## Saturday, May 12, 2018

### von Mises-Fisher (part 2)

$$\newcommand{\vv}{\mathbf{v}} \newcommand{\rv}{\mathbf{r}} \newcommand{\muv}{\boldsymbol\mu} \newcommand{\mudotv}{\muv\cdot\vv}$$ A normalized SG has the same equation as the probability distribution function for a von Mises-Fisher (vMF) distribution on the 3 dimensional sphere. This affords us a few more tools and applications to work with. A vMF distribution can be defined for any dimension. I'll focus on 3D here because it is the most widely usable for computer graphics and simplifies discussion. Because a vMF does not have a free amplitude parameter it is written as: \begin{aligned} V(\vv;\muv,\lambda) = \frac{\lambda}{ 2\pi \left( 1 - e^{-2 \lambda} \right) } e^{\lambda(\mudotv - 1)} \end{aligned} \label{eq:vmf} The more common form you will likely see in literature is this: \begin{aligned} V(\vv;\muv,\lambda) = \frac{\lambda}{ 4\pi \sinh(\lambda) } e^{\lambda(\mudotv)} \end{aligned} \label{eq:vmf_sinh} which is equivalent due to the identity \begin{aligned} \sinh(x) = \frac{ 1 - e^{-2x} }{ 2e^{-x} } \end{aligned} \label{eq:sinh_identity} The form in eq \eqref{eq:vmf} is more numerically stable so should be used in practice as explained by [2].

Compare the equation for a vMF to the equation for a SG and it is easy to see that: \begin{aligned} V(\vv;\muv,\lambda) = G\left( \vv; \muv, \lambda, \frac{\lambda}{ 2\pi \left( 1 - e^{-2 \lambda} \right) } \right) \end{aligned} \label{eq:vmf_to_sg} That means a vMF is equivalent to a normalized SG and by moving terms from one side to the other we can show that a SG is equivalent to a scaled vMF. \begin{aligned} G\left( \vv; \muv, \lambda, a \right) = \frac{2\pi a}{\lambda} \left( 1 - e^{-2 \lambda} \right) V(\vv;\muv,\lambda) \end{aligned} \label{eq:sg_to_vmf}

### Fitting a vMF distribution to data

Fitting a vMF distribution to directions or points on a sphere is a very similar process as fitting a normal distribution to points on a line. In the case of a normal distribution, one calculates the mean and variance of the data set and then chooses a normal distribution with the same mean and variance as the best fit to the data.

For the vMF distribution the mean direction and spherical variance are used. Calculating these properties for a set of directions is simple. \begin{aligned} \rv = \frac{1}{n}\sum_{i=1}^{n} \textbf{x}_i \end{aligned} \label{eq:r_average} where $\textbf{x}_1, \textbf{x}_2, ..., \textbf{x}_n$ are a set of unit vectors.

Often values are associated with these directions. So instead taking a simple average we can take a weighted average. \begin{aligned} \rv = \frac{\sum_{i=1}^{n} \textbf{x}_i w_i}{\sum_{i=1}^{n} w_i} \end{aligned} \label{eq:r_weighted_average} We have the two properties, the mean direction $\muv = \frac{\rv}{\|\rv\|}$ and the spherical variance $\sigma^2 = 1 - \|\rv\|$. To fit a vMF distribution to the data we need to know what these properties are for the vMF distribution. Since the vMF distribution is convex, circularly symmetric about its axis, and is max in the direction of $\muv$, it is fairly obvious that the mean direction will be $\muv$ so I won't derive that here.

The spherical variance $\sigma^2$ on the other hand is a bit more involved. Because we already know the direction of $\rv$ is $\muv$ we can simplify this calculation to the integral of the projection of the function onto $\muv$. $$\begin{split} \|\rv\| &= \int_{S^2} V(\vv;\muv,\lambda) (\mudotv) d\vv \\ &= \frac{\lambda}{ 4\pi \sinh(\lambda) } \int_{S^2} e^{\lambda(\mudotv)} (\mudotv) d\vv \\ \end{split}$$ Because the integral over the sphere is rotation-invariant we will replace $\muv$ with the x-axis. $$\begin{split} &= \frac{\lambda}{ 4\pi \sinh(\lambda) } \int_{0}^{2 \pi} \int_{0}^{\pi} e^{\lambda\cos\theta} \cos\theta\sin\theta d\theta d\phi \\ &= \frac{\lambda}{ 4\pi \sinh(\lambda) } 2 \pi \int_{0}^{\pi} e^{\lambda\cos\theta} \cos\theta\sin\theta d\theta \\ \end{split}$$ Substituting $t=-\cos\theta$ and $dt=\sin\theta d\theta$ $$\begin{split} &= \frac{\lambda}{ 2 \sinh(\lambda) } \int_{-1}^{1} -t e^{-\lambda t} dt \\ &= \frac{\lambda}{ 2 \sinh(\lambda) } \left( \frac{ 2 \lambda \cosh(\lambda) - 2 \sinh(\lambda) }{ \lambda^2 } \right) \\ &= \frac{\cosh(\lambda)}{ \sinh(\lambda) } - \frac{\sinh(\lambda)}{ \lambda \sinh(\lambda) } \\ \end{split}$$ Arriving in its final form $$\|\rv\| = \coth(\lambda)-\frac{1}{\lambda} \label{eq:r_length}$$ Although simple in form, this function unfortunately isn't invertible. [1] provides an approximation which is close enough for our purposes. \begin{aligned} \lambda &= \|\rv\| \frac{ 3 - \|\rv\|^2}{1 - \|\rv\|^2} \end{aligned} Now that we have a way to calculate the mean and spherical variance for a data set and we know the corresponding vMF mean and spherical variance, we can fit a vMF to the data set.

Using eq \eqref{eq:r_weighted_average} to calculate $\rv$, the vMF fit to that data is $$V\left( \vv; \frac{\rv}{\|\rv\|},\|\rv\| \frac{ 3 - \|\rv\|^2}{1 - \|\rv\|^2} \right) \label{eq:r_to_vmf}$$
Going the other direction from $V(\vv;\muv,\lambda)$ form to $\rv$ form using eq \eqref{eq:r_length} is this: $$\rv = \left( \coth(\lambda)-\frac{1}{\lambda} \right) \muv \label{eq:vmf_to_r}$$

We now have a way to convert to and from $\rv$ form. $\rv$ is linearly filterable as shown in how it was originally defined in eq \eqref{eq:r_weighted_average}. This means if our vMF functions are representing a spherical distribution of something then a weighted sum of those distributions can be approximately fit by another vMF. In other words we can approximate the resulting distribution by converting to $\rv$ form, filtering, and then converting back to traditional $V(\vv;\muv,\lambda)$ form.

By using the weighted average eq \eqref{eq:r_weighted_average} we can apply this concept to non normalized SGs too. This allows us to not just filter (ie sum with a total weight of 1) but add as well. A non-normalized SG as shown in eq \eqref{eq:sg_to_vmf} is a scaled vMF. We can use this scale as the weight when summing and use the total weight as the final scale for the summed SG.

This is the $\rv$ form for $G(\vv;\muv,\lambda, a)$. It includes an additional weight value you can think of like the energy this SG is adding to the sum: \begin{aligned} \rv_i &= \left( \coth(\lambda_i)-\frac{1}{\lambda_i} \right) \muv_i \\ w_i &= \frac{2\pi a_i}{\lambda_i} \left( 1 - e^{-2 \lambda_i} \right) \\ \end{aligned} This weight is of course used in the weighted sum \begin{aligned} \rv &= \frac{\sum_{i=1}^{n} \rv_i w_i}{\sum_{i=1}^{n} w_i} \\ w &= \sum_{i=1}^{n} w_i \\ \end{aligned} Using eq \eqref{eq:r_to_vmf} and eq \eqref{eq:vmf_to_sg} we can convert back to a scaled vMF and finally to a SG in $G(\vv;\muv,\lambda, a)$ form: \begin{aligned} G\left( \vv; \muv, \lambda, a \right) &= w V(\vv;\muv,\lambda) \\ &= G\left( \vv; \muv, \lambda, w \frac{\lambda}{ 2\pi \left( 1 - e^{-2 \lambda} \right) } \right) \end{aligned} While addition and filtering are approximate they can be useful. The accuracy of the result is very dependent on the angle between the $\mu$ vectors or lobe axii. Adding sharp lobes pointed in different directions will result in a single wide lobe.

Next, what we can use this for:
Normal map filtering using vMF

### References

[1] Banerjee et al. 2005, "Clustering on the Unit Hypersphere using von Mises-Fisher Distributions"
[2] Jakob 2012, Numerically stable sampling of the von Mises Fisher distribution on S2 (and other tricks)"