Geometry of distance functions for probability distributions
Monday, December 12, 2PM – 3PM
Distance functions are used to analyze mixture model points, for example to cluster them. There are many distances to choose from, and selection is usually driven by the historical preference of the application domain, information theory, or the desirability of the clustering results. Relatively little consideration is usually given to how distance functions geometrically transform data, or the distances’ algebraic properties. Here we take a look at these, in the hope of providing complementary insight. Several popular distances are shown to be nearly equivalent, despite the dissimilarity of their functional forms: triangular discrimination, Jensen-Shannon divergence, and the square of the Hellinger distance. Equivalence is in terms of their functional forms after transformations, factorizations, and series expansions, and in terms of the geometry of their contours. The ratio between these distances is nearly flat for modest ratios of point coordinates, up to about 4:1. Beyond that the functions increase at different rates. We include derivations of ratio and difference bounds.
We provide some constructions that nearly achieve the worst-cases. These help us understand when the different functions would give different orderings to the distances between points.
Hosted by Chandrajit Bajaj