What is Wasserstein Distance? Wasserstein Distance Explained
Wasserstein distance, also known as Earth Mover’s distance (EMD) or Wasserstein metric, is a measure of dissimilarity between two probability distributions. It quantifies the minimum cost required to transform one distribution into another, where the cost is defined as the amount of “work" needed to move the mass from one location to another.
It is particularly useful when comparing distributions that have different shapes or supports, and when traditional distance measures like Euclidean or Kullback-Leibler divergence may not be appropriate. It is widely used in various domains, including statistics, machine learning, and optimal transport theory.
The EMD between two probability distributions P and Q is calculated by finding the optimal transport plan that minimizes the total cost of moving mass from P to Q. Each point in P is associated with a point in Q, and the cost is determined by the distance between the corresponding points. The Wasserstein distance is the minimum cost needed to transport the entire mass from P to Q.
The formal definition involves solving an optimization problem known as the Kantorovich formulation of optimal transport. There are different formulations of this distance, such as the first Wasserstein distance (W1) and the p-th Wasserstein distance (Wp), where p is a parameter that determines the order of the distance.
It has several desirable properties, including being a true metric, meaning it satisfies the properties of non-negativity, symmetry, and triangle inequality. It is also able to capture the structural differences between distributions, taking into account both the magnitude and the location of the differences.
In practical applications, computing the exact EMD can be computationally expensive, especially for high-dimensional distributions. However, approximate algorithms and efficient computational techniques, such as the Sinkhorn algorithm, have been developed to estimate this distance for large-scale problems.
It has found applications in various areas, including image analysis, texture synthesis, generative modeling, optimal transport, and shape matching. It provides a powerful tool for comparing and quantifying the dissimilarity between probability distributions in a meaningful way.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.