Home »

Optimal Transport 🕷️ and Wasserstein distance

Intro to Optimal Transport with spider Cedric (Villani) :) 🕷️

Lucia

Table of Contents

To qualify as a distance, a measure must satisfy the following properties:

Symmetry: $ d(P, Q) = d(Q, P) )$
Triangle inequality: $ d(P, Q) + d(Q, R) \geq d(P, R) $

However, in practice, we often deal with weaker notions of distances, commonly referred to as divergences.

Example: KL Divergence

The Kullback-Leibler (KL) divergence is defined as: $$ D_{\text{KL}}(P || Q) = \int p(x) \log \frac{p(x)}{q(x)} dx $$

Properties of KL Divergence

Not Symmetric: $$ D_{\text{KL}}(P || Q) \neq D_{\text{KL}}(Q || P) $$
Infinite for Different Supports:

$$ D_{\text{KL}}(P || Q) \to \infty \quad \text{if } P \text{ and } Q \text{ have different supports.} $$

Addressing These Challenges

Solution 1: Smoothing Distributions💡

To avoid the issue of different supports, one solution is to smooth the distributions to match their supports.

Solution 2: Use a Different Divergence💡

An alternative approach is to use a divergence that naturally handles different supports and adheres to desirable distance properties. One such measure is the Wasserstein Distance.

Wasserstein Distance

The Wasserstein distance, rooted in optimal transport theory, addresses the shortcomings of KL divergence by offering:

Optimal Transport 🕷️ and Wasserstein distance

However, in practice, we often deal with weaker notions of distances, commonly referred to as divergences.

Example: KL Divergence

Properties of KL Divergence

Addressing These Challenges

Solution 1: Smoothing Distributions💡

Solution 2: Use a Different Divergence💡

Wasserstein Distance

Intuition Behind Optimal Transport

Properties of Wasserstein Distance

More in written notes:

However, in practice, we often deal with weaker notions of distances, commonly referred to as divergences.#

Example: KL Divergence#

Properties of KL Divergence#

Addressing These Challenges#

Solution 1: Smoothing Distributions💡#

Solution 2: Use a Different Divergence💡#

Wasserstein Distance#

Intuition Behind Optimal Transport#

Properties of Wasserstein Distance#

More in written notes:#

However, in practice, we often deal with weaker notions of distances, commonly referred to as divergences.

Example: KL Divergence

Properties of KL Divergence

Addressing These Challenges

Solution 1: Smoothing Distributions💡

Solution 2: Use a Different Divergence💡

Wasserstein Distance

Intuition Behind Optimal Transport

Properties of Wasserstein Distance

More in written notes: