Let $W_2(mu,nu)$ denote the $2$-Wasserstein distance between two given probability measures $mu$ and $nu$ on $mathbb R^n$. For a probability measure $mu$ and $f:mathbb R^nto mathbb R^n$, let $f_{#}mu=mucirc f^{-1}$ denote the push-forward of $mu$ under $f$, i.e. $(f_{#}mu)(B)=mu(f^{-1}(B))$ for every Borel set $B$ in $mathbb R^n$. Why does the following inequality hold true?

$$W^2_2(f_{#}mu,g_{#}mu)leq int_{mathbb R^n}|f(x)-g(x)|^2,dmu(x) $$

for all $mu$-measurable functions $f,g:mathbb R^ntomathbb R^n$.

Some comment: the product measure $f_{#}muotimes g_{#}mu$ is a so-called transport plan and by definition of the Wasserstein distance

$$W^2_2(f_{#}mu,g_{#}mu)leq int_{mathbb R^ntimes mathbb R^n}|x-y|^2,d(f_{#}muotimes g_{#}mu)(x,y)=int_{mathbb R^ntimes mathbb R^n}|f(x)-g(y)|^2,dmu(x),dmu(y).$$