Transportation Distances and their Application in Machine Learning: New Problems
Nov 30, 2012
from 02:00 PM to 03:00 PM
|Contact Name||Prof. Lieven Vandenberghe|
|Add event to calendar||
I will present in this talk two new research topics related to the optimal transportation distance (also known as Earth Mover's or Wasserstein) and its application in machine learning to compare histograms of features.
I will first discuss the ground metric learning problem, which is the problem of tuning automatically the parameters of transportation distances using labeled histogram data. After providing some reminders on optimal transportation, I will argue that learning transportation distances is akin to learning an L1 distance on the simplex, namely a distance with polyhedral level sets, and I will draw some parallels with Mahalanobis distances, the L2 distance and elliptic level sets. I will then introduce our algorithm (arXiv:1110.2306) and more recent extensions.
In the second part of my talk, I address the fact that transportation distances are not Hilbertian by showing that they can be as positive definite kernels through the "generating function trick". We prove that the trick, which uses the generating function of the transportation polytope to define a similarity - rather than focusing exclusively on the optimal transport to define a distance - leads to a positive definite kernel between histograms (arXiv:1209.2655).
Marco Cuturi got his PhD in 2005 at the Ecole des Mines de Paris, under the supervision of JP Vert. He has worked in the Institute of Statistical Mathematics in Tokyo and in Princeton University, and is now Associate Professor at Kyoto University. His current research interests lie in machine learning, including metric learning, kernel methods and the analysis of time series.