Personal tools
Large margin based parameter estimation for hidden Markov models
| What |
|
|---|---|
| When |
Jan 23, 2009 from 10:30 AM to 11:30 AM |
| Where | Conference Room 54-134 EIV |
| Add event to calendar |
|
Prof. Fei Sha
University of Southern California
Friday, January 23, 2009 at 10:30am
Conference Room 54-134 EIV
Abstract
In many application domains, we face the task of characterizing the
distribution of continuous random variables. For instance, in automatic
speech recognition (ASR), these variables are acoustic properties of
speech signals. For such tasks, Gaussian mixture models (GMMs) are
widely used as an very effective density estimator. Particularly, in the
context of ASR, they are embedded in continuous-density hidden Markov
models (CD-HMMs) to yield emission probabilities, i.e., the likelihoods
of acoustic observations conditioned on hidden states such as phonemes.
Meanwhile, the transition probabilities in CD-HMMs attempt to capture
temporal properties of speech signals. Similar modeling choices arise in
other applications, for instance, in activity recognition.
Various techniques have been developed to estimate the parameters of
CD-HMMs. In particular, discriminative techniques such as conditional
maximum likelihood and minimum classification error have attracted
significant research attention. When carefully and skillfully
implemented, they often lead to lower error rates (in speech
recognition) than traditional techniques of maximum likelihood
estimation.
In this talk, I will describe a new discriminative technique that is
based on the principle of large margin, a key framework in many machine
learning algorithms including support vector machines and boosting. The
new technique differs from previous discriminative methods for ASR in
the goal of margin maximization. In particular, in our large margin
training of CD-HMMs, model parameters are optimized to maximize the gap
(or the margin) between correct and incorrect classifications. I will
present an extensive empirical evaluation of our approach on two
benchmark problems in speech recognition: phonetic classification and
recognition on the TIMIT speech database. In both tasks, large margin
systems obtain significantly better performance than systems trained by
maximum likelihood estimation or competing discriminative frameworks.
An in-depth analysis also reveals some interesting features of our
approach, which contribute to the superior performance.
Towards the end of the talk, I will discuss briefly the connection of
our work to the structured prediction problems in the machine learning
community. I will also discuss the future direction of this line of work
and other application potentials.
Biography
Fei Sha got his Ph.D from U. of Pennsylvania in computer and information
science. Afterwards, he spent a year at UC Berkeley as a postdoc with
Prof. Michael Jordan and Prof. Stuart Russell. He then joined Yahoo
Research as a research scientist for a year. He has been a faculty at
USC's computer science department since the summer of 2008. His research
interest focuses on statistical machine learning. His work had won best
student paper awards at NIPS and ICML.
