Personal tools
Teager Energy and Modulation Features for Speech Applications
| What |
|
|---|---|
| When |
Jan 20, 2009 from 02:00 PM to 03:00 PM |
| Where | Conference Room 54-134 EIV |
| Add event to calendar |
|
Prof. Alexandros Potamianos
Technical University of Crete
Tuesday, January 20, 2009 at 2:00pm-3:00pm
Conference Room 54-134 EIV
Abstract
Several studies have been dedicated to the analysis and modeling of
AM--FM modulations in speech and different algorithms have been proposed
for the exploitation of modulations in speech applications. This talk
first details a statistical analysis of amplitude modulations using a
multi-band AM-FM analysis framework. The aim of this study is to
analyze the phonetic- and speaker-dependency of modulations in the
amplitude envelope of speech resonances. The analysis focuses on the
dependence of such modulations on acoustic features such as, fundamental
frequency, formant proximity, phone identity, as well as, speaker
identity and contextual features. The results show that the amplitude
modulation index of a speech resonance is mainly a function of the
speaker's average fundamental frequency, the phone identity, and the
proximity between neighboring formant resonances. The results are
especially relevant for speech and speaker recognition application
employing modulation features.
In the second part of this talk, the relationship between the energy
estimation process and automatic speech recognition (ASR) performance in
the presence of additive noise is investigated. Modest improvements of
the ASR performance have been shown in the literature when using a
Teager Energy Cepstral Coefficients (TECC) front-end that employs the
Teager-Kaiser energy computation scheme instead of the traditional
quadratic one. A theoretical analysis framework is proposed here that
accounts for the short- and long-term differences between the
Teager-Kaiser and quadratic energy estimation schemes in noise. The
results are generalized for the case of ASR front-ends that use
cepstrum-like coefficients and are experimentally verified. It is shown
that the Teager-Kaiser energy computation scheme has superior short-term
behavior compared to the quadratic one, and performs better in the
presence of lowpass noise, e.g., car noise.
Biography
Alexandros Potamianos received the Diploma in Electrical and Computer
Engineering from the National Technical University of Athens, Greece in
1990. He received the M.S and Ph.D. degrees in Engineering Sciences from
Harvard University, Cambridge, MA, USA in 1991 and 1995, respectively.
He received the M.B.A. degree from Stern School of Business, NYU in
2002. In the spring of 2003, he joined the Department of Electronics and
Computer Engineering at the Technical University of Crete, Chania,
Greece as an associate professor.
His current research interests include speech processing, analysis,
synthesis and recognition, dialog and multi-modal systems, nonlinear
signal processing, natural language understanding, artificial
intelligence and multimodal child-computer interaction.
