Personal tools
Home Events Events Archive 2009 Teager Energy and Modulation Features for Speech Applications

Teager Energy and Modulation Features for Speech Applications

— filed under:

What
  • Visitor Seminars
When Jan 20, 2009
from 02:00 PM to 03:00 PM
Where Conference Room 54-134 EIV
Add event to calendar vCal
iCal

Prof. Alexandros Potamianos
Technical University of Crete

Tuesday, January 20, 2009 at 2:00pm-3:00pm
Conference Room 54-134 EIV

Abstract
Several studies have been dedicated to the analysis and modeling of AM--FM modulations in speech and different algorithms have been proposed for the exploitation of modulations in speech applications. This talk first details a statistical analysis of amplitude modulations using a multi-band AM-FM analysis framework. The aim of this study is to analyze the phonetic- and speaker-dependency of modulations in the amplitude envelope of speech resonances. The analysis focuses on the dependence of such modulations on acoustic features such as, fundamental frequency, formant proximity, phone identity, as well as, speaker identity and contextual features. The results show that the amplitude modulation index of a speech resonance is mainly a function of the speaker's average fundamental frequency, the phone identity, and the proximity between neighboring formant resonances. The results are especially relevant for speech and speaker recognition application employing modulation features.

In the second part of this talk, the relationship between the energy estimation process and automatic speech recognition (ASR) performance in the presence of additive noise is investigated. Modest improvements of the ASR performance have been shown in the literature when using a Teager Energy Cepstral Coefficients (TECC) front-end that employs the Teager-Kaiser energy computation scheme instead of the traditional quadratic one. A theoretical analysis framework is proposed here that accounts for the short- and long-term differences between the Teager-Kaiser and quadratic energy estimation schemes in noise. The results are generalized for the case of ASR front-ends that use cepstrum-like coefficients and are experimentally verified. It is shown that the Teager-Kaiser energy computation scheme has superior short-term behavior compared to the quadratic one, and performs better in the presence of lowpass noise, e.g., car noise.

Biography
Alexandros Potamianos received the Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece in 1990. He received the M.S and Ph.D. degrees in Engineering Sciences from Harvard University, Cambridge, MA, USA in 1991 and 1995, respectively. He received the M.B.A. degree from Stern School of Business, NYU in 2002. In the spring of 2003, he joined the Department of Electronics and Computer Engineering at the Technical University of Crete, Chania, Greece as an associate professor.

His current research interests include speech processing, analysis, synthesis and recognition, dialog and multi-modal systems, nonlinear signal processing, natural language understanding, artificial intelligence and multimodal child-computer interaction.

Document Actions