General

Home
Welcome
Advisory Board
Annual Reports
Calendar
Contact Information
Directory
Events
History
Impact
Maps & Directions
News
Room Reservations

Information for:

Alumni
Current Students
Industry
New Faculty
New Students
Prospective Students
TAs

Information about:

Accreditation
Admissions
Courses
Faculty
Forms & Petitions
Procedures & Regulations
Programs
Research
Scholarships & Fellowships
Seminar Series
Staff
Surveys

Openings

Faculty Positions
Job Board
Postdoctoral Positions
TA Application


                                 Events Home   Upcoming Events   Seminar Series
                                 Workshops      PhD Defenses        Visitor Seminars     Faculty Lectures

2008-2009 Seminars by Visitors to the Department
(excluding speakers in the Seminar Series)

2008-2009     2007-2008    


Teager Energy and Modulation Features for Speech Applications

Prof. Alexandros Potamianos
Technical University of Crete

Tuesday, January 20, 2009 at 2:00pm-3:00pm
Conference Room 54-134 EIV

Abstract
Several studies have been dedicated to the analysis and modeling of AM--FM modulations in speech and different algorithms have been proposed for the exploitation of modulations in speech applications. This talk first details a statistical analysis of amplitude modulations using a multi-band AM-FM analysis framework. The aim of this study is to analyze the phonetic- and speaker-dependency of modulations in the amplitude envelope of speech resonances. The analysis focuses on the dependence of such modulations on acoustic features such as, fundamental frequency, formant proximity, phone identity, as well as, speaker identity and contextual features. The results show that the amplitude modulation index of a speech resonance is mainly a function of the speaker's average fundamental frequency, the phone identity, and the proximity between neighboring formant resonances. The results are especially relevant for speech and speaker recognition application employing modulation features.

In the second part of this talk, the relationship between the energy estimation process and automatic speech recognition (ASR) performance in the presence of additive noise is investigated. Modest improvements of the ASR performance have been shown in the literature when using a Teager Energy Cepstral Coefficients (TECC) front-end that employs the Teager-Kaiser energy computation scheme instead of the traditional quadratic one. A theoretical analysis framework is proposed here that accounts for the short- and long-term differences between the Teager-Kaiser and quadratic energy estimation schemes in noise. The results are generalized for the case of ASR front-ends that use cepstrum-like coefficients and are experimentally verified. It is shown that the Teager-Kaiser energy computation scheme has superior short-term behavior compared to the quadratic one, and performs better in the presence of lowpass noise, e.g., car noise.

Bio
Alexandros Potamianos received the Diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece in 1990. He received the M.S and Ph.D. degrees in Engineering Sciences from Harvard University, Cambridge, MA, USA in 1991 and 1995, respectively. He received the M.B.A. degree from Stern School of Business, NYU in 2002. In the spring of 2003, he joined the Department of Electronics and Computer Engineering at the Technical University of Crete, Chania, Greece as an associate professor.

His current research interests include speech processing, analysis, synthesis and recognition, dialog and multi-modal systems, nonlinear signal processing, natural language understanding, artificial intelligence and multimodal child-computer interaction.

 
 
Copyright © 2009. The University of California. All rights reserved.
UCLA Electrical Engineering. Email for comments on or questions about the website.