Personal tools
Home Events Events Archive 2010 The Voice Source in Speech Production: Data, Analysis and Models

The Voice Source in Speech Production: Data, Analysis and Models

— filed under:

What
  • PhD Defenses
When Mar 02, 2010
from 12:00 PM to 02:00 PM
Where Engr IV Maxwell Room 57-124
Add event to calendar vCal
iCal

Yen-Liang Shue
Advisor: Abeer Alwan

Tuesday, March 2, 2009 at 12:00pm-2:00pm
Engr IV Maxwell Room 57-124

Abstract:
Analysis of the voice source with respect to voice quality is essential to the understanding of the human speech production system, which can lead to better speech modeling for improving a vast range of applications. However, due to the position of the vocal folds, analyzing the source is often hampered by the lack of direct observations with which to calibrate algorithms.

In this work, two approaches to voice source and voice quality analysis were pursued. In the first approach, the source waveform was extracted by analyzing the glottal area waveforms from high-speed imaging of the vocal folds. These direct observations led to the development of a new source model, which is more accurate compared to existing models. A codebook search technique was then proposed to estimate the source signal from the acoustic data. Results were promising for a number of model parameters such as the open quotient and speed of opening. However, error analysis showed that the algorithm required reasonable formant-frequency constraints which may be difficult to obtain automatically in some cases.

In the second approach, voice source related measures were used in three voice quality applications: voice source analysis, automatic gender classification and prosody analysis. In voice source analysis, acoustic measures were examined in the context of the voice source model parameters obtained from model-fitting the glottal area waveforms. Results showed that correlations could be made between model parameters and the related acoustic measures, such as the asymmetry coefficient and harmonic-to-noise ratio measures. It was also shown that the model parameters and related acoustic measures were affected by the type of voice quality (pressed, normal and breathy). In gender classification, voice source related measures were found to be more helpful in younger (10--14 year old) speakers, where traditional pitch and formant frequency features were less useful. Analysis of prosody showed that, amongst other things, features correlated to pitch accents were not necessarily centered at the target syllable, and depended on the position of other prosodic events.

Biography:
Yen-Liang Shue received his B.E. with highest honors in Computer Engineering from University of New South Wales, Australia, in 2002, and his M.S. degree in Electrical Engineering from University of California, Los Angeles in 2005. He is currently a Ph.D. candidate under the advisory of Professor Abeer Alwan. His research interests include speech analysis, source estimation and voice quality analysis.

Document Actions