Speaker Recognition Via Fusion of Subglottal and Supraglottal Information
May 21, 2014
from 12:00 PM to 01:00 PM
|Where||Engr. IV Bldg., Tesla Room 53-125|
|Contact Name||Harish Arsikere|
|Add event to calendar||
Department Research Forum
Hosted by: Prof. Greg Pottie
Advisor: Prof. Abeer Alwan
Given a speech signal as input, the goal of a speaker-recognition system is either to find the best match among a set of known voices (speaker identification) or to verify whether the talker is indeed the one he or she claims to be (speaker verification). State-of-the-art speaker-recognition technology is based on the use of acoustic features that characterize the supraglottal (i.e., above the glottis, which is the opening between the vocal cords) vocal-tract system. Since vocal-tract shape and acoustics change continuously over time, the efficacy of present-day systems depends on the amount of speech data available for speaker modeling and/or decision making. Unlike the vocal tract, the subglottal system remains almost unchanged during speech production; subglottal acoustics are therefore more stationary than supraglottal acoustics. Subglottal acoustics are also known to be speaker specific owing to the correlation between subglottal-tract length and body height. Therefore, acoustic features characterizing the subglottal system are hypothesized to be useful for speaker recognition, especially when the amount of speech data is limited. This talk presents the first known attempt to combine subglottal features with conventional supraglottal features for speaker identification and verification. It is shown empirically that the proposed fusion approach provides significant performance improvements relative to existing methods.
This work was supported in part by the National Science Foundation.
Harish Arsikere is a PhD candidate in the Electrical Engineering Department at UCLA. Prior to this, he obtained a Master of Technology degree from Indian Institute of Technology, Kanpur, in 2009, and a Bachelor of Engineering degree from R. V. College of Engineering, Bangalore, in 2007. His research interests include signal processing, speech processing, speech production, and automatic speech and speaker recognition.