|
|
|
Home Projects &
Demos |
| Voice
Source Project |
|
In voiced speech, the vocal folds open and close quasi-periodically and
thus convert the glottal air flow (air volume velocity) into a train of
flow pulses which is referred to as the voice source excitation signal.
Early models of the source signal used a simple impulse train for
modeling voiced excitation. None of these models has been calibrated
with direct observations of glottal area changes which are the proximal
cause of the air pressure changes that we hear as sound.The effective
study of the voice source thus requires both more accurate source
models and a comprehensive set of underlying observations on which to
base the models. The primary goal of the proposed research is to
develop and evaluate a new, more powerful source model based on direct
observations of vocal fold vibrations... ( details)
|
| The
Subglottal Resonances: Research and Applications |
↑Top |
During the past few decades, research efforts in the area of speech
processing have focused on the extraction of reliable acoustic features
for applications such as automatic speech recognition, speaker
identification, and speech coding, among many others. These acoustic
features are related either to the vocal tract (filter), or to the
glottal air flow (source) that drives it. Although the mechanics of the
supraglottal (above the glottis) system have been well understood, the
subglottal (below the glottis) system and its properties have not been
explored in great detail. Unlike the supraglottal tract, the
configuration of the subglottal system remains fairly constant during
speech production, which makes its properties very interesting and
useful. In particular, its resonant frequencies, through subtle
interactions with the speech signal, are believed to have the potential
to minimize acoustic differences among speakers and also to provide
valuable information about a speaker's identity...
(details)
|
|
|
|
We almost always listen to speech which is degraded by the addition of
competing speech and non-speech signals. Fortunately, we are remarkably
adept at isolating a specific speech signal from the background noise
and understanding what is said. The purpose of this study is to
contribute to a broad research program whose aim is to understand and
model human perception of speech in noise... (details)
|
|
|
|
The TBALL project aims to advance the state of the art in speech
processing, datamining, and human-computer interface design. It
integrates these technologies with a progressive understanding of the
components of academic performance to develop an effective,
child-friendly literacy assessment system.
The project is studying the impact of this approach with native
American English speakers and non-native speakers of Mexican-Spanish
background, longitudinally from K-2. It will:
* Analyze children's speech as they grow
* Develop speech recognition algorithms for automated assessments
* Create a query-based, longitudinal database for each student
* Derive instructional guidance from the analysis of an ongoing
professional development program for teachers of native and non-native
speakers
* Allow teachers to make more timely and appropriate decisions about
curriculum and instructional interventions.
The project fosters interdisciplinary activities at:
UCLA - Electrical Engineering, Computer Science, Education
USC - Electrical Engineering, Linguistics, Neuroscience, Psychology
UCB - Education ... (details)
|
|
|
|
Quantitative models of the human speech production system are needed
for a better understanding of our cognitive abilities and for the
development of high-quality speech synthesizers and automatic speech
recognition systems. In previous studies, information regarding the
vocal tract geometry during speech production has been mainly derived
from lateral x-ray data. The main limitations of x-rays include
radiation risks and difficulty in accurately deducing the
cross-sectional morphology from mid-sagittal profiles... (details)
|
|
|
|
Design of high quality speech coders and echo-cancelation schemes for
wireless networks is a challenging task since good quality should be
maintained with low power consumption under time-varying channel
conditions and limited bandwidth. The design should account for a
number of parameters such as bit rate, delay, power consumption,
complexity, and quality of coded speech. Available bandwidth will
depend on network protocols. Depending on the application, a set of
parameters is optimized... (details)
|
|
|
|
No accepted standard system exists for describing pathological voice
qualities. Qualities are labeled based on the perceptual judgments of
individual clinicians, a procedure plagued by inter- and intra-rater
inconsistencies and terminological confusions. Synthetic pathological
voices could be useful as an element in a standard protocol for quality
assessment. In this project, we develop guidelines for synthesizing
some kinds of severely pathological voice qualities in the hope of
making synthesis less of a subjective art, as it currently is, and more
of a science... (details)
|
|
|
|
Acoustic feedback can cause oscillations and instability which lead to
a howling sound produced by the hearing aid. Even motion near the
hearing aid can cause changes in the acoustic feedback path. The
purpose of this analysis is to quantify the acoustic path transfer
function (APTF) of in-the-ear (ITE) hearing aids under both static and
dynamic conditions... (details)
(demo)
|
|
|
|
In Smart Kindergarten, we target the early childhood education
environment as a testbed, where we try to provide parents and teachers
with the abilities to comprehensively investigate students’
learning
processes. The kind of questions that we hope to answer ranges from
evaluations of students progress such as “How well is student A
reading
the story book B?”, “Is student C spending too much time on
one
learning area?”, to evaluations of students social behavior such
as
“Does student A tend to confront other students?”,
“Is student B
usually isolated?”. The infrastructure of SmartKG was designed to
collect, manage, and fuse the information of the sensors to interpret
and present the information in a logical and user friendly manner... (details)
|
|
|
|
|
|