|
|
|
Home Projects &
Demos |
During the past few decades, research efforts in the area of speech
processing have focused on the extraction of reliable acoustic features
for applications such as automatic speech recognition, speaker
identification, and speech coding, among many others. These acoustic
features are related either to the vocal tract (filter), or to the
glottal air flow (source) that drives it. Although the mechanics of the
supraglottal (above the glottis) system have been well understood, the
subglottal (below the glottis) system and its properties have not been
explored in great detail. Unlike the supraglottal tract, the
configuration of the subglottal system remains fairly constant during
speech production, which makes its properties very interesting and
useful. In particular, its resonant frequencies, through subtle
interactions with the speech signal, are believed to have the potential
to minimize acoustic differences among speakers and also to provide
valuable information about a speaker's identity...
(details)
|
|
|
|
We almost always listen to speech which is degraded by the addition of
competing speech and non-speech signals. Fortunately, we are remarkably
adept at isolating a specific speech signal from the background noise
and understanding what is said. The purpose of this study is to
contribute to a broad research program whose aim is to understand and
model human perception of speech in noise... (details)
|
|
|
|
The TBALL project aims to advance the state of the art in speech
processing, datamining, and human-computer interface design. It
integrates these technologies with a progressive understanding of the
components of academic performance to develop an effective,
child-friendly literacy assessment system.
The project is studying the impact of this approach with native
American English speakers and non-native speakers of Mexican-Spanish
background, longitudinally from K-2. It will:
* Analyze children's speech as they grow
* Develop speech recognition algorithms for automated assessments
* Create a query-based, longitudinal database for each student
* Derive instructional guidance from the analysis of an ongoing
professional development program for teachers of native and non-native
speakers
* Allow teachers to make more timely and appropriate decisions about
curriculum and instructional interventions.
The project fosters interdisciplinary activities at:
UCLA - Electrical Engineering, Computer Science, Education
USC - Electrical Engineering, Linguistics, Neuroscience, Psychology
UCB - Education ... (details)
|
|
|
|
Quantitative models of the human speech production system are needed
for a better understanding of our cognitive abilities and for the
development of high-quality speech synthesizers and automatic speech
recognition systems. In previous studies, information regarding the
vocal tract geometry during speech production has been mainly derived
from lateral x-ray data. The main limitations of x-rays include
radiation risks and difficulty in accurately deducing the
cross-sectional morphology from mid-sagittal profiles... (details)
|
|
|
|
Design of high quality speech coders and echo-cancelation schemes for
wireless networks is a challenging task since good quality should be
maintained with low power consumption under time-varying channel
conditions and limited bandwidth. The design should account for a
number of parameters such as bit rate, delay, power consumption,
complexity, and quality of coded speech. Available bandwidth will
depend on network protocols. Depending on the application, a set of
parameters is optimized... (details)
|
|
|
|
No accepted standard system exists for describing pathological voice
qualities. Qualities are labeled based on the perceptual judgments of
individual clinicians, a procedure plagued by inter- and intra-rater
inconsistencies and terminological confusions. Synthetic pathological
voices could be useful as an element in a standard protocol for quality
assessment. In this project, we develop guidelines for synthesizing
some kinds of severely pathological voice qualities in the hope of
making synthesis less of a subjective art, as it currently is, and more
of a science... (details)
|
|
|
|
Acoustic feedback can cause oscillations and instability which lead to
a howling sound produced by the hearing aid. Even motion near the
hearing aid can cause changes in the acoustic feedback path. The
purpose of this analysis is to quantify the acoustic path transfer
function (APTF) of in-the-ear (ITE) hearing aids under both static and
dynamic conditions... (details)
(demo)
|
|
|
|
In Smart Kindergarten, we target the early childhood education
environment as a testbed, where we try to provide parents and teachers
with the abilities to comprehensively investigate students’ learning
processes. The kind of questions that we hope to answer ranges from
evaluations of students progress such as “How well is student A reading
the story book B?”, “Is student C spending too much time on one
learning area?”, to evaluations of students social behavior such as
“Does student A tend to confront other students?”, “Is student B
usually isolated?”. The infrastructure of SmartKG was designed to
collect, manage, and fuse the information of the sensors to interpret
and present the information in a logical and user friendly manner... (details)
|
|
|
|
|
|