spacer
spacer
header
 
Home arrow Sharewares: Codes, Databases, and Useful Links

Codes

 

↑Top
VFR: Variable Frame Rate ↑Top

    Variable Frame Rate (VFR) analysis is a method of feature extraction for noise robust automatic speech recognition (ASR) which builds on speech perception research that shows that dynamic spectro-temporal information is important, and, hence, not all equi-duration speech segments are equally important perceptually. For example, formant transitions at the onset of a vowel can carry more discriminative information than the steady-state part of the vowel ... (details) (download)
VoiceSauce: A Program for Voice Analysis ↑Top

    VoiceSauce is an application, implemented in Matlab, which provides automated voice measurements over time from audio recordings. Inputs are standard wave (*.wav) files and the measures currently computed are: F0, Formants F1-F4, H1(*), H2(*), H4(*), A1(*), A2(*), A3(*), H1(*)-H2(*), H2(*)-H4(*), H1(*)-A1(*), H1(*)-A2(*), H1(*)-A3(*), Energy, and Cepstral Peak Prominence ... (details)

XVocal: Vocal Tract Articulatory Synthesizer ↑Top

    XVocal
    is the UNIX version of Dr. Shinji Maeda's Vocal Tract Articulatory Synthesizer, VTCALCS (originally developed for the PC platform). In 1995, Edmond Chi Hin Chui of our laboratory ported the PC version to UNIX. With the permission by Dr. Maeda, XVocal is now freely available if used for research purposes only. Please check out the user manual for a detailed instruction on how to use the program... (details)
CTMRedit: a Matlab based MRI Image Segmentation Tool with GUI ↑Top

    A Matlab GUI for viewing, segmenting, and interpolating CT and MRI Images. Written by Mark Hasegawa-Johnson and Jul Cha... (details)
Speechdemo: a Matlab based Speech Processing Platform with GUI ↑Top

    Speechdemo is a Matlab-based graphical tool for speech analysis by Qifeng Zhu. It supports simultaneous analysis of signals in two channels. The user can view the signal in time and frequency using a variety of analysis tools such as the Discrete Fourier Transform (DFT); Linear Predictive Coding (LPC); Mel-Frequency Cepstral Coefficients (MFCC); and others... (details)
ITU G.722 Wide-band Codec implementation in ANSI C ↑Top

    (ANSI C code)                                                                                                                                               

Databases

 

↑Top
Consonant Vowel Tokens (CV) Database ↑Top

    An extensive database of 1,728 isolated Consonants and Vowels (CV) is available through this website. (details)
VTR Formants Database ↑Top

    The speech group at Microsoft Research (Redmond, Washington, US) and IPAM and Electrical Engineering at UCLA (Los Angeles, CA, US) have recently jointly developed a database for manually labeled vocal-tract-resonance (or formant) trajectories, for research in speech processing including analysis, synthesis, and recognition. (details)
Narrated Videotape Showing 3D Tongue and Vocal Tract Reconstructions from MRI Data for Consonants and Vowels ↑Top

    A narrated videotape showing 3D tongue and vocal tract reconstructions from MRI data for consonants and vowels as produced by 2 talkers. Sample 3D models can be seen at: http://www.ee.ucla.edu/~spapl/projects/mri.html. This videotape is an effective teaching aid, and is produced by Shrikanth Narayanan and Abeer Alwan. ... (details)

    For a free copy of the videotape, please email Prof. Alwan at: alwan@icsl.ucla.edu


Useful Links

 

↑Top

    UCSC Speech Links;                                                                                                                                     

    Alexander Graham Bell's Path to the Telephone;                                                              
 
spacer
spacer