VoiceSauce is an application, implemented in Matlab, which provides automated voice measurements over time from audio recordings. Inputs are standard wave (*.wav) files and the measures currently computed are:
where (*) indicates that the harmonic/spectral amplitudes are reported with and without corrects for formant frequencies and bandwidths. More parameters to be added soon.
VoiceSauce requires Matlab versions 2007 and up. VoiceSauce has been successfully run under Windows (XP, Vista, 7) and Mac (Leopard). Other operating systems may also work but have not been tested. If you are attempting to run VoiceSauce on a system other than Windows or Mac, you may need to install Tcl/Tk first; this can be obtained on ActiveState's website.
Since many of the parameters estimated by VoiceSauce depend on F0, meaningful results are only valid for voiced speech. Noisy speech may affect the accuracy of the F0 estimations and hence the values of the voice measurements.
The correction formula for the effects of the formant frequencies work best when there are accurate estimates of the the first formant, i.e. when F0 and F1 do not come too close to each other. For example, speech produced by a high-pitched voice saying high vowels may return inaccurate results.
It has been reported that wav files contained in folder names which consist of non-English characters may cause the formant estimator to fail.
Distribution is currently in two forms: (1) m-code for systems with Matlab, and (2) compiled executables for systems without Matlab. Note that the compiled executables requires the installation of the Matlab Component Runtime (only needs to be installed once).
Currently compiled executables are only available for Windows systems.
Version changelog is available here.
Compiled Matlab executables - Windows XP/Vista/7
Instructions: Unzip and run VoiceSauce.m from Matlab.
Note: Requires Matlab 2007a or later.
Instructions: Run MCRInstaller.exe (only needs to be done once). Unzip VoiceSauce_bin.zip and run VoiceSauce.exe.
Note: Running VoiceSauce.exe for the first time may take a few minutes to load.
Documentaton (thanks to Chad Vicenik and Spencer Lin) is available here. Note that this may, and likely will, evolve over time.
EggWorks: A free program by Henry Tehrani, created for the NSF Voice project to analyze EGG signals (closing quotients, peak increase in contact) in batch mode; also includes utilities for splitting .pmf files into separate .wav files, for inverting .wav files, and for converting .wav files from 32- to 16-bit.
EggWorks can be found here (download link is at the bottom of the page).
This work was supported in part by the NSF.
Back to SPAPL home page