The results from applying an improved algorithm to the task of automatic segmentation of spontaneous telephone quality speech are presented, and compared to the results from those resulting from superimposing white noise. Three segmentation algorithms are compared which are all based on variants of the Spectral Variation Function. Experimental results are obtained on the OGI multi language telephone speech corpus (OGLTS). We show that the use of the auditory forward and backward masking effects prior to the SVF computation increases the robustness of the algorithm to white noise. When the average signal to noise ratio (SNR) is decreased to 10 dB, the peak ratio (defined as the ratio of the number of peaks measured at the target over the original SNRs) is increased by 16%, 12%, and 11% for the MFC (Mel Frequency Cepstra), RASTA (Relative Spectral Processing), and the FBDYN (Forward Backward Auditory Masking Dynamic Cepstra) SVF segmentation algorithms, respectively.
On the robust automatic segmentation of spontaneous speech
B. Petek,O. Andersen,P. Dalsgaard
Published 1996 in Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
ABSTRACT
PUBLICATION RECORD
- Publication year
1996
- Venue
Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
- Publication date
1996-10-03
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-8 of 8 references · Page 1 of 1
CITED BY
Showing 1-36 of 36 citing papers · Page 1 of 1