Automatic phoneme alignment based on acoustic-phonetic modeling

Published 2002 in Interspeech

ABSTRACT

This paper presents a method for speaker-independent automatic phonetic alignment that is distinguished from standard HMM-based “forced alignment” in three respects: (1) specific acoustic-phonetic features are used, in addition to PLP features, by the phonetic classifier; (2) the units of classification consist of distinctive phonetic features instead of phonemes; and (3) observation probabilities depend not only on the current state, but also on the state transition information. This proposed method is compared with a state-of-the-art baseline forcedalignment system on a number of corpora, including telephone speech, microphone speech, and children’s speech. The new method has agreement of 92.57% within 20 msec on the TIMIT corpus, which is a 26% reduction in error over the baseline method (with 89.95% agreement on TIMIT). Average reduction in error over all corpora is 28%.

PUBLICATION RECORD

Publication year
2002
Venue
Interspeech
Publication date
2002-09-16
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.21437/ICSLP.2002-154
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Automatic time alignment of phonemes using acoustic-phonetic information
2000cited by this paper
Burst detection based on measurements of intensity discrimination
2000cited by this paper
High-accuracy automatic segmentation
1999cited by this paper
On the use of F0 features in automatic segmentation for speech synthesis
1998cited by this paper
Evaluation and integration of neural-network training techniques for continuous digit recognition
1998cited by this paper
Universal speech tools: the CSLU toolkit
1998cited by this paper
An Introduction to the Psychology of Hearing
1997cited by this paper
Syllable-level desynchronisation of phonetic features for speech recognition
1996cited by this paper
Labeler agreement in phonetic labeling of continuous speech
1994cited by this paper
Automatic segmentation and labeling of speech based on Hidden Markov Models
1993cited by this paper
A preliminary statistical evaluation of manual and automatic segmentation discrepancies
1991cited by this paper

CITED BY

Analyzing Phonetic and Graphemic Representations in End-to-End Automatic Speech Recognition
2019cites this paper
Traces of the Northern Cities Shift:An Empirical Case Study in Amherst, MA
2016cites this paper
Expressive speech synthesis : research and system design with hidden Markov models
2015cites this paper
Automatic Pronunciation Scoring with Score Combination by Learning to Rank and Class-Normalized DP-Based Quantization
2015cites this paper
Modeling coarticulation in continuous speech
2014influential citation
Robust neural network-based estimation of articulatory features for Czech
2014cites this paper
英詩における口語(speech)の台頭とその超克 : 詩的言語のイデオロギーとホプキンズ
2014cites this paper
Quantification of speech disfluency as a marker of medication-induced cognitive impairment: An application of computerized speech analysis in neuropharmacology
2013cites this paper
Automatic transcription and speech recognition of Romanian corpus RO-GRID
2012cites this paper
MULTITASK LEARNING TO IMPROVE ARTICULATORY FEATURE ESTIMATION AND PHONEME RECOGNITION
2011cites this paper
Automatic speech segmentation for Italian: tools, models, evaluation, and applications
2011cites this paper
Mathematical treatment of uncertainty in the speech recognition process
2010cites this paper
Robust automatic transcription of English speech corpora
2010cites this paper
CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language
2010cites this paper
Integrating hidden Markov model and PRAAT: a toolbox for robust automatic speech transcription
2010cites this paper
EasyAlign : a friendly automatic phonetic alignment tool under Praat
2010cites this paper
文本不特定之自動音素分段演算法 (Text-Independent Automatic Phone Segmentation Algorithm) [In Chinese]
2010cites this paper
Using speech transformation to increase speech intelligibility for the hearing- and speaking-impaired
2009cites this paper
Robust Detection of Phone Boundaries Using Model Selection Criteria With Few Observations
2009cites this paper
Sistema baseado em regras para o refinamento da segmentação automatica de fala
2008cites this paper
Phonetic segmentation using multiple speech features
2008cites this paper
TRUES: Tone Recognition Using Extended Segments
2008cites this paper
Hybridizing conversational and clear speech to determine the degree of contribution of acoustic features to intelligibility.
2008cites this paper
Large margin algorithms for discriminative continuous speech recognition (זיהוי דיבור רציף באמצעות אלגוריתמי שוליים רחבים.)
2007cites this paper
Large margin algorithms for discriminative continuous speech recognition (זיהוי דיבור רציף באמצעות אלגוריתמי שוליים רחבים.)
2007cites this paper
Acoustic-phonetic features for refining the explicit speech segmentation
2007influential citation
Monolingual and crosslingual comparison of tandem features derived from articulatory and phone MLPS
2007cites this paper
A Large Margin Algorithm for Speech-to-Phoneme and Music-to-Score Alignment
2007cites this paper
DERIVED SPOKEN LANGUAGE MARKERS FOR DETECTING MILD COGNITIVE IMPAIRMENT
2007cites this paper
Online learning: theory, algorithms and applications (למידה מקוונת.)
2007cites this paper
Brazilian Vowels Recognition using a New Hierarchical Decision Structure with Wavelet Packet and SVM
2007cites this paper
Segment Boundary Detection via Class Entropy Measurements in Connectionist
2006cites this paper
Segment boundary detection via class entropy measurements in connectionist phoneme recognition
2006influential citation
Machine Learning Methods for Automatic Speech Recognition and Analysis
2006cites this paper
Refining Segmental Boundaries using Support Vector Machine
2006influential citation
A New Hierarchical Decision Structure Using Wavelet Packet and SVM for Brazilian Phonemes Recognition
2006cites this paper
Phoneme Alignment using Large Margin Techniques
2005cites this paper
Phoneme alignment based on discriminative learning
2005cites this paper
Comparison of acoustic features of time‐compressed and natural speech
2004cites this paper
Diagnostic Assessment of Childhood Apraxia of Speech Using Automatic Speech Recognition (ASR) Methods.
2004cites this paper