To date, there has been limited work in applying Deep Belief Networks (DBNs) for acoustic modeling in LVCSR tasks, with past work using standard speech features. However, a typical LVCSR system makes use of both feature and model-space speaker adaptation and discriminative training. This paper explores the performance of DBNs in a state-of-the-art LVCSR system, showing improvements over Multi-Layer Perceptrons (MLPs) and GMM/HMMs across a variety of features on an English Broadcast News task. In addition, we provide a recipe for data parallelization of DBN training, showing that data parallelization can provide linear speed-up in the number of machines, without impacting WER.
Making Deep Belief Networks effective for large vocabulary continuous speech recognition
Tara N. Sainath,Brian Kingsbury,B. Ramabhadran,P. Fousek,Petr Novák,Abdel-rahman Mohamed
Published 2011 in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding
ABSTRACT
PUBLICATION RECORD
- Publication year
2011
- Venue
2011 IEEE Workshop on Automatic Speech Recognition & Understanding
- Publication date
2011-12-01
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-13 of 13 references · Page 1 of 1