Making Deep Belief Networks effective for large vocabulary continuous speech recognition

Tara N. Sainath,Brian Kingsbury,B. Ramabhadran,P. Fousek,Petr Novák,Abdel-rahman Mohamed

Published 2011 in 2011 IEEE Workshop on Automatic Speech Recognition & Understanding

ABSTRACT

To date, there has been limited work in applying Deep Belief Networks (DBNs) for acoustic modeling in LVCSR tasks, with past work using standard speech features. However, a typical LVCSR system makes use of both feature and model-space speaker adaptation and discriminative training. This paper explores the performance of DBNs in a state-of-the-art LVCSR system, showing improvements over Multi-Layer Perceptrons (MLPs) and GMM/HMMs across a variety of features on an English Broadcast News task. In addition, we provide a recipe for data parallelization of DBN training, showing that data parallelization can provide linear speed-up in the number of machines, without impacting WER.

PUBLICATION RECORD

Publication year
2011
Venue
2011 IEEE Workshop on Automatic Speech Recognition & Understanding
Publication date
2011-12-01
Fields of study
Computer Science
Identifiers
DOI 10.1109/ASRU.2011.6163900
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
2012influential reference
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks
2012cited by this paper
A Practical Guide to Training Restricted Boltzmann Machines
2012influential reference
Parallelized Stochastic Gradient Descent
2010influential reference
Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine
2010cited by this paper
Distributed Training Strategies for the Structured Perceptron
2010influential reference
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
2009cited by this paper
Optimizing bottle-neck features for lvcsr
2008cited by this paper
A Fast Learning Algorithm for Deep Belief Nets
2006influential reference
PARALLEL TRAINING OF NEURAL NETWORKS FOR SPEECH RECOGNITION
2006cited by this paper
An architecture for rapid decoding of large vocabulary conversational speech
2003cited by this paper
Tandem acoustic modeling in large-vocabulary recognition
2001cited by this paper
Connectionist Speech Recognition: A Hybrid Approach
1993influential reference

CITED BY

An overview of high-resource automatic speech recognition methods and their empirical evaluation in low-resource environments
2025cites this paper
Enhanced Handwritten Digit Recognition with a Hybrid Optimization Framework for Deep Learning Techniques
2025cites this paper
Assistive systems for visually impaired people: A survey on current requirements and advancements
2024cites this paper
Analysis and Optimization of Neural Networks for Image and Speech Recognition
2023cites this paper
Empirical Investigation for Predicting Depression from Different Machine Learning Based Voice Recognition Techniques
2022influential citation
Speech Enhancement Using Augmented SSL CycleGAN
2022cites this paper
JOIST: A Joint Speech and Text Streaming Model for ASR
2022cites this paper
On the Utility of Combining Topic Models and Recurrent Neural Networks
2021cites this paper
Sequence-Level Self-Teaching Regularization
2021cites this paper
Noise can speed backpropagation learning and deep bidirectional pretraining
2020cites this paper
Sibilant consonants classification comparison with multi‐ and single‐class neural networks
2020cites this paper
Latest Research Trends in Gait Analysis Using Wearable Sensors and Machine Learning: A Systematic Review
2020cites this paper
MESRS: Models Ensemble Speech Recognition System
2020cites this paper
A Systematic Review of Machine Learning based Automatic Speech Assessment System to Evaluate Speech Impairment
2020cites this paper
An improved optimization technique using Deep Neural Networks for digit recognition
2020cites this paper
A comparative case study of neural network training by using frame-level cost functions for automatic speech recognition purposes in Spanish
2020cites this paper
L-Vector: Neural Label Embedding for Domain Adaptation
2020cites this paper
Training of reduced-rank linear transformations for multi-layer polynomial acoustic features for speech recognition
2019cites this paper
Character-Aware Attention-Based End-to-End Speech Recognition
2019cites this paper
Deep belief networks and cortical algorithms: A comparative study for supervised classification
2019cites this paper
Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges
2019cites this paper
Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning
2019cites this paper
Sibilant Consonants Classification with Deep Neural Networks
2019cites this paper
Speech Recognition Using Deep Neural Networks: A Systematic Review
2019cites this paper
Improving Layer Trajectory LSTM with Future Context Frames
2019cites this paper
Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition
2018cites this paper
Cycle-Consistent Speech Enhancement
2018influential citation
Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation
2018cites this paper
A Flexible FPGA-Based Inference Architecture for Pruned Deep Neural Networks
2018cites this paper
The promotion strategy of supply chain flexibility based on deep belief network
2018cites this paper
Speech Recognition Using Convolutional Neural Networks
2018cites this paper
Throughput Optimizations for FPGA-based Deep Neural Network Inference
2018cites this paper
Deep Learning in Mobile and Wireless Networking: A Survey
2018cites this paper
Speech recognition in a dialog system: from conventional to deep processing
2018cites this paper
Deep Learning and its Applications Surveys on Future Mobile Networks Deep Learning Driven Networking Applications Fundamental Principles Advantages Multilayer Perceptron Boltzmann Machine Auto-encoder Convolutional Neural Network Recurrent Neural Network Generative Adversarial Network Deep Reinforce
2018cites this paper
Exploring Layer Trajectory LSTM with Depth Processing Units and Attention
2018cites this paper
A Deep Learning Approach for Classification of Cleanliness in Restrooms
2018cites this paper
Discriminative and adaptive training for robust speech recognition and understanding
2018cites this paper
A Survey of Deep Learning Techniques in Speech Recognition
2018cites this paper
Densenet Blstm for Acoustic Modeling in Robust ASR
2018cites this paper
Domain and Speaker Adaptation for Cortana Speech Recognition
2018cites this paper
Adversarial Feature-Mapping for Speech Enhancement
2018cites this paper
Optimization of discriminative models for speech and handwriting recognition
2017cites this paper
Temporal-related Convolutional-Restricted-Boltzmann-Machine capable of learning relational order via reinforcement learning procedure?
2017cites this paper
Knowledge distillation across ensembles of multilingual models for low-resource languages
2017cites this paper
Effective joint training of denoising feature space transforms and Neural Network based acoustic models
2017cites this paper
The extraction of motion-onset VEP BCI features based on deep learning and compressed sensing.
2017cites this paper
Discriminative Autoencoders for Acoustic Modeling
2017influential citation
Unsupervised adaptation with domain separation networks for robust speech recognition
2017influential citation
Action recognition using instrumented objects for stroke rehabilitation
2017cites this paper
Large-Scale Domain Adaptation via Teacher-Student Learning
2017cites this paper
Extended low-rank plus diagonal adaptation for deep and recurrent neural networks
2017cites this paper
Cracking the cocktail party problem by multi-beam deep attractor network
2017cites this paper
La représentation des documents par réseaux de neurones pour la compréhension de documents parlés
2017cites this paper
Turbo fusion of magnitude and phase information for DNN-based phoneme recognition
2017cites this paper
Discriminative deep belief networks for microarray based cancer classification
2017cites this paper
On the relevance of auditory-based Gabor features for deep learning in robust speech recognition
2017cites this paper
Recent advances in LVCSR : A benchmark comparison of performances
2017cites this paper
Toward growing modular deep neural networks for continuous speech recognition
2017cites this paper
An Optimized Implementation of Speech Recognition Combining GPU with Deep Belief Network for IoT
2017cites this paper
Improved cepstra minimum-mean-square-error noise reduction algorithm for robust speech recognition
2017cites this paper
Kernel Approximation Methods for Speech Recognition
2017cites this paper
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q
2017cites this paper
Invariant Representations for Noisy Speech Recognition
2016cites this paper
Wise teachers train better DNN acoustic models
2016cites this paper
Low-rank plus diagonal adaptation for deep neural networks
2016cites this paper
Graph based manifold regularized deep neural networks for automatic speech recognition
2016cites this paper
Exploring multidimensional lstms for large vocabulary ASR
2016cites this paper
A comparison between deep neural nets and kernel acoustic models for speech recognition
2016influential citation
Image Retrieval Based on Deep Belief Networks
2016cites this paper
Convolutional feature learning and Hybrid CNN-HMM for scene number recognition
2016cites this paper
Improving DNN-Based Automatic Recognition of Non-native Children Speech with Adult Speech
2016cites this paper
Acoustic model selection for recognition of regional accented speech
2016cites this paper
Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling
2016cites this paper
A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition
2016cites this paper
Trajectory modelling with limited speech data
2016cites this paper
Hardware Implementation and Applications of Deep Belief Networks
2016cites this paper
Phoneme Recognition Based on Deep Belief Network
2016cites this paper
Efficient deep neural network acceleration through FPGA-based batch processing
2016cites this paper
LSTM time and frequency recurrence for automatic speech recognition
2015cites this paper
Semantic Deep Learning
2015cites this paper
SVD-based universal DNN modeling for multiple scenarios
2015cites this paper
Investigations on sequence training of neural networks
2015cites this paper
Speaker adaptive joint training of Gaussian mixture models and bottleneck features
2015cites this paper
An analysis of convolutional neural networks for speech recognition
2015cites this paper
Deep neural network training emphasizing central frames
2015cites this paper
Discriminative training of linear transformations and mixture density splitting for speech recognition
2015cites this paper
Maximum a posteriori adaptation of network parameters in deep models
2015cites this paper
Deep Discriminative and Generative Models for Pattern Recognition
2015cites this paper
Noisy training for deep neural networks in speech recognition
2015cites this paper
Cross-market behavior modeling
2015cites this paper
Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data
2015cites this paper
Fuse Deep Neural Network and Gaussian Mixture Model Systems
2015cites this paper
Improvements to the pruning behavior of DNN acoustic models
2015cites this paper
Deep Learning Approaches to Problems in Speech Recognition, Computational Chemistry, and Natural Language Text Processing
2015influential citation
Deep Generative and Discriminative Models for Speech Recognition
2015cites this paper
Multilingual representations for low resource speech recognition and keyword search
2015cites this paper
A multi-region deep neural network model in speech recognition
2015cites this paper
Fundamentals of speech recognition
2015cites this paper
Implementation of DNN-HMM Acoustic Models for Phoneme Recognition
2015cites this paper