Noisy training for deep neural networks in speech recognition

Shi Yin,Chao Liu,Zhiyong Zhang,Yiye Lin,Dong Wang,Javier Tejedor,T. Zheng,Yinguo Li

Published 2015 in EURASIP Journal on Audio, Speech, and Music Processing

ABSTRACT

Deep neural networks (DNNs) have gained remarkable success in speech recognition, partially attributed to the flexibility of DNN models in learning complex patterns of speech signals. This flexibility, however, may lead to serious over-fitting and hence miserable performance degradation in adverse acoustic conditions such as those with high ambient noises. We propose a noisy training approach to tackle this problem: by injecting moderate noises into the training data intentionally and randomly, more generalizable DNN models can be learned. This ‘noise injection’ technique, although known to the neural computation community already, has not been studied with DNNs which involve a highly complex objective function. The experiments presented in this paper confirm that the noisy training approach works well for the DNN model and can provide substantial performance improvement for DNN-based speech recognition.

PUBLICATION RECORD

Publication year
2015
Venue
EURASIP Journal on Audio, Speech, and Music Processing
Publication date
2015-01-20
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1186/s13636-014-0047-0
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Deep Learning: Methods and Applications
2014cited by this paper
Noisy training for deep neural networks
2014cited by this paper
Multi-level adaptive networks in tandem and hybrid ASR systems
2013cited by this paper
Feature Learning in Deep Neural Networks - A Study on Speech Recognition Tasks
2013cited by this paper
An investigation of deep neural networks for noise robust speech recognition
2013cited by this paper
An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition
2013cited by this paper
Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks
2013cited by this paper
Noise adaptive front-end normalization based on Vector Taylor Series for Deep Neural Networks in robust speech recognition
2013cited by this paper
Multi-task learning in deep neural networks for improved phoneme recognition
2013cited by this paper
Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition
2013cited by this paper
Bottleneck features based on gammatone frequency cepstral coefficients
2013cited by this paper
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
2012cited by this paper
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks
2012cited by this paper
Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition
2012influential reference
Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?
2012cited by this paper
Recurrent Neural Networks for Noise Reduction in Robust ASR
2012cited by this paper
Auto-encoder bottleneck features using deep belief networks
2012cited by this paper
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
2012cited by this paper
Comparing multilayer perceptron to Deep Belief Network Tandem features for robust ASR
2011cited by this paper
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription
2011cited by this paper
Improved Bottleneck Features Using Pretrained Deep Neural Networks
2011cited by this paper
Making Deep Belief Networks effective for large vocabulary continuous speech recognition
2011cited by this paper
Large vocabulary continuous speech recognition with context-dependent DBN-HMMS
2011cited by this paper
Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition
2010cited by this paper
Hierarchical bottle neck features for LVCSR
2010cited by this paper
Optimizing bottle-neck features for lvcsr
2008cited by this paper
Extracting and composing robust features with denoising autoencoders
2008cited by this paper
Tandem connectionist feature extraction for conventional HMM systems
2000cited by this paper
Lecture Notes in Artificial Intelligence
1999cited by this paper
Adaptive Processing of Sequences and Data Structures
1998influential reference
Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions
1997cited by this paper
Noise Injection: Theoretical Prospects
1997cited by this paper
The Effects of Adding Noise During Backpropagation Training on a Generalization Performance
1996cited by this paper
Comments on "Noise injection into inputs in back propagation learning"
1995influential reference
Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter
1995cited by this paper
Noise injection into inputs in back-propagation learning
1992cited by this paper
Neural net pruning-why and how
1988cited by this paper
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING 1 Cross-lingual Automatic Speech Rec
year unknowncited by this paper
Current address: Microsoft Research,
year unknowncited by this paper

CITED BY

Data augmentation techniques for Automatic Speech Recognition: Taxonomy, method analysis, challenges, and future research directions
2026cites this paper
Focus Then Listen: Exploring Plug-and-Play Audio Enhancer for Noise-Robust Large Audio Language Models
2026cites this paper
Internal noise in hardware deep and recurrent neural networks helps with learning
2025cites this paper
Using machine learning to map simulated noisy and laser-limited multidimensional spectra to molecular electronic couplings
2025cites this paper
Augmenting brain-computer interfaces with ART: An artifact removal transformer for reconstructing multichannel EEG signals
2025cites this paper
Comparison of three hybrid architectures using 1D, 2D, and 3D CNNs for speech emotion recognition
2025cites this paper
Validity and Robustness of Denoisers: A Proof of Concept in Speech Denoising
2025cites this paper
Noise injection into Freeman chain codes
2025cites this paper
Methodology for Obtaining High-Quality Speech Corpora
2025cites this paper
Deep neural networks for speech enhancement and speech recognition: A systematic review
2025cites this paper
Ensemble Machine Learning Approach for Parkinson’s Disease Detection Using Speech Signals
2024cites this paper
Resting state electroencephalographic brain activity in neonates can predict age and is indicative of neurodevelopmental outcome
2024cites this paper
Automatic Speech Recognition Tuned for Child Speech in the Classroom
2024cites this paper
Evaluating Real-World Benefits of Hearing Aids With Deep Neural Network-Based Noise Reduction: An Ecological Momentary Assessment Study.
2024cites this paper
A Hybrid Regularized Multilayer Perceptron for Input Noise Immunity
2024cites this paper
Exploring data augmentation for Amazigh speech recognition with convolutional neural networks
2024cites this paper
Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists
2024cites this paper
Robustness enhancement in neural networks with alpha-stable training noise
2024cites this paper
ART: Artifact Removal Transformer for Reconstructing Noise-Free Multichannel Electroencephalographic Signals
2024cites this paper
Audio Enhancement for Computer Audition—An Iterative Training Paradigm Using Sample Importance
2024cites this paper
Mean Teacher based SSL Framework for Indoor Localization Using Wi-Fi RSSI Fingerprinting
2024cites this paper
A Comprehensive Review of Convolutional Neural Networks for Defect Detection in Industrial Applications
2024cites this paper
Policy Gradient-Driven Noise Mask
2024cites this paper
Spot keywords from very noisy and mixed speech
2023cites this paper
Robustness Enhancement in Neural Networks with Alpha-Stable Training Noise
2023cites this paper
Deep learning in wastewater treatment: a critical review.
2023cites this paper
Automatic Exploration of Optimal Data Processing Operations for Sound Data Augmentation Using Improved Differentiable Automatic Data Augmentation
2023cites this paper
Convolutional Neural Networks: A Survey
2023cites this paper
Trends and developments in automatic speech recognition research
2023cites this paper
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps and Residual Convolutional Neural Networks
2023cites this paper
Integrating modeled environmental variability into neural network training for underwater source localization.
2023cites this paper
Balancing Logit Variation for Long-Tailed Semantic Segmentation
2023cites this paper
NEAT: A Resilient Deep Representational Learning for Fault Detection Using Acoustic Signals in IIoT Environment
2023cites this paper
Noise-Robust Machine Learning Models for Predictive Maintenance Applications
2023cites this paper
Noise-Aware Target Extension with Self-Distillation for Robust Speech Recognition
2023cites this paper
Improving the Intent Classification accuracy in Noisy Environment
2023cites this paper
Asymmetric Effects of Different Training-Testing Mismatch Types on Myoelectric Regression via Deep Learning
2023cites this paper
Time-Domain Joint Training Strategies of Speech Enhancement and Intent Classification Neural Models
2022cites this paper
Automatic Selection of Appropriate Data Augmentation Operation for Acoustic Scene Classification Model Training
2022cites this paper
Noise Suppression Using Gated Recurrent Units and Nearest Neighbor Filtering
2022cites this paper
Automatic Speech Recognition for Uyghur, Kazakh, and Kyrgyz: An Overview
2022cites this paper
DeAR: A Deep-learning-based Audio Re-recording Resilient Watermarking
2022cites this paper
Graph convolutional network-based feature selection for high-dimensional and low-sample size data
2022cites this paper
Generalizability analysis of tool condition monitoring ensemble machine learning models
2022cites this paper
Performance Evaluation of Data Augmentation for Object Detection in XView Dataset
2022cites this paper
Musical Instrument Recognition with a Convolutional Neural Network and Staged Training
2022cites this paper
Reference architecture design for computer-based speech therapy systems
2022cites this paper
Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
2022cites this paper
Domain Adapting Speech Emotion Recognition modals to real-world scenario with Deep Reinforcement Learning
2022cites this paper
Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments
2022cites this paper
Dark‐Mode Human–Machine Communication Realized by Persistent Luminescence and Deep Learning
2022cites this paper
An Automatic Speech Recognition System: A systematic review and Future directions
2022cites this paper
Chain-based Discriminative Autoencoders for Speech Recognition
2022cites this paper
SRIB Submission to Interspeech 2021 DiCOVA Challenge
2021cites this paper
Noise-robust Attention Learning for End-to-End Speech Recognition
2021cites this paper
Self-Supervised Learning for Personalized Speech Enhancement
2021cites this paper
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
2021cites this paper
Machine Learning for Stuttering Identification: Review, Challenges & Future Directions
2021cites this paper
Neural Networks-Based Equalizers for Coherent Optical Transmission: Caveats and Pitfalls
2021cites this paper
Survey of Deep Representation Learning for Speech Emotion Recognition
2021cites this paper
Rawboost: A Raw Data Boosting and Augmentation Method Applied to Automatic Speaker Verification Anti-Spoofing
2021cites this paper
Pure Noise to the Rescue of Insufficient Data: Improving Imbalanced Classification by Training on Random Noise Images
2021cites this paper
Voice Commanded System for Navigation of Mobile Robots
2021cites this paper
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends
2020cites this paper
Noisy parallel hybrid model of NBGRU and NCNN architectures for remaining useful life estimation
2020cites this paper
Evaluating and Enhancing the Generalization Performance of Machine Learning Models for Physical Activity Intensity Prediction From Raw Acceleration Data
2020cites this paper
Noisy Multipath Parallel Hybrid Model for Remaining Useful Life Estimation (NMPM)
2020cites this paper
Unsupervised Domain Adaptation Under Label Space Mismatch for Speech Classification
2020cites this paper
Reduction of Over-Fitting Problem in Predictive Forecasting Model using Deep Learning Neural Network
2020cites this paper
Multipath Parallel Hybrid Deep Neural Networks Framework for Remaining Useful Life Estimation
2020cites this paper
A depthwise separable convolutional neural network for keyword spotting on an embedded system
2020cites this paper
Robust sound event detection in binaural computational auditory scene analysis
2020cites this paper
A Digital Signal Recovery Technique Using DNNs for LEO Satellite Communication Systems
2020cites this paper
Self-Supervised Learning from Contrastive Mixtures for Personalized Speech Enhancement
2020cites this paper
NBLSTM: Noisy and Hybrid Convolutional Neural Network and BLSTM-Based Deep Architecture for Remaining Useful Life Estimation
2020cites this paper
Replay Spoofing Countermeasure Using Autoencoder and Siamese Network on ASVspoof 2019 Challenge
2019cites this paper
Data Augmentation in Training CNNs: Injecting Noise to Images
2019cites this paper
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition
2019cites this paper
A Study on the Data Augment Method considering Room Transfer Functions for Acoustic Scene Classification
2019cites this paper
Generative Noise Modeling and Channel Simulation for Robust Speech Recognition in Unseen Conditions
2019cites this paper
Using Artificial Neural Network in Nuclear Spin Echo Experiments
2019cites this paper
Deep Neural Network For Structured Data - A Case Study Of Mortality Rate Prediction Caused By Air Quality
2019cites this paper
English Spoken Digits Database under noise conditions for research: SDDN
2019cites this paper
Noise-tolerant modular neural network system for classifying ECG signal
2019cites this paper
Application of Deep Learning in Speech Recognition
2019cites this paper
Disambiguating Conflicting Classification Results in AVSR
2019cites this paper
Sequence Noise Injected Training for End-to-end Speech Recognition
2019cites this paper
Robust S1 and S2 heart sound recognition based on spectral restoration and multi-style training
2019cites this paper
Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks
2019cites this paper
Data-Driven Approaches for Diagnosis of Incipient Faults in DC Motors
2019cites this paper
Speech Recognition Using Deep Neural Networks: A Systematic Review
2019cites this paper
Improving the resilience of neural network solution of inverse problems in Raman spectroscopy of multi-component solutions of inorganic compounds to the distortions caused by frequency shift of the spectral channels
2018cites this paper
An Unsupervised-Learning-Based Approach for Automated Defect Inspection on Textured Surfaces
2018influential citation
Comparison of different Acoustic Models for Kannada language using Kaldi Toolkit
2018cites this paper
Improving label efficiency through multitask learning on auditory data
2018cites this paper
Joint Application of Group Determination of Parameters and of Training with Noise Addition to Improve the Resilience of the Neural Network Solution of the Inverse Problem in Spectroscopy to Noise in Data
2018cites this paper
It's Not (Only) the Mean that Matters: Variability, Noise and Exploration in Skill Learning.
2018cites this paper
Training Recurrent Neural Networks against Noisy Computations during Inference
2018cites this paper
Learning Noise-Invariant Representations for Robust Speech Recognition
2018cites this paper
Hallucinating Robots: Inferring Obstacle Distances from Partial Laser Measurements
2018cites this paper