Empirically combining unnormalized NNLM and back-off N-gram for fast N-best rescoring in speech recognition

Yongzhe Shi,Weiqiang Zhang,Meng Cai,Jia Liu

Published 2014 in EURASIP Journal on Audio, Speech, and Music Processing

ABSTRACT

Neural network language models (NNLM) have been proved to be quite powerful for sequence modeling, including feed-forward NNLM (FNNLM), recurrent NNLM (RNNLM), etc. One main issue concerned for NNLM is the heavy computational burden of the output layer, where the output needs to be probabilistically normalized and the normalizing factors require lots of computation. How to fast rescore the N-best list or lattice with NNLM attracts much attention for large-scale applications. In this paper, the statistic characteristics of normalizing factors are investigated on the N-best list. Based on the statistic observations, we propose to approximate the normalizing factors for each hypothesis as a constant proportional to the number of words in the hypothesis. Then, the unnormalized NNLM is investigated and combined with back-off N-gram for fast rescoring, which can be computed very fast without the normalization in the output layer, with the complexity reduced significantly. We apply our proposed method to a well-tuned context-dependent deep neural network hidden Markov model (CD-DNN-HMM) speech recognition system on the English-Switchboard phone-call speech-to-text task, where both FNNLM and RNNLM are trained to demonstrate our method. Experimental results show that unnormalized probability of NNLM is quite complementary to that of back-off N-gram, and combining the unnormalized NNLM and back-off N-gram can further reduce the word error rate with little computational consideration.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-25 of 25 references · Page 1 of 1

CITED BY