Including phrases in the vocabulary list can improve ngram language models used in speech recognition. In this paper, we report results of automatic extraction of phrases from the training text using frequency, likelihood, and correlation criteria. We show how a language model built from a vocabulary that includes useful phrases can systematically improve language model perplexity in a natural language call-routing task and the 20K-Nov92 Wall Street Journal evaluation. We also discuss the impact of such phrase-based language models on recognition word error rate.
ABSTRACT
PUBLICATION RECORD
- Publication year
1999
- Venue
EUROSPEECH
- Publication date
1999-09-05
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-6 of 6 references · Page 1 of 1
CITED BY
Showing 1-31 of 31 citing papers · Page 1 of 1