Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures

I. Bulyko,Mari Ostendorf,A. Stolcke

Published 2003 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web filtered to match the style and/or topic of the target recognition task, but also that it is possible to get bigger performance gains from the data by using class-dependent interpolation of N-grams.

PUBLICATION RECORD

  • Publication year

    2003

  • Venue

    North American Chapter of the Association for Computational Linguistics

  • Publication date

    2003-05-27

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-19 of 19 references · Page 1 of 1

CITED BY

Showing 1-100 of 161 citing papers · Page 1 of 2