Using the Web as an Implicit Training Set: Application to Structural Ambiguity Resolution

Preslav Nakov,Marti A. Hearst

Published 2005 in Human Language Technology - The Baltic Perspectiv

ABSTRACT

Recent work has shown that very large corpora can act as training data for NLP algorithms even without explicit labels. In this paper we show how the use of surface features and paraphrases in queries against search engines can be used to infer labels for structural ambiguity resolution tasks. Using unsupervised algorithms, we achieve 84% precision on PP-attachment and 80% on noun compound coordination.

PUBLICATION RECORD

  • Publication year

    2005

  • Venue

    Human Language Technology - The Baltic Perspectiv

  • Publication date

    2005-10-06

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-27 of 27 references · Page 1 of 1

CITED BY

Showing 1-85 of 85 citing papers · Page 1 of 1