Letter Sequence Labeling for Compound Splitting

Jianqiang Ma,Verena Henrich,E. Hinrichs

Published 2016 in Special Interest Group on Computational Morphology and Phonology Workshop

ABSTRACT

For languages such as German where compounds occur frequently and are written as single tokens, a wide variety of NLP applications benefits from recognizing and splitting compounds. As the traditional word frequency-based approach to compound splitting has several drawbacks, this paper introduces a letter sequence labeling approach, which can utilize rich word form features to build discriminative learning models that are optimized for splitting. Experiments show that the proposed method significantly outperforms state-of-the-art compound splitters.

PUBLICATION RECORD

  • Publication year

    2016

  • Venue

    Special Interest Group on Computational Morphology and Phonology Workshop

  • Publication date

    Unknown publication date

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-27 of 27 references · Page 1 of 1