Does syntax help discourse segmentation? Not so much

Chloé Braud,Ophélie Lacroix,Anders Søgaard

Published 2017 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

Discourse segmentation is the first step in building discourse parsers. Most work on discourse segmentation does not scale to real-world discourse parsing across languages, for two reasons: (i) models rely on constituent trees, and (ii) experiments have relied on gold standard identification of sentence and token boundaries. We therefore investigate to what extent constituents can be replaced with universal dependencies, or left out completely, as well as how state-of-the-art segmenters fare in the absence of sentence boundaries. Our results show that dependency information is less useful than expected, but we provide a fully scalable, robust model that only relies on part-of-speech information, and show that it performs well across languages in the absence of any gold-standard annotation.

PUBLICATION RECORD

  • Publication year

    2017

  • Venue

    Conference on Empirical Methods in Natural Language Processing

  • Publication date

    2017-09-01

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-36 of 36 references · Page 1 of 1

CITED BY

Showing 1-15 of 15 citing papers · Page 1 of 1