Linguistic Structured Sparsity in Text Categorization

Dani Yogatama,Noah A. Smith

Published 2014 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

We introduce three linguistically motivated structured regularizers based on parse trees, topics, and hierarchical word clusters for text categorization. These regularizers impose linguistic bias in feature weights, enabling us to incorporate prior knowledge into conventional bagof-words models. We show that our structured regularizers consistently improve classification accuracies compared to standard regularizers that penalize features in isolation (such as lasso, ridge, and elastic net regularizers) on a range of datasets for various text prediction problems: topic classification, sentiment analysis, and forecasting.

PUBLICATION RECORD

  • Publication year

    2014

  • Venue

    Annual Meeting of the Association for Computational Linguistics

  • Publication date

    2014-06-01

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-37 of 37 references · Page 1 of 1

CITED BY

Showing 1-55 of 55 citing papers · Page 1 of 1