The fact that words are not conventionally demarcated in Chinese orthography makes the process of word segmentation non-trivial. Chinese word segmentation remains a challenging topic in Chinese computational linguistics. We survey previous approaches to Chinese word segmentation, including dictionary look-up, strength of internal binding, as well as character tagging and machine learning. The Word Boundary Decision (WBD) approach which requires no prior lexical knowledge is proposed. It is shown that the WBD model greatly reduces the complexity of Chinese word segmentation and may provide a promising approach to address domain adaption and robustness issues.
Words without Boundaries: Computational Approaches to Chinese Word Segmentation
Published 2012 in Language and Linguistics Compass
ABSTRACT
PUBLICATION RECORD
- Publication year
2012
- Venue
Language and Linguistics Compass
- Publication date
2012-08-01
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-11 of 11 references · Page 1 of 1
CITED BY
Showing 1-15 of 15 citing papers · Page 1 of 1