Corpora of child language are essential for psycholinguistic research. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe an ongoing project that aims to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. To date, we have produced a corpus of over 65,000 words with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for English CHILDES data. The parser and the manually annotated data are freely available for research purposes.
High-accuracy Annotation and Parsing of CHILDES Transcripts
Kenji Sagae,Eric Davis,A. Lavie,B. MacWhinney,S. Wintner
Published 2007 in Unknown venue
ABSTRACT
PUBLICATION RECORD
- Publication year
2007
- Venue
Unknown venue
- Publication date
2007-06-29
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-15 of 15 references · Page 1 of 1
CITED BY
Showing 1-73 of 73 citing papers · Page 1 of 1