Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years. In most work on this topic, however, utterances in a conversation are treated independently and discourse structure information is largely ignored. In the context of language acquisition, this independence assumption discards cues that are important to the learner, e.g., the fact that consecutive utterances are likely to share the same referent (Frank et al., 2013). The current paper describes an approach to the problem of simultaneously modeling grounded language at the sentence and discourse levels. We combine ideas from parsing and grammar induction to produce a parser that can handle long input strings with thousands of tokens, creating parse trees that represent full discourses. By casting grounded language learning as a grammatical inference task, we use our parser to extend the work of Johnson et al. (2012), investigating the importance of discourse continuity in children’s language acquisition and its interaction with social cues. Our model boosts performance in a language acquisition task and yields good discourse segmentations compared with human annotators.
Parsing entire discourses as very long strings: Capturing topic continuity in grounded language learning
Minh-Thang Luong,Michael C. Frank,Mark Johnson
Published 2013 in Transactions of the Association for Computational Linguistics
ABSTRACT
PUBLICATION RECORD
- Publication year
2013
- Venue
Transactions of the Association for Computational Linguistics
- Publication date
2013-07-31
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-68 of 68 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1