Analyzing Assumptions in Conversation Disentanglement Research Through the Lens of a New Dataset and Model

Jonathan K. Kummerfeld,S. R. Gouravajhala,Joseph Peper,V. Athreya,R. Chulaka Gunasekara,Jatin Ganhotra,Siva Sankalp Patel,L. Polymenakos,Walter S. Lasecki

Published 2018 in arXiv.org

ABSTRACT

Disentangling conversations mixed together in a single stream of messages is a difficult task with no large annotated datasets. We created a new dataset that is 25 times the size of any previous publicly available resource, has samples of conversation from 152 points in time across a decade, and is annotated with both threads and a within-thread reply-structure graph. We also developed a new neural network model, which extracts conversation threads substantially more accurately than prior work. Using our annotated data and our model we tested assumptions in prior work, revealing major issues in heuristically constructed resources, and identifying how small datasets have biased our understanding of multi-party multi-conversation chat.

PUBLICATION RECORD

Publication year
2018
Venue
arXiv.org
Publication date
2018-10-25
Fields of study
Computer Science
Identifiers
arXiv 1810.11118
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Recovering Implicit Thread Structure in Newsgroup Style Conversations
2021cited by this paper
SLATE: A Super-Lightweight Annotation Tool for Experts
2019cited by this paper
DSTC7 Task 1: Noetic End-to-End Response Selection
2019cited by this paper
Who Is Answering to Whom? Finding “Reply-To” Relations in Group Chats with Long Short-Term Memory Networks
2018cited by this paper
Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking
2018influential reference
Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks
2017cited by this paper
Characterizing Online Discussion Using Coarse Discourse Sequences
2017cited by this paper
Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs
2017cited by this paper
Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus
2017influential reference
Learning the Structures of Online Asynchronous Conversations
2017cited by this paper
Piecewise Latent Variables for Neural Variational Text Processing
2017cited by this paper
DyNet: The Dynamic Neural Network Toolkit
2017cited by this paper
Addressee and Response Selection for Multi-Party Conversation
2016cited by this paper
Multi-view Response Selection for Human-Computer Conversation
2016cited by this paper
Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots
2016cited by this paper
Ubuntu-fr: A Large and Open Corpus for Multi-modal Analysis of Online Written Conversations
2016cited by this paper
Discovering Conversational Dependencies between Messages in Dialogs
2016cited by this paper
Internet Argument Corpus 2.0: An SQL schema for Dialogic Social Media and the Corpora to go with it
2016cited by this paper
Enhanced LSTM for Natural Language Inference
2016cited by this paper
A Novel Method for Unsupervised and Supervised Conversational Message Thread Detection
2016cited by this paper
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
2015influential reference
On comparing partitions
2015cited by this paper
Conversation Trees: A Grammar Model for Topic Structure in Forums
2015cited by this paper
Using discursive information to disentangle French language chat
2015influential reference
Adam: A Method for Stochastic Optimization
2014cited by this paper
A Supervised Approach to Predict the Hierarchical Structure of Conversation Threads for Comments
2014cited by this paper
2 Related Work
2014cited by this paper
Conversations in the Crowd: Collecting Data for Task-Oriented Dialog Learning
2013cited by this paper
Extending Word Highlighting in Multiparticipant Chat
2013cited by this paper
A Supervised Approach for Reconstructing Thread Structure in Comments on Blogs and Online News Agencies (El enfoque supervisado para reconstrucción de la estructura de hilos en comentarios en blogs y agencias de noticias en línea)
2013cited by this paper
Efficient Estimation of Word Representations in Vector Space
2013cited by this paper
The Ubuntu Chat Corpus for Multiparticipant Chat Analysis
2013cited by this paper
Hierarchical Conversation Structure Prediction in Multi-Party Chat
2012influential reference
Predicting Thread Discourse Structure over Technical Web Forums
2011cited by this paper
Reconstruction of Threaded Conversations in Online Discussion Forums
2011cited by this paper
Disentangling Chat with Local Coherence Models
2011cited by this paper
Learning online discussion structures by conditional random fields
2011cited by this paper
Tagging and Linking Web Forum Posts
2010cited by this paper
Disentangling Chat
2010cited by this paper
Making Conversational Structure Explicit: Identification of Initiation-response Pairs within Online Discussions
2010cited by this paper
Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance
2010cited by this paper
Context-based Message Expansion for Disentanglement of Interleaved Text Conversations
2009cited by this paper
Bounding and Comparing Methods for Correlation Clustering Beyond ILP
2009cited by this paper
You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement
2008influential reference
Topic Detection and Extraction in Chat
2008cited by this paper
Comparing clusterings---an information based distance
2007cited by this paper
Extracting the discussion structure in comments on news-articles
2007cited by this paper
Thread detection in dynamic text message streams
2006influential reference
Practical statistics for medical research
1990cited by this paper
A Coefficient of Agreement for Nominal Scales
1960cited by this paper
Practical Statistics
1888cited by this paper

CITED BY

CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection
2023cites this paper
DialAug: Mixing up Dialogue Contexts in Contrastive Learning for Robust Conversational Modeling
2022cites this paper
Domain-matched Pre-training Tasks for Dense Retrieval
2021cites this paper
Building an Efficient and Effective Retrieval-based Dialogue System via Mutual Learning
2021cites this paper
Noetic end-to-end response selection with supervised neural network based classifiers and unsupervised similarity models
2020cites this paper
Sequential Neural Networks for Noetic End-to-End Response Selection
2020cites this paper
Overview of the seventh Dialog System Technology Challenge: DSTC7
2020influential citation
Knowledge-incorporating ESIM models for Response Selection in Retrieval-based Dialog Systems
2019cites this paper
Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
2019cites this paper
Multi-level Context Response Matching in Retrieval-Based Dialog Systems
2019cites this paper
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
2019cites this paper
Constructing Interpretive Spatio-Temporal Features for Multi-Turn Responses Selection
2019cites this paper
Sequential Attention-based Network for Noetic End-to-End Response Selection
2019cites this paper
Dialog System Technology Challenge 7
2019cites this paper
RAP-Net: Recurrent Attention Pooling Networks for Dialogue Response Selection
2019cites this paper
Learning Multi-Level Information for Dialogue Response Selection by Highway Recurrent Transformer
2019cites this paper
DSTC7 Task 1: Noetic End-to-End Response Selection
2019cites this paper
End-to-End Question Answering Models for Goal-Oriented Dialog Learning
2019cites this paper
Building Sequential Inference Models for End-to-End Response Selection
2018cites this paper
Convolutional Neural Encoder for the 7 th Dialogue System Technology Challenge
2018cites this paper
Enhanced Sequential Representation Augmented with Utterance-level Attention for Response Selection
2018cites this paper
Spatio-Temporal Matching Network for Multi-Turn Responses Selection in Retrieval-Based Chatbots
2018cites this paper