Serial Speakers: a Dataset of TV Series

Xavier Bost,Vincent Labatut,Georges Linarès

Published 2020 in International Conference on Language Resources and Evaluation

ABSTRACT

For over a decade, TV series have been drawing increasing interest, both from the audience and from various academic fields. But while most viewers are hooked on the continuous plots of TV serials, the few annotated datasets available to researchers focus on standalone episodes of classical TV series. We aim at filling this gap by providing the multimedia/speech processing communities with “Serial Speakers”, an annotated dataset of 155 episodes from three popular American TV serials: “Breaking Bad”, “Game of Thrones” and “House of Cards”. “Serial Speakers” is suitable both for investigating multimedia retrieval in realistic use case scenarios, and for addressing lower level speech related tasks in especially challenging conditions. We publicly release annotations for every speech turn (boundaries, speaker) and scene boundary, along with annotations for shot boundaries, recurring shots, and interacting speakers in a subset of episodes. Because of copyright restrictions, the textual content of the speech turns is encrypted in the public version of the dataset, but we provide the users with a simple online tool to recover the plain text from their own subtitle files.

PUBLICATION RECORD

Publication year
2020
Venue
International Conference on Language Resources and Evaluation
Publication date
2020-02-17
Fields of study
Computer Science
Identifiers
arXiv 2002.06923
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Remembering winter was coming
2019cited by this paper
Extraction and Analysis of Fictional Character Networks
2019cited by this paper
Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi
2017cited by this paper
Characterization and Modeling
2017cited by this paper
Improving Speaker Diarization of TV Series using Talking-Face Detection and Clustering
2016influential reference
A storytelling machine? : Automatic video summarization: the case of TV series. (Une machine à raconter des histoires ? / Une machine à raconter des histoires ? : Résumé automatique de vidéos : le cas des séries TV)
2016cited by this paper
Book2Movie: Aligning video scenes with book chapters
2015influential reference
Accio: A Data Set for Face Track Retrieval in Movies Across Age
2015cited by this paper
Audiovisual speaker diarization of TV series
2015cited by this paper
Improved weak labels using contextual cues for person identification in videos
2015influential reference
Constrained speaker diarization of TV series based on visual patterns
2014cited by this paper
Story-based Video Retrieval in TV series using Plot Synopses
2014influential reference
TVD: A Reproducible and Multiply Aligned TV Series Dataset
2014influential reference
Knock, knock...
2014cited by this paper
StoryGraphs: Visualizing Character Interactions as a Timeline
2014influential reference
A time pooled track kernel for person identification
2014cited by this paper
Semi-supervised Learning with Constraints for Person Identification in Multimedia Data
2013influential reference
StoViz: story visualization of TV series
2012influential reference
Toward plot de-interlacing in TV series using scenes clustering
2012influential reference
“Knock! Knock! Who is it?” probabilistic person identification in TV-series
2012cited by this paper
Segmentation of TV shows into scenes using speaker diarization and speech recognition
2012cited by this paper
Speaker diarization of heterogeneous web video files: A preliminary study
2011cited by this paper
Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization
2011cited by this paper
Segmentation TV series into scenes using speaker diarization
2011cited by this paper
SEGMENTING TV SERIES INTO SCENES USING SPEAKER DIARIZATION
2010cited by this paper
MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment
2010cited by this paper
Using Artistic Markers and Speaker Identification for Narrative-Theme Navigation of Seinfeld Episodes
2009cited by this paper
Power-Law Distributions in Empirical Data
2007cited by this paper
Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video
2006cited by this paper
Characterization and modeling of weighted networks
2004cited by this paper
Network connection strengths: Another power-law?
2003cited by this paper
Temporal video segmentation: A survey
2001cited by this paper
Automated high-level movie segmentation for advanced video-retrieval systems
1999cited by this paper
Segmentation of Video by Clustering and Graph Analysis
1998cited by this paper

CITED BY

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
2024cites this paper
Decoding the Popularity of TV Series: A Network Analysis Perspective
2023cites this paper
Bazinga! A Dataset for Multi-Party Dialogues Structuring
2022influential citation
Towards Personalised and Document-level Machine Translation of Dialogue
2021cites this paper
On the Need for Thoughtful Data Collection for Multi-Party Dialogue: A Survey of Available Corpora and Collection Methods
2021cites this paper
MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
2021cites this paper