Transfer learning for topic labeling: Analysis of the UK House of Commons speeches 1935–2014

Hannah Béchara,Alexander Herzog,Slava Jankin,Peter John

Published 2021 in Research & Politics

ABSTRACT

Topic models are widely used in natural language processing, allowing researchers to estimate the underlying themes in a collection of documents. Most topic models require the additional step of attaching meaningful labels to estimated topics, a process that is not scalable, suffers from human bias, and is difficult to replicate. We present a transfer topic labeling method that seeks to remedy these problems, using domain-specific codebooks as the knowledge base to automatically label estimated topics. We demonstrate our approach with a large-scale topic model analysis of the complete corpus of UK House of Commons speeches from 1935 to 2014, using the coding instructions of the Comparative Agendas Project to label topics. We evaluated our results using human expert coding and compared our approach with more current state-of-the-art neural methods. Our approach was simple to implement, compared favorably to expert judgments, and outperformed the neural networks model for a majority of the topics we estimated.

PUBLICATION RECORD

Publication year
2021
Venue
Research & Politics
Publication date
2021-04-01
Fields of study
Not labeled
Identifiers
DOI 10.1177/20531680211022206
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Keyword Assisted Topic Models
2020cited by this paper
Local News and National Politics
2019cited by this paper
Concealing Corruption: How Chinese Officials Distort Upward Reporting of Online Grievances
2018cited by this paper
Elites Tweet to Get Feet Off the Streets: Measuring Regime Social Media Strategies During Protest
2018cited by this paper
Making Austerity Popular: The Media and Mass Attitudes toward Fiscal Policy
2018cited by this paper
A Model of Text for Experimentation in the Social Sciences
2016cited by this paper
Source-LDA: Enhancing Probabilistic Topic Models Using Prior Knowledge Sources
2016cited by this paper
Reading Between the Lines: Prediction of Political Violence Using Newspaper Text
2016cited by this paper
Automatic Labelling of Topics with Neural Embeddings
2016influential reference
The Sensitivity of Topic Coherence Evaluation to Topic Cardinality
2016cited by this paper
Labeling Topics with Images Using a Neural Network
2016cited by this paper
Measuring Political Positions from Legislative Speech
2016cited by this paper
The Politics of Scrutiny in Human Rights Monitoring: Evidence from Structural Topic Models of US State Department Human Rights Reports
2016cited by this paper
The Most Unkindest Cuts: Speaker Selection and Expressed Government Dissent during Economic Crisis
2015cited by this paper
A textual Taylor rule: estimating central bank preferences combining topic and scaling methods
2015cited by this paper
Automatic Labelling of Topic Models Using Word Vectors and Letter Trigram Vectors
2015cited by this paper
Policy Agendas in British Politics
2013cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
Automatic Labelling of Topic Models
2011cited by this paper
How to Analyze Political Attention with Minimal Assumptions and Costs
2010cited by this paper
A Survey on Transfer Learning
2010cited by this paper
Software Framework for Topic Modelling with Large Corpora
2010cited by this paper
Latent Dirichlet Allocation
2009cited by this paper
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora
2009cited by this paper
Automatic labeling of multinomial topic models
2007cited by this paper
A correlated topic model of Science
2007cited by this paper
Supervised Topic Models
2007cited by this paper
Comparative studies of policy agendas
2006cited by this paper
Hierarchical Dirichlet Processes
2006cited by this paper
Finding scientific topics
2004cited by this paper
Locating TDs in Policy Spaces: The Computational Text Analysis of Dáil Speeches
2002cited by this paper

CITED BY

Agenda-setting studies in public policy: Origins, development, and new possibilities for coding in the age of AI
2026cites this paper
Topic Modelling: Going Beyond Token Outputs
2024cites this paper
Energy agendas: A longitudinal analysis of Finnish parliamentary debates
2024cites this paper
Cross-Domain Topic Transfer Learning Method based on Multiple Balance and Feature Fusion
2024cites this paper
Automatic Detection of Industry Sectors in Legal Articles Using Machine Learning Approaches
2023cites this paper
If You Have Choices, Why Not Choose (and Share) All of Them? A Multiverse Approach to Understanding News Engagement on Social Media
2022cites this paper