Learning Multiview Embeddings of Twitter Users

Published 2016 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Low-dimensional vector representations are widely used as stand-ins for the text of words, sentences, and entire documents. These embeddings are used to identify similar words or make predictions about documents. In this work, we consider embeddings for social media users and demonstrate that these can be used to identify users who behave similarly or to predict attributes of users. In order to capture information from all aspects of a user’s online life, we take a multiview approach, applying a weighted variant of Generalized Canonical Correlation Analysis (GCCA) to a collection of over 100,000 Twitter users. We demonstrate the utility of these multiview embeddings on three downstream tasks: user engagement, friend selection, and demographic attribute prediction.

PUBLICATION RECORD

Publication year
2016
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2016-08-01
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/P16-2003
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Comparative Study of Demographic Attribute Inference in Twitter
2015cited by this paper
Multiview LSA: Representation Learning via Generalized CCA
2015cited by this paper
Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks
2015cited by this paper
Predicting Demographics and Affect in Social Networks
2015cited by this paper
Skip-Thought Vectors
2015cited by this paper
On Deep Multi-View Representation Learning
2015cited by this paper
Asymmetrically Weighted CCA And Hierarchical Kernel Sentence Embedding For Multimodal Retrieval
2015cited by this paper
Improving Vector Space Word Representations Using Multilingual Correlation
2014cited by this paper
Distributed Representations of Sentences and Documents
2014cited by this paper
Multi-view learning with supervision for transformed bottleneck features
2014cited by this paper
TOMOHA: TOpic model-based HAshtag recommendation on twitter
2014cited by this paper
Inferring User Political Preferences from Streaming Communications
2014cited by this paper
Deep Canonical Correlation Analysis
2013cited by this paper
On the impact of text similarity functions on hashtag recommendations in microblogging environments
2013cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013influential reference
Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors
2012cited by this paper
User Modeling and Tweets Recommendation Based on Wikipedia Concept Graph ∗
2012cited by this paper
On Recommending Hashtags in Twitter Networks
2012cited by this paper
langid.py: An Off-the-shelf Language Identification Tool
2012cited by this paper
Analyzing user modeling on twitter for personalized news recommendations
2011cited by this paper
Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter
2011cited by this paper
Generalized canonical correlation analysis of matrices with missing rows: a simulation study
2006cited by this paper
The link prediction problem for social networks
2003cited by this paper
Kernel Principal Component Analysis
1997cited by this paper

CITED BY

Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks
2025cites this paper
AI POWERED SYSTEM QUANTIFIES SUICIDE INDICATORS AND IDENTIFIES SUICIDE RELATED CONTENT IN ONLINE POSTS
2025cites this paper
Systematic Evaluation of Auto-Encoding and Large Language Model Representations for Capturing Author States and Traits
2025cites this paper
Multi-view subspace text clustering
2024cites this paper
Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations
2024cites this paper
PASUM: A Pre-training Architecture for Social Media User Modeling Based on Text Graph
2024cites this paper
From Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP
2024cites this paper
From Text to Context: An Entailment Approach for News Stakeholder Classification
2024cites this paper
Rumor Detection by Jointly Learning Propagation Patterns and Users’ Personas
2024cites this paper
Extraordinarily Time- and Memory-Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain: From Shallow to Deep
2023cites this paper
A deep semantic matching approach for identifying relevant messages for social media analysis
2023cites this paper
Domain-based user embedding for competing events on social media
2023cites this paper
Large Human Language Models: A Need and the Challenges
2023cites this paper
Social world knowledge: Modeling and applications
2023cites this paper
Speaker landscapes: machine learning opens a window on the everyday language of opinion
2023cites this paper
Time and Memory Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain
2022influential citation
Enriching Unsupervised User Embedding via Medical Concepts
2022cites this paper
Can Contextualizing User Embeddings Improve Sarcasm and Hate Speech Detection?
2022cites this paper
Learning Interest-oriented Universal User Representation via Self-supervision
2022cites this paper
Deep multiview learning to identify imaging-driven subtypes in mild cognitive impairment
2022cites this paper
Information Processing and Management
2021cites this paper
User Factor Adaptation for User Embedding via Multitask Learning
2021cites this paper
Analysis of Twitter Users' Lifestyle Choices using Joint Embedding Model
2021cites this paper
SocialVec: Social Entity Embeddings
2021cites this paper
NPS-AntiClone: Identity Cloning Detection based on Non-Privacy-Sensitive User Profile Data
2021cites this paper
Interest-oriented Universal User Representation via Contrastive Learning
2021cites this paper
Variational graph autoencoders for multiview canonical correlation analysis
2021cites this paper
American Politicians Diverge Systematically, Indian Politicians do so Chaotically: Text Embeddings as a Window into Party Polarization
2021influential citation
Generalizable Identity Classifiers from Self-Reporting Statements on Reddit
2020cites this paper
Diachronically like-minded user community detection
2020cites this paper
Author2Vec: A Framework for Generating User Embedding
2020cites this paper
eDarkFind: Unsupervised Multi-view Learning for Sybil Account Detection
2020cites this paper
Sparse Generalized Canonical Correlation Analysis: Distributed Alternating Iteration-Based Approach
2020cites this paper
Analyzing and Detecting Collusive Users Involved in Blackmarket Retweeting Activities
2020influential citation
Returning the N to NLP: Towards Contextually Personalized Classification Models
2020cites this paper
Assessing the Severity of Health States based on Social Media Posts
2020cites this paper
Drink2Vec: Improving the classification of alcohol-related tweets using distributional semantics and external contextual enrichment
2020cites this paper
Multiview Variational Graph Autoencoders for Canonical Correlation Analysis
2020cites this paper
Sparse generalized canonical correlation analysis via linearized Bregman method
2020influential citation
Does Yoga Make You Happy? Analyzing Twitter User Happiness using Textual and Temporal Information
2020cites this paper
Exploiting Behavioral Consistence for Universal User Representation
2020cites this paper
Do You Do Yoga? Understanding Twitter Users' Types and Motivations using Social and Textual Information
2020cites this paper
Deep Multiview Learning to Identify Population Structure with Multimodal Imaging
2020cites this paper
User community detection via embedding of social network structure and temporal content
2020cites this paper
How to Evaluate Word Representations of Informal Domain?
2019cites this paper
Modeling Large-Scale Dynamic Social Networks via Node Embeddings
2019cites this paper
Transfer Learning for Unsupervised Influenza-like Illness Models from Online Search Data
2019cites this paper
A Content-Based Approach to Email Triage Action Prediction: Exploration and Evaluation
2019cites this paper
Performance evaluation of methods for integrative dimension reduction
2019cites this paper
Information Diffusion Prediction with Network Regularized Role-based User Representation Learning
2019cites this paper
Representation Learning for Words and Entities
2019cites this paper
Predicting delay discounting from heterogeneous social media data
2019cites this paper
User Level Multi-feed Weighted Topic Embeddings for Studying Network Interaction in Twitter
2019cites this paper
Social Media-based User Embedding: A Literature Review
2019influential citation
Canonical Correlation Analysis (CCA) Based Multi-View Learning: An Overview
2019influential citation
Multi-modal Sentiment Analysis using Deep Canonical Correlation Analysis
2019cites this paper
Talkographics: Measuring TV and Brand Audience Demographics and Interests from User-Generated Content
2019cites this paper
UAFA: Unsupervised Attribute-Friendship Attention Framework for User Representation
2019cites this paper
Learning Invariant Representations of Social Media Users
2019cites this paper
Are Online Reviews of Physicians Biased Against Female Providers?
2019cites this paper
Learning a Faceted Customer Segmentation for Discovering new Business Opportunities at Intel
2019cites this paper
Multiview Representation Learning for a Union of Subspaces
2019cites this paper
Graph Multiview Canonical Correlation Analysis
2018cites this paper
Multi-view Network Embedding via Graph Factorization Clustering and Co-regularized Multi-view Agreement
2018cites this paper
CASCADE: Contextual Sarcasm Detection in Online Discussion Forums
2018cites this paper
Learning multiview embeddings for assessing dementia
2018cites this paper
Deep Canonically Correlated LSTMs
2018cites this paper
client2vec: Towards Systematic Baselines for Banking Applications
2018cites this paper
Mining Daily Canonical Correlations among Multivariable Electricity, Gas and Climate Data
2018cites this paper
Using Author Embeddings to Improve Tweet Stance Classification
2018influential citation
Automatically Infer Human Traits and Behavior from Social Media Data
2018cites this paper
Community Member Retrieval on Social Media Using Textual Information
2018cites this paper
Gender Recognition Based on Social Networks for Multimedia Production
2018cites this paper
What's ur Type? Contextualized Classification of User Types in Marijuana-Related Communications Using Compositional Multiview Embedding
2018cites this paper
CoupleNet: Paying Attention to Couples with Coupled Attention for Relationship Recommendation
2018cites this paper
Interpreting Social Media-Based Substance Use Prediction Models with Knowledge Distillation
2018influential citation
Learning Representations of Social Media Users
2018influential citation
Social Media-based Substance Use Prediction
2017cites this paper
How well can machine learning predict demographics of social media users
2017cites this paper
Multiview Community Discovery Algorithm via Nonnegative Factorization Matrix in Heterogeneous Networks
2017cites this paper
Joint Embedding Models for Textual and Social Analysis
2017cites this paper
Temporally Like-minded User Community Identification through Neural Embeddings
2017cites this paper
Detection of User Demographics on Social Media: A Review of Methods and Recommendations for Best Practices
2017cites this paper
Detecting Culture-specific Tags for News Videos through Multimodal Embedding
2017cites this paper
Improved Abusive Comment Moderation with User Embeddings
2017cites this paper
Multi-View Unsupervised User Feature Embedding for Social Media-based Substance Use Prediction
2017cites this paper
Deep Generalized Canonical Correlation Analysis
2017influential citation
Stochastic Approximation for Canonical Correlation Analysis
2017cites this paper
Incorporating Metadata into Content-Based User Embeddings
2017cites this paper
Application of multiview techniques to NHANES dataset
2016cites this paper
Evaluating Informal-Domain Word Representations With UrbanDictionary
2016cites this paper
Stochastic Optimization for Multiview Representation Learning using Partial Least Squares
2016cites this paper