Ultradense Word Embeddings by Orthogonal Transformation

S. Rothe,Sebastian Ebert,Hinrich Schütze

Published 2016 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

Embeddings are generic representations that are useful for many NLP tasks. In this paper, we introduce DENSIFIER, a method that learns an orthogonal transformation of the embedding space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We show that ultradense embeddings generated by DENSIFIER reach state of the art on a lexicon creation task in which words are annotated with three types of lexical information - sentiment, concreteness and frequency. On the SemEval2015 10B sentiment analysis task we show that no information is lost when the ultradense subspace is used, but training is an order of magnitude more efficient due to the compactness of the ultradense space.

PUBLICATION RECORD

Publication year
2016
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
2016-02-24
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/N16-1091 arXiv 1602.07572
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

INESC-ID: A Regression Model for Large Scale Twitter Sentiment Lexicon Induction
2015cited by this paper
A Linguistically Informed Convolutional Neural Network
2015cited by this paper
KLUEless: Polarity Classification and Association
2015cited by this paper
Learning Composition Models for Phrase Embeddings
2015cited by this paper
Lsislif: Feature Extraction and Label Weighting for Sentiment Analysis in Twitter
2015cited by this paper
Skip-Thought Vectors
2015cited by this paper
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
2015cited by this paper
UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification
2015cited by this paper
CLaC-SentiPipe: SemEval2015 Subtasks 10 B,E, and Task 11
2015cited by this paper
SemEval-2015 Task 10: Sentiment Analysis in Twitter
2015cited by this paper
ECNU: Multi-level Sentiment Analysis on Twitter Using Traditional Linguistic Features and Word Embedding Features
2015cited by this paper
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation
2015cited by this paper
Sentiment Analysis of Short Informal Texts
2014cited by this paper
Convolutional Neural Networks for Sentence Classification
2014cited by this paper
Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach
2014cited by this paper
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification
2014cited by this paper
A Convolutional Neural Network for Modelling Sentences
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Concreteness ratings for 40 thousand generally known English word lemmas
2014cited by this paper
SemEval-2013 Task 2: Sentiment Analysis in Twitter
2013cited by this paper
Exploiting Similarities among Languages for Machine Translation
2013cited by this paper
CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON
2013cited by this paper
NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets
2013cited by this paper
Sentiment Analysis in Czech Social Media Using Supervised Machine Learning
2013cited by this paper
Learning Sentiment Lexicons in Spanish
2012cited by this paper
Sentiment Lexicon Creation from Lexical Resources
2011cited by this paper
Natural Language Processing (Almost) from Scratch
2011cited by this paper
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
2011cited by this paper
Rectified Linear Units Improve Restricted Boltzmann Machines
2010cited by this paper
Sentiment Translation through Lexicon Induction
2010cited by this paper
GermanPolarityClues: A Lexical Resource for German Sentiment Analysis
2010cited by this paper
Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis
2005cited by this paper
Mining and summarizing customer reviews
2004cited by this paper
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
2002cited by this paper
Principal component neural networks — Theory and applications
1998influential reference
Some metric inequalities in the space of matrices
1955cited by this paper

CITED BY

Emotion Classification Using Large Language Models: A Comparison of Fine-Tuning and Prompting
2025cites this paper
eXplainable AI for Word Embeddings: A Survey
2024cites this paper
Optimal synthesis embeddings
2024cites this paper
Bridging Natural Language Processing and Psycholinguistics: computationally grounded semantic similarity datasets for Basque and Spanish
2023cites this paper
Emotion Embeddings — Learning Stable and Homogeneous Abstractions From Heterogeneous Affective Datasets
2023cites this paper
Dialectograms: Machine Learning Differences between Discursive Communities
2023cites this paper
for Sentiment, Affect, and Connotation
2023cites this paper
Bridging Natural Language Processing and Psycholinguistics: computationally grounded semantic similarity and relatedness datasets for Basque and Spanish
2023cites this paper
Distil Knowledge from Natural Language
2022cites this paper
Explainability of Text Processing and Retrieval Methods: A Critical Survey
2022influential citation
Are Male Candidates Better than Females? Debiasing BERT Resume Retrieval System*
2022cites this paper
Gender Norms Do Not Persist But Converge Across Time
2022influential citation
On the Geometry of Concreteness
2022influential citation
Explainability of Text Processing and Retrieval Methods: A Survey
2022influential citation
Locating Language-Specific Information in Contextualized Embeddings
2021cites this paper
Probabilistic Latent Semantic Scaling
2021cites this paper
Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
2021influential citation
Guilt by Association: Emotion Intensities in Lexical Representations
2021cites this paper
Acquiring a Formality-Informed Lexical Resource for Style Analysis
2021cites this paper
Less is More/More Diverse: On The Communicative Utility of Linguistic Conventionalization
2021cites this paper
Detecting Domain Polarity-Changes of Words in a Sentiment Lexicon
2020cites this paper
Comparing Deep Neural Networks to Traditional Models for Sentiment Analysis in Turkish Language
2020cites this paper
A comprehensive analysis of adverb types for mining user sentiments on amazon product reviews
2020cites this paper
Learning variable-length representation of words
2020cites this paper
Towards Verifying Results from Biomedical NLP Machine Learning Models Using the UMLS: Cases of Classiﬁcation and Named Entity Recognition
2020cites this paper
Towards Label-Agnostic Emotion Embeddings
2020cites this paper
Building domain specific lexicon based on TikTok comment dataset
2020influential citation
Towards a Unified Framework for Emotion Analysis
2020cites this paper
Domain-Specific Sentiment Lexicons Induced from Labeled Documents
2020cites this paper
Target-Level Sentiment Analysison Various Genres
2020cites this paper
Techno-Economic Assessment of Three Modes of Large-Scale Crop Residue Utilization Projects in China
2020cites this paper
Querying subjective data
2020cites this paper
Where Words Get their Meaning
2020cites this paper
Social Informatics: 12th International Conference, SocInfo 2020, Pisa, Italy, October 6–9, 2020, Proceedings
2020cites this paper
Exploiting semantic relationships for unsupervised expansion of sentiment lexicons
2020cites this paper
Knowledge Efficient Deep Learning for Natural Language Processing
2020cites this paper
Exploiting Latent Semantic Subspaces to Derive Associations for Specific Pharmaceutical Semantics
2020cites this paper
Predicting the Concreteness of German Words
2020cites this paper
Learning Lexical Subspaces in a Distributional Vector Space
2020influential citation
Mining Semantic Subspaces to Express Discipline-Specific Similarities
2020cites this paper
Explainable Word-Embeddings for Medical Digital Libraries - A Context-Aware Approach
2020influential citation
A Computational Analysis of Polarization on Indian and Pakistani Social Media
2020cites this paper
Learning and Evaluating Emotion Lexicons for 91 Languages
2020cites this paper
Predicting Word Concreteness and Imagery
2019cites this paper
Subjective Databases
2019cites this paper
Attentive Mimicking: Better Word Embeddings by Attending to Informative Contexts
2019influential citation
Analytical Methods for Interpretable Ultradense Word Embeddings
2019influential citation
UniSent: Universal Adaptable Sentiment Lexica for 1000+ Languages
2019influential citation
Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings
2019cites this paper
A Multilingual BPE Embedding Space for Universal Sentiment Lexicon Induction
2019influential citation
Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes
2019cites this paper
EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media Text
2019cites this paper
CatE: Category-Name GuidedWord Embedding
2019cites this paper
SenSALDO: a Swedish Sentiment Lexicon for the SWE-CLARIN Toolbox
2019influential citation
Cross-lingual Structure Transfer for Relation and Event Extraction
2019cites this paper
Rotate King to get Queen: Word Relationships as Orthogonal Transformations in Embedding Space
2019cites this paper
Retrofitting Contextualized Word Embeddings with Paraphrases
2019cites this paper
Improving Pre-Trained Multilingual Model with Vocabulary Expansion
2019cites this paper
Grammatical Gender, Neo-Whorfianism, and Word Embeddings: A Data-Driven Approach to Linguistic Relativity
2019cites this paper
Word embeddings: reliability & semantic change
2019cites this paper
Discriminative Topic Mining via Category-Name Guided Text Embedding
2019cites this paper
Improving word embeddings projection for Turkish hypernym extraction
2019cites this paper
The impact of inﬂection on word vectors
2019cites this paper
Towards Interpretation of Node Embeddings
2018cites this paper
UWB at IEST 2018: Emotion Prediction in Tweets with Bidirectional Long Short-Term Memory Neural Network
2018cites this paper
Deep recurrent convolutional networks for inferring user interests from social media
2018cites this paper
SenSALDO: Creating a Sentiment Lexicon for Swedish
2018influential citation
Refining Pretrained Word Embeddings Using Layer-wise Relevance Propagation
2018cites this paper
Aff2Vec: Affect–Enriched Distributional Word Representations
2018cites this paper
Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection
2018cites this paper
A Feature-Oriented Sentiment Rating for Mobile App Reviews
2018cites this paper
Uncovering Divergent Linguistic Information in Word Embeddings with Lessons for Intrinsic and Extrinsic Evaluation
2018cites this paper
Interpretable Word Embeddings for Medical Domain
2018cites this paper
Using Sentiment Induction to Understand Variation in Gendered Online Communities
2018influential citation
Learning Concept Abstractness Using Weak Supervision
2018cites this paper
Learning Sentiment Composition from Sentiment Lexicons
2018cites this paper
Tackling the Challenge of Emotion Annotation in Text
2018cites this paper
Affect Lexicon Induction For the Github Subculture Using Distributed Word Representations
2018cites this paper
EmotionX-AR: CNN-DCNN autoencoder based Emotion Classifier
2018cites this paper
Two Methods for Domain Adaptation of Bilingual Tasks: Delightfully Simple and Broadly Applicable
2018cites this paper
Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings
2018cites this paper
Inducing Affective Lexical Semantics in Historical Language
2018influential citation
Multi-lingual Common Semantic Space Construction via Cluster-consistent Word Embedding
2018cites this paper
SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment
2018influential citation
Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora
2018cites this paper
UWB at SemEval-2018 Task 1: Emotion Intensity Detection in Tweets
2018cites this paper
Word Emotion Induction for Multiple Languages as a Deep Multi-Task Learning Problem
2018cites this paper
SemEval-2018 Task 1: Affect in Tweets
2018cites this paper
Inferring Affective Meanings of Words from Word Embedding
2017cites this paper
Second-Order Word Embeddings from Nearest Neighbor Topological Features
2017cites this paper
Semi-Supervised Affective Meaning Lexicon Expansion Using Semantic and Distributed Word Representations
2017influential citation
Towards the Improvement of Automatic Emotion Pre-annotation with Polarity and Subjective Information
2017cites this paper
Elucidating Conceptual Properties from Word Embeddings
2017cites this paper
Improving Claim Stance Classification with Lexical Knowledge Expansion and Context Utilization
2017cites this paper
What do we need to build explainable AI systems for the medical domain?
2017cites this paper
Supervised and unsupervised methods for learning representations of linguistic units
2017influential citation
Cross-Lingual Sentiment Relation Capturing for Cross-Lingual Sentiment Analysis
2017cites this paper
Distributed representations for fine-grained entity typing
2017cites this paper
A Study of Style in Machine Translation: Controlling the Formality of Machine Translation Output
2017influential citation
Discovering Stylistic Variations in Distributional Vector Space Models via Lexical Paraphrases
2017influential citation