SimCSE: Simple Contrastive Learning of Sentence Embeddings

Published 2021 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

This paper presents SimCSE, a simple contrastive learning framework that greatly advances the state-of-the-art sentence embeddings. We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective, with only standard dropout used as noise. This simple method works surprisingly well, performing on par with previous supervised counterparts. We find that dropout acts as minimal data augmentation and removing it leads to a representation collapse. Then, we propose a supervised approach, which incorporates annotated pairs from natural language inference datasets into our contrastive learning framework, by using “entailment” pairs as positives and “contradiction” pairs as hard negatives. We evaluate SimCSE on standard semantic textual similarity (STS) tasks, and our unsupervised and supervised models using BERT base achieve an average of 76.3% and 81.6% Spearman’s correlation respectively, a 4.2% and 2.2% improvement compared to previous best results. We also show—both theoretically and empirically—that contrastive learning objective regularizes pre-trained embeddings’ anisotropic space to be more uniform, and it better aligns positive pairs when supervised signals are available.

PUBLICATION RECORD

Publication year
2021
Venue
Conference on Empirical Methods in Natural Language Processing
Publication date
2021-04-18
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/2021.emnlp-main.552 arXiv 2104.08821
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Self-Guided Contrastive Learning for BERT Sentence Representations
2021cited by this paper
ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer
2021cited by this paper
WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach
2021cited by this paper
Semantic Re-tuning with Contrastive Tension
2021cited by this paper
Whitening Sentence Representations for Better Semantics and Faster Retrieval
2021cited by this paper
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
2021cited by this paper
Improving Neural Language Generation with Spectrum Control
2020influential reference
Dense Passage Retrieval for Open-Domain Question Answering
2020cited by this paper
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
2020influential reference
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
2020influential reference
An Unsupervised Sentence Embedding Method by Mutual Information Maximization
2020influential reference
On the Sentence Embeddings from BERT for Semantic Textual Similarity
2020influential reference
CLEAR: Contrastive Learning for Sentence Representation
2020influential reference
Learning Dense Representations of Phrases at Scale
2020cited by this paper
A Simple Framework for Contrastive Learning of Visual Representations
2020cited by this paper
Representation Degeneration Problem in Training Natural Language Generation Models
2019influential reference
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
2019influential reference
RoBERTa: A Robustly Optimized BERT Pretraining Approach
2019influential reference
Momentum Contrast for Unsupervised Visual Representation Learning
2019cited by this paper
A Bilingual Generative Transformer for Semantic Sentence Embedding
2019cited by this paper
Adversarial NLI: A New Benchmark for Natural Language Understanding
2019cited by this paper
Learning Dense Representations for Entity Retrieval
2019cited by this paper
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
2019cited by this paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019influential reference
Universal Sentence Encoder for English
2018influential reference
SentEval: An Evaluation Toolkit for Universal Sentence Representations
2018cited by this paper
An efficient framework for learning sentence representations
2018influential reference
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
2017influential reference
Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features
2017cited by this paper
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
2017influential reference
Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations
2017influential reference
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation
2017influential reference
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
2017cited by this paper
On Sampling Strategies for Neural Network-based Collaborative Filtering
2017cited by this paper
A Simple but Tough-to-Beat Baseline for Sentence Embeddings
2017cited by this paper
Efficient Natural Language Response Suggestion for Smart Reply
2017influential reference
Attention is All you Need
2017influential reference
Learning Distributed Representations of Sentences from Unlabelled Data
2016cited by this paper
SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation
2016cited by this paper
Task-Oriented Intrinsic Evaluation of Semantic Textual Similarity
2016cited by this paper
SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability
2015cited by this paper
A large annotated corpus for learning natural language inference
2015influential reference
Skip-Thought Vectors
2015influential reference
From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions
2014influential reference
Discriminative Unsupervised Feature Learning with Convolutional Neural Networks
2014cited by this paper
SemEval-2014 Task 10: Multilingual Semantic Textual Similarity
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Dropout: a simple way to prevent neural networks from overfitting
2014cited by this paper
A SICK cure for the evaluation of compositional distributional semantic models
2014cited by this paper
FWDselect : Variable selection algorithm in regression models
2013cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
*SEM 2013 shared task: Semantic Textual Similarity
2013cited by this paper
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
2013cited by this paper
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
2012cited by this paper
Dimensionality Reduction by Learning an Invariant Mapping
2006cited by this paper
Automatically Constructing a Corpus of Sentential Paraphrases
2005cited by this paper
Annotating Expressions of Opinions and Emotions in Language
2005cited by this paper
Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales
2005cited by this paper
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts
2004cited by this paper
Mining and summarizing customer reviews
2004cited by this paper
Building a question answering test collection
2000cited by this paper
On the trace and the sum of elements of a matrix
1984cited by this paper

CITED BY

From Logits to Latents: Contrastive Representation Shaping for LLM Unlearning
2026cites this paper
Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning
2026cites this paper
Enhancing user cold-start recommendation with graph structures and semantic dependencies
2026cites this paper
Bayesian RAG: uncertainty-aware retrieval for reliable financial question answering
2026cites this paper
Less Finetuning, Better Retrieval: Rethinking LLM Adaptation for Biomedical Retrievers via Synthetic Data and Model Merging
2026cites this paper
Beyond the Unit Hypersphere: Embedding Magnitude in Contrastive Learning
2026cites this paper
Agentic Retoucher for Text-To-Image Generation
2026cites this paper
Prompt-Contrastive Learning for Zero-Shot Relation Extraction
2026cites this paper
RADSCL: Representation augmentation integrated with dual supervised contrastive learning for low-resource text classification
2026cites this paper
ShieldedCode: Learning Robust Representations for Virtual Machine Protected Code
2026cites this paper
NEST: Nested Event Stream Transformer for Sequences of Multisets
2026influential citation
Textual Planning with Explicit Latent Transitions
2026cites this paper
Edge-Ready Romanian Language Models: Training, Quantization, and Deployment
2026cites this paper
Map of Encoders -- Mapping Sentence Encoders using Quantum Relative Entropy
2026influential citation
SimPRL: A Simple Contrastive Learning for Path Representation Learning by Joint GPS Trajectories and Road Paths
2026cites this paper
Contrastive Learning-Enhanced Chain-of-Thought Optimization for Complex Table Question Answering
2026cites this paper
Membox: Weaving Topic Continuity into Long-Range Memory for LLM Agents
2026cites this paper
Efficient Temporal-aware Matryoshka Adaptation for Temporal Information Retrieval
2026cites this paper
Contrastive Bi-Encoder Models for Multi-Label Skill Extraction: Enhancing ESCO Ontology Matching with BERT and Attention Mechanisms
2026cites this paper
EncodeRec: An Embedding Backbone for Recommendation Systems
2026cites this paper
Contrastive learning enhanced retrieval-augmented few-shot framework for multi-label patent classification
2026cites this paper
Angle-QPP: Improving Query Performance Prediction through Large Language Models and Angle Interaction in Complex Vector Space
2026cites this paper
Graph contrastive learning view construction methods in recommender systems: a survey
2026cites this paper
SENDE: extractive summarization of legal documents by sentence noising-reconstruction and dilated-gated convolutional networks
2026cites this paper
Global Geometry Is Not Enough for Vision Representations
2026cites this paper
Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth
2026cites this paper
A Unified Multimodal Framework for Dataset Construction and Model-Based Diagnosis of Ameloblastoma
2026cites this paper
Contrastive Learning pada IndoBERT untuk Analisis Sentimen Kebijakan Makan Bergizi Gratis
2026cites this paper
MemAdapter: Fast Alignment across Agent Memory Paradigms via Generative Subgraph Retrieval
2026cites this paper
CoMI-IRL: Contrastive Multi-Intention Inverse Reinforcement Learning
2026cites this paper
A topic-enhanced network via contrastive learning for abstractive text summarization
2026cites this paper
Contrastive Learning-Based Deep Embedded Clustering and the TCN-DMAttention Model for Traffic Congestion Prediction
2026cites this paper
Margin-based angular losses for lightweight text classification: Lessons from face recognition
2026cites this paper
TCLP-KWS: A Triplet Contrastive Learning Framework for Efficient and Customizable Keyword Spotting
2026cites this paper
Efficient Tuning Framework for Resource- Constrained Biomedical Question Answering
2026cites this paper
Can Embedding Similarity Predict Cross-Lingual Transfer? A Systematic Study on African Languages
2026cites this paper
LLM-Guided Lifecycle-Aware Clustering of Multi-Turn Customer Support Conversations
2026cites this paper
The Overlooked Role of Graded Relevance Thresholds in Multilingual Dense Retrieval
2026cites this paper
Contrastive Learning with Narrative Twins for Modeling Story Salience
2026cites this paper
Enriching Semantic Profiles into Knowledge Graph for Recommender Systems Using Large Language Models
2026cites this paper
One Instruction Does Not Fit All: How Well Do Embeddings Align Personas and Instructions in Low-Resource Indian Languages?
2026cites this paper
Overcoming BERT's limitations in uncertainty: A novel two-stage solution for multi-class medical text classification
2026cites this paper
A hybrid intelligent assessment model for English translation education with improved BERT and SVM
2026influential citation
What Should I Cite? A RAG Benchmark for Academic Citation Prediction
2026cites this paper
ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language Models
2026cites this paper
Code-Enhanced Cross-Perspective Bug Question Retrieval
2026cites this paper
Influence Guided Sampling for Domain Adaptation of Text Retrievers
2026cites this paper
When fixes teach: Repair-aware contrastive learning for optimization-resilient binary vulnerability detection
2026cites this paper
Multi-task learning fact checking with supervised reading comprehension-based evidence extraction
2026cites this paper
ChemAU: A collaborative framework for chemical reasoning via adaptive uncertainty estimation
2026cites this paper
LLM-based Embeddings: Attention Values Encode Sentence Semantics Better Than Hidden States
2026cites this paper
Inference-Only Prompt Projection for Safe Text-to-Image Generation with TV Guarantees
2026cites this paper
ACL: Aligned Contrastive Learning Improves BERT and Multi-exit BERT Fine-tuning
2026cites this paper
ContraLog: Log File Anomaly Detection with Contrastive Learning and Masked Language Modeling
2026cites this paper
CSRv2: Unlocking Ultra-Sparse Embeddings
2026cites this paper
Quantifying the knowledge proximity between academic and industry research: An entity and semantic perspective
2026cites this paper
Defending LLMs against jailbreak attacks through representation offset detection
2026cites this paper
Pioneering exploration in patent landscape studies: leveraging large language models and in-context learning for deeper insights
2026cites this paper
Hybrid Dictionary–Retrieval-Augmented Generation–Large Language Model for Low-Resource Translation
2026cites this paper
SCAD: self-supervised contrastive learning for allusion detection in Chinese poems
2026cites this paper
HypRAG: Hyperbolic Dense Retrieval for Retrieval Augmented Generation
2026cites this paper
Contrastive Learning for Diversity-Aware Product Recommendations in Retail
2026cites this paper
Cognition-aligned frequency filtering for sentence embeddings
2026cites this paper
Interest-Aware Graph Contrastive Learning for Recommendation With Diffusion-Based Augmentation
2026cites this paper
PITN: Physics-Informed Temporal Networks for Cuffless Blood Pressure Estimation
2026cites this paper
Multimodal UNcommonsense: From Odd to Ordinary and Ordinary to Odd
2026cites this paper
G-LFFN: A Global-Local Feature Fusion Network Leveraging Transformer-Encoder and Contrastive Learning for Multimodal Sentiment Analysis
2026cites this paper
A hierarchical enhanced text matching model for semantic alignment and importance determination of bridge defect records
2026cites this paper
A few-shot learning framework for unified thermal aging life prediction using contrastive representation learning
2026cites this paper
Counterfactual samples constructing and training for commonsense statements estimation
2026cites this paper
Comparative Analysis of Medical-Domain and General-Purpose Large Language Models: Evaluating Specialization in Healthcare Applications
2026cites this paper
Contrastive adversarial tuning: Enhancing discriminability and robustness of LLMs for emotion recognition in conversation
2026cites this paper
Revealing a coherent cell-state landscape across single-cell datasets with CONCORD.
2026cites this paper
SimRPD: Optimizing Recruitment Proactive Dialogue Agents through Simulator-Based Data Evaluation and Selection
2026cites this paper
Token Maturation: Autoregressive Language Generation via Continuous Token Dynamics
2026cites this paper
SemPA: Improving Sentence Embeddings of Large Language Models through Semantic Preference Alignment
2026influential citation
From Domains to Instances: Dual-Granularity Data Synthesis for LLM Unlearning
2026influential citation
Self-MedRAG: a Self-Reflective Hybrid Retrieval-Augmented Generation Framework for Reliable Medical Question Answering
2026cites this paper
Providing legal pincite recommendations using language representations
2026cites this paper
Do Large Language Models Know When They Lack Knowledge?
2026cites this paper
Co-evolution of bi-encoders and cross-encoders for unsupervised domain adaptation in semantic textual similarity
2026cites this paper
EmbeddingRWKV: State-Centric Retrieval with Reusable States
2026cites this paper
Relation Extraction Capabilities of LLMs on Clinical Text: A Bilingual Evaluation for English and Turkish
2026influential citation
Deriving Character Logic from Storyline as Codified Decision Trees
2026cites this paper
Hierarchical Deep Learning Framework Integrating Structural Interaction Potentials and Evolutionary Information for Protein-Protein Interaction Affinity Prediction
2026cites this paper
Transforming Drilling Textbooks into Semantic Tools for Design Verification and Knowledge Extraction
2026cites this paper
EdgeSim: Firmware vulnerability detection with control transfer-enhanced binary code similarity detection
2026cites this paper
An LLM chatbot to facilitate primary-to-specialist care transitions: a randomized controlled trial.
2026cites this paper
Retrieval augmentation for out-of-distribution robustness in non-knowledge intensive in-context learning
2026cites this paper
SASA: Semantic-Aware Contrastive Learning Framework with Separated Attention for Triple Classification
2026cites this paper
scACAN: An Adaptive Learning Framework Aggregating Local Graph Structure Context for Rare Cell Type Identification.
2026cites this paper
HCGBot: Learning Homophilous Context Graphs for Twitter Bot Detection
2026cites this paper
PEARL: Prototype-Enhanced Alignment for Label-Efficient Representation Learning with Deployment-Driven Insights from Digital Governance Communication Systems
2026cites this paper
KG-CRAFT: Knowledge Graph-based Contrastive Reasoning with LLMs for Enhancing Automated Fact-checking
2026cites this paper
FIT: Defying Catastrophic Forgetting in Continual LLM Unlearning
2026cites this paper
Zonkey: A Hierarchical Diffusion Language Model with Differentiable Tokenization and Probabilistic Attention
2026cites this paper
Enhancing Language Models for Robust Greenwashing Detection
2026cites this paper
Do Reasoning Models Enhance Embedding Models?
2026cites this paper
A prototypical alignment approach to unknown traffic classification using BERT
2026cites this paper
Diffusion-Pretrained Dense and Contextual Embeddings
2026cites this paper