Learning deep representations by mutual information estimation and maximization

R. Devon Hjelm,A. Fedorov,Samuel Lavoie-Marchildon,Karan Grewal,Adam Trischler,Yoshua Bengio

Published 2018 in International Conference on Learning Representations

ABSTRACT

This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.

PUBLICATION RECORD

Publication year
2018
Venue
International Conference on Learning Representations
Publication date
2018-08-20
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1808.06670
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

MINE: Mutual Information Neural Estimation
2018influential reference
Invariant Information Clustering for Unsupervised Image Classification and Segmentation
2018cited by this paper
Representation Learning with Contrastive Predictive Coding
2018cited by this paper
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
2018cited by this paper
Invariant Information Distillation for Unsupervised Image Segmentation and Clustering
2018cited by this paper
Isolating Sources of Disentanglement in Variational Autoencoders
2018cited by this paper
Which Training Methods for GANs do actually Converge?
2018cited by this paper
Image-to-image translation for cross-domain disentanglement
2018cited by this paper
Learning Discrete Representations via Information Maximizing Self-Augmented Training
2017cited by this paper
Towards Principled Methods for Training Generative Adversarial Networks
2017cited by this paper
Deep Variational Information Bottleneck
2017cited by this paper
Independently Controllable Features
2017influential reference
Improved Training of Wasserstein GANs
2017influential reference
Learning Independent Features with Adversarial Nets for Non-linear ICA
2017influential reference
Multi-task Self-Supervised Visual Learning
2017cited by this paper
Boundary-Seeking Generative Adversarial Networks
2017cited by this paper
Unsupervised Learning by Predicting Noise
2017cited by this paper
Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference
2017cited by this paper
Geometrical Insights for Implicit Generative Modeling
2017cited by this paper
Deep Adaptive Image Clustering
2017cited by this paper
Independently Controllable Factors
2017cited by this paper
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
2016cited by this paper
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
2016cited by this paper
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
2016cited by this paper
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
2016cited by this paper
Pixel Recurrent Neural Networks
2016cited by this paper
One-Shot Generalization in Deep Generative Models
2016cited by this paper
Adversarially Learned Inference
2016influential reference
Density estimation using Real NVP
2016cited by this paper
Context Encoders: Feature Learning by Inpainting
2016cited by this paper
Improved Techniques for Training GANs
2016cited by this paper
Adversarial Feature Learning
2016influential reference
Exploring the Limits of Language Modeling
2016influential reference
From Facial Parts Responses to Face Detection: A Deep Learning Approach
2015cited by this paper
Adversarial Autoencoders
2015influential reference
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
2015cited by this paper
Unsupervised Deep Embedding for Clustering Analysis
2015cited by this paper
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
2015cited by this paper
Unsupervised Visual Representation Learning by Context Prediction
2015cited by this paper
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Semi-supervised Learning with Deep Generative Models
2014cited by this paper
Deep Learning Face Attributes in the Wild
2014cited by this paper
Dynamic functional connectivity analysis reveals transient states of dysconnectivity in schizophrenia
2014cited by this paper
NICE: Non-linear Independent Components Estimation
2014influential reference
Generative Adversarial Nets
2014cited by this paper
Auto-Encoding Variational Bayes
2013cited by this paper
Learning word embeddings efficiently with noise-contrastive estimation
2013cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics
2012cited by this paper
Disentangling Factors of Variation for Facial Expression Recognition
2012influential reference
Capturing inter-subject variability with group independent component analysis of fMRI data: A simulation study
2012cited by this paper
ImageNet classification with deep convolutional neural networks
2012influential reference
Representation Learning: A Review and New Perspectives
2012cited by this paper
A Kernel Two-Sample Test
2012influential reference
Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
2011cited by this paper
An Analysis of Single-Layer Networks in Unsupervised Feature Learning
2011cited by this paper
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
2010cited by this paper
Noise-contrastive estimation: A new estimation principle for unnormalized statistical models
2010cited by this paper
Prediction, Cognition and the Brain
2009cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009cited by this paper
Modulation of temporally coherent brain networks estimated using ICA at rest and during cognitive tasks
2008cited by this paper
Extracting and composing robust features with denoising autoencoders
2008cited by this paper
Linear and nonlinear ICA based on mutual information
2007cited by this paper
Multiscale structural similarity for image quality assessment
2003influential reference
Training Products of Experts by Minimizing Contrastive Divergence
2002influential reference
Slow Feature Analysis: Unsupervised Learning of Invariances
2002cited by this paper
Independent component analysis: algorithms and applications
2000cited by this paper
Nonlinear independent component analysis: Existence and uniqueness results
1999cited by this paper
Mutual information maximization: models of cortical self-organization.
1996cited by this paper
An Information-Maximization Approach to Blind Separation and Blind Deconvolution
1995cited by this paper
An information-theoretic unsupervised learning algorithm for neural networks
1993cited by this paper
Learning Factorial Codes by Predictability Minimization
1992influential reference
The self-organizing map
1990cited by this paper
Self-organization in a perceptual network
1988influential reference
Asymptotic evaluation of certain Markov process expectations for large time
1975cited by this paper
Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks
year unknowncited by this paper

CITED BY

PCIR: An Open-World Remote Sensing Image Representation Learning Method From a Causal Perspective
2026cites this paper
CoMI-IRL: Contrastive Multi-Intention Inverse Reinforcement Learning
2026cites this paper
Contrastive knowledge embedding with discriminative self-weighted sampling.
2026cites this paper
Few-Shot Video Object Segmentation in X-Ray Angiography Using Local Matching and Spatio-Temporal Consistency Loss
2026cites this paper
Tabular Incremental Inference
2026cites this paper
A Systematic Review on Data-Driven Brain Deformation Modeling for Image-Guided Neurosurgery
2026cites this paper
Learn Before Represent: Bridging Generative and Contrastive Learning for Domain-Specific LLM Embeddings
2026cites this paper
Automated Re-Identification of Holstein-Friesian Cattle in Dense Crowds
2026cites this paper
Discrete Bridges for Mutual Information Estimation
2026cites this paper
Inter-group knowledge transfer and representation distillation for fair recommendation
2026cites this paper
Innovative Specific Emitter Identification With Self-Supervised and Incremental Learning
2026cites this paper
Uncovering Neural Learning Dynamics Through Latent Mutual Information
2026cites this paper
Beyond alignment: synergistic integration is required for multimodal cell foundation models
2026cites this paper
Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning
2026cites this paper
SemanAegis: Toward Credential-Aware Semantic Communication Against Knowledge Leakage Threats
2026cites this paper
Pruning Attention Heads Based on Semantic and Code Structure for Smart Contract Vulnerability Detection
2026cites this paper
Spectral Ghost in Representation Learning: from Component Analysis to Self-Supervised Learning
2026cites this paper
Radial-VCReg: More Informative Representation Learning Through Radial Gaussianization
2026cites this paper
Mitigating spurious features by contrastive learning in pottery sherd recognition
2026cites this paper
Self-Supervised Learning as Discrete Communication
2026cites this paper
FDAGCL:Feature Discrepancy-Aware Graph Contrastive Learning
2026cites this paper
GloryIMVC: global-driven information theory for incomplete multi-view clustering
2026cites this paper
Learning From Graph-Graph Relationship: A New Perspective on Graph-Level Anomaly Detection
2026cites this paper
FG-KD: A Novel Forward Gradient-Based Framework for Teacher Knowledge Augmentation
2026cites this paper
Partial Label Learning via Mutual Information Representation Learning
2026cites this paper
MGIPS-CL: Multimodal NER with multi-granularity interweave and implicit positive sample contrastive learning
2026cites this paper
Multiview Self-Representation Learning across Heterogeneous Views
2026cites this paper
Seeing the whole in the parts with self-supervised representation learning
2026cites this paper
Temporal Representations for Exploration: Learning Complex Exploratory Behavior without Extrinsic Rewards
2026influential citation
Contrastive context-aware emotion recognition: A multimodal framework for learning contextualized emotional representations
2026cites this paper
DR-MIM: Zero-shot cross-lingual transfer via disentangled representation and mutual information maximization
2026cites this paper
MCD-EARS: A multimodal cross-domain expertise-aware recommender system for healthcare applications
2026cites this paper
Information Bottleneck-Based Subgraphs Defending Against Inference Attacks in Federated Graph Learning Systems
2026cites this paper
Simple Network Graph Comparative Learning
2026cites this paper
Knowledge-Guided Generative Surrogate Modeling for High-Dimensional Design Optimization under Scarce Data
2026cites this paper
Frequency-aware Adaptive Contrastive Learning for Sequential Recommendation
2026cites this paper
A Tutorial on Discriminative Clustering and Mutual Information
2025cites this paper
Mutual Information Neural-Estimation-Driven Constellation Shaping Design and Performance Analysis
2025cites this paper
FLAM: Frame-Wise Language-Audio Modeling
2025cites this paper
Towards Anomaly-Aware Pre-Training and Fine-Tuning for Graph Anomaly Detection
2025cites this paper
Deconfounding representation learning for mitigating latent confounding effects in recommendation
2025cites this paper
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey
2025cites this paper
Graffe: Graph Representation Learning via Diffusion Probabilistic Models
2025cites this paper
Connecting Domains and Contrasting Samples: A Ladder for Domain Generalization
2025cites this paper
STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing
2025cites this paper
Disentangling Reasoning Factors for Natural Language Inference
2025cites this paper
A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics
2025cites this paper
Local Masked Reconstruction for Efficient Self-Supervised Learning on High-Resolution Images
2025cites this paper
Minimax Bayesian Neural Networks
2025cites this paper
Enhancing Aerosol Vertical Distribution Retrieval With Combined LSTM and Transformer Model From OCO-2 O2 A-Band Observations
2025cites this paper
Harnessing the Power of Local Supervision in Federated Learning
2025cites this paper
ZSLformer: Zero-Shot Learning for Automatic Modulation Recognition
2025cites this paper
Advancements in Image Interpretation: Self-Supervised Learning of Object Structures for Enhanced Classification and Segmentation
2025cites this paper
Graph Contrastive Learning with Progressive Augmentations
2025cites this paper
Have Our Cake and Eat It: Augmentation Diversity and Semantic Consistency Balanced Graph Contrastive Learning
2025cites this paper
NeuroNasal: Advanced AI-Driven Self-Supervised Learning Approach for Enhanced Sinonasal Pathology Detection
2025cites this paper
Benchmarking Mutual Information-based Loss Functions in Federated Learning
2025cites this paper
Attention-based hybrid contrastive learning for unsupervised person re-identification
2025cites this paper
Multi-Scale Contrast for Global-Interaction Graph Representation Learning
2025cites this paper
Beyond Recommendations: Sequential Recommendation with Collaborative Explanation
2025cites this paper
A Semantic Generalization of Shannon’s Information Theory and Applications
2025cites this paper
Semantics-Empowered Non-Orthogonal Multiple Access for Downlink Transmission of Correlated Information Sources
2025cites this paper
A Privacy-Preserving Cross-Modal Retrieval Scheme Based on CLIP and Deep Hashing
2025cites this paper
Sonata: Self-Supervised Learning of Reliable Point Representations
2025cites this paper
Detection of High-Speed Railway Track Bed Foreign Object Based on Contrastive Learning and Adversarial Training
2025cites this paper
Robust Classification with Noisy Labels Based on Posterior Maximization
2025cites this paper
Burger: Robust Graph Denoising-Augmentation Fusion and Multi-Semantic Modeling for Social Recommendation
2025cites this paper
Fine-Grained Alignment Network for Zero-Shot Cross-Modal Retrieval
2025cites this paper
Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations
2025cites this paper
Cross-Domain Underwater Sound Source Localization Algorithm Based on Binaural Matrix and Mutual Information Constraint Loss
2025cites this paper
Learning from Noisy Labels with Contrastive Co-Transformer
2025cites this paper
Random Walks in Self-supervised Learning for Triangular Meshes
2025cites this paper
Intrinsic and Extrinsic Factor Disentanglement for Recommendation in Various Context Scenarios
2025cites this paper
TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces
2025cites this paper
A Survey on Multimodal Recommender Systems: Recent Advances and Future Directions
2025cites this paper
HyperGCL: Multi-Modal Graph Contrastive Learning via Learnable Hypergraph Views
2025cites this paper
INFO-SEDD: Continuous Time Markov Chains as Scalable Information Metrics Estimators
2025cites this paper
A Generalization Method for Indoor Localization via Domain-Invariant Feature Learning
2025cites this paper
FeCoGraph: Label-Aware Federated Graph Contrastive Learning for Few-Shot Network Intrusion Detection
2025cites this paper
A Neural Difference-of-Entropies Estimator for Mutual Information
2025cites this paper
Keep your distance: learning dispersed embeddings on $\mathbb{S}_{m}$
2025cites this paper
Multi-Grained Alignment with Knowledge Distillation for Partially Relevant Video Retrieval
2025cites this paper
Facial Expression Recognition With Heatmap Neighbor Contrastive Learning
2025cites this paper
AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning
2025cites this paper
A General Cooperative Optimization Driven High-Frequency Enhancement Framework for Multispectral Image Fusion
2025cites this paper
Spatial-Aware Metric Network via Patchwise Feature Alignment for Few-Shot Learning
2025cites this paper
Contrastive MIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning
2025cites this paper
Composed image retrieval: a survey on recent research and development
2025cites this paper
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
2025cites this paper
Hypergraph Foundation Model
2025cites this paper
L2M: Mutual Information Scaling Law for Long-Context Language Modeling
2025cites this paper
A strategy for improving GAN generation: Contrastive self-adversarial training
2025cites this paper
MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation
2025cites this paper
A flexible road network partitioning framework for traffic management via graph contrastive learning and multi‐objective optimization
2025cites this paper
RML: Efficient Representation Mutual Learning Framework for End-to-End Weakly Supervised Semantic Segmentation
2025cites this paper
Contrastive Learning via Randomly Generated Deep Supervision
2025cites this paper
Utilization of Neighbor Information for Image Classification with Different Levels of Supervision
2025cites this paper
End-to-end optimal detector design with mutual information surrogates
2025cites this paper
Implicit Contrastive Representation Learning with Guided Stop-gradient
2025cites this paper
Countering Overfitting with Counterfactual Examples
2025cites this paper