Adversarial Training for Unsupervised Bilingual Lexicon Induction

Meng Zhang,Yang Liu,Huanbo Luan,Maosong Sun

Published 2017 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show that such cross-lingual connection can actually be established without any form of supervision. We achieve this end by formulating the problem as a natural adversarial game, and investigating techniques that are crucial to successful training. We carry out evaluation on the unsupervised bilingual lexicon induction task. Even though this task appears intrinsically cross-lingual, we are able to demonstrate encouraging performance without any cross-lingual clues.

PUBLICATION RECORD

Publication year
2017
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2017-07-01
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.18653/v1/P17-1179
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

GENERATIVE ADVERSARIAL NETS
2018influential reference
Offline bilingual word vectors, orthogonal transformations and the inverted softmax
2017cited by this paper
Bilingual Lexicon Induction from Non-Parallel Data with Minimal Supervision
2017cited by this paper
Towards Principled Methods for Training Generative Adversarial Networks
2017cited by this paper
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
2016cited by this paper
Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings
2016cited by this paper
Learning principled bilingual mappings of word embeddings while preserving monolingual invariance
2016cited by this paper
Massively Multilingual Word Embeddings
2016cited by this paper
Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders
2016cited by this paper
Improved Techniques for Training GANs
2016cited by this paper
Cross-Lingual Word Representations via Spectral Graph Embeddings
2016cited by this paper
Inducing Bilingual Lexica From Non-Parallel Data With Earth Mover’s Distance Regularization
2016influential reference
Cross-lingual Models of Word Embeddings: An Empirical Comparison
2016cited by this paper
A Distribution-based Model to Learn Bilingual Word Embeddings
2016influential reference
Dual Learning for Machine Translation
2016cited by this paper
Amortised MAP Inference for Image Super-resolution
2016cited by this paper
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
2016cited by this paper
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
2016cited by this paper
Learning Crosslingual Word Embeddings without Bilingual Corpora
2016cited by this paper
Learning in Implicit Generative Models
2016cited by this paper
On the Role of Seed Lexicons in Learning Bilingual Word Embeddings
2016influential reference
Minimally-Constrained Multilingual Embeddings via Artificial Code-Switching
2016cited by this paper
LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages
2016cited by this paper
Adversarial Autoencoders
2015cited by this paper
Deep Multilingual Correlation for Improved Word Embeddings
2015cited by this paper
Unifying Bayesian Inference and Vector Space Models for Improved Decipherment
2015cited by this paper
Simple task-specific bilingual word embeddings
2015cited by this paper
Trans-gram, Fast Cross-lingual Word-embeddings
2015cited by this paper
Domain-Adversarial Training of Neural Networks
2015cited by this paper
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
2015cited by this paper
Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning
2015cited by this paper
Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction
2015cited by this paper
Bilingual Word Representations with Monolingual Quality in Mind
2015cited by this paper
On the universal structure of human lexical semantics
2015cited by this paper
Learning Cross-lingual Word Embeddings via Matrix Co-factorization
2015cited by this paper
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation
2015cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
An Autoencoder Approach to Learning Bilingual Word Representations
2014cited by this paper
Dropout: a simple way to prevent neural networks from overfitting
2014cited by this paper
Improving zero-shot learning by mitigating the hubness problem
2014cited by this paper
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
2014cited by this paper
Learning Bilingual Word Representations by Marginalizing Alignments
2014cited by this paper
Improving Vector Space Word Representations Using Multilingual Correlation
2014cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013influential reference
Dropout Training as Adaptive Regularization
2013cited by this paper
Dependency-Based Decipherment for Resource-Limited Machine Translation
2013cited by this paper
Bilingual Word Embeddings for Phrase-Based Machine Translation
2013cited by this paper
Multilingual Distributed Representations without Word Alignment
2013cited by this paper
Combining Bilingual and Comparable Corpora for Low Resource Machine Translation
2013cited by this paper
Learning with Marginalized Corrupted Features
2013cited by this paper
Exploiting Similarities among Languages for Machine Translation
2013influential reference
Cross-Lingual Semantic Similarity of Words as the Similarity of Their Semantic Word Responses
2013influential reference
Large Scale Decipherment for Out-of-Domain Machine Translation
2012cited by this paper
Identifying Word Translations from Comparable Corpora Using Latent Topic Models
2011cited by this paper
Domain Adaptation for Machine Translation by Mining Unseen Words
2011cited by this paper
Learning Bilingual Lexicons from Monolingual Corpora
2008cited by this paper
Mining Very-Non-Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and E
2004cited by this paper
A Geometric View on Bilingual Lexicon Extraction from Comparable Corpora
2004cited by this paper
Learning a Translation Lexicon from Monolingual Corpora
2002cited by this paper
Automatic Identification of Word Translations from Unrelated English and German Corpora
1999cited by this paper
Current address: Microsoft Research,
year unknowncited by this paper

CITED BY

Cross-Lingual Representation Alignment Through Contrastive Image-Caption Tuning
2025cites this paper
AZIM: Arabic-Centric Zero-Shot Inference for Multilingual Topic Modeling With Enhanced Performance on Summarized Text
2025cites this paper
mini-vec2vec: Scaling Universal Geometry Alignment with Linear Transformations
2025cites this paper
Word-level Cross-lingual Structure in Large Language Models
2025cites this paper
Reshaping Word Embedding Space With Monolingual Synonyms for Bilingual Lexicon Induction
2025cites this paper
Lost in Alignment: A Survey on Cross-Lingual Alignment Methods for Contextualized Representation
2025cites this paper
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
2025cites this paper
A domain-specific cross-lingual semantic alignment learning model for low-resource languages
2025cites this paper
Aligning Embeddings and Geometric Random Graphs: Informational Results and Computational Approaches for the Procrustes-Wasserstein Problem
2024cites this paper
LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks
2024cites this paper
SeNSe: embedding alignment via semantic anchors selection
2024cites this paper
Alignment of Multilingual Embeddings to Estimate Job Similarities in Online Labour Market
2024cites this paper
DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon Induction
2024cites this paper
The Shape of Word Embeddings: Quantifying Non-Isometry with Topological Data Analysis
2024cites this paper
A survey of neural-network-based methods utilising comparable data for finding translation equivalents
2024influential citation
Cross-Network Embeddings Transfer for Traffic Analysis
2024cites this paper
Decipherment-Aware Multilingual Learning in Jointly Trained Language Models
2024cites this paper
Toward Cross-Lingual Social Event Detection with Hybrid Knowledge Distillation
2024cites this paper
TLS-WGAN-GP: A Generative Adversarial Network Model for Data-Driven Fault Root Cause Location
2023cites this paper
LAPCA: Language-Agnostic Pretraining with Cross-Lingual Alignment
2023cites this paper
Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition
2023cites this paper
ADMit: Improving NER in automotive domain with domain adversarial training and multi-task learning
2023cites this paper
Accessing Higher Dimensions for Unsupervised Word Translation
2023cites this paper
ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting
2023cites this paper
Neural Machine Translation: A Survey of Methods used for Low Resource Languages
2023cites this paper
Enabling Unsupervised Neural Machine Translation with Word-level Visual Representations
2023cites this paper
An approach to lexicon filtering for author profiling
2023cites this paper
DA-DAN: A Dual Adversarial Domain Adaption Network for Unsupervised Non-overlapping Cross-domain Recommendation
2023cites this paper
Bilingual word embedding fusion for robust unsupervised bilingual lexicon induction
2023cites this paper
An Unsupervised Approach to Bilingual Lexicon Induction with Comparable Corpora
2023cites this paper
Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions
2023cites this paper
A Novel Unsupervised Approach for Cross-Lingual Word Alignment in Low Isomorphic Embedding Spaces
2023cites this paper
Transformer: A General Framework from Machine Translation to Others
2023cites this paper
MUSEDA: Multilingual Unsupervised and Supervised Embedding for Domain Adaption
2023cites this paper
Investigating Unsupervised Neural Machine Translation for Low-resource Language Pair English-Mizo via Lexically Enhanced Pre-trained Language Models
2023cites this paper
Leveraging Context Patterns for Medical Entity Classification
2023cites this paper
English-Manipuri Cross-Lingual Embedding: A Preliminary Study
2023influential citation
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
2023cites this paper
Learning bilingual word embedding for automatic text summarization in low resource language
2023cites this paper
Mining parallel sentences from internet with multi-view knowledge distillation for low-resource language pairs
2023cites this paper
GRI: Graph-based Relative Isomorphism of Word Embedding Spaces
2023cites this paper
Multi-Stage Framework with Refinement Based Point Set Registration for Unsupervised Bi-Lingual Word Alignment
2022cites this paper
Cross-lingual Feature Extraction from Monolingual Corpora for Low-resource Unsupervised Bilingual Lexicon Induction
2022cites this paper
Unsupervised Alignment of Distributional Word Embeddings
2022cites this paper
Deep Multilabel Multilingual Document Learning for Cross-Lingual Document Retrieval
2022cites this paper
Topic-Based Unsupervised and Supervised Dictionary Induction
2022cites this paper
Domain Mismatch Doesn’t Always Prevent Cross-lingual Transfer Learning
2022cites this paper
Aligning Word Vectors on Low-Resource Languages with Wiktionary
2022cites this paper
English-Malay Cross-Lingual Embedding Alignment using Bilingual Lexicon Augmentation
2022cites this paper
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment
2022cites this paper
Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow Interpolation
2022cites this paper
Towards Unsupervised Speech-to-Speech Translation
2022cites this paper
A cross-lingual sentence pair interaction feature capture model based on pseudo-corpus and multilingual embedding
2022cites this paper
Sub-Word Alignment is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation
2022cites this paper
Robust Question Answering on Out-of-Domain Data
2022cites this paper
Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation
2022cites this paper
Don’t Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings
2022cites this paper
Cross-Lingual Transfer with Class-Weighted Language-Invariant Representations
2022cites this paper
Unsupervised Parallel Sentences of Machine Translation for Asian Language Pairs
2022cites this paper
Low-resource Neural Machine Translation: Methods and Trends
2022cites this paper
Sub-word based unsupervised bilingual dictionary induction for Chinese-Uyghur
2022cites this paper
Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN
2022cites this paper
Understanding models understanding language
2022cites this paper
Inducing Generalizable and Interpretable Lexica
2022cites this paper
QA Domain Adaptation using Data Augmentation and Contrastive Adaptation
2021cites this paper
Bilingual Terminology Extraction from Comparable E-Commerce Corpora
2021cites this paper
Improving bilingual word embeddings mapping with monolingual context information
2021influential citation
Unsupervised Neural Machine Translation with Universal Grammar
2021cites this paper
Adversarial Domain Adaptation for Cross-lingual Information Retrieval with Multilingual BERT
2021cites this paper
English–Welsh Cross-Lingual Embeddings
2021cites this paper
Embedding Semantic Anchors to Guide Topic Models on Short Text Corpora
2021cites this paper
Fully unsupervised word translation from cross-lingual word embeddings especially for healthcare professionals
2021cites this paper
Exploring Cross-Lingual Transfer Learning with Unsupervised Machine Translation
2021cites this paper
Cross-Lingual BERT Contextual Embedding Space Mapping with Isotropic and Isometric Conditions
2021cites this paper
Multi-Granularity Contrasting for Cross-Lingual Pre-Training
2021cites this paper
Combining Static Word Embeddings and Contextual Representations for Bilingual Lexicon Induction
2021cites this paper
AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER
2021cites this paper
Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study
2021cites this paper
Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification
2021cites this paper
Improving Zero-Shot Cross-lingual Transfer for Multilingual Question Answering over Knowledge Graph
2021cites this paper
An exploration of semi-supervised and language-adversarial transfer learning using hybrid acoustic model for hindi speech recognition
2021cites this paper
Word Embedding Transformation for Robust Unsupervised Bilingual Lexicon Induction
2021cites this paper
Bilingual alignment transfers to multilingual alignment for unsupervised parallel text mining
2021cites this paper
Filtered Inner Product Projection for Crosslingual Embedding Alignment
2021influential citation
A Survey on Low-Resource Neural Machine Translation
2021cites this paper
Cross-lingual Transferring of Pre-trained Contextualized Language Models
2021cites this paper
Bilingual Terminology Extraction from Non-Parallel E-Commerce Corpora
2021cites this paper
Adversarial training with Wasserstein distance for learning cross-lingual word embeddings
2021influential citation
Transferring Knowledge Distillation for Multilingual Social Event Detection
2021cites this paper
Towards mining bilingual lexicons and parallel phrases from large-scale monolingual corpora
2021cites this paper
A word embedding-based approach to cross-lingual topic modeling
2021cites this paper
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
2021cites this paper
Commonsense Knowledge Augmentation for Low-Resource Languages via Adversarial Learning
2021cites this paper
Do not neglect related languages: The case of low-resource Occitan cross-lingual word embeddings
2021cites this paper
El Volumen Louder Por Favor: Code-switching in Task-oriented Semantic Parsing
2021cites this paper
Efficient bilingual lexicon extraction from comparable corpora based on formal concepts analysis
2021influential citation
Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation
2021cites this paper
Filtered Inner Product Projection for Multilingual Embedding Alignment
2020cites this paper
Cross-lingual Sentiment Analysis via AAE and BiGRU
2020cites this paper
Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization
2020cites this paper