Word embeddings carry stereotypical connotations from the text they are trained on, which can lead to invalid inferences in downstream models that rely on them. We use this observation to design a mechanism for measuring stereotypes using the task of natural language inference. We demonstrate a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe). Further, we show that for gender bias, these techniques extend to contextualized embeddings when applied selectively only to the static components of contextualized embeddings (ELMo, BERT).
On Measuring and Mitigating Biased Inferences of Word Embeddings
Sunipa Dev,Tao Li,J. M. Phillips,Vivek Srikumar
Published 2019 in AAAI Conference on Artificial Intelligence
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
AAAI Conference on Artificial Intelligence
- Publication date
2019-08-25
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-23 of 23 references · Page 1 of 1