publication . Preprint . Conference object . Other literature type . 2019

Measuring Bias in Contextualized Word Representations

Keita Kurita; Nidhi Vyas; Ayush Pareek; Alan W. Black; Yulia Tsvetkov;
Open Access English
  • Published: 01 Jan 2019
Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1)~propose a template-based method to quantify bias in BERT; (2)~show that this method obtains more consistent results in capturing social biases than the traditional cosine based method; and (3)~conduct a case study, evaluating gender bias in a downstream task of Gender Pronoun Resolution. Although our case study focuses on gender bias, the proposed technique is generalizable ...
free text keywords: Computer Science - Computation and Language, Natural language processing, computer.software_genre, computer, Computer science, Training set, Artificial intelligence, business.industry, business, Pronoun resolution, Gender bias
Communities with gateway
OpenAIRE Connect image
Funded by
NSF| NSF-BSF: RI: Small: Collaborative Research: Modeling Crosslinguistic Influences Between Language Varieties
  • Funder: National Science Foundation (NSF)
  • Project Code: 1812327
  • Funding stream: Directorate for Computer & Information Science & Engineering | Division of Information and Intelligent Systems
27 references, page 1 of 2

Evelin Amorim, Marcia Canc¸ado, and Adriano Veloso. 2018. Automated essay scoring in the presence of biased ratings. In Proc. of NAACL, pages 229-237.

Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Calif. L. Rev., 104:671. [OpenAIRE]

Christine Basta, Marta R Costa-jussa`, and Noe Casas. 2019. Evaluating the underlying gender bias in contextualized word embeddings. arXiv preprint arXiv:1904.08783.

Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Proc. of NIPS, pages 4349-4357.

Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334):183-186.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proc. of NAACL.

Anjalie Field and Yulia Tsvetkov. 2019. Entity-centric contextual affective analysis. In Proc. of ACL. [OpenAIRE]

Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16):E3635-E3644.

Peter Glick. 1991. Trait-based and sex-based discrimination in occupational prestige, occupational salary, and hiring. Sex Roles, 25(5-6):351-378.

Anthony Greenwald, Debbie E. McGhee, and Jordan L. K. Schwartz. 1998. Measuring individual differences in implicit cognition: The implicit association test. Journal of personality and social psychology, 74:1464-80.

David Jurgens, Yulia Tsvetkov, and Dan Jurafsky. 2017. Incorporating dialectal variability for socially equitable language identification. In Proc. of ACL, pages 51-57.

Daniel Lee and Hyunjune Seung. 2001. Algorithms for non-negative matrix factorization. In Proc. of NIPS.

Klas Leino, Matt Fredrikson, Emily Black, Shayak Sen, and Anupam Datta. 2019. Feature-wise bias amplification. In Prof. of ICLR.

Thomas Manzini, Yao Chong, Yulia Tsvetkov, and Alan W Black. 2019. Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. In Proc. of NAACL.

Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On measuring social biases in sentence encoders. In Proc. of NAACL.

27 references, page 1 of 2
Any information missing or wrong?Report an Issue