Bias in Automated Speaker Recognition

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 20 Jun 2022Embargo end date: 01 Jan 2022 Netherlands Publisher:ACMJournal:2022 ACM Conference on Fairness Accountability and TransparencyFunded by:EC | SPATIAL, EC | TAILOR

Authors: Wiebke Toussaint Hutiri; Aaron Yi Ding;

doi: 10.1145/3531146.3533089 , 10.48550/arxiv.2201.09486

arXiv: 2201.09486

Bias in Automated Speaker Recognition

- Summary
- Subjects
- Metrics

Abstract

Automated speaker recognition uses data processing to identify speakers by their voice. Today, automated speaker recognition is deployed on billions of smart devices and in services such as call centres. Despite their wide-scale deployment and known sources of bias in related domains like face recognition and natural language processing, bias in automated speaker recognition has not been studied systematically. We present an in-depth empirical and analytical study of bias in the machine learning development workflow of speaker verification, a voice biometric and core task in automated speaker recognition. Drawing on an established framework for understanding sources of harm in machine learning, we show that bias exists at every development stage in the well-known VoxCeleb Speaker Recognition Challenge, including data generation, model building, and implementation. Most affected are female speakers and non-US nationalities, who experience significant performance degradation. Leveraging the insights from our findings, we make practical recommendations for mitigating bias in automated speaker recognition, and outline future research directions.

Country

Netherlands

Related Organizations

Delft University of Technology
Netherlands

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), bias, evaluation, Computer Science - Computation and Language, speaker recognition, fairness, audit, Computer Science - Sound, Machine Learning (cs.LG), Computer Science - Computers and Society, Audio and Speech Processing (eess.AS), Computers and Society (cs.CY), FOS: Electrical engineering, electronic engineering, information engineering, speaker verification, Computation and Language (cs.CL), Electrical Engineering and Systems Science - Audio and Speech Processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	36
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%