
Language models deployed in high-impact sectors like healthcare, education, and law risk perpetuating discrimination against marginalized groups. Existing efforts to reduce biases in models struggle to remove complex prejudices beyond the surface level. Here, we propose a novel framework for evaluating how language models encode and represent gender, rooted in decades of sociological research on gender and language. We identify three key requirements: avoiding essentialism, ensuring meaningful embeddings for all gender identities, and eliminating harmful stereotypes. Testing these requirements on multiple prominent language models, we reveal persistent patterns of gender essentialism, inadequate representations of nonbinary and transgender identities, and harmful pathologizing stereotypes. These findings highlight the need for critical engagement with concepts like gender when auditing language models for their representations. Addressing these issues is crucial to preventing harmful outcomes in high-risk applications, such as biased medical diagnoses, misinformed educational assessments, and weakened legal protections for marginalized communities.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
