The bias is in the eye of the beholder: an epistemological reframing of LLM fairness research

Characterizing a large language model (LLM) from its outputs raises a methodological problem that remains insufficiently examined. LLMs are stateless systems. They do not maintain identity across interactions and retain no memory of prior decisions. There is therefore no stable bearer on which intrinsic properties could reside. Outputs vary with prompt wording, model version, deployment context, and decoding parameters. They may also vary across runs under identical conditions. Part of the current literature does not account for this architecture. Studies often attribute racism, political bias, deceptive intent, or moral competence to LLMs by measuring output distributions under specific prompts. The conclusions are then extended to the system itself. This collapses two distinct levels of analysis. Conditional output behaviour is observable. System-level properties require demonstrating stability across the interaction space. Most evaluations do not establish this condition. The result is an attribution error with implications for research, regulation, and governance. We propose a change in the evaluation question. Instead of asking what an LLM is, the analysis should focus on what a given output does. This requires specifying the input conditions, the deployment context, and the observable effect of the response. Outputs are conditioned on inputs constructed by human actors. They therefore do not reveal intrinsic system properties. What they describe is a specific human-machine interaction. Responsibility should be located at that level.

Related Organizations

University of Milan
Italy
Fondazione IRCCS Istituto Nazionale dei Tumori
Italy

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average