
The Prototype Fairness Illusion: Why Prototype-Based Fair Representations Fail on Image Data This repository contains the research paper studying the behavior of prototype-based fairness methods when applied to high-dimensional visual data. Abstract Prototype-based fair representation learning, introduced by Zemel et al. (2013) through Learning Fair Representations (LFR), has demonstrated strong fairness guarantees on tabular data by enforcing statistical parity in prototype assignments. In this work, we systematically study what happens when this framework is extended to high-dimensional image data. We make three primary contributions: Geometric analysis of vanilla LFR on images.We provide both a theoretical argument and empirical evidence showing that vanilla LFR fails fundamentally on image data. Euclidean distance in pixel space is not semantically meaningful for faces, causing prototype assignments to become nearly uniform. As a result, the fairness constraint (L_z) approaches zero without actually removing sensitive attribute information from the learned representation. Deep Semantic LFR (DS-LFR).We propose an extension that moves prototypes into a learned semantic convolutional latent space and replaces pixel-wise reconstruction loss with a VGG-based perceptual loss. This modification substantially improves classification performance, increasing accuracy from 51.95% to 92.08% on the CelebA benchmark. Fairness Activation Threshold.We identify a previously unreported optimization phenomenon. During early training (epochs 1–14), the model exhibits complete collapse with (L_z = 0) and (L_y = 0.693) (random guessing). At epoch 15, a sharp phase transition occurs where classification and fairness objectives activate simultaneously. Despite the architectural improvements, sensitive attribute accuracy (sAcc) remains approximately 0.926 across all hyperparameter configurations. This demonstrates that statistical parity in prototype assignments is fundamentally weaker than representation-level disentanglement. These findings suggest that the (L_z) objective, regardless of weighting, cannot remove protected attribute information from useful visual representations. Instead, adversarial disentanglement mechanisms are likely required to achieve true representation-level fairness in image models. Dataset: CelebA (Liu et al., 2015)Code and experiments: https://www.kaggle.com/code/proprak01/prototype-fairness-main-image-data DOI: https://doi.org/10.5281/zenodo.19016833
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
