Hinglish Gender and Racial Bias Mitigation Dataset

This dataset contains 2,031 manually annotated Hinglish prompt-response instances developed for research on gender and racial bias mitigation in Large Language Models (LLMs). Each record consists of an input prompt, a biased response, a corresponding neutralized response, and a bias category label (gender or racial). The dataset was manually constructed and annotated by the authors to support research on bias detection, fairness evaluation, and bias mitigation in code-mixed Hinglish language settings. The corpus is intended for academic research in responsible AI, hate speech analysis, and fair language generation. Total Records: 2,031 Language: Hinglish (Hindi-English Code-Mixed) Categories: Gender Bias Racial Bias

Found an issue? Give us feedback