
doi: 10.21203/rs.3.rs-3499674/v1 , 10.1038/s41598-024-55686-2 , 10.2139/ssrn.4574226 , 10.48550/arxiv.2309.09825
pmid: 38433238
pmc: PMC10909834
arXiv: 2309.09825
doi: 10.21203/rs.3.rs-3499674/v1 , 10.1038/s41598-024-55686-2 , 10.2139/ssrn.4574226 , 10.48550/arxiv.2309.09825
pmid: 38433238
pmc: PMC10909834
arXiv: 2309.09825
Abstract Large language models (LLMs) have the potential to transform our lives and work through the content they generate, known as AI-Generated Content (AIGC). To harness this transformation, we need to understand the limitations of LLMs. Here, we investigate the bias of AIGC produced by seven representative LLMs, including ChatGPT and LLaMA. We collect news articles from The New York Times and Reuters, both known for their dedication to provide unbiased news. We then apply each examined LLM to generate news content with headlines of these news articles as prompts, and evaluate the gender and racial biases of the AIGC produced by the LLM by comparing the AIGC and the original news articles. We further analyze the gender bias of each LLM under biased prompts by adding gender-biased messages to prompts constructed from these news headlines. Our study reveals that the AIGC produced by each examined LLM demonstrates substantial gender and racial biases. Moreover, the AIGC generated by each LLM exhibits notable discrimination against females and individuals of the Black race. Among the LLMs, the AIGC generated by ChatGPT demonstrates the lowest level of bias, and ChatGPT is the sole model capable of declining content generation when provided with biased prompts.
Male, FOS: Computer and information sciences, bias, 330, 070, Computer Science - Artificial Intelligence, Science, Sexism, Aortic Valve Insufficiency, Article, gender bias, Bias, Large language model (LLM), Humans, Animals, AI-generated content (AIGC), racial bias, prompt, Language, Gender bias, Q, R, ChatGPT, Artificial Intelligence (cs.AI), generative AI, Generative AI, Medicine, Female, Camelids, New World
Male, FOS: Computer and information sciences, bias, 330, 070, Computer Science - Artificial Intelligence, Science, Sexism, Aortic Valve Insufficiency, Article, gender bias, Bias, Large language model (LLM), Humans, Animals, AI-generated content (AIGC), racial bias, prompt, Language, Gender bias, Q, R, ChatGPT, Artificial Intelligence (cs.AI), generative AI, Generative AI, Medicine, Female, Camelids, New World
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 110 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 0.1% |
