On the Truthfulness of Surprisingly Likely Responses of Large Language Models

Name: On the Truthfulness of Surprisingly Likely Responses of Large Language Models
Creator: Naman Goel
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory, Computation and Language (cs.CL), Machine Learning (cs.LG), Computer Science and Game Theory (cs.GT)

Naman Goel

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2023

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1145/371592...

Article . 2025 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2023

License: CC BY

Data sources: Datacite

On the Truthfulness of Surprisingly Likely Responses of Large Language Models

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 03 Aug 2025Embargo end date: 01 Jan 2023Publisher:ACMJournal:Proceedings of the ACM Collective Intelligence Conference

Authors: Naman Goel;

doi: 10.1145/3715928.3737471 , 10.48550/arxiv.2311.07692

arXiv: 2311.07692

On the Truthfulness of Surprisingly Likely Responses of Large Language Models

- Summary
- Subjects
- Metrics

Abstract

The principle of rewarding a crowd for surprisingly common answers has been used in the literature for designing a number of truthful information elicitation mechanisms. A related method has also been proposed in the literature for better aggregation of crowd wisdom. Drawing a comparison between crowd based collective intelligence systems and large language models, we define the notion of 'surprisingly likely' textual response of a large language model. This notion is inspired by the surprisingly common principle, but tailored for text in a language model. Using benchmarks such as TruthfulQA and openly available LLMs: GPT-2 and LLaMA-2, we show that the surprisingly likely textual responses of large language models are more accurate in many cases compared to standard baselines. For example, we observe up to 24 percentage points aggregate improvement on TruthfulQA and up to 70 percentage points improvement on individual categories of questions in this benchmark. We also provide further analysis of the results, including the cases when surprisingly likely responses are less or not more accurate.

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory, Computation and Language (cs.CL), Machine Learning (cs.LG), Computer Science and Game Theory (cs.GT)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

UArctic