Taxonomy Construction of Factual Claims from Social Media

This dataset accompanies the paper "LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media" (Findings of ACL 2025). It contains the curated data used for taxonomy construction experiments described in the paper, focusing on factual claims extracted from social media discussions across three topic domains, including COVID-19 vaccine, climate change, and cybersecurity. This dataset is designed to support research in taxonomy construction and factual claim analysis. Contents tweets.csv: The ids of 384,676 tweets collected from X (formerly Twitter) for the three domains above. (Note: Facebook data in the paper are not included due to data-sharing restrictions and privacy policies.) Taxonomies: Nine final taxonomies of factual claims generated by three LLMs (Zephyr, GPT-4o mini, Gemini 2.0 Flash) across the three datasets. Each taxonomy includes three hierarchical levels: broad, medium, and detailed topics.

Related Organizations

The University of Texas at Arlington
United States

Keywords

Factual Claim, Social Media, Taxonomy

EOSC Subjects

Twitter Data

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Related to Research communities

Corona Virus Disease