Spanish-PoliCorpus-2020: A Spanish Twitter Corpus for Author Profiling and Attribution

Spanish-PoliCorpus-2020 is a Spanish Twitter corpus designed for author profiling and author attribution tasks in the political domain. The dataset includes pseudonymised author identifiers and is annotated with multiple author-level traits, such as gender, age range, and ideological orientation. Two evaluation scenarios are supported through independent data splits: author profiling and author attribution. This record provides a consolidated and FAIR-compliant public version of the dataset containing tweet identifiers and author-level annotations, while excluding any real user identifiers. Tweet text and additional derived representations are not included in the public release. The dataset was originally introduced in the associated scientific publication and is preserved here to ensure reproducibility and long-term reuse.

Related Organizations

University of Murcia
Spain

Keywords

political discourse, author profiling, author attribution

EOSC Subjects

Twitter Data

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average