Spanish Morality Corpus

Description of the files included in this dataset This dataset contains the first publishable version of a corpus of Spanish-language online comments annotated for moral foundations, developed as part of the AMOR project. The annotations were carried out by trained participants using the Qualtrics platform and coordinated via Prolific. The comments were extracted from Spanish-speaking Reddit communities, filtered and manually curated to ensure linguistic quality, moral relevance, and geographical diversity. The dataset comprises two main components: Corpus Files (.json and .jsonl formats): These files include the annotated texts, each accompanied by metadata such as subreddit, author, and comment thread information. Annotations capture moral foundations (e.g., care, loyalty, authority) and include additional information on polarity (virtue or vice) and the annotator’s confidence level (low, medium, high). Three versions are provided for each annotation task: including all confidence levels, high confidence only, and medium+high confidence. Annotator Profiles (.json and .jsonl formats): These files describe the demographic and moral profile of each annotator using the MFQ30 (Moral Foundations Questionnaire) adapted to Spanish. Identifiers are anonymized for privacy. The dataset is designed for use in computational linguistics and social science research on moral language and value expression in Spanish-speaking online discourse. AMOR-Corpus V2 – Format and Annotator Profiling Description The AMOR-Corpus V2 is a curated, high-quality dataset of Spanish-language Reddit comments annotated for moral content. It was created as part of the AMOR project, focused on affective and moral reasoning in online discourse. The dataset is composed of two structured JSON-based files: 1. AMOR-Corpus_V2-high.jsonl – Corpus and Annotations This file contains Reddit comments annotated for the presence of moral foundations, based on an expanded version of Moral Foundations Theory (MFT). Each line in this JSON Lines file represents: - id: A unique identifier for the comment.- text: The original Reddit comment in Spanish.- annotations: A list of five independent annotations per comment, each including: - The moral foundation(s) perceived (if any). - Subtypes (Virtue/Vice distinctions). - Annotator confidence level (Low, Medium, High). Annotations were collected through the Qualtrics platform, and annotators were recruited via Prolific. Each task included 60 comments and was annotated by 5 different individuals. Annotators had the option to mark “None” if no moral content was identified. Manual pre-filtering of texts ensured content was:- Written in Spanish,- Interpretable without deep context,- Potentially moral in nature,- Free of extreme offensive language. 2. AMOR-Corpus_V2-annotators.json – Annotator Metadata This file contains anonymized demographic and psychometric profiles of all annotators. Each entry includes:- ID: A unique anonymized user ID.- genre, age, political orientation, income, religious: Self-reported demographic information.- Responses to the Moral Foundations Questionnaire (MFQ), described below. Moral Foundations Questionnaire (MFQ) To assess how personal values may influence moral annotation behavior, each annotator completed the Moral Foundations Questionnaire (MFQ), a validated psychometric instrument based on MFT. The MFQ measures sensitivity across five foundational moral domains: 1. Care/Harm2. Fairness/Cheating3. Loyalty/Betrayal4. Authority/Subversion5. Purity/Degradation Each foundation includes:- Relevance items (e.g., “Whether or not someone suffered emotionally”), measuring importance,- Judgment items (e.g., “Compassion for those who are suffering is the most crucial virtue”), measuring agreement. Annotators responded using a 6-point Likert scale. Item codes in the data are prefixed as follows:- CA_# = Care- EQ_# = Fairness (EQ = Equality)- LO_# = Loyalty- AU_# = Authority- PU_# = Purity- PR_# = Proportionality-related fairness The inclusion of MFQ scores enables researchers to analyze how individual moral orientations influence perception and annotation of moral language. More information on MFQ can be found at:- MFQ1 (original): https://moralfoundations.org/questionnaires/- MFQ2 (updated): https://yourmorals.org/

Related Organizations

Universidad Politécnica de Madrid
Spain

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average