Actions
shareshare link cite add Please grant OpenAIRE to access and update your ORCID works.This Research product is the result of merged Research products in OpenAIRE.
You have already added 0 works in your ORCID record related to the merged Research product.
See an issue? Give us feedback
Please grant OpenAIRE to access and update your ORCID works.
This Research product is the result of merged Research products in OpenAIRE.
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
Research data . Dataset . 2021
Claim Detection and Matching for Indian Languages
Ashkan Kazemi; Garimella, Kiran; Gaffney, Devin; Hale, Scott A.;
Ashkan Kazemi; Garimella, Kiran; Gaffney, Devin; Hale, Scott A.;
Open Access Hindi
Published: 01 Jun 2021
Publisher: Zenodo
Abstract
Two datasets are included in this repository: claim matching and claim detection datasets. The collections contain data in 5 languages: Bengali, English, Hindi, Malayalam and Tamil. The "claim detection" dataset contains textual claims from social media and fact-checking websites annotated for the "fact-check worthiness" of the claims in each message. Data points have one of the three labels of "Yes" (text contains one or more check-worthy claims), "No" and "Probably". The "claim matching" dataset is a curated collection of pairs of textual claims from social media and fact-checking websites for the purpose of automatic and multilingual claim matching. Pairs of data have one of the four labels of "Very Similar", "Somewhat Similar", "Somewhat Dissimilar" and "Very Dissimilar". All personally identifiable information (PII) including phone numbers, email addresses, license plate numbers and addresses have been replaced with general tags (e.g. <PHONE#>, <ADDRESS>, etc) to protect user anonymity. A detailed explanation on the curation and annotation process is provided in our ACL 2021 paper: Kazemi, A.; Garimella, K.; Gaffney, D.; and Hale, S. A. 2021. Claim Matching Beyond English to Scale Global Fact-Checking. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, ACL 2021.
Subjects
nlp, fact-checking, misinformation, multilingual, claim matching, claim detection, whatsapp, tamil, malayalam, bengali
nlp, fact-checking, misinformation, multilingual, claim matching, claim detection, whatsapp, tamil, malayalam, bengali
See an issue? Give us feedback
Download fromView all 2 sources
Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.