SoK: Machine Learning for Misinformation Detection

Annotations and replication materials for 'SoK: Machine Learning for Misinformation Detection' I've included descriptions of file contents below. annotations_aec.tsv: Contains annotations for our full paper corpus, comprising 248 published works. We annotated these papers for target, dataset curation, model choice, feature selection, and evaluation. paper_selection_criteria.txt: Our criteria for assembling the full and focus coding sets, adapted from pages 3, 5 ('Paper selection') and 6. replications.zip: within this zip archive, you'll find three subfolders, each corresponding to one of the three replication analyses found on pages 11-13 of the manuscript. We've included the subsection header in the manuscript where each dataset / codebase is discussed: articles (5.1): includes original and modified Reuters and NYTimes texts and accompanying labels (these are new datasets that we introduced for the sake of robustness testing). Also includes FA-KES and ISOT datasets and classifier (new_RNN_CNN.py) used by the original study authors and their classifier. users (5.2): includes troll and non-troll summary statistics, by account, with accompanying label. Also includes the classifier used by the original study author. sources (5.3): includes splits, classifier, and datasets used by the original author. Notes on open-source availability for each codebase: the source-scoped replication code is freely available online. We received permission from the authors of the article-scoped study to open-source their code. We've previously contacted the author of the user-scoped work (TrollMagnifier) and have not received a response -- we are sharing their code here, for the sake of artifact evaluation; open-source availability is pending an affirmative response from the author.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average