
The need for efficient categorization of fake and real media increases as the ubiquity of generative AI and motivated bad actors make producing fake news ever easier. Researchers have estimated that in 2021, $2.6 billion dollars of ad revenue can be attributed to misinformation publishing sites (Skibinski, 2021), providing ample motivation for the aforementioned bad actors to fabricate stories. This paper seeks to create an effective machine learning solution that gives readers the ability to classify articles they want to read as fake or real, enabling the consumption of solely accurate news. As users tend to prefer simple solutions, we provide a parsimonious model consisting of only 5 features, yet still able to achieve 71% testing accuracy. Among the most effective predictors of a real article is that of “perceived effort” - predicated by an article’s length, number of authors, and readability.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
