<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
The spread of fake news on social media platforms is becoming an increasingly alarming problem with fake news becoming more deceptive and harder to detect. Twitter, in particular, poses a significant threat as fake news spreads faster than real news on the platform, enhancing misinformation and leading to serious consequences.This project presents a novel machine learning-based approach for detecting fake news tweets on Twitter using the TruthSeeker 2023 dataset from the University of New Brunswick. As the largest ground truth dataset for fake news detection on social media, it contains over 130,000 crowdsourced tweets, enabling the creation of a broader and more applicable model for real-world scenarios. The algorithm employed in this study leverages the properties of gradient-boosted decision tree models (XGBoost) to develop a novel method for classifying fake and real news tweets. The proposed model preprocesses the data by extracting additional features for each tweet, such as detailed sentiment analysis of both the tweet and the related news statement, as well as features pertaining to the author. These features are added to the tweets feature vector. The enhanced feature vectors are then fed into an XGBoost model with tuned hyperparameters determined through a grid search algorithm to perform binary classification. The additional extracted features increase the robustness of the model by highlighting key differentiating factors between real and fake tweets. The results of this study demonstrate the effectiveness of the proposed algorithm, achieving an accuracy of 0.9335 on over 13,000 unseen tweets.
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |