
Since its inception in 2009, Bitcoin has been mired in controversies for providing a haven for illegal activities. Several types of illicit users hide behind the blanket of anonymity. Uncovering these entities is key for forensic investigations. Current methods utilize machine learning for identifying these illicit entities. However, the existing approaches only focus on a limited category of illicit users. The current paper proposes to address the issue by implementing an ensemble of decision trees for supervised learning. More parameters allow the ensemble model to learn discriminating features that can categorize multiple groups of illicit users from licit users. To evaluate the model, a dataset of 2059 real-life entities on Bitcoin was extracted from the Blockchain. Nine features were engineered to train the model for segregating 28 different licit-illicit categories of users. The proposed model provided a reliable tool for forensic study. Empirical evaluation of the proposed model vis-a-vis three existing benchmark models was performed to highlight its efficacy. Experiments showed that the specificity and sensitivity of the proposed model were comparable to other models.
Fraud detection, [INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], Data Mining, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [INFO.INFO-IT] Computer Science [cs]/Information Theory [cs.IT], Supervised Learning, Boosting Bitcoin, CCS Concepts: • Computing methodologies → Artificial intelligence
Fraud detection, [INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], Data Mining, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [INFO.INFO-IT] Computer Science [cs]/Information Theory [cs.IT], Supervised Learning, Boosting Bitcoin, CCS Concepts: • Computing methodologies → Artificial intelligence
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 26 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
