
handle: 1959.4/101813
Anonymity networks are becoming increasingly popular in today's online world as more users attempt to safeguard their online privacy. Tor is currently the most popular anonymity network and provides anonymity to users and services (hidden services). However, the anonymity provided by Tor is also being misused in various ways. Hosting illegal sites for selling drugs, hosting command and control servers for botnets, and distributing censored content are a few examples. As a result, various parties, including governments and law enforcement agencies, are interested in techniques that assist in de-anonymising the Tor network, disrupting its operations, and bypassing its censorship circumvention mechanisms. However, the encrypted traffic used by Tor makes the de-anonymisation more difficult. In this thesis, we focus on three distinct but interrelated tasks carried out to monitor anonymity network traffic (with a focus on the Tor network), which can help detect potential security threats and malicious actors. In the first work, we try to identify the classifiability of hidden service traffic. Hidden services refer to web services that can only be accessed via the Tor network and are responsible for a significant portion of the dark web. We employ machine learning to distinguish this hidden service traffic from other Tor traffic with >99% accuracy. Then, we investigate how certain modifications done to Tor traffic to obfuscate its information leakage affect our techniques while identifying the most influential feature combinations for our classification task. In the second work, we explore website fingerprinting, which is one of the main de-anonymisation techniques against Tor users. It can be used to confirm the online activities of target users over Tor. In our experiments, we identified that Decentralised applications (DApps) are harder to fingerprint compared to conventional websites and reload traffic can reduce the accuracy of current website fingerprinting techniques considerably (more than 40% in some cases). We also propose two new Graph Neural Network-based website fingerprinting techniques that outperform existing techniques when applied to reloading traffic and DApp traffic. In the final work, we focus on one of the most concerning but less studied areas related to Tor, in which we try to investigate the existence of malware traffic concealed in Tor and classify this malware into different classes. We utilise machine learning techniques to improve the accuracy of malware class identification. The techniques we use improved the micro-average precision and recall of existing techniques by ~20% and ~10%, respectively. In addition, we try to use Explainable Artificial Intelligence (XAI) techniques to interpret our results and investigate the resilience of the classifiers against evasion attacks. Last but not least, we develop a testbed to simulate botnet communications in Tor and collect a new dataset for future research.
4604 Cybersecurity and privacy, anzsrc-for: 4611 Machine learning, Traffic classification, anzsrc-for: 460407 System and network security, Tor, 460407 System and network security, Anonymity Networks, 004, 620, 4611 Machine learning, Machine learning, anzsrc-for: 4604 Cybersecurity and privacy, Website Fingerprinting
4604 Cybersecurity and privacy, anzsrc-for: 4611 Machine learning, Traffic classification, anzsrc-for: 460407 System and network security, Tor, 460407 System and network security, Anonymity Networks, 004, 620, 4611 Machine learning, Machine learning, anzsrc-for: 4604 Cybersecurity and privacy, Website Fingerprinting
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
