<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Тема выпуÑкной квалификационной работы: "Ð’Ñ‹Ñвление доÑтупа к запрещенным Ñайтам путём анализа трафика Tor Ñ Ð¿Ð¾Ð¼Ð¾Ñ‰ÑŒÑŽ методов машинного обучениÑ". Ð”Ð°Ð½Ð½Ð°Ñ Ñ€Ð°Ð±Ð¾Ñ‚Ð° поÑвÑщена анализу применимоÑти методов машинного Ð¾Ð±ÑƒÑ‡ÐµÐ½Ð¸Ñ Ð´Ð»Ñ Ð´ÐµÐ°Ð½Ð¾Ð½Ð¸Ð¼Ð¸Ð·Ð°Ñ†Ð¸Ð¸ трафика Ñети Tor. Задачи, которые решалиÑÑŒ в ходе иÑÑледованиÑ: 1) проведение анализа архитектуры Ñети Tor; 2) проведение анализа ÑущеÑтвующих методов машинного обучениÑ; 3) проведение анализа методов ÑÐ¾ÐºÑ€Ð°Ñ‰ÐµÐ½Ð¸Ñ ÐºÐ¾Ð»Ð¸Ñ‡ÐµÑтва метрик Ñетевых пакетов; 4) проведение ÑкÑпериментов Ñ Ñ€Ð°Ð·Ð»Ð¸Ñ‡Ð½Ñ‹Ð¼Ð¸ методами машинного обучениÑ; 5) проведение анализа качеÑтва Ð¾Ð±Ð½Ð°Ñ€ÑƒÐ¶ÐµÐ½Ð¸Ñ Ñетевых пакетов, Ñодержащих запроÑÑ‹ к запрещенным Ñайтам. Ð’ данной работе была Ñобрана выборка Ñ Ñ‚Ñ€Ð°Ñ„Ð¸ÐºÐ¾Ð¼ к запрещенным и легитимным Ñайтам. Ð”Ð»Ñ Ð¿Ð¾Ð»ÑƒÑ‡ÐµÐ½Ð¸Ñ Ð¼ÐµÑ‚Ñ€Ð¸Ðº Ñайта иÑпользовалÑÑ Ð¸Ð½Ñтрумент CICFlowMeter, позволÑющий получить временные характериÑтики. Ð”Ð»Ñ Ð¿Ñ€Ð¾Ð²ÐµÐ´ÐµÐ½Ð¸Ñ ÑкÑпериментов Ñ Ð¼ÐµÑ‚Ð¾Ð´Ð°Ð¼Ð¸ машинного Ð¾Ð±ÑƒÑ‡ÐµÐ½Ð¸Ñ Ð¸ методами ÑÐ¾ÐºÑ€Ð°Ñ‰ÐµÐ½Ð¸Ñ Ñ€Ð°Ð·Ð¼ÐµÑ€Ð½Ð¾Ñти иÑпользовалÑÑ Ð¸Ð½Ñтрумент Weka. Ð’ результате методом RandomForest в комбинации Ñ WrapperSubsetEval доÑтигаетÑÑ Ð¼Ð°ÐºÑÐ¸Ð¼Ð°Ð»ÑŒÐ½Ð°Ñ Ñ‚Ð¾Ñ‡Ð½Ð¾Ñть в 98%. Так же были проведены ÑкÑперименты по определению конкретного Ñайта. Ðаилучший результат Ñ Ñ‚Ð¾Ñ‡Ð½Ð¾Ñтью в 69% был доÑтигнут Ñ Ð¿Ð¾Ð¼Ð¾Ñ‰ÑŒÑŽ алгоритма RandomForest. Данные результаты могут быть применены в ÑиÑтемах, фильтрующих трафик на Ñтороне провайдера.
The subject of the graduate qualification work is “Identify access to banned sites by analyzing Tor traffic using machine learning methodsâ€. The given work is devoted to the analysis of the applicability of machine learning methods to deanonymize Tor network traffic. The research set the following goals: 1) analyzing the architecture of the Tor network; 2) analyzing existing machine learning methods; 3) analyzing methods for selecting features network packets; 4) conducting experiments with various machine learning methods; 5) analyzing the quality of detection of network packets containing requests to banned sites. In this work, a sample was collected with traffic to banned and legitimate sites. To get site metrics, we used the CICFlowMeter tool, which allows us to get temporary characteristics. The Weka tool was used to conduct experiments with machine learning methods and selection features methods. As a result, the RandomForest method in combination with WrapperSubsetEval achieves a maximum accuracy of 98%. Experiments were also conducted to determine a specific site. The best result with 69% accuracy was achieved using the RandomForest algorithm. These results can be applied in systems that filter traffic on the provider side.
ÐаÑинное обÑÑение, ÐнÑоÑмаÑионнÑе ÑиÑÑемÑ, меÑÐ¾Ð´Ñ ÑокÑаÑÐµÐ½Ð¸Ñ Ð¿Ð°ÑамеÑÑов ÑÑаÑика tor, traffic analysis, меÑÐ¾Ð´Ñ ÐºÐ»Ð°ÑÑиÑикаÑии weka, ÐнÑоÑмаÑиÑ, анализ ÑÑаÑика, feature selection of tor traffic, weka classification methods
ÐаÑинное обÑÑение, ÐнÑоÑмаÑионнÑе ÑиÑÑемÑ, меÑÐ¾Ð´Ñ ÑокÑаÑÐµÐ½Ð¸Ñ Ð¿Ð°ÑамеÑÑов ÑÑаÑика tor, traffic analysis, меÑÐ¾Ð´Ñ ÐºÐ»Ð°ÑÑиÑикаÑии weka, ÐнÑоÑмаÑиÑ, анализ ÑÑаÑика, feature selection of tor traffic, weka classification methods
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |