Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Разработка антиспам-ядра для Ð¿Ð¾Ñ‡Ñ‚Ð¾Ð²Ñ‹Ñ ÑÐµÑ€Ð²Ð¸ÑÐ¾Ð²

выпускная квалификационная работа бакалавра

Разработка антиспам-ядра для Ð¿Ð¾Ñ‡Ñ‚Ð¾Ð²Ñ‹Ñ ÑÐµÑ€Ð²Ð¸ÑÐ¾Ð²

Abstract

Разработка средств защиты от спама актуальна, поскольку спам в электронных письмах и сообщениях представляет собой серьезную проблему для отдельных лиц, предприятий и организаций. Эти нежелательные со-общения могут перегружать почтовые системы, замедлять сетевой трафик и даже распространять вредоносные программы и вирусы. В рамках данной работы показан процесс разработки антиспам-ядра для классификации писем на спам и не спам. Классификация производится с помощью машинного обучения, такого как наивная Байесовская классификация, метода объявления черных списков для IP адресов, номеров телефонов из текста письма, email адресов отправителей, ссылок из письма. В рамках разработки использовались такие технологии и инструменты: C++ 17, Cmake, conan, git, Ms Visual Studio. В результате данной работы было создано антиспам-ядро, в котором реализованы методы Байесовской фильтрации и черные списки, при тестировании на большом наборе писем было установлено, что каждое десятое письмо определяется неправильно, то есть антиспам-ядро выдает правильный результат в 90% случаев, что является хорошим результатом для программ из сферы классификации писем.

Developing spam protection tools is relevant because spam in emails and mesages is a serious problem for individuals, businesses and organizations. These unwanted messages can overload email systems, slow down network traffic, and even spread malware and viruses. This paper shows the process of developing an anti-spam kernel to classify emails into spam and non-spam. The classification is done using machine learning such as naive Bayesian classification, blacklist declaration method for IP addresses, phone numbers from the letter text, sender email addresses, links from the letter. The technologies and tools used in the development included: C++ 17, Cmake, conan, git, Ms Visual Studio. As the result of this work the anti-spam-core was created in which meth-ods of Bayesian filtering and blacklists were realized. While testing with a large set of letters it was found that every tenth letter is defined incorrectly, that is, the anti-spam-core gives a correct result in 90% of cases and this is a good result for programs from the field of e-mail classification.

Keywords

black lists, классификация, чёрные-списки, machine learning, classification, antispam, машинное обучение, антиспам, C++

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!