
Разработка ÑредÑтв защиты от Ñпама актуальна, поÑкольку Ñпам в Ñлектронных пиÑьмах и ÑообщениÑÑ… предÑтавлÑет Ñобой Ñерьезную проблему Ð´Ð»Ñ Ð¾Ñ‚Ð´ÐµÐ»ÑŒÐ½Ñ‹Ñ… лиц, предприÑтий и организаций. Ðти нежелательные Ñо-Ð¾Ð±Ñ‰ÐµÐ½Ð¸Ñ Ð¼Ð¾Ð³ÑƒÑ‚ перегружать почтовые ÑиÑтемы, замедлÑть Ñетевой трафик и даже раÑпроÑтранÑть вредоноÑные программы и вируÑÑ‹. Ð’ рамках данной работы показан процеÑÑ Ñ€Ð°Ð·Ñ€Ð°Ð±Ð¾Ñ‚ÐºÐ¸ антиÑпам-Ñдра Ð´Ð»Ñ ÐºÐ»Ð°ÑÑификации пиÑем на Ñпам и не Ñпам. КлаÑÑÐ¸Ñ„Ð¸ÐºÐ°Ñ†Ð¸Ñ Ð¿Ñ€Ð¾Ð¸Ð·Ð²Ð¾Ð´Ð¸Ñ‚ÑÑ Ñ Ð¿Ð¾Ð¼Ð¾Ñ‰ÑŒÑŽ машинного обучениÑ, такого как Ð½Ð°Ð¸Ð²Ð½Ð°Ñ Ð‘Ð°Ð¹ÐµÑовÑÐºÐ°Ñ ÐºÐ»Ð°ÑÑификациÑ, метода объÑÐ²Ð»ÐµÐ½Ð¸Ñ Ñ‡ÐµÑ€Ð½Ñ‹Ñ… ÑпиÑков Ð´Ð»Ñ IP адреÑов, номеров телефонов из текÑта пиÑьма, email адреÑов отправителей, ÑÑылок из пиÑьма. Ð’ рамках разработки иÑпользовалиÑÑŒ такие технологии и инÑтрументы: C++ 17, Cmake, conan, git, Ms Visual Studio. Ð’ результате данной работы было Ñоздано антиÑпам-Ñдро, в котором реализованы методы БайеÑовÑкой фильтрации и черные ÑпиÑки, при теÑтировании на большом наборе пиÑем было уÑтановлено, что каждое деÑÑтое пиÑьмо определÑетÑÑ Ð½ÐµÐ¿Ñ€Ð°Ð²Ð¸Ð»ÑŒÐ½Ð¾, то еÑть антиÑпам-Ñдро выдает правильный результат в 90% Ñлучаев, что ÑвлÑетÑÑ Ñ…Ð¾Ñ€Ð¾ÑˆÐ¸Ð¼ результатом Ð´Ð»Ñ Ð¿Ñ€Ð¾Ð³Ñ€Ð°Ð¼Ð¼ из Ñферы клаÑÑификации пиÑем.
Developing spam protection tools is relevant because spam in emails and mesages is a serious problem for individuals, businesses and organizations. These unwanted messages can overload email systems, slow down network traffic, and even spread malware and viruses. This paper shows the process of developing an anti-spam kernel to classify emails into spam and non-spam. The classification is done using machine learning such as naive Bayesian classification, blacklist declaration method for IP addresses, phone numbers from the letter text, sender email addresses, links from the letter. The technologies and tools used in the development included: C++ 17, Cmake, conan, git, Ms Visual Studio. As the result of this work the anti-spam-core was created in which meth-ods of Bayesian filtering and blacklists were realized. While testing with a large set of letters it was found that every tenth letter is defined incorrectly, that is, the anti-spam-core gives a correct result in 90% of cases and this is a good result for programs from the field of e-mail classification.
black lists, клаÑÑиÑикаÑиÑ, ÑÑÑнÑе-ÑпиÑки, machine learning, classification, antispam, маÑинное обÑÑение, анÑиÑпам, C++
black lists, клаÑÑиÑикаÑиÑ, ÑÑÑнÑе-ÑпиÑки, machine learning, classification, antispam, маÑинное обÑÑение, анÑиÑпам, C++
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
