
Тема выпуÑкной квалификационной работы: «Оценка платежеÑпоÑобноÑти клиентов банка Ñ Ð¸Ñпользованием методов машинного обучениÑ». Ð”Ð°Ð½Ð½Ð°Ñ Ñ€Ð°Ð±Ð¾Ñ‚Ð° поÑвÑщена иÑÑледованию подходов к оценке платежеÑпоÑобноÑти банковÑких клиентов, Ñ Ð¸Ñпользованием методов машинного обучениÑ. Задачи, которые решалиÑÑŒ в ходе иÑÑледованиÑ: Обзор литературы, поÑвÑщенной иÑпользованию машинного Ð¾Ð±ÑƒÑ‡ÐµÐ½Ð¸Ñ Ð² банковÑкой Ñфере; Изучение и подготовка набора данных Ð´Ð»Ñ Ð¿Ð¾Ñледующего анализа; ПоÑтроение моделей оценки платежеÑпоÑобноÑти Ñ Ð¸Ñпользованием методов машинного обучениÑ. Работа проведена Ñ Ð¿Ð¾Ð¼Ð¾Ñ‰ÑŒÑŽ Ñзыка R и интегрированной Ñреды разработки RStudio, где проводилаÑÑŒ Ð·Ð½Ð°Ñ‡Ð¸Ñ‚ÐµÐ»ÑŒÐ½Ð°Ñ Ñ‡Ð°Ñть иÑÑледований. Ð’ ней проведен предварительный анализ набора данных, Ð·Ð½Ð°Ñ‡ÐµÐ½Ð¸Ñ Ð°Ñ‚Ñ€Ð¸Ð±ÑƒÑ‚Ð¾Ð² преобразованы к корректным типам данных, показана Ñтруктура набора данных. Оценка платежеÑпоÑобноÑти проводилаÑÑŒ Ñ Ð¸Ñпользованием таких методов клаÑÑификации, как логиÑтичеÑÐºÐ°Ñ Ñ€ÐµÐ³Ñ€ÐµÑÑиÑ, метод опорных векторов, алгоритм Ñлучайного леÑа и алгоритм деревьев решений. Ð’ результате проделанной работы было применено четыре метода машинного Ð¾Ð±ÑƒÑ‡ÐµÐ½Ð¸Ñ Ð´Ð»Ñ Ñ€ÐµÑˆÐµÐ½Ð¸Ñ Ð¿Ð¾Ñтавленной задачи. Ð”Ð»Ñ ÐºÐ°Ð¶Ð´Ð¾Ð¹ модели получены графики ROC-кривых, показывающие качеÑтво прогнозируемой модели. Дополнительно приведены метрики качеÑтва полученных моделей, на оÑнове которых Ñделан выбор наилучшей модели.
Theme of final qualification work: "Assessment of the solvency of bank customers using machine learning methods". This work is devoted to the study of approaches to assessing the solvency of bank customers using machine learning methods. Tasks that were solved during the study: Review of the literature on the use of machine learning in the banking sector; Study and preparation of a data set for subsequent analysis; Construction of solvency assessment models using machine learning methods. The work was carried out using the R language and the R Studio integrated development environment, where a significant part of the research was carried out. It contains a preliminary analysis of the data set, attribute values are converted to the correct data type, and the structure of the data set is shown. Solvency assessment was carried out using classification methods such as logistic regression, the method of support vectors, the algorithm of a random forest and the algorithm of decision trees. As a result of the work done, four machine learning methods were applied to solve the task. For each model, graphs of ROC curves are obtained, showing the quality of the predicted model. Additionally, the quality metrics of the obtained models are given, on the basis of which the choice of the best model is made.
клаÑÑиÑикаÑиÑ, ÐÑкÑÑÑÑвеннÑй инÑеллекÑ, classification, логиÑÑиÑеÑÐºÐ°Ñ ÑегÑеÑÑиÑ, logistic regression, scoring, ÑлÑÑайнÑй леÑ, ÑкоÑинг, random forest
клаÑÑиÑикаÑиÑ, ÐÑкÑÑÑÑвеннÑй инÑеллекÑ, classification, логиÑÑиÑеÑÐºÐ°Ñ ÑегÑеÑÑиÑ, logistic regression, scoring, ÑлÑÑайнÑй леÑ, ÑкоÑинг, random forest
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
