
Financial literacy (FL) represents a person's ability to turn assets into income, and understanding digital currencies has been added to the modern definition. FL can be predicted by exploiting unlabelled recorded data in financial networks via semi-supervised learning (SSL). Measuring and predicting FL has not been widely studied, resulting in limited understanding of customer financial engagement consequences. Previous studies have shown that low FL increases the risk of social harm. Therefore, it is important to accurately estimate FL to allocate specific intervention programs to less financially literate groups. This will not only increase company profitability, but will also reduce government spending. Some studies considered predicting FL in classification tasks, whereas others developed FL definitions and impacts. The current paper investigated mechanisms to learn customer FL level from their financial data using sampling by synthetic minority over-sampling techniques for regression with Gaussian noise (SMOGN). We propose the SMOGN-COREG model for semi-supervised regression, applying SMOGN to deal with unbalanced datasets and a nonparametric multi-learner co-regression (COREG) algorithm for labeling. We compared the SMOGN-COREG model with six well-known regressors on five datasets to evaluate the proposed models effectiveness on unbalanced and unlabelled financial data. Experimental results confirmed that the proposed method outperformed the comparator models for unbalanced and unlabelled financial data. Therefore, SMOGN-COREG is a step towards using unlabelled data to estimate FL level.
12 pages
FOS: Computer and information sciences, Computer Science - Machine Learning, semi-supervised regression, Econometrics (econ.EM), unlabelled Data, [INFO] Computer Science [cs], unbalanced datasets, Machine Learning (cs.LG), Computational Engineering, Finance, and Science (cs.CE), FOS: Economics and business, Computer Science - Computers and Society, Financial literacy, Computers and Society (cs.CY), Financial literacy semi-supervised regression unbalanced datasets unlabelled Data, Computer Science - Computational Engineering, Finance, and Science, Economics - Econometrics
FOS: Computer and information sciences, Computer Science - Machine Learning, semi-supervised regression, Econometrics (econ.EM), unlabelled Data, [INFO] Computer Science [cs], unbalanced datasets, Machine Learning (cs.LG), Computational Engineering, Finance, and Science (cs.CE), FOS: Economics and business, Computer Science - Computers and Society, Financial literacy, Computers and Society (cs.CY), Financial literacy semi-supervised regression unbalanced datasets unlabelled Data, Computer Science - Computational Engineering, Finance, and Science, Economics - Econometrics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
