An Investigation on Disparity Responds of Machine Learning Algorithms to Data Normalization Method

descriptionPublicationkeyboard_double_arrow_right Article 19 Sep 2022Publisher:Koya UniversityJournal:ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, volume 10, pages 29-37 (issn: 2410-9355, eissn: 2307-549X,

Authors: Haval A. Ahmed; Peshawa J. Muhammad Ali; Abdulbasit K. Faeq; Saman M. Abdullah;

doi: 10.14500/aro.10970

An Investigation on Disparity Responds of Machine Learning Algorithms to Data Normalization Method

- Summary
- Subjects
- Related research
  (3)
- Metrics

Abstract

Data normalization can be useful in eliminating the effect of inconsistent ranges in some machine learning (ML) techniques and in speeding up the optimization process in others. Many studies apply different methods of data normalization with an aim to reduce or eliminate the impact of data variance on the accuracy rate of ML-based models. However, the significance of this impact aligning with the mathematical concept of the ML algorithms still needs more investigation and tests. To identify that, this work proposes an investigation methodology involving three different ML algorithms, which are support vector machine (SVM), artificial neural network (ANN), and Euclidean-based K-nearest neighbor (E-KNN). Throughout this work, five different datasets have been utilized, and each has been taken from different application fields with different statistical properties. Although there are many data normalization methods available, this work focuses on the min-max method, because it actively eliminates the effect of inconsistent ranges of the datasets. Moreover, other factors that are challenging the process of min-max normalization, such as including or excluding outliers or the least significant feature, have also been considered in this work. The finding of this work shows that each ML technique responds differently to the min-max normalization. The performance of SVM models has been improved, while no significant improvement happened to the performance of ANN models. It is been concluded that the performance of E-KNN models may improve or degrade with the min-max normalization, and it depends on the statistical properties of the dataset.

Related Organizations

Koya University
Iraq
Ishik University
Iraq

Keywords

Artificial neural network, Technology, Support vector machine, T, Science, Q, Mean squared error, Min-max normalization, Euclidean-based K-nearest neighbor

3 Research products, page 1 of 1

A K-NN associated fuzzy evidential reasoning classifier with adaptive neighbor selection
2004IsAmongTopNSimilarDocuments
Fuzzy-Assisted Event-Based kNN Query Processing in Sensor Networks
2018IsAmongTopNSimilarDocuments
Heart Disease Prediction Using Extended KNN (E-KNN)
2021IsAmongTopNSimilarDocuments

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	13
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%