Single and Ensemble Based Filters in Environmental Data

descriptionPublicationkeyboard_double_arrow_right Article 12 Jun 2025 English Publisher:WileyJournal:Expert Systems, volume 42 (issn: 0266-4720, eissn: 1468-0394,

Copyright policy )

Authors: Yousra Cherif; Ali Idri;

doi: 10.1111/exsy.70076

Single and Ensemble Based Filters in Environmental Data

- Summary
- Metrics

Abstract

ABSTRACTResearchers rely on species distribution models (SDMs) to establish a correlation between species occurrence records and environmental data. These models offer insights into the ecological and evolutionary aspects of the subject. Feature selection (FS) aims to choose useful interlinked features or remove unnecessary and redundant ones and make the induced model easier to understand. Although feature selection plays a crucial role in SDMs, only a limited number of studies in the literature have addressed it with several key shortcomings such as lack of the use of multivariate techniques, lack of comparison between the univariate and the multivariate filters, and absence of a comparison between the ensemble univariate and multivariate filters. Therefore, this study presents a rigorous empirical evaluation consisting of assessing and comparing six filter‐based univariate feature selection methods using two thresholds with two multivariate techniques, as well as four classifiers: Extreme Gradient boosting (XGB), Random Forest (RF), Decision Tree (DT), and Light gradient‐boosting machine (LGBM). Furthermore, the current study proposes a novel approach for ensemble construction consisting of evaluating the applications of ensemble learning using 40% of features ranked by means of Borda Count and Reciprocal Rank (univariate filter ensembles) as well as the fusion‐based and the intersection‐based ensembles (multivariate filter ensembles). Moreover, we evaluated and compared the performances of univariate and multivariate techniques with their ensembles. Similarly, we evaluated and compared the performances of the best ensemble techniques across datasets. The empirical evaluations involve several techniques, such as the 5‐fold cross‐validation method, the Scott Knott (SK) test, and Borda Count. In addition, we used three performance metrics (accuracy, Kappa, and F1‐score). Experiments showed that Consistency‐based subset selection in conjunction with RF outperformed all other univariate and multivariate FS techniques with an accuracy value of 91.63% across all datasets. However, Fisher score trained with RF was the best choice when considering the number of features. Moreover, the univariate or multivariate based ensembles, in general, outperformed their singles. In addition, when comparing the univariate and multivariate ensembles, the fusion‐based ensemble outperformed all other ensembles achieving an accuracy of 91.77% when using RF across datasets. Nevertheless, in terms of performance and number of features, the ensemble constructed using Reciprocal Rank performed better than all other FS techniques regardless of the classifier used. It achieved an accuracy of 91.61% across datasets when using RF.

Related Organizations

Mohammed V University
Morocco

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Top 10%

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now