
doi: 10.3233/ida-160840
Feature selection is an important machine learning field which can provide a key role for the challenging problem of classifying high-dimensional data. This problem is finding effective features among the set of all features in such that the final feature set can improve accuracy and reduce complexity. Since feature selection is an NP-Hard problem, many heuristic algorithms have been studied so far to solve this problem. In this paper, we propose a novel method based on hyper-heuristic approach to find an efficient proper feature subset which is named Hyper-Heuristic Feature Selection (HHFS). In the proposed method, Low level heuristics are categorized into two groups: the first group contains exploiters which cause to exploit the search space efficiently by improving the quality of the candidate solution at hand; the second one includes explorer heuristics which explore the solution space by dwelling on random perturbations. Since each region of the solution space can have its own characteristics, an appropriate low level heuristic should be selected and applied to the current solution. We propose Genetic Algorithm to select among the set of low level heuristic and balance between exploitation and exploration. It chooses the low level heuristic based on the existing functional history of low level heuristic. We aim to investigate the role of cooperation between low level heuristics within a hyper-heuristic framework to find the best feature subset. Since different low level heuristics have different strengths and weaknesses, we believe that cooperation can allow the strengths of one low level heuristic to compensate for the weaknesses of another. In this study, we also propose Adaptive Hyper-Heuristic Feature Selection (AHHFS) which is an extension of HHFS. Empirical study of the proposed method on several commonly used data sets from UCI repository indicates that it outperforms recent methods in the literature for feature selection.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 15 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
