Actions
  • shareshare
  • link
  • cite
  • add
add
auto_awesome_motion View all 3 versions
Publication . Article . Preprint . 2014

Feature selection for classification with class-separability strategy and data envelopment analysis

Yishi Zhang; Chao Yang; Anrong Yang; Chan Xiong; Xingchi Zhou; Z. Zhang;
Open Access
English
Abstract

In this paper, a novel feature selection method is presented, which is based on Class-Separability (CS) strategy and Data Envelopment Analysis (DEA). To better capture the relationship between features and the class, class labels are separated into individual variables and relevance and redundancy are explicitly handled on each class label. Super-efficiency DEA is employed to evaluate and rank features via their conditional dependence scores on all class labels, and the feature with maximum super-efficiency score is then added in the conditioning set for conditional dependence estimation in the next iteration, in such a way as to iteratively select features and get the final selected features. Eventually, experiments are conducted to evaluate the effectiveness of proposed method comparing with four state-of-the-art methods from the viewpoint of classification accuracy. Empirical results verify the feasibility and the superiority of proposed feature selection method.

23 pages, 12 figures

Subjects by Vocabulary

Microsoft Academic Graph classification: Set (abstract data type) Selection (genetic algorithm) Pattern recognition Class (biology) Feature (machine learning) Data envelopment analysis Artificial intelligence business.industry business Redundancy (engineering) Feature selection Computer science

Subjects

Computer Science - Learning, Computer Science - Information Theory, Statistics - Machine Learning, 68T10, 90C05, 94A17, 62B10, 68U35, I.5.2, G.1.6, H.1.1, Artificial Intelligence, Cognitive Neuroscience, Computer Science Applications, Machine Learning (cs.LG), Information Theory (cs.IT), Machine Learning (stat.ML), FOS: Computer and information sciences, I.5.2; G.1.6; H.1.1

56 references, page 1 of 6

[1] Y. Lin, X. Hu, X. Wu, Quality of information-based source assessment and selection, Neurocomputing 133 (2014) 95-102.

[2] S. Cang, H. Yu, Mutual information based input feature selection for classification problems, Decision Support Systems 54 (2012) 691-698.

[3] I. Guyon, A. Elisseeff, An introduction to variable and features election, Journal of Machine Learning Research 3 (2003) 1157-1182.

[4] H. Liu, L. Yu, Toward integrating features election algorithms for classification and clustering, IEEE Transactions on Knowledge and Data Engineering 17 (4) (2005) 491-502.

[5] T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, D. Haussler, Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics 16 (10) (2000) 906-914. [OpenAIRE]

[6] G. Qu, S. Hariri, M. Yousif, A new dependency and correlation analysis for features, IEEE Transactions on Knowledge and Data Engineering 17 (9) (2005) 1199-1207. [OpenAIRE]

[7] H. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (8) (2005) 1226-1238.

[8] D. Huang, T. W. S. Chow, Effective feature selection scheme using mutual information, Neurocomputing 63 (2005) 325- 343.

[9] J. J. Huang, Y. Z. Cai, X. M. Xu, A parameterless feature ranking algorithm based on mi, Neurocomputing 71 (2008) 1656-1668.

[10] L. Yu, H. Liu, Efficient feature selection via analysis of relevance and redundancy, Journal of Machine Learning Research 5 (2004) 1205-1224.

Related to Research communities
Download fromView all 4 sources
lock_open
moresidebar