• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 3 versions
Publication . Article . Preprint . 2014

Feature selection for classification with class-separability strategy and data envelopment analysis

Yishi Zhang; Chao Yang; Anrong Yang; Chan Xiong; Xingchi Zhou; Z. Zhang;
Open Access
Published: 05 May 2014 Journal: Neurocomputing, volume 166, pages 172-184 (issn: 0925-2312, Copyright policy )
Publisher: Elsevier BV

In this paper, a novel feature selection method is presented, which is based on Class-Separability (CS) strategy and Data Envelopment Analysis (DEA). To better capture the relationship between features and the class, class labels are separated into individual variables and relevance and redundancy are explicitly handled on each class label. Super-efficiency DEA is employed to evaluate and rank features via their conditional dependence scores on all class labels, and the feature with maximum super-efficiency score is then added in the conditioning set for conditional dependence estimation in the next iteration, in such a way as to iteratively select features and get the final selected features. Eventually, experiments are conducted to evaluate the effectiveness of proposed method comparing with four state-of-the-art methods from the viewpoint of classification accuracy. Empirical results verify the feasibility and the superiority of proposed feature selection method.

23 pages, 12 figures

Subjects by Vocabulary

Microsoft Academic Graph classification: Set (abstract data type) Pattern recognition Class (biology) Redundancy (engineering) Selection (genetic algorithm) Feature selection Computer science Artificial intelligence business.industry business Data envelopment analysis Feature (machine learning)


Artificial Intelligence, Cognitive Neuroscience, Computer Science Applications, Computer Science - Learning, Computer Science - Information Theory, Statistics - Machine Learning, 68T10, 90C05, 94A17, 62B10, 68U35, I.5.2, G.1.6, H.1.1, Machine Learning (cs.LG), Information Theory (cs.IT), Machine Learning (stat.ML), FOS: Computer and information sciences, I.5.2; G.1.6; H.1.1

56 references, page 1 of 6

[1] Y. Lin, X. Hu, X. Wu, Quality of information-based source assessment and selection, Neurocomputing 133 (2014) 95-102.

[2] S. Cang, H. Yu, Mutual information based input feature selection for classification problems, Decision Support Systems 54 (2012) 691-698.

[3] I. Guyon, A. Elisseeff, An introduction to variable and features election, Journal of Machine Learning Research 3 (2003) 1157-1182.

[4] H. Liu, L. Yu, Toward integrating features election algorithms for classification and clustering, IEEE Transactions on Knowledge and Data Engineering 17 (4) (2005) 491-502.

[5] T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, D. Haussler, Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics 16 (10) (2000) 906-914.

[6] G. Qu, S. Hariri, M. Yousif, A new dependency and correlation analysis for features, IEEE Transactions on Knowledge and Data Engineering 17 (9) (2005) 1199-1207. [OpenAIRE]

[7] H. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (8) (2005) 1226-1238.

[8] D. Huang, T. W. S. Chow, Effective feature selection scheme using mutual information, Neurocomputing 63 (2005) 325- 343.

[9] J. J. Huang, Y. Z. Cai, X. M. Xu, A parameterless feature ranking algorithm based on mi, Neurocomputing 71 (2008) 1656-1668.

[10] L. Yu, H. Liu, Efficient feature selection via analysis of relevance and redundancy, Journal of Machine Learning Research 5 (2004) 1205-1224.

Download fromView all 4 sources