A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set

descriptionPublicationkeyboard_double_arrow_right Article , Part of book or chapter of book 08 Jul 2006Publisher:ACMJournal:Proceedings of the 8th annual conference on Genetic and evolutionary computation

Authors: Correa, Elon S.; Freitas, Alex A.; Johnson, Colin G.;

doi: 10.1145/1143997.1144003

A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set

- Summary
- Subjects
- Metrics

Abstract

Many data mining applications involve the task of build- ing a model for predictive classification. The goal of such a model is to classify examples (records or data instances) into classes or categories of the same type. The use of variables (attributes) not related to the classes can reduce the accu- racy and reliability of a classification or prediction model. Superfluous variables can also increase the costs of build- ing a model - particularly on large data sets. We propose a discrete Particle Swarm Optimization (PSO) algorithm de- signed for attribute selection. The proposed algorithm deals with discrete variables, and its population of candidate solu- tions contains particles of different sizes. The performance of this algorithm is compared with the performance of a standard binary PSO algorithm on the task of selecting at- tributes in a bioinformatics data set. The criteria used for comparison are: (1) maximizing predictive accuracy; and (2) finding the smallest subset of attributes.

Related Organizations

University of Kent
United Kingdom

Keywords

QH324.2, QA76

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	41
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

41

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering