• shareshare
  • link
  • cite
  • add
Publication . Conference object . 2010

A Population-Based Incremental Learning approach to microarray gene expression feature selection

Meir Perez; David M. Rubin; Tshilidzi Marwala; Lesley Scott; Wendy Stevens;
Published: 01 Nov 2010
Publisher: IEEE

The identification of a differentially expressed set of genes in microarray data analysis is essential, both for novel onco-genic pathway identification, as well as for automated diagnostic purposes. This paper assesses the effectiveness of the Population-Based Incremental Learning (PBIL) algorithm in identifying a class differentiating gene set for sample classification. PBIL is based on iteratively evolving the genome of a search population by updating a probability vector, guided by the extent of class-separability demonstrated by a combination of features. PBIL is compared, both to standard Genetic Algorithm (GA), as well as to an Analysis of Variance (ANOVA). The algorithms are tested on a publically available three-class leukaemia microarray data set (n=72). After running 30 repeats of both GA and PBIL, PBIL was able to find an average feature-space separability of 97.04%, while GA achieved an average class-separability of 96.39%. PBIL also found smaller feature-spaces than GA, (PBIL — 326 genes and GA — 2652) thus excluding a large percentage of redundant features. It also, on average, outperformed the ANOVA approach for n = 2652 (91.62%), q < 0.05 (94.44%), q < 0.01 (93.06%) and q < 0.005 (95.83%). The best PBIL run (98.61%) even outperformed ANOVA for n = 326 and q < 0.001 (both 97.22%). PBIL's performance is ascribed to its ability to direct the search, not only towards the optimal solution, but also away from the worst.

Subjects by Vocabulary

Microsoft Academic Graph classification: Feature selection Computer science Population education.field_of_study education Population-based incremental learning Sample classification Microarray gene expression Set (abstract data type) Artificial intelligence business.industry business Probability vector Pattern recognition Genetic algorithm

Related Organizations
Download from
Conference object . 2010
Providers: Crossref