publication . Other literature type . Preprint . 2017

Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining

Urbanowicz, Ryan J.; Olson, Randal S.; Schmitt, Peter; Meeker, Melissa; Moore, Jason H.;
Open Access English
  • Published: 22 Nov 2017
Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. ‘omics’ data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the ‘Relief’ algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Tr...
Subjects
free text keywords: Article, Computer Science - Learning
Related Organizations
Funded by
NIH| Penn integrated Human Pancreas procurement and Analysis Program
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1UC4DK112217-01
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| Approaches to Genetic Heterogeneity of Obstructive Sleep Apnea
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HL134015-03
  • Funding stream: NATIONAL HEART, LUNG, AND BLOOD INSTITUTE
,
NIH| Bioinformatics Approaches to Visual Disease Genetics
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1R01EY022300-01
  • Funding stream: NATIONAL EYE INSTITUTE
,
NIH| VASCULAR REFLEX MECHANISMS IN EXERCISE AND HEART FAILURE
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1F32HL009012-01
  • Funding stream: NATIONAL HEART, LUNG, AND BLOOD INSTITUTE
,
NIH| CORE--Endocrine and Reproduction Disruption
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5P30ES013508-04
  • Funding stream: NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES

Ryan J Urbanowicz, Je Kiralis, Nicholas A Sinnott-Armstrong, Tamra Heberling, Jonathan M Fisher, and Jason H Moore. Gametes: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData mining, 5(1): 16, 2012c. [OpenAIRE]

Ryan J Urbanowicz, Gediminas Bertasius, and Jason H Moore. An extended michigan-style learning classi er system for exible supervised learning, classi cation, and data mining. In International Conference on Parallel Problem Solving from Nature, pages 211{221. Springer, 2014.

Abstract
Modern biomedical data mining requires feature selection methods that can (1) be applied to large scale feature spaces (e.g. ‘omics’ data), (2) function in noisy problems, (3) detect complex patterns of association (e.g. gene-gene interactions), (4) be flexibly adapted to various problem domains and data types (e.g. genetic variants, gene expression, and clinical data) and (5) are computationally tractable. To that end, this work examines a set of filter-style feature selection algorithms inspired by the ‘Relief’ algorithm, i.e. Relief-Based algorithms (RBAs). We implement and expand these RBAs in an open source framework called ReBATE (Relief-Based Algorithm Tr...
Subjects
free text keywords: Article, Computer Science - Learning
Related Organizations
Funded by
NIH| Penn integrated Human Pancreas procurement and Analysis Program
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1UC4DK112217-01
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| Approaches to Genetic Heterogeneity of Obstructive Sleep Apnea
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HL134015-03
  • Funding stream: NATIONAL HEART, LUNG, AND BLOOD INSTITUTE
,
NIH| Bioinformatics Approaches to Visual Disease Genetics
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1R01EY022300-01
  • Funding stream: NATIONAL EYE INSTITUTE
,
NIH| VASCULAR REFLEX MECHANISMS IN EXERCISE AND HEART FAILURE
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1F32HL009012-01
  • Funding stream: NATIONAL HEART, LUNG, AND BLOOD INSTITUTE
,
NIH| CORE--Endocrine and Reproduction Disruption
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5P30ES013508-04
  • Funding stream: NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES

Ryan J Urbanowicz, Je Kiralis, Nicholas A Sinnott-Armstrong, Tamra Heberling, Jonathan M Fisher, and Jason H Moore. Gametes: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData mining, 5(1): 16, 2012c. [OpenAIRE]

Ryan J Urbanowicz, Gediminas Bertasius, and Jason H Moore. An extended michigan-style learning classi er system for exible supervised learning, classi cation, and data mining. In International Conference on Parallel Problem Solving from Nature, pages 211{221. Springer, 2014.

Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Other literature type . Preprint . 2017

Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining

Urbanowicz, Ryan J.; Olson, Randal S.; Schmitt, Peter; Meeker, Melissa; Moore, Jason H.;