Family Rank: a graphical domain knowledge informed feature ranking algorithm

descriptionPublicationkeyboard_double_arrow_right Article 19 May 2021 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics, volume 37, pages 3,626-3,631 (issn: 1367-4803, eissn: 1367-4811,

Copyright policy )

Authors: Michelle Saul; Valentin Dinu;

doi: 10.1093/bioinformatics/btab387

pmid: 34009295

Family Rank: a graphical domain knowledge informed feature ranking algorithm

- Summary
- Metrics

Abstract

Abstract Motivation When designing prediction models built with many features and relatively small sample sizes, feature selection methods often overfit training data, leading to selection of irrelevant features. One way to potentially mitigate overfitting is to incorporate domain knowledge during feature selection. Here, a feature ranking algorithm called ‘Family Rank’ is presented in which features are ranked based on a combination of graphical domain knowledge and feature scores computed from empirical data. Results A simulated dataset is used to demonstrate a scenario in which family rank outperforms other state-of-the-art graph based ranking algorithms, decreasing the sample size needed to detect true predictors by 2- to 3-fold. An example from oncology is then used to explore a real-world application of family rank. Availability and implementation An implementation of Family Rank is freely available at https://cran.r-project.org/package=FamilyRank. Supplementary information Supplementary data are available at Bioinformatics online.

Related Organizations

Caris Life Sciences (United States)
United States
Arizona State University
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

gold

Fields of Science (4) View all

engineering and technology

medical engineering

Fields of Science

engineering and technology

medical engineering

View all

Related to Research communities

ELIXIR GR