Bayesian consensus clustering

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 28 Aug 2013Embargo end date: 01 Jan 2013 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics, volume 29, pages 2,610-2,616 (issn: 1367-4803, eissn: 1367-4811,

Copyright policy )Funded by:NIH | Bayesian Methods for Asse...

Authors: Eric F. Lock; David B. Dunson;

doi: 10.1093/bioinformatics/btt425 , 10.48550/arxiv.1302.7280

pmid: 23990412

pmc: PMC3789539

arXiv: 1302.7280

Bayesian consensus clustering

- Summary
- Subjects
- Metrics

Abstract

AbstractMotivation: In biomedical research a growing number of platforms and technologies are used to measure diverse but related information, and the task of clustering a set of objects based on multiple sources of data arises in several applications. Most current approaches to multisource clustering either independently determine a separate clustering for each data source or determine a single ‘joint’ clustering for all data sources. There is a need for more flexible approaches that simultaneously model the dependence and the heterogeneity of the data sources.Results: We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separate clusterings adhere loosely to an overall consensus clustering, and hence they are not independent. We describe a computationally scalable Bayesian framework for simultaneous estimation of both the consensus clustering and the source-specific clusterings. We demonstrate that this flexible approach is more robust than joint clustering of all data sources, and is more powerful than clustering each data source independently. We present an application to subtype identification of breast cancer tumor samples using publicly available data from The Cancer Genome Atlas.Availability: R code with instructions and examples is available at http://people.duke.edu/%7Eel113/software.html.Contact: Eric.Lock@duke.eduSupplementary information: Supplementary data are available at Bioinformatics online.

Related Organizations

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Models, Statistical, Gene Dosage, Bayes Theorem, Machine Learning (stat.ML), Genomics, Machine Learning (cs.LG), Statistics - Machine Learning, Cluster Analysis, Humans, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	246
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 1%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%