Efficient utility-based clustering over high dimensional partition spaces

Article, Other literature type English OPEN
Liverani, Silvia ; Anderson, Paul E. ; Edwards, Kieron D. ; Millar, A. J. ; Smith, J. Q. (2009)
  • Publisher: Int Soc Bayesian Analysis
  • Journal: (issn: 1936-0975)
  • Related identifiers: doi: 10.1214/09-BA420, doi: 10.1214/09-BA420
  • Subject: Circardian Expression Profiles | QA | Bayesian | Genetics | Posterior Probability Distribution
    acm: ComputingMethodologies_PATTERNRECOGNITION

Because of the huge number of partitions of even a moderately sized dataset, even when Bayes factors have a closed form, in model-based clustering a comprehensive search for the highest scoring (MAP) partition is usually impossible. However, when each cluster in a partition has a signature and it is known that some signatures are of scientific interest whilst others are not, it is possible, within a Bayesian framework, to develop search algorithms which are guided by these cluster signatures. Such algorithms can be expected to find better partitions more quickly. In this paper we develop a framework within which these ideas can be formalized. We then briefly illustrate the efficacy of the proposed guided search on a microarray time coursed at a set where the clustering objective is to identify clusters of genes with different types of circadian expression profiles.
