Separation measures and the geometry of Bayes factor selection for classification

Article English OPEN
Smith, J. Q. ; Anderson, Paul E. ; Liverani, Silvia (2008)

Conjugacy assumptions are often used in Bayesian selection over a partition because they allow the otherwise unfeasibly large model space to be searched very quickly. The implications of such models can be analysed algebraically. We use the explicit forms of the associated Bayes factors to demonstrate that such methods can be unstable under common settings of the associated hyperparameters. We then prove that the regions of instability can be removed by setting the hyperparameters in an unconventional way. Under this family of assignments we prove that model selection is determined by an implicit separation measure: a function of the hyperparameters and the sufficient statistics of clusters in a given partition. We show that this family of separation measures has plausible properties. The methodology proposed is illustrated through the selection of clusters of longitudinal gene expression profiles.
  • References (21)
    21 references, page 1 of 3

    Anderson, P. E., J. Q. Smith, K. D. Edwards, and A. J. Millar (2006). Guided Conjugate Bayesian Clustering for Uncovering Circadian Genes. Technical Report 06-07, CRiSM paper, Department of Statistics, University of Warwick.

    Barry, D. and J. A. Hartigan (1992). Product partitions for change point problems. Annals of Statistics 20, 260-279.

    Bernardo, J. M. and A. F. M. Smith (1994). Bayesian Theory. Wiley.

    Chipman, H., E. George, and R. McCullough (1998). Bayesian CART Model Search. J. Amer. Statist. Assoc. 93, 935-960.

    Chipman, H. and R. Tibshirani (2006). Hybrid Hierarchical Clustering with Applications to Microarray Data. Biostatistics 7, 268-285.

    Chipman, H. A., E. George, and R. E. McCullock (2001). The Practical Implementation of Bayesian Model Selection. Model Selection 38, 1-50.

    Chipman, H. A., E. I. George, and R. E. McCulloch (2002). Bayesian treed models. Machine Learning 48 (1-3), 299-320.

    Denison, D. G. T., C. C. Holmes, B. K. Mallick, and A. F. M. Smith (2002). Bayesian Methods for Nonlinear Classification and Regression. Wiley Series in Probability and Statistics. John Wiley and Sons.

    Edwards, K. D., P. E. Anderson, A. Hall, N. S. Salathia, J. C. W. Locke, J. R. Lynn, M. Straume, J. Q. Smith, and A. J. Millar (2006). FLOWERING LOCUS C Mediates Natural Variation in the High-Temperature Response of the Arabidopsis Circadian Clock. The Plant Cell 18, 639-650.

    Fernandez, C., E. Ley, and M. J. F. Steel (2001). Benchmark priors for Bayesian Model Averaging. Journal of Econometrics 100, 381-427.

  • Metrics
    views in OpenAIRE
    views in local repository
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    Warwick Research Archives Portal Repository - IRUS-UK 0 12
Share - Bookmark