
doi: 10.2307/2532201
Summary: The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the criterion of \textit{H. P. Friedman} and \textit{J. Rubin} [J. Am. Stat. Assoc. 62, 1159-1178 (1967)]. However, as currently implemented, it does not allow the specification of which features (orientation, size, and shape) are to be common to all clusters and which may differ between clusters. Also, it is restricted to Gaussian distributions and it does not allow for noise. We propose ways of overcoming these limitations. A reparameterization of the covariance matrix allows us to specify that some, but not all, features be the same for all clusters. A practical framework for non- Gaussian clustering is outlined, and a means of incorporating noise in the form of a Poisson process is described. An approximate Bayesian method for choosing the number of clusters is given. The performance of the proposed methods is studied by simulation, with encouraging results. The methods are applied to the analysis of a data set arising in the study of diabetes, and the results seem better than those of previous analyses. A magnetic resonance image (MRI) of the brain is also analyzed, and the methods appear successful in extracting the main features of anatomical interest. The methods described here have been implemented in both Fortran and S-PLUS versions, and the software is freely available through StatLib.
noise, Classification and discrimination; cluster analysis (statistical aspects), hierarchical agglomeration, diabetes, Bayesian inference, iterative relocation, reparameterization of the covariance matrix, approximate Bayesian method, Poisson process, magnetic resonance image of the brain, simulation, Applications of statistics to biology and medical sciences; meta analysis, Bayes factors, sum of squares criterion, mixture models, classification maximum likelihood approach, clustering algorithms, non-Gaussian clustering
noise, Classification and discrimination; cluster analysis (statistical aspects), hierarchical agglomeration, diabetes, Bayesian inference, iterative relocation, reparameterization of the covariance matrix, approximate Bayesian method, Poisson process, magnetic resonance image of the brain, simulation, Applications of statistics to biology and medical sciences; meta analysis, Bayes factors, sum of squares criterion, mixture models, classification maximum likelihood approach, clustering algorithms, non-Gaussian clustering
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2K | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 0.1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 0.01% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
