
doi: 10.1002/wics.1219
AbstractThe need for classification arises in most scientific pursuits. Typically, there is interest in ‘classifying’ an entity, say, an individual or object, on the basis of some characteristics (feature variables) measured on the entity. This article focuses on the form of classification known as supervised classification or discriminant analysis. It is applicable in situations where there are data of known origin with respect to the predefined classes from which a classifier can be constructed to assign an unclassified entity to one of these classes. We consider nonparametric and parametric approaches to the construction of classifiers. Consideration is given to recent results on the formation of classifiers in situations where the number of variablespis very large relative to the number of observationsn. Methods for estimating the error rates of a classifier are described, including the situation where the classifier has been formed in some optimal way from a relatively small subset of the variables relative to the available numberp. In such situations care has to be taken to avoid the selection bias inherent in the ordinarily used error‐rate estimators.WIREs Comput Stat2012 doi: 10.1002/wics.1219This article is categorized under:Statistical and Graphical Methods of Data Analysis > Multivariate Analysis
Bayes’ rule of allocation, High-dimensional data, Fisher’s linear discriminant function, Parametric and nonparametric rules, 2613 Statistics and Probability, Error-rate estimation, 310
Bayes’ rule of allocation, High-dimensional data, Fisher’s linear discriminant function, Parametric and nonparametric rules, 2613 Statistics and Probability, Error-rate estimation, 310
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 13 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
