
arXiv: 1612.07971
handle: 2268/204699
Quadratic and linear discriminant analysis (QDAandLDA) are the most often applied classification rules under normality. InQDA, a separate covariance matrix is estimated for each group. If there are more variables than observations in the groups, the usual estimates are singular and cannot be used anymore. Assuming homoscedasticity, as inLDA, reduces the number of parameters to estimate. This rather strong assumption is however rarely verified in practice. Regularized discriminant techniques that are computable in high dimension and cover the path between the 2 extremesQDAandLDAhave been proposed in the literature. However, these procedures rely on sample covariance matrices. As such, they become inappropriate in the presence of cellwise outliers, a type of outliers that is very likely to occur in high‐dimensional datasets. In this paper, we propose cellwise robust counterparts of these regularized discriminant techniques by inserting cellwise robust covariance matrices. Our methodology results in a family of discriminant methods that (1) are robust against outlying cells, (2) cover the gap betweenLDAandQDA, and (3) are computable in high dimension. The good performance of the new methods is illustrated through simulated and real data examples. As a by‐product, visual tools are provided for the detection of outliers.
FOS: Computer and information sciences, Technology, Physique, chimie, mathématiques & sciences de la terre, Statistics & Probability, Computer Science, Artificial Intelligence, Methodology (stat.ME), Physical, chemical, mathematical & earth Sciences, Penalized estimation, cellwise robust precision matrix, INVERSE COVARIANCE ESTIMATION, Statistics - Methodology, Science & Technology, 0104 Statistics, Statistics, Classification, MULTIVARIATE LOCATION, discriminant analysis, Discriminant analysis, Computer science, Cellwise robust precision matrix, Mathématiques, 4905 Statistics, penalized estimation, SCATTER, 4605 Data management and data science, classification, Physical Sciences, Computer Science, Computer Science, Interdisciplinary Applications, GRAPHICAL LASSO, Mathematics
FOS: Computer and information sciences, Technology, Physique, chimie, mathématiques & sciences de la terre, Statistics & Probability, Computer Science, Artificial Intelligence, Methodology (stat.ME), Physical, chemical, mathematical & earth Sciences, Penalized estimation, cellwise robust precision matrix, INVERSE COVARIANCE ESTIMATION, Statistics - Methodology, Science & Technology, 0104 Statistics, Statistics, Classification, MULTIVARIATE LOCATION, discriminant analysis, Discriminant analysis, Computer science, Cellwise robust precision matrix, Mathématiques, 4905 Statistics, penalized estimation, SCATTER, 4605 Data management and data science, classification, Physical Sciences, Computer Science, Computer Science, Interdisciplinary Applications, GRAPHICAL LASSO, Mathematics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
