
AbstractCovariance matrix estimation is a fundamental statistical task in many applications, but the sample covariance matrix is suboptimal when the sample size is comparable to or less than the number of features. Such high-dimensional settings are common in modern genomics, where covariance matrix estimation is frequently employed as a method for inferring gene networks. To achieve estimation accuracy in these settings, existing methods typically either assume that the population covariance matrix has some particular structure, for example, sparsity, or apply shrinkage to better estimate the population eigenvalues. In this paper, we study a new approach to estimating high-dimensional covariance matrices. We first frame covariance matrix estimation as a compound decision problem. This motivates defining a class of decision rules and using a nonparametric empirical Bayes g-modeling approach to estimate the optimal rule in the class. Simulation results and gene network inference in an RNA-seq experiment in mouse show that our approach is comparable to or can outperform a number of state-of-the-art proposals.
FOS: Computer and information sciences, 62C12 (Primary) 62C25 (Secondary), \(g\)-modeling, Mathematics - Statistics Theory, Bayes Theorem, Genomics, Statistics Theory (math.ST), compound decision theory, Applications of statistics to biology and medical sciences; meta analysis, nonparametric maximum likelihood, Methodology (stat.ME), Mice, Sample Size, FOS: Mathematics, Animals, Computer Simulation, Gene Regulatory Networks, separable decision rule, Statistics - Methodology
FOS: Computer and information sciences, 62C12 (Primary) 62C25 (Secondary), \(g\)-modeling, Mathematics - Statistics Theory, Bayes Theorem, Genomics, Statistics Theory (math.ST), compound decision theory, Applications of statistics to biology and medical sciences; meta analysis, nonparametric maximum likelihood, Methodology (stat.ME), Mice, Sample Size, FOS: Mathematics, Animals, Computer Simulation, Gene Regulatory Networks, separable decision rule, Statistics - Methodology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
