
arXiv: 1601.06259
Linear independence testing is a fundamental information-theoretic and statistical problem that can be posed as follows: given $n$ points $\{(X_i,Y_i)\}^n_{i=1}$ from a $p+q$ dimensional multivariate distribution where $X_i \in \mathbb{R}^p$ and $Y_i \in\mathbb{R}^q$, determine whether $a^T X$ and $b^T Y$ are uncorrelated for every $a \in \mathbb{R}^p, b\in \mathbb{R}^q$ or not. We give minimax lower bound for this problem (when $p+q,n \to \infty$, $(p+q)/n \leq ��< \infty$, without sparsity assumptions). In summary, our results imply that $n$ must be at least as large as $\sqrt {pq}/\|��_{XY}\|_F^2$ for any procedure (test) to have non-trivial power, where $��_{XY}$ is the cross-covariance matrix of $X,Y$. We also provide some evidence that the lower bound is tight, by connections to two-sample testing and regression in specific settings.
9 pages
FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Information Theory, Information Theory (cs.IT), Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Machine Learning (cs.LG), Statistics - Machine Learning, FOS: Mathematics
FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Information Theory, Information Theory (cs.IT), Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), Machine Learning (cs.LG), Statistics - Machine Learning, FOS: Mathematics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
