
doi: 10.1007/s11222-022-10148-5 , 10.17863/cam.93008 , 10.48550/arxiv.2201.00450 , 10.17863/cam.94262
pmid: 36691583
pmc: PMC9852177
arXiv: 2201.00450
doi: 10.1007/s11222-022-10148-5 , 10.17863/cam.93008 , 10.48550/arxiv.2201.00450 , 10.17863/cam.94262
pmid: 36691583
pmc: PMC9852177
arXiv: 2201.00450
AbstractThere is an increasing body of work exploring the integration of random projection into algorithms for numerical linear algebra. The primary motivation is to reduce the overall computational cost of processing large datasets. A suitably chosen random projection can be used to embed the original dataset in a lower-dimensional space such that key properties of the original dataset are retained. These algorithms are often referred to as sketching algorithms, as the projected dataset can be used as a compressed representation of the full dataset. We show that random matrix theory, in particular the Tracy–Widom law, is useful for describing the operating characteristics of sketching algorithms in the tall-data regime when the sample size n is much greater than the number of variables d. Asymptotic large sample results are of particular interest as this is the regime where sketching is most useful for data compression. In particular, we develop asymptotic approximations for the success rate in generating random subspace embeddings and the convergence probability of iterative sketching algorithms. We test a number of sketching algorithms on real large high-dimensional datasets and find that the asymptotic expressions give accurate predictions of the empirical performance.
FOS: Computer and information sciences, random projection, Sketching, Random projection, Numerical Analysis (math.NA), random matrix theory, Statistics - Computation, Article, Random matrix theory, sketching, FOS: Mathematics, Mathematics - Numerical Analysis, Computational methods for problems pertaining to statistics, Computation (stat.CO)
FOS: Computer and information sciences, random projection, Sketching, Random projection, Numerical Analysis (math.NA), random matrix theory, Statistics - Computation, Article, Random matrix theory, sketching, FOS: Mathematics, Mathematics - Numerical Analysis, Computational methods for problems pertaining to statistics, Computation (stat.CO)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
