
arXiv: 1609.05772
This paper considers a restriction to non-negative matrix factorization in which at least one matrix factor is stochastic. That is, the elements of the matrix factors are non-negative and the columns of one matrix factor sum to 1. This restriction includes topic models, a popular method for analyzing unstructured data. It also includes a method for storing and finding pictures. The paper presents necessary and sufficient conditions on the observed data such that the factorization is unique. In addition, the paper characterizes natural bounds on the parameters for any observed data and presents a consistent least squares estimator. The results are illustrated using a topic model analysis of PhD abstracts in economics and the problem of storing and retrieving a set of pictures of faces.
24 pages, 4 figures, 5 tables
FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
