Model Selection for Non-Negative Tensor Factorization with Minimum Description Length

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 27 Jun 2019 English Publisher:MDPI AGJournal:Entropy, volume 21, page 632 (eissn: 1099-4300,

Copyright policy )

Authors: Yunhui Fu; Shin Matsushima; Kenji Yamanishi;

doi: 10.3390/e21070632

pmid: 33267345

pmc: PMC7515125

Model Selection for Non-Negative Tensor Factorization with Minimum Description Length

- Summary
- Subjects
- Metrics

Abstract

Non-negative tensor factorization (NTF) is a widely used multi-way analysis approach that factorizes a high-order non-negative data tensor into several non-negative factor matrices. In NTF, the non-negative rank has to be predetermined to specify the model and it greatly influences the factorized matrices. However, its value is conventionally determined by specialists’ insights or trial and error. This paper proposes a novel rank selection criterion for NTF on the basis of the minimum description length (MDL) principle. Our methodology is unique in that (1) we apply the MDL principle on tensor slices to overcome a problem caused by the imbalance between the number of elements in a data tensor and that in factor matrices, and (2) we employ the normalized maximum likelihood (NML) code-length for histogram densities. We employ synthetic and real data to empirically demonstrate that our method outperforms other criteria in terms of accuracies for estimating true ranks and for completing missing values. We further show that our method can produce ranks suitable for knowledge discovery.

Related Organizations

University of Tokyo
Japan

Keywords

model selection, Science, Physics, QC1-999, Q, Astrophysics, Article, minimum description length, QB460-466, non-negative tensor factorization, normalized maximum likelihood code length

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

Green

gold

Fields of Science (4) View all

Fields of Science