publication . Preprint . Conference object . 2018

Sparsemax and Relaxed Wasserstein for Topic Sparsity

Tianyi Lin; Zhiyue Hu; Xin Guo;
Open Access English
  • Published: 22 Oct 2018
Abstract
Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse ...
Subjects
free text keywords: Computer Science - Machine Learning, Computer Science - Information Retrieval, Statistics - Machine Learning, Computer science, Probabilistic logic, Data mining, computer.software_genre, computer, Text corpus, Softmax function, Gaussian, symbols.namesake, symbols, Inference, Web content, Vocabulary, media_common.quotation_subject, media_common, Backpropagation
39 references, page 1 of 3

[1] M. Arjovsky and L. Bottou. 2017. Towards principled methods for training generative adversarial networks. In ICLR. [OpenAIRE]

[2] M. Arjovsky, S. Chintala, and L. Bottou. 2017. Wasserstein Generative Adversarial Networks. In ICML. 214-223. [OpenAIRE]

[3] D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 993-1022.

[4] Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji. 2015. A Novel Neural Topic Model and Its Supervised Extension. In AAAI. 2210-2216.

[5] D. Card, C. Tan, and N. A. Smith. 2017. A Neural Framework for Generalized Topic Models. ArXiv Preprint: 1705.09296 (2017).

[6] X. Chen, M. Zhou, and L. Carin. 2012. The contextual focused topic model. In KDD. ACM, 96-104.

[7] X. Cheng, X. Yan, Y. Lan, and J. Guo. 2014. BTM: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering 26, 12 (2014), 2928-2941.

[8] T. M. Cover and J. A. Thomas. 2012. Elements of information theory. John Wiley & Sons.

[9] D. C. Dowson and B. V. Landau. 1982. The Fréchet distance between multivariate normal distributions. Journal of multivariate analysis 12, 3 (1982), 450-455. [OpenAIRE]

[10] J. Eisenstein, A. Ahmed, and E. P. Xing. 2011. Sparse additive generative models of text. In ICML. 1041-1048.

[11] X. Guo, J. Hong, T. Lin, and N. Yang.2017. Relaxed Wasserstein with Applications to GANs. ArXiv Preprint: 1705.07164 (2017).

[12] G. E. Hinton and R. R. Salakhutdinov. 2009. Replicated softmax: an undirected topic model. In NIPS. 1607-1614.

[13] M. Hoffman, F. R. Bach, and D. M. Blei. 2010. Online learning for latent dirichlet allocation. In NIPS. 856-864.

[14] M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley. 2013. Stochastic variational inference. Journal of Machine Learning Research 14, 1 (2013), 1303-1347.

[15] T. Hofmann. 1999. Probabilistic latent semantic analysis. In UAI. Morgan Kaufmann Publishers Inc., 289-296.

39 references, page 1 of 3
Abstract
Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse ...
Subjects
free text keywords: Computer Science - Machine Learning, Computer Science - Information Retrieval, Statistics - Machine Learning, Computer science, Probabilistic logic, Data mining, computer.software_genre, computer, Text corpus, Softmax function, Gaussian, symbols.namesake, symbols, Inference, Web content, Vocabulary, media_common.quotation_subject, media_common, Backpropagation
39 references, page 1 of 3

[1] M. Arjovsky and L. Bottou. 2017. Towards principled methods for training generative adversarial networks. In ICLR. [OpenAIRE]

[2] M. Arjovsky, S. Chintala, and L. Bottou. 2017. Wasserstein Generative Adversarial Networks. In ICML. 214-223. [OpenAIRE]

[3] D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research 3, Jan (2003), 993-1022.

[4] Z. Cao, S. Li, Y. Liu, W. Li, and H. Ji. 2015. A Novel Neural Topic Model and Its Supervised Extension. In AAAI. 2210-2216.

[5] D. Card, C. Tan, and N. A. Smith. 2017. A Neural Framework for Generalized Topic Models. ArXiv Preprint: 1705.09296 (2017).

[6] X. Chen, M. Zhou, and L. Carin. 2012. The contextual focused topic model. In KDD. ACM, 96-104.

[7] X. Cheng, X. Yan, Y. Lan, and J. Guo. 2014. BTM: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering 26, 12 (2014), 2928-2941.

[8] T. M. Cover and J. A. Thomas. 2012. Elements of information theory. John Wiley & Sons.

[9] D. C. Dowson and B. V. Landau. 1982. The Fréchet distance between multivariate normal distributions. Journal of multivariate analysis 12, 3 (1982), 450-455. [OpenAIRE]

[10] J. Eisenstein, A. Ahmed, and E. P. Xing. 2011. Sparse additive generative models of text. In ICML. 1041-1048.

[11] X. Guo, J. Hong, T. Lin, and N. Yang.2017. Relaxed Wasserstein with Applications to GANs. ArXiv Preprint: 1705.07164 (2017).

[12] G. E. Hinton and R. R. Salakhutdinov. 2009. Replicated softmax: an undirected topic model. In NIPS. 1607-1614.

[13] M. Hoffman, F. R. Bach, and D. M. Blei. 2010. Online learning for latent dirichlet allocation. In NIPS. 856-864.

[14] M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley. 2013. Stochastic variational inference. Journal of Machine Learning Research 14, 1 (2013), 1303-1347.

[15] T. Hofmann. 1999. Probabilistic latent semantic analysis. In UAI. Morgan Kaufmann Publishers Inc., 289-296.

39 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Preprint . Conference object . 2018

Sparsemax and Relaxed Wasserstein for Topic Sparsity

Tianyi Lin; Zhiyue Hu; Xin Guo;