Hash Embeddings for Efficient Word Representations

Preprint English OPEN
Svenstrup, Dan; Hansen, Jonas Meinertz; Winther, Ole;
(2017)
  • Subject: Computer Science - Computation and Language
    acm: ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMS

We present hash embeddings, an efficient method for representing words in a continuous vector form. A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash ... View more
  • References (23)
    23 references, page 1 of 3

    Argerich, L., Zaffaroni, J. T., and Cano, M. J. (2016). Hash2vec, feature hashing for word embeddings. CoRR, abs/1608.08940.

    Bai, B., Weston, J., Grangier, D., Collobert, R., Sadamasa, K., Qi, Y., Chapelle, O., and Weinberger, K. (2009). Supervised semantic indexing. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 187-196. ACM.

    Conneau, A., Schwenk, H., Barrault, L., and LeCun, Y. (2016). Very deep convolutional networks for natural language processing. CoRR, abs/1606.01781.

    Gray, R. M. and Neuhoff, D. L. (1998). Quantization. IEEE Trans. Inf. Theor., 44(6):2325-2383.

    Huang, E. H., Socher, R., Manning, C. D., and Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, ACL '12, pages 873-882, Stroudsburg, PA, USA. Association for Computational Linguistics.

    Huang, P.-S., He, X., Gao, J., Deng, L., Acero, A., and Heck, L. (2013). Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM), pages 2333-2338.

    Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167.

    Jegou, H., Douze, M., and Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell., 33(1):117-128.

    Johansen, A. R., Hansen, J. M., Obeid, E. K., Sønderby, C. K., and Winther, O. (2016). Neural machine translation with characters and hierarchical encoding. CoRR, abs/1610.06550.

    Johnson, R. and Zhang, T. (2016). Convolutional neural networks for text categorization: Shallow word-level vs. deep character-level. CoRR, abs/1609.00718.

  • Related Research Results (1)
  • Related Organizations (2)
  • Metrics
Share - Bookmark