Remedies against the vocabulary gap in information retrieval

Doctoral thesis, Preprint English OPEN
Van Gysel, Christophe;
(2017)
  • Subject: Computer Science - Computation and Language | Computer Science - Artificial Intelligence | Computer Science - Information Retrieval
    acm: InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL

Search engines rely heavily on term-based approaches that represent queries and documents as bags of words. Text---a document or a query---is represented by a bag of its words that ignores grammar and word order, but retains word frequency counts. When presented with a ... View more
  • References (191)
    191 references, page 1 of 20

    17. pyndri (https://github.com/cvangysel/pyndri) - a Python interface to the Indri search engine. [Ch. 3, 4, 8 and App. A]

    index = pyndri.Index('/opt/local/clueweb09') query_env = pyndri.QueryEnvironment( index, rules=('method:dirichlet,mu:5000',))

    [1] The knowledge-based economy. Technical report, Organisation for Economic Co-operation and Development, 1996. (Cited on page 51.)

    [2] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane´, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vie´gas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. (Cited on page 114.)

    [3] Q. Ai, L. Yang, J. Guo, and W. B. Croft. Analysis of the paragraph vector model for information retrieval. In ICTIR, pages 133-142. ACM, 2016. (Cited on page 13.)

    [4] Q. Ai, L. Yang, J. Guo, and W. B. Croft. Improving language estimation with the paragraph vector model for ad-hoc retrieval. In SIGIR, pages 869-872. ACM, 2016. (Cited on pages 13 and 108.)

    [5] Q. Ai, S. T. Dumais, N. Craswell, and D. Liebling. Characterizing email search using large-scale behavioral logs and surveys. In WWW, pages 1511-1520, 2017. (Cited on pages 10 and 29.)

    [6] Q. Ai, Y. Zhang, K. Bi, X. Chen, and B. W. Croft. Learning a hierarchical embedding model for personalized product search. In SIGIR, 2017. (Cited on page 137.)

    [7] J. Allan, B. Croft, A. Moffat, and M. Sanderson. Frontiers, challenges, and opportunities for information retrieval. In SIGIR Forum, volume 46, pages 2-32. ACM, 2012. (Cited on page 10.)

    [8] E. Amigo´, J. Gonzalo, J. Artiles, and F. Verdejo. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval, 12(4):461-486, 2009. ISSN 1386-4564. (Cited on page 80.)

  • Related Research Results (6)
  • Metrics
Share - Bookmark