# Let's Replace Term-Based Retrieval with k-NN Search

- Published: 31 Oct 2016
- Publisher: ACM

- Masaryk University Czech Republic
- Carnegie-Mellon University United States
- Carnegie Mellon University Australia

[1] A. Andoni, P. Indyk, T. Laarhoven, I. Razenshteyn, and L. Schmidt. Practical and optimal lsh for angular distance. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 1225-1233. Curran Associates, Inc., 2015.

[2] S. Arya and D. M. Mount. Approximate nearest neighbor queries in fixed dimensions. In Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, pages 271-280. Society for Industrial and Applied Mathematics, 1993.

[3] V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios. Nearest neighbor retrieval using distance-based hashing. In 2008 IEEE 24th International Conference on Data Engineering, pages 327-336. IEEE, 2008. [OpenAIRE]

[4] A. Berger, R. Caruana, D. Cohn, D. Freitag, and V. Mittal. Bridging the lexical chasm: Statistical approaches to answer-finding. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '00, pages 192-199, New York, NY, USA, 2000. ACM.

[5] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor” meaningful? In Database theory-ICDT'99, pages 217-235. Springer, 1999.

[6] M. W. Bilotti, P. Ogilvie, J. Callan, and E. Nyberg. Structured retrieval for question answering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 351-358. ACM, 2007.

[7] L. Boytsov and B. Naidan. Engineering efficient and effective non-metric space library. In Similarity Search and Applications, pages 280-293. Springer, 2013. [OpenAIRE]

[8] A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings, pages 21-29. IEEE, 1997.

[9] G.-I. Brokos, P. Malakasiotis, and I. Androutsopoulos. Using centroids of word embeddings and word moverâA˘ Z´ s distance for biomedical document retrieval in question answering. In Proceedings of the BIONLP 2016 Workshop, 2016. [OpenAIRE]

[10] P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics, 19(2):263-311, 1993.

[11] D. Carmel, D. Cohen, R. Fagin, E. Farchi, M. Herscovici, Y. S. Maarek, and A. Soffer. Static index pruning for information retrieval systems. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '01, pages 43-50, New York, NY, USA, 2001. ACM. [OpenAIRE]

[12] C. Carpineto and G. Romano. A survey of automatic query expansion in information retrieval. ACM Comput. Surv., 44(1):1:1-1:50, Jan. 2012.

[13] E. Chávez, G. Navarro, R. Baeza-Yates, and J. L. Marroquín. Searching in metric spaces. ACM Comput. Surv., 33(3):273-321, 2001.

[14] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 12:2493-2537, Nov. 2011. [OpenAIRE]

[15] S. Ding and T. Suel. Faster top-k document retrieval using block-max indexes. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages 993-1002, New York, NY, USA, 2011. ACM. [OpenAIRE]

##### Related research

- Masaryk University Czech Republic
- Carnegie-Mellon University United States
- Carnegie Mellon University Australia

[1] A. Andoni, P. Indyk, T. Laarhoven, I. Razenshteyn, and L. Schmidt. Practical and optimal lsh for angular distance. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 1225-1233. Curran Associates, Inc., 2015.

[2] S. Arya and D. M. Mount. Approximate nearest neighbor queries in fixed dimensions. In Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, pages 271-280. Society for Industrial and Applied Mathematics, 1993.

[3] V. Athitsos, M. Potamias, P. Papapetrou, and G. Kollios. Nearest neighbor retrieval using distance-based hashing. In 2008 IEEE 24th International Conference on Data Engineering, pages 327-336. IEEE, 2008. [OpenAIRE]

[4] A. Berger, R. Caruana, D. Cohn, D. Freitag, and V. Mittal. Bridging the lexical chasm: Statistical approaches to answer-finding. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '00, pages 192-199, New York, NY, USA, 2000. ACM.

[5] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor” meaningful? In Database theory-ICDT'99, pages 217-235. Springer, 1999.

[6] M. W. Bilotti, P. Ogilvie, J. Callan, and E. Nyberg. Structured retrieval for question answering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 351-358. ACM, 2007.

[7] L. Boytsov and B. Naidan. Engineering efficient and effective non-metric space library. In Similarity Search and Applications, pages 280-293. Springer, 2013. [OpenAIRE]

[8] A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences 1997. Proceedings, pages 21-29. IEEE, 1997.

[9] G.-I. Brokos, P. Malakasiotis, and I. Androutsopoulos. Using centroids of word embeddings and word moverâA˘ Z´ s distance for biomedical document retrieval in question answering. In Proceedings of the BIONLP 2016 Workshop, 2016. [OpenAIRE]

[10] P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics, 19(2):263-311, 1993.

[11] D. Carmel, D. Cohen, R. Fagin, E. Farchi, M. Herscovici, Y. S. Maarek, and A. Soffer. Static index pruning for information retrieval systems. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '01, pages 43-50, New York, NY, USA, 2001. ACM. [OpenAIRE]

[12] C. Carpineto and G. Romano. A survey of automatic query expansion in information retrieval. ACM Comput. Surv., 44(1):1:1-1:50, Jan. 2012.

[13] E. Chávez, G. Navarro, R. Baeza-Yates, and J. L. Marroquín. Searching in metric spaces. ACM Comput. Surv., 33(3):273-321, 2001.

[14] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 12:2493-2537, Nov. 2011. [OpenAIRE]

[15] S. Ding and T. Suel. Faster top-k document retrieval using block-max indexes. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '11, pages 993-1002, New York, NY, USA, 2011. ACM. [OpenAIRE]