
Topic-based search systems retrieve items by contextualizing the information seeking process on a topic of interest to the user. A key issue in topic-based search of text resources is how to automatically generate multiple queries that reflect the topic of interest in such a way that precision, recall, and diversity are achieved. The problem of generating topic-based queries can be effectively addressed by Multi-Objective Evolutionary Algorithms, which have shown promising results. However, two common problems with such an approach are loss of diversity and low global recall when combining results from multiple queries. This work proposes a family of Multi-Objective Genetic Programming strategies based on objective functions that attempt to maximize precision and recall while minimizing the similarity among the retrieved results. To this end, we define three novel objective functions based on result set similarity and on the information theoretic notion of entropy. Extensive experiments allow us to conclude that while the proposed strategies significantly improve precision after a few generations, only some of them are able to maintain or improve global recall. A comparative analysis against previous strategies based on Multi-Objective Evolutionary Algorithms, indicates that the proposed approach is superior in terms of precision and global recall. Furthermore, when compared to query-term-selection methods based on existing state-of-the-art term-weighting schemes, the presented Multi-Objective Genetic Programming strategies demonstrate significantly higher levels of precision, recall, and F1-score, while maintaining competitive global recall. Finally, we identify the strengths and limitations of the strategies and conclude that the choice of objectives to be maximized or minimized should be guided by the application at hand.
Artificial intelligence, LEARNING COMPLEX QUERIES, Topic-based search, Genetic Programming, Weighting, Genetic programming, Selection (genetic algorithm), https://purl.org/becyt/ford/1.2, Similarity (geometry), Diversity preservation, Multi-Objective Optimization, Physics, GLOBAL RECALL, INFORMATION RETRIEVAL, Global recall, FOS: Philosophy, ethics and religion, Multi-objective genetic programming, Computational Theory and Mathematics, Application of Genetic Programming in Machine Learning, Physical Sciences, Medicine, DIVERSITY PRESERVATION, Radiology, Quantum mechanics, Semantic Genetic Programming, Artificial Intelligence, Automatic query formulation, Machine learning, Image (mathematics), Entropy (arrow of time), Information retrieval, Swarm Intelligence Optimization Algorithms, https://purl.org/becyt/ford/1, Data mining, TOPIC-BASED SEARCH, Precision and recall, Global Optimization, MULTI-OBJECTIVE GENETIC PROGRAMMING, Linguistics, QA75.5-76.95, Computer science, INFORMATION-THEORETIC FITNESS FUNCTIONS, Philosophy, AUTOMATIC QUERY FORMULATION, Electronic computers. Computer science, Computer Science, Nature-Inspired Algorithms, FOS: Languages and literature, Recall, DIVERSITY MAXIMIZATION, Multiobjective Optimization in Evolutionary Algorithms
Artificial intelligence, LEARNING COMPLEX QUERIES, Topic-based search, Genetic Programming, Weighting, Genetic programming, Selection (genetic algorithm), https://purl.org/becyt/ford/1.2, Similarity (geometry), Diversity preservation, Multi-Objective Optimization, Physics, GLOBAL RECALL, INFORMATION RETRIEVAL, Global recall, FOS: Philosophy, ethics and religion, Multi-objective genetic programming, Computational Theory and Mathematics, Application of Genetic Programming in Machine Learning, Physical Sciences, Medicine, DIVERSITY PRESERVATION, Radiology, Quantum mechanics, Semantic Genetic Programming, Artificial Intelligence, Automatic query formulation, Machine learning, Image (mathematics), Entropy (arrow of time), Information retrieval, Swarm Intelligence Optimization Algorithms, https://purl.org/becyt/ford/1, Data mining, TOPIC-BASED SEARCH, Precision and recall, Global Optimization, MULTI-OBJECTIVE GENETIC PROGRAMMING, Linguistics, QA75.5-76.95, Computer science, INFORMATION-THEORETIC FITNESS FUNCTIONS, Philosophy, AUTOMATIC QUERY FORMULATION, Electronic computers. Computer science, Computer Science, Nature-Inspired Algorithms, FOS: Languages and literature, Recall, DIVERSITY MAXIMIZATION, Multiobjective Optimization in Evolutionary Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
