
Research articles in biomedicine domain have increased exponentially, which makes it more and more difficult for biologists to manually capture all the information they need. Information retrieval technologies can help to obtain the users' needed information automatically. However, it is a great challenge to apply these technologies to biomedicine domain directly because of some domain specific characteristics, such as the abundance of terminologies. To enhance the effectiveness of the biomedical information retrieval, we propose a novel framework based on the state-of-the-art information retrieval methods, called learning to rank, which has been proved effective to rank documents based on their relevance degree. In the framework, we attempt to tackle the problem of the abundance of terminologies by constructing ranking models, which focus on not only retrieving the most relevant documents but also diversifying the searching results to increase the completeness of the resulting list for a given query. In the model training, we propose two novel document labeling strategies, and combine several traditional retrieval models as learning features. Besides, we also investigate the usefulness of different learning to rank approaches in our framework. Experimental results on TREC Genomics datasets demonstrate our proposed framework is effective in improving the performance of biomedical information retrieval.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
