<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Because of the complex Web structure, most approaches of focused crawling employ a local search algorithm, which will only search pages in a sub-graph of the Web. And the multi-topic feature of Web pages makes it difficult to determine the relevance of a Web page to a given topic. Towards those two issues, in this paper we present a new hybrid approach to focused crawling, which is based on meta-search and VIPS (VIsion based Page Segmentation) algorithm. We use meta-search to achieve a wider crawling range than traditional local search algorithm. Besides, in order to obtain better recall and precision, we use VIPS-based algorithm for the relevance computation of a Web page, which first partitions a Web page into a set of blocks that reflect the semantic structure of the page. The system architecture of hybrid focused crawler is discussed after a short review on related work, and then we present the framework of the hybrid focused crawling approach.
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |