Metadata Propagation in the Web Using Co-Citations

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 18 Oct 2005Publisher:IEEEJournal:The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)

Authors: Camille Prime-Claverie; Michel Beigbeder; Thierry Lafouge;

doi: 10.1109/wi.2005.95

Metadata Propagation in the Web Using Co-Citations

- Summary
- Metrics

Abstract

Given the large heterogeneity of the World Wide Web, using metadata on the search engines side seems to be a useful track for information retrieval. Though, because a manual qualification at the Web scale is not accessible, this track is little followed. We propose a semi-automatic method for propagating metadata. In a first step, homegeneous corpus are extracted. We used in our study the following properties: the authority type, the site type, the information type, and the page type. This first step is realized by a clusterization which uses a similarity measure based on the co-citation frequency between pages. Given the cluster hierarchy, the second step selects a reduced number of documents to be manually qualified and propagates the given metadata values to the other documents belonging to the same cluster. A qualitative evaluation and a preliminary study about the scalability of this method are presented.

Related Organizations

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

Top 10%

Average

Fields of Science

social sciences

other social sciences

Fields of Science

social sciences

other social sciences