
doi: 10.3233/his-200288
Co-citation analysis can be exploited as a bibliometric technique used for mining information on the relationships between scientific papers. Proposed methods rely, however, on co-citation counting techniques that slightly take the semantic aspect into consideration. The present study proposes a semantic driven bibliometric techniques for co-citation analysis through measuring the semantic similarity (SS) between the titles of co-cited papers. Several computational measures rely on knowledge resources to quantify the semantic similarity, such as the WordNet “is a” taxonomy. Our proposal analyzes the SS between the titles of co-cited papers using word-based SS measures. Two major analytical experiments are performed: the first includes the benchmarks designed for testing word-based SS measures through the correlation coefficients for expressing the measures efficiency; the second exploits the dataset DBLP1 citation network. As a result, the semantic similarity measures shows good performance in relation with the human judgements compared to automatic provided estimated similarities. Therefore, the lexical similarity can be consequently used for the automatic assessment of similarity between co-cited papers. The analysis of highly repeated co-citations demonstrates that the different SS measures display almost similar behaviours, with slight differences due to the distribution of the provided SS values. Furthermore, we note a low percentage of similar referred papers into the co-citations.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
