
doi: 10.1155/2021/5577692
Recently, massive online academic resources have provided convenience for scientific study and research. However, the author name ambiguity degrades the user experience in retrieving the literature bases. Extracting the features of papers and calculating the similarity for clustering constitute the mainstream of present name disambiguation approaches, which can be divided into two branches: clustering based on attribute features and clustering based on linkage information. They cannot however get high performance. In order to improve the efficiency of literature retrieval and provide technical support for the accurate construction of literature bases, a name disambiguation method based on Graph Convolutional Network (GCN) is proposed. The disambiguation model based on GCN designed in this paper combines both attribute features and linkage information. We first build paper-to-paper graphs, coauthor graphs, and paper-to-author graphs for each reference item of a name. The nodes in the graphs contain attribute features and the edges contain linkage features. The graphs are then fed to a specialized GCN and output a hybrid representation. Finally, we use the hierarchical clustering algorithm to divide the papers into disjoint clusters. Finally, we cluster the papers using a hierarchical algorithm. The experimental results show that the proposed model achieves average F1 value of 77.10% on three name disambiguation datasets. In order to let the model automatically select the appropriate number of convolution layers and adapt to the structure of different local graphs, we improve upon the prior GCN model by utilizing attention mechanism. Compared with the original GCN model, it increases the average precision and F1 value by 2.05% and 0.63%, respectively. What is more, we build a bilingual dataset, BAT, which contains various forms of academic achievements and will be an alternative in future research of name disambiguation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 6 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
