
Knowledge graphs (KGs) that follow the Linked Data principles are created daily. However,there are no holistic models for the Linked Open Data (LOD). Building these models( i.e., engineering apipeline system) is still a big challenge in order to make the LOD vision comes true. In this paper, we addressthis challenge by presenting NELLIE, a pipeline architecture to build a chain of modules, in which each ofour modules addresses one data augmentation challenge. The ultimate goal of the proposed architecture is tobuild a single fused knowledge graph out of the LOD. NELLIE starts by crawling the available knowledgegraphs in the LOD cloud. It then finds a set of matchingKGpairs. NELLIE uses a two-phase linking approachfor each pair (first an ontology matching phase, then an instance matching phase). Based on the ontologyand instance matching, NELLIE fuses each pair of knowledge graphs into a single knowledge graph. Theresulting fused KG is then an ideal data source for knowledge-driven applications such as search engines,question answering, digital assistants and drug discovery. Our evaluation shows an improved Hit@1 scoreof the link prediction task on the resulting fused knowledge graph by NELLIE in up to 94.44% of the cases.Our evaluation also shows a runtime improvement by several orders of magnitude when comparing ourtwo-phases linking approach with the estimated runtime of linking using a naïve approach.
IEEE Access, ume 11 p. 84957-84973 (2023)
data fusion, semantic web, link discovery, Knowledge graphs, linked data, Electrical engineering. Electronics. Nuclear engineering, data augmentation, TK1-9971
data fusion, semantic web, link discovery, Knowledge graphs, linked data, Electrical engineering. Electronics. Nuclear engineering, data augmentation, TK1-9971
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
