Gollum: A Gold Standard for Large Scale\\Multi Source Knowledge Graph Matching

The set of Knowledge Graphs (KGs) generated with automatic and manual approaches is constantly growing. For an integrated view and usage, an alignment between these KGs is necessary on the schema as well as instance level. There are already approaches which try to tackle this multi source knowledge graph matching problem, but large gold standards are missing to evaluate their effectiveness and scalability. In particular, most existing gold standards are fairly small and can be solved by matchers which match exactly two KGs (1:1), which are the majority of existing matching systems. We close this gap by presenting Gollum -- a gold standard for large-scale multi source knowledge graph matching with over 275,000 correspondences between 4,149 different KGs. They originate from knowledge graphs derived by applying the DBpedia extraction framework to a large wiki farm. Three variations of the gold standard are made available: (1) a version with all correspondences for evaluating unsupervised matching approaches, and two versions for evaluating supervised matching: (2) one where each KG is contained both in the train and test set, and (3) one where each KG is exclusively contained in the train or the test set. We plan to extend our KG track at the Ontology Alignment Evaluation Initiative (OAEI) to allow for matching systems which are specifically designed to solve the multi KG matching problem. As a first step towards this direction, we evaluate multi source matching approaches which reuse two-KG (1:1) matchers from the past OAEI. Due to the size of the KG files, they are hosted at the institute: http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/40K.tar (50,3 GB) http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/all.tar (74,7 GB) http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/gold.tar (25,3 GB)

Related Organizations

University of Mannheim
Germany

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	37
download	downloads	48

37
views
48
downloads
Powered by

Found an issue? Give us feedback

visibility

download

0

Average

37

48