WikiDataSets: Standardized sub-graphs from Wikidata

Developing new ideas and algorithms in the fields of graph processing and relational learning requires public datasets. While Wikidata is the largest open source knowledge graph, involving more than fifty million entities, it is larger than needed in many cases and even too large to be processed easily. Still, it is a goldmine of relevant facts and relations. Using this knowledge graph is time consuming and prone to task specific tuning which can affect reproducibility of results. Providing a unified framework to extract topic-specific subgraphs solves this problem and allows researchers to evaluate algorithms on common datasets. This paper presents various topic-specific subgraphs of Wikidata along with the generic Python code used to extract them. These datasets can help develop new methods of knowledge graph processing and relational learning.

Related Organizations

Télécom ParisTech
France
INSTITUT POLYTECHNIQUE DE PARIS
France

Keywords

Social and Information Networks (cs.SI), FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Computer Science - Social and Information Networks, Machine Learning (stat.ML), Machine Learning (cs.LG)

1 Research products, page 1 of 1

3d-force-graph software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Fields of Science (4) View all

natural sciences

Fields of Science

natural sciences

View all

WikiDataSets: Standardized sub-graphs from Wikidata

WikiDataSets: Standardized sub-graphs from Wikidata

1 Research products, page 1 of 1

3d-force-graph software on GitHub