Downloads provided by UsageCounts
WD50K dataset: An hyper-relational dataset derived from Wikidata statements. The dataset is constructed by the following procedure based on the [Wikidata RDF dump](https://dumps.wikimedia.org/wikidatawiki/20190801/) of August 2019: - A set of seed nodes corresponding to entities from FB15K-237 having a direct mapping in Wikidata (P646 "Freebase ID") is extracted from the dump. - For each seed node, all statements whose main object and qualifier values corresponding to wikibase:Item are extracted from the dump. - All literals are filtered out from the qualifiers of the above obtained statements. - All the entities from the dataset which have less than two mentions are dropped. The statements corresponding to the dropped entities are also dropped. - The remaining statements are randomly split into the train, test, and validation sets. - All statements from train and validation sets are removed which share the same main triple (s,p,o) with test statements. - WD50k_33, WD50k_66, WD50k_100 are then sampled from the above statements. Here 33, 66, 100 represents the amount of hyper-relational facts (statements with qualifiers) in the dataset. The table below provides some basic statistics of our dataset and its three further variations: | Dataset | Statements | w/Quals (%) | Entities | Relations | E only in Quals | R only in Quals | Train | Valid | Test | |-------------|------------|----------------|----------|-----------|-----------------|-----------------|---------|--------|--------| | WD50K | 236,507 | 32,167 (13.6%) | 47,156 | 532 | 5460 | 45 | 166,435 | 23,913 | 46,159 | | WD50K (33) | 102,107 | 31,866 (31.2%) | 38,124 | 475 | 6463 | 47 | 73,406 | 10,668 | 18,133 | | WD50K (66) | 49,167 | 31,696 (64.5%) | 27,347 | 494 | 7167 | 53 | 35,968 | 5,154 | 8,045 | | WD50K (100) | 31,314 | 31,314 (100%) | 18,792 | 279 | 7862 | 75 | 22,738 | 3,279 | 5,297 | When using the dataset please cite: @inproceedings{StarE, title={Message Passing for Hyper-Relational Knowledge Graphs}, author={Galkin, Mikhail and Trivedi, Priyansh and Maheshwari, Gaurav and Usbeck, Ricardo and Lehmann, Jens}, booktitle={EMNLP}, year={2020} } For any further questions, please contact: mikhail.galkin@iais.fraunhofer.de
Funding sources - SPEAKER : 01MK20011A - JOSEPH : Fraunhofer Zukunftsstiftung - Cleopatra : 812997 - ML2R: 01 15 18038 A/B/C - MLwin: 01IS18050 D/F - ScADS: 01IS18026A
Wikidata, Knowledge Graph, Link prediction, Hyper Relational Graph, Graph Convolutional Network, Natural Language Processing
Wikidata, Knowledge Graph, Link prediction, Hyper Relational Graph, Graph Convolutional Network, Natural Language Processing
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 61 | |
| downloads | 17 |

Views provided by UsageCounts
Downloads provided by UsageCounts