Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2023
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Freebase Datasets for Robust Evaluation of Knowledge Graph Link Prediction Models

Authors: Shirvani Mahdavi, Nasim; Akrami, Farahnaz; Saeef, Mohammed Samiul; Shi, Xiao; Li, Chengkai;

Freebase Datasets for Robust Evaluation of Knowledge Graph Link Prediction Models

Abstract

Freebase is amongst the largest public cross-domain knowledge graphs. It possesses three main data modeling idiosyncrasies. It has a strong type system; its properties are purposefully represented in reverse pairs; and it uses mediator objects to represent multiary relationships. These design choices are important in modeling the real-world. But they also pose nontrivial challenges in research of embedding models for knowledge graph completion, especially when models are developed and evaluated agnostically of these idiosyncrasies. We make available several variants of the Freebase dataset by inclusion and exclusion of these data modeling idiosyncrasies. This is the first-ever publicly available full-scale Freebase dataset that has gone through proper preparation. Dataset Details The dataset consists of the four variants of Freebase dataset as well as related mapping/support files. For each variant, we made three kinds of files available: Subject matter triples file fb+/-CVT+/-REV One folder for each variant. In each folder there are 5 files: train.txt, valid.txt, test.txt, entity2id.txt, relation2id.txt Subject matter triples are the triples belong to subject matters domains—domains describing real-world facts. Example of a row in train.txt, valid.txt, and test.txt: 2, 192, 0 Example of a row in entity2id.txt: /g/112yfy2xr, 2 Example of a row in relation2id.txt: /music/album/release_type, 192 Explaination "/g/112yfy2xr" and "/m/02lx2r" are the MID of the subject entity and object entity, respectively. "/music/album/release_type" is the realtionship between the two entities. 2, 192, and 0 are the IDs assigned by the authors to the objects. Type system file freebase_endtypes: Each row maps an edge type to its required subject type and object type. Example 92, 47178872, 90 Explanation "92" and "90" are the type id of the subject and object which has the relationship id "47178872". Metadata files object_types: Each row maps the MID of a Freebase object to a type it belongs to. Example /g/11b41c22g, /type/object/type, /people/person Explanation The entity with MID "/g/11b41c22g" has a type "/people/person" object_names: Each row maps the MID of a Freebase object to its textual label. Example /g/11b78qtr5m, /type/object/name, "Viroliano Tries Jazz"@en Explanation The entity with MID "/g/11b78qtr5m" has name "Viroliano Tries Jazz" in English. object_ids: Each row maps the MID of a Freebase object to its user-friendly identifier. Example /m/05v3y9r, /type/object/id, "/music/live_album/concert" Explanation The entity with MID "/m/05v3y9r" can be interpreted by human as a music concert live album. domains_id_label: Each row maps the MID of a Freebase domain to its label. Example /m/05v4pmy, geology, 77 Explanation The object with MID "/m/05v4pmy" in Freebase is the domain "geology", and has id "77" in our dataset. types_id_label: Each row maps the MID of a Freebase type to its label. Example /m/01xljxh, /government/political_party, 147 Explanation The object with MID "/m/01xljxh" in Freebase is the type "/government/political_party", and has id "147" in our dataset. entities_id_label: Each row maps the MID of a Freebase entity to its label. Example /g/11b78qtr5m, Viroliano Tries Jazz, 2234 Explanation The entity with MID "/g/11b78qtr5m" in Freebase is "Viroliano Tries Jazz", and has id "2234" in our dataset. properties_id_label: Each row maps the MID of a Freebase property to its label. Example /m/010h8tp2, /comedy/comedy_group/members, 47178867 Explanation The object with MID "/m/010h8tp2" in Freebase is a property(relation/edge), it has label "/comedy/comedy_group/members" and has id "47178867" in our dataset. uri_original2simplified and uri_simplified2original: The mapping between original URI and simplified URI and the mapping between simplified URI and original URI repectively. Example uri_original2simplified "http://rdf.freebase.com/ns/type.property.unique": "/type/property/unique" uri_simplified2original "/type/property/unique": "http://rdf.freebase.com/ns/type.property.unique" Explanation The URI "http://rdf.freebase.com/ns/type.property.unique" in the original Freebase RDF dataset is simplified into "/type/property/unique" in our dataset. The identifier "/type/property/unique" in our dataset has URI http://rdf.freebase.com/ns/type.property.unique in the original Freebase RDF dataset.

Related Organizations
Keywords

Benchmark Dataset, Knowledge Graph, Knowledge Graph Embedding, Link Prediction, Knowledge Graph Completion

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 24
    download downloads 1
  • 24
    views
    1
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
24
1