Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: ZENODO
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Billion Triple Challenge (BTC) 2019 Dataset

Authors: Herrera, Jose Miguel; Hogan, Aidan; Käfer, Tobias;

Billion Triple Challenge (BTC) 2019 Dataset

Abstract

The Billion Triple Challenge (BTC) 2019 Dataset is the result of a large-scale RDF crawl (accepting RDF/XML, Turtle and N-Triples) conducted from 2018/12/12 until 2019/01/11 using LDspider. The data are stored as quads where the fourth element encodes the location of the Web document from which the associated triple was parsed. The dataset contains 2,155,856,033 quads, collected from 2,641,253 RDF documents on 394 pay-level domains. Merging the data into one RDF graph results in 256,059,356 unique triples. These data (as quads or triples) contain 38,156 unique predicates and instances of 120,037 unique classes. If you would like to use this dataset as part of a research work, we would ask you to please consider citing our paper: José-Miguel Herrera, Aidan Hogan and Tobias Käfer. "BTC-2019: The 2019 Billion Triple Challenge Dataset ". In the Proceedings of the 18th International Semantic Web Conference (ISWC), Auckland, New Zealand, October 26–30, 2019 (Resources track). The dataset is published in three main parts: Quads: (*nq.gz): contains the quads retrieved during the crawl (N-Quads, GZipped). These data are divided into individual files for each of the top 100 pay-level-domains by number of quads contributed (btc2019-[domain]_000XX.nq.gz). While most domains have one file, larger domains are further split into parts (000XX) with approximately 150,000,000 quads each. Finally, quads from the 294 domains not in the top 100 are merged into one file: btc2019-other_00001.nq.gz. Triples: (btc2019-triples.nt.gz): contains the unique triples resulting from taking all quads, dropping the fourth element (indicating the location of the source document) and computing the unique triples. VoID (void.nt): contains a VoID file offering statistics about the dataset. For parsing the files, we recommend a streaming parser, such as Raptor, RDF4j/Rio, or NxParser. The data are sourced from 2,641,253 RDF documents. The top-10 pay-level-domains in terms of documents contributed are: dbpedia.org 162,117 documents (6.14%) loc.gov 150,091 documents (5.68%) bnf.fr 146,186 documents (5.53%) sudoc.fr 144,877 documents (5.49%) theses.fr 141,228 documents (5.35%) wikidata.org 141,207 documents (5.35%) linkeddata.es 130,459 documents (4.94%) getty.edu 130,398 documents (4.94%) fao.org 92,838 documents (3.51%) ontobee.org 92,812 documents (3.51%) The data contain 2,155,856,033 quads. The top-10 pay-level-domains in terms of quads contributed are: wikidata.org 2,006,338,975 quads (93.06%) dbpedia.org 36,686,161 quads (1.70%) idref.fr 22,013,225 quads (1.02%) bnf.fr 12,618,155 quads (0.59%) getty.edu 7,453,134 quads (0.35%) sudoc.fr 7,176,301 quads (0.33%) loc.gov 6,725,390 quads (0.31%) linkeddata.es 6,485,114 quads (0.30%) theses.fr 4,820,874 quads (0.22%) ontologycentral.com 4,633,947 quads (0.21%) The data contain 256,059,356 unique triples. The top-10 pay-level-domains in terms of unique triples contributed are: wikidata.org 133,535,555 triples (52.15%) dbpedia.org 32,981,420 triples (12.88%) idref.fr 16,820,681 triples (6.57%) bnf.fr 11,769,268 triples (4.60%) getty.edu 6,571,525 triples (2.57%) linkeddata.es 5,898,762 triples (2.30%) loc.gov 5,362,064 triples (2.09%) sudoc.fr 4,972,647 triples (1.94%) ontologycentral.com 4,471,962 triples (1.75%) theses.fr 4,095,897 triples (1.60%) If you wish to download all N-Quads files, the following may be useful to copy and paste in Unix: wget https://zenodo.org/record/2634588/files/btc2019-acropolis.org.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-aksw.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-babelnet.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-bbc.co.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-berkeleybop.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-bibliotheken.nl_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-bl.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-bne.es_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-bnf.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-camera.it_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-cervantesvirtual.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-chemspider.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-cnr.it_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-comicmeta.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-crossref.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-cvut.cz_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-d-nb.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-datacite.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-dbpedia.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-dbtune.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-drugbank.ca_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ebi.ac.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ebu.ch_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ebusiness-unibw.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-edamontology.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-europa.eu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-fao.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-gbv.de_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-geonames.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-geospecies.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-geovocab.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-gesis.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-getty.edu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-github.io_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-githubusercontent.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-glottolog.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-iconclass.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-idref.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-iflastandards.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ign.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-iptc.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kanzaki.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kasei.us_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kit.edu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kjernsmo.net_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-korrekt.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kulturarvsdata.se_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-kulturnav.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-l3s.de_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-lehigh.edu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-lexvo.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-linkeddata.es_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-linkedopendata.gr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-linkedresearch.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-loc.gov_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-lu.se_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-mcu.es_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-medra.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-myexperiment.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ndl.go.jp_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-nih.gov_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-nobelprize.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-okfn.gr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ontobee.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ontologycentral.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-openei.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-openlibrary.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-orcid.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-ordnancesurvey.co.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-oszk.hu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-other_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-persee.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-pokepedia.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-princeton.edu_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-productontology.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-rdaregistry.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-rdvocab.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-reegle.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-rhiaro.co.uk_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-schema.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-sf.net_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-simia.net_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-sti2.at_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-stoa.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-sudoc.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-taxonconcept.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-theses.fr_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-timbl.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-uba.de_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-unesco.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-uni-mannheim.de_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-uniprot.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-unitn.it_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-verborgh.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-w3.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wals.info_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-walsh.name_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00002.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00003.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00004.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00005.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00006.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00007.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00008.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00009.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00010.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00011.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00012.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00013.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-wikidata.org_00014.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-worldcat.org_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-xmlns.com_00001.nq.gz wget https://zenodo.org/record/2634588/files/btc2019-zbw.eu_00001.nq.gz To merge these files into one (again on Unix): cat *.nq.gz > btc2019-quads.nq.gz_bak mv btc2019-quads.nq.gz_bak btc2019-quads.nq.gz Links to previous BTC datasets: BTC 2014 BTC 2012 BTC 2011 BTC 2010 BTC 2009 BTC 2008 (no link available)

Keywords

semantic web, billion triple challenge, linked data, btc

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 430
    download downloads 1K
  • 430
    views
    1K
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
430
1K