
This is a snapshot of QAWiki from 2025-09-09: a dataset for knowledge graph question answering (KGQA) and/or SPARQL query generation over Wikidata. The dataset is presented in two formats: The simple format is a TSV file, and contains language-tagged questions and paraphrased questions with SPARQL queries. The full format is a TTL file, and contains a full RDF dump of QAWiki featuring also entity mentions, relation mentions, question relations, quality tags, etc. The dataset contains 518 question/query pairs in English and Spanish with SPARQL queries (and 8 additional ambiguous questions without queries). Some questions also feature Italian and Danish translations provided by the community.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
