
tBiomed is a dataset for tabular data to knowledge graph matching. It is derived for the Biodiversity domain and has two types of tables. On the one hand, Horizontal Relational Tables are where each table represents a collection of entities. On the other hand, Entity Tables represent a single entity. We supported ground truth data from Wikidata as a target knowledge graph (KG). tBiomed is generated by KG2Tables using two levels of a recursive hierarchy of related concepts in Wikidata. tBiomed contains 26,778 entity and horizontal tables, while this repository contains only a validation fold of the original data representing 20% of the total of the entire benchmark with its ground truth data (gt). The Full size of this dataset is 1 GB. We will update this repository with the full dataset, including the test fold with its ground truth data in the Future. Please get in touch if you are interested in the full dataset, The supported tasks for semantic table annotations are: Topic Detection (TD) links the entire table to an entity or a class from the target KG. Cell Entity Annotation (CEA) maps individual table cells to entities from the target KG. Column Type Annotation (CTA) links individual table columns to classes from the target KG. Column Property Annotation (CPA) detects the relations between column pairs from the target knowledge graph. Row Annotation (RA) annotates the entire row to a KG entity or property.
Topic Detection (TD), Biomedical, KG2Tables, Semantic Table Annotation, Row Annotation (RA), Benchmark, Cell Entity Annotation (CEA), Column-Column Property Annotation (CPA), Column Type Annotation (CTA)
Topic Detection (TD), Biomedical, KG2Tables, Semantic Table Annotation, Row Annotation (RA), Benchmark, Cell Entity Annotation (CEA), Column-Column Property Annotation (CPA), Column Type Annotation (CTA)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
