Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Conference object . 2025
License: CC BY
Data sources: Datacite
ZENODO
Conference object . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

AI-assisted research data annotation in biomedical consortia

Authors: Engel, Felix; Watter, Manuel; Benadi, Gita; Giuliani, Claudia; Kalantari Sarcheshmeh, Aref; Binder, Harald; Kaier, Klaus;

AI-assisted research data annotation in biomedical consortia

Abstract

Annotation of research data is a key element of Open Science and has gained additional value as training input for artificial intelligence. However, developing metadata schemas poses a series of challenges, including optimisation and securing both complete coverage and constant completeness and quality. We employ large language models (LLMs) to address some of these challenges while keeping researchers in the loop to ensure reliability of annotations.Our research data management group currently supports seven biomedical research consortia. We develop customised metadata schemas together with consortium members, drawing on established controlled vocabularies (Engel et al. 2025). Schemas are implemented on the fredato research data platform developed at the IMBI (Watter et al. 2023). Schemas are documented and published as knowledge graphs adhering to the Resource Description Framework (RDF), relating metadata to research processes as modelled by commonly used ontologies.LLMs are employed to develop initial schema drafts from related research literature and to predict dataset annotations from scientific papers (Giuliani et al. 2025). The models have proved to perform well with these tasks, supporting researchers with improving metadata coverage in their consortia.

Related Organizations
Keywords

Metadata annotation prediction, Metadata, Large Language Models, Research data management, Data annotation, Data schemas

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green