Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
ZENODO
Article . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

KONDA: An LLM-based Tool for Semantic Annotation and Knowledge Graph Creation Using Ontologies for Research Data

Authors: Kim, Soo-Yon; Görz, Martin; Geisler, Sandra;

KONDA: An LLM-based Tool for Semantic Annotation and Knowledge Graph Creation Using Ontologies for Research Data

Abstract

Achieving semantic interoperability of research data is key to enabling cross-domain data integration, reuse, and knowledge discovery [1]. While the need to align heterogeneous datasets using shared vocabularies and ontologies is widely recognized, doing so remains a considerable challenge in practice [2], [3]. Researchers face several challenges: • (C1) Lack of expertise in ontologies: Many researchers are unfamiliar with ontology engineering and semantic annotation. • (C2) Absence of established domain ontologies: While some domains, such as medicine, have well-established vocabularies, other domains, such as production engineering, may lack suitable or widely adopted ontologies, making it difficult to identify reusable options. • (C3) Technical barriers: The knowledge required to work with technologies such as RDF or mapping tools often presents an entry barrier. • (C4) Tool heterogeneity: Working with multiple disconnected tools adds cognitive and technical overhead. • (C5) Limited resources: Researchers typically face time constraints, making it difficult to invest in familiarizing themselves with complex tools or processes. • (C6) Proprietary solutions: Many semantic mapping tools (e.g., Talend [4]) are proprietary and not suitable for scientific work. To address these challenges, we present KONDA, an LLM-based tool that supports semantic enrichment of research datasets and the construction of explorable knowledge graphs within a single integrated workflow. The KONDA workflow is as follows: • An interface prompts the user to upload their research dataset, along with optional supplementary documents (e.g., protocols, DMPs, README files) to provide the tool with context. • The user is supported in the selection of suitable ontologies via a direct integration with the TIB Terminology Service [5], with the option to add custom ontologies. • The tool performs automated LLM-based semantic annotation of the dataset using the provided context and selected ontologies. A feedback screen enables the user to review and correct annotations. • The annotated data is provided in RDF format with an immediate visualization as a knowledge graph. KONDA's architecture comprises a user interface, a server backend managing sessions and data processing, and an API layer that connects the tool to an LLM, where the semantic enrichment is conducted with techniques such as named entity recognition, relation extraction, and ontology-based annotation. Through KONDA, a guided, interactive tool is provided in which users receive LLM-assisted suggestions and the opportunity to intuitively explore their enriched data directly through automated knowledge graph creation, thus reducing required technical or formal training in semantic technologies (C1, C3). The discovery of reusable ontologies is enabled through the integration of terminology services (C2). KONDA unifies the pipeline within a single, cohesive environment (C4). The tool's semi-automated workflow provides fast and visually supported results with minimal manual effort (C5) while retaining opportunities for human feedback to ensure output quality. Finally, KONDA's modular backend supports the deployment of both proprietary and open LLMs (C6). KONDA empowers researchers to semantically enrich their datasets with minimal effort, offering an integrated and adaptable solution. Future development will focus on persistent graph storage, automated ontology recommendations, and evaluation in real-world settings. By leveraging LLMs and emphasizing usability, KONDA provides a robust foundation for advancing data interoperability across disciplines.

Related Organizations
Keywords

Large Language Models, Semantic Annotation, Ontologies, Research Data Management, Knowledge Graphs, Interoperability

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!