Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Presentation . 2025
License: CC BY
Data sources: Datacite
ZENODO
Presentation . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Managing distributed research data in the engineering sciences with a Data Mesh approach

Authors: Moser, Mario;

Managing distributed research data in the engineering sciences with a Data Mesh approach

Abstract

Presentation at the distribits conference 2025 in Düsseldorf Abstract Since Research Data Management in the engineering sciences is decentralised regarding organisations as well as technically w.r.t. tools and data sources, a Data Mesh approach is applied, for which first experiences are presented.To make scientific data-driven research transparent and comprehensive, Research data management (RDM) covers tools and methods to make it FAIR (findable, accessible, interoperable, reusable) and to maintain research data as a valuable resource. Moreover, a cultural change started to build analyses based on existing data instead of necessarily collect new data at the beginning of each experiment. RDM, esp. in the engineering sciences, is characterised by heterogeneity and decentralisation: Thousands of research institutions in Germany work independently from each other, unless collaborating within projects. Different domains form the engineering sciences, ranging from mechanical over electrical to civil engineering. Engineering is highly interdisciplinary, so that data might be reused for a purpose unknown at data generation. Data formats and structures are heterogeneous, covering relational sensor data and material models as well as images and audios. Data is provided in several repositories, either generic, institutional, or specialised. Overall, this initial situation makes it hard to discover existing data, leverage this data content-wise as well as technically, and assess data quality of a reused dataset. The Data Mesh approach from industrial data management appears appropriate for this setting: Instead of centralised IT teams, domains and their domain owner manage their datasets, being able to answer specific questions about the data and ensuring its quality. Data is provided in the form of data products, ensuring that relevant elements like metadata for context, code for processing, a handle for identification, provenance as history, and a license from a legal perspective etc. are provided. Data remains in their original source, leveraging existing and potentially more specialised infrastructure. Compared to a monolithic ‘one-fits-all’ solution, this is less complex to maintain and can more easily adapt to future requirements. No complex ETL pipeline are required for data integration, although it requires data in their sources to be accessible, e.g. via an API. Based on metadata, the decentralised data in its sources is registered in a central platform, e.g. in a data catalogue or graph, for increased findability. On such a self-serve platform, owner can onboard their data and user can find and access it. To achieve interoperability within a Data Mesh, a federated governance is applied, consisting of global and local elements: Designed depending on the respective requirements, global governance ensures standardisation between data within the whole mesh, while local rules leave room for domain-specific individual design decisions. In this talk, the characteristics of RDM in the engineering sciences and Data Mesh will be presented and mapped against each other. First results will be presented about the suitability and fields of adaption. Although not exactly 1:1 applicable to RDM, the Data Mesh approach addresses main challenges identified before. Especially the domain-oriented approach and the federated governance go beyond purely technical or centralised solution approaches.

Mario Moser et al. would like to thank the Federal Government and the Heads of Government of the Länder, as well as the Joint Science Conference (GWK), for their funding and support within the framework of the NFDI4ING consortium. Funded by the German Research Foundation (DFG) - project number 442146713. 

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green