Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Conference object . 2024
License: CC BY
Data sources: Datacite
ZENODO
Conference object . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

DQ-Kit web app: Evaluating and Improving Data Quality for Soil and Agricultural Data in the BonaRes Repository

Authors: Lachmuth, Susanne; Hoffmann, Carsten; Nguyen, Viet Hoang; Silva de Almeida, Igo; Thalheim, Torsten; Kynn, Bethia; Lesch, Stephan; +4 Authors

DQ-Kit web app: Evaluating and Improving Data Quality for Soil and Agricultural Data in the BonaRes Repository

Abstract

Well-curated data repositories enhance the discovery, access, integration, and analysis of scientific data. They maximize research impacts and ensure the accuracy of data-driven technologies. The BonaRes Repository is a FAIR and open infrastructure for soil and agricultural research data publication. Alongside this repository, we are developing DQ-Kit, a web application that automates comprehensive data quality assurance. DQ-Kit offers automated guidance on data elements that require review and confirmation. DQ-Kit checks encompass four main categories. First, it addresses formal criteria such as atomization of data, structural consistency, and other formatting issues. Second, DQ-Kit provides a summary of variables, their properties, and summary statistics. Third, DQ-Kit allows for the exploration of relationships among variables and patterns of missingness. Lastly, we are planning to implement data plausibility checks flagging variables that contain theoretically "impossible" values and values that seem empirically implausible based on existing knowledge. Initially, this functionality may be limited to soil data, where our team possesses the necessary expertise. We focus on "data fitness for use," emphasizing data suitability for specific purposes and amplifying the impact of data providers. We plan to enhance the metadata at the BonaRes Repository with DQ-Kit results, enabling seamless quality control and facilitating dataset comparisons. Ultimately, we aim to offer DQ-Kit as open-source software, inviting community contributions – including from the FAIRagro community - to its development. In summary, DQ-Kit ensures the integrity and reliability of scientific data at the BonaRes Repository and beyond, supporting various research endeavors.

Keywords

NFDI, M3.4, Data Fitness for Use, Research Data Management, Agrosystems, Soil and Agricultural Science, FAIRagro, Community Summit 2024, Data Quality, Research Data Repository, FAIR Data Principles

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average