Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2024
License: CC BY
Data sources: ZENODO
ZENODO
Project deliverable . 2024
License: CC BY
Data sources: Datacite
ZENODO
Project deliverable . 2024
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

D4.6 Definition of Data Quality Metrics

Authors: Kalra, Dipak;

D4.6 Definition of Data Quality Metrics

Abstract

Reusing poor quality data has limited value. When developing the requirements for the AIDAVAcuration virtual assistant, data users repeatedly asked the same question: how reliable the data is.The answer differs depending on the state of the data: i) for data sources, a quality label can beestablished based on the quality level provided by the data holder — if available — including thecredentials of the persons who created and validated the data; ii) for the curated data (i.e. the PHKG),the quality label will be linked to the quality from the source, the level of quality and certification ofthe curation tools used during transformation, the level of health and literacy of the humans whoprovided answers when there were semantic gaps, and the number of data quality checks that couldnot be resolved; iii) for published data, the quality label will be linked to the level of the curated data,the compliance with the target format, the completeness of the content, the absence of bias as wellas the quality, reliability and certification of the imputation algorithm, if applicable.This document provides a detailed overview of AIDAVA deliverable 4.6, focusing on data quality andmetadata across the health data life cycle. This deliverable serves as a key component in AIDAVA,aimed at developing a comprehensive data quality assessment methodology. This methodology iscrucial for ensuring the reliability, transparency, and effective reuse of health data. The documenthighlights the importance of maintaining high standards of health data quality and incorporates dataquality dimensions, methodologies, and tools. Furthermore, deliverable 4.6 is linked with otherintegral parts of the project, namely deliverables 1.3 (Business requirements for R1) [1], 1.4(Definition of assessment study including test scenarios & metrics, and study initiation package) [2] ,2.1 (Global data sharing standard) [3], and 2.2 (Details on data curation & publishing process)(deliverable on request). These deliverables introduce SHACL (Shapes Constraint Language) rules andspecific data quality guidelines, contributing for establishing data quality practices.

Related Organizations
Keywords

data quality framework, data quality assessment, Data quality, secondary use, health data

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average