Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Edinburgh Research A...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
versions View all 1 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Dr.Aid: a formal framework assisting compliance with data governance rules

Authors: Zhao, Rui;

Dr.Aid: a formal framework assisting compliance with data governance rules

Abstract

The data “Terms of Use” (ToU) widely exists, with different names, such as “Privacy Policy” or “Data Consent”, and everyone handling data will deal with them. They are, in general, a form of data-governance rules, which not only includes access controls but also contains more general terms such as obligations. In the current world, the designing and handling of data governance rules is often polarized: either open data with almost no governance rules, or restricted data with tight governance rules as well as applications, training, supervision, etc. This poses challenges for researchers, especially when they combine data from different sources and share their results with others. Existing research about automated compliance handling falls into two major categories: single-infrastructural and data-flow tracking. They have different properties and features, but normally only target policies about access controls, and fall short in supporting rule combination for multi-input-multi-output (MIMO) processes and therefore arbitrary directed acyclic graphs (DAGs). In this thesis, a novel extensible language is introduced, designed for MIMO processes and their composed DAGs. It contains two parts, the data rule and the flow rule, for writing data terms of use and writing how the processes affect the data terms of use, respectively. In addition to the expected policy derivation of the data-flow tracking category, it supports obligations, which also mimic access controls, to demonstrate the language features. The language is formalised using situation calculus, with reasoning process explained. Relevant proofs are shown to demonstrate the correctness of the whole-graph reasoning for any DAG, enabling further optimisation of the reasoning. Then, Dr.Aid, the prototype system implementation, is introduced and discussed, whose name is an abbreviation of Data Rule Aid. It takes provenance as the source of data flow graphs, uses Golog as the situation calculus reasoner, and supports rule identification through the recognizer component. Two provenance schemas, CWL-Prov and S-Prov, are supported to demonstrate the generality that supports the main two types of workflow management systems, file-oriented and data-streaming. After that, relevant evaluations of the language and the system are presented. Apart from the already-introduced proofs of correctness of the reasoning, the evaluation includes how the proposed language meets all five principles used to evaluate related research, the capacity of the language to encode real-world data ToU, and the capability of the system for real-world data-use activities in different scientific communities. The evaluation has shown that our language model can encode a substantial amount of real-life data ToU (90% for actioning rules and 74% for all rules), and our framework has the potential to be used in a wide range of applications. The limitations and future works are discussed afterwards, as well as our prospective vision of future data activities using technologies similar to those proposed in this thesis. We believe the work presented pioneers a productive direction for research in this domain.

Country
United Kingdom
Related Organizations
Keywords

MIMO, DAGs, directed acyclic graphs, S-Prov, CWLProv, Data Rule Aid, Dr.Aid, data-flow tracking, data governance rules, multi-input-multi-output processes, automated compliance handling, single-infrastructural, data governance

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Related to Research communities