PROX: Approximated Summarization of Data Provenance.

descriptionPublicationkeyboard_double_arrow_right Article , Conference object 01 Jan 2016 France English Journal:Advances in database technology : proceedings. International Conference on Extending Database Technology, volume 2,016Funded by:NSF | III: Medium: Collaborativ..., EC | MODAS

Authors: Ainy, Eleanor; Bourhis, Pierre; Davidson, Susan; Deutch, Daniel; Milo, Tova;

pmid: 27570843

pmc: PMC5001561

PROX: Approximated Summarization of Data Provenance.

- Summary
- Subjects
- Metrics

Abstract

Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand the application logic and how information was derived. Data provenance has been proven helpful in this respect in different contexts; however, maintaining and presenting the full and exact provenance may be infeasible, due to its size and complex structure. For that reason, we introduce the notion of approximated summarized provenance, where we seek a compact representation of the provenance at the possible cost of information loss. Based on this notion, we have developed PROX, a system for the management, presentation and use of data provenance for complex applications. We propose to demonstrate PROX in the context of a movies rating crowd-sourcing system, letting participants view provenance summarization and use it to gain insights on the application and its underlying data.

Country

France

Related Organizations

French Institute for Research in Computer Science and Automation
France
French National Centre for Scientific Research
France
Inria centre at the University of Lille
France
University of Pennsylvania
United States
University of Lille
France

View all View all

Keywords

[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB]

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average