Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Project deliverable . 2020
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Project deliverable . 2020
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Other literature type . 2020
License: CC BY
Data sources: ZENODO
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

BigDataStack - D2.2 Requirements & State of the Art Analysis – II

Authors: Orlando Avila-García; Paula Ta-Shma; Yosef Moatti; Everton Luís Berz; Ana Juan Ferrer; Ana Belén González Méndez; Bernat Quesada; +19 Authors

BigDataStack - D2.2 Requirements & State of the Art Analysis – II

Abstract

{"references": ["G. Beskales, I. F. Ilyas, and L. Golab, \"Sampling the repairs of functional dependency violations under hard constraints,\" Proc. VLDB Endow., vol. 3, no. 1\u20132, pp. 197\u2013207, 2010.", "W. Fan, J. Li, S. Ma, N. Tang, and W. Yu, \"Towards certain fixes with editing rules and master data,\" Proc. VLDB Endow., vol. 3, no. 1\u20132, pp. 173\u2013184, 2010.", "J. Wang and N. Tang, \"Towards dependable data repairing with fixing rules,\" in Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 457\u2013468.", "X. Chu, I. F. Ilyas, and P. Papotti, \"Holistic data cleaning: Putting violations into context,\" in Data Engineering (ICDE), 2013 IEEE 29th International Conference on, 2013, pp. 458\u2013469.", "M. Heinsman, \"Trifacta,\" Trifacta. [Online]. Available at https://www.trifacta.com/. [Accessed: 23- May-2018].", "M. Dallachiesa et al., \"NADEEF: a commodity data cleaning system,\" in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 2013, pp. 541\u2013552.", "J. Wang, S. Krishnan, M. J. Franklin, K. Goldberg, T. Kraska, and T. Milo, \"A sample-and-clean framework for fast and accurate query processing on dirty data,\" in Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 469\u2013480.", "Z. Khayyat et al., \"Bigdansing: A system for big data cleansing,\" in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015, pp. 1215\u20131230.", "Y. Altowim, D. V. Kalashnikov, and S. Mehrotra, \"Progressive approach to relational entity resolution,\" Proc. VLDB Endow., vol. 7, no. 11, pp. 999\u20131010, 2014.", "Z. Li, S. Shang, Q. Xie, and X. Zhang, \"Cost reduction for web-based data imputation,\" in International Conference on Database Systems for Advanced Applications, 2014, pp. 438\u2013452.", "D. Haas, J. Wang, E. Wu, and M. J. Franklin, \"Clamshell: Speeding up crowds for low-latency data labeling,\" Proc. VLDB Endow., vol. 9, no. 4, pp. 372\u2013383, 2015.", "C. Gokhale et al., \"Corleone: hands-off crowdsourcing for entity matching,\" in Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 601\u2013612.", "B. Mozafari, P. Sarkar, M. Franklin, M. Jordan, and S. Madden, \"Scaling up crowd-sourcing to very large datasets: a case for active learning,\" Proc. VLDB Endow., vol. 8, no. 2, pp. 125\u2013136, 2014.", "X. Chu, I. F. Ilyas, S. Krishnan, and J. Wang, \"Data Cleaning: Overview and Emerging Challenges,\" 2016, pp. 2201\u20132206.", "P. Bohannon, W. Fan, M. Flaster, and R. Rastogi, \"A cost-based model and effective heuristic for repairing constraints by value modification,\" in Proceedings of the 2005 ACM SIGMOD international conference on Management of data, 2005, pp. 143\u2013154", "J. Wang, T. Kraska, M. J. Franklin, and J. Feng, \"Crowder: Crowdsourcing entity resolution,\" Proc. VLDB Endow., vol. 5, no. 11, pp. 1483\u20131494, 2012.", "A. Chalamalla, I. F. Ilyas, M. Ouzzani, and P. Papotti, \"Descriptive and prescriptive data cleaning,\" in Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 445\u2013456.", "L. Golab, H. Karloff, F. Korn, D. Srivastava, and B. Yu, \"On generating near-optimal tableaux for conditional functional dependencies,\" Proc. VLDB Endow., vol. 1, no. 1, pp. 376\u2013390, 2008.", "G. Beskales, I. F. Ilyas, L. Golab, and A. Galiullin, \"On the relative trust between inconsistent data and inaccurate constraints,\" in Data Engineering (ICDE), 2013 IEEE 29th International Conference on, 2013, pp. 541\u2013552.", "M. Yakout, A. K. Elmagarmid, J. Neville, M. Ouzzani, and I. F. Ilyas, \"Guided data repair,\" Proc. VLDB Endow., vol. 4, no. 5, pp. 279\u2013289, 2011.", "S. Krishnan, J. Wang, E. Wu, M. J. Franklin, and K. Goldberg, \"Activeclean: Interactive data cleaning while learning convex loss models,\" ArXiv Prepr. ArXiv160103797, 2016.", "Carbonell, J. (1990). Machine learning: paradigms and methods. Elsevier North-Holland, Inc.", "Yu, H., Han, J. & Chang, K. C.-C., \"PEBL: Positive example -based learning for Web page classification using SVM.\" In 'Proceedings of ACM SIGKDD 2002 International Conference on Knowledge Discovery and Data Mining'.", "Agichtein, E., Brill, E. & Dumais, S. T.,\"Improving Web search ranking by incorporating user behavior information.\" In 'Proceedings of the 29th International ACM SIGIR Conference on Research and Development in Information Retrieval'.", "Liu, T.-Y., \"Learning to rank for information retrieval.\" Foundations Trends Information Retrieval. 3, 225\u2013331", "Page, L., Brin, S., Motwani, R. & Winograd, T.,\"The PageRank Citation Ranking: Bringing Order to the Web.\" Technical report. Stanford InfoLab. 1999", "Macdonald, C., Santos, R. & Ounis, \"The whens and hows of learning to rank.\" Information Retrieval. 2012", "J. N. Gray, \"Notes on data base operating systems,\" Lecture Notes in Computer Science, vol. 60, pp. 393-481, 1978.", "H. Sturgis and B. Lampson, \"Crash recovery in a distributed data storage system,\" Computer Science Laboratory, Xerox, Palo Alto, 1976.", "D. Peng and F. Dabek, \"Large-scale incremental processing using distributed transactions and notifications,\" in Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI'10), 2010.", "J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd and S. Melnik, \"Spanner: Google's globally-distributed database,\" in Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI '12), 2012.", "D. G. Ferro, F. Junqueira, I. Kelly, B. Reed and M. Yabandeh, \"Omid: Lock-free transactional support for distributed data stores,\" in IEEE 30th International Conference on Data Engineering (ICDE), Chicago, 2014.", "Apache, \"Apache Tephra,\" [Online]. Available at http://tephra.incubator.apache.org. [Accessed May 2018].", "Amr Osman, Mohamed El-Refaey, Ayman Elnaggar, Towards Real-Time Analytics in the Cloud, In Proceedings of IEEE SERVICES, 2013", "Mike Barlow, Real-Time Big Data Analytics: Emerging Architecture, O'Reilly Media, Inc.,2013", "T. \u00d6zsu, P. Valduriez. Principles of Distributed Database Systems. Springer, 2011", "Alfons Kemper and Thomas Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In Proceedings of ICDE, 2011", "Franz F\u00e4rber, Sang Kyun Cha, J\u00fcrgen Primsch, Christof Bornh\u00f6vd, Stefan Sigg, and Wolfgang Lehner. SAP HANA database: data management for modern business applications. In Proceedings of SIGMOD, 2012.", "V. Gulisano, R. Jim\u00e9nez-Peris, M. Pati\u00f1o-Mart\u00ednez, C. Soriente, P. Valduriez (2012) StreamCloud: An Elastic and Scalable Data Streaming System. IEEE Trans. Parallel Distrib. Syst. 23(12): 2351-2365.", "B. F. van Dongen, A. K. A. de Medeiros, H. M. W. Verbeek, A. J. M. M. Weijters, and W. M. P. van der Aalst, \"The ProM Framework: A New Era in Process Mining Tool Support,\" in Applications and Theory of Petri Nets 2005, vol. 3536, G. Ciardo and P. Darondeau, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 444\u2013454.", "International Organization for Standardization, \"ISO/IEC/IEEE 29148:2011 \u2013 Systems and software engineering \u2014 Life cycle processes \u2014 Requirements engineering,\" ISO/IEC/IEEE, Nov. 2011.", "Open Grid Forum, \"Web Services Agreement Specification (WS-Agreement),\" Oct. 10, 2011. http://ogf.org/documents/GFD.192.pdf", "Open Grid Forum, \"WS-Agreement Negotiation Version 1.0,\" Jan. 31, 2011. https://www.ogf.org/Public_Comment_Docs/Documents/2011-03/WS-AgreementNegotiation+v1.0.pdf", "P. Pietzuch, J. Ledlie, J. Shneidman, M. Roussopoulos, M. Welsh, and M. Seltzer, \"Network-Aware Operator Placement for Stream-Processing Systems\", 22nd International Conference on Data Engineering (ICDE '06), pp. 49\u201353, IEEE Computer Society, 2006.", "V. Cardellini, V. Grassi, F. Lo Presti, and M. Nardelli, \"Distributed QoS-aware Scheduling in Storm\", 9th ACM International Conference on Distributed Event-Based Systems, pp. 344-347, ACM, 2015.", "Y. Xing, S. Zdonik, and J.-H. Hwang, \"Dynamic Load Distribution in the Borealis Stream Processor\", 21st International Conference on Data Engineering (ICDE '05), pp. 791\u2013802, IEEE Computer Society, 2005.", "M. Hirzel, R. Soule, S. Schneider, B. Gedik, and R. Grimm, \"A Catalog of Stream Processing Optimizations\", ACM Computing Surveys, vol. 46, Mar. 2014, pp 1\u201334.", "MongoDB MongoDB and MySQL Compare. [Accessed: 27/05/2018] https://www.mongodb.com/compare/mongodb-mysql", "L. Sun, M. J. Franklin, S. Krishnan, and R. S. Xin, \"Fine-grained partitioning for aggressive data skipping,\" SIGMOD, 2014.", "L. Sun, S. Krishnan, R. S. Xin, and M. J. Franklin, \"A partitioning framework for aggressive data skipping,\" VLDB, 2014.", "A. Shanbhag, A. Jindal, S. Madden, J. Quiane, and A. J. Elmore, \"A robust partitioning scheme for ad-hoc query workloads,\" SoCC, 2017.", "Y. Lu, A. Shanbhag, A. Jindal, and S. Madden, \"Adaptdb: Adaptive partitioning for distributed joins,\" VLDB, 2017.", "D. McPherson, \"Managing Compute Resources with OpenShift/Kubernetes,\" August 2016. Red Hat. https://blog.openshift.com/managing-compute-resources-openshiftkubernetes/ [Accessed June 2018]."]}

This is the second version of a series of three deliverables specifying the stakeholder as well as technical (software and technology) requirements for BigDataStack. In the requirements analysis shown in this document, a top-down approach is taken with respect to the user requirements, which have been collected through the BigDataStack use case providers. This is complemented with a bottom-up approach aiming to identify, collect, and analyse the rest of stakeholder requirements as well as technical requirements from BigDataStack technology providers.

Related Organizations
  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 3
    download downloads 3
  • 3
    views
    3
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
3
3
Green