publication . Article . Conference object . Other literature type . 2016

I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets

Chard, Kyle; Michel D'Arcy; Heavner, Ben; Foster, Ian; Kesselman, Carl; Madduri, Ravi; Rodriguez, Alexis; Soiland-Reyes, Stian; Goble, Carole; Clark, Kristi; ...
  • Published: 08 Dec 2016
  • Country: Switzerland
Abstract
Big data workflows often require the assembly and exchange of complex, multi-element datasets. For example, in biomedical applications, the input to an analytic pipeline can be a dataset consisting thousands of images and genome sequences assembled from diverse repositories, requiring a description of the contents of the dataset in a concise and unambiguous form. Typical approaches to creating datasets for big data workflows assume that all data reside in a single location, requiring costly data marshaling and permitting errors of omission and commission because dataset members are not explicitly specified. We address these issues by proposing simple methods and...
Subjects
free text keywords: data analysis, BDBags, Big Data analysis, Big Data bags, Big Data sharing, Minid, data assembling, data collections, data descriptions, datasets, identifiers, research objects, Encoding, Payloads, Robustness, Uniform resource locators, bdbag, ResearchInstitutes_Networks_Beacons/02/04, Data Science Institute, Data mining, computer.software_genre, computer, Metadata, Marshalling, Identifier, Big data, business.industry, business, Encoding (memory), Workflow, Robustness (computer science), Software, Computer science
Funded by
EC| BioExcel
Project
BioExcel
Centre of Excellence for Biomolecular Research
  • Funder: European Commission (EC)
  • Project Code: 675728
  • Funding stream: H2020 | RIA
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Article . Conference object . Other literature type . 2016

I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets

Chard, Kyle; Michel D'Arcy; Heavner, Ben; Foster, Ian; Kesselman, Carl; Madduri, Ravi; Rodriguez, Alexis; Soiland-Reyes, Stian; Goble, Carole; Clark, Kristi; ...