<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Parallel query processing in a polystore

descriptionPublicationkeyboard_double_arrow_right Article 03 Feb 2021 France, Spain English Publisher:Springer Science and Business Media LLCJournal:Distributed and Parallel Databases, volume 39, pages 939-977 (issn: 0926-8782, eissn: 1573-7578,

Authors: Kranas, Pavlos; Kolev, Boyan; Levchenko, Oleksandra; Pacitti, Esther; Valduriez, Patrick; Jiménez-Peris, Ricardo; Patiño-Martinez, Marta;

doi: 10.1007/s10619-021-07322-5

Parallel query processing in a polystore

- Summary
- Subjects
- Metrics

Abstract

The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store's native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of parallel retrieval of underlying partitioned datasets. In this paper, we address these points by: (i) using the polyglot approach of the CloudMdsQL query language that allows native queries to be expressed as inline scripts and combined with SQL statements for ad-hoc integration and (ii) incorporating the approach within the LeanXcale distributed query engine, thus allowing for native scripts to be processed in parallel at data store shards. In addition, (iii) efficient optimization techniques, such as bind join, can take place to improve the performance of selective joins. We evaluate the performance benefits of exploiting parallelism in combination with high expressivity and optimization through our experimental validation.

Countries

France, Spain

Related Organizations

Université de Montpellier
France
Université de Montpellier (EPE)
France
Université Montpellier
France
University Montpellier 2
France
French National Centre for Scientific Research
France

View all View all

Keywords

Informática, Query processing, Distributed and parallel databases, Database integration, Polystores, Query languages, [INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB], Heterogeneous databases

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	5
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%