High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 18 Dec 2024 English Publisher:Association for Computing Machinery (ACM)Journal:Proceedings of the ACM on Management of Data, volume 2, pages 1-27 (eissn: 2836-6573,

Copyright policy )Funded by:EC | CODAC

Authors: Kuschewski, Maximilian; Giceva, Jana; Neumann, Thomas; Leis, Viktor;

doi: 10.1145/3698813

High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance

- Summary
- Subjects
- Metrics

Abstract

This paper aims to bridge the gap between fast in-memory query engines and slow but robust engines that can utilize external storage. We find that current systems have to choose between fast in-memory operators and slower out-of-memory operators. We present a solution that leverages two independent but complementary techniques: First, we propose adaptive materialization, which can turn any hash-based in-memory operator into an out-of-memory operator without reducing in-memory performance. Second, we introduce self-regulating compression, which optimizes the throughput of spilling operators based on the current workload and available hardware. We evaluate these techniques using the prototype query engine Spilly, which matches the performance of state-of-the-art in-memory systems, but also efficiently executes large out-of-memory workloads by spilling to NVMe arrays.

Related Organizations

Technical University of Munich
Germany
TECHNISCHE UNIVERSITAET MUENCHEN
Germany

Keywords

OLAP, high-performance, out-of-memory, spilling, out-of-core, NVMe, SSD

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Top 10%

Average

Green

Fields of Science (4) View all

natural sciences

computer and information sciences

Fields of Science

natural sciences

computer and information sciences

View all

Funded by

EC| CODAC