Algorithmic Complexity Attacks on Dynamic Learned Indexes

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Dec 2023Embargo end date: 01 Jan 2024 English Publisher:Association for Computing Machinery (ACM)Journal:Proceedings of the VLDB Endowment, volume 17, pages 780-793 (issn: 2150-8097,

Copyright policy )Funded by:NSF | SPX: Collaborative Resear..., NSF | Collaborative Research: S..., NSF | SPX: Collaborative Resear... +1 projects

Authors: Yang, Rui; Kornaropoulos, Evgenios M.; Cheng, Yue;

doi: 10.14778/3636218.3636232 , 10.48550/arxiv.2403.12433

arXiv: 2403.12433

Algorithmic Complexity Attacks on Dynamic Learned Indexes

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Learned Index Structures (LIS) view a sorted index as a model that learns the data distribution, takes a data element key as input, and outputs the predicted position of the key. The original LIS can only handle lookup operations with no support for updates, rendering it impractical to use for typical workloads. To address this limitation, recent studies have focused on designing efficient dynamic learned indexes. ALEX, as the first and one of the representative dynamic learned index structures, enables dynamism by incorporating a series of design choices, including adaptive key space partitioning, dynamic model retraining, and sophisticated engineering and policies that prioritize read/write performance. While these design choices offer improved average-case performance, the emphasis on flexibility and performance increases the attack surface by allowing adversarial behaviors that maximize ALEX's memory space and time complexity in worst-case scenarios. In this work, we present the first systematic investigation of algorithmic complexity attacks (ACAs) targeting the worst-case scenarios of ALEX. We introduce new ACAs that fall into two categories, space ACAs and time ACAs, which target the memory space and time complexity, respectively. First, our space ACA on data nodes exploits ALEX's gapped array layout and uses Multiple-Choice Knapsack (MCK) to generate an optimal adversarial insertion plan for maximizing the memory consumption at the data node level. Second, our space ACA on internal nodes exploits ALEX's catastrophic cost mitigation mechanism, causing an out-of-memory (OOM) error with only a few hundred adversarial insertions. Third, our time ACA generates pathological insertions to increase the disparity between the actual key distribution and the linear models of data nodes, deteriorating the runtime performance by up to 1, 641× compared to ALEX operating under legitimate workloads.

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Cryptography and Security, Computer Science - Databases, Databases (cs.DB), Cryptography and Security (cs.CR)

1 Research products, page 1 of 1

aca-dlis software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average