descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Nov 2021Embargo end date: 01 Jan 2021Publisher:IEEEJournal:2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Authors: Remil, Youcef; Bendimerad, Anes; Mathonat, Romain; Chaleat, Philippe; Kaytoue, Mehdi;

doi: 10.1109/ase51524.2021.9678915 , 10.48550/arxiv.2108.03906

arXiv: 2108.03906

"What makes my queries slow?": Subgroup Discovery for SQL Workload Analysis

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Among daily tasks of database administrators (DBAs), the analysis of query workloads to identify schema issues and improving performances is crucial. Although DBAs can easily pinpoint queries repeatedly causing performance issues, it remains challenging to automatically identify subsets of queries that share some properties only (a pattern) and simultaneously foster some target measures, such as execution time. Patterns are defined on combinations of query clauses, environment variables, database alerts and metrics and help answer questions like what makes SQL queries slow? What makes I/O communications high? Automatically discovering these patterns in a huge search space and providing them as hypotheses for helping to localize issues and root-causes is important in the context of explainable AI. To tackle it, we introduce an original approach rooted on Subgroup Discovery. We show how to instantiate and develop this generic data-mining framework to identify potential causes of SQL workloads issues. We believe that such data-mining technique is not trivial to apply for DBAs. As such, we also provide a visualization tool for interactive knowledge discovery. We analyse a one week workload from hundreds of databases from our company, make both the dataset and source code available, and experimentally show that insightful hypotheses can be discovered.

Related Organizations

French National Centre for Scientific Research
France
Claude Bernard University Lyon 1
France
Institut National des Sciences Appliquées de Lyon
France
University of Lyon System
France
UNIVERSITE LUMIERE LYON 2
France

Keywords

[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], FOS: Computer and information sciences, Computer Science - Artificial Intelligence, Databases (cs.DB), Database, Software Engineering (cs.SE), Computer Science - Software Engineering, Artificial Intelligence (cs.AI), Workload Analysis, Computer Science - Databases, Explainable AI, Data Mining, Subgroup Discovery, Data Visualisation

2 Research products, page 1 of 1

pgsentinel software on GitHub
IsRelatedTo
sd-4sql software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	8
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%