LLM-Augmented Exploratory Testing: A Framework For Intelligent Risk Discovery, Hypothesis Generation, And Cognitive Enhancement In Software Quality Engineering

Exploratory testing has traditionally depended on the creativity, intuition, and domain knowledge of skilled human testers, who dynamically design and execute tests while learning about the system under evaluation. With the advent of Large Language Models (LLMs), however, new augmentation pathways emerge that enable automated hypothesis generation, risk discovery, test idea expansion, and structured session-level reasoning capabilities that meaningfully extend human exploratory abilities. This paper introduces a cohesive framework for LLM-augmented exploratory testing that integrates established methods such as Session-Based Test Management (SBTM) with modern AI-driven techniques drawn from advancements in exploratory testing research, risk-based testing strategies, automated and search-based test generation, and machine-learning system verification. Within this framework, LLMs contribute to shaping richer and more comprehensive test charters, identifying latent or emergent risks, reasoning through complex multi-step failure scenarios, and enhancing exploratory coverage through adaptive test idea generation. The inclusion of three conceptual diagrams the Transformer architecture underlying LLM reasoning, SBTM work-breakdown supporting structured exploration, and DeepTest's transformation-based defect-exposure approach illustrates how these diverse research areas converge to support augmented exploratory practice. Collectively, these components enable an approach that strengthens test coverage, reduces tester cognitive load, and improves the consistency and documentation of exploratory insights across teams and testing contexts.

Keywords

Exploratory Testing; Large Language Models; Risk-Based Testing; Session-Based Test Management; Test Automation; Hypothesis Generation; Machine Learning Testing; Test Idea Generation; AI-Assisted Quality Engineering.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green