
DELTA This repository contains supplementary materials for the research paper: "Can Large Language Models Decompose User Stories into Tasks? Exploring the Role of Prompting Strategies and Models". The paper was submitted to the Automated Software Engineering Conference 2026. Structure DELTA - Pipeline implementation for decomposing user stories into tasks and evaluating the quality of the resulting decompositions Prompt iterations – Prompt engineering experiments and observations Descriptive statistics – Task title length statistics Evaluation session – Human evaluation session including informed consent form and evaluation session protocol Statistics – Per research question (RQ) statistical results Research Questions RQ1How do persona-based zero- and few-shot prompting strategies affect text similarity between LLM-generated and human-created decompositions? RQ2How do model family and size affect text similarity between LLM-generated and human-created decompositions? RQ3To what extent do text similarity metrics align with human expert judgments of decomposition quality? RQ4To what extent does the criteria-based automated evaluation module of DELTA agree with human expert judgments of decomposition quality?
