Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

A Phased Co-Construction Methodology for Reverse Specification Generation: Why AI-Native Documentation Requires the Human

Authors: Hirashima, Daishiro;

A Phased Co-Construction Methodology for Reverse Specification Generation: Why AI-Native Documentation Requires the Human

Abstract

Reverse specification generation—the task of inferring documentation from an existing codebase—remains a persistent challenge across diverse practical domains in software engineering: legacy system modernization, onboarding of new contributors, contractual deliverable preparation, and regulatory compliance. The widespread adoption of large language models (LLMs) has made this task ostensibly automatable, yet artifacts produced by pure-AI approaches fail to reach tacit knowledge that is absent from the code itself, instead introducing fluent but groundless filler that erodes the practical reliability of generated documentation. Conversely, pure-human authorship lacks the exhaustiveness and consistency needed to complete within realistic time bounds. Generic Human-in-the-Loop (HITL) patterns intervene in an ad hoc manner regarding when, why, and how the human contributes, and therefore fail to support the epistemic structure specific to reverse specification. This paper proposes the Phased Co-Construction Methodology for reverse specification generation from existing codebases. The methodology rests on three foundational structures: (i) a formalization that segments codebase knowledge into three layers—explicit / implicit / tacit; (ii) a Responsibility Allocation Triad that explicitly assigns each phase to one of three actors—Human / AI / Mechanical; and (iii) a six-phase state machine spanning reconnaissance, planning, parallel investigation, mechanical verification, dialogue-based refinement, and delivery. We further introduce (iv) a statement-level confidence convention that embeds hallucination control directly into the output protocol itself rather than relegating it to downstream filters, and (v) an Abandoned-as-First-Class principle that delivers permanently unresolvable uncertainties as a first-class chapter of the final artifact. Surveying related work across pure-AI specification synthesis, forward requirements elicitation, generic HITL LLMOps, abstention and calibrated uncertainty, tacit knowledge elicitation, AI-native SDLC, and industrial hybrid approaches, we show that no prior work integrates the above five elements. Implementability is demonstrated through cc-rsg (Claude Code Reverse Spec Generator), an open-source reference implementation realized as a Claude Code skill. A case study on a Tetris codebase confirms 100% inventory coverage, zero cross-chapter inconsistencies, and the structural emergence of abandoned questions (5/6) under SME-inaccessible conditions. This paper is positioned as a methodology paper, and cc-rsg is presented as one instantiation rather than the sole implementation.

Powered by OpenAIRE graph
Found an issue? Give us feedback