Process Discipline as the Key Variable in Ai-Assisted Enterprise Software Development: a Natural Experiment

Findings: Same team, same AI tools, different process conditions: 18/18 enterprise dimensions satisfied at 8-10x elite productivity benchmarks under a structured SDLC, versus 2/18 dimensions with no tests, no quality pipelines, and no code review when process authority shifted to non-technical stakeholders. Measured across 287 FTE-days, 1.48 million lines added, 811,000 lines removed, and 18 enterprise dimensions derived from SOC 2, NIST, OWASP, DORA, CIS, and CNCF frameworks. Scale: No comparable study in the AI-assisted development literature approaches this duration or granularity. Peng et al. (2023) measured a single isolated task. METR (2025) tested 16 developers on individual issues. DeputyDev (2025) observed 300 engineers but had no unstructured comparison arm. This study spans 13.7 FTE-months of sustained enterprise development with commit-level traceability across multiple codebases. Literature gaps addressed: (1) No published study isolates process discipline as a controlled variable in AI-assisted development. This paper presents a natural experiment holding team and tools constant while varying the development process through an exogenous organizational change. (2) The speed literature produces contradictory findings (55.8% speedups vs. 19% slowdowns) with no reconciliation. This paper argues these are measurements of different process conditions, not conflicting results. (3) No published benchmark exists for sustained AI-assisted commit rates; this paper reports 10.7 commits/FTE-day over 287 FTE-days. (4) DORA's "amplifier" thesis rests on correlational survey data; this paper provides project-level evidence with a causal mechanism. (5) Model collapse research (Shumailov et al. 2024, Nature) has not been connected to practical codebase quality; this paper identifies the clean starting codebase as a multiplicative requirement grounded in generation loss theory. (6) The role of organizational governance in AI-assisted development quality has not been empirically demonstrated; this paper documents a quality collapse caused by an organizational decision, not a technical one. All metrics derived from git history, source code analysis, and published industry benchmarks. Methodology described for replication.

Keywords

code quality, technical debt, software development lifecycle, AI-assisted software development, enterprise software quality, SDLC, function point analysis, software productivity, process discipline, natural experiment

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now