
The integration of generative AI into humanistic research introduces a structural tension: large language models (LLMs) can produce fluent, authoritative-sounding text, yet their outputs often lack verifiable sources—the very foundation upon which humanistic scholarship rests. This paper argues that traceability—the ability to trace any claim back to its source page number, edition, and material context—constitutes a non-negotiable constraint, rather than an optional feature, for AI-assisted humanistic research. We liken page anchors to Ariadne's thread: in the labyrinth of generative fluency, it is this thread that ensures scholars can find their way back to the source. We situate this argument within the longue durée history of knowledge infrastructure revolutions—from Herder's 1792 insight (that the progress of understanding depends not on individual intellect but on transformations in infrastructure), through print and digitization, to the current generative turn—to demonstrate that traceability has remained the invariant core principle across all three revolutions. We then diagnose the structural deficiencies of current RAG systems, vision-language models (VLMs), and TEI standards when confronted with the specific demands of humanistic inquiry: page-level verification, version awareness, uncertainty disclosure, and falsifiability. The paper proposes the AIH Infra normative framework, centered on four mechanisms—page anchors as the sole hard threshold, human-in-the-loop as a scholarly obligation, NO_EVIDENCE as epistemic honesty, and a four-level compliance schema (Level 0–3)—and translates these into implementation-agnostic architectural requirements. A pilot evaluation across three corpora demonstrates feasibility and identifies key failure modes. The framework is designed for technology-evolution resilience: what it constrains is not any specific implementation but an epistemological baseline that does not change with technological iteration.
retrieval-augmented generation, generative AI ethics, citation-first generation, page anchors, traceable scholarship, human-in-the-loop, digital humanities infrastructure
retrieval-augmented generation, generative AI ethics, citation-first generation, page anchors, traceable scholarship, human-in-the-loop, digital humanities infrastructure
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
