
This paper formalizes the AI Visibility Aggregation Threshold Theorem, establishing that stable entity representation within large language model (LLM) training requires structured corpus aggregation exceeding a minimum survival threshold. Drawing on documented multi-platform ingestion observations, the study identifies a measurable discontinuity: an entity lacking representation under minimal corpus conditions achieved consistent multi-model recall following structured corpus expansion under low-authority constraints. The theorem situates this threshold behavior within previously documented shallow-pass selection mechanisms, budget-constrained ingestion, and structured signal compression dynamics. It defines aggregation not as simple document count, but as survival-eligible structured signal mass capable of persisting through crawl inclusion, selection filtering, and training compression. The result establishes proof-of-existence for aggregation-driven entity formation under constrained authority conditions and clarifies boundary variables influencing threshold magnitude, including authority, redundancy coherence, and positional weighting. This work extends the AI Visibility framework by formally defining the conditions under which entity-level signal transitions from non-representation to stable representation in LLM training systems.
LLM Corpus Aggregation, AI Visibility Framework, LLM Visibility Theorem, AI Visibility, AI Visibility Aggregation Threshold, AI Visibility Theorem
LLM Corpus Aggregation, AI Visibility Framework, LLM Visibility Theorem, AI Visibility, AI Visibility Aggregation Threshold, AI Visibility Theorem
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
