
This report documents the first structural analysis of the arXiv archive within the ComputeCostsobservatory. The archive records metadata and abstracts of recently published scientific papersand treats this collection as a measurable signal of how computational infrastructure is described inscientific writing. The dataset analysed in this report contains 6690 unique papers obtained through a deterministicingestion pipeline. The epoch 1 observation period spans from 1 January 2025 00:00 UTC until 19February 2026, corresponding to the ingestion state frozen at the time of this report. The analysisdeliberately collapses the entire dataset into a single baseline snapshot rather than attempting tointerpret short term variation. The objective is not to measure trends but to determine whether thearchive contains measurable signals relating to cloud infrastructure, local computation andoperational constraints. Initial measurements show that abstracts already contain operationally relevant signals, eventhough detailed deployment narratives are not typically expected in scientific papers. Mentions ofself hosted and local execution vocabulary indicate that local compute set ups are explicitly used ina measurable subset of the dataset. Hardware references to consumer GPUs, specifically NVIDIARTX class devices including RTX 4090, demonstrate that workstation grade hardware is used forscientific computing in published research. These observations are particularly valuable because the archive is accompanied by a completelocal PDF collection for the full dataset. The present report does not analyse those PDFs, but theiravailability means that the abstract level signals identified here can be deepened later through fulltext extraction, context reconstruction and evidence snippets. The abstract results thereforefunction as a calibration step that justifies a second phase of deeper paper level analysis.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
