
we construct a large-scale, empirically grounded dataset from Reddit to support the development and evaluation of agent-based social simulations. The dataset includes 33, technology-focused, 14 climate-focused, and 7 COVID-related agents, each domain encompassing over (one million posts and comments ). Using publicly available posts and comments, we define agent categories based on content and interaction patterns, derive inter-agent relationships from temporal commenting behaviors, and build a directed, weighted network that reflects empirically observed user connections. The resulting dataset enables researchers to calibrate and benchmark agent behavior, network structure, and information diffusion processes against real social dynamics. Quantitative and qualitative analysis reveal distinctive patterns in user connectivity, engagement life cycles, and triadic closure growth, illustrating the potential of Reddit-derived interaction networks for realistic social simulation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
