
# Data Files Description for reddit_Paper1_main_NHB_plots.ipynb# ================================================================ ## Main Data Files- paper1_fastopic_themes.json: Topic-to-theme mapping (Solution, Cause, Catastrophic Impact, Societal/Scientific Response) for 100 FasTopic topics- df_joined_w_FasTopics.pkl: Main Reddit dataset (submissions + comments) with assigned topic IDs and metadata (scores, timestamps, etc.) ## Statistical Analysis- paper1_fig1_KStest.json: Kolmogorov-Smirnov test results comparing topic distributions vs global baseline (endorsement, engagement metrics) ## Emotion Analysis - Climate Data- df_emotions_w_topics/*.pkl (20 files): Chunked emotion predictions for climate-related Reddit comments, merged with topic assignments. Contains 28 emotion scores per comment (joy, fear, anger, etc.) from emotion classification model. ## Emotion Analysis - Baseline Comparisons- emotions-casualconversation-baseline/: CasualConversation subreddit baseline (100k sample) - neutral conversation control- emotions-nostupidquestions-baseline/: NoStupidQuestions subreddit baseline (100k sample) - informational Q&A control - emotions-fitness-baseline/: Fitness subreddit baseline (100k sample) - lifestyle/health control Each baseline contains: - emotions_model_outputs_*.pkl: Emotion predictions for baseline comments - *_100k_sample.parquet: Original comment text and metadata ## Output Data- 00_fig3_communication.json: Communication pattern analysis results (engagement vs emotional content) for Figure 3 ## PurposeThis dataset supports analysis of climate discourse on Reddit: topic volume/endorsement/engagement patterns (Fig 1), emotion distributions vs baselines (Fig 2), and communication-emotion relationships (Fig 3).
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
