
# G-quadruplex Distribution in Coronavirus Genomes: Analysis Code and Data ## Overview This dataset contains all code, data, and supplementary materials for the manuscript "Paradoxical G-quadruplex distribution in coronavirus genomes reveals functional constraints and antiviral therapeutic opportunities" published in *Virus Research* (2026). ## Key Findings - **Genome-wide G4 depletion**: Fold-change = 0.56 (95% CI: 0.24-2.30) - **Regional enrichment in critical proteins**: - Spike protein: IRR = 17.9 (95% CI: 11.7-27.6) - Nucleocapsid protein: IRR = 15.2 (95% CI: 8.7-26.6) - **Therapeutic targets**: 38 thermodynamically stable G4 candidates (ΔG < -5 kcal/mol) - **Primary target**: GGCTGGCAATGGCGG (ΔG = -7.35 kcal/mol, 54.8% conservation) ## Dataset Contents ### Genomes (31 files) - **SARS-CoV-2 variants** (n=20): Alpha, Beta, Gamma, Delta, Omicron sublineages (BA.1, BA.2, BA.5, BQ.1.1, XBB.1.5, XBB.1.16, JN.1), Epsilon, Eta, Kappa, Lambda, Mu, D614G, B.1, Wuhan reference, HKU-SZ-005b (Hong Kong early isolate) - **Other coronaviruses** (n=11): SARS-CoV, MERS-CoV, bat coronaviruses (RaTG13, BANAL-52), pangolin coronaviruses (GX, GD, GD1), HCoV-OC43, HCoV-HKU1, HCoV-229E, HCoV-NL63 ### Code (88 Python scripts) - G4 detection and regional mapping - Statistical robustness analysis (Bootstrap, Poisson GLM, GEE models) - Thermodynamic stability assessment (ΔG calculations) - Bayesian network analysis (pgmpy) - Machine learning predictions (XGBoost) - Cross-virus comparative analysis ### Results - Supplementary Tables S1-S7, S6A - Complete analysis outputs (CSV format) - Reproducible analysis pipeline ## Usage ```bash # Install dependencies pip install -r requirements.txt # Run complete analysis pipeline ./run_all.sh Citation Tanigawa, M., & Iwaki, T. (2026). Paradoxical G-quadruplex distribution in coronavirus genomes reveals functional constraints and antiviral therapeutic opportunities. Virus Research.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
