
Abstract: Background: The need for rigorous pre-analysis inferential validation is critical for studies utilizing large administrative health data, especially following reports suggesting increased cancer risk post-COVID-19 vaccination. This study aims to formally validate a severe external validity discrepancy caused by a dual structural bias present in one such influential cohort. Methods: We applied two Z-Tests for a single proportion to validate the causal chain of bias: 1) The non-representativeness of the cohort's demographic composition (>= 65 years) against the national gold standard (Root Cause). 2) The non-compatibility of the cancer incidence rate in the non-vaccinated control subgroup (>= 65) against the national rate (External Validity Flaw). Results: The Z-Test for demographic representativeness yielded a Z-score of - 260.39 (p-value = 65). The Z-Test on cancer incidence yielded a Z-score of -15.23 (p-value < 10^-50), formally validating a - 45% structural deficit in the baseline cancer risk of the non-vaccinated group, as anticipated in a preliminary analysis. Findings: The combined inferential evidence confirms a fatal structural bias in the scrutinized cohort of the examined study. The statistical suppression of the baseline cancer incidence (the denominator) inevitably mathematically inflated the relative Hazard Ratios calculated from the scrutinized cohort. Our work establishes inferential validation against gold standards as a methodological mandate before any complex statistical modeling is applied. Keywords: Biomathematics, Computational Epidemiology; Inferential Statistics, Selection Bias, COVID-19 vaccination, Cancer Research
Epidemiology, COVID-19, Biostatistics, Applied mathematics, Cancer
Epidemiology, COVID-19, Biostatistics, Applied mathematics, Cancer
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
