
doi: 10.7302/6870
handle: 2027.42/175656
The reproducibility of scientific discoveries is a hallmark of scientific research. Although its centrality is widely appreciated in the scientific community, precise definitions for reproducibility and quantitative approaches for replicability assessment are still lacking. This dissertation aims to address the foundational issues and resolve methodological challenges of reproducibility research. Chapter II provides guidance in constructing statistical models and selecting proper inferential strategies for quantification and control of replicability. We first address the classification of different types of reproducibility and the specification of replication success under different contexts. Next, we focus our discussion on one specific type: results reproducibility, in which the same set of variables is investigated across experiments based on different data sets but with the same analytical method. We then propose a novel definition for replicable signals by emphasizing the directional consistency of these signed effect of interest across experiments. Finally, we discuss inference principles for replicability assessment. Chapter III proposes a computational method, INTRIGUE, to evaluate and control replicability in high-throughput settings. High-throughput experiments enable simultaneous measurements of a large number of biological variables. However, the accuracy and replicability of the results are often susceptible to unobserved confounding factors or batch effects. This concept can also be applied to study genuine biological heterogeneity. The proposed methods are based on the model comparison strategy. They are designed to (1) assess the overall reproducible quality of multiple studies, and (2) evaluate replicability at the individual levels. We demonstrate the proposed methods for detecting unobserved batch effects via simulations and further illustrate the versatility of the proposed methods in applications for transcriptome-wide association studies. Chapter IV extends the proposed directional consistency criterion to frame two types of Bayesian model criticism procedures for general replicability assessment. The proposed methods are aimed at identifying potentially inconsistent results targeting distinct application scenarios, e.g., two-group and exchangeable group scenarios. The methods are motivated by established Bayesian prior and posterior predictive model-checking procedures and generalize many existing replicability assessment approaches. We discuss the statistical properties of the proposed approaches and illustrate their utility by simulations and examples of real data analysis, including a re-analysis of the data sets gathered in Reproducibility Psychology Project (RP:P) and publication bias detection in COVID-19 related replication studies. As an extension, in Chapter V we propose a multivariate model criticism approaches for replicability assessment when multiple variables are investigated simultaneously (e.g., in multivariate regression and ANOVA analysis). The approaches discussed in this Chapter are a further development of the methods proposed in Chapter IV accounting for the correlation structures between variables. The final Chapter advances the replicability assessment criterion. Instead of emphasizing sign consistency, we define replicable signals based on the constrained probability of the signals from a target group being misclassified to a reference group. The development of the criterion is motivated by the comparison of genetic results in distinct populations with the existence of gene-environment interaction. We adopt the newly proposed criterion into the model comparison and criticism framework. Numerical experiments, i.e., simulation for batch effect contamination and publication bias, are carried out. The results for applying the methods to multiple real data applications are also presented, including a re-analysis of the RP:P data and the RNA-seq differential expression analysis data sets.
Statistics and Numeric Data, Science, Replicability assessment, Bayesian modeling, Reproducibility
Statistics and Numeric Data, Science, Replicability assessment, Bayesian modeling, Reproducibility
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
