descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 01 Sep 2020Embargo end date: 01 Jan 2019Publisher:Institute of Mathematical StatisticsJournal:The Annals of Applied Statistics, volume 14 (issn: 1932-6157,

Authors: Hung, Kenneth; Fithian, William;

doi: 10.1214/20-aoas1336 , 10.48550/arxiv.1903.08747

arXiv: http://arxiv.org/abs/1903.08747

Statistical methods for replicability assessment

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Large-scale replication studies like the Reproducibility Project: Psychology (RP:P) provide invaluable systematic data on scientific replicability, but most analyses and interpretations of the data fail to agree on the definition of "replicability" and disentangle the inexorable consequences of known selection bias from competing explanations. We discuss three concrete definitions of replicability based on (1) whether published findings about the signs of effects are mostly correct, (2) how effective replication studies are in reproducing whatever true effect size was present in the original experiment, and (3) whether true effect sizes tend to diminish in replication. We apply techniques from multiple testing and post-selection inference to develop new methods that answer these questions while explicitly accounting for selection bias. Our analyses suggest that the RP:P dataset is largely consistent with publication bias due to selection of significant effects. The methods in this paper make no distributional assumptions about the true effect sizes.

Related Organizations

University of California, San Francisco
United States
University of California System
United States
University of California, Berkeley
United States

Keywords

postselection inference, publication bias, FOS: Computer and information sciences, multiple testing, 62F03, 62P25, Statistics - Applications, meta-analysis, Methodology (stat.ME), Replicability, Applications (stat.AP), Statistics - Methodology

2 Research products, page 1 of 1

psychology_resolution software on GitHub
IsRelatedTo
assessing-replicability software on GitHub
IsRelatedTo

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	7
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%