
doi: 10.1037/met0000302
pmid: 33617275
Recent empirical evaluations of replication in psychology have reported startlingly few successful replication attempts. At the same time, these programs have noted that the proper way to analyze replication studies is far from a settled matter and have analyzed their data in several different ways. This presents 2 challenges to interpreting the results of these programs. First, different analysis methods assess different operational definitions of replication. Second, the properties of these methods are not necessarily common knowledge; it is possible for a successful replication to be deemed a failure by nearly all of the metrics used, and it is not always immediately clear how likely such errors are to occur. In this article, we describe the methods commonly used in replication research and how they imply specific operational definitions of replication. We then compute the probability of false failure (i.e., a successful replication is concluded to have failed) and false success determinations. These are shown to be high (often over 50%) and in many cases uncontrolled. We then demonstrate that errors are probable in the data to which these methods have been applied in the literature. We show that the probability that some reported conclusions about replication are incorrect can be as high as 75-80%. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Data Interpretation, Statistical, Humans, Psychology, Reproducibility of Results
Data Interpretation, Statistical, Humans, Psychology, Reproducibility of Results
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 17 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
