Most psychology findings are not replicable. What can be done? In his Psychological Science editorial, Stephen Lindsay advertises pre-registration as a solution, writing that “Personally, I aim never again to submit for publication a report of a study that was not preregistered”. I took a look at whether pre-registrations are effective and feasible [TL;DR: maybe and possibly].
[I updated the blog post using comments by Cortex editor Chris Chambers, see below for full comments. It turns out that many of my concerns have already been addressed. Updates in square brackets.]
A recent study published in Science found that the majority of Psychological research cannot be reproduced by independent replication teams (Open Science Collaboration, 2015). I believe that this is due to questionable research practices (LINK) and that internal replications are no solution to this problem (LINK). However, might pre-registrations be the solution? I don’t think so. The reason why I am pessimistic is three-fold.
What is a pre-registration? A pre-registered study submits its design and analysis before data is acquired. After data acquisition the pre-registered data analysis plan is executed and the results can confidently be labelled confirmatory (i.e. more believable). Analyses not specified before are labelled exploratory (i.e. less believable). Some journals offer peer-review of the pre-registration document. Once it has been approved, the chances of the journal accepting a manuscript based on the proposed design and analysis are supposedly very high. [Chris Chambers: “for more info on RRs see https://osf.io/8mpji/wiki/home/”%5D
1) Pre-registration does not remove all incentives to employ questionable research practices
Pre-registrations should enforce honesty about post hoc changes in the design/analysis. Ironically, the efficacy of pre-registrations is itself dependent on the honesty of researchers. The reason is simple: including the information that an experiment was pre-registered is optional. So, if the planned analysis is optimal, a researcher can boost its impact by revealing that the entire experiment was pre-registered. If not, s/he deletes the pre-registration document and proceeds as if it had never existed, a novel questionable research practice (anyone want to invent a name for it? Optional forgetting?).
Defenders of pre-registration could counter that peer-reviewed pre-registrations are different because there is no incentive to deviate from the planned design/analysis. Publication is guaranteed if the pre-registered study is executed as promised. However, two motives remove this publication advantage:
1a) the credibility boost of presenting a successful post hoc design or analysis decision as a priori can still be achieved by publishing the paper in a different journal which is unaware of the pre-registration document.
1b) the credibility loss of a wider research agenda due to a single unsuccessful experiment can still be avoided by simply withdrawing the study from the journal and forgetting about it.
The take-home message is that one can opt-in and out of pre-registration as one pleases. The maximal cost is the rejection of one peer-reviewed pre-registered paper at one journal. Given that paper rejection is the most normal thing in the world for a scientist these days, this threat is not effective.
[Chris Chambers: “all pre-registrations made now on the OSF become public within 4 years – so as far as I understand, it is no longer possible to register privately and thus game the system in the way you describe, at least on the OSF.”]
2) Pre-registrations did not clean up other research fields
Note that the argument so far assumes that when the pre-registration document is revealed, it is effective in stopping undisclosed post hoc design/analysis decisions. The medical sciences, in which randomized control trials have to be pre-registered since a 2004 decision by journal editors, teach us that this is not so. There are four aspects to this surprising ineffectivetiveness of pre-registrations:
2a) Many pre-registered studies are not published. For example, Chan et al. (2004a,b) could not locate the publications of 54% – 63% of the pre-registered studies. It’s possible that this is due to the aforementioned publication bias (see 1b above), or other reasons (lack of funding, manuscript under review…).
2b) Medical authors feel free to frequently deviate from their planned designs/analyses. For example 31% – 62% of randomized controlled trials changed at least one primary outcome between pre-registration and publication (Mathieu et al., 2009; Chan et al., 2004a,b). If you thought that psychological scientists are somehow better than medical ones, early indications are that this is not so (Franco et al., 2015).
2c) Deviations from pre-registered designs/analyses are not discovered because 66% of journal reviewers do not consult the pre-registration document (Mathieu et al., 2013).
2d) In the medical sciences pre-registration documents are usually not peer-reviewed and quite often sloppy. For example, Mathieu et al., (2013) found 37% of trials to be post-registered (the pointless exercise of registering a study which has already taken place), and 17% of pre-registrations being too imprecise to be useful.
[Chris Chambers: “The concerns raised by others about reviewers not checking protocols apply to clinical trial registries but this is moot for RRs because checking happens at an editorial level (if not at both an editorial and reviewer level) and there is continuity of the review process from protocol through to study completion.”]
3) Pre-registration is a practical night-mare for early career researchers
Now, one might argue that pre-registering is still better than not pre-registering. In terms of non-peer-reviewed pre-registration documents, this is certainly true. However, their value is limited because they can be written so vaguely as to be not useless (see 2d) and they can simply be deleted if they ‘stand in the way of a good story’, i.e. if an exploratory design/analysis choice gets reported as confirmatory (see 1a).
The story is different for peer-reviewed pre-registrations. They are impractical because of one factor which tenured decision makers sometimes forget: time. Most research is done by junior scientists who have temporary contracts running anywhere between a few months and five years [reference needed]. These people cannot wait for a peer-review decision which, on average, takes something like one year and ten months (Nosek & Bar-Anan, 2012). This is the submission-to-publication-time distribution for one prominent researcher (Brian Nosek):
What does this mean? As a case study, let’s take Richard Kunert, a fine specimen of a junior researcher, who was given three years of funding by the Max-Planck-Gesellschaft in order to obtain a PhD. Given the experience by Brain Nosek with his articles, and assuming Richard submits three pre-registration documents on day 1 of his 3-year PhD, each individual document has a 84.6% chance of being accepted within three years. The chance that all three will be accepted is 60.6% (0.8463). This scenario is obviously unrealistic because it leaves no time for setting up the studies and for actually carrying them out.
For the more realistic case of one year of piloting and one year of actually carrying out the studies, Richard has a 2.2% chance (0.2823) that all three studies are peer-reviewed at the pre-registration stage and published. However, Richard is not silly (or so I have heard), so he submits 5 studies, hoping that at least three of them will be eventually carried out. In this case he has a 14% that at least three studies are peer-reviewed at the pre-registration stage and published. Only if Richard submits 10 or more pre-registration documents for peer-review after 1 year of piloting, he has a more than 50% chance of being left with at least 3 studies to carry out within 1 year.
For all people who hate numbers, let me put it into plain words. Peer-review is so slow that requiring PhD students to only perform pre-registered studies means the overwhelming majority of PhD students will fail their PhD requirements in their funded time. In this scenario cutting-edge, world-leading science will be done by people flipping burgers to pay the rent because funding ran out too quickly.
[Chris Chambers: “Average decision times from Cortex, not including time taken by authors to make revisions: initial trial = 5 days; Stage 1 provisional acceptance = 9 weeks (1-3 rounds of in-depth review); Stage 2 full acceptance = 4 weeks”]
What to do
The arrival of pre-registration in the field of Psychology is undoubtedly a good sign for science. However, given what we know now, no one should be under the illusion that this instrument is the solution to the replication crisis which psychological researchers are facing. At the most, it is a tiny piece of a wider strategy to make Psychology what it has long claimed to be: a robust, evidence based, scientific enterprise.
[Please do yourself a favour and read the comments below. You won’t get better people commenting than this.]
— — —
Chan AW, Krleza-Jerić K, Schmid I, & Altman DG (2004). Outcome reporting bias in randomized trials funded by the Canadian Institutes of Health Research. CMAJ : Canadian Medical Association journal = journal de l’Association medicale canadienne, 171 (7), 735-40 PMID: 15451835
Chan, A., Hróbjartsson, A., Haahr, M., Gøtzsche, P., & Altman, D. (2004). Empirical Evidence for Selective Reporting of Outcomes in Randomized Trials JAMA, 291 (20) DOI: 10.1001/jama.291.20.2457
Franco, A., Malhotra, N., & Simonovits, G. (2015). Underreporting in Psychology Experiments: Evidence From a Study Registry Social Psychological and Personality Science DOI: 10.1177/1948550615598377
Lindsay, D. (2015). Replication in Psychological Science Psychological Science DOI: 10.1177/0956797615616374
Mathieu, S., Boutron, I., Moher, D., Altman, D.G., & Ravaud, P. (2009). Comparison of Registered and Published Primary Outcomes in Randomized Controlled Trials JAMA, 302 (9) DOI: 10.1001/jama.2009.1242
Mathieu, S., Chan, A., & Ravaud, P. (2013). Use of Trial Register Information during the Peer Review Process PLoS ONE, 8 (4) DOI: 10.1371/journal.pone.0059910
Nosek, B., & Bar-Anan, Y. (2012). Scientific Utopia: I. Opening Scientific Communication Psychological Inquiry, 23 (3), 217-243 DOI: 10.1080/1047840X.2012.692215
Open Science Collaboration (2015). PSYCHOLOGY. Estimating the reproducibility of psychological science. Science (New York, N.Y.), 349 (6251) PMID: 26315443
— — —