Who hasn't reacted with shock to a devastatingly negative review of a manuscript representing years of work by graduate students and postdoctoral fellows on a difficult, unsolved question? Detailed in its critique, it relentlessly measures the work against a 'gold standard of excellence' using the latest and best techniques, before dismissing the years of labor and stating that the manuscript can only be reconsidered with substantially more data providing definitive proof of each claim. The other two reviews may be favorable – even recommending publication with few revisions – but how can an editor ignore that complete and negative review? Your manuscript is declined, with encouragement to resubmit when new data are added.
I confess. I'm partly responsible for training the pit-bull reviewer, and I bet you are too. Graduate students read, discuss and dissect classic papers as a key part of their training. At Stanford, these discussion sections are led by faculty. The 'best practice' papers chosen for close reading provide training in how to frame a question, how to mine the literature for relevant biological materials to conduct new experiments, and how to construct studies with appropriate controls and analyses to extract conclusions. Faculty ask students to summarize the article's claims, gleaned from the abstract and discussion, and then to judge the quality of the evidence for each claim by a careful reanalysis of the data. Some of these papers have been the turning point in a field or the first in a field – papers completely worthy of this exercise.
We also teach using papers, published in prominent journals, that contain fatal flaws, not fraud, just faulty assumptions about the properties of organisms or reagents, lack of appropriate controls, or a failure to consider alternative interpretations or to mine the literature completely. A favorite in plant biology is a paper claiming massive and dynamic movement of sequences from the mitochondrial into the nuclear genome, followed by amplification of these mitochondrial sequences – perhaps in the manner of transposons. The paper opens with the statement that plants contain three genetic compartments: nucleus, mitochondrion, and plastid. Too bad the authors, the reviewers, and the editors did not take this instructive sentence to heart. All of the data are DNA blot hybridization assays depicting wide fluctuations in hybridization of a particular probe to the nuclear fraction, with mitochondrial hybridization constant. Students reading the paper identified a key 'missing' control, namely inclusion of purified plastid DNA. In fact, further work showed that there was a historic transfer of a tRNA gene from the plastid to the mitochondrial genome; hence the study had been tracking relative plastid DNA content (a type of contamination) in nuclear DNA samples.
There's nothing wrong with using either classic or fatally flawed papers in our teaching, provided we also instruct our students about what constitutes a more typical publication. Few of us will ever write a classic paper – the simply outstanding paper that might garner the authors a Nobel Prize or provide a completely surprising new insight or a significant new technique. The papers that represent great leaps forward are few in number. And we all work to avoid submitting manuscripts with fatal flaws – the internal review of lab group meetings and colleagues is designed to avoid horrible mistakes.
The majority of our collective publications, and hence scientific progress, comes from incremental insights in which the context is provided by the ongoing struggle to resolve a number of outstanding questions in a field. A series of papers, often from different labs over a span of several years, will add up to the solution to one or several questions. Each publication was timely when published, but may be wrong in some of the details of interpretation – the focus in the discussion may have dealt primarily with the most popular model, missing the chance to 'redesign' that model to better fit all of the data. None of these papers is a complete answer: the new insights will eventually be summarized in a short review article weaving the incremental threads of data into one story that becomes the new paradigm, at least for a while.
Taking a phrase from the current US political scene, these experimentally solid papers are "timely, targeted, and temporary". That is, they address unanswered issues that are on the minds of those in the field, they target specific issues amenable to experimental or theoretical resolution, and in some ways their impact is temporary, because subsequent papers using the emerging insights and new methodologies will supersede these solid papers. Yet these solid papers are the foundation for progress most of the time.
Students are trained to be pit bulls in finding even the tiniest faults in great papers. Nearly all the truly remarkable papers we teach contain a few 'typographical' errors such as reference to the incorrect panel of a figure or a small mistake in a large table or the wrong initials for an author in the reference list. These errors do not detract from the impact of the work, but instruct students to be vigilant in that even the deservedly famous can make mistakes. This insight may even inspire some students to use spell-checker and other automated tools to eliminate such errors. Similarly, the papers with fatal flaws, particularly those in which a critical control is simply missing, are highly instructive. These papers highlight the dangerous 'snow globe world' of belief in a particular theory – a world circumscribed to consider only those things within view – and even then only when obscured by snow. It's instructive to point out that the meaning of 'belief' is to accept as true in the absence of facts. The papers with fatal flaws help students appreciate that maintaining skepticism about current interpretations is essential for progress.
How then can we teach students to appreciate the bulk of our own contributions to the literature? Great manuscripts with minute flaws and bad papers with fatal flaws will represent a tiny minority of the manuscripts that our fledgling reviewer will actually encounter. The majority of manuscripts will be sound in conception and fair in data presentation, and contain some new information. How do we teach judgment of where in the pantheon of journal quality a particular study belongs? How do we teach what constitutes a timely 'publishable unit' – not complete proof of a major concept but a defined step in that direction? Here are a few suggestions – ideas that I hope will start a conversation about training reviewers and better scientists.