Despite the fact that peer reviewers play a major role in selecting what science is published (and thereby “endorsed”), little is understood on how to select the best reviewers or improve the skills of existing ones. Past studies of training and mentoring, including a controlled trial of performance feedback to new reviewers, have shown no objective benefit
We hypothesized that perhaps these failures were due to an insufficiently focused and detailed mentoring process, which has been deemed necessary in previous studies of teaching complex writing skills
]. We therefore attempted to develop a more individualized and detailed approach that would still not represent too great a burden on the journal or the editors. All reviewers newly added to Annals of Emergency Medicine
during a four-year period were randomly assigned to a control group or a mentoring group. Both groups were assigned papers in our usual fashion based on their availability and topic expertise. The control group performed their review and was informed of the editor’s final decision, as well as being given access to the full reviews of the same manuscript by other reviewers after the decision was made. The mentee group was treated similarly except that they were advised at the start of the study of the availability of a specific named mentor volunteer and encouraged to discuss papers individually with that reviewer either by phone or email. The enrolled reviewers were surveyed for relevant experience and training in peer review and critical analysis, based on a review of their curriculum vitae and a questionnaire. We found no differences between the two groups from this review, with the exception that the control group had had a lesser total number of formal training experiences than the mentored group (Table
Despite this one-on-one mentoring in the intervention group, there were no differences in mean reviewer quality scores between groups, using a validated scale routinely used at this journal for over 20 years
]. One might expect that the mentoring intervention would have greatest impact on the first three reviews performed, as compared to all reviews performed during the course of the study. However, we found no difference when conducting this sub-analysis. We also examined the performance trend of all reviewers (change in quality scores over time), using a mixed linear effects model reported in detail elsewhere, which corrects for editor and reviewer variables
]. This method of performance measurement also showed no difference between the groups.
We asked the study participants for free text comments about the mentoring experiences and approximately half provided comments. The majority of comments were neutral (including those who got mentoring elsewhere or felt what they learned was mostly specific journal format or style), 3 were positive, and only 3 very positive.
Since the majority of reviewers at the journal Medical Education
wanted formal training in reviewing and 80% would have liked to seek a colleague’s opinion
], we thought that assigning a senior reviewer with a junior reviewer with similar expertise topics might serve this purpose. However this was apparently not the case. Previous studies have reported that written feedback to reviewers, workshops, and self-taught training packages did not result in lasting improvements in peer review (4–7).
For those who might think this lack of efficacy is aberrant or unique to our journal environment, similar results have been reported in regards to teaching physicians critical appraisal skills in other settings. A Cochrane review cited found only one randomized trial on teaching critical appraisal skills rigorous enough and stated that conclusions about the effects of teaching critical appraisal are debatable
]. Another educational trial which randomized practitioners to half-day critical appraisal skills training workshop or wait list control found that those who took their course had a greater overall knowledge score, but no differences in overall attitude towards evidence, perceived confidence, and other areas of critical appraisal skills ability (methodology or generalizability)
]. Finally, a systematic review of journal clubs reported that studies showed an improvement in knowledge of clinical epidemiology and biostatistics, but no improvement in critical appraisal skills
Our study was limited by several factors. The sample size was small, but it included all new reviewers over a four-year period, and the confidence intervals on our mean scores were not wide, limiting the potential for type 2 error. (This size of effect was 0.1, with 95% CI of −0.4 to 0.6). This study was also conducted only at a single specialty journal. However, given other studies demonstrating similarity in characteristics between our reviewers and those at other journals and specialties
], it seems unlikely that this intervention would yield significantly different results elsewhere. Our mentors were not provided specific training in mentoring techniques, although most worked in academic settings where mentoring skills would be high. They had however a proven track record of high quality reviews over a long period of time, and an expressed willingness to mentor others. Mentors and mentees were encouraged to communicate by email or phone, but were not given more explicit or rigid guidelines, since in all regards we were aiming for an intervention that was logistically feasible and likely to be implemented by journals. As well, the actual mentoring and communication was not observed or evaluated by any outside party, so we cannot comment on its consistency. It is possible that the results of this study might have been different had the mentors all had formal training specific to this goal and/or the communication between mentor and mentee had been more standardized, more frequent, or mandated for a longer period of time. We did not implement these requirements because we felt that all these changes would limit compliance and would make it much less likely that a typical journal would invest the energy in implementing this technique.
The absence of observed efficacy in our study might be theorized to have occurred because all of our new reviewers perform at a relatively high level of function and thus the potential margin for improvement is too small to be significant. Our routine processes may better prepare our reviewers for their tasks than at some other journals. Upon recruitment we refer all new reviewers to our training module
], (although we do not enforce its usage), and upon the completion of each review we provide them access to comments from the other reviewers and the editor. Additionally, many of our reviewers, although new to our journal, had prior experience reviewing at other journals, or had taken formal training, or both, and all had had at least one form of prior training or mentorship (Table
However, this explanation seems less likely because this sample included all reviewers, including those self-referred, and no screening was performed (or possible) to select higher quality reviewers in advance
]. The study cohort had an average quality score not significantly different than the 3.61 (95% CI 3.57 – 3.61) of the larger and longer-term reviewer pool
]. Similarly, their slope of change over time was similar to the slope of −0.04 (−0.039 to −0.042) for the larger pool
]. Since on our scale these scores are between an “acceptable” and “commendable”, there is plenty of room left for improvement. Yet more evidence against the explanation that our reviewers were atypically trained and adept, the mentoring group had a significantly greater number of training experiences than the control group (Table
) but the better performance expected did not materialize.
This study adds to the list of those that have not found a successful formula for improving reviewer performance. The reasons for this are as yet unproven, but a major one may be that teaching and improving writing skills is a very complex task which can only be accomplished by very extensive mentoring, ideally provided very promptly, with a rapid opportunity for the learner to absorb feedback, practice and improve their performance. None of these characteristics is present in the peer review process of most journals; feedback is minimal and provided long after the reviewer’s critical thinking is completed. The feedback needed to improve high level analytic and writing skills is particularly detailed and time consuming for both advisor, and advisee, far beyond the resources of even the largest journals to provide
]. This is especially true since most participants in the process are unpaid volunteers, and internal quality assurance programs at journals are uncommon.