What is peer review assumed to accomplish?
Peer review is considered 1) a screening instrument which lets some material through the gates but refuses entry to other submittals, and 2) an editing instrument that turns articles allowed through the gates into better-written or better-edited texts. Experts in peer review have suggested that "the two principal functions of peer review" are "filtering out incorrect or inadequate work and improving the accuracy and clarity of published reports." [2
]. These functions have been further categorized as (1) "selecting submissions for publication" and "rejecting those with irrelevant, trivial, weak, misleading, or potentially harmful content," and "(2) improving the clarity, transparency, accuracy, and utility of the selected submissions." [3
] Distinguishing between the ability to evaluate the scientific content (i.e., the "selection" "gatekeeping," "screening" or "deciding what gets published" functions of peer review) and the ability to provide effective feedback on the content, writing or language (i.e., the "improving what gets accepted" function of peer review) would help make explicit which skills make peer reviewers useful to editors and authors. This is important because the ability of peer review to perform the "improving" function effectively has been questioned not only by wordface professionals [4
] but by researchers in peer review [5
Some editors [6
] have found that even careful, prospective research cannot reliably identify characteristics of good reviewers, ways to train reviewers to become better, or characteristics that contribute to good reviewing skills. A recent editorial in Nature
also recognized the problem with peer review quality:
What right has [an author] to expect a high quality of peer review? What training is being given in his or her own lab to ensure that the next generation understands how to do a good job of critically appraising others' work? And as the pressures on researchers grow–bureaucracy from institutions and funding agencies, incentives to apply the outcomes of research–the very motivation to do a conscientious job of peer review is itself under pressure [7
Many editors seem to be unaware that the ability to provide helpful feedback on different quality dimensions requires skills which cannot be assumed to be "standard equipment" in all potential reviewers. A hypothesis worth considering is that discipline-specific content is more likely to be judged objectively because this is where gatekeepers' expertise is greatest. In contrast, language and writing features are more likely to be judged subjectively because gatekeepers' expertise in this dimension varies widely. The latter is probably influenced by individual characteristics such as the reader's native language and culture, and personal preference for language and writing style [8
]. As a result, feedback about the language and writing may be less likely to help authors improve their manuscripts than feedback about the specialized content.
Evidence of unhelpful feedback about the language and writing
Author's editors and translators who help authors interpret reviewers' feedback frequently observe that reviewers are quick to complain about "the English." Although reviewers sometimes correctly identify problems with technical language or first-language interference, they often claim that a manuscript requires "substantial review and editing by a native English speaker" when in fact they may be reacting to usage or argumentation that is appropriate but different from their preferred style. Below I list some of the changes made or requested by gatekeepers that can make the text harder instead of easier to understand.
1. Edits to improve "good scientific English style": the corrections can introduce unfortunate word choices, jargon, undefined or unneeded abbreviations, and other technical editing errors.
2. Changes in terminology and nomenclature: the reviewer's knowledge may not be up-to-date.
3. Corrections in grammar and syntax: reviewers may overestimate their proficiency in written English.
4. Changes in organization: reviewers may request changes that disrupt the logical flow of ideas.
5. Changes in argumentation and rhetoric: sometimes "non-standard" rhetorical strategies used by authors are more appropriate than the type of writing the reviewer prefers.
Wordface professionals often agree with researchers who feel reviewers have provided contradictory feedback about the writing or complained about "the English" even when native speakers of English wrote, translated or revised the material. Table shows the frequency with which feedback about the English or the writing was considered unhelpful by a sample of experienced STM translators, author's editors and medical writers.
Native-English-speaking author's editors' perceptions of the usefulness of feedback from journal gatekeepers about the language. Questionnaire survey, October 2007. N = 25, response rate 40%.
Although consensus between reviewers is not necessarily one of the aims of peer review, contradictory feedback about the writing is unhelpful if not accompanied by guidance from the editor. The unhelpful comments made by some reviewers may reflect their tendency to consider their role as "one of policing rather than identification of work that is interesting and worth publishing." [9
] As gatekeepers, some reviewers may assume it is more important to find reasons to reject a submittal than to help make worthy but imperfectly polished manuscripts better. As busy professionals with limited time to spare for non-remunerated but demanding work, reviewers may be more highly motivated to find a few fatal flaws than to undertake the more time-consuming task of providing constructive feedback.
Although many additions reviewers suggest do improve research articles, an undesirable outcome of peer review is the introduction of changes that the authors know to be wrong but which are added "to conform to the referee's comments." [9
] Reviewers' comments that force authors to rewrite a paper "in ways that sometimes do not support, but rather weaken" their arguments have been a concern in social science disciplines for decades [10
]. Researchers I have worked with have, at the reviewer's behest, added unnecessary citations and even whole paragraphs which had the unfortunate side effect of disrupting the logical flow of ideas. As a result published articles may be less coherent, less persuasive, and less attractive to readers than they might have been if the reviewers had shown more flexibility and asked themselves whether their suggested changes actually improved the text.
Decline in editorial tolerance for writing that departs from readers' expectations
Many authors do not have ready access to professional editorial help – a problem with the potential to worsen the North-South and West-East information imbalance [11
]. Moreover, reviewers and editors may no longer be as willing or able as they were before to provide extensive help with the writing or language [13
]. Programs such as AuthorAID will attempt to palliate geographical imbalance in access to high-quality author editing and language help [11
Meanwhile, journals in some disciplines seem to be abandoning manuscript editing, a trend which seems to parallel a similar decline in editorial tolerance for imperfect English. To study the trend among STM journals to dispense with editing, I compared policies at four large commercial publishers: Springer, Elsevier, Wiley and Blackwell. (The latter two publishers merged in February 2007). Current policies, discussed here, [See additional file 1
: Publishers' language policies] reflect a range of positions from an appreciation of authors' difficulties in writing well to explicit statements that the publisher is not prepared to edit accepted manuscripts.
Although trends differ between disciplines, recent years have seen a decrease in the number of journals that are willing or able to undertake high-quality editing. For example, in 1993 Jill Whitehouse, then Executive Editor of Physiotherapy
, published an article titled "Readability and clarity" in which she described "the responsibilities of reviewers of articles in helping authors improve their writing style." Reviewers for this journal were expected to provide feedback on both the content and the "style," defined by this editor as features that enhanced "clarity of communication and elegance." [14
Currently the journal, published by Elsevier, offers sparse advice about the standard of writing or language authors are expected to meet: "Please write your text in good English (American or British usage is accepted, but not a mixture of these)." [15
] There is no longer any indication that reviewers or editors consider it their job to attend to "style".
Debate among editors on the WAME listserve in late 1999 reflected the change in attitude toward the effect of language and writing on a manuscript's chances of acceptance. Robin Fox wondered whether "pragmatism will prevail over fairness," and editors debated what could be done to ensure that the quality of the writing was as good as the quality of the content [16
]. Some editors felt the language burden created an uneven playing field that posed additional obstacles to publication for researchers whose first language is not English. Some said they were glad to spend extra time on manuscripts with language or writing problems. However, a few editors admitted that because of practical considerations it might be necessary to reject manuscripts that reported good work if they needed too much editing (i.e., more editing than the editor or publisher could afford to provide).
The latest edition of the American Medical Association (AMA) style manual offers no advice on writing or text revision but contains an abundance of rules on specific points of grammar, usage and technical style [17
]. Although it is considered a de facto standard for medical publishing in English (at least in the USA), the AMA manual lacks advice on the type of writing gatekeepers at biomedical journals are likely to find acceptable. It does, however, note that poor writing is considered a legitimate reason to reject a manuscript (p. 265).
To compare policies across disciplines I also looked at how the style manuals of the American Psychological Association and American Chemical Society [See additional file 2
, American Psychological Association and American Chemical Society language policies] handle peer review of the language and writing.
My own experience with manuscripts published in different journals since the mid-1980s suggests that in general, only the biggest, wealthiest, highest-impact-factor journals continue to provide good copyediting as part of their added value services. Current practices are changing and differ between journals and between publishers, so reviewers may feel confused as to what they are expected to comment on. As a result they may assume that they should attempt to improve the writing or language even if (or perhaps precisely because) it is no longer the journal's or publisher's policy to provide this service.
Application of a two-category coding system for content analysis
Analyzing the guidelines for reviewers according to the two quality dimensions suggested here–specialized content and writing–will show which criteria are likely to be evaluated more objectively and which are likely to be evaluated more subjectively. The criteria used to judge the specialized content should help answer the question, "Does the manuscript report questions, findings and ideas that readers ought to know about?" The criteria used to judge the writing should help answer the question, "Will readers understand well enough what the authors are trying to say?"
Coding advice reliably as pertaining to either the content or the writing requires a taxonomy of features that can be identified easily and reproducibly. Table shows a tentative list of words and phrases that label instructions or comments as relating to one dimension or the other.
Markers of content-related and writing-related information in guidelines and feedback intended for authors and reviewers
As a preliminary test of the usefulness of using just two categories to classify the content, I analyzed different types of texts that contain advice for authors or reviewers. The results of this exercise are reported here. [See additional file 3
: Test of the 2-category coding system]
These preliminary quantitative analyses suggest that the 2-category system is applicable, but replication by many more raters is needed with a large sample of instructions to reviewers, reviewers' reports and instructions to authors.
Other content analysis studies of quality criteria
As shown in an analysis of 35 sets of instructions to authors by Schriger at al. [18
], there are unresolved issues with content validity. Study 2 in this article counted the frequencies of words pertaining to 18 different categories grouped into 4 major classes. Only 5 journals devoted more than 10% of the words to scientific content. Although differences in the classification method and the type of document analyzed make comparisons problematic, their low figures for content-related criteria contrast with my preliminary finding that 71% of the criteria reviewers were asked to consider pertained to the content (Table 3 in reference 18). None of the 18 categories considered by Schriger and colleagues were related specifically to the quality of the language or writing. However, their "scientific content" class included 3 categories for "content or style," "methodology or statistics" and "general content." This last category included instructions about format and style along with information that could not be assigned to any of the other 17 categories.
So the reason for the large difference in content-related criteria between the classification by Schriger and colleagues and the 2-category system proposed here is probably because what Schriger and colleagues called "content" in their analysis comprised a mixture of advice on format, style and reporting, and so cannot be compared to "content" considered here as hypothesis, experimental design, data and analysis.
At issue, however, is not the magnitude of the difference in the proportion of comments considered to pertain to content. The methodological issue here is that the two analyses cannot be compared because of the differences in how content-related comments are defined and classified by different authors. Difficulties in defining text-based variables for content analysis were noted in a similar study that compared comments to authors provided by methodology and regular reviewers [19
]. The methodological pitfalls of content analysis aimed at "deciding which comments refer to which text features" were also pointed out by Belcher in a study of reviewer feedback to authors whose first language was not English [20
Other categories in addition to content and writing hold potential to shed light on the peer review process. One potentially useful category is "reporting" since the damage weak reporting does to scientific communication is now clear [21
]. The reason so much weak reporting reaches print is because peer review fails to detect and correct faults, so training gatekeepers in how to identify problems with study design, methodology, statistical analysis and data reporting is one way to make peer review more effective.
A recent paper in BMC Medical Research Methodology
] classified comments about manuscripts as pertaining to science (i.e., content), journalism or writing. The JAMA study used a third category (journalism) because this leading journal, like other high-impact publications, considers many non-content-related factors in its peer review decisions [23
]. Most journals, however, could probably obtain useful information with content and writing as the sole classification criteria.
Insights into peer review by language and writing specialists and wordface professionals
Academic research in communication disciplines is helping to bring into focus some of the issues peer review research by gatekeepers has so far failed to consider. Some of this research is reviewed here. [See additional file 4
: Academic research] Joy Burrough-Boenisch, a translator, author's editor and specialist in language for specific purposes, has worked with researchers from different linguistic, cultural and academic backgrounds to investigate readers' expectations for academic texts across a range of disciplines and native languages [24
]. Her groundbreaking multidisciplinary research yielded findings that gatekeepers interested in serving their readers well might find stimulating. The findings, summarized here, [See additional file 5
: Wordface research] support the notion that advice on "the writing" offered by scientific peers may be less helpful to authors than advice offered by professional editors or other communication professionals.
The reasons for this are not hard to grasp when the skills of discipline specialists and communication specialists are compared. Text revisers such as translators, language editors and copyeditors tend to make changes to improve readability, at least on a sentence or paragraph level. But if they are not subject experts, language professionals or copyeditors may miss deficiencies in the logic and argumentation because they do not grasp the scientific content. In contrast, peer reviewers (ideally) focus on the validity of the actual scientific content and reporting, and flag for the editor failings in the methods (for example, in the experimental design and statistical analysis) or reasoning (for example, interpreting the results within the context of previous knowledge). However, because of their diverse cultural backgrounds, not all reviewers and editors will have the same expectations for argumentation and internal coherence.
When gatekeepers and writing professionals work together
More than 10 years ago Richard Horton reflected on the suggestion that peer review was the equivalent of nothing more than good technical editing. Horton understood that peer review processes take place within two spheres: subject expertise and language expertise. Missing from peer review, he maintained, was the ability to provide authors with feedback on how persuasive their arguments were. He suggested that critical review of manuscripts by linguists could determine how effectively the authors had used language to support their point of view. "Such an analysis is part of the critical culture of science and would be a very welcome third component of peer review, in addition to qualitative and statistical assessment." [25
] The reason why no journals seem to have acted upon Horton's suggestion to add rhetorical review to their peer review process may be related to editors' and reviewers' understandable lack of skill in the specialized task of applying "textual criticism of scientific discourse" to judge how persuasive a manuscript is. Such analyses are the domain of applied linguistics and discourse analysis, and require specialized knowledge to perform competently.
However, a few bold medical journal editors have ventured to work with experts in applied linguists to investigate the challenges authors face when they try to write their research articles well in English. Thoracic surgeon and editor John R. Benfield, working with linguist Christine B. Feak, suggested that authors who use English as an international language need input from both language professionals and experienced peers [26
]. This view–that two separate skill sets are involved in providing useful feedback that will help researchers become proficient, successful writers–echoes the evidence from research in language and writing [24
]. Benfield had become convinced that "peers and language professionals working together are more effective as editors" than either type of corrector alone in improving research articles written by authors whose first language is not English [32
At the Croatian Medical Journal
gatekeeper editors together with a manuscript editor analyzed how peer review could be used to teach researchers how to write well [33
]. These editors perceived a need to provide intensive support to authors because they recognized that researchers often had valuable hypotheses and data but lacked the skills to present them. This led the editors to develop "an instructional editorial policy to increase the critical mass of researchers competent in scientific writing." As a result, the editors of Croatian Medical Journal
developed author-helpful interventions to improve writers' competencies in four dimensions: study design, narrative, scientific reporting style and language.
These editors observed that translators used by the authors in their setting (a small central European country) often had "insufficient knowledge of medicine and the rules of scientific writing," but nonetheless believed that "the translator or language professional aware of [the] deep intellectual and informational need behind every recommendation within the ICMJE recommendations could substantially contribute to the quality of the manuscript by correcting or pointing out drawbacks (content-, structure- or language-related) of the manuscript to authors before they submit it for publication" (p. 130). This type of editorial input is in fact exactly within the remit of author's editors and "translators as editors" who work with researchers [34
]. Wordface experts are already offering workshops to train non-subject-specialist language and writing professionals to handle specialist material competently [39
Editors at Annals of Emergency Medicine
have defined the two main functions of peer review in these words, " [w]e perform peer review not merely to select the best science but to improve it before publication." [41
] Accordingly, this journal recommends that authors use "clear, succinct prose" and that they consider research reports as a "story," i.e., "an attempt to communicate an experience" that "brings the reader as close to the actual experience as possible." Its instructions to authors emphasize that manuscripts should be written in "the most direct" and "the clearest" manner possible. But the editors' criteria for clarity, succinctness or directness are not made specific. Readers' perceptions of these features may vary considerably, and may not be shared by all the journal's reviewers.
To clarify what this journal expects its peer review process to achieve, it made public its criteria for rating review quality [42
] and subsequently explained these criteria more fully in the journal's Guide for Reviewers [43
]. Two of the six criteria this journal uses to evaluate the quality of the reviews show an awareness that writing quality should be considered separately from scientific quality (from Table in reference 42):
The reviewer commented upon major strengths and weaknesses of the manuscript as a written communication, independent of the design, methodology, results, and interpretation of the study.
The reviewer provided the author with useful suggestions for improvement of the manuscript. ("improvement of the manuscript" could refer to the content or the language/writing, or to both).
It will be interesting to see how useful the explicit distinction between content and writing has been in helping reviewers to provide more useful feedback to authors.