More than 50% of systematic reviews (both Cochrane reviews and reviews based in paper articles) did not specify in the methods whether and how they would use quality assessment in the analysis and interpretation of results. Cochrane reviews fared better than paper based reviews in carrying out quality assessment but were equally unsuccessful in taking it into account.
Since 1987 several authors have explored the extent to which systematic reviews and meta-analyses incorporate quality assessment into the results, with unsatisfactory findings.22,31-34
Moher et al analysed 240 systematic reviews19
: quality assessment was carried out more often in Cochrane reviews than in reviews published in paper based journals (36/36, 100% v
78/204, 38%; P < 0.001), although only 29 took such assessment into account during their analysis (paper based reviews 27/78 34% v
Cochrane reviews 2/36 6%; P < 0.001).
During the past 15 years research has concentrated on two main issues: which components of the quality assessment (for example, allocation concealment) are predictive of valid results and what tool (scales or checklists) best assesses quality. In 2003 Egger et al found that allocation concealment and double blinding were strongly related to treatment effects.4,20,35
Despite the dozens of quality scales and checklists that have been proposed,5,7,18
the answer is still unclear and many doubt that a generic quality assessment tool that would prove valid in all cases can ever be found. In our study the most frequently used tool was the Jadad scale, a tool that has been criticised for its low sensitivity and that does not consider allocation concealment because it was developed before the importance of concealment was established.21
Moreover, less attention has been paid to explore how quality can be used in the interpretation of the results of systematic reviews.8,9,13,36
As Cochrane reviews are preceded by a published protocol and the Cochrane handbook mandates that some form of quality assessment of primary studies is to be done,37
it is not surprising that authors of Cochrane reviews state that they will carry out quality assessment and do so. Yet when it comes to the more complex, yet potentially relevant, aspect of incorporating quality into the results, Cochrane reviews fared no better than their paper based counterpart.
These findings may have several explanations. That Cochrane reviews provide more details may be due to the absence of limitations on space in electronic publications; however, most of the medical journals now publish a web version of their papers, with different space restrictions. We investigated this potential of the electronic versions but found that none of the paper based reviews was supplemented with an electronic appendix of quality assessment. It is also possible that authors are unaware of, or that editors are not interested in, publishing extensive electronic versions. Moreover, most authors of paper based articles may be aware of space limitations imposed by journals and thus omit details of quality assessment because they believe that they not are relevant to their results. This could be a reflection of a bad practice: a more common use of unplanned outcome dependent subgroup analyses. Where subgroup analyses are not predefined, the risk exists that data may have been dredged in search of a significant result.38
Cochrane systematic reviews are, at least in principle, protected against post hoc analyses through the preliminary publication of a detailed study protocol.
Limitations of the study
One limitation of our study is that the Cochrane reviews were published between 1995 and 2002, whereas the paper based articles were first published in 2001. Although older, the Cochrane reviews still fared better for frequency of quality assessment. The Cochrane reviews were, however, remiss in their efforts to incorporate the quality assessment findings in the presentation and interpretation of results. We believe that this difference in publication dates did not affect our results because between 1995 and 2000 no major methodological advances or new consensus emerged in the literature on systematic reviews. Another limitation is whether DARE39
was an appropriate source from which to sample paper based articles. This is a legitimate concern as it may have led to the selection of a control group with better than average quality. Any selection bias would move our estimate towards the null effect or would minimise the difference between the two types of reviews. Thus, our results could be understated.
A third possible limitation is that incomplete reporting might have influenced our assessment. Evidence, however, suggests that what is reported about important aspects of the conduct of a study typically do reflect what is actually done.40,41
Finally, we assessed quality assessment using a checklist that had been developed ad hoc. Although the lack of validation may be criticised, we believe that the items have good face validity: we attempted to reduce assessors' subjectivity, and the inter-rater reliability was acceptably high. We did have trouble with the classification of quality items, tools, and approaches, as there are innumerable ways to define study quality.7,20
It is possible that we recorded quality data with slightly different meanings from those intended by the authors of the studies. Given that almost all systematic reviews were incomplete in their reporting and given the narrative nature of the quality assessment, it is likely that our checklist may have had decreased ability to reflect what authors truly wanted to do and did. It was beyond the scope of our study to assess the appropriateness of the methods chosen for quality assessment by the authors.
Within the Cochrane Collaboration there is room for better standardisation of approaches to quality assessment. The Cochrane handbook should provide clearer guidelines on how to do it. Less clear is how to improve quality assessment in articles published in paper based journals. For such reviews, improvement may only come once a consensus of methodology for systematic reviews has been decided. Peer reviewers and editors should scrutinise systematic reviews to ensure consistency among the various sections and to avoid outcome dependent analyses.
We believe that more research is needed to understand how best to assess and incorporate the methodological quality of primary studies into the results of systematic reviews. Progress towards the necessary improvements highlighted by the results of this study may come from two international meetings planned in May and June 2005 that will be dedicated to, respectively, an improvement of the current section of the Cochrane handbook on quality assessment in systematic reviews and a revision of the QUOROM statement.2
What is already known on the topic
Appraisal of the methodological quality of primary studies is essential in systematic reviews
No consensus exists on the ideal checklist and scale for assessing methodological quality
The Cochrane Collaboration encourages a simple approach to quality assessment based on individual components such as allocation concealment
What this study adds
Approaches to quality assessment of primary studies by systematic reviews are heterogeneous and reflect a lack of consensus on best practice
Cochrane reviews assess methodological quality more often than paper based reviews
Both types of review failed to link the quality assessment to the interpretation of results in almost half of cases