What these examples show is that neither competence of the authors not prestige of the journal is any guarantee that the results of a meta-analysis do not need checking. Expert authors make mistakes that the review process does not correct. It therefore follows that an important standard by which a meta-analysis is to be judged is checkability. I propose that the following five points should be adopted by the community of meta-analysts and users if we are to improve the reliability of meta-analysis.
1. Be vigilant about double counting.
2. Make results checkable.
3. Describe approaches to analysis in detail.
4. Judge the meta-analysis not the analyst.
5. Create a culture of correction.
As regards the first of these, I hope that I have given sufficient examples to put potential users on guard. Although I consider that quality checklists, however good, are of little relevance when deciding whether to trust a meta-analysis, they are potentially useful in warning would-be analysts what to consider. In this respect, however, the current favourite, the Oxman and Guyatt score, is quite inadequate as it does not warn the user of potential problems. Furthermore it has a bias in favour of inclusion. The ten points included (see Oxman et al [
2] page 1272), are
1. Were the search methods reported?
2. Was the search comprehensive?
3. Were the inclusion criteria reported?
4. Was selection bias avoided?
5. Were the validity criteria reported?
6. Was validity assessed appropriately?
7. Were the methods used to combine studies reported?
8. Were the findings combined appropriately?
9. Were the conclusions supported by the reported data?
10. What was the overall scientific quality of the overview?
Of the points, one, point 2, explicitly stresses the importance of being comprehensive and five (points 1,3,4,5 and 6) also address inclusion, whereas it would have to be a researcher who was already sensitised to the problem of double counting (say) who took point 8 as being a reminder to pay attention to this.
The implementation of my second proposal is partly constrained by resources. One inherent advantage that meta-analyses of the Cochrane Collaboration have over others is the amount of space that is allowed compared to journals. This is a point in their favour. However technological advances are making it easier for journals to match this through supplementary material provided on the web and this is what we have to strive for.
The third point requires a recognition and acceptance that meta-analysis is, contrary to what is sometimes maintained, not simple after all. It is not just a question of pushing data into some software sausage machine and waiting for a summary to appear. Empowering the statistically innocent to perform statistical analyses has its drawbacks. Many choices have to be made along the way and not all are uncontroversial. In consequence it is necessary to describe those choices in some detail.
The fourth point is that we should recognise that even experts can make mistakes and even those with motives we mistrust can have good arguments. There is a rather silly secondary literature of meta-analysis that seeks to award quality points for overviews from this or that source. Even if the quality instruments being used were appropriate (and they are not) the false positives and negatives in any screening procedure based on such class scores would be so numerous as to make the information nearly worthless in judging whether to trust an individual analysis. Consider the case of Lee's checks [
17] and Hackshaw et al's meta-analysis [
16]. Lee works as a consultant to the tobacco industry – enough reason to distrust him when passive smoking is being discussed, many would say. Hackshaw et al are public health experts with a considerable reputation. Enough grounds to trust them, many (including me) would claim. However, the trust or mistrust we have in the meta-analysts is irrelevant once we have got to the point of debating a scientific issue such as whether a quoted standard error must be too small.
My final point is that journals should devote more space to the correction of previous work and that we need a mechanism for flagging problems with papers once identified. For example, as far as I am aware, the
BMJ has not issued notes correcting either of the two meta-analyses [
12,
37] mentioned in this article, despite the fact that the problems have been pointed out to the editors. Peter Lee [
17] drew attention in
Statistics in Medicine (
SiM) to the problem with the
BMJ paper on passive smoking but a recent paper [
38] in
SiM not only does not cite Lee but cites the paper on passive smoking and uses it to illustrate a method to deal with missing studies, the opposite of the
known problem! The editors of the
Journal of The Royal Statistical Society Series C refused to publish a letter by Andy Grieve and me pointing to some problems with Peters et al [
22], including that mentioned here. Over
two years after I informed the Cochrane Collaboration regarding the double counting in the otitis media meta-analysis [
39], there is still no correction. The editors of
JAMA initially declined to take any action regarding the corresponding paper [
9] when I brought it to their attention and I still wait to see what they will do about it.