The present study investigated the quality of reporting of studies using the anti-CCP2 assay in RA patients according to the STARD statement. The differences between high and low quality studies were explored. The effect of reporting quality on pooled estimates of diagnostic metrics was also examined. Our analysis focused on the reporting of methodological items (items in method and results’ sections). In total, the 103 articles (corresponding to 132 studies) covered a publication period of 23 years. Almost the articles used in our analysis were published after the introduction of STARD statement (only 4 of them were published during 2003, year of STARD appearance).
Although the overall reporting quality was relatively good (13 items were reported by 70% or more of the studies) there are some essential methodological aspects of the studies (such as number/training/expertise of persons executing the tests, readers’ blinding to results, information on recruitment, adverse events from performing the tests, handling of missing responses and outliers) that are seldom reported making it difficult for the reader to assess explicitly the validity of a study. Comparing the quality of reporting in high versus low quality articles, significant differences were seen in a relatively large number of methodological items (11 items referred to: study population, data collection, reference standard, definition of units/cut-offs, number/training/expertise of persons executing and reading the tests, methods for calculating diagnostic measures, dates of recruitment, clinical/demographic characteristics, information on recruitment, time interval between tests, estimates of diagnostic accuracy).
Overall, the STARD quality score (high/low) has no effect on pooled sensitivity and pooled specificity. However, the meta-analysis showed an effect for specific STARD items. Studies not reporting sufficiently the methods used in calculating the measures of diagnostic accuracy (item 1), may have overestimated the sensitivity. In addition, the reporting of demographic and clinical characteristics/features of the study population (items 13 and 16) has affected the effect size of specificity, i.e. they have overestimated it, indicating also a spectrum bias [19
However, the findings of the present synthesis (sensitivity of anti-CCP2, 71% and specificity, 96%) are compatible with those of earlier reviews (Nishimura et al. [27
]: sensitivity, 67% and specificity, 95%, Whiting et al. [13
]: sensitivity, 67%, specificity, 96%). An overestimation of our overall sensitivity might be resulted because of the lack of stratification by study design or disease duration in the analysis.
In a recent review, Whiting et al. [13
] compared the accuracy of ACPA with that of RF in diagnosing RA in patients with early symptoms of the disease. They also assessed their studies for methodological quality by using a modification of the QUADAS criteria (items related to reporting quality, were removed). However, the impact of quality effect in diagnostic accuracy was not evaluated further. Nevertheless, the primary aim of the present study was to evaluate the effect of quality of reporting (according to STARD) in diagnostic accuracy rather than evaluating the effect of methodological quality (according to QUADAS); though, both tools can be useful for assessing the quality of diagnostic studies in a different perspective [28
Applications of the STARD statement guidelines for assessing the quality of reporting in diagnostic accuracy studies, have been conducted in various medical fields such as in the field of diagnostic endoscopy [29
], of juvenile idiopathic arthritis in peripheral joints [30
], of diabetic retinopathy screening [31
], of glucose monitor studies [32
], of optical coherence tomography in glaucoma [33
], of ultrasonography for the diagnosis of developmental dysplasia of the hip [34
] and in the field of screening ultrasonography for trauma [35
A limitation of the present study is that the literature search was restricted to PubMed. In addition, some studies may have been missed since we included only studies that provided data to estimate both sensitivity and specificity. However, the number of articles used is relatively large and an overview of reporting quality of studies may be obtained and the reached conclusions are unlikely to be affected by omitted studies. We would like to stress that lack of reporting of a STARD item does not necessarily implies that this item was not performed. Thus, a badly performed but well reported study will necessarily receive full credit. Finally, the published studies have had different design settings, and involved different stages of rheumatoid arthritis (study design, disease duration) which may question the synthesis of information, and therefore, the generalizability of results.
In conclusion, our attempt to assess the reporting quality of diagnostic accuracy studies in RA highlights the need for further improvement. Implementation of the quality reporting statements (e.g. CONSORT) have already improved the quality of reporting in other fields of medical research [36
]. Thus, guidelines on the reporting of diagnostic accuracy studies are expected to improve the quality of reports of diagnostic studies as well. Finally, the study quality has no effect on the pooled estimates of diagnostic accuracy.