This study investigated the association between the quality of an article's statistical reporting and analysis and the number of citations it received. In this set of articles, failing to state essential information, such as the primary research question or the primary outcome variable did not affect the number of citations the article received. However, a sufficient description of the methods used was an important factor in increasing the number of citations received in two of the four journals. Statistical errors and sample size were not associated with number of citations received. Reporting quality was associated with the journal visibility and prestige.
West and McIlwaine [
14] have analyzed citation counts in the field of addiction studies. They report that there was no correlation between number of citations and expert ratings of article quality. Callaham et al [
15] examined a cohort of published articles originally submitted to an emergency medicine meeting and also reported that the impact factor of the publishing journal, not the peer rating quality of the research, was the strongest predictor of citations per year. Our findings concerning the statistical quality are in line with these findings.
The importance of stating the purpose and a priori hypotheses of a research project in the report is obvious, but such a statement was often (in 34.6% of papers) missing. In these cases, the results cannot be interpreted in light of a priori hypotheses. Further, unless the research question is clearly stated, the appropriateness of the study design, data collection methods and statistical procedures cannot be judged. For other researchers to cite the paper, however, it does not appear to matter whether the initial purpose of the cited study was clear, or whether the analyses are exploratory and speculative.
We found that 25% of the articles were difficult to read due to an unclear definition of the primary response or outcome variable. Although it is valuable for medical studies to evaluate several aspects of patients' responses, it is important to identify a small set of primary outcome or response variables in advance [
16]. It is also important that the results for primary responses (including any non-significant findings) are fully reported [
9]. Focusing on clearly stated primary response measure(s) helps both the investigators to write an understandable and compact report and the readers to evaluate the findings. Again, though, our results indicate that having an unclear primary response or outcome variable does not lower the citation count and so does not appear to restrain other researchers from using the paper.
Articles with clearly documented research methods did receive more citations. This association was more marked in papers published in AJP and BJP. In our sample, documentation of statistical methods used was generally sufficient in AGP (92.2%), consistent with the editorial policy of the journal which requires an extended methods section in submitted manuscripts.
We included in our review four general psychiatric journals with different prestige and visibility. By involving several journals we were able to control for the effect of journal visibility on the number of citations received and compare the prestige of a journal with the quality of statistical presentation. The reporting of statistical information was more detailed, comprehensive and useful for the reader in the two leading journals (AGP and AJP). Again, this is consistent with their detailed guidelines for presenting statistical results, and also a more rigorous review process, including extensive statistical reviewing [
17]. In low-impact journals the peer review is undoubtedly less thorough [
6,
18]. Thus our results provide an important confirmation, for editors, authors and consumers of research, on the value of guidelines and rigorous statistical reviewing.
Several findings have demonstrated that a non-negligible percentage of articles – even those published in 'high -prestige' journals –, are not statistically faultless [
6,
8,
19,
20]. Our findings are in line with these studies, and also demonstrate inadequate reporting of research methods and hypotheses. However, most of the statistical problems in medical papers are probably relatively unimportant or more a matter of judgment. As there is also no general agreement on what constitutes a statistical error, the comparison of different statistical reviews is difficult [
8,
21]. There may be several valid ways of analyzing a data set.
It has been claimed that researchers prefer to cite large studies rather than small studies [
22]. Our data does not support this hypothesis: sample size was not associated with the frequency of citations. Callaham et al [
15] came to the same conclusion when they analyzed a set of emergency medicine articles. Textbooks of medical statistics require that the sample size should be large enough (or as large as possible) and that some justification for the size chosen should be given [
23]. Unfortunately, our results suggest the concept of sample size calculations seems to be almost unknown in psychiatric research outside the field of clinical trials; less than 4 % of the evaluated articles included sample size calculations, power analysis or any other justification for the sample size.