The number of citations an article accrues by two years after publication can be predicted with about 60% surety using data within three weeks of publication. The ability to predict citation counts in this study is higher than others have reported. One study was only able to predict 14% of the variance for the annual citation rates using papers on emergency medicine and 12 possible predictor variables,2
whereas another study predicted 20% using articles from JAMA,
, and the New England Journal of Medicine
Our sample, however, represents a select group of articles that passed methodological criteria and the greater predictability could result from lower variability in the dataset.
We have also shown that physician rated clinical relevance at the time of publication is related to citation counts at two years. This replicates results from a study which showed that the citation counts of papers important to rhinology were highly related to clinical utility.26
The quality of studies has been shown to be weakly or moderately related to citation counts.11
In our study we could not assess the influence of quality of research methods because we only included articles that had passed basic methodological criteria, and we had not graded quality within this higher quality dataset.
Several groups have predicted citations using journal impact factors. When impact factors from 2004 were included in our derivation regression the sample size was reduced by 182 articles because Cochrane reviews and HTA reports do not have impact factors. The resulting regression on the smaller dataset gave an R2 of 0.47 (95% confidence interval 0.39 to 0.51). Given the importance of Cochrane reviews and HTA reports we chose to include them and not to use the impact factor as a predictor variable.
Cochrane reviews and HTA reports are systematic reviews of evidence, which are typically lengthy and have structured abstracts. Their average citation rates were low compared with the journal articles in our study. We believe that this underlies our findings of negative correlations between citation counts and number of pages, the presence of a structured abstract, and review articles. Without Cochrane reviews and HTA reports in our analysis, comparisons between the number of pages and original article versus review articles were no longer statistically significant, although structured abstracts remained negatively correlated with citation counts. As we had no prior hypothesis on the influence of Cochrane reviews and HTA reports on citation counts, however, we kept them in our regression model. We are aware that the reduced variability among these publications, as seen by the clustering of the residuals in figure 2, improved the performance of our regression.
Predicting citation counts early could allow providers of information resources to quickly identify those articles that are likely to have an impact on clinical practice. We hope to use this model to refine our approach to “pushing” detected articles to practising clinicians, authors, and publishers.
The positive association that we found between clinical ratings and citation counts could be because early dissemination of such articles actually leads to higher citation rates, not merely predicts citation. We could not assess this possibility, but the intended targets of the dissemination process are practising clinicians, not scientists who write papers. Thus our findings support an association that constitutes “criterion validity,” in that ratings predict an accepted measure of research merit.
Strengths and limitations
The strengths of this study include the large number of included articles (n=1261), the magnitude of the association in the derivation dataset, and the agreement between the results from the derivation and validation datasets. We have shown a statistically significant relation between ratings of the clinical relevance of an article and its citation count.
Several weaknesses are present in our study. Our journal subset included only 105 of the most important clinical journals, a relatively small proportion of all such journals. Therefore our results may not be readily transferable to articles in less important clinical journals or basic science articles or journals. Also, our selected articles were limited to those clinical articles that passed basic criteria for critical appraisal and they represent a small proportion of articles published in any given journal. Such a select sample would have reduced variability resulting in greater predictability. The inclusion of the Cochrane reviews and HTA reports also reduced the variability and led to greater predictability.
We collected data on 20 journal specific and article specific characteristics of 1261 articles from 105 clinical journals to determine if we could predict citation counts at two years. Eleven remained statistically significant in our regression model. Therefore we can predict citation counts of methodologically sound clinical studies and review articles at two years with surety using data available within three weeks after publication.
What is already known on this topic
- Citation counts are markers of an article’s importance but are not available for months after publication
- Research shows that various attributes of an article are related to higher citation rates, but the predictive value of these factors is limited
What this study adds
- Features of methodologically sound articles predicted citation counts with higher reliability than previously found
- Ratings of clinical relevance by practising clinicians are significantly associated with citation counts at two years