|Home | About | Journals | Submit | Contact Us | Français|
Objective. There is substantial uncertainty regarding the prevalence of depression in RA. We conducted a systematic review aiming to describe the prevalence of depression in RA.
Methods. Web of Science, PsycINFO, CINAHL, Embase, Medline and PubMed were searched for cross-sectional studies reporting a prevalence estimate for depression in adult RA patients. Studies were reviewed in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines and a meta-analysis was performed.
Results. A total of 72 studies, including 13 189 patients, were eligible for inclusion in the review. Forty-three methods of defining depression were reported. Meta-analyses revealed the prevalence of major depressive disorder to be 16.8% (95% CI 10%, 24%). According to the PHQ-9, the prevalence of depression was 38.8% (95% CI 34%, 43%), and prevalence levels according to the HADS with thresholds of 8 and 11 were 34.2% (95% CI 25%, 44%) and 14.8% (95% CI 12%, 18%), respectively. The main influence on depression prevalence was the mean age of the sample.
Conclusion. Depression is highly prevalent in RA and associated with poorer RA outcomes. This suggests that optimal care of RA patients may include the detection and management of depression.
Depression is more common in RA than in the general population  and has been associated with increased pain , fatigue , reduced health-related quality of life , increased levels of physical disability  and increased health care costs . Depressed RA patients have poorer long-term outcomes, including increased pain , more comorbidities  and increased mortality levels . Depression may therefore be a useful target for interventions aimed at improving subjective health and quality of life in RA patients. However, prevalence estimates for depression in RA range between 9.5%  and 41.5% , making it difficult to establish the likely impact of depression in this patient group.
There are various reasons why this variation in prevalence estimates may exist. First, the term depression is not clear-cut. Making sense of depressive symptoms in the context of chronic physical disease is challenging—it may be difficult to distinguish between patients with a depressive disorder, as opposed to those demonstrating a normal reaction to living with a chronic, debilitating condition. Further, a number of somatic symptoms of depression (e.g. fatigue, poor sleep and loss of appetite) might be expected to occur in RA as part of the disease process. To overcome this, researchers have adapted diagnostic thresholds to define caseness  or removed items that may be confounded by RA symptoms, for example, items assessing fatigue or sleep quality . Such variations in definitions of depression may influence prevalence estimates.
Second, there are a multitude of methods available to detect depression. The gold standard method is psychiatric interview and diagnosis according to Diagnostic and Statistical Manual (DSM)  or International Classification of Diseases (ICD)  criteria. However, such interviews are time consuming and expensive and therefore often not ideal for examining patients in a busy hospital environment . Alternatively, self-report screening questionnaires, such as the Patient Health Questionnaire (PHQ)  and the Hospital Anxiety and Depression Scale (HADS), may be used. These self-report tools are quick and easy to complete, meaning they are often preferred by researchers attempting to collect a large amount of data from a large sample; they are also cheaper to use than diagnostic interviews. Prevalence estimates according to screening tools are often based on predefined thresholds, which may result in overestimations of prevalence, as screening questionnaires tend to prioritize sensitivity over specificity .
Study quality may be a further explanation for the variance in prevalence estimates. Small studies lead to variable and imprecise prevalence estimates. Sampling strategies may influence prevalence estimates, with studies using convenience sampling or low participation rates giving unrepresentative samples that may be healthier than the target population . Furthermore, the population studied can impact prevalence estimates; some studies may include patients with specific disease durations, or those using a particular type of medication, which may impact prevalence levels [19, 20].
There has only been one previous systematic review of depression in RA, which examined the strength of the association between depression and RA . As yet no systematic review has provided pooled prevalence estimates of depression in RA. The present study aims to fill this gap. We aimed (i) to present a pooled prevalence level of depression in RA patients; (ii) to provide a summary of the methods used to define depression in RA and (iii) to explore the impact of study characteristics on prevalence estimates.
The systematic review protocol and data extraction forms were designed in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA; ) by F.M. and L.R. F.M. conducted a systematic search of Web of Science, CINAHL, PsycINFO, Medline, Embase and PubMed, from inception to October 2012. Sample search terms can be found in supplementary Appendix S1, available at Rheumatology Online.
Studies met the following inclusion criteria: (i) Cross-sectional design, baseline cross-sectional data from a longitudinal study or baseline cross-sectional data from a trial, before group allocation. (ii) Reported a prevalence level for depression using diagnostic criteria, a research diagnostic tool or a validated screening tool (Table 1). (iii) Reported prevalence level as the number of participants meeting predefined criteria for depression, or a percentage from which the number of participants meeting criteria for depression could be calculated. (iv) The sample size was ≥50.
Studies were excluded if they: (i) used a selective sample (e.g. intervention trials after group allocation); (ii) used a paediatric sample; (iii) retrospectively reviewed medical records to establish depressive symptomatology.
For the meta-analysis, studies using a screening tool without stating the cut-off threshold used to detect depression were excluded. Table 2 provides a full list of the eligible methods of detecting depression, alongside the numbers of articles utilizing each method and the number of participants assessed.
F.M. conducted the primary data extraction. All articles were examined independently by a second reviewer (L.R.). Inter-rater disagreement was minimal, and any disagreements were resolved through discussion and reexamination of the article in consultation with M.H. When multiple publications spanned the years of longitudinal studies, baseline prevalence levels were reported. A 10-point quality assessment tool (supplementary Appendix S2, available at Rheumatology Online) was devised to assess sampling method, sample size, participation rate, criteria used to determine depression and the eligibility criteria for participation in the studies. Articles were scored as follows: 0–3 = low quality; 4–6 = low to medium quality; 7–8 = medium to high quality; 9–10 = high quality.
Outcomes were major depression, minor depression, depressive/mood/affective disorder, dysthymic disorder or adjustment disorder, defined by diagnostic interview or according to a defined threshold on a screening tool.
Data were pooled according to diagnosis of depression or screening tool and threshold used to detect caseness. Heterogeneity was found to be moderately high between studies, and therefore random-effects meta-analyses with 95% CIs were conducted with STATA (version 10.0). Heterogeneity was assessed using I2, with thresholds of ≥25%, ≥50% and ≥75% indicating low, moderate and high heterogeneity, respectively .
Sensitivity analyses explored whether prevalence estimates were influenced by study design. Planned sensitivity analyses included the following: exclusion of studies with a participation rate ≤75%, or non-reported participation rate; exclusion of studies not stating a sampling strategy, or using a convenience/non-randomized sampling strategy; exclusion of studies that did not state eligibility criteria for inclusion in the study and exclusion of studies using subsets of patients (for example, a female-only sample or patients with limited disease duration). Subgroup analyses were planned by overall study quality, sample size, country of origin and publication year, if there was more than one study in the subgroup. Spearman’s correlation analyses with adjusted r2 assessed the impact of study variables on prevalence estimates. Funnel plots were produced to explore the possibility of publication bias due to preferential publication of small studies reporting high prevalence estimates; Begg-Mazumdar and Egger’s tests of publication bias were also performed.
The search yielded 28 328 relevant articles (Fig. 1). After removal of duplicates, titles and then abstracts were screened for potential eligibility. All non-RA articles were removed, resulting in 806 potentially eligible studies. These were screened according to the inclusion and exclusion criteria for entry into the study, resulting in a total of 101 eligible studies. After taking into account multiple publications from the same sample, 72 articles were included in the review.
Table 1 presents the 72 papers included in the review (see supplementary Appendix S3, available at Rheumatology Online). Seven studies used diagnostic criteria (DSM or ICD), and the remaining 66 used (one or more) screening tools to detect depression (PHQ-9, IDD, HADS, CESD, BDI, SDS or GDS), the most popular being the HADS and the CESD. The studies represented a total of 13 189 patients with RA; the median of mean ages was 53.7 years [interquartile range (IQR) 51.0–56.5], and the median percentage of females represented in the sample was 77.0% (IQR 70.4–82.9%). Sample sizes ranged from 50 to 988 participants (median = 96.0; IQR 75.0–159.0).
Table 1 presents the quality assessments for the 72 papers, according to the quality assessment tool (supplementary Appendix S2, available at Rheumatology Online). The overall quality of the articles was poor with a median quality score of 3/10 (IQR 1–5). Eleven papers (15%) scored 0/10, and 82% of papers scored 5/10 or lower. No papers achieved the maximum score of 10; however, one received 9 out of 10 . Specifically, 16.6% of studies had a sample size larger than 300, only 41.7% stated a participation rate and of these, only 40% had a participation rate ≥75%. Only 55.6% reported participant eligibility criteria for entry into the study.
Depression was defined in 40 different ways (Table 2). The studies using diagnostic interviews reported three different subtypes of depression: major depressive disorder (MDD), minor depression (MD) and dysthymic disorder (DD), as well as combinations of disorders (depression with adjustment disorders or anxiety) and unspecified depression. Studies using screening questionnaires defined possible or probable caseness using multiple thresholds or detected any depression using one threshold. According to diagnostic criteria, MDD and DD were the most commonly diagnosed depressive subtypes. A full explanation of the differences between depressive diagnoses can be found in supplementary Appendix S4, available at Rheumatology Online.
The most commonly used screening questionnaire was the HADS, with 30 studies using this screening tool. However, six different thresholds were presented in the articles, with the conventional cut-offs of 8 (probable depression) and 11 (definite depression) being the most commonly used. Twenty-five articles used the CESD; nine different cut-off points were presented, the most commonly used being 16. Eight papers used the BDI, with five different thresholds for depression.
Prevalence of depression alone (excluding combination disorders) ranged between 0.04% and 66.3% in individual studies (Table 1). Table 2 presents the summary of meta-analyses and heterogeneity assessments. Meta-analytical pooled prevalence of MDD (Fig. 2) according to the DSM diagnostic criteria was 16.8% (95% CI 10.0%, 24.0%), with moderate heterogeneity (I2 = 73.4%). Dysthymic disorder (according to DSM criteria) showed a pooled prevalence level of 18.7% (95% CI −2.0%, 39.0%), with high heterogeneity (I2 = 97.2%).
Prevalence of depression according to the PHQ-9, with a threshold of 10 indicating moderate-severe depressive symptoms, was 38.8% (95% CI 34.0%, 43.0%), with low heterogeneity (I2 = 19.8%).
Analyses of screening questionnaires according to the threshold used to detect depression were conducted. As expected, higher thresholds yielded lower prevalence estimates. For example, the HADS shows an estimated prevalence of 34.2% when used with a threshold of 8, and a prevalence of 14.8% when used with a threshold of 11 (Fig. 2).
Assessment of publication bias (see supplementary Appendix S5, available at Rheumatology Online) indicated significant publication bias, according to the Egger’s test, in studies reporting MDD according to DSM criteria [Begg-Mazumdar: Kendall’s τ = 1.36, P = 0.17, Egger: bias = 4.59 (95% CI 1.36%, 7.82%), P = 0.03]. There was no significant evidence of publication bias in any other analyses.
Table 3 shows prevalence estimates according to each sensitivity and subgroup analysis, in comparison with the primary analysis. The results of the sensitivity analyses indicated no particular trend or pattern according to the exclusion of studies with only abstracts available, the exclusion of studies with unreported participation rates or participation rates ≤75%, the removal of studies using convenience, non-randomized, or with unreported sampling strategies, or the exclusion of studies using subsets of patients. Exclusion of studies with no reported eligibility criteria tended to increase prevalence estimates, with the exception of the CESD (threshold 16). The subgroup analyses were conducted according to sample size, overall quality and publication year. The subgroup analyses for sample size and overall quality showed no clear patterns. However, more recent publications tended to yield higher prevalence estimates.
Spearman’s correlation analyses with adjusted r2 were used to assess the associations between linear variables including participation rate, sample size, overall study quality, publication year, proportion of female participants, mean age of participants and mean duration of illness. Table 4 shows the results of these analyses.
A significant relationship was found between mean age and prevalence estimate; lower age was associated with increased depression prevalence (r = −0.3, P = 0.02). No other study characteristics showed a significant association with prevalence estimate.
Depression is highly prevalent in RA patients. Estimates varied according to the way in which depression was measured, but our pooled estimates from the small number of studies using gold standard clinical interviews suggest that major depression is present in 16.8% of RA patients. The larger number of studies using screening tools found significant depressive symptoms present in 38.8% using the PHQ-9 and between 14.8% and 48% using the HADS. These prevalence estimates are considerably higher than those observed in the general population  and are similar to, or higher than, those observed in patients with diabetes , Parkinson’s disease  and cancer . Although studies varied widely in terms of quality (and many were of poor quality), our sensitivity analyses indicate that prevalence estimates were reasonably stable. Apart from the measurement tool used to ascertain depression, study quality and study population had little impact on the estimates detected.
The RA patient population represents a largely female, older adult population . It could be suggested that the inflated levels of depression found in this sample represent the increased risk of depression found in females and the elderly [28, 29], regardless of the presence of RA. However, as we found a significant negative association between age and depression prevalence estimate, it is more likely that our findings represent and increased risk of depression in RA patients in comparison with the general population.
A bewildering diversity of assessment measures were used to ascertain depression. This is similar to the situation in other physical diseases . In this review, we did not include many studies that did not use validated measures of depression or questionnaires that assess a broader overlapping concept of psychological distress. Nevertheless we found that many studies used idiosyncratic cut-off scores on screening tools, meaning that the range of estimates for one such measure (the HADS) varied from 14.8% to 48%. Because there have not been validation studies to determine the best cut-point for such screening tools in this population, one clear recommendation is that investigators justify the use of idiosyncratic thresholds, and always report prevalence at conventional cut-points as well, to allow cross-study comparisons.
We used rigorous methods to conduct the review, with a sensitive search, and a reproducible, structured approach to data extraction and synthesis. We took a broadly inclusive approach to inclusion of studies, preferring to include less rigorous studies and explore the impact of study design in sensitivity analyses than to exclude such studies from the outset. It is possible that publication bias affected our results. We explored this using funnel plots and Egger’s test where the assumption made was that small studies reporting low prevalence of depression would be less likely to be published than small studies reporting high prevalence. We only found evidence of potential publication bias in the studies that used diagnostic interviews. This is surprising since the efforts taken to conduct such studies are considerable and we would have anticipated these to be least likely to be affected by publication bias.
There are, however, additional important shortcomings in the evidence on prevalence of depression in RA that need to be addressed. The limited number of studies using structured clinical interview and determining depression according to DSM and ICD criteria is a concern. The high rates of depressive symptomatology detected through the screening tools could be due to the overlap between the somatic symptoms of depression and symptoms of RA. Symptoms frequently associated with depression (such as fatigue and reduced sleep quality) may be experienced by RA patients regardless of whether depressive symptoms are present or not. For example, 7 out of the 21 BDI items assess somatic symptoms, leading to concerns about the validity of this questionnaire in medical patients . Similarly, a modified version of the CESD has been suggested for use with patients with RA, due to the symptom overlap ; however, only two articles in the current review used the modified versions available [33, 34].
A further consideration is the representativeness of the sample from which prevalence levels are estimates. Low socio-economic status (SES) patients are often under-represented in research samples . This can be problematic, as low SES is associated with increased susceptibility to depression  and RA . Many of the studies included in this review did not measure SES appropriately, with most studies using a single measure of education level or monthly income to indicate SES. This level of heterogeneity makes it difficult to establish the representativeness of the samples included with regard to SES. However, it is possible that a selection bias favouring high SES patients exists and the results of this systematic review may therefore underestimate the prevalence of depression.
The meaning of depression in the context of RA is not straightforward. Emotional responses to a physical illness characterized by pain and debility are understandable, and somatic symptoms of depression (e.g. loss of appetite and poor sleep) might be expected as part of RA. Therefore there is a need to ensure that measures of depression used in clinical practice are validated, both against a recognized criterion (e.g. the ‘gold standard’ clinical interviews) and also in terms of predictive validity (i.e. to determine the impact of depression on RA outcomes). Psychometric approaches utilizing longitudinal data may further be able to distinguish subtypes of depressive symptoms and thereby distinguish symptoms that are most likely to be core to the depressive syndrome.
Ultimately the key question is whether improved patient outcomes can be attained by recognizing and managing depression more effectively. There is growing evidence that incorporating a system of collaborative and stepped care of depression in patients with physical illness, which might include routine screening for depression with referral for highly structured manualized therapies depending on the outcome of screening, is effective treatment . The high prevalence of depression in RA suggests that this would be a suitable patient group in which to test such strategies.
We thank Dr William Lee for his statistical advice and Dr Katherine Beck for her assistance with data extraction. F.M., L.R. and M.H. receive salary support from the National Institute for Health Research (NIHR) Mental Health Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
Funding: This work was performed on behalf of the IMPARTS Project: Integrating Mental and Physical Healthcare: Research, Training and Services. IMPARTS is funded by King’s Health Partners Academic Health Science Centre, with the overall aim to improve integration of mental and physical healthcare in general hospital settings.
Disclosure statement: The authors have declared no conflicts of interest.
Supplementary data are available at Rheumatology Online.