|Home | About | Journals | Submit | Contact Us | Français|
Comorbid depression is common among substance abusers, making routine assessment of depression critical for high quality care. We evaluated two of the most commonly-used depressive symptomatology measures in a sample of clients (N=240) in residential substance abuse treatment settings. The Beck Depression Inventory (BDI-II) has previously been used in clients receiving substance abuse treatment. The Patient Health Questionnaire (PHQ-9), originally developed for primary care settings, has not been used as frequently in substance abuse treatment settings and it is unknown how it performs in this population. The measures were highly correlated with each other (r = 0.76) and demonstrated good internal consistency reliability (BDI-II = 0.91; PHQ-9 = 0.87); however, the PHQ-9 classifies more individuals as having ‘mild’ depression symptoms relative to the BDI-II, which tends to suggest these individuals have no depression symptoms. Implications for assessing depression symptoms in individuals receiving substance abuse treatment are discussed.
Depressive symptoms are highly prevalent among individuals with alcohol and other drugs disorders and are often the result of a co-occurring depressive disorder. Rates of current major depression are two to four times higher among those with an alcohol and other drug disorder than in the general population (Compton, Thomas, Stinson, & Grant, 2007; Hasin, Stinson, Ogburn, & Grant, 2007), affecting 30–45% of people seeking substance abuse treatment (Grant et al., 2004). Although depressive symptoms may be the result of organic brain syndromes caused by substance use, in most cases, the depression is an independent, co-occurring disorder (Gilman & Abraham, 2001; Grant et al., 2004). As a result, many individuals in substance abuse treatment are in need of specific care for their depressive symptoms (American Psychiatric Association, 1995).
Depression is typically not adequately diagnosed and treated in substance abuse treatment settings and this may lead to poorer outcomes. Few substance abuse treatment providers have qualified mental health professionals on staff for assessment or treatment provision (Grella & Hser, 1997). Further, national data suggest that fewer than 7% of people with co-occurring mental health disorders in substance abuse treatment have received a mental health evaluation or appropriate treatment (Watkins, Burnam, Kung, & Paddock, 2001). Untreated affective disorders, including both major and minor depression, are a leading cause of disability and reduced quality of life (Murray & Lopez, 1996; Rapaport et al., 2002). Compared to individuals with depression, alcoholism, or drug dependence alone, those with co-morbid disorders experience even greater impairment (Kirchner et al., 2002; Schmitz et al., 2001) and have shorter lengths of stay in substance abuse treatment and an increased risk of relapse (Brown et al., 1998; Compton III, Cottler, Jacobs, Ben-Abdallah, & Spitznagel, 2003).
One way to increase detection of depression in substance abuse treatment settings is to provide a brief self-report instrument that can be reliably administered without specialized clinical training. Two of the most commonly used instruments to assess depression symptomatology are the Beck Depression Inventory-II (BDI-II; Beck, Steer, & Brown, 1996) and the Patient Health Questionnaire (PHQ-9; Kroenke, Spitzer, & Williams, 2001). There is evidence that the BDI-II performs well in treatment seeking substance abusers (Buckley, Parker, & Heggie, 2001). Its predecessor (BDI) has been successfully used to screen for depression in IV drug users (Steer, Iguchi, & Platt, 1992), methadone patients (Dorus & Senay, 1980; Steer, Emery, & Beck, 1980), opiate addicts (Strain, Stitzer, & Bigelow, 1991), and cocaine abusers (Mallow, West, Penz, & Lott, 1990; Weiss, Griffin, & Mirin, 1989). Although the BDI-II has been frequently employed in research settings, there are some factors that may limit the feasibility of this measure for use in substance abuse treatment settings. First, the BDI-II has a per-administration cost, which may be a salient barrier for substance abuse treatment settings, many of which have financial constraints. Second, the measure is somewhat longer and more complex than the PHQ-9. A respondent must complete 21 items and consider up to 6 statements at a time and select the one that best fits how they have been feeling in the past 2 weeks. Given these reasons, substance abuse treatment settings may need to consider other options to assess depressive symptoms.
The PHQ-9 is the depression module of the Primary Care Evaluation of Mental Disorders (PRIME-MD; Spitzer & Williams, 1994). It was developed and validated for use in primary care settings (Spitzer, Kroenke, & Williams, 1999; Spitzer, Williams, Kroenke, Hornyak, & McMurray, 2000), but has not been well tested for use in substance abuse treatment settings. It addresses many of the limitations of the BDI-II: it is free to use, easy to administer and score, and consists of nine items that map onto the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV) criteria. Recently, Dum and colleagues (Dum, Pickren, Sobell, & Sobell, 2008) examined the performance of the BDI-II and PHQ-9 among individuals who voluntarily sought outpatient substance abuse treatment. Among this population, both measures were found to be reliable and highly correlated. Similar to some previous studies of these measures, a 3-factor solution was found for the BDI-II (cognitive, affective and somatic) whereas a 1-factor solution was found for the PHQ-9, suggesting the measures’ appropriateness for this population (Beck, Steer, & Garbin, 1988). The authors noted that further research should examine the performance of these measures among more severe substance using populations.
We sought to compare the performance of the BDI-II and PHQ-9 among a more severe substance using population and to evaluate the suitability of using these measures in this setting. This paper presents data from individuals referred to one of four residential substance abuse treatment settings in Los Angeles County. We assessed the internal consistency reliability of the BDI-II and PHQ-9 to understand how the different survey items associated with each scale measured depressive symptoms in this population and examined the correlation between the two measures. We also evaluated the factorial validity of the instruments to determine whether factor structure observed in prior studies holds for individuals seeking residential substance abuse treatment. Because of the overlap in symptoms related to depression and alcohol disorders (e.g., sleep and appetite disturbances), we investigated the correlation between the measures and problem alcohol use. We also looked at the relationship between depressive symptoms scores and length of stay since co-occurring substance and depressive disorders have been associated with shorter lengths of stay and relapse in prior studies.
These data were collected to facilitate planning for a clinical trial that would entail delivery of a group cognitive behavioral therapy to residential substance abuse treatment clients that reported persistent depressive symptoms. Since data collection was primarily designed to assess feasibility of delivering these brief assessment tools as part of standard substance abuse treatment, data collection beyond the assessments was limited to a small set of descriptive variables (described below).
Clients entering one of four adult residential substance abuse treatment facilities in Los Angeles County were eligible to participate in the study. Residential treatment staff used both the BDI-II and PHQ-9 with clients so that the research staff could compare the performance and determine an appropriate cut-off point using the PHQ-9 for inclusion in the research study. The order of the measures was counterbalanced across the sites; two sites administered the BDI-II followed by the PHQ-9 and the two other sites administered the PHQ-9 first followed by the BDI-II so that about half the sample received the BDI-II first and the other half the sample completed the PHQ-9 first. Treatment center staff administered the self-report measures of alcohol use at admission and measures of depression between 14 and 30 days after admission. By administering the depression symptomatology measures at least 2 weeks after treatment entry, we were able to assess for persistent depressive symptoms, excluding individuals whose impaired mood improved quickly after treatment entry. Also by assessing at 2 weeks post-residential treatment admission, we were certain that our sample was likely to be abstinent at the time of the assessment. Among cocaine addicts, (Husband et al., 1996), opiate addicts (Strain et al., 1991), and alcoholics (Brown & Schuckit, 1988), depressive symptoms decrease within the first 7–14 days after treatment admission and then remain stable over the next 4–8 weeks.
All participants received a psychosocial assessment prior to treatment entry that determined that the appropriate level of substance abuse treatment care was a residential setting, rather than less intensive levels of care such as outpatient treatment. The settings’ administrative records offer additional information about the participants they serve. The main referral sources were: 36% self-referred, 30% criminal justice (i.e., court, probation, jail or prison), and 30% another substance abuse treatment program or health care provider. Approximately 45% reported some criminal justice involvement (under probation or parole, in a diversion program or awaiting trial), indicating a significant proportion of mandated program participants. About 65% of clients in these treatment settings reported being poly-substance users, the substances most frequently reported were: alcohol (by 69% of the sample, with over 55% reporting drinking until intoxification), amphetamines (41%), marijuana (39%), cocaine (35%), heroin (16%) and sedatives (12%).
Clients were asked to complete two depression symptom measures, the BDI-II and the PHQ-9. The BDI-II (Beck et al., 1996) is a 21-item self-report measure, each item is scored on a 4-point scale (0–3) and scores range from 0–63. Summed scores can be used to describe the individual’s symptoms in one of four interpretive categories: minimal (0–13), mild (14–19), moderate (20–28), and severe (29–63).
The PHQ-9 is a 9-item self-report measure that assesses the nine depression symptoms from the DSM-IV depression criteria (Spitzer et al., 1999; Spitzer & Williams, 1994). Each item is scored on a 4-point scale (0–3) and scores range from 0 to 27. Similarly to the BDI, summed scores can be used to describe the patient’s symptoms in one of five interpretive categories: none (0–4), mild (5–9), moderate (10–14), moderately severe (15–19), and severe (20–27). The PHQ-9 has been shown to have good reliability and validity in primary care populations.
Clients were also asked to complete the Alcohol Use Disorders Identification Test – Consumption (AUDIT-C; Bradley et al., 1998; Bush, Kivlahan, McDonell, Fihn, & Bradley, 1998) upon admission into the residential treatment program. The AUDIT-C includes three items from the full AUDIT dealing with alcohol consumption. The AUDIT-C has performed better than the full AUDIT for detecting heavy drinking and similarly to the full AUDIT for detecting heavy drinking and/or alcohol abuse or dependence (Bush et al., 1998). The AUDIT-C has also demonstrated excellent reliability and responsiveness to change (Bradley et al., 1998). Scores range from 0 to 12. A score of 5 for men and 4 for women indicated a probable alcohol use disorder (Dawson, Grant, Stinson, & Zhou, 2005).
Gender, race/ethnicity and length of stay for each client was also collected. Length of stay was computed as the number of days from admission into the residential treatment program to discharge from the facility.
The BDI-II and PHQ-9 were evaluated in terms of their internal consistency reliability, correlation with each other, their classification of depression symptoms into interpretive categories (e.g., mild, moderate, severe), and their correlation with alcohol use and length of stay. We wanted to explore whether these measures of depressive symptomatology correlated with a screener for probable alcohol disorder, as the symptoms associated with both disorders may overlap. We also examined length of stay and its relationship to depressive symptomatology because co-occurring depressive and substance disorders have been linked to reduced treatment stays in these settings.
Categorical confirmatory factor analyses, estimated using robust weighted least squares, were used to evaluate the factor structure of the instruments. Previous evaluations of the BDI-II have supported a multi-factor model, although the factor structure has varied (see Johnson, Neal, Brems, & Fisher, 2006, for a summary of previous results) in non-substance using populations. Three studies of the factor structure in substance users supported a 3-factor structure (cognitive, affective, somatic; Buckley et al., 2001; Dum et al., 2008; Johnson et al., 2006). Therefore, we hypothesized that the 3-factor structure supported previously would provide the best fit for our data (cognitive: items 1–3, 5–9, 14; affective: items 4, 10, 12, 13; somatic: items 11, 15–21). We tested two alternative models including a single factor model and a 2-factor model shown to be the best fitting model in psychiatric outpatients (Beck et al., 1996; cognitive = 1–14, 17, 21; somatic = 15, 16, 18–20). The PHQ-9 has consistently demonstrated a single-factor structure; therefore, we were primarily interested in the fit of that model.
We report the chi-squared statistic, but trivial departures from fit can lead to statistical rejection of a model because chi-square is sensitive to sample size. Hence, we relied upon practical fit indices to evaluate model fit including the Comparative Fit Index (CFI) and Root Mean Square Error of Approximation (RMSEA). CFI ranges from 0.00 to 1.00, with a value greater than 0.90 being generally taken to indicate an acceptable fit to the data (Byrne, 1994), although values approaching 0.80 are considered acceptable (Hu & Bentler, 1999). Whereas CFI is considered a goodness-of-fit index, RMSEA is a “badness of fit” index (Loehlin, 1998) because a value of zero indicates perfect fit. RMSEA values less than 0.10 are considered acceptable, and the lower the better (Loehlin, 1998). CFA analyses were conducted using Mplus (Muthen & Muthen, 1998–2004).
Very little missing data were observed on the depression measures. On the PHQ-9, 5% were missing at least one item. On the BDI-II, 9% were missing at least one item. Internal consistency reliability estimates were conducted without imputing for missing data. For analyses that use the summed depression score (e.g., descriptive and correlational analyses), we imputed the sum score using mean score imputation (i.e., imputed based on mean from available items). For the factor analyses, we used the missing option in Mplus, which uses maximum likelihood estimation.
During the data collection period (December 2005 – July 2006), 78% of individuals entering the treatment facilities (n = 243) completed the depression symptomatology measures. Of those who did not participate, 16% dropped out of treatment prior to 14 days, and treatment staff forgot to assess the remaining 6% before they were discharged. Three clients were eliminated from the study sample because they did not respond to at least half of the items from each depression measure, resulting in a final analytic sample of 240. Over half of the clients were male and represented a range of ethnic backgrounds (see Table 1). On average, the clients endorsed mild depression symptoms on both measures (BDI mean = 14.9; PHQ-9 = 7.4). Based on the alcohol use screening, 42% were assessed as having a probable alcohol use disorder (Mean AUDIT-C Score = 4.6).
Clients completed the depression symptomatology measures an average of 18.6 days (range 0–128 days) after admission to the treatment program. Sixty-five percent of the sample (n=157) completed the measures within the suggested window of 14 to 30 days after admission. In addition to conducting analyses of our full sample, we conducted the same analyses with this subsample of clients who completed the depression measures within the recommended time window. These results were similar to the results from the full sample and are discussed in more detail in the Discussion.
As expected, the BDI-II and PHQ-9 were highly correlated with each other (r = 0.76, p<.0001). Internal consistency reliability estimates were high for both measures (BDI-II = 0.91; PHQ-9 = 0.87) and were consistent with results from past studies. The distribution of scores into each of the interpretive categories is shown in Figure 1. Note that the interpretive categories are largely the same, with the exception that the PHQ-9 has an additional category of moderately severe. The BDI-II and PHQ-9 appeared to classify moderate to severe symptoms in a similar way. As can be seen in Figure 1, on BDI-II, 30% had ‘moderate’ or ‘severe’ depression symptoms. On PHQ-9, 30% had ‘moderate’, ‘moderately severe’, or ‘severe’ depression symptoms. The depression measures varied, however, in the number of clients that were classified as ‘mild’. The PHQ-9 suggested that 28% had ‘mild’ depression symptoms; however, the BDI-II indicated that only 16% had ‘mild’ depression symptoms. As can be seen in Figure 2, the categories of ‘none’ and ‘severe’ on the PHQ-9 correspond well with ‘minimal’ and ‘severe’ on the BDI-II, and the PHQ-9 category of ‘moderately severe’ is comprised predominantly of clients who score in the ‘moderate’ and ‘severe’ categories of the BDI-II. The figure highlights the large proportion of clients who are ‘minimal’ on the BDI-II and ‘mild’ on the PHQ-9.
While the BDI-II had a small, but significant correlation with the AUDIT-C score (r = 0.17, p<.01) and probable alcohol use disorder (r = 0.15, p<.05), the PHQ-9 did not. Neither depression measure was significantly correlated with length of stay.
The fit indices from the confirmatory factor analyses are presented in Table 2. As expected, the 1-factor model did not provide the best fit to the data for the BDI-II, as indicated by fit indices that are worse than the pre-specified cutoffs. The fit indices suggest both the 2-factor and 3-factor model provide a good fit to the data. In comparing the 2-factor and 3-factor model, the fit indices are quite similar, suggesting nearly equal fit for these two models. The 2-factor model, however, provides a parsimonious explanation of the underlying structure of the data.
For the PHQ-9, the single factor model fit relatively well, although the fit indices were not consistent. The CFI suggests adequate fit, while the chi-square value was significant (indicating some misfit) and the RMSEA is higher than the recommended cutoff of 0.10. We attempted to improve the model fit by allowing two correlated error terms to be estimated (item 3 with item 4; item 7 with item 8). Correlated error terms may be necessary to improve model fit when covariance between the items is not fully accounted for by the latent factor. In this case, item 3 (‘trouble falling or staying asleep, or sleeping too much’) and item 4 (‘feeling tired or having little energy’) are both related to sleep disturbance. In addition, item 7 (‘trouble concentrating on things, such as reading the newspaper or watching television’) and item 8 (‘moving or speaking so slowly that other people have noticed. Or the opposite – being so fidgety or restless that you have been moving around more than usual’) are both related to psychomotor agitation. The significance of these correlated variances suggests that variance in sleep disturbance and agitation is not fully accounted for by the latent variable (depression). Allowing these correlated errors improved the model fit to acceptable levels, as indicated by both the CFI and RMSEA.
Both the BDI-II and PHQ-9 appeared to perform well in our sample of residential substance abuse treatment clients. The measures detected similar rates of moderate to severe depression symptoms, but varied in assessing less severe depression symptoms. Specifically, the PHQ-9 suggested a higher proportion of clients were suffering from mild depression symptoms as compared to the BDI-II, which was more likely to categorize clients as having minimal symptoms. Therefore, the PHQ-9 may be useful in detecting less severe levels of depressive symptoms. This may be particularly important in substance abuse treatment settings in which individuals may be reluctant to report psychological symptoms.
The measures were not as highly correlated with the AUDIT-C as expected based on results from a recent study of outpatient substance abuse (Dum et al., 2008). Though the BDI-II was correlated with the AUDIT-C, the correlation was lower (r = 0.17) than the correlation previously reported with the full AUDIT (r=0.33; Dum et al., 2008). The PHQ-9 was not significantly correlated with the AUDIT-C, whereas a significant correlation was reported previously with the full AUDIT (r=0.33; Dum et al., 2008). The correlation between the alcohol and depression measures was likely attenuated by administering the depression measures several days after the alcohol measures, an aspect of the design that was necessary to reduce the prevalence of substance-induced depressive symptoms. It is also possible that the full AUDIT is more highly correlated with measures of depression as the AUDIT-C focuses specifically on consumption and not other aspects of the disorder. These findings suggest that these measures of depressive symptomatology, administered several days after treatment entry, may accurately identify symptoms associated with a mood disorder as compared to an alcohol disorder.
The results from our factor analyses of the PHQ-9 replicated the 1-factor structure observed in other populations (e.g., primary care) and support the use of the measure in this population. Our analyses of the BDI-II suggest that both the 2-factor and 3-factor structure provide a good fit to the data. In this case the 2-factor solution is preferred as it provides a more parsimonious explanation of the underlying structure. Although a 3-factor structure has provided the best fit to the data in previous examinations of substance abusing populations, several studies have supported a 2-factor structure in a variety of populations including psychiatric outpatients (e.g., Beck et al., 1996), psychiatric inpatients (Cole, Grossman, Prilliman, & Hunsaker, 2003) and depressed geriatric inpatients (Steer, Rissmiller, & Beck, 2000). Given the mixed results across populations, it is not surprising that both the 2- and 3-factor structure provide adequate fit. Future work should continue to evaluate the factor structure in this population.
We considered some limitations in interpreting our results. First, while the participants completed the depression measures an average 18 days after admission, only 65% of the sample were administered the measure within the recommended 14 to 30 days after admission. While screening before 14 days is not recommended because it may be difficult to distinguish depressive symptoms from substance-induced or withdrawal symptoms, using responses administered outside this window should not influence the relationship between the two depression measures. In fact, supplementary analyses on the subsample of participants who completed the measure between 14 and 30 days (n= 157), suggest that the instruments perform similarly to what we observed in the full sample. The internal consistency reliability estimates were very similar to estimates from the full sample (BDI = 0.91; PHQ-9 = 0.87); however, the correlation between the measures was slightly lower (r = 0.71, p<.0001 compared to 0.76 in the full sample). In addition, the results from the factor analyses were similar.
An additional limitation is that we were not able to evaluate the validity of these two depressive symptomatology measures by comparing responses to depression diagnoses. We used these measures because the level of depression symptomatology is important in evaluating a client’s prognosis and treatment plan, however these measures may differ in their ability to predict depression diagnoses. Future work should look at the relationship between these depressive symptomatology measures and depression diagnoses among this population.
These results suggest that both the BDI-II and the PHQ-9 are appropriate for use in assessing depression symptoms in patients with comorbid alcohol and substance abuse. The PHQ-9 may be more sensitive in identifying patients with lower depression symptoms. While this may lead to the need for additional follow up with these patients to determine whether depression treatment is necessary, we believe that a more sensitive measure may be particularly appropriate in substance abuse treatment settings where clients are often reluctant to report mental health symptoms. The PHQ-9 has the additional advantages of being shorter, easier to administer and having no per-administration cost, which are important advantages for substance abuse treatment programs that often have limited resources to appropriately identify, assess and treat clients with comorbidity.
We would like to thank our collaborators at Behavioral Health Services, Inc., and their tireless staff that devoted their time and energy to administering, collecting, and documentingthe data reported herein so that we could better understand the population they serve. This research was supported by the National Institute on Alcohol Abuse and Alcoholism (R01AA014699). A portion of this research was presented at the 2008 American Psychological Association Annual Meeting.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.