|Home | About | Journals | Submit | Contact Us | Français|
Clinical guidelines recommend depression screening in patients with coronary artery disease (CAD), but how to accomplish this is unclear.
We evaluated the test characteristics of the two-item Patient Health Questionnaire (PHQ-2), the nine-item Patient Health Questionnaire (PHQ-9), and a two-step screening approach (PHQ-2 then PHQ-9 if positive on PHQ-2), compared with the Computerized Diagnostic Interview Schedule (C-DIS) for major depression. We also evaluated a “PHQ diagnosis” of depression, requiring five of nine symptoms “more than half the days,” compared with the C-DIS.
Cross-sectional study of 1,024 outpatients with CAD.
Two hundred twenty-four patients (22%) had current major depression. Optimal cutpoints were ≥2 for the PHQ-2 (82% sensitive, 79% specific) and ≥6 for the PHQ-9 (83% sensitive, 76% specific). The two-step screening approach was less sensitive (75%), but more specific (84%), than the PHQ-2 or PHQ-9 alone. The “PHQ diagnosis” had low sensitivity (28%), but high specificity (96%).
Cutpoints of ≥2 on the PHQ-2 and ≥6 on the PHQ-9 had similar test characteristics. A two-step approach using the PHQ-2 followed by the PHQ-9 was no better than either instrument alone. A “PHQ diagnosis” of depression had high specificity, but poor sensitivity.
Major depressive disorder (MDD) is present in approximately 20% of patients with cardiovascular disease (CVD).1,2 Several clinical guidelines3,4,5 recommend depression screening in patients with CVD, although none specifies what procedures or instruments should be used.
A recent National Heart, Lung, and Blood Institute (NHLBI) Working Group6 recommended a two-step approach to screening in research studies in which the two-item version of the Patient Health Questionnaire (PHQ-2)7 is used as an initial screen, and the nine-item version (PHQ-9)8 is administered to patients positive on the PHQ-2 to identify patients likely to have MDD based on a structured clinical interview. The PHQ-9 is self-administered and easily scored, maps onto the nine symptoms from the DSM-IV classification for MDD, and can be used to track symptoms.8,9 Recommended cutoff scores to identify patients in primary care who would likely be positive for MDD based on a clinical interview are ≥10 for the PHQ-9 and ≥3 for the PHQ-2.7
It has also been suggested that that a “PHQ diagnosis” of MDD can be obtained from the PHQ-9 based on five of nine depressive symptoms present at least half of the days in the past 2 weeks, including depressed mood or anhedonia.8 However, two systematic reviews9,10 reported that this method had similar accuracy to the cutoff score method for identifying patients who met MDD criteria based on a structured clinical interview. These reviews found that the PHQ-9 had good sensitivity (77% and 80%) and specificity (94% and 92%) in primary care settings,9,10 but one review showed that recommended cutoff scores for the PHQ-9 had poor sensitivity in three of six specialty medicine samples (50% to 69%).10 McManus et al.11 found that cutoff points recommended for primary care patients resulted in poor sensitivity for the PHQ-2 (39%) and PHQ-9 (54%), compared to a diagnosis of MDD based on the Computerized Diagnostic Interview Schedule (C-DIS),12 among 1,024 outpatients with stable coronary artery disease (CAD) from the Heart and Soul Study, but did not identify an optimal screening cutoff. Stafford et al.13 reported that a cutoff score of ≥6 optimized sensitivity (83%) and specificity (79%) among CAD outpatients, but this was based on a relatively small sample (N=193, MDD=35).
The objective of this study was to assess the test characteristics of the PHQ-2 and PHQ-9 compared to an MDD diagnosis using the C-DIS in CAD patients from the Heart and Soul Study using recommended primary care cutoffs, alternative cutoffs, the two-step approach recommended by the NHLBI Working Group, and a “PHQ diagnosis.”
Methods of the Heart and Soul Study have been described previously.11 Eligible patients were identified through administrative databases as having CAD, defined as history of MI, angiographic evidence of ≥50% stenosis in ≥1 coronary vessel, previous evidence of exercise-induced ischemia by cardiac stress testing, history of coronary revascularization, and/or diagnosis of CAD by an internist or cardiologist. Invitations to participate in the study were mailed to 15,438 eligible patients; 2,495 responded by mail and received a follow-up telephone call. Of these, 505 could not be reached, 596 declined participation, and 370 were excluded due to an MI in the prior 6 months, self-assessed inability to walk one block, or pending move from the area. Between September 2000 and December 2002, 1,024 patients were enrolled. At their initial study appointment, patients completed the PHQ-2 and PHQ-9 and were assessed for current (past month) MDD with the C-DIS. The appropriate Institutional Review Boards approved all study procedures, and all participants provided written, informed consent.
The PHQ-98,14 includes nine items (scored 0–3; total score range 0 to 27). The PHQ-2 includes the first two items of the PHQ-9 (anhedonia and depressed mood) with a total score range of 0 to 6. For the “PHQ diagnosis,” subjects were considered depressed if they reported a total of five of nine PHQ symptoms, including anhedonia or depressed mood, “more than half the days” (thoughts of death counted if present at all).14 The C-DIS was the gold standard used to assess MDD in the previous month12,15 by research assistants blind to results of the PHQ.
Sensitivity, specificity, positive predictive value, negative predictive value, likelihood ratios, and area under the receiver-operating characteristic curve (AUC)16 were calculated. Each of 18 patients was missing one item on the PHQ-9. Missing values were imputed using the SPSS Missing Values Analysis module expectation maximization algorithm (version 15.0, Chicago, IL).
A total of 224 patients (22%) had MDD diagnoses. Patient characteristics, including MDD prevalence by subgroup, are shown in Table 1. As shown in Table 2, and as previously reported by McManus et al.,11 cutoffs for the PHQ-2 (≥3) and PHQ-9 (≥10) recommended for primary care resulted in good specificity (93% and 90%, respectively), but poor sensitivity (39% and 54%, respectively). Optimal cutpoints were ≥2 for the PHQ-2 (82% sensitive, 79% specific) and ≥6 for the PHQ-9 (83% sensitive, 76% specific). The two-step procedure (PHQ-9 ≥6 for patients with PHQ-2 ≥2) resulted in somewhat lower sensitivity (75%) and somewhat higher specificity (84%) compared to the PHQ-9 or PHQ-2 alone. The “PHQ diagnosis” approach was highly specific (96%), but poorly sensitive (28%). There were no significant differences in AUC or sensitivity and specificity for the PHQ-2 or PHQ-9 based on sex or age (<70 years versus ≥70 years).
In outpatients with stable CAD, we found that either a PHQ-2 cutpoint of ≥2 or a PHQ-9 cutpoint of ≥6 optimized combined sensitivity and specificity for detecting MDD based on a structured clinical interview. A two-step screening approach using both instruments had similar overall diagnostic accuracy to using either alone. As compared with a structured clinical interview for MDD, a “PHQ diagnosis” using the PHQ-9 responses to diagnose MDD was highly specific, but resulted in many false negatives.
Our results build on the work of Stafford et al. who examined a group of individuals in Australia 3 months after discharge from hospitalization for an acute MI or a coronary revascularization procedure.13 They also found that a PHQ-9 cutoff score of ≥6 optimized sensitivity and specificity. We added to this work by demonstrating that the PHQ-2 performs similarly to the PHQ-9 and that two-step screening with the PHQ-2 followed by the PHQ-9 does not improve results compared to screening with either the PHQ-2 or the PHQ-9 alone. An obvious benefit to using the PHQ-2 is its relative brevity. On the other hand, the PHQ-9 may be a better tool for tracking depressive symptoms over time.
We generated cutoff scores that optimized the balance between sensitivity and specificity. In clinical settings, scores can be used to assess depression severity, to monitor the efficacy of treatment, or to identify patients likely to have a diagnosis of MDD based on further assessment. Cutoff points can also be used for research purposes, but not for the formal diagnosis of MDD. A formal diagnosis of MDD requires a clinical interview that assesses specific symptom patterns, as well as evidence of functional limitations.
As demonstrated in primary care settings, improved depression outcomes are likely to occur only when a collaborative care model is used, including the use of evidence-based protocols for treatment, active collaboration between primary care providers and mental health specialists, active monitoring of adherence to therapy, and access to structured psychotherapy.17 In the absence of these services, there is no evidence that screening alone is of benefit to patients in CVD settings. It must be noted, however, that this conclusion is made from the perspective of depression care alone and does not take into account the possibility that depression screening may have other potential benefits to patients with CVD. Many studies have now shown that patients with positive depression screens are at increased risk of cardiovascular morbidity and mortality.18 If depression screening identifies a group of high-risk patients who derive particular benefit from certain cardiac procedures or from interventions focused on enhancing adherence to medication or to secondary prevention behaviors, for example, then screening may be useful in therapeutic decision making even in the absence of mechanisms for formal depression diagnosis, treatment, and follow-up.
It must also be noted that this analysis is based on data from a study of outpatients with stable CAD, and the degree to which conclusions generalize to patients hospitalized with acute coronary syndromes is unknown. Furthermore, since only 7% of eligible patients actually enrolled in the study, results may not generalize well to other groups of CAD patients, although this response rate is comparable to other large cohort studies, such as the Coronary Artery Disease in Young Adults Study and the Cardiovascular Health Study.19,20 Additional research is needed on screening in acute care settings. Studies are also needed that examine paradigms, such as multiple positive screens prior to initiating formal evaluation, with the goal of reducing the high number of false positives generated in initial screening. Furthermore, clinical management paradigms are needed to establish whether screening in cardiovascular care settings leads to net benefits for patients.
The Heart and Soul Study was funded by the Department of Veterans Epidemiology Merit Review Program, the Department of Veterans Affairs Health Services Research and Development service, the National Heart Lung and Blood Institute (R01 HL079235), the American Federation for Aging Research (Paul Beeson Scholars Program), the Robert Wood Johnson Foundation (Generalist Physician Faculty Scholars Program), and the Ischemia Research and Education Foundation. Dr. Thombs is supported by a New Investigator Award from the Canadian Institutes of Health Research and an Établissement de Jeunes Chercheurs award from the Fonds de la Recherche en Santé Québec. Dr. Ziegelstein is supported by grant no. R24AT004641 from the National Center For Complementary and Alternative Medicine and by the Miller Family Scholar Program.
Conflict of interest None disclosed.