|Home | About | Journals | Submit | Contact Us | Français|
In high-income countries, depression is prevalent in HIV patients and is associated with lower medication adherence and clinical outcomes. Emerging evidence from low-income countries supports similar relationships. Yet little research has validated rapid depression screening tools integrated into routine HIV clinical care.
Using qualitative methods, we adapted the Patient Health Questionnaire-9 (PHQ-9) depression screening instrument for use with Cameroonian patients. We then conducted a cross-sectional validity study comparing an interviewer-administered PHQ-9 to the reference standard Composite International Diagnostic Interview in 400 patients on antiretroviral therapy attending a regional HIV treatment center in Bamenda, Cameroon.
The prevalence of major depressive disorder (MDD) in the past month was 3% (n=11 cases). Using a standard cutoff score of ≥10 as a positive depression screen, the PHQ-9 had estimated sensitivity of 27% (95% confidence interval: 6–61%) and specificity of 94% (91–96%), corresponding to positive and negative likelihood ratios of 4.5 and 0.8. There was little evidence of variation in specificity by gender, number of HIV symptoms, or result of a dementia screen.
The low prevalence of MDD yielded very imprecise sensitivity estimates. Although the PHQ-9 was developed as a self-administered tool, we assessed an interviewer-administered version due to the literacy level of the target population.
The PHQ-9 demonstrated high specificity but apparently low sensitivity for detecting MDD in this sample of HIV patients in Cameroon. Formative work to define the performance of proven screening tools in new settings remains important as research on mental health expands in low-income countries.
In high-income nations, depression in HIV patients has received much attention as a highly prevalent comorbidity and a predictor of reduced antiretroviral (ARV) medication adherence, higher transmission risk behaviors, and poorer clinical outcomes.1–3 As access to HIV clinical care and ARVs expands in sub-Saharan Africa,4 increasing attention has been focused on the prevalence and consequences of depression for HIV-infected patients in these settings.3 Evidence to date suggests that in sub-Saharan Africa, as in other parts of the world, the prevalence of depression is higher in HIV patients than in the general population and that depression is associated with worse ARV adherence and HIV clinical outcomes.2,5
In order to respond to the potential negative impact of depression on HIV clinical care, HIV clinical settings require rapid, validated screening tools to identify patients with a likely depressive disorder. Brief depression screening tools such as the Patient Health Questionnaire-9 (PHQ-9)6 are becoming widely used in the United States and Europe, but these tools were generally developed for self-administration by a literate population. Few studies7 have been conducted in African countries validating such tools against reference standard diagnostic assessments, especially in low-literacy populations, or have considered their application in HIV-infected populations specifically.8
Accordingly, we undertook to validate the PHQ-9 as a depression screening tool compared to the internationally calibrated Composite International Diagnostic Interview (CIDI) at a regional hospital serving a predominantly poor and low-literacy HIV patient population in the Northwest Region of Cameroon. Cameroon is a country of approximately 20 million people in Central Africa9 with an estimated HIV prevalence rate of 5.3%.10
This study was approved by the Cameroon National Ethics Committee (No. 111/CNE/SE/09), the University of North Carolina at Chapel Hill's Biomedical Institutional Review Board (# 09-0852) and the Duke University Health System IRB (# Pro00016937). The study also received administrative approval from the Ministry of Public Health in Cameroon.
We conducted a cross-sectional validation study to compare depression screening results from the PHQ-9 to a reference standard tool for depression diagnosis among HIV-positive patients on antiretroviral therapy (ART) and attending the Bamenda Regional Hospital AIDS Treatment Center (BRHATC) in Cameroon. The BRHATC is a health facility dedicated to the care of HIV-positive patients, the vast majority of whom are on ART. The center provides care to over 4,000 patients annually. Patients were eligible if they were HIV-infected, on antiretroviral therapy ART, were attending the BRHATC for any service (including counseling, clinical follow-up or drug refill), spoke English, were ages 18–55, and were willing to provide informed consent. We excluded patients >55 years because the present study was to lay the groundwork for a medication trial which would exclude patients >55 years for safety reasons. Patients who could not give consent or refused to give consent were excluded. Although more than 30 local languages exist, English is the language used for regular clinical care in the BRHATC and throughout the Northwest Region of Cameroon; more than 95% of clinic attendees can communicate effectively in English. Each participant could be eligible only once during the study period.
Our sampling goal was to recruit a study sample representative of HIV-positive patients receiving ART at BRHATC. In the absence of a daily patient register to provide a sampling frame, study staff approached each patient consecutively as the patients passed through a central point in the registration process (weight measurement) until a patient indicated willingness to participate in the study. The recruiting staff member then obtained written informed consent from the interested patient and completed the first part of data collection. The patient would then complete the second part of data collection with a second staff member while the recruiting staff member resumed approaching patients consecutively at registration.
The PHQ-9 is a widely used 9-item screener for depression that assesses the presence within the past two weeks of the 9 core symptoms of depression as specified by the DSM-IV.11 Each symptom is rated as being present none of the time, a few days, more than half the days, or nearly every day in the past two weeks. The PHQ-9 total score can range from 0–27, with a score of 10 or above being considered in many settings as an indication of a likely depressive disorder.
In order to assess any needed adaptations of the wording of the PHQ-9 for use in our setting, we conducted four focus groups with hospital patients and family members from our study site. Groups were separated by gender and religion (Christian and non-Christian). Study staff trained in qualitative methods led the focus groups, which were audio-recorded and transcribed for content analysis. The analysis suggested minor wording changes to two of the questions and optional clarifications (that were read only if the original question was not understood) for six additional questions.
Diagnoses of major depression were established with the reference standard Composite International Diagnostic Instrument (CIDI) from the World Health Organization. The CIDI is a lay-administered diagnostic instrument whose performance has been validated widely in multiple international settings, including HIV settings.12,13 The CIDI is a comprehensive, fully structured interview designed to be used by trained lay interviewers for the assessment of mental disorders in epidemiological and cross-cultural studies as well as for clinical and research purposes. We used the World Mental Health Survey Initiative Version of the CIDI (WMH-CIDI), which allows the assessment of mental disorders according to the definitions and criteria of ICD-10 and DSM-IV. The diagnostic section of the interview expands upon earlier versions of the CIDI by adding detailed questions about disorder severity, impairment, service use, and treatment, and has improved generalizability with increased involvement of less wealthy countries. This version has been successfully used in Sub-Saharan Africa (Nigeria).14
After consent, participants completed the PHQ-9 with one study team member. Given the low level of literacy in the target population, the PHQ-9 was read to all participants and responses were recorded on the form by the study team member, who was trained in PHQ-9 administration. Previous studies have found comparable results from interviewer administration and self-administration of the PHQ-9.15 Participants also provided socio-demographic information, completed the International HIV Dementia Scale,16 and were asked whether in the past 6 months they had experienced each of 13 symptoms commonly associated with HIV infection: new or persistent headaches, fevers, oral pain, white patches in the mouth, rashes, nausea, trouble with eyes, sinus infection, numbness in the hands or feet, persistent cough, diarrhea, weight loss, or (for women only) abnormal vaginal discharge.
Participants then completed the Screening and Depression modules of the CIDI in an in-person interview with a second study team member who was blinded to the results of the PHQ-9 screening. The CIDI interviewer was a health care professional who received formal CIDI training from a WHO-certified trainer. The CIDI interviewer completed regular supervision and record review with a psychiatrist (BNG) to ensure consistence and accuracy of diagnoses.
Based on validation studies of the PHQ-9 in other settings, we defined a positive screen for depression as a PHQ-9 total score of ≥10.7,17 In sensitivity analyses we also considered alternative cutoffs of ≥8 and ≥12.
Standard CIDI scoring methodology requires the following components for a lifetime major depressive episode: a period lasting at least two weeks characterized by at least 5 out of 9 core depressive symptoms, with at least one symptom being either depressed mood or anhedonia, and representing a change from previous functioning; the symptoms must have caused clinically significant distress or impairment in social, occupational, or other important areas of functioning; the symptoms must not be better explained by substance use or a general medical condition; and the symptoms must not be better explained by bereavement, or if secondary to bereavement, the episode either must last more than two months or be characterized by at least one of the following: marked functional impairment, morbid preoccupation with worthlessness, suicidal ideation, psychotic symptoms, or psychomotor retardation.
On the CIDI, participants reporting any lifetime major depressive episode (MDE) were asked if they had experienced any similar episodes in the past year, past 6 months, past month, and currently. Participants reporting an episode in the past year or more recently completed a standard depressive severity rating scale called the Quick Inventory of Depressive Symptoms (QIDS)4 that is embedded within the CIDI. The QIDS total score can range from 0–27 and has standard categories that correspond to very severe (21–27), severe (16–20), moderate (11–15), mild (6–10), and no (0–5) depressive symptoms. Participants were classified as having a diagnosis of a MDE in the past year, past 6 months, and past month if they endorsed a depressive episode in those time frames and received a score of 11 or above on the QIDS.
The IHDS was scored according to standard methodology.16 Each of three tasks (motor speed – number of finger taps in 5 seconds; psychomotor speed – number of sequences of a pattern of hand movements in 10 seconds; memory-recall – number of words recalled) is scored on a 0–4 scale, with higher scores indicating better functioning and a maximum possible score of 12. A score of10 or below is considered a positive screen for possible dementia.
Participant characteristics are described with proportions or with means and standard deviations (SD). To calculate test characteristics of the PHQ-9 relative to the CIDI, we ideally would have compared a positive PHQ-9 screen (which references the past two weeks) to CIDI diagnoses of current MDE. However, the overall prevalence of MDE in the sample was lower than anticipated and an initial examination of the prevalence of MDE in different time frames indicated that there were too few cases of current MDE to allow meaningful estimation of sensitivity. We therefore made the decision for the primary analysis to compare the PHQ-9 to CIDI diagnoses of MDE in the past month, as has been done in prior studies,18,19 recognizing that the time frames referenced by the two instruments would differ.
We calculated the sensitivity, specificity, likelihood ratio positive (LR+), and likelihood ratio negative (LR−) for this comparison. Since positive and negative predictive value (PPV and NPV) vary by prevalence, we present what the PPV and NPV for a range of different prevalence values for depression. We examined whether there was evidence that the test characteristics of the PHQ-9 differed in certain subgroups (gender, HIV physical symptoms, English proficiency, and dementia score). In secondary analyses, we considered different thresholds for the PHQ-9 (8, 10, and 12) and different time frames for the CIDI (current versus past month).
Between May and October, 2010, 461 patients were approached; 48 were not eligible (26 did not speak English well enough to complete study activities, and 22 were >55 years old) and 13 declined to participate (10 did not have time and 3 wished to consult their spouse before participating). Informed consent was provided by the remaining 400 participants. Participants had a median age of 41 years (interquartile range: 34–47 years); 74% were female, 99% were Christian (religious affiliations were indicated as Muslim by four participants and traditional by one participant), and 61% had completed only primary education (Table 1). The self-reported estimated median household daily expenditures were FCFA 700 (IQR 400–1400) (approximately US $1 [$1–$3]). Participants endorsed a median of 5 HIV-related symptoms (IQR: 3–6). A majority of participants (83%) met criteria for possible dementia. More than half of the subjects performed poorly on the memory sub-scale (54% with a score ≤3 out of 4).
Of 398 individuals with a complete CIDI assessment, 11 (3%) met diagnostic criteria for a major depressive episode in the past month (Table 2). Of these 11 cases, 3 had a positive PHQ-9 screen for depression at a cutoff of 10 or above (true positives) (Table 3). Of the 387 participants without a diagnosis of an MDE in the past month, 364 had a negative PHQ-9 screen. Thus the standard PHQ-9 cutoff of 10 or above had a sensitivity of 27% (95% CI: 6–61%) and a specificity of 94% (95% CI: 91–96%) relative to an MDE diagnosis on the CIDI in the past month. These test characteristics correspond to a LR+ of 4.5 and a LR− of 0.8.
In sensitivity analyses, we compared the standard PHQ-9 cutoff score of 10 to current MDE diagnoses at the time of interview. Four participants (1%) met criteria for a current MDE. When considering current MDE, the sensitivity of the PHQ-9 was higher than when considering past-month MDE (50%, 95% CI: 7–93%). Specificity was unchanged at 94% (95% CI: 91–96%).
Compared to the original time frame of MDE in the past month, use of a lower PHQ-9 cutoff of 8 yielded a slightly higher sensitivity and lower specificity, while use of a higher PHQ-9 cutoff of 12 yielded a slightly lower sensitivity and higher specificity (Table 3). Given the low number of cases and the resulting wide uncertainty around the sensitivity estimate, we do not present a full ROC curve, as ROC estimates would be imprecise.
As PPV and NPV are functions of prevalence as well as sensitivity and specificity, we illustrate in Figure 1 the PPV and NPV for the observed sensitivity and specificity over a range of possible prevalence values of MDE. At the MDE prevalence of 3% observed in this population (indicated by the red vertical line), the PHQ-9 had a PPV of 12% (2–30%) and an NPV of 98% (96–99%). In a hypothetical population with a higher MDE prevalence of 20%, the PPV of the PHQ-9 would improve to 53% while the NPV would decline to 84%.
The number of MDE cases was too small to assess whether the sensitivity of the PHQ-9 varied for different subgroups. There was little evidence of variation in specificity by gender, number of HIV symptoms endorsed, or outcome of the International HIV Dementia Scale assessment (Table 4).
In this population of predominantly low-literacy HIV-infected individuals in Cameroon, an interviewer-administered PHQ-9 demonstrated high specificity but low sensitivity in identifying cases of major depressive disorder as measured by the gold standard Composite International Diagnostic Interview (CIDI). The estimate of PHQ-9 specificity at the standard cutoff of 10 or above was comparable with other published results from a range of settings.17 The estimate of sensitivity at the standard cutoff was lower than many previous studies, especially in primary care populations,17 but was comparable to some studies done in chronically ill medical populations20 and to a recent large validation study in the Netherlands that also used the CIDI as its gold standard.19 The sensitivity was imprecisely estimated because of the small number of MDD cases in this population. As expected, the PHQ-9 (which asks about symptoms in the past two weeks) demonstrated higher sensitivity in detecting MDD that was present at the time of interview than MDD diagnoses from any point in the past month, although conclusions are limited by the low prevalence of cases available to calculate sensitivity.
The prevalence of depression was somewhat lower in this sample than in other recently reported studies of HIV-infected individuals in sub-Saharan Africa,21 although the majority of prevalence estimates from the region have been based on screening instruments rather than diagnostic instruments. Studies in the region using diagnostic instruments have reported MDD prevalence estimates ranging from 2.7–34.9%,22–28 with most studies employing the Mini International Neuropsychiatric Interview.29 The WHO Neuropsychiatric AIDS Study from the early 1990s, which was the only published study we could identify which used the CIDI to measure depression among HIV-infected individuals in the region, reported prevalence estimates in Nairobi, Kenya, of 3.0% and 5.5% in asymptomatic and symptomatic HIV-infected patients, respectively, and prevalence estimates in Kinshasa, Democratic Republic of the Congo, of 0% and 4.4% in asymptomatic and symptomatic HIV-infected patients.12
Although several studies have considered the internal and construct validity of the PHQ-9 in sub-Saharan African populations,30,31 including in HIV patients,8 we identified only one formal validation study from Africa that reported sensitivity and specificity of the PHQ-9 relative to a reference standard diagnostic instrument.7 This validation study compared a self-administered PHQ-9 to the Mini International Neuropsychiatric Interview (MINI) among a highly educated sample of Nigerian university students and reported a sensitivity of 85% and specificity of 99%. Although promising, the performance of the PHQ-9 and other depression screening instruments needs to be better understood in Cameroon and other low-income countries.
This study's findings should be understood in light of its strengths and limitations. The generalizability of the present study is enhanced by the consecutive sampling approach, broad inclusion criteria, and recruitment from a large regional HIV/AIDS treatment center. The validity of our screening and diagnostic measures of depression was enhanced by adaptation of the PHQ-9 for use in Cameroon through focus group feedback, rigorous training of interviewers, regular review of diagnostic decisions by a supervising psychiatrist, and blinding of the CIDI interviewer to the results of the PHQ-9 screening. Due to the low level of literacy in the target population, we chose to have the PHQ-9 be interviewer-administered rather than self-administered. The PHQ-9 was developed as a self-administered tool and only a small number of studies have validated an interviewer-administered version.32 Although previous studies have reported consistent performance between self-administration and interviewer administration of the PHQ-9,15 it is possible that administration mode in this population affected the estimated test characteristics. For example, participants may be less likely to report depressive symptoms to an interviewer than to endorse them on paper, which would be expected to bias the estimate of sensitivity downward. The prevalence of positive screens for dementia of 83% was higher than in previous research among HIV-infected patients in Cameroon,33 suggesting some degree of memory impairment in the study subjects which could also have biased the sensitivity downward in this interviewer administrated version of the PHQ-9. The estimates of test characteristics may also have been influenced by the difference in time frames referenced by the PHQ-9 (past two weeks) and the CIDI (past month), which would also be expected to bias the estimate of sensitivity downward. Our approach, however, was consistent with18 or more stringent than19 similar studies using the CIDI as the reference standard.
Insufficient evidence was available in this study to distinguish between the competing hypotheses described above for the observed low sensitivity of the PHQ-9. Efforts to identify depression screening protocols with higher sensitivity might focus on (1) conducting validation studies in larger populations or populations with a higher prevalence of depression to permit more precise estimation of sensitivity; (2) testing whether mode of administration affects sensitivity, for example by comparing interviewer administration to audio computer-assisted self-interview (ACASI); and (3) testing whether medical comorbidity, and particularly co-occurrence of dementia, affects sensitivity. Given other reports of low sensitivity of the PHQ-9 in chronically medically ill and non-US populations,19,20,34,35 additional research should also directly compare various depression screening instruments to identify which instrument has the best characteristics in a given population.
In conclusion, the PHQ-9 demonstrated high specificity but apparently low sensitivity at the standard cutoff in detecting past-month MDD as measured by the CIDI in this sample of HIV patients receiving ART in an urban center in Cameroon. The high specificity suggests that any positive screen will require immediate clinical attention to confirm a diagnosis. On the other hand, the low estimated sensitivity implies a high false negative rate, suggesting that methods to increase opportunities for identifying patients with MDD may be useful. However, sensitivity was extremely imprecisely measured because of the low prevalence of MDD in the sample. Further research may benefit from identifying higher-prevalence populations to improve sensitivity estimates, examining the role of administration mode on PHQ-9 performance, assessing the impact of dementia and other medical comorbidities on test characteristics, and validating alternative depression screening tools. In general, the results of this study underline the importance of formative work such as diagnostic validation studies when applying a depression screening instrument in a new population.
This work was made possible by our study participants and by study personnel Mrs. Shantal Asanji, Dr. Awasum Charles, Mr. Andrew Goodall, Mr. Fru Johnson, Dr. Charles Arrey Kefie, Mrs. Irene Numfor, Mr. Joseph Nyingcho, and Ms. Seema Parkash.
Role of funding source: This study was supported by grant R34 MH084673 of the National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA. BWP is an investigator with the Implementation Research Institute (IRI), at the George Warren Brown School of Social Work, Washington University in St. Louis; through an award from the National Institute of Mental Health (R25 MH080916-01A2) and the Department of Veterans Affairs, Health Services Research & Development Service, Quality Enhancement Research Initiative (QUERI). BNG receives funding from the NC TRACS Institute, which is supported by grants UL1RR025747, KL2RR025746, and TLRR025745 from the NIH National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health. This publication was made possible with help from the Duke University Center for AIDS Research (CFAR), an NIH funded program (P30 AI064518). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIMH or the NIH. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Conflicts of interest: The authors state that no relevant conflicts of interest exist.
Author contributions: BWP, BNG, JA, KW, RW, PN obtained funding and designed the study. BWP, BNH, JA, GT, RW, PN oversaw training and data collection. BWP and JKO analyzed the data. BWP, BNG, JA, JKO drafted the manuscript. DK reviewed the literature. GT, DK, RW, KW, PN reviewed the manuscript for important intellectual content. All authors contributed to and have approved the final manuscript.