Depression as a clinical and categorical diagnosis requires specific criteria: persistent depressed mood or anhedonia (i.e., loss of interest or pleasure), along with at least four additional symptoms including sleep problems, excessive feelings of guilt or worthlessness, loss of energy, concentration problems, appetite or weight changes, psychomotor retardation or agitation, and suicidality.(
33) The diagnosis of depression is generally (and optimally) made after a face-to-face clinical interview by a trained mental health professional based on either ICD or DSM criteria. In the field, detection and diagnosis of depression often comprises two stages: screening and then referral for diagnosis by clinical assessment. Note, however, that symptoms of depression that do not meet diagnostic criteria also can be distressing and can affect health behaviors. Accordingly, depression can be examined continuously (i.e., in terms of symptom severity) as well as categorically (i.e., in terms of the presence or absence of a clinically significant diagnosis). Researchers should keep in mind that screening, diagnosing, and quantifying the symptoms of depression are somewhat different tasks, and that instrument choice should be driven by several considerations, including the population of interest, study hypotheses, and comparisons that the investigators might want to draw with other studies and populations.
In research settings, standardized diagnostic interviews are commonly used as the “gold standard” to assess the categorical diagnosis of depression based on DSM or ICD criteria. However, they are lengthy and usually require a clinician or an interviewer with extensive training to administer reliably. In international settings, the requirement of highly trained interviewers is particularly burdensome. Numerous depression screening instruments have been developed that do not require much time or a clinician to administer. They provide empirically based cut-offs useful as the basis for referrals to more comprehensive evaluations or to estimate the prevalence of possible depression. Symptom-rating scales offer a continuous measure of the extent of depressive severity. They usually do not require a clinician to administer; in fact, they are typically brief and easy to self-administer. They may or may not include a recommended cutoff for screening purposes. Symptom rating scales can be useful for monitoring change in depression symptoms over time.
In , we provide information on 23 instruments assessing depression, a sampling of the most commonly used and the potentially most useful. For each, we indicate whether they can be used for diagnostic, screening, or symptom rating purposes; who can administer them; whether they have a specific cut-off for screening; whether they directly assess suicidality; and examples of studies among HIV-positive populations in which they have been used.
| Table 1Brief Description of Diagnostic, Screening, and Symptom-Rating Measures of Depression |
There are a variety of instruments that have been shown to be valid in primary care settings, although relatively few of these instruments have been evaluated specifically in PLWH. Williams and colleagues (2002) reviewed the literature from 1970 to 2000 seeking instruments tested in general primary care settings against a standard interview such as the Structured Clinical Interview (SCID) to make a criterion-based diagnosis of depression.(
34) They found 28 studies using 11 different instruments, all of which performed acceptably. Of the measures, the BDI and the CES-D have perhaps the longest history of use in behavioral studies of HIV.(
8) The BDI and PHQ are considered sensitive to clinically relevant change and are often used to measure outcomes in depression intervention research. Note there are several versions of the BDI. The original version was published in 1961 and revised in 1978 (BDI-1A). The current version, the BDI-II, was published in 1996 in response to the publication of the DSM-IV, which changed the criteria for the diagnosis of depression. The BDI-II can be separated into 2 subscales, an affective component and a somatic component. The BDI was developed to assess depression severity, not as a diagnostic instrument, but a 7-item version called the BDI-PC (PC for Primary Care) was introduced in 1997 that was intended for use in screening.
A number of studies have examined the diagnostic performance of depression screening instruments in PLWH,(
35–
40) four of which used diagnostic interviews as the basis for making a criterion diagnosis.(
35–
37,
39) One reason to specifically examine the diagnostic performance of a diagnostic instrument in an HIV-specific group is that the somatic symptoms of depression can be difficult to disentangle from the disease-associated symptoms of HIV (e.g., fatigue, weight loss, poor concentration). For example, Voss and colleagues studied the relation between symptoms of depression and of fatigue in PLWH and found correlations of >0.60.(
41) Kalichman et al. tried to separate somatic symptoms of HIV from depression symptoms using both the BDI and the CES-D, concluding that the clinical utility of both instruments was improved when these HIV-related somatic symptoms were removed.(
40) Cockram and colleagues conducted a small pilot study among hospitalized HIV patients referred for depression (
N = 34), comparing two self-administered symptom rating scales (BDI and CES-D) and two clinician-administered scales (HAMD and MADRS); they used a standard clinical interview to diagnose major depression by DSM-III-R criteria. In contrast to earlier reports in non-HIV populations,(
42) investigators found all four scales performed equally well, despite the inclusion of somatic/vegetative symptoms in two of the scales.(
39) This problem is in no way unique to HIV. Others have used a variety of methods to distinguish disease-related and depression-related somatic symptoms in the elderly and in cancer patients and have concluded that there is little advantage to attempting to do so.(
43–
46)
We would encourage investigators to continue to examine this question in their studies of PLWH. The size of this potential effect will likely be related to HIV stage and the presence of co-morbid conditions such as hepatitis C and complications of HIV disease such as opportunistic infections. Note the impact of somatic symptoms may differ depending on whether one is assessing overall levels of depression versus examining whether the symptom level of depression is associated with health behaviors or health outcomes. In testing a depression treatment, for example, one may not expect to see changes in the somatic items if they are related directly to HIV disease. However, if the depression treatment affects the cognitive/affective items on a scale, this would still be apparent in an overall change score. When the research focuses on the association between depression and other health behaviors or outcomes, it may be more important to isolate the cognitive/affective items, which may be uniquely related to the health outcomes of interest.
Researchers should remember that the screening and symptom-rating instruments that we note here (and that are described in more detail in the reviews cited(
34,
47)) are not themselves diagnostic instruments. If the goal of the research study is to identify individuals with a clinical diagnosis of major depressive disorder, investigators should consider following up individuals who screen positive with a full diagnostic interview such as the SCID or Composite International Diagnostic Interview (CIDI), which has been validated with PLWH in both developed and developing countries.(
48)
Given that the diagnostic performance of commonly used instruments is similar, more practical considerations should guide decisions of which instrument to use. These include the length of the instrument; whether it measures only depression or other mental health concepts (e.g., anxiety, substance use) that might be relevant to a particular study; if it assesses specific aspects of depression, such as suicidality or somatic symptoms separately; how user friendly the instrument is in terms of reading level, response formats, need for training to administer; and whether it can be used to assess depression severity and treatment response in addition to providing a diagnosis of depression. For example, the HAM-D and the BDI both emphasize somatic symptoms of depression (e.g., fatigue) more than some other instruments. Depending on the clinical population, the outcomes of interest, and the comparisons of interest, such a somatic emphasis could be either a positive or a negative factor in an investigator’s instrument choice. Researchers also might want to consider how often the instrument has been used in the area of HIV, which could bolster the acceptability of the approach and facilitate comparison of results.
Several other practical and conceptual factors may influence instrument selection. For example, the BDI uses different response options for each item, which increases the time of administration, makes it difficult for individuals with impaired cognitive or attentional abilities, and complicates telephone administration. The CES-D, which was designed for telephone administration, has 20 items with the same response format, but not all versions are DSM-IV compatible (i.e., early versions do not include items that map directly onto DSM symptoms). In contrast, the depression scale of the PRIME-MD PHQ (i.e., PHQ-9) has only 9 items (one for each of the nine DSM-IV depression symptoms) and is therefore relatively simple to administer. It also has undergone preliminary performance testing in PLWH (
38,
49) and its psychometric properties among HIV-infected persons have recently been assessed.(
50) While the PHQ-9 had high reliability and validity, suggesting it is appropriate for use with individual HIV-positive patients, differences in the interpretation of questions across subgroups of the population (e.g., differing by race, gender, age), indicate it may not be ideal for longitudinal studies across group.(
50)
One consideration worthy of mention with respect to depression assessment and its relation to health behaviors has to do with a potential for a curvilinear association. For example, some health behaviors such as adherence likely have a linear relation with depression: the more severe the depression, the greater the impact on the behavior. With other behaviors, such as sexual risk- taking, the relation with depression may be curvilinear.(
51) In this respect, individuals with increasing depression may be proportionately more risky, except during severe depression, during which individuals may experience a loss of interest in sex and, hence, be essentially abstinent. In selecting the most appropriate depression instrument, researchers should, therefore, consider the potential pattern of association between depression and the health behavior of interest. If a curvilinear relation is possible, a continuous measure of depression is required.