We have provided extensive details on tool psychometrics, as well as details on types of tools and extent of validation, to guide clinicians’ own choice of an assessment instrument for routine emotional distress screening. Making recommendations about which screening tools should be used depends on the context in which tools are going to be implemented and the intended objectives that may vary across settings and users. The following recommendations were based on composite quality criteria that we defined using transparent decision rules ().
Among ultrashort measures, the two-item combination depression questions had the best psychometric properties. The widely used DT had been subjected to the most validation studies on the largest patient samples but was not validated against a structured clinical interview with established sufficient psychometrics. For the DT, the sensitivity and specificity findings were lower than 80% in about half and two-thirds, respectively, of the validation studies. However, some evidence suggests that modifications of the DT, such as the Mood Thermometer (33
), or expansions, such as the Impact Thermometer (32
), may represent improvements over the original scale.
Our findings regarding ultrashort measures differ in part from the results of other meta-analyses and reviews on screening tool validity. Meta-analyses (16
) as well as studies in primary care (151
) have demonstrated a lack of specificity in ultrashort measures (including the DT) for identifying depression. However, our results reveal that this criticism does not apply to the combination depression questions as these were found to demonstrate high specificity.
When it comes to ultrashort measures, patients have reported that a single-item interview format did not accurately describe or capture their mood (38
). In line with these findings, Ohno et al. (153
) reported that 65% of patients responded to the question “Are you depressed or not?” with “neither,” which indicates their uncertainty when rating emotional distress with such a simple question, even though their HADS scores suggested that they had clinical depression. Furthermore, agreement between ultrashort and longer measures in identifying distressed patients detected by structured clinical interviews was poor (115
). Problems with determining the face validity of single-item measures as well as patients’ difficulty with scaling on single-item screening tools could explain these discrepant findings. Consequently, further comparison studies investigating tools of different lengths should be conducted.
Among the short measures, we can recommend the CES-D as a screening tool for depression because it met all criteria for quality. The most extensive validation existed for the HADS, and this was the case across disease types and stages as well as across languages and cultures. The scale has been extensively tested against criterion standards.
Note that many other tools relied on the HADS for discriminant validation. Studies that compared the discriminant validity of the HADS against other scales found that the HADS was superior (26
) or equivalent (65
) to other measures. With regard to whether or not to use the total score or the subscale scores of the HADS, several studies showed that the total score was superior in nonpsychiatric patients (49
The BSI-18 and the GHQ-12 are short measures that also demonstrated good psychometric properties. Nevertheless, ROC analyses of the BSI-18 were based on comparisons of short form with the long form of the same instrument and do not, therefore, represent independent validation (52
). In addition, the GHQ-12 consistently performed worse than the HADS (62
). Nonetheless, both scales have also been used as criterion measures in validation attempts of other scales.
The Post Traumatic Stress Disorder Checklist-Civilian Version, the Psychological Distress Inventory, and the Hornheide Questionnaire Short Form are short measures that demonstrated adequate psychometric properties. However, their use to date is limited to specific cancer types or language applications. For patients receiving palliative care, the Edinburgh Postnatal Depression Scale or its short form, the six-item Brief Edinburgh Depression Scale, demonstrated adequate psychometric properties. Because of the strong psychometric properties of the PHQ-9 in large samples of primary care and obstetrics and gynecology patients (132
), this scale deserves further empirical evaluation of its value for distress screening of cancer patients.
Among the long measures, the BDI and the GHQ-28 met all quality criteria. The Psychosocial Screen for Cancer has not been validated against a structured clinical interview but otherwise met all criteria. In addition, the Psychosocial Screen for Cancer provides information on the social support that a patient desired and actually received, which may also guide decision making in psycho-oncological follow-up. The Questionnaire on Stress in Cancer Patients–Revised was validated in a large sample of cancer patients and provided good psychometric properties. The existing English version of the scale, therefore, deserves recommendation as a screening tool for emotional distress in cancer patients. Finally, the RSCL is a long measure that demonstrated adequate psychometric properties for distress screening.
Cancer-specific tools may provide more relevant information than generic scales on patients with a specific type of cancer; however, some of these tools, such as the Memorial Anxiety Scale for Prostate Cancer (97
), require additional validation. Furthermore, the routine use of cancer-specific tools is particularly likely to be implemented in specialized centers such as those that treat breast or prostate cancer patients. Facilities that treat patients with a broader disease spectrum may benefit most from a screening tool that can be applied to a mixed patient population, such as well-established scales including the BDI, the CES-D, the GHQ-28, or the HADS. Furthermore, the use of a scale that assesses anxiety as well as depressive symptoms, such as the BSI-18, GHQ-28, the HADS, the Psychosocial Screen for Cancer, or the RSCL, may prevent anxiety disorders from being overlooked within a routine screening program.
We argue that, depending on the physical condition of the patients and the treatment setting, relatively short tools should be used for the screening of palliative care patients or patients who are undergoing strenuous treatment. Furthermore, the use of shorter tools for routine screening in an inpatient setting is easier to justify and implement. By contrast, patients who have completed treatment, have follow-up appointments, or are attending rehabilitative care may have more physical resources (eg, compared with patients under chemotherapy treatment or palliative care patients) and more time to complete longer questionnaires. Moreover, cancer patients who are undergoing treatment may require immediate psychological support, whereas cancer survivors may need to adapt to the disease in the long term. For the latter patients, a more extensive psychological assessment seems to be needed.
Although single-item interviews may have a useful role in assessing distress in palliative care patients by minimizing patient burden, it is also true that somewhat longer scales may have higher content validity and may be better suited for longitudinal assessments. Future research should compare the accuracy and appropriateness of tools of differing lengths in specific treatment settings.
Choosing a tool for routine screening of cancer patients requires a trade-off between a measure with adequate psychometric properties and one with a reasonable length. It has been shown that computerized versions of screening instruments that use touch screen technology can be used successfully, including by older patients (155
). The use of fully computerized touch screen and autoscoring technology minimizes the workload of oncology treatment personnel, further reduces costs, and ensures the continuity and standardization of its application.
The usefulness of a screening program for emotional distress can be evaluated according to whether or not screened patients accept referral to a mental health professional. Shimizu et al. (156
) found that neither patient demographic variables nor the level of physical functioning, disease stage, or treatment status was associated with acceptance of a referral by the patient, whereas level of distress was, thus providing evidence that screening for emotional distress can result in enhanced utilization of psychological treatment. Compared with structured clinical interviews, distress screening instruments tend to overestimate the prevalence rates of depressive disorders in cancer patients (116
). In this regard, measures that have superior psychometric properties may, therefore, reduce the workload of psycho-oncology staff and allow for the accurate forecasting of resource needs. When clinic staff, alone or in cooperation with researchers, want to undertake distress tracking over time to assess treatment outcomes and/or learn more about adjustment processes longitudinally, then ultrashort screening tools tend to fall short because they lack a range of scores. Only the longer versions of measures can accomplish such objectives.
Several limitations of this systematic review must be noted. Some validation studies or measures could have been overlooked because of the fact that only peer-reviewed articles were included in this review. On the other hand, the scientific accuracy of such studies or measures would have remained unclear because of their lack of peer review. Furthermore, we only included validation studies that provided information on construct validity, discriminant validity, and/or concurrent validity for at least one additional measure, and we excluded feasibility studies that only reported on the measure itself or on a translation of the measure. Many studies that were included only reported on limited aspects of validation. Of these, several described results of factor analyses, as well as subscale and total scale reliabilities, whereas others provided data from ROC analyses without information on reliability. Also, many included studies did not provide sufficient descriptive statistics to allow us to compute missing indices of sensitivity, specificity, positive predictive values, and negative predictive values. Consequently, the conclusions we draw in this review depend on the information given in the original reports. However, the strength of a systematic review is that it provides a broader scope than meta-analyses, which typically combine studies of varying types and consequently provide only summary statistics. Hence, this systematic review is, to our knowledge, the most comprehensive review to date that addresses a broad range of screening tools, varying types of cancers, and disease stages.
In conclusion, several generic and newly developed cancer-specific instruments meet high-quality criteria for use in emotional distress screening of cancer patients. Many general emotional distress screening tools focus on depression. Nonetheless, highly prevalent transient anxiety or mixed emotional disorders that occur during the cancer diagnosis and treatment trajectory deserve the attention of clinicians. Hence, the exclusive use of a depression scale may overlook other disorders (eg, anxiety disorders). Consequently, a scale that measures mixed emotional states rather than depression only has clear merit for clinical practice.
Apart from purely psychometric considerations, large-scale implementation of screening for emotional distress may not occur if a given test has to be purchased for each use. This factor alone may have an impact on the choice of a screening tool, given that some well-validated screening tools have to be purchased for every use, whereas others are available at no cost. Another useful criterion for deciding which tool to use is the treatment setting. For example, treatment centers that specialize in breast or prostate cancer may prefer to use disease-specific measures.
In terms of actual decision making, it is important to recognize that a measure's sensitivity and specificity are a function of the cutoff that is used to distinguish anxious or depressed patients from nonanxious or nondepressed patients. Higher cutoffs improve the measure's specificity, and treatment facilities can decide upfront, by consciously choosing a specific cutoff, the amount of psychological and psychotherapeutic follow-up treatment they are willing to or can provide. Given that we were able to find a large number of well-executed validation studies on distress screening tools, we question whether the development of additional tools at this time should be discouraged to avoid redundancy. However, it may be worthwhile to initiate additional attempts to improve the validity of work on the tools that have good psychometric properties but that have not yet been validated against criterion standards.
Worthy of note is an ongoing National Institutes of Health project—the Patient-Reported Outcomes Measurement Information System network (http://www.nihpromis.org/default.aspx
)—to improve measures of patient-reported outcomes. A number of tools for the assessment of emotional distress in patients with chronic diseases are in the process of being developed within this network that may be useful as potential screening tools for emotional distress in cancer patients in the future.
Empirical findings published to date do not allow us to judge the predictive validity of screening tools for emotional distress. Nonetheless, the screening tools recommended here are effective for routine screening of emotional distress based on their high sensitivity and specificity. However, further information is needed about how screening affects long-term outcomes and patient quality of life.