|Home | About | Journals | Submit | Contact Us | Français|
To identify psychometrically sound measures of outcomes in end-of-life care and to characterize their use in intervention studies.
English language articles from 1990 to November 2005 describing measures with published psychometric data and intervention studies of end-of-life care.
Systematic review of end-of-life care literature.
Two reviewers organized identified measures into 10 major domains. Eight reviewers extracted and characterized measures from intervention studies.
Of 24,423 citations, we extracted 200 articles that described 261 measures, accepting 99 measures. In addition to 35 measures recommended in a prior systematic review, we identified an additional 64 measures of the end-of-life experience. The most robust measures were in the areas of symptoms, quality of life, and satisfaction; significant gaps existed in continuity of care, advance care planning, spirituality, and caregiver well-being. We also reviewed 84 intervention studies in which 135 patient-centered outcomes were assessed by 97 separate measures. Of these, 80 were used only once and only eight measures were used in more than two studies.
In general, most measures have not undergone rigorous development and testing. Measure development in end-of-life care should focus on areas with identified gaps, and testing should be done to facilitate comparability across the care settings, populations, and clinical conditions. Intervention research should use robust measures that adhere to these standards.
For the vast majority of Americans, the end of life includes a prolonged experience of chronic progressive disease, often associated with uncertainty, pain, suffering, and cost (Hogan et al. 2000; Lunney et al. 2003; National Consensus Project 2004; Teno 2005). For these reasons, coupled with an expansion of knowledge and research in managing and improving the dying experience, end-of-life care has been designated as a priority area for quality improvement, policy change and implementation, and cost effectiveness of work (Institute of Medicine 1997; Institute for the Future 2003; National Consensus Project 2004). In order to address end-of-life care, clinicians, administrators, researchers, and policy makers need to be able to evaluate the patient and caregiver experience.
However, end-of-life care is not a singular entity with a universal experience and well agreed upon course of care. Prognostication is poor and imprecise for most conditions, and patients and caregivers cope with illness for years, often with different trajectories of functional decline and needs (Lynn et al. 1997; Steinhauser et al. 2000, 2001; Patrick, Engelberg, and Curtis 2001; Lunney et al. 2003; Patrick et al. 2003; National Consensus Project 2004). The differing patterns of decline, as well as variable factors such as age, race, gender, culture, and preferences, suggest the need for diverse measures to assess the full spectrum of patients' and caregivers' end-of-life experience. From the health care system and policy perspective, it is also useful to consider the role of settings of care such as nursing homes, hospitals, home care, or hospice all play important roles and face different challenges in delivering care.
Substantial consensus derived from expert opinion and confirmed through nationally representative surveys of patients, families, and providers has established the principle domains to guide evaluation of the end-of-life experience (Hanson, Danis, and Garrett 1997; Lynn et al. 1997; Singer, Martin, and Kelner 1999; Steinhauser et al. 2000, 2001; Patrick, Engelberg, and Curtis 2001; Patrick et al. 2003; Wenrich et al. 2003; National Consensus Project 2004; Teno et al. 2004). Patients and families endorse important concerns including the extent to which care addresses pain and other physical and emotional symptoms, advance care planning, continuity of care, spiritual well-being, practical support for caregiving, and overall satisfaction. End-of-life care and palliative care have evolved over the last two decades to apply increasingly rigorous scientific methods to assess outcomes in these domains (National Consensus Project 2004; Teno 2005). Developing an adequate evidence base for improving end-of-life care requires reliable and valid measures of the patient and caregiver experience that also allow for comparability across the important care settings, populations, and clinical conditions.
As a part of systematic review of end-of-life care and outcomes, we identified measures that evaluate domains of the patient and caregiver experience and described their psychometric properties. Our objectives were to characterize measures that are available for evaluating end-of-life care and to characterize their use in the highest quality intervention studies that have been conducted in the field.
We conducted a systematic review of end-of-life care outcome measures in response to an Agency for Healthcare Quality & Research (AHRQ) task order for an evidence-based review of end-of-life care and outcomes for a National Institutes of Health, State of the Science conference (Lorenz et al. 2004, 2005). Our initial literature search (1990 to September 2004) included Medline, the Database of Reviews of Effects (DARE), and the bibliography of the National Consensus Project (NCP) for palliative care (National Consensus Project 2004). Major search terms included: palliative care, death, terminal care, hospice care, dying, end of life, and limited life; the comprehensive search strategy is described in detail elsewhere (Lorenz et al. 2004, 2005). We limited searches to citations in English language involving human subjects and excluding case reports.
Exclusion criteria included: citations that only addressed a population age 18 and under; case studies with fewer than 30 cases; reports that did not consider palliative care; studies of non-Western populations; nonsystematic reviews; clinical trials of chemotherapy, radiotherapy, stents, laser treatment, endoscopy, or surgery; descriptions of ethical, legal, or regulatory issues; descriptions of research processes; editorials, histories, personal narratives, and other descriptive nonclinical articles; articles about professional education; articles about organ transplantation or donation; articles that presented data only from before the mid-1980s; and studies in which the outcomes were lab or radiological tests or other physiological indicators. The inclusion and exclusion criteria were selected to focus the review on our task order questions. We were advised periodically in these choices by a technical expert panel that comprised an international group of experts in palliative care; more details of our methods are available (Lorenz et al. 2004, 2005) and the full report is available at http://www.ahrq.gov/clinic/epcsums/eolsum.htm.
We conducted a three-staged process, first screening titles, then abstracts, and then abstracting data from accepted citations. Efforts to standardize the review included the use of a training set, weekly team discussions, and the use of standardized review and abstraction forms. Because of the large volume of citations, we employed single review with one of the PIs (K. L.) serving as a gold-standard reviewer of random subsets of titles and abstracts. Eight reviewers, all with clinical and/or research backgrounds in palliative care worked in teams organized by domain. Review teams evaluated intervention studies from 1990 through September 2004 and abstracted the outcomes and associated measures used by these reports. Intervention studies were also characterized in terms of the diseases, stage of illness, and patient populations they addressed. Two reviewers with expertise in systematic reviews (P. S., S. M.) characterized the methodological quality of all systematic reviews and only high-quality reviews were included.
Two reviewers (R. M., S. D.) descriptively characterized the number and quality of measures, organizing them according to domains of interest by using a modified approach from a prior review of measures of end-of-life care, the Toolkit of Instruments to Measure End-of-Life Care (National Consensus Project 2004; Teno 2005). A major consideration in accepting the Toolkit as the starting point for our review was its widespread use and acceptance by the field. We went beyond it by focusing on more specific measures of palliative or end-of-life care. In addition, we accepted only reports that included some description of psychometric properties that assess aspects of validity, reliability, or responsiveness. We characterized measures, according to the care settings, clinical conditions, and populations in which they have been tested. We updated the literature search for measures from September 2004 through November 2005. Descriptive statistics were used to explore for gaps in measures by domain. We highlight (see Appendix A) several measures that are most robust and useful on the basis of the strength of existing psychometric testing, the conceptual basis for their development, and current use and/or potential for acceptance in the field of palliative care.
We also characterized the use of these measures in the intervention studies accepted in the review. We were interested in the extent to which measures used in the highest quality end-of-life research were generic or developed for use in late-life populations, the extent to which similar measures were used across the studies to facilitate comparability, and the degree to which use of similar measures in different populations allowed for comparisons of the end-of-life experience across important populations, clinical conditions, and settings of care.
We examined 21,245 titles identified through literature searches and an additional 3,178 titles from other sources, of which 6,381 were considered possibly relevant to our topic areas and continued to abstract review (see Figure 1 for article flow). Of these, 921 met inclusion criteria for the systematic review, including an additional 10 measure articles obtained from the update; 200 articles described 261 measures that were potentially applicable to end-of-life care measures. We accepted 152 reports and one comprehensive review (Toolkit of Instruments to Measure End-of-Life Care or “Toolkit”) (Teno 2005). We identified 99 measures of the end-of-life experience that had published psychometric properties.
The Toolkit (Teno 2005) is a published review of over 928 articles identified from 1967 through 2000 which selected 293 measures as potentially relevant to end-of-life care research and recommended measures based on the following characteristics: (1) measures were patient-focused, family centered, clinically meaningful, and manageable in their application; (2) measures demonstrated reliability, validity, and responsiveness; (3) measures were user-friendly and relevant to quality evaluation and improvement; (4) measures incorporated both the patient and family perspectives; and (5) measures examined both the process as well as the outcomes of care. The Toolkit recommended 35 measures (see Table 1) across the spectrum of end-of-life domains that are extensively described at http://www.chcr.brown.edu/pcoc/toolkit.htm. Our search strategy sought only measures with psychometric properties described in end-of-life care research and thus differ in scope and result from the Toolkit.
We identified 64 additional measures of the end-of-life experience that supplement the end-of-life care instruments included in the Toolkit review. The measures are summarized alphabetically in Appendix A and are described briefly below, organized by primary domain of measurement. Multidimensional measures of quality of care are included in the domain of satisfaction and quality of care. Based on our implicit process criteria, we recommend three measures for wide general use, the Quality of Life at End of Life (QUAL-E) and the Quality of Dying and Death (QODD) instruments, and the Palliative Care Outcome Scale (POS).
We identified 10 measures of QOL in addition to the four Toolkit recommended measures (see Appendix A), including four tested specifically in end-of-life care populations: QUAL-E (Steinhauser et al. 2002), Life Evaluation Questionnaire (LEQ) (Salmon, Manzi, and Valori 1996), Palliative Care Quality of Life Instrument (PQLI) (Mystakidou et al. 2004), and the Brief Hospice Inventory (Guo et al. 2001). The QUAL-E measure consists of 24 items across five domains: life completion, relationships with the health care system, preparation/anticipatory concerns, symptom impact, and connectedness and affective social support (Steinhauser et al. 2002). The LEQ (Salmon, Manzi, and Valori 1996) is a self-administered, 121 item measure across five subscales: freedom, appreciation of life, contentment, resentment, social integration. The PQLI includes 28 items in scales that include: overall QOL, function, symptom, choice of treatment, and psychological (Mystakidou et al. 2004). The Brief Hospice Inventory addresses similar domains to other measures, but was developed specifically for the hospice setting (Guo et al. 2001). Other measures of QOL include: the Hebrew Rehabilitation Center for the Aged Quality of Life (HRCA-QL) index (an adapted version of the Spitzer Quality of Life Index for patients with advanced cancer) (Llobera et al. 2003), the Brief Scale (also adapted from the Spitzer Quality of Life Index) (Abratt and Viljoen 1995), the McMaster Quality of Life Scale (Sterkenburg, King, and Woodward 1996), Kansas City Cardiomyopathy Questionnaire (Green et al. 2000), a five-item Linear Analog Scale (LAS) for cancer patients (Giorgi et al. 1996), and the Duke–UNC Social Support Scale (Herndon et al. 1997).
Five measures were recommended by the Toolkit for assessing either pain or overall symptoms. For the Toolkit measures, we identified two additional validation trials for the Memorial Symptom Assessment Scale (Chang et al. 1998; Tranmer et al. 2003) and two for the Edmonton Symptom Assessment System (see supplemental data to Table 1) (Chang, Hwang, and Feuerman 2000; Chow et al. 2001). We identified 10 additional measures that are predominantly intended to assess physical symptoms.
The Cambridge Palliative Assessment Schedule—revised (CAMPAS-R) was developed for assessing a range of symptoms in primary care settings (Ewing et al. 2004). The Symptom Monitor is a 10-item diary for physical symptoms, developed for feasibility in patients with advanced illness (Hoekstra et al. 2004). We identified two validation reports for the Lung Cancer Symptom Scale (LCCS) (Hollen et al. 1993, 1994). The measure uses nine patient-scored visual analog scales and six observer-scored items to measure symptoms prevalent in lung cancer. Normative data and trends for health-related QOL in stages III and IV lung cancer are available for the LCCS (Hollen et al. 1999). The LCCS has also been revised and psychometrically evaluated for use in patients with mesothelioma as the LCCS-meso (Hollen et al. 2004). Sarna and Brecht (1997) applied the Symptom Distress Scale (SDS) to female lung cancer patients. Hopwood, Howell, and Maguire (1991) did a validation appraisal of the Rotterdam Symptom Checklist (RSCL) in patients with breast cancer. Parshall et al. (2001) report the Dyspnea Descriptor Questionnaire in heart failure patients evaluated in an emergency department. The Edmonton Staging System (revised, rESS) is a clinician assessment tool designed for classifying cancer pain (Fainsinger et al. 2005).
We identified three measures for symptom assessment in dementia. The Pain Assessment in Advanced Dementia (PAINAD) is a five-item observer assessment (Warden, Hurley, and Volicer 2003). Volicer, Hurley, and Blasi (2001) reports the evaluation of two symptom scales for dementia patients: the Symptom Management at the End of Life in Dementia (SM-EOLD), a nine-item scale, and the Comfort Assessment in Dying With Dementia (CAD-EOLD), a 14-item scale with four subscales (physical distress, dying symptoms, emotional distress, and well-being).
We identified eight reports describing measures predominantly addressing emotional and cognitive symptoms in addition to five measures recommended in the Toolkit. Two relate specifically to cognitive symptoms: the Communication Capacity Scale is a five-item clinician rating scale developed for palliative care populations (Morita et al. 2001) and the Cornell Scale for Depression in Dementia (CSDD) is a 19-item clinician interview that was tested in nursing homes (Kurlowicz et al. 2002).
The Structured Interview for Symptoms and Concerns briefly addresses 13 emotional and physical symptoms in palliative care patients (Wilson et al. 2004). Other measures that may be applicable to the palliative population include: the Mental Adjustment to Cancer scale (revised as the G-MAC) (Mystakidou et al. 2005), the Demoralization Scale (Kissane et al. 2004), the Agitation Distress Scale, evaluated in a mixed cancer population (Morita et al. 2001), and the Hospital Anxiety and Depression Scale (HADS), evaluated in breast cancer patients (Hopwood, Howell, and Maguire 1991). We identified one study evaluating a single-item screening for depression (“Are you depressed?”) in terminally ill patients (Chochinov et al. 1997), as well as a study comparing this question to a visual analog scale for depression and the Edinburgh depression scale in a terminally ill population (Lloyd-Williams, Dennis, and Taylor 2004).
We identified four measures in addition to six recommended measures from the Toolkit within this domain. Two reports described refinements to the Toolkit measure, the Edmonton Functional Assessment Tool (EFAT), called the EFAT-2 (Kaasa et al. 2000; Kaasa and Wessel 2001), The EFAT-2 is a 10-item health-professional rating of symptoms and functional status. Two other reports describe measures for the frail elderly that may be relevant to an end-of-life population. The 54-item Physical Disability Index (PDI) requires calibrated specialized performance measuring equipment (Gerety et al. 1993), and the 19-item Frail Elderly Functional Assessment Questionnaire (FEFA) was designed for the elderly with very low activity levels (Gloth et al. 1995).
We identified five additional measures related to advance care planning that supplement the Toolkit measure. Schwartz et al. (2004) developed a revised version of the Emanuel and Emanuel Medical Directive, including two goals and four treatment preferences for each of six hypothetical scenarios. Discriminant validity testing in a seriously ill population found significant differences between patients who had chosen hospice and those who had not. Another scenario-based measure, the Willingness to Accept Life-sustaining Treatment instrument (WALT) (Fried, Bradley, and Towle 2002), demonstrated significant associations with a simpler measure of preference and with functional status. Koedoot et al. (2001) found that a widely used instrument for evaluating decision-making processes, the Decisional Conflict Scale (DCS), had moderate reliability and validity in a population including a group making decisions about palliative chemotherapy. Additional measures included a 21-item questionnaire assessing families' attitudes, perceptions, and patterns of choice in the management of terminal cancer patients (Mystakidou et al. 2002) and a four-item patient assessment of the quality of end-of-life communication (Curtis et al. 1999), which was correlated with whether clinicians knew if a patient had a durable power of attorney in an HIV population.
We identified no measures for evaluating continuity of care beyond the four recommended in the Toolkit.
We identified two measures that supplement the six recommended in the Toolkit. The 45-item Life Closure Scale was developed to measure psychological adaptation in the dying and was tested in hospice patients (Dobratz 1990). The Santa Clara Strength of Religious Faith Questionnaire (SCSORF) (Sherman et al. 2001) is a 10-item scale that may be applicable to end-of-life populations but has not been tested in patients with advanced illness.
We identified six measures in addition to two from the Toolkit. The Hogan Grief Reaction Checklist (HGRC) (Hogan, Greenfield, and Schmidt 2001) is a 61-item measure across six constructs (despair, panic behavior, blame and anger, disorganization, detachment, and personal growth). The 17-item Core Bereavement Items (CBI) (Burnett et al. 1997), developed from the Bereavement Phenomenology Questionnaire, demonstrated both time and group effects in discriminant validity testing (Kissane, Bloch, and McKenzie 1997). Evaluation of the 102-item Grief Experience Inventory (GEI) (Feldstein and Gemma 1995) showed significant differences between bereaved and nonbereaved groups (p<.001). Testing of an eight-item adaptation of the Bereavement Risk Index found significant differences over a 25-month period between high- and low-risk groups (Robinson et al. 1995). The Grief Evaluation Measure (Jordan et al. 2004), including 58-item experiences and 33-item problems sections, was developed as a predictive measure for complicated grief; scores were correlated with related measures such as the Inventory of Traumatic Grief.
We identified one additional validation study for the FAMCARE scale recommended in the Toolkit (Kristjanson et al. 1997) and 16 additional measures in this domain that supplement the four in the Toolkit. Five instruments predominantly measure quality of care or quality of the experience in the end-of-life population. The QODD measure, a 31-item family after-death interview across the six domains (Patrick, Engelberg, and Curtis 2001; Curtis et al. 2002); was evaluated in the intensive care unit (Mularski et al. 2004) and as a shorter 14-item nursing version (Hodde et al. 2004).
The 12-item POS (Hearn and Higginson 1998; Bausewein et al. 2005), developed for hospice patients, includes both patient self-administered and staff-assessment items. The Concept of a Good Death measure includes 17 descriptive statements in three subscales, closure, personal control, and clinical criteria, and showed small to moderate associations with other measures (Schwartz et al. 2003). The Resident Assessment Instrument for Palliative Care (RAI-PC) is a nursing home clinician assessment tool adapted for palliative care (Steel et al. 2003). Carson, Fitch, and Vachon (2000) report validity and reliability testing of the Support Team Assessment Schedule (STAS), an instrument for palliative care cancer support team patient assessment, in acute care oncology and palliative care units.
Six measures predominantly assess satisfaction at the end of life. The Quality of End-of-Life Care and Satisfaction with Treatment (QUEST) includes four scales for evaluating QUEST: physician care, physician satisfaction, nursing care, and nursing satisfaction (Sulmasy et al. 2002). The Family Perception of Care Scale (FPCS), designed for end-of-life care in long-term care facilities, demonstrated higher satisfaction when patients died in the facility than in the hospital (Vohra et al. 2004). The F-Care Expectations and Perceptions Scales assess family members' care expectations (Kristjanson et al. 1997). An 89-item after-death survey was used to examine caregiver satisfaction with palliative care in the United Kingdom (Jacoby et al. 1999). The 10-item Satisfaction with Care at the EOLD (SWC-EOLD) includes three scales but is limited to the dementia population (Volicer, Hurley, and Blasi 2001).
We also identified several needs assessment tools and clinical tools that are related to quality and satisfaction. The Cancer Patient Needs Survey has 51 items in five categories, including coping, help, information, work, and cancer shock, and was developed for the general cancer population (Gates, Lackey, and White 1995). The Needs Assessment for Advanced Cancer Patients (NA-ACP) has 132 items in seven domains (Rainbird, Perkins, and Sanson-Fisher 2005), and the Problems and Needs in Palliative Care Questionnaire (PNPC) (Osse et al. 2005) has 138 items across 13 dimensions. The Hospice Pressure Ulcer Risk Assessment Scale (in Swedish, Hospice Riskbedoming Trycksar, HoRT) measures physical activity, age, and mobility (Henoch and Gustafsson 2003). A report from the Canadian Study of Health and Aging (CSHA) (Kristjansson, Breithaupt, and McDowell 2001) describes a six-item measure as an index of social support available to the elderly.
Our literature search identified four reports in addition to two measures recommended from the Toolkit. The Caregiving at Life's End Questionnaire, designed for evaluating hospice care giving, includes seven scales such as caregiver comfort and the importance of care giving (Salmon et al. 2005). The General Functioning Scale of the Family Assessment Device (FAD), a 12-item scale assessing family functioning, had acceptable psychometrics in an advanced cancer population (Kristjanson et al. 1997). The Family Caregiver Medication Administration Hassles Scale is designed to capture problems caregivers experience with assisting elderly with medications (Travis et al. 2003). The Cost and Reciprocity Index (CRI) is a 25-item scale that was modified for use with hospice caregivers and measures social support and social conflict (Kirschling, Tilden, and Butterfield 1990).
Almost exclusively, measures were evaluated in a single setting (hospice, palliative care service, nursing home, community, or hospital). Few reports of measures that we reviewed compared the performance of measures across settings and only two measures reported translations to languages other than its primary language. With respect to patient conditions, 25 measures were evaluated in mixed or unspecified populations, 17 in mixed or unspecified cancer predominant populations. Only two measures were evaluated in CHF patients, five in dementia, one in HIV/AIDS, and five in breast or lung cancer patients.
We identified 84 individual intervention studies (see Figure 2) that evaluated strategies to affect outcomes in end-of-life care and abstracted the measures that were used to assess outcomes. Twenty-eight studies addressed continuity, 27 studies addressed symptoms, 19 studies addressed advance care planning, 13 studies addressed satisfaction, and 12 studies addressed caregiver issues (some studies related to more than one domain). We identified 175 outcomes that were studied in these intervention trials. Thirteen outcomes were measured using visual analog scales and 27 outcomes were utilization measures such as length of stay. The remaining 135 patient-centered outcomes were evaluated using 97 separate measures.
Different studies rarely used the same measures; 80 of 97 measures were used only once, and only eight measures were used in more than two studies (SF-36, Cohen–Mansfield Agitation Inventory, Mini-mental State Examination, EORTC-C30 QOL, Hospital Anxiety Depression Scale, Minnesota Living with Heart Failure Questionnaire, and the RSCL). Thirty-three of the 97 measures used were not identified in our review of end-of-life care measures and are mostly adapted from other sources. The remaining 64 measures are dispersed among the domains of QOL, physical symptoms, emotional and cognitive symptoms, functional status, spirituality, satisfaction, and caregiver well-being (see Figure 2). No studies used measures in the domains of advance care planning, grief and bereavement, or continuity of care that were identified in our review.
One previous systematic review, the Toolkit (Teno 2005), described a large number of patient-centered measures potentially applicable to the end of life and we identified 64 additional measures with published psychometric properties. Data on reliability or validity are available for only about one-third of published measures. Reliability and validity testing was often limited in scope and performed in small populations that were often not representative of the dying population as a whole.
Researchers faced with the challenge of selecting instruments for study will find a number of robust instruments (in the Toolkit and our review) that address pain and other symptoms, QOL, and quality of or satisfaction with terminal care described by after-death report. We identified several measures beyond those previously put forward by the Toolkit that were particularly noteworthy on the basis of conceptual grounding, psychometric evaluation, and/or acceptance in the field of palliative care. These measures include: the QUAL-E instrument, the QODD instrument, and the POS.
The current evidence base for palliative and end-of-life care is anchored in cancer care, and progress requires reliable and valid measures of the patient and caregiver experience for other conditions. At present, most measures and evaluations are limited to a single setting, usually the hospital or hospice. As patients at the end of life often use multiple sites of care, measures should be useful longitudinally over the time and settings. Culture may influence the patient and caregiver end-of-life experience, but measure development addressing population differences such as race or ethnicity are uncommon.
We also found that measures mostly addressed the domains of QOL, quality of care and satisfaction, and pain and physical symptoms. Gaps in measuring important domains of end-of-life care include continuity of care, advance care planning, spirituality, and caregiver well-being. In more developed domains, such as QOL or satisfaction, different projects almost always used different measures. The large number of measures of uncertain quality makes it difficult to compare findings or to synthesize insights across research or quality improvement studies; we recommend uniformity and use of the highest quality measures as important goals among others noted by the 2004 National Institutes of Health State-of-the-Science Conference Statement on Improving End-of-Life Care (available at http://consensus.nih.gov/)
In the published intervention studies we reviewed, <9 percent of measures were used in more than two studies. Prior recommendations for measures in the field of palliative care (e.g., the Toolkit), as well as the existing interventional literature, have capitalized on the existing measures that were often developed for other uses. Future interventional research should increasingly emphasize the use of measures that are more specific and appropriate for palliative care. Support for collaborative research may be one helpful approach to facilitating robust development of the strongest measures and comparisons across studies.
Our systematic review has several limitations. For some purposes, our broad definition of “end of life” would be overinclusive; and for other purposes, our exclusions might have left out some important elements. Some measures developed for other populations might be useful for measuring certain aspects of the end-of-life patient and caregiver experience, although our review focused specifically on the extent of progress in measurement in the field of end-of-life care. Neither the Toolkit nor our systematic review identified all of the measures used to assess the outcomes of intervention studies, a fact that suggests that many of the measures of interest may not be indexed as “end of life” or previously developed or used in the end-of-life care population. The current review was unable to address literature on technical interventions, nor was it feasible to address children, and those limitations highlight the need for additional focused review of those areas. Our recommendations for measures are limited as the field lacks an agreed upon method for grading instruments and our implicit process criteria may not consider all the appropriate factors; our recommendations should be considered on the basis of these limitations.
Priorities for future research include developing measures of continuity of care, advanced care planning, spirituality, and grief and bereavement. Studies should compare the highest quality measures across diseases, settings, and among important populations. Measures are also needed that are appropriate for a variety of study purposes—clinical research, health services, and quality assessment and improvement. Research funders should emphasize the use of high-quality, comparable metrics.
Advancement of the quality of end-of-life measurement will aid reforms to improve the quality of end-of-life care. Better availability of well-developed, widely evaluated end-of-life-relevant measures is a critical step in improving the knowledge base and the quality of end-of-life care. This review provides a reference for researchers seeking guidance in choosing domains and the highest quality measures for evaluating the outcomes of interventions relevant to the end of life.
This study was funded through contract #290-02-0003 from the AHRQ on behalf of the National Institute for Nursing Research (NINR). Dr. Sydney Dy was supported by Grant K07CA96783 from the National Cancer Institute. Dr. Karl Lorenz was supported by a VA Health Services Research Career Development Award.
The following supplementary material for this article is available:
Identified measures with published psychometric data supplemental to Toolkit.
This material is available as part of the online article from http://www.blackwellsynergy.com/doi/abs/10.1111/j.1475-6773.2007.00721x (this link will take you to the article abstract).
Please note: Blackwell Publishing is not responsible for the content or functionality of any supplementary materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.