|Home | About | Journals | Submit | Contact Us | Français|
The Patient-Reported Outcomes Measurement Information System® (PROMIS) was an NIH-funded initiative to develop measures of symptoms and function. Responsiveness is the degree to which a measure can detect underlying changes over time. Our objective was to document the responsiveness of 8 PROMIS measures in a large population-based cancer cohort.
The Measuring Your Health study recruited 2,968 patients diagnosed with one of seven cancers between 2010–2012 through 4 SEER registries. Participants completed a baseline (6–13 months post diagnosis) and a 6 month follow-up survey. Changes in 8 PROMIS scores were compared to global ratings of transition, changes in performance status, and clinical events.
Measures were responsive to 6-month declines and improvements in performance status with small to large effect sizes (ES) (Cohen’s d=0.34 – 0.71, p<0.01). Mean changes and effect sizes were larger for participants reporting declines than improvements. We identified small to medium ES for patients who reported being “a little” worse (d =0.31 – 0.56), and medium to large ES for participants who reported being “a lot” worse (d= 0.53 – 0.72). Hospitalized participants reported significant score increases resulting in worsening of pain (d=0.51), fatigue (d=0.35), and depression (d= 0.57, all P<0.01). Cancer recurrence and progression were associated with smaller increases in pain, fatigue, and sleep disturbance (d=0.22–0.27).
We found that all 8 PROMIS measures were sensitive to patient-perceived worsening and improvement and major clinical events. These findings will be able to inform the design and interpretation of future research studies and clinical initiatives administering PROMIS measures.
This study presents strong evidence across multiple evaluation methods that PROMIS measures are responsive to both improvements and declines in symptoms and function experienced by cancer patients. Our findings will be able to inform the design and interpretation of future research studies and clinical initiatives administering PROMIS measures
Patient-reported outcomes (PROs) are measures of functioning and well-being in physical, mental and social spheres of health1. In 2004, the National Institutes of Health (NIH) launched the Patient-Reported Outcomes Measurement Information System® (PROMIS®) as part of an NIH Roadmap (and later NIH Common Fund) initiative assessing all disease and health areas2, 3. PROMIS was designed using extensive qualitative and quantitative methods to develop a comprehensive set of item banks and short-form measures.
Responsiveness is an important aspect of scale evaluation, measuring the degree to which a PRO measure can detect underlying true changes4. Anchors used as indicators of change include patient transition reports and documented clinical events over the study period. To date, only a handful of studies have been published examining the responsiveness of PROMIS measures5–7. One study examined PROMIS responsiveness and minimally important differences in a clinic-based sample of advanced-stage cancer patients (n=101)8.
Our study builds upon these findings, evaluating the responsiveness of 8 PROMIS Short Form measures in the Measuring Your Health (MY-Health) Study9. MY-Health was designed to conduct a large-scale psychometric evaluation of PROMIS measures across a diverse cancer sample10, using community-based sampling to represent the full range of known health disparities across age, and race/ethnic groups11. This study presents an ideal environment for evaluating responsiveness in a large sample of patients with 7 different cancers, providing 6-month prospective data and capture of both patient- and clinical indicators. Demonstrating responsiveness in this heterogeneous cohort of cancer patients will support using PROMIS measures in population-based studies, comparative effectiveness research, clinical trials and other longitudinal studies
We recruited cancer patients as part of the Measuring Your Health (MY-Health) study. Four population-based cancer registries that are a part of the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program in three states (California, Louisiana, New Jersey) identified participants, within 6–13 months post diagnosis with a primary colorectal, lung, non-Hodgkin’s lymphoma, breast, gynecologic (uterine, cervical) or prostate cancers. We oversampled by younger age (21–49; 50–64) and non-white race/ethnicity (Hispanic, Black, Asian) targeting 20% of the full sample represented (n=1,000) in each group. Enrolled participants completed a paper baseline and 6-month follow-up survey. Among participants who completed our follow-up survey, we conducted a medical record abstraction of cancer-related procedures, hospitalizations and medical events for a random sub-sample. Further details on study design, eligibility, and baseline data collection procedures have been described in-depth elsewhere12.
Trained SEER study abstractors conducted a medical record abstraction (MRA) on a 40% sub-sample of participants who completed both baseline and follow-up surveys. Stage III and IV cancer patients were oversampled to ensure this group had sufficient events for evaluation between the baseline and follow-up surveys. Abstractors reviewed hospital and out-patient records for cancer-related treatment (chemotherapy, hormonal, targeted therapies, radiation, and surgical procedures), hospitalizations, medical events, cancer status (recurrence, progression, remission) and vital status.
Data on date of cancer diagnosis, cancer type and cancer stage information were obtained from SEER registry routine databases. Cancer stage was defined using the American Joint Commission on Cancer (AJCC) criteria. Information on hospitalization (dates) and documented cancer status change (type and date of change: remission, recurrence, progression, missing) was collected via medical record review. The participant baseline survey collected patient-reported demographic and clinical information (surgery, radiation, chemotherapy, hormone therapy) used in this study, including age, race/ethnicity, education-level, birthplace (U.S. or Outside the U.S). Race/ethnicity categories (non-Hispanic White [White], Black, Hispanic, Asian) used for analysis were created following U.S. Census (2010) classification algorithms13. When self-reported race/ethnicity was missing, we used information from the SEER registry database (<0.4% of participants).
Two non-PROMIS PRO measures were collected at both baseline and follow-up to examine PROMIS 6-month responsiveness: (1) the 7-item FACT-G physical well-being (PWB) subscale (full measure was not administered due to overall survey length)14; and (2) the single-item, patient self-report ECOG Performance Status (ECOG PS) scale often used in cancer clinical trials to assess disease impact on daily living abilities15. The ECOG PS scale has 5 response options ranging from “normal activity without symptoms” to “unable to get out of bed”.
We created 8 PROMIS measures administered at baseline and follow-up timepoints: Physical Function (15-items), Fatigue (14-items), Pain Interference (11-items), Anxiety (11-items), Depression (10-items), Ability to Participate in Social Roles v2 (“Social Function” 10-items), Cognitive Function v2 (8-items), and Sleep Disturbance (8-items)16, 17. Each measure is a custom short form created for the MY-Health Study. These measures were designed to include multiple “off the shelf” PROMIS Short Forms, as well as frequent Computer Adaptive Testing (CAT) selections for lower functioning patients (0.5 and 1.0 SD below the U.S. population mean).
On the follow-up survey, a single-item patient-reported global rating of change was administered immediately after each of the 8 PROMIS measures18. This global change item asked the following: “compared to six months ago, how is your (PROMIS symptom or function) now?” Two different response sets were used: (1) responses to Pain, Fatigue, and Anxiety: “A lot less”, “A little less”, “About the same”, “A little more”, “A lot more”; and (2) responses to physical function, social function, cognitive function, depression (labeled: “feelings”), and sleep disturbance (labeled: “sleep quality”): “A lot better”, “A little better”, “About the same”, “A little worse”, “A lot worse”.
First, we evaluated our follow-up survey completion rate, examining demographic and clinical differences from the full baseline sample. We then conducted descriptive analyses of our sample to examine 6-month change between our baseline and follow-up surveys for each PROMIS measure. We calculated means, standard deviations, and effect sizes for baseline and follow-up mean scores. Unadjusted change scores (baseline – 6 months scores) were evaluated within each anchor response option (self-report measure of 6-month change) for each PROMIS measure.
We evaluated 6-month responsiveness across the 8 PROMIS measures using retrospective ratings of global change. Retrospective anchors were collected using self-reported 5-point global change ratings corresponding to each PROMIS domain measure. For each PROMIS measure, we calculated the mean score change and effect size across each change rating (e.g., “a lot better”)
We analyzed prospective change using five known-group contrasts based on both survey self-report (ECOG PS and current cancer status) and medical record information (number of hospitalizations and recurrence/progression versus remission). Each comparison examines relative mean score difference between a group where change was expected versus a stable contrast group, with no expected change is anticipated. We also evaluated the Spearman rank-order correlations of change for PROMIS measures (selected with baseline correlations above r=0.7: pain, fatigue, social function, physical function) with the FACT PWB.
We calculated the effect sizes of all responsiveness calculations by dividing the absolute value of the mean 6-month change in each PROMIS score by the baseline standard deviation (SD), reported as the absolute value. We applied Cohen’s interpretation of effect size magnitude: d=0.2 (small); d=0.5 (moderate); d=0.8 (large)19. Past studies suggest that change scores around d=0.2 are probably too low to be classified as an estimate of clinically meaningful change.18, 20, 21 Therefore, we considered values at or above d=0.3 as clinically meaningful change. We used SAS version 9.4 (SAS Institute, Cary, NC).
The overall follow-up survey response rate was 54%. Participants completing the follow-up survey were more likely to be non-Hispanic White (62%), 65 or older (57%), or report a college degree or higher (62%). Prostate cancer patients had the highest (61%) and cervical cancer patients the lowest (34%) follow-up rates. Follow-up rate varied by cancer stage at diagnosis decreasing as stage increases (57% stage I vs. 46% stage IV). (Table 1). Overall, follow-up responders reported significantly better function and lower symptom severity at baseline than those participants who were lost to follow-up.
Baseline characteristics of participants who completed both a baseline and 6-month follow-up MY-Health Survey (n=2968), and supplemental medical record information (n=844) are provided in Table 1. Most patients were 50 years of age or older (81%) and 53% of our respondents were members of a racial/ethnic minority group. While most participants were diagnosed with breast and prostate cancer, seven different cancer types were represented in the sample and 26% (n=780) were diagnosed with advanced disease. Half of participants reported “normal” performance status at baseline. The medical record participant sub-set reflects an intentional over-sampling of advanced stage cancer patients (III/IV).
Overall, we found that participants were largely unchanged or showed small improvement across all measures over a 6-month period. Among PROMIS symptom measures, we found small declines in pain interference (mean change: −1.6, p<0.001) and fatigue (mean change −1.1, P<0.001). Improvements were identified for physical and social function (mean change: 0.8 and 1.1, respectively, both P<0.001). Self-reported 6-month retrospective global change indicated that more participants reported improvement (a little or a lot) for pain (51%), fatigue (49%), anxiety (47%), and social function (44%). However, symptom increases/functional declines reported over this period were highest for fatigue (13%) and lowest for physical function (7%). (Tables 2)
Across symptom and functional status, “a lot better/less” was associated with mean changes of 2 to 4 points (d range = 0.22 to 0.44). “A little worse/more” was associated with mean changes of 3 to 6 points (d range= 0.31 to 0.56). “A lot worse” was associated with mean changes of 5 to 9 points (d range = 0.53 to 0.72). (Table 3a–c). Mean PROMIS changes and effect sizes were larger for those reporting declines in function or worsening symptoms. For example, the mean change from baseline in the PROMIS-Fatigue score was 5.42 (d=0.62) among those reporting “a lot more fatigue”, and −3.26 (d=0.38) among those reporting “a lot less fatigue.” (Table 3a) Depression and cognitive function measures were most responsive to declines (8.5 and 8.7, respectively) and the least responsive to improvement (−2.42 and 2.12, respectively). (Tables 3b and 3c)
Patients with a 1 point improvement on the ECOG PS had clinically meaningful (d ≥0.30) improvements on 5 PROMIS measures. Physical function reported the largest improvement (mean change= 3.4 points, d= 0.53), followed by pain (mean change= −4.5; d=0.45). (Table 4) All measures were more sensitive to worsening performance status with fatigue showing the largest change and effect size (mean change: 4.3, d=0.63). Data from our MRA cohort indicated pain, fatigue, and depression were responsive to hospitalization, reporting mean score increases of 3–5 points, and small to moderate effect sizes (d =0.35 [fatigue], 0.51 [pain], and 0.57 [depression]). Correlation of change between the FACT-G PWB scale ranged from r=0.33 (pain) to r=0.47 (fatigue). (Table 5)
This study provides support for the responsiveness of 8 PROMIS measures in a diverse cohort of cancer patients, supporting past evaluations in cancer and other chronic conditions.22 Most notably, our study supports and extends research findings from a study of advanced-stage cancer patients which identified a similar range for longitudinal anchor-based change in PROMIS pain, fatigue, anxiety, depression, and physical function measures (d=0.36 – 0.67).8 Using a larger, more diverse, national sample of cancer patients, our findings provide evidence that a 3 to 5 point change is sufficient across all PROMIS measures to identify clinically meaningful change.
This study also establishes that these PROMIS measures are responsive to functional recovery and symptom improvement in cancer. However, absolute changes in PROMIS scores tended to be smaller for patients who retrospectively reported a functional improvement/symptom decrease than a functional decline/symptom increase on global change ratings. This imbalance in change score magnitudes across retrospective ratings of global health change has been reported in other cancer-specific, patient-reported outcome measures (i.e., the FACT-G).23 Despite attenuated responsiveness to retrospectively-rated improvements, our study did show similar positive and negative responsiveness on PROMIS measures across a prospective assessment of change (patient-rated ECOG performance status). Given known methodological concerns using global retrospective change ratings (e.g., recall bias, implicit evaluation of changes), our findings suggest prioritizing prospective change assessment in similar validation efforts.
These findings also present evidence that PROMIS measures are responsive to both medical and cancer-specific clinical events. This study found that a hospitalization within this 6 month period, between 6 and 18 months post diagnosis was linked to clinically meaningful increases (3–5 points) in pain, fatigue, and depression. This decline complements recent work examining PROMIS responsiveness in surgical recovery after a heart transplant7. Small increases in pain, fatigue, and sleep disturbance due to a documented recurrence/progression of cancer (1.8–2.3 points) provides evidence of responsiveness to cancer-specific clinical events. In contrast, worsening in anxiety and depression was shown to be responsive to only patient self-report of cancer at follow-up, not based on clinical documentation alone. This difference may suggest that anxiety and depression may be more sensitive to patient perception of recurrence or progression, rather than clinical identification alone.
Limitations include the long time gap between ratings of change (6 months), which has been found to be difficult for patients to report accurately.24 Nevertheless, the association between global ratings of change and actual change scores was reasonably high. The MRA comparisons rely on a short time period (6-months) to document recurrence/progression and in some cases (Hospitalization) report a very small sample size. Furthermore no further delineation was made to the 6-month window to evaluate post-initial treatment to evaluate events and additional cancer-related treatments. Therefore, events could have occurred at any point within this 6-month period, potentially reducing the degree of responsiveness measured by the PROMIS short forms. Further research is necessary, focused on tracking the impact of cancer-specific medical events (e.g., hospitalizations, adjuvant therapies, recurrence). Lastly, it is possible that a scale recalibration-response shift might have occurred over this time period, resulting in the computed change scores not being fully reflective of the true change that has taken place. However, this study was not designed to formally identify or evaluate this occurrence.
These observational data provide a necessary first step in detailing sensitivity to change for 8 PROMIS measures when reported by cancer patients. They lay a ground work for incorporating the PROMIS measures into oncology clinical trials by helping inform sample size needs and power calculations for studies aimed at detecting a specific magnitude of change in one of these PRO endpoints. In addition, they can help the interpretation of the magnitude of change or difference in results observed with one or more of these PROMIS measures. These data are timely, given the increased interest by the Food and Drug Administration in patient-focused drug development and their release of a clinical outcomes assessment (COA) compendium to help guide applications to their COA Qualification Program, as industry adoption of PROs increases25.
Finally, while we have many cancer-specific PRO measures to choose from for use in cancer trials, there are few generic PRO measures that have been validated in cancer patient samples with an appropriate degree of responsiveness. These findings, are among the first to indicate that PROMIS measures may be able to compare the magnitude of benefit or harm from treatments across both cancer and non-cancer clinical trials (e.g., pain or symptom management studies, or for measures of clinical benefit). This provides an opportunity to make meaningful comparisons across heath conditions and a broader context for the interpretation of PRO endpoints.
In conclusion, this study presents strong evidence across multiple evaluation methods that PROMIS measures are responsive to both improvements and declines in symptoms and function experienced by cancer patients. It extends past work, presenting further evidence that clinically meaningful change across PROMIS measures ranges from 3 to 5 points. This study also highlights the utility using PROMIS measures in research (clinical trials, observational cohorts, and comparative effectiveness evaluations).
This work was supported by National Institutes of Health (U01 AR057971 R.E.J., T.L., A.L.P, C.M.M., P30 CA051008 to R.E.J., T.L., A.L.P.); National Center for Research Resources (NCRR), National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), through the Clinical and Translational Science Awards Program (CTSA) (KL2 TR000102 to R.E.J)
Special thanks to the MY-Health Clinical Advisory Committee (Drs. Patricia Ganz, Julie Gralow, Maurie Markman, Jimmy Hwang, Anthony Back, and Bonnie Teschendorf), SEER study site collaborators (Laura Allen, Lauren S. Maniscalco MPH, Lisa Moy MPH, Natalia Herman MPH Rosemary Cress DrPH, Wendy Ringer), and the Georgetown MY-Health Research team (Caroline Moore, Charlene Kuo MPH, Marin Rieger MS, Deena Loeffler, MA, Aaron Roberts).
The Patient-Reported Outcomes Measurement Information System® (PROMIS®) is an NIH Roadmap initiative to develop valid and reliable patient-reported outcome measures to be applicable across a wide range of chronic diseases and demographic characteristics. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.
Conflicts of Interest: David Cella serves on the PROMIS Health Organization Board of Directors.
Author ContributionsConception and Design: Roxanne Jensen, Carol Moinpour, Arnold Potosky, Ashley Wilder Smith, David Cella, Ron Hays
Financial Support: Arnold L. Potosky Carol Moinpour, Ashley Wilder Smith
Acquisition and Assembly of Data: Arnold Potosky, Carol Moinpour, Theresa Keegan, Xiao-Cheng Wu, Lisa Paddock, Antoinette Stroup, Tania Lobo, Roxanne Jensen
Data Analysis and Interpretation: Roxanne Jensen, Carol Moinpour, Arnold Potosky, Elizabeth Hahn, David Cella, Ashley Wilder Smith, David Eton, Tania Lobo
Manuscript Writing: All authors
Final Approval of Manuscript: All authors
Accountable for All Aspects of Work: All authors