|Home | About | Journals | Submit | Contact Us | Français|
To evaluate the responsiveness to change of the PROMIS® negative affect measures (Depression, Anxiety, and Anger) using longitudinal data collected in six chronic health conditions.
Individuals with major depressive disorder (MDD), back pain, chronic obstructive pulmonary disease (COPD), chronic heart failure (CHF), and cancer completed PROMIS negative affect instruments as computerized adaptive test (CAT) or as fixed-length short form (SF) at baseline and a clinically-relevant follow-up interval. Participants also completed global ratings of health. Linear mixed effects models and standardized response means (SRM) were estimated at baseline and follow-up.
903 individuals participated (back pain, n = 218; cancer, n = 304; CHF, n = 60; COPD, n = 125; MDD, n = 196). All three negative affect instruments improved significantly for treatments of depression and pain. Depression improved for CHF patients (anxiety and anger not administered), while anxiety improved significantly in COPD groups (stable and exacerbation). Response to treatment was not assessed in cancer. Subgroups of patients reporting better or worse health showed a corresponding positive or negative average SRM for negative affect across samples.
This study provides evidence that the PROMIS negative affect scores are sensitive to change in intervention studies in which negative affect is expected to change. These results inform the estimation of meaningful change and enable comparative effectiveness research.
Researchers and clinicians wishing to assess negative affect in a clinical or community population must choose from among numerous assessment options, many of which purport to measure the same or a similar construct[1–3]. Not all the available instruments meet high levels of instrument development standards for reliability, validity, appropriate reading level, and minimal respondent burden[4, 5]. In an effort to improve the existing measures, the Patient-Reported Outcomes Measurement Information System (PROMIS®) employed a multi-step, mixed methods approach to develop computerized adaptive tests (CAT) and fixed-length short forms to assess health-related quality of life, including symptoms and functional domains across physical, mental and social health . Moreover, the goal of PROMIS, as an NIH Roadmap initiative, was to create a system that could standardize the measurement of patient-reported outcome across chronic conditions; thus, enabling comparisons of the burden of disease and the benefits of treatment across these chronic diseases. Included in that system is a set of item banks and short forms for negative affect, specifically depression, anxiety and anger.
This paper reports on an important subsequent step in the validation processes for PROMIS measures: longitudinal analysis of the PROMIS negative affect scores in adult samples of patients with specified chronic health conditions. These analyses have the potential to deepen the PROMIS validity base, help define anchor-based clinically-important differences, and further enable comparative effectiveness research by identifying subpopulation reference values and observed change scores based on receipt of conventional treatment. In the present study, these conditions comprised back pain, cancer, chronic obstructive pulmonary disease (COPD), chronic heart failure (CHF), and major depressive disorder (MDD). Self-reported negative affect may distinguish important features among patients suffering from these medical conditions, including level of risk,  disability, [9–11] or recovery. 
Although this investigation is an exploratory “test drive” of PROMIS measures, the nature of the clinical groups and interventions allows us to articulate some hypotheses. For each of the PROMIS negative affect measures, we hypothesized that longitudinal improvements would occur during treatment for MDD (psychotropic medications and/or psychotherapy), chronic heart failure (heart transplant surgery), back pain (spinal injection), and the resolution of COPD exacerbation. Further, we expected the greatest change on all three negative affect measures for those being treated for MDD relative to those being treated for physical conditions. Given the progressive nature of cancer and the absence of any change in treatment for the COPD stable subgroup, we did not have a priori hypotheses for longitudinal changes of these groups. Our cross-sample hypotheses were that the MDD sample should have more severe scores on PROMIS Depression compared to those with other ailments, while patients with COPD exacerbation should have worse negative affect scores compared to the stable group.[9, 13, 14]
While we have articulated some hypotheses above, our ability to develop these more fully is somewhat hampered by the secondary nature of the data analysis. As discussed in the overview paper of this series,  a more thorough validation study developed with an a priori design, analytic approach, and data collection focused on across-study and across-disease validation would be useful and possibly more elegant. It should also be emphasized that the purpose of this report is not to demonstrate treatment effectiveness, but to investigate the responsiveness and validity of the PROMIS negative affect instruments.
At the time of this study, there were three PROMIS negative affect item banks, consisting of Depression (28 items), Anxiety (29 items), and Anger (29 items). The items in the PROMIS negative affect banks use a 7-day time frame and a 5-point rating scale that ranges from 1 (“Never”) to 5 (“Always”) [6, 7]. Each item bank was developed using comprehensive mixed (qualitative and quantitative) methods [16, 17]. After confirming essential unidimensionality and fit to the graded response model, items were calibrated with regard to their location (severity) and discrimination (ability to distinguish people at different levels of distress). This produced a bank of questions that can accurately measure levels of negative affect across its observed continuum, and provides the basis for innovative administration strategies such as CAT (in which item administration selection is based on responses to prior items) and short-forms targeted to the particular sample being assessed. Each item bank provided more information than conventional measures across a wider range of severity, ranging from normal to severely distressed .
The PROMIS Depression bank focuses on affective and cognitive manifestations of depression rather than somatic symptoms such as appetite, fatigue and sleep. PROMIS Anxiety content focuses on fear (e.g., worry, feelings of panic), anxious misery (e.g., dread), hyperarousal (e.g., tension, nervousness, restlessness), and somatic symptoms related to arousal (e.g., cardiovascular symptoms, dizziness). The Anger bank included items that were affective and cognitive, but also included indicators of behavioral activation and anger expression . See http://www.nihpromis.org/measures/domainframework1 for full definitions of these banks.
The administration format of the PROMIS measures differed slightly across the condition and disease groups evaluated in this project. For most studies, the banks were administered via CAT. For the cancer study, however, customized short-forms that predated the release of PROMIS short forms (Version 1.0) were administered. We only scored items on the two cancer short forms that are also in the PROMIS negative affect item banks ; eight items for depression and seven for anxiety. These items do differ, however, from the established short forms: for depression, six out of the eight items are also in the Depression Short Form 8b (SF-8b; ); for anxiety, two out of the seven items are also in the Anxiety Short Form 7a . Nevertheless, because the cancer study items were taken from the PROMIS negative affect item banks, we could score them with the established PROMIS discrimination and threshold parameters; therefore, scores were estimated using the same T-score metric as the existing short forms and CATs. These cancer study short-forms correlated very highly with the full bank (r = .97 for both anxiety and depression).
For each data set, we identified one general health and one general emotional distress item (outside of PROMIS items) to serve as a clinical indicator of change. These items either assessed change over the time of treatment directly or via a calculated change score. For example, patients in the cancer study answered the following question at each administration, “In general, would you say your health is…” with five respondent choices ranging from “excellent” to “poor.” In this case, we subtracted the former from the later scores to determine self-reported change in general health. Cancer patients also answered a domain-specific question at the second administration, namely, “Since the last time you filled out a questionnaire, your level of anxiety is…,” with seven respondent choices ranging from “very much better” to “very much worse.” Patient scores were categorized into three groups of patients, reflecting health changes that were better, about the same or worse. Details on the anchors used for each study are described in the overview paper in this issue. 
The PROMIS negative affect measures were administered as part of the PROMIS studies designed to validate PROMIS measures in a variety of clinical populations. Five patient groups completed one or more forms of the PROMIS negative affect banks: 1) MDD, 2) back pain, 3) COPD, both exacerbating and stable patients, 4) CHF, and 5) cancer. The studies of MDD, back pain, and CHF followed patients as they enrolled in new treatments. COPD exacerbation patients were treated for their condition, which was expected to resolve over the course of the study. We examined the longitudinal data at baseline and follow-up, namely, 3 months after start of study (MDD, back pain, and COPD), 8–12 weeks after transplantation (CHF), and 6–12 weeks after enrollment (cancer). (Although cancer and COPD-stable groups were not enrolled in new treatments, we apply the terms “baseline” and “follow-up” to all study groups for consistency.) The percentages for missing follow-up data for PROMIS instruments were as follows: 5% for MDD, 10% for cancer, and 20% for CHF, and 20% for Pain. For COPD, the overall missing data rate was 7% for PROMIS Depression and Anger and 8% for Anxiety. For COPD exacerbation, the percentage was 4% for all three measures; for COPD stable, the percentage was 9% for PROMIS Depression and Anger, and 10% for Anxiety. Detailed information on patient characteristics and treatments may be found in the overview article elsewhere in this journal. 
PROMIS Depression, Anxiety, and Anger were administered to 5, 4, and 3 different clinical groups, respectively. Least square means were estimated for these longitudinal data. Linear mixed models were estimated with random subject effects to account for the similarity among repeated observations within individuals [19, 20]. Since it was reasonable to consider the missing data to be missing completely at random (MCAR) or missing at random (MAR), a mixed model is advantageous because all available data can be used; in other words, the analyses were not restricted to those respondents with data at both time points [21, 22]. Least squares means, standard errors and 95% confidence intervals were estimated from the models.
To clinically anchor the changes in PROMIS negative affect measures, we used items that assessed changes in overall health, negative affect, or both, as described above. For each clinically anchored subgroup, we calculated the change in T-score and the standardized response mean (SRM). The latter is the ratio of the mean change to the standard deviation of that change, [23, 24] which is a form of Cohen’s effect size index. Based on previous studies, we assume for the purposes of this study that an SRM of .30 would be a minimally important difference in outcome [10, 26, 27]. Because of missing data at follow-up, subgroup sample sizes for the anchor-based analysis do not sum to the sample sizes for the mixed methods analysis.
Because the cancer sample was sufficiently large (enrolled N = 310) and some patients were expected to improve while others would deteriorate, we also computed least square means for the subsamples of patients who reported that their overall health got better, worse, or remained about the same over the course of the study.
The results of the mixed models are summarized in Table 1 and and2,2, while Figure 1 shows the least squares means of each measure in each clinical group. Consistent with our expectations, scores on the negative affect measures decreased significantly for patients receiving treatment for back pain, MDD, and CHF. The COPD exacerbation group, however, improved only on PROMIS anxiety and not (significantly) on anger and depression on the second administration following their exacerbation. Furthermore, the COPD stable group also showed significant reductions of anxiety (but not on Anger and Depression). As Figure 1 illustrates, both COPD groups were also more highly elevated on Anxiety at baseline relative to Anger and Depression.
In the cancer sample, we observed small improvements in mean levels of the entire group, but only the improvement on PROMIS Anxiety (−0.8 T-score points) was significant. However, as Figure 1 and Table 2 show, the clinically anchored subgroups produced significant improvements in PROMIS Anxiety and Depression in the predicted direction. Those reporting global improvements on the domain-specific anchor (“Since the last time you filled out a questionnaire, your level of depression [anxiety] is…,”) also improved significantly on PROMIS Anxiety and Depression; patients reporting deterioration on the domain-specific anchor also showed significantly higher Anxiety and Depression scores.
As expected, the mean score for PROMIS Depression (at baseline) was highest in the MDD sample compared to the other clinical samples (≥ 5 T-score points). The negative affect measures also distinguished the COPD stable group from COPD exacerbation for PROMIS Anxiety, t (119) = 3.68, p < .001, PROMIS Depression, t (119) = 2.76, p < .01), and marginally for PROMIS Anger, t (119) = 1.87, p = .06, with the COPD exacerbation group showing more distress. These group differences were maintained at follow-up (3 months) for Anxiety (t = (110) = 2.85, p < .01) and Depression (t (111) = 2.61, p < .05), but the difference was not significant for Anger (t (111) = 1.06, p =.29).
The analysis of clinically anchored subgroups generally showed that individuals grouped in the better health category saw greater improvements on PROMIS negative affect compared to those in the worse health category (see Tables 3–5); this was particularly the case when the anchor was domain-specific (negative affect) compared to global (health). For PROMIS Depression, individuals reporting better health between administrations showed an average SRM across conditions of −0.54 compared to an average of −0.10 for those in the worse health group. Changes in the groups defined by the domain-specific distress anchor were even larger: the average SRM was −0.71 for better compared to 0.49 for worse. For PROMIS Anxiety, the better health groups showed an average of −0.66 across conditions compared to an average of −0.17 for those reporting worse health. Using the distress anchor, the SRM averages were −0.83 for better and 0.38 for worse. For Anger, the globally anchored groups showed less of a difference, with an SRM of −0.44 for better health compared to −0.01 for worse health. With the distress anchor, however, the average was −0.62 for better health and 0.56 for worse health.
The PROMIS negative affect measures were created using an extensive instrument development process, including qualitative and cognitive interviews,  dimensionality analyses, IRT calibration, and concurrent validity evaluation, [6, 7] and scale-setting to match the US population. The current study examines how well the PROMIS emotional distress instruments differentiate among diverse clinical groups and if changes over time are consistent with expected changes during treatment.
Our results were largely consistent with a priori hypotheses. As expected, we found statistically significant longitudinal reductions on negative affect measures after treatment for MDD (psychotropic medications and/or psychotherapy), chronic heart failure (heart transplantation), and back pain (spinal injection). In addition, we found the greatest change on all three emotional distress measures for the treatment of MDD relative to the physical conditions and other treatments. Our cross-sample hypotheses were also met: patients with MDD scored higher on the emotional distress measures compared to the other groups, while COPD exacerbation patients scored higher on Depression and Anxiety than the stable group (but the two groups were not significantly different for Anger). We did not have a priori expectations for changes in cancer and COPD-stable groups; significant improvements were found in both groups on Anxiety.
While we expected that all three emotional distress measures would show improvement upon the resolution of COPD exacerbation, only PROMIS Anxiety improved for this group (−4.3 T-score points). The COPD-stable group likewise improved significantly on PROMIS Anxiety (−3.0 T-score points). For each COPD group, the PROMIS Anxiety score was also the highest among the three emotional distress measures at baseline. In fact, the mean level of anxiety at baseline in the COPD exacerbation group (60.2) was nearly as high as the level of anxiety in the MDD sample (61.7). Anxiety also clearly distinguished COPD exacerbation from COPD stable, both at baseline and follow-up (difference ≥ 5 T-score points). Our results are consistent with research suggesting a possible unique role for anxiety in COPD progression and treatment. The prevalence and severity of negative affect, especially anxiety, in people with COPD is well documented, [9, 13, 14] and anxiety in women with COPD is independently associated with increased risk of death. 
Mean scores on the emotional distress measures in the cancer group did not change much over the course of treatment (< 1 T-score point). However, as this group consisted of a diverse set of participants in different stages and types of cancer (see overview paper elsewhere in this issue) we would expect some individuals to improve and some to deteriorate. As Figure 1 illustrates, groups anchored to the global, domain-specific rating of improvement also showed significant parallel improvement on PROMIS Depression and Anxiety (and vice versa for the global, domain-specific rating of deterioration). This pattern of scores has been observed previously, and appeared to be robust in these data, with the absolute value of SRMs ranging from .35 to .72 (Tables 3 and and44).
The analysis of SRMs by clinically anchored subgroups supported the hypothesis that PROMIS emotional health measures are responsive to change in diverse clinical groups. For the domain-specific ratings of change, all SRMs for the better and worse subgroups were higher than the MID value of .30 across the three emotional distress measures. The SRMs for subgroups determined by general health ratings, however, were somewhat inconsistent. While the better health subgroups showed improvement in the predicted direction on emotional distress, the worse health subgroups also showed improvement on the emotional distress measures (albeit below the .30 MID threshold in most cases). This discrepancy is plausible, as participants’ general health may be influenced by factors external to treatment or otherwise independent from their emotional state.
Our results are also relevant to psychopathology research. Most research on the structure of the depressive and anxiety symptoms is done on cross-sectional samples; [31–33] however, longitudinal and treatment data can also support or refute models developed on cross-sectional data. Figure 1 illustrates that the course of depression and anxiety are virtually identical for the outpatients with MDD. This may reflect the correlation/comorbidity of depression and anxiety and suggests that it would be useful to investigate how they can be modeled and examined together .
The present study is not without limitations. First, while the use of global ratings of health or domain-specific change may be face valid and clinically relevant, they provide some methodological complications. Because retrospective ratings are assessed at follow-up, they are typically correlated more with follow-up (current) scores rather than pre-test or change scores. Secondly, sample sizes for COPD and CHF were modest; when anchored to different change groups, several of the subsamples for COPD and CHF were below 20. Consequently, results for these subgroups should be interpreted cautiously. Third, as stated in the introduction, this paper represents a secondary analysis of a range of studies with divergent goals. One of these goals (not addressed here) was to compare the responsiveness of PROMIS measures to established legacy measures; for depression, this question has been addressed in a separate manuscript.  For the purposes of this particular report, however, it would have been ideal to have the research design (e.g., choice of population and instruments) be exclusively informed by a priori hypotheses in the negative affect domain. Finally, we acknowledge that the anger bank was administered to fewer clinical patients with less extreme anger scores (relative to depression and anxiety). It would important to test the responsiveness of this bank in populations that have known anger problems.
Despite these limitations, the current study extends previous validity research on PROMIS negative affect measures by examining longitudinal change in a diverse set of clinical groups. The study demonstrates predictable change on PROMIS negative affect for various treated clinical samples over time, with parallel changes for clinically anchored subgroups, and meaningful differences between these clinical groups. These data help inform an evidence base for defining treatment responders and conducting or interpreting the results of comparative effectiveness research.
PROMIS® was funded with cooperative agreements from the National Institutes of Health (NIH) Common Fund Initiative (Northwestern University, PI: David Cella, PhD, U54AR057951, U01AR052177, R01CA60068; Northwestern University, PI: Richard C. Gershon, PhD, U54AR057943; American Institutes for Research, PI: Susan (San) D. Keller, PhD, U54AR057926; State University of New York, Stony Brook, PIs: Joan E. Broderick, PhD and Arthur A. Stone, PhD, U01AR057948, U01AR052170; University of Washington, Seattle, PIs: Heidi M. Crane, MD, MPH, Paul K. Crane, MD, MPH, and Donald L. Patrick, PhD, U01AR057954; University of Washington, Seattle, PI: Dagmar Amtmann, PhD, U01AR052171; University of North Carolina, Chapel Hill, PI: Harry A. Guess, MD, PhD (deceased), Darren A. DeWalt, MD, MPH, U01AR052181; Children’s Hospital of Philadelphia, PI: Christopher B. Forrest, MD, PhD, U01AR057956; Stanford University, PI: James F. Fries, MD, U01AR052158; Boston University, PIs: Alan Jette, PT, PhD, Stephen M. Haley, PhD (deceased), and David Scott Tulsky, PhD (University of Michigan, Ann Arbor), U01AR057929; University of California, Los Angeles, PIs: Dinesh Khanna, MD (University of Michigan, Ann Arbor) and Brennan Spiegel, MD, MSHS, U01AR057936; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR052155; Georgetown University, PIs: Carol. M. Moinpour, PhD (Fred Hutchinson Cancer Research Center, Seattle) and Arnold L. Potosky, PhD, U01AR057971; Children’s Hospital Medical Center, Cincinnati, PI: Esi M. Morgan DeWitt, MD, MSCE, U01AR057940; University of Maryland, Baltimore, PI: Lisa M. Shulman, MD, U01AR057967; and Duke University, PI: Kevin P. Weinfurt, PhD, U01AR052186). NIH Science Officers on this project have included Deborah Ader, PhD, Vanessa Ameen, MD (deceased), Susan Czajkowski, PhD, Basil Eldadah, MD, PhD, Lawrence Fine, MD, DrPH, Lawrence Fox, MD, PhD, Lynne Haverkos, MD, MPH, Thomas Hilton, PhD, Laura Lee Johnson, PhD, Michael Kozak, PhD, Peter Lyster, PhD, Donald Mattison, MD, Claudia Moy, PhD, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, Peter Scheidt, MD, Ashley Wilder Smith, PhD, MPH, Susana Serrate-Sztein, MD, William Phillip Tonkins, DrPH, Ellen Werner, PhD, Tisha Wiley, PhD, and James Witter, MD, PhD. The contents of this article uses data developed under PROMIS. These contents do not necessarily represent an endorsement by the US Federal Government or PROMIS. See www.nihpromis.org for additional information on the PROMIS® initiative.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
CONFLICT OF INTEREST
Benjamin D. Schalet: None
Paul A. Pilkonis: None
Lan Yu: None
Nathan Dodds: None
Kelly L. Johnston: None
Susan Yount: None
William Riley: None
David Cella is an unpaid member of the board of directors and officer of the PROMIS Health Organization