|Home | About | Journals | Submit | Contact Us | Français|
In cancer treatment trials, the standard source of adverse symptom data is clinician reporting by use of items from the National Cancer Institute’s Common Terminology Criteria for Adverse Events (CTCAE). Patient self-reporting has been proposed as an additional data source, but the implications of such a shift are not understood.
Patients with lung cancer receiving chemotherapy and their clinicians independently reported six CTCAE symptoms and Karnofsky Performance Status longitudinally at sequential office visits. To compare how patient's vs clinician's reports relate to sentinel clinical events, a time-dependent Cox regression model was used to measure associations between reaching particular CTCAE grade severity thresholds with the risk of death and emergency room visits. To measure concordance of CTCAE reports with indices of daily health status, Kendall tau rank correlation coefficients were calculated for each symptom with EuroQoL EQ-5D questionnaire and global question scores. Statistical tests were two-sided.
A total of 163 patients were enrolled for an average of 12 months (range = 1–28 months), with a mean of 11 visits and 67 (41%) deaths. CTCAE reports were submitted by clinicians at 95% of visits and by patients at 80% of visits. Patients generally reported symptoms earlier and more frequently than clinicians. Statistically significant associations with death and emergency room admissions were seen for clinician reports of fatigue (P < .001), nausea (P = .01), constipation (P = .038), and Karnofsky Performance Status (P < .001) but not for patient reports of these items. Higher concordance with EuroQoL EQ-5D questionnaire and global question scores was observed for patient-reported symptoms than for clinician-reported symptoms.
Longitudinally collected clinician CTCAE assessments better predict unfavorable clinical events, whereas patient reports better reflect daily health status. These perspectives are complementary, each providing clinically meaningful information. Inclusion of both types of data in treatment trial results and drug labels appears to be warranted.
Clinicians report adverse events in clinical trials using the National Cancer Institute's Common Terminology Criteria for Adverse Events; whether patient self-reports are useful is unknown.
Lung cancer patients who were receiving chemotherapy and their clinicians independently reported six Common Terminology Criteria for Adverse Events symptoms and Karnofsky Performance Status at office visits. Patient and clinician reports were compared by assessing associations between Common Terminology Criteria for Adverse Events grade severity thresholds and risk of death and emergency room visits. Relationships of Common Terminology Criteria for Adverse Events reports with indices of daily health status were also measured.
Patients reported symptoms earlier and more frequently than clinicians. Associations with death and emergency room admissions were seen for clinician reports of fatigue, nausea, constipation, and performance status but not for patient reports of these items. Patient-reported symptoms were more often in agreement with measures of daily health status than clinician-reported symptoms.
The perspectives of clinicians and patients regarding adverse events are complementary—clinician reports better predict unfavorable clinical events and patient reports better reflect daily health status.
Because the study included patients from one center with one cancer type and specific survey questions, the findings may not apply to other patient populations.
From the Editors
In cancer treatment trials, the standard source of adverse symptom data is clinician reporting by use of items from the National Cancer Institute's (NCI) Common Terminology Criteria for Adverse Events (CTCAE) (1). Patient self-reporting is not currently an accepted source of this information (2,3). Consequently, during the drug approval process, industry sponsors and the Food and Drug Administration (FDA) have access only to clinician impressions of adverse symptoms but not to patient accounts of these events (4). Toxicity tables in oncology drug labels include clinician-based information only. This approach has been criticized because of its exclusion of the patient's perspective, with suggestions that patients are in the best position to describe their own symptoms and that their perspective is relevant for labeling to inform future users of treatments about anticipated effects (5–7).
Recently, interest has increased at the FDA and NCI to adopt patient-reported outcomes in trials as a standard data source for measuring subjective phenomena (8–11). In 2006, a draft FDA Guidance was issued outlining standards for development of patient-reported outcome measures to support labeling claims (12), followed by an NCI-sponsored conference on patient-reported outcomes in cancer research (13,14).
NCI is currently considering including a set of patient items in future versions of the CTCAE (2,15). The feasibility of collecting such data directly from patients has previously been demonstrated, as has the willingness of practitioners to use this information to inform treatment decisions (16–18). Although the narrowing digital divide is likely to make electronic collection of such data more efficient and affordable in the future (19,20), widespread adoption of patient reporting would require substantial new resources at study sites (5) and would ultimately change how toxic drugs appear in their labels because patients receiving cancer treatment more often than clinicians report that symptoms are more severe (6,21,22).
Whether patient or clinician reports of adverse symptoms—or both—should be included in treatment trials and drug labels is a topic of recent debate (2,5). It is germane to this debate to determine which data source is more informative about the overall toxic impact of medical products on the patient experience because adverse event information is generally considered in aggregate during regulatory review as an indicator of safety (23). To make such a determination, we designed a study to compare the associations of longitudinally collected patient vs clinician adverse symptom reports by use of three established clinical outcomes that are central to the patient experience with cancer care: global health status, risk of emergency room visits, and risk of death.
A study protocol was approved by the Institutional Review Board of Memorial Sloan-Kettering Cancer Center. Patients with lung cancer starting noninvestigational chemotherapy regimens at Memorial Sloan-Kettering Cancer Center who were able to read and understand an English-language questionnaire were invited to enroll. All participants provided written informed consent. Patients were observed for up to 28 months or until death.
At baseline and each clinic visit, patients were asked to complete a seven-item toxicity questionnaire via a touch screen tablet computer interface. Items included patient adaptations of CTCAE version 3.0 symptom items salient to individuals receiving chemotherapy (including fatigue, pain, nausea, vomiting, diarrhea, and constipation) and Karnofsky Performance Status (KPS). These items were previously developed for patient use through focus groups, interviews, and comparisons of patient vs clinician responses and have been previously described (16,17,21). Because these items were derived from the CTCAE, their intention by design is adverse symptom screening. Patients completed items before their encounters with clinicians because previous work has demonstrated no effects on either patient or clinician toxicity reporting, regardless of whether patients report before or after seeing clinicians (21). No incentives were offered for completing questionnaires. The study was conducted before the development of CTCAE version 4.0.
Clinician assessments of these same symptoms using the CTCAE, as well as KPS, are a part of standard chart documentation at Memorial Sloan-Kettering Cancer Center. At each clinical encounter, this information is completed by a nurse and/or a medical oncologist using a preprinted form. These data were abstracted from medical charts of enrollees for visit days during the study by two trained research assistants (A. Barz and M. Appawu). It was documented whether the data source was a nurse or a physician, and if both reported independently at a given visit, each score was coded separately for use in a sensitivity analysis. In cases of coding disagreement or unclear reporting, the chart was reviewed by a senior data manager and medical oncologist for adjudication.
CTCAE items were graded for severity by use of a five-point ordinal scale, whereas an 11-point scale was used for KPS to assess level of physical capability (assigned in units of 10 between 0 and 100) and for pain as noted in the medical chart (assigned between 0 and 10). Per NCI instructions for the CTCAE, a score of 1 was defined as “mild,” 2 as “moderate,” 3 as “severe,” and 4 as “disabling” (1). For the 0–10 pain scale, previously established score ranges associated with mild, moderate, or severe designations were used (24–26).
To compare the extent to which patient-reported vs clinician-reported items, which had been collected at multiple time points longitudinally, predicted discrete clinical outcomes, a time-dependent Cox model was used in which each possible grade level was considered as a threshold for each item of interest (ie, mild, moderate, severe, or disabling). Clinical outcomes included death and emergency room visits. The association of each specified grade threshold with the risk of each clinical outcome was calculated. This analysis was conducted separately for the patient-reported version and clinician-reported version of each item to compare their relative strengths of association. P values were generated with Wald statistics. Only those predictor values that were recorded before a clinical outcome of interest were used in the Cox model.
To compare the extent to which patient-reported vs clinician-reported items reflect global health status, patients completed two measures during the study: 1) the EuroQoL EQ-5D questionnaire (27) and 2) a global question that uses a visual analog scale to rate current health state from worst to best imaginable (28). Both measures have been shown to be reliable and valid assessments of health status in cancer patients (28,29). The EQ-5D consists of five items (including mobility, self-care, daily activities, pain and/or discomfort, and anxiety and/or depression), which are combined to render a single score that is adjusted for US population preference weights. The global question yields a single score between 0 and 100. Patients completed the EQ-5D questionnaire at each clinic visit and the global question only at baseline. To compare the strength of concordance of each patient-reported and clinician-reported item with these two measures of health status, Kendall tau rank correlation coefficients were calculated for each relationship, with −1 representing perfect negative concordance, 0 representing independence, and +1 representing perfect positive concordance. Confidence intervals (CIs) were derived from bootstrap resamplings on patients to account for correlations among repeated measures.
For each successive visit, the proportion of symptom reports completed by patients vs clinicians was tabulated to compare the extent of missing data between these two reporting approaches. If a patient's or a clinician's report was missing at a visit, bracketed by completed reports at previous and subsequent visits, then the most recent previously reported score was used for the analysis.
P values were generated by use of the two-sided Wald test for parameter estimates in the Cox models; two-sided P values and 95% confidence intervals for the Kendall tau correlation coefficients were generated by bootstrap resamplings on patients to account for correlations among repeated measurements. P values of less than .05 were considered as statistically significant.
Between June 15, 2005, and June 14, 2006, 190 consecutive patients with lung cancer were approached, of whom 185 were eligible and 163 consented to participate. Baseline characteristics are shown in Table 1. The median age was 63 years (range = 35–85 years). Most patients (69%) were diagnosed with advanced or metastatic non–small cell lung cancer, with predominantly good baseline provider–reported performance status. A majority of patients received cytotoxic chemotherapy in clinic once every 3 weeks. Mean enrollment was 12 months (range = 1–28 months), and the average number of clinic visits was 11 (range = 1–40). Sixty-seven (41%) of the 163 patients died during the study period. Twenty clinicians cared for these patients and documented their adverse symptoms in medical charts, including nine attending oncologists and 11 oncology nurses. The average number of patients cared for by any given oncologist was 16, with three oncologists treating fewer than 10 patients, three treating 10–20, and three treating 30–40. During the study, there were a total of 1712 clinic visits at which adverse symptom forms were submitted by patients at 1362 visits (80%) and by clinicians at 1601 visits (95%). Rates of missing data varied by symptom (Table 2). In general, rates of survey completion were higher among clinicians than patients. Reasons cited by patients for not completing forms included “no changes in symptoms from the previous time reporting” (56%), “inconvenient timing” (15%), “staff forgot to administer form” (14%), “felt too sick” (8%), and “did not feel like it” (6%). There was no effect of treatment type on compliance levels. At most visits, clinician documentation of adverse symptoms was completed by nurses (53%), followed by attending oncologists (38%), and oncology fellows (7%).
Figure 1 shows the cumulative incidence of attaining various severity levels of symptom items and performance status as reported by patients vs clinicians, with death considered as a competing risk. In general, each symptom severity level (ie, threshold) was reached earlier in the course of care and more frequently according to patient reports as opposed to clinician reports.
Table 3 shows the associations of symptoms reported by patients or clinicians with the risk of death and the risk of emergency room visits. For the threshold of moderate fatigue in a univariate analysis, there was no relationship between patient self-reports and the risk of death (P = .23), whereas there was a highly statistically significant relationship for clinician reports of this symptom severity level (P < .001), with a hazard ratio for death of 2.75. The threshold was reached by a similar number of patients by each reporting approach, with a similar number of deaths among those reaching the threshold (46% and 48%). Raising the threshold to severe fatigue yielded similar results, with a non-statistically significant relationship for patient self-reports (P = .15), but a statistically significant relationship for clinician reports (P = .003), with a hazard ratio for death of 2.39. In this case, fewer patients reached the threshold by the clinician's report compared with the patient's report (37% vs 17%), yet a relatively higher proportion of those identified by clinicians as having severe fatigue died (57% vs 48%).
A similar pattern was observed for the relationships of moderate nausea and constipation thresholds with the risk of death, with non-statistically significant relationships for patient reports (P = .38 and P = .40, respectively) and statistically significant relationships for clinician reports (P = .01 and P = .038, respectively). Too few reports of severe nausea or vomiting existed to conduct analyses of these thresholds. The pattern was repeated for KPS, for which clinician reports yielded the strongest observed association (hazard ratio = 6.39, 95% CI 3.88 to 11.08; P < .001) compared with patient reports (hazard ratio = 1.40, 95% CI = 0.84 to 2.35; P = .20). The cutoff KPS score of 70 or less [defined as “cares for self; unable to carry on normal activity or to do active work” (30)] was the most predictive threshold, although other cutoffs yielded similar results. Notably, worse survival was seen in patients for whom clinicians reported severe nausea, pain, fatigue, or KPS score of 70 or less during the initial month of observation (with a median survival time of 12 months) compared with patients for whom severe grades were not assigned (who did not reach a median survival time during the study).
Analyses were generally not informative for symptoms when using mild as a threshold (because of commonness at baseline by both patient and clinician reports) or when using disabling as a threshold (because of its overall rarity in this outpatient population). Too few events existed for any threshold levels of vomiting or diarrhea to conduct analyses of these symptoms. No effect on any estimates by patient characteristics including sex, age, education level, clinician type (physician or nurse), treating oncologist, disease stage, or chemotherapy type was observed.
A sensitivity analysis evaluated the associations between the second time patients reached each severity threshold and the outcomes of interest, and results were similar. In a bivariate analysis, patient-reported data did not augment the predictive ability of clinician reporting. In a multivariable model that included all items, KPS dominated in its ability to predict the measured endpoints.
Results were similar for the association of symptom thresholds with risk of emergency room admission, with statistically significant relationships observed for clinician-reported moderate fatigue, pain, and KPS thresholds but not for patient reports. There were too few events for other thresholds to conduct analyses.
Figure 2 shows the relative strengths of concordance between patient-reported and clinician-reported CTCAE symptoms and KPS with the two validated measures of health status (EuroQoL EQ-5D questionnaire and 0–100 global question). In general, higher levels of concordance were seen for patient reports compared with clinician reports. Results were more pronounced for EQ-5D scores, which were measured at each sequential clinic visit, compared with 0–100 global question scores, which were measured only at baseline.
In this longitudinal study of patients with advanced lung cancer, we found that clinician reporting of CTCAE symptoms and performance status was statistically significantly associated with unfavorable clinical outcomes, such as death and emergency room admissions, whereas patient reporting was more strongly associated with measures of daily health status. Patients generally reported symptoms earlier and more frequently than clinicians. Because clinicians are attuned to patients’ trajectories to major disease benchmarks, they appeared to reserve assignment of more severe grades until it was clear that such a benchmark was impending at the expense of sensitivity to granular aspects of patients’ daily experiences. In contrast, patient reporting appeared to better reflect real-time suffering at the expense of sensitivity to impending sentinel events, such as death or hospitalization. These results expand on previous studies that have reported statistically significant associations of clinician-reported performance status with survival (31–33) and of patient-reported outcomes including pain and health-related quality of life with survival (34–41).
Patient and clinician perspectives of adverse symptoms appear to be complementary, together providing a more complete picture of the toxic impact of treatments compared with either perspective alone. Clinicians bring professional training and experience to their evaluations, whereas patients are in a better position to communicate their own subjective experiences. Currently, in cancer treatment trials, clinicians but not patients serve as the source of adverse symptom data (1). Our findings demonstrate that this approach resulted in the loss of information that might be valuable to prospective prescribers and patients in understanding the anticipated effects of treatment.
As this study and previous work (6,21,22) have demonstrated, patients more frequently report worse symptom severity than clinicians. We further found that patients tend to report adverse symptoms earlier in the course of care than their clinicians. Therefore, drug labels that are based exclusively on clinician reporting invariably underestimate the frequency and severity of adverse symptom events as compared with the patient's perspective. In contrast, if patient reporting were adopted as a new standard, then a frameshift would occur whereby drugs would appear more toxic in their labels. This frameshift effect would balance out in controlled studies because symptoms unrelated to treatment would presumably equilibrate between groups. Because the incidence of reported symptoms would rise across the entire patient population, such trials would have greater power to discern between background symptoms and those related to interventions. In single-arm trials, comparisons of patient-reported adverse symptoms with baseline values would help distinguish between preexisting symptoms and those that developed during a study. Because single-arm trials are uncommonly used as data sources for cataloging adverse events in labels, symptom reports from phase III trials would have the advantage of identifying the incremental symptom burden attributable to new treatments.
Given our findings, which source of adverse symptom data should be included in oncology trials and drug labels: clinicians, patients, or both? Because clinical trials already capture survival and hospitalization data, and clinician-reported symptoms appear to be closely related to these events but not with daily health status, we suggest that clinician-reported symptoms will add only limited information beyond what is already collected in trials. In contrast, patient reports are less frequently associated with clinical endpoints but do reflect daily health status and, hence, provide additional information beyond what is currently collected in trials or reported in labels. The addition of patient-reported adverse symptoms to drug labels may therefore be warranted.
Even if patient reporting of adverse symptoms is adopted in oncology trials, it is unlikely that clinician reporting of this information will be discontinued. Clinician reporting is well established and, as demonstrated in this study, is generally more complete than patient reporting. Therefore, the questions are whether the current model should be expanded to include documentation of patient-reported outcomes in addition to clinician reports and whether it would cause confusion if clinician and patient adverse symptom data were both presented in study results and labels, given that they are often discrepant (6,21,22) (eg, would a label confuse readers if 30% of clinicians but 55% of patients reported severe nausea?).
In fact, there are many examples of reporting structures outside of medicine in which professional and nonprofessional ratings of the same phenomena are presented together and are well accepted. Highly trafficked Internet review sites for books, hotels, movie reviews, and consumer electronics provide expert and consumer ratings side-by-side, despite frequent discrepancies (19). Implicit in these sites is an understanding that professionals and nonprofessionals consider different criteria when reviewing the same product, and each perspective brings value. When considering a hotel, the opinion of an expert with a broad and technical standpoint is useful, as are the views of individuals who have actually stayed at the hotel. However, drug labels are not like these Web sites and, by comparison, are missing half the picture. Consequently, when an oncologist and patient sit down to discuss starting a new regimen and wish to consider its potential toxic effects, the impressions of that patient's peers are not available—only the impressions of the oncologist's colleagues are included in the label.
The value of patient input in clinical research is already well established in other contexts, including patient advocate participation in NCI study sections and FDA advisory committee meetings and patient involvement in trials directly reporting health-related quality of life and symptom endpoint data. Results of this study indicate the direction that extending this model to incorporate the patient's perspective in adverse event documentation could take. Such a change would potentially have implications for multiple stakeholders, including patients, investigators, FDA reviewers, industry sponsors, and clinicians (Table 4) (18,42). In the future, this patient-reporting model could be expanded into the routine care setting to assist clinicians with symptom management and could also contribute real-time data to postmarket drug safety surveillance systems (43,44).
This study had several limitations. It was conducted at an urban tertiary cancer center in patients with a single cancer type, potentially limiting the generalizability of our findings. The results are based on the use of specific symptom and performance status items and may be subject to unique measurement properties that may not extend to other items, although in general, the items used are standard for assessing patient status in NCI-sponsored treatment trials (1). Corroboration of the results is warranted in a multicenter study with a diverse patient population, and such a study is under way in the NCI cooperative group setting. Nonresponse bias is a common challenge in symptom research, although response rates were high overall in this study and reasons for nonresponses were tracked. It is not known whether patients and their clinicians would alter treatment decisions when provided with patient-reported data about drug toxicity. Future studies in this area would provide an indication of the extent to which patterns of care would change if drug labels included patient-reported adverse symptom information.
Funding to conduct this research was received from the National Cancer Institute, American Society of Clinical Oncology, and the Steps-for-Breath philanthropic fund of Memorial Sloan-Kettering Cancer Center.
The sponsors had no role in the study design, data collection or analysis, interpretation of the results, or in the preparation of the manuscript or in the decision to submit the manuscript for publication.
The authors wish to thank Drs Elena Elkin and Yuelin Li for reviewing the manuscript and providing valuable comments, and Tony Riley and Joseph Christoff for their assistance with graphics and document preparation, respectively.