|Home | About | Journals | Submit | Contact Us | Français|
Randomized controlled trials remain the gold standard for evaluating cancer intervention efficacy. Randomized trials are not always feasible, practical, or timely, and often don’t adequately reflect patient heterogeneity and real-world clinical practice. Comparative effectiveness research can leverage secondary data to help fill knowledge gaps randomized trials leave unaddressed; however, comparative effectiveness research also faces shortcomings. The goal of this project was to develop a new model and inform an evolving framework articulating cancer comparative effectiveness research data needs.
We examined prevalent models and conducted semi-structured discussions with 76 clinicians and comparative effectiveness research researchers affiliated with the Agency for Healthcare Research and Quality’s cancer comparative effectiveness research programs.
A new model was iteratively developed, and presents cancer comparative effectiveness research and important measures in a patient-centered, longitudinal chronic care model better-reflecting contemporary cancer care in the context of the cancer care continuum, rather than a single-episode, acute-care perspective.
Immediately relevant for federally-funded comparative effectiveness research programs, the model informs an evolving framework articulating cancer comparative effectiveness research data needs, including evolutionary enhancements to registries and epidemiologic research data systems. We discuss elements of contemporary clinical practice, methodology improvements, and related needs affecting comparative effectiveness research’s ability to yield findings clinicians, policymakers, and stakeholders can confidently act on.
The accelerated pace of scientific discovery has yielded rapid advances in cancer care, underscoring the need for timely, evidence-based information to guide implementation of new interventions into clinical practice. Randomized controlled trials are valid tools for conducting comparative effectiveness research and remain the gold standard in determining efficacy of new interventions;1 however, this research design is not always feasible, practical, or sufficiently timely. Moreover, because of their very selective inclusion criteria and limited enrollment, randomized trials are commonly unable to characterize the myriad combinations of interventions and heterogeneous patient characteristics, and thus fall short of informing “real world” clinical practice. 2–6
Thanks to advances in information technology and recently developed statistical methods, ever-growing repositories of observational data may be leveraged to conduct cancer comparative effectiveness research. By leveraging rapidly expanding repositories of secondary data collected through patient registries, electronic health records, administrative data, interventional clinical trials, and elsewhere, non-experimental cancer comparative effectiveness research holds great promise for addressing many of the shortcomings of randomized trials and filling the knowledge gaps they leave unaddressed.7–10 However, a primary challenge in using secondary data is comprehensively and confidently characterizing important processes of care and outcomes while effectively controlling for potential confounders. As such, it remains a challenge for cancer comparative effectiveness research to generate valid, timely, and broadly generalizable new information reflecting diverse, “real-world” patients and meaningful outcomes, and many cancer care stakeholders remain skeptical or suspicious of non-experimental comparative effectiveness research.11–13 To overcome these challenges and generate findings with sufficient confidence to meet different evidentiary standards, we must clearly understand data shortcomings and use this knowledge to improve future data and methods.
To inform an evolving framework for understanding cancer comparative effectiveness research data needs, we reviewed prevalent data models and incorporated the feedback of over 70 cancer comparative effectiveness research and outcomes researchers, clinicians, and other stakeholders to develop a conceptual model for examining secondary data in cancer comparative effectiveness research. This model provides a template for informing future data collection and methods development efforts relevant to not only secondary data but also prospective research, with the ultimate goal of advancing the utility and acceptability of cancer comparative effectiveness research for clinical and policy-relevant decision-making.
Comparative effectiveness research can take numerous forms, but for this discussion, we focus on the use of secondary data (i.e., data used for reasons other than that for which it was originally collected) and non-experimental studies in the context of the Institute of Medicine’s definition of comparative effectiveness research:
Comparative effectiveness research is the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat, and monitor a clinical condition or to improve the delivery of care. The purpose of CER is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.14,15
Grounded in this definition, members of the study team developed a baseline conceptual model from prevalent models in the recent literature. Models from multiple specialties were considered, though the focus was on the cancer-specific context. Search topics included multiple forms of the following concepts: cancer care continuum, data requirements of non-experimental studies, and models of cancer care, with an emphasis on framework, quality of care, outcomes, measures, measure inter-relationships, and key points of assessment. These topics were examined with regard to not only non-experimental studies, but also randomized trials because they are perceived to be of the highest evidentiary standard, they have been central in informing evidence-based practice and policy, and it is important to understand how models using secondary data may, of necessity, contrast to randomized trials or evolve from them. We concentrated primarily on developing an extensive model of data elements and explicit measures associated with clinical status, treatment selection, and outcomes to inform non-experimental comparative effectiveness research.
The primary study team included an epidemiologist, pharmacoepidemiologist, biostatistician, health services researcher, and two cancer-focused physician researchers, all of whom conduct patient-centered cancer outcomes research. First, we met with a convenience sample of cancer outcomes researchers associated with the Agency for Healthcare Research and Quality (AHRQ)’s Cancer-focused Developing Evidence to Inform Decisions about Effectiveness (Can-DEcIDE) Comparative Effectiveness Research Consortium.16 Discussants were from multiple disciplines and included clinicians, clinical trials experts, epidemiologists, health services researchers, biostatisticians, clinical data managers, state public health workers, and informaticians. Applying snowball sampling, discussants were asked to identify other researchers who might provide additional insight. The final sample of consisted of 41 discussants.† Discussions included the important measures and outcomes in specific datasets relevant to cancer comparative effectiveness research, as well as strengths and weaknesses, population/sample, linking capabilities with other data, and data structure and accessibility.
The study team reviewed findings from the literature and notes from key-informant discussions, and organized and integrated these to develop and refine the conceptual model. The model was shared with the key informants and evolved over multiple iterations. Subsequently, it was presented at a meeting of the Agency for Healthcare Research and Quality-funded Registry of Patient Registries (RoPR) project, the goal of which is to promote the use of standard health outcome measures and other data elements across registries. 17,18 At this meeting, the model was dissected and further refined through intensive discussion with 35 stakeholders including providers, payers, researchers, government officials, industry representatives, and journal editors.‡
From a large but fragmented literature, 20 references were found to be most useful for this study and were focused on closely. 8,19–37 The model by Zapka and colleagues (Figure 1)25 well-characterizes the spectrum of cancer care and illustrates the overarching context of the cancer care continuum in which comparative effectiveness research studies focus.5,38,39 Although developed as a process model, it is useful for identifying relevant interventions, data elements, and intermediate outcomes important for effectiveness studies. Survivorship care is not specifically described, though is inherent in the longitudinality of cancer care presented. Its categories help demonstrate the broad range of stakeholders whose interests are relevant to this discussion, extending far beyond primary cancer treatment.
Geracci and colleagues 40 present a model reflecting randomized trials of cancer treatment efficacy, which provides a foundation from which to build an analogous model for assessing intervention effectiveness using secondary data (Figure 2). Though underspecified compared to the measures actually collected in most modern randomized trials, this model captures the fact that minimal specification is adequate in randomized trials since randomization controls for confounding in treatment selection and outcomes. It also serves as a foil against which to conceptually contrast the extensively specified model required when examining non-experimental data. Its simple temporal unidirectionality also reflects the “intention to treat ” principle prevalent in randomized trials, wherein the treatment or intervention (i.e., exposure) is defined by the first care episode or arm to which the person was randomized, regardless of whether that person subsequently changed their course of care.41
Figure 3 represents this study’s newly-developed conceptual model for examining data for non-experimental cancer comparative effectiveness research, integrating concepts presented in the prior two models, elsewhere in the literature, 11–32,40 and our discussions with comparative effectiveness researchers and stakeholders. Among predictor variables, it extends well beyond traditional measures of the person and the cancer by characterizing a hierarchical set of factors relevant to treatment selection and outcomes, including those at the patient-level, the health care provider level, and the environment-level. Within these different levels of measure there are different subcategories of measures. These multi-level measures feed into a model reflecting the simultaneously longitudinal and cyclical nature of cancer development, detection, and treatment. Among outcomes, the model extends beyond the traditional and familiar measure of survival or mortality by articulating intermediate outcomes, including treatment selection, and richer long-term outcomes such as overall quality of life. It can help identify confounders, mediators, and the multiple pathways driving treatments and outcomes, and thus serve as a guidepost for researchers to develop study designs that mitigate many of the biases inherent in comparative effectiveness research.42
The model reflects the need for more extensive variable specification in the absence of randomization, and enumerates multiple data elements necessary for rigorous analyses that control for bias. Originally oriented towards treatment-focused interventions, it also reflects other points on the cancer care continuum including prevention, diagnosis, palliation, and supportive care. Cancer survivorship fits into this framework, as the measures can be tailored to fit each of the three stages of survivorship – acute, extended, and permanent43 – as the patient’s cancer experience continuously evolves and changes over time. Recognizing that selecting the most relevant measures may vary substantially by the specific study question being examined, the model includes illustrative, though not exhaustive examples of measures relevant to cancer, their interrelationships, and temporality, and specifies the relevance of intermediate outcomes both as important endpoints and also as they may moderate treatments. In taking this approach, the model moves conceptually beyond the intent to treat assumption. While intent to treat has its merits, this new model better-reflects the realities of clinical practice by addressing the notion of multiple lines of therapy, heterogeneous populations and the interactive and reciprocal feedback among multiple levels of relevant factors including different modalities of care, treatments and their sequencing, and intermediate outcomes, all of which may evolve and change over time. In doing so, it presents cancer and cancer comparative effectiveness research examinations in the context of a patient-centered, longitudinal chronic care perspective rather than a single-episode, acute-care perspective.
The goal of this study was to inform a framework for understanding cancer comparative effectiveness research data needs to help advance comparative effectiveness research and its ability to meet different evidentiary standards and thus be confidently utilized in clinical and policy decision-making. The model developed by the study team (Figure 3) serves as a baseline guide for addressing these goals. Compared to prior models, it better reflects the contemporary practice of cancer care by articulating measures and outcomes of growing importance and the iterative longitudinal process of cancer care. By doing so, it informs the evolution of current data sources such as cancer registries, the development of future such data resources, and serves as a useful template to help inform future study design and analysis.
The model articulates multiple factors that have grown in relevance as cancer care has evolved. Historically, cancer treatment options were few, and survival was relatively short. Data collected in our national system of cancer registries reflect this legacy, in that they accommodate limited information beyond basic demographics, diagnosis, initial treatment, and duration of survival. Currently, cancer continues to be detected at earlier, more treatable stages, there are commonly multiple treatment options among different modalities, and many different clinical and non-clinical factors demark these treatments’ use. With these advances, cancer patients can often expect to live for many years after diagnosis.44 As such, additional measures must be developed and incorporated into existing systems. For example, genetic markers such as the KRAS test are increasingly able to provide predictive insight into treatment response, disease progression, or recurrence risk. 45,46 The measures presented in Figure 3 clearly are illustrative and not exhaustive, and are not necessarily intended to represent the key ingredients in a recipe for the single, definitive dataset. Nor does the model imply that all of these measures are a prerequisite for valid inferences from nonexperimental comparative effectiveness research studies. Rather, they are categorically representative of measures that should be tailored to the context of specific cancers and populations, since each cancer is unique.
The model also reflects contemporary cancer care in that it characterizes cancer as a chronic disease experienced through distinguishable phases rather than as a single acute-care event. Reflecting the multiple transitions of each individual cancer patient’s illness trajectory, data elements and pathways in the model can accommodate disease activity and remission, multiple modalities and waves of treatment and testing, equally relevant periods of non-treatment with varying levels of surveillance,47 and other patient outcomes and needs, all of which can occur through recurrent, iterative steps and change over the span of many years.48–51 Prior therapies, coexisting conditions, time, treatment sequence, and physical and psychosocial characteristics associated with aging can influence the choice of subsequent therapies and outcomes. As people live longer and have more options, the model moves beyond the familiar outcome of survival to include intermediate outcomes that are important in and of themselves and as they modulate a longitudinal care process. It also includes measures of the structure, process, and outcomes of care, which collectively compose quality of care.52–54 Overall quality of life joins survival at the end of the model to reflect the tradeoff between quality and quantity of life that many people make.55,56 In sum, in contrast to focusing on a particular intervention, outcome, or even stage of the cancer care continuum, by placing the patient at the center of the model, multiple levels of measures are evident as being relevant.
Although the model proved intuitive to practicing oncologists and outcomes researchers, it is critical to point out that almost no research datasets exist that capture all of these measures, the longitudinal course of care, and feedback among measures. Those that are closest were exceptionally expensive to create and maintain, and remain limited in size and/or difficult to access. Examples include myriad experimental and non-experimental studies, as well as the HMO Cancer Research Network,57 the Cancer Care Outcomes Research and Surveillance study,58 and the Women’s Health Initiative-linked Medicare data.59
Linkages among datasets will play an important role in bringing necessary measures into a single dataset for examination. Though not without its own limitations, the National Cancer Institute’s Surveillance Epidemiology and End Results (SEER)-Medicare data exemplify how the linked data “whole” can be greater than the sum of its constituent parts. Electronic health records or clinical patient-reported data systems that rely on standardized diagnosis and procedure coding systems such as the International Classification of Diseases (ICD), and Healthcare Common Procedural Coding System (HCPCS), can also add important measures and detail. It is helpful that these are nearly ubiquitous systems; however, it will be increasingly important to capture not only diagnoses and the performance of procedures or tests, but also the indication for the procedures and even test results. To some degree, this is captured by electronic health records, though commonly in large text fields that must be abstracted to be useful for researchers. Advances in information technology and migrating from ICD version 9 to ICD version 10 will help address some of these shortcomings.
With an adequate trust fabric protecting patient privacy, data-linkage at the individual patient level can couple detailed information like ICD with important markers such as stage and performance status longitudinally. The next generation of linked data includes distributed research networks, which are more dynamic and perpetually updated, helping address many such shortcomings of SEER-Medicare and similarly static linked data.60,61 Such networks may be the key to success for national quality of care improvement efforts, such as have been proposed under the Affordable Care Act,62,63 as well as the foundation for rapid learning healthcare as outlined by the Institute of Medicine, National Cancer Policy Forum, and the Office of the National Coordinator. 62–66
Dataset linkage will thus be of central importance, but it is important to point out that growing privacy and confidentiality concerns and increasingly restrictive policies governing protected health information (PHI) severely limit access to unique identifiers (e.g., Social Security Numbers, full names), which have been the mainstays of quality data linkages. Indeed, facing these restrictions and privacy concerns many registries, health care organizations, and public health programs have stopped collecting or transmitting important protected health information elements as a matter of policy. For example, federal agencies now prohibit collection of social security numbers for contract research, and Institutional Review Boards are increasingly adopting such restrictions for social security numbers and even participant names.36,67 These policies have created substantially greater burden for both health care providers and researchers, and even have been indicted as contributory to research bias.68–74 As a result, there is an urgent and growing need to develop novel linkage methods to preserve and perpetuate the viability of secondary data for cancer comparative effectiveness research.
Linkages will address some challenges and help strengthen data for cancer comparative effectiveness research; however, they will not necessarily address the challenge of data standardization.75 There are many stakeholders (e.g., researchers, providers, payers) collecting clinical, population, and health services data in numerous ways that often do not interdigitate well. Moreover, even among the multiple disciplines comprising the cancer comparative effectiveness research research community, there are substantial differences of opinion on essential variables and how they should be measured. This lack of standardization in measure definitions and data coding can inhibit comparability across and even within health datasets, and threaten the generalizability of findings in the context of population heterogeneity, sometimes even for simple measures such as race and ethnicity. Discussions with key informants and Registry of Patient Registries project participants reflected this variation in disciplinary perspectives, which led to challenges in developing mutually exclusive data categories, their constituent data elements, and definitive empirical definitions. Nonetheless, the overall model withstood this scrutiny.
Finally, many of the challenges facing cancer comparative effectiveness research data cannot be solved through more measures and more data alone. A greater quantity of data will not necessarily make comparative effectiveness research studies more generalizable or reproducible; rather, the design issues of existing studies need to be better understood and overcome, resulting in better quality data to address these needs. Future work should focus on methods to iteratively upgrade data quality, triangulate results to confirm findings, and refine expectations in response to reality. In parallel, the ongoing advancement of complex analytic methods and guidelines for their application is necessary.13 The majority of data currently used for comparative effectiveness research is collected for non-research purposes, yielding several potential sources of significant bias. Advanced statistical approaches such as propensity score trimming 76 and instrumental variable analysis77 can help address these structural limitations to extend the utility of current data. While advanced methods alone cannot resolve many limitations of currently available secondary datasets, continued investment in developing and applying better analytic methods can help address design limitations and further enhance what we may learn from non-experimental cancer comparative effectiveness research.
By leveraging secondary data we can fill knowledge gaps and provide timely and valid scientific knowledge to inform health care decisions for people with cancer or at risk for it. The model we present here serves a starting point for an evolving framework to systematically address cancer comparative effectiveness research data needs, accelerate the pace of cancer comparative effectiveness research, and ultimately enhance the adoption of comparative effectiveness research findings by the multiple stakeholders interested in improving patient care and outcomes. While the focus of this examination was on the use of secondary data, it is important to point out that many of the key elements here are also relevant for prospective comparative effectiveness research studies as well, both with regard to their primary research focus and also inasmuch as their data may be reexamined subsequently in examination of another study question. There are no doubt many challenges facing the comparative effectiveness research community as it seeks to advance this model and the data needed to fulfill its potential.42 However, for the cancer community to leverage and experience the benefit of these data, policy changes will be required before healthcare providers and other data holders will be willing or legally able to make these data available for research, an issue of substantial ongoing discussion.42,78–81 The engagement and substantial coordination of multiple organizations, agencies, and individuals from multiple disciplines to address several ongoing challenges will also be essential.42
We thank Timothy Carey, MD; Lisa DiMartino, MPH; Janet Freburger, PhD; and Deborah Schrag, MD, for their review and feedback, as well as Richard Gliklich and Daniel Campion from Outcome Sciences (AHRQ Contract HHSA290200500351; “American Recovery and Reinvestment Act: Developing a Registry of Registries,”). Work on this study was supported in part by the Integrated Cancer Information and Surveillance System (ICISS), UNC Lineberger Comprehensive Cancer Center with funding provided by the University Cancer Research Fund (UCRF) via the State of North Carolina.
This work was supported by funding from AHRQ through the Cancer DEcIDE Comparative Effectiveness Research Consortium, contract HHSA290-205-0040-I-TO4-WA5 – Data Committee for the DEcIDE Cancer Consortium.
†Associated with UNC Can-DEcIDE from: the University of North Carolina at Chapel Hill, Duke University, the Brigham and Women’s Hospital and the Dana Farber Cancer Institute, the University of Virginia, the Epidemiologic Research and Information Center at the Durham Veteran’s Affairs Medical Center, the North Carolina Central Cancer Registry, Blue Cross and Blue Shield of NC, Agency for Healthcare Research and Quality (AHRQ), the National Cancer Institute (NCI), and the Centers for Disease Control and Prevention (CDC).
‡Associated with the RoPR Project from: AHRQ, the American College of Surgeons Commission on Cancer (CoC), American Society of Clinical Oncology (ASCO), American College of Radiology (ACR), Blue Cross and Blue Shield, the Centers for Medicare and Medicaid Services, Children’s Tumor Foundation, College of American Pathologists, Duke University, the NCI, Johns Hopkins University, National Comprehensive Cancer Networks (NCCN), the National Institutes of Health (NIH), North American Association of Central Cancer Registries, Northwestern University Feinberg School of Medicine, Outcome Science, Patient Advocate Foundation, RTI Health Solutions, University of California at San Francisco, University of Colorado at Denver, the University of North Carolina at Chapel Hill.
The views expressed in this article are those of the authors, and no official endorsement by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services is intended or should be inferred.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.