|Home | About | Journals | Submit | Contact Us | Français|
Academic radiology is poised to play an important role in the development and implementation of quantitative imaging (QI) tools. This manuscript, drafted by the Association of University Radiologists (AUR) Radiology Research Alliance (RRA) Quantitative Imaging Task Force, reviews current issues in QI biomarker research. We discuss motivations for advancing QI, define key terms, present a framework for QI biomarker research, and outline challenges in QI biomarker development. We conclude by describing where QI research and development is currently taking place and discussing the paramount role of academic radiology in this rapidly evolving field.
Medical imaging has evolved dramatically since the first Roentgenogram nearly 125 years ago (1). Modern techniques including ultrasound, computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography (PET) now provide an unprecedented level of spatial detail and functional information (2). As medical imaging has progressed, older analog techniques have been steadily replaced with newer digital methods of image acquisition, processing, archiving, and display. This evolution has occurred in parallel with advancements in our understanding of the molecular underpinnings of disease and the rise of a more statistical and evidence-based approach to diagnosis and treatment. Medical imaging is now poised to leverage quantitative techniques in support of a wide range of clinical and research goals (3, 4).
In a broad sense, quantitative imaging (QI) refers to the extraction and use of numerical/statistical features from medical images (see Box 1 for definitions of key terms). As a research field, QI includes the development, standardization, optimization, and application of anatomical, functional, and molecular imaging acquisition protocols, data analyses, display methods, and reporting structures, as well as the validation of QI results against relevant biological and clinical data (5, 6). The QI concept is closely tied to that of a biomarker, defined as a characteristic that is objectively measured and evaluated as an indicator of a normal biologic process, pathologic process, or response to a therapeutic intervention (7). A QI biomarker is therefore an objectively measured characteristic, derived from a medical image, that can be correlated with anatomically and physiologically relevant parameters including disease presence, disease severity, disease characterization (particularly on a molecular level), predicted disease course (both with and without treatment), and treatment response. The Quantitative Imaging Biomarkers Alliance (QIBA), organized by the Radiological Society of North America (RSNA), has formally defined a QI biomarker as “an objective characteristic derived from an in vivo image measured on a ratio or interval scale as indicators [sic] of normal biological processes, pathogenic processes, or a response to a therapeutic intervention.” This definition’s emphasis on ratio or interval variables would imply that tumor volumes or PET standardized uptake values (SUVs) would be considered QI biomarkers, because the difference or ratio between two values is meaningful, whereas ordinal variables such as Breast Imaging Reporting and Data System (BIRADS) assessment categories would not. This strict definition is meant to guide QI research toward biomarkers that may be assessed and compared with robust statistical calculations including frequency distributions, medians, means, standard deviations, and standard errors of the mean (8).
Analytical validation – Demonstration of the accuracy, precision, and feasibility of biomarker measurement
Biomarker – A characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or responses to a therapeutic intervention
Predictive biomarker – A biomarker intended to forecast disease course in the presence of a specific treatment
Prognostic biomarker – A biomarker intended to forecast disease course in the absence of treatment
Qualification – Demonstration that a biomarker is associated with a clinical endpoint
Quantitative imaging – The extraction and use of numerical/statistical features from medical images
Quantitative imaging biomarker (modified QIBA definition) – An objective characteristic derived from an in vivo image measured on a ratio or interval scale as an indicator of a normal biological process, a pathogenic process, or a response to a therapeutic intervention (8)
Repeatability – The agreement between successive measurements made under the same conditions
Reproducibility – The agreement between successive measurements made with varying conditions, such as location or operator
Surrogate endpoint – A biomarker intended to substitute for a clinical endpoint
Utilization – Assessment of biomarker performance in the specific context of its proposed use
This manuscript, drafted by the Association of University Radiologists (AUR) Radiology Research Alliance (RRA) Quantitative Imaging Task Force, addresses issues related to QI biomarker research and development. A separate manuscript from our Task Force outlines current clinical applications of QI (9). In this article, we describe motivations for QI biomarker development and discuss challenges for QI research using a three-part framework. We then provide an overview of where QI research and development is currently taking place. We conclude by discussing the particular role of academic radiology in advancing QI. Sections of this manuscript were derived from individual mini-scoping studies based on focused research questions (10).
The promise of QI lies in the potential for increased precision and standardization of image interpretation, in both the research and clinical settings. Potential gains from the growth of QI include increased diagnostic accuracy; decreased variability and subjectivity of image analysis; increased automation of data reporting; more robust association of imaging findings with other biological and clinical parameters, including rigorous statistical correlations between quantitative datasets; and the opportunity for large-scale attempts to link phenotypic imaging patterns with genomic profiles (11). The development of QI is being driven in large part by the environment of evidence-based medicine, in which diagnoses across the clinical spectrum are reinforced with quantitative data (12, 13).
Perhaps the greatest demand for QI at present is from cancer clinical trials, where quantitative measurements of tumor response are used to determine the efficacy of investigational treatments. Imaging-based response assessment guidelines such as the Response Evaluation Criteria in Solid Tumors (RECIST) (14) have been used for decades and have been successfully validated against long-term patient outcomes in certain settings (15, 16). However, in the era of targeted agents that may promote tumor stability rather than tumor regression (17-21), the oncologic imaging community has embarked on developing novel imaging biomarkers to identify and interrogate underlying molecular and functional changes in tissue, with the premise that these measurements will provide earlier and/or more accurate response assessment than tumor size changes (Fig 1) (22). Validated QI biomarkers reporting on different elements of tumor status may enhance drug development by establishing proof of concept for investigational agents, by facilitating selection of candidate agents for promotion to later stage testing, and by determining patient subgroups in which the likelihood of drug response is higher (23, 24). QI biomarkers may also be useful for clinical care by offering the ability to stratify patients to the most appropriate treatments and by promoting earlier identification of patients with a poor response to a particular regimen (25).
Imaging researchers are responding to the demand for QI biomarkers by advancing a broad array of quantitative techniques across a wide spectrum of clinical and research indications (24, 26-35). The common denominator linking all of these efforts is the drive toward producing standardized, unbiased, and precise imaging data in support of the larger medical research and clinical enterprise. This endeavor involves particular research challenges, as presented in the next section.
Rigorous evaluation is required before a QI biomarker can be safely and sensibly adopted (36). This section describes this evaluation process and presents key challenges in QI biomarker development. We have organized our discussion using a framework from the Institute of Medicine (IOM) that considers biomarker evaluation in three parts: (1) analytical validation, (2) qualification, and (3) utilization (37).
Analytical validation involves demonstration of the accuracy, precision, and feasibility of biomarker measurement. If a QI biomarker cannot be reliably measured, it will have little or no use as an indicator for a biological process or clinical outcome. The process of analytical validation includes generating data on limits of detection, limits of quantification, and reference normal values (23). It also includes assessing both repeatability (i.e., the agreement between successive measurements made under the same conditions) and reproducibility (i.e., the agreement between successive measurements made with varying conditions such as location or operator) (8, 38, 39), with both specified by appropriate statistical parameters including the kappa (or weighted kappa), the intra-class correlation coefficient, or the confidence interval of the mean (Fig. 2) (40-50).
Validation studies are also used to generate preliminary reporting standards for QI biomarkers (51). Evaluations of technical performance and measurement error provide the foundation for establishing whether biomarkers should be reported as continuous variables or categorical (e.g., mild, moderate or severe dysfunction of the left ventricle, as assessed by cardiac MRI with quantification of ejection fraction), and also provide data to inform selection of rational cutoff values.
It is important to note that QI techniques typically rest on a foundation of image processing steps used to generate the values for subsequent biomarker definition. These initial steps present their own research challenges. Examples of challenges at the image processing stage include validating automated feature generation in absence of a reliable ground truth or plausible simulation model, and achieving accurate data while minimizing radiation dose.
Qualification involves demonstrating that a biomarker is associated with a clinical endpoint. The qualification process establishes the ability of a QI biomarker to serve as a measurable indicator of a biological process, pathologic process, or response to an intervention (31, 52-55). This critical step in biomarker evaluation provides the basis for biomarker adoption in clinical and research applications, as well as for consideration of biomarker data by regulatory authorities as evidence of drug and device efficacy (51).(51). Qualification is fundamentally a statistical challenge, one with important methodological requirements that must be taken into account in the design of biomarker studies (38, 39).
For a prognostic biomarker, i.e., a biomarker intended to forecast disease course in the absence of treatment, a correspondence must be shown between the biomarker and the outcome of interest. Once a relationship has been established in an initial derivation cohort, it must be confirmed independently in an entirely different set of patients (validation cohort) to prove that the initial correspondence was neither due to chance nor the result of overfitting a statistical model to the derivation cohort dataset (56). Initial relationships can be demonstrated through small retrospective studies, but more robust biomarker qualification requires testing in multiple patient cohorts and preferably within a randomized or prospective clinical trial (57). If test performance is standardized rigorously, the biomarker’s ability to predict clinical outcomes can be tested across multiple centers with varying scanners and viewing platforms. Qualification of prognostic biomarkers must also evaluate biomarker performance as a function of time; even if a strong correspondence is established early in a disease between a biomarker and a clinical outcome, comprehensive biomarker qualification must also examine whether the strength of that correspondence wanes over time as the disease progresses.
For a predictive biomarker, i.e., a biomarker intended to forecast disease course in the presence of a specific treatment, the statistical challenges are greater (58, 59). The same general principle still applies (i.e., establishing an initial relationship between biomarker and outcome and then confirming that relationship in an independent validation cohort), but the analysis requires data from patients with both high and low biomarker levels. Different clinical trial designs exist for analyzing treatment effects in patients stratified by biomarker status. Given the challenges of performing prospective, randomized studies, retrospective analyses of completed trials may be an important source of evidence for predictive biomarker qualification.
One of the greatest statistical challenges for biomarker qualification is in establishing a biomarker as a surrogate endpoint, i.e., a valid substitute for a clinical endpoint. Only a small subset of biomarkers ever meets criteria for surrogacy. In order to qualify as a surrogate for a clinical endpoint, not only must the biomarker forecast the clinical outcome without reference to a specific intervention (“‘individual level” surrogacy), but also the effect of treatment on the biomarker must correlate closely with the effect of treatment on the clinical outcome (“trial level” surrogacy). Generally, individual-level surrogacy is established using standard correlation coefficients, while trial-level surrogacy can be established only through meta-analysis of multiple randomized trials (60). A major challenge for the validation of surrogate endpoints is the need for separate qualification of surrogate endpoints in the setting of different treatments; if a biomarker is qualified as a surrogate endpoint with one treatment, it cannot be assumed that it automatically qualifies as a surrogate for a novel treatment with a different mechanism of action (59).
It should be noted that there is not always consensus on the appropriate clinical endpoint against which a biomarker should be qualified. In oncology clinical trials, for example, many observers have commented on the difficulties in using overall survival (OS) as the gold standard clinical endpoint in tumors for which several lines of treatment are available, leading to adoption in many trials of progression-free survival (PFS) as the primary clinical endpoint (61-64). Considerations such as these demand that we develop a pragmatic approach to biomarker qualification based not only on statistics but also incorporating elements of biological plausibility and practical usefulness (65).
Utilization involves the assessment of biomarker performance in the specific context of its proposed use. This important step in biomarker evaluation asks whether the available evidence from validation and qualification provides sufficient support for the intended use of the biomarker (37). Different research and clinical settings may have distinct requirements and performance thresholds for incorporating a QI biomarker. For example, the U.S. Food and Drug Administration (FDA) may require a higher level of qualification when using biomarker data in support of drug approval than required by a pharmaceutical company when using biomarker data to prioritize compounds in its development pipeline (23).
In the clinical environment, contextual consideration of biomarker utilization allows for a holistic evaluation of a biomarker’s usefulness for decision making. For example, even if a QI biomarker is well correlated with clinical response to a drug agent, it may not demonstrate important drug side effects or toxicities (66), which in turn may imply the need for additional information beyond biomarker status (including qualitative information from imaging) for patient management. Proper consideration of QI utilization in the clinical setting must also address the possibility of assigning too much importance to statistical results and too little to clinical intuition and empirical judgment (67). A comprehensive evaluation of QI biomarker utilization would ideally consider the long-term effects of biomarker use on patient outcomes, notwithstanding the well-known difficulties in separating and measuring the effects of diagnostic imaging on improving patient health (68).
Comprehensive evaluation of QI biomarker utilization also addresses practical issues around biomarker incorporation into routine workflows. If biomarker data cannot be extracted and reported efficiently and at a reasonable cost, there is little likelihood of biomarker translation and adoption into standard-of-care clinical practice (69). Critical imperatives for QI biomarker research therefore include development of semi-automated and fully automated methods of data extraction (Fig. 3) (70), refinement of software tools for importing biomarker data into structured reports (71, 72), development of tools to facilitate QI biomarker tracking over time (73), integration of these tools with existing PACS and other radiology information systems, and investigations into the time-efficiency and cost-effectiveness of QI biomarker reporting (74, 75). Integration of QI biomarker archives with other clinical databases will likely assume greater importance with the anticipated transformation of radiology from a transactional to an information management business (76). Finally, implementing QI biomarker reporting in clinical practice may require dedicated insurance reimbursement to cover the costs of equipment upgrades, phantoms, software, image processing and interpretation, altered workflow, and ongoing quality assurance. These changes to reimbursement are difficult to achieve, and are likely to occur only following recommendations by expert panels and incorporation of QI biomarkers into clinical practice guidelines (77).
Conceptualization of QI biomarker research requires an appreciation of the developmental needs for QI and also consideration of the most appropriate environments for conducting QI investigations. The validation-qualification-utilization framework establishes an ambitious agenda for QI research that requires engagement from multiple stakeholders with different skill sets and end objectives.
Government funding for QI biomarker development exists through several arms of the National Institutes of Health (NIH), mostly notably the National Cancer Institute (NCI). The NCI’s Imaging Response Assessment Teams (IRAT) were an initial effort by the NCI to advance QI biomarkers for assessing therapy response, to increase the use of QI biomarkers in clinical trials, and to strengthen collaborations between basic imaging scientists and clinical oncology investigators. The NCI now encourages QI biomarker development principally through its Quantitative Imaging Network (QIN), which currently includes 17 “centers of imaging excellence” throughout the U.S. (78).
QI biomarker development is also taking place within a number of partnerships and consortia. The Quantitative Imaging Biomarkers Alliance (QIBA), established in 2007 by the RSNA, brings together imaging scientists, radiologists, and industry stakeholders for the advancement and use of QI biomarkers in both research and clinical practice. As of May 2014, QIBA has released publicly reviewed profiles for DCE-MRI quantification, CT tumor volume change, and FDG-PET/CT as an imaging biomarker for measuring response to cancer therapy (79). Meanwhile, the American College of Radiology Imaging Network (which recently merged with the Eastern Cooperative Oncology Group as ECOG-ACRIN) facilitates collaboration in QI clinical trials by academic and community radiologists as well as public and private stakeholders.
Private industry is also involved with QI biomarker development. Pharmaceutical companies are a source of funding for QI biomarker investigations, especially those designed to establish proof-of-concept for compounds in early-stage clinical testing (80). Several small companies are now marketing either stand-alone or plug-in software solutions for automated lesion measurement tracking (e.g., Mint Lesion, Mint Medical, Heidelberg, Germany; Median Lesion Management Solutions, Median Technologies, Valbonne, France; MimVista Software Inc., Cleveland, OH), and several large PACS vendors now offer biomarker tracking and reporting packages for various applications.
As QI biomarker research efforts move across the validation-qualification-utilization spectrum, deployment within clinical trials assumes greater importance (2). It is important to note that in many clinical trials, QI biomarkers are not the primary focus of the trial itself, but are rather deployed as tools to facilitate or accelerate a larger study objective (e.g., demonstrating efficacy of an investigational drug agent). The use of QI biomarkers confers several potential advantages within a clinical trial, including the possibility of populating the trial with biomarker-selected patients who have a higher likelihood of a positive therapeutic response; the opportunity to measure response earlier, more accurately, and/or less invasively than with other methods; and the potential ability to decrease overall study duration and cost by reducing both sample size and patient follow-up requirements (31, 36, 54). Clinical trials also provide a cost-effective environment for conducting QI research because QI correlative studies can be attached to trials with a broader funding appeal. However, conducting QI research within the confines of a larger clinical trial has important limitations, including the “two-variable” problem, i.e., the inherent difficulty in testing an exploratory biomarker and an investigational drug simultaneously (81), and the reluctance of trial sponsors to pay for additional imaging beyond standard-of-care scans.
Academic radiology occupies a crucial role at the interface of basic imaging science and clinical research and is the proving ground for the eventual translation of novel techniques and approaches into routine practice. As such, academic radiology is poised to play a unique role in the development and dissemination of QI methods. Specific roles for academic radiologists include partnering with basic science colleagues to ensure that biomarker efforts are directed toward clinically relevant objectives; coordinating interdisciplinary collaboration between basic science and clinical researchers; designing and participating in analytical validation and qualification studies; and spearheading efforts to establish the potential advantages and appropriate utilization of QI biomarkers. Additional partnership opportunities include working with clinical colleagues from other disciplines to incorporate QI biomarkers into standardized diagnostic and therapeutic algorithms; working with informatics professionals to develop and test technology solutions for efficient QI biomarker extraction, reporting, and management; working with industry stakeholders to promote standardization of biomarker acquisition across vendor platforms; and working with collaborative groups and government agencies to establish data registries and new funding opportunities.
Challenges for academic radiology in QI biomarker research include prioritizing and focusing among a wide set of important objectives, avoiding redundant efforts given a broad array of stakeholders, and staying grounded with respect to basic tenets of standardization and quality assurance while pursuing higher-level technology evaluation. The latter challenge is especially relevant given the recent heightened interest in QI. It is crucial to address variability in methods before attempting to qualify QI biomarkers for broad clinical use (82). Finally, QI biomarker research output from academic radiology is currently hampered by the lack of training among radiologists in advanced clinical research techniques (83); the academic radiology community is addressing this deficiency through programs such as the AUR GE-Radiology Research Academic Fellowship (GERRAF) program, but additional initiatives would be beneficial.
Researchers and clinicians from across the biomedical spectrum are increasingly demanding QI biomarkers for incorporation into algorithmic decision making. The imaging community is responding to this demand by developing QI biomarkers in numerous modalities across a broad set of functional areas. QI biomarker development requires painstaking evaluation with sequential attention to analytical validation, qualification, and utilization of novel techniques and metrics. Academic radiology is poised to play a significant role in these efforts, especially in framing research questions and facilitating translation of emerging techniques from the laboratory into practice.
The authors gratefully acknowledge Lisa Li, Ph.D., and Lori Arlinghaus, Ph.D., and Zhoubing Xu for contributing figures to this manuscript.
Funding acknowledgements: AUR GE Radiology Research Academic Fellowship (RGA), P30 CA068485 (RGA), P50 CA098131 (RGA), NCI U01CA142565 (TEY), RSNA ESCH1319 (RMS), AHRQ HHSA290201200007I (RMS), T32 EB001631 (JPY)
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Richard G. Abramson, Department of Radiology and Radiological Sciences Vanderbilt University 1161 21st Ave. S, CCC-1121 MCN Nashville, TN 37232-2675 (615)322-6759 Fax (615) 322-3764 ; Email: firstname.lastname@example.org.
Kirsteen R. Burton, Dept. of Medical Imaging and Institute of Health Policy, Management and Evaluation University of Toronto 263 McCaul Street, 4th Floor Toronto, ON M5T1W7 (416) 978-6801 ; Email: email@example.com.
John-Paul J. Yu, Department of Radiology and Biomedical Imaging University of California, San Francisco 505 Parnassus Ave., M-391 Box 0628 San Francisco, CA 94143-0628 ; Email: firstname.lastname@example.org.
Ernest M. Scalzetti, Department of Radiology SUNY Upstate Medical University 750 E. Adams St. Syracuse NY 13210 ; Email: ude.etatspu@etezlacs.
Thomas E. Yankeelov, Institute of Imaging Science Vanderbilt University 1161 21st Ave. S, AA-1105 MCN Nashville, TN 37232-2310 ; Email: email@example.com.
Andrew B. Rosenkrantz, Department of Radiology NYU Langone Medical Center 550 First Avenue New York, NY 10016 (212) 263-0232 fax: (212) 263-6634 ; Email: gro.cmuyn@ztnarknesoR.werdnA.
Mishal Mendiratta-Lala, Abdominal and Cross-sectional Interventional Radiology Henry Ford Hospital 2799 West Grand Blvd. Detroit, MI 48202 (313) 461-1648 ; Email: ude.hfh.dar@llahsim..
Brian J. Bartholmai, Chair, Division of Radiology Informatics Mayo Clinic Rochester, MN Phone 507-284-4292 FAX: 507-284-8996 ; Email: ude.oyam@nairB.iamlohtraB.
Dhakshinamoorthy Ganeshan, Department of Abdominal Imaging University of Texas MD Anderson Cancer Center Houston, TX 77030 713-792-2486 Fax: 713-745-1151 ; Email: gro.nosrednadm@nahsenagd.
Leon Lenchik, Department of Radiology Wake Forest School of Medicine Medical Center Boulevard Winston-Salem, NC 27157 Phone: 336-716-4316 Fax: 336-716-1278 ; Email: ude.htlaehekaw@kihcnell.
Rathan M. Subramaniam, Russell H Morgan Department of Radiology and Radiological Sciences Johns Hopkins School of Medicine Department of Health Policy and Management Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Baltimore, MD.