|Home | About | Journals | Submit | Contact Us | Français|
On May 3, 2008, a National Cancer Institute (NCI)-sponsored open consensus conference was held in Toronto, Ontario, Canada, during the 2008 International Society for Magnetic Resonance in Medicine Meeting. Approximately 100 experts and stakeholders summarized the current understanding of diffusion-weighted magnetic resonance imaging (DW-MRI) and reached consensus on the use of DW-MRI as a cancer imaging biomarker. DW-MRI should be tested as an imaging biomarker in the context of well-defined clinical trials, by adding DW-MRI to existing NCI-sponsored trials, particularly those with tissue sampling or survival indicators. Where possible, DW-MRI measurements should be compared with histologic indices including cellularity and tissue response. There is a need for tissue equivalent diffusivity phantoms; meanwhile, simple fluid-filled phantoms should be used. Monoexponential assessments of apparent diffusion coefficient values should use two b values (> 100 and between 500 and 1000 mm2/sec depending on the application). Free breathing with multiple acquisitions is superior to complex gating techniques. Baseline patient reproducibility studies should be part of study designs. Both region of interest and histogram analysis of apparent diffusion coefficient measurements should be obtained. Standards for measurement, analysis, and display are needed. Annotated data from validation studies (along with outcome measures) should be made publicly available. Magnetic resonance imaging vendors should be engaged in this process. The NCI should establish a task force of experts (physicists, radiologists, and oncologists) to plan, organize technical aspects, and conduct pilot trials. The American College of Radiology Imaging Network infrastructure may be suitable for these purposes. There is an extraordinary opportunity for DW-MRI to evolve into a clinically valuable imaging tool, potentially important for drug development.
Imaging biomarkers are important tools for the detection and characterization of cancers as well as for monitoring the response to therapy . With rapid technological developments, new imaging methods appear rapidly and their utility requires systematic evaluation. Diffusion-weighted magnetic resonance imaging (DW-MRI) depends on the microscopic mobility of water. This mobility, classically called Brownian motion, is due to thermal agitation and is highly influenced by the cellular environment of water. Thus, findings on DW-MRI could be an early harbinger of biologic abnormality. For instance, the most established clinical indication for DW-MRI is the assessment of cerebral ischemia where DW-MRI findings precede all other MR techniques .
In oncologic imaging, DW-MRI has been linked to lesion aggressiveness and tumor response, although the biophysical basis for this is incompletely understood. Parameters derived from DW-MRI are appealing as imaging biomarkers because the acquisition is noninvasive, does not require any exogenous contrast agents, does not use ionizing radiation yet is quantitative and can be obtained relatively rapidly, and is easily incorporated into routine patient evaluations. However, these desirable features are offset by many challenges that face the validation of any imaging-based biomarker.
From the outset, it is important to recognize the many pioneering contributions of Stejskal and Tanner, Le Bihan et al., and Chenevert et al. from which as come the knowledge that tissue diffusivity measurements are not always random due to tissue organization and therefore, have components that can be attributed both to the vascular and the extravascular compartments [3–5]. This weighting is imparted by the experimental conditions used in measurements (the b values). This document focuses on extravascular diffusion measurements where the measured signal is related to tissue cellularity, tissue organization and extracellular space tortuosity, and on the intactness of cellular membranes that are intrinsically hydrophobic. Classically, the low values of diffusion found in most tumors have been attributed to their increased cellular density; however, this remains a point of contention because diffusivity is influenced by extracellular fibrosis, the shape and size of the intercellular spaces, and by other microscopic tissue/tumor organizational characteristics such as glandular formations (as in well differentiated adenocarcinomas).
There is an extraordinary opportunity for DW-MRI to evolve into a clinically useful method that is useful for pharmaceutical drug development and for predicting therapeutic efficacy. Potentially, DW-MRI could have clinical utility at all stages of a cancer patient's journey from detection to diagnosis, for staging and assessing therapy response, and finally, for assessing relapse. As a pharmacodynamic indicator, DW-MRI may have a significant impact on pharmaceutical drug development. It should be recognized that pharmaceutical drug development and clinical therapeutic efficacy assessments are related but are nonetheless different. In pharmaceutical development, the questions revolve around whether a drug produces a measurable effect, on the magnitude of those effects and the potential biologic implications. In clinical trials, questions revolve around whether changes in individual patients can be measured reliably and reproducibly and whether they predict important clinical outcomes related to therapy.
To truly realize its potential, it is imperative for DW-MRI to become robust so as to provide similar information at different institutions using differing equipment. To date, no accepted standards in measurement or analysis methods have been established. Indeed, it is evident that current implementations of imaging protocols and analysis by different companies vary significantly; even the same manufacturer can alter its methods with “upgrades.” There is also a lack of transparency and a divergent nomenclature among the vendors regarding their particular implementation of DW-MRI, which impedes efforts to standardize the technique. Furthermore, it is unclear whether whole body or localized analyses are preferred for clinical use and for drug development. Finally, multiple analytic approaches to DW-MRI have been proposed and it is not clear which is the best for any given situation. For instance, it is unclear what the optimal set of “b values” should be, how parameters should be adjusted with the tumor type and site, and whether the data should be analyzed monoexponentially, biexponentially, or multiexponentially, all of which will influence apparent diffusion coefficient (ADC) values. Diffusion-weighted MRI depends mostly on images obtained with b value >100 sec/mm2 in extracranial tissues. However, it is acknowledged that the usefulness of information depicted by lower b values has not yet been fully investigated.
Our purpose was to attempt to summarize the current understanding of the pathophysiologic basis for DW-MRI imaging, to describe widely accepted methods that depict diffusivity in the extravascular-extracellular space, and to provide recommendations on standards for measurement, image display, and analysis methods. We have done this to bring all stakeholders toward a consensus on how to conduct multi-institutional trials that will assess the efficacy of DW-MRI for tumor assessments.
Diffusion measurements reflect the effective displacement of water molecules allowed to migrate for a given time .Whereas temperature modulates molecular mobility in pure water (by approximately 2.4% per degree Celsius), it is rarely considered a significant factor in DW-MRI intact tissues because other biophysical properties have far greater influences on tissue water mobility. Signal-to-noise ratio (SNR) and T2-relaxation rates set practical limits on the diffusion measurement interval to within 40 to 80 milliseconds on standard clinical systems.
Using pure water at body temperature (37°C) as a reference standard, the average displacement of water molecules during a 50-millisecond interval is approximately 30 µm . Because this is comparable to or greater than the dimensions of cells, there is a high probability that water molecules will interact with cells and their hydrophobic membranes and macromolecules will impede the motion of water. As such, the observed or “apparent” diffusion of water within tissues is typically several-fold less than in pure water. Moreover, diffusion in biologic systems is affected by water exchange between intracellular and extracellular compartments and the tortuosity of the extracellular space (which in turn is affected by cell sizes, organization, and packing density). Thus, although the spatial resolution of DW-MRI is typically on the order of millimeters, DW-MRI is exquisitely sensitive to changes in diffusion measured on the cellular scale (e.g., micrometers). A clear example of the ability of DW-MRI to document directional diffusion from which architectural features can be derived is the anisotropy depicted in highly directional structures (e.g., myelinated white matter fiber tracts) [3,7].
Other biophysical processes can potentially increase apparent water mobility. These include active transport, flow and perfusion, and macroscopic/bulk movements such as cardiac and respiratory motions. Flow, perfusion, and motion have detrimental effects on the accuracy of measurements of tissue water diffusion and its change over time. Motion correction, cardiac and respiratory gating, and overaveraging can reduce the magnitude of these effects.
Given these complexities, undisputed consensus on the appropriate biophysical interpretation of water diffusion measurements of in vivo systems is difficult to achieve. Indeed, some authors propose a “low-mobility intracellular space” and a “high-mobility extracellular space,” which are averaged on clinical DW-MRI. Although the latter model has intuitive appeal, it is not well supported by empirical observations [8,9]. Alternative models based on exchange between distinct diffusion domains that coexist in the same physical compartment offer better fits of empirical data, although the development and application of these models require measurements over a broader range of b values and more diffusion settings than are typically used in clinical settings. There are a limited number of validation tools available to confirm that properties, such as water diffusion, compartmental tortuosity, and interactions with cell membranes, actually influence diffusion measurements on DW-MRI.
Blood flow signal is rapidly attenuated at low b values (e.g., b < 100–150 sec/mm2) and may be mistakenly attributed to diffusion. This phenomenon, also known as the intravoxel incoherent motion (IVIM), has been used to assess tissue perfusion ; however, for clarity in this communication, we exclude blood flow and perfusion from diffusion phenomena and confine our comments to water mobility in the extravascular space.
Higher minimum b values are required to suppress the perfusion in vascular-rich tissues, indicating that appropriate minimum b values may vary across applications depending on intrinsic vascularity. Even after elimination of perfusion effects, tissues are known to exhibit multiexponential signal decay. Typically, very high b values (b = 1000–5000 sec/mm2) are required to reliably quantify the biexponential decay constants, “Dfast” and “Dslow” [10,11]. Alternatively, nonmonoexpential decay behavior may be fit to a stretched exponential model that yields a distributed diffusion coefficient and an index representing degree of intravoxel diffusion heterogeneity . Therefore, the notion of “low” and “high” b values is relative and dependent on the tissue being studied and the SNR available to maximize diffusion contrast.
Diffusion-weighted MRI is already being incorporated into general oncologic imaging practice because of its many clinical advantages [13,14]. A particular advantage of DW-MRI is that it does not require intravenous contrast media, thus enabling its use in patients with reduced renal function. Its clinical uses include improved tissue characterization (differentiating benign from malignant lesions), for monitoring treatment response after chemotherapy or radiation, for differentiating posttherapeutic changes from residual active tumor, and for detecting recurrent cancer. Potential additional roles include predicting treatment outcomes (before and soon after starting therapy), for tumor staging, and perhaps also for detecting lymph node involvement by cancer.
The reason(s) why malignant tumors have lower ADC values are poorly understood but is probably related to a combination of higher cellularity, tissue disorganization, and increased extracellular space tortuosity, all contributing to reduced motion of water (Figures 1 and and2).2). Correlations with cellularity have been found for some primary and secondary neoplasms [15–20] but not for all tumors such as adenocarcinomas and necrotic lesions that correlate only weakly [15,21]. Diffusion-weighted MRI is able to differentiate between benign and malignant focal hepatic lesions in many cases based on the higher ADC of benign lesions compared with malignant lesions . However, when cystic, necrotic, or treated metastases are included, results in the liver are not as good . In line with these findings, reduced ADC values of malignant breast tumors compared with those of benign lesions and normal tissue have also been noted [16,24,25]. Sumi et al.  showed that lymphomatous nodes had significantly lower ADC than benign nodes. However, in the same study, metastatic cervical lymph nodes in patients with head and neck cancers had significantly higher ADC values than benign nodes. This apparent discrepant result can be explained by the common occurrence of necrosis in nodes with metastatic squamous cell carcinomas . In a confirmatory study, the ADC values of lymphomas were reported to be lower than those of squamous cell carcinomas . These and other studies indicate that false-positive results occur with abscess and infective processes and false-negatives occur with cystic, necrotic lesions and in well-differentiated neoplasms (particularly adenocarcinomas).
Because cellular death and vascular changes in response to treatment can both precede changes in lesion size, changes in DW-MRI may be an effective early biomarker for treatment outcome both for vascular disruptive drugs and for therapies that induce apoptosis [14,28,29]. Preclinical work has shown that DW-MRI is able to discriminate between nonperfused but viable and nonperfused, nonviable (necrotic) tissues, when tumors are treated with the vascular disruptive agent combrestatin-A4-phosphate . In most malignant tumors, successful treatment is reflected by increases in ADC values. Rising ADC values with successful therapy have been noted in several anatomic sites, including breast cancers [31,32], primary and metastatic cancers to the liver [33,34], primary sarcomas of bone [15,35], and in brain malignancies .
Soon after initiation of therapy, transient decreases in ADC can also be observed; this seems to be related to cellular swelling, reductions in blood flow, or to extracellular space (the latter maybe mediated by vascular normalization if antiangiogenic drugs are given). For instance, it has been noted that anti-vascular endothelial growth factor therapies in brain tumors lead to an initial reduction in vasogenic edema that lowers ADC values . The extent and duration of such ADC reductions is likely to depend on the type of treatment administered, tumor type, and the timing of imaging with respect to the treatment. Cellular swelling has been noted to occur in the early phases of apoptosis in response to anticancer treatment . Apparent diffusion coefficient values can be reduced by fibrosis and dehydration after successful treatment as reported in rectal cancers  and in brain gliomas . These additional observations indicate that ADC changes are dependent on complex interplays of biophysical processes, emphasizing the need to better understand at a histologic level, tissue changes reflected in ADC maps.
The differentiation of posttreatment changes and residual or recurrent tumor is a common diagnostic dilemma. In head and neck tumors, for instance, recurrent tumor and chondroradionecrosis are often impossible to distinguish clinically or by imaging . Diffusion-weighted MRI has the potential to distinguish postradiation changes from recurrent cancer based on ADC value differences. Higher ADC values likely represent posttherapeutic extracellular edema, whereas lower values are suspicious for active disease. Simple visual assessments of signal intensity on high-b value DW images may be helpful for image interpretation, with hyperintensity associated with lower ADC values suggesting active tumor. In this respect, DW-MRI may have advantages over fluorodeoxyglucose-positron emission tomography (FDG-PET) assessments, which can be limited shortly after radiation therapy because areas of inflammation often have high uptake on PET scans and can lead to false-positive results .
Diffusion-weighted MRI has been shown to have the potential to prospectively predict the success of some treatments in a number of different tumors [30,33,36,38–43]. For example, strong negative correlations between pretreatment tumor ADC in patients with rectal cancer and size changes after chemotherapy and chemoradiation have been found . This and other similar observations have led to the hypothesis that tumors with higher ADC levels are more likely to have areas of necrosis, which in turn predicts poor outcomes related to hypoxia-mediated radioresistance. This relationship between poor outcomes and high pretherapy ADC may not apply to all tumors and to all therapy types. For example, in an animal tumor model treated with a vascular disruptive agent, low tumor ADC values still had viable tumor cells on histologic diagnosis after therapy, whereas tumors with higher ADC values had a greater degree of cell kill .
There is early experience indicating that DW-MRI may be helpful for primary tumor staging  and for detecting nodal [26,27] and distant metastases . In this regard, whole-body DW-MRI seems to be particularly promising . Determining threshold ADC values and confounding effects that can allow/impede the differentiation of benign and malignant lymph nodes will be important goals for future clinical studies.
Diffusion-weighted MRI has a number of roles in neuroradiology including the diagnosis of acute stroke as early as 30 minutes after the onset of ischemia , compared with the hours-to-days range for computed tomography and other MRI sequences. For this reason, DW-MRI has easily and quickly surpassed all other imaging techniques in the initial evaluation of acute stroke patients. In the acute stroke setting, DW-MRI can also be used with perfusion imaging to predict the likely clinical outcome after thrombolytic therapy and estimate the chances of secondary hemorrhagic transformation [48, Chalelaet, et, al, 2007,83]. Other uses of DW-MRI include differentiating brain abscesses from other abnormalities that may mimic abscesses, with high sensitivity (96%) and specificity (96%) , with the exception of differentiating toxoplasmosis from lymphoma in HIV-positive patients . Diffusion-weighted MRI can also be used for predicting the extent of neuronal damage after status epilepticus , for differentiating arachnoid cysts from intracranial epidermoid cysts, and for evaluating residual epidermoid tumor after surgical resection .
In the differential diagnosis of cystic brain lesions, DW-MRI can help distinguish abscesses from necrotic primary brain tumors such as high-grade gliomas, with lower ADC values usually detected in abscesses. Lower water diffusivity in abscesses is probably related to the presence of microorganisms, macromolecules, and intact inflammatory cells . When intracranial masses are solid, the main determinant of diffusivity is the volume of the extracellular space. With a larger extracellular volume, which may be caused by edema/fluid accumulation, ADC values are higher, whereas tumor hypercellularity has the effect of restricting diffusion by decreasing the extracellular volume. Lower ADC values in lymphomas, compared with gliomas, correlate well with measures of cellularity  (Figures 3 and and4).4). Apparent diffusion coefficient values also correlate with tumor cellularity in astrocytomas [20,53], although the value of DW-MRI in tumor grading is still debated. Diffusion-weighted MRI has also been investigated as a biomarker of response to treatment in brain tumors, with increased diffusion values detected shortly after treatment initiation suggesting a favorable outcome [42,54,55].
Diffusion tensor imaging (DTI), which provides directionality to diffusion measurements, can be used to assess the relationships between tumor and nearby white matter tracts, potentially differentiating tumor infiltration of white matter tracts from displacement, which can be useful for preoperative planning . Fractional anisotropy (FA), the degree to which diffusion is directed in a particular direction, is generally reduced in primary brain tumors owing to disorganized architecture resulting from neuronal death, axonal loss, and irregular tumor cellular growth . Fractional anisotropy reductions also correlate with tumor cellularity and percentage tumor infiltration .
Diffusion-weighted MRI has the potential to assist in new drug development and in clinical practice. To be accepted in either area requires that DW-MRI be validated as an accurate biomarker. This will require systematically conducted prospective studies where patients are assessed by both DW-MRI, and standard criteria. Diffusion-weighted MRI will only be accepted if it is shown to provide accurate information earlier, quicker or easier than current methods, or can provide information unattainable with other modalities.
In clinical drug development, it would be important to see if a drug in a phase 1 study alters ADC and in which direction. It would be relatively easy to add DW-MRI to studies that were already performing serial MRI studies but it would seem sensible to look at earlier time points. This is because of the need to identify the time point of maximal response; it is only after defining this point correctly in phase 1 trials that DW-MRI could be incorporated into phase 2 and 3 studies where the opportunities to perform multiple repeat studies are limited. Furthermore, serial DW-MRI measurements that include early time points are essential to avoid false-negative results. Data from multiple studies with a wide variety of different agents incorporating DW-MRI would have to be available, before “go-no-go” decisions were made based on whether changes in ADC were seen.
If changes in ADC were seen in the phase 1 or early phase 2 studies of a new drug, it would be useful to incorporate DW-MRI measurements into phase 3 trials. Changes in ADC would then be correlated with Response Evaluation Criteria in Solid Tumors (RECIST)-defined responses, progression-free survival, and survival. If there were robust correlations between changes in ADC and one of the other efficacy end points, one could examine whether DW-MRI added value to the trial. One added value would be an earlier prediction of drug activity than is possible with other biomarkers. Another would be a better prediction of response than by standard criteria.
Care should be exercised in comparing responses according to RECIST and responses according to DW-MRI. For general use, it would be necessary to produce internationally accepted definitions for response/nonresponse on DW-MRI (these are not currently available). To produce such definitions requires analyzing numerous trials where DW-MRI results and standard end points are available. Correlation with survival or progression-free survival is far more important than correlation with RECIST response, which is just another surrogate marker of activity. If DW-MRI does predict response, it might indicate responses in some patients where RECIST does not and vice versa. If, for instance, a rise in ADC is associated with apoptosis, this effect might be associated with stable disease rather than partial response, yet such a drug might be valuable in extending survival. In such a situation, RECIST might indicate that a drug was inactive, but higher ADCs might be an early indicator that the drug is effective as maintenance therapy.
To validate DW-MRI as a biomarker, similar methods could be used as were used to validate the serum tumor marker CA-125. For example, rather than trying to correlate DW-MRI with RECIST response in individual patients, one could determine whether DW-MRI was able to classify the drug as active or not. Such an approach is particularly useful when different patients might be classified as responders by different techniques .
Another area that DW-MRI could have value in clinical trials is if it could predict which patients are most likely to benefit/not benefit from a drug/approach. Enriching a population to increase the proportion of patients benefiting is becoming increasingly important as drug development costs rise. It might, however, be difficult to determine an absolute ADC value that has predictive power (given the complexities of data acquisition and analysis methods particularly when examinations are done on different scanners).
A technique that can detect drug efficacy at an earlier time point has great potential in clinical oncology practice where there is a desire to stop ineffective therapy as quickly as possible especially if that therapy has severe adverse effects and is very expensive or where there are alternative approaches to treatment available. However, to change therapy based on a new technique requires having confidence about its accuracy. Large studies would be required to determine the sensitivity and specificity of DW-MRI for predicting activity in groups of patients. However, to change therapy on individuals also requires knowledge of measurement error. Depending on the alternative strategy to be used, a very high individual value in the positive predictive value/negative predictive value (PPV/NPV) would be required. Furthermore, it would be necessary to demonstrate that DW-MRI was able to provide information that was more accurate or was not available by other techniques and that it was widely available at a realistic cost. This aspect of DW-MRI development is illustrated in Appendix 2.
One could envisage that DW-MRI could give information about a tumor that would provide both prognostic and predictive information. The use of imitanib for gastrointestinal stromal tumors and trastuzumab for HER2-positive breast cancers are examples of the few situations where the activity of the new molecular targeted agents can be predicted by pretreatment investigations. There are no tests that predict which patients are most likely to benefit from new antiangiogenic, vascular disruptive agents, or other novel proapoptotic drugs. Only well-designed prospective studies will demonstrate whether DW-MRI can help predict which patients are most likely to benefit from such agents. Such studies might ultimately show that an absolute pretreatment ADC value below a specified range or an absence of change in serial ADC values once treatment has started is associated with a worse prognosis. However, the availability of yet another prognostic marker is unlikely to be widely used (given the higher cost of imaging tests compared with serology, for example) unless such information leads to changes in therapy.
Many new therapeutics are presently entering development as a result of increased understanding of the molecular and genetic pathways controlling cellular function. Many of the structural and functional components of the cancer/host cell surface, cytoplasm, and nucleus are being explored for their potential value as therapeutic targets. Targeted molecular approaches typically seek to inhibit the cellular processes characteristic of the cancer phenotype. These new drugs may have complex and possibly even contradictory effects on water diffusion.
Central to the success of this process are the go-no-go decisions made in early clinical trials, an important aim of which is to reduce the high cost of pivotal (phase 3) trials. Because no single biomarker or assay is used to make such judgments, imaging will need to establish its place before being integrated into decision-making processes.
Strategic responses of the pharmaceutical industry to overcome current bottlenecks that result in delays to delivery of new drugs to market include using novel clinical trial designs and investments in novel technologies. Clinical trials are now being designed with the expectation of real-time data analysis and delivery. Imaging, together with other biomarkers, is recognized as providing accurate and reproducible data capable of enhancing decision making at critical milestones in the drug development process.
With this in mind, the important end points for imagers to consider include
Translational and early clinical development is well suited as a focus for DW-MRI in drug development. The nonionizing nature of DW-MRI and avoidance of contrast agents are conducive to serial patient examinations in exploratory early-phase drug trials. By their nature, early-phase clinical trials would allow the timely inclusion of preclinical DW-MRI information to select clinical imaging schedules and assess the likely clinical sensitivity and specificity of the technique. Key questions in early clinical trials include Has there been a change caused by the drug detectable by DW-MRI? What is the confidence that such a change has occurred? What is the magnitude of the change? What is the meaning or predictive value of the changes observed?
The limited sample size in early-phase clinical trials provides the opportunity for the rapid accumulation and interpretation of imaging data across many tumor types and therapeutics. Although this approach needs to be balanced against the limits of small sample size, studies should have a clearly stated aim(s) focused on one or more of the end points given above.
There is no reason to think that DW-MRI will be preferred for any specific class of drug. Any therapeutic could be investigated by DW-MRI, and the mechanism of action of a drug should not exclude consideration of this technique as long as the biology and sensitivity of the technique support its rational use. As a starting point, the drug mechanism(s) most likely to be detected by DW-MRI are those likely to alter the microenvironmental architecture (i.e., apoptosis and angiolysis). The preclinical and evolving clinical experience of DW-MRI in clinical trials suggest its use as a pharmacodynamic biomarker (the manner in which the drug affects its intended target). Such data might provide insight into (1) dose scheduling in single or combination therapy and optimal drug formulation (e.g., oral vs intravenous administration).
Because DW-MRI has the potential to provide unique information for decision making, procedural rigor will be needed to establish it as a biomarker. Technical reproducibility needs to be determined to define significant thresholds of change in diffusion indices. Great emphasis needs to be placed on the need for reproducible examinations suited for the multicenter and global structure of early-phase clinical trials. In addition, there needs to be a rational basis for the choice of scanning times given the mechanism of the drug being assessed. Acquisition sequences, data transfer, and analyses all need to be standardized to allow future meta-analyses and comparisons of results.
Diffusion-weighted MRI clearly differentiates itself from other imaging modalities both within the MRI “space” and outside as the only imaging modality able to depict water movement at a cellular level. Evaluations of the value of DW-MRI as a “contrast mechanism” leading to new clinical applications could open up new market segments encouraging more sales for MRI equipment and innovations in this area. If DW-MRI can be shown to be a robust and reproducible early biomarker of response to anticancer therapies, which is useful for drug development or for making patient decisions, then this would allow the installed MRI market to increase further. Improved understanding of the mechanisms that determine tissue diffusivity and ADC changes in response to therapy would encourage more rapid dissemination of the method.
To enable such developments to occur, it is absolutely necessary that there are agreements among all stakeholders on standards for both acquisition protocols, repeatability/reproducibility and for the postprocessing procedures, to ensure that quantitative ADC values have similar meanings across vendors and institutions. The setting down of such standards in cooperation of the scientific imaging community will allow vendors to focus their research and development resources on improving measurement and analysis methods.
Recognizing that there are several DW-MRI data acquisition techniques, the most commonly implemented basis sequence is the single and double spin-echo Stejskal-Tanner echo planar image (EPI) experiment . The following comments apply to clinical imaging at 1.5 and 3.0 T but may vary and need adjustment according to field strength. To ensure high quality images for both qualitative and quantitative assessments, scanning factors should be optimized to maximize SNR and reduce artifacts (e.g., from motion, incomplete fat suppression, residual eddy currents induced by diffusion gradients, and EPI-related artifacts). In the body, DW-MRI can be performed using breath-hold, free breathing, or respiratory/cardiac-triggered techniques as dictated by specific anatomic locations . Scanning parameters should be prescribed to allow accurate and reproducible ADC quantification, and the chosen parameters should ideally be achievable across MR platforms to allow meaningful comparison of results. The scanning parameters should be clearly stated in reports and manuscripts.
The following summarizes the key factors that would help to optimize image quality .
The following imaging techniques can help reduce motion artifacts
To maximize tumor visualization and characterization, adequate suppression of background signals arising from normal tissue is desirable. Diffusion-weighted MRI should be performed with sufficient degrees of diffusion weighting (by appropriate choices of b values), with considerations given for the anatomic region, tissue composition, and pathologic processes as indicated in Table 1. This may require the customization of DW-MRI protocols for different tumor types and tumor locations.
Both native high b value DW-MR images and the ADC maps are useful for visual assessments of DW-MRI data; both should be evaluated with corresponding morphologic images. Cellular tissues generally demonstrate high signal intensity on high-b value images, but yield low ADC values. This pattern is occasionally seen in viscous fluids also such as postoperative cavities and abscess emphasizing the need to undertake correlative imaging with other morphologic MRI sequences including those using contrast medium enhancements. Conversely, cystic or necrotic tissues show greater signal attenuation on high-b value images and have high ADC values (in this context, the definition of “necrosis” needs to be established). The latter pattern is sometimes also seen in well-differentiated neoplasms.
Interestingly, some normal and pathologic tissues exhibit high signal intensity on high-b value DW-MRI and also return high ADC values. In these cases, the high signal intensity observed on DW-MRI cannot be attributed to limitations in water diffusion but result from the intrinsically long tissue T2-relaxation times; an effect known as “T2-shine through.” In some anatomic regions (e.g., prostate gland), ADC maps may be more helpful for disease detection because T2-shine through from the normal peripheral zone and may mask disease even on high-b value DW-MR images.
Fibrosis may appear low in signal intensity on DW-MRI and return low ADC values, but the range of the imaging appearances of fibrotic tissue has not been fully characterized.
Table 1 summarizes b values that may be used as a guide when performing DW-MRI for qualitative assessment [18,22,39,63–69]. For some tissues (e.g., prostate, lymph nodes), b values >1000 sec/mm2 are occasionally needed to mitigate the effects of “T2-shine through.”
Meaningful comparisons of DW-MRI from different imaging centers with data acquired from different platforms are more likely to be realized if DW-MR images are quantified.
Apparent diffusion coefficient quantification obtained using breath-hold DW-MRI is less reproducible compared with ADC obtained using free breathing techniques. However, ADC values obtained using free breathing techniques mask tissue heterogeneity owing to partial volume averaging effects. The relative balance between these two trends may vary in different parts of the body thereby necessitating different approaches.
In the absence of conclusive data demonstrating the superiority of ADC derived from either breath-hold or free breathing techniques for assessing treatment response, both techniques should be investigated.
The following should be considered when performing DW-MRI for ADC quantification:
To ensure maximum transparency, all pertinent scan parameters should be recorded and clearly stated in reports, manuscripts, and other scientific publications. This would facilitate investigators replicating the imaging technique on similar platforms, translating techniques onto other imaging systems, and enabling comparisons of ADC values between vendors.
Field strength, MR system and model, gradient performance, software version, field of view (FOV), matrix size, technique (e.g., breath-hold, free breathing, respiratory-triggered), type of imaging sequence, TR, TE, number of partitions, section thickness, number of averages (including the use of high b values averaging), fat suppression technique, choice of b values, use of tetrahedral encoding, receiver bandwidth, duration of application of diffusion gradient (Δt), and time between application of diffusion-gradients (δt)
Accurate recording of imaging parameters may then be used to retrospectively evaluate the relative value of techniques for optimizing DW-MRI. Many of these parameters are already recorded in DICOM header information and are retrievable. Obtaining access to this header information is critical for accurate image analysis. Magnetic resonance imaging vendors should support access to this information in a move toward greater transparency/standardization.
For many clinical applications, single b-value DW-MRI at relatively high diffusion weighting offers exceptional sensitivity to detect disease (this is evident from experiences in the brain for the early detection of stroke (on high-b value images) and in the liver for the detection of lesions using “black blood” low-b value images). However, image signal analysis from single b-value images is inadequate for even rudimentary quantitative analysis of water mobility in tissue.
Multiple b values are necessary to calculate the ADC. At least two b values are needed for basic ADC calculations and this can be done on most clinical systems. Implicit in two b-value ADC calculations is the application of a monoexponential decay model. That is, an ADC map is generated by the natural logarithm of the ratio of low-b value over high-b value image, scaled by inverse of the b-value difference. Albeit overly simple, this two-point method is adequate in instances where multiexponential features are negligible over the acquired b-value range. Moreover, the adoption of this basic analysis has led to reasonable agreement across centers and MRI vendors for ADC quantification of the human brain (b-value range 0–1000 sec/mm2). Many observers believe this may be sufficient for clinical usage, although it is likely that physical measurement of water diffusion in tissue is more complex than described by the monoexponential decay model.
In highly vascular tissue, blood flow/perfusion may impart significant signal attenuation over the low b-value range (from b = 0 to b = 100 sec/mm2), which artificially inflates diffusion estimates. As described above, nonzero lower b values should be used to eliminate vascular contributions to the calculated ADC. The minimum b value threshold to suppress perfusion effects will depend on the vascular properties of tissues, although for most applications, a lower b value of 100 to 150 sec/mm2 is probably adequate.
It is recommended to also continue to acquire the nominal “b = 0” image to provide anatomic information and to maintain consistency with prior work. Usually, the b = 0 image can be obtained at nearly no cost in scan times using single-shot techniques, particularly because acquisition along three-orthogonal axes is not performed for the b = 0 weighting.
For applications where DW-MRI is acquired over larger ranges of diffusion sensitivities, and assuming perfusion effects have been effectively removed by the proper choice of the lower b value, simple monoexponential models may not adequately characterize the decay curve. Usually, evidence of true multiexponential features (not related to perfusion effects) requires substantially higher b values (e.g., 2000–6000 sec/mm2), much greater than is typically acquired in clinical studies owing to practical SNR limits. Proper analysis of these data types requires multiexponential models where signal decays are modeled as weighted sums of two or more exponentials (provided that the signals at the highest b value are above the noise level) [10,11] or alternative models such as stretched exponentials that allow a distribution of diffusion coefficients in each voxel . As with other curve fitting challenges, reliability to accurately isolate multiple decay coefficients depends on the difference between the true Dfast and Dslow, SNR, b-value range, and number of b values acquired. Rejection of low SNR pixels and/or incorporation of SNR weights in the multiexponential fitting routine should be used to mitigate fitting errors. An unfortunate tradeoff in acquisition of DW-MRI over many b values and/or averaging to increase SNR to support multiexponential diffusion analysis is the commensurate increases in scan times which may not be practical in many clinical settings.
Diffusion in some tissues is known to be directionally dependent, that is, anisotropic (e.g., in the central nervous system and in muscle). If it is known a priori that the tissue of interest is isotropic (e.g., most tumor models) then a single gradient direction is usually sufficient to properly document diffusion properties. In general, however, it is safer to assume the lesion of interest and its surrounding tissues may have directional dependencies so it is best to measure water mobility along at least three orthogonal diffusion gradient directions yielding, say, ADCx, ADCy, and ADCz. The simple average of these into a mean diffusivity value effectively removes confounding influences of the relative orientation between tissue and the imaging system. This mean diffusivity bears the same desirable rotational independence as the trace of the full diffusion tensor without having to acquire or process DTI .
If further information is specifically desired regarding the strength and spatial patterns of anisotropy, at least six gradient directions are required to generate the full diffusion tensor, although additional gradient directions (9–32 commonly) generally improve the quality of the tensor analysis results.
Most MRI vendors offer the option to acquire and process DTI scans in a reasonably efficient manner. Intrascan image registration should be applied if there are systematic shifts and image distortions at various gradient directions before tensor analysis. Multiple indices are available to quantify the degree of anisotropy (e.g., FA, relative anisotropy ). In addition, the direction of the strength of anisotropy can be color-encoded using the principal eigenvector of the diffusion tensor. Furthermore, the connectivity of anisotropic domains can be represented in tractography, which allows visualization of tissue fiber tracts in three dimensions . As suggested above, diffusion anisotropy is relatively strong in the CNS. Outside the CNS with the exception of the kidney and muscle, however, anisotropy is rather modest, and therefore, most tumor analyses have been directed toward isotropic diffusivity indices (i.e., ADC calculations).
To study diffusion properties of tumor, proper delineation of lesion boundaries must be identified for subsequent quantification. Ideally, the region of interest (ROI) is contoured around lesions using images with the highest contrast between lesion and normal tissue. Subjective placement of smaller ROIs within lesions is not recommended particularly for response assessment studies.
Traditional high-contrast anatomic images, such as T2-weighted and contrast medium-enhanced T1-weighted, which are independent of the DW-MRI sequences are preferred, but translation of ROIs to the DW-MR image set is then required. Transferal of such ROIs to the DW-MRI data set requires image registration unless prescription of the traditional and DW-MRI scans was identical (ignoring for the moment other systematic distortions). In some instances, the DW images themselves can offer strong lesion/tissue contrast, in which case these are sufficient for ROI definition. Ideally, the b0 T2-weighted image (or a very low b value image) should be used, although, occasionally, higher b-value images may have to be used.
There is debate as to which b-value image best delineates tumor from normal tissue/necrotic tissues. When ROIs are drawn on high-b value images for the estimation of ADC values, such ROIs are said to represent “viable tumor” because the detrimental effects of necrosis are ameliorated. However, such a method for defining ROIs is occasionally prone to error because of T2-shine through effects. Furthermore, in the presence of necrosis/cystic structures, lesion extent maybe underestimated. It is also important to remember that well-differentiated tumors may not be seen on high-value images.Whatever the method used to define ROIs, a standard, recorded strategy should be applied to ensure consistency within any given study.
In the ADC calculation methods described above, low SNR pixel values should be eliminated before the ADC map calculation, and these pixels should be flagged as “not-a-number” for exclusion. However, elimination of low SNR pixels for ADC calculations and/or using high-b value images for ROI definitions can be problematic when evaluating therapeutics effects of some drugs. For example, chemotherapy for teratoma can cause a poorly differentiated tumor to become well differentiated (a favorable outcome measure) and some drugs/therapies induce necrosis and cystic degeneration. In both cases, ROIs placed solely on areas of “viable tumor, however, defined” would underestimate/mask favorable therapeutic effects. Pixel counting of zero ADC values before and after being induced by therapy would be a way of dealing effectively with these issues.
Conservative ROI definitions would only include apparently viable tumor based on robust Gd contrast enhancement on T1-weighted images. A more generous tumor extent would include contrast-enhanced and hyperintense tissues on T2-weighted images. However, inclusion of necrotic and cystic zones can include extremes in water ADC values, which may adversely bias image analysis. It is important that standardized software be developed in which criteria of undesirable tissue be clearly defined and that individual subjective decision making by observers is kept to a minimum. Different scenarios may be adopted to exclude these nonviable tissue regions. It should be kept in mind that a particular treatment might induce more nonviable voxels, which, if eliminated from analysis, would falsely reduce the apparent impact on ADC.
The entire three-dimensional volume of interest (VOI), a composite of ROIs over multiple slices, of the lesion should be delineated particularly if the tumor is being followed over time.
Volume of interest analyses methods fall into three general areas: whole-tumor summary statistics, histogram, and voxel-wise analyses.
Choice of monoexponential versus multiexponential modeling of signal decay with b value depends on features apparent in the data, SNR, number, and range of acquired b values.
Data typically obtained in most clinical applications for b-value ranges of 100 to 1000 sec/mm2 are reasonably well modeled using monoexponential decay fits.
Tumor ROI/VOI definitions may be done on traditional high-contrast images such as T2-weighted or T1-weighted contrast-enhanced images. High-image contrast, high-b value DW images can also be used.
Descriptions of diffusion properties within lesions or tissues of interest may be reported at several levels classified as follows: 1) traditional summary statistic over the entire ROI/VOI; 2) histogram analysis, which allows segmentation of the tissue based on diffusion properties; and 3) voxel-by-voxel analyses where spatial information is retained over interval examinations such that fractional volume of tissue exhibiting change in diffusion properties is measurable. However, the latter requires methods of tracking individual voxels over time.
The determination of outcome measures or end points must be dictated by the nature of the question being address (clinical, biologic, physical, or pharmaceutical). For instance, if the purpose of a trial is to determine whether DW-MRI can characterize the biologic aggressiveness of a tumor, then ADC values need to be correlated with recognized measurements of aggressiveness. This could include tumor grade, time to progression, progression-free survival, or overall survival. However, if the goal is to determine whether DW-MRI is an early marker of treatment success, then intermediate end points such as pathologic response could be used. Potentially, DW-MRI results could be compared with other biomarker changes such as serum markers of cancer (e.g., carcino embryonic antigen, prostate-specific antigen, etc.), RECIST, and WHO measurements of tumor size. However, firmer and more robust end points reflecting therapy efficacy in patient outcomes are preferred where possible, such as time to progression, progression-free survival, and overall survival.
If DW-MRI is being evaluated as an early biomarker of therapy response then the timing of follow-up studies should be such that DW-MRI is acquired before changes in size are expected to occur. Intermediate time points may become influenced by necrosis and liquefaction, and DW-MRI may become less useful. Long-term data points may “normalize” because liquefactive necrosis resolves and the residual mass contains fibrotic dehydrated tissues.
For response assessment studies, it is important to have predetermined whether DW-MRI changes are expected to occur, the magnitude/direction of the likely change, and the timeline as to when and for how long changes are expected to last. Animal validation studies before human studies may provide information on the appropriateness of using DW-MRI and on the optimal timing for doing imaging in human studies.
Diffusion experiments generate large numbers of magnitude b-value images. When these are combined with morphologic images, many hundreds of images are produced, which need to be reduced for diagnostic interpretations.
The most valuable images required for interpretation are high-b value images and ADC maps, which should always be evaluated with morphologic imaging. Because high-b value DW-MR images have high background suppression, tumor localization is usually straightforward. However, very high signal on high b value may also be due to T2-shine through effects; conversely, liquefaction or necrosis can result in an underestimation of lesion extent, so comparisons with anatomic images are important.
Although no color scales are especially suited for the display of high-b value magnitude images, convention has it that “inverted grayscale” be used (ADC maps however, are better displayed using conventional grayscale). Indeed, whole-body DW imaging with background suppression can produce images that superficially resemble FDG-PET scans (Figure 5). This is because of the high contrast on high-b value images, which, when used with three-dimensional displays, are amenable to multiplanar reconstructions and three-dimensional renderings (maximum-intensity projections, surface shaded display, volume renderings).
A common method of analyzing high-b value images is to use fusion imaging techniques. Modern three-dimensional fusion imaging visualization software works in three steps. (1) Superimposition: data sets do not need to be acquired in the same plane and to have identical FOVs and matrix sizes, but most ADC data sets are aligned and obtained with similar parameters. (2) Alignment: algorithms work with multiple degrees of freedom (translation and rotation) based on anatomic landmarks with the ability to work automatically with manual overrides if necessary. (3) Visualization: blending of grayscale with pseudo color images with adjustable balance between the two superimposed data sets. When blending is used for data display, the level of blending should be kept constant across a study and reported in manuscripts.
Other potential artifacts appearing on fused images include misregistration of anatomic and DW images due to bladder filling and internal organ including movements. Susceptibility artifacts caused by luminal air are exaggerated on high-b value images, although their effects are minimized on ADC maps.
A major challenge to the widespread implementation of DW-MRI is the lack of a standard approach to data collection and analysis. This creates challenges for support of DW-MRI by commercial MRI vendors and makes deployment of DW-MRI techniques limited to sites with significant experimental MRI expertise. Furthermore, the lack of standard approaches impairs validation and makes the ultimate qualification of DW-MRI as a biomarker extremely difficult.
In large part, the lack of standardization is related to the technical challenges in performing DW-MRI acquisitions. In most practical applications of DW-MRI, performing “ideal” data acquisitions is impractical owing to limits in technology and patient compliance.
Approaches that accommodate technical limitations through compromises in acquisition and/or in analysis have been developed to allow the practical implementation of this technique. Examples include reducing the number of b values for modeling of data, reducing spatial resolution, limiting volume of imaging, averaging free breathing studies instead of gating, using empiric analyses (e.g., visual assessments signal intensity of high-b value images), creative acquisition time reducing techniques, and so on.
Standardized data sets should be acquired systematically using “ideal techniques” with great intrinsic redundancy to test the effect of various technical compromises on measuring the signal associated with response. Such data should be made widely available for investigators to test their analytic software. These ideal data sets should be limited to single organs/single treatments starting with the least challenging. Ideally these should be documented, anonymous, and be available on the Web.
Similarly, it would be desirable for research groups to make their analysis methods available either by publication of open code or under specific bilateral agreements. In the longer term, specific standardized software for analysis would be advantageous, but this should not restrict the continual evolution of measurement and analysis approaches.
Standard methods of diffusion assessment should be established and validated against phantoms appropriate to specific body locations, with their measurement reproducibility being established.
Basic standards for measurements/analysis and reporting of tissue diffusion coefficient should be established and adhered to. They should be tested against relevant phantoms, and reproducibility should be established.
New techniques need to demonstrate specific advantages over existing methods, providing comparison data that defines the benefit.
Studies should include routine measurement and QA analysis.
Standardized data sets need to be made available to allow testing and comparison of analysis approaches.
Research groups should make analysis methods available, either as open source code or by specific agreements where there are confidential or commercial issues.
Standardization of software for analysis would be desirable.
To support the use of DW-MRI parameters in decision making about pharmaceuticals, it is important to link DW-MRI to underlying pathophysiological processes both before and after interventions.
Initially, this should be performed in well-defined model systems and then, where possible, confirmed by clinical measurements using biopsy specimens or surrogate tissues. The link between DW-MRI biomarker change and therapy response should also be established in xenografts and then clinically using both clinical outcome measures as well as pathologic surrogates of outcomes. Ideally, these biologic end points should relate specifically to the mechanism of action of the compound.
Suggested histologic validation of DW-MRI includes exploring links with measurements of proliferation index (Ki 67), cellularity index (cells/high-power field), tumor grade, and apoptosis. It will also be useful to explore/correlate DW-MRI with other MR measures of perfusion (dynamic contrast-enhanced (DCE) MRI, dynamic susceptibility contrast MRI, blood oxygenation level-dependent MRI), arterial spin labeling or metabolism magnetic resonance spectroscopy, and other imaging tests (e.g., FDG-PET, thymidine-PET, or annexin imaging for apoptosis).
Initially, clinical studies should validate practical approaches developed using the standardization guidelines described above, in more generalized applications such as chemotherapy response at varieties of anatomic sites. Neoadjuvant clinical trials are particularly suitable for these purposes because pathologic materials obtained can serve as rapid intermediate readouts/end points. If these are successful then novel therapeutics in early phase 1/2 studies can be evaluated.
Requires correlation between size and type of biologic effect and relevant DW-MRI parameter, in animal models, supported by clinical biopsy or histology data.
Time course of effects will define the timing of imaging in clinical trials.
Attempt to derive hypothesis-driven relationships between imaging and specific biologic end points.
Biologic end points should relate to the purported mechanism of activity of the compound.
It would be desirable to be able to predict the magnitude of the MR effect based on animal models, allowing trial design to monitor dose-related change.
To allow appropriate study design and to assess the significance of change, centers should demonstrate the reproducibility of their clinical measurements, in a manner that is traceable, providing information on individual and intergroup reproducibility. This information should be combined with evidence of the expected magnitude of therapeutic effect, such that studies can enable assessments of dose-related changes.
Reproducibility assessments are facilitated by incorporating baseline repeated measurements to provide information directly relevant to the body sites chosen. It is important to identify major sources of error leading to nonreproducible results. To determine whether changes in tumors induced by treatments are significant, three factors should be known. These are the natural biologic variability of parameters such as ADC, the variability inherent in the measuring instruments, and knowledge of additional errors induced by appraisers or analysis techniques. This implies that diffusion parameter measurement changes cannot be taken at face value without due consideration of measurement errors. Estimates of measurement errors enable us to decide whether changes in ADCs are “real” for both group and individual observations.
Few published studies have documented measurement error in body DW-MRI, and the major contributors toward errors are not documented. However, from previous studies of other functional imaging techniques (e.g., DCE MRI), it is likely that DW-MRI measurement error will be dependent on a number of factors. These include imaging instrumentation and setup procedures, data acquisition techniques, and the time interval between repeated measurements. Data analysis techniques are also likely to add to measurement error including modeling techniques used (including range and noise of b value images used and implicit assumptions (monoexponential vs biexponential or multiexponential fitting)). Patient-related factors include tumor type, anatomic region being evaluated, and underlying physiologic status of patients.
It is important that clinical trials evaluating DW-MRI responses to treatment assess measurement variability as an intrinsic part of clinical trial design. The measurement error estimate component should be of sufficient statistical power (i.e., on enough patients) and needs to be performed on the study patients or in other patients who are representative of those being examined in the main study. To compare measurement errors of DW-MRI parameters at diverse anatomic sites and pathologies, it is important that similar statistical methods be used and that the meaning and limitations of statistical measures are understood.
Before statistical tests are applied, assumptions intrinsic to reproducibility analysis must be verified (e.g., normality of data and the nature of any relationship between measurement error and the magnitude of the parameters). Appropriate statistical parameters include the within-subject SD and coefficient of variance, and intraclass correlation coefficient should be quoted in communications (as detailed in Appendix 3). The repeatability statistic is a useful parameter for DCE-MRI studies because it informs on whether changes in a particular patient are significant.
Centers should define reproducibility of data that is traceable, for individuals and intergroup comparisons, allowing the power of studies to be defined prospectively for a defined end point. Where possible, and in the absence of existing reproducibility data specific to the method, two baseline measurements should be incorporated to allow assessment of individual patient reproducibility.
Multiple lesions per organ should be taken into account.
A standardized minimum statistical approach for reproducibility analysis should be reported.
In multicenter trials using identical (preferred) or similar methods (such as maintaining a constant field strength, imaging a single organ, etc.), comparison of precision and accuracy should be determined on phantoms to provide a basis for pooling of data, with account taken of corrections for machine-specific factors, and for sensitivity to motion effects not seen in phantoms.
Site qualification should be undertaken by the performance of measurements validated at a central analysis site before recruitment using standardized data from each site. Readers should refer to Appendix 4 on QA procedures and diffusion phantoms for further details.
Analyses of DW-MRI data in multicenter trials should be performed at a single center using a standardized validated software. The reliability of analyses should be assured using data from each participating center before starting the trial.
In each study, patient and lesion selection, as well as the number of studies per patient including reproducibility assessments, should be defined prospectively. Reproducibility studies should be done at each imaging site because it provides estimates of measurement error in multicenter settings but also serves as a quantitative QA measurement of site performance. Standardized QA procedures should be enforced on all institutions participating to keep the data as uniform as possible.
Every effort should be made to ensure that the study can proceed on a given MR unit even if the unit is upgraded to a higher software level. It is important for the viability of DW-MRI, as a biomarker that implemented DW-MRI methods, be impervious to upgrades and software changes; otherwise, its future as a biomarker is in question.
Robust data acquisition protocols that are able to deal effectively with physiological motions should be instituted and adhered to.
Central data collections should incorporate appropriate QA and quality control procedures. Fast feedback to imaging sites is recommended to minimize data loss due to incomplete or incorrect imaging.The number and causes of failed examinations/analyses should be prospectively recorded. Ideally, failed examination/analysis rates should be <5% to 10%.
Data analysis should use a software that is fit for the purpose, is validated, and is preferably Food and Drug Administration 21 CFR part 11-compliant. 21 CFR Part 11 sets forth the requirements that need to be met to have the Food and Drug Administration consider electronic signatures and records equally trustworthy and just as reliable as handwritten signatures. Validation of software algorithms via multicenter trials is an essential need for obtaining regulatory approval to use DW-MRI as an accepted surrogate biomarker.
To promote the comparisons of ADC values obtained from different centers and for differing therapies and to overcome the dependence of ADC on the range of b values chosen for any particular study, perfusion-insensitive ADC values (by excluding the b = 0 sec/mm2 image from the ADC calculation) should always be quoted.
Additionally, study data should be publicly available to enable alternative analytic approaches that might be superior to the ones used in the study.
Animal validation should be undertaken before human studies to provide information on the appropriateness of using DW-MRI and may be able to indicate the optimal timing for doing imaging in human studies.
Double-baseline studies should be done to provide data about measurement error of imaging specific to the study and thus knowledge of what constitutes a significant change in an individual and in a group of patients (powering studies).
Quantified parameters such as ADC should be measured to derive physiologically meaning that can be related to drug mechanisms of action. Quantified parameters have the advantage of allowing interpatient and intrapatient comparisons to be made. Good quality control and QA are keys to success for multicenter studies.
Software applications that provide the means to standardize analysis and display of DW-MRI data will facilitate the advancement of this approach for diagnostic and therapeutic assessments in cancer imaging.
Software applications should allow efficient and reproducible analyses of DW-MRI data using tools that allow zoom and pan functionality for two-dimensional and three-dimensional displays. Efficient ROI/VOI delineations using anatomic and b-value images should be available.
It is preferred that software incorporates intraexamination and interexamination image registration. Image intensity-based spatial registration using mutual information is vital because it will allow for voxel-wise changes to be followed over time in individuals.
The software should have a flexible workflow and should be able to generate a variety of quantitative calculations for comparing tumor diffusion values at multiple time points and to visualize therapy response in individual, configurable layouts (Figure 6).
Overall, the standardized software will provide users with the ability to generate automatically a variety of quantitative calculations based on volumetric analysis (histograms) or voxel-based analysis over multiple time points and to visualize therapy response (an example of voxel-based display is functional diffusion maps — fDM — see Appendix 1).
Open source code should be used whenever possible to permit broad dissemination of the method, thus encouraging standardization. The application should allow to import/export DICOM data sets and to output measurements via schema-based XML to data repositories.
Diffusion-weighted MRI data require unique analysis approaches for quantification of results.
Standardization of software applications for analysis of DW-MRI data should provide for volumetric/histogram and for voxel-wise quantification approaches.
Standardized software will provide for robust, standardized approaches for analysis of single and multisite clinical trials data that can be used for regulatory submission.
Development and testing of new acquisition protocols sensitized to multiple directions with the objective of improving image quality and reducing distortions and artifacts.
Improved robustness in whole-body DW-MRI techniques with increased coverage and improved fat suppression.
Improved robustness of acquisition techniques able to deal with physiological motions as dictated by specific anatomic locations are needed to minimize smearing of data image, to reduce partial volume averaging effects, and to preserve heterogeneity of ADC maps. The relative benefits of motion-compensated and -independent measurement techniques need systematic evaluation.
There is debate on the best phantom materials, options are described in Appendix 4. Whatever material is ultimately used, there is a pressing need for equipment manufacturers to specify clearly QA procedures that they already do for SNR measurements.
Improved phantoms mimicking the cellular environment of living tissue are required. Simple phantoms filled with liquids with different diffusion coefficients measure diffusion coefficients and ADC. Phantoms filled with beads of well-controlled size may help us in understanding the properties of diffusion of extracellular water (provided that susceptibility effects are negligible) but does not tell us anything about other water diffusion components (vascular contributions and intracellular). However, non-Gaussian distributions of the displacement would be seen so these could be used initially.
It is recognized that improvements in SNR afforded by 3-T systems could be a boost for DW-MRI. It is accepted that diffusivity is likely to be independent of field strength; however, the effect of high field strengths on data acquisitions and analyses still needs to be systematically evaluated.
Improvements in software for analyzing DW-MRI data are needed. Such software should be adapted for “clinical” and “research” uses. Ideally, the ability to generate synthetic b-value images back-calculated from ADC/b values (less noise) would be valuable and could be used to extrapolate very high b value images. The method of calculating ADC should be clear to the reader (via fitting displays, noise of individual b-value images, and calculation error). There should be flexible scaling of ADC images on displays and the ability to generate ADC values for any given range of b values to promote the calculation and reporting of perfusion-insensitive ADC values in clinical studies. Low SNR pixel values should be eliminated before ADC map calculation, and these pixels should be flagged as “not-a-number” for exclusion from subsequent analysis. The ability to segment (threshold) high-b value/ADC images and then to obtain ADC values from “threshold ROIs” using histograms are needed.
Fusion imaging techniques are invaluable for DW-MRI, and improved registration in three-dimensional using mutual information and warping techniques is needed.
Fusion methods need to be integrated into analysis and reporting/communication tools. Fusion techniques need to be improved and extended with the ability to display, coregister, segment, fuse, and analyze multiple functional imaging methods (DCE; blood oxygenation level-dependent, magnetic resonance spectroscopic imaging, etc.) in an integrated single work space. Such tools should allow for missing functional data sets (not obtained, poor SNR, corruption, and artifacts).
It is becoming clear that DTI techniques provide information that is valuable for investigating some organs such as the kidneys, muscle, and the prostate gland. These efforts should be encouraged, and to this end, robust data acquisition sequences and analysis software should become more available.
Diffusion-weighted MRI is an attractive noninvasive, quantitative technique that yields parameters that relate to tissue structure, cellularity, and necrosis. The plethora of existing methods in the current literature makes it impossible to conclude definitively that DW-MRI is qualified as a pharmacodynamic marker of therapy response. Therefore, the major outcome of this consensus document is to stimulate funding agencies, vendors, and clinical trials to collaborate on the design of multicenter studies that will test the value of DW-MRI in more controlled circumstances than have hitherto been possible. This will require focused groups of researchers to agree on precise clinical scenarios in a variety of cancer types and on the timing of imaging so as to address key biomarker development questions. Agreements will be needed on data acquisition and analysis methods that yield water diffusion biomarkers, on calibration tools, and statistical methods for analyzing trial data. Although this task is complex, the potential benefits are enormous given the ease with which DW-MRI data can be incorporated into clinical trials. We hope that this White Paper will catalyze action in this field so that these challenges can be addressed in the near future.
The International Standards Organization (ISO: 5725) definitions repeatability and reproducibility as follows:
Thus, repeatability refers to the ability of a measurement system to provide consistent readings on a given object. A major requirement is that external sources of error are controlled; in this case, only one observer performs the measurement and the spontaneous variability of the biologic parameter is assumed to change little during the brief period. Repeatability thus informs on equipment variation. In MRI, the repeated measurement of tissue relaxation properties without moving the patient over a short period is thus a repeatability study. It should be noted that repeatability is both a concept but also a specific statistical measure (see below).
Reproducibility, however, is the ability for multiple operators/experiments to achieve consistent results on identical objects. Thus, reproducibility measures the typical error between observers when each observes the same quantity. Reproducibility informs on appraiser/experimental variation. A DW-MRI measurement repeated on patients a few days apart is considered to be a reproducibility study. Even if the equipment and analysis technique used for the two measurements is identical, the experimental conditions are different because of the timing element.
Appropriate statistical parameters include the within-subject SD and coefficient of variance and interclass correlation coefficient. The repeatability statistic is a useful parameter for DW-MRI studies because it informs on whether changes in a particular patient are significant. Before statistical tests are applied, assumptions intrinsic to reproducibility analysis must be verified (normality of data and the nature of any relationship between measurement error and the magnitude of the parameters). Test-retest studies would need to be performed within a short period (the same day would do provided that patients are taken off the imaging couch for a period and a machine reboot takes place in a series of patients with the assumption that no significant changes are expected to occur over such a short period).
Measures of the spontaneous variability of parameter estimates
Measures of measurement error
Statistical calculations also need to take into account that there may be more than one lesion measured per patient. Parameter values cannot be treated as independent samples because of the behavior of tumors within a particular patient shares to a degree, a biologic environment that is not common to other tumors. That is, some correlation between tumors in a patient is expected, and this clustering within patients needs to be taken into account in the methods of statistical analyses. The latter is likely to affect not only measurement error calculations but also the behavior of tumors owing to the effect of treatment. About the latter, it is likely that the behavior of tumors within a particular patient, due to treatment, is more likely to have a higher correlation than the variation between different patients.
Quality control procedures for DW-MRI can characterize the performance of measurements obtained on clinical MRI systems and can assist in clinical studies performed in centers where there maybe limited specialist experience in diffusion imaging. Key features of QA protocols are as follows: 1) preparation of test objects (phantoms), 2) the systematic application of DW-MRI procedures, and 3) the parameters chosen to analyze the results of QA protocols must be helpful in identifying errors/deviations in the performance of MRI systems. There is debate on the best phantom material (options are outlined below), but whatever material is used, there is a pressing need for machine vendors to define clearly QA procedures for DW-MRI.
Ideally, test objects should bemade of material with tissue-equivalent diffusions and MR signal properties (e.g., T2-relaxation rate). Materials should be inexpensive, easy to prepare in a reproducible way, safe to transport, stable over time, and ideally nontoxic. However, it is recognized that multiple, complex phantoms to validate ADC measurements in multicenter clinical studies are difficult to make because the cellular environment of living tissues cannot be easily mimicked. Simple phantoms filled with gels/liquids with different diffusion coefficients may be acceptable initially.
A number of liquids have been suggested as substances for diffusion phantoms including the following:
Whatever the final choice of diffusion phantom material used, the following measurement conditions should apply.
Methods of quantifying artifacts in a systematic way are helpful in developing clinical protocols and maintaining image quality. Nondiffusing phantoms are well suited for this purpose. Apparent diffusion coefficient maps generated from measurements on nondiffusing test objects provide a direct indication of the source and magnitude of artifacts. Nyquist ghosts and distortions can be quantified , and protocols are optimized to reduce the magnitude of artifacts as indicated in the next section .
All DW-MRI measurements are influenced by artifacts and machine imperfections. These include b0 inhomogeneity resulting from susceptibility variations within biologic or physical test objects/samples (this includes patients, volunteers, and nonhomogeneous structured phantoms). Chemical shift artifacts result from the presence of more than one chemical species or scalar coupling. Other artifacts are measurement-induced, for example, Nyquist ghosting and geometric distortions from residual MPG-induced eddy currents.
Before deciding on the measurement protocol for clinical studies, artifacts should be characterized, and their effects minimized using QA phantoms. There is extensive published literature on DTI informing on methods for improving DW-MR image quality [81,82]. Unfortunately, most successful DTI correction techniques are not easily applied in extracranial applications because of factors such as motion, patient-induced b0 variations (including time variation), and chemical shift artifacts from fat.
Phantom measurements can be used to improve the quality of DW-MR images. One of the most challenging areas to tackle is geometrical distortion (which occurs over large FOVs typically used in extracranial imaging) resulting from residual eddy currents. Large homogeneous phantoms (>20 cm) are suitable for the evaluation of in-plane geometric distortions and have the advantage that they lack internal structures, which may confound the analysis . Measurements performed on these phantoms are easy to evaluate using vendor measurement tools (distance and grid tools). Subtraction images (the DW image is rescaled by the ratio of mean values from the b0 and DW images and then subtracted from the b0 image) enable the degree of geometric distortion to be visualized easily.
Image distortion in clinical scanners can be improved in a number of ways:
|FOV* (cm)||23 x 23||28–40|
|Matrix size* (x, y)||112 x 256||128 x 128; interpolate if possible|
|TR (msec)||Shortest ≈ 4000||>2500|
|Parallel imaging factor||N/A||2|
|Section thickness (mm)/gap||5/0||5–7/0–1|
|Directions of MPGs||3 directions||3 directions|
|b factors (mm2/sec)||b = 0, 1000||Use three including b = 0, 50, or 100, 500, or 1000|
|Pixel bandwidth (Hz)||1833||1446|
|Patient preparation||None||Empty stomach|
|Other comments||None||Free breathing or respiratory-triggered|
Imaging protocols are for 1.5-T systems.
SPIR indicates spectral presaturation with inversion recovery; WE, water excitation.
|Matrix size† (x, y)||160 x 256||128 x 128; interpolate if possible|
|TE (msec)||Minimum||Minimum (<80)|
|Section thickness (mm)/gap||6/1||5–7/0–1|
|Directions of MPGs||3||3|
|b factors (mm2/sec)||0, 100, 800||0, 50–100, 500–1000|
|Pixel bandwidth (Hz)||1000–1500||1000–1500|
|Patient preparation||Antiperistaltic - i.m. for longer action||Empty stomach|
|Other comments||Free breathing||Free breathing or respiratory-triggering|
Imaging protocols are for 1.5-T systems.
The authors thank each one of the many individuals who participated in the workshop (Appendix 6) and others who gave very helpful discussions before and after the workshop. A special acknowledgement is given to Drs. Jeff Evelhoch, Robert Gillies, and Paul Tofts for their many helpful comments and critiques. The authors also acknowledge the leadership and support of Drs. James Tatum and Larry Clarke and Ms. June Hoyt who coordinated the meeting's participant list and other logistics of the workshop.
The writing committee acknowledges workshop participants and thanks them for their ideas and thoughts on improving the quality of this White Paper
Anwar Padhani, MB, BS, FRCP, FRCR, Mount Vernon Hospital; Guoying Liu, PhD, National Cancer Institute; Peter L. Choyke, MD, National Cancer Institute; Thomas L. Chenevert, PhD, University of Michigan Health System; Brian D. Ross, PhD, University of Michigan Health System; Andy Dzik-Jurasz, MD, PhD, Novartis Pharmaceuticals Corporation; Dow-Mu Koh, Royal Marsden Hospital; Marc Van Cauteren, PhD, Philips Healthcare Asia Pacific; Bachir Taouli, MD, NYU Medical Center; Taro Takahara, MD, PhD, University Medical Center (UMC) Utrecht; A. Gregory Sorensen, MD, Harvard Medical School; Harriet C. Thoeny, University of Bern; Jeffry R. Alger, PhD, University of California, Los Angeles; David L. Buckley, PhD, The University of Manchester; Jean Brittain, GE Healthcare; Cecil Charles, Duke University; Haesun Choi, MD, M.D. Anderson Cancer Center; Laurence (Larry) Clarke, PhD, National Cancer Institute; David A. Clunie, RadPharm, Inc.; Alexandre Coimbra, Merck, Inc.; Patricia E. Cole, Etai Pharmaceuticals; David Collins, Royal Marsden Hospital; Amita Dave, PhD, Memorial Sloan-Kettering Cancer Center; Benjamin M. Ellingson, PhD, Marquette University, Medical College of Wisconsin; Lloyd Estkowski, GE Healthcare; Jeffrey L. Evelhoch, PhD, Amgen; John Froehlich, Guerbet Switzerland; Susan Galbraith, BSc, MB, BChir, PhD, Bristol-Myers Squibb; Robert J. Gillies, PhD, University of Arizona; Ingrid S. Gribbestad, Institutt for sirkulasjon og bildediagnostikk NTNU Norway; Masoom Haider, University of Toronto; Wendy Hayes, DO, Bristol-Myers Squibb Co.; Gwenael Herigault, PhD, Philips Healthcare; Patrice Hervo, GE Healthcare, France; Luna Hilaire, PhD, GE Healthcare Medical Diagnostics R&D; Derek Hill, IXICO Ltd.; Doug Hussey, Cedara Software, Inc.; Nola Hylton, PhD, University of California, San Francisco; Marko Ivancevic, PhD, Philips Medical Systems; Paula M. Jacobs, PhD, SAIC Frederick, National Cancer Institute; Yasushi Kaji, MD, Dokkyo Medical University, School of Medicine; Ihab R. Kamel, MD, PhD, Johns Hopkins Hospital; Berthold Kiefer, Siemens AG; Adrian Knowles, GE Healthcare; Jenny Kuhelj, Varian Medical Systems; Normand Laperriere, MD, FRCPC, Princess Margaret Hospital; Martin Leach, Royal Marsden Hospital; Cynthia Ménard, MD, FRCPC, Princess Margaret Hospital; Chuck Meyer, PhD, University of Michigan; Chrit Moonen, University Victor Segalen Bordeaux; Josephine Naish, University of Manchester; Sarah Nelson, UCSF - Mission Bay; Andrew N. Priest, Cambridge University Hospitals; Marwan Sati, Cedara Software, Inc.; Vahan Sharoyan, Synarc, Inc.; Girish Srinivasan; Choon Hua Thng, National Cancer Centre, Singapore; Paul S. Tofts, Brighton and Sussex Medical School; Nina Tunariu, Hammersmith Hospitals; J. C. Waterton, AstraZeneca; Souhil Zaim, MD, Synarc, Inc., University of Michigan Health System.