|Home | About | Journals | Submit | Contact Us | Français|
Accumulating evidence suggests that characteristics of pre-treatment FDG-PET could be used as prognostic factors to predict outcomes in different cancer sites. Current risk analyses are limited to visual assessment or direct uptake value measurements. We are investigating intensity-volume histogram metrics and shape and texture features extracted from PET images to predict patient's response to treatment. These approaches were demonstrated using datasets from cervix and head and neck cancers, where AUC of 0.76 and 1.0 were achieved, respectively. The preliminary results suggest that the proposed approaches could potentially provide better tools and discriminant power for utilizing functional imaging in clinical prognosis.
Recent years have witnessed increased use of positron emission tomography (PET) in radiotherapy treatment planning and monitoring. In particular, [18F] Fluoro-2-deoxy-d-glucose (FDG), a glucose metabolism analog, has been frequently used in clinical practice for tumor detection, staging, and radiotherapy target definition of different cancer sites [1-3]. Recently, there has been accumulating evidence that pre-treatment FDG uptake could be used as a prognostic factor for predicting radiotherapy treatment outcomes [4-10]. This was motivated by the fact that tumor uptake is dependent on the characteristics of its microenvironment . More recently, heterogeneity in FDG uptake in head and neck tumors has been reported in animal models . The heterogeneity was attributed to the distribution of different tissue components within the tumor region. Typically, quantitative analysis of FDG uptake is conducted based on observed changes in the standardized uptake value (SUV). However, sole SUV measurements are potentially impacted by the initial FDG uptake kinetics and radiotracer distribution, which are dependent on the initial dose and the elapsing time between injection and image acquisition. In addition, some commonly reported SUV measurements might be sensitive to changes in tumor volume definition (e.g., maximum SUV). These factors and others might make such approach prone to significant intra- and inter-observer variability [6, 7]. Alternatively, there have been some efforts in the literature directed towards utilizing variations in the FDG distribution, characterized by its heterogeneous shape and texture, as potentially more robust prognostic metrics. Visual assessment was investigated to evaluate heterogeneity in FDG for patients with locally advanced rectal carcinoma  and cervix cancer . Another visual pattern analysis technique was applied in Hicks et al.  for grading tumor response and normal tissue toxicity in patients with non-small cell lung cancer. Spatial heterogeneity metric based on deviation from an idealized ellipsoid structure (i.e., eccentricity) was found to have strong association with survival in patients with sarcoma [14, 15].
In this work, we are exploring two feature-based approaches for summarizing and extracting reliable information from PET images. This information would be used to derive prognostic metrics in outcome analysis and could potentially be incorporated into the clinical planning process to modify patients' treatment based on their predicted failure risk. For instance, this could be done by intensifying the treatment dose for patients that at are high risk of failure and providing less toxic regimens than standard for patients who are at lower risk.
The first approach is a histogram-based approach referred to as intensity-volume histogram (IVH), which is analogous to the dose-volume histogram (DVH) concept used for evaluating treatment plan in radiotherapy [16, 17] but is applied to functional imaging datasets instead of dose distributions. The IVH approach would summarize the 3D functional imaging intensity information into a single curve for each anatomic structure of interest, which will be used to derive intensity-volume metrics, as discussed later. The IVH may be represented in two forms: the cumulative integral IVH and the differential IVH. Considered here is only the cumulative IVH, which is a plot of the volume of a given structure as a function of image intensity (or normalized SUV) that is equal to or higher than a certain value.
The second approach extracts shape and texture features from PET images to characterize anatomical structure of interest. This would provide the analyst with objective morphological descriptors of PET uptake in these regions. As a proof of principle, we will mainly focus on commonly well-established features in the pattern recognition community. Hence, in addition to geometrical shape features such as eccentricity and solidity for shape description [15, 18], we also explore second-order histogram statistics or co-occurrence matrix features for texture analysis . These features possess strong discriminant power and ability to mimic human perception of texture variability. Therefore, they have been widely applied in many complex industrial and medical pattern recognition tasks [20-25]. In PET imaging, texture analysis has been previously applied to evaluate the performance of reconstruction algorithms . In addition, we demonstrate that complementary information among extracted features could be combined to capture differences in tumor metabolic activities between patients who respond to treatment and patients who do not respond to treatment. To our knowledge, this is the first report that utilizes a systematic pattern analysis approach for predicting cancer patients' treatment outcomes based on PET imaging.
As demonstrative examples, we analyzed two datasets of cervix cancer patients and head and neck cancer patients who received chemoradiotherapy as part of their treatment. All patients, in both sets underwent pre-chemoradiotherapy diagnostic FDG-PET in our institute. The PET images were acquired using a hybrid PET/CT Siemens Biograph with an in-plane spatial resolution of 5.3×5.3 mm/pixel and a slice thickness of 3.4 mm. The images were reconstructed using ordered-subsets expectation maximization algorithm with compensation for attenuation using CT-derived μ-maps. The cervix cancer dataset consisted of 14 patients. The data was analyzed for the endpoint of disease persistence (i.e., tumor did not eradicate as a result of treatment) 3 months post-radiotherapy. Half of the cervix patients had persistent disease after chemoradiotherapy treatment as diagnosed on their follow-up PET scans. The head and neck dataset consisted of 9 patients. Each patient underwent a CT simulation scan and diagnostic PET/CT scan providing 3 images per case (27 images in total). The dataset was analyzed for endpoint of overall survival rate. The patients had a median follow-up period of 30 months (range: 9-48 months). Four of the head and neck patients died during the follow-up period. Note that traditional disease staging evaluation was not predictive of response in these patients. Interested reader could find review literature on the current role of PET imaging for cervix cancer in Grigsby et al.  and in head and neck cancer in Greven et al. .
The pre-treatment scans were transferred using the Digital Imaging and Communications in Medicine (DICOM) protocol into the research treatment planning system CERR, which stands for Computational Environment for Radiotherapy Research , where the intensity values were converted into SUV. SUV is a decay-corrected measurement of activity per unit volume of tissue (MBq/mL) adjusted for administered activity per unit of body weight (MBq/kg) . All patients had a maximum SUV greater than 2. An experienced oncologist outlined and recorded the clinical target volume (CTV), which includes a margin around the tumor to account for subclinical invasion of microdisease extensions. The gross tumor volume (GTV) delineation in the cervix cancer case was performed by thresholding the SUVs within the CTV at 40% of the maximum value , which is the current standard clinical practice in our institute. Figure 1 shows samples of contoured PET images for the cervix cancer case. In the case of head and neck, the tumor delineation of the PET's GTV, was performed manually by the physician (see figure 2) because the 40% maximum SUV is implicated as an unreliable criteria for the head and neck case . Contouring of head and neck target volumes was done using a CT simulation scan fused with a PET/CT scan with the PET window level set at 40% of the maximum value. Delineation of target volumes for head and neck cancer was based on institutional guidelines that have been developed for each head and neck subsite. The standard practice at our institution also includes review of all available information including physical examination, imaging evaluations (CT, PET/CT, MRI, etc), endoscopic findings, operative reports, and pathologic reports. The current GTV definition guidelines in our institute include the elective nodes beside the primary tumor as shown in figure 2.
The typical role of the intensity volume histogram (IVH) is to reduce a complicated 3D data to a single easier to interpret curve, whereas in this case the volume of a selected structure is plotted as a function of image intensity (or normalized SUV) that is equal to or higher than a certain value.
The IVH method enables several useful metrics to be extracted from functional images for outcome analysis such as Ix (minimum intensity to x% highest intensity volume), Vx (percentage volume having at least x% intensity value), and other common descriptive statistics (mean, minimum, maximum, standard deviation, etc). Note in case of PET, the image intensity values are typically normalized into SUVs prior to calculations.
Examples of IVH plots and these metrics for PET data in CERR are shown in figures 3 and and4.4. Note that the ‘x’ in Vx is chosen as a percentage of the maximum SUV for each patient because SUVs tend to have relatively large variability amongst patients. In our datasets, the maximum SUV range was 3-43 with a median of 12 for cervix patients and it had a range of 4-32 with a median of 16 for head and neck patients.
The co-occurrence features are based on the second-order joint conditional probability density function P(i,j;a,d) of a given texture image. The ith, jth element of the co-occurrence matrix for an anatomical structure of interest represents the number of times that intensity levels i and j occur in two voxels separated by distance (d) in direction (a). In our implementation d was set to a single voxel size, and a was selected to cover the 26-connected neighborhood in 3D space. For an image with M intensity levels, the co-occurrence matrix size is M × M. The M levels were obtained by applying a non-uniform quantization method (root-squared) to limit the size of the matrix and speedup computation. Note that in this case the main source of noise is actually due to quantization rather than the original image photon-limited count.a The values for M are typically selected in powers of 2 (8, 16, 32, etc). Shown in figures 5 and and66 are sample surface plots from clinical PET images.
These metrics are independent of tumor position, orientation, size, and brightness, and take into account the local intensity-spatial distribution [19, 32]. These are considered advantages of texture-based analysis over direct (first-order) histogram metrics (e.g., mean and standard deviation), which are currently practiced clinically in some institutes for analyzing PET metabolic uptake [8, 27]. Hence, these features can provide an intensity-spatially dependent map of the tumor metabolic uptake that can potentially be used a signature to characterize the tumor response to treatment.
In our experiments, these features were mainly calculated for the tumor GTV region and in some cases; figures are presented for the surrounding CTV background for comparison purposes as discussed below.
We used Spearman's rank correlation (rs) and the area under the receiver-operating characteristics curve (AUC) to analyze the association between the extracted features and post-radiotherapy outcomes in cervix and head and neck cancers. Spearman's coefficient provides a simple yet robust nonparametric way to estimate bivariate increasing or decreasing trends that are not necessarily linear associations . The receiver-operating characteristics (ROC) is a graph of the true positive fraction (sensitivity) versus the false positive fraction (1-specificity) for a continuum of threshold values. A value of the area under the curve (AUC) of 1 is ideal, while a value of 0.5 is equivalent to a random guess . Figures 7 and and88 show ROC curves for outcome prediction based on PET analysis in cervix cancer and head and neck cancer, respectively. To improve the prediction power, we have adopted a multi-metric model building approach based on logistic regression and resampling methods to combine several prognostic factors , in which a logit transformation was used:
where n is the number of patients, xi is a vector of the input variable values used to predict f(xi) for outcome yi of the ith patient. The ‘x-axis’ summation g (xi) is given by:
where p is in the number of selected feature metrics. The most relevant metrics are selected using a sequential forward search. The parameters of the model (βj) are estimated using maximum likelihood techniques. The validity of the derived logistic regression model was tested using the leave-one-out cross-validation resampling technique. The main rationale for using the logistic functional in (7) as a predictor is that treatment responses are thought to follow S-shaped curves . This is in addition to its format simplicity for routine clinical evaluation.
As a demonstration of applying these methods to improve cancer treatment outcomes analysis in PET, we considered cohorts of 14 cervix cancer patients and 9 head and neck patients who were treated by chemoradiotherapy in our institute.
In figure 3, we show the IVH plots for the clinical tumor volume (CTV) outlined by the oncologist and the 40% maximum SUV delineated cervix tumor corresponding to patient's PET scan of figure 1. Note that the IVH slope of the delineated tumor has a shallower slope compared to the CTV, indicating larger variability uptake values in the tumor region (due to spanning a wider range of intensities in the accumulative histogram). Shown in figure 5a, b are surface plots of co-occurrence matrix for the CTV and the cervix tumor. Note that the CTV is relatively smooth (entropy is 3.6) while the tumor is showing a ‘noisy’ pattern (entropy is 5.3), which is consistent with the shallow slope of the IVH (larger variability). Summarized in table 1 are the statistical analysis results. It is observed that the texture-based metrics in the table had the highest categorical prediction power of failure risk, while commonly used SUV descriptive statistics had the lowest predictive ability. Specifically, the IVH-based V10–90 and the texture energy were the most significant predictive features. A combined model of these two variables using logistic regression analysis is characterized by the following linear combination:
The model order was manually set to 2 and the variables were chosen by forward selection as explained in section 2.4. This model had an rs=0.49 (p=0.04) and an AUC=0.76 as shown in figure 7.
We carried out similar analysis for the head and neck case with an endpoint of overall survival. In figure 4, we show the IVH plots for the CTV and the tumor GTV outlined by the oncologist corresponding to patient's PET scan of figure 2. Note that the IVH slopes of the CTV and the GTV are more comparable in terms of their steepness/shallowness in contrast with the cervix case; this could be due to the relatively more heterogeneous nature of FDG uptake in surrounding normal tissues in head and neck cases. Shown in figure 6a, b are the surface plots of co-occurrence matrix for the CTV and the head and neck gross tumor. Notice that the texture map of the CTV is again less noisy than that of the tumor (entropy of 3.8 versus 4.7, respectively), but was less smooth than the cervix case (cf. figure 5a and figure 6a). Whereas the cervix tumor's texture map was noisier than that of the head and neck tumor (cf. figure 5b and figure 6b). Summarized results for statistical analysis of the head and neck data are found in table 2. It is interesting to notice from table 2, that shape-based metrics had the highest categorical prediction power of failure risk, while commonly used SUV descriptive statistics had the lowest predictive ability. The IVH-based V90 had the highest univariate predictive power of survival outcome. However, the tumor volume was also noted to be a good predictor of outcome, which is consistent with previous findings [38-40].
We have chosen again to build a two-metric logistic regression model. The IVH-based V90 and shape extent were chosen for constructing the model, which is characterized by the following linear combination:
This model had an rs=0.87 (p= 0.0012) and an AUC=1 as shown in figure 8, which is a perfect fit results for this small cohort of 9 patients. Larger datasets are needed to validate the prediction power of this model.
In this work, we have investigated methods based on pattern recognition for analyzing functional imaging data to ameliorate reported shortcomings of relying on visual inspection or sole traditional standardized uptake value (SUV) measurements in assessing treatment relative risk. These methods would allow for extracting potential prognostic factors from functional imaging data that could be more robust, informative, and discriminative than simplistic SUV descriptive statistics to predict treatment outcomes.
Two distinct approaches were explored for analyzing functional imaging information. The first category is histogram-based reduction of 3D functional imaging data and is denoted as intensity-volume histogram (IVH). The IVH analysis of functional imaging information inherits the same advantages and disadvantages of histogram analyses in analyzing volumetric data. Metrics such an Ix and Vx are derived from IVH plots and could be used to build predictive models of outcomes as demonstrated in Section 3. Nevertheless, IVH like other first-order histogram approaches is limited by its spatial insensitivity. To overcome this limitation, we have also investigated image-based features such as texture and shape attributes.
Unlike SUV descriptive statistics and IVH metrics, image-based features would have spatial and topological information embedded in them. As a proof of principle, we have used the co-occurrence matrix features to represent texture information because of its relatively simple and intuitive structure. Surface plots of the co-occurrence matrix give a pictorial representation of the spatial-intensity distribution, which is typically masked by first-order histogram analyses. Several metrics could be derived from the co-occurrence matrix to characterize the texture features of the structure of interest. We have chosen only four features from this pool, which we thought were most relevant for analyzing PET prognostic value, mainly focused on characterizing uptake's heterogeneity texture patterns. However, other invariant texture features approaches could be investigated as well such as the multi-channel Gabor filter, wavelet transform, Markov model, etc, which may not suffer from information loss due to quantization. These methods and others are surveyed in .
For shape analysis, we have chosen geometric shape features such as eccentricity, Euler number, solidity, and extent. The Euler number is particularly useful for analyzing necrotic tumors or GTVs that include lymph nodes structures in addition to the primary tumor as noticed in figure 2. As this shape feature can capture the presence of multiple uptake regions within the region of interest.
We have selected two cancer sites for demonstrating our methods, other sites such as lung maybe good candidates as well, however, issues related to respiratory motion artifacts need to be addressed carefully. We observed from our preliminary analysis, that the predictive variables were site and endpoint specific. In both the cervix and the head and neck examples, an IVH-based variable was among the strongest correlates with outcome. However, the texture features seemed to be more relevant in the cervix case, while shape features appeared to be more prevalent in the head and neck case. The prevalence of shape features in this case might be affected by the GTV definition that included active lymph nodes. We noted that the prediction in the case of cervix cancer was modest compared to head and neck cancer. This could be attributed to inability of the selected features to fully characterize tumor persistence in cervix cancer based on heterogeneity of FDG-uptake only and possibly additional variables related to irradiation resistance such as oxygenation levels need to be included in analysis ass well. In general, we conjecture that heterogeneity metrics such as entropy and local homogeneity may be most relevant if texture matters, while eccentricity and shape convexity would be more prevalent if shape matters. The presented results are based on a small cohort of patients and are intended for demonstrative purposes. Analysis on larger data sets would be required to shed more light on role of these metrics and their clinical relevance. We expect that these imaging metrics will complement and enrich currently existing clinical knowledge.
For effective use of PET imaging as a biomarker and proper application of pattern recognition features, deliberate attempts should be made to ensure that the PET scanning of the patients follows the same standard with regard to blood glucose concentration, FDG dose, interval between FDG injection and imaging, and imaging duration. In addition, it should be noted that shape metrics are relatively more sensitive than texture metrics to delineation variations. Manual contouring is known to suffer from inter-observer variabities . Therefore, automated methods would be preferred to manual contouring when possible. However, currently used threshold-based approaches may be prone to errors . We have recently investigated an automated concurrent PET/CT image segmentation algorithm based on active contours that can potentially provide better performance than its counterparts by jointly analyzing the CT and the PET boundary information .
In our analysis, we have emphasized building a multi-metric model of relevant features to further improve the predictive power by exploiting complementary information in these features . This approach has improved the prediction power for both cervix cancer and head and neck cancer cases in comparison to using a single metric (correlation in cervix case improved by about 14% and in head and neck by about 12%). However, other models based on machine learning [45, 46], which are able to capture nonlinear pattern relationships could provide better discriminant power as in our previous work in mammography .
We believe that using the proposed pattern analysis methodology would potentially enrich the set of tools available to the outcome analyst in oncology to extract the most relevant information from functional imaging data.
We have proposed a systematic pattern recognition approach for analyzing functional imaging data, in particular for FDG-PET. This would allow for extracting robust visual cues and features that could facilitate incorporation of PET imaging information to understand better patient's response to radiotherapy treatment and reduce the subjectivity induced by visual assessment. The intensity volume histogram (IVH) summarizes intensity-volume relationships in functional images, which could have clinical treatment implication regarding dose coverage adequacy for the oncologist. Image texture and shape features are valuable tools to mimic human perception of visual cues without being prone to inter- and intra-observer variability. In particular, texture features could significantly aid in summarizing tumor uptake characteristics in its microenvironment and its relation to treatment resistance in certain clinical scenarios. Our preliminary results are encouraging and suggest that the proposed methodologies could yield beneficial and more robust tools for treatment response assessment based on functional imaging information. However, further testing using larger datasets is required to validate the quality of extracted relevant features from PET images for clinical decision-making.
This work was supported in part by NIH grant R01 CA 85181 and American Cancer Society IRG-58-010-50.
Issam El Naqa received his B.Sc. (1996) and M.Sc. (1995) in Electrical and Communication Engineering from the University of Jordan, Jordan, Ph.D. (2002) in Electrical Engineering from Illinois Institute of Technology, Chicago, IL, USA, and M.A. (2007) in Biology from Washington University in St. Louis, St. Louis, MO, USA. He is currently an assistant professor at department of Radiation Oncology at Washington University School of medicine, St. Louis, MO, USA. He is a member of IEEE, AAPM, and ASTRO. His current research interests include image processing, pattern recognition, machine learning, and treatment outcomes.
Perry W. Grigsby received his M.D. (1982) from University of Kentucky, Lexington, KY, USA. He is a board certified radiation oncologists at the Siteman Cancer of the Barnes-Jewish Hospital. He is currently a Professor of Radiation Oncology, Nuclear Medicine, and Gynecologic Oncology and a Director of the Brachytherapy & microRT Treatment Center. Dr. Grigsby is a fellow of the American Cancer Society and is an expert (America's best doctors) in the detection, diagnosis and treatment of gynecological and thyroid cancers, with numerous publications in these areas. His current research interests include gynecologic oncology, cervical cancer, thyroid cancer, gynecologic brachytherapy, and PET imaging.
Aditya Apte, M.Sc. is a software engineer and works on developing CERR an in-house research treatment planning system.
Elizabeth Kidd, M.D. is a medical resident in radiation oncology.
Eric Donnelly, M.D., is a medical fellow in radiation oncology.
Divya Khullar, M.Sc., is a software engineer and works on developing CERR an in-house research treatment planning system.
Summer Chaudhari, B.Sc., is a PhD candidate in medical physics
Deshan Yang received his B.Sc. (1992) in Electrical Engineering from Tsinghua University, Beijing, China, M.Sc. (2001) in Computer Science from Illinois Institute of Technologies, Chicago, IL, USA, Ph.D. (2005) in Biomedical Engineering from Wisconsin, Madison. He is currently a post-doctoral fellow at the department of Radiation Oncology at Washington University School of medicine, St. Louis, MO, USA. He is an IEEE and AAPM member. His current research interests include image processing, pattern recognition, an motion estimation.
Martin Schmitt, B.Sc., is a PET technologist at the Mallinckrodt Institute of Radiology at Washington University School of medicine, St. Louis, MO, USA.
Richard Laforest received his B.Sc. (1989), M.Sc. (1991), and Ph.D. (1994) in Experimental Nuclear Physics, Laval University, Quebec Canada. He is currently and associate professor and co-Director of the Small Animal Imaging Laboratory at the Mallinckrodt Institute of Radiology at Washington University School of medicine, St. Louis, MO, USA. His current research interest involves the developments of multimodality (PET/CT) small animal imaging instruments and techniques.
Wade L. Thorstad received his M.D. from the University of Texas Medical School, Houston, Texas, 1991. He is a board certified in radiation oncology and nuclear medicine at the Siteman Cancer of the Barnes-Jewish Hospital. He is an assistant professor of radiation oncology and specializes in head and neck cancer. His research interests include head and neck oncology, thoracic cancer, tumors, systemic radiotherapy, radioimmunotherapy, radioimmunodiagnosis, nuclear oncology, and PET imaging.
Joseph O. Deasy received his B.Sc (1984) in electrical engineering and his Ph.D. (1992) in physics from University of Kentucky, Lexington, KY, USA. He is currently a professor and director of the Bioinformatics and Outcomes Research division at department of Radiation Oncology at Washington University School of medicine, St. Louis, MO, USA. He is a member of IEEE, AAPM, and ASTRO. Dr. Deasy has been a principle investigator on many NIH funded projects. He is an associate Editor for Medical Physics journal and serves as the chair of the AAPM biological effects committee. He was the Co-chair of the joint NSF/NCI conference on Operations Research Applications in Radiation Therapy, Virginia, 2002. His current research interests include bioinformatics, statistical modeling, treatment planning optimization, and radiobiology.
aFor a uniform-quantizer, assuming uniformly distributed data, the signal-to-noise ratio is given by 6 dB/bit. This would be improved by about 2 dB for non-uniform case (Jain, 1989).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.