|Home | About | Journals | Submit | Contact Us | Français|
To develop an algorithm to identify and quantify BAT from PET/CT scans without radiologist interpretation.
Cases (n = 17) were randomly selected from PET/CT scans with documented “brown fat” by the reviewing radiologist. Controls (n = 18) had no documented “brown fat” and were matched with cases for age (49.7 [31.0-63.0] vs. 52.4 [24.0-70.0] yrs), outdoor temperature at scan date (51.8 [38.9-77.0] vs. 54.9 [35.2-74.6] °F), sex (F/M: 15/2 cases; 16/2 controls) and BMI (28.2 [20.0-45.7] vs. 26.8 [21.4-37.1] kg/m2]). PET/CT scans and algorithm-generated images were read by the same radiologist blinded to scan identity. Regions examined included neck, mediastinum, supraclavicular fossae, axilla and paraspinal soft tissues. BAT was scored 0 for no BAT; 1 for faint uptake possibly compatible with BAT or unknown; and 2 for BAT positive.
Agreement between the algorithm and PET/CT scan readings was 85.7% across all regions. The algorithm had a low false negative (1.6%) and higher false positive rate (12.7%). The false positive rate was greater in mediastinum, axilla and neck regions.
The algorithm's low false negative rate combined with further refinement will yield a useful tool for efficient BAT identification in a rapidly growing field particularly as it applies to obesity.
Brown adipose tissue (BAT) is a highly metabolically active endocrine organ characterized by mulitlocular adipocytes with high mitochondrial expression of uncoupling protein-1 (UCP-1) (1). Until recently, the only known function of BAT in humans was nonshivering thermogenesis during infancy (2-5). It was understood that BAT disappeared after infancy and early childhood, suggesting that it did not contribute to energy homeostasis in adulthood. However, clinical studies have confirmed the presence of BAT in adults, and researchers have postulated that it may be functionally relevant in adulthood (6). In support of this hypothesis, recent studies using 18Fluoro-deoxyglucose positron emission tomography/X-ray computed tomography (18F-FDG-PET/CT) have shown that varying quantities of active BAT are present in adult humans and can contribute to energy expenditure (2,4,5,7,8). Through uncoupled oxidative phosphorylation, BAT inefficiently utilizes energy substrates and releases chemical energy as heat (9). Thus, BAT favors energy expenditure through inefficient fuel utilization, which would be beneficial in a state of positive energy balance such as obesity (9). Thus, the presence of BAT in adults introduces a novel potential therapeutic target to treat obesity.
Although 18F-FDG-PET/CT scans provide a potential means for non-invasive identification of activated BAT in humans, image analysis methodology for efficient quantification of BAT from these scans is very limited. The relatively low prevalence of activated BAT detected in clinical scans (3-7.5%, [4, 10, 11]) implies that large datasets must be evaluated to generate statistically valid results. Hence, an automated tool capable of analyzing large volumes of scans for clinical intervention and population studies is needed. The objective of this study was to design an algorithm to identify and quantify human activated BAT from clinical diagnostic 18F-FDG-PET/CT scans without human interpretation.
A retrospective chart review was approved by the Institutional Review Board at Boston University Medical Center, Boston, MA to identify a cohort of subjects suitable to test a program designed to segment BAT from clinical 18F-FDG-PET/CT scans. Patients who had undergone an 18F-FDG-PET/CT scan were chosen from 1,048 consecutive skull-to-mid-thigh or whole-body 18F-FDG-PET/CT scans taken at Boston Medical Center between 2006 and 2009.
Cases (n = 17) were randomly selected from 143 of these 1,048 scans based on documentation of BAT in their medical record by the reviewing radiologist at the time of the scan. Controls (n = 18) were selected from the pool of 18F-FDG-PET/CT scans without documentation of BAT from their scan. Controls were selected to match cases for age, outdoor temperature at scan date, sex and BMI (Table 1). Subjects selected had undergone diagnostic testing for the following reasons (number of controls/cases): breast (7/9), lung (2/1), gastrointestinal (1/1), lymphoma (4/2), head and neck (2/1), ovarian (0/1), parametrial (0/1), cervical (0/1), pancreatic (1/0) cancer and a benign lung tumor (1/0). The study radiologist independently confirmed the presence or absence of BAT in the scans selected for this study.
The areas examined were limited to regions previously reported to have BAT in humans (4,12), including left and right supraclavicular, paraspinal and axilla regions, as well as the mediastinum. CT and PET images were processed with algorithms written in Matlab (Mathworks, Natick, MA) to identify regions of BAT (Figure 1). CT images were filtered using an adaptive Wiener filter to remove noise, and regions of fat between the diaphragm and the top of spine were identified as pixels having −200 to −10 Houndsfield units. PET images of standardized uptake value (SUV) were resampled to the CT resolution using linear interpolation. Fat regions with SUV >2 g/mL were delineated, and contiguous regions <10 mm2 in area were removed. To eliminate spillover artifacts, regions whose activity was <90% of the maximum activity within a three pixel wide perimeter around the region were removed. Images were created with BAT regions superimposed on the body outline and bone structure (Figure 2) to allow visual assessment of the program results by the radiologist.
We conducted a case-control study to compare the utility of the algorithm against radiologist expertise. There is no set “gold standard” for defining activated BAT from 18F-FDG-PET/CT images, which is required to validate the results of the automated algorithm. We chose our gold standard to be the judgment of an experienced nuclear medicine radiologist who reviewed all 18F-FDG-PET/CT scans, and assigned a likelihood score to activity seen within adipose tissue located between the skull base and the iliac crests. This nuclear medicine radiologist is a senior faculty member with 16 years of experience reading PET scans and 8 years of experience reading PET/CT scans. All cases were reviewed in a clinical workstation (MIMVista, version 4.3, Cleveland, OH) that displayed the CT, PET, and fused PET/CT images in the coronal, sagittal, and axial planes. PET scans were attenuation corrected and images were displayed as standard uptake values (SUV) corrected for body weight using a 7-bit grey scale mapped to SUVs between 0 and 7, a common scale for clinical PET/CT reading. The scans were acquired using Boston Medical Center's standard imaging protocol for oncology (unpublished). Briefly, patients were injected with 10-20 mCi of 18F-FDG after at least 4 hours of fasting and imaged from the skull base to the mid thighs 60 minutes after injection using a Discovery STN-16 PET/CT scanner (GE Healthcare, Waukesha, WI). During the 60 minute period after injection patients rested in a reclining chair in a quiet, lit room. PET emission scans varied with body habitus. For BMI less than 20 kg/m2, the emission scans lasted for 2 minutes per bed position and were acquired in 3D mode. For BMI greater than 30 kg/m2, the emission scans lasted 6 minutes per bed position and were acquired in 2D mode. For BMI between 20-30 kg/m2, the emission scans lasted 3 minutes using 3D mode acquisition. CT imaging was done first using a low-radiation dose protocol. CT mode is full helical with tube rotation time of 0.8 seconds. The pitch is 1.75 with a beam collimation of 10 mm (16 × 0.625 mm) and table speed of 17.5 mm/rotation. The peak voltage ranges from 120 to 140 kV, and the current is fixed at 125 mA. CT images are reconstructed using a standard kernel at 3.75 mm slice thickness and displayed every 3.3 mm to match the PET data that is reconstructed in 3.3 mm slice thickness displayed also every 3.3 mm. The CT data was also used to do attenuation correction of the PET data following the procedure recommended by the manufacturer of the scanner. Radiation exposure to patients ranged from 13 to 24.5 mSv, and included the radiotracer and CT radiation dose (13).
The score system employed in defining the gold standard assigned the value of 0 for no BAT, 1 for faint/unknown, and 2 for BAT positive. This scale was used, instead of a binary “BAT present” and “BAT absent”, because there is no rigorous definition for BAT in 18F-FDG-PET/CT imaging. Activated BAT in adult humans was originally reported from 18F-FDG-PET/CT imaging (14), and was described as unexpected uptake in the fat of the supraclavicular fossae and neck that was above background. Most of the studies were done in oncology patients and activity is not seen in fat when attenuation corrected images are displayed using scales common to oncology imaging. Although BAT activity is most common in the supraclavicular and neck regions, experience has shown that unexpected uptake is also seen in the fat of the axillae, mediastinum, paraspinal soft tissues, and even in the retroperitoneum around the adrenal glands (15). For this reason our radiologist only reviewed images from the base of the skull to the top of the diaphragm, and used the scoring system to document his confidence that uptake was related to BAT. The reasons that caused a score of 1, or questionable BAT, included (1) uptake that was barely above background and was possibly an artifact; (2) uptake in fat outside areas where BAT was expected; or (3) misregistration between the PET and CT scans that led to uncertainties with regards to the origin of the uptake, for example, muscle or tumor uptake versus BAT. In defining the gold standard and assigning a score the radiologist did not use a specific SUV threshold or measured activity by drawing regions of interest. Instead the radiologist relied on a qualitative impression from images displayed using the same scale.
After a 4 week break to ensure there was no influence from the PET/CT scan readings, the same radiologist also reviewed and scored the images produced by the algorithm. These images consisted of a body outline and color coded boundaries enclosing a volume of fat tissue with uptake SUV > 2 g/mL, as detailed below. The radiologist scored these images using the same scoring system used for the PET/CT scans (described above).
Finally, three weeks after the algorithm scoring was completed the radiologist conducted a qualitative side-by-side comparison of the algorithm and PET/CT images to determine the sources of disagreement between the two methods. Using the scoring system we defined discrepancies as true positives (algorithm 1 and PET/CT 1 or algorithm 2 and PET/CT 2), true negative (algorithm 0 and PET/CT 0) false positives (algorithm 2 and PET/CT 0 or 1) or false negatives (algorithm 0 and PET/CT 1 or 2)
In order to measure the performance of the algorithm relative to the PET/CT reading (congruency), the score of the algorithm was compared to the PET/CT score and the proportion of true positives (1 vs. 1 and 2 vs. 2), true negatives (0 vs. 0), false positives (0 vs. 1 or 2 and 1 vs. 2) and false negatives (1 vs. 0 and 2 vs. 1) were determined. For each region, percentages in a category were calculated by dividing the number of findings in that category by the number of patients studied. The sensitivity of the algorithm was calculated as: (true positives/[true positives + false negatives])*100%; the specificity was calculated as: (true negatives/[false positives + true negatives]); and the positive predictive value was calculated as: (true positives/[true positives + false positives]). In addition, the agreement between the two scores was determined by weighted Kappa analysis. Based on Landis and Koch (16), agreement for the weighted kappa coefficient was defined as: ≤0 = poor, 0.01-0.20 = slight, 0.21-0.40 = fair, 0.41-0.60 = moderate, 0.61-0.80 = substantial, and 0.81-1 almost perfect (17). The Bonferroni corrected level of significance was set at p < 0.007 for weighted Kappa analysis. The distribution of BAT volumes were not Gaussian and thus the Wilcoxon rank-sum test was used to compare BAT volumes between cases and controls. All statistical analyses were carried out using GraphPad Prism Statistical software (La Jolla, CA).
Agreement between the algorithm and the PET/CT scoring was 85.7% across all regions. Individual analyses by region demonstrated that there was substantial agreement between the algorithm and the PET/CT score for the left axilla (κ = 0.745, p < 0.0001), left (κ = 0.744, p < 0.0001) and right (κ = 0.731, p < 0.0001) paraspinal and left (κ = 0.735, p < 0.0001) and right (κ = 0.684, p < 0.0001) neck regions. There was only moderate agreement between the algorithm and PET/CT scoring for the right axilla (κ = 0.508, p = 0.001) region and fair agreement for the mediastinum region (κ = 0.398, p = 0.001).
The sensitivity of the algorithm across all regions was 86.7% and the specificity was 85.6%. Two examples of true positive findings of the supraclavicular region are demonstrated in Figure 3A and 3B. The overall false negative rate was low (1.6%) and only four examples were found in 35 scans across seven regions (Table 2). Of the four false negatives identified, two were found in one scan and were due to a gross misregistration of the PET scan with the CT scan. In these cases, local increases in 18F-FDG uptake of the BAT regions were misplaced into non-adipose tissue regions, precluding identification. The other two false negatives were found in the paraspinal region of separate scans and were due to underestimation of BAT by the algorithm (pictured in Figure 3C).
The false positive rate of BAT by the algorithm was higher (12.7%) than the false negative rate (Table 2), leading to a positive predictive value of 45%. The causes of false positives were mostly due to spillover of heart activity into BAT region of mediastinum (Figure 3D), overestimation of BAT by algorithm relative to PET/CT reading, underestimation of activity by radiologist in the PET/CT scan readings, as well as one case of tumor pathology in the right neck region misidentified as BAT by algorithm.
The total volume of activated BAT that was quantified by the algorithm is presented in Table 3. The volume of activated BAT was significantly higher in BAT cases than in controls (p < 0.01).
The algorithm detected small quantities of BAT in 5 control subjects, ranging from 0.36 to 5.58 mL. For three of these control subjects, the detected BAT was outside the pre-selected areas, and for two of these control subjects there was an occurrence of a false positive in the left neck and a false positive in the right axilla. For all other controls (n = 13) the algorithm detected no activated BAT. In all cases with pre-noted BAT presence (n = 17), the algorithm detected activated BAT, with volumes ranging from 3.36 to 157.52 mL.
We have developed a preliminary tool with the potential to screen thousands of clinical 18F-FDG-PET/CT scans for the presence of activated BAT. We demonstrate that our software can efficiently identify activated BAT with high sensitivity (few false negatives) and specificity, and an acceptable positive predictive value. The usefulness of this tool with regard to cost and time efficiency was very apparent during radiologist review. It took an estimated 10 minutes to adequately review each PET/CT scan image, which equates to ~6 hours for all 35 scans. In comparison, the software analysis takes approximately 1 minute per scan, which can be reduced in future versions by optimizing the program and the platform. The real value in time savings is not just in one-to-one comparisons of identified cases, but in the potential of screening large numbers of cases with high sensitivity automatically with this computer algorithm. For example, consider that the highest prevalence reported for BAT in the population is about 7%. This would mean that 500 scans would have to be evaluated by experienced radiologists to find 35 cases of BAT, an effort tying up valuable radiologist time. Instead, the algorithm has the potential of identifying the reduced brown fat cases for radiologists to study. The utility of this tool is also evident from the increasing interest in understanding the role BAT may have in the prevention and treatment of obesity (18). Through comparative analysis of PET/CT images with the algorithm generated images, we documented the good performance of the algorithm, and identified sources of error that can be handled by improved algorithms and imaging. Previous work by Cypess et al. (4) required significant human intervention. Our scheme is also noteworthy because we can quickly compute volumes and other metrics like total glycolytic activity (equal to the volume × mean SUV ), in a region of BAT.
The identification of significant BAT in adult humans has spurred great interest in understanding the potential role this metabolically active tissue may have in energy expenditure. Several studies have identified factors associated with the amount of BAT present in human subjects, including sex, age, BMI, percent body fat, and diabetes (2-4,8,11). Moreover, recent studies demonstrate that coldinduced activation of BAT significantly contributes to whole body energy expenditure (8,11). For example, Ouellet et al. (11) reported that cold activation of BAT increased total energy expenditure by 80%, demonstrating that BAT has great potential as a therapeutic target of obesity. Additionally, recent attempts have been made to elucidate the mechanisms that may facilitate the development of therapeutics that induce greater expression and/or activation of BAT. For example, Bostrom et al. (20) identified a novel blood secreted hormone they coined “irisin”, a cleaved fragment of FNDC5 that is regulated by PGC1-α. Through a series of cell culture and mouse model experiments this study demonstrated that irisin induces browning of white adipose tissue through upregulation of UCP-1 and Cidea mRNA expression. Moreover, mice fed a high fat diet were protected against obesity and diabetes with exogenous administration of irisin (20). Bostrom et al. (20) also demonstrated that irisin is highly conserved and is present in human subjects. Thus, in order to keep pace with this and other studies in this rapidly growing area of research, it is essential that we generate a tool to efficiently quantify BAT in humans.
In this study, we defined a gold standard for identifying BAT from 18F-FDG-PET/CT scans using an expert radiologist and a trinary scoring system. However, this standard is not perfect. It is based on the experience of a single reader who made choices that may not be made by others with similar experience. Indeed, when the radiologist compared the PET/CT scans and algorithm-generated images at the same time, he identified four cases where he felt that he had underestimated the likelihood of BAT. This occurred most often because the radiologist chose to give a low score to areas with faint activity (score of 0 or 1). Because these areas were above the threshold used by the program, the images generated by the algorithm were scored as clearly BAT (score of 2). Use of additional radiologists would provide a more robust gold standard because sites of activated BAT would be identified by a consensus of “experts”. However, this is a preliminary study and the anatomic areas were limited to those where activated brown fat is common in clinical PET/CT imaging. Such restriction reduces the likelihood of discrepancies between radiologists. As the software matures and is subject to additional tests, it will be necessary to improve on our “gold standard” by employing a panel of radiologists.
While the rate of false negatives in our study was low and the agreement was substantial for most regions, one source of error we observed when examining the false negative findings was due to the misregistration of PET/CT scans. This causes the algorithm to misidentify regions of white adipose tissue placing the BAT outside the region of analyses. Typically this is caused by movement of sick patients during the scan because of respiration, muscle relaxation, and discomfort (21). The head and neck region, a region highly likely to have BAT, is most susceptible to misregistration because the head is not fixed in position during the scans and the long time lapse between PET and CT imaging (21). In order to minimize these issues, our future studies will involve healthy subjects without respiratory issues; we will also devise methods to fix the head in position and will be attentive to the comfort of the subject. Another possible solution is to develop imaging processing algorithms to recognize and correct, if possible, for misregistration. In addition, we also observed two cases of false negative findings in the paraspinal region. It is possible that this is due to the size restriction the algorithm employs (<10 mm2). This constraint is designed to avoid false positive findings resulting from imaging noise. In future versions of the algorithm, relaxing this size constraint in regions where BAT volumes are expected to represent small image areas, such as the paraspinal region, may prevent such false negative findings.
We observed a higher false positive rate relative to false negative rate, which was primarily due to spillover of heart activity into the mediastinal fat next to the heart, an observation that was obvious when reading for the gold standard. As such, the agreement between the algorithm and the PET/CT scoring was only fair (κ = 0.398) in this region. We can also not rule out the possibility that there was BAT present in the heart as previous studies have confirmed the presence of brown adipocytes in epicardial fat (22). However, the interference of heart muscle activity precluded our ability to identify activated BAT in this organ. This confounding factor can be addressed by avoiding heart uptake through the use of prolonged (greater than 12 hours) fasting (23), which would limit spillover of heart activity into adjacent regions. Additionally, by identifying specific causes of false positive findings and developing the software to recognize these conditions, we can reduce such mislabeling in future versions.
Currently, there is no automated tool available that is capable of quantifying activated BAT in human PET/CT scans and such a device is ultimately necessary to compare quantities among various clinical studies (24). In the present study we demonstrate the novel capabilities of the algorithm to quantify the volume of activated BAT. In case subjects, we observed that the volume of activated BAT ranged from 3.36 mL up to 157.52 mL. This indicates that the algorithm can detect small (<5 mL), yet clinically significant volumes of activated BAT. The algorithm also quantified “BAT” in 5 of the control subjects, three of which had “BAT” detected outside the selected areas and two of which had false positive findings in the selected regions. One limitation of the algorithm in its present form is that it cannot quantify BAT by individual region, but instead gives the total activated BAT volume for the scan. In addition, studies in healthy subjects will use lower radiation doses, which may create more noise that might interfere with the algorithm's ability to identify very small regions of activated BAT. Validation of the algorithm under these experimental conditions in both men and women will be an integral next step in the development and validation of our tool.
Finally, this was a retrospective study design and the scans were chosen from a patient population composed primarily of women that required a PET/CT scan for diagnostic purposes, mainly cancer. Therefore, the experimental conditions were not ideal for assessing BAT. In future studies, we will use a healthy population free of any illness that may interfere with the identification of BAT in order to perfect the algorithm in future studies. While this is a critical next step, we observed only one case in which the presence of a cancerous lesion produced a false positive finding in the regions of interest. In this example, the radiologist identified tumor pathology of the neck, a region that has a high likelihood of expressing BAT, in the PET/CT scan. However, the algorithm was unable to differentiate the BAT from the cancerous lesion and incorrectly identified the right neck as positive for BAT. Caution must be taken when conducting retrospective analyses of BAT prevalence from patient 18F-FDG-PET/CT images because of the unavoidable presence of cancerous lesions, particularly in patients with head and neck cancers.
We have developed a preliminary image processing algorithm that can efficiently identify activated BAT from clinical 18F-FDG-PET/CT images with very good sensitivity and specificity, and which has substantial agreement with radiologist assessment. We also present a novel tool useful for the quantification of activated BAT in human PET/CT scans. Further refinement of this approach and improvements in methodologies will yield a valuable tool for more efficient identification and more accurate quantification of BAT in large scale studies, where radiologist review would otherwise be extremely inefficient and costly. The utility of this algorithm-based method is evident from the rapidly growing area of research focused on increasing the activity of BAT in animals and humans in order to burn excess calories. Thus, this methodology would be essential for future studies aimed at determining the role of BAT in the treatment of obesity.
This study was supported by a Boston University Department of Medicine Pilot Study grant and the Boston Nutrition & Obesity Research Center (P30DK46200). M.R. Ruth, T. Szabo, T. Wellman, and G. Mercier have no competing interests. C.M. Apovian has served on the advisory boards for Allergan, Amylin, Orexigen, Merck, Johnson and Johnson, Abbott, Arena, Zafgen, Novo Nordisk, and Sanofi-Aventis, and has received research funding from Lilly, Amylin, Pfizer, Sanofi-Aventis, Orexigen, MetaProteomics, and the Dr. Robert C. and Veronica Atkins Foundation.