|Home | About | Journals | Submit | Contact Us | Français|
To compare inter-observer variations in delineating the whole breast for treatment planning using two contouring methods.
Auto-segmented contours were generated by a deformable image registration-based breast segmentation method (DEF-SEG) by mapping the whole breast clinical target volume (CTVwb) from a template case to a new patient case. Eight breast radiation oncologists modified the auto-segmented contours as necessary to achieve a clinically appropriate CTVwb and then recontoured the same case from scratch for comparison. Times to complete each approach as well as inter-observer variations were analyzed. The template case was also mapped to 10 breast cancer patients with body mass indexes ranging from 19.1 to 35.9. Three-dimensional surface-to-surface distances and volume overlapping analyses were computed to quantify contour variations.
The median time to edit the DEF-SEG-generated CTVwb was 12.9 min (range, 3.4–35.9), compared to 18.6 min (range, 8.9–45.2) to contour the CTVwb from scratch (30% faster; p = 0.028). The mean surface-to-surface distance was noticeably reduced from 1.6 mm among contours generated from scratch to 1.0 mm using the DEF-SEG method (p = 0.047). Deformed contours in 10 patients can achieve 94% volume overlap prior to correction and required editing of 5% of the contoured volume (range, 1%–10%).
Significant inter-observer variations suggest that there was a lack of consensus regarding the CTVwb, even among breast cancer specialists. Using the DEF-SEG method produced more consistent results and required less time. The DEF-SEG method can be successfully applied to patients with various body mass indexes.
Radiation plays an integral role in the modern management of breast cancer. Traditionally, radiation is delivered to the intact breast using opposed tangential fields. However, there can be considerable dose heterogeneity with this technique. Intensity-modulated techniques have been developed to overcome these problems (1–3). With the use of more conformal techniques, accurate delineation of the target volumes becomes increasingly important. This can be problematic in breast cancer, since there is currently no standard delineation of the whole breast, making variability a problem. Hurkmans et al. found that intra- and inter-observer variations in delineation of the whole breast can be quite large (4). The use of volume-based planning with standard acceptable volume definitions would lend itself to more precise quality review and comparisons of treatment plans between institutions. In addition, with the increasing use of partial-breast irradiation, delineation of the whole breast has become more important because it permits determination of the amount of breast tissue to be spared. (5)
This intra- and inter-observer variability in contouring the whole breast was taken into account in the National Surgical Adjuvant Breast and Bowel Project protocol B-39/Radiation Therapy Oncology Group protocol 0413 randomized study, which compared whole breast irradiation to partial breast irradiation. The protocol specified the whole breast contour to include all tissue within the tangent fields, excluding lung, recognizing that this is a non-anatomic description of the breast, but would be more reproducible and would allow the process can to be automated with treatment planning systems (6).
In addition to being subject to intra- and inter-observer variability, delineation of the whole breast can be a tedious and time-consuming task for the physician. Computer-assisted automatic segmentation is a potential solution for this problem. In this study, we proposed a segmentation technique using a deformable image registration method. In this technique, expert physicians define the treatment target on a model patient to create a reference atlas, and then the deformable image registration transfers the contours to another patient’s computed tomography (CT) image. Physicians can review these deformed contours and modify them as necessary. Our institution successfully implemented a similar approach in treatment planning for head and neck cancers (7, 8).
The goals of this study were: (1) to investigate and compare inter-observer variations in delineating the whole breast for radiation treatment planning using the deformable image registration-based breast segmentation method (DEF-SEG) to contouring using standard tools from a commercial 3D treatment planning system; (2) to test the feasibility of applying the DEF-SEG method to a population of breast cancer patients with different body shapes.
This study was approved by our institution’s institutional review board. A multidisciplinary team consisting of radiation oncologists, a breast surgical oncologist, and a breast radiologist defined the whole breast clinical target volume (CTVwb) on a CT-dataset of a model patient (used as a template or atlas). All of the patients used in this study had CT simulation performed on an angled board with the arm of the involved side abducted and secured with a Vac-Lok immobilization cradle (MEDTEC, Orange City, IA). A large-bore helical CT scanner (LightSpeed RT, GE Medical Systems, Milwaukee, WI) was used for image acquisition. All images had a slice thickness of 2.5 mm. The axial resolution was 512 × 512, and the pixel size was 1.27 mm.
When defining the CTVwb for the model patient, we attempted to define anatomic boundaries with tissue intensity changes to allow for more reproducible results, although this was not always possible. The anatomic landmarks for defining the CTVwb included the following: (1) superior: mid-clavicular head, with modifications based on patient anatomy; (2) inferior: inferior portion of the xyphoid process, with modifications based on patient anatomy; (3) posterior: clavipectoral-latissimus fascia and anterior border of latissimus dorsi muscle, with modifications based on patient anatomy; (4) anterior: breast skin; (5) medial: medial border of sternum; (6) lateral: anterior portion of latissimus dorsi muscle, with modifications based on patient anatomy. Physicians were encouraged to use these guidelines when contouring the test patients; however, modifications based on patient anatomy and external metallic markers used in the CT simulation were permitted.
We used a previously developed deformable image registration algorithm based on the accelerated “demons” algorithm (9). This is an image intensity-based algorithm and it uses the CT numbers in the breast simulation images for automatic registration between the CT images of the model patient and an arbitrary patient case (the test patient). Because there may be drastic differences between the positions of the model patient and the test patient in the CT images, we started with a rigid registration between the two cases, using the sternum as a bony landmark. This registration does not need to be perfect, because the bony structure may not be identical in shape or orientation among different patients. However, this rigid registration brings the two cases close enough to start the deformable image registration, which accounts for the rest of the differences.
In this project, we used a preliminary implementation of a parallel computing system, which consisted of three Intel-based personal computers with the new quad-core Intel processor (Intel Core 2 Extreme QX6700 Quad-Core Processor at 2.66 GHz). Another personal computer (Dual core, Intel Core 2 Extreme X6800 at 2.93 GHz) was used as a gateway to distribute computing jobs and connect with outside (user) computers. The algorithm to perform deformable image registration was the same, except that each CPU used the deformable registration algorithm on a small portion of the whole CT image in parallel, which greatly improved the calculation speed. After deformable image registration, the displacement vectors (transformation matrix) between the two CT images were used to map the contours from the model patient to the test patient (9), achieving the goal of auto-segmentation for the new patient.
We identified patients consecutively treated to the left breast at our institution. For this portion of the study, we chose the test patient by selecting a patient with a body mass index (28.3) closest to the median of the 10 patients. Auto-segmented contours were generated by the DEF-SEG software by mapping the template case (body mass index 31.0) to the test patient. Eight radiation oncologists (seven breast cancer radiation oncologists and one senior resident) modified the auto-segmented contours as necessary and then re-contoured the same case from scratch using a commercial 3D treatment planning system (Phillips Pinnacle Treatment Planning System, Phillips Medical Systems, Bothell, WA). When contouring from scratch, the external skin contour was first automatically generated using a tool in the commercial treatment planning system. The physician then used the treatment planning software’s editing tools to further contour the CTVwb. This technique, which is frequently used in our clinic, makes it easier to have a smooth contour line around the breast skin and typically saves about 4 minutes in contouring time compared to contouring the CTVwb without the help of the automatically extracted external skin contours. The time to edit the DEF-SEG-generated CTVwb and the time to contour the CTVwb from scratch were recorded for each physician.
All of the calculations were performed using MATLAB software (The Mathworks, Inc., Natick, MA). The breast contours were extracted from the treatment planning system and constructed as 3D binary objects using a polygon filling technique from the contour points. A spatial scoring map (probability map) was then created on the basis of how many times a voxel was included inside a given physician’s contours. The probability map thus created consisted of values from 1 to 8 for each voxel position (the value represents the number of physicians who included the voxel in the contours). Based on the probability map, an “average” surface representing the consensus of at least five physicians (the probability value was greater than or equal to 5) was constructed using thresholding and boundary detection techniques. This average contour represented a simple majority of five out of eight physicians (not necessarily the same five physicians) and was used to evaluate the variations between individual contours from different physicians. The distance between the average 3D surface and the 3D surface from each individual CTVwb was calculated using an Euclidean distance transformation method (10, 11) and was used to quantify the spatial deviations from the average surface. Because distance measure between two surfaces is not a bijective function (12, 13), we calculated the Euclidean distance in both directions between the two surfaces and used averaged value as the mean distance between the two surfaces. The mean, standard deviation, and maximum 3D surface-to-surface distances between the average 3D surface and the 3D surface of each individual CTVwb (for each physician) were calculated for each surface point. Subsequently, the variations of mean and maximum 3D surface-to-surface distances were calculated for all eight physician contours.
In order to validate the automatically segmented volumes, we also calculated the Dice Similarity Coefficient (DSC).(14) For two segmentations, S1 and S2, the DSC is defined as the ratio of the volume of their intersection to their average volume:
The DSC has a value of 1 for perfect agreement and 0 when there is no overlap.
The template case was mapped to the previously described 10 consecutively treated patients using the DEF-SEG method. The test patient from the inter-observer variation study was included in this cohort. One physician modified the 10 deformed contours according to the guidelines of the template CTVwb. The body mass indexes of the patients ranged from 19.1 to 35.9. Image processing methods were used as described above to compute the 3D surface-to-surface distances and DSCs between the modified and unmodified deformed contours in the 10 test patients. The time to edit the DEF-SEG-generated CTVwb was recorded. This portion of the study was performed using a slower, research version of the treatment planning system, and the contouring task was performed over a network connection instead of at the console of a Pinnacle workstation. However, the relative times (for correcting the contours) for each case remained valid among the 10 cases. Therefore, each recorded time was scaled using the time the same physician required to contour the same test patient used in the inter-observer variation portion of the study on the faster, clinical version of the treatment planning software.
The deformable image registration was completed in 36 seconds, including the time for network transfer of data sets from the user computer to the parallel computing system and retrieval of the results. A similar computing job would take approximately 3 minutes on a single CPU system. Figure 1a shows the contours for the model patient (top panel) and the contours mapped to the test patient after the deformable transformation (bottom panel). It illustrates that the test patient was quite different from the model patient both in body shape and body curvatures. Figure 1b shows more axial slices of the deformably mapped contours for the same test patient. The deformed contours for both the breast and the spinal cord matched well with the CT image of the test patient.
Figure 2 shows two definitions of breast contours using our in-house guidelines (red) and the National Surgical Adjuvant Breast and Bowel Project whole breast definition (green). The top panel shows the reference contours on the model patient in three views: axial (left), coronal (middle), and sagittal (right). The bottom panel shows the corresponding views of the deformed contours mapped by the deformable image registration algorithm. Although the two patients are different in size and shape, the corresponding anatomical landmarks in the breast contours matched very well, as indicated by the red arrows. Figure 2 illustrates that the method will still be valid if the contours are defined differently on the model patient. The deformed contours will follow the original definition and map them correspondingly to the anatomy.
The overall average 3D spatial deviation from the average surface for the eight physicians was 1.0 mm (range, 0.4–1.5) when they edited the CTVwb from the DEF-SEG method versus 1.6 mm (range, 1.0–2.9) when they contoured the CTVwb from scratch (p=0.049, paired sample t-test). The detail is listed in Table 1. The standard deviation between the mean surface was reduced from 2.9 mm among contours generated from scratch to 2.0 mm using the DEF-SEG method (p = 0.054). The average DSC from the average consensus volume was also improved from 0.92 (contouring from scratch) to 0.94 (editing from DEF-SEG delineated contours) (p=0.049).
Figure 3 illustrates the variations in physicians’ contours from the average (consensus) surfaces. The standard deviations, maximum, and mean 3D surface-to-surface distances were plotted for contouring the CTVwb from scratch (top row) and for editing the CTVwb using the DEF-SEG method (bottom row). The plots in Figure 3 were generated from a beam’s eye view from the inside of the patient looking out, as indicated by the yellow arrow in the bottom left corner. The greatest differences occurred in the posterior-lateral border, the superior and inferior ends, and along the breast-lung interface (Figure 4). Because we used an automatically generated external skin contour as a part of the breast contour, interobserver differences were smallest at the breast skin surface.
The median time required to edit the DEF-SEG-generated CTVwb was 12.9 min (range, 3.4–35.9 min), compared to 18.6 min (range, 8.9–45.2 min) to contour the CTVwb from scratch, even though contouring from scratch was done immediately after editing of the contours from the DEF-SEG method and the physicians might have learned from this previous contouring experience. On average, contouring from the deformed template was 30% faster than contouring from scratch (paired t-test: p = 0.028).
The overall mean 3D spatial deviation between the unedited DEF-SEG CTVwb and the physician-edited CTVwb was 1.3 mm (range, 0.4–2.2) (Table 2). The data in Table 2 reflect the results of deforming a single model patient’s contours to 10 different patients. It appears that the 3D spatial deviations are comparable to the inter-observer variations between different physicians for a single case (Table 1). The average DSC between the edited and unedited DEF-SEG delineated volume was 0.94, which is identical to the inter-observer variation among eight physicians for contouring from scratch (DSC=0.94). The deformed contours required an average editing of 5% of the contoured volume (range, 1%-10%), especially in the cranial-caudal ends and posterior-lateral border, as seen in Figure 5. Although the results seem to be acceptable using one single patient as a template, we found that the performance became worse when a patient’s BMI was greatly different from the template patient. Figure 6 plots the volume overlap index (DSC) in relationship to the absolute BMI difference from the template patient. There is a statistically significant deterioration when the difference becomes larger (p=0.02; Pearson correlation coefficient = 0.71).
Our studies demonstrated that there were significant contour variations in the delineation of the whole breast, even among specialists. However, these variations were reduced when the contours were pre-segmented by the deformable registration software using a model patient with a consensus contour definition. We also demonstrated that the model patient’s contours could be successfully applied to different patients with different body shapes and simulation positions. The DEF-SEG delineated contours seem to maintain the same anatomical definition in the model patient. This technique allows the whole breast to be contoured for more consistent and quantitative analyses of dose coverage.
In addition to demonstrating that the computer-assisted contouring method improved consistency, we demonstrated that the computer-assisted method improved efficiency. There was a 30% improvement in the median time required to contour the CTVwb using the DEF-SEG program, compared to contouring from scratch. This is a conservative estimate, as all physicians edited the DEF-SEG CTVwb first and then contoured the CTVwb from scratch. By editing the DEF-SEG contours first, the physicians were more familiar with the patient anatomy and the contouring tools than when they contoured from scratch, and this may have actually reduced contouring time from scratch.
Treatment planning vendors may also improve manual contouring tools, which could speed up the manual contouring process. For example, Eclipse treatment planning software (Varian Medical Systems, Pola Alto, CA) has a “smart paintbrush” function, which allows faster contouring of the skin line. However, our atlas approach has the advantage to preserve the anatomy definition, which can make contouring more consistent among different physicians.
Determination of the template case CTVwb by breast cancer specialists, as done in this study, could allow for more consistency in multi-center trials in breast cancer patients in whom all or part of the breast is to be treated. Our results (in particular, the results shown in Figure 2) indicate that our deformed contours consistently closely matched the original definition of the breast target. As shown in Figure 2 (left column), the posterior-lateral border of the contoured breast was correctly mapped to the latissimus dorsi muscle (bright in intensity) as defined in the model patient. This feature, a result of the deformable image registration, is usually called “topologically” preserved transformation, in which the relative relationship of contour shapes (topology) will be somewhat maintained. Figure 2 shows that both the in-house-defined breast contours (in red) and the whole breast contours according to the National Surgical Adjuvant Breast and Bowel Project definition (in green) are shown for both the model patient and the test patient. The shapes and relative relationships of the contours remained the same in the model and test patients. This means that the same anatomy definition will be maintained across different patients. This consistency is an important feature, especially for large-scale clinical trials.
However, the significant inter-observer contour variations suggest that there was a lack of consensus regarding the CTVwb, even among breast cancer specialists. The greatest inter-observer variability occurred in the posterior-lateral border, as demonstrated in Figure 4. This is consistent with findings from the study by Hurkmans et al., in which the largest deviations in the whole breast planning target volume were in the posterior region (4). Both our study and the Hurkmans study support the need for uniform consensus guidelines in breast target delineation, and in fact the Radiation Therapy Oncology Group has formed a working group to discuss and resolve this issue (15).
In this study, we used a grayscale intensity-based deformable image registration algorithm. It takes the advantage of a well defined CT number (Hounsfield Unit) to establish one-to-one correspondence between the model patient and the test patient. However, this algorithm requires sufficient image contrast to detect tissue structures in the breast or nearby anatomy. In the absence of intensity changes in parts of anatomical boundaries, the algorithm is less accurate because the one-to-one correspondence is difficult to establish. This is one of the main reasons that physicians should review and correct these computer-generated contours. Fortunately, our study shows that manual edits are not substantial (only about 5% of the breast volume). Overall, it still saves approximately 30% of time, comparing to contouring from the scratch.
The deformable imaging program was able to map the template case CTVwb to 10 different patients with various body shapes and required average editing of only 5% of the contoured volume. Our results showed that there might be a relationship between body mass index and the physician-modified volume. The template patient had a body mass index of 31.0, and the patients with the lowest and highest body mass indexes (19.1 and 35.9, respectively) both required 7% of the volume to be modified by a physician. However, a patient with a body mass index of 28.6 required modification of 10% of the contour volume. We speculate that the success of contouring depends on both the individual shape of the breast and the difference in body mass index from the template patient. In the future, it may be helpful to have several template cases available to achieve the best match with a particular patient to be contoured.
In conclusion, we demonstrated an effective, template-based deformable image registration solution for delineating the CTVwb. This approach improved both consistency and efficiency, although physician editing is still necessary and appropriate. Additional studies will focus on the utility of this approach in patients requiring regional nodal irradiation and patients with gross disease at the time of treatment.
Supported in part by a grant from Varian Medical Systems, Palo Alto, CA and by National Cancer Institute grant T32CA77050.
CONFLICT OF INTEREST NOTIFICATION
None of the authors have a conflict of interest to declare.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.