|Home | About | Journals | Submit | Contact Us | Français|
This paper details an algorithm to simultaneously perform registration of computed tomography (CT) and cone-beam computed (CBCT) images, and image enhancement of CBCT. The algorithm employs a viscous fluid model which naturally incorporates two components: a similarity measure for registration and an intensity correction term for image enhancement. Incorporating an intensity correction term improves the registration results. Furthermore, applying the image enhancement term to CBCT imagery leads to an intensity corrected CBCT with better image quality. To achieve minimal processing time, the algorithm is implemented on a graphic processing unit (GPU) platform. The advantage of the simultaneous optimization strategy is quantitatively validated and discussed using a synthetic example. The effectiveness of the proposed algorithm is then illustrated using six patient datasets, three head-and-neck datasets and three prostate datasets.
Cancer radiotherapy aims to deliver a prescribed radiation dose to a cancerous target region while sparing surrounding critical structures and normal tissues. Conventional cancer radiation therapy designs a treatment plan based on a snapshot of the patient’s anatomy, such as the snapshot obtained from CT, and uses the treatment plan for the entire treatment course minimal to no replanning. However, a patient’s body is a dynamically evolving system. The inter-fractional variation of a patient’s geometry during a long treatment course may seriously compromise the plan’s quality. To overcome this disadvantage, adaptive radiation therapy (ART) has been proposed (Yan et al., 1997; Wu et al., 2002; Birkner et al., 2003). ART seeks to redesign the treatment plan based on a patient’s evolving anatomy, acquired through daily in-room imaging modalities (e.g., CBCT), and hence maintain optimality of the updated plan. In such a process, manual delineation of cancerous target and organs at risk on a daily CBCT replanning regimen is impractical because of its labor intensive nature. Automatic structure delineation utilizing deformable image registration (DIR) has been proposed as a feasible auto-segmentation approach given the initial organ segmentations available from the planning CT (Chao et al., 2008; Samant et al., 2008; Xie et al., 2008; Godley et al., 2009; Paquin et al., 2009). Treatment plans can conveniently transfer the organ contours from the planing CT to CBCT by establishing voxel correspondences between the initial planning CT image and newly acquired CBCT imagery.
Current clinical practice mainly utilizes CBCT for the purpose of treatment setup, i.e. to position the patient in the same way that the CT was taken. There are more demanding applications, such as tumor contouring, electron density conversion and dose calculation (Hatton et al., 2009), that require high-quality CBCT images with accurate CT numbers. However the shading artifacts in CBCT images, as a result of scattered radiation and beam hardening effects, produce errors in CT numbers and lead to object contrast loss (Niu and Zhu, 2011). Many shading or scatter correction algorithms for CBCT have been proposed. Among the literature (Colijn and Beekman, 2004; Zhu et al., 2006; Star-Lack et al., 2009; Niu and Zhu, 2010), a relative simple idea is to warp CT to match the anatomical features in CBCT by DIR. Then the intensity of CBCT images can be corrected either in image space (Marchant et al., 2008) or in the projection space (Niu et al., 2010). However these approaches turn the CT-to-CBCT DIR and image enhancement of CBCT into a “chicken-and-egg” type of problem.
This paper proposes an algorithm that simultaneously performs the registration and CBCT enhancement. The registration algorithm relies on a viscous fluid model by D’Agostino et al. (2002), which offers the freedom to naturally incorporate an intensity correction term into the registration framework. We assume that the intensity difference between CT and CBCT after alignment is spatially smooth and thus regularize it with the H1 semi-norm. An iterative procedure alternates between updating the deformation fields to align CT and CBCT geometrically, and performing intensity fitting to photometrically match CBCT to CT. The additional intensity fitting term improves the registration results. The proposed algorithm is implemented on a graphics processing unit (GPU) to accelerate this computationally heavy task.
The remainder of this paper is organized as follows. The overview of image registration methods is present in Section 2. Section 3 is devoted to the joint registration and image enhancement model and its iterative equations. Section 4 presents the results of our algorithm on one synthetic example in order to demonstrate the presumed benefits of the joint optimization. Furthermore, six representative CT/CBCT datasets of head-and-neck data and prostate data are registered and enhanced. Conclusions and discussions are provided in the final section.
Deformable image registration between CT and CBCT is a challenging problem. One may consider it as an image registration problem between mono-modal images (Horn and Schunck, 1981; Thirion, 1998; Vercauteren et al., 2009; Gu et al., 2010). However, the conventional strategy of using sum-of-squared-differences (SSD) of intensity values between two images as a similarity measure performs poorly, or may even fail, since corresponding points in CT and CBCT do not necessarily attain the same intensity value. One way to circumvent this intensity inconsistency issue is to include an intensity calibration step. For example, Hou et al. (2011) utilize histogram matching to unify the CT and CBCT intensities, followed by application of the Demons algorithm (Thirion, 1998) which is designed for mono-modal registration. A more sophisticated intensity correction method present by Guimond et al. (2001) considers a polynomial relationship of the intensity correspondences between CT and CBCT images. A modification of polynomial intensity correction is made in (Ou and Chefd’Hotel, 2009). The work of Nithiananthan et al. (2011) integrates a tissue-specific intensity correction technique into the registration process, with subsequent application of the Demons algorithm. In a similar topic of aligning an atlas to the image space of magnetic resonance images (Pohl et al., 2005), a set of structure-dependent registration parameters are estimated simultaneously with classification and intensity correction in a statistical framework. However, the relationship between CT and CBCT is neither intensity-dependent nor material-dependent. In Fig. 1, the ring-shaped shadows of two CBCT images in axial view indicate that the intensity difference between CT and CBCT largely depends on the pixel location, which makes the aforementioned approaches sub-optimal.
An alternative perspective regards CT–CBCT registration as a special case of multimodal registration. In a multimodal scenario, some similarity measures are available, such as correlation ratio (Roche et al., 1998), cross-correlation (Gonzales and Woods, 2002) and rank-correlation (Figl et al., 2010). In particular for CT–CBCT registration, there is a sizeable number of work using normalized cross-correlation (NCC), such as (Schreibmann et al., 2006; Yang et al., 2007; Greene et al., 2008, 2009; Lu et al., 2010, 2011). The underlying rationale of these methods is that the intensities of CT and CBCT are related by a linear transformation. However, in the contexts where severe artifacts presents in CBCT, e.g. those shown in Fig. 1, such relationship does not hold and hence NCC may not give satisfactory results. In more general multimodal registration scenario, mutual information (MI) is widely used. It relies on the intensity distributions of two images instead of their intensity values directly. Since the seminal work of Viola and Wells (1997) and independently of Maes et al. (1997), many approaches are based on MI-inspired similarity measures. Please refer to the survey paper by Pluim et al. (2003) and more recent works (Razlighi et al., 2009; Yi and Soatto, 2009; Loeckx et al., 2010; Chappelow et al., 2011). One drawback of MI is the tendency to disregard local consistency in order to match the two intensity distributions globally.
On the assumption that one source of intensity differences between CT and CBCT is a result of slow, or smooth, spatial variation, we propose a similarity measure that is a linear combination of both MI and SSD. The latter similarity score is to account for local information and will also be augmented to perform intensity correction. Furthermore, to enforce the spatial smoothness, the intensity correcting SSD score will be regularized with the H1 semi-norm. We apply the intensity correction term to the CBCT, in essence matching the CBCT to the deformed CT photometrically. A similar approach of intensity correction is examined by Myronenko and Song (2010). There, the registration is formulated in the mono-modality setting. We find that including both MI and SSD is helpful to register CT and CBCT, with the intensity correction of CBCT strengthening the registration results.
We denote a given CT image by and the corresponding CBCT image by . The goal of deformable image registration, defined here, is to compute a geometric transformation mapping the points in I2 onto the corresponding set of points in I1. The choice of applying the transformation to the CT imagery arises from the fact that CT has better image quality than CBCT. Therefore, deforming CT to match CBCT should cause less interpolation errors than the other way around. The deformed, or registered, CT image will be denoted by .
The goal of image enhancement is to estimate an intensity correction term such that and are as close as possible. The intensity corrected image, denoted by , should have better image quality than . Note that the shadow areas in Fig. 1 make discrimination difficult. It is reasonable to assume that is globally smooth, as the CBCT contaminations caused by scatter and bowtie-filter are spatially smooth. Thus, given the input pair of images (I1; I2), the goal of our algorithm is to output another pair of images, which are the geometrically aligned image and the intensity corrected image . The typical approach to computing the transformation and intensity correction relies on establishing a similarity measure to quantify how close the two input image volumes are to each other. Next, the transformations and that maximize the similarity are computed through an optimization process. The remainder of this section details our joint algorithm.
Mutual information (MI), when used as a similarity measure, is computed from the joint histogram of two images. Under such a similarity model, the less dispersed the joint histogram is, the better the two images are aligned (Pluim et al., 2003). The definition of the MI between I1 and I2, given the transformation , is:
where is the joint histogram of the deformed image and the target image . The joint histogram is computable using a 2D Parzen window to approximate the δ function in the overlap region V,
The marginal distributions and p(i2) in (1) are the probability density distributions of and , respectively. The arguments i1 and i2 range over the image intensity values.
MI measures the similarity between two images using global statistical properties from the entire image domain. Accordingly, MI tends to disregard strong neighborhood information in an attempt to increase the overall image similarity, thus producing significant local errors in the estimated transformation. Considering that the image modalities between CT and CBCT are not drastically different, we include the L2 distance between and I2 + a in the similarity measure. The purpose of this inclusion is to incorporate information regarding local intensity matching to guide the registration process. Incorporation of the two objectives is performed through a convex combination of their associated similarity scores plus a regularizing term, arriving at the objective function to be minimized,
where 0 ≤ λ ≤ 1. Notice that the parameter λ should be chosen carefully to control the relative weights between the first two terms, as SSD and MI are usually at different orders of magnitude.
Ignoring the contributions of the intensity correction term a, the model reduces to the intensity-based model for mono-modality registration with λ = 1, while it becomes the MI-based multimodal DIR model at λ = 0. is a regularization term with respect to the intensity correction a. Without the regularization, i.e., μ = 0, there would be too much freedom for a and the registration results would be unreliable due to overfitting of the intensity values. The regularization term is chosen as the H1 semi-norm to ensure global smoothness of a. The smoothness of the vector field is not explicitly modeled in the objective function, but is inherently modeled in a viscous fluid model.
We adopt a viscous fluid model (D’Agostino et al., 2002) to compute the optimal deformation. The model assumes that the deformation is governed by the Navier–Stokes equation of viscous fluid motion. Mathematically, it is expressed as
The deformation velocity is related to by
where is the Jacobian of with respect to the coordinates . The vector function is a force field that drives the deformation in the appropriate direction.
This viscous fluid approach can easily take into account the intensity correction through the force field. In fact, the objective function and the force term are related. Taking the variation of E with respect to gives the expression for the force term , that is,
where is the variation of (1) with respect to ,
To numerically solve the Navier–Stokes equation (4), successive over-relaxation (SOR) is typically applied as in Christensen et al. (1996), which is computationally expensive. A filter-based method is derived by Bro-Nielsen and Gramkow (1996) on account of the Greens function as a fundamental solution to the fluid equation. In particular, the velocity field is obtained by a convolution filter applied to each component of the force field , i.e., , where Φs is the 3D Gaussian kernel with width s (in voxels)
Implementation of the optimization algorithm requires for the associated PDEs to be discretized. With regards to , at each iteration k, the deformation field is updated by
where and the time step δt is chosen adaptively for each iteration through
with δu equal to the maximal voxel displacement allowed in one iteration.
To preserve the topology of the deformed template, re-gridding is performed whenever there exists such that the Jacobian of reaches below a positive threshold (e.g., 0.5). As long as the determinant of Jacobian is larger than this pre-determined value, the deformation field is guaranteed to be invertible, and thus the topology is preserved. Re-gridding is performed in a way that the current deformed image is set to be a new template and the incremental deformation field to be zero. The total deformation is the composition of the incremental deformation fields associated with each propagated template.
To simultaneously optimize the intensity correction, we adopt a gradient-descent method with regards to optimization of a. In particular, computing the gradient flow to minimize the energy (3) with respect to a leads to the following PDE,
The fitting term a and the deformation field are coupled in an alternating manner. For each k, we use ak as initial value and evolve this PDE for several iterations to get ak+1. A semi-implicit discretization scheme is used to make the algorithm stable. The update on a improves the L2 distance, e.g., the intensity matching, between the two images as part of the similarity measure. The improved intensity matching smoothly aligns the support of the two image distributions and p(i2), thus aiding the registration process.
Furthermore, we employ a multi-resolution scheme to increase speed and robustness. Specifically, we construct an image pyramid by down-sampling the images by a factor of two in each dimension. We then perform registration at the level of a low-resolution grid. The deformation field obtained at the low resolution level is then interpolated to become the initial conditions for the higher-resolution computation.
Lastly, we have implemented our algorithm on the GPU. Registration of volumetric data in medical image processing is a computationally intensive task. Computational time is a critical factor that limits the adoption and applications of registration in clinical practice. As the structure of image registration algorithms is compatible with GPU compute architecture, a range of DIR algorithms have been developed on the GPU (Sharp et al., 2007; ur Rehman et al., 2007; Samant et al., 2008; Muyan-Ozcelik et al., 2008; Gu et al., 2010). The papers (Shams et al., 2010; Fluck et al., 2010) survey GPU implementations of DIR.
In summary, a flow chart in Fig. 2 illustrates our joint registration and intensity correction algorithm.
We provide experiments to demonstrate (1) the linear combination of two similarity measures outperforms any individual, (2) efficiency and robustness of the proposed method for both multimodal deformable registration and intensity correction on one synthetic example and six real patient datasets and (3) the influence of parameters. For image registration, we compare the linear combination of SSD and MI with SSD only, MI only, cross-correlation (CC) and residual complexity (RC) as in the recent work of Myronenko and Song (2010). For intensity correction, we compare with the state-of-the-art scatter removal method (Niu et al., 2012) in CBCT imaging.
We synthesize a CT image by imposing smooth deformation flows with maximal displacement d on the Shepp-Logan phantom, while the CBCT image is created by adding a round-shape shadow and Gaussian white noise with standard deviation s to the phantom. The simulated CT and CBCT images are illustrated in Fig. 3a and b. Our proposed method, the combination of SSD and MI, is compared with the methods using SSD only, MI only, CC and RC. CC assumes a linear relationship between two image modalities, while RC is formulated in a mono-modal setting. Therefore, CC and RC perform similar to SSD in this case. MI, as a similarity measure for multi-modal registration, has two noticeable deficiencies in registration: the boundaries are not as sharp nor as uniform as they should be (such as for the white outer region), and the small center element has lost some of its area to the larger circle near it. This is because these regions are indeed mono-modal in nature. On the other hand, MI performs better than SSD, CC and RC on the region of left black oval, where the intensities of two images are significant different and not related by a linear function. The combination of SSD and MI has the benefits of both methods while mitigating the drawbacks of using single of them.
To quantitatively compare the accuracy of these registration methods, percentage of false positives (PFP),
is evaluated, where A denotes the ground truth, B is the deformed region and |·| computes the cardinality of each set. Table 1 lists the PFP of three particular regions, as illustrated in Fig. 4. The deformed regions are computed using various similarity measures, i.e., SSD + MI, SSD, MI, CC and RC, under different maximal displacement d and noise level s. SSD, CC and RC are all better than MI if PFP is computed on the white region, where these similarity measures provide a stronger force than MI to deform one image towards the other. On the gray or black regions, the MI registered image is a stronger performer than those are registered by the other similarity measures. As indicated by Table 1, the combined SSD + MI approach is not the best in each case, but it has a better overall performance.
Our algorithm is applied to- and validated on- six patient cases (three head-and-neck datasets and three prostate datasets). Table 2 lists dimension and voxel resolution of each dataset along with the computational time using GPU. For head-and-neck (A) and (B), prostate (A), the patients were first scanned in a 4-slice GE Light-Speed RT scanner (GE Healthcare, Milwaukee, WI) for treatment plan purposes and then in the on-board imager system on a Varian Trilogy linear accelerator (Varian Medical System, Palo Alto, CA) before each treatment. For head-and-neck (C), prostate (B) and (C), the planning CT scans were taken on a 16-slice Philips Brilliance Big Bore CT simulator (Philips Healthcare Systems, Andover, MA) with the helix scan mode. The CBCT scans were performed on the patients at treatment time using the on-board imager system installed on the Varian Clinac 23IX radiation therapy machine (Varian Medical System, Palo Alto, CA). We first perform a rigid registration so that CT and CBCT have same orientation and voxel size. With the high computing power of a GPU and the carefully designed algorithmic structure tailored for this platform, it only takes 48 s to register a pair of images with size 512 × 512 × 150. The time is reduced to 25 s if no intensity correction is included. Incorporation of intensity correction nearly doubles the computational time. This phenomenon is expected since the degrees of freedom (DOF) of the function a is on the order of the number of voxels in the images. The large DOF consumes more memory and increases the computational cost of the algorithm.
In order to quantitatively evaluate registration performance, we asked a physicist to manually delineate some organs in both CT and CBCT images. In particular, the left parotid from the head- and-neck (C) and the bladder from the prostate (A) are examined. In CT–CBCT registration, one wants to deform CT to match CBCT geometrically. Therefore, the label map of an organ in CBCT image can be regarded as ground-truth. After applying deformation field to the label map of the same organ in CT image, PFP of two label maps is computed. Table 3 demonstrates that the linear combination of SSD and MI is better than using SSD or MI only. Notice that in Table 3 PFP of prostate (A) is much smaller than the one of head- and-neck (C) since the bladder has smaller motion than parotid.
As exact ground-truth is not always available in real data, we compare registration algorithms using edge maps. Let B, C be two 3D volumetric images. We first apply a Canny edge detection algorithm on every 2D axial slice of B and C. Let K be the third dimension of the volumetric data. This implies Canny edge detection is applied K times, each producing Bk, Ck for k = 1, … , K. We define an error percentage (EP) of two images B and C as follows,
EP measures the percentage of the number of mismatched pixels between the edge maps of CBCT and the warped CT by certain registration algorithm. A smaller EP implies a better result of this registration algorithm. We compare SSD + MI with similarity measures, MI only, CC and RC, in terms of EP in Table 4. The achieved smallest value in each case in our method indicated the improved registration accuracy from the aspect of EP. As shown in Fig. 5, Canny edge detection has some defects, such as not getting consistent yet correct edge maps among corresponding slices. The error percentage based on edge maps results in a heuristic comparison between two images.
Visually a checkerboard display technique serves to demonstrate the performance of the joint registration and intensity correction algorithm. In this technique, two images to be compared are merged together in a checkerboard pattern, where black regions are filled by the first image and the white regions are filled the second. This display method highlights the mismatch between anatomical structures in the two images at the transition zone. Ideally, when two images are fully aligned, there is little to no difference at the boundaries between checkerboard squares. Misalignment and miscorrection of intensity will result in discontinuities along edges and enhance perception of the checkerboard pattern, respectively. We use the notation C(I1, I2) for the checker-board fusion of images I1 and I2.
Figs. 6--1111 show the sagittal, axial and coronal views of all patient datasets before and after our algorithm is applied. Our algorithm takes the input images I1 (CT) and I2 (CBCT) and outputs the deformed CT image and intensity corrected CBCT image . The middle column in Figs. 6--99 is to compare the deformed CT to the original CBCT, which validates that our algorithm works in the geometrical alignment of features between two images, and not simply to match intensity. In Fig. 10 and and11,11, we also plot deformation fields overlaid on deformed CT images.
As for three head-and-neck datasets, the anatomical features, such as jaw and spine, align. Notice that we resize the results in superior-interior direction corresponding to the physical space, which results in interpolation errors in sagittal and coronal views, especially in the region of spines as in Fig. 7. Furthermore, the CBCT has the same contrast as CT given the lack of intensity differences in the checkerboard display. There is a small region in the sagittal view of Fig. 6 where CT and CBCT do not match. The patient opens mouth during the CT scan, and closes during the CBCT scan. Consequently, CT and CBCT may have different topological characteristics around the mouth. The assumption of diffeomorphic deformation flow is invalid in the region where topology changes. It is difficult to design registration algorithms that can handle topological changes. This is the future direction we would like to pursue.
We demonstrate the intensity correction or scatter removal part of our algorithm. For two head-and-neck datasets (B) and (C) in Figs. 12 and and13,13, the intensity of shoulder regions is recovered after correction. Two prostate datasets (A) and (B), shown in Figs. 14 and and15,15, are more challenging for intensity correction, as their CBCT images contain a large amount of noise. The corrected CBCT is the sum of the original CBCT and an intensity correction term . If is smooth, then the noise in the original CBCT image will cause the output CBCT to be noisy. On the other hand, we can tune parameter μ so that the correction term is less smooth. But this approach will lead to intensity overfitting in the sense that the corrected CBCT images are heavily biased towards the CT images. Please refer to the discussion on parameter μ in Section 4.4 for more details. Our algorithm enables the corrected CBCT image to have the similar contrast to the CT image, while also removing the shadow. Notice that noise still prevails in the corrected CBCT images. Denoising is beyond the scope of this paper and will be explored in the future.
In Fig. 15, we also compare our method with the state-of–theart scatter removal method (Niu et al., 2012). This approach registers the CT to the CBCT image, and hence primary signals in CBCT scan can be estimated via forward projection on the CT image. The shadow in the axial view, caused by the scatter signals, can be removed using the low-pass filter on the difference between the estimated and the raw CBCT projections. However, the residual scatter and increased noise after scatter correction result in streaking artifacts as shown in the middle column of Fig. 15. Also the truncation errors in the projection yield the sacrum corrupted and incomplete in the coronal view of their reconstruction.
We have two parameters in the algorithm. λ is a weighting factor in the linear combination of SSD and MI, while μ balances between the similarity measure and the intensity fitting. This section is devoted to discussion on the influence of these two parameters on our algorithm.
We first compare our algorithm for different values of parameter λ in (3) without incorporation of the intensity correction term (and its associated smoothing cost). The case λ = 0 corresponds to a MI-based multimodal DIR model and the case λ = 1 corresponds to an intensity-based algorithm for mono-modality registration. The value of λ is indicated in the superscript of the output registered image in Fig. 16. The combination of SSD and MI outperforms using them individually. Ideally the optimal λ is chosen according to the difference of histograms of CT and CBCT images. In practice, we find our algorithm is not sensitive for λ (0, 1) when we rescale the image intensities of CT and CBCT to be [0, 1] so that SSD and MI to have the same magnitude.
In order for the L2 distance in the energy (3) to provide sensible output, it is better to update the intensity correction function a iteratively in (11). To demonstrate the role of a and its optimization, we consider three cases as follows. In Fig. 17a, the correction is fixed a = 0 and not updated. In Fig. 17b, the correction is updated but not applied in the checkerboard test. Lastly, in Fig. 17c the correction is updated and applied in the checkerboard test. Comparing the left two columns, we find that iterative correction of a assists with the registration. Edge continuity is improved for the intensity corrected iterations (top part of zoomed in center region). The bright region of the upper image (in the top row) also has sharper edges for the intensity corrected version. Both of these observations validate that our algorithm works to align the geometrical features between two images, and not simply to match intensity. Moving to the right-most column, which applies the intensity correction, it is clear that the two images have been properly transformed to match. Edges are continuous and intensity variation is smooth over uniform regions. These improvements demonstrate that intensity correction is important for both CBCT image enhancement and deformable CT–CBCT registration.
In Fig. 18 we show the influence of the parameter μ in (3) on the performance of the intensity correction portion of our algorithm. Larger values of μ correspond to a stronger assumption of smooth-ness of the fitting term a. Smaller values of μ lead to intensity overfitting, in the sense that the corrected CBCT images are heavily biased towards the CT images. For example, in Fig. 18, there is a black hole in the center of the corrected CBCT image which inherits from the CT image, while the hole in the bottom left is completely removed from the original CBCT image. On the other hand, small μ yields better image quality of the reconstructed CBCT in consonance with CT. Therefore, there is a trade-off regarding how to choose the parameter μ. For all of our experiments, we use μ = 100.
In this paper, we have presented an algorithm to simultaneously register CT to CBCT and enhance CBCT image quality. In particular, we combine the similarity measure for registration and the L2 distance for intensity correction into a viscous fluid model so that our algorithm can jointly achieve two goals. As for the registration cost functions, SSD is better in mono-modality, while MI is better in multi-modality. We use a linear combination of the two cost functions for the proposed similarity measure, which is better than using single one of them to register CT and CBCT. In order to make SSD a reasonable metric, CBCT intensity is updated according to the deformed CT image, which in turn helps to improve the registration results. Our algorithm is implemented on the GPU, which only takes 45 s to register a pair of 512 × 512 × 150 images.
Before the end of this paper, we would like to provide discussions on a set of relevant issues that could be of interest to readers.
First of all, our work aims at solving the registration problem between a CT image and a CBCT image contaminated by severe artifacts caused by e.g. scatter and bowtie filter. While improved registration results are obtained over other competing methods, the significance of our method depends on the severity of these artifacts in clinical practice. In fact, with the rapid advances in CBCT technologies over the past few years, artifacts of these kinds have been mitigated to a large extent. We also believe that the ultimate solution to the CT–CBCT registration problem should be the improvement of CBCT image quality, so that the intensity inconsistency problem between these two modalities are negligible and conventional registration algorithms can robustly deliver accurate results. Nonetheless, at the current stage, there are a large number of old generations of CBCT machines used in current clinical practice, on which artifacts of our focus still exist to a certain level. The severity also varies depending on the CBCT technology on those machines and the satisfactory level of machine calibration and maintenance. The image artifacts from those machines largely hinder the use of CBCT for advanced clinical applications, e.g. adaptive radiotherapy. Before those old CBCT machines are replaced, our method offers a practical solution to conduct CT-CBCT registration and facilitate clinical tasks.
Second, it is also of interest to discuss the potential benefit of having the intensity modified CBCT image. Actually, the benefits rely on the accuracy of registration. Under ideal situation with perfect registration results, the deformed CT image will have correct anatomical structures as well as accurate CT numbers. Hence, current clinical practice in radiation therapy that relies on CT numbers, e.g., dose calculation on CBCT, can be conducted using the deformed CT image instead. In this regard, the intensity modified CBCT image does not provide much clinical benefit over the deformed CT image. However, the registration accuracy is not always satisfactory in practice. For instance, there could be mismatch of structures between the deformed CT and CBCT due to insufficient number of iterations or local minima of registration algorithms. The intensity corrected CBCT image will offer benefits in dose calculations than the uncorrected one, as the systematic CT number inaccuracy are eliminated to a certain extent. Yet, since it is required to convert the HU values to electron density before dose calculation in radiotherapy via a calibrated conversion curve, another possible approach without correcting CBCT HU values is to use a HU to electron density conversion curve that is calibrated against the original CBCT image with incorrect HU values. However the effectiveness of such a method depends on the type and the amount of artifacts. In some situations, e.g. when the ring artifact due to bow-tie appears, voxels with the same HU number in CT at the center or at the periphery of the image would attain different HU numbers in CBCT. The electron density cannot be correctly interpreted by a single conversion curve. Our method will attain its advantage over the approach of using a new conversion curve. However, given the fact that MeV beam dose calculation is not quite sensitive to small inaccuracy of electron density, whether or not a single electron density conversion curve can deliver sufficiently accurate dose calculation results in the presence of this ring artifact is subject to further investigation, especially when the ring artifacts are mild.
Last, but not least, we would like to point out that, although improved results have been observed over some competing methods, these improvements in this preliminary study are not conclusive and are subject to future investigations. Regarding the CBCT intensity correction, more quantitative studies are needed. Our model may also be adjusted to yield more accurate results. In fact, one key assumption of our method is the smoothness of the correction term a. Yet, in the high contrast regions, the correction term varies largely. Hence a single smoothness term regardless spatial locations may lead to oversmoothed correction term and make the correction insufficient. On the other hand, the improved registration accuracy is only demonstrated indirectly from a few measures, such as PFP. To rigorously demonstrate the accuracy, more investigations, e.g. comparing with manual landmarks, are needed.
There are also a few issues to be studied in future. First, SSD and MI are usually at different orders of magnitudes. This may cause difficulties when tuning the parameter λ. In this paper, we consider to normalize image intensity to [0,1] and we find empirically our algorithm is not sensitive to λ. An alternative is to use normalized similarity measures. For example, it is worth trying the combination of normalized cross-correlation and normalized mutual information in the future. Second, other forms of similarity metrics may be included in our model depending on the contexts. For instance, the aforementioned NCC could be used to replace the SSD term in our model, when a linear relationship between CT and CBCT intensities holds. Third, it is also desirable to design a parametric formula for the intensity correction term a. For example, inspired on the ring-shape shadow, a maybe a function of radius to the center. The parametric form of a could largely reduce its degree of freedom, and thus a higher computational efficiency can be achieved. The validation of the parametric description of a, however, requires further investigations.
This project was supported by grants from the National Center for Research Resources (P41-RR-013218) and the National Institute of Biomedical Imaging and Bioengineering (P41-EB-015902) of the National Institutes of Health. This work was also supported by the NIH grant R01 MH82918 as well as a grant from AFOSR. This work is part of the National Alliance for Medical Image Computing (NA-MIC), funded by the National Institutes of Health through the NIH Roadmap for Medical Research, Grant U54 EB005149. Dr. X. Jia would like to acknowledge the support from the Early Career Award from Thrasher Research Fund. Dr. T. Niu and Dr. L. Zhu are supported by Georgia Institute of Technology new faculty start-up funding and the NIH under the Grant No. 1R21EB012700-01A1. We thank NVIDIA for providing GPU cards on which this work is performed. Information on the National Centers for Biomedical Computing can be obtained from http://nihroadmap.nih.gov/bioinformatics.