Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2900817

Formats

Article sections

- Abstract
- 1. INTRODUCTION
- 2. ONLINE CONTOUR TRACKING
- 3. ASYMMETRIC ONLINE BOOSTING
- 4. DYNAMIC APPEARANCE MODEL AND SUBSPACE LEARNING
- 5. EXPERIMENTAL RESULTS
- 6. CONCLUSION
- References

Authors

Related links

Proc IEEE Int Symp Biomed Imaging. Author manuscript; available in PMC 2010 July 9.

Published in final edited form as:

Proc IEEE Int Symp Biomed Imaging. 2010 April 1; 2010: 209–212.

doi: 10.1109/ISBI.2010.5490376PMCID: PMC2900817

NIHMSID: NIHMS214111

Baiyang Liu,^{†}^{‡} Lin Yang,^{‡}^{§} Casimir Kulikowski,^{†} Jinghao Zhou,^{§} Leiguang Gong,^{#} David J. Foran,^{‡}^{§} Salma J. Jabbour,^{§} and Ning J. Yue^{§}

Accurate tracking of tumor movement in fluoroscopic video sequences is a clinically significant and challenging problem. This is due to blurred appearance, unclear deforming shape, complicate intra- and inter- fractional motion, and other facts. Current offline tracking approaches are not adequate because they lack adaptivity and often require a large amount of manual labeling. In this paper, we present a collaborative tracking algorithm using asymmetric online boosting and adaptive appearance model. The method was applied to track the motion of lung tumors in fluoroscopic sequences provided by radiation oncologists. Our experimental results demonstrate the advantages of the method.

Accurate tracking of tumor movement in fluoroscopic video sequences is a clinically significant and challenging problem. The issue of precise target positioning of tumors in lung cancer is complicated by intra-fraction target motion. It has been well demonstrated that tumors located in the thorax may exhibit significant respiratory induced motions [1]. These physiologically related motions may increase the target positioning uncertainty and have a two-fold of impact on radiation treatment. First, the motions will blur the images and increase the localization uncertainty of the target at the planning stage. On the other hand, this increased uncertainty may unnecessarily include some normal tissues, which may lead to higher than expected normal tissue damage.

Recent computer-aided methods for tracking the tumor can be categorized into three groups: (1) Finding the tumor positions based on the external surrogates [2]; (2) Tumor tracking with the help of fiducial markers inside or near the tumor [3]; (3) Tumor tracking without implanted fiducial markers. The optical flow [4] produce promising tracking results when there is relatively small motion between adjacent frames. Respiration motion also complicates accurate tracking of tumors making it necessary to apply adaptive trackers. Shape models of individual annotated tumors at different phases of respiration were learned offline to achieve good tracking results [5]. A motion model and one step forward prediction were applied to reliably track the left ventricle in 3D ultrasound [6]. However, these methods require a lot of expensive annotations and can only track tumors by utilizing learned shape priors or motion priors.

In this paper, we present an adaptive tracking algorithm of lung tumors in fluoroscopic image sequences using online learned collaborative trackers. Through the use of online updating, our method does not require large number of manual annotations and can adjust to appearance changes adaptively. Adaptive classifiers are “taught” to discriminate landmark points in the contour from others. They are incrementally updated during the whole tracking process. Appearance of the local window centered in each landmark point is modeled in a low dimensional subspace. The algorithm was evaluated using two fluoroscopic sequences of lung cancer provided by radiation oncologists.

The tumor contour is represented as a list of landmark points in clockwise. We denote the contour as *C* = {*c*_{1}, *c*_{2}, …, *c _{n}*}, where

$$p({\mathrm{\Lambda}}_{t}\mid {Z}_{1:t})\propto p({Z}_{t}\mid {\mathrm{\Lambda}}_{t})p({\mathrm{\Lambda}}_{t-1}\mid {Z}_{1:t-1})$$

(1)

${\mathrm{\Lambda}}_{t}^{\ast}={\mathit{argmax}}_{{\mathrm{\Lambda}}_{t}}p({\mathrm{\Lambda}}_{t}\mid {Z}_{1:t})$ is the estimation of the contour tracking result.

Assuming each landmark point is independent to each other, let *χ _{ti}* and

$$p({\mathrm{\Lambda}}_{t}\mid {Z}_{1:t})\propto \prod _{i=1}^{n}p({\chi}_{{t}_{i}}\mid {z}_{{1}_{i}},{z}_{{2}_{i}},\dots ,{z}_{{t}_{i}}).$$

(2)

Tracking results for all landmark points with a maximum *p*(*χ _{ti}*|

$$p({\chi}_{t}\mid {z}_{1:t-1})=\int p({\chi}_{t}\mid {\chi}_{t-1})p({\chi}_{t-1}\mid {z}_{1:t-1})d{\chi}_{t-1}$$

(3)

and updating

$$p({\chi}_{t}\mid {z}_{1:t})\propto p({z}_{t}\mid {\chi}_{t})p({\chi}_{t}\mid {z}_{1:t-1}).$$

(4)

The transition model *p*(*χ _{t}*|

$$p({z}_{t}\mid {\chi}_{t})\propto {p}_{D}({z}_{t}\mid {\chi}_{t}){p}_{G}({z}_{t}\mid {\chi}_{t}),$$

(5)

where *p _{D}*(

We define training samples
${\{{x}_{i}\}}_{i=1}^{N}\in {R}^{d}$ and their labels
${\{{y}_{i}\}}_{i=1}^{N}\in \{-1,1\}$, where *y* = 1 denotes a landmark point and *y* = −1 is a background point. A function *f*(*x*) : *R ^{d}* →

Boosting constructs a strong classifier as a linear combination of *T* weak classifiers *h _{i}*(

$$H(x)=\mathit{sign}(f(x)),$$

(6)

$$f(x)=\sum _{i=1}^{T}{\alpha}_{i}{h}_{i}(x).$$

(7)

The weight *α _{i}* of

$$\epsilon (H)=\mathbb{E}(\mathit{ALoss}(H(x),y)),$$

(8)

where

$$\mathit{ALoss}(H(x),y)=\{\begin{array}{ll}{k}^{1/2}\hfill & if\phantom{\rule{0.16667em}{0ex}}y=1,H(x)=-1\hfill \\ {k}^{-1/2}\hfill & if\phantom{\rule{0.16667em}{0ex}}y=-1,H(x)=1\hfill \\ 0\hfill & \mathit{otherwise}\hfill \end{array},$$

(9)

which is proven to be effective in [7]. This asymmetric loss function can be integrated into an online boosting algorithm by multiplying the original weights *exp*(−*y _{i}* *

The asymmetric online boosting method used in our algorithm is summarized in Algorithm (1). The strong classifier *H*(*x*) is updated incrementally using tracking results in the current frame. Each weak learner *h _{i}*(

The online boosting classifier utilizes a limited number of online selected weak learners with best discriminative power. However, it is still challenging to discriminate landmarks with similar appearance, which are very common in fluoroscopic images. Dynamic appearance model handles this challenge by learning the target incrementally.

Without losing generality, we define *I* = (*I*_{1}, *I*_{2}, …, *I _{n}*) as

Let Ω = {*μ*, *U*, Σ} serves as the appearance model for *I*, Σ is a diagonal matrix contains *k* largest ordered singular values of *I* and *λ*_{1} > *λ*_{2} > … > *λ _{k}*. For a given sample

$${p}_{U}({I}_{{\chi}_{t}}\mid \mathrm{\Omega})\propto \frac{\mathit{exp}(-{\scriptstyle \frac{1}{2}}{({I}_{{\chi}_{t}}-\mu )}^{T}U{\mathrm{\sum}}^{-2}{U}^{T}({I}_{{\chi}_{t}}-\mu ))}{{(2\pi )}^{{\scriptstyle \frac{k}{2}}}{\displaystyle \prod _{j=1}^{k}}{\lambda}_{j}},$$

(10)

and distance-to-subspace,

$${p}_{\overline{U}}({I}_{{\chi}_{t}}\mid \mathrm{\Omega})\propto \mathit{exp}(-{\left|\right|{I}_{{\chi}_{t}}-\mu -{UU}^{T}({I}_{{\chi}_{t}}-\mu )\left|\right|}^{2}).$$

(11)

The generative model *p _{G}* in (5) is estimated as

$${p}_{G}({z}_{t}\mid {\chi}_{t})={p}_{U}({I}_{{\chi}_{t}}\mid \mathrm{\Omega}){p}_{\overline{U}}({I}_{{\chi}_{t}}\mid \mathrm{\Omega}).$$

(12)

Let *A* denote the observations in the previous *n* frames and *B* represent the most recent *m* frames. Incremental subspace learning [8] is performed to merge the new frames into the original subspace learned from *A*. With *A* = *U*Σ*V ^{T}* by singular value decomposition (SVD), and as the components of

$$[A\phantom{\rule{0.16667em}{0ex}}B]=[U\phantom{\rule{0.16667em}{0ex}}\stackrel{\sim}{B}]R\left[\begin{array}{cc}{V}^{T}& 0\\ 0& I\end{array}\right],$$

(13)

where
$R=\left[\begin{array}{cc}\mathrm{\sum}& {U}^{T}\widehat{B}\\ 0& \stackrel{\sim}{B}(\widehat{B}-{UU}^{T}\widehat{B})\end{array}\right]$ After we calculate the SVD of *R* = *Ũ** ^{T}*, the new subspace is updated as

Two sets of fluoroscopic video sequences on a patient were collected to test our algorithm. The fluoroscopic images were acquired for a consented lung cancer patient who had right lilar T2 stage non-small cell lung cancer and was undergoing radiotherapy. These fluoroscopic images were saved in digital format and readily available for video display and analysis. Each sequence lasts for about 10 seconds and covers two to three respiration cycles. The first set is a posterior-anterior (PA) fluoroscopic video sequence, and the second one is a lateral sequence. As discussed previously, the quantified moving information is extremely important in radiotherapy management of lung cancers. Using the prescribed algorithm, an optimal plan and treatment strategy can be designed to provide a desired conformal dose coverage to a tumor target while sparing as much surrounding normal tissues as possible.

The experiments of the developed algorithm were conducted as follows. The initial contours for these sequences were labeled in a 3D CT by an experienced radiation oncologist. On the 3D CT images, the target (cancer tumor) could be clearly identified and delineated. Digitally reconstructed radiograph, along with the delineated contours, were projected along the PA and lateral directions. Based on the digitally reconstructed radiograph and projected target, an experienced radiation oncologist manually identified and delineated the target in the first frame of each of the fluoroscopic video sequence. The tracking procedure was then started automatically. The results are shown in Figure 1. Sequence 1 contains 223 frames and the results of frame (4, 74, 144, 214) are displayed in Figure 1a. Sequence 2 contains 120 frames and the results of frame (2, 42, 82, 120) are displayed Figure 1b. The tracking results were compared with optical flow using a 40 × 40 window. The results show that although each sequence exhibits different illumination and contrast changes due to different acquisition directions, the algorithm we presented can provide a reasonably accurate tracking result of the tumors.

We have proposed an adaptive tracking algorithm for lung tumors in fluoroscopy using online learned collaborative trackers. No shape or motion priors are required for this tracking algorithm. This saves many expensive expert annotations. The experimental results demonstrate the effectiveness of our method. Instead of building a specific model, all the major steps in our algorithm are based on online updating. This adaptive online learning algorithm is therefore general to be extended to other medical tracking applications.

1. Giraud P, De Rycke Y, Dubray B, Helfre S, Voican D, Guo L, Rosenwald JC, Keraudy K, Housset M, Touboul E, Cosset JM. Conformal radiotherapy (CRT) planning for lung cancer: Analysis of intrathoracic organ motion during extreme phases of breathing. International Journal of Radiation Oncology Biology Physics. 2001;51:1081–1092. [PubMed]

2. Jiang SB. Radiotherapy of mobile tumors. Seminars in radiation oncology. 2006;16:239–248. [PubMed]

3. Tang X, Sharp GC, Jiang SB. Fluoroscopic tracking of multiple implanted fiducial markers using multiple object tracking. Physics in Medicine and Biology. 2007;52(14):4081–98. [PubMed]

4. Xu Q, Hamilton RJ, Schowengerdt RA, Alexander B, Jiang SB. Lung tumor tracking in fluoroscopic video based on optical flow. Medical Physics. 2008;35(12):5351–5359. [PubMed]

5. Xu Q, Hamilton RJ, Schowengerdt RA, Jiang SB. A deformable lung tumor tracking method in fluoroscopic video using active shape models: A feasibility study. Physics in Medicine and Biology. 2007;52(17):5277–5293. [PubMed]

6. Yang L, Georgescu B, Zheng Y, Foran DJ, Comaniciu D. A fast and accurate tracking algorithm of left ventricles in 3D echocardiography. International Symposium on Biomedical Imaging. 2008:221–224. [PMC free article] [PubMed]

7. Viola P, Jones M. Fast and robust classification using asymmetric adaboost and a detector cascade. Advances in Neural Information Processing System. 2002;14:1311–1318.

8. Ross D, Lim J, Lin RS, Yang MH. Incremental learning for robust visual tracking. International Journal of Computer Vision. 2008;77(1):125–141.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |