|Home | About | Journals | Submit | Contact Us | Français|
Techniques that originate in computer graphics and computer vision have found prominent applications in the medical domain. In this paper, we have seamlessly developed techniques from computer graphics and computer vision together with domain knowledge from medicine to develop an image guided surgical system for medialization laryngoplasty. The technical focus of this paper is to register the preoperative radiological data to the intraoperative anatomical structure of the patient. With careful analysis of the real-world surgical environment, we have developed an ICP-based partial shape matching algorithm to register the partially visible anatomical structure to the preoperative CT data. We extracted distinguishable features from the human thyroid cartilage surface and applied image space template matching to find the initial guess for the shape matching. The experimental result shows that our feature-based partial shape matching method has better performance and robustness compared with original ICP-based shape matching method. Although this paper concentrates on the medialization laryngoplasty procedure, its generality makes our methods ideal for future applications in other image guided surgical areas.
Many techniques originally developed in computer graphics and its related fields have been adopted in a variety of application fields. Among them, image guided surgery is one of the rapidly growing applications. In this paper, we present how image guided medicalization laryngoplasty can be improved using computer graphics and computer vision techniques.
Medialization laryngoplasty is a surgical procedure to improve the voice function of the patient with vocal fold paresis and/or paralysis. During the procedure, a surgeon will implant a uniquely configured structural support lateral to the paretic vocal fold through a window cut in the thyroid cartilage. The implant provides vocal fold support by placing the vocal fold into a more medial position. The failure rate of this procedure is as high as 24% even for experienced surgeons.1 The objective of this research is to improve the outcome of the procedure and to reduce the revision rate by providing image guided surgery tools.
While various computer graphics and vision techniques have been successfully applied to the image guided surgery applications, our main focus is the registration problem in the image guided surgery. One of the major challenges in image guided surgery system is to accurately register the preoperative radiological data to the intraoperative anatomical structure of the patient. However, several constraints present in this surgical procedure make the registration process a difficult task: (1) introduce minimal intrusion or modifications to the current surgical practices. (2) Register the delicate anatomy of thyroid cartilage during the surgery to the preoperative 3D CT data. (3) Implement with only a moderate increase in additional equipment. For medialization laryngoplasty, the thyroid cartilage has ossified into a rigid framework, which is only minimally deformable.4 The rigidity and the distinctive features of the cartilage’s surface have inspired us to perform surface-based registrations in the image-guided system.
During the surface-based registration process, we meet the partial surface occlusion problem. In most cases, only part of the thyroid cartilage surface will be exposed for the surgical operation. The Iterative Closest Point (ICP) matching algorithm is a well-known technique in computer vision and computer graphics to match two point clouds. In this paper, we present a method based on structured light surface reconstruction and ICP-based partial shape matching to register the intraoperative surface of larynx to the preoperative 3D CT data.
The contribution of this paper is the application of basic knowledge derived from computer graphics, computer vision, and the human anatomy to provide a new registration method in image-guided surgery. Basic techniques such as structured light-based surface reconstruction, feature-based shape matching, and volume visualization have been seamlessly integrated to the image guided surgery application with close collaboration with surgeons. Although this paper is specialized for the medialization laryngoplasty, the generality of the approach, which uses a series of 2D images, makes this basic technology ideal for future applications in other vision-based surgical techniques.
The rest of this paper proceeds as follows. The next section presents background and related works. In image guided medialization laryngoplasty section, we provide an overview of the image guided medialization laryngoplasty system and describe some important implementation details. We also present experimental results of the surface reconstruction and feature-based partial shape matching method. In the final section, we provide conclusions and future directions.
In this section, we present some domain knowledge related to laryngoplasty and summarize previous works technically related to our image guided surgery system.
Medialization laryngoplasty (Figure 1) is a kind of thyroplasty procedure, which aims at medializing the membranous portion of the vocal fold. In thyroplasty, a surgical opening is created in the lamina of the thyroid cartilage for the purpose of implanting a permanent structural support that pushes the injured vocal fold medially to approximate the normal vocal fold of the opposite side during phonation. Optimal voice outcomes are most dependent on the exact placement of the implant relative to the position of the underlying vocal fold while sub-optimal voice outcomes and high revision rates reflect the significant challenges inherent in the thyroplasty procedure. The major challenge is determination of the optimal implant configuration and accurate placement of implant during surgery.
Registration in image guided procedures can be classified into three categories based on the fiducial markers: extrinsic invasive, extrinsic noninvasive, and intrinsic markers. Extrinsic invasive markers are usually fixed to the patient’s large bones,2,3 while extrinsic noninvasive markers are attached to the patient’s skin.4 In the case of laryngoplasty, the bone fixed fiducial markers would cause potential damage to the thin laryngeal cartilage. The skin affixed markers may move significantly relative to the laryngeal cartilage. Intrinsic markers are anatomical and geometric landmarks. In our system, these intrinsic markers are used for the registration, and thus, surfaces, contours, and points in the CT data set are registered onto the corresponding features using intraoperative imaging modalities. Surface-based registration is a research area in image guided surgery, where the exposed anatomical structure is modeled and registered to the preoperative CT or MRI volume.5 Although the surface-based registration has been mainly performed at large anatomical structures like skulls,6,7 knees, and hip joints, we will present its possibility in the image guided laryngoplasty.8,9 The distinctive features in larynx and thyroid cartilage make it possible to perform surface-based registration for image guidance.
To globally align multiple 3D point sets of surfaces, the ICP algorithm calculates a rigid transformation matrix by minimizing the mean square error (MSE) between the closest point pairs.10 Various techniques including point sub-sampling, outlier removal, k–d tree and error minimization methods have been introduced to accelerate the ICP matching process.11–14 However, if the initial estimation is not accurate, the ICP-based shape matching method usually falls into local minima. To avoid local minima, some researchers applied several perturbations in the initial conditions to select the best results.15 In our case, since we can only acquire partial surface information from the intraoperative stage, the initial pose estimation is critical for the final registration accuracy.
Recently, the shape feature-based registration has drawn great interest from researchers. To select the feature points, shape descriptors are used, including the global harmonic shape descriptors,16 light field descriptors, 17 integral volume descriptors,18 curvature-based local descriptors,19 and integral invariant multi-scale surface characteristics.20 A priority-based search would be performed to match the surface features between two geometric models.21 For the medical applications, Gilles et al. 22 introduced a method to automatically model musculoskeletal structure using multi-resolution constraint-based segmentation and analyze shape variations with high-level shape descriptors. In case of laryngoplasty, we selected the laryngeal prominence and the ridge between left and right side of the thyroid cartilage as the most distinctive features. For the partial shape matching, we also propose a simplified shape descriptor to detect the laryngeal prominence and the ridge from the thyroid cartilage surface, and quickly calculate the initial pose using template matching in image space. At the final stage, a vanilla version of ICP method successfully finds much refined registrations.
Our image guided system consists of five functional components: preoperative surface generation from CT data, stereo camera calibration, structured light-based surface reconstruction, shape feature-based ICP matching, and visualization. First, a preoperative CT is taken of the patient’s larynx. The thyroid cartilage surface is extracted from the preoperative CT using a marching cube algorithm. Second, a stereo camera pair is calibrated based on 2D planar homography-based calibration method. The stereo camera calibration parameters and the image rectification matrix are used for later surface reconstruction. Third, a structured light-based surface reconstruction algorithm is developed to reconstruct the exposed thyroid cartilage surface, which acts as the intraoperative surface. Forth, the preoperative surface is registered to the intraoperative surface with the ICP-based partial shape matching algorithm. Finally, the registration information is used to place the preoperative CT dataset under the intraoperative surface, and is visualized simultaneously on the monitor. The suggested optimal location of the implant calculated by a surgical planning system can be displayed on the preoperative CT. Technical issues and their solutions are presented in the following sections.
For preoperative surface generation, the standard marching cube algorithm23 is used to extract the thyroid cartilage surface. The 3D CT data set is displayed as a volume rendered image, and user can interactively change the iso-surface value to get the thyroid cartilage surface. To conduct experiments on the surface reconstruction and registration process, a phantom model and its corresponding CT data set is required. Since it is difficult to get an actual scale phantom model of laryngeal cartilage, we use a reverse engineering approach to build a phantom model from the Visible Human CT data set (Figure 2). The extracted 3D surface model is converted to a solid CAD model and sent to 3D prototyping device (Stratasys FDM 3000). The prototyper is capable of constructing a 3D phantom model with the accuracy of 0.1mm.
Structured light-based surface reconstruction requires light projection device (LCD projector) and one or more cameras. In our case, an LCD projector is used with two cameras. For the camera calibration, we used the planar homography-based camera calibration method.24 After calibration, the images from two cameras are rectified to align the horizontal scan lines. We propose a surface reconstruction based on gray code multiple line shifting patterns. The gray code patterns separate the illuminated area into small regions indexed by a unique gray code. Multiple high intensity lines are projected onto the surface to provide illumination peaks, where there is exactly one high intensity line in each gray code region.
The key challenge in gray code multi-line shifting method is to resolve the ambiguity and mislabeling problem that originate from image segmentation, shadows, and occlusions. We have tackled these problems with two-pass dynamic programming. During the first pass, since we know the ground truth projection pattern, the individual multi-line shifting image is matched to the ground truth pattern sequence with dynamic programming. After the first pass sequence matching, we will obtain the illumination peak sequences that robustly match with the ground truth pattern sequence. If we only use the robustly matched points to calculate the 3D positions, many data samples will be lost. A code error recovery procedure is designed to include the missed data samples. In the second pass, we sort all the sub-pixel peak position from multi-line shifting images and applied dynamic programming. The second pass is to maximize the chance of sequential matching in both spatial and temporal domain.
The 3D surface from preoperative CT data and the point clouds from the intraoperative stage need to be registered. The ICP-based shape matching will work well if we can get the whole thyroid cartilage image from the structured light-based surface reconstruction. However, in typical medialization laryngoplasty procedure, only part of the thyroid cartilage surface will be exposed for the surgical operation (Figure 3).
Accounting for these partially visible cases, we propose a thyroid cartilage surface feature-based partial shape matching method. The shape of the thyroid cartilage varies among people and gender. But some common features such as the laryngeal prominence and ridge between left and right side of thyroid cartilage (Figure 4) can be reliable for the shape registration. For the partial shape matching, we detected these distinguished surface features and performed image space searching to find out proper initial guess for the ICP method. With this proper initial estimation, the ICP-based shape matching converges to the correct global minima.
At the first step, we construct a mesh structure from point cloud of partially visible thyroid cartilage through simply connecting closest neighboring points. Poisson mesh reconstruction algorithm is used to generate smooth triangular mesh with normal vectors. In the second stage, to detect the prominetia laryngea and the ridge, we used K-mean clustering algorithm25 to cluster the normal vectors into two groups. The median normal vector of each cluster is used to identify the distinguished surface features. The normal clustering method will be applied to both the partial surface from intraoperative stage and the 3D surface from the preoperative CT. In the third step, in order to estimate proper initial guesses, we apply a template matching method in image space. This is an acceleration technique in place of performing exhaustive searches in 3D space. We approximated the surface curvature with a simplified term: cross product between the vertex normal and median normal of the surface features. For each triangular mesh, the color of each vertex is calculated as this approximated surface curvature value. 3D meshes from the CT and partial surface from surface scanning are rendered in OpenGL frame buffer. We perform template matching on the two images and the calculated template matching position is the initial guess for the shape matching. Finally, original ICP method is used to calculate refined shape matching result. Since we have estimated proper initial condition, the ICP matching quickly converges to the correct solution. The proposed feature-based shape matching algorithm is described in more detail at the flow diagram (Chart 1). For the ICP method itself, we use balanced k–d tree to reduce the searching time for the closest point matching. After the calculation of closest point, we reject the outliers from the sample space if the closest distance is greater than two times the mean closest distance (Figure 5).
One problem in image-guided visualization systems is to seamlessly deliver a localized detailed view within a global context of the patient. As the amount of information available to a surgeon as well as the interdependencies among them increase, it will be more important to allow the surgeon to explore the dataset for the information of interest, to filter irrelevant information, and to allow the flexibility of representing relationships. In image-guided medialization laryngoplasty, this kind of information exploration would be very natural since the surgeon would be searching for the optimal point of entry through exploration of the CT dataset in relation to the external imagery. To achieve the focus plus context visualization, we used multi-pass volume rendering with difference transfer functions in the global context volume and local region of interest volume (Figure 6). We also provide interactive virtual endoscope fly through to allow the surgeons to examine the vocal fold and implant locations (Figure 7).
Our image guided medialization laryngoplasty system includes Intel Xeon 3.2GHz workstation with 4GB memory, two JAI M8-CL cameras, Epson EMP 1705 LCD projector, and NDI Polaris Spectro optical tracker (Figure 8). The two cameras are controlled by Matrox Helios CL frame grabber in dual base mode. The frame grabber can capture 7 frames/second from each camera with image resolution of 1600 × 1200. Structured light pattern is generated by OpenGL rendering and sent to the LCD projector. The two cameras and the LCD projector are synchronized for the structured light-based surface scanning. The preoperative and intraoperative surface of thyroid cartilage surface are registered with the shape feature-based ICP method. After registration the optical tracker is used to track medical devices intraoperatively for visual feed back. The image guided system is developed at Microsoft Visual Studio 2005 with the support of standard libraries including MFC, OpenGL, OpenCV,26 Matlab C++ Library,27 and VTK.28
We applied the structured light-based surface reconstruction method on various models including the phantom model, male and female larynx dissected from cadavers, a full cadaver model, and an animal bone. We used sphere model with radius of 27.5–28.5mm to show the sub-millimeter accuracy. The reconstructed point cloud has a radius of 28.0375mm with mean absolute error of 0.1483mm and standard deviation of 0.1948mm. Figure 9 shows the 3D reconstruction result of various models. Three cadaveric studies have been performed to test the surface reconstruction method on real human anatomy. An animal bone with bloody surface and specular highlight is used to simulate real surgical area. The surface reconstruction result shows that the proposed method works well in all of these situations.
Our feature-based partial shape matching method has been tested on three different data sets: phantom model from NIH visible human data, larynx from male, and female cadaver. In some cases, as shown in Table 1, our feature-based partial shape matching method converges to correct solution while the original ICP method result in incorrect solutions. Figures 10–12 shows the correct shape matching result with our method and the incorrect results from original ICP-based shape matching. Even in the cases where the original ICP method shows a correct solution, our method could accelerate the overall shape matching process by 35% on average (Table 1). For the 4th female case in Table 1, the feature-based shape matching has slight increase in overall execution time, because the partial shape and the full thyroid cartilage surface have similar center of mass positions. The overhead of curvature rendering and template matching is the main reason of execution time increase.
In this paper, we presented image-guided surgery system for the medialization laryngoplasty. Our research team consists of laryngoplasty experts including surgeons and computer graphics engineers. Through close collaborations with each other, the domain knowledge from the field surgeons and their requirements are analyzed and realized into our final system. Additionally, comments on prototypes finally construct a positive feedback over the whole development phase. Overall process is a good example of interdisciplinary research involving computer graphics and medical professionals.
After analyzing the real-world surgical environment, our system is designed to have the capability of registering partially matched interoperative images even with occlusions. Based on the domain knowledge from laryngoplasty, we extracted distinguishable features from the human thyroid cartilages. Combining this feature extraction process with technical solutions from compute graphics and computer vision, we accelerated the overall image registration process to finally show remarkable speed-ups.
At this time, our image guided system has been fully tested with phantom models and several cadaver models. To apply our method in the clinical setting, we have performed validation experiments with phantom model. The point-based registration using optical tracker is one of the standard registration methods in image guided surgery. We used optical tracker to compute the target registration accuracy of our feature-based shape registration method. The average target registration accuracy is 2.0502mm and it is within the boundary of the clinical requirement. Our future work is to perform clinical trials involving real patients. From the technical point of view, our variation of ICP algorithms for partially matching cases and image-space acceleration techniques can be extended to more general cases with various applications.
This work was supported by the National Institutes of Health with a grant R01-DC007125-01B1 to develop computer-based tools for medialization laryngoplasty.
Ge Jin received the B.S. degree in computer science from Peking University, China, in 1997 and M.S. degree in computer science from Seoul National University, Korea, in 2000. He received his Doctor of Science degree in computer science from the George Washington University, USA, in 2007. He is currently an Assistant Professor in the department of computer information technology and graphics at the Purdue University Calumet. He was a postdoctoral research scientist at the George Washington University department of computer science, from 2007 to 2008. His research spans the fields of computer graphics, computer animation, medical visualization, medical image processing, and image guided surgery. He is a member of the ACM SIGGRAPH, MICCAI and SPIE.
Nakhoon Baek is an Associate Professor in the School of Electrical Engineering & Computer Science at Kyungpook National University, Korea. In the year 2008, he is currently visiting the Department of Computer Science at the George Washington University. His research interests include image-guided surgery, medical image processing, graphics algorithms and real-time rendering. He received his B.A., M.S., and Ph.D. in Computer Science from Korea Advanced Institute of Science and Technology (KAIST) in 1990, 1992, and 1997, respectively.
James K. Hahn is currently a full Professor in the Department of Computer Science at the George Washington University where he has been a faculty since 1989. He is the founding director of the Institute for Biomedical Engineering and the Institute for Computer Graphics His areas of interests are: medical simulation, image-guided surgery, medical informatics, visualization, and motion control. He received his Ph.D. in Computer and Information Science from the Ohio State University in 1989, an M.S. in Physics from the University of California, Los Angeles in 1981, and a B.S. in Physics and Mathematics from the University of South Carolina in 1979.
Steven Bielamowicz, MD, FACS is Professor and Chief of the Division of Otolaryngology and Director of the Voice Treatment Center at The George Washington University. He received his medical degree from the Baylor College of Medicine and completed a residency in Otolaryngology-Head and Neck Surgery at UCLA. He completed a Neurolaryngology fellowship at the NIH and has an interest in the neurologic basis of voice disorders. Dr Bielamowicz evaluates and treats patients with voice and swallowing disorders and has an active clinical research program within the field of neurolaryngology.
Rajat Mittal is currently a full Professor in the Department of Mechanical and Aerospace Engineering at the George Washington University where he has been a faculty since 2001. He is the founding director of the GW Center for Biomimetics and Bioinspired Engineering. His areas of interests are: computational fluid dynamics, biomechanics and bioinspired engineering. He received his Ph.D. in Applied Mechanics from the University of Illinois at Urbana-Champaign in 1995, an M.S. in Aerospace Engineering from the University of Florida, Gainesville in 1991, and a B.Tech in Aeronautical Engineering from the Indian Institute of Technology at Kanpur in 1989.
Raymond J. Walsh is currently a Professor of Anatomy and Regenerative Biology at The George Washington University School of Medicine and Health Sciences in Washington, DC., where he has been on the faculty since 1978. He is a member of the Executive Committee of the Institute for Biomedical Engineering. Dr Walsh’s current areas of interest are the design and development of computer-assisted instructional aids and the application of anatomy to virtual environments as they pertain to clinical medicine. Dr Walsh received his Ph.D. in Anatomy from Tufts University in 1976 and a B.S. degree in Zoology from the University of Massachusetts, Amherst, in 1969.