|Home | About | Journals | Submit | Contact Us | Français|
Medical imaging technologies have allowed for in vivo exploration and evaluation of the human musculoskeletal system. Three-dimensional bone models generated using image-segmentation techniques provide a means to optimize individualized orthopedic surgical procedures using engineering analyses. However, many of the current segmentation techniques are not clinically practical due to the required time and human intervention. As a proof of concept, we demonstrate the use of an expectation maximization (EM) algorithm to segment the hand phalanx bones, and hypothesize that this semi-automated technique will improve the efficiency while providing similar definitions as compared to a manual rater. Our results show a relative overlap of the proximal, middle, and distal phalanx bones of 0.83, 0.79, and 0.72 for the EM technique when compared to validated manual segmentations. The EM segmentations were also compared to 3D surface scans of the cadaveric specimens, which resulted in distance maps showing an average distance for the proximal, middle, and distal phalanx bones of 0.45, 0.46, and 0.51 mm, respectively. The EM segmentation improved on the segmentation speed of the manual techniques by a factor of eight. Overall, the manual segmentations had greater relative overlap metric values, which suggests that the manual segmentations are a better fit to the actual surface of the bone. As shown by the comparison to the bone surface scans, the EM technique provides a similar representation of the anatomic structure and offers an increase in efficiency that could help to reduce the time needed for defining anatomical structures from CT scans.
The application of computed tomography (CT) and magnetic resonance (MR) technologies to medical imaging has allowed for in vivo exploration and evaluation of the human musculoskeletal system. Three-dimensional bone models generated using image-segmentation techniques provide a means to perform engineering analyses and to optimize individualized orthopedic surgical procedures. Traditionally, manual raters have performed the segmentation of bone from medical images. However, these techniques are not clinically practical due to the extensive time and human intervention that is required. This study aims to evaluate a preprocessing and segmentation method to accurately, efficiently, and reliably separate bony regions of interest from CT images. The advantages and insight obtained from engineering analysis of surgical procedures has motivated researchers to automate the process and improve on its efficiency and accuracy. Others in our laboratory are working towards producing automatic hexahedral meshes from the surface definitions appropriate for surface contact analysis using the finite element method. Our ultimate goal is to automate the preprocessing, segmentation, and mesh generation procedures to provide a clinically useful means of surgical analysis and planning.
Many techniques have attempted to segment bone from medical images, many of which have been optimized to certain locations in the body due to some of the inherent difficulties in bone segmenting. One may intuitively expect that bone is very different from the surrounding soft tissues, allowing it to be easily segmented from images. However, bone does not have a homogenous composition1. This inhomogeneity results in intensity values that vary greatly throughout a bony structure (e.g. cortical versus trabecular). While cortical bone in younger adults is easy to differentiate from trabecular bone, as the subject ages the density of cortical bone often decreases (e.g. osteoporosis) thereby making the differentiation more difficult. Another complication is the close relationship between bones at the articulating surfaces where the relative positions of these bones is not fixed due to the degrees of freedom allowed by the joint. These close relationships, can make it difficult to differentiate bones of interest especially at the articulating surfaces. The large degree of shape variability at the articulating surfaces throughout the body also makes it difficult to develop a universal process for bone segmentation.
In attempts at overcoming these difficulties, researchers have tried a variety of modifications to well-known techniques including thresholding, region-growing techniques, atlas-based techniques, artificial intelligence, and various combinations of techniques. Many of these methods have aimed towards the automation of bone segmentation yet have varying degrees of human intervention. Most of the methods are specific to certain anatomical locations where the techniques have been optimized. Researchers have optimized bone segmentation using various techniques for the acetabulum2, femoral head2,3, cranium3, pelvis3,4, carpal bones1, mandible3, vertebrae5, ribs6, and various other regions3,7–12. Our laboratory has also developed a segmentation technique using an artificial neural network for the phalanx bones of the hand13.
Ehrhardt et al. demonstrated an automatic method for segmentation of the hip4. This technique used a Thirion atlas-based registration, which was followed by a thresholding and nearest neighbor interpolation. The authors report a 98.5% accuracy of their technique when compared to manual segmentations, but mention that several areas were in need of improvement. In particular, an algorithm was needed to increase the precision of the segmentation of the acetabulum and femoral head joint region. Sebastian et al. surveyed various methods of segmentation in an effort to generate models of the carpal bones from CT images1. An optimum method for the carpal bones was found to be a combination of active contour models, seed growing, and competition between neighboring regions that was coined the “skeletally coupled deformable model” (SCDM). A clinical evaluation of the segmentations was performed; however, a quantitative comparison was not presented. Staal et al. demonstrated a method for automated rib segmentation that utilized four preprocessing steps followed by region growing6. The preprocessing steps were designed to isolate regions of interest from surrounding tissues and included 1D ridge voxel detection, generation of line elements from ridge voxels, classification of primitives to remove background information, and, finally, grouping of generated primitives. Overall, these preprocessing steps were aimed at generating grouped primitives to identify the centerlines of the ribs. This information was then used for a region-growing segmentation. The authors report 97.5% accuracy, 96.8% sensitivity, and 97.8% specificity and a total processing time of around 6.5 min.
In this study, we demonstrate the use of a semi-automated segmentation technique for segmenting the phalanx bones of the hand that face similar challenges to those seen by Ehrhardt et al.4, Sebastian et al.1, and Staal et al.6. The expectation maximization (EM) algorithm coupled with atlas to image registration has been used for brain segmentation using MR images14–17. This approach has been shown to identify and account for image inhomogeneities, incorporate anatomical priors, and produce reliable segmentations of the brain. Using the EM algorithm, we have applied this technique to the phalanx bones of the hand excluding the thumb in this proof-of-concept study. The thumb was excluded since manual segmentations of the thumb were not available. The phalanx bones of the hand were chosen as the bony region of interest due to the fact that a single scan of one hand supplies twelve separate bones that vary in size and geometry. This allows for a large amount of data from a single scan and an evaluation of a segmentation technique’s ability to handle variable bone geometry. Also, the three phalanx bones of each finger are oriented within close proximity, which challenges a segmentation technique’s ability to differentiate not only bone from surrounding tissues, but also bone from neighboring bone. For these reasons, the phalanx bones of the hand were an ideal choice for our study. In the future, we hope to extend our techniques to other structures of the skeletal system.
Fifteen fresh cadaveric upper extremities were amputated above the elbow and screened to rule out any pre-existing pathology, including evidence of prior trauma. The donor set consisted of thirteen female and two male specimens with a mean age of 73.7 years. Each specimen was mounted on a customized Plexiglas frame in the neutral anatomic position. The neutral position was defined by aligning the back of the hand with the back of the forearm and the third metacarpal with the long axis of the forearm. Using a Siemens Sensation 64 slice CT scanner, three-dimensional voxel datasets of the hand were acquired for each specimen (matrix=512×512, FOV=172 mm, KVP=120, Current=94 mA, Exposure=105 mAs) with a 0.34-mm in-plane resolution and a 0.4-mm slice thickness. Slices spanning the entire limb were obtained for each dataset.
Following image acquisition, the data was processed using the BRAINS2 software18. The images were spatially normalized and resampled to 0.2-mm isotropic voxels aligning the vertical plane of the frame along the superior/inferior axis in the coronal view. This vertically aligned the third metacarpal. The images were cropped to contain only the phalanges for ease of data management. Two trained technicians manually traced 15 index fingers and three complete hands using the BRAINS2 software with an emphasis on creating the most accurate segmentations possible. The regions of interest (ROIs) defining the distal, middle, and proximal bones were manually traced by each technician. The average time required to manually segment the three bones of the index finger was 58.47 min, ranging from 39 to 83 min. In order to ensure minimal inter-rater variability a study was conducted to compare the performance of the two tracers by determining the relative overlap. The relative overlap computed between the two raters was 0.89 for all the bones. The individual bones, the proximal, middle, and distal phalanges, had relative overlaps of 0.91, 0.90, and 0.87, respectively19.
To streamline our registration and segmentation techniques, the images of the left hands were transformed into right hands by a mirror function along the x axis of the hand. This produced fifteen right-hand images for our study. One of the specimens was chosen as the atlas since this specimen had manual traces for all the phalanx bones from each of the four digits, and a visual inspection of all the data showed this specimen to be of average size. The atlas image was registered to each of the subjects using a three stage registration technique which included landmark identification. A total of thirty-two landmarks (eight per finger) were identified using the BRAINS2 software. Six of the twelve landmarks were placed at the endpoints of the phalanx bones corresponding to the anterior, middle, and posterior aspect of the metacarpophalangeal (MCP) joint and the superior tip of the distal phalanx. The remaining two landmarks were placed in the middle of the proximal interphalangeal (PIP) and distal interphalangeal (DIP) joints. Landmark placement was visually verified to be within one voxel using three viewing planes provided by the BRAINS2 software. Based on preliminary work, landmark identification was found to be necessary in order to account for the large anatomic variability that existed across subjects. The number of landmarks was optimized using an iterative process. Fewer than eight landmarks per finger were found to decrease the registration accuracy. More than eight landmarks per finger did not significantly improve the registration and only extended the time for the thin-plate spline registration. Following landmark identification, a thin-plate spline registration based on the manually defined landmarks20 was used to initialize a higher order Thirion Demon’s algorithm21 that was used to warp the atlas specimen into correspondence with each of the other specimens. The resulting transform was saved as a deformation field. The resulting deformation field was then applied to the anatomical priors as described in the following section.
The expectation maximization (EM) algorithm is part of the 3D Slicer2.7 toolkit and is provided by the National Alliance for Medical Imaging Computing (NA-MIC), one of the National Centers for Biological Computing sponsored by the National Institutes of Health. The algorithm is a two-stage iterative algorithm that uses probability maps and signal intensity information to generate anatomical labels. This process requires a subject image and a probability map for each bone that is to be segmented from the subject image. The probability maps for each phalanx bone were obtained from manual traces from the atlas CT image. The manual traces defining the border of the structure were first converted to a binary representation and then filtered using a Gaussian filter with a variance of 2.0 mm. The resulting image was then scaled from 0 to 255 and converted to an eight-bit representation. This was done separately for each of the ROIs resulting in 12 probability maps defining the proximal, middle, and distal phalanx bones for the index, middle, ring, and little fingers. The deformation field mapping the atlas image to each subject, as described previously, was applied to the probability maps warping them from the atlas space to each of the individual subjects.
The EM algorithm was then applied to the subject datasets using optimized parameters for bone segmentation. The EM algorithm was organized using a hierarchical method which first defined signal intensity means and variance for the background, soft tissue, and regions of probable phalanx bone. The regions of probable phalanx bone were further divided into 12 separate regions that corresponded to the phalanx bones of the four fingers. Each region of probable bone was then assigned a respective probability map. The twelve regions of probable bone were further differentiated into regions of bone and regions of soft tissue. This procedure proved effective at finding cortical bone, but produced hollow regions within the trabecular bone. The resulting regions of interest from the EM algorithm were then post-processed to keep the largest connected region thereby eliminating islands that had been misclassified. The hollow regions were then filled to generate an anatomical label for each phalanx bone. Finally, a triangle surface was generated from the binary segmentation and smoothed using Laplacian smoothing.
As described previously by DeVries19 the cadaveric specimens were utilized for validation with a 3D laser surface scanner. Specimen preparation and laser scanning are described briefly here, for more details see DeVries et al. In preparation for the physical scanning procedure, the bones were carefully dissected from each hand/wrist. Care was taken not to alter the bony surface during dissection due to nicking or scratching by instruments. The majority of the surrounding soft tissue was removed during dissection. What tissue remained post-dissection was removed following the defleshing process prescribed by Donahue et al.22. The bones were placed in a 5.25% sodium hypochlorite (bleach) solution for approximately 4 to 6 h to remove the remaining tissue. The bones were examined hourly to avoid decalcification and to remove any extraneous loose tissue. Once denuded, the bones were degreased in a soapy water solution followed by a period of air drying. Due to the bone’s natural coloring and texture, a white primer was applied to the surface before scanning. This produced better scan results since the scanner has difficultly with “fuzzy”, dull objects. Three-dimensional surface scans of each physical specimen were ascertained using a Roland LPX-250 3D laser scanner (0.2-in. resolution). The surfaces were cleaned of non-manifold, redundant, and crossing faces. These abnormal faces may cause severe errors in geometric calculations and were removed. Finally, the individual shell was smoothed within a tolerance of 0.01 mm. A tight smoothing tolerance was chosen so the actual surface characteristics were kept intact19.
Three metrics were used to compare the resulting regional definitions from the EM algorithm. First, a relative overlap metric (Eq. 1) and similarity index metric (Eq. 2) were used to compare the binary segmentations generated from the EM algorithm to those created by the manual rater23.
The third method was used to test the validity of the EM segmentation algorithm. We compared models generated from the EM segmentations to bone surface scans performed on five index fingers from specimens including the proximal, middle, and distal phalanx bones (15 bones total). The acquisition orientation of the laser and CT scanners differed; consequently, the bony surface definitions required alignment. Once a common orientation was established, an iterative closest point (ICP) rigid registration24 was used to register the two surfaces. Thereafter, the distance between the two surfaces was computed. The resulting distance map is based on a Euclidean distance metric, or the shortest distance from a source point to the target surface along the surface normal. The surfaces generated from the EM algorithm were considered the source, while the physical laser scans were considered the target surfaces. Custom written software was used to create the Euclidean distance maps as a means of comparison between the surface scans and the EM segmentation models.
In addition to the reliability and validity measures obtained above, a simple time trial was applied to the EM segmentation technique. A stopwatch was synchronized with both the start and the end of the segmentation process including the definition of the anatomical landmarks used in the registration process. Prior work had provided time trial values for the manual segmentations13.
A complete phalanx bone model from the EM-based segmentation from a single subject is shown in Figure Figure1.1. The average relative overlap and average similarity index results for each of the phalanx bones can be seen in Table 1. Overall, the average relative overlap values for all of the fingers were 0.83, 0.79, and 0.72 for the proximal, middle, and distal bones, respectively. The relative overlap for the index finger where 13 subjects’ data was available for comparison showed similar relative overlap measures of 0.87, 0.80, and 0.70 for the proximal, middle, and distal bones. Overall, the average similarity index values for all of the fingers were 0.91, 0.88, and 0.83 for the proximal, middle, and distal bones, respectively. This is as one would expect given the difference in how the metrics are computed. The similarity index for the index finger where 13 subjects’ data was available for comparison showed similarity index measures of 0.93, 0.89, and 0.82 for the proximal, middle, and distal bones. A relative overlap or similarity index of 1.0 would be an exact match between the images being compared and a value of 0 would indicate no overlap of the regions of interest.
The Euclidean distance maps produced an average distance for the proximal, middle, and distal phalanx bones of 0.45, 0.56, and 0.51 mm, respectively. It is evident that Specimen-1 is an outlier in this data set. The segmentation for this specimen appears to have not included the proximal portion of the distal phalanx resulting in large distance measures for this subject. The average Euclidean distances are summarized in Table 2. In most cases, the average distance between the bone surface scan and the EM segmentation model are near the resolution of the laser scanner (0.34 mm) and are therefore close in shape to the physical laser scanned surface of the bone.
The total time required to segment the four fingers (twelve phalanx bones) of the hand was 30 min. This total time is distributed as follows: 8 min for landmark identification, 3 min for image registration, and 19 min for EM segmentation and post-processing. Recall that the average time to segment a single finger manually was 58.47 min, which would equate to an average of 233.88 min for the manual segmentation of all four fingers. The EM segmentation is nearly eight times faster than manual tracing and only requires 5% of the human time required for segmenting the 12 regions of interest.
The comparison of the EM technique to the manual segmentation methods has revealed advantages and disadvantages to both methods. In our previous work, the manual segmentation methods were validated as an accurate representation of the surface of the bone. The EM segmentations were not an exact match to the validated manual segmentations and, thus, are not as accurate a representation as the manual methods. Nevertheless, the average distances from the bone scans show that the EM method is performing well for the proximal and middle phalanx bones as their average distances are approximately the order of a pixel in the raw CT data away from the laser scanned surface. In addition, the relative overlap and similarity index values between the EM segmentation and the manual rater were high for these regions as well.
However, the average distance for distal phalanx was slightly larger. The average location of the inferior aspect of the distal phalanx overlaps with the average location of the superior aspect of the middle phalanx. This naturally leads to an overlap in the probability maps that were used. The probability map overlap is problematic to the sequential segmentation technique utilized by the EMSegment module. This issue results in portions of the distal phalanx erroneously being identified as the middle phalanx. This error has manifested itself as a larger average distance of the distal phalanx shown by the Euclidean distance mapping. Another difficulty in the EM bone segmentation can be attributed to the advanced age of the subjects and the level of cortical bone deterioration. In the absence of the “cortical shell”, it is very difficult for the EM segmentation to differentiate trabecular bone from surrounding tissues due to their similar intensity values on CT images.
The previous works presented in the context of segmentation of bony regions of interest suffer from similar issues as the EM Segmentation algorithm. Sebastian et al. showed that their segmentation technique was sensitive to partial volume artifacts1. In particular, regions with small distances between bones sometimes resulted in joined bone models. Staal et al. suffered from leaky region growing in small joint spaces where portions of the vertebrae were identified as ribs6. In addition, primitives were sometimes misclassified resulting in missing or added ribs. The author suggests the inclusion of shape features and a feedback program to cross check the primitive classification would help to eliminate these errors. The work presented by Ehrhardt et al. is based in part on thresholding4. This step is dependent on the difference in intensity between bone and surrounding tissues. This may not be problematic for regions of bone with thick cortical layers, but it does not address the issue of normal anatomical regions with thin cortical layers that exist in older adults or as a result of osteoporosis. This study utilized cadaveric specimens that had an average age of 73.7 years. Many of these subjects exhibited very little cortical shell in the phalanx bones. The reduced cortical thickness results in intensity values that may not significantly differ from surrounding tissues and trabecular bone. The similarity between these intensity values makes it difficult for the EM segmentation process to segment bone. Images from a younger population may offer a better representation of the EM segmentation ability.
Errors in the EM segmentation algorithm can be introduced in several ways. First, the manual segmentation defined on the atlas image is used as probability that the region is at any given location in the atlas space. This will tend to bias the results towards what the guidelines used to define the region on the atlas subject. To minimize this bias, the manual segmentations were smoothed with a Gaussian filter. This has the effect of weighting the interior voxels more than the edge voxels where the error introduced by the manual rater is the greatest. The algorithm is also susceptible to errors in the definition of the landmarks for registration. Since these landmarks are only used to initialize the Thirion Demons deformable registration algorithm, the errors introduced by the manual rater will be minimized since the registration algorithm will tend toward a global optimum. As long as the landmarks are not significantly different from the actual location, the errors in the landmark location should be minimal. The EM segmentation step depends on the registered subject image and image intensity values to accurately segment bone. The registration step was already discussed, but the image intensity values are also important.
Previous research has shown the average time for manually segmenting the three phalanx bones of the index finger to be 58.47 min13. The EM segmentation improved on the efficiency of the manual techniques by a factor of eight. This improvement in segmentation time brings us one step closer to a clinically practical segmentation technique. As a semi-automated technique, the EM segmentation offers an improvement in efficiency with a minimal loss to segmentation accuracy.
We have previously evaluated an artificial-neural-network (ANN)-based segmentation using the same research subjects. As reported by Gassman et al., the ANN was compared to the manual segmentations and was shown to have a relative overlap metric of 0.87, 0.82, and 0.76 for the proximal, middle, and distal phalanx bones of the index finger, respectively13. These relative overlap values are very close to those of the EM-based segmentation; however, it should be noted that the ANN-based segmentation does result in slightly higher relative overlap values for the distal and middle phalanx bones. When compared to the 3D laser scans, the ANN segmentations were shown to have an average distance (mm) of 0.35, 0.29, and 0.40 for the proximal, middle, and distal phalanx bones, respectively13. When compared to the EM-based segmentation, the ANN average distance values are closer to the scanned surface for all three phalanx bones of the index finger. The ANN was also shown to be ten times faster than the manual segmentations13, which is 1.3 times faster than the EM-based segmentation. The advantage of the EM segmentation method is that the whole hand (excluding the thumb) was being segmented while the ANN was only evaluated on the phalanx bones of the index finger in this previous work.
To make the EM segmentation process even more clinically relevant, it would be necessary to minimize user interaction in the process. Currently, our registration technique requires a user to place eight landmarks for each finger. This is a significant amount of user interaction which could lend itself to user error. Improving this registration technique will help to minimize user interaction. Secondly, the EMSegment module could be altered to better segment bony regions of interest. Instead of sequentially segmenting individual probability maps, a problem described earlier, the algorithm could be amended instead to recognize regions of probability map overlap. In the regions of overlap, a second step could be added to determine the most probable identification between the two overlapping probability maps. This simple modification could help to avoid misidentification of structures where boundaries are ambiguous.
Another approach would be to add shape constraints using a principle components analysis, thereby allowing the bone shape to be optimized in these overlapping probability regions. This has previously been proposed by Pohl et al. for segmentation of subcortical brain structures16. These improvements would help to improve the accuracy of the EM segmentation technique and bring it one step closer to clinical relevancy.
The EM segmentation technique has been shown to offer a reliable and accurate means of bone segmentation in this proof-of-concept study. The slight loss of accuracy when compared to the manual tracing techniques is not disappointing when the segmentation times between the two techniques are compared. Even without modifying the EMSegment module, minimal human intervention could be used to improve accuracy by manually fixing problematic areas of segmentation with minimal loss of overall efficiency. Overall, this work represents a step towards a clinically relevant, fully automated bone segmentation process that attempts to match the accuracy of human manual raters while improving the segmentation speed.
The authors acknowledge the assistance of K. Pohl and gratefully acknowledge the financial support provided by the NIH Awards R21EB001501 and R01EB005973 and The University of Iowa Carver College of Medicine.