|Home | About | Journals | Submit | Contact Us | Français|
Quantitative analysis of short-axis functional cardiac magnetic resonance (CMR) images can be performed using automatic contour detection methods. The resulting myocardial contours must be reviewed and possibly corrected, which can be time-consuming, particularly when performed across all cardiac phases. We quantified the impact of manual contour corrections on both analysis time and quantitative measurements obtained from left ventricular (LV) short-axis cine images acquired from 1555 participants of the Framingham Heart Study Offspring cohort using computer aided contour detection methods. The total analysis time for a single case was 7.6±1.7 minutes for an average of 221±36 myocardial contours per participant. This included 4.8±1.6 minutes for manual contour correction of 2% of all automatically-detected endocardial contours and 8% of all automatically-detected epicardial contours. However, the impact of these corrections on global LV parameters was limited, introducing differences of 0.4±4.1ml for end-diastolic volume, −0.3±2.9ml for end-systolic volume, 0.7±3.1 ml for stroke volume and 0.3±1.8% for ejection fraction. We conclude that LV functional parameters can be obtained under 5 minutes from short-axis functional CMR images using automatic contour detection methods. Manual correction more than doubles analysis time, with minimal impact on LV volumes and ejection fraction.
Quantitative analyses of short-axis (SA) functional cardiac magnetic resonance (CMR) images over the entire cardiac cycle require a complete delineation of the myocardium. Such a complete delineation typically consists of the left ventricular (LV) endocardial and epicardial contours on 8–12 slices and 20–50 phases/slice, altogether involving up to 1200 (12×50×2) myocardial contours. Consequently, manual delineation is too time-consuming to be performed in clinical routine and therefore automatic delineation tools are essential.
The problem of automatic delineation of the myocardium in SA functional CMR images has proven to be challenging. The papillary muscles are particularly difficult to delineate automatically. These need to be excluded from the myocardial wall to enable measurements of wall thickness, although there are often no image features separating the papillary muscles from the myocardial wall. Consequently, the problem has attracted much attention in the image processing community. The presented solutions include methods based on active contour models (1–3), active shape models (4,5), active appearance models (6–8), level sets (9,10), graph cuts (11,12), fuzzy connectivity (13,14) and combinations of those approaches (15,16). While these methods perform reasonably well in validation studies comparing quantification data obtained from manually and automatically defined contours on medium-sized populations, none provide perfect contours on a case-by-case basis, as shown in the recent review by Petitjean and Dacher (17). Consequently, effective deployment of automatic delineation methods into clinical practice still requires human-operator interaction to enable correction of imperfect contours. In addition, little is known about the effect of manual contour correction on the quantitative analysis results.
In this study we sought to assess the accuracy, time-efficiency, and reproducibility of a computer-aided analysis system incorporating an automatic delineation method (18) together with several (semi-automatic) contour correction mechanisms (1,19). We deployed our system to analyze CMR SA-image data from the Framingham Heart Study (FHS) using computer-aided contouring methods with a specific step-by-step strategy and monitored the evolution of the quantitative results following each of these steps. Moreover, extensive logging of user interactions provided additional insights into the effectiveness of computer-aided analysis of SA functional CMR images.
The study design and selection criteria for the FHS Offspring study have been described previously (20). Briefly, the FHS Offspring cohort participants are the children of the original FHS cohort and the spouses of those children. The Offspring cohort was initiated in 1971 and participants have undergone comprehensive examinations and interval histories every 3–4 years. Offspring were excluded prospectively if there were potential contraindications to CMR imaging, including pacemaker, implanted cardioverter-defibrillator, metallic intraocular or intracranial clips, and history of foreign ocular bodies or severe claustrophobia. Participants with known permanent atrial fibrillation were also excluded. A total of 1837 Offspring underwent CMR imaging. The study was approved by human study committees of both the Boston University School of Medicine and Beth Israel Deaconess Medical Center (Boston, MA). Written informed consent for CMR scanning was obtained from all participants.
CMR imaging was performed with subjects positioned supine in a 1.5-T scanner (Gyroscan NT, Philips Healthcare, Best, The Netherlands) using a 5-element cardiac array coil for radiofrequency signal detection. Following scout images to determine the position and orientation of the heart within the thorax, images were acquired using an electrocardiogram (ECG)-gated steady-state free precession (SSFP) cine sequence (21). A stack of 10-mm thick contiguous SA slices encompassing the left ventricle from base to apex was acquired during a series of end-tidal breath-holds. Imaging parameters included: repetition time = 3.2 ms, echo time = 1.6 ms, flip angle = 60°, 208×256 matrix with 400-mm FOV, temporal resolution of 30–40 ms.
All image analyses were performed by a single expert observer (CJS), with 12 years of CMR analysis experience and >4000 cases previously analyzed, blinded to participant characteristics and clinical history, using a commercially available workstation (Extended MR Workspace 2009, Philips Healthcare). The analysis software provides automatic contour detection in a task-guided application in which the user needs only to indicate the slice range (i.e. from apex to base) deemed suitable for automatic analysis. To account for through-plane (long-axis) motion over the cardiac cycle, slice-extent for automated analysis was determined by the operator at end-systole (ES). The apical limit was selected as the most apical slice for which blood pool could be seen at ES. The basal limit was selected as the most basal slice at ES for which there was myocardium entirely encompassing the LV blood pool. Contour detection was then performed without the need for operator-identification of the LV blood pool, or any other features, in the image.
The automatic contour detection starts by locating the myocardium using a fast and robust ring detection method. This ring detection method uses a variant of the Hough transform tailored to detect circular shapes, which is computed efficiently in the Fourier domain using an analytic Hankel transformation. Thereafter, the precise delineation of the endocardial and epicardial contours at end diastole (ED) is obtained by deforming a geometric template in a coarse-to-fine approach. This geometric template models the myocardium as a ribbon structure consisting of a centerline and a variable width, both described by interpolating splines controlled by as few nodes as possible. The position and width of those nodes is optimized using a greedy algorithm to minimize an energy criterion that includes: 1) circularity and regularity terms to favor smooth circular shapes, 2) boundary terms based on edge and ridge features to drive the contours to an appropriate location and 3) regional terms forcing the template to segment homogenous regions, so that the papillary muscles are considered part of the LV cavity volume, in accordance with commonly used manual contour delineation conventions. The resulting ED contours are then propagated to the remainder of phases by optimizing the match of grey values along profiles in consecutive images perpendicular to the contour (1).
If manual contour correction is necessary, semi-automatic contour propagation (1) allows efficient correction across all phases. Computer-aided contouring was performed using a strict protocol, including automatic contour detection (18) on all slices in which the complete circumference of the LV cavity was surrounded by myocardial tissue in all phases. The resulting contours were reviewed at end-diastole and necessary corrections were propagated to the remainder of phases (1). Successive review and correction of the remainder of phases was performed by correcting contours at the phase where contour positioning was worst, followed by a dual (forward and backward) propagation (19,22) to guarantee temporal consistency. Finally, if needed, additional basal and apical slices at ED were drawn manually to enable accurate volume quantification at ED. This complete analysis procedure is summarized in Figure 1. The most common modification was addition of at least one additional basal slice at ED. The most basal slice was selected as that containing an arc of ≥ 180° of myocardium, as shown in Figure 2. During the analyses, all user interactions were recorded in log files. Furthermore, the analysis software was augmented with automatic result-export and logging functionality for the purpose of this study.
Global volumetric parameters were determined. LV volumes were computed in all phases using Simpson’s rule for slice summation. The resulting volume vs. time curve was processed to obtain the end-diastolic volume (EDV), end-systolic volume (ESV), stroke volume (SV), ejection fraction (EF), and cardiac output (CO). In addition, LV mass (LVM) was determined from the end-diastolic datasets.
Inter- and intra-observer reproducibility was assessed from 48 randomly selected cases from equal strata of sex and Framingham Risk Score. The second observer (MLC), for interobserver reproducibility, had 17 years of CMR experience and >1000 cases analyzed.
Finally, we sought to verify that our methodology of automated contouring followed by manual correction would yield results comparable to fully manual contours. One observer (MLC) analyzed the 48 randomly selected cases without the help of automation by manually drawing LV endocardial contours at ED and ES and LV epicardial contours at ED only, which is sufficient to enable quantification of global volumetric parameters. These manual analyses were performed en bloc over two days at a time point widely separated from other analyses without reference to prior contours and results.
The time spent to perform each task in the analysis protocol was derived from the log files that were stored by the analysis workstation. This analysis included the time spent to select the appropriate apical and basal slices to perform automatic contour detection, to correct the automatic contours, to add the additional slices at ED, and to export the quantitative results. The resulting times were found to be non-Gaussian in distribution. Therefore, we use the median and the inter-quartile range to summarize these results.
The accuracy of automatically detected myocardial contours was assessed by measuring the distance between the automatically detected contours and the final contours after review and correction. We have measured the mean, root-mean-square (RMS), and maximum distance after establishing a point-to-point correspondence perpendicular to a centerline obtained using the repeated averaging algorithm (23). These results are summarized using the mean and the standard deviation.
The quantitative parameters obtained from automatic analyses, i.e. based on automatically detected contours only, were compared with the quantitative parameters obtained from complete analyses, i.e. based on the final contours after review and correction, to assess the impact of manual adjustments made to the contours. Moreover, to assess the impact of adding the additional slices at ED alone, quantitative parameters were also obtained from reconstructed analyses, i.e. based on automatically detected contours complemented with additional ED contours. Note that we refer to these analyses as “reconstructed” because they were computed from a previously saved contours, as the addition of ED slices was performed after detailed adjustments of all automatically detected contours. The differences were reported by mean and standard deviation of the signed difference. In addition, scatter plots were generated including the results of linear regression analysis. The close correspondence between the quantification results from reconstructed analyses and complete analyses also allowed for the generation of Bland Altman plots (24).
We allocated 6 months for the image analysis, during which the images of 1,555 participants were transferred to the workstation and analyzed by a single observer (CJS). The median ± inter-quartile total analysis time was 9.1 ± 3.8 minutes/case. Timing results for the individual steps in the analysis protocol are listed in Table 1. Note that exporting all the contours and results to text files was performed for research purposes and is not mandatory in clinical practice. These text files were exported immediately after automatic contour detection and after complete analysis. This added a median 85 seconds (i.e. 42 seconds immediately after contour detection and 43 seconds after complete analysis) to our analysis protocol, which was 16% of the total analysis time. Excluding the export time, the total analysis time for a single case was 7.6 ± 1.7 minutes for an average of 221 ± 36 myocardial contours per participant.
Automatic contour detection was performed at 8148 slices and successfully located the LV in 8072 slices (99% success rate). All automatically detected contours (323,202) were reviewed of which 16,378 contours (5%) required manual modifications that were propagated using the dual propagation mechanism. This included 3,157 modifications to endocardial contours (2% of all endocardial contours), and 12,948 modifications to epicardial contours (8% of all epicardial contours). The size of these manual corrections was assessed by measuring the mean, RMS, and maximum distance between the automatically detected and manually corrected contours (Table 2). Moreover, Figure 3 shows the cumulative percentage of edits versus the contour displacement. Overall, visual contour verification and manual adjustment (as needed) required 292 ± 97 seconds/case, or 4.87 ± 1.62 minutes/case.
The fidelity of the automatically detected myocardial contours was assessed by measuring the average mean, RMS and maximum distance with respect to the final contours, resulting in Table 2 that shows the accumulated positioning errors with respect to the final contours across all 323,202 automatically-detected myocardial contours (i.e. all slices and all phases), across all ED contours, and across all manually modified contours. Figure 4 shows scatter plots of the time between the first and last export of results vs. the mean, RMS and maximum contour positioning error for all cases. The intercepts of the trend lines in these plots, 254s, 245s, and 243s for the mean, RMS and maximum contour positioning errors respectively, indicate the time that would be expected for an analysis with perfect automatic contouring. Note that these times exclude the time spent on slice selection, but include the time spent on reviewing automatically defined contours, adding additional slices, and defining myocardial segments, as well as the time spent on exporting the results once. Moreover, the slope of the trend lines, 437s mm−1, 338s mm−1, 160s mm−1 for the mean, RMS and maximum contour positioning errors respectively, indicate the crucial impact of the accuracy of the automatically detected contours on the total analysis time. An average 0.1 mm improvement in contour detection accuracy assessed by the mean contour positioning error is expected to result in a 44s time gain.
Image quality affects the time spent during analysis. The images of 89 cases (5.7%) contained significant artifacts at one of the slices. In these artifact-afflicted cases, the RMS distance between the automatically detected and final contours at all phases was 0.49 ± 0.92 mm and 1.38 ± 1.26 mm for the LV endocardium and LV epicardium respectively, as compared to 0.25 ± 0.53 mm and 1.15 ± 1.03 mm in cases without artifacts (p<0.001). The total time spent to correct the automatically detected contours and export the final results for these artifact cases was 551 ± 227 s (vs. 464 ± 222 s for cases without artifacts, p< 0.001).
The final step in our analysis protocol was to add myocardial contours to the basal, and occasionally apical, slice(s) at ED, which were not included for automatic contour detection as these slices did not show LV cavity at ES. To assess the importance of this step, we quantified the volumetric parameters from automatic and reconstructed analyses and compared the results to volumetric parameters from complete analyses. The reconstructed contour sets only included correction for the 1% of slices in which the automatically detected contours did not surround the LV. In total 4,320 slices were added, which on average equals to 2.78 ± 0.92 slices per case. The results for the EDV, ESV, SV and EF are shown in scatter plots with linear regression results in Figure 5, while Table 3 lists the mean ± SD differences and the r2 for all volumetric results. Moreover, the close correlation between the results from reconstructed analyses and the complete analyses are suitable for visualization in Bland-Altman plots (Figure 6).
Due to the time-consuming nature of hand-tracing ventricular contours, fully manual analysis is usually performed by drawing contours at the ED and ES phases only. This requires manual selection of the ES phase blinded from volume quantification results. Automatic contour detection provides myocardial contours at all phases. Consequently, the ES phase can be automatically selected afterwards, rather than manually selected beforehand, which may save additional time (and potentially lead to more accurate volume parameters). While the automatically detected ES phase could be over-ridden (explicitly selected manually), this proved to rarely be necessary. As the ES phase is also redetected after modifying the myocardial contours, only small changes in ES phase were observed. The mean difference between the automatically detected and final ES time was 0.2 ± 3.0 ms, with a 0.994 correlation coefficient (r2), indicating excellent agreement between user-determined and automatically-identified ES time.
Intra- and inter-observer variability were determined both by measuring distances between the contours of both observations (Table 4), as well as by comparing the volumetric parameters obtained in both observations (Table 5). Note that the distances reported in Table 4 are comparable to the distances reported for all contours in Table 2. Moreover, the intra- and inter-observer variability for the volumetric parameters (Table 5), is comparable to the differences between the results derived from the reconstructed and final data (Table 3). The intra- and inter-observer variability for volumetric parameters derived from the reconstructed analyses was comparable to the intra- and inter-observer variability from the complete analyses (Table 5). As the contours between two observations in the former approach are only different at a few ED slices, average values for the mean, RMS and maximum distance between the contours of both observations are all < 0.1mm.
To enable a comparison with other works, the accuracy of the automatically detected myocardial contours was also assessed with respect to the contours from the fully manual analyses (Table 6). Moreover, the volumetric parameters as obtained from reconstructed and complete analyses were compared to the volumetric parameters obtained from manual analyses (Table 7). LVM was underestimated, by approximately 5%, by automatic contouring versus fully manual analysis (p<0.001), and EDV was minimally (1%) but significantly overestimated by automatic contouring (p=0.03). Other global LV parameters did not differ between fully manual and complete analyses. The analysis time for manual segmentation of LV endocardial contours at ED and ES, and epicardial contours at ED, was 361 ± 11 seconds/case, or 6.02 ± 0.18 minutes/case.
Quantitative analysis of SA functional CMR images across the cardiac cycle requires a complete delineation of the LV myocardium, which cannot be obtained manually within the time constraints of clinical routine. Therefore, clinical software for the analysis of SA functional CMR images should provide automated contouring tools. To date, such automated methods are helpful, but require manual interventions by the user to obtain visually satisfactory LV contours. In this study, we assessed the accuracy and time-effectiveness of a clinical application for the analysis of SA functional CMR images, which contains a combination of automatic and semi-automatic contouring algorithms.
We further have assessed the impact of manual contour corrections on the final outcome of clinical measurements. The addition of basal and apical slices at the ED phase caused a significant increase in the EDV (Table 3 and Figure 5). Consequently the derived SV and EF both increased significantly. This result confirms the known importance of correct slice selection for the outcome of global volume measurements in SA functional CMR.
Moreover, while the reviewing and correcting the automatically proposed contours doubles the analysis time (Table 1), Table 3 and Figure 6 highlight the limited impact of detailed, manual contour corrections on the outcome of global volume measurements in SA functional CMR. Nevertheless, a review of automated contours remains necessary. Very rarely the automatic contour detection fails, instead producing spurious contours; such instances produce the extreme global-quantification outliers seen in Figures 5 and and66.
Finally we have assessed the accuracy of the automatically detected contours with respect to fully manually drawn contours (Table 6), as well as the accuracy of the global volumetric results before and after correction with respect to the global volumetric results obtained from fully manual analysis (Table 7). Both experiments revealed similar accuracy with respect to our previous experiments (1,18,25), and with respect to many other methods (17).
The overall accuracy of complete analysis was high compared with results obtained from fully manual analysis, with excellent correlation between the two methods (Table 7). Limits of agreement were tight and comparable to those of inter-observer variation between two fully manual observations, which are reported as mean difference ± SD of 8.07 ± 5.58ml for EDV, 2.77 ± 5.1ml for ESV, −0.53 ± 2.32% for EF, and −2.75 ± 5.97 for LVM (26). LV mass was slightly but significantly underestimated, by approximately 5%, by complete analysis versus fully manual analysis. Although EDV was significantly overestimated in the statistical sense, the difference (1.0% of mean EDV) is unlikely to be of clinical relevance. LV ESV, SV and perhaps most importantly EF did not differ between complete and fully manual analyses.
Full manual analyses involved hand drawn contouring at ED and ES only. While this was marginally faster than a complete analysis (6.1 minutes vs. 7.6 minutes), the limited impact of elaborate manual corrections, which take a median 4.9 minutes suggests that accurate global LV parameters can be obtained in 3–4 minutes. Moreover, full manual analysis requires manual identification of the ES phase, which is a source of intra- and inter-observer variation that can be eliminated using contour detection on all phases.
The timing results reported in this paper are derived from analyses by a single user. It remains to be investigated how these timing results translate to other users, with different levels of experience and in other environments. In many clinical practices, a typical weekly case load might be 10–20 CMR cases, as compared to the 50–100 studies/week performed by this user during this study. Consequently, clinical users may be less experienced with respect to manipulating contours, but also be less fatigued from repetitive analyses. Moreover, in clinical practice, users will likely be more focused on time-efficiency versus absolute accuracy of each contour, as was the case in the present study, as an additional purpose of our study was to generate reference values. Thus the mean analysis times reported here may not be fully generalizable to any given clinical practice for reasons noted above.
It bears mention that our study was conducted on image data obtained from Framingham Offspring cohort participants, the majority of whom were free from clinical cardiovascular disease at the time of CMR study. In clinical practice, analysis of SA functional CMR images is performed for a wide variety of clinical indications. Although similar performance can be expected for many conditions, lower contour accuracy may occur in cases with morphologically deformed ventricles, which may consequently require more analysis time. Additionally, we have shown the impact of detailed corrections on quantification of global functional parameters. The impact of detailed corrections on regional measurements of local myocardial contractile function was not assessed as the vast majority of participants were without wall motion abnormalities. The performance of automated contouring and the effect and time-efficiency of detailed manual correction on assessment of regional function remains to be investigated, and is probably better performed in a population with greater prevalence of both focal and global myocardial dysfunction.
The results reported in this paper are obtained using a particular clinical software application for the analysis of SA functional CMR images, which combines automatic and semi-automatic contouring algorithms with graphics editing tools to enable efficient analysis. It is unknown how timing results may be affected by using different contouring algorithms or other editing paradigms. Similarly, all image data used in this study were acquired with a single CMR scanner. Thus, the time-efficiency of our methods may vary when applied to data from other scanners. Furthermore, our results are not directly comparable to results from benchmarking experiments performed with other algorithms on other image data (27). That said, the accuracy of automatic contouring in this study was comparable to earlier published results (1,17,25) obtained on image data from other medical centers.
Finally, we did not measure the time expended to upload images for analysis. Although this can be important to clinical workflow, the generalizability of such data are extremely limited and depend strongly on local factors such as whether the analysis software operates on a stand-alone workstation or shares resources with other applications, e.g. a PACS, and of course on the specific hardware configuration. Since any report of data-loading times would apply only to our specific site, and would not provide information of general interest, these data were not obtained.
We have assessed the time-effectiveness, reproducibility, and overall performance of automatic contour detection methods for the analysis of SA functional CMR images in an evaluation study on images acquired from 1555 participants of the Framingham Heart Study Offspring cohort. We quantified the impact of manual corrections to automatically-detected contours on both LV volume measurements and analysis time. Whereas the addition of basal slices at ED to handle through-plane (long axis) motion was critical for accurate quantification of LV volumes and ejection fraction, manual contour corrections had minimal impact on global LV parameters, while more than doubling the analysis time. Comparison to fully manual analyses, on a subsample drawn from equal strata of sex and age-tertile, showed that complete analyses (automatic contour detection followed by manual correction as needed) are accurate, and there is at most minimal difference in global LV parameters between the two methods. Therefore we conclude that LV functional parameters and time-volume curves can be obtained in 3–4 minutes from short-axis functional CMR images using automatic contour detection methods in a strict protocol.
This project was supported in part by the National Heart, Lung and Blood Institute’s Framingham Heart Study (Contract No. N01-HC-25195), a subcontract from the National Institutes of Health (RO1 HL70279) and by an unrestricted grant from Philips Healthcare. GLTFH and MB are employees of Philips Healthcare.
1The ranges of the Bland-Altman plots in Figure 6 exclude 1–3 extreme outlier(s), in which the difference in EDV was 119.7ml, the difference in ESV was 55.1ml, the difference in EF was 36.2%, and the difference in SV was 89.2ml, 27.3ml, and 29.8ml, respectively.