Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Pattern Recognit. Author manuscript; available in PMC 2010 November 1.
Published in final edited form as:
Pattern Recognit. 2009 November 1; 42(11): 2514–2526.
doi:  10.1016/j.patcog.2009.04.022
PMCID: PMC2849737

Estimating Myocardial Motion by 4D Image Warping


A method for spatio-temporally smooth and consistent estimation of cardiac motion from MR cine sequences is proposed. Myocardial motion is estimated within a 4-dimensional (4D) registration framework, in which all 3D images obtained at different cardiac phases are simultaneously registered. This facilitates spatio-temporally consistent estimation of motion as opposed to other registration-based algorithms which estimate the motion by sequentially registering one frame to another. To facilitate image matching, an attribute vector (AV) is constructed for each point in the image, and is intended to serve as a “morphological signature” of that point. The AV includes intensity, boundary, and geometric moment invariants (GMIs). Hierarchical registration of two image sequences is achieved by using the most distinctive points for initial registration of two sequences and gradually adding less-distinctive points to refine the registration. Experimental results on real data demonstrate good performance of the proposed method for cardiac image registration and motion estimation. The motion estimation is validated via comparisons with motion estimates obtained from MR images with myocardial tagging.

Index Terms: Image Registration, Cardiac Motion Estimation, Spatio-Temporal Normalization

1. Introduction

According to World Health Organization estimates, 16.7 million people around the world die of cardiovascular diseases (CVD) each year [1]. CVDs which primarily affect the myocardium are called cardiomyopathies, and these can be generally classified into two major groups, extrinsic and intrinsic cardiomyopathies. Extrinsic cardiomyopathies are those where the primary pathology is outside the myocardium itself, e.g., ischemic cardiomyopathy or cardiomyopathy secondary to sarcoidosis. Intrinsic cardiomyopathies are not due to an identifiable external cause, and are usually diffuse and may be difficult to diagnose. Detecting abnormalities in myocardial function can greatly help in the early diagnosis of intrinsic cardiomyopathies.

Intrinsic cardiomyopathies present themselves in different forms, including structural changes such as wall thinning, fat deposition or fibrosis, and functional changes, e.g., variations in ventricular wall motion, ejection fraction, and perfusion. Both of these types of information need to be extracted from the image before an accurate diagnosis can be made. Much work has been done in the area of feature extractors for structural characterization of disease. This has focused primarily on image features such as intensities and gradients [2], moments [3, 4], Gabor features [5, 6], and local frequency representations [7]. One difficulty in analysis of diffuse cardiomyopathies is that not all of the pathology can be characterized in terms of structural changes. Function at rest may be abnormal as a result of ischemic heart disease (including ischemia, infarction, and stunned or hibernating myocardium) or related to myocardial structural changes from infiltrative or genetic cardiomyopathies. During stress testing, new or worsening wall motion abnormalities are indicative of functionally significant coronary artery stenosis [8]. In addition, wall motion imaging to detect regional contractile reserve is an accurate measure of myocardial viability in ischemic heart disease, and the results can help guide coronary revascularization therapy. Thus, characterizing cardiomyopathies based on both structural and functional changes will make the diagnostic algorithm more accurate and robust.

Most clinical modalities used to image myocardial function evaluate passive wall motion (ventriculography) or wall thickening (echocardiography, gated single-photon emission computed tomography, or cine MR imaging). Some types of imaging, such as MRI with myocardial tagging and echocardiography with tissue Doppler imaging, also allow quantitative measurement of regional intramyocardial motion and, subsequently, strain, which can be more sensitive to wall motion abnormalities than just wall thickening alone. MR imaging methods for quantification of intramyocardial wall motion can be loosely classified into two approaches, i.e., those relying on specially developed MR imaging pulse sequences, like MR Tagging [9, 10], to help in the estimation of myocardial motion, and those relying on image analysis techniques to extract motion estimates from conventional cine MR sequences. Recent work has also compared the results obtained from the analysis of tagged vs. cine MR sequences [11]. For the sake of brevity, we limit our review of related work to the latter.

A. Extracting motion from Cine MR images

Cine MR images are acquired in a clinical setting typically with an in-plane resolution of 0.84–1.2mm, and slice thickness in the range of 6–10mm. The temporal resolution generally varies between 25 and 70ms. Many methods have been developed for extracting cardiac motion fields from cine MR sequences [1216]; these can be classified into two main categories. The first approach uses segmentation of the myocardial wall, followed by geometric and mechanical modeling using active contours or surfaces to extract the displacement field and to perform the motion analysis [12, 16, 17]. For matching two contours or surfaces, curvatures are frequently used to establish initial sparse correspondences, followed by dense correspondence interpolation in other myocardial positions by regularization or mechanical modeling [12, 14]. The lack of distinct landmarks on the myocardial wall makes it difficult to estimate the wall motion based on surface tracking. In addition, this approach is very sensitive to the accuracy with which the myocardium can be segmented, and it performs poorly in regions within the myocardium, generally aligning only the myocardial boundaries.

The other approach uses energy-based warping or optical flow techniques to compute the displacement of the myocardium [15, 18, 19]. Perperidis et al [15] use a regular grid with a B-spline basis to model the deformation and use normalized mutual information as the similarity metric which is calculated over the whole image. Although this produces reasonable motion estimates, there are two major issues that need more attention. Firstly, since the similarity function is evaluated over the entire image, it is not very effective at capturing localized small deformations in the myocardial wall. These localized deformations can theoretically be captured by refining the B-Spline grid, however, increasing the grid resolution increases the computational cost of the method. In addition, increasing the grid resolution in regions outside the myocardium, like the blood pool, results in over registration, and creates unwanted artifacts. Since our ultimate goal is to be able to characterize cardiomyopathies, these subtle variations are very important. Again, these approaches are limited by the dimensionality of the transformation that they can estimate. The B-spline based approach would require a very fine grid in order to capture these subtle variations. Secondly, since most motion estimation approaches are built up from standard 3D deformable registration techniques, they do not take temporal smoothness and consistency into account, which is an important part of the estimation process. For example, while registering two cine MR sequences, Perperidis [15] enforces temporal smoothness by considering a 4D B-spline grid to model the transformation and therefore produces better results than those obtained earlier. We hypothesize that incorporating temporal smoothing into the motion estimation process will produce better estimates of myocardial motion. A more detailed and thorough review of cardiac image registration methods can be found in [20], and a more general review of image registration methods can be found in [21].

2. Method

In this paper, we propose a new method for estimation of spatio-temporally consistent myocardial motion fields from MR images, based on a 4D elastic registration algorithm. Unlike other approaches that estimate myocardial motion by performing a series of deformable 3D-registrations, we instead present it as a 4D registration problem, which makes the motion estimates temporally smooth and consistent. We describe our algorithm in detail in the following three subsections. First, we describe the formulation of the myocardial motion estimation problem as one of 4D registration. Then, we describe the use of attribute vectors for correspondence detection, which is fundamental to the registration process. Finally, we present the actual 4D registration algorithm, which is formulated as an energy minimization problem. The outline of the overall motion estimation algorithm is schematically shown in Figure 1.

Figure 1
Overview of the motion estimation algorithm. The algorithm starts with the fuzzy segmentation of the MR cine sequence, followed by the generation of the template sequence and the generation of the attribute vectors. The 4D transformation is then estimated ...

A. Motion estimation using a 4D registration framework

Cardiac motion estimation is the problem of determining a transformation that captures the motion of every point, x, in the myocardium over the cardiac cycle (1≤ tN). If we can register 3D images acquired at different phases, t, of the cardiac cycle to each other, we can then have an estimate of cardiac motion. This is the approach taken in most cardiac motion estimation algorithms, i.e., they register all the frames in the sequence to a reference frame, typically the end-diastolic frame. Thus, for estimating cardiac motion from end-diastole to other times in a periodic sequence of N 3D images I(x,t)={I1(x), … It(x), … IN(x)}, with I1 as the end-diastolic image, we need to register the end-diastolic image I1 to images at all other time-points. As mentioned earlier, this approach does not guarantee temporal smoothness or consistency. In particular, a single frame with bad correspondences can add substantial errors to the motion estimation; solving it within a 4D registration framework may obviate these problems. As long as a majority of neighboring frames have accurate correspondences, the bad correspondences in a bad frame will not contaminate estimates in those neighboring frames under the 4D registration framework. In contrast, the correspondences in bad frames can be potentially corrected by neighboring frames.

In order to use a 4D image registration for simultaneously estimating the motion from the end-diastolic image I1 to all other time-points {It}, we generate a new 4D image, i.e., a new image sequence T (x, t) = {I1, ···, I1}, which repeats the end-diastolic image I1 as images at N different time-points (Figure 2). This image sequence represents the stationary heart of the same patient, and is used as the template sequence. Thus, by registering the 4D image I(x, t) to T (x, t) via a spatial transformation h (x, t), the motion for each point x in the end-diastolic I1 to all other time-points in a cardiac sequence can be estimated. The displacement field u (x, t) defines the mapping from the coordinate system of the end-diastolic image, I1, to the image at time t, I(x, t), while the transformation h (x, t) = x + u (x, t) defines the mapping that transforms the end-diastolic image into the specific shape at time t. Notably, the transformation h (x, t) does not contain any temporal components, and it is thus restricted to 3D spatial deformations at the same time-point, since no temporal differences exist in the generated image sequence T (x, t) and thus there is no need to consider temporal variations.

Figure 2
The formulation of myocardial motion estimation as a 4D image registration problem. The problem of motion estimation can be thought of as a 4D registration problem of the sequence with another 4D sequence which is formed by the replication of the end-diastole ...

B. Attribute Vector

The accuracy and robustness of correspondence detection depends on how effectively the image descriptor can capture the local and global properties of the given point. In a registration context, it is desirable that these descriptors are scale and rotation invariant. The heart does not have many geometrically discernible points and most local descriptors, such as intensity and gradient information, fail to capture the uniqueness of a given voxel. In order to register two image sequences accurately, we design for each point a morphological signature, i.e., an attribute vector a(x,t), for the purpose of minimizing the ambiguity in image matching and correspondence detection during the deformable registration procedure.

The attribute vector defined at each voxel, x, in the image reflects the underlying structure at different scales. Each attribute vector includes not only image intensity, but also boundary, and Geometric Moment Invariants (GMIs) [3], all of which are computed from the 3D spatial images. The attribute vector, a(x), is made up of three components a1(x), a2(x), and a3(x). a1(x) represents the edge type of the voxel x in the image. In cases where hard segmentation is available, it is a scalar value and takes one of 7 discrete values, corresponding to 7 edge types such as non-edge and six combinations of edges between the myocardium (M), blood (B) and the surrounding tissue (S). In this paper, we use results from a fuzzy segmentation algorithm proposed by Pham et al. [22]. Each voxel, x, has the tissue classification result as a vector [CM, CB, CS]T, where CM, CB and CS take the real values between 0 and 1.0, and CM + CB + CS =1.0. The attribute a1(x) is represented by a 3×1 vector which is the maximum difference between the voxel x and its neighboring voxels (in a 3×3 neighbourhood N(x)). That is,


a2(x) is a scalar that represents the gray scale intensity of the voxel x, which has been normalized to have values between 0 and 1. The normalization is performed relative to the minimum and maximum values over the entire sequence.

The vector a3(x) comprises of the GMIs of each tissue at different scales. For each scale and each tissue type, there are thirteen rotation invariants that are calculated from the zero-order, second-order and third-order 3D regular moments [23]. GMIs are computed from different neighborhood sizes, and are concatenated into a long attribute vector. GMIs at a particular scale are calculated by placing a spherical neighborhood, of radius R, around each voxel and calculating a number of parameters that are invariant to rotation. GMIs can characterize the geometric properties of objects and are especially useful in distinguishing a voxel from close neighbors which can have similar intensity and edge types. The spherical neighborhoods are normalized to the unit sphere, which thus normalizes the GMIs in turn. The detailed definitions for attribute vectors are given in [3]. It is worth noting that for the generated image sequence T(x,t), we need only to compute attribute vectors for one 3D image and other identical images will have the same set of attribute vectors.

Attribute Similarity

Cine MR images carry very little information within the myocardium, since generally the blood is bright and the myocardium is dark. Therefore, the blood-myocardium interface provides valuable information for cardiac motion estimation. For this reason, we require that the boundary voxels in the template deform to the same boundary type in the subject. We therefore incorporate the benefits of both the surface-based and image-based motion estimation approaches by requiring the boundaries to match up and allowing the attribute similarities to determine correspondences in other areas. This leads to our definition of the similarity between two voxels x and y as,


where c([a2 (x) a3 (x)],[a2 (y) a3 (y)]) is the similarity of the second and the third parts of the attribute vectors, which is defined as,


where a3i(x) is the i-th element of a3 (x) that has a total of K elements. Since we use fuzzy segmentation and the attribute a1 is a 3×1 vector, the inequality a1 (x) ≠ a1(y) is replaced by left angle bracketa1(x), a1(y)right angle bracket < ε, where left angle bracket•, •right angle bracket is the inner product and ε was selected to be 0.1 in the experiments reported in this paper. In addition, we define the distance between two attribute vectors as d (a(x), a(y)) = 1− m(a(x), a(y)). The effectiveness of the attribute vector in localizing a specific voxel is shown in Figure 3. In Figure 4, we demonstrate the effectiveness of the attribute voxel in detecting correspondence across images acquired at different cardiac phases.

Figure 3
The similarity evaluated in the neighborhood of a voxel based on the attribute vectors. The yellow point marks the voxel in whose neighborhood the similarity is computed. Red represents higher similarity and green lower similarity.
Figure 4
The similarity of the voxels marked in Figure 3 plotted in the mid-systole phase. The original coordinates of the voxels are shown in blue. The most probable correspondences are marked in yellow. It can be seen that the correspondences for the points ...

Distinctiveness of Attribute Vectors

In order to be able to pick focus points for driving image registration, we need to design a criterion for selecting points with distinctive attribute vectors. Notice that the iterative addition of focus points during the progress of image registration also allows us to robustly perform the registration in a fast hierarchical manner. Since, the zero-order GMIs in a3 (x) directly represents the volume of each tissue, they are used to select the focus points. We define the myocardium- blood boundary voxels with relatively high myocardial tissue volume as well as the myocardium-surrounding tissue boundary voxels with relatively low myocardial tissue volume, as being the most distinctive. These are the myocardial regions with high curvature and are therefore more easily identifiable as compared to other boundary voxels. Mathematically, the weight, ωT (x, t), for each voxel x can be computed as ωT (x, t) 2|0.5 −Vm|, where Vm is the total myocardial volume within a unit sphere. The calculated weight is thus used as the distinctiveness of each voxel, and after sorting, to select the focus points during the procedure of energy function evaluation.

C. Energy Function

We solve the 4D image registration problem by hierarchically matching attribute vectors in the two image sequences and estimating the transformation that minimizes the difference between these attribute vectors. We model the 4D registration as an energy minimization problem, where the energy term includes temporal consistency and spatio-temporal smoothness terms in addition to the attribute similarity terms. The energy function can be written as,


Here, the first two terms, EF and EB are the image attribute similarity terms, EC enforces temporal consistency, and ES enforces spatio-temporal smoothness. Each of these terms is now explained in detail.


To make the registration independent of which of the two sequences is treated as the template [3, 24], the energy function that evaluates the match of two image sequences should be symmetric with respect to the two images being registered. Therefore, we evaluate both the forward transformation h(x,t) and the backward transformation h−1(x,t) and force them to be consistent1 with each other. This improves the robustness of the motion estimation algorithm since it enforces inverse-consistency during the correspondence detection. In future, we would like to explore alternative strategies for enforcing inverse consistency [25].

With this consideration, we have two attribute similarity terms, i.e., the forward similarity, EF, and the backward similarity, EB, which are respectively defined as,


The first term EF is defined on the forward transformation h(x,t), and measures the similarity of attribute vectors between each point in the sequence T(x,t) and its corresponding one in the sequence I(x,t). The second energy term EB is similar to the first term, but is instead defined on the inverse transformation h−1(x,t) to ensure that each point in the sequence I(x,t) also finds its best matching point in the sequence T(x,t). Specifically, in the forward similarity term, the importance of each point (x,t) in the image registration is determined by its corresponding parameter ωT(x,t), which is designed to be proportional to the distinctiveness of this point’s attribute vector aT(x,t). The match for each point (x,t) is evaluated in its 4D (3D spatial and 1D temporal) spherical neighborhood n(x,t), by integrating all differences between the attribute vector aT(z,τ) of every neighboring point (z,τ) and the attribute vector aI(h(z,τ)) of the corresponding point h(z,τ) in the sequence I(x,t). The difference of two attribute vectors d(·, ·) ranges from 0 to 1 [3]. The size of the neighborhood n(x,t), r, is large initially and decreases gradually with the progress of the registration, thereby increasing robustness and accuracy of deformable registration.

The radius of the spherical neighborhood, r, is chosen to be the same as the search range, δ, which is given by,


Where, δ is the search range, which depends on the current level at which the registration is being performed, δ0 is the initial search range for the current resolution, and ζ is the current iteration number normalized to be between 0 and 1. The values of the parameters used in this paper are listed in Table 1.

Table 1
List of the parameters used in this paper. We used the same set of parameters for all the results shown in this paper.


The consistency energy term EC measures the attribute-vector matching of corresponding points in different time frames of the sequence I(x,t). This can be written as,


For each point (x,t) in the sequence T(x,t), its corresponding point in the sequence I(x,t) is h(x,t). Since the sequence T(x,t) has identical images at different time-points, i.e., same end-diastolic image, points {(x,t),1 ≤ tN} are the N corresponding points in the sequence T(x,t); accordingly, N transformed points {h(x,t),1 ≤ tN} are the established correspondences in the sequence I(x,t). In this way, we can require the attribute vector aI(h(x,t),t) of a point h(x,t) in the image I(x,t) to be similar to the attribute vector aI(h(x,τ), τ) of its corresponding point h(x,τ) in the neighboring time-point image I(x,τ). This requirement is repeated for each position (z,τ) in a 4D neighborhood n(x,t), and the total attribute-vector difference is weighted by εT(x,t) to reflect the importance of a point (x,t) in the sequence T(x,t). In this paper the weight εT (x, t) used is the same as the distinctiveness weight ωT (x, t) as mentioned above. We use a different notation to stress on the fact that these weights should be biased towards points which are temporally well resolved and not just spatially. The current work does not treat these differently, and this is left as future work.

The use of this energy term makes it easier to solve the 4D registration problem, since the registration of cardiac images at neighboring time-points is relatively easier and thus it can provide a good initialization for 4D image registration by initially focusing only on energy terms of Ec and ES.


The fourth energy term ES is a smoothness constraint for the transformation h(x,t),


For convenience, we separate this smoothness constraint into two components, i.e., a spatial smoothness constraint ESSpatial and a temporal smoothness constraint ESTemporal, and these two constraints are linearly combined by their own weighting parameters α and β. The values α=1,β=0.5 were used in this paper.

For the spatial smoothness constraint, we use a Laplacian operator to impose spatial smoothness, which is defined as,


where the Laplacian operator, [nabla]2, is given by,


where (e1, e2, e3) represents the Cartesian coordinates. The Laplacian is numerically computed from the current estimate of the displacement field using a 7-point stencil, which considers the (x ± 1, y, z), (x, y ± 1, z) and (x, y, z ± 1) neighbors for a given voxel (x, y, z).

It is important to remember that all images in the sequence T(x,t) are the identical end-diastolic image. Since maximum strain is experienced at end-systole, we have to register the end-diastolic image with the end-systolic image using a large nonlinear transformation, which might be over-smoothed by the Laplacian operator. To avoid this over-smoothing problem, we use a multi-resolution framework, i.e., multi-level transformations, to implement our registration algorithm. Each resolution will estimate its own level of transformation based on the total transformations estimated from the previous resolutions, and the final transformation used to register two image sequences is the summation of all levels of transformations respectively estimated from all resolutions. Notably, the Laplacian operator is only allowed to smooth the current level of transformation being estimated at the current resolution, which effectively avoids smoothing the transformations estimated from the previous resolutions.

For the temporal smoothness constraint, we use a Gaussian filter, with kernel size σ, to obtain an average transformation in a 1D temporal neighborhood around each point (x,t), and force the transformation on this point (x,t) to follow its average transformation in the temporal neighborhood. In this paper we used a Gaussian filter with σ=1.


Since the image sequence is periodic, and represents one cardiac cycle that repeats itself, we need to add a constraint for periodicity. This is because the first image I1 and the last image IN are temporal neighbors as indicated by solid curved arrows in Fig 2. In this way, we ensure that each point x moves smoothly along the temporal direction from the end-diastolic image I1 to other time-points, and returns to its original position after completing the 4D image registration, since the first images respectively in the two sequences are identical and the transformation between them is thus forced to be exactly zero during the entire registration procedure.

D. Multi-Resolution Approach

We perform the registration in a multi-resolution fashion by adaptively selecting different sets of image points that are used to drive the registration. For each resolution, the adaptive selection of focus points is done on the basis of the distinctiveness of the voxels. Initially only the most distinctive voxels are selected. At later iterations, additional less-distinctive voxels are selected and used for refining image registration. In particular, we express the total image similarity energy as a weighted sum of the similarity energies of the individual points. The weights are assigned to the points according to the distinctiveness of the attribute vectors, i.e., we assign large weights for the energy terms of points that are distinctive, such as points with high curvature along the left ventricular wall. We assign a weight of zero to all points that are not to be considered for the global energy term at a given registration level. This allows us to focus on the most suitable points for actively driving the image registration.

The adaptive approach using focus points is superior to standard sub-sampling based multi-resolution registration techniques for several reasons. First, it potentially avoids errors in the image similarity terms by using only the most distinctive points at each level, rather than using a regular (or random) sampling of the image space. This increases the robustness of the motion estimation, as it is not affected by the similarity of unreliable points, such as points in the ventricular blood pool. Second, as we pick points adaptively, the estimated transformation captures the actual transformation better than a sub-sampled version, even at the low resolution levels, i.e., we pick more points in regions with more deformation and fewer in regions which are relatively static. Finally, another benefit is that of improved speed of the motion estimation. If we were to solve for the displacement at every grid point on the image, the dimensionality of the cost function that we minimize would be extremely high and the cost function evaluation would be much more expensive. Solving for the displacements at only the focus points reduces the dimensionality of the optimization problem. Thus, the procedure approximates a very high-dimensional cost function (equal to the number of points in the image sequence) by a lower-dimensional cost function of only the active points. This latter function has few local minima, because it is a function of the coordinates of active points, for which relatively unambiguous matches can be found. Therefore, using this strategy, we can speed up the performance of image registration and also reduce the chances of local minima, which in part result from ambiguities in determining the matching pairs of points. The downhill simplex method [26] was used to minimize the cost function.

Estimating the parameters used in this paper

Several parameters have been used in this work and the correct choice of these parameters is important for the successful implementation of this method. The parameters were selected empirically by running experiments on simulated and real data to determine the best parameters for the motion estimation.

3. Results

Several experiments were performed to validate the accuracy of our motion estimation algorithm, including demonstrating the performance by visual inspection of the resulting warps, motion fields, interface tracking, and ventricular volumes. We also validate the motion fields by comparison to motion estimates obtained using tagged MR data and extracted using the method proposed by Chandrashekara et al [27].

We acquired cine MR sequences with and without myocardial tagging for 3 healthy volunteers in order to validate our motion estimation algorithm. The cine MR and tagged images were acquired on a Siemens Sonata 1.5T scanner during the same scanning session at end-expiration, thus the datasets are assumed self-registered. Short and long axis segmented k-space breath-hold cine TrueFISP (SSFP) images were acquired for three healthy volunteers. The image dimensions were 156×192 pixels with a pixel size of 1.14×1.14 mm2. A total of 11 slices were acquired with a slice thickness of 6mm and a slice gap of 2mm. For validation purposes, we also acquired grid tagged, segmented k-space breath-hold cine TurboFLASH images. The image dimensions were 156×192 pixels with a pixel size of 1.14×1.14 mm2. The slice thickness was 8mm. Three short axis (apical, mid-ventricular and basal) and one long-axis images were acquired. The MR images were converted into isotropic volumes using cubic spline interpolation prior to motion estimation. All runs were performed on an SGI® Origin® 300 server, using a single processor, and took on an average 2.5 hours for the entire motion estimation.

A. Visual Inspection of Motion Estimation Results

The first experiment evaluates the performance of the proposed method by visual inspection of the deformation field and the warped images. We evaluate the smoothness of the deformation field by evaluating the determinant of the Jacobian matrix over the entire myocardium, and verify that this is positive at all voxels. The deformation field, and correspondingly the motion estimates, can additionally be evaluated by warping the end-diastole image and comparing the warped images with the original images. All of the cine sequences in our test datasets had 33 time frames; selected frames along with their warped counterparts are shown in Figure 5. The similarity between the original and the warped images can be seen in these images, even for the end-systolic frames where the deformation is large. The deformation vectors over the left ventricular region are shown in Figure 5(c); note that these are in agreement with expectations of myocardial motion based on other methods of estimation.

Figure 5
Estimating deformations from end-diastole to other frames. (a) Five selected cardiac images, (b) the results of warping end-diastole to other time-points, (c) deformations around left ventricle, estimated from the end-diastolic image and cropped here ...

An important benefit of motion estimation is that we can use the estimates to perform boundary tracking and segmentation. In other words, if one of the time frames is labeled or segmented, this information can be propagated to the other frames. This can also be used to evaluate the accuracy of the motion estimation algorithm, as we can compare the segmentation results with those performed by hand. The tracking results are shown in Figure 6, where the myocardial walls are marked in red. For the first frame, these contours are obtained by a semi-automatic segmentation procedure using the method in [22]. These semi-automatically segmented contours can be propagated to the other frames based on the deformation fields obtained from the 4D registration procedure. The deformed contours are shown in both the long and short-axis views of the heart; the blood-myocardium interface is tracked accurately in both views and over different time-frames.

Figure 6
Tracking/labeling the boundaries of interest in a cardiac sequence.

Additionally, a point in the myocardium can also be tracked temporally, as shown in Figure 7. This result demonstrates the temporal smoothness of the motion estimation, which cannot be interpreted easily by observing the motion fields. An alternate way of interpreting this is to analyze the left-ventricular volume, obtained using manual segmentation and by using the motion estimates, as seen for one particular subject in Figure 8. The motion estimation based volume curve is smoother than the one obtained by manual segmentation, but is in general agreement with the latter. We manually segmented the left ventricle for 8 subjects, 3 volunteers and 5 with suspected cardiomyopathy, and evaluated the difference in ventricular volume over the entire cardiac cycle. The average volume error was 3.37%, with standard deviation 2.56%; and average volume overlap error was 7.04%, with standard deviation 3.28%. The correlation coefficient between the manual and automatic segmentations was r = 0.99. In addition, the determinants of the Jacobian of the displacement field are shown in Figure 9. The maximum volumetric change was less than 5% over the myocardium for all subjects. This is consistent with the near incompressibility of the myocardium reported in literature [28]; actually, cardiac motion estimation algorithms have also incorporated incompressibility constraints to obtain better estimates of motion [29].

Figure 7
Tracking/labeling results in temporal views of two short-axis lines. The red colors indicate the labeling results on the left and right ventricular boundaries.
Figure 8
Left ventricular volume of a selected subject, segmented by our algorithm (blue-solid curve) and by hand (red-dotted curve) over all frames in a cardiac cycle.
Figure 9
The determinants of the Jacobian of the deformation field displayed for frames around end-systole. The maximum volumetric change observed is ~ 4%, which is consistent with the nearly incompressible properties of the myocardium.

B. Validation using Kinematic Model of the Left Ventricle

Since the ground truth motion estimation is not known in most cases, we used motion fields generated by a kinematic model of the heart [30], to generate synthetic motion fields and corresponding images. We manually selected points on the left ventricle in a mid-systole image of a volunteer dataset. These points are shown in Figure 10. The motion for these points was then calculated using the Kinematic model. The model described by Arts [30], uses 13 parameters to describe the motion of the left ventricle. The parameter Δk1 is associated with changes in ventricular volume, k2 with torsion, k3 with the ratio of axial length to diameter, k4k7 with asymmetric linear (shear) deformation, k8k10 with rotations, and k11k13 with translations along the x, y and z axes. The values of these parameters (as a function of time) were selected to approximate those reported in [30]. Specifically, the principal modes were approximated by,

Figure 10
The points picked on the mid-systole frame to generate synthetic datasets for validation. Sample points are shown on the short-axis view (left), and in 3D (right).


where, a1 = a2 = 0.1, b1 = 1 and a3 = 0.05. The time parameter, t, was selected such that, t=0 at mid systole, t=−0.5 at end-diastole and t=0.5 at end-systole. These numbers match closely with those reported in [30]. The temporal profiles for the remaining parameters were manually selected to approximate the values reported in [30].

A dense motion field was estimated over the entire image domain by applying a Gaussian smoothing kernel, with σ equal to one half the average distance between the points. The motion field thus obtained was applied to the original mid-systole image to obtain a sequence of images from end-diastole to end-systole. A total of 10 synthetic datasets were created by adding random noise (5–10% of maximum amplitude) to the principal modes of variation (Δk1, k2, k3).

Motion fields were estimated using our algorithm on the synthetic dataset, and the estimated motion fields were compared against the original motion fields (ground truth). The accuracy of the motion estimation was evaluated by computing the angle between the estimated displacement vector and the ground truth displacement. The estimated displacement fields are defined to be in agreement with the ground truth if the angle between them is within a tolerance angle θ. In order to assess the match, we determine the tolerance angle for which 95% and 99% of all the voxels in the left ventricle are in agreement. The results are presented in Figure 10. As can be seen, in all cases there is very good agreement (θ < 7°) in the vector fields. Additionally, the determinant of the Jacobian of the displacement field was compared over the LV myocardium. The RMS error for the estimated determinant of the Jacobian with respect to the ground truth is shown in Figure 11, as well.

Figure 11
Comparison with left ventricular motion generated using a Kinematic model [1]. We show, (a) short-axis and (b) long-axis views of the angular difference between the estimates and ground truth in degrees. The percentage error in the estimation of the determinant ...

C. Validation Against Tagged MR Images

Although the above mentioned validation schemes ascertain to a certain extent the accuracy of the obtained motion field, they are limited to validation based on visual inspection of prominent features like the myocardial boundary, or based on simulated datasets. Evaluating the accuracy of the registration and consequently the motion estimation in other regions, for example inside the myocardium, is not as straightforward. In order to overcome this, we use the motion estimates obtained to place virtual tag-planes on the tagged images. The virtual tag-planes are drawn manually on the end-diastolic image as planes perpendicular to the image plane. These are deformed according to the motion estimate to obtain the virtual tag-planes for the other frames. Visual comparison of these with the actual tags is an effective way of estimating the accuracy of the motion estimation procedure, even in areas without visually discernible features. These results are shown in Figures 12 and and1313.

Figure 12
Virtual tag-planes overlaid on tagged MR images. The virtual tag-planes were obtained by deforming manually drawn lines (and extending perpendicular the end-diastolic image) using the deformation field obtained from the motion estimation procedure. The ...
Figure 13
Virtual tag-planes overlaid on long axis views of tagged MR images. The left frame is at end-diastole, the middle frame at mid-systole and the right frame at end-systole. Observe that there is a good match between the virtual and the real tags, and that ...

To quantize the accuracy of the agreement with the virtual tags, tag-intersection points were manually detected by an observer in three different short-axis (SA) slices and one long-axis (LA) slice for all time frames between end-diastole and end-systole, including 20 points on the SA slices and 14 points on the LA slices. The root mean square error between the in-plane displacements estimated from the registration algorithm and the actual in-plane displacements as measured by the observer are presented in Figure 14. It can be seen that the average error is within acceptable limits (2mm) and support our claim to the estimation of accurate motion estimation.

Figure 14
Plot of the rms error for the displacements estimated using the 4D registration algorithm against those obtained via manual tracking of tag intersection points. RMS errors for short-axis slices close to the apex (apical SA), at the middle of the ventricle ...

In order to further evaluate the accuracy of the motion estimates obtained using our method, we extracted the motion fields from the tagged MR images using the method proposed by Chandrashekara et al [27]. We compared the determinant of the Jacobian, calculated in blocks corresponding to the tag spacing (8 mm). The average error in the determinant of the Jacobian between the two methods was 5.63% with a standard deviation of 3.44%.

D. Strain Computations

In order to further validate the proposed method, strain analysis was performed on the three subjects and compared with strain estimates obtained using the method proposed by Chandrashekara et al [27]. Strain analysis during systole was performed by dividing the LV into the standard segments of the American Heart Association (AHA)2. Myocardial motion was recovered using both methods and used to compute the radial and circumferential strains. The motion estimated for the method proposed by Chandrashekara et al [27] were obtained using tagged MR images, whereas we computed the motion estimates for our method using the Cine MR sequences. The average peak strains and standard deviations at end systole are shown in Table 2. As can be seen, the strains estimated by both methods are in general agreement with each other. It is important to note that the strains obtained from the Cine sequences compare very well with those obtained using tagged MR images. This is a confirmation of the quality of our motion estimation algorithm.

Table 2
Average peak strain and standard deviation at end-systole for 3 healthy subjects.

4. Conclusion

A 4D deformable registration method for estimation of cardiac motion from MR image sequences was presented, and experimentally tested. The cardiac motion estimation was formulated as a 4D image registration problem, which simultaneously considers all images of different time-points and further constrains the spatiotemporal smoothness of estimated motion fields concurrently with the image registration procedure. Also, compared to other motion estimation methods that use relatively simple features such as curvature of the left ventricular border, our method uses a rich set of attributes, including GMIs, to distinguish the corresponding points across different time-points, thereby maximally reducing ambiguity in image matching. Finally, by selecting the active points hierarchically based on the degree of distinctiveness of their attribute vectors, the proposed registration algorithm has the potential to produce a global solution for motion estimation.

The performance of this 4D registration method for cardiac applications has been evaluated by visual inspection as well as quantitative validation using tagged MR images. Experimental results demonstrate good performance of our method in estimating cardiac motion from cine MR sequences. The motion estimates are statistically similar to the motion of tag lines obtained from tagged MR images. In addition, the radial and circumferential strain estimates obtained by our methods from Cine MR images compare favorably with those obtained from co-registered tagged MR images. Cine MR images are commonly acquired in a clinical setting, and the use of our algorithm may obviate the need to acquire additional tagged images to estimate myocardial motion. In addition, our method allows for dense estimates of myocardial motion, which is not restricted to the intersection of tag lines, thus helping detect abnormal myocardial motion and also potentially improving early diagnosis and treatment planning of cardiomyopathies.


1From the perspective of correspondence detection, this merely enforces inverse consistency, i.e., a point y is a consistent match for x if x is also the best match for point y.

2We use a simplified 12 segment model, dividing longitudinally into basal, mid-ventricular and apical sections, and dividing each into 4 segments, Septum, anterior, lateral and inferior.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


1. Heart Disease and Stroke Statistics. American Heart Association; Dallas, Texas: 2004.
2. Roche A, Pennec X, Malandain G, Ayache N. Rigid registration of 3-D ultrasound with MR images: a new approach combining intensity and gradient information. IEEE Transactions on Medical Imaging. 2001;20:1038–1049. [PubMed]
3. Shen D, Davatzikos C. HAMMER: Hierarchical attribute matching mechanism for elastic registration. IEEE Transactions on Medical Imaging. 2002 November;21:1421–1439. [PubMed]
4. Teh CH, Chin RT. On image analysis by the method of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1988;10:496–513.
5. Manjunath BS, Ma WY. Texture Features for Browsing and Retrieval of Image Data. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996 August;18:837–842.
6. Verma R, Davatzikos C. Matching of Diffusion Tensor Images Using Gabor Features; Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI); Arlington, Va. 2004. pp. 396–399.
7. Liu J, Vemuri BC, Marroquin JL. Local frequency representations for robust multimodal image registration. IEEE Transactions on Medical Imaging. 2002;21:462–469. [PubMed]
8. Smart SC, Knickelbine T, Malik F, Sagar KB. Dobutamine-Atropine Stress Echocardiography for the Detection of Coronary Artery Disease in Patients With Left Ventricular Hypertrophy: Importance of Chamber Size and Systolic Wall Stress. Circulation. 2000 January 25;101:258–263. [PubMed]
9. Zerhouni E, Parish D, Rogers W, Yang A, Shapiro E. Human heart: tagging with MR imaging--a method for noninvasive assessment of myocardial motion. Radiology. 1988 October 1;169:59–63. [PubMed]
10. Guttman MA, Prince JL, McVeigh ER. Tag and contour detection in tagged MR images of the left ventricle. IEEE Transactions on Medical Imaging. 1994;13:74–88. [PubMed]
11. Baho A, Carbayo ML, Marta CS, David EP, Fernandes MG, Desco M, Santos A. Cardiac Motion analysis from Magnetic resonance Imaging: Cine Magnetic Resonance versus Tagged Magnetic Resonance. Computers in Cardiology. 2007
12. Shi P, Sinusas AJ, Constable RT, Duncan JS. Volumetric Deformation Analysis Using Mechanics-Based Data Fusion: Applications in Cardiac Motion Recovery. International Journal of Computer Vision. 1999/11;35:87–107.
13. Amini AA, Prince JL. Measurement of Cardiac Deformations from MRI. Physical and Mathematical Models; Kluwer: 2001.
14. McEachen JC, Nehorai A, Duncan JS. Multiframe temporal estimation of cardiac nonrigid motion. IEEE Transactions on Image Processing. 2000;9:651–665. [PubMed]
15. Perperidis D, Mohiaddin RH, Rueckert D. Spatio-temporal free-form registration of cardiac MR image sequences. Medical Image Analysis. 2005/10;9:441–456. [PubMed]
16. Wang YP, Chen Y, Amini AA. Fast LV motion estimation using subspace approximation techniques. IEEE Transactions on Medical Imaging. 2001;20:499–513. [PubMed]
17. Papademetris X, Sinusas AJ, Dione DP, Duncan JS. Estimation of 3D left ventricular deformation from echocardiography. Medical Image Analysis. 2001/3;5:17–28. [PubMed]
18. Song SM, Leahy RM. Computation of 3-D velocity fields from 3-D cine CT images of a human heart. IEEE Transactions on Medical Imaging. 1991;10:295–306. [PubMed]
19. Ledesma-Carbayo MaJ, Kybic J, Desco M, Santos As, Unser M. Cardiac Motion Analysis from Ultrasound Sequences Using Non-rigid Registration. (2208) 2001 [PubMed]
20. Makela T, Clarysse P, Sipila O, Pauna N, Pham QC, Katila T, Magnin IE. A review of cardiac image registration methods. IEEE Transactions on Medical Imaging. 2002;21:1011–1021. [PubMed]
21. Zitova B, Flusser J. Image registration methods: a survey. Image and Vision Computing. 2003/10;21:977–1000.
22. Pham DL, Prince JL. Adaptive fuzzy segmentation of magnetic resonance images. IEEE Transactions on Medical Imaging. 1999 September;18:737–752. [PubMed]
23. Lo CH, Don HS. 3-D Moment Forms: Their Construction and Application to Object Identification and Positioning. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989 October;11:1053–1064.
24. Christensen GE, Johnson HJ. Consistent Image Registration. IEEE Transactions on Medical Imaging. 2001;20:568–582. [PubMed]
25. Tagare HD, Groisser D, Skrinjar O. A Geometric Theory of Symmetric Registration; Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop; 2006.
26. Press W, Teukolsky S, Vetterling W, Flannery B. Numerical Recipes in C. Cambridge University Press; 1992.
27. Chandrashekara R, Mohiaddin RH, Rueckert D. Analysis of 3-D myocardial motion in tagged MR images using nonrigid image registration. IEEE Transactions on Medical Imaging. 2004;23:1245–1250. [PubMed]
28. Yin FC, Chan CC, Judd RM. Compressibility of perfused passive myocardium. American Journal of Physiology- Heart and Circulatory Physiology. 1996;271:1864–1870. [PubMed]
29. Bistoquet A, Parks WJ, Skrinjar O. Cardiac Deformation Recovery using a 3D Incompressible Deformable Model; Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop; 2006.
30. Arts T, Hunter WC, Douglas A, Muijtjens AM, Reneman RS. Description of the deformation of the left ventricle by a kinematic model. J Biomech. 1992;25:1119–27. [PubMed]