|Home | About | Journals | Submit | Contact Us | Français|
Shape analysis is useful for a wide variety of disciplines and has many applications. There are many approaches to shape analysis, one of which focuses on the analysis of shapes that are represented by the coordinates of predefined landmarks on the object. This paper discusses Tridimensional Regression, a technique that can be used for mapping images and shapes that are represented by sets of three-dimensional landmark coordinates, for comparing and mapping 3D anatomical structures. The degree of similarity between shapes can be quantified using the tridimensional coefficient of determination (R 2). An experiment was conducted to evaluate the effectiveness of this technique to correctly match the image of a face with another image of the same face. These results were compared to the R 2 values obtained when only two dimensions are used and show that using three dimensions increases the ability to correctly match and discriminate between faces.
Tobler  proposed bidimensional regression as a tool for computing the degree of similarity between two planar configurations of points and to estimate mapping relations between two objects that are represented by a set of two-dimensional landmarks. Bidimensional regression is an extension of linear regression where both dependent and independent variables are represented by coordinate pairs, instead of scalar values. Specifically, Tobler  suggested that bidimensional regression may be useful for comparing signatures, geographical maps, or faces. The latter was done in the context of face recognition by Shi et al.  and Kare et al. .
Tobler's  method has been extended to Tridimensional Regression for situations when both dependent and independent variables are represented by three-dimensional coordinates . The purpose of this paper is to provide a summary of that extension, to illustrate the use of tridimensional regression for comparing and mapping anatomical structures, and to compare the effectiveness of the two-dimensional and three-dimensional methods. Widespread use of three-dimensional imaging devices in many areas of research makes this research timely. This technique is broadly applicable to any situation where spatial configurations of three-dimensional points are compared. Specific instances where tridimensional regression may be of use are three-dimensional mapping and comparison of objects or structures that are represented by their three-dimensional coordinates. The R 2 values derived from regression allow the degree of similarity between two objects to be quantified.
In this section, a brief summary of bidimensional regression and its extension to three dimensions is provided. Details of the tridimensional regression models are provided.
Nakaya  defines a bidimensional regression model as
where (u i, v i) is the dependent variable, (x i, y i) represent the corresponding coordinates of the independent variable, g and h are transformation functions used to estimate mapping relations between independent and dependent variables, and (ε i, η i) is an error vector that is assumed to be normally and independently distributed. Both Tobler  and Nakaya  discuss obtaining estimates for parameters in g and h using the method of least-squares so that
where and are the transformation functions evaluated at the parameter estimates, is minimized. Here n is the number of landmark points used in the analysis. The normal equations are obtained in the usual manner , and by solving in
where X j is the design matrix of transformation j and is a (2n × 1) vector for the dependent variable partitioned by the coordinates, will yield the least-squares parameter estimates . The design matrix (X j) will depend on the transformation used and the number of parameters to estimate; hence, the dimension of will also be determined by the type of transformation.
Tobler  proposes four bidimensional regression models, three of which are intrinsically linear and one is curvilinear. Friedman and Kohler  argue that the curvilinear model may be too general for practical use and describe the linear transformations in more detail. Each of the other three transformations is linearized by reparameterization prior to solving the parameter estimates.
The three linear transformations yield the Euclidean, affine, and projective models where in each model the original coordinates are scaled, rotated, and translated. These transformations form a hierarchy with the Euclidean being the simplest (fewest parameters) and the projective the most complex (most parameters) of the models.
Details of bidimensional regression models can be found in [1, 5, 7, 8]. Briefly, the Euclidean model is a similarity transformation in that the overall shape remains unchanged. The coordinates are translated, rotated, and isotropically scaled , thus preserving the original shape and angles. The affine model allows for X and Y coordinates to be scaled independently, and the configuration could exhibit shear (γ) (e.g., a square may become a parallelogram; Figure 1). The projective transformation, which is the most complex, allows the size, shape, and orientation to change as a function of viewpoint . An example of a projective transformation is shown in Figure 2.
In the Euclidean and affine transformations, the models are linearized by reparameterization, and then the normal equations can be derived in the usual manner. Once the parameters have been estimated , provide equations for calculating the scale and rotation values for the Euclidean transformation and the scale, shear, and rotation values for the affine transformation.
The equations for the projective transformation can be rewritten using homogeneous coordinates and put in matrix notation as shown in (16). Homogeneous coordinates can be used with any of the models to provide a uniform framework for all transformations. For rotation, scaling, and shear, the transformed coordinates can be expressed as the product of a transformation matrix and the original coordinates. For translation, however, the coordinates are derived by addition of the translation vector to the original coordinates. Use of homogeneous coordinates makes all the transformations multiplicative. This is accomplished by adding an additional coordinate (t), called the homogeneous coordinate.
The homogeneous coordinate is added purely for mathematical simplification and has no effect on the transformation of coordinates. For example, it is convenient to represent a sequence of transformations as the product of the corresponding transformation matrices. Thus, in the Euclidean and affine models, the translation parameters become multiplicative and one matrix could be used for all of the transformation parameters . With the projective model, the conversion is used to linearize the model, and once the object is mapped using homogeneous coordinates, the original coordinates are restored by dividing by the homogeneous coordinate, t. However, when this is done, the restriction placed on t results in parameter estimates of , . Consequently, the projective transformation is reduced to the affine transformation and the results are identical. The conversion to homogeneous coordinates is adequate for determining the location of transformed points, but not for obtaining transformation parameter estimates. If left in terms of the original equations, the parameters of the projective transformation can be estimated using nonlinear regression. When extended to three dimensions, a similar approach is used.
The similarity of the two objects is assessed using the bidimensional correlation coefficient ,
The bidimensional regression models proposed by Tobler  can be extended to instances where three-dimensional data are used for comparison. A specific instance may include anatomical structures that are represented by three-dimensional landmark coordinates, but tridimensional regression can be useful for determining the degree of similarity between any two objects that are represented by three-dimensional coordinates.
In this paper, the linear transformations discussed by Tobler  will be extended to three dimensions. Extensions to the Euclidean, affine, and projective transformations are described in detail where the dependent and independent variables are represented by their three-dimensional coordinates,
The three-dimensional Euclidean transformation is similar to the two-dimensional case in that coordinates are simply translated, rotated, and isotropically scaled. The overall shape and the angles of the original object are preserved, and parallel lines in the original object are mapped to parallel lines in the transformed space. There is an additional translation parameter, and the rotation matrix differs depending on which axis(es) are used for the rotation. In general, the number of rotation parameters is k(k − 1)/2, where k is the number of dimensions. Therefore, there are three rotation parameters for the general three-dimensional Euclidean transformation. However, for instances when it is known that all three rotations are not necessary, the transformation can be reduced to one or two rotations. These special cases are discussed in detail in .
The format of the rotation matrix depends on the axis of rotation. The formats for each of the three rotations are shown below, where γ is the angle of rotation about the x-axis, θ is the angle of rotation about the y-axis, and ϕ is the angle of rotation about the z-axis:
The general form of the three-dimensional Euclidean transformation is
where R is one of the rotation matrices.
As in the two-dimensional case, the transformation can be linearized by reparameterization, where the new transformation matrix (R′) is a combination of the scale and rotation parameters. The reparameterized transformations and their normal equations follow.
For rotation about the x-axis,
and deriving the normal equations in the usual manner yields
Similar details for rotation about the y and z axes can be found in .
When more than one rotation is used, the reparameterization to linearize the model is not obvious; therefore, the rotation matrices remain in terms of the rotation parameters and nonlinear regression is used. The advantage of using nonlinear regression is that the rotation and scale parameters are directly estimated instead of being solved in terms of β i; the disadvantage in using nonlinear regression is convergence may not be reached and starting values must be specified. The similarity of the two objects is assessed using the Pseudo- R 2 as defined by . The Pseudo-R 2 is calculated in the same manner as R 2, but, in general, is not guaranteed to be greater than zero. Again, the rotation matrix differs depending upon the axes of rotation. An example of a two-rotation Euclidean transformation is shown below.
For rotation about x and y axes,
In the general form of the three-dimensional Euclidean transformation, the order in which the transformations are applied will result in different parameter estimates. Permuting this order will result in different estimates of the rotation parameters, but the measure of similarity will remain the same regardless of the order of transformations. The following system of equations shows the rotations in the order of x-axis, y-axis, and then z-axis:
The extension of the affine transformation from two dimensions into three dimensions includes additional parameters for translation, scaling, rotation, and shear. Figure 3 shows an example of a three-dimensional affine transformation. The transformed coordinates in affine transformations are given by
Deriving the normal equations in the usual manner yields
where I 3 is a 3 × 3 identity matrix and is the direct product of the two matrices.
The extension of the projective transformation from two to three dimensions involves the conversion to homogeneous coordinates (16). Additional parameters are added corresponding to the coordinate of the third dimension. In a projective transformation, the size, shape, and orientation can all change as a function of viewpoint. While this is a nonlinear transformation, by using homogeneous coordinates, the model can be linearized in order to obtain the normal equations and estimate the parameters. The equations to obtain the transformed coordinates are
and deriving the normal equations in the usual manner yields
where I 4 is a 4 × 4 identity matrix and is the direct product of the two matrices.
As described in , this linearization results in parameter estimates that reduce the transformation to affine. The linearization is adequate to determine the transformed points, but not for the optimization to determine the transformation parameters or for measuring the degree of similarity between the two objects. Therefore, the transformation is left in terms of the original equations and nonlinear regression is used to obtain parameter estimates.
Nonlinear regression is an extension of linear regression where the expected responses are nonlinear functions of the parameters . Finding least-squares estimates for linear models is straightforward as they have a closed-form solution. For nonlinear models, the least-squares estimates must be found using an iterative procedure. In this paper, the Gauss-Newton algorithm is used. This iterative procedure utilizes a Taylor series expansion to find the least-squares estimates .
For all transformations, parameter estimates can be found in the usual manner, , and subsequently used to solve for rotation, scale, and sheer parameters. The similarity of the two objects can be assessed using the tridimensional correlation coefficient, R 3D, given by
which is an extension to the bidimensional correlation coefficient .
An experiment was conducted to evaluate the effectiveness of tridimensional regression and its improvement over bidimensional regression. Three-dimensional landmark data obtained from human faces were used for this purpose. The landmarks were obtained by placing reflective markers on the faces of subjects and tracking the coordinates as the subjects moved through a series of poses using automated software. The landmarks were adapted from . They are shown in Figure 4 and described in Table 1.
The landmarks were obtained for three subjects at two different sittings and five poses per sitting. The objective was to compare R 2 values within a subject to the R 2 values between subjects using both tridimensional regression and bidimensional regression. One would expect the degree of similarity to be higher, thus a higher R 2 value, for two samples from the same person than for samples from two different people. All pairwise R 2 values were calculated for bidimensional and tridimensional regressions. Poses of the same individual within a sitting were not compared since the markers were not removed between poses and using these poses would result in inflated R 2 values.
For each transformation, both in two and three dimensions, the distributions of R 2 values for within and between subjects were obtained by fitting a theoretical distribution over the histograms of observed values. Overlaying these theoretical distributions allowed for the estimation of a threshold value (τ) as a cutoff for determining if two images were from the same subject. R 2 values greater than τ lead to the decision that the two images are of the same subject (match) while R 2 values less than τ indicate that the images are of two different subjects (nonmatch). The threshold value was determined to be where the two distributions cross, as to simultaneously minimize the false-positive and false-negative error rates. A false positive is when images of two different subjects are incorrectly determined to be from the same subject (an R 2 value greater than τ for different subjects); a false-negative occurs when two images from the same subject are incorrectly determined to be from different subjects (an R 2 value less than τ for the same subject). In addition to calculating the observed error rates, the expected error rates were found by evaluating the cumulative distribution functions of the R 2 values at τ. Table 2 summarizes the observed and expected error rates, and Figures Figures5,5, ,6,6, and and77 show the within-subject (dotted line) and between subject (solid line) distributions for each transformation.
Table 2 shows that both the observed and expected error rates for tridimensional regression are much smaller than those for bidimensional regression using any of the three transformations. Bidimensional regression resulted in both error rates being very high, false-positives often over fifty percent. Tridimensional regression shows a substantial decrease in both false-positive and false-negative error rates which indicates that the three-dimensional method is better at correctly matching a subject to him or herself.
In this application, the Euclidean and affine transformations were comparable to one another with the affine performing slightly better. The projective transformation had the largest observed false-positive rate. This result is not surprising as the flexibility of the projective transformation allows it to map objects into many other shapes. This flexibility results in the ability to match even two very dissimilar objects quite well with certain transformation parameters. Consequently, the R 2 values are very high for all matches. This shifts the between-person distribution closer to the within person-distribution which results in a larger false-positive error rate.
Additionally, a sixth pose was taken on each of the subjects in each setting. This pose was not used to build the within- and between-subject distributions, or to determine the threshold. These six sets of points (two for each subject) were compared to all other poses not taken in the same setting of the same subject (30 comparisons per pose, 6 possible correct matches). The highest R 2 for all six was a correct match. In addition, a minimum of the top 4 matches were correct matches, illustrating that tridimensional regression can be very good at identifying correct matches and discriminating between different objects.
Bidimensional regression  is a useful tool for comparing two geometric configurations that are each represented by a set of coordinate pairs. The scale, rotation, and translation relating the two configurations can be estimated by first estimating the parameters of the transformation model. As an application of the technique, [2, 3] used bidimensional regression analysis for relating faces in landmark-based face recognition.
In this paper, the bidimensional technique has been extended to three dimensions. Such an extension may prove useful in the analysis of three-dimensional landmark data. The underlying foundations for tridimensional regression have been developed with different transformations: Euclidean, affine, and projective. Its use is demonstrated through an application to compare human faces using three-dimensional landmarks. Results show that tridimensional regression improves the ability to correctly match objects that are represented by landmark data. Both the Euclidean and affine transformations work well to reduce the error rates. The projective transformation also shows improved error rates, but its flexibility may make it too general for some practical applications. Choice of transformation should be given careful consideration given the goals of the application. While there is improvement over Bidimensional regression, the observed and expected error rates are likely higher in this experiment due to the small number of subjects involved and comparing several poses of the same subject. A larger-scale study is needed to better estimate the expected error rates.
This work can be extended in several different directions. The focus here was in developing the theory of tridimensional regression and conducting an initial investigation for shape matching with a feasibility experiment. An investigation with a larger amount of three-dimensional landmark data is needed to more fully understand its effectiveness. In addition to a larger-scale study, it is also of interest to develop weighted tridimensional regression techniques which would allow some landmarks to be weighted more or less heavily than others. Weighting landmarks allows for less weight to be placed on landmarks that are highly variable. Some landmarks could be more variable because they are less reliably extracted or simply due to more natural variability. Weighting has been shown to improve the matching ability in bidimensional regression , specifically in a face matching application , and is expected to improve the matching ability and precision in mapping for tridimensional regression as well.
The authors wish to thank Dr. Jordan Green of the University of Nebraska-Lincoln for his help in obtaining the three-dimensional landmark data used for this research.