Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2980332

Formats

Article sections

- Abstract
- 1 Introduction
- 2 Elastic Shape Analysis
- 3 Construction of Prior Shape Models
- 4 Posterior Energy for Boundary Extraction
- 5 Experimental Results
- 6 Statistical Decision Theory
- 7 Conclusion
- References

Authors

Related links

Int J Comput Vis. Author manuscript; available in PMC 2010 November 12.

Published in final edited form as:

Int J Comput Vis. 2009 March 1; 81(3): 331–355.

doi: 10.1007/s11263-008-0179-8PMCID: PMC2980332

NIHMSID: NIHMS245075

Shantanu H. Joshi, Laboratory of Neuroimaging, University of California, Los Angeles, CA 90024, USA;

Shantanu H. Joshi: ude.alcu.inol@ihsojs

See other articles in PMC that cite the published article.

We present a framework for incorporating prior information about high-probability shapes in the process of contour extraction and object recognition in images. Here one studies shapes as elements of an infinite-dimensional, non-linear quotient space, and statistics of shapes are defined and computed intrinsically using differential geometry of this shape space. Prior models on shapes are constructed using probability distributions on tangent bundles of shape spaces. Similar to the past work on active contours, where curves are driven by vector fields based on image gradients and roughness penalties, we incorporate the prior shape knowledge in the form of vector fields on curves. Through experimental results, we demonstrate the use of prior shape models in the estimation of object boundaries, and their success in handling partial obscuration and missing data. Furthermore, we describe the use of this framework in shape-based object recognition or classification.

Appearances of objects in images can be characterized to a certain extent by shapes of their boundaries. Therefore, the task of extracting object boundaries in images may prove important in problems of detection, tracking, and recognition of objects using image data. In some applications this task has to be performed solely on the basis of images (data) while in others some a-priori information about the shape of the boundary (to be extracted) is available to the algorithm. Boundary extraction has been studied in detail for many years but mostly in the former situation. Focusing on the latter, we present a method for *representing, modeling, and incorporating prior information* about shapes of closed curves for use in the process of shape extraction. Examples of these situations include battlefield image analysis where the interest is restricted to certain military vehicles, or medical image analysis where one focuses on certain anatomical parts with known shape variability. Stated differently, we are not seeking “general purpose” segmentation algorithms; our inferences will correspond to curves that form boundaries of known and desired objects in images. From this perspective, our technique can be viewed as a preliminary step in shape-based inferences—classification and recognition of objects—in images. Besides being application oriented, this perspective provides another advantage over the general-purpose segmentation algorithms: a criterion for evaluating boundary estimation performance. Since estimated boundaries are to be used for future inferences, performance in boundary estimation can be measured directly in terms of eventual object classification or recognition performance.

To highlight the novelty and contributions of this paper, we first discuss past and present methodologies in this area.

The seminal paper by Kass et al. (1988) formulated a snake evolving over a landscape of the intensity image, guided by its edges and corners. This model, in its original proposed form, had many limitations such as its limited capture range and preference to local solutions. As a remedy, Cohen and Cohen (1993) included a balloon force which expands the curve until it rests on edges. The Gradient Vector Flow (GVF) (Xu and Prince 1998b) and generalized GVF (GGVF) snakes (Xu and Prince 1998a), have gained popularity due to their ability to attract active contours toward object boundaries from large distances and move contours into object cavities. However, these snakes may fail to converge at weak edges, especially at the locations where strong and weak edges are in close proximity. Li et al. (2005) proposed an Edge Preserving GVF (EPGVF) to overcome this drawback; EPGVF prevents snakes from crossing over weak boundaries, while preserving the desired properties of GVF and GGVF. Caselles et al. (1997) proposed geodesic active contours to detect boundaries in images by searching for minimal-length curves in a space with a suitable Riemannian metric. Kichenassamy et al. (1995) formulated a geometric curve evolution approach based upon curvature flows in images. Both of these approaches use Euclidean curve-shortening flows that evolve a contour according to a normal vector field that is proportional to its curvature. A desirable property of these curvature flows, also called Euclidean heat flows, is the naturally occurring smoothing of the curve (Gage and Hamilton 1986; Grayson 1987). Additionally, curve evolution has also been accomplished using level-set representations, where curves are level sets of an appropriate function such as a signed-distance function (Caselles et al. 1997; Kichenassamy et al. 1995). While level-set representations allow for a change of topologies during curve evolution, an important strength of this approach, it is quite difficult to explicitly characterize shapes in this representation. Staib and Duncan (1992) use (parametric) deformable contours based on the elliptic Fourier decomposition of boundaries for image segmentation. In all these methods, the prior term, if used at all, regulates only the smoothness of the extracted curves; no specific shape information is incorporated in the extraction process.

Shape is a characteristic that is invariant to rigid motion and uniform scaling. How can one represent, model, and incorporate a-priori information about high-probability shapes? A major effort in shape analysis has been on a finite-dimensional or “landmark-based” approach, where shapes are represented by a coarse, expert-driven sampling of objects (Dryden and Mardia 1998; Small 1996). This process however requires expert intervention as the automatic detection of landmarks is not straightforward. Additionally, since the analysis depends heavily on chosen landmarks, this approach is limited in its scope. Grenander's formulation (Grenander 1993; Miller and Younes 2001) considers shapes as points on infinite-dimensional manifolds, where the variations between the shapes are modeled by the action of diffeomorphism groups. Although this approach is relevant for modeling variations in full appearances of objects, i.e. both shapes and textures, its computational complexity makes it expensive for problems limited to curves and their shapes.

There are two general approaches for performing statistical analysis on nonlinear manifolds, including the shape manifolds: extrinsic and intrinsic. We briefly introduce these terms and refer the reader to Patrangenaru and Bhattacharya (2003) for a more detailed discussion.

In this approach a manifold is embedded inside a larger vector space, the statistical analysis is performed in this larger space, and the final inference is projected back on to the manifold. For instance, sample means are computed in the larger space, using classical tools, and projected back to the manifold. While extrinsic analysis enjoys the advantage of being simple and computationally efficient, it is generally not the most natural. The main disadvantage is that solutions often depend on the choices of the ambient space and the embedding. As a simple example of this issue, consider a unit circle that can be embedded inside ^{2} in multiple ways where each embedding possibly leads to a different value of the extrinsic mean on the circle. The active shape model approach (Cootes et al. 1995) is an extrinsic approach that uses principal component analysis (PCA) of shapes, represented via vectors of landmarks, to model shape variability. Despite its simplicity and efficiency, this approach is limited because it ignores the nonlinear geometry of the underlying shape space. A similar idea is to use level sets (e.g. of signed-distance functions) to represent contours, and to compute PCA in the space of signed distance functions (Leventon et al. 2000).

An intrinsic analysis is restricted to the manifold and uses quantities such as distances, probability measures, and moments, that are defined strictly on that manifold. For a Riemannian structure imposed on that manifold, the resulting distances between points are achieved by paths (geodesics) that lie completely on the manifold. In most formulations, shape spaces are quotient spaces of nonlinear manifolds, and one needs to apply tools derived from differential geometry to carry out intrinsic analysis on the underlying shape spaces. There has been a resurgence of ideas in studying shapes of closed, planar curves (Charpiat et al. 2005; Klassen et al. 2004; Michor and Mumford 2006; Mio et al. 2007; Shah 2005) that use differential geometry of manifolds. There has also been substantial progress in developing intrinsic (parametric) statistical models on such spaces (Srivastava et al. 2005a, 2005b).

Several papers have presented approaches for extracting object boundaries using priors on either shapes or some shape related quantities. Grenander et al. (1991, 2000) used shape models learned from past observations to help improve shape estimation in future data. Cremers et al. (2003, 2006) and Rousson and Paragios (2002) have used priors in several formulations, including the level set formulation, to perform segmentation and object extraction.

There are several important criteria, such as invariance, shape metric, intrinsic nature, the use of higher moments, and the computational cost, in evaluating different Bayesian shape estimation methods. We motivate these criteria with a commonly used method. Consider the following way of comparing shapes of two closed curves, either directly or through their level sets. Let *ϕ*(*β*) denote the signed-distance function of a closed, planar curve *β*. Then, possible distances between shapes of two closed curves *β*_{1} and *β*_{2} are:

$$\begin{array}{l}d({\beta}_{1},{\beta}_{2})\hfill \\ \phantom{\rule{1em}{0ex}}=\underset{R\in SO(2),T\in {\mathbb{R}}^{2},a\in {\mathbb{R}}_{+}}{min}{\Vert \phi ({\beta}_{1})-\phi (R\cdot a{\beta}_{2}+T)\Vert}^{2},\hfill \\ \phantom{\rule{1em}{0ex}}\text{or}\underset{R\in SO(2),T\in {\mathbb{R}}^{2},a\in {\mathbb{R}}_{+}}{min}{\Vert {\beta}_{1}-(R\cdot a{\beta}_{2}+T)\Vert}^{2}\hfill \end{array}$$

(1)

where *R* denotes a rotation, *T* denotes a translation, and *a* denotes a scale. These ^{2}-type norms are computed on ambient vector spaces and not on spaces of closed curves or spaces of signed-distance functions. For example, in the case of signed-distance functions, ‖*ϕ*_{1} − *ϕ*_{2}‖ (or its version with the Heaviside step function) is often given by the ^{2} norm (Vemuri and Chen 2003; Cremers et al. 2002). With this example, we discuss different evaluation criteria for Bayesian shape estimation.

Shape is traditionally considered invariant to the action of translation, rotation, parameterization, and scaling, and one would like a representation that removes as many of these groups as possible (ideally all of them) before analysis. The remaining groups have to be removed through computational means by posing them as optimization problems. In certain approaches, such as level-set methods or diffeomorphism based methods, it is difficult to build representations that are maximally invariant. For example, the formulation in (1) tends to be expensive because all the similarity transformations are still present in the representation and have to be removed using an optimization method.

Once the representation is chosen and the shape space is established, the ensuing shape analysis can be performed in either an extrinsic or intrinsic fashion. As mentioned earlier, while the intrinsic analysis is more natural with unique solutions, the extrinsic analysis is often the simpler one. A recent paper (Cremers et al. 2006) has obtained several desired invariances for shape analysis using level set representations, but it remains an extrinsic approach. One consequence is that in the resulting probability models, one will encounter functions that are invalid for representing shapes, with high probability. Another consequence is that shape statistics defined extrinsically will not be unique.

While several metrics are available for comparing shapes, it is preferable to use one that lends to physical interpretations such as bending and stretching of curves. The elastic metric, defined later, is one such preferred metric as it provides a better matching of features through stretching, compression, and bending. However, despite its wide acceptance for shape comparisons, it has not been previously used in Bayesian shape estimation.

Most of the early work in shape priors involved terms of the type exp(−*d*^{2}), where *d* is a distance between shapes. However, a more interesting idea is to involve higher moments, such as the covariance, in building a shape prior. To our knowledge, covariances have seldom been used in past literature on shape extraction.

Our framework has the advantage of having: (i) a maximal invariance to shape preserving transformations, (ii) an intrinsic shape analysis framework, (iii) an attractive elastic metric, and (iv) the ability to use covariance information in the shape prior. As a combination of these properties, the resulting solution is advantageous, although it maybe computationally more complex than certain methods that employ non-elastic metrics or extrinsic solutions.

The main contribution of this paper is to apply fully intrinsic as well as maximally invariant prior probability models on well-defined shape spaces to the problem of boundary extraction. A priori knowledge of shape assumes importance when the image quality is low, perhaps due to low contrast or excess clutter, or when parts of the object have been obscured. Our goal is to use intrinsic, geometric representations of shapes of continuous, closed, planar curves in imposing shape priors and using these priors in a Bayesian framework for shape extraction.

Numerous ideas proposing different shape representations of closed curves and their intrinsic analysis have been presented in the literature (Joshi et al. 2007b; Klassen et al. 2004; Michor and Mumford 2006; Mio et al. 2007; Shah 2005; Yezzi and Mennucci 2005) and others referenced therein. The basic idea is to represent curves denoting object boundaries as parameterized functions (not necessarily by arc-length), appropriately constrained, resulting in a nonlinear manifold . Most of the similarity transformations are already removed, but to remove the remaining transformations one forms a quotient space = */S*, where *S* is the space of the remaining similarity transformations. Consequently, the shapes of closed curves are analyzed as elements of . The main tool in this analysis is the choice of a Riemannian metric and the construction of geodesic paths on between shapes under that metric. While several metrics have been proposed in the literature, the *elastic* metric is gaining popularity lately. Under the elastic metric, the curves are matched to each other using an optimal combination of bending and stretching/compression. This metric was first proposed by Younes (1998), advanced by Mio et al. (2004, 2007), and more recently simplified by Joshi et al. (2007b). The difference between these papers is in the choice of representation of curves.

Using the elastic metric, one can develop a framework for statistical models by defining means, covariances, and other statistics in tangent spaces of (Srivastava et al. 2005b; Vaillant et al. 2004). The problem of boundary extraction can then be posed as a maximum a-posteriori (MAP) estimation, and rewritten in terms of an energy formulation as:

$$\widehat{\beta}=\underset{\beta}{argmin}\phantom{\rule{0.1em}{0ex}}\left({E}_{\mathit{\text{image}}}(\beta )+{E}_{\mathit{\text{smooth}}}(\beta )+{E}_{\mathit{\text{prior}}}(\beta )\right),$$

(2)

where the minimization is over all closed curves in the given image domain. To ensure a fully intrinsic formulation, all the terms will be independent of parameterizations of *β*. Additionally, *E _{smooth}* will be independent of translation and rotation, and

*E _{prior}* is the novel term representing prior energy associated with the shape of curve

- In addition to image information and roughness constraints, it incorporates a priori information about shapes in boundary extraction. This additional information is presented using change of variables in the form of deformation vector fields on curves for efficient energy minimization.
- This approach does not require expert-generated landmarks, diffeomorphic embeddings, or level-set functions for shape analysis. Instead, it studies the set of shapes of curves directly. Since the computations here are performed only on curves, this analysis is also computationally efficient.
- The prior term is fully intrinsic to the shape space. Metrics, shape statistics, and probability models are all defined and computed on a well-defined shape space or its tangent bundle. The metric used is the elastic metric for shape analysis.
- The shape prior is fully invariant not only to the similarity transformations, but also to re-parameterizations of curves by changes in speed. Similarity transformations such as rotation, translation, and scaling are already removed and need not be estimated at each step. However, this representation introduces a variability, associated with re-parameterizations of curves, which is solved numerically.

The main *limitations* of our framework are presented here as follows. Firstly, it does not allow objects to split and merge during the evolution. Shape models are defined and applied to each closed curve individually. Secondly, this approach is restricted to 2D images and does not extend to analysis of surfaces in volumetric data (although surfaces can sometimes be represented as indexed collections of curves and then this framework applies (Samir et al. 2006)). This framework easily extends to shape estimation of curves in * ^{n}* (Joshi et al. 2007b). Lastly, as mentioned above, the representation of curves using parametric forms introduces an additional variability of re-parameterization that has to be removed using optimizations.

The remainder of this paper is organized as follows. Section 2 summarizes our approach for representing and analyzing shapes of closed curves and Sect. 3 presents two possibilities for forming a prior (probability) model on the shape space. Section 4 outlines different energy terms that form the posterior energy and presents an algorithm for finding MAP estimates of curves. Section 5 presents some experimental results that demonstrate the main ideas in the paper, while Sect. 6 shows an example from statistical decision theory that allows selection and testing of different prior shapes models on a given image data. We finish the paper with a section on conclusions and summary.

As mentioned earlier, elastic analysis has emerged as a dominant tool in statistical analysis of variations in shapes of curves. However, the formulation of this analysis is not unique and it depends on the chosen representation of curves. Let *J* denote the interval [0, 2*π*]. The following representations of a curve *β* have been proposed:

- Younes (1998) uses a function
*ξ : J*^{1}, given by $\xi =\frac{\dot{\beta}}{\Vert \dot{\beta}\Vert}$, with*β*viewed as a parameterized curve in the complex space , to represent a curve. For any (increasing) diffeomorphism of*J*, call it*γ*, and a complex function*r : J*^{1}, the mapping (*r, γ*) ○*ξ*=*r*·*ξ*(*γ*) defines a group action on the functions denoting curves. In particular, the component*γ*denotes the stretching and the component*r*denotes the bending of the original curve*β*. - For a curve
*β*, one can denote a deformation in its shape using a normal vector field on it and, further, specify this vector field as a real-valued function on*J*. That is, for a function*a : J*, define a vector field*a*(*t*)**n**, where_{β(s)}**n**denotes a unit normal to the curve at_{β(s)}*β*(*s*). Using this representation, Michor and Mumford (2006) have studied shape analysis of closed curves under a variety of metrics, including the elastic metric. - Mio et al. (2007) represent a closed curve
*β*by a pair of functions: the log-speed function*ϕ*log(‖‖) and the angle function $\theta \equiv \angle (\frac{\dot{\beta}}{\Vert \dot{\beta}\Vert})$, to represent and analyze shapes under the elastic metric. - Joshi et al. (2007b) use a square-root representation, given by $\frac{\dot{\beta}}{\sqrt{\Vert \dot{\beta}\Vert}}$, that studies the shape of closed curves in
using the same elastic metric.^{n}

These different representations lead to different spaces, or manifolds, for analyzing shapes and the elastic shape analysis is performed by choosing an appropriate Riemannian structure on these manifolds. For the purpose of this paper, one can choose any of these representations and metrics. We choose the third representation, i.e. the pair (*ϕ, θ*), as it provides a direct interpretation of deformation between shapes as being elastic, a change in *θ* implies bending and a change in *ϕ* implies stretching/compression.

Next we summarize ideas from elastic shape analysis of planar closed curves; for details the interested reader is referred to Mio et al. (2007). Let *β : J* ^{2} be a non-singular, parameterized curve and we are interested in analyzing its shape. We assume that the curve has been scaled to be of length 2*π*. We represent the velocity vector of the curve in complex polar coordinates, (*t*) = *e ^{ϕ(t)}e^{jθ(t)}*, where

$$\mathcal{C}=\left\{\nu \equiv (\phi ,\theta ):{\int}_{J}{e}^{\phi (t)}dt=2\pi ,\frac{1}{2\pi}{\int}_{J}\theta (t){e}^{\phi (t)}dt=\pi ,{\int}_{J}{e}^{\phi (t)}{e}^{j\theta (t)}dt=0\right\}\subset \mathscr{H},$$

(3)

where is called the *pre-shape space*. The first constraint forces the length of *β* to be 2*π*, the second forces its orientation to be *π* and the last ensures that *β* is closed. Therefore, contains all planar, closed curves with length 2*π* and orientations *π*. In other words, the set of valid shapes lying in are defined as the inverse image ^{−1}(2*π, π*, 0, 0), where

$$\begin{array}{ll}\mathcal{G}\hfill & \equiv ({\mathcal{G}}_{1},{\mathcal{G}}_{2},{\mathcal{G}}_{3},{\mathcal{G}}_{4})\hfill \\ \hfill & =\left({\int}_{J}{e}^{\phi (t)}dt,\frac{1}{2\pi}{\int}_{J}\theta (t){e}^{\phi (t)}dt,{\int}_{J}{e}^{\phi (t)}cos\theta dt,{\int}_{J}{e}^{\phi (t)}sin\theta dt\right).\hfill \end{array}$$

(4)

An arbitrary differentiable curve having a representation (*ϕ, θ*) , is not necessarily an element of . However, we can project it in by moving perpendicular to the level sets of as follows. Using the gradient of , we can specify the space normal to at (*ϕ, θ*):

$${N}_{\phi ,\theta}(\mathcal{G})=\text{span}\left\{\nabla {\mathcal{G}}_{1}=\left(\frac{1}{a},0\right),\nabla {\mathcal{G}}_{2}=\left(\frac{\theta}{a},\frac{1}{b}\right),\nabla {\mathcal{G}}_{3}=\left(\frac{cos(\theta )}{a},\frac{-sin(\theta )}{b}\right),\nabla {\mathcal{G}}_{4}=\left(\frac{sin(\theta )}{a},\frac{cos(\theta )}{b}\right)\right\}.$$

(5)

Then, using Algorithm 1 we can iteratively update (*ϕ, θ*) in a direction in *N _{ϕ,θ}* () that moves (

Although we have removed rotation, translation and scaling variabilities, many elements of can still have the same shape. These elements result from re-parameterizing the same curve and, thus, need to be removed from the representation. Let *Γ* be the set of all diffeomorphisms of a unit circle ^{1} to itself. This set can be identified with the set of all re-parameterizations of all closed curves, using the right composition action, (*β, γ*_{0}) = *β*(*γ*_{0}). Any element *γ*_{0} *Γ* can be written as *γ*_{0} = *r* · *γ* where *r* is an element of ^{1}, *γ* is the set of all origin-preserving diffeomorphisms of ^{1} (call it ), and · is the group operation in *Γ*. Another way to write this is *Γ* = ^{1} · . Stated differently, the re-parameterizations of a closed curve can be broken into two components: (i) The change in the placement of origin on the curve is represented by the action of a unit circle ^{1} on , according to *r* · (*ϕ*(*s*), *θ*(*s*)) = (*ϕ*(*s* − *r*), *θ*((*s* − *r*)_{mod 2}* _{π}*)−

(**a**) A curve, (**b**) it's log-speed function *ϕ*, and (**c**) angle function *θ*. (**d**) An arbitrary warping function *γ* acting on the curve, (**e**) the new log-speed function (*ϕ* ○ *γ* + log ), and (**f**) the **...**

$$\left[\left(\phi ,\theta \right)\right]=\left\{\left(r\cdot \left(\phi ,\theta \right)\right)\cdot \gamma \mid r\in {\mathbb{\text{S}}}^{1},\gamma \in \mathcal{D}\right\}.$$

(6)

This set is also called the *orbit* of (*ϕ, θ*) each orbit in represents a unique shape. Analysis of shapes is performed on the set of such orbits.

Note that the pair *ν* (*ϕ, θ*) represents the shape of *β* and, thus, ignores its placement, orientation, parameterization and scale. These latter variables are called *nuisance variables* in shapes analysis, as they do not contribute to the shape of a curve. However, in extraction of boundaries from images, matching shapes of curves, or even in visual displays of a shape, these nuisance variables become important. In Table 1, we enumerate the shape and the nuisance variables separately.

We summarize the role of these variables used in different parts of the analysis:

- The shape variables (
*ϕ, θ*) and the matching variables*x*(_{m}*r, γ*) play a role in shape analysis and are used in computation of*E*and its gradient._{prior} - The length variable
*l*completely defines the term*E*, while some additional variables (_{smooth}*ϕ, θ, p, τ*) are used to compute the gradient of*E*._{smooth} - The shape variables (
*ϕ, θ*) and the nuisance (appearance) variables*x*(_{a}*p, τ, l*) are involved in computation of*E*and its gradients. This set of variables is also called the_{image}*full appearance variables*. - There are two places where
^{1}enters in our representation: the orientation*τ*^{1}and the placement of origin*r*^{1}. However, they are not completely independent. A change in*r*, while keeping*τ*=*π*, results in a rotation of the curve. Hence, we need to keep either an orientation parameter or a placement of origin, but not both, to capture the rotation variability. We will choose one according to the convenience: for rotating curves in a plane we will vary*τ*while keeping*r*fixed and for optimally matching any two shapes we will vary*r*while keeping*τ*=*π*.

We will use *x* = [*x _{a}, x_{m}*] to denote the whole set of nuisance variables,

$$\begin{array}{l}\beta (s)=p+l{\int}_{0}^{s}{e}^{\stackrel{\sim}{\phi}(s)}{e}^{j(\stackrel{\sim}{\theta}(s)+\tau )}ds,\hfill \\ \phantom{\rule{0.5em}{0ex}}(\stackrel{\sim}{\phi},\stackrel{\sim}{\theta})=(r\cdot (\phi ,\theta ))\cdot \gamma \hfill \end{array}$$

(7)

where *x* = (*p, l, τ, r, γ*) . To emphasize dependence of *β* on shape *ν* and nuisance variables *x*, we will often use an explicit notation *β* *β*(*ν, x*), where *ν* = (*ϕ, θ*).

Geodesics are important in defining and computing statistics of shapes. To specify geodesics, we need to impose a Riemannian structure on the tangent space of shapes first. The tangent space at any point of is itself, while the tangent spaces to and are naturally more restricted. We start by making a Riemannian manifold by imposing the following metric. For a point (*ϕ, θ*) , let *h _{i}* and

$${\langle ({h}_{1},{f}_{1}),({h}_{2},{f}_{2})\rangle}_{(\phi ,\theta )}=a{\int}_{J}{h}_{1}(s){h}_{2}(s){e}^{\phi (s)}ds+b{\int}_{J}{f}_{1}(s){f}_{2}(s){e}^{\phi (s)}ds.$$

(8)

The parameters *a* and *b* control the *tension* and *rigidity* in the metric. A higher ratio of *a/b* favors bending, while a lower value favors stretching/compression.

According to (5), the tangent space * T_{(ϕ,θ)}*() is the set of all pairs in that are orthogonal to the set:

$$\text{span}\{\left(\frac{1}{a},0\right),\left(\frac{\theta}{a},\frac{1}{b}\right),\left(\frac{cos\theta}{a},\frac{-sin\theta}{b}\right),\left(\frac{sin\theta}{a},\frac{cos\theta}{b}\right)\}$$

with orthogonality defined using the inner product in (8). So, one can project an arbitrary pair (*h, f*) into *T _{(ϕ,θ)}*() using:

$${(h,f)}_{\mathit{\text{proj}}}\equiv (h,f)-\sum _{i=1}^{4}{\langle (h,f),{\mathbf{\text{e}}}_{\mathcal{G}}^{i}\rangle}_{(\phi ,\theta )}{\mathbf{\text{e}}}_{\mathcal{G}}^{i},$$

(9)

where **e**_{1}, …, **e**_{4} are the orthonormal basis elements of *N _{(ϕ,θ)}*(). Furthermore, the tangent space

With the metric given in (8), the geodesics in are well defined and the remaining task is to compute geodesics for any two elements, say [(*ϕ*_{1}, *θ*_{1})] and [(*ϕ*_{2}, *θ*_{2})], of . There are two prominent numerical methods for finding geodesics on nonlinear manifolds when analytical expressions are not available.

Here one starts by constructing a geodesic starting from [(*ϕ*_{1}, *θ*_{1})] in an arbitrary initial direction (*h, g*) *T*_{[(ϕ1,θ1)]}() and follows it for unit time. Then, one optimizes over (*h, g*) so that the ^{2} norm, between the end point of the geodesic and [(*ϕ*_{2}, *θ*_{2})], becomes zero. Qualitatively, this corresponds to *shooting* geodesics from the first shape so as to hit the second shape in unit time. The readers are referred to the paper (Mio et al. 2007) for more details on the shooting method.

We emphasize that this approach requires an optimal matching of the two shapes before or during the calculation of a geodesic. In other words, we need to solve an optimization problem over ^{1} × so that the parameterization of the second shape best matches a fixed (usually arc-length) parameterization of the first shape. In our implementation, the search over ^{1} is performed exhaustively while the search over is performed using the dynamic programming (see Mio et al. 2007).

The main idea here is to connect the two shapes [(*ϕ*_{1}, *θ*_{1})] and [(*ϕ*_{2}, *θ*_{2})] by an arbitrary initial path in and to iteratively straighten it until the path results in a geodesic. The straightening is performed by perturbing the path along the gradient of an energy. An interesting point is that the gradient is known analytically and the perturbation is computationally efficient. In general, path straightening is faster and more stable of the two methods.

For the purpose of this paper, either method can be used for finding geodesic paths in and we have chosen the former. Once a geodesic is found, it can be completely characterized by a starting point and a starting direction. Let *Ψ _{t}* (

Specifically, the task of calculating a geodesic between a given pair of shapes (*ϕ*_{1}, *θ*_{1}) and (*ϕ*_{2}, *θ*_{2}), involves finding a tangent vector, (*h, f*) such that the geodesic starting from (*ϕ*_{1}, *θ*_{1}) reaches (*ϕ*_{2}, *θ*_{2}) in unit time. For the pre-shape space, this amounts to solving an optimization problem that involves the exponential map as follows,

$$(\widehat{h},\widehat{f})=\underset{(h,f)}{\text{arg min}}{\Vert exp(h,f)\text{}-{\nu}_{2}\Vert}_{{\nu}_{1}}^{2}.$$

(10)

In the process of finding the geodesic, any update to the vector (*h, f*), may result in the terminal point being located outside the manifold . This point is projected back on the manifold using a mechanism that projects arbitrary points in on the pre-shape space , using Algorithm 1. This projection can be constructed by iterating the point in the direction given by the basis vectors of the normal space of . The initial tangent vector is then parallel-transported to the new point along the infinitesimal geodesic. A first order infinitesimal update to the current estimate of the vector (*h, f*) is given as follows,

$$(\stackrel{\sim}{h},\stackrel{\sim}{f})=(h,f)-\delta \left(\frac{{h}^{2}}{2}-\frac{b}{2a}{f}^{2},hf\right).$$

(11)

This is also the equation of the parallel transport and is derived from the system of differential equations of the geodesics approximated in . Shown in Fig. 2 is an example of a geodesic path under the elastic metric for *a* = 1 and *b* = 1.

We will specify a prior model on shape using *E _{prior}* that in some circumstances can be seen as the negative log of a prior density. Although shape spaces are infinite-dimensional, prior densities are defined on some finite-dimensional subset using standard reference measures (Grenander 1981). In other cases

A simple choice for prior energy is *E _{prior}* =

Let *ν* and *x* be the shape and the nuisance variables of *β*, respectively; we can write *β* = *β* (*ν*, *x*). Since the prior depends only on *ν*, the choice of *x* is completely arbitrary here. In the larger optimization over *β*, *x* is determined by other terms in *E _{total}*.

*The gradient of E _{prior}with respect to β*,

$${\nabla}_{\beta}{E}_{\mathit{\text{prior}}}(s)\approx \frac{1}{\varepsilon}(\beta ({\Psi}_{\varepsilon}(\nu ,g;\mu ,{x}_{m}),{x}_{a})(s)-\beta (\nu ,{x}_{a})(s)),\phantom{\rule{1.5em}{0ex}}\mathit{\text{for}}\phantom{\rule{0.2em}{0ex}}\varepsilon >0\phantom{\rule{0.3em}{0ex}}\mathit{\text{small}},$$

(12)

where

- Ψ
_{t}(ν, g; μ, x_{m}) is a geodesic path from ν to μ having the optimal direction g and the optimal matching x_{m}_{m}. - β (Ψ
_{ε}(ν, g; μ, x_{m}), x_{a}) is the closed curve that results when the shape component of β is perturbed along the tangent direction g by an amount ε. The appearance nuisance variable x_{a}remains unchanged.

Let *g* *T _{ν}* () be such that

Figure 3 shows a sample shape *ν*, a prior mean *μ*, and a prior vector field * _{β}E_{prior}* induced on it.

While this *E _{prior}* provides a mechanism for introducing some prior information in the process of boundary extraction, it does not permit any more additional structure. For instance, this

Since the tangent space at any point on the manifold is a vector space, it has been the preferred domain for imposing probabilities for statistical analysis of shapes. Depending upon the representations used in shape analysis there are several references for this idea: for landmark-based shape analysis (Dryden and Mardia 1998), shape analysis of non-elastic (Srivastava et al. 2005b) and elastic curves (Srivastava et al. 2005a), diffeomorphism of image domains (Vaillant et al. 2004), and for set-theoretic shape analysis using Hausdorff type metrics (Charpiat et al. 2005). In case of functional representations, it is also common to impose a multivariate normal density on a finite-dimensional subspace of the tangent space with respect to the underlying Lebesgue measure. Although such a probability density can be pushed onto the manifold using the exponential (or some other) map, it is difficult to do so analytically due to the nonlinear nature of such mappings. Therefore, one restricts to considering probability densities on tangent spaces rather than on . In the implementation, however, one can generate (random) samples in tangent spaces and map them freely to using the exponential map. We will take this approach to define a *E _{prior}* that allows more structure than the squared-distance energy.

The basic idea is to estimate a mean shape *μ* for an object class, and to use the tangent space *T _{μ}* (), or rather a finite-dimensional subspace of

One starts with an ensemble of *k* observed (in this case, training) shapes {*ν _{i}*,

The next step is to form an orthonormal basis for the set {*g _{i}*,

Denote the projection of each tangent *g _{i}* into

To obtain computational efficiency, it is often useful to restrict to a principal subspace of *M* as follows. Perform PCA of the observed set {*y _{i}*,

One can impose one of several probability densities on *N* as a prior. For example, a Gaussian model for individual principal coefficients, with mean zero and variances given by observed singular values, is a simple choice. Another model that imposes a different mixture of Gaussians for each principal component, independent of each other, has been found to perform better than a simple Gaussian model and a nonparametric model (Srivastava et al. 2005a). For the simplicity of presentation, we take the first idea and form a multivariate normal density on the subspace *M* of *T _{μ}* ().

Let *g* *T _{μ}* () be a tangent vector such that exp

$${\stackrel{\sim}{E}}_{\mathit{\text{prior}}}(y)=\frac{1}{2}{y}^{T}(U{K}^{-1}{U}^{T})y+\frac{1}{2{\delta}^{2}}{\Vert y-U{U}^{T}y\Vert}^{2}.$$

(13)

*K* ^{n}^{×}* ^{n}* is the sample covariance matrix associated with past observations on

The gradient * _{y}Ẽ_{prior}* =

This can be carried one step further, i.e. we can write the posterior energy in terms of the curve *β*, or rather its shape *ν*, using the exponential map *g* *ν* = exp* _{μ}* (

Our interest is in writing the gradient of *E _{prior}* with respect to the shape of

$${\nabla}_{\beta}{E}_{\mathit{\text{prior}}}(s)\approx \frac{1}{\varepsilon}(\beta ({\Psi}_{1}(\nu ,\varepsilon h;\mu ,{x}_{m}),{x}_{a})(s)-\beta (\nu ,x)(s)),\phantom{\rule{1.5em}{0ex}}\text{for}\phantom{\rule{0.2em}{0ex}}\varepsilon >0\phantom{\rule{0.2em}{0ex}}\mathit{\text{small}}.$$

(14)

keeping in mind that *Ψ*_{1} (*μ*, *g*) = *ν*. As earlier, * _{β}E_{prior}* is a vector field on the curve

We have proposed two prior energy terms on *β*—a squared-distance term on and a multivariate normal term on the tangent bundle *T* —and a natural question is to compare their effectiveness in incorporating shape priors. Intuitively, the normal term should provide better results as it additionally contains the directions of major variability in a particular shape class. This idea is further demonstrated pictorially in Fig. 5 where shape evolutions under the (negative) gradients of the two energies are presented.

(**a**) Cartoon diagram of shape evolution under the squared distance, normal energy, and Rayleigh quotient in the space *T*_{μ} (S). (**b**) Successive shapes along gradient paths of the three energies

In this particular example, we use tank shapes and their mean *μ* shown in Fig. 4. Using the TPCA approach, we compute the dominant subspace *N* and a covariance matrix *K* from the observations. Next, we plot shape evolutions under the gradient process with respect to both the energy terms, starting at an arbitrary shape *ν*. Since the distance-squared term is simply the squared length of a tangent vector, the gradient process will simply follow this straight line from *ν* towards *μ* in the tangent space *T _{μ}* (). The normal energy term however will form a curved path from

(**a**) Eigen variation (*column wise*) along dominant vectors *υ*_{1}, *υ*_{2}, *υ*_{3} and some random directions *υ*_{rand} *M*^{} *T*_{μ} (S). (**b**) Shape variation about the mean *μ* (*center*) for the squared-distance **...**

So far we have discussed prior models for the shape component in *β*, *ν*(*β*). Another interesting question is, what prior model can be chosen for the nuisance variable *x*? In this paper, we assume an uniform prior on rotations and translations though, in the future, more informative priors can be pursued.

Boundary extraction typically involves minimizing an energy functional on the space of closed curves. The posterior energy functional usually involves the image information and a smoothing criterion, whereas this paper adds a term on shape prior, resulting in:

$${E}_{\mathit{\text{total}}}(\beta )={\lambda}_{1}{E}_{\mathit{\text{image}}}(\beta )+{\lambda}_{2}{E}_{\mathit{\text{smooth}}}(\beta )+{\lambda}_{3}{E}_{\mathit{\text{prior}}}(\beta ),$$

(15)

where *λ*_{1}, *λ*_{2}, and *λ*_{3} are arbitrary positive constants. The term *E _{image}* forces the curve

$$\widehat{\beta}=\underset{\beta \in \mathcal{B}}{argmin}{E}_{\mathit{\text{total}}}(\beta ),$$

(16)

where the minimization is performed over all closed curves in the given image domain. We will use a gradient approach to solve for .

In the next subsections, we describe the remaining terms, *E _{image}* and

There exists a large literature on the problem of extracting curves from 2D images. Since our goal is to illustrate the influence of a shape prior, any such method can be chosen as a starting point as long as it allows shape analysis of underlying curves. Here we select methods that represent contours as parametric curves explicitly.

Most segmentation methods involving curves, depend on intensity or edge information to evolve over the image. Kass et al. (1988) suggest the use of various external forces like simple image intensity, edge energy functional or termination functionals using the curvature of level lines in an image: *I* (*x*, *y*), − |*I* (*x*, *y*)|^{2} or − | (*G*_{σ} (*x*, *y*) * *I* (*x*, *y*))|^{2}. Caselles et al. (1997) also suggest the energy function:
$\frac{1}{1+{|\nabla I(x,y)|}^{i}}$, *i* = 1 or 2. These image-derived forces, separately or when combined together, emphasize the detection of lines, edges, ridges or corners and serve as a stopping criterion for the curve evolution. In order to reduce sensitivity to initialization, and to smooth these image-driven vector fields, Xu et al. (1998a, 1998b) present an energy directly in terms of vector fields **v**(*x*, *y*) = (*υ _{x}* (

$${\mathcal{E}}_{\mathit{\text{ext}}}(\mathbf{\text{v}})=\int g(|\nabla f|){|\nabla \mathbf{\text{v}}|}^{2}+h(|\nabla f|)\times (\mu {|{J}_{\mathbf{\text{v}}}p|}^{2}+{|\mathbf{\text{v}}-\nabla f|}^{2})\mathit{\text{dxdy}}.$$

(17)

This definition is taken verbatim from (6) (Li et al. 2005), where *f* and *g* weighting functions, and *p* is a normalized vector field that is orthogonal to the gradient vector field of the image. Here *J*** _{v}** is the Jacobian of the vector field

$${E}_{\mathit{\text{image}}}(\beta )={\int}_{\beta}{|\mathbf{\text{v}}(p)|}^{2}dp.$$

(18)

*The energy term E _{image}is invariant to re-parameterizations of β*.

For any parameterized curve *β*, *E _{image}* is given by
${\int}_{0}^{l}{|\mathbf{\text{v}}(\beta (s))|}^{2}(|\dot{\beta}(s)|)ds$. Replacing

$${E}_{\mathit{\text{image}}}(\stackrel{\sim}{\beta})={\int}_{0}^{l}{|\mathbf{\text{v}}(\beta (\gamma (s)))|}^{2}(|\dot{\beta}(\gamma (s))|)\dot{\gamma}(s)\mathit{\text{ds}}$$

which, by a change of variable *t* = *γ* (*s*), becomes same as *E _{image}* (

Although one can derive the gradient of *E _{image}* with respect to

$${\left[\begin{array}{cc}\frac{\partial {\upsilon}_{x}}{\partial x}& \frac{\partial {\upsilon}_{x}}{\partial y}\\ \frac{\partial {\upsilon}_{y}}{\partial x}& \frac{\partial {\upsilon}_{y}}{\partial y}\end{array}\right]\mid}_{\beta (s)}.$$

Figure 7 shows some examples of the EPGVF, and vector fields induced on curves. The top row shows some images with some arbitrary closed curves. The middle row shows EPGVF for the full images, and the bottom row shows * _{β}E_{image}* as the vector field restricted to the curve

A desirable property of the curve evolution is that smoothness of the curve be preserved or even enhanced during extraction. All parametric and free-form active contour models (Kass et al. 1988; Xu and Prince 1998b) enforce smoothness by including a regularization term that penalizes the first and second order derivatives of the curve. While such an approach usually depends on the parameterization, we follow a more intrinsic approach (Caselles et al. 1997; Kichenassamy et al. 1995). We consider an arc-length parameterized curve *β*. Define the smoothing energy functional as:

$${E}_{\mathit{\text{smooth}}}(\beta )={\int}_{0}^{l}{\langle \dot{\beta}(s),\dot{\beta}(s)\rangle}^{1/2}ds.$$

*E _{smooth}* is simply the length of the curve

As shown in many articles, e.g. Kichenassamy et al. (1995), the gradient of *E _{smooth}* with respect to

$${\nabla}_{\beta}{E}_{\mathit{\text{smooth}}}(\beta )=-\ddot{\beta}(s)=-\kappa (s)n(s),$$

(19)

where, *κ* is the curvature along *β* and *n*(*s*) is the inward normal unit vector to *β* at *s*. It has been shown (Gage and Hamilton 1986; Grayson 1987) that these curve-shortening or Euclidean heat flows automatically lead to smoothing of the curve. Furthermore they cause the curve to shrink and become convex, and if this process is continued, the curve then turns into a circle and ultimately collapses to a point.

As an example, Fig. 8 shows the evolution of a curve under: (i) * _{β}E_{smooth}* alone (top row) and (ii) under a combination of

For the total energy given in (15), the gradient vector field is a sum on individual gradients:

$$\nabla {E}_{\mathit{\text{total}}}(\beta )(s)={\lambda}_{1}{J}_{\beta}(\mathbf{\text{v}})(\beta (s))\mathbf{\text{v}}(\beta (s))-{\lambda}_{2}\kappa (s)n(s)+{\lambda}_{3}{\nabla}_{\beta}{E}_{\mathit{\text{prior}}}(s),$$

(20)

and contour estimation is obtained as a solution of the differential equation:

$$\frac{dX(t)}{dt}=-\nabla {E}_{\mathit{\text{total}}}(X(t)),\phantom{\rule{0.5em}{0ex}}X(0)\in \mathcal{\text{B}},$$

(21)

with = lim_{t}_{→∞} *X*(*t*). It is difficult to establish asymptotic properties *of X*(*t*) analytically and we resort to experimental results to analyze this method.

The computational cost in updating *β* is in evaluating the three vector fields on *β* at each iteration. While the other two terms are standard, we discuss the computational cost of the gradient of *E _{prior}*. The main cost in this step comes from the computation of a geodesic path between the current shape

In this section we deibe experimental results on estimating boundaries of objects using real and simulated image data. Bayesian shape extraction is most effective in situations where the image quality is low, or when the image consists of partially obscured or occluded objects and, most importantly, when we have a prior knowledge about the shape of the object contained in the region. This is often times the case in medical image analysis using ultrasound images, echocardiographs or low-resolution PET scans, where signal to noise ratio is rather small and object to background contrast is not sufficient for image-based object extraction. Similar low-contrast images appear in surveillance applications using infrared cameras. Using past observations of those objects: anatomical parts or human subjects, one can develop a shape prior and use it in order to compensate for missing data.

Even though *E _{total}*(

$$(\phi ,\theta )\circ \gamma ,\phantom{\rule{1em}{0ex}}\text{where}\phantom{\rule{0.2em}{0ex}}\gamma (t)={\int}_{0}^{t}{e}^{-\phi (s)}ds.$$

(22)

The warping function *γ* can be computed numerically, and applied to (*ϕ, θ*) to obtain an arc-length parameterized version of *β*.

Firstly, we present two examples where we use images of a tank shown in Fig. 9, to demonstrate Bayesian active curves under a shape prior using the squared-distance energy. The rightmost panel in each case shows the prior mean shape *μ* used in this experiment and the prior energy used here was the squared-distance term. The goal is to find a MAP estimate of the object shape present in *I*. Shown in Fig. 9 are the evolutions of *X*(*t*), without the prior term *(*λ_{3} = 0) and with (λ_{3} *>* 0). In the absence of prior, the algorithm is misled by a partial obscuration of the object in *I*. In contrast, the Bayesian solution gives a better estimate of the tank boundary because of a strong prior on the shape of the curve. We emphasize that the scale, position, and orientation of *β*, along with its shape, are estimated by the algorithm automatically.

Evolution of *β* under *E*_{total}(*β*) with λ_{3} = 0 and λ_{3} *>* 0. The *last panel* shows the prior mean *μ*

Secondly, we consider the effect of using different prior shape averages for Bayesian extraction by comparing the evolution of curves for a fixed, arbitrary test image. In this experiment, the prior mean shape was computed using an independent set of silhouettes from IR tank images, shown later in Fig. 19. Comparisons of curve evolution under the effect of different prior mean shapes are shown in Fig. 10.

Evolution of *β* under *E*_{total}(*β*) with λ_{3} *>* 0 for the same image, but for different prior mean shapes. The *last panel* shows the prior mean *μ* in each row

This experiment involves a set of cardiac images taken using an ultrasound equipment. Each set consists of a sequence of images beginning with the ED (End-Diastole) frame and ending with the ES (End-Systole) frame. An important goal in echocardiographic image analysis is to develop tools for extracting cardiac boundaries in approximately 10–15 image frames that are typically acquired between ED and ES. Different aspects of past efforts on this dataset (Wilson et al. 2000; Chen et al. 2002) have focused on both the construction of geometric figures to model the shape of the heart as well as validation of extracted contours. Given a manual tracing of the boundaries for the ED and ES frames, we aim to extract epicardial and endocardial boundaries in the intermediate frames. We first compute a geodesic interpolation of the curves between the first and the last frame (Joshi et al. 2005), and use these intermediate curves as initial conditions for boundary extraction using minimization of *E _{total}*. Figure 11 shows extraction results for the cardiac boundary for a specific frame in two different image sequences. In this experiment, we have used the squared-distance energy as

In this example, we consider the task of completing shapes of dinosaur's bones using images of bones with missing pieces (left column of Fig. 12). As a prior information, we are given some pictures of complete bones from animals of the same species (second column). For each case, we compute the mean shape *μ* of the prior observations and use the distance squared prior on shapes for Bayesian shape estimation. The estimated shapes are shown in the third column of Fig. 12. Since there is no ground truth available, we use manual segmentations by an expert (right column) for comparison. The visual closeness of the estimated shapes with expert drawings shows the utility of prior information in the shape estimation process. It is also interesting to note some differences in the two results. In each of the first two rows, the Bayes' estimate shows an elongation on the right side whereas the expert's sketch does not have a major change on the right corner.

Next we present boundary extraction results for a few images from the ETH-80 dataset, as well as some partially obscured image consisting of a hand gesture. Each row in Fig. 13 shows curve evolutions with and without the *E _{prior}* term; the upper row is for

At each step of the evolution, the curve is influenced by the relative weights of the terms *λ*_{1}, *λ*_{2}, and *λ*_{3}. The appropriate values for these weights are mostly application dependent. For noisy images lacking in reliable edge information, it is desirable that the final segmentation be largely governed by the prior term, thereby emphasizing *λ*_{3}. Figure 14 shows the changes in the final extracted curve by varying *λ*_{3} while keeping *λ*_{1} and *λ*_{2} fixed, for both types of *E _{prior}*—the squared-distance energy and the normal energy. It is noticed that an increasing contribution by the prior vector field reduces the effect of object boundary edges on the final segmentation, and eventually for large

Bayesian shape extraction is expected to benefit from the use of advanced statistical models, but different choices of *E _{prior}* can lead to different results. The Fig. 5(b) illustrates the difference in evolution when the squared-distance and the normal energies are minimized using a gradient approach. We first compare the effects of these two energy terms using primitive shapes, such as circles and ellipses. The top panel of Fig. 15 shows some circles and ellipses, and their Karcher mean

Next, we perform a similar experiment, this time with a shape prior model derived from real image observations. The data consists of a sequence of infrared images resulting from the rotation of a tank mounted on a pedestal at various elevations. We are interested in a subset of this data that corresponds roughly to an azimuthal rotation of the tank from 200° to 315° at a fixed elevation. This amounts to a change in the angle between the gun and the body of the tank at different perspectives. Figure 16 shows a few examples out of the total number of 108 of such images.

A few examples of IR images of tanks for a sweep of 207° to 312°. *First column* shows the elevation of the tank

The TPCA model is constructed by first computing the mean shape shown in Fig. 17 and then projecting the tangent vectors from all the 108 observations at the mean shape. Figure 17 also shows the decay in the eigenvalues corresponding to each principal component as well as the eigen-shape variation along the dominant eigendirections.

Next, we compare the curve evolution under both the squared-distance and the normal energy on a test image in Fig. 18. The final extracted curve for the squared-distance model closely resembles the mean shape, while converging on the boundaries of the tank image. It is noticed that there is a mismatch in the extracted shape w.r.t. the location of the turret in the image. This problem is avoided in the case of the normal model where the covariance information affects the curve evolution such that the misalignment between the turrets of the extracted curve and the tank image is reduced.

So far the estimated boundary has been obtained by implementing a gradient flow that typically yields locally optimal solutions. This is largely due to the sensitivity of gradient descent type optimizations to initializations. In such cases, one may employ stochastic optimization techniques to achieve global minima.

There are some important questions that arise immediately for this framework: (i) In case we do not have any a priori expectation on the class of shapes, which prior model can be used for Bayesian active contour framework? (ii) How to evaluate performance of such an active contour approach? (iii) How to choose *λ*s for weighting different energy terms appropriately?

We suggest the use of statistical decision theory to address these and similar questions, as described next.

What is the correct choice of the prior model for estimation of contours from a given image? In many situations, the presence of contextual information may help reduce possible hypotheses to a small number, but not necessarily to a single prior shape model. Consider for example an image taken from a battlefield containing military targets. In addition to the possibility that one of the several target types—tank, truck, jeep, hummer, etc.—may be present in that image, the shape model for even a single target will differ depending upon its pose. Rather than fixing an a priori shape model, a Bayesian approach suggests evaluating all possible models under the MAP framework. In other words, under the Bayesian approach one searches over all possible shape models and selects the model that best explains the image data. This also results in Bayesian object recognition as the selection of a shape class being equivalent to object recognition.

Let denote a discrete, finite set of shape models associated with a possible object present in an image. Different models can come from shapes of different objects, or different planar shapes resulting from different poses of the same three-dimensional object, or both. For each class *q* , assume that we have a prior shape model and a prior probability *P*(*q*) of that class. In case of the normal energy prior, this implies that we have a different mean *μ _{q}*, principal space

$$\begin{array}{ll}P(q\mid I)\hfill & =\frac{P(q){\int}_{\mathcal{B}}P(I\mid \beta )P(\beta \mid q)d\beta}{{\sum}_{q}(P(q)\int P(I\mid \beta )P(\beta \mid q)d\beta )}\hfill \\ \hfill & =\frac{P(q){\int}_{\mathcal{B}}{e}^{-{E}_{\mathit{\text{total}}}^{q}(\beta )}d\beta}{{\sum}_{q}(P(q){\int}_{\mathcal{B}}{e}^{-{E}_{\mathit{\text{total}}}^{q}(\beta )}d\beta )},\hfill \end{array}$$

where ${E}_{\mathit{\text{total}}}^{q}={\lambda}_{1}{E}_{\mathit{\text{image}}}+{\lambda}_{2}{E}_{\mathit{\text{smooth}}}+{\lambda}_{3}{E}_{\mathit{\text{prior}}}^{q}$. Since it is difficult to perform this integration analytically, we look for some computational approximations. One idea is to use importance sampling, i.e. sample from the prior and use the likelihood term as weights. Another possibility is to use Laplace's approximation of the nuisance integral using asymptotics (Grenander et al. 2000). A coarse approximation is to evaluate the integrands at , the MAP estimates. That is, choose:

$$\widehat{q}=\underset{q\in \mathcal{\text{Q}}}{argmax}(P(q){e}^{-{E}_{\mathit{\text{total}}}^{q}({\widehat{\beta}}_{q})}),$$

(23)

where * _{q}* is the MAP curve estimate for the model

We conducted an experiment that uses real image data to generate prior shape models for tanks. Figure 19(a) shows a schematic of a battle tank rotated at 3° steps in azimuth while keeping a fixed elevation. The sample infrared images corresponding to different side views of the tank are shown in Fig. 19(c). Figure 19(b) shows the automatically extracted shapes and their sample means grouped together in 9 quadrants for one half-circle. Due to viewing symmetry (the profile of a 3D object is same at antipodal viewing directions), we construct prior shape models for half of a circle, and reflect the shapes to get the models for the other half. In this experiment, the shape model comprises of the mean shape *μ _{q}* corresponding to each quadrant {

How should one evaluate the performance of this framework or any other boundary extraction algorithm? In a general segmentation framework this question is quite difficult to address. However, in our narrow framework where extracted curves correspond to boundaries of objects, and shapes of objects relate to their classes, one can use the resulting classification performance for analyzing extraction algorithm.

If is the result of our MAP estimation in an image *I*, we can use this curve to classify the object in that image. This can be done as suggested above using a Bayesian decision or using a more general classifier, e.g. SVM, nearest neighbor, etc. If we denote the classification performance on a test dataset to be *F*, then we claim that *F* can also be used to quantify performance of the original curve extraction algorithm.

In our framework, the estimate and therefore *F* will actually be a function of *λ*_{1}, *λ*_{2}, and *λ*_{3}. This suggests a technique for selecting these parameters which have been selected quite arbitrarily thus far. One can maximize *F* over these parameters on a validation set to select their values. If one seeks a fully Bayesian framework, then it would be better to use some non-informative priors on *λ*s themselves, e.g. a gamma density with large shape and scale parameters. We have left this study for future work.

This paper presents an intrinsic approach for constructing prior shape models for object extraction. Extraction is based on curve evolution under the influence of image, smoothing and prior vector fields. The strength of this work is the incorporation of object priors using a geometric shape model for improved segmentation. The shape prior is fully invariant not only to the similarity transformations, but also to re-parameterizations of curves by changes in speed. The last step comes at an additional computational cost for removing the re-parameterization group from the shape representation. The prior shape information drastically improves the object extractions under various obscuration to objects in images. The effectiveness of the shape prior in curve extraction is demonstrated using both the squared-distance energy as well as the normal energy using higher-order covariance information. This is evident from results on a variety of images including synthetic, as well as real data consisting of military, biological, and medical images.

We thank Prof. David Wilson of University of Florida for providing us the ultrasound images, Dr. Richard Sims of USAMRDEC for the tank data, and Albert Prieto-Marquez of Department of Biological Sciences, Florida State University for the use of the hadrosaurid bone images. Additionally, we thank Dr. Ian Jermyn of INRIA, Sophia Antipolis, France for several interesting discussions. This research was support in part by the Army Research Office awards W911NF-04-01-0268 and W911-NF-04-1-0113, the Air Force Office of Scientific Research award FA9550-06-1-0324, and an Innovation Alliance Award from the Northrop Grumman Corporation.

Shantanu H. Joshi, Laboratory of Neuroimaging, University of California, Los Angeles, CA 90024, USA.

Anuj Srivastava, Department of Statistics, Florida State University, Tallahassee, FL 32306, USA.

- Amit Y, Grenander U, Piccioni M. Structural image restoration through deformable templates. Journal of the American Statistical Association. 1991;86(414):376–387.
- Caselles V, Kimmel R, Sapiro G. Geodesic active contours. International Journal of Computer Vision. 1997;22(1):61–79.
- Charpiat G, Faugeras O, Keriven R, Maurel P. Approximations of shape metrics and application to shape warping and empirical shape statistics. In: Krim H, Yezzi A, editors. Statistics and analysis of shapes. Berlin: Springer; 2005. pp. 363–395.
- Chen Y, Tagare H, Thiruvenkadam S, Huang D, Wilson D, Gopinath K, Briggs R, Geiser E. Using prior shapes in geometric active contours in a variational framework. International Journal of Computer Vision. 2002;50(3):315–328.
- Cohen LD, Cohen I. Finite-element methods for active contour models and balloons for 2-d and 3-d images. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1993;15(11):1131–1147.
- Cootes TF, Taylor CJ, Cooper DH, Graham J. Active shape models,—their training and application. Computer Vision and Image Understanding. 1995;61(1):38–59.
- Cremers D, Tischhäuser F, Weickert J, Schnörr C. Diffusion snakes: introducing statistical shape knowledge into the Mumford–Shah functional. International Journal of Computer Vision. 2002;50(3):295–313.
- Cremers D, Kohlberger T, Schnörr C. Shape statistics in kernel space for variational image segmentation. Pattern Recognition. 2003;36(9):1929–1943.
- Cremers D, Osher SJ, Soatto S. Kernel density estimation and intrinsic alignment for shape priors in level set segmentation. International Journal of Computer Vision. 2006;69(3):335–351.
- Dryden IL, Mardia KV. Statistical shape analysis. New York: Wiley; 1998.
- Gage M, Hamilton RST. The heat equation shrinking convex plane curves. Journal of Differential Geometry. 1986;23:69–96.
- Grayson M. The heat equation shrinks embedded plane curves to round points. Journal of Differential Geometry. 1987;26:285–314.
- Grenander U. Abstract inference. New York: Wiley; 1981.
- Grenander U. General pattern theory. London: Oxford University Press; 1993.
- Grenander U, Chow Y, Keenan DM. Hands: a pattern theoretic study of biological shapes. Berlin: Springer; 1991.
- Grenander U, Srivastava A, Miller M. Asymptotic performance analysis of Bayesian target recognition. IEEE Transactions on Information Theory. 2000;46:1658–1665.
- Jermyn I. Invariant Bayesian estimation on manifolds. Annals of Statistics. 2005;33(2):583–605.
- Joshi SH, Srivastava A, Mio W. Elastic shape models for interpolations of curves in image sequences. IPMI. 2005:541–552. [PubMed]
- Joshi SH, Klassen E, Srivastava A, Jermyn IH. Removing shape-preserving transformations in square-root elastic (sre) framework for shape analysis of curves. In: Yuille A, et al., editors. LNCS: Vol 4679 Proceedings of energy minimization methods in computer vision and pattern recognition (EMMCVPR) Berlin: Springer; 2007a. pp. 387–398. [PMC free article] [PubMed]
- Joshi SH, Klassen E, Srivastava A, Jermyn IH. An efficient representation for computing geodesics between n-dimensional elastic shapes. IEEE conference on computer vision and pattern recognition (CVPR); June 2007.2007b. [PMC free article] [PubMed]
- Kass M, Witkin A, Terzopolous D. Snakes: Active contour models. International Journal of Computer Vision. 1988;1(4):321–331.
- Kichenassamy S, Kumar A, Olver P, Tannenbaum A, Yezzi A. Gradient flows and geometric active contour models. Proc international conf computer vision. 1995:810–815.
- Klassen E, Srivastava A. Geodesics between 3D closed curves using path straightening. European conference on computer vision. 2006;I:95–106.
- Klassen E, Srivastava A, Mio W, Joshi SH. Analysis of planar shapes using geodesic paths on shape spaces. IEEE Transactions Pattern Analysis and Machine Intelligence. 2004;26(3):372–383. [PubMed]
- Leventon M, Eric W, Grimson L, Faugeras O. Statistical shape influence in geodesic active contours. Proc IEEE conf comp vision and pattern recognition. 2000:316–323.
- Li C, Liu CJ, Fox MD. Segmentation of external force field for automatic initialization and splitting of snakes. Pattern Recognition. 2005;38(11):1947–1960.
- Michor PW, Mumford D. Riemannian geometries on spaces of plane curves. Journal of the European Mathematical Society. 2006;8:1–48.
- Miller MI, Younes L. Group actions, homeomorphisms, and matching: a general framework. International Journal of Computer Vision. 2001;41(1/2):61–84.
- Mio W, Srivastava A. Elastic-string models for representation and analysis of planar shapes. Proc IEEE conf comp vision and pattern recognition. 2004:10–15.
- Mio W, Srivastava A, Joshi SH. On shape of plane elastic curves. International Journal of Computer Vision. 2007;73(3):307–324.
- Patrangenaru V, Bhattacharya R. Large sample theory of intrinsic and extrinsic sample means on manifolds. Annals of Statistics. 2003;31(1):1–29.
- Rousson M, Paragios N. Shape priors for level set representations. European conference in computer vision (ECCV) 2002;II:78–93.
- Samir C, Srivastava A, Daoudi M. Three-dimensional face recognition using shapes of facial curves. IEEE Transactions on Pattern Analysis and Pattern Recognition. 2006;28(11):1858–1863. [PubMed]
- Shah J.
*H*^{0}-type Riemannian metrics on spaces of planar curves. 2005. Preprint http://arxiv.org/abs/math.DG/0510192. - Small CG. The statistical theory of shape. Berlin: Springer; 1996.
- Srivastava A, Jain A, Joshi S, Kaziska D. Statistical shape models using elastic-string representations. In: Naraynan PJ, editor. LNCS: Vol 3851. Proceedings of Asian conference on computer vision (ACCV) Berlin: Springer; 2005a. pp. 612–621.
- Srivastava A, Joshi SH, Mio W, Liu X. Statistical shape analysis: Clustering, learning and testing. IEEE Transactions Pattern Analysis and Machine Intelligence. 2005b;27(4):590–602. [PubMed]
- Staib LH, Duncan JS. Boundary finding with parametrically deformable models. IEEE Transactions Pattern Analysis and Machine Intelligence. 1992;14(11):1061–1075.
- Van Trees HL. Detection, estimation, and modulation theory (Vol I) New York: Wiley; 1971.
- Vaillant M, Miller MI, Younes L, Trouve A. Statistics on diffeomorphisms via tangent space representations. NeuroImage. 2004;23:161–169. [PubMed]
- Vemuri B, Chen Y. Joint image registration and segmentation. In: Osher S, Paragios N, editors. Geometric level set methods in imaging, vision, and graphics. Berlin: Springer; 2003. pp. 251–269.
- Wilson D, Geiser E, Larocca J. Automated analysis of echocardiographic apical 4-chamber images. Proc international society for optical eng in math modeling, estimation, and imaging. 2000:128–139.
- Xu C, Prince J. Generalized gradient vector flow external force for active contours. Signal Processing. 1998a;71(2):131–139.
- Xu C, Prince J. Snakes, shapes, and gradient vector flow. IEEE Transactions Image Processing. 1998b;7(3):359–369. [PubMed]
- Yezzi AJ, Mennucci A. Conformal metrics and true “gradient flows” for curves. Proc IEEE international conf computer vision. 2005:913–919.
- Younes L. Computable elastic distance between shapes. SIAM Journal of Applied Mathematics. 1998;58(2):565–586.