Home | About | Journals | Submit | Contact Us | Français |

**|**Bioinformatics**|**PMC2881375

Formats

Article sections

Authors

Related links

Bioinformatics. 2010 June 15; 26(12): i13–i20.

Published online 2010 June 1. doi: 10.1093/bioinformatics/btq190

PMCID: PMC2881375

* To whom correspondence should be addressed.

Copyright © The Author(s) 2010. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article has been cited by other articles in PMC.

**Motivation:** The centrosome is a dynamic structure in animal cells that serves as a microtubule organizing center during mitosis and also regulates cell-cycle progression and sets polarity cues. Automated and reliable tracking of centrosomes is essential for genetic screens that study the process of centrosome assembly and maturation in the nematode *Caenorhabditis elegans*.

**Results:** We have developed a fully automatic system for tracking and measuring fluorescently labeled centrosomes in 3D time-lapse images of early *C.elegans* embryos. Using a spinning disc microscope, we monitor the centrosome cycle in living embryos from the 1- up to the 16-cell stage at imaging intervals between 30 and 50 s. After establishing the centrosome trajectories with a novel method involving two layers of inference, we also automatically detect the nuclear envelope breakdown in each cell division and recognize the identities of the centrosomes based on the invariant cell lineage of *C.elegans*. To date, we have tracked centrosomes in over 500 wild type and mutant embryos with almost no manual correction required.

**Availability:** The centrosome tracking software along with test data is freely available at http://publications.mpi-cbg.de/itemPublication.html?documentId=4082

**Contact:** ed.gbc-ipm@hcsneaj

Genetic screens that produce and require analysis of massive amounts of image data are becoming a major tool in functional genomics. Here, we do so to study the assembly and maturation of centrosomes, an organelle in animal cells that serves as a microtubule organizing centre (Alberts *et al.*, 1994) during mitosis, in the nematode *Caenorhabditis elegans*. In principle, centrosomes can be of any size. The embryo, however, exhibits remarkable precision in setting centrosome size in relation to cell size that get progressively smaller through development. This is crucial because centrosome size in turn sets the length of the mitotic spindle (Greenan *et al.*, 2010) and is involved in signaling activities within the cell (Fletcher and Muschel, 2006). Several studies (Kirkham *et al.*, 2003; Song *et al.*, 2008) have identified genetic factors of centrosome size but the molecular mechanism of how a cell adopts the size of its dynamic organelles remains poorly understood. In these studies, only the spatial but not the temporal information of centrosome growth has been considered and so we aim to overcome this shortcoming by precisely quantifying the *kinetics* of centrosome growth in wild type and mutant embryos.

To visualize centrosomal proteins of interest in living *C.elegans* embryos, we use fluorescent markers such as GFP and reduce their levels by RNA interference (RNAi) in order to assay function. We then produce 3D time-lapse images of such labeled and/or genetically altered centrosomes. In this article we present a suite of algorithms that analyze these images (Fig. 1).

A typical example of a *C.elegans* embryo with γ-TUB::GFP labeled centrosomes in the 4-cell stage (maximum intensity projection). The centrosomes are annotated by our algorithm in terms of cell names and anterior/posterior position and registered **...**

In the earliest related work, Bao *et al.* (2006) report a system that identifies and tracks nuclei and meta-phase chromosomes in *C.elegans* via histone-GFP. Their goal is to trace the cell lineage up to hundreds of cells in one color channel and use a second color channel to monitor gene expression. Compared to our problem, the objects tracked are large, their phase easily distinguished by their shape, and their frame-to-frame movement is small, none of which are true for centrosomes. Recently, Kerekes *et al.* (2009) present an algorithm for tracking centrosomes in neuron cells for the study of neuron migration. Short sequences of only 20 time points at 16 s intervals are recorded and the analysis cannot handle multiple centrosomes at the same x/y-location. These circumstances are thus considerably simpler than those considered here^{1}.

To robustly solve the problem of tracking centrosomes in developing *C.elegans* embryos, we have extended a bipartite matching based framework for object tracking (Kalaidzidis, 2007) to incorporate prior knowledge of the centrosome cycle (see the following subsection). In that, we use the novel approach of first reliably finding those subsegments of a trajectory in which the centrosome is bright and large, and then extending these ‘core’ trajectories backward in time to catch the initiation of the centrosome where it is very small and dim, and forward in time to catch the disappearance of the centrosome when it appears to ‘explode’ or dissolve. Beyond establishing centrosome trajectories, our system also recognizes the nuclear envelope breakdown (Gorjanacz *et al.*, 2007) in each cell division and determines the biological identities of the centrosomes based on the canonical naming scheme of *C.elegans*' invariant cell lineage. The developed algorithms are highly accurate and so the results they produce can be used directly or if perfection is desired, require very little additional human curation. We have thus produced a truly automatic system that analyzes centrosome movies, delivering temporally aligned time series of centrosome kinetics ready for biological interpretation.

During cell division, centrosomes are formed temporarily to orchestrate the organization of the microtubule fibers that segregate two copies of the DNA complement of the cell. The process of centrosome formation and eventual dissolution is in broad outline as follows. At the beginning of each cell division a pair of centrosomes begins to assemble around a pair of orthogonally arranged, barrel-shaped centrioles, each 200-nm in length, located next to the nucleus. The centrosomes then move to opposite poles of the nucleus while continuously increasing in size due to the recruitment of proteins such as γ-Tubulin and SPD-2 from the cytoplasm. As the nucleus breaks down (Gorjanacz *et al.*, 2007), microtubules grow outward from the centrosomes and attach to the chromosomes and to the anterior and posterior cell cortex. Thereafter the centrosomes are pulled apart, toward opposite sides of the cell, by means of these fibers. As the centrosomes do so, they pull copies of each chromosome toward them to form the DNA content of each emerging daughter cell. Finally, the centrosomes come to a stop and are rapidly disassembled. This appears in the images to be analyzed as either an explosion into several pieces that then rapidly dissolve or as a dissolving of the entire centrosome.

We follow this process through the first four to five rounds of cell divisions of a *C.elegans* zygote (fertilized egg). The process becomes slower the higher the cell stage, the centrosomes become smaller, and they reach their maximum size earlier in the division process. Their maximum diameter varies from ~ 1 −5μm depending on the cell stage and genetic factors. Due to the asynchrony of the cell divisions and the fact that we wish also to track mutational variants, one cannot assume that the number of centrosomes is a power of 2 at any given time. The mutational variants can further affect centrosome size, movement, the number per cell and cell cycle duration as well as cell size and embryo size. On the other hand, one can safely assume that the centrosome is always constant or increasing in size up to beginning of its dissolution and they have a biophysical limit on their maximum velocity.

Using GFP-strains of any centrosomal protein such as γ-TUB::GFP, we image centrosomes in developing *C.elegans* embryos by live-cell fluorescence microscopy for up to ~ 2h with a spinning disc microscope. The GFP-signal attenuates significantly towards the bottom of the embryo and also decreases with time due to photo-bleaching. Although it takes only ~ 6 s to acquire a 3D stack with 50 planes, we image only at time intervals between 30 and 50 s to minimize laser-induced damage of the embryo as well as said photo-bleaching. Each 3D stack is 512 × 512 × 50 pixels, and typically 100 time points are sampled resulting in roughly 2 Gb of data per experiment. The image resolution in x and y is 133 nm per pixel and the distance between two z-planes is set to 500 or 600 nm. The embryo is usually stationary, but small movements cannot be excluded and manual adjustments of the focus during the pause between two time point samplings may cause additional movement of the whole embryo in the z-direction. With the laser set to minimize exposure levels, the signal to noise ratio is quite low and autofluorescence as well as random particles further corrupt the images. Nonetheless, the ellipsoidal shape of centrosomes and the temporal coherence of their trajectories allow us to reliable analyze the resulting movies.

Our tracking algorithm begins like many others by over-detecting spot-like objects. It is tuned to very high sensitivity, so that missing detections virtually do not occur. We then find the core sub-trajectories that we are confident are correct, in that the appearance and displacement vector sequence of the centrosome through the trajectory are mutually consistent and the individual spot calls used are clearly not noise. Our approach is further distinctive in that it adaptively determines the scoring and threshold parameters that yield truly reliable core trajectories. Every core trajectory, which is invariably the middle portion of a complete trajectory, is then extended to find the best set of complete trajectories over the set of all called objects. Occasionally, two non-overlapping core trajectories are found for a given complete trajectory in which case, after the extension step, they abut in time. In a final phase, we recognize these situations and concatenate the trajectories involved. The key idea is to understand what is certain, and then extrapolate the rest around that.

We first over-detect possible centrosomes in each 3D stack by finding all objects that are to first approximation a Gaussian spot of some radius. Each stack *I* is first smoothed slightly to give *I*_{S} by applying a 3D-Gaussian filter of radius ~.6 pixels chosen as per Thomann (2003). We consider there to be an object at position *p* with radius σ if (i) the value at *p* in *I*_{S} is locally maximal, and (ii) the normalized cross-correlation (Haralick and Shapiro, 1992) of the raw image *I* with a Mexican hat filter centered at *p* is greater than τ. Several discrete values of σ are used and we find that setting τ to .70 gives a very low false negative rate (i.e. it is rare to not detect a centrosome) while leaving us with a manageable number of false positive signals (e.g. noise and autofluorescent ‘spots’).

Unfortunately, the same centrosome will be detected multiple times at slightly different radii and different positions. To solve this problem, in a post-processing step we first compute for each detected object *p* a refined estimate of its (*x*, *y*)-position, its radius ρ(*p*) and its mean-illumination-to-background ratio η(*p*) as we will explain below. We then iteratively detect pairs of objects *p* and *q* that are closer together than the sum of their radii ρ(*p*)+ρ(*q*) and discard the dimmer object (i.e. the one with the smaller η-value). This eliminates on average 54% ± 10% (SD) (*n* = 40 movies) of the initially detected objects and is clearly conservative as two centrosomes do not intersect in space.

Another problem that may cause tracking errors are objects outside the embryo such as the polar body or remains of the embryo's parent. We remove these by first detecting the contour of the embryo in a maximum intensity projection using Otsu thresholding (Otsu, 1979) and marker-based watershed segmentation (Meyer, 1994), and then discarding all object calls that are outside the convex hull of the detected embryo contour. This step eliminates on average 28% ± 19% (SD) (*n* = 40) of the objects from the last step.

We now have a set of possible centrosome centres and an estimate of their radius. Empirically, it is exceeding rare for a centrosome to not be in this set, and anywhere from 0 to 90% (average: 34% ± 14% (SD), *n* = 40) of the remaining spots are false positive signals (e.g. noise, etc.) depending on the time point and acquisition conditions.

To refine a putative centrosome's centre (*x*(*p*), *y*(*p*)) and compute its radius ρ(*p*), its mean-illumination-to-background ratio η(*p*) as well as its total illumination ι(*p*), we find the best least squares fit of the spot at *p* to an elliptical 2D-Gaussian model with a background offset (using the trust-region-reflective algorithm (Coleman and Li, 1996) implemented in Matlab). We do so only in the 2D plane with the same z-value as p, primarily because 3D statistics are less reliable due to the relatively poor z-resolution of confocal microscopy as well as movement of the centrosome while it is imaged. All parameters are free and real-valued so that in the end we have a sub-pixel centre (*x*(*p*), *y*(*p*)), angle θ(*p*) of the principle axis, SD σ_{p}(*p*), σ_{m}(*p*) along the principal and minor axes, background level *b*(*p*) and peak height *h*(*p*). Figure 2(a) and (b) show an object height map and the fitted Gaussian, respectively. From this Gaussian, we then have radius , η(*p*) = mean intensity within ρ_{G}(*p*)/*b*(*p*), and ι(*p*) = volume under the Gaussian - b(p).

Measuring centrosome radius. (**a**) An image of a centrosome and its intensity profile. (**b**) Gaussian fit (colored surface) overlaid on the centrosome intensity profile (meshed surface). (**c**) A non-Gaussian centrosome intensity profile together with its Gaussian **...**

Occasionally, when centrosomes are very small and near each other, they are recognized as a single object. The signature of these cases are that the ratio *r*=σ_{p}(*p*)/σ_{m}(*p*) is large and σ_{m}(*p*) is small. In these cases, we try fitting the signal to *two* Gaussian models and if we have a better residual and both σ-ratios are smaller than *r* then we accept the two spots scenario and associated Gaussian models.

At the resolution of light microscopy there is no clear boundary between a centrosome and the background. Therefore, the radius of the fitted Gaussian is actually only a measure that is relative to an unknown but fixed scaling constant. Above, we empirically chose this scaling constant to be 2 based on human perception of centrosome size. Note, however, that the value of this constant plays no role when comparing the size of centrosomes relative to each other. Furthermore, while the model above is sufficient for the purposes of computationally inferring centrosome tracks, for the final biological analysis of the data one needs as accurate a measurement of centrosome size as is possible and so we undertake the following two refinements.

In the late stages of a centrosome's trajectory, when the centrosome is effectively dissolving, the intensity profile of the spot is not Gaussian, typically having a dip in the centre as illustrated in Figure 2c. To more accurately measure the radius of the object in these cases, we assume to first approximation that the illumination of the spot is symmetric with respect to the sub-pixel center and elliptical iso-contours of the fitted Gaussian, and compute the radial intensity profile of the spot as a function of ellipse radius. That is for each ellipse centered about (*x*(*p*), *y*(*p*)) with skew σ_{p}(*p*)/σ_{m}(*p*) and angle θ(*p*), we compute the mean intensity along this ellipse. This is illustrated in Figure 2d along with the radial intensity profile of the fitted Gaussian. The spot radius ρ_{S}(*p*) is then estimated as that value for which the radial intensity of the raw image equals the radial intensity of the fitted Gaussian at ρ_{G}(*p*).

Finally, we correct for image distortion. Due to the convolution of each light point with the point spread function of the microscope, objects appear bigger in the images than they actually are. An approximation is given by (convolution of a Gaussian object with a Gaussian PSF) where ρ_{PSF} is the width of the point spread function and ρ(*p*) is our final estimate of the radius of *p*. Our final estimate is within 1% of the true value when tested against beads with a diameter of 1100 nm injected into the embryo.

Let *V*_{t}={*p*^{t}_{1}, *p*^{t}_{2},…, *p*^{t}_{mt}} be the set of detected objects at time point *t*=1…*n* and *V*=*V*_{1}…*V*_{n} be the set of all objects. Note that for each time point the number *m*_{t} of objects can be different. We define a weighted directed graph *G*=(*V*, *E*, *w*) where the set of edges represents all possible matching hypotheses between objects detected at successive time points. There is an edge (*p*, *q*) between objects *p**V*_{t} and *q**V*_{t+1} if and only if:

- ‖
*p*−*q*‖≤*d*_{max}:*d*_{max}models the biophysical limit on centrosome velocity (300 nm/s) and also takes into account cell movement within the embryo as well as movement of the entire embryo. - if ρ(
*p*)>*r*_{init}then ρ(*q*)>*r*_{cont}: Biologically the centrosome is always increasing or constant in size with time up to the beginning of its dissolution but due to measurement and estimation error, the computed radius varies. Empirically, we have observed that once the radius exceeds*r*_{init}=700 nm (well above the resolution limit), it never becomes smaller than*r*_{cont}=600 nm. We call this criterion radius hysteresis and in what follows we require that it not only be true of edges but also of sequences of edges, i.e. once a centrosome becomes larger than*r*_{init}on a trajectory it does not then become smaller than*r*_{cont}later in the trajectory (Fig. 3).

These two simple conditions eliminate a large number of potential correspondences that are certainly incorrect.

The weight *w*(*p*, *q*) of an edge of the graph is intended to reflect the chances that *p* and *q* represent the same centrosome. Biologically we know that centrosomes move in a relatively smooth manner and that facets of their appearance such as their radius and intensity also change gradually. Thus we expect *p* and *q* to be close together and for their appearance to be similar. We thus chose to define the weight or ‘distance’ between *p* and *q* as:

where

where *d*_{S} is a normalized spatial distance, *d*_{A} is a normalized relative-appearance distance and α[0, 1] is a free parameter (to be selected by the algorithm) that weights the contributions of these two aspects of coherence. In our implementation we use two appearance features, _{1}(*p*) the radius of *p* and _{2}(*p*) the total intensity of *p*. The distances are normalized using the concept of a z-score (http://www.stats4students.com/Essentials/Standard-Score/Overview.php) wherein the measure is the number of standard deviations above or below the mean value of a distribution of ‘raw’ scores. In our case, the set of edges *D* contributing to the distribution is decided dynamically during the algorithm and so will be described later.

A solution *S* to the tracking problem is a set of edges in *G* that form a number of vertex-disjoint paths, each corresponding to a centrosome's lifespan. We could add additional constraints on these paths for wild-type data, but since we want solutions for mutants as well, we can in general not require more structure than this. However, suppose that there were no false-positive objects in the graph, that is, every vertex truly represents a centrosome. Then our solution in this special case is the set of paths that involves the maximal number of edges, and if there are several, then we want the one with the set of edges of minimum weight. This is equivalent to finding the best bipartite matching between *V*_{t} and *V*_{t+1} for each time *t* and we can do so using the Hungarian algorithm (Munkres, 1957). It is a standard technique to use bipartite matching even when there are false positive objects and we will do so numerous times with different parameter settings as a way of discovering these false positives. Moreover, as we solve each bipartite matching for successive value of *t*, we eliminate any edges between *V*_{t} and *V*_{t+1} whose addition to the trajectories up to time *t* would result in a trajectory with radius hysteresis (Fig. 3).

The first step is to find the middle part of each centrosome's trajectory where the centrosome is large and bright. We refer to this part of a trajectory as the core trajectory. While it is desirable to establish as much of each trajectory as possible, it is more important that the core trajectories be error-free.

For several thresholds ωΩ, we consider the subgraph of *G* restricted to the objects that are brighter than ω, i.e. *V*(ω)={*p**V*|η(*p*)≥ω}. Note that the larger the threshold ω, the fewer false-positive objects there are in *V*(ω) until at some value, not yet known to us, practically no false-positives remain. We first apply the bipartite matching heuristic described above with α=0 and σ_{S}=1 on *V*(ω) yielding a solution *S*(ω) that is the set of edges in the optimal matching at each time point. This solution is solely based on the spatial distances *d*_{S} between the objects. Given *S*(ω), we let it be the set of edges *D* for computing the means and standard deviations of the spatial and feature differences that then determine normalized scores for *d*_{S} and *d*_{A}. For a series of different values of α>0, we then apply the bipartite matching heuristic where the weights of each edge are determined by D=*S*(ω) and α. We refer to this solution as *S*(ω, α) and observe that it takes object appearance into account as well.

Our key idea is that if *S*(ω)=*S*(ω, α) or nearly so, and also the larger α, then the more likely it is that the edges in *S*(ω)∩*S*(ω, α) are all correct. In the ideal case, *S*(ω) would equal *S*(ω, 1) for some ω. So for a set of choices (ω, α)Ω × *A*, we compute a table of the number of conflicts between *S*(ω) and *S*(ω, α) as *K*(ω, α)=|*S*(ω) − *S*(ω, α)| and find the smallest value *K*^{*}=*min*{*K*(ω, α)}. Almost always *K*^{*}=0 and there are multiple choices of ω and α that give the value *K*^{*}. Among these we select (ω^{*}, α^{*}) where α^{*} is the largest α that gives us *K*^{*} for some ω, i.e. α^{*}=*max*{α|ω : *K*(ω, α)=*K*^{*}}, and ω^{*} is the lowest possible brightness threshold ω, among those at blending factor α^{*} that give *K*^{*}, i.e. ω^{*}=*min*{ω | *V*(ω, α^{*})=*K*^{*}}. To conclude, we let the edges in χ=*S*(ω^{*})∩*S*(ω^{*}, α^{*}) be our core trajectories and in most cases α^{*}≥.8. In effect, we have found the level ω^{*} that minimizes the possibility of having false positives in *V*(ω^{*}) and hence the solution to the bipartite matching heuristic is most likely to be correct. The so established core trajectories include on average 86% ± 11% (SD) (*n*=40) of the centrosome objects of the complete trajectories that we wish to establish.

Given the core trajectory of each centrosome, we now go back to the full set of called objects and extend the trajectories forward and backward in time to obtain the full trajectories. To this end, we apply the same core procedure of our tracking algorithm with edge weights that combine spatial distance and appearance distance as described above with the weighting factor α=0.3 and *D*=χ. The choice of α was empirically chosen and indicates a slight preference towards minimizing distance changes over appearance changes during the extension phase. Moreover, we compute the mean μ_{χ} and SD σ_{χ} of the weights of all the edges in the core trajectories and eliminate from *E*−χ all edges whose weight is >4 SDs above the mean, i.e. μ_{χ}+4σ_{χ}, as being implausible (*E* is the set of all edges as defined in Section 2.2.3).

In the following, we will describe the track extension algorithm for the backward case. Forward extension is analogous. We extend core tracks backward iteratively starting with *t*=*n* and working backward to 1. Suppose we have already iteratively extended backward to time *t* and let *B*(*t*) be the set of edges in our current set of tracks where the induction starts with *B*(*n*)=χ, the set of core tracks computed in the previous subsection. Note that all edges (*a*, *b*)*B*(*t*) with *b**V*_{t′≤t} are core trajectory edges.

To extend to time *t*−1 we first find a bipartite matching between *V*_{t−1} (i.e. the set of all objects at time *t*−1) and *U*(*t*)={*b**V*_{t} | *a* : *a**V*_{t−1} ∧ (*a*, *b*)*B*(*t*)}{*b**V*_{t} | *c* : *c**V*_{t+1} ∧ (*b*, *c*)*B*(*t*)} (i.e. the set all objects at time *t* that are adjacent to an edge in *B*(*t*)) using the Hungarian algorithm with the weights and edges described in the previous paragraph. We remove from consideration any edge whose addition to the established trajectories would result in a trajectory with radius hysteresis.

Suppose the algorithm returns the set of edges *H*_{t} as the best matching between *V*_{t−1} and *U*(*t*). We must reconcile these edges against the core trajectory edges χ_{t}=(*V*_{t−1} × *V*_{t})∩*B*(*t*) that are currently connecting time *t*−1 and *t*. An edge *e*=(*a*, *b*)*H*_{t} is compatible (with *B*(*t*)) if either *e*χ_{t} (i.e. *e* is also a core edge) or there are no edges adjacent to *a* and no edges between any vertex at *t*−1 and b in B(t) (i.e. *a* is not in any core trajectory and *b* is not already matched to another object at *t*−1). There is only one situation in which we will accept non-compatible edges in *H*_{t} as part of the new extension *B*(*t*−1). A pair of edges (*a*, *b*) and (*c*, *d*) in *H*_{t} is an *exchange* [w.r.t. *B*(*t*)] if and only if (*a*, *d*)*B*(*t*) and there are no edges adjacent to *c* or between any vertex at *t*−1 and *b* in *B*(*t*). This scenario typically arises when a centrosome pair first forms and their spots are so dim that at time *t*−1 *c* is not above threshold ω^{*} and so *a* is inadvertently paired with *d* instead of *b*. During the extension phase *c* is present and the exchange straightens the problem out. Figure 4 illustrates an exchange. In conclusion, we add to our current set of edges all the compatible and exchange edges in *H*_{t} and remove those (core) edges that are replaced by an exchange edge to arrive at *B*(*t*−1), formally:

Illustration of an exchange during the backward extension phase (the edges are drawn in the opposite direction only to illustrate that we work backward in time). The core trajectory edge (*a*, *d*)*B*(*t*) is replaced by the exchange edges (*a*, *b*) **...**

We apply the same procedure forward in time to obtain a sequence *F*(1)=χ∩*B*(1), *F*(2),…, *F*(*n*) of forward extensions. Note carefully that we start the forward induction from χ∩*B*(1) (i.e. the set of all core trajectory edges that have not been replaced by an exchange edge during the backward extension phase) and not *B*(1). So it may be that *B*(1) and *F*(*n*) are not mutually compatible exactly as defined above, that is, *B*(1)*F*(*n*) may have tracks with radius hysteresis or tracks that split or merge. First, we resolve every such fork in the merged result by eliminating the edge [which depends on whether it is in *B*(1) or *F*(*n*)] with the higher weight. When such an elimination is required, we wanted to be sure that the extension procedure whose edge was eliminated would not have made a different extension in light of this. So we rerun the forward and backward extensions with the incompatible edges removed. We repeatedly do this until *B*(1) and *F*(*n*) are compatible, typically one or two iterations. We then further remove from *B*(1) and *F*(*n*) those extension edges with highest weight that would link (non-overlapping) core trajectories in the merged result in order to prevent introducing edge sequences with radius hysteresis. The result of the extension phase is the set of tracks *T*=*B*(1)*F*(*n*).

Occasionally, two or more non-overlapping core trajectories are found for a given centrosome, in which case, after the extension step, the trajectory is broken into several parts that abut in time. In a final phase, we search to exhaustion for edges that link two trajectories fulfilling the three conditions below and add the edge, effectively ‘stitching’ together two trajectories.

- The stitched trajectory must not be longer than a generous maximum bound of 45 min on the duration of the centrosome cycle.
- The differences in position, radius and integrated intensity across the stitching edge must not be greater than the respective maximum difference over all trajectories established so far.
- The stitched trajectory must not have radius hysteresis (
*r*_{init}=700 nm,*r*_{cont}=600 nm).

We then repeat the stitching process with less strict conditions for (i) trajectories that are shorter than a minimum of 10 minutes on the duration of the centrosome cycle and (ii) trajectories with initial radius >600 nm. For these rare cases of certainly broken trajectories we allow twice the respective maximum difference (condition 2) and relax the radius hysteresis to *r*_{init}=1000 nm and *r*_{cont>}=550 nm (condition 3).

It is often necessary to know the biological identities of the tracked centrosomes, that is, the identity in the *C.elegans* lineage (Schnabel *et al.*, 1997; Sulston *et al.*, 1983) of the cell it is located in and whether it is the anterior or the posterior associated pole of the spindle. We determine these identities by establishing a binary tree of trajectories representing the pair and parent/child relationships between the tracked centrosomes as follows.

- For centrosome movies that start at the 1-cell stage the anterior/posterior orientation of the embryo is easily determined as the sperm always enters on the posterior side of the embryo (Goldstein and Hird, 1996). Thus, the pole closest to the centrosomes detected in the first frame of the movie is the posterior pole. For movies that start at later cell stages, we additionally take into account the stereotypical timing of the cell divisions to figure out where the posterior pole is. For example, in the 4-cell stage, the cells ABa and ABp divide before EMS, and EMS divides before P
_{2}. - The centrosomes are paired by finding an optimal bipartite matching between their trajectories. There is an edge between trajectory
*k*and*h*iff (i) they overlap in time by at least 6 min, (ii) the distance between the centrosomes is positively correlated with time (i.e. they tend to separate) and (iii) their minimum centrosome separation distance is <12 μm. The weight*w*(*k*,*h*) between*k*and*h*is the product of (i) the minimum centrosome separation distance in the first half of the trajectories and (ii) the absolute difference in radii averaged over time. Intuitively, we expect a pair to be near each other and for their centrosomes to be at the same growth stage. For the pairs so established, we label the one closer to the posterior pole at the end of its trajectory as the posterior child and the other as the anterior child. - In the last step, we determine for each centrosome pair its (common) parent centrosome, again by computing a minimum-weight bipartite matching on trajectories where the edge weight is the distance between the last location of a potential parent and the first location of a potential child. Edges are only present between trajectories where the parent ends no later than 3 min after the child's begin and the child begins no later than 15 min after the parent's end.

Given the invariant cell lineage of *C.elegans* and its canonical naming scheme with respect to anterior/posterior localization, we then only need to traverse the resulting tree level by level to assign a cell name to each centrosome.

In order to compare the time series of centrosome statistics with each other we need to register them in time. It is common practice in biology to define such a registered time axis relative to a cell cycle event. We use nuclear envelope break down (NEBD) (Gorjanacz *et al.*, 2007) because it can be observed in our images without an additional marker. The nucleus is free of centrosomal proteins and so appears as a dark, circular region in the cytoplasm. As the nucleus breaks down, there is an inflow of GFP-labeled molecules from the cytoplasm into the nuclear region and its fluorescent intensity increases. The NEBD is the time point for which the rate of this inflow of luminosity is maximal.

As the dark nuclear region is difficult to delineate based purely on signal, we take advantage of prior knowledge. The nucleus is located roughly halfway between the two centrosomes and in the early stages its diameter is in the range of 5–8 μm. We thus have an estimate of an image region in which the darkest pixels correspond to the nucleus and can compute a time versus nuclear intensity curve.

Given a pair of centrosomes, we consider a circle half way between them in the xy-dimension and consider the minimum intensity projection of the z-planes between them within that region. We estimate the intensity of the nucleus as the average of the darkest half of the pixels in this circle. Plotting this value through time yields a curve illustrated in Figure 5. We first find the time of the maximum intensity of the curve and then set as the NEBD the time of the maximum rate of change of the intensity in the preceding 3 min.

We have analyzed over 500 centrosome movies that cover different stages of embryonic development between the 1- and the 16-cell stage. To quantify the tracking performance, we randomly selected 10 multi-cell stage movies for visual inspection. Due to the small number of movies that contain the 8- and 16-cell stage in our corpus, we additionally inspected three random examples of such movies. Many papers on tracking systems including (Bao *et al.*, 2006) identify three types of errors: failure to incorporate an object, incorporation of a false-positive object and incorporation of an incorrect object in a given track. We use a finer error categorization because some kinds of errors are more severe than others. For example, a completely missed centrosome is a more severe tracking error than a trajectory that misses only a couple of frames at the beginning of the cycle when the centrosome is a tiny dim spot. We use the following error categories that cover the space of all observed errors.

*Short track.*The track is correct but misses a few frames at either end.*Long track.*The track is correct but has a few extra frames at either end.*Broken track.*A centrosome is completely tracked but the trajectory is broken into two or more pieces. Stitching the involved tracks would correct the problem.*Fused track.*Two or more centrosomes are in a track but splitting the track would correct the problem.*Chimera track.*The track contains more than one centrosome or non-centrosome object and it is not a long or fused track.*Missing track.*A centrosome is not tracked at all.*Particle track.*A non-centrosome object is tracked but involves no centrosomes.

Table 1 summarizes the evaluation results. Tracks that are a few frames too short are the most prevalent error but it is also the least severe because it only involves faint, tiny centrosomes at the very beginning of the cycle or diffuse, disassembled centrosomes at the very end of the cycle. Fusion errors and particle tracks never occurred. More severe errors occurred only in the 16-cell stage. For example, two missed centrosomes were due to their being so deep in the stack, and hence so dim, that they were not picked up in the core trajectory extraction step.

To evaluate the performance of our automatic NEBD detection method, we manually determined the NEBD for 30 randomly selected movies containing a total of 95 cells and the difference in frames to that determined automatically. During manual NEBD detection, we experienced that it was often ambiguous as to the exact frame in which NEBD occurred. Therefore, we considered automatic NEBD detection to be correct if it was within one frame of the manually determined NEBD. We measured exact agreement for 64% and agreement within one frame for 34% of the cases, yielding 98% accuracy for NEBD detection.

To validate the utility of our system for studying centrosome size, we checked if it could detect previously described phenotypes. As an example, TAC-1 is a protein essential for microtubule assembly *in vivo* (Bellanger *et al.*, 2003; Srayko *et al.*, 2003). Depletion of TAC-1 causes centrosomes to get less strongly pulled apart during disassembly due to reduced microtubule forces. We knocked down TAC-1 by RNAi and plot in Figure 6(a) the mean centrosome radius with time versus wild type. As expected, centrosomes in tac-1(RNAi) embryos (red curve) are eventually smaller than in wild-type embryos (black curve). Moreover, spindles are shorter compared to wild type embryos [Fig. 6(b)] confirming previous results by (Srayko *et al.*, 2003). Our time-resolved data additionally shows that TAC-1 has no significant effect on centrosome size before centrosome disassembly begins at metaphase (~150 s after NEBD).

TAC-1 affects centrosome size only at the end of the centrosome cycle when microtubules pull the centrosome apart. (**a**) Graph showing centrosome radius over time in one-cell stage γTUB::GFP embryos (*n* = 12 for wild type, *n* = 9 for tac-1(RNAi), **...**

A protein that actually affects the size of centrosomes is the kinase AIR-1. It is required for centrosome maturation and also involved in spindle assembly (Hannak *et al.*, 2001). Partial depletion of AIR-1 leads to significantly smaller centrosomes compared to wild type [Fig. 7(a)] and after nuclear envelope breakdown the centrosome separation cannot be maintained [Fig. 7(b)].

We have described a system for tracking and precisely quantifying the size of GFP-labeled centrosomes in early *C.elegans* embryos. The strengths of our system are its low error rate and its ability to automatically extract and visualize all the data that is necessary to convert raw movies into aligned and correctly labeled time-series of centrosome life cycles ready for biological interpretation. This facilitates quantitative studies of centrosomes to an extent and precision that would not be possible with manual image analysis.

The robustness of our tracking algorithm relies on three things. First, in our layered approach we initially focus on sub-problems that can be solved with very high confidence. These reliable partial solutions are then incrementally refined until the overall tracking problem is solved. However, while the general idea of a layered approach could also be applied to other tracking problems, our concrete implementation is tailored to the problem of centrosome tracking. Second, we make extensive use of prior knowledge but rely only on properties that are also valid for mutants. Third, while our detection algorithm produces a number of false positive object calls, false negative calls are extremely rare and occur only at the very beginning and end of the centrosome cycle.

So far, we have successfully processed over 500 centrosome movies with minimal manual corrections. The most challenging movies were the ones starting at the end of the 4-cell stage / beginning of the 8-cell stage and ending somewhere during the 16-cell stage. In these movies, it was difficult to figure out the identities of the initial centrosomes (the ones with their parent centrosomes not in the movie) and the parent/child relations between the 8- and the 16-cell stage centrosomes.

In our ongoing work we have quantified the dynamics of centrosome maturation over development from the 1- to the 8-cell stage in wild type and mutant embryos and identified several genes that affect this process. We are currently investigating whether one of these genes has a key role in controlling centrosome size. Moreover, in an initial study, our system was also able to track centrosomes in human tissue cultures cells and, with some modifications, nuclei in early *C.elegans* embryos.

We greatly acknowledge the initial work on centrosome tracking in 1-cell stage embryos by Jacques Pecreaux. We also would like to thank Horatiu Fantana for injecting beads into embryos as well as Fuhui Long, Nathan Clack and Hanchuan Peng for helpful discussions on the tracking algorithm.

*Funding*: This work was supported by the Alexander-von-Humboldt Foundation under Max Planck Research Award #018878.

*Conflict of Interest*: none declared.

^{1}An evaluation of the method on our data was not possible because the code is not published and when requested could not be made available for intellectual property reasons.

- Alberts B, et al. Molecular Biology of the Cell. 4th. NY: Garland Publishing; 1994. pp. 1030–1032.
- Bao Z, et al. Automated cell lineage tracing in caenorhabditis elegans. Proc. Natl Acad. Sci. 2006;103:2707–2712. [PubMed]
- Bellanger JM, Gonczy P. TAC-1 and ZYG-9 form a complex that promotes microtubule assembly in C.elegans embryos. Curr. Biol. 2003;13:1488–1498. [PubMed]
- Coleman TF, Li Y. An interior trust region approach for nonlinear minimization subject to bounds. SIAM J. Optim. 1996;6:418–445.
- Fletcher L, Muschel R. The centrosome and the DNA damage induced checkpoint. Cancer Lett. 2006;243:1–8. [PubMed]
- Goldstein B, Hird SN. Specification of the anteroposterior axis in Caenorhabditis elegans. Development. 1996;122:1467–1474. [PubMed]
- Gorjanacz M, et al. What can Caenorhabditis elegans tell us about the nuclear envelope? FEBS Lett. 2007;581(15):2794–2801. [PubMed]
- Greenan G, et al. Centrosome size sets mitotic spindle length in Caenorhabditis elegans embryos. Curr. Biol. 2010;20:353–358. [PubMed]
- Hannak E, et al. Aurora-A kinase is required for centrosome maturation in Caenorhabditis elegans. J. Cell Biol. 2001;155:1109–1116. [PMC free article] [PubMed]
- Haralick RM, Shapiro LG. Computer and Robot Vision. II. Boston, MA, USA: Addison-Wesley; 1992. pp. 316–317.
- Kalaidzidis Y. Intracellular objects tracking. Eur. J. Cell. Biol. 2007;86:569–578. [PubMed]
- Kerekes RA, et al. Automated 3-D tracking of centrosomes in sequences of confocal image stacks. Conf Proc IEEE Eng Med Biol Soc., EMBC 2009. Annual International Conference of the IEEE. 2009:6994–6997. [PubMed]
- Kirkham M, et al. SAS-4 is a C.elegans centriolar protein that controls centrosome size. Cell. 2003;112:575–587. [PubMed]
- Meyer F. Topographic distance and watershed lines. Mathematical morphology and its applications to signal processing. 1994;38:113–125.
- Munkres J. Algorithms for the Assignment and Transportation Problems. Journal of the Society of Industrial and Applied Mathematics. 1957;5:32–38.
- Otsu N. A threshold selection method from grey level histograms. IEEE Trans. Syst. Man Cyber. 1979;9:62–66.
- Sage D, et al. Automatic tracking of individual fluorescence particles: application to the study of chromosome dynamics. IEEE Trans Image Process. 2005;14:1372–1383. [PubMed]
- Song MH, et al. The conserved protein SZY-20 opposes the Plk4-Related kinase ZYG-1 to limit centrosome size. Dev. Cell. 2008;15:901–912. [PMC free article] [PubMed]
- Schnabel R, Priess JR. Specification of the early embryonic cell fates in C.elegans. In: Riddle D, Blumenthal T, Meyer BJ, Priess J, editors. The Nematode C.elegans. Vol. 2. New York: Cold Spring Harbor Laboratory Press, Cold Spring Harbor; 1997.
- Sulston JE, et al. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 1983;100:64–119. [PubMed]
- Srayko M, et al. Caenorhabditis elegans TAC-1 and ZYG-9 form a complex that is essential for long astral and spindle microtubules. Curr. Biol. 2003;13:1506–1511. [PubMed]
- Thomann DM. PhD Thesis. Zurich, Switzerland: Swiss Federal Institute of Technology Zurich; 2003. Algorithms for detection and tracking of objects with super-resolution in 3D Fluorescence Microscopy; p. 12 and 58.

Articles from Bioinformatics are provided here courtesy of **Oxford University Press**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |