|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: AD UE OA JMS. Performed the experiments: AD UE OA. Analyzed the data: AD UE OA. Contributed reagents/materials/analysis tools: AD UE OA. Wrote the paper: AD UE OA JMS.
Our understanding of dynamic cellular processes has been greatly enhanced by rapid advances in quantitative fluorescence microscopy. Imaging single cells has emphasized the prevalence of phenomena that can be difficult to infer from population measurements, such as all-or-none cellular decisions, cell-to-cell variability, and oscillations. Examination of these phenomena requires segmenting and tracking individual cells over long periods of time. However, accurate segmentation and tracking of cells is difficult and is often the rate-limiting step in an experimental pipeline. Here, we present an algorithm that accomplishes fully automated segmentation and tracking of budding yeast cells within growing colonies. The algorithm incorporates prior information of yeast-specific traits, such as immobility and growth rate, to segment an image using a set of threshold values rather than one specific optimized threshold. Results from the entire set of thresholds are then used to perform a robust final segmentation.
The analysis of behavior in individual cells is essential to understand cellular processes subject to large cell-to-cell variations. Bulk measurements and cell synchronization methods are insufficient to study such processes because a lack of synchrony masks oscillations, all-or-none effects, sharp transitions, and other dynamic processes operating within individual cells , , , , , , , . The vast majority of all single cell studies ultimately relies on the ability to accurately segment and track cells. We here refer to segmentation as the process of separating regions of interest (cells) from background (non-cells) in an image . Moreover, high quality data for studying dynamic processes can only be obtained if segmentation is coupled with the ability to track cells, i.e., to correctly identify the same cell over consecutive time points in an experiment.
Segmentation of individual cells relies on the ability to detect cell boundaries and classify all pixels in a given image as ‘cell’ or ‘non-cell’ pixels. This differentiation is accomplished by specifying a threshold or a threshold function. There are several ways in which this threshold can be determined, ranging from simpler intensity based thresholds to usage of more complex functions such as graphical models , , pattern recognition , deformable templates , cell contours  or the watershed algorithm . Despite these efforts, we still lack a unified approach that robustly detects all cells for all time points.
This is a major challenge in image analysis as the intensity of areas to be designated as ‘cell’ can vary significantly through time and even within the same cell due to varying cellular morphologies, imaging artifacts, and organelles. Moreover, most cells do not have an easily defined and constant geometrical form; therefore, fitting predefined objects on cells does not work in most cases. Besides, segmentation has to be very accurate for successful analyses of time-lapse microscopy, since the overall time-series segmentation success rate decreases geometrically with the length of the time series. For example, the probability of successfully tracking a cell for 100 time-points given a 99% segmentation success rate is only ~37% (=0.99100 from assuming an identical and independently distributed segmentation probability).
To address these image processing problems for the widely-used model organism Saccharomyces cerevisiae (budding yeast), we developed a novel segmentation and tracking algorithm. Budding yeast is ideal for single cell time-lapse imaging studies because it combines considerable variation in key cell characteristics (protein levels and expression, cell size, shape, and age), with a short generation time and immobility , , . So far, considerable progress has been made towards solving the yeast segmentation problem by refining algorithms for segmentation , , , , , , , , , as well as for tracking , , , . Additional algorithms exist to characterize morphology ,  and protein localization , , , . However, we still lack a robust approach for the segmentation and tracking of budding yeast that is easy to implement and computationally efficient.
More specifically, our algorithm is based on the idea that summing multiple repeated segmentations of the same phase contrast image using sequentially varying thresholds is more robust than any algorithm based on a sole potentially optimized threshold. Such a strategy generates an unsupervised, and accurate final segmentation. We show that this method segments and tracks cells with different morphologies as well as cells within dense colonies with very high accuracy. We also present an example of how this algorithm can be used to determine specific cell cycle phases and dynamics. Our algorithm is fully automated following an initial manual seeding of the cells to be tracked. Moreover, the algorithm is easy to implement, and we have constructed a graphical user interface (GUI) to facilitate its application (see Supporting Information S1).
We here present the main outline of the algorithm. For a detailed step-by-step description of the algorithm see materials and methods section ‘algorithm outline’ and figures 1, ,2,2, ,3,3, and and44.
Before segmentation, images are typically processed in one or more steps such as filtering and rescaling, and then processed by a threshold function that differentiates ‘cell’ regions from ‘non-cell’ regions. The procedure of generating such a threshold function has proven a major challenge, as a specific threshold that might work for a given cell at some points in time and space might not necessarily work at other times and/or for other cells. In fact, it is not even certain that we can find a good threshold for any given cell since the intensity of boundaries and intercellular regions might vary significantly. Moreover, depending on the complexity of the threshold and pre-processing methods, the segmentation may take excessive processor time, be specific for each imaging pipeline, and require manual input.
To overcome these difficulties, we developed an algorithm that uses all possible thresholds to segment an image. Next, the algorithm uses a ‘plurality vote’ or sum of all segmentations to achieve a robust highly accurate final segmentation.
The algorithm is divided into three parts: First, cells are selected (seeded) semi-manually from the last time frame. Next, the seeds are segmented and tracked backwards in time, and, finally, the data obtained from the experiment are extracted and analyzed.
Segmenting backwards in time provides the following advantages: (i) As all cells are selected in the last frame, no subroutines are needed to identify newborn cells (buds). (ii) As all cells are present at the last time point, backward segmentation allows for the selection of only the cells of interest, avoiding uninteresting (e.g., dead or newborn) cells. (iii) Since we begin segmenting cells at their presumed maximal size, we can set a bound on the upper cell size through the movie to prevent ‘out of control growth’ missegregations. We note that our algorithm relies on budding yeast not moving significantly between frames and requires that the cell growth rate is significantly smaller than the sampling frequency.
Manual selection of all cells of interest in the last time point of interest (seeding) is required to initiate the algorithm. Any frame can be chosen as the last time point of interest. Note that we will use the ‘first time point’ to denote the first point we segment even though this is often the last time point in the experiment (t=tmax). Similarly, the term ‘last time point’ refers to the last time point we segment, which most often is the first time point in the experiment (t=0).
To select cells of interest, we segment the first time point semi-manually using a watershed algorithm with an adjustable threshold. Manual correction ensures accurate seeding and is facilitated by a graphical user interface (Figure 1a,b). The outcome of the seeding step is a seed image with all selected cells numbered from left to right (Figure 1c,d,e).
Once the seed has been selected, cells are segmented one by one and tracked backwards through time. To segment images, we apply a watershed algorithm to all possible thresholds in the image (one for each possible level of brightness, here ranging from 0 to 255 as we are using 8bit images), and sum up the results (Figure 3a–d). The resulting composite image (Figure 3e) where each pixel has a value between 0 (segmented for no threshold) to 255 (segmented for all thresholds) (cf Figure3b and 4c, see also steps 1–10 in the algorithm description in the materials and methods sections). This approach resembles statistical bagging, where data is resampled and weighted to achieve a more robust model for estimation . The difference is that the composite image/dataset is here generated systematically using all possible thresholds instead of by bootstrapping.
After the generation of the final composite image it is processed in two steps before segmentation. First, we calculate the Euclidean distance of a pixel to the nearest pixel in the previously segmented image and subtract it from the composite score image to penalize cellular movements (figure 4a,b). Next, the original phase image is subtracted from the composite score image (figure 4c). This prevents boundary regions (cf bright white in Figure 3a) from being scored as ‘cell’ (see also steps 11–13 in algorithm description in the materials and methods sections).
Next, the final composite image is segmented using two thresholds: one permissive for regions classified as ‘cell’ in the previous time-point, and one restrictive for regions that were not classified as ‘cell’ (steps 14–20 in algorithm description; Figure 4d,e). This approach has two main benefits: First, it balances the need to segment the entire cell with the natural shrinkage of the cell (as the segmentation runs backwards in time growing cells tend to get smaller); Second, it automatically tracks the cell. Note that these two thresholds are typically the only ones changed from experiment to experiment and they are easily adjusted upon inspection and do not require complicated optimization procedures. Once a cell has been segmented, the segmentation is stored and used as a seed for the next time point (steps 21–22 in the algorithm).
To test the performance of our algorithm, we used a commercially available microfluidics system to grow budding yeast for 300 minutes in synthetic complete media containing 2% glucose . Images were taken every three minutes. Next, phase images were exported and subsequently segmented and tracked using our algorithm. To avoid selection bias, we selected all available cells present in the last frame of the movie (first time point for the algorithm; t=300 min). In total, 263 cells were selected in three fields of view, and 11727 segmentations were performed. On average, cells were present in the movie for 134 minutes or ~45 frames with a standard deviation of 93 minutes (many cells are born near the end of the movie; Figure 5a). In total, segmentation took approximately 300 minutes corresponding to approximately 40 segmentations per minute using a computer with an Intel(R) core(TM) Quad CPU at 2.83 GHz and 4.00GB RAM running on a 32-bit windows vista operating system. As experiment itself took 300 minutes to perform, our algorithm can segment data in real time on a standard desktop computer.
To determine the quality of the segmentation, all cells were inspected manually and scored as being either properly segmented, having a minor error or having a major error (Figure 5). We defined a cell as ‘properly segmented’ if over 95% of its area was segmented correctly. We scored cells as having a ‘minor error’ if 90–95% of the area was correctly segmented and as having a ‘major error’ if less than 90% of the cell area was correctly segmented. The vast majority of all segmentation errors were classed as minor. Notably, we observed that major errors only occurred as a result of unexpected cell movements (Figure 5b,c; Table1; Movies S1, S2, S3). In total, 83% of all cells were segmented without any errors throughout the movie, and our error rate (major or minor) was less than 1 in 140 (Table 1).
To test the effectiveness of the segmentation algorithm when faced with different cell shapes, we took advantage of the fact that yeast cells exposed to low concentrations of mating pheromone (α-factor), exhibit a pseudohyphal-like morphology characterized by elongated, polarized growth . After 90 minutes, growing cells were exposed to a brief pulse (30 minutes) of high (240 nM) α-factor concentration, followed by a long period (450 minutes) of low (3 nM) α-factor concentration. We then applied our algorithm to segment and track the cells using the same metric as above (Figure 5d; Movie S4). As expected, the more irregular shape of the pheromone-arrested cells lowered the performance of the algorithm (see Table1). However, the performance remained acceptable as almost 90% of all segmentation events were classified as without error. In conclusion, this shows that our algorithm performs well on cells with several different morphologies.
The purpose of segmentation and tracking is to extract information about some particular property through time (cf step 21 in the algorithm section). Whereas the segmented phase image allows for the extraction of morphological features such as cell area, minor/major axis, circumference and center point (Figure 6a–c), fluorescent markers can be used to determine dynamics of proteins of interest. To demonstrate the applicability of our algorithm, we investigated cells expressing a C-terminal GFP fusion of the transcriptional inhibitor Whi5 from the endogenous locus. Whi5 is exported from the nucleus during the cell cycle phase G1 (prior to DNA replication) and imported again at the end of the cell cycle, making it an excellent reporter of cell cycle dynamics (Figure 6d,e) , , . Cells were grown, tracked, and segmented, and a 2D Gaussian function was fit around the peak-GFP signal to determine the dynamics of Whi5-GFP. For a practical example of how this method can be used to characterize cell cycle transitions, see .
Even with the use of a hardware-based automatic focus tool such as Definite Focus (Zeiss, Germany), the cell nucleus may still be out of the plane of focus due its movement (Figure 6f). To correct for variation in a nuclear (e.g., Whi5-GFP) signal that are caused by movement of the nucleus relative to the plane of focus, it is possible to use a more constant-brightness nuclear marker as a standard for comparison. Here, we used a histone (Htb2) fused with a red fluorescent protein (mCherry) expressed from the endogenous locus to identify time periods with a consistent decrease in the mean nuclear intensity. Since the mean nuclear intensity, as measured by the Htb2-mCherry signal per nuclear area, is supposed to only increase during S-phase, we assume that, after smoothing, any decrease in the mean nuclear intensity, especially if accompanied by a decrease in Whi5-GFP, indicates an out-of-focus nucleus in G1. In these sections, we correct the Whi5 signal by increasing it in direct proportion to the fractional decrease in the mean nuclear intensity of the Htb2-mCherry signal in the red channel (Figure 6g).
To verify this approach in general, and the use of directly proportional correction in particular, we applied this algorithm to analyze pheromone-arrested cells. Arrested cells are ideal for testing because we expect them to maintain a high level of nuclear Whi5 [reference doncic ‘11] so that variation in the Whi5 signal likely due to nuclear movement. Indeed, when we optimized the coefficient of proportionality to minimize variation in Whi5 levels for individual cells, we have found that the direct proportionality (a coefficient of 1) was close to, and not statistically different from, optimal. We therefore conclude that direct proportionality can be used to correct nuclear signals from distortion due to nuclear movement relative to the plane of focus.
Recent advances in the automation of time-lapse microscopy and microfluidics greatly facilitate the generation of high quality single-cell data. However, analysis of such data relies on accurate cell segmentation and tracking over many frames. We here present an accurate and easy-to-implement algorithm for automatically segmenting budding yeast phase images over long time-courses. The main principle underlying our algorithm is the application and addition of all possible segmentation thresholds, which results in a very accurate and robust segmentation.
The fundamental challenge when segmenting any object is to segregate between the object of interest and the background. Whereas most segmentation algorithms find an optimal threshold given some segregation method (e.g., filtering, edge detection, watershedding) we here use all thresholds using one algorithm: watershedding. This makes the segmentation very accurate, because we capture much more information in the ensemble than could be captured in any one segmentation event. Moreover, although the majority of cells can be segmented with a single optimized threshold, the optimized threshold is rarely able to segment all of the cells, all of the time. Thus, ‘optimal threshold’ methods will necessarily be more error prone as more information is lost when the single threshold is applied.
Our algorithm has several further benefits: It does not require that cells take any particular geometric shape and can therefore segment hyperpolarized and shmooing cells as well as round and oblong ones. The algorithm allows for manual selection of cells in the last time frame, so no time is spent on out of focus cells, buds, dead cells or other uninteresting objects. Our algorithm can segment arbitrarily large colonies, as it segments and tracks cells one-by-one backwards through time.
As a proof of concept, we have also shown how our algorithm can be used as a basis for extracting quantitative dynamic traces of a fluorescent protein translocated from the nucleus in a cell cycle dependent manner. Accuracy in increased when this approach is combined with a subroutine that corrects for nuclear drift. High quality quantification of fluorescent proteins in single cells can later be used to elucidate more complex cellular behavior such as transcriptional dynamics, size control and cell fate decisions , , . Notably, our algorithm is modular, so that segmentation and tracking (phase) is done by separate subroutines from fluorescence measurement. It is therefore easy to add custom subroutines to detect and quantify specific fluorescent proteins of interest.
The algorithm described here is not limited to yeast phase images because preliminary tests suggest that it can also be applied to bright-field yeast time series and slow moving mammalian cells with distinct boundaries (AD, unpublished data). However, such applications will require extensive modification of the implementation described here, and are beyond the scope of this paper.
Many cellular processes are governed by all-or-none effects, memory, and oscillations. Because these phenomena are masked in bulk assays, their study requires high quality single-cell data. As time-lapse microscopy can be performed by off-the-shelf instrumentation, the bottleneck in the generation of high quality single cell data has become image processing. Here, we applied the simple principle of democratically summing up segmentations for all possible thresholds to develop an efficient algorithm for segmenting and tracking yeast phase images. The simplicity of the underlying idea suggests that this approach could be used to improve single-cell studies in a variety of contexts.
To facilitate the implementation of our algorithm for segmenting and tracking cells, we here present a point-by-point description accompanied by a graphical representation (algorithm flowchart; Figure 2, also note that variable names are written in italics). To users not familiar with programming we have also provided a user friendly GUI that implements our algorithm (see Supporting Information S1 for more details).
Segmentation requires that the objects to be segmented remain visible throughout the duration of the experiment. In the case of yeast, this requires that growing colonies be restricted to two dimensions. We used a commercially available flowcell from Cellasic to grow the cells (Table 2) at a temperature of 30°C while being exposed to constantly flowing SCD (synthetic complete media with 2% glucose) at a flow rate of 5psi (~34 KPa).
Images were taken every three minutes with a Zeiss Axio Observer Z1 microscope using an automated stage and a plan-apo 63X/1.4NA oil immersion objective. Zeiss Definite Focus hardware was used for automatic focusing. The WHI5-GFP strain was exposed for 100 ms using a Colibri LED 470 module and the HTB2-mCherry strain was exposed using the Colibri 540-80 LED module, both at 25% power.
See Supporting Information S1, Figures S1, S2, S3 and tables S1, S2 for instructions regarding the GUI.
See main text for details.
See main text for details.
See main text for details.
See main text for details.
We thank Jonathan Turner for carefully reading through the manuscript.
Funding provided by National Institutes of Health grant GM092925 and Burroughs Wellcome Fund CASI. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.