Phenotypic assays of intact organisms allow the study of biological pathways and diseases that cannot be reduced to biochemical or cell-based assays.
C. elegans is a useful model system for studying biological processes shared with humans
1, high-throughput instrumentation and reagent libraries exist for sample preparation and imaging
2, and deviations from wild-type are often readily apparent because worms are visually transparent and follow a stereotypic developmental pattern
3. Large-scale chemical and RNAi screens using
C. elegans are widespread
4–6 and can probe complex processes such as metabolism, infection, and behavior, but so far the analysis of such experiments has largely been manual, subjective, and onerous.
Much progress has been made in automating the analysis of particular types of
C. elegans experiments, such as those involving low-throughput, high-resolution, 3-D, or time-lapse images, or images of embryos
7–11. However, there is still a strong need to automate the analysis of high-throughput, static images of adult worms in liquid culture, a common screening output. For most assays, the density of worms per microplate well causes the worms to touch or cluster, so that automated analysis has been limited to population-averaged measurements
12–13, hiding population heterogeneity and prohibiting measurements on individual animals.
An alternative to microscopy is flow systems adapted for worms(e.g., COPAS, Union Biometrica), measuring length, optical density and fluorescence emission at transverse slices along the length of individual worms. However, image-based screens have several benefits: They allow detection of more complex phenotypes by two-dimensional analysis of shape and signal patterns, and do not require re-suspension of worms in additional liquid prior to analysis, allowing smaller sample volumes and closed culture conditions —an important factor when screening large libraries of small molecules and RNAi clones, and when using pathogenic microbes. Also, image based screening allows for visual confirmation of results, the images form a permanent record that can be re-screened for additional phenotypes, and low-throughput experiments require no more equipment than a microscope and a digital camera.
To improve C. elegans phenotype scoring from images of adult worms in liquid, we developed an image-analysis toolbox that can detect individual worms regardless of crossing or clustering. It can measure hundreds of phenotypes related to shape, biomarker intensity, and staining pattern in relation to the anatomy of the animals.
A typical workflow starts with bright field images(). We pre-process to compensate illumination variations, detect well edges, and make the image binary (). The next step, and the major challenge, is “untangling,” i.e., detecting individual worms among clustered worms and debris. To address this, we first construct a model of the variability in worm size and shape from a representative set of training worms(). The model is then used to untangle and identify individual worms(). A large number of measurements such as size, shape, intensity, texture, and spot counts can thereafter be made on a per-worm basis using all image channels available, as is common for cell-based assays
14. Many phenotypes, such as spot area per animal, can be scored directly by such measurements; more complex phenotypes, such as subtle or complex changes in protein expression patterns, can be scored using a combination of measurements and machine learning
15. If reporter signal location is of interest, we map each worm to a low-resolution atlas allowing quantification correlated to the worm’s anatomy.
We evaluated the untangling performance using images from our prior work
8, where 15 worms were placed in each well of a 384-well plate. Approximately 1500 worms from 100 wells were manually delineated, revealing that 46% of the worms were clustered or touching other worms (
Supplementary Fig. 1). Compared to manual delineation, 51% of the worms were correctly detected with automated foreground-background segmentation followed by connected component labeling. When applying the untangling algorithms of the WormToolbox the performance increased to 81%, which proved sufficient for the assays presented here. The major source of error was poor image contrast close to well edges; performance improved to 94% when the foreground-background segmentation was manually corrected, decoupling errors caused by untangling from errors in the initial segmentation. We also tested the performance of the untangling in relation to the size of the training set, and found that performance plateaus using a worm model constructed from 50 randomly selected training worms. This means that training can be done on a relatively small number of samples representing the phenotypic variation of a given experiment (
Supplementary Fig. 2).
We first evaluated the toolbox on data from a different laboratory and imaging system
13. The challenge was to detect individual adult worms that were partly clustered and mixed with eggs and progeny. We trained the worm model on L4 and adult worms only, and observed that untangling improved the accuracy of finding individual adult worms as compared to thresholding and size-sorting alone ( and
Supplementary Fig. 3). The model efficiently excluded smaller larvae (L1, L2, and L3) and eggs, and performance was relatively robust in the presence of up to 6-fold more progeny than adults (
Supplementary Fig. 4). We also evaluated the performance of worm untangling as the number of worms per well increased. Wells contained either L1, L3, or adult worms at increasing concentrations, and we created a separate worm model for each developmental stage. As expected, the performance was higher for the slightly smaller L3 worms as more space between worms leads to less clustering, but untangling became unstable when the worms were so small(L1) that the image resolution only allowed a few pixels per worm (
Supplementary Fig. 5 and 6).
In the second assay, we evaluated the toolbox for scoring viability, which can be read out as a morphological phenotype in bright field images alone, without the need for a viability stain. Worms in liquid culture tend to be curved and evenly opaque when alive but become rod-shaped and textured when they die (). We untangled high-throughput images of worms infected with
Enterococcus faecalis and either mock-treated with DMSO or treated with ampicillin
12. After making shape, intensity and texture measurements of each untangled worm, we manually selected 150 live and dead training examples from one 384-well plate. We thereafter used the gentle-boosting classifier of CellProfiler Analyst
15 (
Supplementary Fig. 7) to identify a combination of measurements that discriminates live and dead worms. Finally, we applied the classifier to 1,500 worms from a different 384-well plate, and verified that it distinguished live and dead worms as well as humans can (). To evaluate the performance of the viability scoring on more heterogeneous data from a real high throughput experiment we selected 1,766 random images and 200 hits from a 37,200 compound screen
12 and compared the automated scoring with that of visual scoring based on bright field images (
Supplementary Fig. 8). We achieved an accuracy of 97%, and a precision of 83%, indicating that morphology-based viability screening could be a feasible alternative to the viability stains(SYTOX)used in the original screen.
In the third assay, we evaluated how well the toolbox could differentiate between a positive and a negative control from an RNAi screen for regulators of fat accumulation
16. The positive control down-regulates
daf-2, and the negative control was an empty vector. We compared two different approaches for pattern quantification: per-well measurements(using the basic functionality of CellProfiler), where no effort was made to assign fatty regions to individual worms, yielded a false discovery rate (FDR) of 22.2% (
Supplementary Fig. 9); and per-worm measurements(using the untangling functionality of the Worm Toolbox), yielded an FDR of 4.5% (
Supplementary Fig. 10). The per-worm measurements were superior because they captured the heterogeneity of the population, which was lost in the population averages from per-well measurements.
Finally, we evaluated the toolbox’s ability to detect worms with a change in the location of GFP expression(). We used a
C. elegans strain where GFP expression in the intestine is under the control of a promoter that responds to
Staphylococcus aureus infection
17. A pharyngeal stain (mCherry) served as an internal control. The assay could not be scored using simple approaches, such as measuring the total intensity of GFP expression per well or per worm, or counting the number of GFP spots (
Supplementary Fig. 11). However, using worm straightening () and our atlas-mapping (), we were able to quantitatively detect elevated expression of
clec-60::GFP in the anterior intestine () and separate positive and negative controls with a Z’-factor of 0.21. Here we focused on location of signal along the length of the worm, but asymmetric signal distribution across the width of the worm (e.g. fluorescence in full worm as compared to only eggs, or only gut) could also be discerned, using the outline of the worm as a spatial reference for the atlas.
The WormToolbox is the first system to automatically, quantitatively, and objectively score a variety of phenotypes in individual
C. elegans in static, high-throughput images. The toolbox is implemented as modules for the open-source CellProfiler
14,18 software, emphasizing ease-of-use, is compatible with cluster computing to speed analysis, and is flexible to new assays developed by the scientific community. Training the worm model takes less than an hour, and once an image analysis pipeline is set up for an assay, a typical analysis takes 10–30s per image; much less if a computing cluster is available.
The performance of the WormToolbox depends on the contrast between worms and the surrounding background, making it sensitive to large variations in background illumination and to the worm-like tracks sometimes formed when growing worms on agar medium. The WormToolbox can handle images of worms on agar in large plates, but further optimization is needed for worms on solid medium in 384 well plates. In liquid culture, the untangling can handle up to 20 adult worms per well in 384-well format, and is designed to detect worms of the size and shape range of the training worms used to create the worm model. Unexpected phenotypes are likely to be discarded as debris, but wells with a low fraction of correctly detected worms may be flagged for visual examination. In future work we will extend the WormToolbox by adding further worm-specific measurements based on their unique anatomy and better handling of mixed worms at various stages of development.