|Home | About | Journals | Submit | Contact Us | Français|
The detection and counting of transcripts within single cells using Fluorescent in situ Hybridization (FISH) [1–6] has allowed researchers to ask quantitative questions about gene expression at the level of individual cells. This method is often preferable to quantitative RT-PCR [7–9], as it does not necessitate destruction of the cells being probed, and maintains spatial information that may be of interest. Until now studies using FISH at single molecule resolution have only been rigorously carried out in isolated cells (eg. yeast cells or mammalian cell culture). Here we describe the detection and counting of transcripts within single cells of fixed, whole-mount Drosophila embryos using a combination of FISH, immunohistochemistry, and image segmentation. Our method takes advantage of inexpensive, long RNA probes detected with antibodies [10, 11], and we present novel evidence to show that we can robustly detect single mRNA molecules. We use this method to characterize transcription at the endogenous locus of the Hox gene Sex combs reduced (Scr), by comparing a stably expressing group of cells to a group that only transiently expresses the gene. Our data provide evidence for transcriptional bursting [2, 5, 12–16], as well for divergent "accumulation" and “maintenance” phases of gene activity at the Scr locus.
In early Drosophila embryos, the limits of Hox expression domains along the anterior-posterior axis are set by parasegmental boundaries. Parasegments are repeating units of cellular organization that make up the body plan of early embryos, and the Hox gene Sex combs reduced (Scr) displays dynamic differences in expression between parasegments 2 and 3 (PS2 and PS3) (Figure 1). Cells in PS2, which give rise to the posterior mouthparts, stably express Scr from early embryogenesis onward. Cells in ventral PS3, which contribute to the first thoracic segment, display a transient burst of Scr transcription during mid-embryogenesis (Figure 1C–E) [17–19]. Thus, this system offers a convenient way to compare Scr transcriptional dynamics between stably and transiently expressing groups of cells in the same embryo, as well the opportunity to shed light on the expression of a crucial developmental regulator.
At low-magnification, fluorescent signals from a probe directed against Scr mRNAs have a "speckled" appearance (Figure 2A). At high-magnification most cytoplasmic signals are resolvable as ellipsoids of roughly uniform size (~250–300 nm diameter in x and y, Figure 2B). While we believed we were visualizing single mRNA molecules [1–6, 20–22], it was possible that they instead represented mRNA aggregates (e.g. P-bodies ). One method for demonstrating single transcript FISH resolution is to show that a spatial shift exists between signals from two different probes targeted to adjacent regions of an mRNA [1, 4, 6], which should not be present if you are visualizing an aggregate of randomly oriented transcripts. Consistent with this, we observed a randomly oriented spatial shift between signals from probes directed against the coding region and the 3’ UTR of Scr (Supp. Figure 2). Another method to demonstrate single RNA molecule detection is to show that the fluorescence emitted by specific numbers of direct-labeled oligonucleotide probes bound at each locus is reproducible and predictable [1, 4, 22]. While the long RNA probes used in this study offer a large increase in signal-to-noise ratios when compared to oligonucleotide probes, because they are indirectly-labeled the fluorescence they emit is more variable ( and data not shown). Therefore, we developed a different assay to test whether the punctate cytoplasmic signals represented single transcripts. Single transcripts should contain only a single binding site for a unique probe sequence. If two probes against the same sequence are labeled with different hapten tags and simultaneously hybridized to embryos, there should be competition between the two probes, and very low levels of association should be observed.
We tested for such competition using two unfragmented probes complementary to the same 330 bp region of the Scr 3' UTR, labeled with either digoxigenin or dinitrophenyl haptens (probes S1 and S2, respectively, Figure 2H). This experiment was done as part of a triple hybridization, using a biotin-labeled coding region probe (ORF, Figure 2H) as a marker for the adjacent Scr mRNA protein coding region (Figure 2D – G). Randomly chosen S1 signals were almost always associated with an ORF signal (79%, n=100; Figure 2D, E, I), but rarely with an S2 signal (17%, n=100; Figure 2E, F, G, I). A given S1 signal was only associated with both an S2 and an ORF signal in a minority of cases (10%, n=100, Figure 2I), which is strong evidence that these locations contain only single binding sites for an S probe. Although the S1→S2 association statistics may seem high, rotating the S2 image stack 90 degrees relative to the S1 image stack, and re-scoring the same 100 S1 signals (to simulate random association) yields a nearly identical association level of 20% (Figure 2I), indicating that this association can be explained by the chance overlap of numerous signals in a finite volume. We therefore conclude that the large majority of the cytoplasmic Scr signals we observe correspond to single mRNA molecules. More pairwise association data, as well as antibody detection and probe binding efficiency data, are shown in Supplementary Figures 1 and 2.
In order to group and count transcripts from single cells of the embryo, we used a combination of RNA FISH, immunohistochemistry to detect cell boundaries, manual cell segmentation, and automated transcript signal segmentation. Cells of interest were manually segmented, using an anti-spectrin antibody  to stain cell membranes as a guide. The cell segmentation process was accelerated using an ImageJ plugin we developed that allows a user to quickly draw unique regions of interest (ROIs) for each cell outline in an sequence of confocal image slices (Figure 3A, B). These ROIs then defined the 3D boundaries used to group the FISH signals from each cell (Figure 3C).
The punctate FISH signals themselves were automatically segmented and counted using the Volocity® 3D image analysis program (Figure 3D). Almost all FISH transcript signal segmentations appeared correct upon visual inspection, and the algorithm yielded transcript counts that were nearly identical (±6%) to those obtained by manual counting (Supplementary Table 1). Given this variation, and the fact that more than one transcript will occasionally occupy the same volume, we believe this counting method yields transcript numbers that are within ±10% of the actual value.
Intense FISH signals representing sites of transcription in the nucleus are often detected with probes to upstream exons or introns of a gene (e.g. Figure 2B [arrowhead]) [25, 26]. The transcriptional activity of a gene can be roughly quantified by measuring the fluorescence intensity of these spots, which will vary according to the number of nascent transcripts associated with the locus [1, 2, 5, 13, 27]. To characterize transcription at the Scr locus we counted cytoplasmic transcripts with an ORF probe (Figure 3E), and nascent transcript intensity with an Scr intron probe (Figure 1A and Figure 3F).
We first examined several stably expressing cells from PS2, as well as several transiently expressing cells in PS3 (Figure 3E,G). To our surprise, cell groups from both PS2 and PS3 displayed a wide range of cellular transcript numbers (72–262 for PS2, and 3–14 for PS3, Figure 3G). In PS3 cells, the low number of cytoplasmic transcripts was consistent with undetectable levels of nuclear transcription in the same nuclei. However, in PS2, there was not a good correspondence between cytoplasmic and nascent transcript signals; an extreme example of this is shown for two nearby cells (Figure 3E [arrows], H, I, J). While both cells contain over a hundred cytoplasmic mRNAs, one cell has two obvious sites of transcription, while the other has none.
To investigate this further, a more comprehensive analysis was carried out on three embryos during stages 10 and 11 of embryogenesis , and the results are shown in Figure 4. Embryos were chosen that were representative of different phases of Scr transcription: before, during, and after the transient period of Scr expression in PS3 (Figure 4A–C). Approximately 20 ventro-lateral ectodermal cells from both PS2 and PS3 were segmented, and all cells were located ~50um from the ventral midline (Figure 4A–F). Strongly expressing PS2 cells had an average of 94 mRNAs per cell, with the values exhibiting a large range from 33 to 177 mRNAs per cell (n= 58; Figure 4G). Differences in cell size were not responsible for this heterogeneity, as a similar distribution of values was seen after taking cell volume into account (Figure 4H). For each of the three stages examined, the average number of Scr transcripts per cell in PS2 were similar (100, 104, and 80). For PS3 cells, average Scr mRNA numbers were very low during stage 10 (5 transcripts), increased dramatically during early stage 11 (33 transcripts), and decreased during late stage 11 (12 transcripts) (Figure 4G). See Supplementary Table 2 for data and statistical analyses.
To determine whether cells expressing other Hox genes produced similar numbers of transcripts, mRNAs were also counted for Deformed (Dfd) and Ultrabithorax (Ubx) in areas of abundant transcript accumulation during stage 11 of embryogenesis. Values for these two Hox genes were similar to those found for Scr in PS2, with Dfd having an average of 92 mRNAs per cell, and Ubx an average of 74 per cell (Figure 4G, H).
Graphs plotting number of Scr transcripts per cell and nascent transcription strength along the anterior-posterior axis are shown in Figure 4A'–C' (red and green lines, respectively). Surprisingly, for PS2 cells in the stage 10 embryo the graphs were divergent (Figure 4A'). On the other hand, in stage 11 cells transcript numbers and nascent transcription levels rose and fell largely in unison (Figure 4B', C'). Figures 4I and 4J show scatter plots for the cell groups in PS2 and PS3, and non-parametric correlations were calculated for all cell groups. Consistent with the traces, stage 10 PS2 cells showed a significant negative correlation between cytoplasmic transcript numbers and nascent transcription (r = −0.7, p < 0.05), while early and late stage 11 PS2 cell groups both showed weak but significant positive correlations (r = 0.47 and 0.60, p < 0.05) (Figure 4I). On the other hand, PS3 cells had very significant positive correlations between cellular transcript numbers and nascent transcription for both early and late stage 11 cell groups (r = 0.85 and 0.67 respectively, p < 0.001) (Figure 4J; see Supplementary Table 3 for correlation data). It is possible that the same mode of transcription occurring in PS3 during stage 11 may also be occurring in PS2 during the same period, although the positive correlations are not as striking because they are superimposed upon an existing pool of transcripts.
Recent data indicates that transcription is often not only stochastic (meaning transcription initiation is probabilistic), but also occurs in bursts, during which a gene will switch back and forth between prolonged active and inactive states [2, 5, 12–16]. Our observations of large variations in transcript numbers on a cell-by-cell basis, as well as the often poor correlation between nascent transcription and cellular transcript numbers, indicate that transcriptional bursting is taking place at the Scr locus in PS2. One way to capture the relative intensity of these bursts is through the use of the Fano factor (FF) , which is essentially a measurement of population heterogeneity. In this case it is defined as the variance of the distribution of transcript numbers per cell divided by the mean. Even stochastically transcribing cell populations can have small FF values (<1) if most cells contain similar transcript numbers, but FF values larger than 1 are suggestive of transcriptional bursting. We observed FF values of 7.1, 8.4, and 16, for the three PS2 cell groups, which are intermediate to an observed FF value of ~4 in bacteria , and an FF value of >40 for a transgenic reporter gene in mammalian cells . Whether the heterogeneity we observe is due to intrinsic noisiness in Scr transcription, or due to high variations in activator and repressor input (extrinsic noise) [29–32], is as yet unknown.
Our observations also indicate that there may exist divergent “accumulation” and “maintenance” phases of Scr transcription, characterized by stage 11 PS3 cells and stage 10 PS2 cells, respectively. The Scr gene may begin transcribing in a stochastic, but still relatively constant manner, until a threshold number of transcripts are reached, after which it switches to a bursting mode of transcription to maintain the mRNA pool. The mechanisms whereby a cell might directly sense the concentration of a distinct mRNA species are unclear, although regulation of Scr transcription via downstream targets of the SCR protein could explain this phenomenon. The simultaneous RNA/protein detection procedures described in this paper should allow for more detailed studies in which endogenous transcription factor concentrations can be correlated to target gene activity on a cell-by-cell basis.
In summary, we have characterized endogenous transcription of the Scr locus at single molecule resolution in Drosophila embryos, and provided evidence for transcriptional bursting, as well as for two divergent modes of gene expression. To our knowledge this is the first rigorous analysis of transcription using single molecule FISH performed in a developing metazoan. Using FISH or live imaging in these kinds of studies is crucial, as biochemical methods that extract RNA from cell populations do not detect cell-to-cell variations. Carrying out these analyses at single molecule resolution is similarly crucial, as metrics such as the FF are impossible to derive when using arbitrary whole cell fluorescence measurements . Finally, single molecule measurements are much more objective, and should allow for the comparison of results between disparate experiments.
Haptenylated probes were created by in vitro transcription as described earlier . The intronic probe was directly labeled with AlexaFluor 555 dyes, and was prepared by Invitrogen. Simultaneous RNA and protein detection was carried out using a modified standard FISH protocol  using acetone instead of Proteinase K permeabilization . Dechorionated embryos were fixed in 8% formaldehyde for 25 minutes, devitellinized by vigorous shaking in a 1:1 heptane:methanol mixture, washed with ethanol, rocked in a 1:1 ethanol:xylenes mixture for 30 minutes, washed with methanol, and then gradually rehydrated in a series of methanol:H20 washes (3:1, 1:1, 1:3, 0:1). Embryos were permeabilized in cold 80% acetone for 10 minutes at −20°C, and then transferred into Phosphate Buffered Saline plus 0.1% Tween (PBT). Embryos were then post-fixed in 5% formaldehyde in PBT for 25 minutes and washed with PBT. RNA probe hybridization and immunohistochemistry (including antibody combinations) were carried out as described earlier . Spectrin  and engrailed  antibodies were obtained from the Developmental Studies Hybridoma Bank (antibodies 3A9 and 4D9, concentrate) and used at a 1:100 dilutions.
All images were collected with a Leica SP2 laser-scanning confocal microscope. Gain and offset were set to non-saturating levels such that intensity data would span the entire dynamic range, and line averaging was set to 2. Stacks of at least one cell thickness (~15um) were collected and channels were shifted relative to one another to correct for Z-axial chromatic aberration (which was measured independently using Tetraspeck fluorescent beads). All images were deconvolved using the AutoDeblur® software program.
A set of ImageJ (http://rsbweb.nih.gov/ij/) plugins were developed to allow us to manually segment confocal stacks (contact W. Beaver, ude.dscu.sc@revaebw). Transcript segmentation and counting was carried out using the image analysis program Volocity®. First, transcripts were counted manually for several cells (n=4) and this training set was used to tune the variables of the Volocity® segmentation algorithm so that it predicted transcript numbers that were nearly identical to manual counts. We then used the algorithm to segment and count transcripts for the training set plus 8 more cells that were not part of the training set (Supplementary Table 1). Overall the algorithm predicted values that were within ±6% of the manually derived values, and was accurate over a wide range of values (20 – 153) without any obvious bias towards a certain range.
Volocity® was also used to measure nascent transcript intensities as well as cell volumes. Both transcribing alleles are often distinguishable, although we cannot rule out that cells containing solitary signals do not represent cases of overlapping alleles, therefore we simply summed the intron probe fluorescence from the entire nucleus.
We thank J. Mahaffey for providing the Scr cDNA plasmid, A. Hermann and A. Arvey for help with the manuscript, as well as E. Tour for troubleshooting and advice. This work was supported by NIH Grant HD28315 (to W.M.) and National Institutes of Health Training Grant T32GM007240 (to A.P. and D.L.).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.