The examples above illustrate how the expression distribution of a particular transcript or fluorescent protein reporter can be used to quantify the transitions between active and inactive transcription states and to determine the mechanism by which regulators modulate this process. In many of these studies, the analysis of regulatory behavior required the application of an external input or a change in environmental conditions. It is not always easy to introduce such a perturbation, but what if they already existed in nature? As discussed above, most cellular proteins undergo stochastic fluctuations, which can activate or repress downstream processes and thereby introduce valuable perturbations. As a result, when multiple transcript or protein species are monitored in the same cell, important additional information can be extracted by analyzing how different species correlate with one another. This correlation analysis was used in experiments focused on synthetic gene networks in E. coli
, where expression levels of several genes were monitored in the same cell using fluorescent reporters. By analyzing the pair-wise correlation between the different fluorescent reporters, the major fluctuation sources could be determined (23
In a recent study, Stewart-Ornstein et al.
) used fluorescent proteins to examine the pair-wise correlations of hundreds of different yeast genes, whose expression levels varied over three orders of magnitude. Even without using exogenous perturbations, single-cell steady state measurements could reveal clear groups of genes whose stochastic fluctuations were strongly coordinated. These collections of genes, which they labeled “noise regulons,” corresponded to functional groups related to stress response, mitochondrial regulation, and amino acid biosynthesis. Furthermore, Stewart-Ornstein et al.
showed that steady-state correlations were strongly predictive of the proteins’ dynamic response to heat shock.
Using a two-color RNA fluorescent in situ
hybridization assay, Gandhi et al.
) measured pair-wise correlations between RNA species regulated by the same promoter or by two different promoters. The Gal4-regulated genes GAL1
were induced with 2% galactose, and their distributions were measured at steady state. As expected, single-cell correlation analyses showed strong correlations between GAL1
as well as between GAL1
. mRNA correlations were also found in other regulatory genes. Transcripts of the genes SWI5
, which are expressed in the G2/M stages of the cell cycle, were strongly correlated with each other, but weakly anti-correlated with NDD1
, which dominates during the S-phase. On the other hand, constitutive genes such as MDN1
(ribosome biogenesis), PRP8
(pre-mRNA splicing) and KAP104
(nucleocytoplasmic transport) exhibited much less coordination.
Although correlations at a single time point can reveal static relationships among different mRNA and protein species, this view lacks information about the system’s history and causal relationships. If two proteins X and Y are correlated, the questions remain: does X activate Y; does Y activate X; or does a third protein W control them both? To illustrate this, shows simple motifs by which proteins W, X and Y could relate to one another, and shows typical scatter plots of the single-cell expression for proteins X and Y for these motifs. When static correlations cannot discriminate between these motifs, dynamic correlations in single-cell fluctuations may help (26
). Such analyses make use of the cross-correlation function (26
(τ) = X
, which measures how fluctuations in Y
at time t
relate to those in X
at time t
denotes the covariance of two variables, and σX
are the standard deviations of X
respectively. The magnitude of RXY
) reveals positive or negative regulation, and the timing of peaks in RXY
) reveals causality in this regulation. As examples, plots the cross-correlation functions between proteins X and Y for each of the motifs in . For the first motif, where X activates Y, the blue line in (left) shows that RXY
) has a maximum, and since X is upstream of Y, this peak occurs at a negative delay time. Conversely, when protein Y is a repressor of X, RXY
) has a minimum at a positive τ (, second column, red line). If both X and Y were both controlled by W, the maximum or minimum would occur at τ= 0, and its sign would be positive or negative depending upon whether W has the same or different effects on X and Y (, right columns).
Different regulatory motifs yield different steady-state correlations
Dunlop et al.
) tested this dynamic correlation approach in live cells by inserting three fluorescent protein reporters of different colors into the E. coli
genome. Yellow fluorescent protein (YFP) was fused to the λ CI repressor, which controlled expression of red fluorescent protein (RFP). Cyan fluorescent protein (CFP) was placed on a separate constitutive promoter. Using fluorescence time-lapse microscopy, all three colors could be monitored simultaneously over several hours. Dynamics of the YFP-RFP pair were anti-correlated with a delay of about 120 minutes, clearly revealing that CI-YFP repressed RFP (similar to , second column, blue line). Conversely, the unregulated YFP-CFP pair exhibited a delay-free correlation characteristic of common upstream regulators (extrinsic noise) that affect both YFP and CFP in a similar fashion (similar to , third column). Thus, the causal relationships of all three reporters were uniquely determined. Extending and applying this approach to the CRP
feed-forward loop in E. coli
, they analyzed how the relationship between GalS and GalE
varies under different fucose levels and under the influence of GalR
While correlations at either mRNA or protein levels can reveal gene regulatory relationships, the two do not always perform equally well. To illustrate this, shows scatter plots and cross-correlations between the mRNA X and mRNA Y corresponding to protein X and protein Y, respectively. Although protein X and protein Y are coordinated for all four motifs in , this is not the case for their mRNA levels. This can be explained by the disparate times scales of mRNA and protein. Fast degrading mRNA may exhibit fluctuations with a broad frequency bandwidth. Conversely, slow degradation of proteins filters out fast fluctuations but keeps slow fluctuations. Constitutively expressed mRNA X has both fast and slow fluctuations, but protein X only transmits the slow fluctuations downstream. The result is that the dynamics of mRNA X and mRNA Y are dominated by uncorrelated fast fluctuations, which overshadow their correlated slow fluctuations. On the other hand, protein X and protein Y only contain the better-correlated slow fluctuations. In other words, two mRNA species can be mostly uncorrelated with one another, yet produce protein in a coordinated fashion. Gandhi et al.
) observed such a circumstance in budding yeast, when they found very little correlation between pairs of transcripts that encode for coordinated proteins of the same protein complex, including proteasome and RNA polymerase II subunits. They even found correlation lacking in two alleles of the same gene. In a related study, Taniguchi et al.
) looked at over 1000 genes in E. coli
and measured both mRNA and protein copy numbers in single cells. They found that for most genes, even the numbers of mRNA and protein molecules were uncorrelated. These studies suggest that understanding of regulatory phenomena requires one to consider regulation at both the mRNA and the protein level.
From these studies, it is now clear that variability in single-cell measurements contains a wealth of information that can reveal new insights into the regulatory phenomena of specific genes and the dynamic interplay of entire gene networks. As modern imaging techniques begin to beat the diffraction limitations of light (28
) and flow cytometers become affordable for nearly any laboratory bench (29
), we find ourselves in the midst of an explosion in single-cell research. With the advent of single cell sequencing (30
) it might be possible to determine the full transcriptome of many single cells in the near future and to determine the full expression distributions and correlations for all genes in the genome. We expect that the approaches described in this review, which have been pioneered using the model microbial systems, will be readily applied to mammalian cells and tissues (32