|Home | About | Journals | Submit | Contact Us | Français|
In the past, biomedical research has embraced a reductionist approach, primarily focused on characterizing the individual components that comprise a system of interest. Recent technical developments have significantly increased the size and scope of data describing biological systems. At the same time, advances in the field of systems biology have evoked a broader view of how the underlying components are interconnected. In this essay, we discuss how quantitative genetic interaction mapping has enhanced our view of biological systems, allowing a deeper functional interrogation at different biological scales.
In 1977, Charles and Ray Eames produced a short movie entitled “Powers of Ten”, taking viewers on a journey through space that spanned many orders of magnitude, from the atom to the outer universe (www.powersof10.com). The journey can be a humbling experience. Quoting Carl Sagan: “We find that we inhabit an insignificant planet of a hum-drum star lost in a galaxy tucked away in some forgotten corner of a universe in which there are far more galaxies than people”.
Biomedical research has focused on a subset of the orders of magnitude explored by Charles and Ray Eames, from ecosystems (106 meters) to the atomic structure of biomolecules (10−10 meters). Although each of these orders of magnitude is typically explored with different sets of experimental tools, in nature they are intricately connected. For example, point mutations in proteins can lead to changes in signaling circuitry that can change species behavior (de Bono and Bargmann, 1998) with potential impact on inter-species interactions, while behaviors like algal blooms that create phenotypes visible from space are likely to be under genetic control (Erdner and Anderson, 2006). Still, throughout the last century, biological research has largely focused on characterizing the components that make up systems of interest. Only recently, with the advent of systems biology, is the emphasis shifting towards integrative studies that aim to describe how observed biological phenomena depend on the global interplay of these components. Increases in quantitative data and improvements in computational methods have lead to the rise of models that, to some extent, can predict the non-intuitive behavior of biological systems at different scales. Examples of these include models of protein binding affinities (Chen et al., 2008), cell decision signaling events (Santos et al., 2007), development (Bergmann et al., 2007), and homeostasis (Novák and Tyson, 2008).
In this essay, we will discuss the rapid pace of technological development of one such method, quantitative genetic interaction mapping, and how it is being used to study different scales of biology. In tribute to the short movie “Powers of Ten”, we will journey from the whole organism to the atomic resolution of single amino acids.
A genetic interaction between two genes implies that they impact each other’s functions. Genetic interactions between two loci can be mapped by measuring how the phenotype of the double mutant differs from that expected when the phenotypes of the single mutations are combined (Figure 1A) (Mani et al., 2008; Phillips, 2008). The most commonly used neutral model assumes that the fitness of the double mutant is equal to the product of individual single mutant fitness. For example, if loss of gene A results in a growth rate 0.9 times the wild type growth rate, while loss of gene B results in a growth rate of 0.8, then the expected growth rate of the double mutant (loss of gene A and gene B) would be 0.72 times that of the wild type (Figure 1A). This neutral model assumes that two genes do not normally impact each other and, in fact, experimental observations support the intuitive idea that the vast majority of genes do not interact (i.e. strong genetic interactions are rare) (Tong et al., 2001; Pan et al., 2004; Schuldiner et al., 2005). Cases where knocking out two genes causes a more deleterious effect than the fitness reduction expected from the combination of the individual knock-outs are referred to as negative or aggravating interactions (e.g. synthetic sickness) (Figure 1A) and often identify proteins that are functioning in distinct but parallel pathways in a given process (Figure 1B). Alternatively, the combined double mutation can have a smaller than expected impact on fitness, and these cases represent positive or alleviating interactions (e.g. suppression) (Figure 1A).
We have previously demonstrated that pairs of mutants that display positive genetic interactions often correspond to proteins that are functioning in the same pathway and/or are physically associated (Figure 1B) (Roguev et al., 2008; Collins et al., 2007). A possible explanation for this trend would be that if removal of a component of a complex results in that complex being functionally disabled, then deleting a second component would have no additional effect, a situation that would result in an epistatic (i.e. positive) interaction (Figure 1A). An additional scenario would be the deletion of one component of a complex that would result in a partially dysfunctional complex that would be detrimental to cell viability. If the removal of an additional component of the complex completely disabled this detrimental function, then this would result in a suppressive relationship, another type of positive interaction. Furthermore, if enough genetic interactions are collected for a set of genes, then each mutant engenders a genetic interaction profile, or phenotypic signature, representing how it genetically interacts with all other mutants tested. Comparison of these profiles is a powerful and unbiased way to identify genes that are functioning in the same biochemical pathway (Figure 1C) (Pan et al., 2004; Schuldiner et al., 2005; Collins et al., 2007; Tong et al., 2004).
The multiplicative model described above is useful for quantitative measures such as growth rate but may be less appropriate for complex phenotypes like cell morphology, and, as such, alternative models of epistatic behavior have been proposed and compared (Mani et al., 2008). In fact, the study of epistasis has a strong theoretical basis in the literature of genetic linkage and related areas, and many of these concepts were recently reviewed (Phillips, 2008). Here, we will focus primarily on developments of high-throughput quantification of genetic interactions, analysis methods, and their applications across different species and scales of biological organization.
Genetic studies are traditionally sub-divided into forward and reverse genetics. Forward genetics often involves defining a phenotype of interest and then identifying mutants that contribute to this phenotype. In contrast, reverse genetic studies start with genes of interest and attempt to define their function through mutational analysis. In this context, genetic interaction screening can be defined as a form of reverse genetics.
The development of high-throughput genetic interaction screening was made possible by the creation of deletion libraries for single non-essential S. cerevisiae genes (recently reviewed in (Boone et al., 2007). An important landmark was the first implementation, termed Synthetic Genetic Array (SGA), where each S. cerevisiae single gene deletion strain was mated to produce arrays of double mutant strains (Tong et al., 2001). This allowed for the rapid qualitative assessment of synthetic lethal interactions for many thousands of gene pair combinations. Pan and colleagues developed an alternative approach, named dSLAM (Diploid-based Synthetic Lethal Analysis with Microarrays), to detect genetic interactions using pools of barcoded mutants (Pan et al., 2004). In this approach, genetic interactions are determined by the differential enrichment of double mutants growing in competitive culture as measured using barcode microarrays. Although in principle both these methods can be adapted to measure a range of epistatic effects, in practice they were used to identify synthetic sick or lethal (i.e. negative) interactions. The E-MAP (Epistatic Mini-Array Profile) approach introduced protocols to measure colony size in array format, thus quantifying genetic interactions in a high-throughput fashion (Collins et al., 2006; Schuldiner et al., 2005). The barcode approach was also adapted to provide a quantitative genetic score (Decourty et al., 2008), and recently an approach using a flow cytometry device was developed that enabled even more precise quantification of very small epistatic effects (Breslow et al., 2008).
In parallel with the development of genetic interaction screening for S. cerevisiae, methods were developed for screening in C. elegans using RNAi knockdown. In this case, C. elegans strains carrying a specific mutation (i.e. gene knock-out) are fed different RNAi expressing bacteria in a 96-well assay (Byrne et al., 2007; Lehner et al., 2006). RNAi-mutant combinations that differ from the expected phenotypes of the combined single perturbations are defined as synthetic sick/lethal. Quantifying genetic interactions in C. elegans is more challenging than in S. cerevisiae given the added complexity of multicellularity. Nevertheless, Byrne and colleagues were able to provide a semi-quantitative measure of genetic interactions by scoring phenotypes using visual inspection (Byrne et al., 2007).
In the last few years, protocols have been developed to assay genetic interactions in other species, such as E. coli (Typas et al., 2008; Butland et al., 2008), S. pombe (Dixon et al., 2008; Roguev et al., 2007) and Drosophila cell lines (Bakal et al., 2008). These will increase our capacity to probe for epistatic effects across different species. Although methods to assay genetic interactions differ in implementation, and have different advantages and disadvantages, they are converging on a common result – the quantification of genetic interactions on a large scale.
Having briefly reviewed the principles of genetic interaction screening and the methodological developments to date, we describe in the following sections how this approach can be used to study biological systems across different scales of space and time.
In the spirit of the “Powers of Ten” film, we start our journey at the millimeter scale (10−3m), the length of a C. elegans worm. From this whole organism viewpoint, the phenotypes of interest relate to survival and development. As mentioned above, quantitatively measuring epistatic effects in a multicellular organism is a difficult task. However, it has recently been shown that genetic interactions can be used to predict the effect of single gene perturbations at the organismal level (Lee et al., 2008). Lee and colleagues used genetic interaction data in combination with additional information (e.g. mRNA co-expression and protein-protein interactions) to define a network of functional interactions in C. elegans. This network allowed them to make predictions for phenotypes visible at the level of the organism. For example, they indentified genes that suppress the ectopic vulvae phenotype associated with synMuv A inactivation (Lee et al., 2008). Although this study was based on a predicted network of functional interactions, it clearly shows the potential for understanding genetic interactions at an organismal level.
We move away from C. elegans and zoom down two to three orders of magnitude to about 5×10−6 meters, or the average diameter of a S. cerevisiae cell (Figure 2). Although we leave the complexities of multicellularity behind, we are now concerned with the myriad of functions the cell must sustain in order to survive and replicate. To illustrate the power of quantitative genetic interaction information at different scales, we will use data derived from a single study describing an E-MAP focused on various aspects of chromatin function, including chromosome segregation, transcriptional regulation, DNA repair/replication and chromatin remodeling/modification (Collins et al., 2007). At the cellular and sub-cellular level (10−6 to 10−7 meters), quantitative genetic interaction reveal how different biological processes are functionally connected. For example, a strong propensity for negative genetic interactions between DNA replication and DNA recombination/repair genes as well as between transcription and chromatin modification/remodeling genes is observed, (Figure 2), arguing that significant redundancy exists between pathways within these processes.
In an influential paper from a decade ago, Hartwell and colleagues articulated an important challenge: “Cell biology is in transition from a science that was preoccupied with assigning functions to individual proteins or genes, to one that is now trying to cope with the complex sets of molecules that interact to form functional modules” (Hartwell et al., 1999). One clear example of this modular organization is the assembly of protein macromolecular structures (complexes ) from smaller modular groups of protein that cooperate to carry out biochemical tasks (Krogan et al., 2006; Gavin et al., 2006). As we delve deeper towards 10−8m, we begin to see these individual complexes. Although datasets comprising only physical protein interactions tend to arrange into distinct (albeit modular) complexes, these in turn are often connected by negative epistatic interactions (Kelley and Ideker, 2005). In fact, using quantitative genetic interactions, we can identify pathways where sets of physically distinct complexes are acting together in linear pathways (Figure 2). Segre and colleagues first noted that the genetic interactions between genes acting in the same cellular process tended to be predominantly negative or predominantly positive (Segrè et al., 2005). Moreover, if the complex or pathway is non-essential, the components tend to be enriched for positive genetic interactions among themselves and have very similar genetic interaction profiles (Collins et al., 2007; Roguev et al., 2008). For example, quantitative genetic interactions were used to uncover a pathway required for efficient transcriptional elongation by RNA polymerase II comprising three complexes: the Rad6 histone H2B ubiquitination complex (Sun and Allis, 2002; Wood et al., 2003; Hwang et al., 2003; Robzyk et al., 2000), the Paf1 complex (Mueller and Jaehning, 2002; Squazzo et al., 2002; Krogan et al., 2002) and COMPASS, an eight subunit complex that methylates lysine 4 of histone H3 (Miller et al., 2001; Roguev et al., 2001; Nagy et al., 2002; Krogan et al., 2002) (Figure 2). Interestingly, genetic interaction information alone does not distinguish between mutants of the components of the complexes that function in this pathway as they all share similar profiles as well as positive genetic interactions (Figures 1B, C,,2).2). Further analysis of genes associated to this pathway revealed additional stable, stochiometric complexes where all the components act in a concerted and coherent fashion, including Rpd3C(L), the histone deacetylation complex responsible for regulation of gene expression at promoters of many genes (Keogh et al., 2005; Carrozza et al., 2005) (Figure 2).
Success in studying pathways in this way has spurred the development of unsupervised learning approaches that can make use of genetic and physical interaction information to provide more accurate module predictions (Ulitsky et al., 2008; Bandyopadhyay et al., 2008). The fact that these modules can be derived directly from the genetic interaction information using unsupervised learning methods is in itself evidence that modularity is a property of cellular networks as postulated by Hartwell and colleagues (Hartwell et al., 1999).
As we zoom to 10−8 meters, we reach the level of the individual protein. The discovery that the histone H2A variant Htz1 gets incorporated into chromatin via the SWR-C complex, an event that facilitates transcription, chromosome segregation, replication and DNA repair, relied on the observation of quantitative genetic interactions (Mizuguchi et al., 2004; Krogan et al., 2003; Kobor et al., 2004). Close inspection of these interactions allows us to piece together pathway architecture (Figure 1B, ,2).2). These observations also suggested that a redundancy exists with respect to Htz1 incorporation via the SWR-C complex and histone deacetylation by Rpd3C(L) since strong negative genetic interactions exist between members of these complexes (compared to positive genetic interactions between components within the respective complexes).
In the final part of our journey, we reach the resolution of 10−10 meters, and ask how can genetic information allow us to make functional inferences about protein structure? Until this point we have mostly discussed experiments where wild type genes were either knocked-down or knocked-out. In these cases, the function of a gene is perturbed in its entirety. These same methods could be used to study other mutants, including over-expression alleles or mutations disrupting specific functions of a gene, allowing for the study of structure-function relationships. For example, Haarer and colleagues used alanine mutation variants of the actin gene to test for genetic interactions with genes previously shown to genetically interact with haplo-insufficient actin (Haarer et al., 2007). The authors observed that different mutations recapitulated subsets of the phenotypes previously indentified using the null allele. Interestingly, mutations that were near each other on the protein surface tended to share genetic interactions, consistent with the concept of individual functions mapping to local regions of domains within protein sequences. Using the E-MAP approach (Collins et al., 2007), we analyzed PCNA, or Pol30, an essential protein involved in DNA repair, chromatin assembly and DNA replication. Since this protein is multifunctional, we expect that different parts of the protein will be linked to different processes. Indeed, one specific mutation (pol30-79) results in an E-MAP profile that resembles those derived from a hypomorphic allele of POL30 (pol30-DAmP) (Schuldiner et al., 2005) and several canonical replication mutants, including RAD27 or POL32 null alleles (Figure 2). Based on these profiles, we speculate that the pol30-79 mutation could have a general destabilization effect of PCNA. However, another mutation, located on a different region of Pol30 (pol30-8 in Figure 2), engenders a strikingly different profile, one that resembles those seen with deletions of CAC2, RLF2 and MSL1, coding for the three components of the CAF chromatin assembly complex. Further evidence that Pol30 functions with CAF-1 come from the fact that pol30-8 displays positive genetic interaction with components of the complex (Figure 2). These proof-of-principle experiments demonstrate that it should be possible to use quantitative genetic interaction screening to probe the relation between structure and function in a high-throughput and systematic manner.
We used the same chromosome biology related dataset throughout these different scales as a coherent example. However, many other studies have been conducted on different functional aspects of yeast biology. For instance, Lin and colleagues (Lin et al., 2008) surveyed the histone acetylating and deacetylating enzyme complexes, providing a welcome overview of the organization of these complexes as well as finding links between DNA double-strand break repair and the NuA4 complex. Similarly, the SGA method showed how diverse cellular modules such as cell polarity, the mitotic microtubule complex, and DNA synthesis and repair are integrated into higher order pathways (Tong et al., 2001). The E-MAP approach has also been used to study diverse processes such as the early secretory pathway (Schuldiner et al., 2005), kinase signaling systems (Fiedler et al., 2009) RNA processing (Wilmes et al., 2008), and protein folding in the endoplasmatic reticulum (Jonikas et al., 2009).
Reaching the level of amino acids on individual protein molecules (10−10 meters), we begin to journey back. We now consider also another important dimension - time. Like space, the analysis of time dependent changes in biology spans many orders of magnitude, from the study of picosecond molecular dynamics in single proteins to the analysis of evolutionary changes over millions of years. Again, quantitative genetic interaction studies have shed light on time dependent biological processes (Figure 3).
One example of a time dependent process is the cellular response to signaling inputs. Many pathways and genes are used by the cell to sense and adapt to changes in external conditions. For this reason it is expected that only in the presence of specific external conditions will some functions and genetic interactions become apparent (Figure 3A). The tools described above could be used to study genetic interactions before and after specific changes in external conditions in order to understand how cells react to these insults. For example, St. Onge and colleagues used a DNA damage agent (MMS) to search for changes in genetic interactions in the presence of this environmental stress. By studying gene pairs that elicited alleviating genetic interactions in the presence of MMS, the authors identified interactions important for DNA damage response (St Onge et al., 2007). In another study, Bakal and colleagues showed that instead of colony size or relative growth rate, the pathway activity of a double mutant could be used as a measure of fitness. They used simultaneous repression of combinations of genes by RNA interference to find novel regulators of the Drosophila Jun NH2-kinase (JNK) (Bakal et al., 2008). Activity of the JNK kinase was monitored using FRET to find deviations from neutral expectation. Extrapolating from these two studies, which show that genetic interaction assays can be used to study condition/time dependent cellular functions, we can envision that increasing the number and variety of phenotypic read-outs (ex: pathway activity, transcriptional output, etc) will allow the study of cellular pathways in an unprecedented manner.
Evolutionary changes are an important source of time dependent variation, casting light on how nature uses genes and proteins to solve a variety of biological problems. After speciation events, species diverge over time as they adapt to their specific niches resulting in cross-species differences in their genome organization and cellular interaction networks (Figure 3B). Cross-species studies of genetic interactions have begun to help us understand how mutations at the level of DNA generate phenotypic variation across species. Using RNA interference, Tischler and colleagues perturbed 837 gene pairs in C. elegans that were orthologous to synthetic lethal gene pairs in S. cerevisiae (Tischler et al., 2008). They estimated that, at most, 5% of the synthetic lethal interactions are conserved between these two species. This is in stark contrast to the high conservation of essentiality for orthologous genes across these species (Tischler et al., 2008). This evolutionary question has also been systematically addressed in two genetic interaction studies of S. pombe. Roguev and colleagues developed a strategy for mapping genetic interactions in fission yeast (Roguev et al., 2007) and used it to quantitatively measure pair-wise interactions between 550 genes (Roguev et al., 2008). Analyzing orthologous gene pairs in budding and fission yeasts, it was observed that about 17% of negative interactions and approximately 10% of positive interactions were conserved across species. This high divergence is unlikely to be largely explained by technical reasons since these same genetic interactions scores show high reproducibility in biological replicates within each of the species (Roguev et al., 2008). Interestingly, positive interactions within genes that share the same complex are much more likely to be conserved (50% conservation) than other positive interactions. It was shown that within-module genetic interactions (i.e. within protein complexes) are more likely to be conserved than the genetic interactions across modules. These genetic results are consistent with information obtained with other experimental methods by which it was observed that protein complexes are highly conserved across species (van Dam and Snel, 2008) while their regulation by gene expression or by post-translational modifications appears to change quickly (Jensen et al., 2006; Tuch et al., 2008; Holt et al., 2009; Tan et al., 2009; Beltrao et al., 2009). This is consistent with the notion that modularity increases evolutionary plasticity by allowing the cell to re-use modules to adapt to changing environments (Figure 3B). In Dixon and colleagues study of S. pombe genetic interactions, a similar level of conservation between these yeast species was reported with 23 to 29% of negative genetic interactions conserved between orthologous pairs (Dixon et al., 2008).
These evolutionary studies are the first of many that are sure to follow given the development of methods to assay other species (Typas et al., 2008; Bakal et al., 2008). The mapping of genetic interactions networks for multiple species will allow for comparative studies that promise to advance our understanding of cellular networks in much the same way as comparative genomics as advanced our knowledge of genome architecture.
We have described here how the technological developments in systematic genetic interaction screening can be used to gain insight at different scales of biological organization. Although we have focused this essay on the information derived from genetic interactions, other methods have seen similar developments. For example, the yeast two-hybrid method has recently been adapted to identify the protein domains most likely responsible for a given interaction (Boxem et al., 2008) as well as the potential effects of point mutations (Zhong et al., 2009). Mass-spectrometry has also been shown to be extremely value to study time dependent processes (Wolf-Yadlin et al., 2006). Finally, advances in structural approaches such as electron tomography coupled with improved computational methods are starting to provide us with an integrated structural view of living organisms from atomic details of single proteins to whole cells (Aloy and Russell, 2006; Alber et al., 2007).
Systems biology approaches range between bottom-up and top-down methods. Top-down approaches are characterized by the gathering of genome-wide high-throughput data that can be analyzed to identify informative patterns or correlations. Bottom-up methods, on the other hand, are typically concerned with smaller number of elements that are analyzed to identify important design features in biological systems. Developments in high-throughput experimental assays, especially quantitative genetic interaction mapping, are blurring these lines and showing us that one can zoom in from entire cellular maps to the angstrom resolution of protein structure-function analysis using the same methodologies. This is crucial, because after all, as Eames stated after completing the Powers of 10 film, “Eventually, everything connects.”