|Home | About | Journals | Submit | Contact Us | Français|
Epistasis refers to the interaction between genes. Although high-throughput epistasis data from model organisms are being generated and used to construct genetic networks1-3, to what extent genetic epistasis reflects biologically meaningful interactions remains unclear4-6. We address this question by in silico mapping of positive and negative epistatic interactions amongst biochemical reactions within the metabolic networks of E. coli and S. cerevisiae using flux balance analysis. We found that negative epistasis occurs mainly between nonessential reactions with overlapping functions, whereas positive epistasis usually involves essential reactions, is highly abundant, and surprisingly, often occurs between reactions without overlapping functions. We offered mechanistic explanations of these findings and experimentally validated them for 61 S. cerevisiae gene pairs.
Epistasis refers to the phenomenon that the effect of a gene on a trait is masked or enhanced by one or more other genes6,7. Fisher and other population and quantitative geneticists extended the concept to mean non-independent or non-multiplicative effects of genes6,8. The direction, magnitude, and prevalence of epistasis is important for understanding gene function and interaction2,6,9, speciation10, evolution of sex and recombination11,12, evolution of ploidy13, mutation load14, genetic buffering15, human disease4,5, and drug-drug interaction16. Epistasis in fitness between two mutations is commonly defined by ε = WX WXWY, where WX and WY represent the fitness values of two single mutants relative to the wild-type and WXY represents the fitness of the corresponding double mutant. Epistasis is said to be positive when ε > 0, and negative when ε < 0. When deleterious mutations are concerned, positive epistasis lessens the fitness reduction predicted from individual mutational effects, whereas negative epistasis enhances it. The magnitude of epistasis between different pairs of mutations may be compared using scaled epistasis 17, which is transformed from and has the same sign as ε, but is normally bounded between -1 and 1. We apply flux balance analysis (FBA) of metabolic networks18 to explore the functional association between biochemical reactions that are epistatic to each other. Assuming a steady state in metabolism, FBA maximizes the rate of biomass production under the stoichiometric matrix of all reactions and a set of flux constraints. The maximized rate in a mutant strain relative to that in the wild-type strain can be regarded as the fitness of the mutant relative to the wild-type17. FBA can be used to investigate the fitness of the cell under various environmental and genetic perturbations19,20 and has been used to generate the epistasis map of yeast metabolic genes17,21,22. We first study the bacterium Escherichia coli, because its reconstructed metabolic network is of high quality and its FBA predictions have been empirically verified20,23.
Using FBA, we identified from the E. coli metabolic network 270 reactions whose removal reduces the fitness under the glucose minimal medium. Removing any of the remaining 661 reactions has no such effect, primarily because the reaction has zero flux under this medium, or occasionally because the network has another reaction that can fully compensate its loss. Among the 270 reactions, 212 are essential, meaning that deleting any one of them results in zero fitness. We considered a genetic perturbation in each reaction that constrains its flux to ≤50% of its wild-type optimal value and then computed the fitness of the mutant by FBA. We similarly computed the fitness values of all possible double mutants and obtained ε and for all pairs of the 270 reactions, which reveal the global epistasis pattern within the metabolic network (Supplementary Table 1). Constraining the flux to ≤50% instead of zero17,21,22 allows the investigation of essential reactions. Consequently, the number of pairwise epistasis values obtained here exceeds 25 times that previously obtained17. Constraining the flux to other non-zero levels does not alter our results qualitatively (Supplementary Table 1).
To examine whether metabolic reactions with epistatic relationships are functionally associated, we need to identify the function of each reaction in generating the E. coli biomass, which is composed of 49 constituents. If a reaction is important for producing a set of biomass constituents, the removal of these constituents from the biomass function will recover the biomass reduction caused by the deletion of the reaction. Based on this idea, we designed a removal-recovery method to determine the functions of 255 of the 270 important reactions in generating biomass constituents (Fig. 1a). For the remaining 15 reactions, the functions cannot be unambiguously determined and thus they are excluded from our analysis. The majority of the 255 reactions each contribute to only one biomass constituent, whereas a small number of reactions affect many or even all 49 constituents (Fig. 1b). Note that the glucose minimal medium is again used in determining the function of each reaction, because some reactions have variable functions in different media. Functional assignment by our method is generally consistent with the conventional functional annotation of E. coli reactions24, but our assignment is expected to be more precise in identifying the biomass constituents contributed by each reaction.
We found 26 (0.08%) reaction pairs that show apparent negative epistasis ( ≤ -0.01). Among them, 25 pairs each share functions in producing at least one biomass constituent (Table 1; Fig. 2a, 2b). The remaining pair is between reactions MALS (catalyzed by malate synthase) and PPC (phosphoenolpyruvate carboxylase), anaplerotic reactions feeding the Krebs cycle. The lack of shared biomass constituents between them is due to the incomplete identification of MALS and PPC functions caused by their mutual functional compensation (Supplementary Figure 1). A common interpretation of negative epistasis between two genes is that the two genes can individually perform a common function and thus each of them is able to compensate the loss of the other. Our observation that virtually every pair of reactions with negative epistasis share at least one function strongly support this interpretation (Fig. 2b). While negative epistasis might be expected to occur between two nonessential reactions, this is not absolute. For example, two essential reactions (or one essential reaction and one nonessential reaction) may share a nonessential function in producing a biomass constituent and show negative epistasis by this common function (Table 1).
In contrast to the rare occurrence of negative epistasis, >97% of reaction pairs exhibit apparent positive epistasis ( ≥ 0.01) (Fig. 2a). However, only ~26% of them occur between reactions that share at least one biomass constituent (Table 1; Fig. 2c). There is also no significant difference in ε or between functionally overlapping and non-overlapping reaction pairs with positive epistasis. It is often observed that a reaction is positively epistatic with a large number of apparently unrelated reactions. Use of ε instead of in measuring epistasis does not change this pattern. The lack of functional overlap between most positively epistatic reaction pairs challenges the general interpretation of epistasis as functional association2,9,25.
Why does positive epistasis occur so frequently between functionally unrelated reactions? Fig. 2a shows that virtually every essential reaction exhibits strong positive epistasis ( ~ 1) with any other reaction regardless of its function and essentiality. This can be explained by considering that, when an essential reaction is constrained, almost all other reactions in the network do not work in their full capacity such that the composition stoichiometry of the biomass is still maintained (Supplementary Figure 2a, 2b). Consequently, a genetic perturbation in a second reaction that reduces its capacity will have a negligible additional effect, resulting in positive epistasis. Note that positive epistasis sometimes occurs between nonessential genes and in these cases ~80% (288/361) show functional overlaps (Fig. 2b).
Why is there no such effect between nonessential reactions? There are three requirements for a metabolic reaction to be considered here as important yet nonessential. First, it must function in producing one or more biomass constituents. Second, there must be alternative reactions that can also make its product. Third, compared with the alternative reactions, it must be more efficient in producing at least one constituent. When the flux of a nonessential reaction is constrained, its less efficient alternative reaction will be turned on (Supplementary Figure 2c). Due to the lower efficiency of the alternative reaction, nutrients that previously went through other reactions for making other biomass constituents can be redistributed in such a way that the biomass reduction by the flux constraint is minimized (Supplementary Figure 2c). It can be shown mathematically that when the number of reactions in the network is large, perturbations of two functionally unrelated nonessential reactions will have a nearly multiplicative effect on biomass production and cause negligibly weak positive epistasis15,17 (Supplementary Note).
Saccharomyces cerevisiae is another species whose reconstructed high-quality metabolic networks have been extensively validated experimentally19,21. We repeated the above FBA in S. cerevisiae and obtained similar general findings on the frequencies of positive and negative epistasis and the functional relationships of epistatic reactions (Table 1; Fig. 3). Specifically, only 0.2% of reaction pairs show negative epistasis ( ≤ -0.01), 83% of which have functional overlaps. By contrast, >95% of reaction pairs show positive epistasis ( ≥ 0.01), but only 20% of which have overlapping functions.
Our computational results appear to be robust against several potential caveats in the computational analysis (Supplementary Note). We further pursued experimental validation of our computational predications in S. cerevisiase, due to the difficulty in conducting partial gene deletion in E. coli. Six essential and two nonessential genes from seven functional categories were examined (Supplementary Tables 2 and 3). We deleted one allele per gene from a diploid S. cerevisiae to achieve partial disruption of a gene. Haploinsufficient genes were used to ensure that partial gene disruption affects fitness. Only non-metabolic genes were examined, because metabolic genes are rarely haploinsufficient26. Non-metabolic genes are expected to behave similarly as metabolic genes in terms of epistasis27, as long as the final product is composed of multiple constituents with a fixed or preferred composition stoichiometry. We then measured the fitness of each strain through a growth competition assay with a reference strain followed by cell counting using fluorescence activated cell sorting (FACS). We then calculated the fitness values of all single-deletion strains and all pairwise double-deletion strains relative to the wild-type, which allowed the estimation of epistasis between genes (Online Methods, Supplementary Note). Among the 27 gene pairs that involve at least one essential gene, 23 (85%) have significantly positive ε (P < 0.05, t test), two have significantly negative ε, and the remaining two do not show significant epistasis (Fig. 4a). The mean among the 23 positively epistatic pairs is 0.78, and 11 of them have not significantly smaller than 1. The epistasis between the two nonessential genes is not statistically significant. These results strongly support the general findings of our computational predictions that essential genes often show epistasis with functionally unrelated genes.
Because the above experiment could not examine haplosufficient genes, we employed the newly developed DAmP method28 to mimic partial gene deletion, in which a marker gene is inserted into the 3′ untranslated region of a gene such that its protein expression may be reduced to <50%. We studied 9 haplosufficient genes belonging to 8 functional categories, including 4 essential genes that are knocked down by DAmP and 5 nonessential genes that are knocked out (Supplementary Table 2). We were able to measure the epistasis of 33 of the 36 gene pairs in haploid cells (Fig. 4b; Supplementary Table 4). Of the 23 gene pairs that have epistasis estimates and involve at least one essential gene, 20 (87%) show significantly positive ε (P < 0.05, t test), two show significantly negative ε, and the remaining one does not show significant epistasis (Fig. 4b). These results further support our computational result of abundant positive epistasis involving essential genes, even among functionally unrelated ones. In the Supplementary Note, we discuss possible explanations for why selected previous studies examining the extent of epistasis in E. coli, yeast, and other species did not find a comparably high prevalence of positive epistasis1-3,15,17,29.
In summary, our flux balance analysis of the E. coli and yeast metabolic networks and the subsequent experimental validations for 61 gene pairs in S. cerevisiae reveals a high prevalence of positive epistasis involving essential genes. While negative epistasis was usually found amongst genes involved in reactions with overlapping functions, positive epistasis often occurs amongst genes involved in reactions with unrelated functions. The proportion of essential genes is ~7% in E. coli, 17% in S. cerevisiae, and 55% in mouse30, and positive epistasis is therefore likely to be even more prevalent in higher eukaryotes than is discovered here. These findings suggest the distinction of genetic interaction from non-multiplicative (or non-additive) gene effects and caution against the use of positive epistasis to infer genetic pathways and gene-gene interactions. While one may argue that, because all metabolic genes share functions in supporting cell growth, their epistasis is not surprising, we suggest that, if epistasis corresponds to such crude functional relationship, it provides little biological insight. Although our results are presented primarily using , it is clear that positive epistasis is highly abundant and much more prevalent than negative epistasis even when ε is used (Supplementary Figures 3 and 4). This is also the case when the majority of mutations are only slightly deleterious (Supplementary Table 5). These observations also suggest the need for reevaluation of evolutionary theories that depend on overall negative epistasis, such as the mutational deterministic hypothesis of the evolution of sexual reproduction11 and the hypothesis of reduction in mutational load by truncation selection against deleterious mutations14.
Methods and any associated references are available in the online version of the paper at http://www.nature.com/naturegenetics/.
We thank Anuj Kumar for yeast strains and plasmids, Nike Bharucha, Gizem Kalay, Anuj Kumar, Jun Ma, and Barry Williams for advice and assistance in yeast experiments, Bernhard Palsson and his group for instruction on FBA, Ben-Yang Liao for drawing Supplementary Figure 2, and Meg Bakewell, Soochin Cho, Wendy Grus, Ben-Yang Liao, and Calum Maclean for valuable comments. This work was supported by research grants from the National Institutes of Health and University of Michigan Center for Computational Medicine and Biology to J.Z.
Author Contributions: X.H. and J.Z. conceived the research. X.H., Z.W., W.Q. and J.Z. designed the experiments. X.H., W.Q., W.Z., Y.L. and J.Z. conducted the experiments. X.H., W.Q., W.Z. and J.Z. analyzed the data. X.H. and J.Z. drafted the manuscript and all authors contributed to the final manuscript writing and its revisions.
We have no competing financial interests.
URLs: Additional analyses related to this publication can be found at http://www.umich.edu/~zhanglab/download.htm.