The relationship between genotype and phenotype is at the core of classical and modern genetics. However, complex phenotypes, involving many cellular processes, present significant challenges due to the limited sensitivity with which genotype–phenotype relationships can be mapped. Here, we have combined a comprehensive exploration of adaptive potentials together with a robust modular data analysis approach to reveal the genetic basis of a complex phenotype. Through the use of both transposon insertion and overexpression libraries, we surveyed the adaptive potential of all genetic loci with respect to ethanol tolerance. The fitness consequences of transposon insertion events were more pronounced compared with that of overexpression. This is not surprising given the nature of these perturbations. In many cases, like that of core enzymes in the TCA cycle, the overexpression of a single gene has little effect on the output of the pathway as a whole. Nevertheless, we observed three conditions in which overexpression perturbations result in a pronounced fitness effect: (1) overexpression of key regulatory components (e.g. cadB in acid stress response), (2) the upregulation of a key enzyme in the pathway (e.g. slt in cell-wall biogenesis), or (3) the simultaneous overexpression of multiple genes in the same pathway (e.g. bet regulon). Our ability to observe the latter is the consequence of the size of the cloned fragments in the overexpression library and the cistronic structure of the bacterial genomes in which all the genes in a small pathway exist together as part of a single operon.
On measuring the fitness consequences of both transposon insertions and overexpressions, we used a modular analysis of the fitness profiles to identify the relevant underlying pathways. For example, transposon insertions in the envelope stress response genes cause a slight decrease in fitness that is not significant enough for these genes individually to pass our gene-level statistical threshold. Whereas, in a modular analysis, the significance of this pathway can be detected as a collective effect of all these genes (Goodarzi et al, 2009a
Our study revealed many pathways and processes that collectively contribute to ethanol tolerance in E. coli
. We found modifications to endogenous pathways (e.g. upregulation of osmoprotectants and suppression of acid stress response pathway) or metabolic reprogramming to boost ethanol degradation capacity as potential mechanisms for adaptive ethanol tolerance. Our results argue for the dominance of regulatory network perturbations in adaptation to extreme environments. The fitness contribution of genes regulated by a range of transcription factors such as betI
, and hns
signifies the adaptive potential of regulatory perturbations. This is to be contrasted with a model in which subtle amino-acid modifications in effector proteins are the dominant contributors to adaptation. Discovering adaptive mutations in different environments would ultimately test this hypothesis; nevertheless, we have previously catalogued adaptive mutations in two evolved strains: the ethanol-tolerant strain (HG179) and a strain (ASN*
) capable of growing in minimal media plus asparagine eight times faster than the wild-type strain (Goodarzi et al, 2009b
). In HG179, we found the major contributor to ethanol tolerance to be a point mutation in rho
, the gene coding for the Rho transcription terminator (Goodarzi et al, 2009b
). It has been shown earlier that Rho is a global regulator of gene expression and PrpC/D (propionate catabolic process) and CadA (acid stress response pathway) are among the proteins most affected by Rho inhibition (Cardinale et al, 2008
). These proteins and their corresponding pathways were also identified as key players in ethanol tolerance in this study. Similarly, in ASN*
, we discovered three adaptive mutations (an IS2 insertion, a single-nucleotide insertion, and a mismatch) that were all upstream of their respective ORFs, modifying their expression levels rather than their amino-acid sequence (Goodarzi et al, 2009b
). Similar studies in other environments may further highlight the importance of regulatory perturbations in adaptation.
Using metabolomic approaches as a measure for downstream effects of the adaptation process, we have shown that some of the pathways identified through our global genetic approach are also modified in laboratory-evolved strains for enhanced ethanol tolerance, most notably biosynthesis of peptidoglycans, colanic acid, and enterobactin. Interestingly, neither of the evolved strains (HG227 and HG228) shows significant changes in glycine or glycine betaine levels (Supplementary Figure S9
). The fact that the evolved strains did not show changes in all beneficial pathways is not surprising, as a single strain is unlikely to explore the entire fitness landscape on a relatively short evolutionary timescale, emphasizing the importance of approaching the analysis of evolution of complex traits through more systematic methods rather than simple strain selection under the desired condition.
Through stable-isotope labeling in the ethanol-tolerant strain, HG228, we observed a substantial boost in ethanol assimilation as compared with the wild-type strain (also see Supplementary Figure S10
). As mentioned earlier, ethanol consumption has been associated with ethanol tolerance in bacteria with active ethanol degradation pathways (Heipieper et al, 2000
). However, in the case of our laboratory-evolved E. coli
strain, ethanol degradation capacity emerges as part of the adaptation process, through regulatory and metabolic rewiring. Moreover, the anti-correlation between ethanol tolerance and ethanol production has been noted earlier in yeast: typically ethanol-tolerant strains are poor ethanol producers and vice versa
(del Castillo Agudo, 1985
). If ethanol degradation is a mechanism for tolerance, selecting for this phenotype results in an adaptive metabolic rewiring, which maximizes ethanol degradation (i.e. enhancing the reactions that deplete acetyl-CoA) rather than ethanol production.
In this study, we have introduced a framework based on coarse-grained sampling of the fitness landscape followed by a modular analysis for identification of pathways that contribute to emergence of complex adaptation. Given that we are directly assaying fitness, the identified pathways are either directly responsible for the observed effects (e.g. osmoregulation in ethanol tolerance), or function as part of an emerging pathway (e.g. adhE activity in ethanol degradation, in contrast with its normal function as an ethanol-producing enzyme). In parallel, we have used metabolomic approaches to probe the status of the identified pathways in the evolved strains. Validating the function of these pathways in the laboratory-evolved strains highlights the biological relevance of our approach and its ability to reveal the actual genetic mechanisms used during the evolutionary process.