Despite the importance of evolutionary ideas in every aspect of biology, there has been relatively little direct experimental data describing the processes and mechanism that underlie evolution. Only recently, through rapidly advancing genome technology, has it become practical to study directly the genetic basis of evolutionary change in an experimental setting. In this study, we analyzed experimental evolution in chemostats with DNA microarray technology, to assess genome-wide variation in gene expression and DNA copy number, and with a practical and affordable method for detecting single-nucleotide changes relative to the sequences of our starting yeast strains. We used these tools to begin to understand the phenotypic and genetic changes characteristic of the evolution of yeast in response to consistent glucose, sulfate, and phosphate limitation in the chemostat.
The main finding of our study is the nature, identity and dynamics of the mutations that occur over the course of these evolution experiments. These mutations confer a selective advantage ranging from ~5% to as much as 50% per generation. One prevalent class of mutations consists of massive structural genomic alterations, consistent with our earlier observations
[26]. The adaptive advantage of a subset of these is readily explained by the fact that amplified genomic segments contain genes encoding transporters that are specific to the applied nutrient limitation. The prevalence of these mutations in adapted populations, the repeatability of their occurrence in independent populations, and the experimental demonstration of their fitness advantage argue for a central role of gene amplification in adaptation to nutrient limitation.
The majority of structural variation, however, is found in other regions of the genome. The reasons for the selective advantage of these variants are less clear, but their repeated observation points to their adaptive value. Interestingly, a recently reported mutation accumulation experiment in yeast also revealed significant aneuploidy in mutation accumulation experiments
[43] despite the fact that anueploidies typically cause growth defects
[44]. In our experiments a large fraction of these events are clearly associated with the repetitive sequence found in retrotransposons, which are themselves active during these evolutions. However, retrotransposon sequences are not necessary for the generation of structural variation, as illustrated by the repeated but diverse amplifications of the
SUL1 locus. These results point to a structural plasticity of the yeast genome, operating at the supragenic and genic
[45] level, that facilitates adaptation.
Whole genome resequencing using tiling microarrays and sanger sequencing revealed that only a small number of point mutations accumulate during these experiments. We estimate that we have found >85% of these mutations, and as new technologies develop, we hope to eventually detect all mutations, including those in difficult repetitive sequences. Our number of acquired mutations is consistent with other microbial experimental evolution studies that have attempted to comprehensively identify mutations. Using a combination of microarray- and mass spectrometry-based sequencing, Herring et al.
[5] found a total of 13 mutations in five
E. coli populations propagated for ~660 generations using serial transfer. Pyrosequencing a “cooperator” strain of
Myxococcus xanthus which appeared after 1000 generations of selection identified 15 point mutations
[4]. Thus, it would appear that adaptation of microbes proceeds without the requirement for mutator phenotypes in these experiments.
The relative merits of haploid and diploid states with respect to adaptation have been hotly contested
[6],
[40],
[46],
[47]. We find an enrichment for gross chromosomal rearrangements in diploid cells as compared with haploid cells, possibly reflecting more deleterious effects of chromosomal rearrangements in haploids
[48]. Although we did not explicitly test the rate of adaptation in haploid and diploid cells, our results highlight an underappreciated mechanism by which recessive alleles can be important for adaptation of diploid organisms: namely, through homozygosing via gene conversion or chromosomal aneuploidy. We did not observe any striking differences in adaptation related to mating type or strain background, although it is noteworthy that in all clones from the CEN.PK strain background we identified a
HXT6/7 amplification whereas this was found in only one of eight glucose-limitation adapted clones in the S288c background. Recent analyses have indicated that CEN.PK is a mosaic of S288c genome background and divergent sequence
[49]. The genomic region containing
HXT6/7 is identical in S288C and CEN.PK indicating that this observed difference in rates is not due to sequence.
We inferred that the batch phase of growth has a large effect on the parallelism of evolutionary paths in our experiments. During the initial batch phase of growth, the population size doubles every generation, which tightly constrains the time at which mutations occur and means that beneficial mutations are very unlikely to be lost by genetic drift. For example, if there is a class of mutations with a total mutation rate such that on average one such mutation will typically occur after 13 generations, such a mutation will almost always occur sometime between generation 10 and 16. However, when a mutation occurs later than average, it will be present at much lower frequency at the end of batch phase, and hence take substantially longer to spread through the population. The length of this delay depends dramatically on the fitness effect of the mutation: a mutation providing a 50% fitness advantage which occurs 3 generations later than average in batch phase will take 3 extra generations to reach a population frequency of 5%; a mutation of 10% advantage will take 20 extra generations, and a mutation of 1% advantage will take an extra 200 generations.
It follows that if a beneficial mutation that provides a large fitness advantage (of order 50%) occurs at a high enough rate to happen during the batch phase, it will almost always reach a substantial frequency within the population by the end of our experiment. On the other hand, a beneficial mutation of small fitness effect (of order 1%) will typically not do so unless it happens to occur very early in the batch phase (and even in this case if a larger-effect mutation occurs much later, the larger-effect mutation can reach high frequency more quickly). A mutation with effect of order 10% is an intermediate case; it will only reach substantial frequency within the population by the end of our experiment if it occurs early enough in batch phase. However, if mutations of roughly this effect occur at a rate of 10−7 or more, at least one such mutation will almost always occur early enough in batch phase to be observed at substantial frequency by the end of our experiment. In this case, the mutation that happens to occur first will typically be the one we observe. In other words, if beneficial mutations of large effect (of order 50%) are sufficiently common that they occur during batch phase, they will almost always occur and take over regardless of what smaller-effect mutations are already present. On the other hand, if large-effect mutations are rare, then the first beneficial mutation of intermediate effect (of order 10%) will typically dominate, because later mutations (even of slightly larger fitness effect) will be at large initial numerical disadvantage.
These dynamics appear to explain the extent of parallelism we observe between populations. In the sulfate-limited evolution, there is a class of SUL1 amplifications that provide a very large selective advantage (of order 50%). Given this large selective advantage, it is unsurprising that these mutations are observed in almost all of our cultures. This does not necessarily imply that there is only one adaptive pathway in sulfate-limited conditions; it could be the case that there are multiple other mutations which provide alternative ways to adapt to sulfate limitation, but that each of these only provides a selective advantage of order 10% and hence are always eliminated by clonal interference, or are present at much lower frequencies (as is the case for clone S1c1, the only sulfate-limitation adapted clone without a SUL1 amplification in our study). On the other hand, the phosphate and glucose-limited populations exhibited a broader range of evolutionary responses. In these conditions, we infer that there is no large-effect mutation that is readily accessible, so the result is more dependent on which of the various mutations that confer fitness advantages of order 10% happens to occur first. The fact that we commonly observe certain genomic amplifications (e.g. HXT6/7) in these populations suggests that they occur with a total rate comparable to that to the adaptive single point mutations, and confer a similar selective advantage. This argument is consistent with an earlier report that the relative selective advantage of a HXT6/7 amplification in glucose-limited chemostats is 1.094 (Brown et al., 1998).
Our observations point to an important principle for adaptive phenomena in natural populations and disease: the diversity of adaptive outcomes will vary as a function of the distribution of fitness effects of beneficial mutations, which differs dramatically depending on the selective pressure. If there is a single “solution” that confers a vastly greater selective advantage, that path will be repeatedly observed. Conversely, a diversity of equally beneficial “solutions” will result in a reduction in the reproducibility of adaptation. An illustrative example of this principle is the recent report of selection for resistance of lung cancer cells to gefitinib or erlotinib using longterm culturing of cells in the presence of the drugs
[50]. Resistance in one-quarter of the specimens could be attributed to amplification of the oncogene
MET, implying that alternative routes to resistance must exist that confer comparable fitness advantages to these tumor cells. It is interesting to consider an approach in both our microbial system and cancer studies of blocking known routes to adaptation in order to enrich for unknown alternative adaptive paths that may confer smaller fitness advantages.
Our experiments have identified the outcomes of adaptation to defined environments and revealed the diversity of genomic variation in clonal representatives of adapted populations. Our findings suggest a number of questions that should be addressed. First, it is critical to determine a neutral mutation rate for genome amplifications, deletions and rearrangements as well as the neutral mutation rate for retrotransposition. A recent report has provided new insights into these rates, indicating that large genome events occur at much greater frequencies than nucleotide changes
[43]. Cells grown in a chemostat grow at a much slower rate than cells grown at maximal rates in batch cultures, the condition under which mutation rates are typically determined. Evidence in bacteria suggests that single base pair mutation rates are increased under slow growth conditions
[51] and in stationary phase
[52]. Therefore, examining the rates at which all classes of genomic variants are generated in chemostat cultures will be informative for interpreting future experiments. Second, extending the duration of selection experiments will shed light on the role of subsequent diversity generated during the continuous phase of growth. It will be of great interest to test whether population diversity increases or becomes increasingly constrained as selection continues. Third, varying population size can be expected to have profound effects on the dynamics of adaptation, and can influence the degree of parallelism between independent cultures
[9],
[10],
[53]. This parameter and others such as growth rate and the complexity of the selective pressures will be fertile areas of investigation. Finally, the mechanisms by which these mutations increase fitness and change gene expression will give insight into the functions of these genes and the cellular systems in which they act.