|Home | About | Journals | Submit | Contact Us | Français|
Gross chromosomal rearrangements (GCRs), or changes in chromosome structure, play central roles in evolution and are central to cancer formation and progression. GCRs underlie copy number variation (CNV), and therefore genomic disorders that stem from CNV. We study amplification in Escherichia coli as a model system to understand mechanisms and circumstances of GCR formation. Here, we summarize observations that led us to postulate that GCR occurs by a replicative mechanism as part of activated stress responses. We report that we do not find RecA to be downregulated by stress on a population basis and that constitutive expression of RecA does not inhibit amplification as would be expected if downregulation of RecA made cells permissive for nonhomologous recombination. Strains deleted for the genes for three proteins that inhibit RecA activity, psiB, dinI, and recX, all show unaltered amplification, suggesting that if they do downregulate RecA indirectly, it does not promote amplification.
Gross chromosomal rearrangements (GCRs) underlie many aspects of evolution, from short-term modulation of levels of gene expression to the formation of new functions by shuffling exons or domains from other genes, reassorting genes and regulatory elements, and providing genetic redundancy that allows progressive change in sequence. The discovery that GCR can occur as a component of stress responses1, 2 opens a wealth of possibilities for processes of adaptive evolution occurring specifically when organisms are not well adapted to their environment, as, for example, during times of changing environment. Stress-inducible genomic instability has been described in several bacterial systems, in yeast, and in human cell cultures, displaying slightly different characteristics in different assay systems, but sharing requirements for activation of stress responses that activate genome-instability pathways (for review, see Refs. 3–5). GCRs also underlie genomic disorders (disease syndromes that stem from variation in gene copy number), and the origin of some and progression of many cancers.6 Thus, the mechanisms that underlie GCR are of broad interest in both basic science and human health.
We have studied starving Escherichia coli using the Lac assay,7 a model organism system in which gene amplification, a GCR, occurs in response to the stress of starvation, in the presence of an potential carbon source that it is unable to use. Amplification in this assay depends on the activation of two stress responses2, 8. This report outlines the molecular processes that we have found to be involved, and the molecular mechanisms that these data suggest. We include data that exclude some possible explanations as to why cells undergo nonhomologous instead of homologous recombination during stress. The replication-based molecular mechanisms that we have proposed find support in observation of GCR in other organisms, notably in human copy number variation (CNV) encountered in the genetics clinic.
In the E. coli Lac assay,7 a +1 frameshift mutation in a lacI fusion gene on an F′ plasmid mutates to Lac+ during prolonged starvation on lactose minimal medium. The Lac+ colonies carry either a compensating frameshift mutation9, 10 or a tandem array of 20 or more copies of the weakly functional lac allele, which confers sufficient β-galactosidase activity for growth.1 The processes of frameshift (“point”) mutation and amplification differ in their genetic requirements, and thus represent alternative strategies that allow escape from starvation.
Formation of point mutations differs from mutation formation in growing cells in its requirement for the proteins of homologous recombinational (HR) double-strand-end (DSE) repair (RecA, RecBC and RuvABC),11–13 the SOS DNA-damage response,14 error-prone DNA polymerase (Pol) IV/DinB,15, 16 the RpoS (σS) general-stress response,2, 17 and periplasmic-stress-response controlled by σE.8 Point mutation formation also requires the F-encoded TraI endonuclease or I-SceI-endonuclease-induced DSEs near lac,18 though recent work shows that they also occur at spontaneous DSEs in chromosomes of F− E. coli.19 TraI appears to provide ssDNA nicks that become DSEs by replication fork collapse. Hence, point mutations are thought to arise via DinB/Pol IV errors during DNA replication reinitiated by HR at collapsed replication forks.18, 19
Lac+ colonies also arise in the Lac assay by gene amplification. Amplification is also an adaptive change that occurs after starvation has begun.1 Amplification of the leaky lac allele gives Lac+ colonies because 20 or more copies of the frameshift mutant gene provide sufficient β-galactosidase activity to allow growth on lactose medium. lac-amplified colonies are distinguished from point mutants by their instability, seen as blue and white sectoring on rich medium with x-gal, because amplification breaks down by HR between the repeats.20, 21 The repeat units (amplicons) are direct repeats of ~7 to ≥40 Kb joined by microhomology junctions of 2 to 15 base pairs.22, 23
Like point mutation, amplification requires activation of the RpoS and RpoE stress-responses,2, 8, and the proteins of DSE-repair (which might be a requirement for HR for expansion of an amplified array by unequal crossing-over from a duplication).23 However, amplification also requires DNA Pol I, which is not required for stress-induced point mutation.23, 24 Amplification is enhanced by mutation in xonA (which encodes the major single-strand 3′-exonuclease ExoI),23 and by providing DSEs with an endogenously expressed I-SceI endonuclease,18 implying that 3′ ends at DSBs are intermediates in amplification. Amplification does not require DinB or the SOS response.15
We proposed that amplification is initiated by template switching during repair of collapsed replication forks in cells unable to use HR,23, 25 a model that is currently important in human CNV work (e.g.,26–28). This model is described below.
The elevated rate of amplification seen in ExoI-defective mutants implies that 3′ single-strand DNA ends are intermediates in the amplification process, and that these ends are frequently removed by ExoI so that amplification is inhibited. Because Pol I functions in excision repair processes and in lagging-strand processing during replication, we tested for a requirement for excision repair in amplification. We find that nucleotide excision repair and base excision repair are not required.23 Neither is mismatch repair (Fig. 1). Thus, the requirement for Pol I implies that the events happen during DNA replication at the replication fork. The presence of microhomology at the novel junctions of amplicons indicates that the initial event involved nonhomologous recombination, because the length of microhomology is too short to allow RecA-mediated homologous recombination.29
An event involving single-strand 3′ DNA-ends at replication forks suggests that the novel junctions are formed by polymerase slippage or template switching during DNA replication. Template switching as had been described previously occurs within the boundaries of a replication fork. However, amplicons in the Lac assay system average about 20 kb in length,23 presumably much too long to have occurred by polymerase switching within a single replication fork. We initially suggested that template switching occurred between replication forks: the long-distance template switch model.23 This model was encouraged by the observation that some amplification events were complex, having sequence from nearby regions inserted into the junction in either orientation.23 Since then, the same observation has been made for human CNVs,30 suggesting a common mechanism.23, 30
An extensive analysis of the structure of amplicons in the Lac assay by comparative genomic hybridization characterizes this complexity in more detail.31 It also provides evidence of genome-wide instability, witnessed by a significant number of GCR events other than lac amplification occurring in the same cells that had experienced amplification, when compared with cells that have not been stressed, as well as with cells in the same stressed population but in which lac was not found to be amplified. Genome-wide instability is predicted if amplification occurs as a consequence of a stress-response, because the response is a cell-wide phenomenon. We found evidence of genome-wide instability, but only in those cells that also carried amplification, confirming that a cell-wide physiological change underlies stress-induced amplification. At the same time, because only cells in which lac is amplified show additional GCRs, we infer that amplification occurs in a subpopulation of cells that is differentiated to be permissive for nonhomologous recombination.31
About 15% of amplification events were complex, with a mixture of direct and inverted insertions.31 We now interpret this complexity as a product of break-induced replication (BIR), a process by which collapsed (broken) replication forks are repaired and restarted.25, 32 BIR has been shown in yeast to involve repeated rounds of extension by replication of a DSE followed by separation from the template and then reinvasion of the DNA end into new template DNA and priming from replication. This happens several times before a fully processive replication fork is established.33 However, BIR is an HR-mediated process, whereas amplification in the Lac system occurs at microhomologous positions. We have therefore suggested that, in stressed cells, HR is not available, and instead, BIR occurs by annealing of the 3′ DNA-end with any nearby single-stranded DNA with which it shares microhomology.25, 32 This model is discussed below.
We suggest that GCRs are formed by a modified BIR process (Fig. 2).25, 32 Because BIR is a precisely homologous process mediated by RecA (or its orthologue Rad51 in yeast),34, 35 we postulated that a failure to form homologous junctions in amplification could be due to an insufficiency of RecA activity in starving cells. Alternatively, it might result instead from cells having no sister chromosome at the time of repair, as is expected in 60% of stationary-phase cells.36 To repair a collapsed fork without RecA or homology, one would need to use annealing of ssDNA. One strand is provided by the 3′-end from the processed DNA end at the site of fork collapse. We postulate that the ssDNA end pairs with any other ssDNA nearby. SsDNA is likely to occur, for example, at replication forks, excision repair sites, R-loops formed at sites of transcription, and at secondary structures in DNA, such as hairpins and G-quartet DNA (see Maizels, this volume). Because the homology requirements for strand annealing are so low, the 3′ end could switch templates to almost any exposed ssDNA, and thus cause GCRs joined by microhomology such as those that we see in amplification.23
Hence, the microhomology-mediated break-induced replication (MMBIR) model (Fig. 2) suggests that, when a replication fork collapses in a cell under stress, replication will be restarted by a modified BIR process in which HR functions are unavailable. Resection (exonuclease digestion) of the 5′-DNA end at the break produces a 3′-overhang (Fig. 2C). This 3′-end will anneal with any single-stranded DNA in physical proximity (Fig. 2E) and then prime synthesis (Fig. 2F). This annealing reaction has very low homology requirements, so that replication recommences (Fig. 2F) at a sequence that shows only microhomology and almost any ssDNA sequence will be able to take part. This sequence can be in a region already replicated, producing a duplication, or downstream of where the fork collapsed, leading to deletion. Because the available ssDNA will often be a lagging-strand template, inversion will be frequent. As in BIR, initial synthesis is of low processivity, and the extended end will separate from the template (Fig. 2G). The process must then be repeated to produce a completed viable chromosome, so that complexity at the joints will occur in the form of inserted sequence, often from nearby, in either orientation. After a few such repeated events, fully processive replication is established, and continues to the end of the replicon. The final replication would usually be in direct orientation to obtain a viable product.
Other mechanisms for GCR besides template switching during replication must be considered. The most likely candidate is nonhomologous end-joining (NHEJ) (reviewed in Refs. 37, 38) and NHEJ has frequently been cited as a probable mechanism for human GCR. There is sometimes microhomology at junctions formed by NHEJ, and there is also a tendency for fragments of sequence from elsewhere to be incorporated into the junction. Additionally, insertions or deletions of one or two base pairs occur at some of the junctions, and we have observed this in E. coli.23 However, E. coli lacks the necessary proteins for NHEJ,39 and yet the properties of GCR in E. coli and in human are very similar, suggesting a common mechanism that is not NHEJ. Another argument against NHEJ is that the sequences inserted in both E. coli and humans generally reflect sequence from the immediate genomic region, rather than random fragments from elsewhere in the cell. Moreover, the requirement for Pol I,23, 24 implies a replicative mechanism. Evidence for a replicative mechanism is also found in other recombination assays in E. coli (reviewed by Ref. 40) and yeast.41
Because BIR is so efficient and accurate, there is a puzzle as to why NHR is allowed to occur, because the homology requirement for BIR would be expected to exclude nonhomologous and microhomologous events. If RecA and sequence homology were present, RecA would catalyze invasion of the homologous sequence and repair would be accurate. We suggest that RecA activity and/or homology is limiting in these starving cells. RecA/Rad51 could be downregulated by stress, as has been demonstrated in human cancer cell lines under hypoxic stress,42, 43 during which, under stress, HR is also reduced and replaced by NHR. A switch from HR to NHR also occurs in Drosophila heterozygous for mutation in a Rad51 homologue, showing again that NHR happens when RecA/Rad51 activity is limiting.44 However, we find no evidence of downregulation of RecA in starved E. coli (Fig. 3A), and a strain with high constitutive expression of RecA is unaltered for amplification (Figure 3B). It is also possible that there is not enough ATP for RecA activity in starving cells. Because, as discussed above, amplification occurs in a differentiated sub-population of cells,31 we must consider the possibility that differentiation to being permissive for nonhomologous recombination will apply only to a subpopulation. The Westerns presented here examine only the whole population, and so could miss subpopulation-specific down regulation of RecA.
Another possibility for why repair might not use homologous recombination could be stress–response control of proteins that inhibit RecA activity, so that control of RecA activity would be indirect. However, as shown in Fig. 3C, three inhibitors of RecA activity, PsiB, DinI and RecX (reviewed by Ref. 45), are not required for amplification. Other possibilities are that the absence of a sister molecule with which to interact (expected in 60% of cells36) and downregulation of DNA Pol III allowing access of Pol I,46 which is more conducive to GCR. These experiments are not subject to concerns about subpopulations.
There is now extensive evidence supporting the idea that aberrant DNA replication underlies much GCR. Previous research in E. coli, yeast, and humans suggests that GCRs can be formed through the process of restarting collapsed replication forks by BIR. However, any BIR process that generates amplification in the Lac assay, and GCR in many other systems, does not use homologous recombination involving RecA/Rad51-mediated invasion of homologous duplex DNA to create novel junctions and duplications, because the microhomology at the junctions is too short for HR. HR is necessary to expand a duplication into an amplified array, which probably underlies the requirement for HR proteins in amplification.23 We suggest that, instead of RecA-mediated invasion, novel junction formation involves annealing of single stranded DNA at the broken end with any other single-stranded DNA nearby at sites of microhomology.23, 25 A key question is to understand the molecular mechanism by which stress changes cellular physiology such that nonhomologous events occur in situations where normally there would be HR; for example collapsed replication forks usually are repaired using homologous sequences in the sister DNA molecule. We present data showing that, unlike human cells, E. coli in our conditions does not downregulate RecA/Rad51 under stress, at least when viewed on a whole population basis. Attempts to find indirect regulation of RecA activity under stress have not yielded an answer. Other ways to limit HR, for example by the unavailability of homologous sequence, are under consideration.
Chromosomal rearrangement is an evolutionary engine. It creates new regulatory circuits and new genes derived by reassortment of existing modules, expands the genome in a way that allows sequence diversity to evolve, and varies gene copy number, which can change expression levels. The discovery that stress responses promote amplification implies that the capacity of organisms to evolve can be increased specifically when they are maladapted to their environment, when stressed. This contrasts with early ideas of the neoDarwinian “modern synthesis” about constant and gradual genetic changes underlying evolution,47 and instead implies feedback and responsiveness of the generation of genomic diversity to the environment. Further, that such accelerated genome evolution happens in only a subpopulation of cells suggests that the apparent danger of reshuffling a genome during stress could be mitigated by differentiating a small subpopulation in which to run the potentially dangerous experiment. Both stress-induced GCR and stress-induced point mutation3 may, on a microscopic scale, help contribute to the macroscopic phenomenon of evolution seeming to occur in bursts48.
This work was supported by NIH Grants R01 GM53158 to SMR and by R01 GM64022 to PJH.
Conflicts of interest
The authors declare no conflicts of interest.