|Home | About | Journals | Submit | Contact Us | Français|
Activation-induced deaminase (AID) is a B-cell specific enzyme required for initiating the mechanisms of affinity maturation and isotype switching of antibodies. AID functions by deaminating cytosine to uracil in DNA, which initiates a cascade of events resulting in mutations and strand breaks in the immunoglobulin loci. There is an intricate interplay between faithful DNA repair and mutagenic DNA repair during somatic hypermutation, in that some proteins from accurate repair pathways are also involved in mutagenesis. One factor that shifts the balance from faithful to mutagenic repair is the genomic sequence of the switch regions. Indeed, the sequence of the switch μ region is designed to maximize AID access to increase the abundance of clustered dU bases. The frequency and proximity of these dU nucleotides then in turn inhibits faithful repair and promotes strand breaks.
Immunoglobulin genes are unique among eukaryotic genes in that they undergo high rates of somatic recombination and mutation in order to generate extremely diverse antibodies. Diversity is achieved at four molecular levels: (a) joining of variable (V), diversity (D), and joining (J) gene segments; (b) hypermutation of rearranged VDJ genes; (c) switching of heavy chain constant (C) genes; and (d) gene conversion of V genes. In detail, (a) joining takes place in pro- and pre-B cells, and is initiated by the recombination proteins RAG-1 and RAG-2. This level produces diversity in rearranged V genes, which are expressed as IgM molecules on naïve B cells. The next two levels happen after antigen stimulation of the B cells in the presence of T-helper cells or stimulation by mitogens. (b) Hypermutation occurs throughout the rearranged V region, and cells expressing mutated antibody receptors with higher affinity for antigen are intensely selected. Thus, the significance of mutation in the V gene is to generate antibodies with high affinity to antigens. (c) Hypermutation also occurs in the switch (S) region preceding each heavy chain C gene, resulting in DNA strand breaks which initiate class switch recombination. Switching of heavy chain classes from IgM to IgG, IgE, and IgA allows the mutated V gene to be associated with several C genes with different effector functions for optimal immune responses to pathogens. (d) Gene conversion of V genes is found in some species, notably chicken and rabbit. Conversion happens after joining, but before antigen stimulation, to diversify the primary repertoire. The activation-induced deaminase (AID) protein is needed for the last three levels of diversification [1–3]. Both mice and men deficient for AID have no hypermutation and no heavy chain class switching, and chicken cells deficient for AID have no gene conversion. AID, which is only expressed in activated B cells from germinal centers, is somehow targeted to the V and S regions of immunoglobulin genes. Two major questions remain unanswered in the hypermutation field: what is the mechanism and what proteins control targeting?
Neuberger and colleagues used genetic techniques to show that AID deaminates cytosine to uracil (U) in DNA [4–6], which cracked open the mechanism of mutation (Fig. 1). Uracils are thus ground zero, and, depending on how they are processed, will produce mutations or DNA strand breaks. AID deaminates cytosine on single-stranded DNA substrates in vitro  and may be very active in S region DNA because the DNA can form stable secondary structures such as R loops . In vitro, AID has specificity for the WRC motif (W = A or T; R = purine) . In vivo, we and others showed that the WGCW sequence, which is comprised of overlapping WRC motifs on both strands, may be the entry point for AID to bind in the chromosome [9, 10]. AID has been reported to work processively on DNA, so that after it binds, it can move along DNA to generate mutations at cytosines in different sequence contexts . Deletions or mutations of AID indicate that the N-terminal end has deamination and mutation activity, and the C-terminal end is required for switching, perhaps by interacting with recombination proteins . AID is thus a potent mutator, and its product, a U:G base pair (bp), is handled by two different pathways, which probably occur equally frequently to generate mutations at all four bases.
Uracils could lead to mutations at C:G bp in three ways. First, uracil could be left in the DNA and copied by any DNA polymerase, which inserts A opposite U to produce C to T transitions, or G to A transitions if C is deaminated on the complementary strand . In confirmation of this hypothesis, mice deficient for UNG have exclusively transitions of C:G in V regions . Second, uracil could be removed by UNG to produce an abasic site [5, 6, 12], and the abasic site could be copied by a low fidelity polymerase to produce transitions and transversions of C:G. In support of this mechanism, it was found that cells and mice deficient for the Rev1 polymerase have fewer C:G to G:C transversions in V genes [13, 14]. Third, uracil could be removed by UNG, and the abasic site would be nicked by apurinic/apyrimidinic endonuclease (APE) to generate double strand breaks for class switch recombination. Indeed, it was observed that mice deficient for APE have reduced class switching . Thus, these genetic mouse models support the biochemical pathway outlined in Fig. 1A.
Mismatches can be generated before they are fixed as mutations after DNA replication. In eukaryotic mismatch repair, the heterodimer MSH2-MSH6 binds to single nucleotide mismatches, and MSH2-MSH3 binds to loops created by insertions, deletions, or mispairing on one strand. The heterodimer PMS2-MLH1 is then recruited, and PMS2 nicks the DNA. Exonuclease 1 binds to MSH2 and MLH1, and creates a gap at the nick to excise the misincorporated nucleotide from the newly-synthesized strand. This gap can then be filled in by a DNA polymerase. To see which of these proteins are involved in hypermutation, we and others have examined mice deficient for each of them. In wild type mice and mice deficient for MSH3, PMS2, and MLH1 [10, 16–22], the number of mutations at A:T bp and G:C bp is roughly equal, whereas in mice deficient for MSH2 and MSH6 [10, 17, 20, 22–24], mutations of A:T drop precipitously. Mice deficient for exonuclease 1  had a similar phenotype to MSH2 and MSH6 deficiency, that is, fewer mutations of A:T bp. This is consistent with the notion that MSH2-MSH6 binds to a single mismatch, and recruits exonuclease 1 to create a gap. The gap can then be filled in by a DNA polymerase that would synthesize predominantly mutations of A:T bp. Therefore, these mouse models confirm the biochemical pathway illustrated in Fig. 1B.
If U:G remains in the DNA, the uracil could be replicated by any high or low fidelity DNA polymerase, to produce C:G to T:A transitions. Low fidelity polymerases must then generate the C:G transversions and A:T mutations that are abundant in immunoglobulin genes. Eight polymerases have been studied for their role in this process, using mice deficient for the polymerases. Polymerases ι [26, 27], κ , λ , μ , θ [30, 31], and ζ  are either not involved or play a minor role. Polymerase η [33–36] and Rev1  are definitely involved, based on the difference in the types of mutations that are generated in their absence. For polymerase η, we first reported that it is an A:T mutator  after observing that the frequency of mutations of A and T dropped four-fold in V genes from patients with xeroderma pigmentosum disease, who lack polymerase η. This was confirmed in mice deficient for polymerase η [34, 35], where the frequency of A:T mutations was also diminished in non-coding regions around V genes. An analysis of the S regions in humans and mice deficient for the polymerase showed that polymerase η synthesizes mutations of A and T there as well [34, 35, 37, 38]. For Rev1, several groups have shown that it generates G:C to C:G transversions in cell lines and mice [13, 14], in accord with its activity as a cytidyl transferase. However, other low fidelity polymerases must generate the G:C to T:A transversions, and the residual A:T mutations seen in the absence of polymerase η.
Mice deficient for MSH2, MSH6, exonuclease 1, and DNA polymerase η have the same hypermutation phenotype: fewer mutations of A:T. To see if they function in the same mutagenic pathway, we tested their association in biochemical experiments . We obtained the following evidence for interaction: the MSH2-MSH6 heterodimer binds to a U:G mismatch; MSH2 and polymerase η physically interact in cells, and MSH2-MSH6 stimulates the catalytic activity of polymerase η in vitro. These data indicate that the proteins work together during somatic hypermutation to generate mutations downstream of the initial uracil when polymerase η fills in a gap created by exonuclease 1. As shown in Fig. 1B, AID produces a U:G mismatch, MSH2-MSH6 binds to U:G, and exonuclease 1 and polymerase η are recruited to the site. As polymerase η synthesizes in the gap, it will generate mismatches opposite A and T. If MSH2-MSH6 repeatedly binds to the mismatches, polymerase η would be further stimulated to make more mutations downstream of the original C deamination. Therefore, the observed phenotypes of gene-deficient mice confirms the biochemical interactions of MSH2-MSH6 and polymerase η.
Humans without AID have IgM antibodies with germline sequences , and have an increased incidence of infections because they cannot make high affinity antibodies or switch to IgG or IgA. However, B cells pay a price for expressing AID, since it can also generate tumors. B cell tumors have been linked to both hypermutation run amuck in bystander oncogenes, and nicking gone bad which triggers translocation of oncogenes [40–42]. Mice deficient for UNG have a high frequency of B cell tumors with age, indicating that if dU remains in immunoglobulin DNA, it will generate mutations and strand breaks , and can cause oncogenesis. Tumors were not detected until the mice were relatively old (greater than 18 months). The fact that they were B-cell lymphomas, and not tumors of other cell types, strongly implies that UNG normally modifies DNA in the immunoglobulin loci. Interestingly, when AID is overexpressed in transgenic mice, T-cell tumors arise, implying that aberrant AID expression initiates tumorigenesis in non-B cells by mis-targeting .
Mutations are localized to two distinct regions: ~two kilobases of DNA surrounding and including the VDJ gene, and ~four kb of DNA encompassing the Sμ region (Fig. 2) [45, 46] These regions are downstream of promoters preceding V and S regions; thus, transcription may be involved in targeting the mechanism . Most recently, Neuberger and colleagues have identified a protein in the splicesosome, CTNNBL1, that interacts with AID and impacts mutation, switching and gene conversion . AID has also been reported to interact with RNA polymerase II . We propose that the V and S regions are independently targeted, an idea that is consistent with the observation that hypermutation can occur in the Sμ region in splenic B cells activated ex vivo in the absence of mutation in the nearby V region . In this review, we will focus on targeting to the Sμ region, which contains a well-described structural formation called R-loops .
S regions are found adjacent to C gene exons to mediate the recombinational switch from IgM to other isotypes. Structurally, the S regions are comprised of repetitive sequences which contain abundant amounts of the WGCW hotspot motif recognized by AID. Furthermore, these regions contain 3–4 bp stretches of poly-C tracts located on the transcriptional template strand, which have been hypothesized to allow the formation of R-loop secondary structure throughout the repetitive region . R-loop structures are defined by a stable RNA-DNA hybrid between the newly synthesized transcript and the DNA template strand. This structure therefore inhibits the reassociation of the two DNA stands, potentially allowing increased access of AID to the single stranded non-template strand. To examine this hypothesis, we studied the distribution of RNA polymerases and mutations in the Sμ region, which precedes the C gene encoding IgM. We identified a high density of RNA polymerases located in close proximity (~500 bp) to the Sμ repetitive region but not in other locations of the Sμ loci . This pattern of RNA polymerase II accumulation was found to be identical between naïve and activated B cells from both AID-proficient and -deficient mice. Thus RNA polymerase abundance is independent of the activation state and the presence of AID. Significantly, this data corresponds well with the proposed R-loop secondary structure which begins roughly 300–700 bp outside the repetitive region . We propose that RNA polymerases pile up before the repetitive region because they have trouble unwinding the stable RNA-DNA hybrid in the R loops. Thus, the genomic sequence located within and surrounding the Sμ repetitive region causes RNA polymerases to accumulate. Furthermore, the Sμ region seems to be held in a primed state allowing for a quick response to antigenic signaling.
To initiate class switch recombination, AID has to access single stand DNA to deaminate dC to dU. The presence of uracils in DNA initiates DNA repair, which for unknown reasons, processes the dU erroneously into either a DNA strand break or a nucleotide mutation. In the absence of UNG, processing of uracils into strand breaks is inhibited while the mutation frequency increases. We therefore looked for mutations in UNG-deficient mice as an indirect measure of AID activity . Mutations appeared around the intronic μ enhancer at a low frequency (~0.5 ×10−3 mutations/bp) and reached a peak before the Sμ repetitive sequence (~2.5 ×10−3 mutations/bp). However, on the downstream side of the repetitive core, very few mutations were seen (~0.5 ×10−3 mutations/bp) (Fig. 2) . This indicates that AID activity is localized specifically to the core and sequences just upstream of this region. The rapid decrease in mutational frequency downstream of the core is significant as mutations to the adjacent C gene would be detrimental to antibody production. Upon closer analysis of the sequenced mutations, a pattern emerged which showed that there is a distinct lack of mutations of dA ~300–700 bp upstream of the repetitive sequence, even though the dA density is high . As stated above, dA mutations are synthesized by DNA polymerase η, suggesting there is a decrease in polymerase η activity in these regions (Fig. 3). Significantly, the loss in dA mutations correlates with the regions hypothesized to form R-loop structure, which may inhibit DNA polymerase η activity because the template strand is bound to RNA. Thus, the formation of R-loop structure in the Sμ region increases the mutational frequency and alters the mutational spectra.
It is known that transcription is required for mutation and switching, and there is some evidence that AID interacts with the transcription complex [48, 49]. Taken together with the structural elements of the Sμ region, a model can be proposed where the structure maximizes AID targets while inhibiting repair to promote class switching (Fig. 3). In this model, AID is associated with the transcription complex as polymerases move through the Sμ region. Upon encountering the R-loop structure, RNA polymerases accumulate and give AID abundant access to the single-strand DNA, increasing the number of deaminations in these regions. As the transcription complex enters the repetitive region, the deamination events will further increase due to the high abundance of WGCW AID hotspot motifs. Therefore, the presence of R-loops and quantity of AID hotspots has a dramatic effect in maximizing high levels of clustered deoxyuracils. The uracils are removed by UNG and the abasic site is cleaved by APE, producing a high density of nicked DNA on both strands. The close proximity of DNA breaks decreases the efficiency of base excision repair by DNA polymerase β and ligase. As a result, the balance swings from faithful DNA repair to non-homologous end-joining and subsequent class switch recombination.
Antibody diversity is a complex balance between advantageous and deleterious mutagenesis. The mechanism is highly regulated to control both the generation and processing of mutagenic deoxyuracils to promote affinity maturation and isotype switching. The mechanism uses intricate means to shift faithful DNA repair to an error-prone process that fixes the mutations into the genome and initiates non-homologous recombination. While the proteins and enzymes involved in processing the deoxyuracils have been extensively studied, further analysis has to be done to understand how they are altered to reduce faithful DNA repair. Specifically, the mechanisms seem to be altered at the DNA synthesis step of repair to utilize highly error-prone polymerases. However, it remains unclear why DNA polymerase η is specifically selected over other low fidelity polymerases under these conditions. Additionally, the understanding of these mechanisms is complicated by the fact that AID is responsible for the initiation of three very different mechanisms--hypermutation, switching, and gene conversion. Specifically, the findings that the Sμ region DNA sequence shifts the repair mechanisms to promote recombination in addition to mutagenesis, signifies a distinct second level of regulation distinguishing somatic hypermutation from class switch recombination. Although these studies provide some insight into how AID is targeted to the S regions, there is currently no information on how AID is targeted to the V regions, which do not have R loops. Finally, knowing how AID is initially brought to the loci and how it interacts with the transcription complex will have a significant impact at understanding how AID localizes and mutates the immunoglobulin loci [53–55].
This research was supported by the Intramural Research Program of the NIH, National Institute on Aging.