Ig heavy and light chains are each encoded by a separate multigene family,(9,10)
and the individual V and C domains are each encoded by independent elements: V(D)J gene segments for the V domain and individual exons for the C domains. The primary sequence of the V domain is functionally divided into three hypervariable intervals, termed complementarity determining regions (CDRs) that are situated between four regions of stable sequence termed frameworks (FRs) ().
Each V gene segment typically contains its own promoter, a leader exon, an intervening intron, an exon that encodes the first three framework regions (FR 1, 2, and 3), CDRs 1 and 2 in their entirety, the amino terminal portion of CDR 3, and a recombination signal sequence (RSS). Each J (for joining) gene segment begins with its own recombination signal, the carboxy terminal portion of CDR 3, and the complete FR 4 (, ).
Rearrangement events in the human κ locus
The creation of a V domain is directed by the recombination signal sequences (RSS) that flank the rearranging gene segments. Each RSS contains a strongly conserved seven base pair, or heptamer, sequence (e.g., CACAGTG) that is separated from a less well-conserved nine base pair, or nonamer, sequence (e.g., ACAAAACCC) by either a 12- or 23-base-pair spacer. These spacers place the heptamer and nonamer sequences on the same side of the DNA molecule, separated by either one or two turns of the DNA helix. A one turn recombination signal sequence (12 base pair spacer) will preferentially recognize a two turn signal sequence (23 base pair spacer), thereby avoiding wasteful V-V or J-J rearrangements.
Initiation of the V(D)J recombination reaction requires recombination activating genes 1 and 2 (RAG-1 and RAG-2), which are almost exclusively expressed in developing lymphocytes.(11)
RAG-1 and RAG-2 introduce a DNA double-strand break (DSB) between the terminus of the rearranging gene segment and its adjacent recombination signal sequence. These breaks are then repaired by ubiquitously expressed components of a DNA repair process, known as nonhomologous end-joining (NHEJ), that are common to all cells of the body. Thus, while mutations of RAG affect only lymphocytes, loss or alteration-of-function mutations in NHEJ proteins yield susceptibility to DNA damage in all cells of the body. The NHEJ process creates precise joins between the RSS ends, and imprecise joins of the coding ends. Terminal deoxynucleotidyl transferse (TdT), which is expressed only in lymphocytes, can variably add non-germline encoded nucleotides (N nucleotides) to the coding ends of the recombination product.
Typically, the initial event in recombination will be recognition of 12-bp spacer RSS by RAG-1. RAG-2 then associates with RAG-1 and the heptamer to form a synaptic complex. Binding of a second RAG-1 and RAG-2 complex to the 23-bp, two-turn RSS permits the interaction of the two synaptic complexes to form what is known as a paired complex; a process that is facilitated by the actions of the DNA-bending proteins HMG1 and HMG2.
After paired complex assembly, the RAG proteins single strand cut the DNA at the heptamer sequence. The 3’ OH of the coding sequence ligates to 5’ phosphate and creates a hairpin loop. The clean cut ends of the signal sequences enable formation of precise signal joints. However, the hairpin junction created at the coding ends must be resolved by re-nicking the DNA, usually within four to five nucleotides from the end of the hairpin. This forms a 3’ overhang that is amenable to further modification. It can be filled in via DNA polymerases, nibbled, or serve as a substrate for TdT-catalyzed N addition. DNA polymerase μ, which shares homology with TdT, appears to play a role in maintaining the integrity of the terminus of the coding sequence.
The cut ends of the coding sequence are then repaired by the non-homologous end joining proteins. NHEJ proteins involved in V(D)J recombination include Ku70, Ku80, DNA-PKcs, Artemis, XRCC4, and ligase 4. Ku70 and Ku80 form a heterodimer (Ku) that directly associates with DNA double-strand breaks to protect the DNA ends from degradation, permit juxtaposition of the ends to facilitate coding end ligation, and help recruit other members of the repair complex. DNA-PKcs phosphorylates Artemis, inducing an endonuclease activity that plays a role in the opening of the coding joint hairpin. Finally, XRCC4 and ligase 4 help rejoin the ends of the broken DNA. Deficiency of any one of these proteins creates sensitivity to DNA breakage and can lead to a SCID phenotype.
The κ locus
The κ locus is located on chromosome 2p11.2.(12)
κ V domains represent the joined product of Vκ and Jκ gene segments (), whereas the κ C domains is encoded by a single Cκ exon. The locus contains five Jκ and 75 Vκ gene segments upstream of Cκ (). One third of the Vκ gene segments contain frameshift mutations or stop codons that preclude them from forming functional protein, and of the remaining sequences less than 30 of the Vκ gene segments have actually been found in functional immunoglobulins. V gene segments can be grouped into families on the basis of sequence and structural similarity.(13,14)
There are six such families for Vκ.
Representation of the chromosomal organization of the Ig H, κ, and λ gene clusters
Each active Vκ gene segment has the potential to rearrange to any one of the five Jκ elements, generating a potential ‘combinatorial’ repertoire of more than 140 distinct VJ combinations. The Vκ gene segment contains FR1, −2, and −3, CDR1 and −2, and the amino terminal portion of CDR3; whereas the Jκ element contains the carboxy terminus of CDR3 and FR4 in its entirety. The terminus of each rearranging gene segment can undergo a loss of 1 to 5 nucleotides during the recombination process, yielding additional ‘junctional’ diversity. In human, TdT can introduce random N nucleotides to either replace some or all of the lost Vκ or Jκ nucleotides, or to add to the original germline sequence.(15)
Each codon created by N addition increases the potential diversity of the repertoire 20-fold. Thus, the initial diversification of the κ repertoire is focused at the VJ junction that defines the light chain CDR3, or CDR-L3.
The λ locus
The λ locus, on chromosome 22q11.2, contains four functional Cλ exons, each of which is associated with its own Jλ (). Vλ genes are arranged in three distinct clusters, each containing members of different Vλ families.(16)
Depending on the individual haplotype, there are approximately 30–36 potentially functional Vλ gene segments and an equal number of pseudogenes.
During early B cell development, H chains form a complex with unconventional λ light chains, known as surrogate or pseudo light chains (ΨLC), to form a pre-B cell receptor. The genes encoding the ΨLC proteins, λ14.1 (λ5) and VpreB
, are located within the λ light chain locus on chromosome 22. Together, these two genes create a product with considerable homology to conventional λ light chains. A critical difference between these unconventional ΨLC genes and other L chains is that 14.1 and VpreB
gene rearrangement is not required for ΨLC expression. The region of the ΨLC that corresponds to CDR-L3 covers CDR-H3 in the pre-B cell receptor, allowing the pre-B cell to avoid antigen-specific selection.(17
The H chain locus
The H chain locus, on chromosome 14q32.33, is considerably more complex than the light chain clusters. The ~80 VH
gene segments near the telomere of the long arm of chromosome 14 can be grouped into seven different families of related gene segments.(18)
Of these, approximately 39 are functional. Adjacent to the most centromeric VH
, V6-1, are 27 DH
(D for diversity) gene segments ()(19)
and six JH
gene segments. Each VH
gene segment is associated with a two turn recombination signal sequence, which prevents direct V->J joining. A pair of one turn recombination signal sequences flanks each DH
. Recombination begins with the joining of a DH
to a JH
gene segment, followed by the joining of a VH
element to the amino terminal end of the DJ intermediate. The VH
gene segment contains FR1, −2, and −3, CDR1 and −2, and the amino terminal portion of CDR3; the DH
gene segment forms the middle of CDR3; and the JH
element contains the carboxy terminus of CDR3 and FR4 in its entirety (). Random assortment of one of ~50 active VH
and one of 27 DH
with one of the six JH
gene segments can generate more than 104
different VDJ combinations ().
The antigen binding site is the product of a nested gradient of diversity
While combinatorial joining of individual V, D, and J gene segments maximizes germline-encoded diversity, the junctional diversity created by VDJ joining is the major source of variation in the pre-immune repertoire (). First, DH gene segments can rearrange by either inversion or deletion, and each DH can be spliced and translated in each of the three potential reading frames. This gives each DH gene segment the potential to encode six different peptide fragments. Second, the rearrangement process proceeds through a step that creates a hairpin ligation between the 5’ and 3’ termini of the rearranging gene segment. Nicking to resolve the hairpin structure leaves a 3’ overhang that creates a palindromic extension, termed a P junction. Third, the terminus of each rearranging gene segment can undergo a loss of one to several nucleotides during the recombination process. Fourth, TdT can add numerous N nucleotides to replace or add to the original germline sequence. N nucleotides can be inserted between the V and the D, as well as between the D and the J. The imprecision of the joining process and variation in the extent of N addition permits generation of CDR-H3s of varying length and structure. As a result, more than 1010 different H chain VDJ junctions, or CDR-H3s, can be generated at the time of gene segment rearrangement. Taken as a whole, somatic variation in CDR3, combinatorial rearrangement of individual gene segments and combinatorial association between different L and H chains can yield a potential pre-immune antibody repertoire of greater than 1016 different immunoglobulins.
Class switch recombination (CSR)
Located downstream of the VDJ loci are nine functional CH genes ().(20)
These constant genes consist of a series of exons, each encoding a separate domain, hinge, or terminus. All CH genes can undergo alternative splicing to generate two different types of carboxy termini: either a membrane terminus that anchors immunoglobulin on the B lymphocyte surface or a secreted terminus that occurs in the soluble form of the immunoglobulin. With the exception of CH
1δ, each CH
1 constant region is preceded by both an exon that cannot be translated (an I exon) and a region of repetitive DNA termed the switch (S). Cocktails of cytokine signals transmitted by T cells or other extracellular influences variably activate the I exon, initiating transcription and thus activating the gene. Through recombination between the Cμ switch region and one of the switch regions of the seven other H chain constant regions (a process termed class switching
or class switch recombination [CSR]
), the same VDJ heavy chain variable domain can be juxtaposed to any of the H chain classes.(20)
This enables the B cell to tailor both the receptor and the effector ends of the antibody molecule to meet a specific need.
Somatic hypermutation (SHM)
A final mechanism of immunoglobulin diversity is engaged only after exposure to antigen. With T cell help, the variable domain genes of germinal center lymphocytes undergo somatic hypermutation
(SHM) at a rate of up to 10−3
changes per base pair per cell cycle. SHM is correlated with transcription of the locus and in human two separate mechanisms are involved: the first mechanism targets mutation hot spots with the RGYW (purine/G/pyrimidine/A) motif(21)
and the second mechanism incorporates an error-prone DNA synthesis that can lead to a nucleotide mismatch between the original template and the mutated DNA strand.(22)
Other species use gene conversion between functional and non-functional V sequences to introduce additional somatic diversity. SHM allows affinity maturation of the antibody repertoire in response to repeated immunization or exposure to antigen.
Activation-induced cytidine deaminase (AID)
AID plays a key role in both CSR and SHM.(11,23)
AID is a single strand DNA (ssDNA) cytidine deaminase that can be expressed in activated germinal center B cells.(24
) Transcription of an immunoglobulin V domain or of the switch region upstream of the CH
1 domain opens the DNA helix to generate ssDNA that can then be deaminated by AID to form mismatched dU/dG DNA base pairs. The base excision repair protein uracil DNA glycosylase (UNG) removes the mismatched dU base, creating an abasic site. Differential repair of thie lesion leads to either SHM or CSR. The mismatch repair (MMR) proteins MSH2 and MSH6 can also bind and process the dU:dG mismatch. Deficiencies of AID, UNG underlie some forms of the hyper-IgM syndrome.
Generation of immunoglobulin diversity occurs at defined stages of B cell development
Creation of immunoglobulin diversity is hierarchical. In pro-B cells, DH→JH joining precedes VH→DJH rearrangement and VL→JL joining takes place at the late pre-B cell stage. Production of a properly functioning B cell receptor is essential for development beyond the pre-B cell stage. For example, function-loss mutations in RAG-1/2 and DNA dependent protein kinase (DNA-PKcs, Ku 70/80) preclude B cell development, as well as T cell development, leading to severe combined immune deficiency. In-frame, functional VDJh rearrangement allows the pro-B cell to produce μ H chains, most of which are retained in the endoplasmic reticulum. The appearance of cytoplasmic μ H chains defines the pre-B cell.
Pre-B cells whose μ H chains can associate VpreB, and λ14.1 [λ5], which together form the surrogate light chain (ψLC), form a pre-B cell receptor. Its appearance turns off RAG1 and RAG2, preventing further H chain rearrangement (allelic exclusion). This is followed by four to six cycles of cell division (25
). Late pre-B daughter cells reactivate RAG1 and RAG2 and begin to undergo Vl
rearrangement. Successful production of a complete κ or λ light chain permits expression of conventional IgM on the cell surface (sIgM), which identifies the immature
B cell. Immature B cells that have successfully produced an acceptable IgM B cell receptor extend transcription of the H chain locus to include the Cδ exons downstream of Cμ. Alternative splicing permits co-production of IgM and IgD. These now newly mature
B cells enter the blood and migrate to the periphery where they form the majority of the B cell pool in the spleen and the other secondary lymphoid organs. The IgM and IgD on each of these cells share the same variable domains.
The life span of mature B cells expressing surface IgM and IgD appears entirely dependent on antigen selection. After leaving the bone marrow, unstimulated cells live only days or a few weeks. As originally postulated by Burnet´s "clonal selection" theory, B cells are rescued from apoptosis by their response to a cognate antigen. The reaction to antigen leads to activation, which may then be followed by diversification. The nature of the activation process is critical. T cell independent stimulation of B cells induces differentiation into short-lived plasma cells with limited class switching. T-dependent stimulation adds additional layers of diversification, including somatic hypermutation of the variable domains, which permits affinity maturation, class switching to the entire array of classes available, and differentiation into the long-lived memory B cell pool or into the long-lived plasma cell population.