|Home | About | Journals | Submit | Contact Us | Français|
Double-strand DNA breaks are common events in eukaryotic cells, and there are two major pathways for repairing them: homologous recombination and nonhomologous DNA end joining (NHEJ). The diverse causes of DSBs result in a diverse chemistry of DNA ends that must be repaired. Across NHEJ evolution, the enzymes of the NHEJ pathway exhibit a remarkable degree of structural tolerance in the range of DNA end substrate configurations upon which they can act. In vertebrate cells, the nuclease, polymerases and ligase of NHEJ are the most mechanistically flexible and multifunctional enzymes in each of their classes. Unlike repair pathways for more defined lesions, NHEJ repair enzymes act iteratively, act in any order, and can function independently of one another at each of the two DNA ends being joined. NHEJ is critical not only for the repair of pathologic DSBs as in chromosomal translocations, but also for the repair of physiologic DSBs created during V(D)J recombination and class switch recombination. Therefore, patients lacking normal NHEJ are not only sensitive to ionizing radiation, but also severely immunodeficient.
Unlike most other DNA repair and DNA recombination pathways, nonhomologous DNA end joining (NHEJ) in prokaryotes and eukaryotes evolved along themes of mechanistic flexibility, enzyme multifunctionality, and iterative processing in order to achieve repair a diverse range of substrate DNA ends at double-strand breaks (DSBs) (1-3). Except for very limited protein homology for the Ku protein in prokaryotes and eukaryotes (2), the actual nuclease, polymerase, and ligase components of NHEJ appear to have arisen independently, but converged on these same mechanistic themes in order to handle the challenge of joining two freely diffusing ends of diverse DNA end overhang configuration with a wide range of base or sugar oxidative damage (3).
When double-strand breaks arise in any organism, prokaryotic or eukaryotic, there are two major categories of DNA repair that can restore the duplex structure (Fig. 1). If the organism is diploid (even if the diploidy is only transient, as in replicating bacteria or replicating haploid yeast), then homology-directed repair can be used. The most common form of homology-directed repair is called homologous recombination (abbreviated HR), which has the longest sequence homology requirements between the donor and acceptor DNA. Other forms of homology-directed repair include single-strand annealing (abbreviated SSA) and breakage-induced replication, and these require shorter sequence homology relative to HR (4, 5).
In nondividing haploid organisms or in diploid organisms that are not in S phase, a homology donor is not nearby. Hence, early in evolution, another form of double-strand break repair had an opportunity to provide survival advantage, and nonhomologous DNA end joining (NHEJ) includes a set of DNA enzymes that have the mechanistic flexibility to provide such an advantage (Table 1)(6).
How the cell determines whether HR or NHEJ will be used to repair a break is still an active area of investigation. The HR versus NHEJ determination may be somewhat operational (7). If a homologue is not present near a DSB during S/G2, then HR cannot proceed, and NHEJ is the only option. During S phase, the sister chromatid is physically very close, thereby provding a homology donor for HR. Outside of S/G2, NHEJ is indeed the markedly preferred option. The precise molecular events, beyond issues of proximity and possible competition between Ku and RAD51 or 52, are yet to be deciphered (7-9). Recent data from S. cerevisiae suggests that DNA ligase IV complex may be key in suppressing the DNA end resection needed to initiate HR (10).
There are an estimated ten double-strand breaks (DSBs) per day per cell, based on metaphase chromosome and chromatid breaks in early passage primary human or mouse fibroblasts (11-13). Estimates of DSB frequency in nondividing cells are difficult to make because methods for assessing DSBs outside of metaphase are subject to even more caveats of interpretation.
In mitotic cells of multicellular eukaryotes, DSBs are all pathologic (accidental) except the specialized subset of physiologic DSBs in early lymphocytes of the vertebrate immune system (Fig. 1). Major pathologic causes of double-strand breaks in wild type cells include replication across a nick, giving rise to chromatid breaks during S phase. Such DSBs are ideally repaired by HR using the nearby sister chromatid.
All of the remaining pathologic forms of DSB are repaired by NHEJ because they usually occur when there is no nearby homology donor and/or because they occur outside of S phase. These causes include reactive oxygen species from oxidative metabolism, ionizing radiation, and inadvertent action of nuclear enzymes (14).
Reactive oxygen species (ROS) are a second major cause of DSBs (Fig. 1). During the course of normal oxidative respiration, mitochondria convert about ~0.1 to 1% of the oxygen to superoxide (O-2) (15). Superoxide dismutase in the mitochondrion (SOD2) or cytosol (SOD1) can convert this to hydroxyl free radicals, which may react with DNA to cause single-strand breaks. Two closely spaced lesions of this type on anti-parallel strands can cause a DSB. About 1022 free radicals or ROS species are produced in the human body each hour, and this represents about 109 ROS per cell per hour. A subset of the longer-lived ROS may enter the nucleus via the nuclear pores.
A third cause of DSBs is natural ionizing radiation of the environment. These include gamma rays and X-rays. At sea level, ~300 million ionizing radiation particles per hour pass through each person. As these traverse the body, they create free radicals along their path, primarily from water. When the particle comes close to a DNA duplex, clusters of free radicals damage DNA, generating double- and single-stranded breaks at a ratio of about 25 to 1 (16). About half of the ionizing radiation that strikes each of us comes from outside the earth. The other half of the radiation that strikes us comes from the decay of radioactive elements, primarily metals, within the earth.
A fourth cause of DSBs is inadvertent action by nuclear enzymes on DNA. These include failures of type II topoisomerases, which transiently break both strands of the duplex. If the topoisomerase fails to rejoin the strands, then a DSB results (17). Inadvertent action by nuclear enzymes of lymphoid cells, such as the RAG complex (composed of RAG1 and 2) and activation-induced deaminase (AID) are responsible for physiologic breaks for antigen receptor gene rearrangement; however, they sometimes accidentally cleave the DNA at off-target sites outside the antigen receptor gene loci (18). In humans, these account for about half of all of the chromosomal translocations that result in lymphoma.
Finally, physical or mechanical stress on the DNA duplex is a relevant cause of DSBs. In prokaryotes, this arises in the context of desiccation, which is quite important in nature (19). In eukaryotes, telomere failures can result in chromosomal fusions that have two centromeres, and this results in physical stress by the mitotic spindle (breakage/fusion/bridge cycles) with DSBs (20).
In addition to the above for mitotic cells, meiotic cells have an additional source of DSBs, which is physiologic and is caused by an enzyme called Spo11, a topoisomerase II-like enzyme (21). Spo11 creates DSBs to generate cross-overs between homologues during meiotic prophase I. These events are resolved by HR. Therefore, NHEJ is not relevant to Spo11 breaks. Interestingly, it is not clear that NHEJ occurs in vertebrate meiotic cells, because one group reports the lack of Ku70 in spermatogonia (22). Human spermatogonia remain in meiotic prophase I for about 3 weeks, and human eggs remain in meiotic prophase I for 12 to 50 years; hence, these cells can rely on HR during these long periods. Given the error-prone nature of NHEJ (see below), reliance on HR may be one way to minimize alterations to the germ line at frequencies that might be deleterious to a population.
Perhaps the most intriguing aspect of NHEJ is the diversity of substrates that it can accept and convert to joined products. This demands a remarkable level of mechanistic flexibility at the level of protein-substrate interaction and is unparalleled in most other biochemical processes. Though we have substantial amounts of information on the DNA end configurations at DSBs, there are limitations to the information because of the diverse manner in which ionizing radiation and ROS interact with DNA. Therefore, we know the most about the diversity of physiologic DSBs, specifically V(D)J recombination, because we know where the relevant enzymes initiate the cutting of the two DNA strands. In V(D)J recombination, we can examine many NHEJ outcomes from the same starting substrates. We can also vary the sequence of the two DNA ends being joined. All of these various overhangs are joined in vivo at about the same efficiency, regardless of sequence. Within this range of overhang variation then, NHEJ can accept a wide variety of overhang length, DNA end sequence, and DNA end chemistry.
Like most DNA repair processes, NHEJ requires a nuclease to resect damaged DNA, polymerases to fill-in new DNA, and a ligase to restore integrity to the DNA strands (Fig. 2). Functional correspondence between the NHEJ proteins of prokaryotes, yeast (along with plants and invertebrates), and vertebrates can be inferred (Table 1). Prokaryotic NHEJ has been reviewed recently (23), and comparisons have between prokaryotes and eukaryotes have been made (3). NHEJ in yeast, which appears to be similar for plants and invertebrates, has been thoroughly reviewed as well (24, 25). Hence, the discussion here will focus on NHEJ in vertebrates, with appropriate comparisons to prokaryotic and yeast NHEJ.
When a double-strand break arises in vertebrates, it is thought that Ku is the first protein to bind, based on its abundance (estimated at ~400,000 molecules per cell) and its strong equilibrium dissociation constant (~10-9 M) for duplex DNA ends of any configuration (Fig. 3 and and4a4a)(26-29). Ku is a toroidal protein, based on its crystal structure (30). Ku bound to a DNA end can be considered as a Ku:DNA complex, which serves as a node at which the nuclease, polymerases and ligase of NHEJ can dock (31). One can think of Ku as a toolbelt protein, similar to PCNA in DNA replication, where many proteins can dock. At a DSB, there are two DNA ends. Hence, it is presumed that there is a Ku:DNA complex at each of the two DNA ends being joined, thereby permitting each DNA end to be modified in preparation for joining.
DNA end complex can recruit the nuclease, polymerase and ligase activities in any order (31, 32). This flexibility is the basis for the diverse array of outcomes that can arise from identical starting ends. The processing of the two DNA ends may transiently terminate when there is some small extent of annealing between the two DNA ends. The processing may permanently terminate when one or both strands of the left and right duplexes are ligated.
Ku likely changes conformation when bound to a DNA end versus when Ku is in free solution. The basis for this inference is that Ku does not form stable complexes with DNA-PKcs in the absence of DNA ends (33), and the same appears to apply for its interactions with polymerases mu and lambda and with XRCC4:DNA ligase IV (34, 35). The crystal structure for Ku lacks the C-terminal 19 kDa (167 aa) of Ku86 (an important region of interaction with DNAPKcs and other proteins), and this may be a region for conformational change upon Ku binding to DNA (36).
DNA-PKcs complex has a diverse array of nuclease activities, including 5' endonuclease activity, 3' endonuclease activity, and hairpin opening activity, in addition to an apparent 5' exonuclease activity of Artemis alone (Fig. 4b and 4c and Suppl. Fig. 1)(37). The Artemis:DNA-PKcs complex is able to endonucleolytically cut a variety of types of damaged DNA overhangs (38, 39). Hence, there is no obvious need for additional nucleases, though the 3' exonuclease of PALF (APLF) and others are possibilities (see Future Questions section).
Polymerases mu and lambda are both able to bind to the Ku:DNA complexes by way of their BRCT domains located in the N-terminal portion of each polymerase (Fig. 4d)(32). Additional polymerases appear able to contribute when neither of these two polymerases is present (40, 41). As discussed later, pol mu is particularly well-suited for functioning in NHEJ because it is capable of template-independent synthesis, in addition to template-dependent synthesis. Pol lambda also has more flexibility than replicative polymerases.
DNA ligase IV is the most flexible ligase known, with the ability to ligate across gaps and ligate incompatible DNA ends (Fig. 4e)(42, 43). It can also ligate one strand when the other has a complex configuration (e.g., bearing flaps), and it can ligate single-stranded DNA, though with limited and substantial sequence preferences.
Therefore, the nuclease, polymerases and ligase of NHEJ all have much greater mechanistic flexibility than their counterparts in other repair pathways. This flexibility permits these structure-specific proteins to act on a wider range of starting DNA end structures. One consequence of such flexibility in vertebrates may be the substantial diversity of junctional outcomes observed, even from identical starting ends, as discussed in the next section.
If we arbitrarily designate the two DNA ends as left and right, then the Ku bound at the left end could conceivably recruit the nuclease, and the Ku at the right DNA end might recruit the polymerase, or vice versa. It is likely that there are multiple rounds of action by the nuclease, polymerases, and ligase at the left and right ends of the DSB until the ‘top’ or ‘bottom’ strand is ligated. Therefore, the joining of the two ends is likely to be an iterative process with multiple possible routes all leading to a joining event, but with wide variation in the precise junctional sequence of the products.
Unlike pathologic causes of DSBs which generally cannot generate predictable pairs of starting DNA ends, V(D)J recombination always generates two hairpinned coding ends (see section on V(D)J recombination for a fuller discussion, including signal ends). Though these two coding ends can be opened in a few different ways, the predominant hairpin opening position is 2 nucleotides 3' of the hairpin tip in vivo (44) and in vitro (37). Hence, the two coding ends are often 3' overhangs of length 4 nucleotides each. Assuming that NHEJ within vertebrate B cells is representative of NHEJ in other tissue cell types (these junctions can be analyzed in cells that do not express terminal transferase), several inferences can be drawn from these relatively defined starting DNA ends.
First, the amount of nucleolytic resection (loss) from each DNA end varies, usually over a range of 0 to 14 bp; but there are less frequent examples with resection up to ~25 bp (45). The rare instances where there is loss greater than 25 bp may represent cases where the DNA end is released prematurely from whatever factors retain the two DNA ends in some proximity (see below). In vertebrate NHEJ, a complex of Artemis:DNA-PKcs is capable of endonucleolytically resecting a wide range of DNA end configurations. In yeast, plants and invertebrates, the MRX complex appears to be critical for some of the DNA end resection (24). The evolutionary inception of Artemis and DNA-PKcs coincides with the inception of V(D)J recombination (the vertebrate/invertebrate transition). The MRX nuclease system and the DNA-PKcs system both rely on the same conserved C-terminal tail for protein-protein interaction, also suggesting that the Artemis:DNA-PKcs complex may have evolved to replace the MRX complex for vertebrate NHEJ (46).
Second, nucleotide addition can occur at the DNA junction, even when terminal transferase is not present. In mammals, pol mu can add in a template-independent manner under physiologic conditions (7, 8). Mammalian pol lambda does not appear to add in a template-independent manner except when Mg2+ is replaced with Mn2+. (47, 48). The precise biochemical properties of the Pol X polymerases in other eukaryotes are not as clear. Interestingly, in bacteria, the polymerase activity intrinsic to the LigD protein is capable of adding one nucleotide or ribonucleotide in an entirely template-independent manner (23), perhaps reflecting convergent evolution.
Pol mu and pol lambda both seem to have much greater flexibility than most polymerases during template-dependent synthesis also (47, 48). The template-independent addition by pol mu would sometimes be expected to fold back on itself (42), and the resulting stem-loop structure might function as a primer/template substrate (see step 1 in Suppl. Fig. 2)(48). This may account for the observed inverted repeats at many NHEJ junctions from chromosomal translocations in humans (49, 50). Both pol mu and pol lambda can slip back on their template strand (51-53), and this may permit generation of direct repeats, accounting for such events seen in vivo (54-56). Direct repeats are also often seen at NHEJ junctions from human chromosomal translocations (49, 50). The direct and inverted repeats seen at these NHEJ junctions have been termed T-nucleotides, where the T stands for templated (49, 50).
Therefore, even from a relatively homogeneous set of starting DNA ends as substrates, there is substantial variation in the nucleotide resection from each end and variation in the amount of template-independent addition to the two DNA ends. These two sources of variation are the basis for the heterogeneity at the joining site.
In the context of considering diverse substrates and diverse joining products, it is worth noting an additional facet of that flexibility: the proteins involved and the order of their action in NHEJ can vary at either of the two DNA ends. Each DNA end, especially when bound by Ku, is best considered as a node at which any of the NHEJ proteins can dock. If one of the polymerases arrives first at the ‘left’ end, then this might be the first step at that end. However, if the nuclease binds first at that end, then resection will occur first (32).
In addition to the theme of mechanistic flexibility, there is the theme of iterative processing of the junction (32, 38). For any given joining event, there might, for example, be only three steps involving: one each involving a nuclease, a polymerase, and then a ligase. However, in another joining scheme, there might be 10 steps with multiple appearances by each enzyme activity. Hence, each of the enzymatic components might be involved not at all, once, or many times.
Related, but in addition to the theme of iterative processing is the independent function of the nuclease, polymerases, and ligase from one another and even from Ku. Each of the enzymatic activities has a substantial range and level of activity without one another, and even without Ku, when examined in purified biochemical systems. For example, Ku is entirely unnecessary in ligation of DNA ends by the ligase IV complex when those ends share 4 bp of terminal microhomology, but Ku is stimulatory for shorter microhomology lengths (42). Polymerases mu and lambda are able to carry out fill-in synthesis, and polymerase mu does not always require Ku or XRCC4:DNA ligase IV in order to have TdT-like activity at a DNA end (42). The Artemis:DNA-PKcs complex does not require Ku or any other component to carry out its endonucleolytic functions (37). Therefore, independent function of each enzymatic activity, along with iterative processing and mechanistic flexibility are all noteworthy features of vertebrate NHEJ. S. cerevisiae NHEJ manifests mechanistic flexibility, but within a narrower range of junctional outcomes than mammalian NHEJ (24, 57).
In the context of iterative processing, the Artemis:DNA-PKcs nuclease complex is able to nick within the single-stranded portion of a gapped structure and within a bubble structure (38). Joinings where only one strand is ligated would often have a gapped configuration. Nicking of such a gap could permit nucleotides that were originally part of the arbitrarily designated left DNA end to become separated from that left end and become associated with the right DNA end. Then further nucleotide addition at the left end could separate these nucleotides from the left end. One can find potential examples of this at in vivo junctions. In other scenarios, with the flexibility of the ligase, there may be more nucleotides on the top strand than the bottom strand (42). The activated Artemis:DNA-PKcs complex can nick mismatched or bubble structures on either strand, thereby permitting additional rounds of junctional revision (38).
Based on both S. cerevisiae and mammalian in vivo NHEJ studies, the variation of the resulting junction is usually less when there is terminal microhomology at the ends (24, 57-60). This may reflect the involvement of fewer NHEJ proteins, and genetic studies support the view that not all of the NHEJ components are essential when the two DNA ends share terminal microhomology (60-66).
As mentioned, when the two DNA ends happen to share four nucleotide overhangs that are perfectly complementary, then the only component needed in purified biochemical systems is the XRCC4:DNA ligase IV (42). No nucleolytic resection or polymerase action is needed. In vitro, using purified proteins, XRCC4:ligase IV is adequate to join such a junction, and even ligase IV alone may be sufficient, all without Ku. Moreover, even ligase I or III alone is sufficient for such joins, though at a lower efficiency (42). In vivo, generation of defined DNA end configurations at DSBs is not simple, but there are two approaches that have been used. In S. cerevisiae, short oligonucleotide duplexes can be ligated onto the DNA ends of a linear plasmid, and then these can be transfected into cells (57). One can then harvest the joined circular molecules for analysis. Based on this, it is clear that the joining dependence is simplified when there is terminal microhomology at the DNA ends. A second system for generating defined DNA ends is V(D)J recombination, as mentioned earlier. These ends are not as precisely defined as in the yeast system (because the precise DNA end configuration depends on how hairpin intermediates are opened), but there is the advantage that the ends are actually generated inside the nucleus. In V(D)J recombination, the coding ends (defined in Suppl. Fig. 3) are usually configured with a four nucleotide 3' overhang. If the DNA ends are chosen to be complementary, then the dependence of the V(D)J recombination on Ku can be very minimal (60). However, NHEJ repairs such ends so as to a align the microhomology in a disportionate fraction of the joins (58, 67).
One of the strengths of NHEJ is that microhomology does not appear to be essential in mammalian cells (58, 67). The joining of incompatible DNA ends may be a key selective advantage that drove further evolutionary development of NHEJ in higher eukaryotes. The fact that some of this evolution was convergent rather than divergent further illustrates the strength of this selective advantage. Most natural DSBs generate incompatible ends with little or no microhomology within the first few nucleotides. (S. cerevisiae shows mechanistically interesting differences from mammalian NHEJ insofar as yeast are very poor at blunt end ligation and perhaps more reliant on at least one base pair of terminal microhomology (24, 25, 68).) For mammalian NHEJ joins, the most common amount of observed terminal microhomology is zero nucleotides (45, 69). The next most common is one nucleotide, and longer microhomologies are less common in proportion to their length. As mentioned above, when microhomology is present, then usage of that microhomology for a given pair of DNA ends can be dominant (57-60). Overhangs with substantial terminal microhomology are uncommon in nature and primarily are limited to regions containing repetitive DNA. In wild type cells or neoplastic cells arising in normal animals, the most common amount of terminal microhomology at NHEJ junctions is zero. Exceptions to this arise from two circumstances. First, if the experimental system being used specifically positions terminal microhomology at or near the DNA ends, then the high-microhomology outcome might be observed. Second, in animals lacking a complete wild type NHEJ system, the NHEJ process may be slower, show more resection, and may seek end alignments that are stabilized by more terminal microhomology (2 or 3 bp), as discussed in the next section.
It has been elegantly demonstrated in murine and yeast genetic studies that end joining can occur in the absence of ligase IV (57, 69-71). Insofar as the only remaining ligase activity in the cells is due to ligase I in S. cerevisiae or ligase I or III in vertebrate cells, these joinings must be done by ligase I or III. Most of these joinings rely more heavily on the use of terminal microhomology than NHEJ in wild type cells. In wild type cells, plots of end joining frequency versus microhomology length show a peak at zero nucleotides of microhomology and decline for increasing lengths (69). But for joinings without ligase IV, the peak is at 2.5 bp and the frequency declines on both sides of this peak (69-71).
In biochemical systems using purified NHEJ proteins, it has been shown that human ligase I and III are able to join DNA ends that are not fully compatible (e.g., joining across gaps in the ligated strand), though it is still substantially less efficient than joining by the XRCC4:ligase IV complex (42, 43). Though relatively inefficient, this joining by ligase I or III is somewhat more efficient with 2 or more bp of terminal microhomology to stabilize the ends. Therefore, in the absence of the ligase IV complex, it may not be surprising that the peak microhomology usage changes from zero to between 2 and 3 bp.
The in vivo joining efficiency by mammalian ligase IV relative to ligase I and III is difficult to measure. Two measurements have been done in murine cells in which the joining occurred at DNA ends that have some increased opportunities for terminal microhomology, the class switch recombination sequences. In one study, cells lacking ligase IV are removed from mice and stimulated in culture to undergo class switch recombination (70). Measurements of switch recombination can be done as early as 60 hrs after stimulation. In this case, end joining without ligase IV is reduced only 2.5-fold. In another case, a murine cell line was used to make the genetic knock out and measurements could be done as early as 24 hrs, at which time the joining without ligase IV was reduced about 9-fold (69). In both cases, the joining is almost certainly done by ligase I or III. The latter study suggests that the joining by ligase I or III is substantially less efficient at early times. In both studies, given sufficient time, the joining by ligase I or III improves to about half that of the wild type cells (where nearly all joining is likely -- though not proven -- to be due to ligase IV). Also, in both studies, the joining in wild type cells was much less dependent on terminal microhomology than joining in the ligase IV knock out cells. One reasonable interpretation of these two studies is that ligase IV Is more efficient (and perhaps faster) at joining incompatible DNA ends in vivo, but that ligase I or III can join ends at a lower efficiency, especially when terminal microhomology can stabilize the DNA ends.
In S. cerevisiae, end joining can occur in the absence of ligase IV, but it is at least 10-fold less efficient (24). Moreover, when the joining does occur, it tends to use microhomology (usually >4 bp) that is longer and more internal to the two DNA termini than is seen for wild type yeast (57).
Ku-independent end joining also occurs in both S. cerevisiae and mammalian cells. In yeast, such events can even be as efficient as end joining in the corresponding wild type cells (24, 57). For ligase IV mutants in yeast, the joining relies on longer microhomology (usually >4 bp) that is more internal to the two DNA ends. Ku-independent end joining also can be seen in mammalian cells (60). Even in vertebrate V(D)J recombination, when the two DNA ends share 4 bp of terminal microhomology, the dependence on Ku for joining efficiency can be small (2.5-fold) (60). This indicates that terminal microhomology can substitute for the presence of Ku. We do not know with certainty what Ku-independent joining means mechanistically, but one possibility is that the ligase IV complex holds the DNA ends and Ku stabilizes the ligase IV complex. But when Ku is absent, terminal microhomology may provide some of this stability, consistent with observations in biochemical systems (42).
For some ends joined in the absence of ligase IV, no microhomology is obvious (66). This raises the question whether ligase I or III can join ends with no terminal microhomology. XRCC4:ligase IV can ligate blunts ends (72), and ligase III is also able to do this at low efficiencies (35). All three ligases can do so when macromolecular volume excluders are present (35). Nevertheless, blunt end ligation is much less efficient for all three ligases.
Another explanation of such events is that they involve template-independent synthesis by the Pol X polymerase (42). As discussed, in mammalian cells, pol mu and pol lambda participate in NHEJ (42, 73, 74), as does POL4 in S. cerevisiae (75). In Mn2+ buffers, both pol mu and pol lambda can add nucleotides template-independently, and in the more physiologic Mg2+ buffers, pol mu still shows robust template-independent addition. Such template-independent activity could permit additions to DNA ends that provide microhomology with another end; because that addition would be random, it would not have been scored as microhomology. One could consider such inapparent microhomology as polymerase-generated microhomology.
In addition to template-independent synthesis by pol mu (which pol mu exhibits alone or in the context of other NHEJ proteins), pol mu together with Ku and XRCC4:ligase IV can synthesize across a discontinuous template, and this would also generate microhomology. However, this mechanism requires that the DNA end providing the ‘template’ have a 3' overhang to permit the polymerase to extend into that end. Hence, only a subset of DNA ends could be handled in this manner.
The ratio of template-independent versus template-dependent synthesis by pol mu at NHEJ junctions in such cases is not entirely clear, but both mechanisms occur under physiologic conditions in biochemical systems, and there is clearly some evidence for the template-independent pol mu addition within mammalian cells (41).
In all organisms in which there is NHEJ, there are examples of DNA end joining in the absence of the major NHEJ ligase of that organism (76); even in mycobacteria, LigC can function in NHEJ when LigD is absent (77).) Given that the ligase is regarded as the signature enzymatic requirement of NHEJ, these joining events have been proposed to be due to ‘alternative NHEJ’, or ‘backup NHEJ’. As mentioned, at most of these end joinings, there is substantial terminal microhomology. Hence, in S. cerevisiae, this joining has also been called microhomology-mediated end joining (MMEJ); however, it is essentially an alternative NHEJ.
For eukaryotic end joining, one reasonable nomenclature is that NHEJ be the general term, and that exceptions simply be noted for their exception (e.g., ligase 4-independent NHEJ, Ku-independent NHEJ, or DNA-PKcs-independent NHEJ (i.e., X-independent NHEJ, where X is the omitted protein). Until a specific pathway is delineated, this is a practical solution. It is quite conceivable (even likely) that ligase 4-independent NHEJ is merely NHEJ in which ligase I or III completes the ligation at a somewhat lower efficiency than the ligase IV complex.
Initially, NHEJ was thought to be restricted to eukaryotes because the best-studied prokaryote, E. coli, cannot recircularize linear plasmids. However, when bioinformatists discovered a distantly diverged Ku-like gene in prokaryotic genomes, the existence of a similar NHEJ pathway in bacteria became clear (2, 78, 79). The bacterial Ku homologue appears to form a homodimer with a structure similar to the ring-shaped eukaryotic Ku heterodimer (80). The gene for an ATP-dependent ligase named LigD was typically found to be adjacent to the Ku gene on the bacterial chromosome (2, 78). This linkage between Ku and an ATP-dependent ligase prompted further extensive studies and later defined a bacterial NHEJ pathway. In most bacterial species, unlike the eukaryotic NHEJ ligase IV, LigD is a multidomain protein that contains three components within a single polypeptide: a polymerase (POL) domain, a phosphoesterase (PE) domain, and a ligase (LIG) domain (23).
Why do not all bacteria have an NHEJ pathway? Bacterial NHEJ is nonessential under conditions of rapid proliferation because homologous recombination is active and a duplicate genome is present to provide homology donors (23, 81). Those bacteria that have the NHEJ pathway spend much of their life cycle in stationary phase at which point HR is not available for DSB repair for lack of homology donors. In addition, desiccation and dry heat are two naturally occurring physical processes that produce substantial numbers of DSBs in bacteria. Therefore, bacterial Ku and LigD are present in species that often form endospores, because NHEJ is important for repair of DSB arising during long periods of sporulation.
Each of the individual NHEJ proteins carries an interesting detailed functional and structural literature, and more detailed reviews of each individual component are cited.
Ku was named based on protein gel mobilities (actually 70 kDa and 83 kDa) of a autoantigenic protein from a scleroderma patient with the initials K.U. Ku86 is also equivalently called Ku80. The toroidal shape of Ku is consistent with studies showing that purified Ku can bind at DNA ends but also slide internally at higher Ku concentrations (82). Ku can only load and unload at DNA ends. When linear molecules bearing Ku are circularized, the Ku proteins are trapped on the circular DNA. A minimal footprint size for Ku is ~14 bp at a DNA end (83). The key aspects of Ku in NHEJ have been discussed earlier, and the reader is referred to detailed reviews about Ku for additional information (84).
DNA-PKcs has a molecular weight of 469 kDa and is 4128 aa. It is the largest protein kinase in biology, and the only one that is specifically activated by binding to duplex DNA ends of a wide variety of end configurations (33, 85-87). DNA-PKcs alone has an equilibrium dissociation constant of 3 × 10-9 M for blunt DNA ends, and this tightens to 3×10-11 M when Ku is also present at the end (88). Once bound, DNA-PKcs acquires serine/threonine kinase activity (89). But its initial phosphorylation target seems to be itself, with more than 15 autophosphorylation sites and probably an equal number yet to be defined (90). In addition to the relationship with Artemis discussed earlier, DNA-PKcs interacts with XRCC4 and phosphorylates (91) a very long list of proteins in vitro (26). In vivo evidence for functional effects of those additional protein phosphorylation targets is limited.
The best current structural information concerning DNA-PKcs alone is at 7A resolution by cryo-EM (92). At this resolution α-helices are resolved, but this cryo-EM structure only contains a fraction of the total number of α-helical densities expected and therefore could not definitively reveal which portions of the structure are related to the primary amino acid sequence of DNA-PKcs. The ‘crown’ in that structure is thought to contain the FAT domain and possibly parts of the kinase domain (Fig. 4b)(92). The ‘base’ in that structure is the same as ‘proximal claws 1 and 2’ in a cryo-EM structure of Ku:DNA-PKcs:DNA by another group and was shown to contain HEAT like repeats at 7A resolution (93). In the Ku:DNA-PKcs:DNA and DNA-PKcs:DNA structures, the path of the duplex DNA is not entirely certain, and it is not clear which side of Ku is bound to DNA-PKcs (93, 94). Positioning of the C-terminal portion of Ku when bound to DNA-PKcs is also not determined, which is important because this interaction activates DNA-PKcs and is defined at the primary sequence level (46). Continued work using cryo-EM and other structural methods will undoubtedly be of great value.
It is not clear whether DNA-PKcs remains bound to the DNA ends throughout all processing steps of NHEJ (31, 95). Phosphorylation at the ABCDE cluster appears to increase the ability of other proteins, such as ligases, to gain access the DNA ends, suggesting that DNA-PKcs may dissociate more readily after autophosphorylation at these sites (90, 95, 96).
DNA-PKcs interaction with other proteins is also important. As mentioned, DNA-PKcs is critical for the endonucleolytic activities of Artemis (31, 37, 39, 97, 98). Activated DNA-PKcs stimulates the ligase activity of XRCC4:DNA ligase IV (90, 95, 96). Interestingly, presence of XRCC4:DNA ligase IV stimulates the autophosphorylation activity of DNA-PKcs (96). Hence, DNA-PKcs may be critical for the nucleolytic step, but also stimulatory for the ligation step.
The Artemis:DNA-PKcs complex has 5' endonuclease activity with a preference to nick a 5' overhang so as to leave a blunt duplex end (37). The Artemis:DNA-PKcs complex also has 3' endonuclease activity with a preference to nick a 3' overhang so as to leave a 4nt 3' overhang. In addition, the Artemis:DNA-PKcs complex has the ability to nick perfect DNA hairpins at a position that is 2 nts past the tip. These three seemingly diverse endonucleolytic activities at single- to double-strand DNA transitions are similar to one another if one infers the following model for binding of the Artemis:DNA-PKcs complex to DNA (Suppl. Fig. 1)(37). The complex appears to localize to a 4 nucleotide stretch of single-stranded DNA adjacent to a single-/double-strand transition, and then nick on the 3' side of that 4 nucleotide region. This would explain why 5' overhangs are preferably removed to generate a blunt DNA end, but 3' overhangs are nicked so as to preferably leave a 4 nucleotide 3' overhang. Moreover, it explains why a hairpin is nicked not at the tip, but 2' nucleotides 3' of the tip (37). In perfect DNA hairpins, the last two base pairs do not form well, which means the tip is actually similar in many ways to a 4 nucleotide single-stranded loop. Artemis nicks the hairpin on the 3' side of that loop. The opened hairpin then becomes a 3' overhang of 4 nucleotides.
In V(D)J recombination, null mutants of DNA-PKcs and of Artemis are very similar (63, 99-101). Both result in failure to open the DNA hairpins, but signal ends are joined. Biochemically, when a purified complex of Artemis:DNA-PKcs binds to an individual DNA hairpin molecule, that hairpin can activate the kinase activity of that DNA-PKcs protein (in cis) to phosphorylate itself and the bound Artemis within the C-terminal portion (96, 102). With respect to hairpin opening, and its other endonucleolytic activities, Artemis:DNA-PKcs functions as if it were a heterodimer in which mutation of either subunit results in failure of DNA end processing. A recent DNA-PKcs point mutation in a patient supports that view (103), as does a murine knock-in model that recreates a truncation mutant of Artemis that removes the C-terminal portion where the sites of DNA-PKcs phosphorylation are located.
Polymerase mu has several remarkable activities under physiologic buffer conditions. First, it can carry out template-dependent synthesis with dNTP and rNTP, and it has substantial template-independent synthesis capability, like TdT (104). No other higher eukaryotic polymerase has this range of activities. The ability to add rNTP may be important for NHEJ during G1, when dNTP levels are low, but rNTP levels are high (74). Incorporation of U into the junction might then mark the junction for possible revision using uracil glycosylases at a later point in time. (The highly homologous POL4 of S. cerevisiae also efficiently incorporates rNTPs (105)). Interestingly, the bacterial polymerase for NHEJ (part of LigD) has the ability to incorporate ribonucleotides as well (23).] Second, like many error-prone polymerases, polymerase mu can slip on the template strand (48, 51, 52). Third, and as mentioned earlier, pol mu, when together with Ku and XRCC4:DNA ligase IV, can polymerize across a discontinuous template strand, essentially crossing from one DNA end to another (106, 107). Fourth, and also mentioned earlier, pol mu has template-independent activity, which pol mu exhibits whether alone or together with Ku and XRCC4:DNA ligase IV (42).
Both the template-independent and the discontinous template polymerase activities are likely to be of great importance in the joining of two incompatible DNA ends. For example, in the case of two blunt DNA ends, the TdT-like activity of pol mu allows pol mu to add random nucleotides to each end. As soon as the resulting short 3' overhangs share even one nucleotide of complementarity (polymerase-generated microhomology), then ligation is much more efficient (42). (This type of microhomology would not have been present in the two original DNA end sequences, and in that sense, one could refer to it as polymerase-generated microhomology.) In contrast, in the mechanism where pol mu (with Ku and XRCC4:ligase IV present) crosses from one DNA end to the other (template-dependent synthesis across a discontinuous template strand), the duplex end receiving the new synthesis must be a 3' overhang to permit extension of the incoming polymerase (107). Hence, there are two mechanisms by which polymerase mu can create microhomology during the joining process (and these ‘reaction intermediates’ would not be scored as microhomology events based merely on the final DNA sequence of the junctional product).
Structural studies of polymerase mu, TdT and pol lambda are defining the basis for the intriguing differences between these three highly-related DNA polymerases (104). A region called loop 1 (and other positions, such as H329) is important for substituting for the template strand as TdT (always) and pol mu (sometimes) polymermize in their template-indepdendent mode (108, 109). Importantly, the crystal structures are on single-strand break DNA, and hence, we do not know how these enzymes configure on DSBs.
Mouse in vivo systems, crude extract NHEJ studies, and purified NHEJ systems support a role for pol lambda in NHEJ (41, 73, 110). Pol lambda functions primarily in a standard template-dependent manner in Mg2+ buffers, but has template-independent activity in Mn2+ (48, 104). The lyase domain in pol lambda is functional, whereas the one in pol mu does not appear to be functional. This permits pol lambda to function after action by a glycosylase to remove a damaged base. Like pol mu, pol lambda also slips on the template strand.
TdT or terminal transferase is only expressed in pro-B/pre-B and pro-T/pre-T stages of lymphoid differentiation (111). Like the other two Pol × polymerases of NHEJ, TdT has an N-terminal BRCT domain. (Polymerase beta is the only Pol × polymerase that is not involved in NHEJ, and it lacks any BRCT domain.) TdT only adds in a template-independent manner, consistent with a different loop 1 from pol mu and pol lambda (104). TdT has a preference to stack the incoming dNTP onto the base at the 3'OH, accounting for its tendency to add runs of purines or runs of pyrimidines (45). TdT also has a lower Km for dGTP, and this also biases its template-independent synthesis in vitro and in vivo (111, 112).
DNA ligase IV (also called ligase 4 or DNL4) is mechanistically flexible. In the absence of XRCC4, DNA ligase IV appears to still be capable of ligating not only nicks, but even compatible (4 nt overhang) ends of duplex DNA (72). With XRCC4, ligase IV is able to ligate ends that share 2 bp of microhomology and have 1 nt gaps, but addition of Ku improves this 10-fold (42). When Ku is present, XRCC4:DNA ligase IV is able to ligate even incompatible DNA ends at low efficiency (42). When XLF if also added, then XLF:XRCC4:DNA ligase IV, in the presence of Ku, can ligate incompatible DNA ends much more efficiently (43, 113).
Even one nucleotide of terminal microhomology markedly increases the efficiency of ligation by Ku plus XRCC4:DNA ligase IV (43, 113). But some junctions formed within cells have no apparent microhomology (45). These could be cases where Ku plus XLF:XRCC4:DNA ligase IV ligate incompatible DNA ends or blunt ends. As mentioned earlier, pol mu may add either template-independently or across a discontinuous template strand from the left to the right DNA end, and either of these mechanisms would not be scored as use of microhomology upon inspection of the sequence of the joined product junction. As mentioned, one could call this polymerase-generated microhomology, and it is basically a reaction intermediate.
DNA ligase IV is predominately pre-adenylated as it is purified from crude extracts. The reader is referred to specialty reviews for more details (114).
XRCC4 can tetramerize by itself, but it is unclear if this serves a function (115). The crystal structure demonstrates a globular head domain and a coiled coil C-terminus when it forms a dimer (116, 117).
The crystal structure of XLF (Cernunnos) suggests a structure similar to XRCC4, with a globular head domain and a coiled-coil C-terminus where multimerization is driven (118, 119). When XLF is missing in humans, patients are IR sensitive and lack V(D)J recombination (120, 121). In mice, the IR defect is the same as in humans, but the V(D)J recombination defect is less severe in pre-B cells and yet is severe in mouse embryonic fibroblasts (when given exogenous RAGs) from the same mice (122). Considering the biochemical role of XLF in the joining of incompatible DNA ends, it has been suggested that TdT in the pre-B cells can provide ‘occult’ or polymerase-generated microhomology, making joining less reliant on XLF, and this seems like a reasonable explanation of the data thus far (122).
The interactions between XLF, XRCC4 and DNA ligase IV have been studied genetically and biochemically (120, 121, 123, 124). Gel filtration studies of XLF, XRCC4 and DNA ligase IV are most consistent with a stoichiometry of 2 XLF, 2 XRCC4 and 1 ligase IV (120). Complexes of XRCC4 and ligase IV are most consistent with a stoichiometry of 2 XRCC4 and 1 DNA ligase IV (115, 117). Further functional and structural work on the ligase complex will be of great value.
For both S. cerevisiae and in mammalian purified proteins, Ku is able to improve the binding of XRCC4:DNA ligase IV at DNA ends. This interaction requires both Ku70 and 86 and the first BRCT domain within the C-terminal portion of ligase IV (aa 644 to 748) (91). The presence of DNA-PKcs enhances this complex formation, perhaps through interactions with XRCC4 (125-127). XRCC4:DNA ligase IV is able to stimulate DNA-PKcs kinase activity (96). The ligase complex also stimulates the pol mu and lambda activities in the context of Ku (96). All of these findings suggest that the NHEJ components, while capable of acting independently, also evolved to function in a manner that is synergistic when in close proximity.
PNK, APTX and PALF (also called APLF) all interact with XRCC4 (Fig. 2 and and4f).4f). PNK and XRCC4 form a complex via the PNK FHA domain, but only after the CK2 kinase phosphorylates XRCC4 (128). This same interaction occurs between APLF and XRCC4, and APTX and XRCC4.
For pathologic breaks caused by ionizing radiation or free radicals, PNK plays an important role in several ways that illustrate a corollary theme of NHEJ: enzymatic multifunctionality (129-131). Mammalian PNK is both a kinase and a phosphatase. PNK has a kinase domain for adding a phosphate to a 5'OH. PNK has a phosphatase domain that is important for removing 3' phosphate groups, as can remain after some oxidative damage or partial processing (or after NIELS 1 or 2 remove an abasic sugar, leaving a 3' phosphate group). Interestingly, the short 3' overhang that the Artemis:DNA-PKcs complex leaves after cleaving a long 3' overhang represents an ideal substrate for PNK to add a 5' phosphate at a recessed 5'OH.
Oxidative damage often causes breaks that leave a 3'-phosphoglycolate group, and these can be removed in either of two major ways. First, Artemis:DNA-PKcs can remove such groups using its 3' endonucleolytic activity (39, 98). Second, 3'-phosphoglycolates can be converted to 3'-phosphate by tyrosyl DNA phosphodiesterase 1 (Tdp1), whose major role is the removal of tyrosyl-phosphate linkages that arise when topoisomerase fails to religate transient DNA single-strand break reaction intermediates. Then PNK can remove the 3'-phosphate group.
PALF and APLF are the same protein (511 aa, 57 kDa). The PALF designation stands for PNK and APTX-like FHA protein (134). APLF stands for aprataxin- and PNK-like factor (135, 136). Previously, it was also called C2orf13. APLF is an endonuclease and a 3' exonuclease (134). This is interesting, given that Artemis lacks a 3' exonuclease.
V(D)J recombination is one of the two physiologic systems for creating intentional DSBs in somatic cells, specifically in early B or T cells for purpose of generating antigen receptor genes. RAG1 and RAG2 (both expressed in early B and T cells only) form a complex that can bind sequence-specifically at recombination signal sequences (RSS) that consist of a heptamer and nonamer consensus sequence, separated by either 12- or 23-bp of nonconserved spacer sequence (Suppl. Fig. 3 or sidebar figure). (HMGB1 or 2 is thought to be part of this RAG complex, based on in vitro studies (137).) A given recombination reaction requires two such RSS sites, one 12-RSS and one 23-RSS (the 12/23 rule). The RAG complex initially nicks directly adjacent to each RSS and then uses that nick as a nucleophile to attack the anti-parallel strand at each of the non-RSS ends (138). The two non-RSS ends are called coding ends because these regions join to encode a new antigen receptor exon. The nucleophilic attack generates a DNA hairpin at each of the two coding ends. The NHEJ proteins take over at this point, beginning with the opening of the two hairpins by Artemis:DNA-PKcs, followed by NHEJ joining (37). Like vertebrate NHEJ generally, most coding ends do not share significant terminal microhomology (45, 139). The NHEJ junctions formed in V(D)J recombination have proven to be useful for understanding NHEJ more generally.
The DSBs at the two RSS ends are called signal ends, and these are blunt and 5' phosphorylated (140, 141). In cells that express terminal transferase, nucleotide addition can occur at these ends (142). But these ends only rarely suffer nucleolytic resection, presumably due to tight binding by the RAG complex (138). Joining of the two signal ends together to form a signal joint is also reliant on Ku and the ligase IV complex, but not dependent on Artemis or DNA-PKcs (65, 138). (The fact that DNA-PKcs is required for coding joint formation (for Artemis:DNA-PKcs opening of hairpins), but not for signal joint formation was a point of importance in the original description of scid mice (143). Scid mice have a mutant DNA-PKcs gene (144). Artemis null mice behave similarly in this respect (63).)
Class switch recombination (CSR) occurs only in B cells, after they have already completed V(D)J recombination. It is the second of the two physiologic forms of DSB formation in somatic cells (64). CSR is essential for mammalian B cells to change their immunoglobulin heavy chain gene from producing Igμ for IgM to Igγ, Igα or Igε for making IgG, IgA, or IgE, respectively (Suppl. Fig. 4). The process requires a B-cell specific cytidine deaminase called activation-induced deaminase (AID) which only acts to convert C to U within regions of single-stranded DNA (ssDNA). In mammalian CSR, the single-strandedness appears to be largely due to formation of kilobase-length R-loops that form at specialized CSR switch sequences due to the extremely (40-50%) G-rich RNA transcript that is generated at these specialized recombination zones (145, 146). This permits AID action on the nontemplate DNA strand. RNase H can resect portions of the RNA strand that pairs with the template strand, thereby exposing regions of ssDNA for AID action on that strand as well. Once AID introduces C to U changes in the switch region, then uracil glycosylase converts these to abasic sites, and APE1 can, in principal, nick at these abasic sites. Participation of other enzymes, such as Exo1, may assist in converting the nicks on the top and bottom strands into large overlapping gaps, resulting in DSBs. NHEJ is largely responsible for joining these DSBs, but, as mentioned, elegant work has demonstrated the role of either ligase I or III, when ligase IV is missing (70).
Chromosomal translocations and genome rearrangements can occur in somatic cells, most notably in cancer. In addition, such genome rearrangements can occur in germ cells, giving rise to heritable genome rearrangements. Though the breakage mechanisms vary, the joining mechanism is usually via NHEJ.
The vast majority of genome rearrangement-related DSBs (translocations and deletions) in neoplastic cells are joined by NHEJ, though there is ample opportunity for participation of alternative ligases, if ligase IV is missing, as in experimental systems or extremely rare patients (66, 70). The breakage mechanisms in neoplastic cells include the following: random or near-random breakage mechanisms (due to ROS, ionizing radiation, or topoisomerase failures) in any cell type and V(D)J-type or CSR-type breaks in lymphoid cells (147). The lymphoid-specific breakage mechanisms can combine antigen-receptor loci with off-target loci at sequences that are similar to the RSS or CSR sequences (18, 148). In some lymphoid neoplasms, two off-target loci are recombined, and the breakage at each site of the two sites can occur by any of the above mechanisms. Sequential action by AID followed by the RAG complex at CpG sites appears likely in some of the most common breakage events (called CpG-type events) in human lymphoma (147). In both CSR-type and CpG-type breaks, AID requires single-stranded DNA to initiate C to U or meC to T changes, respectively. Departures from B-form DNA are relevant to such sites (149, 150).
The breakage mechanisms in germ cells are presumably due to random causes primarily (ROS, ionizing radiation, or topoisomerase failures). Deviations from B-DNA are known to be relevant at long inverted repeats where the most common constitutional translocations occur. The most common constitutional chromosomal rearrangement is the t(11;22) in the Emanuel Syndrome (151). In this case, inverted repeats result in cruciform formation, creating a DNA structure that is vulnerable to DNA enzymes that can act on various portions of the cruciform. Once broken, the DNA ends are likely joined by NHEJ, based on observed junctional sequence features.
During evolution, some of the chromosomal rearrangements that arise during speciation are almost certain to share themes with those discussed here, including breakage at sites of DNA structural variation and joining by NHEJ. Replication-based mechanisms are also likely to be very important for major genomic rearrangements (152, 153).
It is not yet clear how much disassembly of histone octamers must occur at a DSB for NHEJ proteins to function. In contrast to homologous recombination (HR), where kilobases of DNA are involved and γ–H2AX alterations are important, NHEJ probably requires less than 30 bp of DNA on either side of a break.
If randomly distributed, 80% of DSBs would occur on DNA that is wrapped around histone octamers and 20% would occur internucleosomally. For those breaks within a nucleosome, one study showed that Ku can bind, implying that the duplex DNA can separate from the surface of the nucleosome sufficiently to permit Ku to bind (154).
Several studies propose that γ–H2AX is important for NHEJ (16, 155). Much of the evidence is based on immunolocalization studies where the damage site may contain a mixture of HR and NHEJ events within the 2000 angstrom confocal microscope section thickness. Differences in access within the euchromatic versus the heterochromatic regions are likely, but even early genetic insights concerning this are limited to yeast (156).
H2AX is only present, on average, in one of every ten nucleosomes because H2A is the predominant species in histone octamers (16). Therefore, most DSBs would occur about 5 nucleosomes away (about 1 kb) from the nearest octamer containing an H2AX that is eligible for conversion to γ–H2AX via phosphorylation by ATM or DNA-PKcs at serine 139 of H2AX. Given this substantial distance from the site of the enzymatic repair, it is not clear that such γ–H2AX phosphorylation events are critical for NHEJ.
When DNA-PKcs does phosphorylate H2AX, this increases vulnerability of H2AX to the histone exchange factor called FACT (which consists of a heterodimer of Spt16 and SSRP1). Phosphorylated H2AX (γ-H2AX) is more easily exchanged out of the octamer, thereby leaving only a tetramer of (H3)2(H4)2 at the site, and this is more sterically flexible, thereby perhaps permitting DNA repair factors to carry out their work (157).
PARP-1 is able to downregulate the activity of FACT by ADP-ribosylation of the Spt16 subunit of FACT. This may be able to shift the equilibrium of γ-H2AX and H2AX in the nucleosomes. That is, PARP-1 activation at a site of damage might shift the equilibrium toward retention of γ-H2AX in the region, perhaps thereby aiding in recruitment or retention of repair proteins (157).
Hence, FACT may initially act proximally at the most immediate nucleosome (or closest one) to exchange γ-H2AX out and leave an (H3)2(H4)2 tetramer at the site of damage for purposes of flexibility of the DNA. FACT may act more regionally (distally) to favor the retention of γ-H2AX for purposes of integrating the repair process with repair protein recruitment, protein retention, and cell cycle aspects (157).
Mechanistic flexibility by multifunctional enzymes and iterative processing of each DNA end are themes that apply to all NHEJ across billions of years of prokaryotic and eukaryotic evolution. Because much of this evolution was convergent, it illustrates that these themes are important for solving this particular biological problem: joining of heterogeneous DNA ends at a double-strand breaks.
In biochemical systems, XRCC4:DNA ligase IV does not appear to require a dedicated protein to help it bring two DNA ends together. This is especially clear when there are 4 bp of terminal microhomology, in which case, addition of Ku does not markedly stimulate joining. However, at 2 bp or less of terminal microhomology, Ku does improve XRCC4:DNA ligase IV ligation. This could occur because Ku is known to stabilize XRCC4:DNA ligase IV at DNA ends, rather than for any intrinsic ability of Ku to bring DNA ends together (which appears to be minimal for Ku alone). Some data suggest that DNA-PKcs might help bring DNA ends together, but this is seen only at 30 mM monovalent salt or less and was not observed at higher salt concentrations. Nevertheless, this is an active area of research and is subject to further definition.
This issue is relevant to whether the two DNA ends generated at a single DSB (proximal) are joined more readily than two DNA ends that arise far apart (as in a chromosomal translocation where two DSBs far apart are involved). The issue of whether close DNA ends are joined more efficiently than ends that are far apart is a point of active study.
NHEJ at a single DSB may be so rapid and physically confined that the damage response pathways (DDR) involving ATM, the RAD50:MRE11:NBS1 complex, γ-H2AX, and 53BP1 are not activated, but this is quite unclear and subject to speculation. Experimentally or with environmental extremes, a cell may be challenged with many DSBs, in which case, activation of the DDR pathways are increasingly likely. As these activate, the impact on the enzymology of NHEJ is not entirely clear. Obviously, competition between HR and NHEJ components may be one aspect that arises. With experimental systems in which each cell has hundreds or thousands of DSBs, titration out of NHEJ components becomes a concern, and at that point it is not clear how much such extreme experimental systems inform us about the typical situation for which NHEJ evolved, which is presumably one or a few DSBs per cell. This all assumes that cells with hundreds of DSBs are optimally allowed to die rather than recover, and death would seem best in a multicellular organism irradiated at high levels. This is an active area of research and subject to continuing study.
Are there additional NHEJ enzymatic components missing? Like most repair pathways, vertebrate NHEJ already has nuclease, polymerase, and ligase activities. Are their additional enzymes missing? Each new candidate must demonstrate some genetic role such as reduced IR resistance when missing. In addition, biochemical studies with purified proteins are important.
The Werner's 3' exonuclease/helicase enzyme has been proposed as one candidate, but the IR-sensitivity data fail to show a large effect (158). WRN does interact with Ku and PARP1, but it has been proposed that this may reflect a role in replication fork repair rather than NHEJ, and this seems reasonable (159). Nevertheless, further work on this and any other new NHEJ candidates will be interesting.
Metnase has been proposed as a possible NHEJ nuclease and helicase, but it also has decatenating activity (160, 161). Metnase is present in humans but not in apes, mice, or apparently any other vertebrates, and there is no yeast homologue. Moreover, there is no genetic knockout to demonstrate a role in NHEJ. The NHEJ biochemical studies have not been done with fully purified protein. Further, work may define what nuclear process to which Metnase contributes in human cells, but likely not a key role in NHEJ, given its absence from organisms other than humans.
One major online translocation database is GRABD: http://archive.uwcm.ac.uk/uwcm/mg/grabd/.
The Atlas of Genetics and Cytogenetics in Oncology and Haematology is an on-line journal and database. Some chromosomal translocation data is available on the web site: http://atlasgeneticsoncology.org//
A chromosomal translocation database related to the citation Cell 135: 1130-42 is at: http://lieber.usc.edu/data/2008_cell_135_1130/.
A small subset of p53 mutations may be due to DSB followed by NHEJ. All p53 mutations can be found at the IARC p53 mutation database: http://www-p53.iarc.fr.
I thank T. E. Wilson, D. Williams, W. An, N. Adachi, and K. Schwarz for comments. I thank Jiafeng Gu and Xiaoping Cui, as well as other current members of my lab for comments. I apologize to those not cited due to length restrictions.