The NHEJ pathway is over 3 billion years old, and is found in bacteria, archaea, and eukaryotes. Despite the ancient conservation of the pathway, we have identified five NHEJ genes that have evolved under positive selection during the evolution of simian primates: NBS1
, and POLλ
. An analysis of polymorphism data supports positive selection of XRCC4
in modern humans as well. Interestingly, the yeast ortholog of NBS1 (XRS2)
was also identified as one of the two Saccharomyces
NHEJ genes with the most extreme signatures of positive selection 
. One hypothesis is that these signatures of positive selection are reflective of natural selection for more efficient DNA repair. As certain NHEJ components evolve, compensatory mutations may arise in other NHEJ components to re-optimize protein-protein interactions between the various components. We feel that this model is unlikely. In the absence of an antagonizing force, there is no reason that recurrent adaptive change should be required of any member of this pathway, which would then need to be followed by compensatory change. Four observations from our study additionally argue against this model. First, our biochemical experiments with Nbs1 suggest that positive selection of at least one of the three sites identified has not altered interactions with other repair proteins. Second, although there are several core complexes involved in NHEJ (the MRN complex and the Lig4/XRCC4/XLF complex), only one component of each of these was identified as evolving under positive selection. Third, the clustered sites of positive selection in XRCC4 fall within the C-terminal protein domain that is not essential for DNA repair. Fourth, the positive selection of the NHEJ pathway is not a primate specific phenomenon, but is also found in Saccharomyces
, arguing against a model where some novel role for DNA repair during primate evolution has driven this selection.
The finding of multiple primate NHEJ components evolving under positive selection, supported by parallel findings in Saccharomyces yeast, indicates a systematic perturbation of the NHEJ pathway. With positive selection observed in two highly divergent eukaryotic clades, a model for the cause of this rapid evolution must span such diverse species groups. We propose that NHEJ genes may be antagonized by genetic parasites, which in primates are comprised of viruses and retrotransposons.
Proteins of the NHEJ repair pathway have been shown to act as antiviral factors in the lifecycle of human adenovirus, a linear double-stranded DNA virus. Adenoviruses are a major cause of upper respiratory and other infections in humans. During infection, components of the NHEJ pathway join together viral genome ends, causing “dead-end” viral genome concatenation 
. To counteract this antiviral tactic, adenovirus proteins (encoded by the E4 genes) sequester and target for degradation a number of components of the NHEJ pathway, including components of the Mre11/Rad50/Nbs1 and Lig4/XRCC4/XLF complexes 
. CtIP has also been implicated in the adenovirus lifecycle through its interaction with the adenovirus early region 1A (AdE1A) protein 
. If primate NHEJ genes are continually selected to encode variants that can evade interaction with these adenoviral antagonists, while the viral antagonists continually counter-evolve, this could drive positive selection of primate NHEJ genes. Adenovirus has been found in stool samples from great apes and macaques 
, indicating a possible long-standing co-evolution between this virus and primates.
Retroviruses like HIV may also provide the selective pressure that shapes the recurrent positive selection of NHEJ genes. There is abundant genetic evidence suggesting a role for NHEJ in the retroviral lifecycle 
. Upon cellular entry, the retroviral RNA genome is reverse transcribed into double-stranded DNA. The ultimate destination for this retroviral cDNA is integration into the genome of the host, but it must first survive passage through the nucleus without being detected as broken DNA by the cell. NHEJ proteins have been found to physically associate with retroviral proteins, cDNA, and pre-integration complexes in vivo
and in two-hybrid interactions 
. There are several models which have been proposed to explain this. In one model, NHEJ proteins are recruited by the viral complex to protect free viral cDNA ends from degradation or from triggering apoptosis. In another model, the viral complex recruits host NHEJ proteins to promote the repair of breaks created at sites of retroviral cDNA integration into the host genome. In a third model, NHEJ proteins act as antivirals, joining the two long-terminal repeat (LTR) ends of the viral cDNA into dead-end “2-LTR circles.” These 2-LTR circles are ubiquitously observed in the nuclei of infected cells 
. Regardless of the model, allelic variants of NHEJ genes that result in lower infection rates would be selectively advantageous to the host. Should such alleles go to high frequency or fixation, retroviruses would be expected to counter-evolve, and the back-and-forth interplay would drive recurrent positive selection of NHEJ genes. Retroviruses and primates have co-evolved for tens of millions of years, as illustrated by the fact that all sequenced primate genomes contain the remnants of hundreds of thousands of integrated retroviruses 
It is unknown whether the positive selection observed in NHEJ genes represents a response to a single selective force, or whether multiple forces are shaping their evolution. At least eight additional viral families have been shown to evade or exploit the host DNA damage response 
. Several NHEJ proteins include one or more “BRCT” domains, which have been linked to viral infection in multiple instances. The Epstein-Barr viral protein Zta has been shown to interact with the BRCT domains of 53BP1, a component of the DNA damage response, to prevent apoptosis that is activated in response to viral replication 
. HIV-1 Tat has also been shown to interact with the BRCT domain of the human replication protein FCP1 
. In both Polλ and Nbs1, we find an amino acid position at the C-terminal end of the BRCT domain to be evolving under positive selection (Q185 in Nbs1 and Q102 in Polλ). The single site found to be under positive selection in Saccharomyces
Xrs2 also falls near the end of the BRCT domain (site 298) 
. BRCT domains could be a critical link in the interaction between viruses and the NHEJ pathway. Antagonism of host NHEJ proteins by genetic parasites may be a universal feature of cellular life, as yeast Ty retrotransposons also interact genetically and physically with NHEJ machinery 
. LINE-1 retrotransposons are major drivers of primate genome evolution, and LINE-1 retrotransposition rates are reduced in the absence of NHEJ genes 
. The Corndog and Omega bacteriophages of mycobacteria have even incorporated the first gene in the bacterial NHEJ pathway, Ku, into their own genome 
. This viral Ku now evolves under the selective pressures of the virus in order to recruit the bacterial NHEJ ligase, LigD, to circularize phage DNA.
In summary, we have documented abundant signatures of positive selection in genes of the NHEJ pathway, which is the major pathway for repairing double-strand chromosomal breaks in mammalian cells. We propose the hypothesis that these signatures result from the long-term co-evolution between NHEJ genes and genetic parasites. While it is well known that genetic parasites shape genome architecture through insertion and subsequent inter-element recombination, the present study may indicate that selective pressures imposed by genetic parasites can drive the evolution of protein sequence in critical human proteins.