The tandem BRCT domain organization has undergone significant evolutionary expansion from prokaryotes to higher order metazoans (). Although the specific mechanisms that fostered this expansion remain speculative, interesting correlations can be made between the phyletic distribution of tandem BRCT proteins and aspects of DNA repair and cell cycle check points in diverse lineages.
Figure 2. Taxonomic distribution of tandem BRCT-containing proteins. Distant orthologs of human tandem BRCT-containing proteins were identified in different taxa and curated individually. Orthologs were retrieved using human tandem BRCT-containing proteins (including (more ...)
The minimal complement of BRCT-containing proteins in the human genome is 24 (Mesquita and Monteiro, unpublished results). Of those proteins, 12 present with more than one BRCT unit (). We hypothesize that these 12 proteins might therefore make up the core of the phosphoserine recognition system during the DDR.
One of the earliest analyses of DDR-associated domain architecture by Aravind et al
. did not find any instances of the BRCT domain in Archaea, although it was evident in the carboxy terminus of bacterial DNA ligases.8
Additionally, the presence of the BRCT domain in Trypanosomes (which is not considered part of the early radiation of eukaryotes) prompted speculation that horizontal gene transfer from prokaryotes could be the mechanism by which eukaryotes acquired BRCT domains.8
However, instances of BRCT domains have since been found in Archaea.43
A search of currently deposited sequences also identifies the presence of a BRCT domain in proteins with similarity to the human LIG4 in several Archaea genera (e.g., Halorhabdus
, and Haloquadratum
). The presence of BRCT domain in all 3 superkingdoms suggests that vertical descent of this domain from the last common ancestor, or cenancestor, could explain its phyletic distribution. In addition, 2 “transition points,” the eukaryotic/prokaryotic division and the emergence of the metazoans, can be identified ().
Interestingly, SH2 domains are present almost exclusively in metazoans, suggesting that its origin is close to the emergence of multicellularity.25
Although the tandem BRCT domain clearly has roots in single-cell organisms, multicellularity requires a more complex and tightly controlled protection of DNA integrity as a genetic insult to a single cell can be detrimental to the whole organism. As organisms expand in complexity, so too does the number of tandem BRCT domain–containing proteins. This could be attributed to the cell’s penchant for highly redundant systems that provide added insurances of successful completion of vital cellular processes such as chromatin regulation, DNA damage repair, and cell cycle check points.
It is likely that the first occurrence of the tandem BRCT domain came from the duplication of a single domain either by genetic duplication or domain shuffling between 2 different genes. Hints of a possible origin of tandem BRCT domains can be extrapolated from the existence of an apparent third, less defined, subclass of BRCT domains found in RFC1-, PARP1-, and NAD+-dependent bacterial DNA ligases. These BRCTs are defined by replacement of the highly conserved tryptophan residue found in helix α3 as well as several other specific structural alterations.44
Most striking is that some of the individual domains within this subclass display a capacity to bind DNA mediated by the BRCT domain in which binding is mediated by the phosphate binding pocket and an N-terminally located helix outside the BRCT domain.44-46
Compellingly, in the case of RFC1 binding to the 5′-phosphate of DNA, there is excellent conservation of both the 3-dimensional structure and the chemical nature of the phosphate binding site found in other BRCT domains. Furthermore, there is conservation between the DNA binding residues of RFC1 and bacterial NAD+-dependent ligases, and mutation of these residues in the ligases severely affects their ability to bind DNA, suggesting conservation in the mode of DNA binding.44,46
These studies suggest that the BRCT domain could have originated as a DNA binding motif through recognition of the 5′-phosphate moiety and the binding to phosphorylated peptides may have developed later.
To probe further into this problem, we generated a tree comparing BRCTs (not the full proteins) from 1) the E. coli ligase, 2) all Archaea BRCTs (all of them from DNA ligases), and 3) all 24 BRCT-containing human proteins (). Surprisingly, the RFC1 BRCT clustered with the Archaea BRCTs, followed by DBF4 and PARP1. These results strengthen the notion raised in the previous paragraph, although a more comprehensive and exhaustive analysis is needed to fully investigate these issues. Importantly, the presence of the BRCT domain in Archaea does not necessarily exclude the possibility of horizontal gene transfer as an evolutionary process that could have contributed to the expansion of this domain.
Figure 3. RFC1 BRCT clusters with Archaea BRCTs. RFC1 BRCT is indicated by an arrow. Blue, yellow, and red lines indicate major branches of BRCT domains. Tree comparing single BRCT units from E. coli ligase BRCT, all BRCTs from Archaea (all derive from DNA ligases), (more ...)
Of all human tandem BRCT proteins, DNA ligase IV (LIG4) contains the earliest identifiable ortholog found in all 3 superkingdoms, although found as a single unit in Archaea and bacteria (). In early prokaryotes and Archaea, where there is less pressure to maintain genomic fidelity and error-prone repair could aid in adaptability of the species, the error-prone NHEJ pathway appears to be the first to utilize the tandem BRCT domain. LIG4 complexes with XRCC4 at sites of double-stranded breaks through recruitment by DNA-PK and KU70/KU86 and completes the final step in repair by ligating the 2 strands of DNA.
Interestingly, in the case of the tandem BRCT domain of LIG4, it is not the domains themselves but rather the linker between the domains that mediate the important interaction with XRCC4.18
Indeed, this mode of interaction is conserved in other instances of the tandem such as the phospho-independent binding of the TP53BP1 tandem BRCT to p53, which utilizes the linker region between the BRCT domains for binding.30,32
However, the linker region does not always directly bind to the interaction partner. Indeed, each of the BRCT domains found in the tandems of both BRCA1 and MDC1 confers sequence binding specificity for phosphopeptides.38
Despite their conservation in structure and composition, it is obvious that their scaffolding properties are utilized in a diverse fashion within the BRCT family. It is also important to note that BRCT domains have also been identified in plants47
and a BRCT-related fold has been observed in the T antigen helicase domain of polyoma viruses.48
This apparent plasticity in binding functionality is likely to be a driving force in the utilization of these domains as adaptor proteins in the DNA damage response. Furthermore, it is tempting to speculate that combination of the most ancient singleton BRCT domains with other non-BRCT regions (or with other duplicated BRCT units) led to the emergence of new binding interfaces not present in the single domains alone that could confer binding selectivity and specificity. This ability to specifically bind new proteins may have provided the basis for the expansion of this motif as a tandem domain in a single protein.
Perhaps more perplexing is how these conserved domains with diverse modes of mediating interactions do not appear to have branched out into other cellular processes, especially those that also involve serine and threonine kinase signaling. Perhaps the most non-DNA damage use of the BRCT domain has been described in ECT2 control of cleavage furrow formation.49
However, since this process involves coordination of cytokinesis with chromosome segregation, it is evident that this BRCT domain functions in maintaining genome fidelity. Considering the FHA and 14-3-3 scaffolding domains involved in the DDR, which participate in a number of diverse cellular pathways, BRCT domains have been relatively confined to the DDR. This could possibly be explained by evolutionary determinants that prevent its utilization in other processes or constraints on recognizing consensus regions that are preferred by kinases involved in the DDR.