|Home | About | Journals | Submit | Contact Us | Français|
BRCT domains are versatile protein modular domains found as single units or as multiple copies in more than twenty different proteins in the human genome. Interestingly, most BRCT-containing proteins function in the same biological process, the DNA damage response network, but show specificity in their molecular interactions. BRCT domains have been found to bind a wide array of ligands from proteins, phosphorylated linear motifs, and DNA. Here we discuss the biology of BRCT domains and how a domain-centric analysis can aid in the understanding of signal transduction events in the DNA damage response network.
When damage to DNA is detected a number of signaling events are initiated at the site of damage in the chromatin and radiate to other subcellular compartments. These signaling events perform several functions including the tagging of damage sites, the hierarchical recruitment of proteins required for repair of the lesion, and the temporal coordination of repair processes with progression of the cell cycle . They are part of the DNA damage response (DDR) network and several of its main components have been uncovered in the last 15 years [2,3]. Because several excellent reviews have been published on the subject [1–8] here we will focus on one specific topic: the modular nature of the DDR network.
The identification and characterization of DDR upstream kinases ATM, ATR, and DNA-PK and several of their substrates revealed an extensive signaling network in which phosphorylation played a major role [8–10]. In addition, several protein phosphatases have also been implicated in the DDR [11,12]. Analysis of target sequences for the DDR kinases and phosphatases reveals a preponderance of phosphorylation events on serine and threonine residues rather than phosphorylation on tyrosine residues as is commonly found in several canonical growth factor receptor signaling pathways [13–15].
Together with protein kinases and phosphatases, protein modular domains and short linear motifs (SLiMs) make up the DDR signaling toolkit. Protein modular domains are protein regions that can fold independently . Many modular domains have been implicated in mediating interactions with ligand proteins via short (8–10 amino acid) linear motifs located in loops or disordered regions . Inspection of the known components of the DDR reveals the prevalence of two modular domains in addition to 14-3-3 proteins: BRCT (BRCA1 C-terminal domain) and FHA (Forkhead associated domain) domains [18,19].
BRCT (BRCA1 C-terminus; PFAM PF00533) domains, initially identified in the breast and ovarian cancer susceptibility gene product BRCA1, are protein-protein interaction modules found in a wide array of prokaryotic and eukaryotic proteins ranging from one up to eight units [20–24]. Germline mutations that disrupt the BRCT domains of BRCA1 are associated with a significantly increased risk for breast and ovarian cancers [25–28]. The human genome contains at least 23 genes coding for proteins with BRCT domains and most are implicated in the DDR (Woods et al. unpublished). BRCT domains are found in all three superkingdoms, Archaea, Eubacteria, and Eukarya, which supports the notion that they have an early origin [20,29].
The FHA domain (PFAM PF00498) is formed by 65–100 amino acid residues and was initially recognized in forkhead transcription factors found in prokaryotic and eukaryotic proteins . The FHA domain was shown to be required for the development of the fruiting body in the proteobacterium Myxococcus xanthus that undergoes a multicellular stage . Human proteins, such as CHK2, RNF8, CHFR, NBS1, and MDC1, which contain FHA domains, have well characterized roles in the DDR [32–38]. Notably, two key DDR proteins NBS1 and MDC1 contain BRCT and FHA domains [39,40].
At least a subset of BRCT and FHA domains have been shown to bind SLiMs phosphorylated by DNA damage-activated kinases with BRCTs showing a preference for phosphoserine (pSer) and the FHAs preferring phosphothreonine (pThr)[41–49]. SLiM binding specificity for characterized tandem BRCT and a subset of FHA domains is determined by a bipartite recognition that involves two distinct pockets: one that recognizes the phosphorylated Serine or Threonine and another pocket that recognizes the +3 residue (pSer/pThr is considered the zero position) and is thought to provide specificity (Supplementary Table 1)[41,42,44,47–52]. The recognition and binding of cognate phosphorylated motifs by proteins with BRCT or FHA domains can lead to changes in protein function and location. However, our knowledge about the specificity determinants for these linear motifs is still incomplete (Supplementary Material).
FHA domains have only been found present as isolated instances but BRCT domains present a more diverse domain arrangement: besides many occurrences of single and tandem BRCT domains it is also be found as a triplet in TOPBP1 [53,54]. It is also important to note that instances of either FHA or BRCT domains might also bind other proteins via more extensive surface interaction or via other linear motifs that do not depend on phosphorylation or other post-translational modifications. This is the case for structurally characterized examples of TP53BP1-TP53 and LIG4-XRCC1 interactions [55,56]. Also, we have recently identified a poly-lysine stretch that mediates the interaction between the BRCT of LIG4 and PA2G4 . Some FHA domains require extended surface interactions such as the binding of the KI-67 FHA to hNIFK phosphopeptide [58,59].
Besides the BRCT and FHA domains other modular domains have also been shown to play critical roles in the DDR. The role of tandem Tudor domains found in TP53BP1 is an example that also illustrates a cooperative relationship with BRCT domains. Tudor domains (PFAM PF00567) are formed by a strongly bent anti-parallel β-sheet consisting of five β-strands with a barrel-like fold . While recognition of phosphorylated Ser139 in Histone H2AX by the TP53BP1 BRCT domain is required for TP53BP1 retention at DNA damage sites, initial recruitment depends on the recognition of methylated lysines, preferentially Histone H4 dimethylated in on Lys20, by its Tudor domains [61–64].
From the analysis of specific protein-protein interactions we can determine the mechanistic basis of signaling in the DDR. However, an operational understanding of DDR dynamics of the DDR will require a network level approach at a modular domain resolution (in which regulatory domains, SLiMs, and enzymes that modify them are well annotated). Here we focus on BRCT domains and how their global analysis can help in our understanding of the DDR.
The organization of DDR signaling events in time and space depends on a large number of protein-protein interactions, some constitutive and some inducible. A fascinating problem in signal transduction is how a set of proteins with multiple overlapping functions achieve specificity. An analysis of mitotic kinases and protein complexes reveals that specificity can be achieved through a combination of selection among kinase target motifs (motif space) and distinct subcellular localization (localization space).
Using Gene Ontology (GO) terms for subcellular compartment we see that there is considerable overlap in the localization of proteins with tandem BRCT proteins (Fig. 1A) (Supplementary Material). This suggests that selection of binding motifs might play a critical role in defining specificity as most of these BRCT-containing proteins share the same subcellular compartment. Interestingly, clustering all BRCT proteins according to the GO annotation shows that BRCT proteins sharing structural and sequence similarities do not necessarily cluster according to subcellular compartment (Fig. 1B). This is perhaps not surprising considering that many signals and motifs that control localization, such as nuclear localization or nuclear export sequences, are located outside of BRCT domains. Important caveats to this analysis are that GO terms may not reflect most recent findings in the literature for several proteins, and proteins that are better studied have more detailed localization annotation than less studied ones (e.g. compare BRCA1 with ANKRD32). In addition, while some proteins may overlap in their general location (e.g. associated with chromosomes), their distribution may vary within that compartment (e.g. associated with DNA lesion versus evenly distributed in chromatin).
Despite the caveats of these preliminary analyses, our incomplete knowledge of which motifs are recognized by BRCT domains, and the precise localization of these proteins we can derive insights about the specificity of modular domain interactions by turning our attention to the structure of BRCT domains. Here we focus on eight tandem BRCT domains in human proteins for which crystallographically-determined structures are available in the Protein Data Bank : BRCA1 (1Y98_A), TP53BP1 (1KZY), MDC1 (2AZM_A), BARD1 (2NTE_B), LIG4 (3II6), PAXIP1 (3SQD_A), TOPBP1 (2XNH), and MCPH1 (3SQD).
Known BRCT domains share a general topology (arrangement) of secondary structure elements where four or sometimes five parallel β-strands in their core are sandwiched between α-helices or loop segments, in a three layered fold. Visual inspection of the protein backbones of the tandem BRCTs reveals a striking similarity of their three-dimensional module arrangement in six examples (the maximum sequence identity between any two structures is 33%)(Fig. 2A). In LIG4 and TOPBP1 we find some unusual domain arrangements (Fig. 2A). For a detailed discussion of structural aspects of BRCT function the reader is referred to Leung et al. .
Despite this similarity of fold, visual inspection of space-filling models highlighting the electrostatic properties mapped on their surfaces reveals great differences, and thus suggests potential for very different binding properties outside the pSer/pThr pocket (Fig. 2B). Obviously not all differences are at ligand binding sites and reflect differing binding specificities. However, comparative analyses of electrostatic surfaces can help discern neutral from adaptive changes and point to differing specificities in this way (for example in eF1A1 and eF1A2)[68,69]. BRCT domains can interact with protein ligands by recognition of linear motifs or through surface interactions. Linear motif and surface-based interactions could conceptually be further subdivided into constitutive or inducible interactions, e.g. binding is influenced by post translation modifications.
To be capable of finer analyses, we produced a structure-based sequence alignment of seven tandem BRCT fragments and included their orthologs from six mammalian species: man, dog, cow, mouse, elephant, and opossum (Supplementary Materials). The protein aligned sequences were used to align the encoding DNA sequences (data not shown). Using these codon-alignments we were able to identify sites in the BRCTs that likely evolved under purifying selection pressure during evolution, as inferred by synonymous codons that are significantly overrepresented compared to non-synonymous codons (Fig. 3). Sites under negative selection for all tandem BRCT domains seem to coincide with the phosphopeptide binding pocket at the “top”, although not exclusively. By comparison, we note a paucity of negatively selected sites over the remainder of the protein surface. Besides corroborating our current understanding, of phosphopeptide binding as a main function in these tandem BRCT domains, analyses like these provide new testable hypotheses regarding which amino acids play critical roles in this type of interaction. In addition, these analyses may identify other negatively selected sites that do not coincide with the phosphopeptide binding that may be important for the regulation and function of BRCT-containing proteins.
The DNA damage response is a fundamental cellular process and understanding of the events involved and their regulation will not only illuminate an important aspect of the life of a cell but will also be critical to understand disease states and to improve treatment. Because the DDR has been proposed to constitute an early barrier to tumorigenesis [70,71] and DNA damage is at the basis of common chemotherapy and radiotherapy this information is going to be particularly relevant in the fight against cancer.
From the perspective of signal transduction we can understand the DDR as an integrated system with kinases and phosphatases, linear motifs, and modular domains. Importantly, events in the DDR are coordinated by the use of modular domains including the FHA and BRCT domains. In our brief analysis described here we explored the commonalities and specificities of select tandem BRCT domains and provide a glimpse of how exploring of the modular nature of the DDR can improve our understanding of the DDR. We show that tandem BRCT domains display a remarkable similarity of backbone arrangement but divergent surfaces, suggesting how a structurally conserved module present in a number of proteins that share subcellular compartments can achieve versatility in specific interactions with other proteins.
BRCT domain-centered functionality has not been extensively explored beyond analysis of structural/surface aspects and SLiM binding capabilities, which provide consistent predictions but lack contextual analysis of its role as a scaffolding module in the DDR. BRCT-SLiM interactions utilize a minimal amount of BRCT solvent accessible area, leaving significant areas on the BRCT domains well suited for additional protein interactions. Therefore, domain-centric protein interaction studies are required to gather a comprehensive understanding of BRCT function within the cell. Data sets cataloging BRCT interacting proteins will provide a framework for understanding the cellular processes in which BRCT domains participate. Interestingly, our results suggest that this type of domain-centric approach can differentiate BRCT domains that exhibit divergent interaction profiles within common cellular processes of the cell cycle and DDR (Woods, et. al., unpublished results).
As an extension, studies that take into consideration the temporal control of interactions determined by cell cycle progression or ionizing radiation are required to understand the inducible nature of the interactions and the role of the BRCT domains in these complex signaling pathways. Adequate depth in delineating these BRCT-mediated pathways could allow the integration of genetic profiling and modern therapeutics to optimize cancer treatments and patient outcomes.
Our ability to model structures on a network will also be instrumental to advancing our understanding. Although still a future goal, such networks can also be contextualized (i.e. according to tissue characteristics, tissue specific expression, stimuli, or temporal series). Perhaps more exciting is the possibility to personalize these networks by overlaying genetic information. Conceivably, the impact of germline genetic variation on protein interactions could be used to identify patients more likely to suffer from side effects. Likewise the impact of somatic genetic changes on protein interactions in a tumor could be used to predict response to chemotherapy or targeted therapies. Although many hurdles still remain to a seamless integration of all these data sources we are now in a position to put them to good (scientific and clinical) use.
A) Phosphorylated short linear motifs recognized by BRCT or FHA modular domains. Includes a survey of phosphorylation SLiMs that have been documented in the literature. B) GO terms for cellular compartment. C) Domain structure-based master alignment of structurally characterized tandem BRCT families. Orthologs from six vertebrate species are included if they were available: man (Homo sapiens; Hs), dog (Canis familiaris; Cf), cow (Bos taurus; Bt), mouse (Mus musculus, Mm), elephant (Loxodonta africana; La) and opossum (Monodelphis domestica; Md). Ortholog protein sequences were identified in either OrthoDB  or OMA  and derived from ENSEMBL transcript sequences. Sequence quality/accuracy was verified by sequence searches in the non-redundant (nr) database at NCBI, and minor corrections made where indicated. To produce a 3-D structure-based alignment we separately superimposed crystal structure fragments of all first and all second BRCT domains (see main text) using the Combinatorial Extension (CE) program  (xxref). The derived sequence alignments for each domain were joined and the linker regions in between aligned using ClustalX . This procedure yields more accurate alignments than those found in protein family databases not considering protein structure. Note that the close relatedness of the orthologs makes it difficult to attribute relevance to conserved positions at the protein sequence level. The codon-alignment we derived (not shown) enables deeper evolutionary sequence analyses in individual families.
We thank members of the Protein Modules Consortium (http://www.proteinmodules.org/) and members of the Monteiro and Gerloff labs for helpful suggestions. Work in the Monteiro lab is supported by the NIH/NCI awards (U19 CA148112-01 and SPORE in Lung Cancer P50 CA119997). N.W. is supported by a Florida Breast Cancer Foundation fellowship.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.