|Home | About | Journals | Submit | Contact Us | Français|
The Fanconi anemia complementation group A (FANCA) gene is one of 15 disease-causing genes and has been found to be mutated in ~60% of Fanconi anemia patients. Using purified protein, we report that human FANCA has intrinsic affinity for nucleic acids. FANCA binds to both single-stranded (ssDNA) and double-stranded (dsDNA) DNAs; however, its affinity for ssDNA is significantly higher than for dsDNA in an electrophoretic mobility shift assay. FANCA also binds to RNA with an intriguingly higher affinity than its DNA counterpart. FANCA requires a certain length of nucleic acids for optimal binding. Using DNA and RNA ladders, we determined that the minimum number of nucleotides required for FANCA recognition is ~30 for both DNA and RNA. By testing the affinity between FANCA and a variety of DNA structures, we found that a 5′-flap or 5′-tail on DNA facilitates its interaction with FANCA. A patient-derived FANCA truncation mutant (Q772X) has diminished affinity for both DNA and RNA. In contrast, the complementing C-terminal fragment of Q772X, C772–1455, retains the differentiated nucleic acid-binding activity (RNA > ssDNA > dsDNA), indicating that the nucleic acid-binding domain of FANCA is located primarily at its C terminus, where most disease-causing mutations are found.
Fanconi anemia (FA)3 is an autosomal recessive or X-linked disorder characterized by bone marrow failure, developmental abnormalities, predisposition to cancer, and hypersensitivity to cross-linking agents (1–10). Thus far, 15 distinct genes have been identified to cause the deadly disease (6, 11–14). Eight of them are components of the FA core complex (FANCA, FANCB, FANCC, FANCE, FANCF, FANCG, FANCL, and FANCM) that monoubiquitinates both FANCD2 and FANCI, a key event that initiates interstrand cross-link (ICL) repair (6, 8, 9, 15). Downstream of the FANCD2 and FANCI monoubiquitination are double-strand break repair proteins (i.e. FANCD1/BRCA2, FANCJ/BRIP1, and FANCN/PALB2, FANCO/RAD51C, and FANCP/SLX4) that are critical to achieving successful ICL repair (6, 11–14, 16, 17). Although a deficiency in each FA gene shows similar clinical and cellular phenotypes, ~85% of FA patients present defective FANCA (~60%), FANCC (~15%), or FANCG (~10%) genes, and ~15% of them have defects in the other FA genes (8, 9, 18).
FANCA physically interacts with FANCG and the transcription factor HES1 within the FA core complex (19–25), which has been found to be localized to chromatin (26, 27). FANCA was also found to be involved in psoralen ICL-induced mutagenesis and in spontaneous and UV light-induced base substitution mutagenesis in human cells, implying its involvement in the mutagenic translesion synthesis of DNA damage (28, 29). Additionally, FANCA was shown to be required for recruiting FANCO/RAD51 and FANCD1/BRCA2 into mitomycin C-induced nuclear foci, indicating its role in the homologous recombination repair of ICLs (12, 30, 31). These data imply that FANCA may play multiple roles in DNA metabolism and transactions. However, because FANCA is not evolutionarily conserved and lacks identifiable domains/motifs, it remains largely unknown how FANCA is involved in these biological processes.
In this study, we report that purified human FANCA binds to nucleic acids with strong preference for single-stranded forms. This novel property of FANCA supports its role in DNA damage repair and should be helpful in understanding how FANCA and the whole FA core complex contribute to the maintenance of genomic stability.
cDNA for human FANCA was obtained by PCR amplification from a universal cDNA pool (BioChain Institute, Inc.). The full-length open reading frame of FANCA was sequenced and found to exactly match NCBI Reference Sequence NM_000135. Overexpression of hexahistidine-tagged FANCA was achieved in insect High Five cells using the Bac-to-Bac expression system (Invitrogen). Truncation mutants were produced through a PCR-based method (32). Expression of FANCA and its mutants was confirmed by Western blot analysis using a Pierce ECL kit. Antibodies against FANCA were kindly provided by the Fanconi Anemia Research Fund. Monoclonal antibody THE against the His6 tag (GenScript, Piscataway, NJ) was also used to confirm expression and subsequent purification. Upon expression of the recombinant proteins in insect cells, the cells were homogenized using a Dounce homogenizer to prepare extracts. Wild-type FANCA and various truncation mutants were purified using a HiTrap Q-Sepharose Fast Flow column; a 5-ml HiTrap Blue column; a Mono S, Mono Q, and/or Superdex 200 gel filtration column (GE Healthcare); and/or a 2-ml high-resolution hydroxylapatite column (Calbiochem) and by tracing FANCA protein through SDS-PAGE and Western blotting. Protein concentration was determined using the Coomassie (Bradford) protein assay reagent (Pierce). The purified proteins were stored at −80 °C in aliquots. Purified replication protein A (RPA) was prepared as described previously (33).
Oligonucleotides that were used to create single-stranded DNA (ssDNA; 61-mer), double-stranded DNA (dsDNA; 61 bp), the 5′-tail (30-mer for the single-stranded part and 31 bp for the double-stranded part), the 3′-tail (30 bp for the double-stranded part and 31-mer for the single-stranded part), the splayed arm (30 bp for the double-stranded part and 31-mer for the single-stranded part), the 5′-flap (with a 31-mer flap), the 3′-flap (with a 31-mer flap), the static fork (all arms are 30 bp), and the static Holliday junction (all four arms are symmetrically 30 bp) were adopted from a design by Gari et al. (34) with the same sequences. It should be noted that there is a 1-base 5′-overhang, originally designed to label the 3′-end of the substrates, on the double-stranded area of the 5′- and 3′-tails, splayed arm, and 5′- and 3′-flaps and on one arm of the static fork and Holliday junctions. Annealing was carried out in a water bath within ~5 h by slowly cooling from 85 °C to 20 °C. The quality of annealing was monitored by native gel electrophoresis. Proper annealing was verified by the mobility of a corresponding substrate, e.g. the static Holliday junction moves slowest because of its largest size. RNA was chemically synthesized by Integrated DNA Technologies, Inc. using the same sequence as the 61-mer ssDNA.
DNA binding EMSA analysis was performed as described previously (35) in a 10-μl reaction containing 25 mm Tris-HCl (pH 7.5), 100 mm NaCl, 5 mm EDTA, 1 mm DTT, 6% glycerol, 1 nm 5′-32P-labeled oligonucleotide substrates, and the indicated amounts of protein. The reactions were incubated at 18 °C for 45 min, followed by the addition of 4 μl of 50% (w/v) sucrose. The reaction mixtures were resolved by electrophoresis through a 4% nondenaturing polyacrylamide gel in 40 mm Tris acetate (pH 7.6) and 10 mm EDTA with 6% glycerol using the Owl P9DS electrophoresis system (Thermo Scientific). The setting was 100 V (~1.5 watts/gel) for 40 or 90 min as indicated. DNA substrates and shifted bands were visualized by autoradiography. Quantitation of the bands was performed using NIH ImageJ software.
At steady state (equilibrium), Kd can be determined through the following equation: Kd = [A][B]/[AB], where [A], [B], and [AB] are the concentrations of FANCA, nucleic acids, and the FANCA-nucleic acid complex, respectively. Because the concentration of nucleic acids ([B]) was very low in our EMSA experiments (1 nm), the FANCA protein concentration ([A]) that shifted 50% of nucleic acids ([AB] = [B], thus [B]/[AB] = 1) was used to estimate Kd.
DNA ladders (10-, 20-, 30-, 40-, 50-, 61-, 71-, and 99-mers) were created by mixing oligonucleotides with different sequences and labeling with 32P. RNA ladders (10-, 20-, 30-, 40-, 50-, 60-, 70-, 80-, 90-, and 100-mers) was created as recommended by the manufacturer (Ambion) and labeled with 32P. 10-mer, 15-mer, and 30-mer oligonucleotides were the 3′-truncated forms of a 61-mer oligonucleotide: GACGCTGCCGAATTCTACCAGTGCCTTGCTAGGACATCTTTGCCCACCTGCAGGTTCACCC. Corresponding dsDNAs were prepared by annealing with the complementing oligonucleotides.
To study its biochemical properties, we overexpressed wild-type human FANCA protein in insect cells using the Bac-to-Bac expression system. As shown in Fig. 1B, we purified WT FANCA to near homogeneity. Purified WT FANCA migrated to a position corresponding to its calculated molecular mass of 164 kDa on SDS-polyacrylamide gel, indicating that it was the full-length protein. The identity of FANCA was further confirmed by Western blotting using a FANCA-specific antibody and an anti-His6 antibody (Fig. 1B).
Because FANCA has been shown to be involved in many steps of DNA repair, we reasoned that FANCA is likely to interact directly with DNA. Indeed, EMSA analysis by incubating increasing amounts of purified FANCA with 32P-labeled ssDNA or its dsDNA counterpart showed that FANCA bound to both ssDNA and dsDNA in a concentration-dependent manner (Fig. 1C). Intriguingly, FANCA had significantly greater affinity for ssDNA than for dsDNA. Using the paired t test, we determined the statistical significance of shifts between ssDNA and dsDNA (Fig. 1D). The p values for 16, 32, and 64 nm FANCA between ssDNA and dsDNA were 0.013, 0.031, and 0.014, respectively, which indicates a significant difference (p < 0.05). Furthermore, the Kd determined by protein titration showed that FANCA bound to ssDNA ~4-fold better than to dsDNA (11.1 nm for ssDNA and 42.3 nm for dsDNA) (Table 1).
Because FA is also a developmental disease and FANCA is involved in the regulation of gene expression (24, 25, 36), it is conceivable that FANCA could somehow be involved in RNA transactions, e.g. RNA stability, transcription, or translation, to carry out its functions. When FANCA was incubated with a single-stranded RNA (ssRNA) oligonucleotide with the same sequence as the ssDNA, we indeed observed that FANCA possessed affinity for RNA. In fact, its affinity for ssRNA determined by Kd was significantly better than for ssDNA (2.8 nm for ssRNA and 11.1 nm for ssDNA) (Table 1). The p values determined by paired t test for 4 and 8 nm FANCA between ssDNA and ssRNA showed a significant difference (0.041 and 0.033 for 4 and 8 nm FANCA, respectively) (Fig. 1D).
The preferential binding of FANCA to ssDNA resembles that of RPA, a well known ssDNA-binding protein that is involved in DNA replication, damage signaling, recombination, and repair (37–39). EMSA analysis of purified RPA (33) with ssDNA and dsDNA of different lengths indicated that RPA is very specific for ssDNA and forms protein-DNA filaments with increasing amounts of RPA and increasing sizes of ssDNA (multiple shifted bands in Fig. 2A, right panel). A high concentration of RPA could shift 10-nucleotide ssDNA (10-mer ssDNA in Fig. 2A, right panel); however, it did not efficiently interact with dsDNA, even when it was a 61-mer.
Unlike RPA, purified FANCA did not bind to ssDNA efficiently when it was shorter than 30 nucleotides (Fig. 2A, left panel). A nitrocellulose filter binding assay using labeled 25-mer ssDNA and RNA also demonstrated negative results for FANCA-nucleic acid interaction. When a 30-mer ssDNA oligonucleotide was used for EMSA, we observed an unstable interaction between FANCA and the ssDNA (smear bands in Fig. 2A, left panel). However, the interaction was dramatically improved when a 61-mer oligonucleotide was used, indicating that FANCA requires a larger area of DNA compared with RPA for optimal interaction. Also apart from RPA, FANCA began to interact with dsDNA when it was a 61-mer (61-mer dsDNA in Fig. 2A, left panel).
To further define the minimum size of nucleic acids for optimal interaction, we first incubated a set of ssDNA or RNA ladders with purified FANCA. After EMSA, we recovered the nucleic acids in the shifted bands and reran them on a denaturing sequencing gel. As shown in Fig. 2B, with increasing amounts of FANCA, more ssDNA and RNA species were identified in the shifted band. However, even with high concentrations of FANCA, ssDNA and RNA oligonucleotides shorter than 30 nucleotides were barely detectable in the shifted band, further supporting that FANCA requires ~30 nucleotides of nucleic acids for efficient interaction.
On the basis of the molecular masses and migration distances of FANCA-DNA and RPA-DNA complexes on the EMSA gel, we reasoned that only one FANCA is bound to the 61-mer ssDNA oligonucleotide (61-mer ssDNA in Fig. 2A, left panel). Our next question was whether longer ssDNA has the capacity to accommodate more FANCA molecules and to form a FANCA-DNA filament. Using a synthetic 116-mer ssDNA oligonucleotide, which was almost double the 61-mer oligonucleotide, we found two distinct shifted bands with increasing amounts of FANCA (compare the shifted bands of the 116-mer and 61-mer oligonucleotides in Fig. 2C), indicating two FANCA molecules on one 116-mer ssDNA molecule. In contrast, only one shifted band corresponding to the faster band in the 116-mer experiment was observed with the 61-mer oligonucleotide. FANCA had greater overall affinity when the ssDNA was longer (compare the shifted bands between the 116- and 61-mers in Fig. 2, C and D).
Because FA proteins are involved in maintaining the stability of replication forks (40, 41), we reasoned that the DNA-binding activity of FANCA is likely involved in recognition of branched structures. To test this possibility, we performed EMSA with purified WT FANCA and a variety of DNA structures, including the 5′- and 3′-tails, splayed arm, 5′- and 3′-flaps, static fork, and Holliday junction (Fig. 3) (35). Intriguingly, the affinity between FANCA and different DNA structures could be divided into three groups: Group I, the 5′-tail, 5′-flap, and splayed arm structures showed the highest close-to-ssDNA level affinity (Fig. 3, solid red lines); Group II, the 3′-tail and 3′-flap structures had lower affinity compared with Group I (Fig. 3, dashed green lines); and Group III, the static fork and Holliday junction possessed the lowest affinity for FANCA (Fig. 3, solid blue lines). The apparent Kd constants of FANCA for all structures also support this grouping (Table 1).
Whereas the common feature of Group I structures is that they contain either a 5′-flap or a 5′-tail (31-mer), Group II DNAs have a 3′-flap or a 3′-tail (31-mer as well), and Group III structures barely have any free ends (only a 1-base 5′-overhang designed for 3′ labeling). On the basis of these observations, we conclude that FANCA prefers DNA structures with a 5′-flap or a 5′-tail.
FANCA does not have any identifiable domains for its interaction with nucleic acids. To define the nucleic acid-binding domain of FANCA, we first analyzed the primary structure of FANCA using the PSIPRED protein structure prediction server (University College London) and found that FANCA is predicted to contain extensive α-helices with a few short β-strands. On the basis of this prediction, we first chose a truncation mutant of FANCA (Q772X) derived from FA patients. Q772X is located roughly in the middle of the 1455-amino acid protein and in the middle of a coiled structure. EMSA of the purified truncation mutant (Fig. 1, A and B) indicated that the N-terminal moiety of FANCA had diminished nucleic acid-binding activity (Fig. 4, first panel).
However, the complementing fragment of Q772X, i.e. C-terminal amino acids 772–1455 of FANCA (Fig. 1, A and B), bound to nucleic acids more efficiently and retained the differentiated binding activity in the order of RNA > ssDNA > dsDNA (Fig. 4, second panel). To further define the nucleic acid-binding domain, we created two additional C-terminal truncation mutants (Fig. 1, A and B). EMSA tests indicated that the DNA-binding activity of both mutants was severely compromised (Fig. 4, third and fourth panels), although the leucine zipper-containing C772–1197 fragment retained partially affinity for ssDNA. The nucleic acid binding results are summarized in Fig. 1A. Overall, these data demonstrate that the nucleic acid-binding domain is located primarily at the C terminus of FANCA.
FA proteins have been generally believed to be involved in the repair of DNA ICLs that block replication and transcription (1–9). Although there are at least 15 FA disease-causing genes, ~60% of the disease is caused by a deficiency in FANCA (8, 18). Thus far, the established DNA-interacting components (FANCM, FANCI, FANCD2, FANCJ, FANCO, and FANCP) account for only ~5% of FA, an observation that does not seem to support the role of FA proteins in DNA repair (9). Discovery of the robust nucleic acid-binding activity of FANCA is important because it explains not only how FANCA localizes to chromatin but also provides solid biochemical support for its role in DNA repair.
An intriguing observation of this study is the preferential binding activity of FANCA for ssDNA over dsDNA. It has been known that recruitment of the FA core complex to chromatin relies strictly on replication (40, 41). Therefore, it is likely that FANCA recognizes the exposed ssDNA in the stalled replication forks, which look like a 5′-flap structure or a splayed arm, and contributes to assembly of the FA core complex on the stalled forks. It would be interesting to further investigate whether and how the DNA-binding activity of FANCA facilitates repair of cross-links through damage recognition, translesion synthesis (29), and homologous recombination events (30).
Thus far, FANCM-FAAP24 is the only identified DNA-binding component in the FA core complex (42, 43). FANCM can remodel stalled replication forks through fork reversal and branch migration, thus stabilizing the stalled replication forks and providing temporal and spatial access for the damage to be repaired (34, 44). FANCM appears to be responsible for recruitment of the FA core complex to chromatin (21, 42, 43, 45–48). The monoubiquitinated FANCI-FANCD2 complex may also be recruited to chromatin through a FANCM-dependent mechanism (15, 49–51). However, unlike other factors in the core complex, FANCM is not required for the formation of the eight-subunit (but not the 10-subunit) core complex (45), and FANCM−/− cells are only partially deficient in damage-induced FANCD2 monoubiquitination (52, 53). Fancm−/− knock-out mice further support that FANCM may have a stimulatory but not essential role in monoubiquitinating FANCD2 (54). Additionally, a direct interacting partner for FANCM-FAAP24 in the FA core complex has not been identified thus far, although FANCM-FAAP24 was originally identified through protein association in a FANCA-specific immunoprecipitation assay (5, 43, 55). FANCM−/− cells are sensitive to camptothecin, a topoisomerase inhibitor. Susceptibility to camptothecin is a unique feature identified only for FANCD1/BRCA2 and FANCN/PALB2 but not for components of the FA core complex (52). These observations suggest that FANCM may act downstream of FANCD2, and therefore, the upstream FA core complex may be recruited to DNA through other mechanisms, such as the DNA-binding activity of FANCA. Based on our observations, FANCA seems to be capable of recruiting the FA core complex to stalled replication forks through its ssDNA-binding activity and its preferential recognition of 5′-flap and splayed arm structures.
Another interesting insight to emerge from these studies is that FANCA has a higher affinity for ssRNA than for ssDNA. There is currently limited information to explain how the RNA-binding activity of FANCA could be linked to its functions. However, we think this activity may be physiologically relevant to RNA-related processes, such as transcription, translation, and RNA stability. First, the FA core complex has been reported to be involved in regulating gene expression through transcriptional (25) and post-transcriptional (56) mechanisms. Second, besides the nucleus, FANCA does localize to the cytoplasm (57–59), which supports a possible function in RNA metabolism. Third, FANCA has been shown to functionally interact with PKR (protein kinase regulated by RNA), a critical factor in translational control as well as regulation of cell proliferation and apoptosis (60). We speculate that, through its RNA-binding activity, FANCA may actively participate in these important biological processes. Further investigation into this issue should help us understand the unusually disproportional contribution of FANCA to FA.
The nucleic acid-binding domain of FANCA is located at its C terminus, where an imperfect leucine zipper and an ATR phosphorylation site are found (Fig. 1A) (61). It would be interesting to test whether the partial leucine zipper and the phosphorylation site have any effect on nucleic acid binding. It is very intriguing that, by analyzing the FANCA variants (1380 public entries as of March 23, 2011) available in the Fanconi Anemia Mutation Database, we found that ~90% of the reported disease-causing point mutations of FANCA are located at the C terminus (from amino acids 772 to 1455), where the nucleic acid-binding domain is identified, further supporting the idea that FANCA is likely to exert its functions through its affinity for nucleic acids.
We thank Drs. Murray Deutscher and Mary Lou King (University of Miami) for reagents. We are grateful to Dr. Wei Yang (NIDDK, National Institutes of Health) and Dr. Murray Deutscher for critical comments and discussion.
*This work was supported, in whole or in part, by National Institutes of Health Grant R01 HL105631 (to Y. Z.). This work was also supported by a new investigator research grant from the Florida Biomedical Research Program (to Y. Z.).
3The abbreviations used are: