|Home | About | Journals | Submit | Contact Us | Français|
Both DNA and chromatin need to be duplicated during each cell division cycle. Replication happens in the context of defects in the DNA template and other forms of replication stress that present challenges to both genetic and epigenetic inheritance. The replication machinery is highly regulated by replication stress responses to accomplish this goal. To identify important replication and stress response proteins, we combined isolation of proteins on nascent DNA (iPOND) with quantitative mass spectrometry. We identified 290 proteins enriched on newly replicated DNA at active, stalled, and collapsed replication forks. Approximately 16% of these proteins are known replication or DNA damage response proteins. Genetic analysis indicates that several of the newly identified proteins are needed to facilitate DNA replication, especially under stressed conditions. Our data provide a useful resource for investigators studying DNA replication and the replication stress response and validate the use of iPOND combined with mass spectrometry as a discovery tool.
Chromosomal replication requires the coordinated action of a large molecular machine, called the replisome, consisting of multiple subunits, including helicases, polymerases, histone chaperones, and chromatin-modifying enzymes. The replisome must work with speed and precision to replicate the DNA and chromatin during each cell division cycle. Damage to the DNA template from endogenous and environmental genotoxins, depletion of nucleotide precursors, and even difficult-to-replicate DNA sequences can impede replication fork progression. Multiple mechanisms respond to this stress to repair the damaged DNA, signal checkpoint activation, ensure the completion of DNA replication, and maintain genome stability. Defects in replication stress response mechanisms cause diseases that are characterized by developmental abnormalities, premature aging, and cancer predisposition.
The ataxia-telangiectasia- and Rad3-related (ATR)2 protein kinase signaling pathway is a primary regulator of the replication stress response (1). A complex of ATR and its obligate partner ATRIP is activated by interactions with TOPBP1 when DNA polymerase and helicase activities at the replication fork are uncoupled (2–5). Activated ATR stabilizes the stalled fork, promotes fork restart, and regulates cell cycle checkpoints to ensure completion of DNA synthesis prior to mitosis. If ATR is not functional, then forks collapse into double-strand breaks because of the action of unregulated fork remodeling and nuclease activities (6).
The continued high rate of discovery of new replication stress response proteins suggests that our inventory of replication regulators remains incomplete. Thus, identifying proteins that function at active and damaged replication forks and characterizing how they work in a coordinated fashion to maintain genome integrity remain critically important research goals. We recently developed a technology called isolation of proteins on nascent DNA (iPOND) that can be used to track protein recruitment to active and damaged replication forks as well as study the processes of chromatin deposition and maturation (7, 8). Importantly, the technique provides high resolution and sensitivity and is compatible with unbiased approaches such as mass spectrometry.
iPOND uses the nucleoside analog 5-ethynyl-2′-deoxyuridine (EdU) and click chemistry (8). EdU is rapidly incorporated into newly synthesized DNA when added to cell culture medium and does not interfere with replication or cause detectable DNA damage when used in short term cell culture (8, 9). An alkyne functional group on EdU can be reacted with an azide linked to biotin using click chemistry. This facilitates a streptavidin-biotin method of purification of the EdU-labeled nascent DNA with associated proteins. Fixation of cells with a reversible cross-linking agent prior to click chemistry and cell lysis permits purification under denaturing conditions, making a single-step isolation procedure possible. Cross-link reversal separates the proteins from the DNA fragments, which can then be detected by immunoblotting or mass spectrometry. Here we coupled iPOND to unbiased shotgun proteomics to probe the changes in replisome composition at active, stalled, and collapsed replication forks.
iPOND was performed largely as described previously (7), with the following modifications. 500 ml of logarithmically growing (3.3 × 106 cells/ml) suspension of 293T cells (a total of 1.6 × 109 cells) were labeled with 12 μm EdU for 15 min. An EdU labeling period of this length may label ~15–20 kb of DNA depending on the rate of polymerization and how rapidly EdU is imported into the cell and phosphorylated by thymidine kinase (8). Following EdU incorporation, the stalled fork sample was incubated in 3 mm of hydroxyurea for 2 h, and the collapsed fork sample was treated with 3 mm hydroxyurea and 3 μm of ATR inhibitor for 2 h to induce fork collapse (10). After EdU labeling, the thymidine chase sample was centrifuged at 1000 rpm for 4 min, medium was decanted, and cells were resuspended in medium equilibrated for temperature and pH containing 10 μm thymidine. The thymidine chase was conducted for 60 min. All samples were fixed with 1% formaldehyde for 20 min at room temperature, followed by 5-min incubation with 0.125 m glycine to quench the formaldehyde.
Fixed samples were split evenly into six 50-ml conical tubes, centrifuged at 2000 rpm at 4 °C for 6 min, washed three times with PBS, and frozen at −80 °C. Five of the six tubes were processed independently on a scale of 2.7 × 108 cells/sample for iPOND purifications. Briefly, click chemistry reactions were performed to conjugate biotin to the EdU-labeled DNA. Streptavidin beads were used to capture the biotin-conjugated DNA-protein complexes. Captured complexes were washed extensively using SDS and high-salt wash buffers. Purified replication fork proteins were eluted under reducing conditions by boiling in 2× SDS sample buffer for 25 min. One-sixth of the eluted protein sample was resolved 1 cm into a 10% Novex precast gel (Invitrogen), excised from the gel slice, alkylated, and in-gel trypsin-digested using standard procedures.
Recovered tryptic peptides were subjected to two-dimensional LC-MS/MS (multidimensional protein identification technology) separation as described previously (11). Briefly, digested peptides were separated by a combined strong cation exchange and reversed-phase chromatographic strategy. Subsets of peptides were eluted from the strong cation exchange onto the reverse phase using a series of ammonium acetate pulses of increasing concentration. This was performed for eight steps, each followed by a 105-min aqueous to organic separation on the reversed-phase column. Eluted peptides were directly nanoelectrospray-ionized and introduced into an LTQ-XL mass spectrometer (Thermo Fisher Scientific) where peptide tandem mass spectra (MS/MS) were collected in a data-dependent manner. The peptide spectral data were searched against the canonical human proteome subset of UniProtKB (v. 155) using the Myrimatch (v. 1.6.75) (12), Sequest (v. 27) (13), and Myrimatch and Sequest (14) database search engines. Protein groups were assembled using IDPicker, which uses parsimony to report the minimum number of confident protein identifications (15). Matched peptides were filtered at a 5% peptide and protein false discovery rate, and each protein required a minimum of two independent peptides for identification. Protein identifiers were converted to EntrezID unique identifiers using the UniProt ID mapping database (16) and the DAVID bioinformatics database (17).
To determine fold enrichments of proteins relative to the negative controls, spectral count data were imported into the statistical software program QuasiTel (18) for pairwise comparisons. QuasiTel applies a quasi-likelihood model to raw spectral count data and reports protein fold enrichment and statistical significance as a quasi p value. Spectral count data are normalized for each multidimensional protein identification technology run using the total number of spectra reported for the run. The threshold for spectral counts was set at an average of one spectral count per experimental sample. For example, when comparing the five replicates from the replication fork sample to the five replicates from the chromatin chase sample, a minimum of 10 total spectral counts was required from the 10 samples for QuasiTel comparisons. Furthermore, to be considered a protein significantly enriched on nascent DNA, the filtering criteria required a minimum of 1.5-fold enrichment above both of the negative controls and a quasi p value of ≤ 0.05.
These filtering criteria were applied to proteins identified using each of the three protein identification search algorithms (Myrimatch plus Sequest, Myrimatch alone, and Sequest alone). Therefore, three lists of enriched proteins were generated independently. The final data reported in supplemental Tables S1–S3 represent the union of all three lists and report the median fold enrichment relative to the chromatin-bound negative control, median p value, and median spectral counts. The median p value, in some cases, is > 0.05 because three independent p values were calculated by QuasiTel for each protein identified by the three different search algorithm methods. If any one of the analyses yielded a p value < 0.05, that protein is reported in supplemental Tables S1–S3 along with the median p value from the three analyses. It should also be noted that when no spectra were detected in the thymidine chase negative control, QuasiTel calculated relative fold enrichment using a small, non-zero value in the denominator. This factor may lead to an overestimation of protein enrichment. Although these values are included in supplemental Tables S1–S3, they are omitted from Figs. 22–4.
Proteins identified at elongating, stalled, and collapsed replication forks were classified on the basis of gene ontology using ToppGene (19). To display median fold enrichment relative to the thymidine chase negative control, median quasi p value, and median spectral counts from the experimental sample were graphed using R. Protein network modeling was performed using the GeneMANIA prediction server (20).
The antibodies used were as follows: H2A, H2B, MSH2, and SNF2H (Abcam); H1 (Millipore); PCNA (Santa Cruz Biotechnology); SNF2L (Cell Signaling Technology); BAZ1B (Novus).
Four individual siRNAs for each of the genes arrayed in 384-well dishes were transfected into U2OS cells at 10 nm final siRNA concentrations. Three days after transfection, cells were treated with 2 mm hydroxyurea for 24 h. Hydroxyurea was removed, and cells were incubated with 10 μm EdU for 4 h. Cells were then fixed with paraformaldehyde and processed with Alexa Fluor 488-coupled biotin azide followed by labeling with antibodies to γH2AX as described previously (21). Images were obtained on a PerkinElmer Life Sciences Opera automated microscope, and the intensities of EdU and γH2AX per nucleus were quantitated by Columbus image analysis software. The ratio of γH2AX to EdU intensities was used as the final scoring criterion. Samples with elevated ratios were identified using the Wilcoxon rank-sum test requiring a false discovery rate-adjusted p value of < 0.001 and a ratio of at least 2.0. As a comparison, the average ratio for the negative control siRNA was 1.07 with an S.E. of 0.026.
To identify proteins associated with nascent DNA at active, stalled, and collapsed replication forks, we coupled iPOND purifications to mass spectrometry. Five samples were prepared for iPOND-MS (Fig. 1A). For all samples, cells were treated for 15 min with EdU to label nascent DNA. To examine proteins at active replication forks, EdU-labeled cells were collected immediately. To monitor proteins associated with stalled replication forks, EdU-labeled cells were treated with 3 mm hydroxyurea for 2 h to arrest fork movement and induce a replication stress response. To identify proteins associated with fork collapse, cells were treated with hydroxyurea and a selective ATR inhibitor (10) for 2 h. These conditions elicit fork collapse, including accumulation of double-strand breaks and excess single-stranded DNA (ssDNA) at the replication fork (6). EdU remained in the growth media during the hydroxyurea treatments.
The specificity of replication fork protein purifications was tested relative to two negative controls. The first were cells treated identically to the normal replication fork sample, but the biotin azide was omitted during the iPOND procedure. Proteins purified in this “no click reaction” sample represent those that interact nonspecifically with streptavidin-conjugated beads. For the second negative control, cells labeled with EdU were washed and then incubated with medium containing a small amount of thymidine for 1 h. This procedure allows replication to continue without additional EdU incorporation. The small concentration of thymidine does not interfere with replication but is used to compete for whatever EdU is left in the cell after removing it from the growth medium. Thus, this negative control monitors proteins bound to mature chromatin that are no longer close to the replication fork. Proteins detected in this “thymidine chase” sample represent chromatin-bound proteins that are not specifically enriched at replisomes (8).
To test the relative enrichment of replication proteins in the samples submitted for mass spectrometry analyses, iPOND purifications were examined for PCNA levels. As observed previously, PCNA was detected at elongating replication forks, and its levels decreased in the thymidine chase sample (Fig. 1B). Although still detectable, PCNA levels at stalled and collapsed replication forks were also decreased compared with the active fork sample, likely because of unloading of PCNA from the mature Okazaki fragments (8). The equal level of histone H2B detected on isolated chromatin (Fig. 1B) indicates that an equivalent amount of EdU-labeled DNA was purified in each sample.
The five experimental samples were purified independently five times each using the iPOND procedure (Fig. 1C). Eluted proteins were analyzed using two-dimensional liquid chromatography coupled with tandem mass spectrometry (multidimensional protein identification technology). The MS/MS spectra were matched to the human protein database using the Myrimatch and Sequest search engines (12–14).
QuasiTel was used to compute fold enrichment values of each experimental sample relative to both of the negative control samples (18). The final lists include proteins enriched at least 1.5-fold (relative to both negative controls) with p values from at least one of the search engines yielding a p value ≤ to 0.05 as computed by QuasiTel.
These filtering criteria led to the identification of a total of 290 proteins that were enriched in at least one of the three experimental samples compared with both negative controls. Approximately 16% of the enriched proteins have been documented previously to function in DNA replication or DNA damage responses. Functional characterization of the dataset revealed that gene ontology categories such as DNA repair, response to DNA damage, DNA metabolic process, DNA replication, and cell cycle were highly overrepresented above a random chance of expectancy (Fig. 1D). This provides confidence that the iPOND-MS screen successfully identified DNA replication and replication stress response proteins. As expected, abundant chromatin-associated proteins like histones were detected by mass spectrometry but not enriched above the controls in any of the experimental samples.
Of the total proteins enriched on nascent DNA, 84 were found to accumulate at active forks (supplemental Table S1), 139 at stalled forks (supplemental Table S2), and 137 at collapsed forks (supplemental Table S3). Several established genome maintenance proteins were among the proteins enriched in all three experimental conditions. For example, the interstrand cross-link repair factor FANCI, which is found mutated in Fanconi anemia, the ATR-activating and replication initiation protein TOPBP1, and the chromatin remodeler SMARCAD1 were enriched at replication forks in unperturbed and stressed conditions.
Overall, the highest confidence proteins from iPOND-MS have low p values, are highly enriched relative to both negative controls, and are detected with large spectral count numbers. A majority of these high-confidence proteins at active replication forks are known replisome components such as PCNA, the RFC complex, and polymerase subunits including POLD1 and POLE (Fig. 2A).
Bioinformatics searches using the GeneMANIA prediction server (20) indicate that approximately one-third of the proteins identified at active forks form an interacting network (Fig. 2B). Not surprisingly, PCNA represents a prominent node in this network because it serves as a binding scaffold for numerous replication and DNA damage proteins (23). Eleven of the iPOND-MS proteins contain predicted PCNA-interacting motifs, which is greater than predicted by chance (p = 2 × 10−7) (Fig. 2C). Proteins containing a PCNA interaction protein motif, or PIP box, include the Williams syndrome transcription factor (also known as BAZ1B), DNA methyltransferase (DNMT1), ligase 1, the mismatch repair proteins MSH3 and MSH6, the chromatin remodelers SNF2L and SNF2H, and the E3 ubiquitin ligase UBR5. The centrosomal protein CP110, the DNMT1 recruiting protein UHRF1, and the euchromatic histone methyltransferase EHMT1 have predicted AlkB homologue 2 PCNA-interacting motifs (24).
To further analyze the proteins, the dataset was compared with two published proteomics screens that identify substrates of the ATM or ATR checkpoint kinase substrates (25, 26). At least 19 of the iPOND-MS enriched proteins are putative ATM/ATR substrates that were identified in these proteomic screens (Fig. 2C). This represents a statistically significant overrepresentation of checkpoint kinase substrates compared with what would be expected by chance (p = 1 × 10−11).
The majority of these kinase substrates are known replication or DNA damage response proteins such as MSH2, MSH3, MSH6, POLE, RFC1, RFC3, TOPBP1, the TOPB1 ubiquitin ligase UBR5, FANCI, the exonuclease EXO1, the replication initiating factor WDHD1 (also known as AND1), and the alternative PCNA clamp loader ATAD5 (also known as ELG1). Other ATM/ATR substrates that localized to active forks are involved in chromatin assembly and maturation. These include the histone chaperones CAF1A and CAF1B, the chromatin remodeler SMARCAD1, and EHMT1. tRNA methyltransferase (TRMT6) and vacuolar protein-sorting homolog B (VPS26) are ATM/ATR substrates that have not been linked previously to DNA replication or replication stress responses but were identified at elongating forks using iPOND-MS.
The iPOND-MS list was also cross-referenced with large-scale siRNA screens that identified genes that, when silenced, activate the DNA damage response (27, 28). Many of those genes encode DNA replication or replication stress response proteins whose inactivation leads to replication-associated checkpoint signaling. Eleven iPOND-MS proteins cause increased H2AX phosphorylation when silenced, including several with no previously described functions in DNA replication, such as EP400, HSD17B7, PDCD4, PLOD1, SMARCA1, SNRPD1, or TRMT6 (Fig. 2C).
Finally, the strong enrichment of the mismatch repair proteins MSH2, MSH3, and MSH6 at active elongating forks is consistent with recent data from yeast systems indicating that these proteins travel with the replisome (29). We verified that both MSH2 and MSH6 are associated with the replisome in a pattern mirroring PCNA using conventional immunoblotting (data not shown).
Several known replisome proteins were not detected. In some cases, not enough peptides were identified, or the fold enrichment values and statistical reproducibility did not meet our stringent criteria. Thus, the dataset should not be considered a full list of replisome proteins.
Proteins enriched near stalled replication forks are listed in supplemental Table S2 and Fig. 3A. The dataset is significantly enriched in gene ontologies classified under cellular response to stress (p = 8 × 10−5), DNA metabolic process (p = 6 × 10−4), and the cell cycle (p = 1.7 × 10−3).
Several known DNA damage response proteins, including MDC1, RPA, RECQL1, XRCC1, FANCD2, FANCI, RAD1, and TOPBP1, were enriched. The identified proteins are also enriched for ATM/ATR substrates (19 proteins, p = 5 × 10−7). Five contain PCNA-interacting motifs, although this is not larger than expected by chance (p = 0.07), and 16 cause elevated DNA damage signaling when depleted (Fig. 3B). Over 50% of these proteins have not been implicated previously in DNA replication or stress responses.
One of the ATM/ATR substrates identified at the stalled fork is EHMT2 (also known as G9A). This protein methylates H3K56, which has well studied functions in DNA replication and repair in yeast. Recent experiments also indicate that this histone modification is important in mammalian cells during DNA replication (30), and the presence of EHMT2 at stalled forks supports this observation.
A number of DNA damage response proteins that are known to be recruited to stalled forks, including ATR, were not identified. In some cases we suspect this is because iPOND only purifies nascent, EdU-labeled DNA. Thus, the unlabeled parental ssDNA signaling platform created by the uncoupling of helicase and polymerase activities along with bound checkpoint proteins is only purified if it remains attached to the double-stranded, newly synthesized DNA containing EdU (8). Less aggressive DNA fragmentation may be needed to capture the parental ssDNA adjacent to the labeled, nascent, double-stranded DNA.
Proteins enriched at collapsed forks after combined hydroxyurea and ATR inhibitor treatment are shown in supplemental Table S3 and Fig. 4A. The most striking difference between the collapsed fork sample and the other two experimental conditions is a large increase in DNA double-strand break repair and RPA-associated proteins. At least one-fourth of the identified proteins form an interacting network (Fig. 4B). ATM and RPA are major nodes in this interaction network. The recruitment of ATM is consistent with studies demonstrating that ATR inhibition leads to ATM activation (1). The strong enrichment of RPA subunits and RPA-interacting proteins at collapsed forks is consistent with our previous observation that large amounts of nascent-strand ssDNA is generated in these conditions due, in part, to resection of a double-strand break (6). In addition to all three subunits of RPA, this sample contained the RPA-interacting helicases BLM and WRN, the fork regression enzyme SMARCAL1, and the double-strand break response proteins ATM, MDC1, RAD51, and BRIP1. SMARCAL1 is one of the most highly enriched proteins. ATR phosphorylates SMARCAL1 to limit enzymatic processing of stalled forks, and unregulated SMARCAL1 contributes to fork collapse (6). ATR was also enriched in this dataset (2.5-fold). However, its p value was just outside of the cutoff for significance (p = 0.054).
Two of the most enriched proteins at collapsed forks are MMS22L and TONSL. The MMS22L-TONSL complex is recruited to sites of RPA-coated ssDNA to promote recombination repair of damaged replication forks (31–33). The MMS22L-TONSL complex facilitates homologous recombination after DNA end resection through promoting RAD51 filament formation. These results are also consistent with the idea that ATR prevents the formation of double-strand breaks and nascent ssDNA at replication forks (6).
The collapsed fork dataset is enriched in ATM/ATR substrates (15 proteins, p = 3 × 10−5). Seven have PCNA-interacting motifs (p = 0.008), and 23 of the proteins cause increased DNA damage signaling when silenced by siRNA (Fig. 4C).
Of the ~240 new putative replication/replication stress response proteins identified in the iPOND-MS screens, we selected 148 for further analysis. Specifically, we were interested in identifying new proteins that might be important for continued replication under stressful conditions. Therefore, we performed an RNA interference screen using four siRNAs targeting many of the genes encoding proteins without clear functions that had higher enrichment or statistical significance values or for which some published literature or domain structure suggested a function in nucleic acid metabolism. Following siRNA transfection, cells were treated with hydroxyurea overnight to stall replication and then allowed to recover for 4 h in the presence of EdU. Cells were then fixed and stained for EdU and γH2AX intensity. Control U2OS cells recover quickly from this acute replication stress challenge and complete DNA replication with very little loss of viability within the time frame of the experiment (6, 34). The expectation is that genes encoding proteins need to maintain replication fork stability, or facilitate replication fork recovery will yield low levels of EdU and high levels of γH2AX after knockdown. Indeed, as a positive control, ATR silencing results in a high ratio of γH2AX to EdU values because forks collapse into double-strand breaks and do not resume DNA synthesis (Fig. 5A). Thus, knockdown of a gene that functions in an ATR or related replication stress response pathway would be predicted to cause high γH2AX/EdU ratios. The results of the screen are shown in Fig. 5A and supplemental Table S4. Seventeen genes passed our stringent criteria and had at least two individual siRNAs yielding an elevated score (> 2) with a false discovery rate-adjusted p value < 0.001.
Two genes, PPP1R10 (PNUTS) and SMARCA1 (SNF2L) had three of four siRNAs yield an elevated score. PNUTS is a targeting subunit for protein phosphatase 1. It is recruited to sites of ionizing radiation-induced DNA damage, is important for DNA repair, and loss of PNUTS function causes G2 checkpoint activation in unperturbed cells (35). Thus, the PNUTS-PP1c phosphatase likely has critical functions at stalled replication forks or after fork collapse to promote fork restart.
SNF2L is an ATP-dependent chromatin remodeling protein. Interestingly, the highly related protein SNF2H (SMARCA5) did not phenocopy loss of SNF2L in this screen (Fig. 5B) even though we confirmed that both SNF2L and SNF2H are enriched at active elongating replication forks using standard iPOND combined with immunoblotting (C and D). Both of these proteins are motor proteins in ISWI chromatin remodeling complexes, which reposition nucleosomes during transcription and other nucleic acid metabolic processes (36). Each protein forms several protein complexes with accessory factors, including BAZ1A and BAZ1B (37). BAZ1B (also known as Williams syndrome transcription factor) was also identified in our iPOND-MS dataset at active forks and confirmed by immunoblotting (Fig. 5, C and D), but, like SNF2H, it did not yield an elevated γH2AX/EdU ratio compared with controls. A BAZ1B-SNF2H complex interacts with PCNA and regulates chromatin compaction during replication (38). SNF2H is also recruited to double-strand breaks where it functions to help unfold chromatin (39). Less is known about SNF2L function in DNA damage responses, but silencing SNF2L by RNA interference increases the amount of DNA damage signaling in cells (40). Because iPOND selectively purifies proteins behind the fork in complexes with newly synthesized DNA, our data suggest that SNF2L and SNF2H function on newly deposited chromatin. Silencing of the two chromatin remodelers does not yield identical phenotypes, indicating that the proteins perform non-redundant roles and that SNF2L may have an especially important function in the context of replication stress.
Coupling iPOND with two-dimensional LC-MS/MS is a powerful discovery tool. We identified 290 proteins at active, stalled, and collapsed forks. Providing validation of the approach, the dataset is highly enriched in proteins known to function in DNA damage responses, cell cycle control, DNA repair, and replication. For example, at normally elongating replication forks, 15 of the top 20 proteins, as measured by fold enrichment and p value, are established replisome components and chromatin replication factors. These include the replicative polymerases, PCNA, the replication-loading complex RFC (RFC1–5), and the chromatin assembly factors CAF1A and CAF1B. The stalled fork dataset enriched for DNA damage response proteins above a random chance of occurrence. Collapsed replication forks exhibited strong enrichment of RPA and RPA-interacting proteins, double-strand break repair proteins, and fork-remodeling helicases.
While this manuscript was in preparation, the Fernadez-Capetillo group (41) completed an iPOND-MS study only looking at proteins enriched at active forks. They identified many of the same replisome components, including ATAD5, BAZ1B, CHAF1A, CHAF1B, DNMT1, EXO1, LIG1, MSH2, MSH3, MSH6, PCNA, POLD1, POLE, RFC1–5, UHRF1, and WIZ. They also identified the MCM helicase complex, which was not enriched in our datasets. In immunoblotting experiments we have observed variable results in detecting the MCM proteins. We suspect that the differences are due to how much cleavage of the ssDNA at the fork happens during the iPOND processing. The MCM proteins function to unwind parental DNA and are not directly associated with newly synthesized, EdU-labeled DNA. Thus, detection of the MCM helicase would rely on purifying larger fragments of DNA containing both nascent and parental strands. Other differences in methodologies may further explain differences in the datasets. Most notably, we used ~10-fold less cells in our samples and examined hydroxyurea-stalled and hydroxyurea/ATR inhibitor-induced collapsed forks in addition to active forks. The decreased amount of starting material may also explain why many known replication and stress response proteins were not identified. Nonetheless, both datasets provide useful resources for investigators interested in replication and replication stress responses. Finally, we would caution that although we applied stringent criteria for protein identification and enrichment, further validation of the candidate proteins is required, especially in cases with higher p values and lower enrichment scores.
The mismatch repair proteins MSH2, MSH3, and MSH6 were some of the most highly enriched proteins at unperturbed replication forks. The high level of enrichment of mismatch repair proteins is unlikely to be due to the need to remove true mismatches because the polymerase error rate is low. More likely, the mismatch repair proteins are scanning for errors in conjunction with replication, as shown recently for the yeast MMR system (29), or possibly involved in removing ribonucleotides from the DNA (42). It is also possible that the MMR proteins may recognize EdU-labeled DNA. However, any DNA damage because of EdU incorporation does not activate a DNA damage signaling pathway in the time frame of these experiments, and very little (if any) of the EdU is removed from the DNA because we do not observe a decrease in chromatin capture after growing cells for hours after the EdU labeling (8).
The FANCI and FANCD2 proteins are highly enriched at stalled and collapsed replication forks. FANCI was also detected at active forks. FANCI and FANCD2 function during interstrand cross-link repair (43). These lesions are some of the most difficult-to-repair substrates, requiring specialized repair mechanisms governed by genes mutated in patients with Fanconi anemia as well as components of nucleotide excision and double-strand break repair (44, 45). FANCD2 is ubiquitylated in response to hydroxyurea and even as cells enter a normal S phase. Thus, the FANCI/FANCD2 may recognize DNA structures generated during replication stress, such as ssDNA-dsDNA junctions (46), and these proteins may have functions outside of cross-link repair. Indeed, FANCD2 promotes the restart of aphidicolin-stalled replication forks (47). Alternatively, it is possible that the FANCI and FANCD2 proteins were identified because of a small amount of continued EdU incorporation during the beginning of the formaldehyde fixation. If this were the case, it might create protein cross-links to the DNA that could be recognized by the Fanconi proteins (22).
The high-level enrichment of the heterotrimeric ssDNA-binding protein RPA is a striking feature of stalled replication forks that collapse after ATR inhibition. Concomitant with RPA accumulation, we observed enrichment of the disease-associated helicases BLM, CHD1L, SMARCAL1, and WRN as well as many other RPA interacting proteins. These data are consistent with the recent observation that ATR inhibition causes the extensive production of nascent-strand ssDNA at replication forks through a process involving fork reversal, enzymatic cleavage, and end resection (6).
Finally, our data confirm important functions for chromatin remodeling enzymes, including SNF2L and SNF2H, at replication forks. The highly related SNF2H and SNF2L chromatin remodelers are the motor enzymes of ISWI complexes (36). In complex with BAZ1B, SNF2H is recruited to replication forks via an interaction with PCNA to maintain the chromatin landscape through DNA replication (38). The activity of SNF2L at replication forks has not been described, but our data indicate that it must have an important non-redundant function that, perhaps, is especially needed in the context of replication stress.
Collectively, our data indicate that iPOND can be combined with mass spectrometry to provide a powerful discovery approach. In addition to analyzing normal, stalled, and collapsed forks, there are many other instances in which iPOND-MS analysis would be useful to understand DNA repair, replication, and chromatin biology.
*This work was supported, in whole or in part, by National Institutes of Health Grants R01 CA136933 and R21 ES22319 (to D. C.) and P30 ES000267. This work was also supported by Department of Defense Breast Cancer Research Program Predoctoral Fellowship W81XWH-10-1-0226 (to B. M. S.), by a grant from Swim Across America (to B. M. S.), by a Center of Molecular Toxicology Grant for Vanderbilt Proteomics Core use, and by the Vanderbilt-Ingram Cancer Center.
This article contains supplemental Tables S1–S4.
2The abbreviations used are: