|Home | About | Journals | Submit | Contact Us | Français|
Nineteen Thermococcus kodakarensis strains have been constructed, each of which synthesizes a different His6-tagged protein known or predicted to be a component of the archaeal DNA replication machinery. Using the His6-tagged proteins, stable complexes assembled in vivo have been isolated directly from clarified cell lysates and the T. kodakarensis proteins present have been identified by mass spectrometry. Based on the results obtained, a network of interactions among the archaeal replication proteins has been established that confirms previously documented and predicted interactions, provides experimental evidence for previously unrecognized interactions between proteins with known functions and with unknown functions, and establishes a firm experimental foundation for archaeal replication research. The proteins identified and their participation in archaeal DNA replication are discussed and related to their bacterial and eukaryotic counterparts.
DNA replication is a central and essential event in all cell cycles. Historically, the biological world was divided into prokaryotes and eukaryotes, based on the absence or presence of a nuclear membrane, and many components of the DNA replication machinery have been identified and characterized as conserved or nonconserved in prokaryotic versus eukaryotic organisms. However, it is now known that there are two evolutionarily distinct prokaryotic domains, Bacteria and Archaea, and to date, most prokaryotic replication research has investigated bacterial replication. Here, we have taken advantage of recently developed genetic techniques to isolate and identify many proteins likely to be components of the archaeal DNA replication machinery. The results confirm and extend predictions from genome sequencing that the archaeal replication system is less complex but more closely related to a eukaryotic than to a bacterial replication system.
The replisome, the chromosomal DNA replication machinery, is composed of subcomplexes that separate the duplex DNA, prime and synthesize DNA, mature and ligate Okazaki fragments, and facilitate and stabilize events in replication (Fig. 1) (1). Extensive biochemical and genetic research has led to the identification of conserved and domain-specific protein-protein and protein-DNA interactions that direct the assembly, and are required for the movement, functions, and stability of bacterial and eukaryotic replisomes. To date, archaeal replication has received far less experimental attention. Archaea are prokaryotes and, in common with Bacteria, most have a single circular chromosome (~0.8 to 8 Mbp) that is replicated bidirectionally from an origin of replication. With no nuclear membrane, the archaeal replisome is also assembled directly from proteins in the cytoplasm, but based on sequence conservation, most of the archaeal proteins predicted to participate in DNA replication are more closely related to eukaryotic than bacterial proteins (reviewed in reference 2). Only a few of these archaeal proteins have, however, been functionally characterized, and there are some obvious and intriguing absences of archaeal homologues of conserved bacterial and/or eukaryotic replisome proteins. For example, there are no known archaeal homologues of the Escherichia coli τ subunit that couples the leading and lagging strand DNA polymerases and helicase, or of Cdc45 and MCM10, proteins essential for eukaryotic chromosome replication. The functions of these proteins may therefore be carried out by unrelated archaeal proteins or by archaeal homologues with such divergent sequences that they are not readily identified by bioinformatics. Such divergence is exhibited by DNA replication processivity factors. Bacterial processivity factors (the β subunit of DNA polymerase III [PolIII]) are homodimers of ~40-kDa subunits, whereas eukaryotic and euryarchaeal processivity factors (proliferating cell nuclear antigen [PCNA]) are homotrimers of ~29-kDa subunits. Members of the order Crenarchaeota contain three different PCNA homologues that assemble into heterotrimers (3). The bacterial and eukaryotic/euryarchaeal PCNAs have only ~15% sequence identity but still form complexes with almost identical three-dimensional structures and retain the same functions (4). Here we report the results of experiments that identify many of the proteins likely to participate in DNA replication in the euryarchaeon Thermococcus kodakarensis. With this experimentally documented database available, a firm foundation is established for focused research on individual archaeal replication components and for investigative exploitation of this simpler prokaryotic model for eukaryotic replication.
Obtaining the information reported was made possible by the recent development of genetic tools for T. kodakarensis KOD1, a heterotrophic hyperthermophile with a 2.09-Mbp genome that has ~2,300 annotated genes (5). T. kodakarensis is naturally competent for DNA uptake and incorporates added DNA into its chromosome by homologous recombination. By constructing DNA molecules with a target gene flanked by chromosomal sequences, the gene can be deleted, inactivated, or replaced with an allele that encodes a modified protein. For this project, we constructed 19 T. kodakarensis strains, each of which has a gene encoding a known or predicted replication protein replaced with the same gene with a hexahistidine-encoding sequence (His6 tag) added in frame at either the 5′ or the 3′ terminus. As the modified genes were expressed from the wild-type loci, they were subject to the same regulation as the wild-type genes. The His6-tagged proteins synthesized in vivo were isolated directly by Ni2+ affinity from clarified cell lysates, and the T. kodakarensis proteins that were coisolated, as components of stable complexes assembled in vivo, were then identified by mass spectrometry. As reported and discussed, the identities of these proteins confirm some, but not all, of the predicted archaeal replisome interactions, reveal unpredicted associations, and provide experimental evidence for additional replication components.
Lysates were generated from exponentially growing but not synchronized cell populations and so contained complexes present at all stages of the replication cycle. The His6-tagged proteins (Table 1) were all synthesized as soluble proteins and were present in readily detectable amounts in the clarified lysates. All of the putative protein-protein interactions detected, based on the coisolation of a protein with a His6-tagged protein, are documented in Table S1 in the supplemental material. The consistent interactions that remained, after the exclusion of proteins whose annotated functions argue strongly against a role in nucleic acid metabolic processes, are listed in Table 2 and illustrated as a network in Fig. 2. Many of the interactions were confirmed by coisolation of the same proteins when different interacting partners were His6 tagged and used to isolate the complex. The results include both previously established and previously unknown interactions between documented, predicted, and previously unrecognized components of the archaeal replisome. In some cases, when two or three homologous proteins with very similar sequences were present and different homologues were His6 tagged, the same proteins were coisolated, consistent with functional redundancy. When this was not the case, the results argue for divergence of the homologues to the extent that different interactions are made, suggesting different functions.
As some replisome complexes were already well documented, the coisolation of the proteins known to be components of these complexes validated and provided a measure of the sensitivity of the His6 tag-dependent coisolation technology. Some examples of these validating interactions are described individually below, and all are listed in Table 2 and documented in Fig. 2.
(i) Polypeptide subunits of DNA polymerase D (PolD), primase, replication factor C (RFC), and the GINS complex. It is well established that the archaeal replisome components PolD, primase, RFC, and the GINS complex (from the Japanese go-ichi-ni-san, meaning 5-1-2-3, after the four related subunits of the eukaryotic complex, Sld5, Psf1, Psf2, and Psf3) are each formed by the assembly of two different polypeptides (2, 6, 7). Consistent with this, the two polypeptides annotated as the subunits of these replisome proteins in T. kodakarensis were coisolated with very high MASCOT scores (Table 2). Additional experiments with recombinant proteins also confirmed that, as predicted, the two primase, the two GINS, and the two RFC subunits assembled in vitro to form a heterodimer, a heterotetramer, and a heteropentamer, respectively (data not shown).
(ii) Replication protein A (RPA) [single-stranded DNA (ssDNA)-binding protein] heterotrimers. T. kodakarensis has three genes (TK1959, TK1960, and TK1961) that encode homologues of the polypeptides that form the eukaryotic trimeric RPA complex (RPA1, RPA2, and RPA3, respectively). In Pyrococcus furiosus, three RPA homologues have also been identified and shown to form an active heteromeric complex (8). Consistent with this, T. kodakarensis RPA1 and RPA2 were coisolated by Ni2+ binding of His6-tagged RPA3 (Table 2).
(iii) RFC-PCNA complex formation. RFC-PCNA binding has been reported in all of the replication systems investigated (9). T. kodakarensis has two genes that encode PCNA homologues, PCNA1 and PCNA2 (TK0535p and TK0582p, respectively). Both the small (RFC-S; TK2218p) and large (RFC-L; TK2219p) subunits of RFC were coisolated with His6-tagged PCNA1, and PCNA2 was coisolated with His6-tagged RFC-L (Table 2). These coisolation results are consistent with both PCNA homologues participating in DNA replication and with replication complexes assembled in vivo containing a mixture of PCNA1 and PCNA2. In vitro experiments have also confirmed functional interactions between RFC and both PCNA proteins (J. Hurwitz; personal communication).
(iv) PCNA-PolD-Fen1-ligase interactions. PCNA interactions with DNA polymerases increase their processivity (10). PCNA also binds and regulates the activity of a number of enzymes participating in Okazaki fragment maturation and postreplication processes (summarized in references 11 and 12). Consistent with these reports, PCNA1 was coisolated in complexes with PolD-L (large subunit of euryarchaeon-specific PolD), Fen1, and DNA ligase (Table 2). There was no evidence for a PCNA-PolB interaction when either PolB or the PCNA proteins were tagged. Such an interaction may not, however, be detectable in a soluble extract given that the bacterial and eukaryotic processivity factors (the β subunit and PCNA, respectively) must encircle the DNA to form a stable complex with the polymerase. PolD was also coisolated with His6-tagged DNA ligase, adding support to the hypothesis that DNA ligase is associated with the archaeal replication fork (13).
(i) PCNA interactions. Many proteins have been reported to interact with eukaryotic PCNA (11), but only a few of these have recognizable homologues in Archaea. Most of the proteins that bind to PCNA do so via a PIP box sequence (14, 15). In addition to proteins expected to copurify with PCNA (see above), Cdc6 (TK1901p), MCM1 (TK0096p), and MCM2 (TK1361p) were copurified with His6-tagged PCNA1, and these do contain PIP box-related sequences (QRAKEAFY in Cdc6p, QKPYENFW and QSKPGFY in MCM1p, and QERVIGFL in MCM2). Three additional proteins that have no known functions but also contain PIP box-related sequences were also routinely coisolated in complexes with His6-tagged PCNA2, namely, TK0569p (QPRSPFYP), TK0953p (QALAEWYA), and TK1046p (QGYRESFA). MCM and PCNA are both established replisome participants, but this is the first experimental evidence for their copresence within a stable complex and the presence of the PIP box sequence suggests a direct MCM-PCNA interaction. The possible roles of PCNA-Cdc6 interaction are discussed below. Homologues of TK0569p are present in Archaea and Bacteria, and homologues of TK1046p are present in all three domains (discussed below). Homologues of TK0953p are present in a small number of archaeal and bacterial species and each appears to have an ATPase domain.
(ii) GINS interactions. In eukaryotes, the GINS complex is an assembly of four different polypeptides (designated Sld5, Psf1, Psf2, and Psf3) that interact with several replisome components, including MCM and the Pol α-primase complex (summarized in references 16 and 17). The GINS complex plays a role in both the initiation and elongation phases of DNA replication. All archaeal genomes contain a single protein, designated GINS15, that has sequence similarity to Sld5 and Psf1. Some Archaea, including T. kodakarensis, also have a protein designated GINS23 that is related to Psf2 and Psf3 (18) and forms a tetrameric complex that contains two GINS15 and two GINS23 subunits (18). Both subunits of PolD were coisolated using His6-tagged GINS15, and PCNA1 and PCNA2 were both coisolated with His6-tagged GINS23, providing the first experimental evidence for a stable replisome association of the GINS complex with PolD and PCNA.
TK1252p, a protein coisolated with His6-tagged GINS15 (Table 2), is annotated as an ssDNA-specific exonuclease with some homology to bacterial RecJ. Intriguingly, a protein (SSO0295p) predicted to have a DNA-binding domain similar to that in RecJ, copurified with the GINS complex from Sulfolobus solfataricus (19). RecJ plays a role in stalled replication fork activation in E. coli (20), suggesting that SSO0295p and TK1252p may similarly help in maintaining replication fork progression. SSO0295p and TK1252p are not, however, related proteins. These observations suggest that the eukaryotic GINS complex may also associate with an as-yet-unidentified nuclease.
(iii) Rad50 interactions. Eukaryotic Rad50 is part of a complex with Mre11 and Nbs1 that is required for double-strand DNA break repair (reviewed in reference 21) and also plays a role during replication. This complex may help prevent replication fork-associated damage by serving as a scaffold that maintains the fork during replication pauses (for example, see reference 22). T. kodakarensis Rad50 (TK2211p) was coisolated in complexes using His6-tagged MCM1, primase, and RPA2 (Table 2). This is consistent with Rad50 also being present in the archaeal replisome and participating in a replication-related function in both eukaryotes and archaea. An interaction of eukaryotic Rad50 and RPA has also been reported (23), and based on the results obtained with T. kodakarensis Rad50, it seems reasonable to predict that eukaryotic Rad50 also interacts with helicase and primase.
(iv) MCM interactions. The MCM proteins are generally considered to function as replicative helicases (24, 25), but the coisolation of both Rad50 and MutS (TK0682p) with His6-tagged MCM1 (Table 2) predicts that the MCM proteins may also participate in DNA repair.
(v) TK1046p interactions. Homologues of TK1046p are present in all three domains. The function(s) of this large protein (147.4 kDa) is unknown, although it does share some sequence similarity with nucleases and it is predicted to have an OB fold, a motif often used for nucleic acid recognition. TK1046p was coisolated in complexes using His6-tagged Fen1, GINS15, MCM3, PCNA2, PolB, RFC-L, RPA2, and TK1792p with very high MASCOT scores (Table 2). This large number of interactions with known replisome enzymes argues strongly that TK1046p is a component of the replication machinery. By extrapolation from the OB fold prediction, TK1046p may be the first recognized example of a conserved nuclease that participates in DNA replication in all three domains.
TK1410p is predicted by sequence similarity to be related to the bacterial primase DnaG, and limited primase activity has been reported for a recombinant version of the TK1410p homologue from S. solfataricus (SSO0079p) (26). These observations suggested that this protein might be part of the replisome, but the complexes isolated using His6-tagged TK1410p did not contain any known replisome proteins, but rather components of the exosome (TK1633p and TK1634p) were isolated. TK1410p was similarly not present in any complex isolated using a known His6-tagged replication protein. Consistent with TK1410p being a part of the exosome, purified exosomes and exosome-containing membrane fractions from S. solfataricus also contain SSO0079p (27, 28). Taken together, the results argue that TK1410p participates in exosome activity rather than in DNA replication.
The interaction network (Fig. 2) and the interactions listed in Table 2 and in Table S1 in the supplemental material were documented using a systematic approach to isolate and identify all of the proteins that copurified with known or predicted archaeal replisome components in T. kodakarensis. All of the T. kodakarensis strains that synthesized His6-tagged proteins grew at the wild-type rate in all of the media tested, minimizing any concerns for the accumulation of aberrant structures or assembly into nonnative complexes. To ensure the same regulation and expression levels, the genes encoding the His6-tagged proteins were expressed from the native chromosomal locations using the wild-type gene expression signals. The results reported provide in vivo confirmation and validation of archaeal replication protein interactions previously documented in vitro and experimental evidence for several previously unrecognized replisome interactions that likely contribute to archaeal and potentially, by extrapolation, also to eukaryotic replication fork assembly, maintenance, and function.
The archaeal Cdc6 proteins bind to the origin of replication, where they are thought to direct the DNA strand separation needed for the initiation of DNA replication and also to recruit other components of the replisome to the origin of replication (summarized in references 6 and 29). Thus, they are functional homologues of bacterial DnaA proteins. The complexes isolated from T. kodakarensis using His6-tagged Cdc6 contained PCNA1, providing the first direct experimental support for an archaeal Cdc6-PCNA interaction, an observation that may be of major significance. A regulatory event known as the regulatory inactivation of DnaA (RIDA) ensures that the E. coli chromosome is replicated only once per cell cycle (reviewed in references 30 and 31). RIDA stimulates the hydrolysis of the active replication initiator ATP-DnaA complex, resulting in inactive ADP-DnaA complexes. The β subunit of PolIII (the functional homologue of PCNA) and the homologous-to-DnaA (Hda) protein are required for this regulation (reviewed in references 30 and 31). The coisolation of PCNA1 and Cdc6 is consistent with a mechanism similar to RIDA existing in Archaea. Hda belongs to the AAA+ family of ATPases and has sequence similarity to the ATPase region of DnaA. As there is no identifiable archaeal Hda homologue, one of the proteins that coisolated with His6-tagged Cdc6 or PCNA proteins may embody the Hda function.
In eukaryotes, Cdc45, MCM, and GINS form a tight complex (referred to as the CMG complex) that moves with the replication fork and is thought to function as the replicative helicase. The GINS complex also interacts with the Pol α-primase complex, which is responsible for primer synthesis on the lagging strand (reference 17 and references therein). To date, no archaeal homologue of Cdc45 has been identified but several proteins, and so potential candidates for Cdc45 functional homologues, copurified with His6-tagged GINS15 or GINS23, including TK0569p, TK1046p, and TK1186p, which also copurified with His6-tagged primase (Fig. 2; Table 2). It has also been proposed that the GINS proteins maintain the integrity of the replisome by linking the replicative polymerase, primase, and helicases, but a direct interaction of GINS with DNA polymerase has not been documented. In Archaea, GINS was previously shown to interact with primase and MCM (19, 32) but not with DNA polymerase. The results now reported (Fig. 2; Table 2) confirm that both subunits of PolD form a complex with His6-tagged GINS15 and both PCNA1 and PCNA2 interact with His6-tagged GINS23. When added to the previously reported interactions, these results add substantial experimental support to the hypothesis that GINS functions as the center of the replisome, linking the polymerase, helicase, and primase components.
In eukaryotes, the activities of PCNA and MCM are modulated by ubiquitination and sumoylation (reviewed in references 33 and 34). A small protein similar in size to ubiquitin (~8 kDa; TK0808p) was consistently coisolated with His6-tagged PCNA1, and a second similarly sized protein (~8.5 kDa; TK0590p) was coisolated in complexes using His6-tagged Fen1, MCM1, MCM2, and MCM3 (Table 2). Currently, very little is known of protein modification in Archaea (35–37), but it seems possible that TK0590p and/or TK0808p could form protein conjugates that regulate archaeal replication as does ubiquitin and SUMO modification of replication proteins in eukaryotes. Some support for this notion is provided by the observation that PCNA in Haloferax volcanii is stabilized by proteosome disruption (38).
MCM is a hexameric complex that assembles at the leading edge of the replication fork and unwinds the two DNA strands ahead of the replicative polymerase (24, 39, 40). In eukaryotes, MCM is a heterocomplex of six different polypeptides (MCM2 through MCM7). Most of the archaeal species studied in detail to date contain only one MCM polypeptide that assembles to form a homohexamer. Recently, some Archaea have been identified (41–43) with several MCM homologues that are thought to have resulted from gene duplication and/or lateral gene transfer from other Archaea (3, 5, 42, 43).
T. kodakarensis has three genes (TK0096, TK1361, and TK1620) encoding MCM homologues, MCM1, MCM2, and MCM3, respectively, that could assemble to form three different MCM homohexamer complexes and/or many different MCM heterohexamer complexes. The coisolation results argue for the assembly of only homohexameric MCM complexes. MCM2 and MCM3 were not coisolated with His6-tagged MCM1, MCM1 and MCM3 were not coisolated with His6-tagged MCM2, and MCM1 and MCM2 were not coisolated with His6-tagged MCM3. The results obtained are consistent with both MCM1 and MCM2 being part of the replisome, and based on the similarity of their interactions, they may be functionally redundant. In contrast, the results argue that MCM3 participates in complexes that differ from those formed by MCM1 and MCM2 (Fig. 2; Table 2). Only proteins with unknown functions were coisolated using His6-tagged MCM3, and MCM3 was never coisolated with a known His6-tagged replication enzyme. MCM3 appears to be a member of the McmD group (42), one of the two groups of MCM proteins conserved within the order Methanococcales that overall contain four to eight MCM homologues. In eukaryotes, MCM homologues are thought also to participate in transcription, DNA repair, and chromatin remodeling and it seems possible that the archaeal McmD group of MCM proteins might similarly participate in one or more of these processes in Archaea, rather than in DNA replication.
Genes encoding 19 known or putative replication proteins were amplified from T. kodakarensis genomic DNA (Table 1), and the His6-encoding sequence (5′ CATCATCATCATCATCAT 3′) was added, in frame, to either the 3′ or the 5′ terminus by overlapping PCR (44). Full details of the primers used are available upon request. The amplified genes were cloned into pUMT2 (45) using restriction enzymes adjacent to trpE (TK0254) and flanked by ~2-kbp DNA molecules that were amplified from immediately upstream and downstream of the gene of interest. The DNA molecules and the organization of genes cloned into pUMT2 to generate the plasmids used to transform T. kodakarensis KW128 are illustrated in Fig. 3. Plasmid preparations were isolated from E. coli DH5α cells and used directly to transform T. kodakarensis KW128 (ΔpyrF ΔtrpE::pyrF) as previously described (45, 46). Transformants were selected by colony growth at 85°C on plates containing GELRITE-solidified minimal medium that lacked tryptophan. Cultures of representative transformants were grown to stationary phase in MA-YT medium (46) that contained 2 g S/liter. The cells were harvested, and genomic DNA was isolated. The presence of the desired chromosomal construction was confirmed by diagnostic PCR amplification and DNA sequencing as previously described (46). Homologous recombination within the flanking sequences directed integration of the transforming DNA into the T. kodakarensis chromosome. In each case, the wild-type gene of interest was replaced with trpE and the gene that encoded the His6-tagged version of the replication protein.
The His6-encoding sequence was also added to genes that encode a subunit of RPA (RPA1; TK1959), the small subunit of euryarchaeal DNA polymerase D (PolD-S; TK1902), and the small subunit of the dimeric primase (Pri-S; TK1791). Transformation with these constructs failed to generate viable T. kodakarensis transformants, suggesting that the His6 extension resulted in defective enzymes.
T. kodakarensis cells were harvested by centrifugation from 5-liter cultures grown to late exponential phase (optical density at 600 nm of ~0.8) (see Fig. S1 in the supplemental material) at 80°C in MA-YT medium supplemented with 5 g sodium pyruvate/liter using a BioFlow 415 fermentor (New Brunswick Scientific). The cells were resuspended in 30 ml of buffer A (25 mM Tris-HCl [pH 8], 500 mM NaCl, 10 mM imidazole, 10% glycerol) and lysed by sonication. After centrifugation, the resulting clarified lysate was loaded onto a 1-ml HiTRAP chelating column (GE Healthcare) preequilibrated with NiSO4. The column was washed with buffer A, and proteins were eluted using a linear imidazole gradient from buffer A to 67% buffer B (25 mM Tris-HCl [pH 8], 100 mM NaCl, 150 mM imidazole, 10% glycerol). Fractions that contained the tagged protein were identified by Western blotting, pooled, and dialyzed against buffer C (25 mM Tris-HCl [pH 8], 500 mM NaCl, 0.5 mM EDTA, 2 mM dithiothreitol). Thirty-microgram aliquots of the proteins present in solution were precipitated by adding trichloroacetic acid (TCA; 15% final concentration).
The TCA-precipitated proteins were identified by multidimensional protein identification technology at the Ohio State University mass spectrometry facility (http://www.ccic.ohio-state.edu/MS/proteomics.htm) using the MASCOT search engine. A MASCOT score of >100 was considered meaningful. To obtain such a score, a minimum of two unique peptide fragments usually had to be identified from the same protein. Protein isolation and mass spectrometry analyses of lysates from two independent cultures of T. kodakarensis KW128 were also undertaken. From these controls, several T. kodakarensis proteins were identified that bound and eluted from the Ni2+-charged matrix in the absence of a His6-tagged protein. All of the proteins identified in the experimental samples that had MASCOT scores of >100 and were not also present in the control samples are listed in Table S1 in the supplemental material.
We thank J. Hurwitz for providing research results before publication.
This work was supported by grants from the National Science Foundation (MCB-0815646 to Z.K.) and the National Institutes of Health (R01GM53185 to J.N.R. and 1F32-GM073336 to T.J.S.).
Citation Li, Z., T. J. Santangelo, L. Čuboňová, J. N. Reeve, and Z. Kelman. 2010. Affinity purification of an archaeal DNA replication protein network. mBio 1(5):e00221-10. doi:10.1128/mBio.00221-10.