Comparative analysis of the protein sequences encoded in the genomes of three families of large DNA viruses that replicate, completely or partly, in the cytoplasm of eukaryotic cells (poxviruses, asfarviruses, and iridoviruses) and phycodnaviruses that replicate in the nucleus reveals 9 genes that are shared by all of these viruses and 22 more genes that are present in at least three of the four compared viral families. Although orthologous proteins from different viral families typically show weak sequence similarity, because of which some of them have not been identified previously, at least five of the conserved genes appear to be synapomorphies (shared derived characters) that unite these four viral families, to the exclusion of all other known viruses and cellular life forms. Cladistic analysis with the genes shared by at least two viral families as evolutionary characters supports the monophyly of poxviruses, asfarviruses, iridoviruses, and phycodnaviruses. The results of genome comparison allow a tentative reconstruction of the ancestral viral genome and suggest that the common ancestor of all of these viral families was a nucleocytoplasmic virus with an icosahedral capsid, which encoded complex systems for DNA replication and transcription, a redox protein involved in disulfide bond formation in virion membrane proteins, and probably inhibitors of apoptosis. The conservation of the disulfide-oxidoreductase, a major capsid protein, and two virion membrane proteins indicates that the odd-shaped virions of poxviruses have evolved from the more common icosahedral virion seen in asfarviruses, iridoviruses, and phycodnaviruses.
The mitochondrial DNA of trypanosomes contains two types of circular DNAs, minicircles and maxicircles. Both minicircles and maxicircles replicate from specific replication origins by unidirectional theta-type intermediates. Initiation of the minicircle leading strand and also that of at least the first Okazaki fragment involve RNA priming. The Trypanosoma brucei genome encodes two mitochondrial DNA primases, PRI1 and PRI2, related to the primases of eukaryotic nucleocytoplasmic large DNA viruses. These primases are members of the archeoeukaryotic primase superfamily, and each of them contain an RNA recognition motif and a PriCT-2 motif. In Leishmania species, PRI2 proteins are approximately 61 to 66 kDa in size, whereas in Trypanosoma species, PRI2 proteins have additional long amino-terminal extensions. RNA interference (RNAi) of T. brucei PRI2 resulted in the loss of kinetoplast DNA and accumulation of covalently closed free minicircles. Recombinant PRI2 lacking this extension (PRI2ΔNT) primes poly(dA) synthesis on a poly(dT) template in an ATP-dependent manner. Mutation of two conserved aspartate residues (PRI2ΔNTCS) resulted in loss of enzymatic activity but not loss of DNA binding. We propose that PRI2 is directly involved in initiating kinetoplast minicircle replication.
Primase and GINS are essential factors for chromosomal DNA replication in eukaryotic and archaeal cells. Here we describe a previously undetected relationship between the C-terminal domain of the catalytic subunit (PriS) of archaeal primase and the B-domains of the archaeo-eukaryotic GINS proteins in the form of a conserved structural domain comprising a three-stranded antiparallel β-sheet adjacent to an α-helix and a two-stranded β-sheet or hairpin. The presence of a shared domain in archaeal PriS and GINS proteins, the genes for which are often found adjacent on the chromosome, suggests simple mechanisms for the evolution of these proteins.
This article was reviewed by Zvi Kelman (nominated by Michael Galperin) and Kira Makarova.
Viruses with genomes greater than 300 kb and up to 1200 kb are being discovered with increasing frequency. These large viruses (often called giruses) can encode up to 900 proteins and also many tRNAs. Consequently, these viruses have more protein-encoding genes than many bacteria, and the concept of small particle/small genome that once defined viruses is no longer valid. Giruses infect bacteria and animals although most of the recently discovered ones infect protists. Thus, genome gigantism is not restricted to a specific host or phylogenetic clade. To date, most of the giruses are associated with aqueous environments. Many of these large viruses (phycodnaviruses and Mimiviruses) probably have a common evolutionary ancestor with the poxviruses, iridoviruses, asfarviruses, ascoviruses, and a recently discovered Marseillevirus. One issue that is perhaps not appreciated by the microbiology community is that large viruses, even ones classified in the same family, can differ significantly in morphology, lifestyle, and genome structure. This review focuses on some of these differences rather than provides extensive details about individual viruses.
algal virus; phycodnavirus; Mimivirus; White spot shrimp virus; jumbo phage; NCLDVs
Repair of double-strand breaks in chromosomal DNA is essential. Unfortunately, a paradigm central to most DNA repair pathways—damaged DNA is replaced by polymerases, by using an intact, undamaged complementary strand as a template—no longer works. The nonhomologous end joining (NHEJ) pathway nevertheless still uses DNA polymerases to help repair double-strand breaks. Bacteria use a member of the archaeo-eukaryal primase superfamily, whereas eukaryotes use multiple members of the polymerase X family. These polymerases can, depending on the biologic context, accurately replace break-associated damage, mitigate loss of flanking DNA, or diversify products of repair. Polymerases specifically implicated in NHEJ are uniquely effective in these roles: relative to canonic polymerases, NHEJ polymerases have been engineered to do more with less. Antioxid. Redox Signal. 14, 2509–2519.
Nucleo-Cytoplasmic Large DNA viruses (NCLDV), a diverse group that infects a wide range of eukaryotic hosts, exhibit a large heterogeneity in genome size (between 100 kb and 1.2 Mb) but have been suggested to form a monophyletic group on the basis of a small subset of approximately 30 conserved genes. NCLDV were proposed to have evolved by simplification from cellular organism although some of the giant NCLDV have clearly grown by gene accretion from a bacterial origin.
We demonstrate here that many NCLDV lineages appear to have undergone frequent gene exchange in two different ways. Viruses which infect protists directly (Mimivirus) or algae which exist as intracellular protists symbionts (Phycodnaviruses) acquire genes from a bacterial source. Metazoan viruses such as the Poxviruses show a predominant acquisition of host genes. In both cases, the laterally acquired genes show a strong tendency to be positioned at the tip of the genome. Surprisingly, several core genes believed to be ancestral in the family appear to have undergone lateral gene transfers, suggesting that the NCLDV ancestor might have had a smaller genome than previously believed. Moreover, our data show that the larger the genome, the higher is the number of laterally acquired genes. This pattern is incompatible with a genome reduction from a cellular ancestor.
We propose that the NCLDV viruses have evolved by significant growth of a simple DNA virus by gene acquisition from cellular sources.
Eukaryotic DNA replication involves the synthesis of both a DNA leading and lagging strand, the latter requiring several additional proteins including flap endonuclease (FEN-1) and proliferating cell nuclear antigen (PCNA) in order to remove RNA primers used in the synthesis of Okazaki fragments. Poxviruses are complex viruses (dsDNA genomes) that infect eukaryotes, but surprisingly little is known about the process of DNA replication. Given our previous results that the vaccinia virus (VACV) G5R protein may be structurally similar to a FEN-1-like protein and a recent finding that poxviruses encode a primase function, we undertook a series of in silico analyses to identify whether VACV also encodes a PCNA-like protein.
An InterProScan of all VACV proteins using the JIPS software package was used to identify any PCNA-like proteins. The VACV G8R protein was identified as the only vaccinia protein that contained a PCNA-like sliding clamp motif. The VACV G8R protein plays a role in poxvirus late transcription and is known to interact with several other poxvirus proteins including itself. The secondary and tertiary structure of the VACV G8R protein was predicted and compared to the secondary and tertiary structure of both human and yeast PCNA proteins, and a high degree of similarity between all three proteins was noted.
The structure of the VACV G8R protein is predicted to closely resemble the eukaryotic PCNA protein; it possesses several other features including a conserved ubiquitylation and SUMOylation site that suggest that, like its counterpart in T4 bacteriophage (gp45), it may function as a sliding clamp ushering transcription factors to RNA polymerase during late transcription.
The family Mimiviridae belongs to the large monophyletic group of Nucleo-Cytoplasmic Large DNA Viruses (NCLDV; proposed order Megavirales) and encompasses giant viruses infecting amoeba and probably other unicellular eukaryotes. The recent discovery of the Cafeteria roenbergensis virus (CroV), a distant relative of the prototype mimiviruses, led to a substantial expansion of the genetic variance within the family Mimiviridae. In the light of these findings, a reassessment of the relationships between the mimiviruses and other NCLDV and reconstruction of the evolution of giant virus genomes emerge as interesting and timely goals.
Database searches for the protein sequences encoded in the genomes of several viruses originally classified as members of the family Phycodnaviridae, in particular Organic Lake phycodnaviruses and Phaeocystis globosa viruses (OLPG), revealed a greater number of highly similar homologs in members of the Mimiviridae than in phycodnaviruses. We constructed a collection of 898 Clusters of Orthologous Genes for the putative expanded family Mimiviridae (MimiCOGs) and used these clusters for a comprehensive phylogenetic analysis of the genes that are conserved in most of the NCLDV. The topologies of the phylogenetic trees for these conserved viral genes strongly support the monophyly of the OLPG and the mimiviruses. The same tree topology was obtained by analysis of the phyletic patterns of conserved viral genes. We further employed the mimiCOGs to obtain a maximum likelihood reconstruction of the history of genes losses and gains among the giant viruses. The results reveal massive gene gain in the mimivirus branch and modest gene gain in the OLPG branch.
These phylogenomic results reported here suggest a substantial expansion of the family Mimiviridae. The proposed expanded family encompasses a greater diversity of viruses including a group of viruses with much smaller genomes than those of the original members of the Mimiviridae. If the OLPG group is included in an expanded family Mimiviridae, it becomes the only family of giant viruses currently shown to host virophages. The mimiCOGs are expected to become a key resource for phylogenomics of giant viruses.
The mitochondrial genome of trypanosomes is composed of thousands of topologically interlocked circular DNA molecules that form the kinetoplast DNA (kDNA). Most genes encoded by the kDNA require a posttranscriptional modification process called RNA editing to form functional mRNAs. Here, we show that alternative editing of the mitochondrial cytochrome c oxidase III (COXIII) mRNA in Trypanosoma brucei produces a novel DNA binding protein, alternatively edited protein 1 (AEP-1). AEP-1 localizes to the region of the cell between the kDNA and the flagellum and purifies with the tripartite attachment complex, a structure believed to physically link the kDNA and flagellar basal bodies. Expression of the DNA binding domain of AEP-1 results in aberrant kDNA structure and reduced cell growth, indicating that AEP-1 is involved in the maintenance of the kDNA. Perhaps most important, our studies show a gain of function through an alternatively edited mRNA and, for the first time, provide a link between the unusual structure of the kDNA and RNA editing in trypanosome mitochondria.
Primases are specialized DNA-dependent RNA polymerases that synthesize a short oligoribonucleotide complementary to single-stranded template DNA. In the context of cellular DNA replication, primases are indispensable since DNA polymerases are not able to start DNA polymerization de novo.
The primase activity of the replication protein from the archaeal plasmid pRN1 synthesizes a rather unusual mixed primer consisting of a single ribonucleotide at the 5′ end followed by seven deoxynucleotides. Ribonucleotides and deoxynucleotides are strictly required at the respective positions within the primer. Furthermore, in contrast to other archaeo-eukaryotic primases, the primase activity is highly sequence-specific and requires the trinucleotide motif GTG in the template. Primer synthesis starts outside of the recognition motif, immediately 5′ to the recognition motif. The fidelity of the primase synthesis is high, as non-complementary bases are not incorporated into the primer.
African swine fever virus (ASFV) is a member of a family of large nucleocytoplasmic DNA viruses that include poxviruses, iridoviruses, and phycodnaviruses. Previous ultrastructural studies of ASFV using chemical fixation and cryosectioning for electron microscopy (EM) have produced uncertainty over whether the inner viral envelope is composed of a single or double lipid bilayer. In this study we prepared ASFV-infected cells for EM using chemical fixation, cryosectioning, and high-pressure freezing. The appearance of the intracellular viral envelope was determined and compared to that of mitochondrial membranes in each sample. The best resolution of membrane structure was obtained with samples prepared by high-pressure freezing, and images suggested that the envelope of ASFV consisted of a single lipid membrane. It was less easy to interpret virus structure in chemically fixed or cryosectioned material, and in the latter case the virus envelope could be interpreted as having two membranes. Comparison of membrane widths in all three preparations indicated that the intracellular viral envelope of ASFV was not significantly different from the outer mitochondrial membrane (P < 0.05). The results support the hypothesis that the intracellular ASFV viral envelope is composed of a single lipid bilayer.
The set of conserved eukaryotic protein-coding genes includes distinct subsets one of which appears to be most closely related to and, by inference, derived from archaea, whereas another one appears to be of bacterial, possibly, endosymbiotic origin. The “archaeal” genes of eukaryotes, primarily, encode components of information-processing systems, whereas the “bacterial” genes are predominantly operational. The precise nature of the archaeo–eukaryotic relationship remains uncertain, and it has been variously argued that eukaryotic informational genes evolved from the homologous genes of Euryarchaeota or Crenarchaeota (the major branches of extant archaea) or that the origin of eukaryotes lies outside the known diversity of archaea. We describe a comprehensive set of 355 eukaryotic genes of apparent archaeal origin identified through ortholog detection and phylogenetic analysis. Phylogenetic hypothesis testing using constrained trees, combined with a systematic search for shared derived characters in the form of homologous inserts in conserved proteins, indicate that, for the majority of these genes, the preferred tree topology is one with the eukaryotic branch placed outside the extant diversity of archaea although small subsets of genes show crenarchaeal and euryarchaeal affinities. Thus, the archaeal genes in eukaryotes appear to descend from a distinct, ancient, and otherwise uncharacterized archaeal lineage that acquired some euryarchaeal and crenarchaeal genes via early horizontal gene transfer.
archaea; eukaryotes; Euryarchaeota; Crenarchaeota; phylogenetic analysis
The rolling circle DNA replication structures generated by the in vitro phage T4 replication system were analyzed using two-dimensional agarose gels. Replication structures were generated in the presence or absence of T4 primase (gp61), permitting the analysis of replication forks with either duplex or single-stranded tails. A characteristic arc shape was visualized when forks with single-stranded tails were cleaved by a restriction enzyme with the help of an oligonucleotide that anneals to restriction sites in the single-stranded tail. After calibrating the gel system with this well-studied rolling circle replication reaction, we then analyzed the in vivo replication directed by a T4 replication origin cloned within a plasmid. DNA samples were generated from infections with either wild-type or primase-deletion mutant phage. The only replicative arc that could be detected in the wild-type sample corresponded to duplex Y forms, consistent with very efficient lagging strand synthesis. Surprisingly, we obtained evidence for both duplex and single-stranded DNA tails in the samples from the primase-deficient infection. We conclude that a relatively inefficient mechanism primes lagging strand DNA synthesis in vivo when gp61 is absent.
Using sequence profile methods and structural comparisons we characterize a previously unknown family of nucleic acid polymerases in a group of mobile elements from genomes of diverse bacteria, an algal plastid and certain DNA viruses, including the recently reported Sputnik virus. Using contextual information from domain architectures and gene-neighborhoods we present evidence that they are likely to possess both primase and DNA polymerase activity, comparable to the previously reported prim-pol proteins. These newly identified polymerases help in defining the minimal functional core of superfamily A DNA polymerases and related RNA polymerases. Thus, they provide a framework to understand the emergence of both DNA and RNA polymerization activity in this class of enzymes. They also provide evidence that enigmatic DNA viruses, such as Sputnik, might have emerged from mobile elements coding these polymerases.
This article was reviewed by Eugene Koonin and Mark Ragan.
Complex viruses that encode their own initiation proteins and subvert the host’s elongation apparatus have provided valuable insights into DNA replication. Using purified bacteriophage SPP1 and Bacillus subtilis proteins, we have reconstituted a rolling circle replication system that recapitulates genetically defined protein requirements. Eleven proteins are required: phage-encoded helicase (G40P), helicase loader (G39P), origin binding protein (G38P) and G36P single-stranded DNA-binding protein (SSB); and host-encoded PolC and DnaE polymerases, processivity factor (β2), clamp loader (τ-δ-δ′) and primase (DnaG). This study revealed a new role for the SPP1 origin binding protein. In the presence of SSB, it is required for initiation on replication forks that lack origin sequences, mimicking the activity of the PriA replication restart protein in bacteria. The SPP1 replisome is supported by both host and viral SSBs, but phage SSB is unable to support B. subtilis replication, likely owing to its inability to stimulate the PolC holoenzyme in the B. subtilis context. Moreover, phage SSB inhibits host replication, defining a new mechanism by which bacterial replication could be regulated by a viral factor.
The Rudiviridae are a family of rod-shaped archaeal viruses with covalently closed, linear double-stranded DNA (dsDNA) genomes. Their replication mechanisms remain obscure, although parallels have been drawn to the Poxviridae and other large cytoplasmic eukaryotic viruses. Here we report that a protein encoded in the 34-kbp genome of the rudivirus SIRV1 is a member of the replication initiator (Rep) superfamily of proteins, which initiate rolling-circle replication (RCR) of diverse viruses and plasmids. We show that SIRV Rep nicks the viral hairpin terminus, forming a covalent adduct between an active-site tyrosine and the 5′ end of the DNA, releasing a 3′ DNA end as a primer for DNA synthesis. The enzyme can also catalyze the joining reaction that is necessary to reseal the DNA hairpin and terminate replication. The dimeric structure points to a simple mechanism through which two closely positioned active sites, each with a single tyrosine residue, work in tandem to catalyze DNA nicking and joining. We propose a novel mechanism for rudivirus DNA replication, incorporating the first known example of a Rep protein that is not linked to RCR. The implications for Rep protein function and viral replication are discussed.
DNA polymerases cannot synthesize DNA without a primer, and DNA primase is the only specialized enzyme capable of de novo synthesis of short RNA primers. In eukaryotes, primase functions within a heterotetrameric complex in concert with a tightly bound DNA polymerase α (Pol α). In humans, the Pol α part is comprised of a catalytic subunit (p180) and an accessory subunit B (p70), and the primase part consists of a small catalytic subunit (p49) and a large essential subunit (p58). The latter subunit participates in primer synthesis, counts the number of nucleotides in a primer, assists the release of the primer-template from primase and transfers it to the Pol α active site. Recently reported crystal structures of the C-terminal domains of the yeast and human enzymes' large subunits provided critical information related to their structure, possible sites for binding of nucleotides and template DNA, as well as the overall organization of eukaryotic primases. However, the structures also revealed a difference in the folding of their proposed DNA-binding fragments, raising the possibility that yeast and human proteins are functionally different. Here we report new structure of the C-terminal domain of the human primase p58 subunit. This structure exhibits a fold similar to a fold reported for the yeast protein but different than a fold reported for the human protein. Based on a comparative analysis of all three C-terminal domain structures, we propose a mechanism of RNA primer length counting and dissociation of the primer-template from primase by a switch in conformation of the ssDNA-binding region of p58.
DNA primase; prim1; prim2; replication; 4Fe-4S cluster; crystal structure; DNA polymerase α
The XPR2 gene encoding an alkaline extracellular protease (AEP) from Yarrowia lipolytica was cloned, and its complete nucleotide sequence was determined. The amino acid sequence deduced from the nucleotide sequence reveals that the mature AEP consists of 297 amino acids with a relative molecular weight of 30,559. The gene codes for a putative 22-amino-acid prepeptide (signal sequence) followed by an additional 135-amino-acid propeptide containing a possible N-linked glycosylation site and two Lys-Arg peptidase-processing sites. The final Lys-Arg site occurs at the junction with the mature, extracellular form. The mature protease contains two potential glycosylation sites. AEP is a member of the subtilisin family of serine proteases, with 42.6% homology to the fungal proteinase K. The functional promoter is more than 700 base pairs long, allowing for the observed complex regulation of this gene. The 5' and 3' flanking regions of the XPR2 gene have structural features in common with other yeast genes.
Until recently there was little interest or information on viruses and viruslike particles of eukaryotic algae. However, this situation is changing. In the past decade many large double-stranded DNA-containing viruses that infect two culturable, unicellular, eukaryotic green algae have been discovered. These viruses can be produced in large quantities, assayed by plaque formation, and analyzed by standard bacteriophage techniques. The viruses are structurally similar to animal iridoviruses, their genomes are similar to but larger (greater than 300 kbp) than that of poxviruses, and their infection process resembles that of bacteriophages. Some of the viruses have DNAs with low levels of methylated bases, whereas others have DNAs with high concentrations of 5-methylcytosine and N6-methyladenine. Virus-encoded DNA methyltransferases are associated with the methylation and are accompanied by virus-encoded DNA site-specific (restriction) endonucleases. Some of these enzymes have sequence specificities identical to those of known bacterial enzymes, and others have previously unrecognized specificities. A separate rod-shaped RNA-containing algal virus has structural and nucleotide sequence affinities to higher plant viruses. Quite recently, viruses have been associated with rapid changes in marine algal populations. In the next decade we envision the discovery of new algal viruses, clarification of their role in various ecosystems, discovery of commercially useful genes in these viruses, and exploitation of algal virus genetic elements in plant and algal biotechnology.
The plasmid pRN1 encodes for a multifunctional replication protein with primase, DNA polymerase and helicase activity. The minimal region required for primase activity encompasses amino-acid residues 40–370. While the N-terminal part of that minimal region (residues 47–247) folds into the prim/pol domain and bears the active site, the structure and function of the C-terminal part (residues 248–370) is unknown. Here we show that the C-terminal part of the minimal region folds into a compact domain with six helices and is stabilized by a disulfide bond. Three helices superimpose well with the C-terminal domain of the primase of the bacterial broad host range plasmid RSF1010. Structure-based site-directed mutagenesis shows that the C-terminal helix of the helix bundle domain is required for primase activity although it is distant to the active site in the crystallized conformation. Furthermore, we identified mutants of the C-terminal domain, which are defective in template binding, dinucleotide formation and conformation change prior to DNA extension.
Understanding the composition and structure of the acquired enamel pellicle (AEP) has been a major goal in oral biology. Our lab has conducted studies on the composition of AEP formed on permanent enamel. The exhaustive exploration has provided a comprehensive identification of more than 100 proteins from AEP formed on permanent enamel. The AEP formed on deciduous enamel has not been subjected to the same biochemical characterization scrutiny as that of permanent enamel, despite the fact that deciduous enamel is structurally different from permanent enamel. We hypothesized that the AEP proteome and peptidome formed on deciduous enamel may also be composed of unique proteins, some of which may not be common with AEP of permanent enamel explored previously. Pellicle material was collected from 10 children (aged 18–54 months) and subjected to mass spectrometry analysis. A total of 76 pellicle proteins were identified from the deciduous pellicle proteome. In addition, 38 natural occurring AEP peptides were identified from 10 proteins, suggesting that primary AEP proteome/peptidome presents a unique proteome composition. This is the first study to provide a comprehensive investigation of in vivo AEP formed on deciduous enamel.
acquired enamel pellicle; saliva; proteomics; primary teeth; proteins; oral; LC-MS/MS
Phosphonatase functions in the 2-aminoethylphosphonate (AEP) degradation pathway of bacteria, catalyzing the hydrolysis of the C-P bond in phosphonoacetaldehyde (Pald) via formation of a bi-covalent Lys53ethylenamine/Asp12 aspartylphosphate intermediate. Because phosphonatase is a member of the haloacid dehalogenase superfamily, a family predominantly comprised of phosphatases, the question arises as to how this new catalytic activity evolved. The source of general acid-base catalysis for Schiff-base formation and aspartylphosphate hydrolysis was probed using pH-rate profile analysis of active-site mutants and X-ray crystallographic analysis of modified forms of the enzyme. The 2.9 Å X-ray crystal structure of the mutant Lys53Arg complexed with Mg+2 and phosphate shows that the equilibrium between the open and the closed conformation is disrupted, favoring the open conformation. Thus, proton dissociation from the cap domain Lys53 is required for cap domain-core domain closure. The likely recipient of the Lys53 proton is a water-His56 pair that serves to relay the proton to the carbonyl oxygen of the phosphonoacetaldehyde (Pald) substrate upon addition of the Lys53. The pH-rate profile analysis of active-site mutants was carried out to test this proposal. The proximal core domain residues Cys22 and Tyr128 were ruled out, and the role of cap domain His56 was supported by the results. The X-ray crystallographic structure of wild-type phosphonatase reduced with NaBH4 in the presence of Pald was determined at 2.4 Å resolution to reveal Nε-ethyl-Lys53 juxtaposed with a sulfate ligand bound in the phosphate site. The position of the C(2) of the N-ethyl group in this structure is consistent with the hypothesis that the cap domain Nε-ethylenamine-Lys53 functions as a general base in the hydrolysis of the aspartylphosphate bi-covalent enzyme intermediate. Because the enzyme residues proposed to play a key role in P-C bond cleavage are localized on the cap domain, this domain appears to have evolved to support the diversification of the HAD phosphatase core domain for catalysis of hydrolytic P-C bond cleavage.
Schiff-base; phosphoryl transfer; phosphoenzyme; phosphoaspartate; structural enzymology; HAD superfamily; general acid catalysis; general base catalysis; enamine; phosphate ester hydrolysis; phosphonatase; phosphonate; cap domain; core domain; electrophilic catalysis
The Arabidopsis thaliana genome encodes a homologue of the full-length bacteriophage T7 gp4 protein, which is also homologous to the eukaryotic Twinkle protein. While the phage protein has both DNA primase and DNA helicase activities, in animal cells Twinkle is localized to mitochondria and has only DNA helicase activity due to sequence changes in the DNA primase domain. However, Arabidopsis and other plant Twinkle homologues retain sequence homology for both functional domains of the phage protein. The Arabidopsis Twinkle homologue has been shown by others to be dual targeted to mitochondria and chloroplasts.
To determine the functional activity of the Arabidopsis protein we obtained the gene for the full-length Arabidopsis protein and expressed it in bacteria. The purified protein was shown to have both DNA primase and DNA helicase activities. Western blot and qRT-PCR analysis indicated that the Arabidopsis gene is expressed most abundantly in young leaves and shoot apex tissue, as expected if this protein plays a role in organelle DNA replication. This expression is closely correlated with the expression of organelle-localized DNA polymerase in the same tissues. Homologues from other plant species show close similarity by phylogenetic analysis.
The results presented here indicate that the Arabidopsis phage T7 gp4/Twinkle homologue has both DNA primase and DNA helicase activities and may provide these functions for organelle DNA replication.
DNA primase; DNA helicase; bacteriophage T7 gp4; Twinkle; Organelle DNA replication
DNA synthesis during replication relies on RNA primers synthesised by the primase, a specialised DNA-dependent RNA polymerase that can initiate nucleic acid synthesis de novo. In archaeal and eukaryotic organisms, the primase is a heterodimeric enzyme resulting from the constitutive association of a small (PriS) and large (PriL) subunit. The ability of the primase to initiate synthesis of an RNA primer depends on a conserved Fe-S domain at the C-terminus of PriL (PriL-CTD). However, the critical role of the PriL-CTD in the catalytic mechanism of initiation is not understood.
Here we report the crystal structure of the yeast PriL-CTD at 1.55 Å resolution. The structure reveals that the PriL-CTD folds in two largely independent alpha-helical domains joined at their interface by a [4Fe-4S] cluster. The larger N-terminal domain represents the most conserved portion of the PriL-CTD, whereas the smaller C-terminal domain is largely absent in archaeal PriL. Unexpectedly, the N-terminal domain reveals a striking structural similarity with the active site region of the DNA photolyase/cryptochrome family of flavoproteins. The region of similarity includes PriL-CTD residues that are known to be essential for initiation of RNA primer synthesis by the primase.
Our study reports the first crystallographic model of the conserved Fe-S domain of the archaeal/eukaryotic primase. The structural comparison with a cryptochrome protein bound to flavin adenine dinucleotide and single-stranded DNA provides important insight into the mechanism of RNA primer synthesis by the primase.
Eukaryotic Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) encode most if not all of the enzymes involved in their DNA replication. It has been inferred that genes for these enzymes were already present in the last common ancestor of the NCLDV. However, the details of the evolution of these genes that bear on the complexity of the putative ancestral NCLDV and on the evolutionary relationships between viruses and their hosts are not well understood.
Phylogenetic analysis of the ATP-dependent and NAD-dependent DNA ligases encoded by the NCLDV reveals an unexpectedly complex evolutionary history. The NAD-dependent ligases are encoded only by a minority of NCLDV (including mimiviruses, some iridoviruses and entomopoxviruses) but phylogenetic analysis clearly indicated that all viral NAD-dependent ligases are monophyletic. Combined with the topology of the NCLDV tree derived by consensus of trees for universally conserved genes suggests that this enzyme was represented in the ancestral NCLDV. Phylogenetic analysis of ATP-dependent ligases that are encoded by chordopoxviruses, most of the phycodnaviruses and Marseillevirus failed to demonstrate monophyly and instead revealed an unexpectedly complex evolutionary trajectory. The ligases of the majority of phycodnaviruses and Marseillevirus seem to have evolved from bacteriophage or bacterial homologs; the ligase of one phycodnavirus, Emiliana huxlei virus, belongs to the eukaryotic DNA ligase I branch; and ligases of chordopoxviruses unequivocally cluster with eukaryotic DNA ligase III.
Examination of phyletic patterns and phylogenetic analysis of DNA ligases of the NCLDV suggest that the common ancestor of the extant NCLDV encoded an NAD-dependent ligase that most likely was acquired from a bacteriophage at the early stages of evolution of eukaryotes. By contrast, ATP-dependent ligases from different prokaryotic and eukaryotic sources displaced the ancestral NAD-dependent ligase at different stages of subsequent evolution. These findings emphasize complex routes of viral evolution that become apparent through detailed phylogenomic analysis but not necessarily in reconstructions based on phyletic patterns of genes.
This article was reviewed by: Patrick Forterre, George V. Shpakovski, and Igor B. Zhulin.