Pif1s and RecDs are all superfamily Ib helicases that share seven conserved motifs (I, Ia, II, III, IV, V, and VI) common to SFI enzymes, as well as three additional motifs (A, B, and C) found only in Pif1 and RecD family helicases (see ). Only Pif1-like helicases, however, contain the Pif1 family signature sequence (DKLeXvARaiRkqXkPFGGIQ), a degenerate sequence located between motifs II and III, the function of which is currently unknown (Bochman et al., 2010
). Aside from CLUSTAL alignments of protein sequences to determine phylogenies, one way to differentiate between Pif1 and RecD helicases is with the signature sequence. Using the sequence above as a Basic Local Alignment Search Tool (BLAST) query of the NCBI database returns only Pif1 homologues, although the more distantly related a Pif1 family helicase is from the canonical ScPif1, the more degenerate the signature sequence becomes (compare BaPif1 and BbPif1 in ). Careful analysis using CLUSTAL alignments, however, reveals that prokaryotic Pif1 enzymes are generally more closely related to eukaryotic Pif1 helicases than they are to any of the various RecD subgroups ().
Why do prokaryotes need Pif1 family helicases?
We hypothesize that Pif1 family helicases are important for resolving common issues that arise during DNA metabolism rather than performing eukaryotic-specific functions. Thus, based on what we know about eukaryotic Pif1 helicases, prokaryotic Pif1s may serve multiple functions, including maintaining prokaryotic telomeres (in bacteria with long linear chromosomes), resolving DNA and DNA–RNA secondary structures, and complementing the lack of other helicases in vivo. Each of these possibilities is discussed separately.
Prokaryotic telomere maintenance
Whereas Escherichia coli
have circular chromosomes, many bacteria have linear chromosomes, plasmids, and/or phages that replicate without a circular intermediate (reviewed in Stewart et al., 2004
, and Casjens and Huang, 2008
). Well-known examples of such organisms include Borrelia burgdorferi
(the causative agent of Lyme disease), many Streptomyces
species (sources of various antibiotics), and the
29 Bacillus subtilis
bacteriophage. To solve the end replication problem, at least two different types of prokaryotic telomeres have evolved: 1) those with ends protected by a protein that is covalently linked to the terminal nucleotide (e.g., Streptomyces
29), and 2) those with covalently closed terminal hairpins (e.g., Borrelia
spp.). In eukaryotes, various Pif1 family helicases are known or hypothesized to associate with telomeres. For instance, ScPif1 is a catalytic inhibitor of telomerase, ScRrm3 is important for replication through telomeric DNA, and mammalian Pif1 helicases interact with telomerase in vivo (Bochman et al., 2010
). Although the structures and sequences of prokaryotic telomeres are unrelated to the well-known eukaryotic telomerase-generated telomeres, Pif1 family helicases may nevertheless be involved in prokaryotic telomere maintenance.
The ends of linear chromosomes and plasmids are inherently recombinogenic (Stewart et al., 2004
; Casjens and Huang, 2008
). In some instances, such as generating antigen variability in Borrelia
spp., this recombination is beneficial, and bacterial Pif1 helicases that resolve recombination intermediates may promote such processes. Indeed, ScPif1 promotes recombination in the linear mtDNA of S. cerevisiae
(Foury and Dyck, 1985
). Alternatively, prokaryotic Pif1 helicases may possess a nucleoprotein disruption activity similar to that hypothesized for ScRrm3 (Ivessa et al., 2003
; Azvolinsky et al., 2009
) that is used to displace DNA end-binding proteins during the replication of linear chromosomes and plasmids. Such an activity could be important for the complete replication of DNA ends.
More broadly, it is widely accepted that the genomic DNA in eukaryotic organelles, such as mitochondria, has a prokaryotic origin. It was long overlooked and often disputed, however, that some eukaryotic mtDNA molecules, including S. cerevisiae
mtDNA, are linear (Nosek et al., 1998
), perhaps reflecting an origin from a prokaryote species with a linear genome. Because Pif1 family helicases are important for mtDNA maintenance (Bochman et al., 2010
), it is tempting to speculate that the first eukaryotic Pif1 helicase may have had a mitochondrial (and hence prokaryotic) origin, with nuclear isoforms evolving later. Indeed, there are several examples of mitochondrial proteins with putative prokaryotic origins (e.g., mitochondrial ribosomal proteins; Henze and Martin, 2001
) that are encoded in the nuclear genome.
DNA/RNA secondary structure resolution
Eukaryotic Pif1 helicases have recently been implicated in the processing of G4 structures (Ribeyre et al., 2009
; Sanders, 2010
; Paeschke et al.
, 2011). A G4 structure is a four-stranded RNA or DNA secondary structure that is held together by Hoogsteen G-G base pairing. Human Pif1 and ScPif1 both bind and efficiently unwind G4 structures in vitro (Ribeyre et al., 2009
; Sanders, 2010
). It is known that guanine-rich sequences that have the potential to form G4 structures are enriched in telomeric DNA and rDNA, and G4 structures are also found at transcriptional regulatory regions and preferred meiotic double-stranded break sites (Neidle and Balasubramanian, 2006
; Capra et al., 2010
). Thus G4 sequences are likely to have roles in diverse cellular functions, such as telomere maintenance (Paeschke et al., 2008
) and transcriptional regulation (Rawal et al., 2006
; Huppert et al., 2008
). Moreover, a recent analysis of 18 prokaryotic genomes revealed that the association of G4 motifs with transcriptional regulatory regions is not limited to eukaryotes, as evidenced by conserved G4 motifs in the promoters of gene orthologues in distantly related bacteria (Rawal et al., 2006
). Although helicases such as E. coli
RecQ are also known to unwind G4 structures (Wu and Maizels, 2001
), not all bacterial genomes encode RecQ homologues (). It is therefore plausible that prokaryotes that lack RecQ helicases may instead encode Pif1 helicases to resolve G4 structures. Furthermore, as might be expected, bacteria with GC-rich genomes are predicted to contain more G4-forming sequences than are organisms with lower GC content (Rawal et al., 2006
). It is therefore intriguing that Pif1 family helicases are found throughout the Bifidobacteriales (a large order of Actinobacteria) and the phylum Bacteroidetes (which includes human gut commensals), the genomes of which are particularly GC-rich.
Previously, it was shown that ScPif1 preferentially unwinds DNA–RNA hybrids relative to DNA–DNA substrates (Boule and Zakian, 2007
), much like the E. coli
UvrD helicase (Matson, 1989
). It is hypothesized that ScPif1 inhibits telomerase activity by unwinding the DNA–RNA hybrid formed between telomeric DNA and the telomerase RNA template (Boule et al., 2005
). Although the prokaryotic telomeres just discussed are not replicated by a conventional telomerase-based mechanism, perhaps Pif1 helicases function in bacteria to resolve DNA–RNA hybrids (i.e., R-loops) that form during transcription. Because R-loop accumulation leads to genomic instability in organisms from E. coli
to mammals, resolution of R-loops is important for maintaining genome integrity (Li and Manley, 2006
Complementing the lack of other helicases
Most genomes encode multiple distinct helicases that have specialized functions in DNA and RNA metabolism. For instance, there are 134 open reading frames in the S. cerevisiae
genome that encode proteins containing helicase structural motifs (Shiratori et al., 1999
). The E. coli
genome also encodes a variety of predicted helicases, with more than dozen such enzymes verified by direct biochemical assays. A universal collection of helicase types that is required for viability, however, has not been identified in bacteria or eukaryotes, likely due to functional redundancy in vivo. Thus, although E. coli
encodes one member each of the DinG, RecD, RecQ, Rep, and UvrD helicase families (), other bacteria contain multiple homologues of these helicases, are devoid of them altogether, and/or possess other types of helicases (e.g., PcrA and Pif1) that may complement the lack of one or more of these enzymes. In E. coli
, DinG is a 5′-3′ helicase with a putative role in homologous recombination (HR), RecD is the 5′-3′ member of the RecBCD complex (which is also involved in HR), RecQ (3′-5′) is involved in HR and double-stranded break repair, Rep is a 3′-5′ accessory replicative helicase needed for timely DNA replication, and UvrD is a 3′-5′ repair helicase that can remove RecA from single-stranded DNA (see Wu and Maizels, 2001
; Montague et al., 2009
; Boubakri et al., 2010
; and references therein).
Given their sequence similarity to RecD enzymes, one might predict that Pif1 family helicases are found in the subset of bacterial species that lack a RecD homologue. Indeed, this situation is true in species such as Rhodomicrobium vannielii
and Gardnerella vaginalis
, which lack RecD. Other bacterial species, however, encode one or more homologues of both Pif1 and RecD (e.g., Psychrobacter
sp. PRwf-1 and Desulfobacterium autotrophicum
; ). Additionally, across the diverse prokaryotic phyla (Ciccarelli et al., 2006
), many prokaryotes that encode Pif1 family helicases appear to lack both RecQ and DinG helicase homologues. As stated earlier in text, Pif1 helicases may function in place of RecQs to resolve G4 DNA and/or to remove R-loops, as DinG does in E. coli
(Boubakri et al., 2010
Additionally, DinG acts in concert with either Rep or UvrD in E. coli
to remove RNA polymerase from the path of replication forks when replication and transcription complexes collide (Boubakri et al., 2010
). This activity is reminiscent of the nucleoprotein disruption activity of ScRrm3 (Ivessa et al., 2003
), and the increased replication fork pausing seen at inverted ribosomal operons in E. coli dinG rep
mutants is similar to the fork pausing observed at rDNA repeats in rrm3
Δ S. cerevisiae
(Ivessa et al., 2000
). Therefore, because eukaryotic Pif1 helicases have functions similar to those observed for the E. coli
DinG, UvrD, and Rep helicases, prokaryotic Pif1 helicases may complement the lack of one or more of these enzymes in bacteria that do not encode DinG, UvrD, and/or Rep homologues.