|Home | About | Journals | Submit | Contact Us | Français|
The tad (tight adherence) locus encodes a protein translocation system that produces a novel variant of type IV pili. The pilus assembly protein TadZ (called CpaE in Caulobacter crescentus) is ubiquitous in tad loci, but is absent in other type IV pilus biogenesis systems. The crystal structure of TadZ from E. rectale (ErTadZ), in complex with ATP and Mg2+, was determined to 2.1 Å resolution. ErTadZ contains an atypical ATPase domain with a variant of a deviant Walker-A motif that retains ATP binding capacity while displaying only low intrinsic ATPase activity. The bound ATP plays an important role in dimerization of ErTadZ. The N-terminal atypical receiver domain resembles the canonical receiver domain of response regulators, but has a degenerate, stripped-down “active site”. Homology modeling of the N-terminal atypical receiver domain of CpaE indicates that it has a conserved protein-protein binding surface similar to that of the polar localization module of the social mobility protein FrzS, suggesting a similar function. Our structural results also suggest that TadZ localizes to the pole through the atypical receiver domain during early stage of pili biogenesis, and functions as a hub for recruiting other pili components, thus providing insights into the Tad pilus assembly process.
Pili (or fimbriae) consist of filaments of pilin subunits that form hair-like appendages on the surface of many bacteria (Proft et al., 2009). They play important roles in processes such as adhesion to other bacteria especially in the context of bacterial conjugation, bacteriophage absorption, motility, and biofilm formation. The tad (tight adherence) locus is a widespread colonization island that is found in both Gram-negative and Gram-positive bacteria, as well as the archaea (Tomich et al., 2007). In Aggregatibacter actinomycetemcomitans, the tad locus encodes a protein translocation system for the assembly of Flp pili. These pili of A. actinomycetemcomitans mediate strong, nonspecific adherence to solid surfaces (such as teeth), and are important in colonization and pathogenesis (Schreiner et al., 2003; Planet et al., 2003; Kachlany et al., 2001a; Bhattacharjee et al., 2001; Kachlany et al., 2000). Similar loci have also been characterized in Caulobacter crescentus (Skerker et al., 2000; Viollier et al., 2002b; Viollier et al., 2002a), Haemophilus ducreyi (Nika et al., 2002) and Pseudomonas aeruginosa (de Bentzmann et al., 2006). The A. actinomycetemcomitans tad locus consists of a 14 genes flp-1-flp-2-tadV-rcpCAB-tadZABCDEFG (Fig. 1A). The tad operon in C. crescentus consists of at least 7 genes including pilA and cpaABCDEF. In Gram-negative bacterium, the Tad pilus spans the periplasmic region, possibly with the help of proteins TadD, TadE, TadF and TadG, and exits through the secretion channel in the outer membrane formed by the secretin RcpA (CpaC), and associated membrane proteins RcpB and RcpC (Tomich et al., 2007; Chen et al., 2005). The cytoplasmic ATPase TadA (Bhattacharjee et al., 2001) may facilitate the assembly of the pilin subunits into the pilus filament. TadV is a peptidase that cleaves the prepilin subunits prior to assembly into a pilus (and also processes the pseudopilins TadE and TadF). The Tad secretion system is evolutionarily related to the type II secretion systems (T2SSs) (Peabody et al., 2003; Tomich et al., 2007); however, several components are unique (e.g. TadZ, TadE, and TadF) and not found in other T2SSs. Thus, the tad-encoded Flp pili are classified as a subfamily of type IVb pili (Kachlany et al., 2001b). Due to the absence of an outer membrane, the tad loci of Gram-positive bacteria are shorter, usually consisting of tadZABC, and lack the components responsible for outer membrane secretion.
The gene tadZ is present in all tad loci, and TadZ is an essential component of the Tad secretion system. Studies of TadZ of A. actinomycetemcomitans (AaTadZ) and the homologous CpaE of C. crescentus (CpaE referred to as CcTadZ hereafter) indicated that they likely function as localization factors (Viollier et al., 2002b; Viollier et al., 2002a; Skerker and Shapiro, 2000; Christen et al., 2010) (B.A. Perez-Cheeks, unpublished, submitted as a companion paper). In C. crescentus, CcTadZ mediates polar positioning of the pilus secretin protein CpaC, as well as the histidine kinase PleC. However, the molecular mechanism of the TadZ function in the pilus assembly process remains unclear. Here, we report the crystal structure of a TadZ protein from E. rectale (ErTadZ) as part of our ongoing structural genomics effort targeting novel protein structures from the human microbiome (Elsliger et al., 2010; Xu et al., 2010b; Xu et al., 2010a). E. rectale is an anaerobic Gram-positive bacterium that is prevalent in the human colon. It is one of the major bacterial producers of butyrate, a short-chain fatty acid that is the preferred energy source for colonocytes, and thus is important for maintaining colon health (Duncan et al., 2008). The ErTadZ structure in complex with ATP and a magnesium ion exhibits a unique molecular architecture combining components from the bacterial cytoskeleton (MinD/ParA/Soj ATPases) and two-component signal transduction (response regulators). However, both domains have lost the canonical active sites. The ATPase domain, with a degenerate Walker-A motif, mediates dimerization in the presence of ATP while exhibiting only low intrinsic ATPase activity. The atypical receiver domain of CcTadZ is structurally similar to the social mobility protein FrzS, which is involved in polar localization to the leading pole (Fraser et al., 2007; Mignot et al., 2007), and indicates a conserved role for these modules in the biogenesis of type IV pili. Thus, we now provide a structural basis for understanding the role of TadZ in the pili assembly process and in polar localization.
Sequence similarities between ErTadZ and CcTadZ or AaTadZ are weak using direct amino–acid sequence comparison (sequence identity <10%). However, the conservation of tad locus genes as a single operon (Tomich et al., 2007) provides reliable evidence for the functional assignment of ErTadZ. The E. rectale genome contains the typical tad operon found in Gram-positive bacteria (Fig. 1A). The tadZ gene is always located immediately upstream of the ATPase gene tadA. In addition to conserved proteins, the E. rectale tad operon also contains two proteins of unknown function (EUBREC_1105 and EUBREC_1108, not shown). A large third protein in this operon (EUBREC_1107, 1082aa) contains a TadE-like region and a Colicin IA-like domain. ErTadZ is related by sequence to proteins in Ruminococcus bromii (gi 291541803), Oribacterium sinus (gi 227872973), Erysipelotrichaceae bacterium (gi 309777432), and Clostridium asparagiforme (gi 225389010) (seq id ~25%, Fig. S1). A search of distant homologs identifies a large number of ATPases, including a few CcTadZ-like proteins, indicating a remote evolutionary relationship between ErTadZ and CcTadZ.
We investigated the domains of the TadZ family of proteins using separate profiles for the receiver domain (PF00072, RD thereafter) and the NifH/FrxC family ATPase Pfam domain (PF00142) (Bateman et al., 2004). Of the top 250 hits of CcTadZ homologs identified by a BLAST search, 242 and 233 proteins were found to contain an N-terminal RD or a C-terminal ATPase domain respectively, suggesting similar domain architecture, as subsequently revealed by the ErTadZ structure (see below). Further examination of the RD of these TadZ homologs indicate that they have lost critical residues corresponding to the phosphorylation site of canonical RDs (Bourret, 2010). The C-terminal ATPase domain of TadZ contains a sequence motif ([KR]GGxGs[ST]) that is a variant of the deviant Walker-A motif (KGGxGK[ST]) with the invariant lysine (underlined) replaced by a residue with a small side-chain (s) (see below).
Thus, ErTadZ is a member of the diverse TadZ family, and part of the E. rectale tad gene cluster. Most members of this family have the same two-domain architecture as ErTadZ and AaTadZ (Fig. 1B). CcTadZ contains an additional ~120aa proline-rich region, which is expected to be only partially ordered. TadZ from Streptomyces avermitilis is fused to a TadE-like tail, while the N-terminal RD is absent in TadZ from Chlorobium tepidum.
Full-length ErTadZ contains 354 residues with a molecular weight of 39.8 kDa and a calculated pI of 4.9. The crystal structure of ErTadZ was determined by the multiple-wavelength anomalous diffraction (MAD) phasing method using the high-throughput structural genomics pipeline implemented at the Joint Center for Structural Genomics (JCSG, http://www.jcsg.org) (Lesley et al., 2002; Elsliger et al., 2010). The selenomethionine (SeMet) derivative of ErTadZ was expressed in E. coli with an N-terminal His-tag and was purified by metal affinity chromatography. Three-wavelength MAD data were collected to 2.1 Å resolution at Stanford Synchrotron Radiation Lightsource (SSRL) beamline BL11-1. The data were indexed and processed in monoclinic space group C2 with unit cell dimensions of a=121.5 Å, b=79.5 Å, c=55.0 Å and β=96.2°. The asymmetric unit contains one protein molecule, with a solvent content of 60.3% (Vm=3.1). The final model was refined to an Rcryst of 19.0% and an Rfree of 21.8%. TLS parameters were refined with each domain defined as a rigid body unit (residues 1–114,115–354). The electron density map showed clear density for the full-length protein. The model of ErTadZ displays good geometry with an all-atom clash score of 7.1, and the Ramachandran plot produced by MolProbity (Davis et al., 2004) shows that all residues are in allowed regions, with 98.3 % in favored regions. The final model of ErTadZ contains all residues of the full-length protein, one ATP, one magnesium ion (Mg2+), one chloride ion, one sulfate ion, two glycerol molecules, two polyethylene glycol fragments and 140 water molecules. The side-chains of nine residues (Met1, Lys12, Glu13, Arg17, Lys46, Arg49, Arg189, Arg300 and Lys312) on the protein surface, and the purification tag are disordered and are absent from the final model. The data processing and refinement statistics are summarized in Table 1.
The structure of ErTadZ consists of two domains (Fig. 2A). The N-terminal domain (residues 1–114) is structurally similar to the RD of response regulators. The C-terminal domain (residues 125–354) shows structural homology to a class of ATPases that are involved in diverse activities such as cell division (MinD), DNA segregation (ParA/Soj), and nitrogen fixation (NifH). An ATP molecule and a magnesium ion are located in the canonical ATP binding site. We refer to these domains as the atypical receiver domain (ARD) and the atypical ATPase domain (AAD), since they have lost the catalytic residues of canonical domains (see below).
The ARD is connected to the AAD through a short linker (residues 115–124). This linker and the C-terminal end of the last helix of the ARD (residues 108–114) interact with the backside of the AAD (distal to the ATP binding site) and together bury 1384 Å2 of total surface area. ErTadZ dimerizes via the AAD domain through the crystallographic two-fold axis, forming a V-shaped molecule with molecular dimensions of ~106 Å × 75 Å × 53 Å (Fig. 2B). The ARDs are located as appendages emanating from the central dimer core with a median intermolecular distance of ~74 Å.
The AAD domain of ErTadZ is structurally most similar to a truncated CT0433 (residues 36–277) of the Gram-negative green sulfur bacterium C. tepidum with an rmsd of 3.0 Å for 221 equivalent Cα atoms and 18% sequence identity (PDB code 3ea0, the Midwest Structural Genomics Center, unpublished, Dali Z=19.8) (Fig. S2). CT0433 is located in a conserved tad locus and, thus, is also a TadZ homolog (referred to as CtTadZ hereafter). AAD of ErTadZ is also highly similar to MinD of Pyrococcus furiosus (PDB code 1g3r, Z=19.4) (Hayashi et al., 2001) and the chromosome segregation protein Soj from Thermus thermophilus (PDB code 2bek, Z=17.9) (Leonard et al., 2005). Compared to the AAD of ErTadZ, the AAD of CtTadZ is phylogenetically closer to CcTadZ, AaTadZ, MinD and Soj with higher sequence and structural similarities [e.g. Z=26.5 for 1g3r (MinD) and Z=24.3 for 2bek (Soj)]. However, ErTadZ does not possess a C-terminal “amphipathic helix” present in MinD. AaTadZ, on the other hand, has a slightly longer sequence that could accommodate a C-terminal “amphipathic helix” (Fig. S2).
ErTadZ contains a bound ATP molecule, supported by well-defined electron density maps and detection of ATP in the purified ErTadZ sample, used for crystallization, by MALDI mass spectrometry (Fig. 3). A magnesium ion interacts with β- and γ- phosphates of ATP, Ser139 Oγ1, Glu162 Oε2 and two waters with a near perfect hexacoordinate geometry (Fig. 4A). ATP and Mg2+ were not added during ErTadZ purification or crystallization stages and, thus, were co-purified from the expression host. ATP and Mg2+ were also detected in the crystal structure of CtTadZ; however, the location of the magnesium ion and the γ-phosphate of ATP are altered (Fig. 4B). The conformation and location of the ATP and Mg2+ in ErTadZ are essentially identical to those of the Soj ATPase mutant D44A structure (PDB code 2bek) (Leonard et al., 2005) (Fig. 4C). MinD/Soj-like ATPases contain a deviant Walker-A (P-loop) motif (xKGGxGK[ST], compared to the canonical GxxGxGK[ST]) with two lysines, one near the carboxyl end of the motif that is common to all Walker-A motifs, and a second at the beginning of the motif that is unique to this subgroup of ATPases (Koonin, 1993; Lutkenhaus et al., 2003). The Walker-A motif of ErTadZ is further degraded (132PCGGVGTS139) with the loss of both lysines (Fig. 4D). The P-loop of CtTadZ contains an alanine at the second lysine position, but has the first conserved lysine.
The loss of the second lysine in the Walker-A motif is also observed in other TadZ homologs (sequence motif s[KR]GGsGs[ST], where s are residues with small side chains A/S/T/V, Fig. 4D), suggesting that this is a common property of the TadZ family. Nevertheless, these P-loop regions containing the degraded Walker A- motif are still highly conserved (Figs. S1 and S2), indicating that the TadZ proteins likely have retained the ATP binding capability, as supported by the ErTadZ and CtTadZ structures.
To probe whether the ATP binding site has catalytic activity, we analyzed the nucleotide composition of the ErTadZ sample used for crystallization by MALDI mass spectrometry. In addition to ATP (m/z 508.03), smaller amounts of ADP (m/z 428.5) were identified in the ErTadZ sample (Fig. 3B, see Experimental procedures), suggesting that ATP bound to ErTadZ is either slowly hydrolyzed or ADP co-purified with ErTadZ. Further biochemical analysis using radiolabeled ATP and nucleotide separation by TLC-chromatography revealed that ErTadZ exhibits a low intrinsic ATP hydrolysis rate under physiological buffer conditions, but lost ATPase activity under the buffer conditions used for crystallization (Fig. 5). In addition, we found that an ErTadZE162A mutation that substitutes the Mg2+-chelating glutamate residue in the ATP binding pocket with an alanine also lost ATPase activity. These results suggest that ErTadZ exhibits a low intrinsic ATPase activity.
It is generally believed the second lysine in the Walker-A motif is required for ATP hydrolysis. For example, mutations of the second conserved lysine in MinD cause it to lose activity (de Boer et al., 1991; Hayashi et al., 2001; Hu et al., 2002). Our results indicate that ErTadZ has a low intrinsic ATPase activity at 35°C at physiological relevant buffer conditions, despite the absence of this highly conserved lysine. It remains possible, but highly speculative at present, that another protein can further stimulate the ATPase activity of ErTadZ, in vivo. Thus, since the crystallization buffer completely abolishes the ATPase activity, the crystal structure may represent a snapshot of ErTadZ locked in the ATP-bound state.
Liquid chromatography–mass spectrometry (LCMS) indicated that the molecular weight of an ErTadZ monomer was 42.46 kDa, which would correspond to the SeMet protein containing the purification tag. Analytical size exclusion gave an estimation of molecular weight the oligomer as 72.1 kDa (equivalent to 1.7 monomers, Fig. S3). A more accurate molar mass, estimated from Static Light Scattering (SLS) averaged across the majority of the peak, was 85.95 kDa as calculated by the ASTRA software (Wyatt Technology), with an oligomer number of 2.02, indicating that a dimer is the dominant species in solution (Fig. 6A).
The dimer interface involves a relatively flat surface encompassing the bound ATP (Fig. 6B) and buries a surface area of ~2630 Å2 per monomer. The dimerization brings the ATPs and the glycine-rich P-loops of the two protomers together. The two equivalent helices (residues 138–152) following the P-loop align collinearly in a head-to-head arrangement, with the phosphoryl groups of the ATP molecules sandwiched in between. Similar dimeric assemblies are also conserved in CtTadZ and Soj (Fig. S4). In ErTadZ and CtTadZ, a helix (residues 274–293 in ErTadZ; Fig 6B) makes more significant contribution to the dimer interface, compared to Soj (Fig. 6C).
The first conserved lysine of the deviant Walker-A motif promotes dimer formation upon ATP binding, through interactions with the phosphoryl groups of the neighboring protomer (Leonard et al., 2005). Interestingly, Lys279 in ErTadZ, located on an α-helix adjacent to the P-loop, interacts with the β- and γ-phosphate of the neighbor ATP in the same manner, thus fulfilling a similar role as the first lysine in Walker-A motif (Fig. 6C). ATP plays an important role in the dimerization of ErTadZ by contributing 40% of the interaction surface. The ATP-mediated dimerization in CtTadZ is more similar to canonical ATPases as it interacts with ATP through the first conserved lysine of the deviant Walker-A motif. Moreover, this residue is also highly conserved (or substituted by an arginine) in other TadZ proteins (Fig. 4D). Thus, it is likely that TadZ proteins have preserved the function of utilizing ATP as a “molecular switch” in promoting dimer formation (Leonard et al., 2005).
ARD is structurally similar to the RD of response regulators, such as the prototypical chemotaxis protein CheY (Stock et al., 1993). CheY functions as a molecule switch controlled by the phosphorylation state on Asp57. Several additional residues are conserved near the phosphorylation site, including Asp12, Asp13 and Lys109, which are needed either for Mg2+ coordination or for phosphorylation-induced conformational changes. CheY consists of a (βα)5 fold forming a three-layer sandwich with a hydrophobic β-sheet core protected on both sides by amphipathic helices. The structures of CheY and ARDs (from ErTadZ), a recently solved TadZ homolog from Mesorhizobium loti (MlTadZ, PDB code 3snk, JCSG, unpublished), and Myxococcus xanthus social motility protein FrzS (Fraser et al., 2007), are shown in Fig. 7A. The overall fold of the ARD of ErTadZ is similar to CheY, except that helix α3 is substituted by a 310 helix and α4 is absent. The “active site” of ErTadZ still partially resembles the chemotaxis protein CheY with several conserved residues, such as Asp9 and Lys92, suggesting its evolutionary origin from an RD. A glutamate (Glu56) replaces the aspartate at the site of phosphorylation, and is coupled with another change (D to K mutation) at one of the residues (Lys10), which chelates Mg2+ in canonical RDs. As a result of these changes, the environment in the “active site” of ErTadZ differs from that of CheY (Fig. 7B). In ErTadZ, the conserved Lys92 interacts directly with Asp9. MlTadZ and FrzS also have degraded “active sites”, but both preserve the interaction between the conserved lysine and an acidic residue in the “active site”. Thus, we conclude that ARD of ErTadZ has lost its ability to bind Mg2+ or to undergo phosphorylation due to the dramatic changes in the canonical active site. It is also interesting to note that the Asp to Glu substitution at the phosphorylation site mimics phosphorylation-based activation in canonical response regulators (Klose et al., 1993).
ARDs have been characterized in a number of proteins involved in different activities, such as bacteria CikA (Zhang et al., 2006; Gao et al., 2007), KaiA (Ye et al., 2004; Williams et al., 2002), plant pseudo-response regulators that control circadian rhythms (Mizuno et al., 2005), FrzS (Fraser et al., 2007), and Vibrio cholerae transcription regulator VpsT (Krasteva et al., 2010). Despite the structural similarities to canonical RDs, the sequence similarities among ARDs are very limited (Fig. S5). The activity of ARDs is phosphorylation-independent because they lack the invariant phospho-accepting Asp or other residues essential for phosphorylation (Makino et al., 2000; Fraser et al., 2007; Williams et al., 2002).
Phosphorylation of an RD induces conformational changes that are mapped to a region defined by α4β5α5 ("output surface"), which triggers the switching of a conserved tyrosine or phenylalanine from a buried to an exposed conformation (Tyr106 in CheY) (commonly referred as “Tyr/Phe switch”) (Bourret, 2010). ARDs seem to have retained the ability to form protein-protein interactions using a similar interface, but in a phosphorylation-independent manner, as demonstrated in FrzS (Fraser et al., 2007). The same region may undergo structure modification with the attachment of additional structural elements, such as an additional helix in VpsT that allows the binding of c-di-GMP (Krasteva et al., 2010). The corresponding region of ErTadZ is not structurally conserved (α5 is replaced by a loop), compared to CheY and other ARDs. Furthermore, the ARD of ErTadZ does not contain the “Tyr/Phe switch” (Ala89). Nonetheless, a conserved surface among close homologs of ErTadZ (Asp9, Asp11, Tyr14, Leu18, Lys92, Tyr93, and Gln94) near the “active site” suggests that this site may be functionally important (Fig. S1C). Thus, ErTadZ contains a more divergent ARD, whereas sequence analysis indicated that ARDs of AaTadZ and CcTadZ still contain the “Tyr/Phe switch” (Fig. S5).
Evidence from both structure and sequence of the TadZ family clearly suggest that they have evolved from fusion of RD and MinD-like ATPase domains, coupled with the loss of catalytic residues of their ancestors. The absence of canonical active sites in these proteins indicates that they are likely involved in mediating protein-protein interactions, consistent with their roles in pili assembly that involves the formation of multi-protein complexes. In order to better understand how the two domains of TadZ are utilized in the pili assembly process, we attempted to identify the potential protein docking sites on TadZ by identifying conserved residues on the protein surface. Clustering of conserved residues on the protein surface is often indicative of functionally important sites.
Sequences of the TadZ family are highly divergent, particularly in the ARD region. Most members of this family are more closely related to the last two domains of CcTadZ, compared to AaTadZ or ErTadZ. Consequently, we built a homology model for the more representative CcTadZ. The sequence of the ARD domain of CcTadZ (residues 125–251) was submitted to I-Tasser server (Zhang, 2008), and the top scoring model was selected for subsequent rounds of manually adjustments and energy minimized to reduce clashes. The resulting model is consistent with the properties of the RD architecture, which consists of conserved hydrophobic β-stands and amphipathic helices.
The ARD of CcTadZ is expected to adopt an (βα)5 fold as for canonical RDs (Fig. S5). Similar to ErTadZ, the “phosphorylation site” of CcTadZ is also occupied by a glutamate (Glu182), indicating that it is also an ARD (Fig. 8A). More interestingly, the “output surface” of the CcTadZ ARD is highly conserved, featuring an arginine-tyrosine pair (Arg220 and Tyr230). Many of the residues on the “output surface” of CcTadZ are also conserved in the ARD of FrzS (Fig. 8B).
A model for the AAD of CcTadZ was also constructed in a similar fashion (Fig. 8C). Models of AAD and ARD of CcTadZ were then superimposed on the respective domains of the full length ErTadZ to generate a two-domain model for CcTadZ (residues 125–517, Fig. 8D). Conserved residues on the surface of AAD are all located on or near the dimer interface, indicating the functional importance of this interface (Fig. 8C). Several conserved arginines (Arg323, Arg331, and Arg421) are close to the domain interface between AAD and ARD.
Type IV pili are generally localized to a single pole of the cell (Wall et al., 1999). It is not well understood how this cell polarity of type IV pili is established. A model was proposed based on studies of C. crescentus (Lawler et al., 2006; Huitema et al., 2006; Lam et al., 2006), which divides asymmetrically to produce two daughter cells that are functionally and morphologically different: a swarmer cell containing a flagellum and pili at one pole and a stalked cell with a stalk at the opposite pole. TipN localizes to the middle of the predivisional cell and is inherited by both daughter cells at the new poles after cell division, and then serves as a “molecular beacon” for establishing the new pole. In the stalked cell, which can further divide, TipN serves as a hub for proper localization of the developmental regulator PleC, the polar organelle development protein PodJ (Viollier et al., 2002b), and the pilus assembly protein CcTadZ to the new pole (where new pili will be generated in the swarmer daughter cell). CcTadZ is localized to one pole in the swarmer cell. The accompanying paper by Perez-Cheeks et al. shows polar localization of TadZ in A. actinomycetemcomitans (B.A. Perez-Cheeks, unpublished, submitted as a companion paper). Thus, TadZ likely plays a critical localization role in pili biogenesis. The structure of TadZ from E. rectale reveals a novel protein architecture consisting of an atypical receiver domain (ARD) and an atypical ATPase domain (AAD), thus representing an interesting example of gene fusion.
ARDs were previously found to be involved in mediating polar protein localization. The polar localization of the social mobility protein FrzS and the cyanobacterial circadian clock kinase CikA requires ARDs (Fraser et al., 2007; Zhang et al., 2006). ARD of FimX of P. aeruginosa, a phosphodiesterase that governs twitching motility, is also essential in its localization to a single cell pole (Kazmierczak et al., 2006). These ARDs were likely evolved from canonical polar targeting RDs. Several characterized canonical RDs are capable of localizing to the pole. For example, M. xanthus RomR localizes bipolar asymmetrically with a large cluster at the lagging pole using its RD as a localization module (Leonardy et al., 2007). C. crescentus CpdR has a phosphorylation-dependent, localization pattern and plays an important role in cell cycle control (Iniesta et al., 2006). Two other C. crescentus response regulators, DivK and PleD, also localize to the cell poles, which is regulated by the phosphorylation states of their RDs (Chan et al., 2004; Lam et al., 2003).
Since ARDs are evolved from RDs, it has been speculated that ARD may interact with histidine kinase (HK) by mimicking the His-Asp phosphor-transfer complex. Analysis of the clustering of highly conserved surface residues reveals two potential protein-protein docking sites on the ARD of TadZ, which map to the canonical active site and the “output surface” respectively. The putative binding surface of ErTadZ-ARD is near Lys92/Tyr93 (Fig. S1), corresponding to a location that is directly involved in the interaction between RD and HK in a two-component system, suggesting that ErTadZ may have retained the ability to bind an HK. Recognition of an HK by the ARD in TadZ raises the possibility of regulating pili biogenesis by a two-component signaling system. Computational analysis of CcTadZ and its close homologs suggests another protein-protein docking site that overlaps with the “output surface” of the canonical RD (see previous section), which is absent in ErTadZ. This site is close to, but does not overlap with, the conserved site in ErTadZ. More interestingly, we showed that the ARD of CcTadZ has an “output surface” similar to that of FrzS (Fig. 8B). This finding is surprising since no functional relationship between CcTadZ and FrzS was expected, even though both are involved in polar localization and are functionally related to type IV pili. FrzS is essential for the social mobility of M. xanthus, which is powered by type IV pili that consist of an N-terminal ARD and a coil-coiled tail, which is believed to interact with an ATPase motor (Ward et al., 2000; Mauriello et al., 2010). The architecture of an ARD associated with an ATPase is similar in FrzS and TadZ. Therefore, the structural analysis suggests that the ARD of TadZ likely functions as a localization module.
Based on the significant structural similarities, the AAD of TadZ may have functional similarities to MinD. The dimerization of MinD allows its stable binding to the membrane via a C-terminal amphipathic helix and the subsequent recruitment of MinC and MinE (Zhou et al., 2005). It is feasible that the AAD may also bind membrane in a similar manner once TadZ is localized to the pole, if a similar amphipathic helix exists. In contrast to MinD that oscillates from pole to pole at the expense of ATP, TadZ is localized to a single pole.
Thus, we propose that TadZ localizes to the leading pole via the ARD at an early stage in pili biosynthesis. ARD likely recognizes a landmark that has previously been established at the pole, such as a component of the TipN-PodJ-PleC multi-protein complex in C. crescentus. The AAD, or other regions of ARD, may then serve as a docking portal for recruiting/assembling other pili biosynthesis components. Interacting partners of TadZ are currently unknown. The first likely candidates are the TadA ATPase motor and inner membrane proteins, TadB and TadC, since they always appear together in genomes. Pseudo-pilins (TadE and TadF), which are generally believed to locate at the base of type IV pili, are also possible binding partners. In a few bacteria, such as Streptomyces avermitilis, TadZ is fused to a C-terminal TadE tail (Fig. 1B).
Figurski and coworkers have undertaken an analysis of the functional roles of AaTadZ and its domains (B.A. Perez-Cheeks, et al., unpublished, submitted as a companion paper). Remarkably, they find that AaTadZ is also localized to the leading pole despite significant sequence divergence, indicating a conserved role for TadZ. Both ARD and AAD of AaTadZ are required for the localization to the pole since both the N-terminal and the C-terminal deletion mutants of AaTadZ failed to localize. They show that the AAD of TadZ indeed shares functional similarity with MinD by interacting with the membrane through the C-terminal amphipathic helix. Thus, our structural studies have revealed new potential regulatory functions mediated by the ErTadZ family proteins.
Clones were generated using the Polymerase Incomplete Primer Extension (PIPE) cloning method (Klock et al., 2008). The gene encoding ErTadZ was amplified by polymerase chain reaction (PCR) from E. rectale genomic DNA using PfuTurbo DNA polymerase (Stratagene) and I-PIPE (Insert) primers (forward primer ErTadZfw, reverse primer ErTadZrv, Table S1) that included sequences for the predicted 5' and 3' ends. The expression vector, pSpeedET, which encodes an amino-terminal tobacco etch virus (TEV) protease-cleavable expression and purification tag (MGSDKIHHHHHHENLYFQ/G), was PCR amplified with V-PIPE (Vector) primers (forward primer: 5’-taacgcgacttaattaactcgtttaaacggtctccagc-3’, reverse primer: 5’-gccctggaagtacaggttttcgtgatgatgatgatgatg-3’). V-PIPE and I-PIPE PCR products were mixed to anneal the amplified DNA fragments together. Escherichia coli GeneHogs (Invitrogen) competent cells were transformed with the I-PIPE / V-PIPE mixture and dispensed on selective LB-agar plates. The cloning junctions were confirmed by DNA sequencing. Expression was performed in a selenomethionine-containing medium at 37°C with suppression of normal methionine synthesis. At the end of fermentation, lysozyme was added to the culture to a final concentration of 250 µg/ml, and the cells were harvested and frozen. After one freeze/thaw cycle, the cells were homogenized in lysis buffer [50 mM HEPES pH 8.0, 50 mM NaCl, 10mM imidazole, 1 mM Tris(2-carboxyethyl)phosphine-HCl (TCEP)] and passed through a Microfluidizer (Microfluidics). The lysate was clarified by centrifugation at 32,500 g for 30min and loaded onto a nickel-chelating resin (GE Healthcare) pre-equilibrated with lysis buffer, the resin washed with wash buffer [50 mM HEPES pH 8.0, 300 mM NaCl, 40 mM imidazole, 10% (v/v) glycerol, 1 mM TCEP], and the protein eluted with elution buffer [20 mM HEPES pH 8.0, 300 mM imidazole, 10% (v/v) glycerol, 1 mM TCEP]. The eluate was buffer exchanged with HEPES crystallization buffer [20 mM HEPES pH 8.0, 200 mM NaCl, 40 mM imidazole, 1mM TCEP] and concentrated to 18.1 mg/ml using centrifugal ultrafiltration (Millipore). ErTadZ was crystallized by mixing 200nl protein with 200 nl crystallization solution over a 50 µl reservoir volume using the nanodroplet vapor diffusion method (Santarsiero et al., 2002) with standard JCSG crystallization protocols (Lesley et al., 2002). The crystallization reagent that produced the ErTadZ crystal used for structure solution consisted of 2.1 M ammonium sulfate, 2.25% polyethylene glycol 400, 15.0% glycerol, and 0.1 M HEPES pH 7.57. The crystal used for structure determination was harvested after 36 days at 293K. MPD was added to the crystal as a cryoprotectant to a final concentration of 15% (v/v). Initial screening for diffraction was carried out using the Stanford Automated Mounting (SAM) system (Cohen et al., 2002) and an X-ray microsource installed on a SSRL beamline (Menlo Park, CA). The crystal was indexed in monoclinic space group C2.
MAD data were collected at wavelengths corresponding to the inflection, high energy remote, and peak of a selenium MAD experiment at 100 K using Mar CCD 325 detector (Rayonix) at SSRL beamline 11-1. Data processing and structure solution were carried out using an automated structure solution protocols developed at the JCSG. In summary, the MAD data were integrated and reduced using MOSFLM (Leslie, 1992) and then scaled with the program SCALA (Evans, 2006). Location of selenium sites, initial phasing, and identification of the space group were carried out using SHELXD (Sheldrick, 2008). Phase refinement and model building were performed using autoSHARP (Bricogne et al., 2003) and wARP (Cohen et al., 2004). The above process produced excellent density maps and resulted in an initial model that was ~96% complete. Further model completion and refinement were performed manually with COOT (Emsley et al., 2004) and REFMAC5 (Murshudov et al., 1997). The refinement included experimental phase restraints in the form of Hendrickson-Lattman coefficients and TLS refinement with 2 TLS groups (residues 1–114, 115–354). Data and refinement statistics are summarized in Table 1. Analysis of the stereochemical quality of the model was accomplished using MolProbity (Davis et al., 2004). Molecular graphics were prepared with PyMOL (DeLano Scientific). Sequence logo was produced by Weblogo (Crooks et al., 2004).
The oligomeric state of ErTadZ in solution was determined using a 1 × 30 cm2 Superdex 200 column (GE Healthcare) coupled with miniDAWN static light scattering (SEC/SLS) and Optilab differential refractive index detectors (Wyatt Technology). The mobile phase consisted of 20 mM Tris pH 8.0, 150 mM sodium chloride, and 0.02% (w/v) sodium azide. The molecular weight was calculated using ASTRA 5.1.5 software (Wyatt Technology).
A protein sample of 100µM ErTadZ in 20mM HEPES buffer pH 8.0, 200mM NaCl, 5mM β-mercaptoethanol was heat inactivated at 94°C for 5 min and centrifuged at 10,000 × g for 5 min. Nucleotide pools in the supernatant were analyzed on a MALDI-MS instrument (Voyager-DE, Applied Biosystems).
Primers ErTadZE162Afw, pSpeedETfw, ErTadZE162Arv, and pSpeedETrv (Table S1) were used to PCR-amplify DNA sequences covering the first 508bp and the last 606bp of the ErTadZ ORF from the pSpeedET::ErTadZ expression vector. PCR products were spliced together in a subsequent SOE-PCR step using the outside primers pSpeedETfw and pSpeedETrv. The resulting SOE-PCR product was cloned into the PacI and AgeI sites of pSpeedET::ErTadZ to produce the pSpeedET::ErTadZE162A construct. The engineered mutation in the ErTadZE162A allele that changed Glu162 (GAA) into Ala (GCA) was confirmed by sequencing.
Hexahistidine-tagged ErTadZ (10µM) or the ATPase MipZ (10µM) was preincubated in 20mM HEPES buffer pH 8.0, 200mM NaCl, 10mM MgCl2, 5mM β-mercaptoethanol. To assess the ATPase turnover rate of ErTadZ under crystallization conditions, a sample of ErTadZ (10um) was preincubated in crystallization buffer (0.1 M HEPES pH 7.57, 2.1 M ammonium sulfate, 15.0% glycerol) supplemented with 10mM MgCl2. Reactions were supplemented with 1mM ATP containing 30mCi of α-labeled [33P] ATP (3000 Ci/mmol, Perkin Elmer) and incubated at 35°C. Aliquots were taken at regular time intervals and heat-inactivated at 94°C for 5 min. Samples were spotted onto polyethyleneimine-cellulose thin-layer chromatography plates (Merck, Darmstadt) and nucleotide were separated using a mobile phase consisting of 0.5M KH2PO4 pH 3.4. Radioactive ATP and ADP species were detected on a Storage PhosphorScreen (Amersham Bioscience) and quantified using imageJ software v1.43.
We thank the members of the JCSG high-throughput structural biology pipeline for their contribution to this work. The JCSG is supported by the NIH, National Institute of General Medical Sciences, Protein Structure Initiative grants U54 GM094586 and GM074898. Portions of this research were carried out at the Stanford Synchrotron Radiation Lightsource (SSRL), SLAC National Accelerator Laboratory. The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences). The genome of E. rectale was a gift of Dr. Jeffrey Gordon, Washington University in St. Louis, School of Medicine. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.
Atomic coordinates and experimental structure factors for ErTadZ at 2.1 Å resolution have been deposited in the PDB code under accession code 3fkq.
Additional supporting information may be found in the online version of this article.