|Home | About | Journals | Submit | Contact Us | Français|
Encased within the 280 kb genome in the capsid of the giant myovirus KZ is an unusual cylindrical proteinaceous “inner body” of highly ordered structure. We present here mass spectrometry, bioinformatic, and biochemical studies that reveal novel information about the KZ head and the complex inner body. The identification of 39 cleavage sites in 19 KZ head proteins indicates cleavage of many prohead proteins forms a major morphogenetic step in KZ head maturation. The KZ head protease, gp175, is newly identified here by a bioinformatics approach, as confirmed by a protein expression assay. Gp175 is distantly related to T4 gp21 and recognizes and cleaves head precursors at related but distinct S/A/G-X-E recognition sites. Within the KZ head there are six high copy number proteins that are probable major components of the inner body. The molecular weights of five of these proteins are reduced 35–65% by cleavages making their mature form similar (26–31 kDa), while their precursors are dissimilar (36–88 kDa). Together the six abundant proteins sum to the estimated mass of the inner body (15–20 MDa). The identification of these proteins is important for future studies on the composition and function of the inner body.
The Pseudomonas aeruginosa phage KZ is the type virus of a genus of “giant” myoviruses, the KZ-like phages, whose members have been isolated from diverse geographic locations and are infective for a variety of Pseudomonas species (Krylov et al., 2007). The KZ-like phages and their gene products are important due to their potential to control pathogenic pseudomonads (Briers et al., 2008, Miroshnikov et al., 2006, Paradis-Bleau et al., 2007, Mesyanzhinov et al., 2002) and genetically manipulate their hosts (Monson et al., 2011). Some, such as KZ, have already been incorporated into traditional phage therapy cocktails and continue to be examined for novel therapeutic applications [e.g., (Golshahi et al., 2011, Matinkhoo et al., 2011)]. KZ has a long, 280 kb linear dsDNA genome that encodes numerous proteins that are remarkably diverged from those in other phages, thus making straightforward functional assignments very difficult (Mesyanzhinov et al., 2002).
The KZ virion also presents a singular research focus as its large T=27 head (Fokine et al., 2005a) contains an unusual immense cylindrical protein “inner body” (IB) encased within the genomic DNA (Krylov et al., 1984). The highly ordered structure of this IB has recently been solved (Wu et al., 2012). The IB is a conserved feature in related phages [(e.g., 201 2-1, PA3, EL, OBP (Krylov et al., 2007)] suggesting it is an essential component of the KZ-like phages. However, the function, composition, and role in infection of the IB has remained elusive since it was first identified (Krylov et al., 1984). The IB may well prove to be multi-functional and could have roles related to DNA packaging, packaged genome structure, DNA ejection, or it may play a role in all of these aspects of phage development. More specifically, the structurally defined IB may be important for size determination of the T=27 capsid and capsid stability; spooling of DNA tightly around the IB could reduce some of the DNA pressure that might otherwise be exerted on the capsid shell. Spooling of the DNA around the IB may also be necessary to organize such a length of dsDNA at ~500 mg/ml in a sufficiently ordered structure that allows efficient and rapid ejection (Black and Thomas, 2011). The IB may also represent a reservoir of proteins injected into the host cell together with the DNA that are necessary for host takeover. Supporting this hypothesis is that the IB is not detected after infection and injection of the packaged DNA (Krylov et al., 1984). Moreover, in addition to the IB, a number of enzymes [e.g., multi-subunit-β,β′-RNA polymerase (RNAP)] are also found within the KZ head with an unknown structural relationship to it.
Previous biochemical studies have illustrated the complexity of the KZ head, containing at least 30 (Lecoutere et al., 2009), but possibly more than 40 (Mesyanzhinov et al., 2002) different proteins. This is in contrast to the head of T4 which contains 13 different proteins (Black et al., 1993). The location of two head proteins (gp120 and gp89) in the complex KZ SDS-PAGE profile were identified (Mesyanzhinov et al., 2002), however ambiguity regarding the exact locations of other head proteins presented a major hurdle for even simple studies of the head. Also, other than gp89 and gp120, the identity of any other major or high copy number proteins in the KZ head was unknown. However, the identity of high copy number head proteins is critical for further studies of the IB as any such proteins would become candidates for components of this large structure. Similarly, while gp120 and gp89 were identified as processed (Mesyanzhinov et al., 2002) it was undefined whether other KZ head proteins were cleaved or not, or which protein represented the prohead protease in any of the KZ-like phages. Comprehensive mass spectral analyses of P. chlororaphis phage 201 2-1 demonstrated that thirteen of its proteins were cleaved at a motif reminiscent of the T4 processing motif (Thomas et al., 2010). However, the exact cleavage sites in all KZ head proteins could not be inferred from the 201 2-1 analyses since these phages are not closely related as illustrated by their homologous virion proteins having ~20–70 percent identity as determined by Blastp (Thomas et al., 2008).
Here we report mass spectrometry (MS) and biochemical analyses of the KZ tailless full head mutant and a prohead precursor and show that cleavage of numerous head proteins occurs in KZ. This processing must play a major role in the formation of the IB, in addition to the capsid. Furthermore, using advanced bioinformatic approaches we have identified the KZ prohead protease which cleaves multiple head proteins at a consensus amino acid recognition motif. This protease is strongly conserved among all known members of this group of phages, reflecting its important role in morphogenesis. We also identify six high copy number proteins (~100–200 copies) within the KZ head as being IB proteins. This determination was made based on their release from full heads and by their presence in terminase-inhibited protein precursors-processed DNA-free proheads. These characteristics lay the groundwork necessary for a genetic determination of the function(s) of the extraordinary IB structure.
HPLC-electrospray ionization tandem mass spectrometry (HPLC-ESI-MS/MS) analyses of CsCl purified KZ tailless mutant (Lavigne et al., 2009, Fokine et al., 2005a, Plotnikova et al., 1982) tailless full heads permitted assignment of peptides to 50 KZ gene products (Figure 1). Twenty-nine of these proteins have previously been identified in tailless heads (Lecoutere et al., 2009). However, one protein identified by Lecoutere et al., gp128, was not found in our analyses. Conversely, 19 proteins were identified in the current study that were not reported to be present in tailless heads and/or wild type KZ particles (Lecoutere et al., 2009). Detection of these additional proteins most likely resulted from the approach used here which permitted sampling of low molecular weight proteins and low abundance proteins. Eleven proteins identified in this study (gps 27, 85, 91, 92, 119, 144, 153, 157, 184, 244 and 298) were not identified previously in the tailless heads, although they are KZ virion proteins (Lecoutere et al., 2009). Eight proteins (gps 42, 81, 96, 129, 177_2, 199, 203 and 228) were identified in the tailless heads in this study that were not previously found in the KZ virion. These proteins are unlikely to be contaminants as the heads were judged to be highly purified by electron microscopy and the reversion rate of the mutant is about 10−5–10−6. Many of these newly identified proteins are expected to be additional head proteins based on their possessing characteristics indicative of head proteins. For instance, the predicted function of gp129 is that it forms the portal (Cornelissen et al., 2011). Similarly, other proteins have qualities that support them as being head proteins, such as being cleaved at a head prohead protease cleavage motif (e.g., gps 92, 96, 153, 129 and 203), and/or their gene being positioned within, or close to, a major head gene module (e.g., gps 91 and 177_2)(Figure 2).
Information gained by comparing observed and expected MS sequence coverage and 1-D gel migration provided insight into the catalog of KZ head proteins that had undergone proteolytic cleavage (Table 1). However, the strongest evidence for cleavage was provided by the presence of a “semi-tryptic peptide”, that is, a peptide with a non-tryptic terminus (i.e., no arginine or lysine) in high confidence peptide assignments; these semi-tryptic peptides were detectable by including “semi-tryptic” cleavage as a parameter in the database searches of the tandem MS data. These efforts were facilitated by the detection of peptides from four different KZ-derived preparations that underwent MS and are reported in this manuscript. High peptide coverage, particularly for the abundant proteins, was frequently observed (see Figure 2). Forty peptides were found that had been produced by cleavages immediately after a glutamic acid, confirming a consensus processing site(s) for 19 different proteins (Figure 2, Supplementary Table 1). Summation of the number of KZ head proteins and their observed cleavage sites indicates the remarkable amount of processing that must be undertaken by the morphogenetic protease – the cleavage of at least 6200 peptide bonds per prohead, perhaps several fold more.
Inspection of the 39 KZ cleavage sites showed that in addition to cleavage occurring invariably immediately after a glutamic acid there is a strong preference for a serine in the third position upstream of the cleavage site (26 instances), although an alanine or a glycine is also accepted in this position (8 and 4 instances, respectively, Supplementary Table 1). In one instance (gp303) cleavage apparently occurred where phenylalanine is in the third position upstream of the cleavage site. Notably, there is little requirement for any other specific residues, particularly downstream of the cleavage site (Supplementary Figure 2). Overall, the KZ consensus cleavage sequence (S/A/G-X-E, where X is any amino acid) is similar to, but not the same as that of the T4 prohead protease which cleaves after the motif I/L-X-E (more than 13 cleavages in eight different precursor proteins)(Black et al., 1993). In T4, there is also no requirement for specific residues downstream of the cleavage site. Also, in T4 there is evidence that a shift in the processing motif within a polypeptide results in a retargeting of the cleavage site by the protease (Black et al., 1993, Mullaney and Black, 1996).
Identification of cleavage sites was critical for length determination of the mature polypeptide(s) for each KZ protein (Table 1). Notably, for some proteins, processing was so extensive that only 30–50% of the precursor protein is retained in the mature head (e.g., gp97, see Table 2). The manner in which several proteins were processed is intriguing, with several proteins cleaved into two or three polypeptides that remain in the mature head (e.g., gps 89, 97, 177, 303). Some of the smaller fragments (e.g., gp89N and gp97N) may be non-functional debris and unable to be cleared from the capsid due to their locale or amino acid composition, similar to the case for the phage T4 “acid soluble peptides” that are found within the mature capsid; alternatively these fragments may have unknown functional significance.
Cleavage sites identified in both gps 89 and 120 were consistent with the N-termini identified by Edman degradation (Mesyanzhinov et al., 2002). However, additional processing sites, N-terminal to those found by Edman degradation were also identified in both these proteins, as evidenced by the presence of low abundance semi-tryptic peptides (see Figure 3A for gp120). Similar semi-tryptic peptides that indicated that additional cleavages had occurred upstream of the maturation cleavage site were also identified in a number of other proteins [e.g., gp93 (Figure 4B) and gps 94, 95, 97, 162, 303 (Figure 2)]. The low abundance of these semi-tryptic peptides suggests that they are not present in every particle; the fact that they were identified at all is a consequence of the low detection limits of the MS analyses. Identification of these additional processing sites is important because they demonstrate that there is a multiplicity of processing in the precursor or immature regions of the KZ head proteins. Examination of the regions in which multiple cleavages have been confirmed by semi-tryptic peptides typically also revealed other potential cleavage motifs in these regions (see Figure 3A and B), suggesting that the detected low abundance cleavages represent a fraction of what actually occurs in the KZ head during maturation. It seems feasible that there is a redundancy of cleavage sites in these regions that may have evolved as a protective measure to ensure removal of regions of the protein not required in the mature head, raising the question as to whether there is a preferred length for cleaved fragments to ensure optimal removal from the prohead. Observation of sequences in the mature regions of proteins apparently adhering to the cleavage motif but which are apparently never cleaved (e.g., GFE-457 or ALE-723 in gp120, Figure 3A) indicates there are likely to be additional specificity or structural determinants required for cleavage in addition to the S/A/GXE motif.
The identification of so many cleavage sites in KZ head proteins enables us to draw conclusions as to the cleavage of head proteins in other phages, particularly 201 2-1 and PA3, phages that are members of the KZ-like phage genus (Cornelissen et al., 2011). Comparison of the identified cleavage sites in this study in KZ and those identified in 201 2-1 indicates that their respective proteases have very similar cleavage site residue requirements (S/A/G-X-E) and this requirement is likely to be a feature of phages in the KZ genus. The cleavage sites in 12 KZ proteins (e.g., gps 86, 89, 96, 153, and 162) are conserved with identified cleavage sites in their respective homolog in 201 2-1 (Thomas et al., 2010) and from this work we can infer about another 15 cleavage sites in 201 2-1 head proteins (Table S2). From the identified cleavage sites in KZ and 201 2-1 proteins we can also infer cleavages occur at about 27 sites in homologous proteins in PA3 (Table S2).
We observed that frequently the degree in which a KZ cleavage site was conserved in its homologs correlated with the overall degree of conservation of the homologs themselves (see Table S4). For instance, the cleavage sites in two highly conserved proteins (major capsid and protease, see below) are the only sites perfectly conserved in their 201 2-1 and PA3 homologs with regard to the three residues upstream of the processing site (Figure 3Ci and ii). However, despite more variations in cleavage site residues with more diverse homologs, we typically observed a conservation of putative cleavage site residues that mirrored the pattern seen in KZ cleavages. A number of the KZ cleavage sites are conserved in homologous proteins only with regard to the first and third residues upstream of the processing site, as seen in the homologs to gps 84 and 92 (SXE, Figure 3Ciii and iv). Similarly, in some homologous proteins the third residue upstream of a proposed cleavage site is replaced with an A or G as observed in some KZ cleavages (e.g., Figure 3Cv and vi). In homologs of other KZ processed proteins, sequences that adhere to the cleavage motif were frequently found in a position not aligning with, but within a few residues of the actual point of cleavage in the KZ protein. We believe these motif-like sequences also represent cleavage sites (e.g., Figure 3Cvii).
Notably, we were unable to infer corresponding cleavage sites in the homologs to several KZ proteins (egs. gps 93, 95, 97 and 303, Table S2). Typically, these are not strongly conserved proteins, however we suspect that numbers of these proteins in KZ, and most likely their homologs in related phages, have an important role in the mature particle based on their abundance in the KZ head (see below). Overall, the identification of numerous cleavage sites in KZ greatly advances our knowledge and expectations of cleavage events in the KZ-like phages. It also emphasizes that prohead protease cleavages form a major step in capsid maturation for these phages.
Sequence-sequence based searches were unable to identify the enzyme responsible for the extensive number of cleavages in KZ head; hence we sought to identify the KZ prohead protease using Hidden Markov Model (HMM) based searches. To do this we used local implementations of the Sequence Analysis and Modeling System [SAM (Hughey and Krogh, 1996, Karplus et al., 1998)] and HHpred (Söding, 2005). Initially, Hmmscore (a profile to sequence search) of a custom T4 gp21 model against a library of proteins from four KZ-like phages (KZ, EL, PA3 and 201 2-1) identified homologous proteins in the four phages: EL gp192 (E-8), KZ gp175 (E-6), 201 2-1 gp168 (E-5) and PA3 gp205 (E-3).
To challenge these matches, an alignment of these four protease candidates was constructed using SAM and then converted into a HHM and used for HHpred searches (profile-profile comparisons). An HHsearch of the KZ gp175 family HHM against all HHpred libraries (from ftp://toolkit.lmb.uni-muenchen.de/HHsearch/databases/) found the best match to be to PHA00911 gp21 prohead core scaffolding protein and protease. The E-value of this match was only 3, however the HHpred P-value, which takes secondary structure into account, was 2.5E-05. This match was solidified using a custom T4 gp21 HHM powerful enough to detect an HHM based on the U35 peptidase family at E-values of 3.5E-05. Members of the U35 peptidase family contain diverged, but known, homologs to the T4 protease, such as the HK97 protease (Liu and Mushegian, 2004, Cheng et al., 2004). When searched against the custom gp21 HHM the KZ gp175 HHM scored slightly better (2.3E-06, Figure 4) than the U35 family HHM, indicating that KZ gp175 and its homologs in the KZ-related phages are homologous, albeit extremely diverged, from T4 gp21.
To confirm the identity of gp175 as the KZ prohead protease we expressed its predicted mature form in Escherichia coli DH10B. An extract incubation assay of gp175 using expressed full length gp93 as a substrate showed little non-specific proteolysis of other proteins, whereas the majority of gp93 observable by SDS-PAGE was cleaved in a good yield to a length consistent with that found in the mature virion (Figure 5). Protease active site knockout control incubations (data not shown) will be published together with a more detailed enzymatic characterization.
KZ gp175 conserves the proposed serine protease active site residues H-85 and S-140 in T4 gp21 (Figure 4); these residues have been shown to be conserved across the phage U9 and U35 families to the S21 family that contains herpesvirus protease (Liu and Mushegian, 2004, Cheng et al., 2004). The final residue of the serine protease catalytic triad (an aspartate or glutamate in classical serine proteases) is less well conserved throughout the phage/virus phylogenetic spectrum (Cheng et al., 2004). In the U9 family this amino acid position is most often an alanine or glycine (Cheng et al., 2004), while in the KZ-like phages it is a threonine (Figure 4B). This third residue of the catalytic triad is not considered to play a major role in catalysis and it has been proposed that for the U9 family catalysis might only require a catalytic dyad (Cheng et al., 2004). Alternately, the third residue of the catalytic triad might have migrated—possibly to the well-conserved aspartate two residues downstream of the alanine/glycine residue (Cheng et al., 2004), which is also conserved in the KZ-like protease candidates (Figure 4B).
The structural elements from a major portion of the herpesvirus protease (β1 – β6), and other members of the peptidase S21 superfamily, also have predicted secondary structure counterparts in phage proteases, with the exception of the herpesvirus αB, which is involved in homodimer formation (Liu and Mushegian, 2004, Cheng et al., 2004). Counterparts to each of the herpesvirus protease elements from β1 – β5, with the exception of β6, are also predicted for KZ gp175 (Figure 4).
KZ gp175 is a low copy head protein in the mature head based on an assignment of a relatively low number of spectra to the ~24 kDa protein in our MS experiments (29, compared to 1405 spectra for gp120, the ~65 kDa major capsid protein). This is similar to T4 where only three copies of the T4 protease remain in the mature head, although there are 72 copies of the zymogen present in the unprocessed prohead (Black et al., 1993). Based on the T4 precedent, it is likely to be biological important that KZ gp175 is C-terminally cleaved at E-210. In T4 the protease is the only head protein known to be C-terminally cleaved, probably at LAE-206 (Keller and Bickle, 1986). Notably, this cleavage of the protease zymogen is an autocleavage event that activates the protease (Showe et al., 1976a, Showe et al., 1976b). The protease auto-activation step is closely controlled by prohead assembly, likely regulated by addition of the vertex proteins (Black et al., 1993). We expect that the processing of KZ gp175 also activates its zymogen form.
Only about one quarter of the processed KZ proteins have either a known function or one that can be inferred by bioinformatics. These are the major capsid protein (gp120), the portal protein (gp129), the protease (gp175), the virion RNAP beta’subunit (gp180), and a helicase (gp203). Several of these proteins have novel or unusual properties. For instance, the fact that the KZ portal is processed is unusual as processing of the portal protein is an event rarely reported in tailed phages, although it is known that 21 and 26 residues are removed from the N-termini of the λ and P2 portal proteins, respectively (Walker et al., 1982, Rishovd et al., 1994, Rishovd et al., 1998). In fact in T4 the portal is the only early prohead protein that is not processed despite this phage having a relatively high number (13) of processed proteins in its capsid (Black et al., 1993). That the KZ portal may have such an unusually large propeptide (262 residues) removed suggests it may have more than just a core-targeting function and may contribute significantly to the core itself, possibly connecting the portal to the precursor form of the IB in the prohead.
Helicases and multi-subunit RNAPs have not commonly been found in the virions of characterised phages. In KZ, presumably the RNAP subunits are injected into the host for transcription of early genes (Thomas et al., 2008). Possibly, they could have an additional role in genome translocation out of the capsid as in T7 (Grayson and Molineux, 2007). Possible functions for gp203 (one of three helicases encoded by KZ) are more speculative. Gp203, homologous to the SNF-2 helicases, might be injected into the host where it may have a role in DNA transcription or replication, or it might be involved in organization of the DNA within the head. The identification of C-terminal cleavages at a protease motif in the RNAP beta’ subunit and gp203 helicase is a novel finding and supports these enzymes being incorporated into the head based on our knowledge of T4 head morphogenesis (Black et al., 1993).
The major capsid protein for KZ has been reported to be present in 1560 copies per virion (Fokine et al., 2005a). In this study, the relative abundances of the proteins in the KZ head were estimated from the number of peptide mass spectra assigned with high confidence, after normalization on the basis of calculated molecular weight (SC/Mw), and densitometry. SC/Mw for gps 89, 90, 93, 95, 97 and 162 were substantially higher than those for most of the other head proteins, ranging from 4.5–5.7 (Table 1). The SC/Mw for the major major capsid protein, gp120, was 25.8. However, as a consequence of saturation effects on relative quantification of the major capsid protein using this approach (Thomas et al., 2010), the copy numbers derived from SC/Mw are expected to be over-estimated for gps 89, 90, 93, 95, 97 and 162. In order to obtain an alternative assessment of relative quantities, image analysis was conducted of the Coomassie-stained SDS-PAGE bands containing these six proteins in both purified phage and tailless heads. The densitometry results were in agreement with the MS data, supporting the conclusion that these six proteins are present in high abundance (>100 copies per head, Table 2). This indicates that each KZ head is composed of more than 2000 protein subunits.
A number of lines of evidence identify these six abundant head proteins as probable major components of the IB. First, the genes encoding five of the abundant head proteins (gps 89, 90, 93, 95 and 97) are located within a single gene module (Figure 2). Although genes with related morphogenetic functions are often observed to be clustered in other phage genomes, the clustered location of these five genes is in KZ more significant because of the otherwise overall absence of the normal head gene synteny in KZ (Figure 2). Second, the sum of the observed masses of the IB candidate proteins is consistent with the total estimated mass of the IB (15 – 20 MDa)(Wu et al., 2012). It is important to note that as a consequence of processing, the molecular weights of the mature forms of the six abundant KZ IB proteins are remarkably similar (Table 2), despite major differences in the sizes of their precursors. Notably, three of the abundant proteins, gps 93, 95 and 162, contain a paralogous domain (PF12699) that is also present in two of the non-abundant head proteins, gp94 and gp163. By analogy to gene duplication events in other phages [e.g., T4 gps 23 and 24 (Fokine et al., 2005b)], the KZ proteins containing the PF12699 domain are likely to have evolved specialized, although related, functions. Each of the KZ-related phages has a set of paralogs belonging to this family which we predict to have a major influence on the IB structure.
The similarity in the proteolysis of the KZ head proteins to those in T4 suggests similar overall head morphogenesis. In T4 an immature, uncleaved form of the prohead is first assembled around a structure-determining scaffolding core. The immature prohead is then converted by the gp21 protease to a more mature prohead whose outer shell is composed of the cleaved capsid proteins gp23 and gp24. Additionally, in the mature prohead most of the internal form-determining scaffolding components (gps 22, 21, 67 and 68) have been proteolytically removed. Other scaffolding components nonessential for prohead assembly undergo only limited N-terminal processing and are retained by the mature prohead into which DNA is then packaged by the terminase. Upon phage infection these proteins [the internal proteins (IPs), IPI, IPII, IPIII and Alt] are injected into the host together with the DNA (Black et al., 1993). Addition of the phage T4 terminase inhibitor 9-aminoacridine (9AA) to infected bacteria blocks DNA packaging and leads to the accumulation of fully processed mature T4 proheads that contain the cleaved IPs but lack DNA (Schaerli and Kellenberger, 1980).
Host uptake of 9-aminoacridine has a strikingly similar effect on KZ head morphogenesis enabling us to isolate a comparable packaging-inhibited KZ prohead in which the abundant, candidate IB processed head proteins are components (Figure 6). The 9AA particles isolated by glycerol gradient centrifugation (Figure 6A) contained the mature capsid protein and a number of other abundant inner body proteins in proportions comparable to those found in the mature capsid (Figure 6B). The particles contained little, if any DNA (Figure 6C). Significantly, electron microscopy of these particles revealed a kernel-like object within the DNA-free prohead which apparently represents a precursor or degraded form of the IB that evidently can assemble in the absence of DNA packaging (Figure 7, Supplemental Figures 3 and 4). Isolation of these processed mature proheads shows that the KZ protease operates like the T4 protease, prior to and independent of DNA packaging, and unlike the adenovirus protease that translocates along the packaged DNA to access its cleavage sites in the capsid (Mangel et al., 2003).
Proteins released from the T4 capsid shell by osmotic shock were observed to be either scaffolding core-related or IP-related (Black et al., 1993). Osmotic shock treatment of KZ purified tailless heads followed by separation of the empty shells (Supplemental Figure 5) from the released proteins, enabled the identification of proteins released from the capsid, and therefore, most likely packaged within the capsid itself and not tightly associated with the capsid shell (Figure 8). The released proteins included the abundant IB proteins listed in Table 2 (Figure 8) and based on the T4 precedent are likely to have scaffolding core- and/or IP-related functions.
An additional experiment also indicates that most of the abundant KZ IB proteins do not have one T4 IP-like signature characteristic. The T4 IPs and Alt precipitate with the still-condensed genomic DNA released from full heads by disruption of the capsid by treatment with 70% acetic acid (generally a protein solubilizing solvent) whereas these proteins are soluble as isolated from heads or proheads lacking DNA (Zachary and Black, 1991). The other T4 head proteins, with the exception of the portal protein, are found in the supernatant (Zachary and Black, 1991). The same treatment of KZ tailless heads also resulted in most proteins being enriched in the supernatant as assessed by SDS-PAGE and the SC/Mw values for each protein in both fractions (Figure 9). As expected, most of the DNA was found in the pellet. Only two processed, encapsidated proteins – the abundant protein (gp93) and the low abundance helicase (gp203) – were enriched in the pellet fraction. By this criterion therefore the soluble gp93 IB protein has a unique T4 IP-like association with the DNA unlike the remainder of the IB proteins. This suggests that most of the IB proteins are segregated away from the DNA in the protein rich cylinder (Wu et al., 2012), whereas the T4 IPs are dispersed within the DNA (Black and Thomas, 2011).
Comparison of the conservation of the homologs to the processed proteins in T4 and KZ among the members of their respective families supports that the candidate KZ IB proteins, and in fact the majority of KZ processed proteins, do not behave like T4 IP proteins, but more like the T4 scaffolding core proteins. Based on the degree by which the T4 processed proteins are conserved through the T4-like phages (from closely related phages, to more diverged members, Table S3) these proteins can be roughly be divided into three groups: (1) proteins that are highly conserved to approximately the same degree as known essential proteins (e.g., terminase, portal and tail sheath proteins), and include the major capsid protein, gp21 and the core-related protein, gp68; (2) proteins that are more diverged than the highly conserved proteins, but for most of which homologs could still be identified in the more distantly related T4-related phages (e.g., gp67 and gp22); and (3) proteins that are poorly conserved, and for which counterparts were not identifiable in many phages (e.g., IP and Alt proteins). Notably, the genes encoding the proteins in categories (1) and (2) are all located in one major region, however the genes encoding the IPs and Alt are in three different locales in the T4 genome.
A comparison between the processed proteins in KZ and their homologs in the related phages 201 2-1, PA3 and EL (Table S4) revealed that all processed KZ proteins have identifiable homologs in 201 2-1 and PA3, and the more highly conserved of these proteins have very diverged counterparts in EL [recently proposed to belong to a separate genus, the EL-like phages (Cornelissen et al., 2011)]. Generally, the proteins of the phages 201 2-1 (and PA3) have similar degrees of divergence from their KZ homologs as KVP40 proteins have from their T4 homologs (Thomas et al., 2008). Therefore the very existence of identifiable homologs to the KZ processed proteins in 201 2-1 and PA3 illustrates that their conservation is very dissimilar to the poor conservation of the IP proteins in the T4-like phages.
Several of the KZ processed proteins are conserved to a comparable degree as the major morphogenetic proteins terminase and sheath, including the major capsid protein, portal, protease, virion RNAP subunits, helicase-gp203, gp89 (an abundant protein) and gp184 (Table S4). KZ processed proteins more moderately conserved (divergence of 50–70%) include gps 84, 92, 163, 177 and 90, the latter an abundant head protein for which processing has not been observed. Lesser conserved processed KZ proteins (70–80% divergence) include gps 86, 93, 94, 95, 96, 97 and 162. The degree of conservation of these more moderately and lesser conserved KZ processed proteins is reminiscent of the conservation of the various T4 proteins with scaffolding core-related functions to their homologs in KVP40.
Proteolysis of head proteins is a critical step in the morphogenesis of many tailed phages, facilitating the conversion of the prohead to the mature capsid. In this study we show sequences of the cleavage sites in nineteen head proteins in KZ. We identify informatically and directly by enzyme activity the protease likely responsible for all of these cleavages as judged by action at consensus S/A/G-X-E recognition sequences at 39 cleavage sites. Processing of KZ head proteins thus occurs at a sequence similar but different from the I/L-X-E recognition sequence of the distantly related T4 phage protease. We found evidence of multiple processing sites in nine KZ proteins. This appears to represent a built-in mechanism by which the phage ensures that the majority of the propeptide regions are removed and emphasizes the essential nature of processing in KZ head morphogenesis.
The other main conclusion of our study is the identification of six similar size, similar abundance internal head proteins that provide the biochemical basis for the regular multi-tiered IB cylinder structure (Wu et al., 2012). Potentially, these six proteins could be segregated within individual tiers, or mixed in few or many combinations in several or more of these tiers. Further work will be required to determine their structural organization. Further study is also required on the IB precursor; the extensive cleavages in the IB proteins suggest that this structure may be quite different and not readily inferred from its mature form. Studies of related phages with differences in IB protein composition will enhance our understanding of the KZ IB, and differences in IB structure among these phages may also help explain the major differences in genome length (280 kb, KZ; 316 kb, 201 2-1; 308 kb, PA3; 211 kb, EL) found among these phages whose capsids are of similar size (Krylov et al., 2007). The determination of whether or not the (thus far) unique IB of the PhiKZ-like phages is essential and, if so, the nature of its function(s) may be answered by a combined genetic, biochemical and structural approach.
The Lory strain of Pseudomonas aeruginosa PAO1 (BEO3) was kindly provided by Dr Robert Ernst. Phage KZ (Mesyanzhinov et al., 2002, Fokine et al., 2007) and its tailless mutant (Lavigne et al., 2009, Fokine et al., 2005a, Plotnilova, et al., 1982) were generous gifts of Drs Andrei Fokine and Konstantin Miroshnikov, respectively. We also thank Dr Miroshnikov for advice on propagation of the tailless mutant.
KZ was propagated as described previously (Wu et al., 2012). Purification of the tailless mutant was by step CsCl gradient as described previously (Fokine et al., 2005a), with the exception that particles were grown in liquid culture and concentrated by PEG precipitation. An overnight culture of P. aeruginosa was diluted 1:1000 into 1L of LB and grown with shaking at 37°C until OD600 was 0.33. The culture was then transferred to 40°C and infected with the tailless mutant phage at a MOI of 5 and grown for 2 hr, 15 min until lysis was observed. Bacterial debris was removed by a low speed spin (10,400 x g, 5 min) and the sample treated with Pancreatic DNAase I (Roche)(1500 U, 37°C, 30 min). Heads were concentrated by slow stirring overnight at 4°C in 1M NaCl and 10% PEG (final concentration) and then pelleted (11,300 x g, 4°C, 30 min). Pellets were resuspended in 20 ml SM buffer [50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 8 mM MgSO4, 0.002% gelatin] containing Complete Protease Inhibitor (Roche) and then layered (5.8 ml/tube) onto CsCl step gradients composed of the following concentrations of CsCl: 1.59 g ml−1 (1.0 ml), 1.52 g ml−1 (0.75 ml), 1.5 g ml−1 (1.5 ml), 1.30 g ml−1 (1.5 ml) and 1.21 g ml−1 (1.5 ml). The buffer used throughout the gradient was 10 mM Tris-HCl (pH 7.5) and 1 mM MgCl2. Tubes were spun (108,000 x g, 10°C, 3 h) in a Beckman Coulter SW41 rotor. The band judged to contain only heads by SDS-PAGE and transmission electron microscopy had a buoyant density of 1.46 g/ml. This is higher than the buoyant density of purified intact phage (1.36 g ml−1). Heads were found to be unstable upon dialysis unless the CsCl was removed by dialyzing initially against a buffer composed of 50 mM Tris-HCl (pH 7.5), 5M NaCl and 100 mM MgCl2 (4°C), followed by a series of buffers composed of reducing concentrations of salts, concluding at 50 mM Tris-HCl (pH 7.5) and 10 mM MgCl2.
Treatment of KZ with the T4 terminase inhibitor 9-aminoacridine (9AA) was generally as described for T4 (Schaerli and Kellenberger, 1980), with the exception that preliminary experiments were undertaken to determine the concentration of 9AA required to inhibit phage particle production (15 μg ml−1); the concentration that was subsequently used was10-fold higher than required for the same effect on T4. This can be attributed, in part, to the higher tolerance of P. aeruginosa to 9AA than E. coli (Wainwright et al., 1997). P. aeruginosa was propagated in 100 ml of M9S media at 37°C with shaking until OD600 0.45. Purified KZ was used to infect at a MOI of 5; 9AA was added 30 seconds after infection. At OD600 0.68 the cells were harvested by centrifugation at 10,400 x g for 5 min. The pellet was resuspended in 800 μl of SM buffer, DNAase (100 units) and CHCl3 (40 μl) were added and the mixture was incubated for 10 min (37°C) and then centrifuged for 5 min (4,300 x g, 4°C). The supernatant (200 μl) was immediately loaded onto a pre-formed 15–45% glycerol gradient (50 mM Tris-HCl, pH7.5; 5 mM MgCl2) and centrifuged in a Beckman Coulter SW50.1 rotor (147,000 x g, 1hr, 4°C). Fractions (~350 μl) were collected from the bottom of the tube and run on SDS-PAGE (10% polyacrylamide) and DNA agarose gels (0.7 – 1%). Non-dialysed fractions of interest were concentrated approximately 10-fold, by initially diluting the sample with 1280 μl of SM and then centrifugation (38,700 x g, 30 min, 4°C) in a SS34 rotor. The pellet was resuspended in SM buffer.
Purified heads in CsCl were osmotically shocked by dialysis from CsCl to a buffer containing 50 mM Tris-Cl (pH 7.5); 10 mM Mg Cl2. Particles were determined to be disrupted by a large increase in the viscosity of the sample such that it prevented it from being pipetted. This viscosity was reduced by treatment with DNAase (10U/200 μl of sample) for 10 min (37°C) and the sample was then loaded onto a pre-poured, chilled glycerol gradient and treated as described for the 9AA samples above. SDS-PAGE bands containing proteins released from the capsid underwent MS analyses as described above.
Acetic acid treatment of KZ tailless heads was as previously described (Zachary and Black, 1991). Heads were propagated as described above but instead of purification by CsCl gradient underwent differential centrifugation (2x 7,600 x g, 10 min; 38,700 x g, 30 min, 4°C). Pellets were resuspended in SM buffer and judged to contain only a small percentage of intact phage as judged by comparison of its SDS-PAGE profile with those of intact phage and purified heads. Acetic acid (70% final v/v) was added to the sample, vortexed, left at room temperature for 20 min, and then centrifuged in a benchtop centrifuge (16,000 x g, 4°C, 10 min). Pellets were dried by speedivac centrifugation (20 min, 60°C) then resuspended in 10 mM Tris-HCl (pH 7.5). The supernatant was dialyzed against water (1hr, 4°C), underwent Speedivac centrifugation, and was resuspended in an equal volume of 10 mM Tris-HCl (pH 7.5) as to the pellet fraction. Samples underwent SDS-PAGE, DNA gels and MS analyses as described above. For the MS analyses double the volume of pellet fraction was loaded on the SDS-PAGE gel as for the supernatant fraction. As a consequence, the number of spectra identified for each protein in the pellet fraction was halved for calculations of SC/Mw to enable a more realistic comparison of the abundance of each protein species in pellet versus supernatant fractions.
Samples were boiled for 10 min in SDS sample buffer (Bio-Rad) prior to electrophoresis on Criterion XT 12% SDS-PAGE reducing gels (Bio-Rad) in the presence of MOPS buffer (Bio-Rad). Proteins were visualized by staining with Coomassie blue. Gel lanes were divided into the following number of slices: acetic acid treatment of KZ tailless heads, 7 slices; purified heads, 17 slices; disrupted KZ tailless particles, 7 slices. Efforts were made to avoid transecting visible stained bands. Proteins in the gel slices were digested in situ with trypsin (Promega) and analyzed by HPLC-ESI-tandem mass spectrometry on either a Thermo Fisher LTQ mass spectrometer (acetic acid treatment of KZ tailless heads) or a Thermo Fisher LTQ Orbitrap Velos mass spectrometer (purified heads and disrupted KZ tailless particles). For the LTQ analyses, on-line HPLC separation of the digests was accomplished with an Eksigent NanoLC fitted with a PicoFrit™ (New Objective; 75 μm i.d.) column packed to 11 cm with C18 adsorbent (Vydac; 218MS 5 μm, 300 Å). A strategy was employed in which a survey scan was acquired followed by data-dependent collision-induced dissociation (CID) spectra of the seven most intense ions in the survey scan above a set threshold. For analyses on the Orbitrap, an Eksigent NanoLC-Ultra 2-D HPLC system was used with separation accomplished by a PicoFrit™ (New Objective; 75 μm i.d.) column packed to 15 cm with C18 adsorbent (Vydac; 218MS 5 μm, 300 Å). Precursor ions were acquired on the Orbitrap at a resolution of 60,000 (m/z 400). Data-dependent CID spectra of the six most intense ions in the survey scan were acquired from the linear trap while the precursor ion spectra were being collected. Mascot (Matrix Science; London, UK) was used to search the uninterpreted CID spectra searched against a locally-generated KZ protein database that had been concatenated with the SwissProt (version 51.6) database. Methionine was considered as a variable modification and semi-trypsin was specified as the proteolytic agent. Determination of probabilities of protein identifications and cross correlation of the Mascot results with X! Tandem were accomplished by Scaffold (Proteome Software). The tandem MS results obtained from the digest of each gel slice were searched separately using Mascot, and the data files were either evaluated individually or combined into datasets for processing by Scaffold (using the “MudPIT” option).
An estimate of abundance for each protein in the purified head sample was made by dividing its spectral counts by its molecular weight (SC/Mw). To obtain the number of spectra assigned to the mature region(s) of the processed proteins, a “Spectrum report” was exported from Scaffold into Microsoft Excel, which was then saved as tab-delineated text. Spectra over defined regions of each protein were then summed using the perl script MSsort (Thomas et al., 2010). The program may be obtained from S.C. Hardies through the website at http://biochem.uthscsa.edu/~hs_lab/scripting/MSprogs.html. Estimates of the copy number of each protein identified as abundant by SC/Mw were made by densitometry of the SDS-PAGE bands containing these proteins. Gels prepared with serially-diluted samples of purified KZ and the tailless mutant were analyzed using ImageJ software (Rasband, 1997–2011). Controls used for copy number estimation of the abundant proteins were the bands containing the major capsid (gp120) and the major sheath (gp29).
Cryo-EM was performed using a Philips CM20-FEG electron microscope as described (Cheng et al., 2002).
Tomographic data collection and reconstruction were carried out as described (Harris et al., 2006). Briefly, a sample of 9AA treated particles was mixed 50:50 with a BSA gold suspension (Aurion, Electron Microscopy Sciences, Hatfield, PA), applied to a Quantifoil R2/2 holey carbon grid (Quantifoil, SPI, West Chester, PA) and plunge-frozen using a Vitrobot (FEI, Hillsboro, OR). Tilt series covering the range from −60° to +60° in 2° increments were collected at −4 μm nominal defocus, 38,500x magnification (7.8 Å pixel size), and a total electron dose per series of ~ 70 e- Å-2 using SerialEM (David N, 2005) on a Tecnai T12 electron microscope (FEI) equipped with a GIF 2002 energy filter (Gatan, Pleasanton, CA) operating in zero-loss mode. Reconstructions were calculated with IMOD (Kremer et al., 1996). Capsids were extracted from the tomogram as subvolumes and denoised by 30 iterations of anisotropic non-linear diffusion (Frangakis and Hegerl, 2001) as implemented in Bsoft (Heymann et al., 2008).
The gene encoding the full length form of gp93 was amplified by PCR using the primers 5′-GCGCCATGGCTCTACTTAAAATGCTAGACA-3′ and 5′-GCGGGTACCTTACTTCTCATCATTCTCAGG-3′. The gene encoding the mature form of gp175 was amplified with the primers 5′-GCGCCATGGAACCATATTCTAATTTTTCTGGT-3′ and 5′-GCGTCTAGATTACTCTAATGCAGGGGAATTCCATTTA-3′. Both genes were cloned using the NcoI and XbaI sites of the vector pHERD20T (Qiu et al., 2008), kindly provided by Dr Hongwei Yu, and transformed into Escherichia coli DH10B. Full length gp93 and mature gp175 were induced individually and the cells disrupted by freeze-thaw and lysozyme treatment. Cell lysates containing either gp93, gp175, or the two mixed together, were incubated for 30 minutes at 37°C. Samples were then spun in a benchtop centrifuge to separate soluble and insoluble fractions and the fractions assayed by SDS-PAGE. Only the soluble fractions are shown in Figure 5.
KZ and the KZ tailless mutant were generously provided by Dr. A. Fokine (Purdue University) and Dr. K. Miroshnikov (Russian Academy of Sciences), respectively. Pseudomonas aeruginosa PAO1 was kindly provided by Dr R. Ernst (Univ. of Maryland Baltimore). The pHERD20T vector was provided by Dr Yu (Marshall University). We thank Kevin Hakala and Sam Pardo for the excellent MS analyses that were conducted in the UTHSCSA Institutional Mass Spectrometry Laboratory. We thank Ru-ching Hsia (Univ. of Maryland Baltimore) for her help with TEM and Qin Dan for her technical support. We are extremely grateful to Dr. S.C. Hardies for allowing access to his collection of bioinformatics software and the UTHSCSA Bioinformatics Center for assistance with computational aspects of the project. This work was supported by NIH grant AI11676 to L.W.B. and by the Intramural Research Program of NIAMS.