|Home | About | Journals | Submit | Contact Us | Français|
The amino acid sequence of viral capsid proteins contains information about their folding, structure and self-assembly processes. While some viruses assemble from small preformed oligomers of coat proteins, other viruses such as phage P22 and herpesvirus assemble from monomeric proteins (Fuller and King, 1980; Newcomb et al., 1999). The subunit assembly process is strictly controlled through protein:protein interactions such that icosahedral structures are formed with specific symmetries, rather than aberrant structures. dsDNA viruses commonly assemble by first forming a precursor capsid that serves as a DNA packaging machine (Earnshaw, Hendrix, and King, 1980; Heymann et al., 2003). DNA packaging is accompanied by a conformational transition of the small precursor procapsid into a larger capsid for isometric viruses. Here we highlight the pseudo-atomic structures of phage P22 coat protein and rationalize several decades of data about P22 coat protein folding, assembly and maturation generated from a combination of genetics and biochemistry.
The morphogenic pathway of the T=7 Salmonella bacteriophage P22 involves the co-assembly of 415 molecules of monomeric coat protein (product of gene 5; gp5) with about 60–300 molecules of scaffolding protein (gp8) as well as some minor proteins (gp7, 16 and 20, referred to as ejection proteins) and the portal protein complex (gp1) into a procapsid (King et al., 1976; Prevelige and King, 1993) (Figure 1, Table 1). Scaffolding protein directs the assembly of the procapsid but is not found in mature phage, and so acts as a true assembly chaperone. Without scaffolding protein, coat protein assembles into T= 4 capsids and aberrant spiral structures. Spirals have pentameric capsomers located inappropriately, and not at icosahedral 5-fold vertices, so that closed procapsid structures do not form (Earnshaw and King, 1978; Lenk et al., 1975; Thuman-Commike et al., 1998). The minor ejection proteins are incorporated early in assembly, also through interactions with scaffolding protein (Greene and King, 1996). The dsDNA genome is actively packaged in an ATP-dependent fashion into the procapsid through the unique portal vertex (Bazinet and King, 1988) by the terminase complex (gp2 and gp3). Concomitant with DNA packaging, scaffolding protein exits from the procapsid to take part in additional rounds of assembly, and the capsid matures, resulting in a 10–15% increase in volume (Earnshaw, Casjens, and Harrison, 1976; Prasad et al., 1993). The dsDNA is stabilized by the addition of plug proteins (gp4, 10 and 26) that close the portal vertex, and finally tailspikes (gp9), the cell recognition and attachment proteins, are added (King, Lenk, and Botstein, 1973). None of the proteins are covalently modified or proteolyzed during folding and assembly.
Phage P22 has long been a model used for understanding assembly mechanisms of dsDNA viruses (Capen and Teschke, 2000; Casjens, 1979; Casjens et al., 1985; Fuller and King, 1980; Fuller and King, 1981; Fuller and King, 1982; Prevelige, Thomas, and King, 1993; Prevelige, Thomas, and King, 1988). An important advantage of phage P22 as an assembly model is its simplicity. The proteins needed for assembling procapsid-like particles (PLP) of P22 can be purified and are active for assembly (Fuller and King, 1981). In vitro, only coat and scaffolding proteins are required to assemble a PLP (Fuller and King, 1980; Prevelige, Thomas, and King, 1988). Both in vivo and in vitro, the number of scaffolding protein molecules incorporated into a procapsid (or a PLP) varies, depending on the conditions of assembly (Casjens et al., 1985; Parent, Ranaghan, and Teschke, 2004; Prevelige, Thomas, and King, 1993). In vitro assembled PLPs have the same morphology and size as in vivo generated procapsids (Fuller and King, 1980). Thus, data generated from in vitro assembly experiments can be readily compared to in vivo experiments, allowing the rationalization of phage phenotypes due to coat variants with observable differences in folding and assembly of coat protein. A nearly complete structure of the coat protein is now available (Parent et al., 2010), which allows us to examine the structure/function relationship of the coat protein during the entire phage life cycle.
HK97 is currently the only example of a bacteriophage where near-atomic structures (better than 4 Å resolution, solved by X-ray crystallography) are available for both precursor and mature particles, allowing for a detailed understanding of the mechanisms of assembly and maturation (Gertsman et al., 2009; Wikoff et al., 2000). P22 (family Podoviridae) and HK97 (family Siphoviridae) have much in common in terms of capsid morphology, assembly and maturation (Johnson and Chiu, 2007). Although the classification into separate families suggests a distant evolution (especially evident in the dramatic differences in tail appendages), their capsid assembly pathways are generally analogous. Both phage have nearly icosahedral lattices that exhibit T=7 laevo symmetry, with a portal complex at a unique vertex. The structures of their respective coat proteins are highly homologous (Parent et al., 2010) and the “HK97-fold” is often considered to be a common ancestor for many dsDNA phage, including P22 (Bamford, Grimes, and Stuart, 2005).
In HK97, the formation of the precursor particle (Prohead I) occurs by co-polymerization of the major capsid protein (gp5), which exists in equilibrium between pre-assembled hexons and pentons, and a protease (gp4)(Duda et al., 1995a; Xie and Hendrix, 1995). Once assembly of Prohead I is complete, proteolysis of the first 103 residues (the Δ-domain, which functions as a scaffold for assembly) converts gp5 to gp5*, forming Prohead II (Duda et al., 1995b). Maturation is triggered by dsDNA packaging and proceeds through at least three expansion intermediates (EI I, II, and III) (Lata et al., 2000). HK97 morphogenesis culminates in formation of Head II, characterized by extensive chainmail crosslinking (Popa et al., 1991). The presence of these multiple, discrete structures has made HK97 a model system, as detailed information about maturation has been gleaned (Conway et al., 1995; Gan et al., 2006; Ross et al., 2006).
Although many general features of bacteriophage assembly are common to both systems, there are some important distinctions between P22 and HK97. The major differences include: 1) P22 assembles from monomeric coat proteins added one at a time through the aid of a separately expressed scaffolding protein, gp8 (Casjens and King, 1974; Prevelige, Thomas, and King, 1988); 2) P22 maturation does not involve proteolytic cleavage of coat protein; and 3) P22 capsids are not stabilized by chemical crosslinks. Though the mechanisms of subunit and capsid stabilization differ between P22 and HK97 (Parent et al., 2010), the common HK97-fold indicates that this building block is well suited to promote effective assembly of a capsid. In addition, as covered in detail in this review, P22 coat protein can tolerate variation and can adapt under various environmental pressures. The widespread functionality, and ability to tolerate insertions and adaptations may be the reason this protein fold is so prevalent in dsDNA tailed bacteriophage.
The first complete description of P22 head assembly used amber mutations in genes that code for head assembly products to determine which products would accumulate in the absence of each head protein (Botstein, Waddell, and King, 1973). Procapsids, filled phage heads, empty heads, and some aberrant structures were observed. These particles were further characterized using small angle X-ray scattering (SAXS) (Earnshaw, Casjens, and Harrison, 1976), which showed that procapsids are smaller in diameter (584 Å) than phage or empty heads (628 Å). Treatment of procapsids with 0.8% SDS caused them to expand to the diameter of phage heads. Expansion was postulated to be an important regulatory step for DNA packaging and scaffolding protein release. For the first time, in vitro assembly from purified subunits was shown to recapitulate in vivo procapsid assembly, resulting in PLPs with the same morphology and dimensions, but lacking the portal complex (Fuller and King, 1981; Prevelige, Thomas, and King, 1988). The first structures of empty procapsid shells (procapsids isolated from phage infected cells that are stripped of scaffolding protein, portal, and minor proteins) and DNA filled heads (phage without the tail machinery) examined by three-dimensional (3D) image reconstruction of vitrified samples to 28 Å, using icosahedral averaging, confirmed the differences in size and morphology that had been observed earlier (Prasad et al., 1993). The procapsid reconstruction revealed large holes at the center of the capsomers, where scaffolding protein was hypothesized to exit during DNA packaging. In addition, the reconstruction of the head showed clear conformational rearrangements that occurred during expansion, which closed the holes primarily through large shifts of coat protein mass.
Reconstruction of in vivo generated procapsids, which contain scaffolding protein, revealed that most of the scaffolding protein was not icosahedrally ordered within the procapsid (Thuman-Commike et al., 1996). However, a small region of scaffolding protein density could be seen on the interior surface of procapsids at the tips of some coat subunits. Scaffolding dimers were proposed to link coat subunits across the strict and quasi-three fold axes of symmetry, thereby causing interactions between subunits in adjacent capsomers and driving icosahedral assembly (Thuman-Commike et al., 1996; Thuman-Commike et al., 1999).
The first identification of P22 coat protein secondary structure was accomplished with subnanometer reconstructions of DNA filled heads (9.5 Å) and empty procapsid shells (8.5 Å), which had delineated subunit interfaces so that each coat subunit could be distinguished in the capsid lattices (a lattice refers to the arrangement of coat protein in the icosahedron) (Jiang et al., 2003). Several α-helices were assigned based on the density and revealed significant conformational changes and coat protein refolding during expansion. In these reconstructions, P22 coat protein was shown to have a fold similar to the HK97 capsid protein, in spite of the low sequence identity between the proteins (<10 %). Though this was a major improvement in resolution, and added to the growing knowledge that the HK97 fold is common to dsDNA tailed phage and viruses, threading the complete P22 protein sequence through the electron density was not yet possible.
More recently, two asymmetric reconstructions of phage P22 were completed (Chang et al., 2006; Lander et al., 2006). Asymmetric reconstructions differ from icosahedrally-averaged reconstructions in that no impositions of symmetry are made. This technique allows the visualization of nucleic acids and proteins that are not icosahedrally localized in a capsid such as dsDNA, the portal complex, ejection proteins and the tail machinery. Furthermore, these reconstructions revealed subunit details such as the symmetry mismatch between the portal complex and the capsid, highlighting structural interactions between P22 coat protein and portal subunits that were not visualized in maps produced from icosahedrally averaged reconstructions. However, these reconstructions (~17 Å resolution) were not at a high enough resolution to trace the backbone of coat protein.
One important prerequisite to attaining a pseudo-atomic structure by cryo-TEM is to have a pure, homogenous sample. A conformational transition similar to the expansion caused by DNA packaging in vivo can be induced in vitro (Earnshaw, Casjens, and Harrison, 1976; Galisteo and King, 1993). During heat expansion the pentons are released but the overall structure of the particle corresponds to the mature form (Teschke, McGough, and Thuman-Commike, 2003). We took advantage of the high stability of the heat-expanded heads, which allowed us to remove all the minor proteins leaving a very pure sample (Parent et al., 2010). The heat-expanded capsid is structurally simple, having only coat protein only in hexon positions. Recently structures of in vitro heat-expanded heads and empty procapsid shells were reported at a resolution of 7.0 Å and of 9.1 Å, respectively (Parent et al., 2010).
The majority of the backbone of P22 coat protein (~91 %) was separately traced through electron density maps of both structures. To accurately guide the threading of the amino acid sequence through the electron density, landmarks were required. The amino-terminus (amino acids 1–42) was identified by difference map analysis of protease-treated compared to untreated expanded heads. Protease treatment removes the first 42 amino acids of coat protein (Parent et al., 2010). Other threading landmarks were generated via engineered cysteine variants within coat protein. Procapsids of these variants were labeled with 1.4 nm gold beads covalently modified with maleimide. The location of electron dense gold beads was readily observed by difference map analysis for three procapsid shell reconstructions. Together, these guideposts allowed us to use the density maps from expanded heads and procapsids along with iterative homology modeling to produce a pseudo-atomic model of coat protein before and after maturation (PDB entries 3IYI, and 31YH respectively) (Parent et al., 2010).
The P22 capsid and procapsid subunit backbone structures are shown as ribbon diagrams next to HK97 subunits (Head II PDB entry 1OHG and Prohead II PDB entry 3E8K) (Figure 2a–d). In HK97, the domains are named according to their location in the capsomer. The A domain is axial to the center, the P domain is found at the periphery, the N-arm at the amino terminus, and the E-loop is the ‘elongated’ loop where the subunits undergo cross-linking (Wikoff et al., 2000). The corresponding regions of P22 coat protein are labeled in the convention established for the HK97 capsid protein (Gertsman et al., 2009); however, the cross-linking reaction does not occur in P22, and the E-loop is shorter. P22 coat protein has the HK97 fold for ~68% of the protein, which we will refer to as the “core” of P22 coat protein. The major structural difference between P22 and HK97 is a telokin domain inserted in the P22 coat protein (residues 223–359 as determined through homology modeling), which is the protrusion called the “extra density” domain in Jiang et al., 2003. Telokin is a member of the immunoglobulin-like (Ig) family of domains, which are found in many phage proteins, primarily as additions to either the capsid or tail proteins (Fraser, Maxwell, and Davidson, 2007; Fraser et al., 2006). We show the telokin domain of P22 coat protein next to the telokin domain from myosin light chain kinase (PDB entry 1FHG) (Figure 2e and 2f).
As mentioned above, several lines of evidence have indicated that there are conformational rearrangements and coat protein refolding during expansion (Jiang et al., 2003; Kang and Prevelige, 2005), and now we can see that these rearrangements primarily occur in the A-domain, the P-loop that connects the strands in the P-domain, and the N-arm. In the supplementary movie we compare the structure of the N-arm plus the P-loop and the A domain to highlight the conformational rearrangements that occur during expansion. During expansion, the first helix in the N-arm (residues 3 – 11) and the P-loop (residues 376 – 398) are drawn closer together by a combination of events where the helix moves to a higher radius, and the P-loop moves to a lower radius. The radii described are relative to the particle origin, where a lower radius refers to a position closer to the particle interior, and a higher radius is toward the outside of the particle. Together these make significant inter-subunit contacts, which likely contribute to the increased stability of the expanded head. In the A-domain (residues 127 – 222) there is a significant movement that closes the hole at the center of the capsomers, along with some refolding of the protein from coil to helix.
The observed conformational rearrangements must be sufficient to stabilize P22 capsids at strict and quasi three-fold axes since P22 coat protein does not crosslink in the fashion of HK97 (Wikoff et al., 2000) nor require accessory capsid proteins, as seen with Herpesviruses, T4 and lambda phage, as well as (Ishii, Yamaguchi, and Yanagida, 1978; Spencer et al., 1998; Yang et al., 2000). Stabilization upon maturation is likely needed in P22, since procapsids are labile enough that they slowly dissassemble at low concentration over a period of days (Parent, Suhanovsky, and Teschke, 2007a). Additionally, monomers can exchange with assembled coat subunits, which suggests a relatively weak intrasubunit interaction energy. In fact, the energy is only −6 to −7 kcal/mol (Parent, Zlotnick, and Teschke, 2006). Addition of all of the coat subunit interactions in the procapsid leads to a robust lattice, with an overall energy of ~ −3000 kcal/mol (Parent, Zlotnick, and Teschke, 2006). Expansion locks the subunits at the quasi and strict three fold axes and dramatically increases capsid stability as dissassociation of expanded heads has not been observed. The expanded capsid lattice even resists disruption in 7 M urea, indicating strong hexon:hexon interactions (Parent et al., 2010). However, the interaction energy between the penton subunits and their neighboring hexon subunits can be overcome with sufficient thermal energy, resulting in loss of pentons during heat expansion (Teschke, McGough, and Thuman-Commike, 2003). Complete heat denaturation of P22 empty capsids requires only slighty less energy than for empty HK97 heads; the Tm for WT P22 capsids is 82.5 °C (Capen and Teschke, 2000). Therefore, P22 is able to achieve high stability through maturation even in the absense of intersubunit crosslinks.
Significant step-wise stabilization occurs as coat protein assembles from monomers to form procapsids, and then matures (Anderson and Teschke, 2003; Galisteo, Gordon, and King, 1995; Galisteo and King, 1993). One initial conformational change involves the A domain (residues 127–222). Amino acids 156–207 of coat protein are protease accessible in monomers and procapsids. The time required for digestion is much longer for procapsids indicating a change in accessibility to proteases and H/D exchange upon assembly (Capen and Teschke, 2000; Lanman, Tuma, and Prevelige, 1999; Teschke, 1999). However, the A domain becomes protease resistant and inaccessible to H/D exchange only once expanded to the mature form (Capen and Teschke, 2000; Kang and Prevelige, 2005; Lanman, Tuma, and Prevelige, 1999). The subunit structure in both procapsid and expanded heads agrees with Raman spectroscopy, which indicates only small changes in coat protein secondary structure during maturation (Tuma, Prevelige, and Thomas, 1998). The large hole at the center of the capsomers becomes occluded by the A-domain rearrangements in expansion (Jiang et al., 2003; Prasad et al., 1993).
Procapsids expand when they have been cleaved in the A-domain with protease at a lower temperature than uncleaved procapsids (Lanman, Tuma, and Prevelige, 1999). These data show that the resulting decrease in conformational constraint of the A domain allows expansion to occur more readily by lowering the activation energy for this process. Thus, we conclude that movement and refolding of the A domain may be the rate-limiting step for capsid maturation.
Assembly of phage P22 procapsids occurs via the addition of coat monomers to the growing edge of the lattice (Fuller and King, 1982; Prevelige, Thomas, and King, 1988). Coat monomers have been extensively studied (Anderson and Teschke, 2003; Fuller and King, 1981; Teschke and King, 1993; Tuma et al., 2001). Assembly-competent monomers are generated by denaturation of empty procapsid shells, followed by extensive dialysis to refold the protein (Anderson and Teschke, 2003; Fuller and King, 1982; Prevelige, Thomas, and King, 1988). In the absence of scaffolding protein, coat protein mostly remains monomeric at low concentrations, though a very slow, uncontrolled assembly reaction does occur with over of days at 20° C (Suhanovsky, unpublished). The overall thermodynamic stability of native WT coat monomers, determined by equilibrium urea denaturation experiments, is only ~ −7 kcal/mol (Anderson and Teschke, 2003), which is rather unstable for a protein the size of coat protein (Fersht, 1999; Finn et al., 1992)
The pseudo-atomic model (Parent et al., 2010) provides an excellent approximation of the structure, but it is not an accurate, full atomic description of the coat protein. Nevertheless, it can be used to rationalize previous biochemical experiments. For example, four of the six tryptophans in the P22 subunit appear to be in quite similar environments, likely interacting with phenylalanines or other tryptophans (Figure 3b). Trps W48, W61 and W410 may cluster in the P-domain. Trp W354 (P-domain) is potentially in a hydrophobic environment with several phenylalanine and tryrosine residues in close proximity. W145 (A-domain) and W241 (telokin domain) are probably solvent exposed in monomers, therefore quenched and do not contribute to the fluorescence. Coat protein tryptophans have been shown by fluorescence lifetime measurements and Raman spectroscopy to reside in equivalent environments in folded monomers (Prevelige, King, and Silva, 1994; Tuma et al., 2001), which is consistent with the positions of the tryptophans in the assembled subunit. W48 and W410 appear be involved in contacts between capsomers (Figure 3c and d) and we hypothesize that this explains the observed increase in fluorescence upon assembly (Teschke and King, 1993). W48 contributes to the entropic stabilization of procapsids, but not monomers, consistent with its location at a subunit interface (Prevelige, King, and Silva, 1994).
SAXS experiments revealed that coat protein monomers are ~ 100 Å long (Tuma et al., 2001). Since the end-to-end distance of the coat subunit in procapsids is ~ 87 Å (Parent et al 2010), coat protein is only slightly more compact upon assembly. In addition, only 4 % of peptide backbone is altered when monomers assemble, indicating that assembly does not induce major folding changes (Tuma et al., 2001). Together, these data suggest that the coat protein subunit structure as determined from procapsids is an approximation of the monomer structure, therefore, we will use it to analyze previously collected data on coat monomer folding and assembly.
“Let the phage do the work” is a quote from Professor Jonathan King, an expert in phage P22 genetics and biochemistry. By this he meant that results from phage genetics could guide biochemical experiments, and lead to an understanding of phage protein function. For example, over the last several decades many P22 coat protein variants have been investigated for their effects on folding and assembly. The first coat protein mutants were isolated during searches for mutants in all phage P22 genes and were used to elucidate the sequence of events that occur during phage morphogenesis (Botstein, Waddell, and King, 1973; Jarvik and Botstein, 1973; Jarvik and Botstein, 1975; King, Lenk, and Botstein, 1973). Phage were generally mutagenized with nitrosoguanosine or ultraviolet light and screened for conditional-lethal and lethal phenotypes: temperature-sensitive (conditional lethal, “ts”), cold-sensitive (conditional lethal, “cs”), and amber mutants (lethal “am”). The position of each mutation was mapped by complementation to the P22 circular genome (Gough and Levine, 1968; Kolstad and Prell, 1969; Lew and Roth, 1970). Mutated phage were used to determine which assembly products accumulated in infections using conditional-lethal or lethal mutants. Thus, with phage carrying mutations in the late genes, which are involved in capsid assembly and completion, procapsids were shown to be precursors to DNA filled heads, followed by the addition of head completion proteins.
Eighteen amino acid substitutions in P22 coat protein result in a tsf phenotype (Figure 4a). In vivo, tsf substitutions significantly reduce the yield of soluble coat protein at high temperatures because the newly synthesized tsf coat polypeptides aggregate and form inclusion bodies prior to reaching an assembly-competent state (Gordon et al., 1994; Nakonechny and Teschke, 1998). Thus, these positions are crucial for the proper folding of coat protein. Such tsf defects lead to a decrease in the rate and yield of capsid assembly both in vitro and in vivo. The folding and assembly of the tsf coat proteins can be rescued at high temperature in vivo by growing phage in cells that overproduce GroEL/S (Gordon et al., 1994; Nakonechny and Teschke, 1998).
In vitro, tsf coat protein monomers have altered secondary and tertiary structure, as well as increased surface hydrophobicity, as compared to WT coat protein (Doyle et al., 2003). The major effect of the tsf substitutions is a destabilization of the monomers, which rapidly “flicker” between the folded state and unfolding intermediates. Because of the increased population of intermediate species, when tsf coat protein monomers are incubated with GroEL in vitro they are bound efficiently (Doyle et al., 2003). The coat proteins in the binary complex showed conformations with substantial secondary and tertiary structure, similar to a late folding intermediate (de Beus, Doyle, and Teschke, 2000).
The relative positions of the known conditional lethal mutations are highlighted in the primary sequence, and in topology and ribbon diagrams (Figure 4a–d). The substitutions appear to be well distributed in the primary sequence (Figure 4a), but close inspection of the topology diagram and the folded subunit clearly shows that the majority of tsf substitutions cluster in the telokin domain (Figure 4b and 4c). Three other substitutions are found in either the P- or A-domains. In addition, three tsf substitutions near the C-terminus reside in a ‘β-hinge’ that connects the A- and telokin domains. This region comprises four anti-parallel β-sheets (Figure 4c, center light orange box).
Intragenic second site suppressors (su) of the tsf gene 5 mutations were isolated by plating tsf phage at elevated temperatures to identify additional residues involved in coat protein folding (Aramli and Teschke, 1999). Three global su substitutions (D163G, T166I, F170L) were repeatedly and independently isolated from different gene 5 tsf parents, scattered throughout the gene. The tsf:su coat proteins demonstrate both stabilized native and intermediate states as compared to their tsf parents, so that these proteins remain GroEL substrates even though their folding is less defective (Doyle et al., 2004; Parent and Teschke, 2007). The three global su substitutions are located in the β-hinge, which implicates this region as important for folding (Figure 4d).
The folding of WT coat protein monomers has been thoroughly investigated (Anderson and Teschke, 2003; Teschke and King, 1993). In equilibrium urea denaturation experiments, coat protein is incubated in 0 – 7 M urea until equilibrium is established, and the unfolding transition is monitored by circular dichroism (CD) at 222 nm for secondary structure and tryptophan fluorescence (trp) at 340 nm as a marker of tertiary structure. A three state model best describes the reversible folding transition of coat protein: native (N) intermediate ensemble (I) unfolded ensemble (U). The equilibrium intermediate ensemble is characterized by the loss of secondary structure but native-like tryptophan fluorescence, which would typically suggest the existence of tertiary structure in the intermediate ensemble (Anderson and Teschke, 2003). Here this is more likely the result of clustering of tryptophans.
When unfolded coat protein is diluted into buffer that promotes folding, 80% of the coat protein molecules follow a slow kinetic path. Kinetic folding experiments have demonstrated the existence of two kinetic intermediates (Doyle et al., 2003), so that coat protein must pass through four distinct states during folding (Figure 5). The U I1 transition is observed in both trp fluorescence and CD experiments; 65% percent of the total trp fluorescence change and 33% percent of secondary structure is formed in this ‘burst phase’ (< 5 sec) leading from U to I1. I2 is a later kinetic intermediate, characterized by fully buried tryptophans but incomplete secondary structure, and is similar to the equilibrium intermediate ensemble. The I1 to I2 transition, which is observed by trp fluorescence, has a relaxation time of 60 sec at 20 °C. The I2 to N transition is characterized by a slow increase in secondary structure, with a relaxation time of ~ 370 sec (Anderson and Teschke, 2003). When bisANS, a hydrophobic dye, is used to monitor burial of hydrophobic patches during folding, there is a large burst (U I1), followed by a slower continued increase in bisANS fluorescence (I1 I2), and finally a very slow loss in bisANS binding (~800 sec), after which the native state (N) is formed. The I1 to I2 transition rate is consistent between bisANS and trp fluorescence, whereas the hydrophobic burial during I2 to N is slower than secondary structure formation (Anderson and Teschke, 2003; Doyle et al., 2003; Teschke and King, 1993).
The telokin domain has only one tryptophan residue (W354) that is probably solvent exposed, which suggests that changes in tryptophan fluorescence are not related to telokin domain folding (Figure 3). Three out of the six tryptophans (241, 354, and 410) in coat protein are included in the C-terminal half, which is largely composed of β-sheets and flexible loops. Conversely, the HK97-like core of the P22 coat protein has all of the α-helices and should contribute most strongly to the CD signal at 222 nm. Refolding experiments using separately expressed halves of P22 coat protein (N-terminal 1–190, and C-terminal 191–430) showed that only the C-terminal half, which contains the telokin domain, was capable of refolding independently, although it formed dimers rather than monomers (Kang and Prevelige, 2005). Based on this result, the C-terminal half was suggested to be the folding scaffold for the entire subunit (Kang and Prevelige, 2005), which is consistent with our folding experiments.
The majority of the tsf mutants are located in the telokin domain. The tsf substitutions generally affect the conformation of the folded monomers so that they appear to have more β-sheet or random coil than WT coat protein monitored by CD (Doyle et al., 2004; Galisteo, Gordon, and King, 1995; Parent, 2007; Teschke and King, 1995). A three-state equilibrium folding model best describes the folding of the tsf coat variants, like WT coat protein. The major effect of the tsf substitutions on folding is to dramatically destabilize the I N transition as measured by equilibrium experiments, but the stability for U I transition is also affected, so that the native state is not well-populated (Doyle et al., 2004; Doyle et al., 2003; Parent, 2007).
Folding kinetics of A108V, D174N, and F353L tsf coat proteins, which are located in the HK97 core of the coat subunit, showed that the only secondary structure generated occurs in the burst phase, though even this is destabilized by the substitutions. (Homology modeling assigned F353 to the telokin domain, but closer inspection of the folding data presented here, more logically places F353 in the β-hinge.) The I2 to N transition is so destabilized for these protein variants that the N state is only transiently populated and flickers between N, I2, and I1 (Doyle et al., 2004). Thus, these proteins are very aggregation prone and dependent on GroEL/S for folding (Gordon et al., 1994; Nakonechny and Teschke, 1998; Teschke and King, 1995).
S223F and G232D are the only two tsf mutants in the telokin domain that have been thoroughly investigated (Doyle et al., 2004; Doyle et al., 2003). Their native states are destabilized compared to WT coat protein in equilibrium denaturation experiments. However, helix formation in the I2 to N transition for S223F is observed as folding kinetics measured by CD. This is unlike tsf mutants in the core where only the CD change in the burst phase is observable. Tsf mutants in the telokin domain, P238S, S262F, and V300A, show CD folding kinetics, as does G232D in a recent reanalysis of folding data (Teschke, unpublished data). However, these proteins still rapidly unfold from N back to intermediates, consistent with a destabilized native state. These data suggest that if the tsf substitutions are in the telokin domain, the ability to fold the core is retained, but if the amino acid substitutions are in the core there is a more severe folding defect.
All of our kinetic and equilibrium folding data of WT, tsf variants, and the tsf:su variants (discussed below) are consistent with the notion that coat protein folds as two autonomous domains: the telokin domain and the core (Figure 5). Based on our experiments with the tsf coat proteins, we propose the following model. First, our data suggest that the telokin domain is the folding nucleus, since this part of the protein has the majority of the tsf substitutions, it is able to fold autonomously, and amino acid substitutions in this domain still allow for final formation of helical secondary structure in the core of the protein. Our data are consistent with telokin folding in the burst phase of the folding reaction, along with burying of some of tryptophans in the core (48, 61, 354, 410), followed by formation of α-helices as the rate-limiting step in the reaction. When there are amino acid substitutions in the telokin domain, the rest of the protein has difficulty achieving the native state, causing the tsf variants to be aggregation-prone. These data are consistent with the observation that the tsf coat proteins induce GroEL/S and require GroEL/S for folding at high temperature (Gordon et al., 1994; Nakonechny and Teschke, 1998; Parent, Ranaghan, and Teschke, 2004).
As mentioned previously, the major conformational changes that occur during expansion are reorganization of the A-domain and movement of the N-arm and P-loops. WT procapsids expand with what appears to be simple first order kinetics, without indication of any populated intermediate, which is consistent with a cooperative reaction (Capen and Teschke, 2000). However, much evidence suggests that there are multiple steps during expansion. For instance, some tsf variants exhibited sigmoidal expansion kinetics, which means that capsid expansion is at least a two step process (Capen and Teschke, 2000). The tsf mutants that affect the kinetics of expansion are distributed throughout the protein. These data suggest that conformational information must be communicated between different regions of the protein prior to expansion (step 1). Consistent with the data presented thus far is the possibility that the reorganization and movement of the A domain is the rate-limiting intermediate step of the reaction during expansion (step 2), which is followed by the physical expansion in volume (step 3).
Evidence for the second step in expansion comes from the observation that amino acid substitutions in the A domain both enhance and inhibit the capsid maturation reaction. Empty procapsid shells with tsf substitutions at one position, (D174N or D174G), show sigmoidal expansion kinetics and a decrease in the temperature required for complete expansion as compared to WT shells. Thus, there must be a lower activation energy for the rate-limiting step, which is consistent with our hypothesis that A domain movements predominantly affect the rate-limiting step in expansion. Conversely, procapsids of coat su variants, (D163G, T166I and F170L), do not display sigmoidal kinetics and expand slower than WT even at high temperatures. This suggests an increased energy requirement for expansion. Indeed, empty procapsid shells of these variants are sometimes disrupted rather than smoothly transition to the expanded state (Parent, Suhanovsky, and Teschke, 2007b). Though D174 occurs near the global su sites, it resides in one of the A-domain helices, whereas all three global su residues are in the β-hinge that connects the core of the protein and the telokin domain. This strand undergoes significant movement in response to the refolding of the A domain. We propose that this movement is a prerequisite for expansion. So these data suggest that there may be as many as three discrete events during expansion: communication between the domains that expansion is occurring, refolding of the A domain, and of the strands in the hinge. In WT procapsids, these events may occur concomitantly or in rapid succession.
The su substitutions, D163G, T166I, and F170L were determined to be global suppressors, i.e., each suppresses the phenotype of several tsf mutants in coat protein, which are distributed throughout sequence. All these su substitutions occur in a β-strand and turn in the β-hinge region (Figure 4d). Each global su substitution remedies the folding defects of multiple tsf parents by a common mechanism. Yet, the three global su substitutions do not function the same way. For example, T166I and D163G both fix the original tsf defects by a shared mechanism. However, F170L corrects tsf protein folding defects by a second mechanism (Aramli and Teschke, 2001; Doyle et al., 2004; Parent and Teschke, 2007).
T166I and F170L have been examined extensively in the context of several tsf substitutions (indicated by tsf:T166I). T166I is the most frequently isolated suppressor substitution (Aramli and Teschke, 1999). Although, stabilization of the native state is the obvious and most frequently observed mechanism by which su substitutions work (Sekijima et al., 2006; Van der Schueren, Robben, and Volckaert, 1998), surprisingly the T166I substitution actually increases monomer aggregation (Aramli and Teschke, 2001). In equilibrium and kinetic folding experiments, the T166I substitution increases the population of the intermediates. Consistent with these data, the tsf:T166I proteins increase expression of GroEL/S even compared to the tsf parents (Parent, Ranaghan, and Teschke, 2004). However, the S223F:T166I coat protein more readily folds to N (Doyle et al., 2004). We hypothesized that the increase in the population of intermediate species allowed for more effective interactions with GroEL/S in vivo. Not only is GroEL/S essential for successful in vivo folding and assembly of the tsf:T166I proteins, but also scaffolding protein (Parent, Ranaghan, and Teschke, 2004). The small population of tsf:T166I coat proteins that fold to the native state are trapped in procapsids by scaffolding protein. GroEL/S is unable to rescue the folding of the tsf:T166I coat proteins when there are decreased levels of intracellular scaffolding protein. The D163G substitution appears to function by the same mechanism as T166I in vivo (Parent and Teschke, 2007).
The F170L su substitution functions via a completely different mechanism than the T166I substitution. The addition of F170L to tsf coat proteins causes stabilization of the native state (Parent and Teschke, 2007). Unlike the tsf:T166I coat proteins, expression of tsf:F170L coat proteins does not induce GroEL/S, though there remains a requirement for the chaperone complex. Tsf:F170L coat proteins are resistant to unfolding at elevated temperatures, where tsf:T166I coat proteins completely aggregate (Aramli and Teschke, 2001).
If the su substitutions (D163G, T166I, F170L) promote folding, and these mutants can assemble into infectious phage, then why haven’t these amino acids become part of the WT coat protein sequence? We investigated this question by generating the su substitutions on a plasmid that expresses coat protein (Parent, Suhanovsky, and Teschke, 2007b). The individual, double (D163G:T166I, D163:F170L, and T166I:F170L), and triple (D163G:T166I:F170L) mutants were all generated. The phenotypes were determined by trans complementation using a phage that is unable to produce coat protein. The single su substitutions showed no phenotype, but the double and triple variants had cs phenotypes, with the triple mutant being extremely cs and unable to support phage growth even at 25 °C. Hence, su substitutions likely failed to become part of the WT sequence owing to a negative consequence for phage production in certain conditions.
The products of in vivo assembly of the su coat proteins were visualized by electron microscopy. All mutants led to the formation of aberrant products (primarily spiral forms and broken procapsids) at 16 °C. Spirals result when the placement of the five-fold vertices is inaccurate (Earnshaw and King, 1978). In vitro assembly reactions recapitulate in vivo results except that the F170L coat variant also forms tubes, called “polyheads” (Parent, Suhanovsky, and Teschke, 2007b). In the absence of scaffolding protein, more F170L coat protein forms polyheads (Suhanovsky, et al., unpublished data). Image reconstruction of vitrified F170L polyheads demonstrates that they are solely composed of hexons (Parent et al., unpublished data). We propose that interaction with scaffolding protein during assembly can alter the conformation of F170L coat protein monomers so that subunits at the pentameric vertices can be added properly.
Three P22 cs coat mutants, T10I, R101C, and N414S, have been well characterized. R101C and N414S were isolated in general searches for cs mutants, whereas T10I was isolated as a suppressor of the coat tsf mutant, F353L (Jarvik and Botstein, 1973; Jarvik and Botstein, 1975). The cs mutants cause assembly problems, but do not negatively affect folding (Fong, Doyle, and Teschke, 1997; Gordon, 1993; Teschke and Fong, 1996). Cs coat proteins are able to fold and assemble at low temperatures but the infections are unable to progress beyond procapsids, owing to altered interaction with scaffolding protein (Gordon, 1993; Teschke and Fong, 1996). In vivo, N414S binds scaffolding protein tightly enough that DNA packaging is precluded, as scaffolding protein cannot exit the procapsids. Conversely, T10I and R101C coat proteins interact weakly with scaffolding protein, observed both as scaffolding protein leaking from procapsids and the inability to assemble PLPs in vitro. However, these mutants interact with scaffolding at least on a basal level to incorporate the portal protein complex (Gordon, 1993; Teschke and Fong, 1996). Weak interaction with scaffolding protein does not explain the inability of these mutants to package DNA in vivo. We propose that these substitutions affect the ability of the N-arm to move appropriately during maturation (Figure 6).
The pseudo-atomic model of the coat subunit can easily explain T10I and R101C phenotypes (Figure 6). The N-arm and P-loop have been hypothesized to be important for scaffolding protein binding (Parent et al., 2010). The coat-binding domain of scaffolding protein is rich in basic amino acids (Tuma et al., 1998). Intriguingly the N-arm of P22 coat protein is very acidic, which suggests this may be a site to which scaffolding protein binds. Conformational changes in the N-arm and P-loop during maturation conceal some of the electrostatic residues and probably encourage scaffolding protein release through the holes at the center of the capsomers. The N414S substitution is more difficult to rationalize since the location of this residue is quite distant from the suggested sites of scaffolding interaction.
The recently published pseudo-atomic model of P22 coat protein has allowed us to interpret our folding and assembly data. The telokin domain appears to have been added and maintained during evolution because it stabilizes monomers. Indeed, it is likely the folding nucleus and it is the site of many tsf mutants, which cause protein aggregation at high temperature. The C-terminal half of the protein, containing the telokin domain, can fold autonomously. It will be revealing to investigate the folding and stability of the isolated telokin domain to determine if it’s folding is rapid compared to the whole protein.
The su substitutions identify an additional region of coat protein that is important for folding. We propose that the formation of the β- hinge is the final “lock” for stabilizing the folded state and that is why substitutions in that region affect folding, both positively and negatively. The β-hinge is composed of regions N- and C-terminal to the telokin domain, by which we infer that the whole protein must be post-translationally folded. This is consistent with a hypothesis that the C-termini of phage proteins are likely to be critical for folding to prevent misassembly of defective or incomplete proteins, i.e., only completely synthesized proteins can fold and assemble (Casjens et al., 1991). Additionally, the regions in coat protein without known mutants may also indicate areas that are important for folding. For instance, the carboxyl terminal half of coat protein has 14 tsf sites, while the amino terminal half has only three, which may indicate that the formation of the helices in the P-domain is crucial for folding. In the past, we’ve use phage genetics to identify crucial sites in the protein, but now the recent structures will provide the information to directly test how different regions of coat protein affect folding or assembly.
Finally, the mutants that have been isolated over the years have provided a framework to validate the newly released structures, which have proven consistent with the biochemical and genetic data of many labs. To ‘Let the phage do the work’ in determining the important regions in coat protein has truly been an instructive and productive approach.
This movie was created with the Yale Morph Server (http://molmovdb.org/) (Flores et al., 2006), which creates by linear interpolation a series of hypothetical snap-shots of the capsid structure from the known procapsid and expanded head models (Parent et al., 2010). “D” subunits from neighboring hexons are distinguished by different colors, with the N-arms of each chain in darker shades of the same colors. The P-loops are shown in black. The top panel represents a view from outside the capsid, along a strict 3-fold axis of symmetry. The bottom panel shows a side view of what is displayed in the top panel.
We would like to thank Dr. Jonathan King for inspiring much of the work presented in this paper, and the rest of the P22 community. We also thank Dr. Timothy S. Baker and members of the Teschke lab for critical reading of the manuscript and helpful discussion. This work was supported by R01 GM76661 to CMT and by NIH fellowship F32A1078624 to KNP. KNP was also supported in part by R37 GM33050 to TSB.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.