|Home | About | Journals | Submit | Contact Us | Français|
Bacteriophage P22 is believed to contain a total of 521 copies of 9 different proteins and a 41,724 base pair genome. Despite its enormous size and complexity, phage P22 can be electrosprayed, and it remains intact in ultra-high vacuum where its molar mass distribution has been measured.
Phage P22 virions were generated by complementation in Salmonella enterica and purified. They were transferred into 100 mM ammonium acetate and then electrosprayed. The masses of individual virions were determined using charge detection mass spectrometry.
The stoichiometry of the protein components of phage P22 is sufficiently well-known that the theoretical molar mass can be determined to within a narrow range. The measured average molar mass of phage P22, 52,180±59 kDa, is consistent with the theoretical molar mass and supports the proposed stoichiometry of the components. The intrinsic width of the phage P22 mass distribution can be entirely accounted for by the distribution of DNA packaged by the headful mechanism.
At over 50 MDa, phage P22 is the largest object with a well-defined molar mass to be analyzed by mass spectrometry. The narrow measured mass distribution indicates that the virions survive the transition into the gas phase intact.
A characteristic of a molecule is that it has a well-defined molar mass. This is true for large molecules like proteins and even some large protein complexes.[1,2,3,4,5,6,7,8] A striking example of the latter is provided by the recent measurement of the molar mass of the Prohead-1gp5 of the virus HK97 by mass spectrometry. In this system, the capsid protein (gene product 5, or gp5) was expressed from a plasmid in the absence of the viral protease and as such, the empty immature virus proheads consist of exactly 420 copies of the capsid protein. Each HK97 capsid protein has a molar mass of 42,154 Da, leading to an overall theoretical molar mass for the Prohead-1gp5 of 17,704,680 Da.
With increasing size comes the possibility of heterogeneity. While there is no intrinsic heterogeneity in the HK97 Prohead-1gp5 mentioned above, in general heterogeneity in biological assemblies can arise from many sources, including variations in the number and composition of lipids, sugars, proteins and nucleic acids. Measuring the masses of heterogeneous objects by mass spectrometry is challenging. However, the measured mass distributions yield not only the average molar mass but also the dispersion, which can provide important information about compositional variability. Such information cannot be obtained from ensemble methods which provide only average values. A number of recent studies have demonstrated the value of measuring the masses of large objects, such as intact viruses.[7,8,9,10,11,12,13,14] For example, mass spectrometry has been used to resolve heterogeneity in recombinant adeno-associated virus (AAV) capsids, and examine genome packaging in AAV gene therapy vectors.
Here, we use charge detection mass spectrometry[15,16,17,18] (CDMS) to accurately measure the mass of infectious bacteriophage P22, a somewhat heterogeneous virus with a theoretical average molar mass of 51,613,585 Da. At >50 MDa and with 521 proteins, it is a remarkably massive and complex species. While masses have been measured for larger objects than phage P22 (including droplets, polymers, and nanoparticles) using specialized methods,[19,20,21,22,23,24,25] in those cases the molar mass is not well defined and a theoretical molar mass cannot be calculated; the measured mass distributions are very broad and depend on the conditions used to prepare the sample. On the other hand, biological assemblies generally have a defined composition and mass spectrometry can give important insight into their stoichiometry and heterogeneity. Phage P22 is by far the largest object with a well-defined molar mass to be analyzed by mass spectrometry.
P22 is a Salmonella-infecting bacteriophage which is widely studied as a model for tailed, double-stranded DNA (dsDNA) phages, the most common class of all viruses. The structure and assembly of P22 is fairly well understood. Figure 1 shows the components of the mature P22 virion. P22 first assembles into a T=7, icosahedral procapsid containing 415 coat proteins (gp5), a 12-portal protein (gp1) complex which takes the place of one of the twelve vertices of the icosahedral capsid, ~100 copies of the assembly chaperone scaffolding protein (gp8), and multiple copies each of the ejection proteins gp7, gp16, and gp20. The DNA is packaged into the capsid in an ATP-dependent fashion through a channel in the portal complex by the terminase complex (gp2 and gp3), which is not incorporated into the mature phage. DNA is actively packaged by the headful mechanism[28,30] from the P22 DNA that is a long concatemer in the cell. It is spooled into the capsid until the capsid is full, and then the DNA is cleaved. Slightly more than one genome is packaged per phage, so the ends of the DNA are redundant. During DNA packaging, the capsid undergoes an expansion, and gp8 leaves through pores in the capsid. Plug proteins (12 copies of gp4 and 6 copies of gp10) and tail needle proteins (3 copies of gp26) are added to hold the DNA inside the capsid. Finally, tailspike proteins (18 copies of gp9) are added for cell recognition and attachment. The mature phage P22 therefore contains different numbers of 9 proteins, along with a distribution of DNA. We show below that its theoretical mass can be calculated to within a fairly narrow range.
The phage P22 was generated by complementation in vivo. See supplemental information for details. CDMS measurements show that authentic phage P22 and phage P22 generated by complementation have the same masses. Samples for electron microscopy were prepared by applying phage onto 300-mesh carbon-coated copper grids (Electron Microscopy Sciences). The grids were stained with 1% uranyl acetate and visualized by an FEI Tecnai G2 Spirit BioTwin Transmission Electron Microscope as described elsewhere. Figure 2 shows a TEM micrograph of infectious phage P22. The brightness of the particles indicates that they are filled, in this case with DNA and some protein. The phage diameter is ~65 nm, as expected. The claw-like tail machinery is visible on most particles.
CDMS is a single-molecule technique where the m/z ratio and charge of individual ions are measured simultaneously, and then multiplied to give the mass. This process is repeated for thousands of ions to obtain a mass spectrum. The CDMS instrument[33,34,35] and data analysis methods[36,37] used in this work have been described in detail. Briefly, positive ions are generated by nano-electrospray. They are transmitted through several differentially-pumped regions containing RF ion guides, where the pressures are adjusted to optimize the signal. The ions are then accelerated through a 100 V potential difference, and focused into a dual hemispherical deflection analyzer (HDA). The HDA only transmits ions with a narrow band of kinetic energies centered on 100 eV/z. Finally, the ions pass into an ultra-high vacuum region containing an electrostatic ion trap with a charge detector tube embedded in it. When an ion enters the detector tube, it induces a charge that is picked up by a charge-sensitive preamplifier. The ion oscillates back and forth in the trap and passes through the detector tube many times which greatly improves the charge precision. A trapping time of 91 ms was employed. The output from the preamplifier is digitized and analyzed with a Fortran program using fast Fourier transforms (FFTs). The signals for ions that are not trapped for the full period are discarded. The m/z is inversely proportional to the square of the ion’s fundamental frequency in the trap, and the charge is proportional to the sum of the fundamental and second harmonic peak magnitudes in the FFT. Thus, the FFT provides both the m/z and the charge measurement of each ion, which are multiplied to obtain mass. For CDMS, the phage P22 was transferred into 100 mM ammonium acetate by spin-column SEC (Bio-Rad Laboratories, Inc.).
Before presenting the measured mass spectrum of the phage, we tabulate the expected mass. The masses and stoichiometry of the components are given in Table 1. The P22 genome is linear dsDNA that is 41,724 base pairs long with a known sequence. The number of redundant base pairs as a consequence of headful packaging described above has been reported to range from 850 to 2350. We use an average value of 1600 additional base pairs to give the mass of the dsDNA in Table 1. The stoichiometry of most of the proteins is precisely known;[32,40] the main source of uncertainty is the ejection proteins (gp7, gp16, and gp20). A variety of estimates have recently been given: 6–20 copies of each ejection protein, and 12–20 copies. The lower end of these ranges is incompatible with the measured mass of the intact phage. The most recent estimate of 11, 12, and 32 copies of gp7, gp16, and gp20, respectively, is used in Table 1. Adding the masses of all the components together, we calculate the average total mass of the intact P22 virion as 51,613 kDa (Table 1).
Figure 3 shows a typical CDMS mass spectrum of the intact phage generated by complementation in vivo. The spectrum was obtained by binning the measured masses for 2563 ions into 100 kDa bins. The spectrum shows a single peak centered around 52 MDa. There are no ions below 20 MDa because they are cut-off by the instrumental settings. The inset in Figure 3 shows an expanded view of the main peak. The experimental spectrum is represented by the blue points and the black line shows a Gaussian fit to the points. The Gaussian is a good fit to the measured peak. The Gaussian fit was used to determine the mean and the width of the mass distribution. The average mass from four independent measurements of the mean on two different samples is 52,180±59 kDa where the uncertainty is the standard deviation of the mean. The average full width at half maximum (FWHM) is 1,330±36 kDa. The width results from a combination of instrumental resolution, salt and solvent adducts, and intrinsic heterogeneity, as described in more detail below. The center of the mass distribution (the average mass) can be defined with an uncertainty which is much less than the width of the distribution.
The points that overlay the mass spectrum in Figure 3 are a scatter plot of charge versus mass. There is a single cluster of points centered at around 490 elementary charges (e). Thus, the average m/z of the phage P22 ions is over 100 kDa. We have previously reported the mass distribution for immature, P22 procapsid-like particles. The procapsid-like particles consisted of 420 copies of coat protein (gp5) and a distribution of scaffolding protein (gp8), as those were the only proteins expressed from the plasmid. The average charge determined here for the phage P22 (490 e) is larger than for the procapsid-like particles (425 e), which is consistent with the larger diameter of the phage according to the charge residue model of electrospray. However, instrumental conditions and settings can have a significant effect on the charge of electrosprayed ions, so it is difficult to compare absolute charges measured in different experiments.
The measured average mass (52,180±59 kDa) is 567 kDa higher than the theoretical mass (51,613 kDa). Electrospray from non-denaturing solvents[45,46] (native electrospray), while enabling the analysis of complex and massive biological analytes by mass spectrometry, is known to lead to masses that are slightly larger than the theoretical mass due to adduction by salts, counter ions, and solvent. In the recent measurement of the mass of the Prohead-1gp5 of HK97, the measured mass was 1.3% larger than the theoretical mass. Here the percentage deviation is 1.1%. The reason that the percentage deviation is smaller here may be that the encapsidated DNA prevents adduction to the capsid interior, while the Prohead-1gp5 of HK97 is an empty shell.
The average FWHM of the peak due to phage P22 is 1,330±36 kDa. There are three main contributions to the width of peak: the mass resolution (which is known), the mass distribution due to adduct formation (which can be estimated), and the intrinsic heterogeneity in the mass of the phage P22 (which we would like to obtain). Assuming that all distributions are Gaussian, the intrinsic heterogeneity of the phage P22 is given by
where WIH is the FWHM of the intrinsic mass distribution, WMEAS is the FWHM of the measured peak, WRES is the FWHM from the mass resolution, and WADD is the FWHM due to adduct formation. The mass resolution depends on the uncertainties in the m/z and charge measurements. These uncertainties are well characterized and lead to a combined relative root mean square deviation (RMSD) (Δm/m) of 0.0087 for ions with the trapping time, oscillation frequency and charge measured here. An estimate of the mass distribution due to adduct formation can be obtained from the effective resolving power (194) reported by Heck and collaborators for the Prohead-1gp5 of HK97. We use the same value to estimate the width of the distribution for phage P22. With these two quantities and the average width of the measured peak, the intrinsic width of the phage P22 mass distribution is 747±66 kDa. If this distribution is entirely due to the distribution of packaged DNA, then the distribution has a FWHM of 1,213±107 base pairs. This value compares favorably with an earlier estimate of the width of 1,500 base pairs. Thus, all the heterogeneity in the mass of the phage P22 can be attributed to the distribution of packaged DNA. Surprisingly then, these data indicate that there is little or no heterogeneity in the protein stoichiometries. While the proteins associated with the capsid and with the tail machine are forced into specific stoichiometries by geometry and their interactions within the phage, this does not appear to be true for the ejection proteins, so there could conceivably be a distribution for these proteins. However, the CDMS results show that if there is a distribution, it is narrow. The narrow intrinsic mass distribution measured for phage P22 also shows that almost all the ions in the spectrum are due to intact phage. A close inspection of Figure 3 shows that there are only a few ions in the spectrum that do not fall within the main peak. These may be a small fraction of procapsids which did not package DNA, or they may be misassembled particles. Thus, despite the size and complexity of phage P22, it survives the transition into the gas phase and remains intact in ultra-high vacuum.
The mass for the dsDNA given in Table 1 was calculated assuming that the DNA is packaged in an un-ionized state. In solution, however, the phosphates are thought to ionize, with neutrality retained through counter ions. The phosphate groups in the DNA backbone are known to have a strong affinity for Na+. If all the backbone phosphates were ionized and had Na+ counter ions the mass of the genome would increase by 1,906 kDa. This is incompatible with the measured mass, which indicates that the DNA is packaged predominantly in the un-ionized state. There is strong support for this conclusion. In recent CDMS measurements for adeno-associated virus (AAV), we were able to determine the mass of the empty capsid and the capsid with a full genome, for both single-stranded and self-complementary genomes. The mass difference between the empty and full particles was in good agreement with the difference expected for unionized DNA. Even for electrosprayed DNA it appears that the number of counter ions is relatively small. Schultz and coworkers have measured the masses of electrosprayed, MDa-sized DNA and found that the fraction of counter ions decreased as the length of the DNA increased, and the mass approached the un-ionized mass for DNA masses of around 5 MDa.
In conventional mass spectrometry where only the m/z is measured, the charge must be deduced from resolved peaks in the ensemble m/z spectrum in order to deduce the mass. The intrinsic mass heterogeneity for phage P22 is relatively small (747 kDa FWHM) compared to the average mass (52,180 kDa). However, with an average charge of 490 e it is impossible to resolve charge states in the m/z spectrum. Thus, it would be impossible to determine the mass distribution for phage P22 from the m/z spectrum alone. In this work we used CDMS where the m/z and charge of each ion are measured simultaneously, yielding the mass of each ion directly. CDMS is one of several single-molecule mass spectrometry methods that can be used to analyze large and heterogeneous objects.
We have measured the molar mass distribution for infectious phage P22. Phage P22 contains a total of 521 copies of 9 different proteins along with dsDNA. Despite its enormous size and complexity, P22 phage survives the transition into the gas phase and remains intact in ultra-high vacuum. The average measured molar mass (52,180±59 kDa) is consistent with the theoretical molar mass and supports the proposed stoichiometry. The intrinsic width of the molar mass distribution (747±66 kDa) is entirely attributable to the distribution of DNA packaged by the headful mechanism, indicating that there is little or no heterogeneity in the protein components. Moreover, the average mass measurement shows that the DNA packaged within phage P22 is un-ionized rather than neutralized by counter ions. From a broader perspective, these results show that valuable information can be obtained from single-molecule mass measurements on objects with masses well beyond those usually studied by conventional mass spectrometry, providing important information on their composition and heterogeneity.
We thank Dr. Marie Cantino and Dr. Xuanhao Sun of University of Connecticut Bioscience Electron Microscopy Laboratory for assistance with transmission electron microscopy. We gratefully acknowledge the support of the NSF through award number CHE-1531823 to MFJ and the NIH for grant GM076661 to CMT.