|Home | About | Journals | Submit | Contact Us | Français|
Gas vesicles are organelles that provide buoyancy to the aquatic microorganisms that harbor them. The gas vesicle shell consists almost exclusively of the hydrophobic 70-residue protein GvpA, arranged in an ordered array. Solid-state NMR spectra of intact, collapsed gas vesicles from the cyanobacterium Anabaena flos-aquae show duplication of certain GvpA resonances, indicating that specific sites experience at least two different local environments. Interpretation of these results in terms of an asymmetric dimer repeat unit can reconcile otherwise conflicting features of the primary, secondary, tertiary and quaternary structures of the gas vesicle protein. In particular, the asymmetric dimer can explain how the hydrogen bonds in the β–sheet portion of the molecule can be oriented optimally for strength while promoting stabilizing aromatic and electrostatic side-chain interactions among highly conserved residues and creating a large hydrophobic surface suitable for preventing water condensation inside the vesicle.
Gas vesicles are buoyancy organelles that are found in a wide range of aquatic microorganisms. By assembling and disassembling these vesicles, organisms are able to regulate their depth in the water column according to their needs for light, air and nutrients.
The gas content of the hollow vesicles reflects passive equilibrium with gas molecules dissolved in the aqueous phase. Given that permeable species include not only such small, non-polar molecules as H2, N2, O2, Ar, and CO2, but also the polar CO molecule and the large perfluorocyclo-butane, C4F8 molecule (6.3 Å diameter)1, it is assumed that H2O is also permeable and that the absence of condensed water inside the vesicles reflects a highly hydrophobic and highly concave inner surface without suitable nucleation sites1; 2. This view is supported by accumulating evidence that proximal hydrophobic surfaces in bulk water produce bubbles between them3.
Electron microscopy of intact vesicles4; 5 shows shapes ranging from acorn-like spindles to regular cylinders with conical end caps. In Anabaena flos-aquae, the cylinders dominate (see Figure 1) and are typically about 5000 Å long and 750 Å wide. Close examination shows that the vesicle is bipolar, with ribs forming a low-pitch helix6 on each side of an apparent insertion seam located in the cylindrical region. A 45.7 Å rib-rib distance has been measured by X-ray diffraction7 and by atomic force microscopy.8
The gas vesicle wall comprises a shell formed almost exclusively by repeats of the 70-residue gas vesicle protein A9 (GvpA) with a small amount of the ~3-fold larger gas vesicle protein C (GvpC) adhering loosely to the outer surface10. However, little is known about the GvpA fold: high resolution electron microscopy is not possible because of multiple scattering, and solution NMR of intact vesicles is not possible because of their large size. Furthermore, since gas vesicles dissolve only under denaturing conditions and subsequent dialysis yields only amorphous precipitates, neither crystallographic nor solution NMR studies have been feasible. However, FTIR spectra (obtained in collaboration with Mark Braiman) provide indications of anti-parallel β–sheet, and X-ray diffraction7 and AFM8 agree that β–strands are tilted 36° from the cylindrical axis of the vesicle. The corresponding orientation of the inter-strand hydrogen bonds at 54° to the cylinder axis, is the ideal for mechanical stability in both the length and width directions1.
Figure 2 shows the amino acid sequence of GvpA in Anabaena flos-aquae. There are six positively charged residues (three R and three K) and nine negatively charged residues (three D and six E), for a net charge of –3. A MALDI-TOF study11 has shown that (1) there is no post-translational modification of GvpA, (2) the only one of the three R-X and K-X bonds that is accessible to trypsin is the one in the N-terminus, (3) none of nine D-X and E-X bonds are accessible to endoproteinase GluC, and (4) the C-terminus is accessible to carboxypeptidase only as far as the S65-A66 bond.
Secondary structure prediction using PSIPRED12 (and other algorithms, not shown) defines likely α–helix and β–sheet regions with high confidence (as shown in Figure 2a). A coil prediction approximately midway through a long β–sheet stretch (too long to fit in one rib of the vesicle) suggests the location of a β–turn. Charges in each half of the predicted β–sheet are conspicuously arranged as oppositely charged pairs (D26,R30 and E52,R54) that could form salt bridges with the same pairs in a neighboring anti-parallel strand. This arrangement would also allow the unique tryptophan (W28) in each subunit to interact with that of a neighboring subunit.
BLAST13 results comparing the Anabaena flos-aquae GvpA sequence with those of other cyanobacteria and with aquatic micro-organisms generally (Figure 2b) show a remarkably conserved core in the sequence, coinciding with the predicted α–helix and β–sheet segments. The putative β–turn region is absolutely conserved and the flanking β–sheet segments show absolute conservation or conservative substitution of all the aromatic and charged residues, including the tryptophan and all of the charge pairs noted above. This conservation suggests the importance of aromatic and electrostatic interactions for the structure of gas vesicles.
However, it is not so easy to satisfy these side chain interactions for the anti-parallel β–strands while also orienting the strands at the observed tilt of 36° from the vesicle axis, which requires a translation of four residues over two subunits (as shown in Figure 3). Figure 4 shows that this can be done with an even-numbered β–turn centered between V34 and G35, but even-numbered turns put charged side chains on both sides of the β–sheet which precludes participation of the sheet in the hydrophobic inner surface of the gas vesicle. The alternative, that the hydrophobic faces of the amphipathic α–helices make up the inner surface of the vesicle, seems unlikely, given that both the N- and C-termini are accessible to digestion by proteases11.
Figures 5–7 show options for odd-numbered β–turns. Centering the turn on G35 or I36 (Figure 5a,b) leads to a translation of two residues over two subunits in either direction, resulting in a tilt of only 20°. On the other hand, centering an odd-numbered turn on V34, (Figure 5c) leads to a translation of six residues over two subunits, resulting in a tilt of 48°. Evidently achieving the correct tilt with odd-numbered turns requires either shifting the register of subunits or relaxing the assumption of equivalent subunits. These two approaches are illustrated in Figures 6 and and7,7, respectively. Figure 6 shows the same turn as in Figure 5a with a two-residue shift in one subunit interface or the other: in Figure 6a, the DAWVR registration is preserved, while the EAR motifs are two residues out-of-register, whereas in Figure 6b, the DAWVR motifs are two residues out-of-register, while the EAR registration is preserved. In both cases, interactions between the out-of-register residues would require side chains to be stretched along a diagonal between the strands. In contrast, the model in Figure 7 preserves the close aromatic and electrostatic interactions while achieving the correct tilt by centering the odd-numbered β–turns in alternating subunits on different residues, specifically V34 and G35, to give the correct translation of four residues over two subunits.
The model in Figure 7 uniquely invokes an asymmetric dimer as the fundamental building block of gas vesicles. In this case, NMR spectra would exhibit two different chemical shifts for at least some residues. Since solid-state NMR does not rely on fast molecular tumbling or large three-dimensional crystals, it is ideally suited technique for studying gas vesicles. Here we present the first solid-state NMR results for vesicles from Anabaena flos-aquae in their intact, collapsed state. Our data indicate that there are inequivalent GvpA subunits in the gas vesicle structure, lending plausibility to the β–sheet structure proposed in Figure 7.
We have obtained NMR data with line widths (full width at half height) of 70–110 and 60–80 Hz in the 13C and 15N dimensions. The 13C line widths are comparable to those in the microcrystalline proteins BPTI14 and ubiquitin15 (in both cases 100 Hz). Like amyloid fibrils, such as α-synuclein16; 17 and HET-s(218–289)18, where 13C lines widths in the 30–100 Hz range are observed, gas vesicles can be regarded as natural 2-dimensional protein crystals with a high degree of short-range order, producing narrow NMR lines. Still narrower line widths, such as those observed for the microcrystalline proteins GB119 and Crh20 (15–30 Hz line widths) are required in order to resolve one-bond 13C-13C J-couplings. With the line widths that we observe in gas vesicles, spectroscopy at high field is required to resolve the peaks. Therefore, the data in this paper were acquired at 1H Larmor frequencies of 700 to 900 MHz. Partial assignments have been obtained, and will be reported in a later paper.
Figure 8 shows the 44–47 ppm 13C/105–117 ppm 15N region of a NCACX correlation spectrum, where only Gly CA-N cross peaks are expected. There are 3 glycine residues in the sequence, but 6 peaks are observed in this region. It is possible to assign three of these peaks sequentially to G22, G35, and G61. The remaining peaks are labeled A, B, and C. While it is not yet possible to assign peaks A, B, and C sequentially, it has been determined that peak C has a valine neighbor and therefore must be due to either G35 or G61.
Figure 9 shows N-C correlations for the unique A-S pair in the GvpA sequence. The left two panels show the N-CA and N-CB correlation of S49 in a NCACX spectrum, and the two right panels show the correlations of the S49 N to the preceding A48 CA and CB sites in a NCOCX spectrum. Since there is only one A-S pair in the sequence, all peaks must be assigned to this pair. In the 15N dimension, it is clear that all the correlation signals are found at two distinct chemical shifts for S49, 1.8 ppm apart. In the 13C dimension, two distinct chemical shifts are also observed for S49-CB, 0.5 ppm apart. On the other hand, at S49-CA, A48-CA, and A48-CB different chemical shifts are not distinguishable within the experimental error.
Figure 10 shows N-C correlations for the unique T-Y pair in the GvpA sequence. The left panel shows the N-CA correlation of Y53 in a NCACX spectrum, and the right panel shows the correlations of the Y53 N to the preceding T52 CA and CB sites in a NCOCX spectrum. Since there is only one T-Y pair in the sequence, all peaks must be assigned to this pair. In the 15N dimension, it is clear that all the correlation signals are found at two distinct chemical shifts for Y53, 1.0 ppm apart. In the 13C dimension, two distinct chemical shifts are also observed for Y53-CA, 0.4 ppm apart, and T52-CA, also 0.4 ppm apart.
Figure 11 shows the alanine CA-CB region of an RFDR spectrum. Although there are only 11 alanine residues in the sequence, at least 14 peaks are observed in this region, of which at least one represents two residues. From the distribution of the peaks, with secondary chemical shifts21; 22 for both α-helix and β-sheet, it is clear that GvpA is a protein with mixed secondary structure. Although all three predicted α-helical alanine residues are assigned to one peak each, two peaks (D and E) still remain unassigned in the α-helix region. In addition, at least 10 peaks are observed for the 8 predicted coil and β-sheet residues, indicating duplication of peaks in various parts of the sequence.
We note that the duplicated peaks in Figures 8–11 have comparable line widths in both the 13C and 15N dimensions. Hence, the extra peaks appear to represent two well-ordered, but structurally different, protein fractions.
Several sites in GvpA especially lend themselves to the identification of subunit inequivalence in gas vesicles. The glycine NCA correlations are found in a spectral region that does not contain other resonances, and there are only three glycine residues in this 70-residue protein. The unique AS and TY pairs are resolved in NCACX spectra. Of these residues, G22, G35, A48 and T52 are absolutely conserved across all organisms, while S49, Y53 and G61 are absolutely conserved among cyanobacteria, but not conserved more generally. All of these residues are located at or near predicted transitions in secondary structure (Figure 2).
As expected, the duplicated signals show larger variations in 15N chemical shifts than in 13C chemical shifts: the lone pair of nitrogen makes its shielding much more sensitive to changes in the local environment23. A sensitive reporter on the peptide backbone is clearly ideal for detecting secondary structure variations, although it may also reflect higher order variations.
The peaks in Figures 9 and and1010 offer unambiguous evidence for structural variations in the GvpA subunits of Anabaena flos-aquae at the C-terminal end of the predicted β-sheet region. Given the similar intensities within the pairs, it is likely that each peak in a duplicated pair comes from the same number of subunits, consistent with the asymmetric dimer model shown in Figure 7. The alternative of sample heterogeneity is implausible since x-ray diffraction shows just one characteristic distance in each of the three dimensions of the wall (i.e., the rib spacing, the wall-thickness, and the subunit repeat along a rib) and electron microscopy also shows just one rib spacing. Of course, electron microscopy also shows conical end-caps on otherwise cylindrical vesicles. However, Anabaena vesicles are long and the relatively small content of the end-caps is not expected to give signals of the strength of those seen in our duplicated peaks.
Figure 8 also shows peak duplication, with larger 15N chemical shift variations, but it is more difficult to interpret. There are two options for assigning peak C because both G35 and G61 are preceded by a valine residue. However, given the strength of the peak already assigned to G61, it seems unlikely that any other peak belongs to G61 and reasonable to infer instead that the G61 signal is not split. With the alternative assignment of peak C to G35, the two G35 peaks are of comparable intensity and there is a difference in 15N and 13C chemical shifts of 4.5 and 1.0 ppm, respectively, indicating a significant difference in the two different subunit conformations at G35. In this scenario, the intensities of the remaining peaks A and B are most reasonably assigned to G22. The chemical shift differences between peaks A, B, and G22 are small, and suggest correspondingly small conformation variations at this position.
The tentative glycine peak assignments obtained by the above consideration of peak intensities are in good agreement with the duplications expected for the asymmetric dimer model shown in Figure 7. G35 will be in significantly different local environments in alternating subunits depending on whether it is or is not at the center of the β-turn. In the above assignment, this is the glycine residue that displays the largest chemical shift change between the duplicated peaks. G22 is situated in a loop at the N-terminal end of the predicted β–sheet region and, like A48 at the C-terminal end, is expected to be less affected by the different turn positions in alternating subunits. G61, located on the C-terminal side of the C-terminal α-helix, would be affected still less. Thus, the asymmetric model shown in Figure 7 provides a clear rationale for the glycine peaks seen in Figure 8.
Taken together, the multiplicity of cross-peaks observed in well-resolved regions of heteronuclear 2D solid-state NMR spectra of intact gas vesicles support a model of the β-sheet portion of GvpA that achieves one completely hydrophobic face, complimentary charges and aromatic-aromatic interactions at subunit interfaces, and the stabilizing 36° strand tilt, all by a small folding variation in alternating GvpA subunits. That two conformations appear to contribute to function, places GvpA in a growing group of “metamorphic” proteins that adopt different folded states under native conditions24.
Uniformly 13C and 15N labeled gas vesicles from Anabaena flos-aquae provide well-resolved solid-state NMR spectra at high magnetic fields. The sensitivity of the amide 15N shifts provides a probe of subtle differences in protein structure. In gas vesicles we find that it distinguishes at least two different forms of the GvpA subunits. In particular, 15N-13C correlation spectra detect two different environments for the amides of S49 and Y53, and at least two different environments for the amides of G22 and G35. Thus the GvpA monomers are incorporated into gas vesicles in at least two different ways. The simplest explanation is that gas vesicles are formed by asymmetric dimers of GvpA. Such dimers can explain how the β–strands can be tilted at 36° relative to the vesicle axis while accommodating stabilizing interactions between highly conserved residues and forming a large hydrophobic surface suitable for inhibiting water condensation inside the vesicle. Future resonance assignments and structural constraints will test this interpretation and provide a larger context.
To uniformly 13C and 15N label gas vesicles in Anabaena flos-aquae (Cambridge Collection of Algae and Protozoa (CCAP), Cambridge, UK, strain 1403/13f), the cells were grown under 13CO2 and 15N2. Floating cells were collected and lysed by osmotic shrinkage of the protoplasts using 0.7 M sucrose25. The released vesicles were isolated and washed by several rounds of centrifugally accelerated floatation at 100 x g in 5.0 mM NaCN, 10 mM potassium phosphate buffer at pH 8.026, followed by filtration on Millipore membrane filters with 0.65 and 0.45 μm pores27. No attempt was made to retain GvpC and any that might be retained would not disturb NMR experiments as a signal from less than 5 mol% of the sample is not detectable.
The isolated gas vesicles were collapsed by a sudden application of pressure to the plunger of a syringe holding a suspension of vesicles. Vesicle collapse was observed by clearing of the initial milky appearance. The collapsed vesicles were pelleted by 45 min of centrifugation at 158,420 x g, washed with 50 mM NaH2PO4, 1 mM NaN3, pH 7.0, and then resuspended in the same buffer with 15% (w/w) D8-glycerol (Cambridge Isotope Laboratories, Andover, MA). Glycerol prevents protein dehydration, and the deuteration prevents cross-polarization of the natural abundance carbon. This suspension was centrifuged for 24 hours at 324,296 x g, and the resulting gel-like pellet was drained and packed into rotors using a tabletop centrifuge. 10.8 and 23.7 mg sample were packed into 2.5 and 3.2 mm rotors, respectively, and we estimate that less than half of this is protein. After closing the rotors, there was no dehydration over several months (as monitored by weight).
CP/MAS (cross-polarization/magic-angle-spinning) NMR spectra were acquired on custom-designed spectrometers (courtesy of D. J. Ruben, Francis Bitter Magnet Laboratory, Massachusetts Institute of Technology) operating at 700 MHz and 750 MHz 1H Larmor frequencies and a Bruker spectrometer (Billerica, MA) operating at 900 MHz 1H Larmor frequency. The 700 MHz spectrometer was equipped with triple-resonance 1H/13C/15N Varian-Chemagnetics (Palo Alto, CA) probe with a 3.2 mm stator, and the 750 and 900 MHz spectrometers were equipped with Bruker (Billerica, MA) probes with 2.5 mm stators. Samples were cooled with a stream of air during experiments, maintaining the exit gas temperature at –5 to 5°C. The MAS frequency was controlled to ±5 Hz using Bruker spinning speed controllers. All spectra were referenced to external DSS according to IUPAC convention28, using the Ξ conversion factor29.
The 2D NCOCX correlation spectrum was obtained at 900 MHz 1H Larmor frequency, with 20 kHz MAS and 100 kHz TPPM 1H decoupling30, using 1H-15N CP, followed by specific DCP31 for 15N-13CO polarization transfer, and PDSD32 for subsequent 13CO-13CX transfer. 128 real and 128 imaginary points were acquired in the indirect dimension with a dwell time of 100 μs, and 2048 points were acquired in the direct dimension with a dwell time of 10 μs. 384 scans were acquired for each t1 point.
The 2D NCACX correlation spectrum was obtained at 700 MHz 1H Larmor frequency, with 12.5 kHz MAS and 83 kHz TPPM 1H decoupling30, using 1H-15N CP, followed by specific DCP31 for 15N-13CA polarization transfer, and 2.56 ms RFDR33; 34 with 20 kHz 13C pulses for mixing from CA to other aliphatic 13C nuclei. At this field, the typical 120-ppm difference between CO and CA resonances corresponds to 21 kHz and since the bandwidth of a 20 kHz pulse is less than 20 kHz, mixing occurs only between aliphatic carbons. This is advantageous in terms of sensitivity, as the magnetization is spread among fewer sites. 192 real and 192 imaginary points were acquired in the indirect dimension with a dwell time of 80 μs, and 1536 points were acquired in the direct dimension with a dwell time of 16 μs. 296 scans were acquired for each t1 point.
The 2D RFDR33; 34 spectrum was obtained at 750 MHz 1H Larmor frequency, with 18.182 kHz MAS and 83 kHz XiX 1H decoupling35. 13C RFDR pulses of 40 kHz were used, and the 13C-13C mixing time was 3.52 ms. 512 real and 512 imaginary points were acquired in the indirect dimension with a dwell time of 28 μs, and 2048 points were acquired in the direct dimension with a dwell time of 12 μs. 112 scans were acquired for each t1 point.
This work was funded by NIH grants EB002175, EB001960, and EB002026. We would like to thank Drs. Patrick van der Wel and Anthony Bielecki for helpful discussions and technical support.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.