|Home | About | Journals | Submit | Contact Us | Français|
Noncovalent self-assembly of biopolymers is driven by molecular interactions between functional groups on complementary biopolymer surfaces, replacing interactions with water. Since individually these interactions are comparable in strength to interactions with water, they have been difficult to quantify. Solutes (osmolytes, denaturants) exert often-large effects on these self-assembly interactions, determined in sign and magnitude by how well the solute competes with water to interact with the relevant biopolymer surfaces. Here, an osmometric method and a water-accessible surface area (ASA) analysis are developed to quantify and interpret the interactions of the remarkable osmolyte glycine betaine (GB) with molecular surfaces in water. We find that GB, lacking hydrogen bond donors, is unable to compete with water to interact with anionic and amide oxygens; this explains its effectiveness as an osmolyte in the E. coli cytoplasm. GB competes effectively with water to interact with amide and cationic nitrogens (hydrogen bonding) and especially with aromatic hydrocarbon (cation-pi). The large stabilizing effect of GB on lac-repressor-lac operator binding is predicted quantitatively from ASA information and shown to result largely from dehydration of anionic DNA phosphate oxygens in the protein-DNA interface. The incorporation of these results into theoretical and computational analyses will likely improve the ability to accurately model intra-and inter-protein interactions. Additionally, these results pave the way for development of solutes as kinetic/mechanistic and thermodynamic probes of conformational changes and formation/disruption of molecular interfaces that occur in the steps of biomolecular self-assembly processes.
Biopolymer self-assembly (folding, binding) in vivo and in vitro involves the replacement of interactions with water by more favorable interactions between biopolymer functional groups.1 The ability (or inability) of solutes and Hofmeister salt ions to compete with water to interact with biopolymer functional groups results in often-large destabilizing (or stabilizing) effects on these assembled states.2,3 To understand the energetics of self-assembly and how solutes modulate these processes, the strength of interactions of functional groups with water relative to the strength of their interactions with one another must be determined.4 To accomplish this, new methodologies and analyses are required. Here, we use GB as proof of principle of a method and analysis that allow us to systematically obtain fundamental, previously unavailable, information about hydrogen bonding, ion pairing, and other interactions between biomolecular functional groups.
Glycine betaine (N,N,N-trimethyl glycine) is the most effective E. coli osmolyte characterized to date, allowing the cell to efficiently retain intracellular water, maintain cytoplasmic volume and dilution of cellular biopolymers, and therefore grow well under dehydrating conditions.5 Many bacterial pathogens accumulate cytoplasmic GB to adapt to osmotic stress, increasing their growth rate and thus affecting colonization and infectivity.5,6 In vitro, GB drives self-assembly, strongly stabilizing site-specific protein-DNA complexes (e.g.7–10), moderately stabilizes the globular (folded) conformation of protein,11,12 and promotes some (but not all) tertiary interactions in folded RNA.13 GB has little if any effect on stability of all-AT DNA duplexes but destabilizes GC-containing nucleic acid duplexes.13,14 To explain various of these effects, GB is proposed to be excluded from anionic oxygens (phosphate, carboxylate),15,16 the peptide backbone,17 and hydrocarbon groups.18
Effects of solutes like urea and GB (and noncoulombic Hofmeister effects of salts) on biopolymer processes like folding and binding are quantified by m-values, defined as derivatives with respect to solute or salt concentration (m3) of the observed standard free energy change of the process , where Kobs is the observed equilibrium concentration quotient (expressed in terms of concentrations and not thermodynamic activities):
In Eq. 1, Kγ is the quotient of biopolymer activity coefficients corresponding to the concentration quotient Kobs and μ23=RTdlnγ2/dm3; μ23 is closely related to the preferential interaction coefficient Γμ3,2,3 which can be predicted if the solute distribution near the biopolymer is known.19,20 (Throughout, subscripts 1, 2, and 3 refer to water, biopolymer or model compound, and glycine betaine, respectively.)
Lee and Richards pioneered the calculation of water-accessible surface area (ASA) of proteins and model compounds, and initiated the use of ASA in analyses of protein stability.21 Subsequently, ASA-based analyses were used to interpret the thermodynamics of protein folding and other protein processes and to predict coupled folding vs. rigid body binding.22–24 The first systematic quantitative application of ASA to analyze solute effects on protein processes was the landmark study of Myers, Pace, and Scholtz,25 wherein the magnitudes of urea and GuHCl m-values for globular protein unfolding were observed to be proportional to the ΔASA of unfolding, calculated assuming an extended chain model for the unfolded state. Urea and GuH+ m-value/ΔASA ratios for unfolding of α-helical peptides are 3–4 fold larger than for globular proteins, which correlates with the 3–4 fold greater fraction of amide surface in the ΔASA of unfolding α-helical peptides as compared to globular proteins26 and demonstrates that the primary interaction of these denaturants is with amide surface.8,26
In the present study, we develop the use of vapor pressure osmometry (VPO) to quantify the thermodynamics of interaction of a solute like GB with model compounds that display the principal functional groups of proteins and nucleic acids. The osmolality Osm of a solution, which may be thought of as an effective (nonideality-corrected) total solute concentration, is directly related to the water activity (Osm=−55.5 lna1). The excess osmolality of a three component solution (ΔOsm) is a quantitative, rigorous measure of the favorable or unfavorable interaction of the two solute components, relative to their interactions with water:
Hence, μ23/RT is the slope of a plot of ΔOsm, obtained from Eq. 3 for given choices of molal concentrations m2 and m3, vs. the m2m3 product. Here, we experimentally determine interactions of GB with model compounds containing the functional groups of biopolymers, interpret these as interactions of GB with the water accessible areas of the major different types of hydrocarbon, nitrogen and oxygen surfaces of biopolymers, and apply these values to predict GB interactions and GB effects on processes, using ASA information.
Nitrilotriacetic acid trisodium salt monohydrate (>98%), glycine betaine monohydrate (>99%), potassium acetate (>99%), urea, mannitol, lysine and arginine hydrochlorides, potassium oxalate monohydrate and sodium benzoate (all >99.5%) were from Fluka. Sodium aspartate, sodium oxamate, potassium citrate tribasic monohydrate (all >98%), glycine, diglycine, sodium chloride, and potassium glutamate (all >99%) were from Sigma. Sodium chloride (>99%), dibasic potassium and sodium phosphates (>99%), sodium acetate (>99.5%), and sucrose (>99.9%) were from Thermo Fisher Scientific. Glycerol (>99.5%) was from Aldrich. Acetyl-Ala-methylamide (>99%) was from Bachem. All samples were dissolved in water purified with a Barnstead E-pure system (Thermo Fisher Scientific).
Samples were prepared using all-gravimetric methods in plastic, capped microfuge tubes. Typically, ~30–250 mg of the model compound (e.g. carboxylate salt) was added as the solid to a preweighed tube, and its weight was accurately determined using an analytical balance (Mettler). Approximately 1 mL of a glycine betaine solution of precisely known concentration (gravimetrically prepared on a larger scale) was added, and the tube was weighed again. Samples were mixed until the solid was completely dissolved and then stored at 4°C until reading, no more than 24 hours later. In some cases, separate stock solutions of the model compound and of GB were prepared gravimetrically. Since the purer form of commercially available GB is the monohydrate, these experiments were most conveniently performed as series in which the GB molality was held constant and the molality of the model compound was varied. To hold the concentration of the model compound constant in a series of experiments at different GB concentrations, the model compound solution was added first and its weight determined. Weights of GB solution and of water required to achieve a constant concentration of the model compound were calculated and added. To avoid addition of a fourth component as well as protonation of carboxylates of model compounds, no buffer component was added and solution pH (ranging from 5–11) was not adjusted. In no case was the pH of the model compound solution sufficiently acidic to protonate the carboxylate group of GB.
Osmometry was performed using a Wescor Vapro 5520 vapor pressure osmometer under conditions of controlled humidity. These experiments typically spanned the range of GB concentrations from 0.4 to 1.2 molal and model compound concentrations from 0.05 to 1.0 molal for tricarboxylate salts and 0.1 to 1.25 molal for monocarboxylate salts. Triplicate measurements of osmolality were performed on each sample. The thermocouple was cleaned extensively before (and often during) assays using 2 molar ammonium hydroxide and filtered deionized water (Barnstead E-Pure), and then calibrated extensively until stable readings were obtained. The calibration was checked frequently during the assay. Standard solutions of NaCl at osmolalities of 0.100, 0.290, and 1.000 osmolal for use in calibration of the osmometer were prepared gravimetrically at the appropriate molalities (0.05351, 0.1567, and 0.5422 molal NaCl, respectively), calculated from isopiestic distillation (ID) data.31 Use of these standards significantly improved the agreement between our VPO and literature ID data for NaCl, KCl, GB, and urea. Above 1 osmolal, the highest calibration osmolality, small systematic differences between VPO and literature ID data for NaCl were observed. Fitted values of these differences were used to correct experimental VPO data obtained for 1–3 osmolal solutions measured on the same osmometer.
Various online resources were utilized to generate pdb files of the model compounds: PubChem (http://pubchem.ncbi.nlm.nih.gov), ChemDB (http://cdb.ics.uci.edu/), Biological Magnetic Resonance Data Bank (http://www.bmrb.wisc.edu), and the SMILES (simplified molecular input line entry specification) translator at http://cactus.nci.nih.gov/translate. Water-accessible surface areas (ASA) are calculated using Surface Racer32 with the Richards’ set of van der Waals radii33 and a 1.4 Å probe radius for water. Each ASA is divided into contributions from eight coarse-grained surface types (Tables S1-S2): aliphatic carbon, aromatic carbon, hydroxyl oxygen, amide oxygen, anionic carboxylate oxygen, anionic phosphate oxygen, amide nitrogen, and cationic nitrogen. A unified atom model is used, wherein hydrogens which are covalently bonded to these atoms are treated as part of the atom in calculating its van der Waals radius. For the lac DBD, the 20 conformers of PDB34 entry 1OSL35 were used as the model of the folded state; averages for total ASA and composition were calculated (as above) using the first 51 residues of each headpiece monomer (40 conformers). The web application ProtSA36,37 was used to generate an unfolded ensemble for these 51 residues, and water-accessible surface areas were calculated for 1919 conformations. The resulting average ΔASA composition, as well as details for other biopolymer ASA and ΔASA calculations, are presented in Table S2.
To obtain values of μ23/RT for interaction of GB with model compounds, values of ΔOsm (Eq. 2) were plotted as a function of the m2m3 product. No evidence for significant concentration dependence of μ23/RT or deviation from proportionality of ΔOsm to m2m3 is observed for interactions of GB with the nonelectrolytes, zwitterions, and salts studied. For all model compounds investigated, these data are well fit by a line with fixed zero intercept, as predicted for nonelectrolytes by Eq. 3. If the intercept is floated, small non-zero intercepts are observed, but these deviations are not systematic and do not have a significant effect on the slopes. Values of μ23/RT and corresponding uncertainties were obtained from the slopes of plots of ΔOsm vs. m2m3 with the intercept fixed at zero. Values of μ23/RT for interactions of GB with model compounds (with ASA and surface composition calculated as above) were analyzed by Igor 5.04B (multiple linear regression) to obtain values of GB interaction potentials (μ23/RTASA)i for different surface types.
Effects of GB on binding lac repressor to SymL operator (40 bp) DNA were determined at 25°C, 0.40 M K+ by nitrocellulose filter binding of equilibrium mixtures of repressor (0.5 nM) and operator (0.1 nM) at GB concentrations from 0–1 molal.8,38 Binding constants are averages of four independent experiments, each with duplicate samples. The predicted contribution from burial of DNA phosphate O ASA in the repressor-operator interface (Fig. 3) is reduced by 0.25 m−1 to correct for the observed increase in KCl activity with increasing GB concentration (Table 1).
We used osmometry to quantify the interactions of GB with 23 model compounds containing carboxylate, phosphate, amide, hydroxyl, ammonium, guanidinium, and aliphatic and aromatic hydrocarbon moieties. (Literature solubility data for four cyclic dipeptides as a function of GB concentration (0–4M)39 were analyzed as well.) Values of ΔOsm quantifying interactions of GB with model compounds investigated to date are plotted as a function of the m2m3 product in Fig. 1A–C. In all cases, these plots are linear over the concentration ranges examined, demonstrating that μ23/RT for each GB interaction is independent of concentration. Values of μ23/RT obtained from the slopes (or solubility data) are listed in Table 1. From the plots of Fig. 1A–C, we observe:
Eight coarse-grained classes of surface were considered in the analysis of the osmometric data: aliphatic carbon, aromatic carbon, hydroxyl oxygen, amide oxygen, anionic carboxylate and phosphate oxygen, amide nitrogen, and cationic nitrogen. For salts, contributions of the interactions of GB with K+, Na+, or Cl− ions were also included. (Table S1 lists amounts of each type of surface for all model compounds investigated here, calculated as described in Methods.) Therefore, as a first level of interpretation, we dissect experimental values of Δμ23/RT or μ23/RT (see Eqs. 1 and 3) into additive contributions from chemically distinct, coarse-grained surface types.15,30,40 This is analogous to the approach of Tanford41 and Bolen,42 which assumes that a solute m-value for protein unfolding can be decomposed into additive contributions from the 20 side chains and the peptide backbone units exposed in unfolding. We propose that the contribution of each type of surface (i) to μ23/RT is the product of a solute interaction potential (contribution per unit of ASA; (μ23/RTASA)i) and the ASA of that surface. The experimental value of the chemical potential derivative μ23/RT is therefore represented as the sum of terms:
where the interaction potential (μ23/RTASA)i quantifies the interaction of the solute of interest with one Å2 of surface of type i on any compound or biopolymer, (ASA)i is the water accessible area in Å2 of surface type i on the model compound being analyzed, and ν j(μ23/RT )j is the product of the number of salt ions (ν j) per formula unit of a salt and the assigned contribution (μ23/RT ) j of that type of ion to μ23/RT. The observed μ23/RT are model-independent thermodynamic quantities; the solute potentials (μ23/RTASA)i, which quantify the effect of the solute per unit area of a particular type of water-accessible surface on the biomolecule or model compound, require a structural model.
We use an ASA-based analysis of coarse-grained surface types instead of a functional group or atom-by-atom analysis for two reasons. First, the ASA of a particular type of surface takes account of variations in the accessibility of different functional groups or atoms resulting from the global conformation of the molecule and/or local steric effects of neighboring atoms or groups. Second, ASA is a fundamental variable in the solute partitioning model (SPM) of preferential interactions, which proposes that the hydration of a particular type of surface is proportional to its ASA. An SPM-based molecular thermodynamic analysis, using ASA, has been successfully applied to experimental data characterizing the effects of the spectrum of Hofmeister salts on the process of forming a nonpolar air-water surface43 and on processes which expose hydrocarbon and amide molecular surface to water.40
Each solute interaction potential is readily interpreted using the SPM;40 for a nonelectrolyte solute at low concentration:
In Eq. 5, is the microscopic analog of a macroscopic thermodynamic partition coefficient (equilibrium concentration quotient) characterizing the distribution of a solute like GB or urea between the local water of hydration of a given type of biopolymer or model compound surface and bulk water, and b1 is the surface density of the local water.
Initially, values of μ23/RT for interactions of GB with nine uncharged solutes (six amides, three polyols; cf. Fig. 1A and Table 1) were analyzed using Eq. 4 and ASA compositions to obtain values of reduced GB interaction potentials (μ23/RTASA) for amide O, amide N, hydroxyl O, and hydrocarbon C surface types (Table 2). Interactions of GB with these solutes were analyzed first because they are uncharged and so may interact more simply with the GB zwitterion than other model compounds in the data set, which are zwitterions or salts. Confirming the qualitative conclusions presented above, the results show that GB interacts unfavorably with amide O but favorably with amide N. Unfavorable interactions of GB with aliphatic C and hydroxyl O are smaller in magnitude.
These GB interaction potentials were held constant in the application of Eq. 4 to analyze interactions of GB with the zwitterions and salts in Figs. 1B and 1C and Table 1; using the ASA information of Table S1, we obtained reduced GB interaction potentials μ23/RTASA for carboxylate O, phosphate O, aromatic C, and cationic (ammonium, guanidinium) N surface types (Table 2). In this analysis, we assigned a μ23/RT value of 0 m−1 to Na+. This value, which is qualitatively consistent with a small net interaction due to strong exclusion from hydrocarbon surface40 and a presumed strong favorable interaction with the carboxylate of GB,44 results in values of μ23/RT for K+ and Cl− of 0.09 m−1 and −0.03 m−1. The most unfavorable potentials are for anionic oxygens, and the interaction with phosphate O is more unfavorable than that with carboxylate O. These results provide a higher resolution separation of the previously-reported composite GB-anionic oxygen interaction potential, obtained from an analysis of GB-DNA and GB-protein interactions and GB effects on protein folding.8 Interpretation of the GB-phosphate O and GB-carboxylate O interaction potentials (Table 2) using the SPM (Eq. 5) indicates that the extents of hydration of anionic carboxylate and phosphate oxygens are at least 0.16±0.01 H2O Å−2 (~2 layers) and 0.27±0.02 H2O Å−2 (~3 layers), respectively. We deduce that GB is highly excluded from this water because of its inability to compete with water as a hydrogen bond donor to interact with these oxygens.
The favorable interaction of GB with cationic nitrogens (Table 2) is presumably due to a hydrogen bonded ion pair or salt bridge between the ammonium or guanidinium group (donor) and the anionic carboxylate of GB (acceptor). If the hydration of these cationic nitrogens (in the absence of GB) is two layers of water (as determined for hydrocarbon40 and anionic carboxylate oxygen surfaces), then the partition coefficient characterizing this interaction is Kp=1.3±0.1. The GB-amide O interaction potential is similar in sign and magnitude to that for carboxylate oxygen, indicating that GB is quite highly excluded from the water of hydration of amide oxygens (Kp ~0.1). The chemical basis of this exclusion, likewise, must be the inability of GB to compete with water to hydrogen bond to the amide oxygen. The favorable preferential interaction of GB with amide nitrogen (similar to that with cationic nitrogen) likely indicates that the amide NH is a better hydrogen bond donor than water for the GB carboxylate oxygen acceptor. Interpretation of this favorable interaction using the SPM yields a GB-amide N partition coefficient Kp=1.5±0.1. Auton and Bolen concluded that GB is excluded from the peptide backbone;17 our potentials quantify this moderate (net) exclusion and decompose it into contributions from the oxygen, carbon, and nitrogen surfaces.
The most favorable interaction determined here is that between GB and the aromatic carbon surface of the benzyl group of benzoate. Since the hydration of benzyl surface is 0.18 H2O Å−2,40 the GB-benzyl partition coefficient is 1.7±0.2. This is almost certainly a cation-pi interaction between the (CH3)3N+ group of GB and aromatic surface, like those observed in the crystal structures of complexes of GB with two proteins involved in its transport across the periplasmic membrane. The periplasmic binding protein ProX45,46 and the transporter protein BetP47 both feature GB trapped in a “box” formed by aromatic side chains, and the pore of BetP is lined with aromatic amino acids. The strength of the cation-pi interaction between GB and the benzyl group supports the structural proposals of the key role of cation-pi interactions in both the binding and transport of GB.
Values of μ23/RT for all model compounds studied were calculated using the results of the global fit (Table 2) and are compared to the experimentally observed values in Table 1 (see also Fig. S1). Agreement is generally very good, showing the merit of the approach and of the assignment of a single-ion μ23/RT value of 0 for Na+.
Using GB interaction potentials from Table 2 and ASA analyses from structural data (Table S2) as inputs to Eq. 4, we predict the strength (μ23) of GB-biopolymer interactions and of GB m-values (Δμ23) for protein folding and protein-DNA binding. Figure 2 is a log-log plot comparing predicted and observed magnitudes of μ23/RT (or Δμ23/RT ) for both model compounds and biopolymers. Agreement is quantitative (within ~15%) for interaction of GB with double-stranded DNA and effects of GB on folding of the lac repressor DNA binding domain (DBD)15 and binding of lac repressor to lac operator DNA. Predictions for interactions of GB with native BSA and HEWL are only semiquantitative; both are 40–50% less than the observed values (Fig. S2).
GB effects on processes involving DNA can be attributed in large part to the extraordinary exclusion of GB from anionic phosphate oxygens. Exclusion of GB from the 23,700 Å−2 of anionic phosphate oxygen ASA of 160bp DNA is predicted to contribute 120 m−1 to the observed value of μ23/RT (130±13; see Fig. S2), and the 320 K+ counterions are predicted to contribute an additional 30 m−1. Since contributions of other surface types are predicted to be small and somewhat compensating, these two contributions determine the observed exclusion of GB from duplex DNA.
Site-specific binding of proteins to helical DNA partially or fully dehydrates and buries significant amounts of anionic DNA phosphate oxygen surface. Since GB is completely excluded from the water of hydration of this surface on free DNA, the reduction in water activity caused by addition of GB drives formation of these complexes. Figure 3 is a semilog plot of binding constants Kobs as a function of GB concentration (0–1 molal), determined from GB titrations of mixtures of lac repressor tetramer and 40bp lac operator DNA at constant salt molality. From the slope, the GB m-value/RT (i.e. Δμ23/RT ) is 2.6±0.4 m−1, ~25% larger than that previously estimated from experiments at higher GB concentration.8 The burial of 6900 Å2 of C, N, and O surface in the protein-DNA and protein-protein interfaces formed in complexation (Table S2) gives rise to a net predicted GB m-value/RT=2.7 m−1 at constant salt molality, in good agreement with experiment (Fig. 3). Formation of the repressor-operator interface, which buries 630 Å2 of anionic DNA phosphate O surface, is predicted to contribute 2.2 m−1 (81%) to the GB slope; folding the hinge helices, which buries 520 Å2 of amide surface, is predicted to contribute 0.5 m−1 (19%); and formation of the core repressor-DBD interface is predicted to make no significant contribution due to small compensating effects (Fig. 3).
The GB m-value/RT for unfolding of the globular lac repressor DNA binding domain is 1.2±0.1 m−1 at low GB concentration.11 From the potentials of Table 2 and ΔASA of Table S2, weak exclusion of GB from aliphatic hydrocarbon surface (63% of ΔASA) is predicted to contribute 0.6 m−1 to the GB m-value. Exclusion from amide oxygen and accumulation at amide nitrogen contributes a net 0.8 m−1, and exposure of 150 Å2 of tyrosine aromatic ASA in unfolding contributes −0.3 m−1. Other contributions are smaller and largely compensate each other. The predicted m-value/RT for the stabilizing effect of GB on unfolding is 1.0 m−1, within 20% of the experimental value. We conclude that GB modestly stabilizes globular proteins like the lac DBD because of roughly equal contributions from exclusion of GB from amide O and aliphatic C surface, the effects of which are mitigated by accumulation at aromatic C and amide N surface.
Our results explain why GB is such an effective osmolyte: it is unable to compete with the water of hydration of anionic (phosphate, carboxylate) and amide oxygens of cellular biopolymers, lipids, metabolites and GB itself. Cytoplasmic GB is therefore concentrated in the remaining intracellular water, thus increasing cytoplasmic osmolality more than a uniformly distributed or accumulated solute could. Our results also predict that steps in biopolymer assembly mechanisms and other processes which bury (or expose) anionic or amide oxygen surface, as in forming a protein-nucleic acid interface, will be especially sensitive to changes in GB concentration,8,48 making GB a valuable quantitative probe in mechanistic studies. How cells maintain regulation of protein-nucleic acid interactions of gene expression, replication, and other nucleic acid processes while accumulating strongly perturbing osmolytes to high concentrations during osmotic stress remains an in vivo-in vitro paradox.5
The authors thank the reviewers for their thoughtful comments and suggestions and Jorge Estrada for providing the pdb files for the lacDBD unfolded ensemble.
*This work was supported by National Institutes of Health Grants GM47022 and GM23467 (to M.T.R.).
Tables containing water accessible surface areas (ASA) or ΔASA for model compounds, biopolymers, and biopolymer processes. Figure comparing experimental values of μ23/RT for model compounds with those calculated from the global fit of the experimental data. Figure illustrating the determination of ΔOsm using VPO data on 0.005m BSA solutions as a function of GB concentration and comparing ΔOsm values for BSA, DNA, and HEWL. This material is available free of charge via the Internet at http://pubs.acs.org.