PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Eur Biophys J. Author manuscript; available in PMC 2011 February 1.
Published in final edited form as:
PMCID: PMC2872189
NIHMSID: NIHMS199084

The implementation of SOMO (SOlution MOdeller) in the UltraScan analytical ultracentrifugation data analysis suite: enhanced capabilities allow the reliable hydrodynamic modeling of virtually any kind of biomacromolecule

Abstract

The interpretation of solution hydrodynamic data in terms of macromolecular structural parameters is not a straightforward task. Over the years, several approaches have been developed to cope with this problem, the most widely used being bead modeling in various flavors. We report here the implementation of the SOMO (SOlution MOdeller; Rai et al. in Structure 13:723–734, 2005) bead modeling suite within one of the most widely used analytical ultracentrifugation data analysis software packages, UltraScan (Demeler in Modern analytical ultracentrifugation: techniques and methods, Royal Society of Chemistry, UK, 2005). The US-SOMO version is now under complete graphical interface control, and has been freed from several constraints present in the original implementation. In the direct beads-per-atoms method, virtually any kind of residue as defined in the Protein Data Bank (e.g., proteins, nucleic acids, carbohydrates, prosthetic groups, detergents, etc.) can be now represented with beads whose number, size and position are all defined in user-editable tables. For large structures, a cubic grid method based on the original AtoB program (Byron in Biophys J 72:408–415, 1997) can be applied either directly on the atomic structure, or on a previously generated bead model. The hydrodynamic parameters are then computed in the rigid-body approximation. An extensive set of tests was conducted to further validate the method, and the results are presented here. Owing to its accuracy, speed, and versatility, US-SOMO should allow to fully take advantage of the potential of solution hydrodynamics as a complement to higher resolution techniques in biomacromolecular modeling.

Keywords: Macromolecular hydrodynamics, Bead modeling, Analytical ultracentrifugation, Protein structure and dynamics, NMR spectroscopy, X-ray crystallography

Introduction

Solution hydrodynamic parameters of macromolecules, such as the translational (Dt) and rotational (Dr) diffusion and sedimentation (s) coefficients, and the intrinsic viscosity ([η]), can be experimentally determined by well-established techniques. Since the size, detailed shape, and time-dependent conformation determine the macromolecules’ frictional properties, the computation of these parameters from their structures has been a field of intense research. These calculations are, however, not straightforward. Well-defined geometrical objects, such as cylinders and ellipsoids, have been used initially to build very low resolution models of proteins (Tanford 1961; Cantor and Schimmel 1980) and other biopolymers, and are still in use today in an enhanced version (Harding et al. 2004). A big step forward was the development of the theory for the computation of translational and rotational frictional coefficients and intrinsic viscosity of ensembles of non-overlapping spheres (beads) of differing radii (reviewed in García de la Torre and Bloomfield 1981; Spotorno et al. 1997; Carrasco and García de la Torre 1999). This procedure has been extended in a number of different ways to model proteins and other biomacromolecules of known 3D structure, ranging from shell modeling to grid-based methods (see Byron 2000). However, the calculation of the hydrodynamic parameters of ensemble of beads can be computationally demanding, requiring a compromise between the bead model resolution and the number and size of the beads employed. Furthermore, the effect of the so-called water of hydration (Halle and Davidovic 2003) should be correctly taken into account (see Rai et al. 2005).

Currently, three principal different bead modeling methods are available, implemented in public-domain computer programs. A “grid” method was implemented by O. Byron in the program AtoB (Byron 1997). Here, the protein is subdivided into equally sized cubes and each residue is assigned to a particular cube. Then, according to user choice, beads of either equal or differing radii are generated and placed in the center of gravity of each cube, the resolution of the final model depending on the spacing of the initial cubic grid. AtoB (Byron 1997) was tested against a large globular protein (aldolase) and a spherical hollow protein (apoferritin). Bead models were generated and the calculated s(20,w)0 and [η] values agreed well with experimental values, provided that an appropriate grid spacing was used and after radial expansion of the beads to compensate for the water of hydration. Zipper and Durchschlag (1997, 1998) also followed a similar approach, and the usefulness of such methods appears to rest mainly in the modeling of very large structures.

At the other end of the spectrum lies the “shell modeling” approach implemented in the currently most widely used bead modeling program, Hydropro, developed by García de la Torre and collaborators (García de la Torre et al. 2000; García de la Torre 2001). In this approach, all atoms in a protein are first replaced by equally sized beads of a certain radius. Then, the surface of this “primary” model is covered with a “shell” of smaller beads, and the procedure is iterated, decreasing the shell beads’ radius, allowing extrapolation to zero bead size. This approach has undergone more extensive testing (García de la Torre et al. 2000; García de la Torre 2001), and the models can on average reasonably reproduce the hydrodynamic parameters determined experimentally (albeit without a critical evaluation of the literature data, see below). Furthermore, to reach a consensus agreement across the test proteins, the primary beads’ radius was adjusted until a mean satisfactory value was found. In addition, to avoid excessive memory requirements and very long computing times, Hydropro currently has an upper limit of ~3,000 shell beads, whose radius is a function of the protein’s size, potentially limiting its precision when large structures are analyzed.

A third approach is to build a bead model with direct correspondence between the atoms in the macromolecules’ residues, such as amino acids in proteins, sugar units in carbohydrates and nucleosides in nucleic acids, and the beads that are used to represent them. This approach can overcome some of the limitations of the other methods, and was chosen for the development of SOMO (SOlution MOdeller; Rai et al. 2005), where, for instance, amino acid residues are represented each by two beads, one for the main-chain and another for the side-chain segments. The beads’ volumes are initially determined by summing the volumes of the atoms which they represent, and are then augmented by adding the volume of the water molecules which were experimentally found to be statistically bound to each residue (e.g., Kuntz and Kauzmann 1974). Beads are positioned according to the characteristics of the residues they represent, and the overlaps between them are then removed by proportional radial reduction, trying to preserve the original anhydrous surface envelope as much as possible. This is aided by an accessible surface area (ASA) computation initially performed on the atomic structure to separate exposed beads from buried beads. Buried beads can then be excluded from subsequent hydrodynamic parameters computations, which in the original SOMO implementation were carried out separately by the program SUPCW (Spotorno et al. 1997; Rai et al. 2005). SOMO was extensively tested against three small proteins, BPTI, RNase A and lysozyme, for which a very large body of hydrodynamic data exists (that were critically evaluated), plus two larger proteins, fibrinogen fragment D and citrate synthase dimer, with very good results (Rai et al. 2005). SOMO has already been instrumental in discriminating between alternative conformation of integrins in a recently published study (Rocco et al. 2008). However, the original SOMO implementation suffered from a number of drawbacks. SOMO consisted of a collection of separate, command-line driven executables running under the Linux operating system, with a rather rigid user interface, recognizing only residues hard-coded in the programs. To overcome these flaws, an entirely re-designed and enhanced version of the SOMO program was developed by the authors by integrating the basic functionality of SOMO under the open source software UltraScan (US). First, we added a graphical user interface (GUI) and replaced the hard coded residue representation by implementing user-modifiable reference tables which code for the atomic groups and residues present in the Protein Data Bank (PDB; Berman et al. 2000) structures. A full range of options controlling details of the modeling process and of the hydrodynamic computations (performed with an integrated version of SUPCW; see Spotorno et al. 1997 and Rai et al. 2005) can be accessed through dedicated menus. In the process, we have also corrected some mistakes that went unnoticed in the original SOMO release, and added new features. Finally, a module for the creation of bead models based on the AtoB grid method (Byron 1997), further developed by M. Nöllman (Centre de Biochimie Structurale, CNRS-INSERM, Montpellier, FR) for the original SOMO program (Rai et al. 2005), has been coded for US-SOMO. This allows either a further reduction of resolution starting from a previously generated bead model, or the direct generation of grid-based models from PDB files or small-angle X-ray scattering (SAXS)-derived dummy atoms models. This first US-SOMO release was tested with an expanded number of X-ray crystallography and NMR spectroscopy structures, as presented by García de la Torre (2001), whose experimental hydrodynamic parameters were, however, critically re-evaluated. In the Electronic Supplementary Material (ESM) of this paper, a detailed description of the operation and main features of US-SOMO is presented. The very satisfactory results of US-SOMO in reproducing most experimental parameters of the test proteins are here reported and discussed, highlighting its potential as a powerful tool for many hydrodynamic modeling applications.

Methods

US-SOMO implementation: general layout, reference tables and options

In Fig. 1 we present the main GUI panel of the new SOMO implementation under UltraScan, which can be accessed from the US “Simulation” drop down menu. The program is divided into three sub-menus. “Modify Lookup Tables:” refers to the reference files needed to operate the program; settings of various modeling and computational options are listed under “Modify SOMO Options for:”; and “Run SOMO Program:” refers to the various runtime operations. The right-side window updates the user about the operation(s) in progress. The US-SOMO operations are described in detail in the ESM.

Fig. 1
Main panel of the US-SOMO program, shown after processing the 8RAT.pdb file. The font size in the right-side window has been artificially reduced to show the entire process

At the core of the program lies its capability to read and interpret PDB-formatted structural files. US-SOMO will upload a PDB file and recognize only the relevant records, discarding all others. Currently, these include the ATOM, HETATM, MODEL, ENDMODEL, TER, and END records. Within the ATOM and HETATM records, US-SOMO extracts and loads the atom name, the residue name, the chain identifier, the residue sequence number, and the x, y and z coordinates into appropriate data structures. The atom and residue names are then compared with the records present in the somo.residue table, which can be edited by the user through a pop-up window accessed by pressing the “Add/Edit Residue” button in the main panel.

Each residue type present in the PDB file must be correctly described in the somo.residue table, and, in order to have maximum flexibility in coding for all possible residues, two other tables were defined. In the first one, somo.hybrid, the different atomic groups are listed, together with their fundamental properties, i.e., the mass and the atomic van der Waals (VdW) radius, according to the “hybridizations” described by Tsai et al. (1999). The current content of the somo.hybrid file is shown in Table S1 in the ESM, and users can edit the current definitions or add new atomic groups through the “Add/Edit Hybridization” menu (not shown). The atomic groups listed in somo.hybrid are then used to build the somo.atom table through the “Add/Edit Atom” menu (not shown). A brief excerpt of the current entries in this table is shown in Table S2 in the ESM, which shows that the PDB coding for atoms does not discriminate between different hybridization states. For instance, CB is bound to three hydrogen atoms in alanine (C4H3), to just one in leucine, isoleucine and threonine (C4H1), and to two in all other amino acids (C4H2), implying a mass difference as shown in Table S2. A more profound difference is found, as an example, between the CG in leucine, having four single bonds and one hydrogen atom bound (C4H1), and that in histidine, having two single and one double bonds, and no hydrogens bound (C3H0). In this case, not only is the molecular weight different, but also the atomic van der Waals (VdW) radius is different. Thus, the somo.atom file allows selection of the correct atom name/molecular weight/VdW radius combination for each of the atoms within a residue. A full description of the operations necessary to enter/edit a residue in the somo.residue table is presented in the ESM.

Technical details

US-SOMO is written in C ++ and linked against the UltraScan (Demeler 2005) and Qt (TrollTech.com: Qt—a cross-platform application framework. http://www.trolltech.com/) libraries. The code is licensed under the GPL license (The GNU General Public License Version 3. http://www.gnu.org/copyleft/gpl.html) and can be downloaded from the UltraScan wiki (The UltraScan Trac Wiki. http://wiki.bcf.uthscsa.edu/ultrascan/). Binaries for all major platforms (Linux/X11, Microsoft Windows, Macintosh OS-X) can be downloaded from the UltraScan website at http://www.ultrascan.uthscsa.edu.

Experimental hydrodynamic data

All experimental hydrodynamic parameters of the proteins used to test US-SOMO were taken from the literature, but with a critical evaluation of the conditions used and of the correctness, whenever possible, of their reduction to standard conditions (water at 20°C). A full list is presented in the ESM, with the appropriate references.

Protein structures

The high-resolution structures of the test proteins were taken from the PDB (http://www.rcsb.org/pdb/home/home.do). Whenever possible, we sought structures deriving from the same species from which the solution data were available. This explains some differences between the structures we have employed and those previously used (García de la Torre et al. 2000; García de la Torre 2001). The completeness of each structure was checked and ensured at two levels: missing atoms within side chains were automatically added by the WHATIF webserver (Vriend 1990; http://swift.cmbi.ru.nl/servers/html/index.html) while missing residues were mostly manually modeled using O (Jones et al. 1991). A relatively long C-terminal sequence in nitrogenase MoFe was generated by Robetta (Chivian et al. 2005; http://robetta.bakerlab.org/) using the ab initio protocol (Bonneau et al. 2002), and then pasted in the original structure using O.

Results and discussion

The US-SOMO implementation was firstly thoroughly tested against the original SOMO software (Rai et al. 2005). In the process, several minor bugs were fixed, the most significant involved an incorrect formulation of the outward translation when reducing exposed side chains beads (see Rai et al. 2005). Of the two ASA algorithms implemented, SurfRace (Tsodikov et al. 2002) was found to be very reliable for small, compact structures, but presented some problems with larger, multisubunit structures. Therefore, in all subsequent work we used the ASAB1 option based on the Lee and Richards (1971) rolling sphere method, which is also the only option implemented for re-checking the beads’ exposure after overlap reduction. Another change affects the threshold detection for activating bead fusion (“popping”) which is now done by computing the intersection volume of pairs of beads. The pair of beads is fused when the volume of either bead multiplied by the user defined percentage overlap is greater than the volume of intersection. The volume of the fused bead is the total volume of the pair of beads.

The testing against protein structures was performed in three phases. In the first, multiple structures for the same protein, originating from both X-ray crystallography and NMR spectroscopy, were used. For the latter, averages of the hydrodynamic parameters computed for each of the multiple conformations present in the models were performed. The test proteins chosen, for which an extensive body of experimental hydrodynamic data exist, are the same utilized in Rai et al. (2005), bovine pancreatic trypsin inhibitor (BPTI), bovine pancreatic ribonuclease (RNase), and hen egg white lysozyme, to which myoglobin was added. A second set included the other proteins utilized by García de la Torre and collaborators (García de la Torre et al. 2000; García de la Torre 2001), excluding some less characterized proteins (trypsin, pepsin and subtilisin). Instead, we have examined in more detail hemoglobin, glyceraldehyde-3-phosphate dehydrogenase (G3PD) and lactate dehydrogenase (LDH), for which data and structures coming from different species exist. For these two sets, we computed and compared Dt(20,w)0,s(20,w)0,τc(20,w)h, and [η], whose experimental values were critically assessed as reported in Tables S3–S5 of the ESM. Finally, the τc(20,w)h values only where computed for the full protein set presented in Table 2 of García de la Torre (2001), after recalculation of the reduction to standard conditions of the experimental values as presented in Table S6. In all our modeling, we kept some options fixed, including: (1) a popping threshold of 40% for exposed side chains and of 60% for exposed main chain beads (this differs from what was used by Rai et al. (2005) because of the new definition of overlap threshold implemented in US-SOMO, see above); (2) the hierarchical overlap removal procedure was used in all cases, with outward translation for the exposed side chain beads; (3) the computations of the hydrodynamic parameters were done with stick boundary conditions, referred to the diffusion center, and with exclusion of the buried beads from both the full computations and from the volume correction; (4) the molecular weights and partial specific volumes used for the computation of s(20,w)0 and [η] were those computed by US-SOMO from the composition.

In Table 1, the comparisons between experimental and calculated Dt(20,w)0 and s(20,w)0 for BPTI, RNase, lysozyme and myoglobin are presented. Taking full advantage of the ease by which some modeling options can be now set in US-SOMO, we explored the influence of ASA thresholds on these parameters. Practically, increasing the residues’ ASA threshold labels more beads as buried, and increasing the ASA re-check threshold also keeps more beads in the buried category. The effect on the two parameters examined in Table 1 is, therefore, entirely due to the number and position of the beads employed in the computations. However, as we will see later in Table 2 (and Tables 4, ,5),5), it has an additional impact on the τc(20,w)h and [η] values because of the exclusion of the buried beads from the volume correction. The three conditions examined are residues’ ASA thresholds of 10, 20 and 40 Å2 (A10, A20, A40), coupled respectively with beads’ ASA re-check thresholds of 30, 50 and 60% (R30, R50, R60). For comparison, the A10/R30 condition is equivalent to that employed by Rai et al. (2005) in their modeling study.

Table 1
Comparison between experimental and calculated Dt(20,w) and s(20,w) values for US-SOMO bead models derived from test proteins
Table 2
Comparison between experimental and calculated τc(20,w) and [η] values for US-SOMO bead models derived from test proteins
Table 4
Further comparison between experimental and calculated τc(20,w) and [η] values for US-SOMO bead models derived from test proteins
Table 5
Comparison between NMR-derived correlation times, τc(20,w)exp, with those computed by US-SOMO with A20/R50, τc(20,w)SOMO, and by HYDROPRO, τc(20,w)HP (García de la Torre, 2001), for a set of test proteins

The first interesting result from Table 1 is that the Dt(20,w)0 value of three out of the four test proteins examined here is reproduced extremely well, within the 3% error of experimental data. Moreover, it is independent of the structure used to generate the models, with no appreciable differences between X-ray and NMR models. The lone exception is lysozyme, for which the X-ray-derived structures perform slightly worse than in the other cases examined, while the NMR structure is in excellent agreement (<1%). This was already noticed by Rai et al. (2005), who suggested that this effect is mainly due to the high number of long, hydrophilic residues on lysozyme surface, not fully extended in crystal structures due to crystal packing. As for the s(20,w)0, the results are mixed, with an excellent agreement for RNase (≤3%) and for the NMR-derived model of lysozyme (≤2%), while the X-ray-derived models of the latter suffer from the same problem seen with Dt(20,w)0. The poor agreement of the myosin CO s(20,w)0 data are instead likely due to a suspicious experimental value (“?” in Table 1), since a minor change is observed in the Dt(20,w)0 data between the CO and apo forms. As it will be discussed in more detail below, the computed value of the partial specific volume v2 could also affect the reliability of these numbers.

The other important evidence derived from Table 1 is the very small effect of greatly reducing the number of the beads used in the computations by increasing the ASA thresholds. Practically, halving the number of beads decreases the accuracy by roughly 1%. This is a surprising result, and it will be further discussed below in conjunction with some graphical images of larger protein models. From the data presented in Table 1, it seems safe to use a residue ASA threshold of 20 Å2 coupled with a bead ASA re-check threshold of 50%, effectively obtaining a factor of ~ 10 in the reduction of the number of frictional points with respect to the starting atomic structures.

In Table 2, the results of the comparisons between experimental and computed τc(20,w)h and [η] values are presented for the same proteins of Table 1. The first thing to notice is the larger error present in the experimental τc(20,w)h values, between 6 and 10%, with respect to the Dt(20,w)0,s(20,w)0 and [η] values. The second is that the ASA threshold values have a relevant effect on the calculated parameters: increasing the ASA threshold decreases the computed τc(20,w)h and [η] values, because fewer beads are included in the volume correction. Examining in detail the τc(20,w)h values, it seems that the A20/R50 values produce the best match between experimental and computed data, well below the experimental errors. The exceptions are BPTI, for which it seems that the experimental data might underestimate the rotational tumbling (supported by the lack of differences between X-ray and NMR structures), and the NMR-derived lysozyme model(s). For the latter, this effect was again noticed and tentatively explained by Rai et al. (2005) as deriving from the opposite effect of the long, hydrophilic and flexible surface side-chains on translational and rotational diffusion. As for [η], the two available datasets confirm that choosing a 20 Å2 residue ASA threshold coupled with a 50% beads ASA re-check threshold produces an excellent match between experimental and computed data. Again, the NMR-derived model(s) of lysozyme are in poor agreement for the reasons given above.

Next, we examine a series of structures with increasing size. In Table 3, the Dt(20,w)0 and s(20,w)0 values are reported, and it can be seen that most Dt(20,w)0 values computed with A10/R30 are within 1% of the experimental values. The exceptions are catalase (+6.3%), α-lactalbumin (+3.7%), the oxi form of hemoglobin (−3.6%), the holo form of G3PD (−4.2%), and nitrogenase MoFe (−5.4%). Given the uncertainty associated with the experimental data, some of which are more than 60 years old (see ESM Table S3), we can consider this an excellent result. As for the computed s(20,w)0 values, most of them are within 5% of the experimental data. The deoxi form of hemoglobin and again nitrogenase MoFe are here the worst performers (+9.7 and +11%, respectively), while in the case of NAD-bound pig muscle lactate dehydrogenase (+8.6) the experimental value is suspiciously equal to that of the pig heart form. This is in contrast to the corresponding Dt(20,w)0 values, where there is a net difference extremely well matched by the relative models. As done for Table 1, we have not investigated the effect of using experimental v2 values in the computations instead of calculated values, and similar effects could have affected the conversion to standard conditions of the original data. In this light, the agreement of the computed and experimental s(20,w)0 data in Table 3 can be considered satisfactory. Moreover, it can be seen that the A20/R50 combination works here as well as the original A10/R30.

Table 3
Further comparison between experimental and calculated Dt(20,w) and s(20,w) values for US-SOMO bead models derived from test proteins

τc(20,w)h and [η] data are also available for a restricted set of the same proteins, presented in Table 4. Using the A20/R50 combination, the τc(20,w)h of all proteins, except ovalbumin, are within 15% of the experimental values, which is good considering the errors associated with the experimental data. More data are available for [η], and here the results are mixed, with four protein models having computed values within 2% of the experimental data, and another four laying between 10 and 15% (still considering the A20/R50 framework). Again, some experimental data are suspicious, like the 4 cm3/g value for ovalbumin, but a full examination of these issues is beyond the scope of this paper.

The results described in the previous section can be better interpreted by comparing the original atomic structures and the US-SOMO-generated models. In Fig. 2, panels a–d, the original β-lactoglobulin (1BEB.pdb) structure is shown (panel a) together with the three bead models generated by US-SOMO (panels b–d) whose parameters are reported in Tables 3 and and4.4. The color-coding, fully described in the Fig. 2 legend, refers to the characteristics of the residues’ side chains and distinguishes also the buried beads (orange) from all the other beads. Note the increasing proportion of the buried beads in going from panel b (model generated with A10/R30) to panel d (A20/R50), to panel c (A40/R60). From the data in Table 3, the loss of prediction accuracy for Dt(20,w)0 is only about 0.5% in going from model b (A10/R30) to model c (A40/R60), while the number of beads used to calculate the parameters drops from 325 to 139. Evidently, the translational motion of the protein is dominated by a restricted number of highly exposed frictional centers, clearly seen in Fig. 2 by comparing panels b and c. A similar situation is found with a larger protein made of four subunits, pig heart lactate dehydrogenase whose atomic structure (5LDH.pdb) is shown in panels e and f of Fig. 2. The two different representations were made to show both the subunits composition of LDH (panel e) and the US-SOMO residue coding (panel f). The two US-SOMO bead models in panels g and h were generated with A10/R30 and A40/R60, respectively. Again, the huge increase of “buried” beads between the two models corresponds only to a modest loss of accuracy of about 0.4% in Dt(20,w)0 (Table 3). Overall, these data cast doubt on the necessity of an accurate modeling of protein surfaces for translational friction, which appears to be dominated by a subset of frictional centers. As for the rotational dynamics and intrinsic viscosity, the interpretation is complicated by the role of the excluded beads in the volume correction, and a more in depth analysis should await further studies.

Fig. 2
Atomic structures, shown in space filling mode, of β-lactoglobulin (1BEB.pdb, panel a) and pig heart lactate dehydrogenase with NAD bound (5LDH.pdb, panels e and f) with their corresponding US-SOMO-generated bead models (β-lactoglobulin, ...

Nevertheless, we can now examine in detail the performance of the US-SOMO-generated models in matching the NMR-derived τc(20,w)h values for a set of relatively small proteins, as presented by García de la Torre (2001). To ensure a proper comparison, the reduction to standard conditions of the experimental data was again critically assessed, as reported in Table S6. It was found that most likely the effect of D2O on the solution viscosity was not accounted for, leading to values different from those reported in Table 2 of García de la Torre (2001). In Table 5, the corrected τc(20,w)exp are thus reported, and compared with those computed by US-SOMO ( τc(20,w)SOMO) using the A20/R50 combination that performed better in the previously described testing phase. In addition, we have also reported in Table 5 the Hydropro-generated values, τc(20,w)HP, presented in Table 2 of García de la Torre (2001), with their % differences from the new recalculated experimental values. Furthermore, we have expanded the set of structures to include additional NMR-derived structures, for which the τc(20,w)SOMO averages were computed. This was facilitated by the fully automated processing implemented in US-SOMO, allowing the generation of most of the dataset presented in Table 5 in a mere 5 h of work, including the retrieval of the structures from the PDB and the coding of new residues (ligands and co-factors) not originally present in the somo.residue file. A few structures needed more work because they were incomplete, requiring additional operations. In particular, in the 1LKI leukemia inhibitor factor X-ray structure the first eight N-terminal residues were missing, which were taken from the 1A7M NMR structure (three different conformations were selected, and averages computed). Likewise, in the 1STN staphylococcal nuclease SN X-ray structure, the first five N-terminal and the last eight C-terminal residues were missing, and were taken from the 1JOR NMR structures, again generating three different models whose parameters were then averaged. For comparisons, the original incomplete structures were also processed and their computed values are presented in Table 5. We must also underscore that the data reported in Table 2 of García de la Torre (2001) were computed on a single structure for each protein examined, while the full datasets could be processed in our study.

Examining in detail the data presented in Table 5, we notice that our models in general match better the experimental data than those produced by Hydropro, the apparent exceptions being interleukin-1β, lysozyme and eglin-c. However, these data must be also interpreted in the light of the evidence, documented in Table 5, that the average data from the NMR-derived structures in many cases perform worse than the corresponding X-ray structures, when both can be compared. An examination of the structures (not shown) reveals that in these cases the residues at the N- and C-terminal ends are very disordered, giving rise to quite different conformations. It is thus likely that this disorder reflects true conformational flexibility, which cannot be properly modeled in the rigid-body approximation used by the current implementation of US-SOMO and by Hydropro. Obviously, it would be possible to choose a single conformation matching the experimental parameters, but this would be clearly incorrect. In any case, overall the data presented in Table 5 confirm the reliability of the US-SOMO hydrodynamic modeling, while suggesting that other approaches, like Brownian dynamics (Ermak and McCammon 1978) or discrete molecular dynamics (Dokholyan et al. 1998) simulations, should be used to properly account for local flexibility effects.

In conclusion, we have shown that the bead-modeling scheme implemented in US-SOMO could be a very valuable tool in biomacromolecular hydrodynamic studies. The limitations present in the original SOMO (Rai et al. 2005) have been removed, and the program is now controlled from a GUI. When using pre-set default parameters, and nonstandard residues or ligands are absent from a structure, the computations of all the hydrodynamic parameters are fast, reliable and very accurate for at least the translational diffusion parameters. When using the proper ASA/ASA re-check combination, the computational time required is minimal on standard personal computers, ranging from seconds for structures in the range 5–50 kDa to a few minutes for structures up to 250 kDa like catalase. While defining new residues still requires a detailed knowledge of their physical-chemical characteristics, these operations are also greatly aided by the GUI interface, effectively allowing the modeling of any kind of biomacromolecule from proteins to nucleic acids, carbohydrates, lipids and their complexes. The somo.atom and somo.residue files already contain, respectively, 300 and 64 entries covering amino acids, nucleotides, sugars, co-factors like heme and NAD/NADPH, prosthetic groups like N-acetyl and phosphate. These files will be constantly updated, we hope also through the help of the analytical ultracentrifugation and other hydrodynamic techniques communities, for which this enhanced and powerful tool was mainly developed.

Supplementary Material

Acknowledgments

We thank M. Nöllmann for providing his newAtoB code, and O. Byron for suggestions. The development of the UltraScan and US-SOMO is supported by the National Institute of Health Grant # RR022200 (to B. D.). M. R. gratefully acknowledges support from the Istituto Superiore della Sanità, program Italia-USA.

Footnotes

AUC&HYDRO 2008—Contributions from 17th International Symposium on Analytical Ultracentrifugation and Hydrodynamics, Newcastle, UK, 11–12 September 2008.

Electronic supplementary material The online version of this article (doi:10.1007/s00249-009-0418-0) contains supplementary material, which is available to authorized users.

Contributor Information

Emre Brookes, Department of Biochemistry, The University of Texas Health Science Center at San Antonio, San Antonio, TX, USA.

Borries Demeler, Department of Biochemistry, The University of Texas Health Science Center at San Antonio, San Antonio, TX, USA.

Camillo Rosano, Nanobiotecnologie, Istituto Nazionale per la Ricerca sul Cancro (IST), Genoa, Italy.

Mattia Rocco, Biopolimeri e Proteomica, Istituto Nazionale per la Ricerca sul Cancro (IST), IST c/o CBA, Largo R. Benzi 10, 16132 Genoa, Italy.

References

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [PMC free article] [PubMed] [Cross Ref]
  • Bonneau R, Strauss CE, Rohl CA, Chivian D, Bradley P, Malmstrom L, Robertson T, Baker D. De novo prediction of three-dimensional structures for major protein families. J Mol Biol. 2002;322:65–78. doi: 10.1016/S0022-2836(02)00698-8. [PubMed] [Cross Ref]
  • Byron O. Construction of hydrodynamic bead models from high-resolution X-ray crystallographic or nuclear magnetic resonance data. Biophys J. 1997;72:408–415. doi: 10.1016/S0006-3495 (97)78681-8. [PubMed] [Cross Ref]
  • Byron O. Hydrodynamic bead modeling of biological macromolecules. Methods Enzymol. 2000;321:278–304. doi: 10.1016/S0076-6879(00)21199-3. [PubMed] [Cross Ref]
  • Cantor CR, Schimmel PR. Part II: techniques for the study of biological structure and function. W.H. Freeman; San Francisco: 1980. Biophysical chemistry.
  • Carrasco B, García de la Torre J. Hydrodynamic properties of rigid particles: comparison of different modeling and computational procedures. Biophys J. 1999;76:3044–3057. doi: 10.1016/S0006-3495(99)77457-6. [PubMed] [Cross Ref]
  • Chivian D, Kim DE, Malmstrom L, Schonbrun J, Rohl CA, Baker D. Prediction of CASP6 structures using automated Robetta protocols. Proteins. 2005;61(Suppl 7):157–166. doi: 10.1002/prot. 20733. [PubMed] [Cross Ref]
  • Demeler B. UltraScan. A comprehensive data analysis software package for analytical ultracentrifugation experiments. In: Scott DJ, Harding SE, Rowe AJ, editors. Modern analytical ultracentrifugation: techniques and methods. Royal Society of Chemistry; UK: 2005. pp. 210–229.
  • Dokholyan NV, Buldyrev SV, Stanley HE, Shaknovich EI. Discrete molecular dynamics studies of the folding of a protein-like model. Fold Des. 1998;3:577–587. doi: 10.1016/S1359-0278(98) 00072-8. [PubMed] [Cross Ref]
  • Ermak DL, McCammon JA. Brownian dynamics with hydrodynamic interactions. J Chem Phys. 1978;69:1352–1360. doi: 10.1063/1.436761. [Cross Ref]
  • García de la Torre J. Hydration from hydrodynamics. General considerations and applications of bead modelling to globular proteins. Biophys Chem. 2001;93:159–170. doi: 10.1016/S0301-4622 (01)00218-6. [PubMed] [Cross Ref]
  • García de la Torre J, Bloomfield VA. Hydrodynamic properties of complex, rigid, biological macromolecules: theory and applications. Q Rev Biophys. 1981;14:81–139. [PubMed]
  • García de la Torre J, Huertas ML, Carrasco B. Calculation of hydrodynamic properties of globular proteins from their atomic-level structure. Biophys J. 2000;78:719–730. doi: 10.1016/S0006-3495 (00)76630-6. [PubMed] [Cross Ref]
  • Halle B, Davidovic M. Biomolecular hydration: from water dynamics to hydrodynamics. Proc Natl Acad Sci USA. 2003;100:12135–12140. doi: 10.1073/pnas.2033320100. [PubMed] [Cross Ref]
  • Harding SE, Longman E, Carrasco B, Ortega A, García de la Torre J. Studying antibody conformations by ultracentrifugation and hydrodynamic modeling. Methods Mol Biol. 2004;248:93–113. [PubMed]
  • Jones TA, Zou JY, Cowan SW, Kjeldgaard M. Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr A. 1991;47:110–119. doi: 10.1107/S0108767390010224. [PubMed] [Cross Ref]
  • Kuntz ID, Kauzmann W. Hydration of proteins and polypeptides. Adv Protein Chem. 1974;28:239–345. doi: 10.1016/S0065-3233 (08)60232-6. [PubMed] [Cross Ref]
  • Lee B, Richards FM. The interpretation of protein structures: estimation of static accessibility. J Mol Biol. 1971;55:379–400. doi: 10.1016/0022-2836(71)90324-X. [PubMed] [Cross Ref]
  • Rai N, Nöllmann M, Spotorno B, Tassara G, Byron O, Rocco M. SOMO (SOlution MOdeler): differences between X-ray-and NMR-derived bead models suggest a role for side chain flexibility in protein hydrodynamics. Structure. 2005;13:723–734. doi: 10.1016/j.str.2005.02.012. [PubMed] [Cross Ref]
  • Rocco M, Rosano C, Weisel JW, Horita DA, Hantgan RR. Integrin conformational regulation: uncoupling extension/tail separation from changes in the head region by a multi-resolution approach. Structure. 2008;16:954–964. doi: 10.1016/j.str.2008.02.019. [PMC free article] [PubMed] [Cross Ref]
  • Spotorno B, Piccinini L, Tassara G, Ruggiero C, Nardini M, Molina F, Rocco M. BEAMS (BEAds Modelling System): a set of computer programs for the generation, the visualization and the computation of the hydrodynamic and conformational properties of bead models of proteins. Eur Biophys J. 1997;25:373–384. Erratum 26:417.
  • Tanford C. Physical chemistry of macromolecules. Wiley; New York: 1961.
  • Tsai J, Taylor R, Chothia C, Gerstein M. The packing density in proteins: standard radii and volumes. J Mol Biol. 1999;290:253–266. doi: 10.1006/jmbi.1999.2829. [PubMed] [Cross Ref]
  • Tsodikov OV, Record MT, Jr, Sergeev YV. Novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature. J Comput Chem. 2002;23:600–609. doi: 10.1002/jcc.10061. [PubMed] [Cross Ref]
  • Vriend G. WHAT IF: a molecular modeling and drug design program. J Mol Graph. 1990;8:52–56. doi: 10.1016/0263-7855(90) 80070-V. [PubMed] [Cross Ref]
  • Zipper P, Durchschlag H. Calculation of hydrodynamic parameters of proteins from crystallographic data using multi-body approaches. Prog Colloid Polym Sci. 1997;107:58–71. doi: 10.1007/BFb0118015. [Cross Ref]
  • Zipper P, Durchschlag H. Recent advances in the calculation of hydrodynamic parameters from crystallographic data by multi-body approaches. Biochem Soc Trans. 1998;26:726–731. [PubMed]