|Home | About | Journals | Submit | Contact Us | Français|
As larger macromolecular structures become available, there is a growing need to understand their ‘internal’ volumes—such as deep clefts, channels and cavities—as these often play critical roles in their function. The 3V web server can automatically extract and comprehensively analyze all the internal volumes from input RNA and protein structures. It rapidly finds internal volumes by taking the difference between two rolling-probe solvent-excluded surfaces, one with as large as possible a probe radius and the other with a solvent radius (typically 1.5 Å for water). The outputs are volumetric representations, both as images and downloadable files, which can be used for further analysis.
The 3V server and source code are available from http://3vee.molmovdb.org.
The number of large macromolecular structures available is increasing and these new structures contain internal features with biological relevance. There are several examples of biologically relevant channels in the Protein Data Bank (1). One example is the ribosomal exit tunnel, a large protruding channel (~22 000 Å3 in channel volume) (2) in the 50S ribosomal subunit that all naturally synthesized proteins must pass through (3). Channels also play an important role in many membrane proteins, including ion channels (~8000 Å3) (4) and mechanosensitive channels (~10 000 Å3) (5). Additionally, large cavities are utilized as isolation chambers for protein folding chaperones, such as GroEL (~500 000 Å3) (6). While it is interesting to study these channels, extracting them from a structure is not straightforward.
There are several programs involved in finding potential binding sites in macromolecules (7–18) and pores in membrane proteins (19,20). However, these programs and web servers are typically designed for small proteins, and are not necessarily useful for large macromolecular complexes. More recently, a few methods have been published to traverse known channels in structures (21–24). These tools determine the trajectory of channels from a starting location in protein structures. The output of these tools is the trajectory of the channel and the maximal radius at each point along the path. The Voss Volume Voxelator (3V) web server provides similar functionality, but using a different technique that can provide the overall shape of the channel. The 3V method requires no starting point and the probe radii are adaptable to the size of any structure and its channels, which is ideal for investigations relating to drug-binding sites.
In order to calculate macromolecular volumes, the web server uses the rolling probe method (25–27). This method essentially works by rolling a virtual probe or ball of a given radius around the van der Waals surface of a macromolecule (Figure 1). There are two approaches of the rolling probe method to determine the volume of an enclosed surface: discrete and analytical. The 3V web server uses the discrete volume method, i.e. a 3D grid of voxels, rather than the analytical one [e.g. (28)]. The discrete method is scalable to structures of any size and complexity, and readily lends itself for subtracting any two volumes created using this method. This subtraction is required to obtain the channel information. The time complexity of the discrete volume method is linear, O(n), with the number of particles, but it increases as the cube, O(r3), of the probe radius (in grid units) and the inverse cube of the 3D grid resolution, O(g−3). Thus, macromolecules containing a large number of atoms can be computed with only small effect on the calculation time relative to smaller macromolecules for a given probe and grid size. The discrete volume method for the rolling probe allows for volumes to be subtracted and large macromolecules to be efficiently analyzed.
When using the rolling probe method, the size of the probe has a profound effect on the shape of the resulting surface. Figure 2A shows the surfaces of the macromolecule at the different probe radii for both the hammerhead ribozyme (29) and lysozyme (30) structures. When a probe of zero size is used, the van der Waals radius is obtained. However, as the probe radius increases, the surface features are filled in (as in Figure 2A, left). In the two-probe method for extracting internal volumes from a macromolecule, three types of volume are used:
These two probe radii can be adjusted from their 1.5 and 6 Å default values. There is a large amount of freedom with these two probe radii that can be optimized for a specific problem depending on the size of the channel of interest. For example, the solvent-excluded probe could be adapted to correspond to the size of a small molecule or drug. A shell probe of 9 Å and a solvent probe of 3 Å was used to find the drug-binding pocket for the p-glycoprotein (Figure 4). For the ribosomal exit tunnel, a 10 Å shell probe radius and a ~3 Å solvent probe radius was ideal for extracting the exit tunnel (2).
The solvent volume can be further broken down into two types of internal volumes: cavities and channels. The cavity volume is all the volumes large enough to accommodate a solvent molecule, but that are not connected to the macromolecule exterior (Figure 3, red). Cavities are essentially isolated from the exterior, because solvent is unable to escape to the surface without any structural rearrangements. The second type of internal volume is loosely defined as the ‘channel volume’; this generalized channel volume includes surface invaginations, grooves, pockets, or clefts, as well as, long internal channels, large chambers and deep pores. In practice, it is difficult to segregate a channel from a cleft or a pocket without detailed analysis of the volume surface. The channel volume is, therefore, defined as the solvent volume minus the cavity volume, i.e. all volumes large enough to accommodate a spherical probe that are connected to the exterior surface (Figure 3, blue). In tightly packed structures, such as lysozyme, the solvent channels are merely surface invaginations (Figure 3A, blue); however, for larger structures such as the 50S ribosomal subunit, channels can act as large networks within the macromolecule (Figure 3B, blue).
There are six tools provided in the 3V web server: two external volume tools and four internal volume ones (Table 1). The first external volume tool is ‘Volume Calculation’, which is commonly used to study the surface of the macromolecule. Figure 4 (upper left) shows a typical result from this tool, for a excluded surface of the macromolecule. The second tool, ‘Volume Range Calculation’, is an extension of the first tool and performs the rolling probe calculation using a series of probe radii. This tool is useful for looking at how different probe radii affect the surface of the macromolecule. As shown in Figure 4 (left), for small probe radii, the gap between the two arms is completely defined; however, as the probe radius increases, the probe no longer fits between the protein arms and the volume is filled in.
Three tools are provided to locate and extract channels within a macromolecular structure (Table 1). All three internal volume tools employ the two-probe method to extract the solvent volume (Figure 2B) and, therefore, require both a small and large probe radius as input for their solvent-excluded and shell surfaces, respectively. The first internal volume tool is the ‘Solvent Extract’ tool. This tool extracts all of the solvent volume at once and requires no additional parameters (Figure 4). This can be useful for identifying all the channels of a macromolecule (Figure 4). However, in many cases, it finds too many channels, making the ‘Channel Find’ tool the most useful for beginners. The ‘Channel Find’ tool calculates the solvent volume, locates all the individual channels in the solvent volume, and extracts any channels larger than a volume cutoff criteria. The volume cutoff criteria can be provided as a requested number of volumes, percentage of the shell volume or an absolute value in Å3. In most cases, the default values are sufficient to extract all significant channels from a structure. Figure 4 shows an example of a channel that was extracted using the ‘Channel Find’ tool. Once the location of a channel is known, then the ‘Channel Extract’ tool can be used. In the ‘Channel Extract’ tool, the coordinates of a point located within the channel are required. The ‘Channel Extract’ tool then extracts any solvent volume connected to the provided coordinate (Figure 4). A final type of tool called the ‘Exit Tunnel Extract’ is a specific tool that can only extract the exit tunnel from the 50S ribosomal subunit (2) is provided for replicating the results from Voss et al. (2).
The first page presented to the user is the tool selection page (Figure 5A, left). The tool selection page presents all six web server tools, which are divided into two categories: external volumes and internal ones (Table 1). The title and a brief description are given for each tool. To the right of the tool selection table, the most recent runs of all of the tools are listed, providing both the PDB ID and tool name. Clicking on any recent runs allows the user to see the results page for the run. At the bottom of every page is a link to a glossary of terms, the source code, external links and reference information.
The tool interface page is similar for all tools (Figure 5A). The white box on the left gives common parameters for each tool, while unique parameters are included in the gray box on the right. Each parameter comes with detailed help information, which can be accessed by dragging the mouse pointer over the parameter name. There are five common parameters for all tools (Table 2), including PDB input, biological unit, hetero atoms, grid size and allow public use. Each page also contains custom parameters that depend on which tool is being used. Typical custom parameters are rolling probe radii, but some require more input. For example in the ‘Channel Find’ tool, a volume cutoff is required to skip any unimportant surface invaginations (Table 1). For users uncertain about how to use any tool, preset examples are provided. When a preset example is used, all the form entries are filled out with values for a given working example. The user can then either run the tool with the value unchanged or manipulate them and see the result.
After launching the web server tool, the page redirects to the job progress page with its associated job ID (Figure 5A). Each job submission is provided with a unique job ID based on the current date and three random hexadecimal characters (e.g. 10jan13.0e7) that serve as a permanent bookmarkable link to the data. Many web servers take an email address and send an email upon completion. It is a challenging issue to keep users up to date on long calculations through email. Moreover, recent problems of emails getting lost in spam folders or having too many emails to track, render the email method unreliable. Therefore, 3V has been designed to use the unique job ID to track the progress of these longer processes. The job ID link then shows the progress of the calculations and, upon completion, presents the final results. By saving the web link for a particular job ID, users can close the web page and return at any time to see the results or check on the progress. During processing, there are two ways to track the progress of the job. First, there is an information box at the top that describes the current processing step being carried out. Second, raw output from the programs is displayed in a terminal-like interface.
The output pages have a common layout that depends on the number of extracted volumes (Figure 5A, right). At the top of the page is information about the input parameters; this is followed by information about each output volume, including snapshot images, volume statistics and downloadable files with information to properly view the files. The input parameters are provided to help reproduce the output, if necessary (Figure 5B). Snapshot images provide a quick indication to the shape of the volume and are randomly colored for help visualizing the 3D shape and provide a reference for the different rotations of the volume (Figure 5C). Volume statistics include the voxel size, total volume, surface area, sphericity, effective radius and center of mass (Figure 5D). The total volume, surface area, and effective radius give one a general idea of the size of the volume. Sphericity, Ψ, is a measure of how much the volume resembles a sphere, defined as:
where V is the volume and A is the surface area. A sphericity of 1.0 implies that the volume is an exact sphere, whereas values of less than 1.0 are less spherical. For example, a cube has a sphericity of 0.806 (Figure 5D). Another way measure the shape of a volume is the effective radius. The effective radius is defined as the radius of a sphere with same surface area to volume ratio as the volume of interest, determined by:
The center of mass is most useful from the Channel Find tool for pinpointing the location of the channel. The center of mass for a given channel can then be input in the Channel Extract tool or used in other channel extraction tools (21–24). Finally, a downloadable file of the volume is available (Figure 5E) for visualization along with the original PDB file. A viewing guide is provided on the web server for new users to manipulate their volume in the available visualization software.
The 3V collection of tools provides researchers with a simple method to investigate macromolecule volumes, including internal volumes that have biological significance. The web server is located at http://3vee.molmovdb.org. This intention is to use this server to store the results indefinitely. In addition, the source code for all the programs is available to download freely as packaged releases or from a SourceForge.net subversion repository and can be modified for more in-depth studies. The source uses only basic C libraries and optionally contains parallel code using the OpenMP library allowing it to be multi-platform.
National Institutes of Health and the AL Williams Professorship funds. Funding for open access charge: National Institutes of Health (to M.G.).
Conflict of interest statement. None declared.
The authors acknowledge advice and comments from Peter B. Moore and Thomas A. Steitz were fundamental is the development of this software. NRV would like to thank Denis Fellmann for writing the base libraries for the PHP server. All molecular graphics images were produced using the UCSF Chimera package (32).