|Home | About | Journals | Submit | Contact Us | Français|
The region surrounding a protein, known as the surface of interaction or molecular surface, can provide valuable insight into its function. Unfortunately, due to the complexity of both their geometry and their surface fields, study of these surfaces can be slow and difficult and important features may be hard to identify. Here, we describe our GRaphical Abstracted Protein Explorer, or GRAPE, a web server that allows users to explore abstracted representations of proteins. These abstracted surfaces effectively reduce the level of detail of the surface of a macromolecule, using a specialized algorithm that removes small bumps and pockets, while preserving large-scale structural features. Scalar fields, such as electrostatic potential and hydropathy, are smoothed to further reduce visual complexity. This entirely new way of looking at proteins complements more traditional views of the molecular surface. GRAPE includes a thin 3D viewer that allows users to quickly flip back and forth between both views. Abstracted views provide a fast way to assess both a molecule's shape and its different surface field distributions. GRAPE is freely available at http://grape.uwbacter.org.
One goal of structural biology is to understand the chemical and physical properties of macromolecules (especially proteins) and how this enables the chemical reactions behind life's processes. In order to study these large and complex molecules, researchers rely on visualizations that provide various levels of abstraction. The more abstract visualizations, such as ribbon diagrams, are limited to the portrayal of a molecule's internal structure. However, protein interactions involve the ‘functional surface’ presented: to a large degree, the internal structure simply exists as scaffolding to place various forces and chemical properties in proper spatial relationships with one another. This articles describes a web server that can generate and display abstract visualizations of this surface.
Popular molecular rendering programs, such as PyMOL (1) and Chimera (2), build a visual representation of the functional surface of a protein by sampling fields, such as charge, onto a triangle mesh, resulting in an image such as in Figure 1a.
This triangle mesh can be thought of as the result of rolling a probe sphere over all the atoms in the protein. Connolly (3,4) provided practical methods for sampling these surfaces, which have subsequently been refined in both efficiency and quality (5–8). The resulting surface, called a ‘solvent-excluded surface’ is locally smooth, but at scales larger than an atom, it exhibits high-frequency detail. For even the smallest proteins, this detail can obscure significant surface features, such as pockets and clefts. Charge sampled on the surface often exhibits similar high-frequency detail, which can obscure significant patches of uniform charge, in addition to making the shape of the surface even harder to discern.
In Cipriano and Gleicher (9), we describe a method to overcome these limitations. Termed ‘Molecular Surface Abstraction’, the technique produces a simplified representation of both the geometry of, and of the surface fields surrounding, the molecular surface. Inspired, in part, by the works of David Goodsell and Olsen (10), this combination not only allows users to quickly see a gestalt of the surface, but also serves as a convenient canvas to place surface labels (or ‘stickers’) representing additional information, such as the location of external hydrogen bond donors and acceptors, or regions of known molecular interaction.
To date, however, software implementing surface abstractions has not been made widely available. While we have written a stand-alone application to test our ideas, the requirement that users download, install and then learn a completely new application before exploring abstractions, has inhibited their wider adoption.
GRAPE represents the first completely web-based system for constructing and displaying abstracted molecular surfaces, using Protein Data Bank (PDB) data as input. Its functionality can be broken into two categories:
GRAPE is motivated by the dual goals of making surface abstractions easy to generate and to give as many users as possible the ability to use them on their own molecules. We have organized GRAPE so that all computations, including those for high-quality lighting and mesh generation, are performed on the server; the viewer is a thin client that merely reads in the results of those computations and renders them. This reduces the computational burden on the user's computer and ensures that our system is available to users with low-end hardware.
In recent years, many online protein viewers have come to be widely used, such as Jmol (11). These viewers are ideal for presenting a low barrier to entry for exploration: no software needs to be installed and results are available from any computer regardless of platform. GRAPE is designed to be similarly accessible: the client software itself is quite small and requires only modest graphics hardware for texture rendering. For this version, because we do require graphics hardware, layering our functionality on top of existing molecular viewers was not feasible.
The process of creating and using an abstraction in GRAPE has three steps: first, obtain the data about a molecule; then, abstract it into a useful form; finally, load this data into a viewer. The first two steps are performed on the server and result in a compressed data file.
GRAPE takes as input a PDB file that may be either uploaded from a local copy or fetched directly from the PDB (12). Optionally, users may also upload a PQR file to supply a custom protonation state, overriding the automatic protonation computations done within GRAPE.
Abstracting a protein can take a long time, depending on its size and complexity, so the GRAPE server creates a separate job for each submitted protein. Jobs run asynchronously on the server; after a submission, users are redirected to a job queue to monitor the status of their job.
Along with the PDB file itself, each job has the following metadata associated with it:
This data can be seen by browsing to the ‘job queue’ page, as shown in Figure 2.
To allow users to better manage their queue, GRAPE provides an optional authentication mechanism. Users who wish to authenticate may create an account by first providing a username and password. They then gain the ability to filter the job queue to show only their jobs, to receive an email when their jobs finish and to mark specific jobs as ‘private’.
By default, all jobs in GRAPE are marked as public, which means that all GRAPE users will be able to view the results of a job. For users with sensitive data, such as prepublication proteins, the optional ability to mark a job as ‘private’ ensures that only that user will see the results.
The authentication and queue management infrastructure used in GRAPE is derived from work done for the KFC server (13).
All major processing takes place on the server back end, where jobs are farmed out to a cluster of computers, in first-come first-served order. Each computer takes on the task of abstracting a single protein. This can be divided into two phases: the ‘data collection phase’, in which the shape and electrochemical properties of the original ‘solvent-excluded’ surface is computed, and the ‘abstraction phase’.
All of the server-side code described below is based on the algorithms found in Cipriano, and Gleicher (9).
The data collection phase breaks down as follows:
After collecting surface data, the second phase of a server job is to abstract this data, transforming the detailed results of the first phase into a (visually) simpler form. This again can be broken down into a series of steps:
The final results of the abstraction process are compressed into a single ZIP file that stores the information required for the client to draw both traditional and abstracted views of the protein. This ZIP container contains a number of smaller files, in three major categories:
The ZIP file, which ranges in size from 200 KB to 12 MB, depending on the size of the protein, is stored on the server in a job-specific location.
After the GRAPE server has completely finished a job, its status changes to a link titled ‘LaunchView’. Clicking this link brings up the results of all abstraction computations. We have built a GRAPE viewer, shown in Figure 3, which can be run directly within the output web page. Standard viewing controls are provided to let users navigate the molecular surface.
The viewer itself is written in Java, and uses the Java OpenGL (http://opengl.j3d.org/), or JOGL, binding library to render the surface. On page load, a small JAR file is downloaded for the GRAPE applet, followed by native JOGL libraries, if necessary. Finally, the ZIP file described in the ‘Surface data format’, section above is downloaded from the server and loaded into memory.
A link is also provided on this page to download abstraction data as a raw ZIP file. Currently, the GRAPE viewer is the only tool that can completely use this data, though we envision plugins for existing protein viewers that would allow us to merge abstract surfaces into existing methods of display.
GRAPE uses Google Friend Connect throughout to foster discussion about protein surfaces, to link researchers together, to allow new users to quickly discover the surfaces that others have found interesting and to provide a mechanism for others to give us feedback about our tool.
Friend Connect gadgets expand the usefulness of the site in two ways: first, GRAPE uses the ‘recommendation’ gadget to give users the ability to recommend proteins to one another, as in Figure 6. So experienced users can discover interesting new models and new users can quickly see the benefits of abstraction on existing proteins, before they try their own. In addition to the these recommended proteins, we have added several of our own curated examples in a separate gadget, to ensure that new users always have a set of high-quality examples to begin their exploration of GRAPE.
Second, the viewer page for each job has an additional ‘ratings and reviews’ gadget, where users can discuss aspects of a single protein. An example can be seen in Figure 3.
In making abstractions more accessible for researchers to use, we made several trade-offs. As described in this ‘Project goals’ section above, our primary goal was to give researchers the ability to quickly use abstractions, to judge their utility for themselves. So rather than attempting to fit into the many different workflows used by researchers, we chose instead to build a simple, easy-to-use server that provides quick results. This decision does ultimately limit how useful our software can be: as it does not integrate into researchers' workflows, they cannot use the tools to which they have grown accustomed. In the future, we would like to fix this limitation by providing a PyMOL plug-in that can understand and display our output data format.
We also chose to make the client hardware requirements as low as possible: the GRAPE viewer itself is very thin and all abstraction is done on the server. While a heavyweight client application could have provided more functionality, and potentially lowered the time between submitting a protein and viewing its abstraction, this would shift the burden to the client's computer, which would in turn limit the number of users who could use our software.
Nevertheless, the GRAPE viewer is currently missing several important features: there is no way to identify which regions of the surface are proximal to specific amino acids in the sequence, ribbon and stick-and-ball visualizations are not available, and certain parameters (such as the color maps for electrostatic potential and the degree of abstraction) are fixed. These features will be added in future revisions.
We have presented GRAPE, a web server that computes and displays abstracted views of the solvent-excluded surface of proteins. The server gives researchers a quick means to explore the surface of a protein of interest, in both abstracted and solvent-excluded views. In addition, GRAPE leverages social networking, in the form of Google Friend Connect, to foster discussion about any aspect of a protein, and to allow the community to share their most compelling, interesting and surprising proteins with one another.
National Institutes of Health (NIH) training grant (NLM-5T15LM007359, in part); US Department of Energy Genomics:GTL and SciDAC Programs (DE-FG02-04ER25627, in part); National Science Foundation (NSF) awards (CMMI-0941013 and IIS-0946598, in part). Funding for open access charge: US Department of Energy (DE-FG02-04ER25627) funds.
Conflict of interest statement. None declared.
We thank Julie Mitchell for providing much of the authentication and job management infrastructure and for encouraging our efforts to deliver abstractions as a web server. We also thank Joshua Oakgrove for designing and implementing an early version of the Java viewer and Aaron Bryden for helping us port server code to the Macintosh.