|Home | About | Journals | Submit | Contact Us | Français|
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact firstname.lastname@example.org
We present a set of programs and a website designed to facilitate protein structure comparison and protein structure modeling efforts. Our protein structure analysis and comparison services use the LGA (local-global alignment) program to search for regions of local similarity and to evaluate the level of structural similarity between compared protein structures. To facilitate the homology-based protein structure modeling process, our AL2TS service translates given sequence–structure alignment data into the standard Protein Data Bank (PDB) atom records (coordinates). For a given sequence of amino acids, the AS2TS (amino acid sequence to tertiary structure) system calculates (e.g. using PSI-BLAST PDB analysis) a list of the closest proteins from the PDB, and then a set of draft 3D models is automatically created. Web services are available at http://as2ts.llnl.gov/.
Determination of protein structures via X-ray crystallography or NMR is a relatively slow and expensive process. The difficulty in increasing the rate of experimental determination of protein structures has led to the emphasis on ‘computational prediction’ and ‘analysis’ of protein structures. The web page described below has been designed to provide access to several computational protein structure comparison (LGA) and protein structure modeling (AS2TS) services.
The ability to verify sequence-based alignments by comparing with the correct structural alignments plays a crucial role in improving the quality of protein structure modeling, protein classification and protein function recognition. The LGA program (1) facilitates this analysis of sequence–structure correspondence. LGA allows detailed pairwise structural comparison of a submitted pair of proteins and also comparison of protein structures or fragments of protein structures with a selected set of proteins from the Protein Data Bank (PDB) (2). The data generated by LGA can be successfully used in a scoring function to rank the level of similarity between compared structures and to allow structural classification when many proteins are being analyzed. LGA also allows the clustering of similar fragments of protein structures. While comparing protein structures, the program generates data that provide detailed information not only about the degree of global similarity but also about regions of local similarity in protein structures. Searching for the best superposition between two structures, LGA calculates the number of residues from the second structure (the target) that are close enough under the specified distance cut-off to the corresponding residues of the first structure (the model). The distance cut-off can be chosen from 0.1 to 10.0 Å in order to calculate a more accurate (tight) or a more relaxed superposition.
There are two provided structural comparison services:
Note that when the LGA program is run with options ‘−1, −2, −3’ it does not calculate the structure-based alignments, but calculates only the structural superposition for a given (fixed) residue–residue correspondence. If the user needs to calculate a structural alignment (automatically establish the residue–residue correspondence), then option ‘−4’ should be selected. An explanation and several examples of how to properly select from both structures the desired set of residues for LGA calculations is provided on the website as the service description.
The discovery that proteins with even negligible sequence similarity can have similar 3D structures, and can perform similar functions, serves as a foundation for the development of many computational protein structure prediction methods. CASP (3) experiments have shown that protein structure prediction methods based on homology search techniques are still the most reliable prediction methods (4). To facilitate the process of homology-based structural modeling, we have developed a set of services called AS2TS. Provided services are as follows:
For a given sequence of amino acids, our AS2TS system performs a quick search for the closest PDB homologs that can be used for 3D protein structure modeling. In our system the NR and the PDB data are updated weekly, so generated template information helps the user to estimate the quality of homology-based 3D models that can be currently calculated for a given protein sequence.
Figure 1 shows the screenshots of results of using our AS2TS system for a quick search for the closest PDB homologs that could be used for 3D model building of the capsid protein sequences of bovine enterovirus (BEV)-2 strain PS-87. BEVs are members of the Picornaviridae family, genus Enterovirus. Detailed 3D protein structure models for three BEV strains were created. This modeling effort was performed in two steps: (i) the structure of the closest template (PDB entry: 1 bev) was modified/corrected in several regions, and some missing residues were modeled; and (ii) the modified 1 bev structure was used as a template to build 3D models for capsid proteins of the three BEV strains of interest.
We have created complete 3D models of the capsids (Figure 2, right) for three BEV strains and for some related PDB templates. Calculated structures will be used for detailed analysis of the ‘canyon regions’ and for identifying structural differences and similarities among various animal picornaviruses. Modeling of the BEV-2 capsid structure supports the generally accepted idea that the region of the VP-1 protein that connects the eight β-strands making up the wedge-shaped region of each capsid protein is part of the variable region specifying the antigenically variable sites. The details of this work were published previously (9).
The AS2TS system has been used to facilitate the molecular replacement (MR) phasing technique in experimental X-ray crystallographic determination of the protein structure of Mycobacterium tuberculosis (MTB) RmlC epimerase (Rv3465) from the strain H37rv. The MTB RmlC protein was crystallized by the Biosciences crystallography group at Lawrence Livermore National Laboratory, and native X-ray data (without phases) were collected at the Advanced Light Source at Lawrence Berkeley Laboratory. Although structurally related homologs were tried for MR, the technique failed because the sequences were too dissimilar. Using our AS2TS system, we built two homology models of this protein that were then successfully employed as MR targets (10).
Evaluation of the generated MTB RmlC models was performed using LGA. Detailed structural comparison analysis of 14 homologs revealed two proteins, dTDP-4-dehydrorhamnose epimerase (PDB entry: 1ep0) and RmlC from Salmonella typhimurium (PDB entry: 1dzr), which were selected as primary templates.
Figure 3 illustrates the results from LGA analysis when 14 proteins of known structure were compared with the selected target protein. This LGA capability allowed us to localize the regions that were structurally similar among all analyzed proteins, select one or more structures as a template(s) for homology modeling, and use this information to create a consensus model. The process of structural determination for the MTB RmlC protein (PDB entry: 1upi) was described by Kanterdjieff et al. (10).
This work was performed under the auspices of the US Department of Energy by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48. The design and development of described systems was supported by LLNL LDRD grants 02-LW-003 and 04-ERD-068 to A.Z. Funding to pay the Open Access publication charges for this article was provided by US Department of Energy.
Conflict of interest statement. None declared.