This paper investigates whether the solvent accessibility of a peptide in the context of a full protein structure can be used to identify potential epitopes. Lollier et al
15 recently questioned this relationship that is widely used in algorithms for B-cell epitope prediction tools based on linear sequence analysis
31 and protein structures.
2This paper presents a simple approach to analyzing 3D models or structures using Naccess and V3D algorithms to obtain values for solvent exposure and local model/structure quality, respectively. Selection occurs by scoring a sequence as “positive” when these values are above the defined threshold.
Several B-cell epitope prediction methods have recently been developed based on the linear protein sequence or on protein structure coordinates.
2,
3 However, B-cell epitope prediction methods are often largely inaccurate for several reasons.
10,
13 The characteristics that render a sequence suitable for antibody binding are still poorly defined despite extensive research in this area, making even the prediction of linear epitopes a difficult task. In terms of prediction methods based on 3D structures, a paradigm is emerging that shows that a significant number of proteins (about 40% of all human proteins) contain at least one disordered segment of 30 aminoacids or more, while 25% of all human proteins are likely to be entirely disordered and might reach a defined structure only when interacting with a ligand.
32,
33 Therefore, experimental crystal structures or structure models may not necessarily reflect the real conformation of proteins or protein complexes in a solution. Despite these limitations, a number of methods demonstrate significant predictive power when challenged with experimental datasets.
This report presents a rather simple structure-based method that only analyzes two different parameters (local solvent exposure and local structure quality). The software is called B-Pred and can be freely accessed through a web server located at
http://immuno.bio.uniroma2.it/bpred. This method is aimed at predicting linear, continuous epitopes (as opposed to conformational/discontinuous epitopes). B-Pred showed a sensitivity of 0.37 at a specificity of 0.69 (), making the method comparable with other published methods that are based on linear protein sequences (LEPD), 3D coordinates (ElliPro), or SVM models (CBTOPE), with a slightly increased specificity, thereby minimizing the number of false-positive predictions. It should be noted that the conditions used to test LEPD, ElliPro, and CBTOPE are different from the ones reported in their original papers, as the epitopes database is different and assumptions were made in order to score 20 mers with these methods.
This method was implemented and made publicly available by the development of a web server with a number of novel features that are not available in similar servers. This server is biased toward scoring potential immunological reagents (peptides) derived from protein sequences. B-Pred uses a sliding window to scan the sequence and identify potential epitopes. The parameters of the analysis can be modified during subsequent iterations in order to identify the reagents that are most suited to the specific needs of the user.
A unique feature of the B-Pred server is the identification of the residues located in protein-protein interaction surfaces. This information can be relevant in designing peptides for use in the production of antibodies/antisera with specific characteristics. In this context, it could be speculated that antibodies directed at protein-protein interfaces could display neutralizing activity by preventing or competing with the formation of active protein complexes. Conversely, antibodies targeted at areas not involved in complex formation, but still located on solvent exposed regions, could be suitable reagents for the immunoprecipitation of whole protein complexes. Of course, the B-Pred server is just a contribution toward this ambitious goal.
Although the B-Pred server considers conformational and structural information to determine solvent accessibility, it exclusively focuses on the prediction of continuous linear epitopes, as opposed to discontinuous epitopes that can be predicted by other B-cell prediction servers. While this limits the scope of the method, it allows for an immediate translation of the results into peptidic reagents for bench research, which is one of the main purposes of this system.
According to Lollier et al, surface and solvent exposure, as assessed by different methods (Relative Surface Accessibility and Protrusion Index) cannot be reliably correlated to antigenic propensity.
15 There are a number of reasons why an experimentally determined epitope can have poor solvent exposure in the context of the 3D structure of the full protein. For instance, a protein or allergen can be denatured or otherwise processed before or after being injected into an animal for immunization. Before the availability of prediction methods based on structure, peptides were selected from protein sequences using propensity scale indexes and were successfully used to raise antisera or monoclonal antibodies. It is common knowledge that monoclonals exist that will only work in western blots, and thereby recognize sequences that become exposed only after protein denaturation during electrophoresis and blotting procedures. However, other monoclonals are suitable for immunoprecipitation of the target protein from undenatured lysates, and thereby recognize solvent accessible surface sequence stretches that are either continuous or discontinuous with respect to the linear sequence. For these and other reasons, many sequences stored in databases as containing B-cell epitopes can have an overall poor surface exposure. The detailed information about the methods and experimental conditions used for their identification would be extremely useful for the development of more targeted prediction methodologies that would be able to take all of the above considerations into account.
It should be noted that, despite the report by Lollier et al,
15 in the present study we do observe a correlation between surface exposure and antigenic propensity. This could be due to a number of reasons that are worth investigating. For example, for intrinsic reasons, our epitopes dataset is entirely biased toward proteins for which a 3D structure is either available or for which a model can be reliably computed. Therefore, it is possible that our experimental epitopes dataset is biased toward peptides that were predicted for subsequent synthesis and experimental testing using existing structural prediction methods that most often incorporate surface exposure information in their algorithms. Since the methodology for peptide design is not readily available in epitope databases, it is not easy to verify this kind of hypothesis. Research is underway to address these issues.
Since the current B-Pred implementation is based on a single parameter (solvent exposure) that is filtered on local model quality, it is reasonable to assume that the method could be further improved by the inclusion of additional structural parameters
12,
34 and/or by combining the solvent exposure, as directly determined from the structure, with classical linear propensity scales. Research is under way to investigate the possible inclusion of additional parameters to improve the prediction accuracy of the current method.
Among the possible applications of this method, the development of diagnostic reagents for serological analysis is worth mentioning. A protein encoded in the genome of a pathogen of interest can be analyzed for potential B-cell epitopes that could be targeted by the humoral host response. Peptides containing these epitopes have potential as diagnostic reagents for serological tests if the selected sequences are specific to the pathogen of interest. In order to optimize the discovery of B-cell epitopes with diagnostic potential by identifying amino acid sequences that are only present in a given pathogen strain (which is related to the concept of “conservancy”
35), an interesting development of this work will be to automatically link the B-Pred analysis with a BLAST analysis. Work is currently in progress to achieve this goal in a future version of the B-Pred web server.
In conclusion, this study provides a new, freely accessible online tool for the selection of candidate B-cell epitopes in proteins of interest, and is focused on the design of experimental reagents for a variety of biological and biotechnological applications.