Binding of peptides to major histocompatibility complex (MHC) molecules is the single most selective step in the recognition of pathogens by the cellular immune system. The human MHC genomic region (called HLA) is extremely polymorphic comprising several thousand alleles, each encoding a distinct MHC molecule. The potentially unique specificity of the majority of HLA alleles that have been identified to date remains uncharacterized. Likewise, only a limited number of chimpanzee and rhesus macaque MHC class I molecules have been characterized experimentally. Here, we present NetMHCpan-2.0, a method that generates quantitative predictions of the affinity of any peptide–MHC class I interaction. NetMHCpan-2.0 has been trained on the hitherto largest set of quantitative MHC binding data available, covering HLA-A and HLA-B, as well as chimpanzee, rhesus macaque, gorilla, and mouse MHC class I molecules. We show that the NetMHCpan-2.0 method can accurately predict binding to uncharacterized HLA molecules, including HLA-C and HLA-G. Moreover, NetMHCpan-2.0 is demonstrated to accurately predict peptide binding to chimpanzee and macaque MHC class I molecules. The power of NetMHCpan-2.0 to guide immunologists in interpreting cellular immune responses in large out-bred populations is demonstrated. Further, we used NetMHCpan-2.0 to predict potential binding peptides for the pig MHC class I molecule SLA-1*0401. Ninety-three percent of the predicted peptides were demonstrated to bind stronger than 500 nM. The high performance of NetMHCpan-2.0 for non-human primates documents the method's ability to provide broad allelic coverage also beyond human MHC molecules. The method is available at http://www.cbs.dtu.dk/services/NetMHCpan.
MHC class I; Binding specificity; Non-human primates; Artificial neural networks; CTL epitopes
Motivation: MHC:peptide binding plays a central role in activating the immune surveillance. Computational approaches to determine T-cell epitopes restricted to any given major histocompatibility complex (MHC) molecule are of special practical value in the development of for instance vaccines with broad population coverage against emerging pathogens. Methods have recently been published that are able to predict peptide binding to any human MHC class I molecule. In contrast to conventional allele-specific methods, these methods do allow for extrapolation to uncharacterized MHC molecules. These pan-specific human lymphocyte antigen (HLA) predictors have not previously been compared using independent evaluation sets.
Result: A diverse set of quantitative peptide binding affinity measurements was collected from Immune Epitope database (IEDB), together with a large set of HLA class I ligands from the SYFPEITHI database. Based on these datasets, three different pan-specific HLA web-accessible predictors NetMHCpan, adaptive double threading (ADT) and kernel-based inter-allele peptide binding prediction system (KISS) were evaluated. The performance of the pan-specific predictors was also compared with a well performing allele-specific MHC class I predictor, NetMHC, as well as a consensus approach integrating the predictions from the NetMHC and NetMHCpan methods.
Conclusions: The benchmark demonstrated that pan-specific methods do provide accurate predictions also for previously uncharacterized MHC molecules. The NetMHCpan method trained to predict actual binding affinities was consistently top ranking both on quantitative (affinity) and binary (ligand) data. However, the KISS method trained to predict binary data was one of the best performing methods when benchmarked on binary data. Finally, a consensus method integrating predictions from the two best performing methods was shown to improve the prediction accuracy.
Supplementary information: Supplementary data are available at Bioinformatics online.
Major histocompatibility complex class II (MHCII) molecules play an important role in cell-mediated immunity. They present specific peptides derived from endosomal proteins for recognition by T helper cells. The identification of peptides that bind to MHCII molecules is therefore of great importance for understanding the nature of immune responses and identifying T cell epitopes for the design of new vaccines and immunotherapies. Given the large number of MHC variants, and the costly experimental procedures needed to evaluate individual peptide–MHC interactions, computational predictions have become particularly attractive as first-line methods in epitope discovery. However, only a few so-called pan-specific prediction methods capable of predicting binding to any MHC molecule with known protein sequence are currently available, and all of them are limited to HLA-DR. Here, we present the first pan-specific method capable of predicting peptide binding to any HLA class II molecule with a defined protein sequence. The method employs a strategy common for HLA-DR, HLA-DP and HLA-DQ molecules to define the peptide-binding MHC environment in terms of a pseudo sequence. This strategy allows the inclusion of new molecules even from other species. The method was evaluated in several benchmarks and demonstrates a significant improvement over molecule-specific methods as well as the ability to predict peptide binding of previously uncharacterised MHCII molecules. To the best of our knowledge, the NetMHCIIpan-3.0 method is the first pan-specific predictor covering all HLA class II molecules with known sequences including HLA-DR, HLA-DP, and HLA-DQ. The NetMHCpan-3.0 method is available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.0.
MHC class II; Tcell epitope; MHC binding specificity; Peptide–MHC binding; Human leukocyte antigens; Artificial neural networks
MHC class I molecules (HLA-I in humans) present peptides derived from endogenous proteins to CTLs. Whereas the peptide-binding specificities of HLA-A and -B molecules have been studied extensively, little is known about HLA-C specificities. Combining a positional scanning combinatorial peptide library approach with a peptide–HLA-I dissociation assay, in this study we present a general strategy to determine the peptide-binding specificity of any MHC class I molecule. We applied this novel strategy to 17 of the most common HLA-C molecules, and for 16 of these we successfully generated matrices representing their peptide-binding motifs. The motifs prominently shared a conserved C-terminal primary anchor with hydrophobic amino acid residues, as well as one or more diverse primary and auxiliary anchors at P1, P2, P3, and/or P7. Matrices were used to generate a large panel of HLA-C–specific peptide-binding data and update our pan-specific NetMHCpan predictor, whose predictive performance was considerably improved with respect to peptide binding to HLA-C. The updated predictor was used to assess the specificities of HLA-C molecules, which were found to cover a more limited sequence space than HLA-A and -B molecules. Assessing the functional significance of these new tools, HLA-C*07:01 transgenic mice were immunized with stable HLA-C*07:01 binders; six of six tested stable peptide binders were immunogenic. Finally, we generated HLA-C tetramers and labeled human CD8+ T cells and NK cells. These new resources should support future research on the biology of HLA-C molecules. The data are deposited at the Immune Epitope Database, and the updated NetMHCpan predictor is available at the Center for Biological Sequence Analysis and the Immune Epitope Database.
In all vertebrate animals, CD8+ cytotoxic T lymphocytes (CTLs) are controlled by major histocompatibility complex class I (MHC-I) molecules. These are highly polymorphic peptide receptors selecting and presenting endogenously derived epitopes to circulating CTLs. The polymorphism of the MHC effectively individualizes the immune response of each member of the species. We have recently developed efficient methods to generate recombinant human MHC-I (also known as human leukocyte antigen class I, HLA-I) molecules, accompanying peptide-binding assays and predictors, and HLA tetramers for specific CTL staining and manipulation. This has enabled a complete mapping of all HLA-I specificities (“the Human MHC Project”). Here, we demonstrate that these approaches can be applied to other species. We systematically transferred domains of the frequently expressed swine MHC-I molecule, SLA-1*0401, onto a HLA-I molecule (HLA-A*11:01), thereby generating recombinant human/swine chimeric MHC-I molecules as well as the intact SLA-1*0401 molecule. Biochemical peptide-binding assays and positional scanning combinatorial peptide libraries were used to analyze the peptide-binding motifs of these molecules. A pan-specific predictor of peptide–MHC-I binding, NetMHCpan, which was originally developed to cover the binding specificities of all known HLA-I molecules, was successfully used to predict the specificities of the SLA-1*0401 molecule as well as the porcine/human chimeric MHC-I molecules. These data indicate that it is possible to extend the biochemical and bioinformatics tools of the Human MHC Project to other vertebrate species.
Recombinant MHC; Peptide specificity; Binding predictions
CD4 positive T helper cells control many aspects of specific immunity. These cells are specific for peptides derived from protein antigens and presented by molecules of the extremely polymorphic major histocompatibility complex (MHC) class II system. The identification of peptides that bind to MHC class II molecules is therefore of pivotal importance for rational discovery of immune epitopes. HLA-DR is a prominent example of a human MHC class II. Here, we present a method, NetMHCIIpan, that allows for pan-specific predictions of peptide binding to any HLA-DR molecule of known sequence. The method is derived from a large compilation of quantitative HLA-DR binding events covering 14 of the more than 500 known HLA-DR alleles. Taking both peptide and HLA sequence information into account, the method can generalize and predict peptide binding also for HLA-DR molecules where experimental data is absent. Validation of the method includes identification of endogenously derived HLA class II ligands, cross-validation, leave-one-molecule-out, and binding motif identification for hitherto uncharacterized HLA-DR molecules. The validation shows that the method can successfully predict binding for HLA-DR molecules—even in the absence of specific data for the particular molecule in question. Moreover, when compared to TEPITOPE, currently the only other publicly available prediction method aiming at providing broad HLA-DR allelic coverage, NetMHCIIpan performs equivalently for alleles included in the training of TEPITOPE while outperforming TEPITOPE on novel alleles. We propose that the method can be used to identify those hitherto uncharacterized alleles, which should be addressed experimentally in future updates of the method to cover the polymorphism of HLA-DR most efficiently. We thus conclude that the presented method meets the challenge of keeping up with the MHC polymorphism discovery rate and that it can be used to sample the MHC “space,” enabling a highly efficient iterative process for improving MHC class II binding predictions.
CD4 positive T helper cells provide essential help for stimulation of both cellular and humoral immune reactions. T helper cells recognize peptides presented by molecules of the major histocompatibility complex (MHC) class II system. HLA-DR is a prominent example of a human MHC class II locus. The HLA molecules are extremely polymorphic, and more than 500 different HLA-DR protein sequences are known today. Each HLA-DR molecule potentially binds a unique set of antigenic peptides, and experimental characterization of the binding specificity for each molecule would be an immense and highly costly task. Only a very limited set of MHC molecules has been characterized experimentally. We have demonstrated earlier that it is possible to derive accurate predictions for MHC class I proteins by interpolating information from neighboring molecules. It is not straightforward to take a similar approach to derive pan-specific HLA-DR class II predictions because the HLA class II molecules can bind peptides of very different lengths. Here, we nonetheless show that this is indeed possible. We develop an HLA-DR pan-specific method that allows for prediction of binding to any HLA-DR molecule of known sequence—even in the absence of specific data for the particular molecule in question.
Reliable predictions of immunogenic peptides are essential in rational vaccine design and can minimize the experimental effort needed to identify epitopes. In this work, we describe a pan-specific major histocompatibility complex (MHC) class I epitope predictor, NetCTLpan. The method integrates predictions of proteasomal cleavage, transporter associated with antigen processing (TAP) transport efficiency, and MHC class I binding affinity into a MHC class I pathway likelihood score and is an improved and extended version of NetCTL. The NetCTLpan method performs predictions for all MHC class I molecules with known protein sequence and allows predictions for 8-, 9-, 10-, and 11-mer peptides. In order to meet the need for a low false positive rate, the method is optimized to achieve high specificity. The method was trained and validated on large datasets of experimentally identified MHC class I ligands and cytotoxic T lymphocyte (CTL) epitopes. It has been reported that MHC molecules are differentially dependent on TAP transport and proteasomal cleavage. Here, we did not find any consistent signs of such MHC dependencies, and the NetCTLpan method is implemented with fixed weights for proteasomal cleavage and TAP transport for all MHC molecules. The predictive performance of the NetCTLpan method was shown to outperform other state-of-the-art CTL epitope prediction methods. Our results further confirm the importance of using full-type human leukocyte antigen restriction information when identifying MHC class I epitopes. Using the NetCTLpan method, the experimental effort to identify 90% of new epitopes can be reduced by 15% and 40%, respectively, when compared to the NetMHCpan and NetCTL methods. The method and benchmark datasets are available at http://www.cbs.dtu.dk/services/NetCTLpan/.
Electronic supplementary material
The online version of this article (doi:10.1007/s00251-010-0441-4) contains supplementary material, which is available to authorized users.
MHC class I pathway; HLA; Pan-specific prediction; CTL epitope; MHC polymorphism
The HLA (human leukocyte antigen) class I is a kind of molecule encoded by a large family of genes and is characteristic of high polymorphism. Now the number of the registered HLA-I molecules has exceeded 3000. Slight differences in the amino acid sequences of HLAs would make them bind to different sets of peptides. In the past decades, although many methods have been proposed to predict the binding between peptides and HLA-I molecules and achieved good performance, most experimental data used by them is limited to the HLAs with a small number of alleles. Thus they are inclined to obtain high prediction accuracy only for data with similar alleles. Because the peptides and HLAs together determine the binding, it's necessary to consider their contribution meanwhile.
By taking into account the features of the peptides sequence and the energy of contact residues, in this paper a method based on the artificial neural network is proposed to predict the binding of peptides and HLA-I even when the HLAs' potential alleles are unknown. Two experiments in the allele-specific and super-type cases are performed respectively to validate our method. In the first case, we collect 14 HLA-A and 14 HLA-B molecules on Bjoern Peters dataset, and compare our method with the ARB, SMM, NetMHC and other 16 online methods. Our method gets the best average AUC (Area under the ROC) value as 0.909. In the second one, we use leave one out cross validation on MHC-peptide binding data that has different alleles but shares the common super-type. Compared to gold standard methods like NetMHC and NetMHCpan, our method again achieves the best average AUC value as 0.847.
Our method achieves satisfactory results. Whenever it's tested on the HLA-I with single definite gene or with super-type gene locus, it gets better classification accuracy. Especially, when the training set is small, our method still works better than the other methods in the comparison. Therefore, we could make a conclusion that by combining the peptides' information, HLAs amino acid residues' interaction information and contact energy, our method really could improve prediction of the peptide HLA-I binding even when there aren't the prior experimental dataset for HLAs with various alleles.
Donor T-cell mediated graft versus host (GVH) effects may result from the aggregate alloreactivity to minor histocompatibility antigens (mHA) presented by the human leukocyte antigen (HLA) molecules in each donor–recipient pair undergoing stem-cell transplantation (SCT). Whole exome sequencing has previously demonstrated a large number of non-synonymous single nucleotide polymorphisms (SNP) present in HLA-matched recipients of SCT donors (GVH direction). The nucleotide sequence flanking each of these SNPs was obtained and the amino acid sequence determined. All the possible nonameric peptides incorporating the variant amino acid resulting from these SNPs were interrogated in silico for their likelihood to be presented by the HLA class I molecules using the Immune Epitope Database stabilized matrix method (SMM) and NetMHCpan algorithms. The SMM algorithm predicted that a median of 18,396 peptides weakly bound HLA class I molecules in individual SCT recipients, and 2,254 peptides displayed strong binding. A similar library of presented peptides was identified when the data were interrogated using the NetMHCpan algorithm. The bioinformatic algorithm presented here demonstrates that there may be a high level of mHA variation in HLA-matched individuals, constituting a HLA-specific alloreactivity potential.
alloreactivity potential; stem-cell transplant; whole exome sequencing; HLA; minor histocompatibility antigen
As a potent CD8+ T cell activator, peptide vaccine has found its way in vaccine development against intracellular infections and cancer, but not against leishmaniasis. The first step toward a peptide vaccine is epitope mapping of different proteins according to the most frequent HLA types in a population.
Methods and Findings
Six Leishmania (L.) major-related candidate antigens (CPB,CPC,LmsTI-1,TSA,LeIF and LPG-3) were screened for potential CD8+ T cell activating 9-mer epitopes presented by HLA-A*0201 (the most frequent HLA-A allele). Online software including SYFPEITHI, BIMAS, EpiJen, Rankpep, nHLApred, NetCTL and Multipred were used. Peptides were selected only if predicted by almost all programs, according to their predictive scores. Pan-A2 presentation of selected peptides was confirmed by NetMHCPan1.1. Selected peptides were pooled in four peptide groups and the immunogenicity was evaluated by in vitro stimulation and intracellular cytokine assay of PBMCs from HLA-A2+ individuals recovered from L. major. HLA-A2− individuals recovered from L. major and HLA-A2+ healthy donors were included as control groups. Individual response of HLA-A2+ recovered volunteers as percent of CD8+/IFN-γ+ T cells after in vitro stimulation against peptide pools II and IV was notably higher than that of HLA-A2− recovered individuals. Based on cutoff scores calculated from the response of HLA-A2− recovered individuals, 31.6% and 13.3% of HLA-A2+ recovered persons responded above cutoff in pools II and IV, respectively. ELISpot and ELISA results confirmed flow cytometry analysis. The response of HLA-A2− recovered individuals against peptide pools I and III was detected similar and even higher than HLA-A2+ recovered individuals.
Using in silico prediction we demonstrated specific response to LmsTI-1 (pool II) and LPG-3- (pool IV) related peptides specifically presented in HLA-A*0201 context. This is among the very few reports mapping L. major epitopes for human HLA types. Studies like this will speed up polytope vaccine idea towards leishmaniasis.
Leishmaniasis is currently a serious health as well as economic problem in underdeveloped and developing countries in Africa, Asia, the Near and Middle East, Central and South America and the Mediterranean region. Cutaneous leishmaniasis is highly endemic in Iran, remarkably in Isfahan, Fars, Khorasan, Khozestan and Kerman provinces. Since effective prevention is not available and current curative therapy is expensive, often poorly tolerated and not always effective, alternative therapies including vaccination against leishmaniasis are of priority to overcome the problem. Although Th1 dominant response is so far considered as a pre-requisite for the immune system to overcome the infection, CD8+ T cell response could also be considered as a potent arm of immune system fighting against intracellular Leishmania. Polytope vaccine strategy may open up a new way in vaccine design against leishmaniasis, since they act as a potent tool to stimulate multi-CD8 T cell responses. Clearly there is a substantial need to evaluate the promising epitopes from different proteins of Leishmania parasite species. Some new immunoinformatic tools are now available to speed up this process, and we have shown here that in silico prediction can effectively evaluate HLA class I-restricted epitopes out of Leishmania proteins.
Peptide-major histocompatibility complex (p-MHC) class I tetramer complexes have facilitated the early detection and functional characterisation of epitope specific CD8+ cytotoxic T lymphocytes (CTL). Here, we report on the generation of seven recombinant bovine leukocyte antigens (BoLA) and recombinant bovine β2-microglobulin from which p-MHC class I tetramers can be derived in ~48 h. We validated a set of p-MHC class I tetramers against a panel of CTL lines specific to seven epitopes on five different antigens of Theileria parva, a protozoan pathogen causing the lethal bovine disease East Coast fever. One of the p-MHC class I tetramers was tested in ex vivo assays and we detected T. parva specific CTL in peripheral blood of cattle at day 15-17 post-immunization with a live parasite vaccine. The algorithm NetMHCpan predicted alternative epitope sequences for some of the T. parva CTL epitopes. Using an ELISA assay to measure peptide-BoLA monomer formation and p-MHC class I tetramers of new specificity, we demonstrate that a predicted alternative epitope Tp229-37 rather than the previously reported Tp227-37 epitope is the correct Tp2 epitope presented by BoLA-6*04101. We also verified the prediction by NetMHCpan that the Tp587-95 epitope reported as BoLA-T5 restricted can also be presented by BoLA-1*02301, a molecule similar in sequence to BoLA-T5. In addition, Tp587-95 specific bovine CTL were simultaneously stained by Tp5-BoLA-1*02301 and Tp5-BoLA-T5 tetramers suggesting that one T cell receptor can bind to two different BoLA MHC class I molecules presenting the Tp587-95 epitope and that these BoLA molecules fall into a single functional supertype.
MHC class II proteins bind oligopeptide fragments derived from proteolysis of pathogen antigens, presenting them at the cell surface for recognition by CD4+ T cells. Human MHC class II alleles are grouped into three loci: HLA-DP, HLA-DQ and HLA-DR. In contrast to HLA-DR and HLA-DQ, HLA-DP proteins have not been studied extensively, as they have been viewed as less important in immune responses than DRs and DQs. However, it is now known that HLA-DP alleles are associated with many autoimmune diseases. Quite recently, the X-ray structure of the HLA-DP2 molecule (DPA*0103, DPB1*0201) in complex with a self-peptide derived from the HLA-DR α-chain has been determined. In the present study, we applied a validated molecular docking protocol to a library of 247 modelled peptide-DP2 complexes, seeking to assess the contribution made by each of the 20 naturally occurred amino acids at each of the nine binding core peptide positions and the four flanking residues (two on both sides).
The free binding energies (FBEs) derived from the docking experiments were normalized on a position-dependent (npp) and on an overall basis (nap), and two docking score-based quantitative matrices (DS-QMs) were derived: QMnpp and QMnap. They reveal the amino acid preferences at each of the 13 positions considered in the study. Apart from the leading role of anchor positions p1 and p6, the binding to HLA-DP2 depends on the preferences at p2. No effect of the flanking residues was found on the peptide binding predictions to DP2, although all four of them show strong preferences for particular amino acids. The predictive ability of the DS-QMs was tested using a set of 457 known binders to HLA-DP2, originating from 24 proteins. The sensitivities of the predictions at five different thresholds (5%, 10%, 15%, 20% and 25%) were calculated and compared to the predictions made by the NetMHCII and IEDB servers. Analysis of the DS-QMs indicated an improvement in performance. Additionally, DS-QMs identified the binding cores of several known DP2 binders.
The molecular docking protocol, as applied to a combinatorial library of peptides, models the peptide-HLA-DP2 protein interaction effectively, generating reliable predictions in a quantitative assessment. The method is structure-based and does not require extensive experimental sequence-based data. Thus, it is universal and can be applied to model any peptide - protein interaction.
The identification of peptides binding to major histocompatibility complexes (MHC) is a critical step in the understanding of T cell immune responses. The human MHC genomic region (HLA) is extremely polymorphic comprising several thousand alleles, many encoding a distinct molecule. The potentially unique specificities remain experimentally uncharacterized for the vast majority of HLA molecules. Likewise, for nonhuman species, only a minor fraction of the known MHC molecules have been characterized. Here, we describe a tool, MHCcluster, to functionally cluster MHC molecules based on their predicted binding specificity. The method has a flexible web interface that allows the user to include any MHC of interest in the analysis. The output consists of a static heat map and graphical tree-based visualizations of the functional relationship between MHC variants and a dynamic TreeViewer interface where both the functional relationship and the individual binding specificities of MHC molecules are visualized. We demonstrate that conventional sequence-based clustering will fail to identify the functional relationship between molecules, when applied to MHC system, and only through the use of the predicted binding specificity can a correct clustering be found. Clustering of prevalent HLA-A and HLA-B alleles using MHCcluster confirms the presence of 12 major specificity groups (supertypes) some however with highly divergent specificities. Importantly, some HLA molecules are shown not to fit any supertype classification. Also, we use MHCcluster to show that chimpanzee MHC class I molecules have a reduced functional diversity compared to that of HLA class I molecules. MHCcluster is available at www.cbs.dtu.dk/services/MHCcluster-2.0.
MHC; HLA; Binding motif; Functional clustering; MHC specificity; Supertypes
Motivation: Receptor–ligand interactions play an important role in controlling many biological systems. One prominent example is the binding of peptides to the major histocompatibility complex (MHC) molecules controlling the onset of cellular immune responses. Thousands of MHC allelic versions exist, making determination of the binding specificity for each variant experimentally infeasible. Here, we present a method that can extrapolate from variants with known binding specificity to those where no experimental data are available.
Results: For each position in the peptide ligand, we extracted the polymorphic pocket residues in MHC molecules that are in close proximity to the peptide residue. For MHC molecules with known specificities, we established a library of pocket-residues and corresponding binding specificities. The binding specificity for a novel MHC molecule is calculated as the average of the specificities of MHC molecules in this library weighted by the similarity of their pocket-residues to the query. This PickPocket method is demonstrated to accurately predict MHC-peptide binding for a broad range of MHC alleles, including human and non-human species. In contrast to neural network-based pan-specific methods, PickPocket was shown to be robust both when data is scarce and when the similarity to MHC molecules with characterized binding specificity is low. A consensus method combining the PickPocket and NetMHCpan methods was shown to achieve superior predictive performance. This study demonstrates how integration of diverse algorithmic approaches can lead to improved prediction. The method may also be used for making ligand-binding predictions for other types of receptors where many variants exist.
Supplementary information: Supplementary data are available at Bioinformatics online.
Drug-induced liver injury (DILI) is one of the most common adverse reactions leading to product withdrawal post-marketing. Recently, genome-wide association studies have identified a number of human leukocyte antigen (HLA) alleles associated with DILI; however, the cellular and chemical mechanisms are not fully understood.
To study these mechanisms, we established an HLA-typed cell archive from 400 healthy volunteers. In addition, we utilized HLA genotype data from more than four million individuals from publicly accessible repositories such as the Allele Frequency Net Database, Major Histocompatibility Complex Database and Immune Epitope Database to study the HLA alleles associated with DILI. We utilized novel in silico strategies to examine HLA haplotype relationships among the alleles associated with DILI by using bioinformatics tools such as NetMHCpan, PyPop, GraphViz, PHYLIP and TreeView.
We demonstrated that many of the alleles that have been associated with liver injury induced by structurally diverse drugs (flucloxacillin, co-amoxiclav, ximelagatran, lapatinib, lumiracoxib) reside on common HLA haplotypes, which were present in populations of diverse ethnicity.
Our bioinformatic analysis indicates that there may be a connection between the different HLA alleles associated with DILI caused by therapeutically and structurally different drugs, possibly through peptide binding of one of the HLA alleles that defines the causal haplotype. Further functional work, together with next-generation sequencing techniques, will be needed to define the causal alleles associated with DILI.
NetMHC-3.0 is trained on a large number of quantitative peptide data using both affinity data from the Immune Epitope Database and Analysis Resource (IEDB) and elution data from SYFPEITHI. The method generates high-accuracy predictions of major histocompatibility complex (MHC): peptide binding. The predictions are based on artificial neural networks trained on data from 55 MHC alleles (43 Human and 12 non-human), and position-specific scoring matrices (PSSMs) for additional 67 HLA alleles. As only the MHC class I prediction server is available, predictions are possible for peptides of length 8–11 for all 122 alleles. artificial neural network predictions are given as actual IC50 values whereas PSSM predictions are given as a log-odds likelihood scores. The output is optionally available as download for easy post-processing. The training method underlying the server is the best available, and has been used to predict possible MHC-binding peptides in a series of pathogen viral proteomes including SARS, Influenza and HIV, resulting in an average of 75–80% confirmed MHC binders. Here, the performance is further validated and benchmarked using a large set of newly published affinity data, non-redundant to the training set. The server is free of use and available at: http://www.cbs.dtu.dk/services/NetMHC.
In vertebrates the major histocompatibility complex (MHC) presents peptides to the immune system. In humans MHCs are called human leukocyte antigens (HLAs), and some of the loci encoding them are the most polymorphic in the human genome. Different MHC molecules present different subsets of peptides, and knowledge of their binding specificities is important for understanding the differences in the immune response between individuals. Knowledge of motifs may be used to identify epitopes, understand the MHC restriction of epitopes and to compare the specificities of different MHC molecules. Several groups have developed prediction methods designed to provide broad allelic coverage of the MHC polymorphism [9-11]. These methods do in contrast to conventional allele-specific methods take both the peptide and the peptide:MHC interaction environment into account, thus allowing for extrapolations to accurately predict the binding specificity of un-characterized MHC molecules. The utility of these algorithms that predict which peptides MHC molecules bind are hampered by the lack of tools for browsing and comparing the specificity of these molecules. We have therefore developed a web-server, MHC motif viewer, that allows the display of the likely binding motif for all human class I proteins of the loci HLA-A, B, C, and E and for MHC class I molecules from chimpanzee (Pan troglodytes), rhesus monkey (Macaca mulatta) and mouse (Mus musculus). Furthermore, it covers all HLA-DR protein sequences. A special viewing feature “MHC fight” allows for display of the specificity of two different MHC molecules side by side. We show how the web-server can be used to discover and display surprising similarities as well as differences between MHC molecules within and between different species. The MHC motif viewer is available at http://www.cbs.dtu.dk/researchgroups/immunology/HLA/Home.html
MHC; HLA; Motifs; Comparison; Viewer; Class I; Class II
MULTIPRED2 is a computational system for facile prediction of peptide binding to multiple alleles belonging to human leukocyte antigen (HLA) class I and class II DR molecules. It enables prediction of peptide binding to products of individual HLA alleles, combination of alleles, or HLA supertypes. NetMHCpan and NetMHCIIpan are used as prediction engines. The 13 HLA Class I supertypes are A1, A2, A3, A24, B7, B8, B27, B44, B58, B62, C1, and C4. The 13 HLA Class II DR supertypes are DR1, DR3, DR4, DR6, DR7, DR8, DR9, DR11, DR12, DR13, DR14, DR15, and DR16. In total, MULTIPRED2 enables prediction of peptide binding to 1077 variants representing 26 HLA supertypes. MULTIPRED2 has visualization modules for mapping promiscuous T-cell epitopes as well as those regions of high target concentration – referred to as T-cell epitope hotspots. Novel graphic representations are employed to display the predicted binding peptides and immunological hotspots in an intuitive manner and also to provide a global view of results as heat maps. Another function of MULTIPRED2, which has direct relevance to vaccine design, is the calculation of population coverage. Currently it calculates population coverage in five major groups in North America. MULTIPRED2 is an important tool to complement wet-lab experimental methods for identification of T-cell epitopes. It is available at http://cvc.dfci.harvard.edu/multipred2/.
T-cell epitope hotspots; HLA; HLA supertype; Human Leukocyte Antigen; promiscuous binding peptide; vaccine design
The immune system must detect a wide variety of microbial pathogens, such as viruses, bacteria, fungi and parasitic worms, to protect the host against disease. Antigenic peptides displayed by MHC II (class II Major Histocompatibility Complex) molecules is a pivotal process to activate CD4+ TH cells (Helper T cells). The activated TH cells can differentiate into effector cells which assist various cells in activating against pathogen invasion. Each MHC locus encodes a great number of allele variants. Yet this limited number of MHC molecules are required to display enormous number of antigenic peptides. Since the peptide binding measurements of MHC molecules by biochemical experiments are expensive, only a few of the MHC molecules have suffecient measured peptides. To perform accurate binding prediction for those MHC alleles without suffecient measured peptides, a number of computational algorithms were proposed in the last decades.
Here, we propose a new MHC II binding prediction approach, OWA-PSSM, which is a significantly extended version of a well known method called TEPITOPE. The TEPITOPE method is able to perform prediction for only 50 MHC alleles, while OWA-PSSM is able to perform prediction for much more, up to 879 HLA-DR molecules. We evaluate the method on five benchmark datasets. The method is demonstrated to be the best one in identifying binding cores compared with several other popular state-of-the-art approaches. Meanwhile, the method performs comparably to the TEPITOPE and NetMHCIIpan2.0 approaches in identifying HLA-DR epitopes and ligands, and it performs significantly better than TEPITOPEpan in the identification of HLA-DR ligands and MultiRTA in identifying HLA-DR T cell epitopes.
The proposed approach OWA-PSSM is fast and robust in identifying ligands, epitopes and binding cores for up to 879 MHC II molecules.
Initiation and regulation of immune responses in humans involves recognition of peptides presented by human leukocyte antigen class II (HLA-II) molecules. These peptides (HLA-II T-cell epitopes) are increasingly important as research targets for the development of vaccines and immunotherapies. HLA-II peptide binding studies involve multiple overlapping peptides spanning individual antigens, as well as complete viral proteomes. Antigen variation in pathogens and tumor antigens, and extensive polymorphism of HLA molecules increase the number of targets for screening studies. Experimental screening methods are expensive and time consuming and reagents are not readily available for many of the HLA class II molecules. Computational prediction methods complement experimental studies, minimize the number of validation experiments, and significantly speed up the epitope mapping process. We collected test data from four independent studies that involved 721 peptide binding assays. Full overlapping studies of four antigens identified binding affinity of 103 peptides to seven common HLA-DR molecules (DRB1*0101, 0301, 0401, 0701, 1101, 1301, and 1501). We used these data to analyze performance of 21 HLA-II binding prediction servers accessible through the WWW.
Because not all servers have predictors for all tested HLA-II molecules, we assessed a total of 113 predictors. The length of test peptides ranged from 15 to 19 amino acids. We tried three prediction strategies – the best 9-mer within the longer peptide, the average of best three 9-mer predictions, and the average of all 9-mer predictions within the longer peptide. The best strategy was the identification of a single best 9-mer within the longer peptide. Overall, measured by the receiver operating characteristic method (AROC), 17 predictors showed good (AROC > 0.8), 41 showed marginal (AROC > 0.7), and 55 showed poor performance (AROC < 0.7). Good performance predictors included HLA-DRB1*0101 (seven), 1101 (six), 0401 (three), and 0701 (one). The best individual predictor was NETMHCIIPAN, closely followed by PROPRED, IEDB (Consensus), and MULTIPRED (SVM). None of the individual predictors was shown to be suitable for prediction of promiscuous peptides. Current predictive capabilities allow prediction of only 50% of actual T-cell epitopes using practical thresholds.
The available HLA-II servers do not match prediction capabilities of HLA-I predictors. Currently available HLA-II prediction servers offer only a limited prediction accuracy and the development of improved predictors is needed for large-scale studies, such as proteome-wide epitope mapping. The requirements for accuracy of HLA-II binding predictions are stringent because of the substantial effect of false positives.
In this paper, we describe the methodologies behind three different aspects of the NetMHC family for prediction of MHC class I binding, mainly to HLAs. We we have updated the prediction servers servers, NetMHC-3.2, NetMHCpan-2.2, and a new consensus method, NetMHCcons, which, in their previous versions, have been evaluated to be among the very best performing MHC:peptide binding predictors available. Here we describe the background for these methods, and the rationale behind the different optimisation steps implemented in the methods. We go through the practical use of the methods, which are publicly available in the form of relatively fast and simple web interfaces. Furthermore, we will review results optained in actual epitope discovery projects where previous implementations of the described methods have been used in the initial selection of potential epitopes. Selected potential epitopes were all evaluated experimentally using ex vivo assays.
Predictive models of peptide-Major Histocompatibility Complex (MHC) binding affinity are important components of modern computational immunovaccinology. Here, we describe the development and deployment of a reliable peptide-binding prediction method for a previously poorly-characterized human MHC class I allele, HLA-Cw*0102.
Using an in-house, flow cytometry-based MHC stabilization assay we generated novel peptide binding data, from which we derived a precise two-dimensional quantitative structure-activity relationship (2D-QSAR) binding model. This allowed us to explore the peptide specificity of HLA-Cw*0102 molecule in detail. We used this model to design peptides optimized for HLA-Cw*0102-binding. Experimental analysis showed these peptides to have high binding affinities for the HLA-Cw*0102 molecule. As a functional validation of our approach, we also predicted HLA-Cw*0102-binding peptides within the HIV-1 genome, identifying a set of potent binding peptides. The most affine of these binding peptides was subsequently determined to be an epitope recognized in a subset of HLA-Cw*0102-positive individuals chronically infected with HIV-1.
A functionally-validated in silico-in vitro approach to the reliable and efficient prediction of peptide binding to a previously uncharacterized human MHC allele HLA-Cw*0102 was developed. This technique is generally applicable to all T cell epitope identification problems in immunology and vaccinology.
The binding of peptide fragments of antigens to class II MHC is a crucial step in initiating a helper T cell immune response. The identification of such peptide epitopes has potential applications in vaccine design and in better understanding autoimmune diseases and allergies. However, comprehensive experimental determination of peptide-MHC binding affinities is infeasible due to MHC diversity and the large number of possible peptide sequences. Computational methods trained on the limited experimental binding data can address this challenge. We present the MultiRTA method, an extension of our previous single-type RTA prediction method, which allows the prediction of peptide binding affinities for multiple MHC allotypes not used to train the model. Thus predictions can be made for many MHC allotypes for which experimental binding data is unavailable.
We fit MultiRTA models for both HLA-DR and HLA-DP using large experimental binding data sets. The performance in predicting binding affinities for novel MHC allotypes, not in the training set, was tested in two different ways. First, we performed leave-one-allele-out cross-validation, in which predictions are made for one allotype using a model fit to binding data for the remaining MHC allotypes. Comparison of the HLA-DR results with those of two other prediction methods applied to the same data sets showed that MultiRTA achieved performance comparable to NetMHCIIpan and better than the earlier TEPITOPE method. We also directly tested model transferability by making leave-one-allele-out predictions for additional experimentally characterized sets of overlapping peptide epitopes binding to multiple MHC allotypes. In addition, we determined the applicability of prediction methods like MultiRTA to other MHC allotypes by examining the degree of MHC variation accounted for in the training set. An examination of predictions for the promiscuous binding CLIP peptide revealed variations in binding affinity among alleles as well as potentially distinct binding registers for HLA-DR and HLA-DP. Finally, we analyzed the optimal MultiRTA parameters to discover the most important peptide residues for promiscuous and allele-specific binding to HLA-DR and HLA-DP allotypes.
The MultiRTA method yields competitive performance but with a significantly simpler and physically interpretable model compared with previous prediction methods. A MultiRTA prediction webserver is available at http://bordnerlab.org/MultiRTA.
The crystal structures of unliganded and liganded pMHC molecules provide a structural basis for TCR recognition yet they represent ‘snapshots’ and offer limited insight into dynamics that may be important for interaction and T cell activation. MHC molecules HLA-B*3501 and HLA-B*3508 both bind a 13 mer viral peptide (LPEP) yet only HLA-B*3508-LPEP induces a CTL response characterised by the dominant TCR clonetype SB27. HLA-B*3508-LPEP forms a tight and long-lived complex with SB27, but the relatively weak interaction between HLA-B*3501-LPEP and SB27 fails to trigger an immune response. HLA-B*3501 and HLA-B*3508 differ by only one amino acid (L/R156) located on α2-helix, but this does not alter the MHC or peptide structure nor does this polymorphic residue interact with the peptide or SB27. In the absence of a structural rationalisation for the differences in TCR engagement we performed a molecular dynamics study of both pMHC complexes and HLA-B*3508-LPEP in complex with SB27. This reveals that the high flexibility of the peptide in HLA-B*3501 compared to HLA-B*3508, which was not apparent in the crystal structure alone, may have an under-appreciated role in SB27 recognition. The TCR pivots atop peptide residues 6–9 and makes transient MHC contacts that extend those observed in the crystal structure. Thus MD offers an insight into ‘scanning’ mechanism of SB27 that extends the role of the germline encoded CDR2α and CDR2β loops. Our data are consistent with the vast body of experimental observations for the pMHC-LPEP-SB27 interaction and provide additional insights not accessible using crystallography.
When pathogens replicate within a host cell, their proteins are degraded into peptides, which are captured by the major histocompatibility complex (MHC) and brought to the cell surface. The peptide-MHC (pMHC) is surveyed by T cell receptors (TCRs) expressed on the surface of T cells. If the peptide is foreign, the peptide-MHC-TCR interaction initiates an immune response to eliminate the pathogen. However, the combinations of pMHC and TCRs are diverse. We ask how TCRs discriminate between structurally similar pMHCs? We address this by focusing on two MHC molecules that differ by a single change, both bind the same peptide but only one instigates a dominant immune response. Intriguingly, the single difference between the two MHCs does not alter the peptide shape nor does it contact the peptide or TCR. We examined the flexibility of the pMHC-TCR interface using molecular dynamics simulations. We observed differences in the peptide and TCR flexibilities that could explain their contrasting physiologies, as well as clues to how the TCR moves atop the MHC in order to ‘scan’ it. Our analysis provides insight into a particular pMHC-TCR interaction not accessible using crystallographic methods, and indicate dynamics may play an influential and perhaps under-appreciated role in other pMHC-TCR systems.
Arenaviruses are the causative pathogens of severe hemorrhagic fever and aseptic meningitis in humans, for which no licensed vaccines are currently available. Pathogen heterogeneity within the Arenaviridae family poses a significant challenge for vaccine development. The main hypothesis we tested in the present study was whether it is possible to design a universal vaccine strategy capable of inducing simultaneous HLA-restricted CD8+ T cell responses against 7 pathogenic arenaviruses (including the lymphocytic choriomeningitis, Lassa, Guanarito, Junin, Machupo, Sabia, and Whitewater Arroyo viruses), either through the identification of widely conserved epitopes, or by the identification of a collection of epitopes derived from multiple arenavirus species. By inoculating HLA transgenic mice with a panel of recombinant vaccinia viruses (rVACVs) expressing the different arenavirus proteins, we identified 10 HLA-A02 and 10 HLA-A03-restricted epitopes that are naturally processed in human antigen-presenting cells. For some of these epitopes we were able to demonstrate cross-reactive CD8+ T cell responses, further increasing the coverage afforded by the epitope set against each different arenavirus species. Importantly, we showed that immunization of HLA transgenic mice with an epitope cocktail generated simultaneous CD8+ T cell responses against all 7 arenaviruses, and protected mice against challenge with rVACVs expressing either Old or New World arenavirus glycoproteins. In conclusion, the set of identified epitopes allows broad, non-ethnically biased coverage of all 7 viral species targeted by our studies.
Arenaviruses cause significant morbidity and mortality worldwide and are also regarded as a potential bioterrorist threat. CD8+ T cells restricted by class I MHC molecules clearly play a protective role in murine models of arenavirus infection, yet little is known about the epitopes recognized in the context of human class I MHC (HLA). Here, we defined 20 CD8+ T cell epitopes restricted by HLA class I molecules, derived from 7 different species of arenaviruses associated with human disease. To accomplish this task, we utilized epitope predictions, in vitro HLA binding assays, and HLA transgenic mice inoculated with recombinant vaccinia viruses (rVACV) expressing arenavirus antigens. Because our analysis targeted two of the most common HLA types worldwide, we project that the CD8+ T cell epitope set provides broad coverage against diverse ethnic groups within the human population. Furthermore, we show that immunization with a cocktail of these epitopes protects HLA transgenic mice from challenge with rVACV expressing antigens from different arenavirus species. Our findings suggest that a cell-mediated vaccine strategy might be able to protect against infection mediated by multiple arenavirus species.