|Home | About | Journals | Submit | Contact Us | Français|
In order to improve the processing efficiency of T cell tumor antigen epitopes, this bioinformatic study compared proteolytic sites in the generation of 47 experimentally identified HLA-A2.1-restricted immunodominant tumor antigen epitopes to those of 52 documented HLA-A2.1-restricted immunodominant viral antigen epitopes. Our results showed that the amino acid frequencies in the C-terminal cleavage sites of the tumor antigen epitopes, as well as several positions within the 10 amino acid (aa) flanking regions, were significantly different from those of the viral antigen epitopes. In the 9 amino acid epitope region, frequencies differed somewhat in the secondary-anchored amino acid residues on E3 (the third aa of the epitope), E4, E6, E7, and E8; however, frequencies in the primary-anchored positions, on E2 and E9, for binding in the HLA-A2.1 groove remained nearly identical. The most frequently occurring amino acid pairs in both N-terminal and C-terminal cleavage sites in the generation of tumor antigen epitopes were different from those of the viral antigen epitopes. Our findings demonstrate, for the first time, that these two groups of epitopes may be cleaved by distinct sets of proteasomes and peptidases or similar enzymes with lower efficiencies for tumor epitopes. In the future, in order to more effectively generate tumor antigen epitopes, targeted activation of the immunoproteasomes and peptidases could be achieved that mediate the cleavage of viral epitopes; thus, enhancing our potential for antigen-specific tumor immunotherapy.
Vaccines capable of eliciting T cell immune responses have been successfully developed for prevention of 26 viral and bacterial infectious diseases(1). In contrast, despite significant progress(2), effective vaccines for most types of tumor are still lacking(3). Since most tumor antigens reported are nonmutated self-antigens(2), peripheral T cell repertoire may be tolerized to self-antigens via thymic negative selection of autoreactive T cells but reacted to viral (foreign) antigens. This model of self-tolerance via thymic selection is often considered as a mechanism of underlying the efficiency differences between the vaccines against viral infections and that against tumors(4). However, self-tolerance, based on the avidity of T cells for self-MHC (major histocompatibility complex)/self-peptide complexes in the thymic selection process, is far from absolute(4). T cells with low avidity for ubiquitously expressed self-antigens or low level expressed self-antigens can escape clonal deletion in thymus and enter the periphery(4). Thus, thymic tolerance is one of the important factors but not the only factor in determining T cell immune responses to tumors and viral infection. T cell responses are also regulated by the process of antigen processing. Improving antigen processing of tumor antigens has been proposed to be a very important direction in development of novel vaccination strategies against tumors(5). Recent reports demonstrated that interferon (IFN)-γ, which is secreted in large amounts during viral infections(6), alters enzymatic processing and proteolytic specificities in generation of T cell antigen epitopes via induction of immunoproteasomes(7) and novel aminopeptides(8). In other words, these results suggested that proteasomes and other enzymes in generation of T cell antigen epitopes versus viral antigen epitopes could be different due to differential expression of IFN-γ during viral infections and tumor growth(7). Notwithstanding, the comprehensive features of proteolytic cleavage sites involved in differential generation of tumor antigen epitopes and viral antigen epitopes remain unknown.
For generation of MHC class I-restricted antigen epitopes, several requirements have been identified, including the following: 1) cleavage sites and favorable flanking sequence around cleavage sites can be effectively recognized by ubiquitin-proteasome complex(9,10) - although nonproteasomal mechanisms also seem to be involved in antigen processing(11,12); 2) transported antigen epitopes have high affinity for binding to transporter associated protein (TAP)(13,14) and for being transported to MHC class I complex (15); and 3) antigen epitopes have high affinity for binding and stabilizing HLA (human leukocyte antigen, human MHC) class I complex on the cell surface(16). It is well accepted that a small subset of peptides generated by proteasomes and processing peptidases, transported by TAP, loaded on MHC class I(17) are potent in elicitation of T cell immune responses and become immunodominant epitopes(18,19), which are desirable for development of novel immunotherapy.
Recently developed serological analyses of tumor antigens by recombinant expression cDNA cloning (SEREX)(20,21) have led to the identification of a large number of tumor antigens(22), which hold great promise as targets for novel antigen-specific tumor immunotherapy(23,24). Previously, we identified broadly immunogenic SEREX tumor antigens(25), CML66L(26,27) and CML28(28), with which specific high-titer IgG antibody responses were associated in the remission of chronic myelogenous leukemia (CML)(25,26,28). Recently, our findings indicated that the overexpression of CML66L in tumor cells, mediated by alternative splicing, is the mechanism of the immunogenicity of this antigen, suggesting that overexpression of SEREX-identified tumor antigens by vaccination could generate anti-tumor immune responses(29). Immunization using dominant antigenic peptides has been most effective in patients with tumors (30) and has generated surprisingly high levels of circulating T cells directed against tumor antigens with a therapeutic outcome(31). Thus, immunodominant epitopes capable of eliciting remarkable CD8+ T cell responses would contribute decisively to the improvement of peptide-based immunization protocols for patients with tumors(32). However, due to our incomplete understanding of the mechanism underlying the generation of immunodominant (measurable T cell reactive) epitopes, as well as technical limitations, the identification of immunodominant T cell antigen epitopes from SEREX antigens has been accomplished at a slow pace; nonetheless, a few SEREX antigens (i.e., MAGE-1, tyrosinase, NY-ESO-1, coactosin-like protein and CML66) are reported to have the ability to elicit both cellular and humoral immune responses to tumor cells(29,33,34). To facilitate the identification of T cell reactive epitopes encoded by a large number of the SEREX antigens, two important questions must be addressed: 1) whether the structures around the cleavage sites generating the T cell reactive tumor antigen epitopes are different from those of identified immunodominant viral antigen epitopes; and 2) if bioinformatic features of the cleavage sites generating the dominant tumor antigen epitopes can be extracted using a statistical approach, whether the processing efficiency of immunodominant tumor epitopes can be improved thereby in the future.
In this study, we hypothesize that proteolytic cleavage sites generating the identified immunodominant tumor antigen epitopes may be statistically different from those generating documented immunodominant viral epitopes. To test this hypothesis, we focused on the statistical analysis of HLA-A2.1-restricted tumor antigen nonapeptide epitopes and viral antigen nonapeptide epitopes that were previously identified to be T cell reactive through the experimental approaches of others(18,19) (also, see the excellent web database: http://www.cancerimmunity.org/peptidedatabase/Tcellepitopes.htm). The purpose in establishment of the public database(s) of T cell antigen epitopes is for database mining to reveal novel information. The statistical approach that we applied in this study has the advantage of revealing important information on structural features through biochemical analysis of individual antigen epitopes, as we demonstrated previously(35). Of note, in contrast to the biochemical analyses with epitope peptides digestible by proteasomes or peptidases, we focused on analyzing the experimentally identified HLA-A2.1 restricted T cell reactive epitopes, which are desirable for future development of antigen specific immunotherapy(32). In contrast to the traditional research processes of designing experiments to test a hypothesis, we used the published experimental data to examine a hypothesis, which was successfully applied in our recent findings(35). Defining the features shared by experimentally identified tumor antigen epitope cleavage sites in a statistical approach would be a very important key to understanding the generation of tumor antigen epitopes versus that of the viral antigen epitopes. We found that experimentally identified, HLA-A2.1-restricted T cell reactive tumor antigen epitopes share structural features around the cleavage sites, but that these structural features were not identical to those used in the generation of viral antigen epitopes. Our new discoveries through the panoramic analysis, in return, have justified this bioinformatic approach. With such knowledge, we could perfect our ability to make processing of tumor antigen epitopes more efficiently, and improve tumor immunotherapy.
The 47 HLA-A2.1-restricted tumor antigen epitopes previously identified by the experimental approaches of others (18,19) are listed in Table I (also, see the web database at: http://www.cancerimmunity.org/peptidedatabase/Tcellepitopes.htm). The experimentally confirmed HLA-A2.1-restricted viral antigen epitopes are as follows: the 24 immunodeficiency virus (HIV) viral antigen epitopes, including those encoded by five HIV viral proteins collected from the HIV Molecular Immunology Database (http://hivweb.lanl.gov/content/immunology/maps/ctl/p17.html), as well as published data(36); the 10 hepatitis B virus (HBV) epitopes encoded by two proteins of HBV(37); the four hepatitis C virus (HCV) epitopes encoded by four proteins(38); and the 14 influenza A virus epitopes encoded by six proteins(39). To achieve parity, only nonapeptide tumor antigen epitopes and nonapeptide viral antigen epitopes—and not epitopes of other lengths—were analyzed in this study.
In accordance with the enzymatic cleavage nomenclature of Schechter and Berger(40), our analyses (Fig. 1) included the ten amino acid residues flanking the N-terminal cleavage site of the nonapeptide epitopes and the ten residues flanking the C-terminal cleavage site of the antigen epitopes(41). The protein-protein BLAST search for short exact matches was performed on the NCBI website (http://www.ncbi.nlm.nih.gov/BLAST/) to retrieve both the N-terminal and C-terminal flanking regions of HLA-A2.1-restricted T cell antigen epitopes. During the final phase, epitopes are generated by two cleavages on both the N-terminus and C-terminus; thus, in addition to enzymatic cleavage nomenclature(40), the amino acid positions in the epitope (E1 to E9), the N-terminal (N10 to N1), and C-terminal (C1 to C10) flanking regions were further nominated (Fig. 1). Substrate specificities of the enzymes were retrieved from the Comprehensive Enzyme Information System, BRENDA (http://www.brenda.unikoeln.de/index.php4).
The four algorithms, the MAPPP (http://www.mpiib-berlin.mpg.de/MAPPP/cleavage.html)(42), the MHC-Pathway (http://www.mhc-pathway.net/)(43), the MHC-Pathway immunoproteasome, and the NetChop3.0 neural network predictor (http://www.cbs.dtu.dk/services/NetChop/) (44), were used to predict the proteasome cleavage sites on antigens. In addition, a prediction algorithm (TAPPred) (http://www.imtech.res.in/raghava/tappred/) for the transporter associated protein (TAP) binding was used to predict the TAP binding potential of antigen epitopes. Furthermore, binding of the antigen peptide epitopes on the HLA-A2.1 molecule was predicted by using two different web-based algorithms, including one on the BIMAS/NIH (BIMAS) website (http://bimas.dcrt.nih.gov/molbio/hla_bind/) and another on the SYFPEITHI (SYF) website (http://syfpeithi.bmi-heidelberg.com/scripts/MHCServer.dll/home.htm).
The 400 potential pairs of amino acids in nature were numbered from 1 to 400. It was assumed that the amino acid pairs were represented in the cleavage sites of tumor antigens and virus antigens, which were assigned as the pair set k1 , and the pair set k2 , respectively, allowing for repetitions. The probability was calculated that there is an overlap between the sets such that the r1 pairs from the first set are the same as the r2 pairs in the second set (see Fig. 2A). Such an overlap as an ordered pair (r1, r2) was denoted. A simple counting method, as described (45), was used to compute event probabilities (the probability mass function) for different overlaps for N = 400, k1 = 47 and k2 = 52 .
Using a one-sample test for binomial proportion(46), the frequencies of amino acid residues in the N-terminal and C-terminal positions of the HLA-A2.1-restricted tumor antigen epitope cleavage sites were calculated and compared to the general occurrence frequencies of each amino acid in any position of the proteins(47). Similarly, the frequencies of amino acid residues in the N-terminal and C-terminal positions of both cleavage sites of the HLA-A2.1-restricted viral antigen epitopes were calculated and compared to the general occurrence frequencies of each amino acid in any position of the proteins(47). The Wilcoxon rank-sum test was used to compare proteasome cleavage probabilities between the tumor epitopes and the viral epitopes, as predicted with the MAPPP algorithm (http://www.mpiib-berlin.mpg.de/MAPPP/cleavage.html).
In order to determine whether cleavage sites in the generation of tumor antigen epitopes were different from those of viral epitopes, we analyzed all of the 47 HLA-A2.1-restricted immunodominant tumor antigen epitopes experimentally identified so far(18,19) and identified immunodominant 52 HLA-A2.1-restricted viral antigen epitopes, along with nine amino acid epitope residues, and the ten amino acids in the N-terminal and C-terminal flanking regions. These 47 epitopes were derived from 24 tumor antigens (see Table I), which included representatives from the four tumor antigen groups characterized thus far: the group of differentiation antigens, including tyrosinase, gp100; the group of amplified/oncogenic antigens, including HER-2/neu, WT1; the group of mutational antigens, including p53; and the group of cancer-testis antigens, including MAGE and NY-ESO-1. This collection allowed us to analyze the common structural features of the cleavage sites and the epitopes shared by various tumor antigens. For the purpose of comparison, we also collected 52 HLA-A2.1-restricted viral antigen epitopes encoded by HIV, HBV, HCV, and influenza A virus, as the reference epitopes. These viral antigen epitopes were suitable for comparison, due to the inclusion of both DNA and RNA viruses, which were categorized into several virus families, including the retroviridae (HIV), the hepadnaviridae (HBV), the flaviviridae (HCV), and the orthomyxoviridae (influenza virus A)(48).
In our preliminary studies, the mean and variance of the processing probabilities of tumor epitopes and viral epitopes were calculated by using the MAPPP algorithm. Based on these results, an estimation of sample size, in comparing the means of the tumor epitope group and the viral epitope group, was calculated according to previously published statistical methods(49). The results indicated that, in each group, 33 epitopes or more must be included in order to obtain 80% power (not shown); thus, the number of epitopes in the tumor epitope group (47 epitopes) and the viral epitope group (52 epitopes) exceeded the calculated power requirement(49). Of note, the statistical estimation of sample size suggested that the conclusion achieved in this study, with more than the sufficient sample size to gain the >80% high power levels is statistically significant, but not biased(49).
We analyzed the statistical differences in the frequency of each amino acid in 29 positions of the tumor antigen epitopes and the viral antigen epitopes, and the flanking regions in comparison to the general occurrence frequencies(47). Since both groups of antigen epitopes were compared against the same control amino acid occurrence frequencies(47), the results from these two groups were comparable. The amino acid frequencies that are statistically higher than the background are listed in Table II. Several findings were reported: 1) at 17 out of the 29 positions, amino acid distributions in the tumor antigen and viral antigen epitopes deviated significantly from the background (Table II) (p<0.05); 2) in the flanking region positions, on N9, N6, N5, and C8, only the tumor antigen—and not the viral antigen—epitopes deviated from the background; and (3) in the flanking region positions, on N10, C1, C3, C7, and C9, only the viral antigen—and not the tumor antigen—epitopes deviated from the background. Previous studies showed some amino acid preferences in the N-termini (Pn1′) and the C-termini (Pc1′) of the proteasome cleaved epitopes. These studies also showed that the C-terminal cleaved position (Pc1′) prefers K, R, A, and S, but does not favor F, D, and E(17). In contrast, our data did not find statistically significant differences in the amino acid occurrence frequencies in position Pn1′ of either the tumor antigen or viral antigen epitopes (p>0.05). In addition, our data on viral antigen epitopes showed that this C-terminal cleavage position (Pc1′) favored T. Our analyses indicated that both the N-terminal and the C-terminal cleavage sites of the tumor antigen epitopes were different from those of the viral antigen epitopes.
Cumulatively, these results suggest the following. First, the epitope flanking regions of tumor and viral epitopes have amino acid preferences that are statistically different from the general amino acid frequency background. The following reports support our design in using general amino acid frequency as a background: (1) In contrast to bacteria, human viruses do not have their own protein translation machinery, and need to use human cell protein translation machinery for synthesis of viral proteins(48); (2) Since viruses can be efficiently replicated in human cells, human cell protein translation system must be capable of efficient translation of viral proteins(48); (3) Codon usage and amino acid frequencies of human proteins and viral proteins synthesized in human cells are generally determined by the expression levels of tRNAs with appropriate anticodons in human cells(50,51). Second, differences in amino acid preferences on positions in the flanking regions can be observed between the viral and tumor epitopes. Although the epitope positions were considered in the flanking regions for both the N-terminal and C-terminal enzyme cleavage sites, it is possible that amino acid restriction in the epitope positions for HLA(52-54) and TAP binding(13,14) may override enzymatic cleavage influences at both ends. Third, and most important, there are significant differences between these two groups of epitopes in positions N1 (Pn1), E1 (Pn1′), and C1 (Pc1′), suggesting that there are differences in the proteolytic enzymes involved in the generation of these two groups of epitopes.
As shown in Table II, in positions E2, E3, and E9, amino acid preference was conserved in the two groups of epitopes. The results of this conservation in the primary HLA-A anchor residues, on E2 and E9, and the secondary anchor residue, on E3, corresponded to prior reports emphasizing the dominant structural requirement for HLA-A2.1 binding, which corresponded to the previous findings (e.g., Leu or Met at position E2, and Val, Leu, or Ile at position E9 of the epitope regions)(52-54). It should be noted that an auxiliary anchor at E3 usually fine-tunes peptide recognition(55,56). In our study, the high restriction in both tumor antigen and viral antigen epitopes served as an appropriate positive control for the quality of our analyses. The significant differences between the two groups of epitopes in positions E4, E6, E7, and E8 suggest that HLA-A2.1 binding(52-54) and TAP binding(13,14) do not have high restriction in these positions. The results also suggest that the differences in these positions between the two groups of epitopes may reflect variances in enzyme recognition in the flanking regions, since the epitope region serves as the C-terminal flanking region for N-terminal cleavage, as does the N-terminal flanking region for C-terminal cleavage. Future work will also need to examine whether the structural features in the auxiliary anchor positions contribute to lower binding avidity between interaction of MHC/self-tumor antigen peptides and T cell antigen receptor (TCR) and higher binding avidity between MHC/viral peptides and TCR(4).
We observed that there are no sequence structures for proteasome cleavage sites inside the epitopes, as defined by two hydrophobic residues at the E2 and E9 positions. This finding suggests that the epitope candidates having both the right HLA-A2.1 anchor residues on E2 and E9 and the internal cleavage sites should have been degraded, and there is no opportunity for these epitopes to be presented. Similarly, there are no sequence structures for HLA-A2.1 anchor residues and proteasome cleavage sites within the 10 amino acid residues in the epitope N-terminal and C-terminal flanking regions, suggesting that a special feature of the flanking regions is to enable the efficient processing of epitopes.
Protein sequences encode more structural and functional information than amino acid occurrence frequencies. In order to further explore this difference, we compared occurrence frequencies of the amino acid pairs(17) in the Pn1 (N1)-Pn1′ (E1) and the Pc1 (E9)-Pc1′ (C1) of the tumor and viral epitopes. These positions were selected to compare amino acid pairs because they are primary structural features for enzyme recognition and cleavage(40). We argued that if the proteasomes and peptidases that process the immunodominant tumor epitopes are the same as or similar to that processing the immunodominant viral epitopes, the amino acid pairs that are identical in these two groups of epitopes would be in high percentages in these positions. As shown in Figs. 2B and 2C, the occurrence of amino acid pairs in N-terminal cleavage sites of the tumor epitopes was radically different from that of the viral antigen epitopes. In the N-terminal cleavage site (the Pn1-Pn1′ pair, depicted in Fig. 2B), out of 400 possible pairs, we found that 81% of the total pairs in the tumor epitopes and 79% of the total pairs in the viral epitopes were different. In Fig. 2C, the most frequently occurring 13 pairs consisted of 34.0% of tumor epitopes. The most frequently occurring 12 pairs in the viral epitopes covered 32.7% of the epitope group. In addition, the frequency of pairs with the basic amino acid at the Pn1 position of the pairs was significantly increased, from 10.6% of the most frequently occurring tumor epitopes, to 17.3% of the most frequently occurring viral epitopes, suggesting that increased trypsin-like activity mediates the processing of viral epitopes(10). Moreover, the frequency of pairs with hydrophobic amino acid at the Pn1 position of the pairs was significantly decreased, from 17.0% of the most frequently occurring tumor epitopes, to 7.7% of the most frequently occurring viral epitopes, suggesting that an increase in chymotrypsin-like activities is responsible for tumor epitope processing(10). Finally, the frequency of pairs with the basic amino acid at the C-position of the pairs was significantly decreased, from 21.3% of the most frequently occurring tumor epitopes, to 3.9% of the most frequently occurring viral epitopes. Again, these results suggest that the N-terminal cleavages of both the tumor and viral epitopes are mediated by two different groups of enzymes.
Similarly, the occurrence of amino acid pairs in the C-terminal cleavage sites of the tumor antigen epitopes (the Pc1-Pc1’ pair) was different from that of the viral antigen epitopes. In Fig. 2B, 53% of the total pairs in the tumor epitopes did not share with 50% of the total pairs in the viral epitopes. The most frequently occurring 12 pairs covered 51.1% of tumor epitopes, and the most frequently occurring 14 pairs covered 51.9% of viral epitopes. In Fig. 2C, among those most frequently occurring pairs, three pairs were conserved between the tumor and viral epitope groups, comprising only 11.5% of the C-terminal cleavage sites. Moreover, differences in the physical features of amino acid pairs in the tumor epitopes versus the viral epitopes were less obvious in the C-terminal cleavage sites, in comparison with the N-terminal cleavage sites. Hydrophobic residues were present in most Pc1 positions of the most frequently occurring pairs, suggesting that chymotrypsin-like activity may be dominant in the processing of C-terminal cleavages of tumor and viral epitopes(10) in addition to the HLA-A2.1 binding preference at the position Pc1/E9 (52-54) and the TAP binding preference at this position(13,14). These results suggest that pairs in the C-terminal sites are less diversified than those in the N-terminal sites. Our findings show that the proteolytic enzymes generating C-terminal cleavage sites of tumor and viral epitopes are also different.
Furthermore, in order to determine whether the percentages of amino acid pair overlapped in the N-terminal site and C-terminal site in the set of 47 tumor antigen epitopes and the set of 52 viral epitopes are statistically significant, we performed the computation for the probability of amino acid pairs overlapped in two steps. First, the probability mass function (pmf) for all the random overlaps of amino acid pairs in two sets of epitopes was analyzed with the surface plot (Fig. 2D), which had the visual demonstration of the probability distribution of random overlaps of amino acid pairs in two sets of epitopes The results in Fig. 2D showed that the probability for the 95% of all the random overlaps of amino acid pairs (the “mountain area” in the surface plot) in the two sets of epitopes was ≥ 0.004; and the overlaps of amino acid pairs in two sets of epitopes with the probability < 0.004 were not random. The results suggested that higher percentages of amino acid pairs overlapped in two sets, having the probability < 0.004, reflected the shared specificities between the enzymes of cleaving tumor epitopes and the enzymes of cleaving viral epitopes to some extent. Second, the same analysis on the probability mass function was performed using the contour plot (Fig. 2E). The results showed that the amino acid pairs overlapped in the two sets of epitopes along the same probability contour line had the same probability. The 95% of all the random overlaps of amino acid pairs between tumor epitopes and viral epitopes was within (≥) the 0.004 probability contour loop, which was designated as 95% confidential interval (CI) loop. Once again, the overlaps of amino acid pairs in two sets of epitopes with the probability < 0.004, outside of the 95% CI loop were not random. Interestingly, the probability for the overlapped amino acid pairs in the N-terminal cleavage sites (Fig. 2B) was 3.3 × 10−3; the probability for the overlapped amino acid pairs in the C-terminal sites (Fig. 2B) was 3.4 × 10−12. Therefore, both probabilities were smaller than 0.004. In biochemical terms, if no enzymatic specificity is shared in cleavage of the N-terminal sites or C-terminal sites for two sets (in totally random conditions), the probability to have amino acid pairs overlapped in these cleavage sites of two sets of epitopes should be ≥ 0.004 (Fig. 2E). As shown in Fig. 2E, since the probabilities to have amino acid pairs overlapped at the N-terminal site and at the C-terminal site were not random, the enzymatic specificities for cleaving the amino acid pairs were required to share about 20 % to 50% amino acid pair overlaps (Fig. 2B) in two sets of epitopes. The enzyme(s) for cleaving the amino acid pairs in the C-terminal site (50% amino acid overlaps) shared more specificities than that in the N-terminal site (about 20 % amino acid pair overlaps) (Fig. 2B). Since the probability for overlaps of amino acid pairs in the N-terminal sites in two sets was near the 95% confidential interval loop for random overlaps, the shared specificities for the N-terminal sites for two sets were minimal. If the percentages for overlaps of amino acid pairs in the N-terminal sites and C-terminal sites are 100% or close to 100%, the proteolytic enzymes mediating cleavage in these two sets of epitopes should be the same. Since the percentages for overlaps of amino acid pairs in the N-terminal sites and C-terminal sites were far less than 100%, the cleavages in both sites are mediated by two different groups of enzymes.
We further examined whether there are any differences between the tumor antigen and viral antigen epitopes by employing the commonly adopted algorithms for the prediction of processing probability by proteasome and immunoproteasome. Our reason for choosing the following four algorithms, the MAPPP (http://www.mpiib-berlin.mpg.de/MAPPP/cleavage.html)(42), the MHC-Pathway constitutive proteasome (http://www.mhc-pathway.net/), the MHC-Pathway immunoproteasome, and the NetChop3.0 neural network predictor (http://www.cbs.dtu.dk/services/NetChop/) (44) rather than other algorithms, such as PAProC (www.paproc.de)(57), is that the former four algorithms allow quantitative prediction. Of note, experimental methods chosen for experiments are not due to their perfection. Likewise, the chosen algorithms may not have 100% prediction efficiency. For any given tumor antigens, these four algorithms may predict some common cleavage sites as well as the different sites. However, using the same sets of algorithms to analyze tumor epitopes and viral epitopes, the results from both sets of epitopes are statistically comparable; the common sites predicted by all four algorithms may reflect the common features of proteasome cleavage revealed from different angles. As shown in Figs. 3A and B, predicted by the algorithm MAPPP, for example, the mean ± 1.96 standard error (SE) (95% confidence interval, CI) of the proteasome cleavage scores, for tumor antigen epitopes were between 0.62 and 0.75. In contrast, the 95% CI of the scores for viral antigen epitopes predicted with the same algorithm was between 0.53 and 0.61. To further consolidate this finding, we applied the similar analyses with three other algorithms for proteasome peptide cleavage, including the NetChop 3.0 and MHC-Pathway constitutive proteasome and MHC-Pathway immunoproteasome. As shown in Figs. 3A and B, the results obtained were similar to that achieved with the algorithm MAPPP. Of note, the scores of both sets of antigen epitopes predicted with the algorithm MHC-Pathway immunoproteasome were higher than that predicted with the algorithm MHC-Pathway constitutive proteasome. With all four algorithms, there were no overlaps in the 95% CI of the predictive scores for these two groups of antigen epitopes, and a two-sided p<0.05, based on the Wilcoxon rank-sum test, suggested that tumor antigen epitopes are different from viral epitopes regarding their probability of being processed by proteasomes and immunoproteasomes (p<0.05). The results suggested that potential tumor antigen epitopes could not be processed efficiently. The lower probability that viral epitopes will be processed by these proteasomes suggests that the “threshold” for processing tumor antigen epitopes by proteasomes and immunoproteasomes is higher than that for viral antigen epitopes.
In Fig. 3B, the Wilcoxon rank-sum test comparison of the TAP-binding potential (predicted by the algorithm TAPPred)(15) of tumor antigen epitopes and viral antigen epitopes showed that the viral epitopes had a slightly higher potential than the tumor epitopes to bind to TAP for transfer into ER—even though there were no statistical differences (p>0.05). Furthermore, a comparison of the HLA-A2.1-binding potential of the tumor antigen and viral epitopes—predicted by the algorithms SYF [SYFPEITHI](58) and BIMAS [BIMAS/NIH](54)—demonstrated a similarity in the two groups of epitopes, suggesting that HLA-A2.1 restriction had overcome the differences in amino acid occurrences and physical characteristics. These results correlated with the fact there are only one type of HLA-A2.1 and limited human TAP polymorphism(13,59) for binding of all the epitopes regardless of tumor antigens or viral antigens. Based on analyses of the experimentally identified 47 tumor antigen epitopes and 52 viral antigen epitopes, generation of a 95% CI for the predictive scores by using HLA binding algorithms and the TAP algorithm, demonstrated—for the first time—predictive score ranges, with statistical confidence. A recent study reported that prediction with these algorithms could not always be verified with experimental data(16,60,61). However, our analytic methods and results on the 95% CIs for the experimentally identified epitopes have proven useful in the selection of predicted epitopes for further experimental verification.
In this study, our results have demonstrated, for the first time, that: 1) the major difference between immunodominant tumor antigen epitopes and viral antigen epitopes lies in the structural features of proteolytic sites, but not in that for TAP binding and HLA binding; and 2) the proteasomes and related peptidases, at least the cleaving efficiencies of those enzymes, in the generation of HLA-A2.1-restricted tumor antigen epitopes may be different from those of viral antigen epitopes. Our future work may further expand this study to compare these two groups of epitopes presented by other MHC alleles when more documented epitopes are reported. Previous studies showed that viral antigen epitopes are preferentially processed by immunoproteasomes, while most tumor antigen epitopes are processed by constitutive proteasomes(62). Since our results emphasize the critical role of proteasomes and related peptidases in the regulation of tumor antigen epitope processing and anti-tumor immune responses, the mechanisms underlying novel therapy of proteasome inhibition(63) in the modulation of tumor antigen epitope processing and anti-tumor immune responses remain an interesting topic in this field of research.
Our results diverged from the previously published analysis of 274 epitope flanking regions(17), which did not identify differences in amino acid occurrence in epitope flanking regions. The discrepancy is likely due to the fact that the previous study only collected naturally processed peptides. The non-immunodominant epitopes that can be eluted from HLA may not join anti-tumor immune responses but may be functional in maintaining T cell repertoire(64,65). Therefore, the summarization of all diverted antigen peptides with potential immunodominance and that without immunodominance might average-out potential differences.
Of note, the noted differences could be caused by viral interference with antigen processing(66). However, recent reports on proteasomes and related proteases of antigen processing also support and explain our findings(62). In viral infections, expression of IFN-γ is induced by cytokines IL-2, IL-18, and IFN-α/β, or by stimulation through TCR or natural killer (NK) cell receptors(6). Consequently, IFN-γ alters proteasome activity quantitatively, by incorporating three immunosubunits, LMP2 (iβ1), LMP7 (iβ5), and MECL-1 (iβ2), to replace the constitutive β1 (δ), β2 (MB1), and β5 (Z) subunits in 20S core proteasome. Thus, two types of proteasome exist, “constitutive proteasomes,” which are present in all somatic cells, and “immunoproteasomes,” which are expressed under the influence of cytokines such as IFN-γ(7). In addition, IFN-γ also upregulates the expression of two other proteins, PA28α and PA28β, which form the heptameric proteasome activator complex PA28(67). In contrast with virally infected cells, in a large number of tumors the expression of IFN-γ-induced proteasome subunits LMP2 and LMP7 is downregulated(68), suggesting that processing of tumor antigen epitopes may be different from those of viral antigen epitopes(7). In conjunction with these findings, our results, via analyzing the substrates for proteolytic enzymes with unique bioinformatic approach, indicate that half of C-terminal cleavage sites in the tumor antigen epitopes are not shared by the viral epitopes, suggesting the possibility that more proteasomes than immunoproteasomes mediate cleavage of the C-terminus of tumor antigen epitopes; whereas, more immunoproteasomes are involved in production of the C-terminus of viral antigen epitopes. In addition to proteasomes, the parallel system potentially contributing to differences in the generation of tumor antigen and viral antigen epitopes is tripeptidyl peptidase II (TRPII; EC220.127.116.11), which is able to generate the HLA-restricted HIV Nef epitope independently of proteasomes(69). This parallel system may also contribute to differences between the two groups in the C-terminal cleavage sites_although we were unable to quantify the contributory percentages of processing by proteasomes/immunoproteasomes and other peptidases.
Our results indicate that, in comparison with the C-terminal cleavage sites of the epitopes, the differences in N-terminal cleavage sites between the two groups of epitopes were even larger. In addition, we found that the most frequently occurring amino acid pairs in both N-terminal (Pn1-Pn1′) and C-terminal cleavage sites (Pc1-Pc1′) in the generation of tumor antigen epitopes are different from those of the viral antigen epitopes. It has been reported that many antigenic peptides are generated as amino-terminal extended precursor peptides, and that these require amino-terminal trimming by aminopeptidases located either in the cytosol or in the endoplasmic reticulum (ER)(8,70-72). It should be noted that the expression of both cytosolic leucine aminopeptidase (EC18.104.22.168)(8) and ER aminopeptidase I (ERAP1)(73) is IFN-γ induced, suggesting that such expression may be involved more in the generation of viral antigen rather than tumor antigen epitopes, which is similar to the function of immunoproteasomes(7). In conjunction with these reports, our results showed that only one fifth of the N-terminal cleavage sites of the tumor antigen epitopes overlap with those of the viral antigen epitopes. In sum, we speculate that the differences between the two groups of epitopes may result from different enzymatic activities; it is more likely that the N-terminal cleavage sites of viral epitopes would be generated by IFN-γ-induced aminopeptidases; whereas, the N-terminus of the tumor antigen epitopes is most likely cleaved by IFN-γ insensitive aminopeptidases. It is well documented that IFN-γ plays a critical role in mounting anti-tumor immune responses(74). Along with this finding, our bioinformatic results with statistical significance suggest that future tumor antigen-specific immunotherapy complemented by IFN-γ may enhance the processing and presentation of MHC class I-restricted tumor antigen epitopes via upregulation of immunoproteasomes and IFN-γ-induced peptidases.
We appreciate the editorial assistance of Dr. B. Ashby at Temple University, J.E. Young, G. Bennett, and K. Franks at Baylor College of Medicine. This work was supported to X.F. Yang in part by NIH grants AI054514; the Kostas family foundation, the Leukemia & Lymphoma Society; and the Myeloproliferative Disorders Foundation.