To test MIMOX, we have applied it to several cases taken from other similar research and literature. We compared the results from MIMOX with the results from other computational tools and the native epitope itself if the epitope is known in the CED database[24
]. It should be pointed out that cases using monoclonal antibodies are most appropriate for testing[15
]. However, in order to compare with previously published tools, some less appropriate cases (using polyclonal antibodies) taken from the corresponding literature are also used. More case studies [see Additional file 1
] can also be found on the test dataset page of MIMOX[25
The first case is taken from FINDMAP[13
]. In 1999, Jesaitis and co-workers used an anti-actin polyclonal antibody to select a phage displayed random peptide library; VPHPTWMR was one of the consensus sequences they derived from the selected mimotopes. They manually mapped VPHPTWMR to the known structure of actin [PDB: 1ATN] and suggested that it might correspond to residues: V129, P130, H101, P102, T358, W356, M355, R372[11
]. In 2003, Mumey et al used FINDMAP to align VPHPTWMR to the actin sequence without utilizing information on the antigen structure. The result from FINDMAP shows VPHPTWMR can be mapped to residues as V129, P130, H101, P102, T103, W356, M355, R372[13
]. FINDMAP mapped the input sequence to a slightly different set of residues (using T103 instead of T358). When running MIMOX with all parameters as defaults, we got no result. However, after the distance threshold is changed to 12 Å (the maximum distance allowed in MIMOX), we find that the two mappings above are returned as candidate cluster 5 and candidate cluster 17. As the side chain of some amino acids (such as arginine) can span a distance as great as 12 Å, MIMOX takes this value as the maximum allowable distance. This distance restriction is also used by Mapitope[16
]. In this case, the need for the higher distance threshold is due to R372 which lies some distance from the other mapped residues. MIMOX also suggested other possibilities such as cluster 1(V96, P102, H101, P130, T358, W356, M355, R372) which has a bigger solvent accessible surface, and cluster 26 (V96, P98, H101, P102, T103, W356, M355, R372), which clearly has 3 sequential segments, i.e. VPHPT, WM, and R.
The second case is taken from work by Enshell-Seijffers[16
]. They used monoclonal antibody 17b, which is against HIV gp120 envelope glycoprotein, to select a phage displayed random peptide library and got a set of 11 mimotopes. Analyzing the mimotopes with Mapitope, they suggest that the 17b epitope might consist of the following residues and segments: L111, LKPCVK (116–121), P124, VITQ (200–203), CPKV (205–208), RIK (419–421), I423, I424, K432, P437, P438. The structure of HIV gp120 envelope glycoprotein in complex with 17b has been solved [PDB: 1GC1] and 17b epitope has been recorded in the CED database as CE0058, which is composed of CK(119,121) + VTQAC(200,202–205) + RKQI(419,421–423) + KMYP (432,434,435,437). Among the 24 Mapitope predicted residues, 11 are contact residues of the 17b epitope. Using default parameters, MIMOX derived a consensus sequence, [LV] RP [LT] [KR] LRE [LP] [RT] X [-R], from 17b mimotopes. MIMOX finds no result matching the whole consensus sequence. Using LRLR, a fragment of the consensus sequence as the input sequence and running MIMOX in conservative match mode with other parameters as defaults, the top result is I423, K421, I420, and R419; the 4th result is V200, K121, L122, K432 and the 5th result is I423, K432, L122, and K121. Taking the top 5 results together, 13 residues are suggested by MIMOX and 6 of them are contact residues of the 17b epitope. In this case, Mapitope gives more complete and elaborate result.
The third case is taken from 3DEX[14
]. LLTTNKD is a mimotope selected from a phage displayed random peptide library using HIV positive patients' IgG. Using 3DEX to map this mimotope to the HIV gp120 envelope glycoprotein [PDB: 1G9M], Schreiber et al reported that this mimotope might correspond to residues: L452, L453, T283, T455, N280, K282, and D279. When running MIMOX with default parameters, the top candidate residue cluster is exactly the same as the result of 3DEX, which has a 265.82 Å2
solvent accessible surface. By picking candidate residues in conservative mode and clustering based on Cα atoms with a distance threshold 7 Å, the top candidate residue cluster suggested by MIMOX is changed to I360, I467, S465, T358, N356, K357, and E464. This cluster has a 399.28 Å2
accessible surface area. Furthermore, when neighbouring based on all heavy atoms with a distance factor 1.11, the top candidate residue cluster suggested by MIMOX is L86, V85, T244, S243, N229, K231, and E267, which has an even larger accessible surface area, 562.89 Å2
. All 3 clusters are shown in Figure . As the latter two mapping results suggested by MIMOX are more exposed, they might be able to bind to the antibody more easily.
Figure 4 Comparison of three mapping results. MIMOX was used to map LLTTNKD to HIV gp120 with three different methods. The top result using the default parameters is shown in image A. It is composed of the residues: L452 L453 T283 T455 N280 K282 D279 and has a (more ...)
The last case is taken from MIMOP[15
]. BO2C11 is a human monoclonal antibody against human coagulation factor VIII. Villard et al selected two phage displayed random peptide libraries with BO2C11 and got a set of 27 mimotopes[26
]. Very recently, Moreau et al have applied their newly developed tool MIMOP to analyze these mimotopes. Combining the two methods MimAlign and MimCons in MIMOP, the BO2C11 epitope is predicted be composed of a segment YFTNMF (2195–2200) and residues T2202, K2207, R2215, R2220, Q2222. The structure of human coagulation factor VIII in complex with BO2C11 has been solved [PDB: 1IQD] and the BO2C11 epitope has been recorded in the CED database as CE0176, which consists of FTNMF (2196–2200), R2215, RPQV (2220–2223), SLLT (2250–2253), HQ (2315–2316). Using default parameters, MIMOX derived a consensus sequence of [NQKR] [HST] RWSNRSS [ST] from those mimotopes. Again, the whole length consensus sequence returns no mapping results. However, when we use QH, RWSN, RSSS, three sequential fragments that cover the whole consensus sequence, as input sequences and running MIMOX in conservative match mode with all other parameters as defaults, the top 3 results of each of the partial sequences overlap with the MIMOP result and the native BO2C11 epitope well. For example, the third result of the input QH suggested by MIMOX is Q2316, H2315; the third result of the input RWSN suggested by MIMOX is R2220, F2196, T2197, N2198; the first result of the input RSSS suggested by MIMOX is R2215, S2216, T2202, T2197 and third result is K2249, S2250, S2254, T2253.
Taking together, our initial case studies show that MIMOX can fully or partially repeat results from manual mapping, other existing tools, and also provide novel suggestions. MIMOX is designed to be a tool which is more interactive than automatic. We acknowledge that tuning the probe sequences and parameters are often required to get good results. This interactive process gives hints to users step by step and greatly decreases the load of the server and prevents the loss of some reasonable results. MIMOX lists all the matched results with no prediction threshold. This allows users to find the reasonable results by themselves based on their background knowledge on a given antibody, a given antigen and a given phage display experiment. Nevertheless, according to the test dataset page of MIMOX, the true epitope (or its segments) often falls in the top 5 (if the there are only a few result entries) or top 10% (if the there are many result entries) of the results. Where the real epitope is unknown, we would suggest running MIMOX with a range of parameters and consensus sequence derived fragments to find overlapping or otherwise promising (high surface accessibility) candidate.
Related software comparison
As we have mentioned previously, several groups have researched algorithms and programs that may assist and automate phage display based epitope mapping. Based on the dependency on antigen structure, the existing programs can be classified into three categories. FINDMAP belongs to the first category, which is independent of any structural information. FINDMAP has been implemented as a C++ program. It aligns a probe (e.g. a consensus sequence derived from a set of mimotopes) to the sequence of native antigen, allowing any permutation of the probe sequence. It uses a two-part scoring system to evaluate the quality of alignments and a branch-and-bound algorithm to find an alignment with maximum score[13
The programs in the second category include SiteLight, 3DEX, Mapitope, and MIMOX. SiteLight was implemented in C++ and it has been tested on Red Hat Linux. First, the program divides native protein surface into overlapping patches based on geodesic distances between residues; then aligns each mimotope in the library with each patch and scores and sorts them; finally, high scoring matches are selected iteratively until 25% of the native protein is covered[12
]. Another program 3DEX was implemented in Visual Basic and could only run on Windows. It divides a sequence into a set of overlapping subsequences with a user-defined length (3-maximum length of mimotope). Then, it searches for matching residues at each position of the above subsequences against the sequence or PDB structure of native protein and links the neighbours iteratively until the first subsequence is complete. This is repeated for the following subsequences to complete the mimotope and return the result[14
]. Mapitope was also implemented in C++ and its algorithm was first described by Enshell-Seijffers in 2003. Briefly, Mapitope deconvolutes a set of mimotope sequences into a set of overlapping amino acids pairs (AAP). Then a set of major statistically significant pairs (SSP) are identified based on the AAP. Later, the SSP are mapped and clustered in the antigen structure. Finally, the most elaborate and diverse clusters on the antigen surface are identified and regarded as the predicted epitope candidates[16
MIMOP, a work published very recently, comprises the third category. MIMOP includes two approaches. One is called MimAlign, which can predict potential epitopic regions (PER) from mimotope and antigen sequences, and from the antigen structure if available. The other called MimCons, can predict PER from mimotope sequences but requires the antigen structure[15
]. It seems that MIMOP can work with or without the antigen structure from the published case studies. However, the sequence of the only case that is independent of antigen structure is just a continuous subsequence of the antigen sequence. Thus, more studies are still needed to prove that MimAlign can work without antigen structure information.
All the existing programs described above have succeeded in given cases. However, a systematic evaluation on these tools is absent. Moreover, as shown in the Table , most of the existing tools have not been implemented as a publicly accessible online service until now, making it less convenient for the community to access, utilize, and evaluate them.
Comparing available programs related to MIMOX