Mannose binding proteins play an important role in the innate immune response by binding to carbohydrates on the surface of a wide range of pathogens and activate the complement system 
. Experimental techniques of identification of mannose interacting residue are costly and time consuming. There is a need to develop in silico
techniques for predicting protein-mannose interaction in order to understand function of MBPs and their role in innate immunity 
. In past, methods have been developed for predicting glucose, galactose and carbohydrate interacting residues in a protein 
but no method has been developed for predicting mannose interacting residues. In this direction, we had made a systematic attempt to develop an accurate and robust method for predicting MIRs in protein sequences.
In this study, we created clean and standard dataset from SuperSite documentation and PDB and assign MIRs using program LPC 
. This dataset have 125 non-redundant MBPs where no two MBPs have more than 40% similarity. In order to understand preference of residues in mannose interaction we compute and compare composition of MIRs and non-MIRs (, , , ). It was observed that certain types of residues are more preferred in mannose interaction than others. It was observed MIRs neighbor residues are also different then non-MIRs neighbor residues. It indicates that mannose interacting sites/pockets are highly conserved. This was also observed that mannose-protein interaction is different than DNA or RNA protein interaction in term of residues preferred interaction 
SVM model based on binary patterns of amino acid sequence has been developed to predict mannose interacting residues with low accuracy around 59%. It has been shown in previous studies that evolutionary information of a protein contains more information than single amino acid sequence of protein. In order to improve performance of our models, we used evolutionary information in form of PSSM profile for developing SVM models for predicting mannose interacting residues (). The accuracy of SVM modules increase significantly from 59% to 66%, it is expected PSSM provides more information than single sequence. During analysis of MIRs, it was observed that residues involved in mannose interaction as well as MIRs neighbors' residues are dominated by certain types of residues. Based on this observation, we used composition profile of patterns (CPP) for developing modules for predicting MIRs instead of binary or PSSM profile. As shown in and , CPP based SVM modules predict MIRs with high accuracy around 85%. The performance of SVM modules based on CPP is significantly higher than SVM modules based on BPP or PPP. Previously, our group used this concept for predicting conformational B-cell epitopes in proteins.
This is interesting that models based on simple composition of patterns perform better than models based on binary or PSSM profile of patterns. BPP provides more comprehensive information than CPP. In case of BPP, information includes order and types of residues in a pattern, where as CPP contain only composition of residues. Ideally BPP based modules should be more accurate than CPP based modules as it have more information. In real life results are contradictory. This problem may be compared with problem of sub-cellular localization of methods where simple composition based SVM modules out perform alignment based methods like BLAST 
. Biologically, it is difficult to justify that composition based method can perform better than BPP or PPP based methods. We feel it is due to limitations of representation of patterns to be used in SVM. In case of BPP, pattern of residues N are represented with matrix of N×21 which contain value 1.0 for N elements and 0.0 for N×20. In simple term, values of most of matrix elements are zero, thus it is difficult for any machine learning technique to learn from matrix having most of elements zero. In case of CPP, pattern is presented by only 21 values where most of values are non-zero. This is probable reason that composition based methods is becoming popular over the years 
. This study will be useful for researcher working in the filed of immunology to understand host pathogen interaction and response of innate immunity.