Prediction of protein-ligand binding sites is an important issue for protein function annotation and structure-based drug design. Nowadays, although many computational methods for ligand-binding prediction have been developed, there is still a demanding to improve the prediction accuracy and efficiency. In addition, most of these methods are purely geometry-based, if the prediction methods improvement could be succeeded by integrating physicochemical or sequence properties of protein-ligand binding, it may also be more helpful to address the biological question in such studies.
In our study, in order to investigate the contribution of sequence conservation in binding sites prediction and to make up the insufficiencies in purely geometry based methods, a simple yet efficient protein-binding sites prediction algorithm is presented, based on the geometry-based cavity identification integrated with sequence conservation information. Our method was compared with the other three classical tools: PocketPicker, SURFNET, and PASS, and evaluated on an existing comprehensive dataset of 210 non-redundant protein-ligand complexes. The results demonstrate that our approach correctly predicted the binding sites in 59% and 75% of cases among the TOP1 candidates and TOP3 candidates in the ranking list, respectively, which performs better than those of SURFNET and PASS, and achieves generally a slight better performance with PocketPicker.
Our work has successfully indicated the importance of the sequence conservation information in binding sites prediction as well as provided a more accurate way for binding sites identification.