Neuropeptides are bioactive peptides that affect the function of almost every central nervous system (1
). Neuropeptidomic studies (2
) characterize neuropeptides using mass spectrometry and provide high-quality, empirical data on actual neuropeptides. However, because the experimental discovery or confirmation of neuropeptides is time and labor intensive, biochemical characterization of an animal's neuropeptide complement is not available for most species. The increasing number of species that have or are being sequenced at the genomic or transcriptomic level has motivated the development of effective and accurate bioinformatics methodologies to predict neuropeptides from sequence information.
A neuropeptide precursor mRNA sequence can be identified from sequence information (7
), and the resulting translated protein sequence includes a signal peptide sequence and one or multiple neuropeptides. An extensive and complicated series of enzymatic processing steps, including cleavage by prohormone or proprotein convertases and other post-translational modifications, occur on the translated protein sequence before the active neuropeptides are created. Prohormone convertases are calcium-dependent serine proteases and each has specific cleavage sites associated with the basic amino acids Lys and Arg (8
). Kexin, furin, and other prohormone convertases, including PC1, PC2, PC4, PACE4, PC5 and PC7, have overlapping cleavage function, and multiple prohormone convertases are also usually present simultaneously (8
). Multiple prohormone convertases can cleave the same site, and thus, overcome the functional loss of a specific prohormone convertase. Consequently, the prediction of the resulting neuropeptides from sequence information alone can prove challenging.
While the cleavage motifs for furin and kexin have been extensively studied, there is less information for other prohormone convertases. General observations (often termed rules) for cleavage recognition sites have been proposed (8
), usually without knowledge of the acting prohormone convertase. However, these observations stem only from motifs that are cleaved; non-cleaved motifs are typically ignored. Thus, many of these observations are made without regard to cleavage status. Southey et al
) predicted precursor cleavages in insects, mammals, birds, fish and other species using a Known Motif approach, based on reported cleavage motifs. Although this approach identified most of the known cleavages, it also had a high rate of false positive results (10
Other approaches to predict neuropeptide cleavage sites include logistic regression [(11
), B. R. Southey, A. B. Hummon, T. A. Richmond, S. L. Rodriguez-Zas and J. V. Sweedler, manuscript submitted], and the artificial neural network (13
) available in the ProP application (http://www.cbs.dtu.dk/services/ProP
). Hummon et al
) predicted cleavage sites in mollusk (Aplysia californica
) precursors using a logistic regression model on combinations of amino acids and locations, and then applied the predictive function to neuropeptide precursors from a range of organisms. This approach was extended to mammalian precursors (12
) and to precursors identified from the Apis mellifera
and Drosophila melanogaster
genomes (B. R. Southey, A. B. Hummon, T. A. Richmond, S. L. Rodriguez-Zas and J. V. Sweedler, manuscript submitted).
NeuroPred provides a unified interface to predict cleavage sites by employing multiple approaches, based on a wide range of precursors and species, as developed by Hummon et al
), Amare et al
) and Southey et al
), B. R. Southey, A. B. Hummon, T. A. Richmond, S. L. Rodriguez-Zas and J. V. Sweedler, manuscript submitted]. NeuroPred also has the capability to calculate the mass of the neuropeptides resulting from the predicted cleavages. Made widely available as a web-based application, NeuroPred is a comprehensive resource with which to explore neuropeptide precursor processing and aid in the discovery and confirmation of new neuropeptides.