ConoServer annotations associated with individual conopeptide sequences are entered semi-automatically and manually. An annotation system performs most of the repetitive tasks, but the resulting outputs in all cases are subjected to manual reviewing before being approved and published. The majority of the sequence and three-dimensional structure data are retrieved from publicly available databases, including GenBank (21
), UniProt-KB (22
), the Protein Data Bank (23
) and the Biological Magnetic Resonance Bank (24
). Manual curation of the peer-reviewed literature provides additional entries, which are therefore unique to ConoServer. Conopeptides are expressed as prepropeptides (25
), and their corresponding mature peptide is predicted using ConoPrec for cases where it was not identified in the literature. As of September 2011, ConoServer provides information on 1180 mature conopeptides. However, with more than 500 species of cone snails (26
) and estimates of 200–1000 unique conopeptides per species (27
), the number of known peptides cataloged in ConoServer is only a small fraction of the potential pool of wild-type conopeptides. ConoServer will need to be regularly updated and improved to cope with the increasing number of sequences.
ConoServer now provides sequence/structure/activity relationships information that is of particular interest for drug design studies. Examples of bioactivity data that are now provided include measures of IC50
and percentage of inhibition of ion currents in various electrophysiological assays. Besides native conopeptide sequences, ConoServer contains information on 338 synthetic variants, which have been chemically synthesized to study the receptor specificity and stability of conopeptides with potentially interesting pharmaceutical properties. ConoServer catalogs 95 three-dimensional structures of wild-type conopeptides and 42 structures of synthetic variants. The majority of these structures have been determined by nuclear magnetic resonance (28
). Finally, ConoServer describes 1288 patented protein and 737 patented nucleic acid sequences.
New types of annotations related to the discovery and evolution of conopeptides are now available in ConoServer, including a more extensive description of organisms, information on how mature peptide sequences were identified and the analysis of precursor sequences. The geographic location and the diet (mollusk, worm or fish) of specific cone snails are new features that are retrieved from the Conus Biodiversity website (http://biology.burke.washington.edu/conus/
) or from the peer-reviewed literature. Mature conopeptides are typically either isolated directly from the venom or predicted from a nucleic acid precursor. Information on the method of identification, now included in ConoServer, allows users to make a rapid assessment of the confidence of conopeptide sequences and the presence of post-translational modifications. Conopeptides are classified into gene superfamilies according to the similarity of the endoplasmic reticulum (ER) signal sequence in their precursor. For cases where the ER signal sequence is not identified in the literature, ConoServer predicts it using the new tool ConoPrec (described below). The sequences of 1120 precursors are currently in ConoServer and 16 gene superfamilies are described. In addition, 13 other temporary gene superfamily were recently introduced in ConoServer to describe newly discovered conopeptide precursors expressed by cone snails from the ‘early divergent’ clade (15
ConoServer now computes statistics on known conopeptides. The statistical tables are kept up-to-date with the database content, and provide information on relationships between classification schemes, sequence conservation of signal sequence regions that define gene superfamilies, the number of conopeptides for each species and details on three-dimensional structures. As an example of the use of this information, these statistics were valuable in a recent discussion of the relationships between the various conopeptide classification schemes (20
). The statistical tables also provide a convenient access link to the database content. For example, there are 18 conopeptides that are antagonists of sodium channels (μ pharmacological family), and some of them belong to the M gene superfamily. Clicking on the ‘M’ in the corresponding table gives access to the list of the 11
μ-conopeptides belonging to the M superfamily.