|Home | About | Journals | Submit | Contact Us | Français|
ArachnoServer (www.arachnoserver.org) is a manually curated database providing information on the sequence, structure and biological activity of protein toxins from spider venoms. These proteins are of interest to a wide range of biologists due to their diverse applications in medicine, neuroscience, pharmacology, drug discovery and agriculture. ArachnoServer currently manages 1078 protein sequences, 759 nucleic acid sequences and 56 protein structures. Key features of ArachnoServer include a molecular target ontology designed specifically for venom toxins, current and historic taxonomic information and a powerful advanced search interface. The following significant improvements have been implemented in version 2.0: (i) the average and monoisotopic molecular masses of both the reduced and oxidized form of each mature toxin are provided; (ii) the advanced search feature now enables searches on the basis of toxin mass, external database accession numbers and publication date in ArachnoServer; (iii) toxins can now be browsed on the basis of their phyletic specificity; (iv) rapid BLAST searches based on the mature toxin sequence can be performed directly from the toxin card; (v) private silos can be requested from research groups engaged in venoms-based research, enabling them to easily manage and securely store data during the process of toxin discovery; and (vi) a detailed user manual is now available.
The growing realisation that most venomous animals possess a complex repertoire of protein toxins with potential pharmaceutical and agrochemical applications has led to an exponential increase in the rate of toxin discovery (1). Several databases have been specifically developed to facilitate retrieval of information about these toxins, such as Tox-Prot (2) and the Animal Toxin Database (ATDB) (3). These databases are critical for comparison of toxins across different groups of venomous animals but they typically lack the rich information content of manually curated databases that deal with specific subsets of animal toxins, such as ConoServer (4), which provides information about toxins from marine cone snails.
Spiders are the most evolutionarily successful venomous animals and the most abundant terrestrial predators. Their remarkable success is due in large part to their evolution of a pharmacologically complex venom that ensures rapid subjugation of prey. Most spider venoms are dominated by disulfide-rich peptide toxins that typically have high affinity and selectivity for specific subtypes of ion channels and receptors, making them particularly valuable from a pharmacological and drug discovery perspective (5–8). Spider venoms are likely to contain >10 million bioactive peptides based on the extraordinary taxonomic diversity of spiders, with the number of extant species predicted to be more than 100000 and the demonstration that some venoms contain more than 1000 unique peptides (5). This is a larger pharmacopeia than that of all other venomous animals combined.
Prior to the introduction of ArachnoServer 1.0 (9), no publicly accessible database existed specifically for collating information about proteinaceous spider toxins. Since its establishment in 2009, ArachnoServer has been widely used and the website has hosted visitors from more than 60 countries. ArachnoServer toxin cards (see Figure 1 for an example) are now cross-referenced through links on the corresponding sequence records in UniProtKB (http://www.uniprot.org), ArachnoServer accession numbers are included in the UniProtKB mapping service and the UniProtKB records have adopted the rational toxin nomenclature (1) that has been applied universally in ArachnoServer.
The starting point for data curation in ArachnoServer is the automated collection of all publically available sequence and annotation information for spider toxins from UniProtKB (10), the International Nucleotide Sequence Database Collection (INSDC) and the Protein Data Bank (PDB) (11). These data sets are joined with the assistance of the Sequence Retrieval System (SRS) (12) into a single non-redundant set containing peptide sequences, nucleotide sequences and protein structures (where available) for each toxin (Figure 2). Other database identifiers (e.g. NCBI taxonomy codes, Gene Ontology classifications, PROSITE and Pfam accessions, etc.) are also imported, as well as literature references, annotations of sequence and structure features (e.g. known locations of disulfide bonds) and toxin descriptions. Spider taxonomy is derived from the World Spider Catalog (13) while other taxonomy (used for classification of a toxin’s phyletic selectivity) is from the NCBI Taxonomy database (14). Since the majority of spider toxins act on ion channels and cell-surface receptors (8,15), we developed a molecular target ontology specifically for venom toxins that is based on the channel and receptor subtype definitions and nomenclature recommended by the International Union of Basic and Clinical Pharmacology (IUPHAR) (16).
Using the curation interface within ArachnoServer, curators can add additional data which includes, but is not limited to, a detailed description of the toxin; toxin name [which conforms to the rational nomenclature proposed for venom peptides (1) and venom sphingomyelinases (17)]; source species; discovery date; toxin synonyms; biological activity; phyletic specificity; molecular targets; sequence features such as toxin pharmacophore and disulfide bonds; database cross-references; and literature references. Literature references are sourced from PubMed (where available) using the PubMed eFetch web service. The process of data retrieval and curation in ArachnoServer is summarized in Figure 2.
In response to user feedback, we have implemented a major upgrade of ArachnoServer that includes the following new features:
Mass spectrometry is becoming one of the standard methods used to characterize venom components (7). Most spider toxins contain multiple disulfide bonds and therefore the mass of the oxidized form of the peptide (as opposed to the reduced form that is calculated in most databases) is typically of primary interest to venom researchers. Thus, toxin records in ArachnoServer now provide the average and monoisotopic masses for both the reduced and oxidized form of the toxin (see Figure 1 for an example). Moreover, the advanced search feature now includes an option to perform a search based on any of these mass classes (see below).
ArachnoServer provides an Advanced Search feature that enables multiple search clauses to be grouped and joined using Boolean operators. We have now added the ability to search for toxins on the basis of toxin mass, external database accession numbers and the date on which the toxin record was published in ArachnoServer. Additional columns have been added to the table of search results that display context-specific data for certain search categories. For example, a search for toxins with oxidized molecular masses within a certain mass range will yield a table in which the oxidized molecular mass for each ‘hit’ is indicated in the additional column. Context-specific data columns have also been introduced for searches based on toxin discovery date as well as the number of solved PDB structures, number of biological activities, number of molecular targets, number of posttranslational modifications and number of disulfide bonds. Search results can be exported in both PDF and XML formats.
Rapid BLAST searches based on the mature toxin sequence ‘only’ can now be performed directly from the toxin card. This option provides a mechanism to enrich alignment results for mature toxin sequences. The option to perform a BLAST search using the entire toxin sequence, which may include signal and propeptide regions, is still available. BLAST results are formatted in HTML and contain links to ArachnoServer toxin cards and the corresponding UniProtKB records where available.
ArachnoServer includes a browse feature that initially enabled toxin records to be located on the basis of ‘Araneae Taxonomy, Molecular Targets and Posttranslational Modifications’. Each category creates a different browsing tree on the right hand side of the screen for easy selection of toxins. Toxins can now also be located on the basis of their ‘Phyletic Specificity’, that is, the range of organisms against which they are active. Choosing to browse by phyletic specificity creates a tree of organisms that includes five taxonomic levels: domain, class, order, genus and species. Selecting, for example the order Insecta will display all toxins with reported insecticidal activity. This new browse feature complements the ability to search for specific types of biological activity using the Advanced Search feature.
We have created private silos, available upon request, for researchers actively involved in discovery and characterization of spider-venom toxins. These silos provide secure repositories for groups of researchers, enabling them to enter and manage their toxin sequences. Within a private silo, the nominated curators have access (via secure login) to the full suite of ArachnoServer curation tools and features, including the ability to securely BLAST their toxin sequences against the public ArachnoServer database. Toxin records within private silos remain strictly confidential until a researcher decides to release a toxin card to the ArachnoServer curators. We anticipate that private silos will not only help toxinologists manage their intellectual property but will also enhance the quality of ArachnoServer records by ensuring that the initial curation is done by the researchers who discovered the toxin.
A detailed user manual is now available for download from the ArachnoServer website. The manual details the kind of data stored in ArachnoServer and how it is curated. The user manual explains all of the features available within ArachnoServer and it provides examples of browsing, advanced searches and BLAST searches.
ArachnoServer 2.0 currently manages 1078 protein sequences, 759 nucleic acid sequences and 56 protein structures, an increase from 567, 334 and 51, respectively, in version 1.0. Overall, this represents the largest single collection of spider toxin records in any online database. In addition, version 2.0 contains many more high-resolution images of spiders from which toxins in the database have been sourced. All images are downloadable and are freely available for academic use according to the creative commons noncommercial license.
ArachnoServer was designed to be useful to scientists across a broad range of disciplines, including pharmacologists, neuroscientists, medicinal chemists, toxinologists, structural biologists and clinicians. Version 2.0 of the database increases its utility by including many new features, a large increase in the number of curated protein and nucleic acid sequences and the most up-to-date information available on proteinaceous spider toxins.
Australian Research Council (Discovery Grants DP0774245 and DP0878450 to G.F.K.). Funding for open access charge: Australian Research Council.
Conflict of interest statement. None declared.
The authors acknowledge financial support from the Australian Research Council, Dr Florence Jungo at the Swiss Institute of Bioinformatics for many helpful discussions and Bastian Rast for supplying high-resolution spider photographs.