PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bmcgenoBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Genomics
 
BMC Genomics. 2009; 10: 469.
Published online Oct 12, 2009. doi:  10.1186/1471-2164-10-469
PMCID: PMC2768748
Tardigrade workbench: comparing stress-related proteins, sequence-similar and functional protein clusters as well as RNA elements in tardigrades
Frank Förster,#1 Chunguang Liang,#1 Alexander Shkumatov,#2 Daniela Beisser,1 Julia C Engelmann,1 Martina Schnölzer,3 Marcus Frohme,4 Tobias Müller,1 Ralph O Schill,5 and Thomas Dandekarcorresponding author1
1Dept of Bioinformatics, Biocenter University of Würzburg, 97074 Würzburg, Germany
2EMBL, Hamburg Outstation, Notkestrasse 85, 22603 Hamburg, Germany
3Functional Proteome Analysis, German Cancer Research Center, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
4University of Applied Sciences, Bahnhofstraße 1, 15745 Wildau, Germany
5Dept of Zoology, Institute for Biology, Universität Stuttgart, 70569 Stuttgart, Germany
corresponding authorCorresponding author.
#Contributed equally.
Frank Förster: frank.foerster/at/biozentrum.uni-wuerzburg.de; Chunguang Liang: liang/at/biozentrum.uni-wuerzburg.de; Alexander Shkumatov: ashkumat/at/embl-hamburg.de; Daniela Beisser: daniela.beisser/at/biozentrum.uni-wuerzburg.de; Julia C Engelmann: julia.engelmann/at/klinik.uni-regensburg.de; Martina Schnölzer: m.schnoelzer/at/dkfz.de; Marcus Frohme: mfrohme/at/tfh-wildau.de; Tobias Müller: tobias.mueller/at/biozentrum.uni-wuerzburg.de; Ralph O Schill: ralph.schill/at/bio.uni-stuttgart.de; Thomas Dandekar: dandekar/at/biozentrum.uni-wuerzburg.de
Received April 14, 2009; Accepted October 12, 2009.
Background
Tardigrades represent an animal phylum with extraordinary resistance to environmental stress.
Results
To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data.
Conclusion
Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences.
Tardigrades are small metazoans resembling microscopic bears ("water-bears", 0.05 mm to 1.5 mm in size) and live in marine, freshwater and terrestrial environments, especially in lichens and mosses [1-3]. They are a phylum of multi-cellular animals capable of reversible suspension of their metabolism and entering a state of cryptobiosis [4,5]. A dehydrated tardigrade, known as anhydrobiotic tun-stage [6,7], can survive for years without water. Moreover, the tun is resistant to extreme pressures and temperatures (low/high), as well as radiation and vaccuum [8-13].
Well known species include Hypsibius dujardini which is an obligatory parthenogenetic species [14]. The tardigrade H. dujardini can be cultured continuously for decades and can be cryopreserved. It has a compact genome, a little smaller than that of Caenorhabditis elegans or Drosophila melanogaster, and the rate of protein evolution in H. dujardini is similar to that of other metazoan taxa [15]. H. dujardini has a short generation time, 13-14 days at room temperature. Embryos of H. dujardini have a stereotyped cleavage pattern with asymmetric cell divisions, nuclear migrations, and cell migrations occurring in reproducible patterns [15]. Molecular data are sparse but include the purinergic receptor occuring in H. dujardini [16].
Milnesium tardigradum is an abundant and ubiquitous terrestrial tardigrade species in Europe and possibly worldwide [17]. It has unique anatomy and motion characteristics compared to other water bears. Most water bears prefer vegetarian food, M. tardigradum is more carnivorous, feeding on rotifers and nematodes. The animals are really tough and long-living, one of the reasons why M. tardigradum is one of the best-studied species so far.
Questions of general interest are: How related are tardigrade proteins to each other? Which protein families provide tardigrade-specific adaptations? Which regulatory elements influence the mRNA stability? Starting from all published tardigrade sequences as well as 607 unpublished new sequences from Milnesium tardigradum, we analyse tardigrade specific clusters of related proteins, functional protein clusters and conserved regulatory elements in mRNA mainly involved in mRNA stability. The different clusters and identified motifs are analysed and discussed, all data are also available as a first anchor to study specific adaptations of tardigrades in more detail (Tardigrade workbench). Furthermore, the tardigrade analyzer, a sequence server to analyse individual tardigrade specific sequences, is made available. It will be regularly updated to include new tardigrade sequences. It has a number of new features for tardigrade analysis not available from standard servers such as the NIH Entrez system [18]: several new species-specific searches (Echiniscus testudo, Tulinus stephaniae), additional new sequence information (M. tardigradum) and pattern-searches for nucleotide sequences (including pattern search on non-redundant protein database, NRDB). An easy search for clusters of orthologous groups (COG, [19]) different from the COGnitor tool [20] allowing tardigrade specific COG and eukaryotic COG (KOG) searches is also available.
Furthermore, a batch mode allows a rapid analysis of up to 100 sequences simultaneously when uploaded in a file in FASTA format (for tardigrade species or NRDB).
Two fifths of the tardigrade sequences cluster in longer protein families, and we hypothesise for a number of these that they are implicated in the unique stress adaptation potential of tardigrades. We find also ten tardigrade specific clusters. The unique tardigrade adaptions are furthermore indicated by a number of functional COGs and KOGs identified here, showing a particular emphasis on the protection of proteins and DNA. RNA read out is specifically regulated by several motifs for mRNA stability clearly overrepresented in tardigrades.
We analysed all publicly available tardigrade sequences (status 9th of April 2009) as well as 607 unpublished M. tardigradum sequences from our ongoing transcriptome analysis.
Major tardigrade protein clusters of related sequence-similar proteins
All available tardigrade sequences were clustered by the CLANS algorithm [21]. Interestingly, 39.3 % of the predicted proteins (mainly EST-based predictions) cluster in just 58 major families, each with at least 20 sequences [see additional file 1: Table S1]. These include 4,242 EST sequences from a total of 10,787.
Using these clusters, a number of tardigrade-specific adaptations become apparent (Table (Table11 [and additional file 1: Table S1]): the clusters include elongation factors (cluster 12), ribosomal RNAs and proteins (cluster 1, 4, 32 and 56) which are part of the transcriptional or translational machinery. Cluster 5 (chitinase binding domain [22]) could provide membrane and structural reorganization or immune protection (e.g. fungi) according to homologous protein sequences characterized in other organisms. Other clusters show protein families related to the tardigrade stress adaptation potential, e.g. ubiquitin-related proteins (cluster 14; maybe stress-induced protein degradation) and cytochrome oxidase-related proteins (cluster 2, suggested to be involved in respiratory chain).
Table 1
Table 1
CLANS clusters of sequence similar proteins in published tardigrade sequences1
Moreover, proteins responsible for protein degradation (cluster 15) were found as well as proteins regulating peptidases (cluster 16). Cluster 23 consists of 53 heat shock proteins which are involved in many stress response reactions [23]. Few diapause specific proteins (cluster 24) are known from other animals. Diapause is a reversible state of developmental suspension. It is observed in diverse taxa, from plants to animals, including marsupials and some other mammals [24] as well as insects (associated molecular function varies but involves calcium channel inhibition [25]) and should here support the tun formation or regulate other (e.g. developmental) metabolic inactive states. Furthermore, proteins involved in storage or transportation of fatty acids also seem to be important (cluster 31, [26]). Late embryogenesis abundant (LEA) protein expression seems to be linked to desiccation stress and the acquisition of desiccation tolerance in organisms [27] e.g. nematodes [28,29] and rotifers [30]. Thirty-one LEA type 1 family proteins were found in cluster 38.
LEA proteins are wide-spread among plants and synthesized in response to certain stresses [31,32]. The LEA type 1 family is well known in higher plants (rice, maize, carrots) to be synthesized during late embryogenesis and in ABA stress response. It includes desiccation-related protein PCC3-06 of Cratersostigma plantagineum. LEA type 1 family occurs in bacteria (e.g. Haemophilus influenzae, Deinococcus radiodurans), but is atypical for animals. However, this is an animal example where LEA family type 1 is well represented and forms a full cluster.
Moreover, ten clusters (8, 18, 19, 30, 33, 35, 37, 42, 51, 55) consist of proteins which seem to be specific for tardigrades. These show no significant homology to known proteins.
Functional clusters of stress-specific adaptations present in tardigrades
To gain a systematic overview of involved tardigrade functions, all available tardigrade sequences were classified species-specific according to COG functional category [19,20] as well as according to COG number and molecular function encoded. Note that in this section "protein" implies one type of protein. A COG or KOG comprises often several sequences from different tardigrades. Prokaryotic (COG) and eukaryotic (KOG) gene clusters were compared (Table (Table2;2; details on the WEB http://waterbear.bioapps.biozentrum.uni-wuerzburg.de/). Again, several tardigrade-specific adaptations stand out, e.g. highly represented COGs regulate translation elongation factor and sulfate adenylate transferase and a strong ubiquitin system. There are many cysteine proteases (21 proteins). For redox protection there are 14 thioredoxin-domain containing proteins and 75 Heme/copper-type cytochrome/quinol-like proteins as well as ubiquinone oxidoreductase subunits (15 proteins). There are ten proteins involved in seleno-cysteine specific translation [33,34]. In eukaryotes, selenoproteins show a mosaic occurrence, with some organisms, such as vertebrates and algae, but notably also tardigrades, having dozens of these proteins, while other organisms, such as higher plants and fungi, having lost all selenoproteins during evolution [34]. Membrane GTPases (25 proteins) are often of Lep A (leader peptidase [35]) type in tardigrades. In general, members of the GTPase superfamily regulate membrane signaling pathways in all cells. However, LepA, as well as NodO, are prokaryotic-type GTPases very similar to protein synthesis elongation factors but apparently have membrane-related functions [35]. It is interesting to observe this prokaryotic-type GTPase in tardigrades. We suggest that it will have similar function as known in other organisms and thus ensure protein translation (elongation factor) coupled to membrane integrity and possibly cytoskeletal rearrangement which would again boost the tardigrade resistance to stress.
Table 2
Table 2
Highly represented protein functions in Tardigrades (COGs and KOGs).
The KOGs show similar highly represented families and adaptations. Abberant proteins are rapidly recognized by ubiquitination-like proteins (220 proteins) and ubiquitin-ligase related enzymes (71 proteins) as well as proteasome regulatory subunits (85 proteins). For protein protection and refolding disulfide isomerases (26 proteins) and cyclophilin type peptidyl-prolyl cis-trans isomerases (43 proteins; KOG 0879-0885) are available. Connected to redox protection are also thirty AAA+type ATPases and three peroxisome assembly factor 2 containing proteins (KOG0736). This broad effort in protein protection is further supported by molecular chaperones (HSP70, mortalins and other; total of 50 proteins) and chaperonin complex components (32 proteins; KOG0356-0364). There are six superoxide dismutases and six copper chaperons for thioredoxins (37 proteins), glutaredoxin-like proteins (nine) and ten thiodisulfide isomerases as well as 52 glutathione-S-transferases. We found 22 hits to helicases. Tardigrade DNA protection is represented by 52 proteins of the molecular chaperone DNA J family: proteins of the DNA J family are classified into 3 types according to their structural domain decomposition. Type I J proteins compose of the J domain, a gly-rich region connecting the J domain and a zinc finger domain, and possibly a C-terminal domain. Type II lacks the Zn-finger domain and type III only contains the J domain [36,37]. The latter two are referred to as DnaJ-like proteins. Analysis of the domains present in tardigrade proteins by SMART [38] and Pfam [39] searches reveals only the J domain and in some cases a transmembrane region, identifying them as type III DnaJ-like proteins. For further information on these COGs/KOGs see Table Table33.
Table 3
Table 3
Identified DnaJ-family COGs/KOGs in Tardigrades and Milnesium tardigradum1.
Moreover, undesired proteins can be rapidly degraded by cathepsin F-like proteins (31 proteins) or L-like proteins (46 proteins). There are several calcium-dependent protein kinases (25 proteins; KOG0032-0034) and actin-bundling proteins. According to this observation calcium signaling should be implicated in adaptive rearrangement of the cytoskeleton during tardigrade rehydration. The cytoskeleton is a key element in the organisation of eukaryotic cells. It has been described in the literature that the properties of actin are modulated by small heat-shock proteins including a direct actin-small heat-shock protein interaction to inhibit actin polymerization to protect the cytoskeleton [40,41] (compare with the CLANS cluster 24 (Diapause proteins) found in the above analysis).
Translation in tardigrades includes polypeptide release factors (71 proteins) and proteins for translation elongation (77 proteins). There are about 80 GTP-binding ADP-ribosylation factors. The secretion system and Rab/Ras GTPases are fully represented (183 proteins). Seventeen tubulin anchor proteins show that the cytoskeleton is well maintained. Finally, we find 14 TNF-associated factors and 34 apolipoprotein D/lipocalin proteins.
Typical motifs in tardigrade mRNAs
The regulatory motif search showed a number of known regulatory RNA elements involved in tardigrade mRNA regulation (Table (Table44 for H. dujardini and M. tardigradum). Certainly it can not be formally ruled out that some of these elements work in a tardigrade modified way. Similarly, there are probably further patterns which are tardigrade specific, but not detected with the UTRscan software [42] applied for analysis.
Table 4
Table 4
Regulatory elements in Hypsibius dujardini1 and Milnesium tardigradum2 mRNA sequences.
The RNA elements found include the lox-P DICE element [43] in H. dujardini as top hit with as many as 1,269 ESTs (23.6% of all H. dujardini EST sequences). The cytidine-rich 15-lipoxygenase differentiation control element (15-LOX DICE, [44]) binds KH domain proteins of the type hnRNP E and K (stronger in multiple copies), mediating mRNA stabilization and translational control [43].
Furthermore, a high number of mRNAs contains K-Boxes (cUGUGAUa, [45]) and brd Boxes (AGCUUUA, [46]). All these elements are involved in mRNA storage and mRNA stability. These two elements are potential targets for miRNAs as shown in Drosophila melanogaster [47].
However, in the two tardigrade species compared, only 16 of 30 well known RNA elements are found, suggesting a clear bias in tardigrade mRNA regulation. For example, the widely used AU rich elements in higher organisms [42] such as vertebrates are absent in tardigrades [see additional file 1].
Regulatory elements in tardigrade mRNA are probably important for their adaptation, in particular to support transformation to tun stage and back to active stage again. The list of RNA elements found can be compared for instance to our data on regulatory elements in human anucleate platelets [48] where mRNAs have to be stockpiled for the whole life of the platelet. Due to this comparatively long life, a long mRNA untranslated region is important in these cells. The same should apply to tardigrade mRNAs, since their average UTR is predicted to be long. A different stock-piling scenario occurs in unfertilized eggs, but due to developmental constraints, here localization signals are often in addition important for developmental gradients. We tested for these in tardigrades but did not find a high representation of localization motifs.
Web-tool tardigrade analyzer
We created a convenient platform to allow rapid sequence comparisons of new protein sequences, in particular from new sequencing efforts in tardigrades, to our database by applying rapid heuristic local alignment using BLAST [49] and allowing to search in selected species.
A batch mode allows the analysis of up to 100 sequences simultaneously when uploaded in a file in FASTA format. Output data are displayed according to an enhanced BLAST output format with graphical illustrations. Low expected E-values result for searches using the option of our tardigrade specific databases: a more specific smaller database reduces the probability of false positives. As an alternative for general sequence analysis, a search against the non-redundant database of GenBank can be performed. This takes more computational power and yields higher E-values, however, it identifies functions for most sequences. An additional useful feature is to scan all available data for peptide motifs or PROSITE signatures using a "pattern" module [additional file 1: Fig. S1] or assign potential functions by COGs [19]. The first is helpful to recognize tardigrade proteins in cases where the tardigrade sequence has diverged far, and only critical residues for function are still conserved as motif signatures. It can also be applied to search for regulatory RNA motifs such as polyadenylation sites (e.g. AAUAAA or AAUUAA) or recognize promotor modules such as the glucocorticoid receptor element (GRE; palindromic pattern: AGAACAnnnTGTTCT). For this purpose, both, the tardigrade sequences and the non redundant database can be searched (e.g. to look for stress-specific regulatory RNA elements; [additional file 1, Fig. S2]).
Interestingly, this nucleotide (RNA or DNA) specific option is not available on some common servers, e.g. the PHI-BLAST [50] server at NIH. Further options include a user-defined database [additional file 1: Fig. S3] and interactively animated stress clusters (Figure (Figure11).
Figure 1
Figure 1
Functional clusters by CLANS of sequence related proteins in tardigrades. All available [see additional file 1: Figure S5] tardigrade protein sequences were clustered in a 3D sphere according to their sequence distance and were projected to the paper (more ...)
The tool http://waterbear.bioapps.biozentrum.uni-wuerzburg.de/ allows rapid searches for tardigrade specific sequences, e.g. molecular adaptations against stress [see additional file 1 for screenshots and a tutorial]. For instance, a search for trehalase sequences shows no trehalase mRNA in the H. dujardini sequences. In contrast, there are several heat shock proteins in tardigrades, an example is HSP90 proteins (identified by sequence similarity as well as by a pattern hit based approach using the PROSITE entry PS00298 with the signature Y-x- [NQHD]- [KHR]- [DE]- [IVA]-F- [LM]-R- [ED]; Table Table5).5). Specific COGs are also rapidly assigned for any desired sequence. This includes the option to map the query sequence of interest to any of the known tardigrade specific COGs. Furthermore, nucleotide patterns such as mRNA polyadenylation sites are rapidly identified e.g. in H. dujardini mRNAs [additional file 1: Fig. S4]. Similarly, other mRNA 3'UTR elements can be identified, e.g. AU rich sequences mediating mRNA instability or regulatory K-boxes (motif cUGUGAUa, [45]) in tardigrades.
Table 5
Table 5
HSP90 proteins identified in Hypsibius dujardini using the Tardigrade analyzer1.
Implications
Tardigrades show a surprising large amount of related sequences. Certainly, one has to correct for a few genes sequenced from many lineages for phylogenetic studies in tardigrades (cytochrome c, rRNA etc.) However, despite this, a number of tardigrade-specific clusters still remain. Furthermore, Table Table11 shows that most of the annotated clusters are stress-related.
Looking at specific protein functions, both COG and KOG proteins show that tardigrades spend an extraordinary effort in protein protection, turnover and recycling as well as redox protection. Some other specific adaptations become apparent also from Table Table2,2, but the complete extent of these adaptations is unclear given the limited sampling of available tardigrade sequences. Furthermore, protection of DNA is critical as it has been shown that tardigrade tuns accumulate DNA damage which first has to be repaired before resurrection occurs [51,52]. Taking this into consideration, DNA J proteins were investigated in more detail since proteins of this family are well represented in tardigrades, including several COGs and KOGs. Several data underline the extremely high resistance of tardigrades to temperature, pressure and radiation as well as a high repair potential regarding DNA [11,51]. Thus, we suggest that the high repair potential is also mediated by this well represented protein family. Phylogenetic analysis (Table (Table3)3) shows that these proteins are represented by several KOGs as well as the classic COGs in tardigrades. In particular, the first three KOG families are also used in M. tardigradum, where extreme stress tolerance requires strong repair mechanisms [17]. Furthermore, all these tardigrade proteins in Table Table33 are small, having neither zinc-finger domains nor low complexity regions, but instead consisting of single DNA J domains which would always place them in type I (subfamily A) of DNA-J like proteins. This suggests that the direct interaction with DNA-J like proteins is the key molecular function.
Finally, we could show that there are 16 regulatory elements used in tardigrade mRNA, while a number of other elements known from higher eukaryotic organisms and vertebrates is not used. It is interesting to note that the elements often used in tardigrades are all involved in regulation of mRNA stability. Thus, they may be implicated in stage switching, as presumably in the initial phases of the tun awakening or tun formation, new supply of mRNA is turned off and instead regulation of synthesized mRNA becomes important.
In addition, and for further research we supply the web tool tardigrade analyzer. There are a number of alternative tools available, e.g. from NCBI http://www.ncbi.nlm.nih.gov/. However, we offer some species-specific searches not available from these sources as well as RNA and promotor pattern search (not only for tardigrades but also for NRDB; not available from NIH). Furthermore, there are functional COG prediction as well as new, unpublished tardigrade sequences from M. tardigradum, all above reported data including the reported sequences and detailed functional clusterings as well as regular server updates. A better understanding of the survival mechanisms in these organisms will lead to the development of new methods in several areas of biotechnology. For example, preservation of biological materials in situ, macromolecules and cells from non-adapted organisms [53]. This is, of course, only a first and very general overview on potential tardigrade specific adaptations, more species-specific data will be considered as more information becomes available.
Conclusion
Tardigrade genomes invest in stress-specific adaptations, this includes major sequence related protein clusters, functional clusters for stress as well as specific regulatory elements in mRNA. For further tardigrade genome analysis we offer the tardigrade workbench as a flexible tool for rapid and efficient analysis of sequence similarity, protein function and clusters, COG membership and regulatory elements.
Tardigrade-sequences
The cosmopolitan eutardigrade species M. tardigradum Doyére 1849 (Apochela, Milnesidae) was cultured. Tardigrades were kept and reared on petri dishes (diameter: 9.4 cm) filled with a small layer of agarose (3 %) (peqGOLD Universal Agarose, peqLAB, Erlangen, Germany) and covered with spring water (Volvic™ water, Danone Waters Deutschland, Wiesbaden, Germany) at 20 ± 2°C and a light/dark cycle of 12 h. Rotifers Philodina citrina and nematodes Panagrellus sp. were provided as food source, juvenile tardigrades were also fed with green algae Chlorogonium elongatum. For all experiments adult animals in good physical condition were taken directly from the culture and starved for three days to avoid preparation of additional RNA originating from not completely digested food in the intestinal system. For an overview of RNAs present both in active and tun stage we used a mixture of the same number of animals.
Total RNA extraction was performed using the QIAGEN RNeasy®Mini kit (Qiagen, Hilden, Germany). The cDNA synthesis was reversed transcribed using 1 μg total RNA by the Creator™ SMART™ cDNA Library Construction Kit (Clontech-Takara Bio Europe, France). The resulting cDNA was amplified following the manufacturers protocol and cloned into pDNR-Lib cloning vector. The resulting plasmids were used to transform Escherichia coli by electroporation. Sequencing of the cDNA-library was done by ABI 3730XL capillary sequencer (GATC Biotech AG, Konstanz, Germany). All obtained EST sequences were deposited with Genbank including dbEST databank.
Nucleotide sequences from other tardigrades were collected from Genbank. For H. dujardini, the best represented species, we composed 5,235 ESTs. We stored H. dujardini as well as all published sequences of other tardigrade species (e.g. T. stephaniae, E. testudo, M. tardigradum, R. coronifer) in a database (10,787 sequences including translated sequences, details in [additional file 1], status on April, 2009).
CLANS clustering
For a systematic overview on tardigrade specific adaptations we first clustered all published tardigrade nucleotide sequences into functional clusters (Figure (Figure1)1) using the Cluster analysis of sequences (CLANS) algorithm [21]. All sequences were clustered in 3D space using 0.001 as an E-value cut-off for TBLASTX all-against-all searches. [additional file 1: Fig. S4].
Identification of regulatory elements
For this the ESTs of H. dujardini and M. tardigradum were systematically screened using the software UTRscan [42]. This software screens 30 regulatory elements for RNA regulation with a focus on 3' UTR elements and stability of mRNA. The default settings for batch mode were used and all reported elements were collected.
COG clustering and identification
In order to acquire a systematic overview of the functionalities, we used the latest version of COG/KOG databases ftp://ftp.ncbi.nih.gov/pub/COG and the BLAST hits from both nucleotide search and protein search were clustered according to their COG ID. Searches were carried out in parallel on all the tardigrade species including M. tardigradum, H. dujardini, E. testudo, T. stephaniae and R. coronifer. The results are summarized in a table shown in the tardigrade analyzer, the background color from cold to warm (blue to red) indicates the cluster size, which enables an easy comparison. Moreover, users are allowed to click the COG ID and the hit number. The server then reports the corresponding sequence ID, description, conservation and the homologous entries recorded in the database. The server with its data is automatically updated bi-monthly according to the latest tardigrade databases.
Tardigrade workbench
The tardigrade workbench is implemented in Perl using the Bioperl modules [54]. NCBI BLAST program of 2.2.17 is involved in the software package. A database of Postgresql 8.1.9 is applied to manage the tardigrade entries so as to accelerate the searching queried by investigators. The COG cluster information is automatically updated each week and warehoused on the server. In addition, the run of tardigrade workbench requires an Apache server, a linux system of at least 2 GB memory is highly recommended.
FF did tardigrade protein data analysis including CLANS clustering and RNA motif analysis. CL established the current version of the tardigrade workbench including programming new routines, data management and nucleotide motif analysis. AS did the initial setup of the server, of the virtual ribosome and the CLANS clustering. DB, JE, MS and MF participated in tardigrade data analysis. TM gave expert advice and input on statistics, RS gave expert advice on tardigrade physiology and zoology. TD led and guided the study including analysis of data and program, supervision, and manuscript writing. All authors participated in the writing of the manuscript and approved the final version.
Supplementary Material
Additional file 1
Additional Tables and Figures. The file contains seven additional figures and two additional tables. One of these tables summarizes annotation and different identifiers for 607 new EST sequences from Milne-sium tardigradum.
Acknowledgements
Stylistic corrections by Rosemary Wilson from EMBL Hamburg are gratefully acknowledged. Support by the state Bavaria, DFG (TR34A5) and the German Federal Ministry of Education and Research, BMBF (0313838A, 0313838B, 0313838C, 0313838D, 0313838E) is gratefully acknowledged.
  • Marcus E, Dahl F. Spinnentiere oder Arachnoidea IV. Bärtierchen (Tardigrada) Urban & Fischer Bei Elsevier; 1928.
  • Marcus E. Zur Ökologie und Physiologie der Tardigraden. Zool Jahrb Abt Phys. 1928;44:323–370.
  • Nelson DR. Current Status of the Tardigrada: Evolution and Ecology. Integr Comp Biol. 2002;42:652–659. doi: 10.1093/icb/42.3.652. [PubMed] [Cross Ref]
  • Keilin D. The Leeuwenhoek Lecture: The problem of anabiosis or latent life: History and current concept. Proc R Soc Lond B Biol Sci. 1959;150:149–191. doi: 10.1098/rspb.1959.0013. [PubMed] [Cross Ref]
  • Ramazzotti G, Maucci W. The Phylum Tardigrada. Memorie dell'Istituto Italiano di Idrobiologia, Pallanza. 1983;41:309–314.
  • Baumann H. Die Anabiose der Tardigraden. Zool Jahrb. 1922;45:501–556.
  • Baumann H. Bemerkungen zur Anabiose von Tardigraden. Zool Anz. 1927;72:175–179.
  • Horikawa DD, Sakashita T, Katagiri C, Watanabe M, Kikawada T, Nakahara Y, Hamada N, Wada S, Funayama T, Higashi S, Kobayashi Y, Okuda T, Kuwabara M. Radiation tolerance in the tardigrade Milnesium tardigradum. Int J Radiat Biol. 2006;82:843–848. doi: 10.1080/09553000600972956. [PubMed] [Cross Ref]
  • Hengherr S, Worland MR, Reuner A, Brümmer F, Schill RO. Freeze tolerance, supercooling points and ice formation: comparative studies on the subzero temperature survival of limno-terrestrial tardigrades. J Exp Biol. 2009;212:802–807. doi: 10.1242/jeb.025973. [PubMed] [Cross Ref]
  • Hengherr S, Worland MR, Reuner A, Brümmer F, Schill RO. High-Temperature Tolerance in Anhydrobiotic Tardigrades Is Limited by Glass Transition. Physiol Biochem Zool. 2009;82(6):749–755. doi: 10.1086/605954. [PubMed] [Cross Ref]
  • Jönsson KI, Rabbow E, Schill RO, Harms-Ringdahl M, Rettberg P. Tardigrades survive exposure to space in low Earth orbit. Curr Biol. 2008;18:R729–R731. doi: 10.1016/j.cub.2008.06.048. [PubMed] [Cross Ref]
  • Jönsson KI, Schill RO. Induction of Hsp70 by desiccation, ionising radiation and heat-shock in the eutardigrade Richtersius coronifer. Comp Biochem Physiol B Biochem Mol Biol. 2007;146:456–460. doi: 10.1016/j.cbpb.2006.10.111. [PubMed] [Cross Ref]
  • Wright JC. Cryptobiosis 300 Years on from van Leuwenhoek: What Have We Learned about Tardigrades? Zoologischer Anzeiger - A Journal of Comparative Zoology. 2001;240:563–582. doi: 10.1078/0044-5231-00068. [Cross Ref]
  • Ammermann D. The cytology of parthenogenesis in the tardigrade Hypsibius dujardini. Chromosoma. 1967;23(2):203–213. doi: 10.1007/BF00331113. [PubMed] [Cross Ref]
  • Gabriel WN, McNuff R, Patel SK, Gregory TR, Jeck WR, Jones CD, Goldstein B. The tardigrade Hypsibius dujardini, a new model for studying the evolution of development. Dev Biol. 2007;312:545–559. doi: 10.1016/j.ydbio.2007.09.055. [PubMed] [Cross Ref]
  • Bavan S, Straub VA, Blaxter ML, Ennion SJ. A P2X receptor from the tardigrade species Hypsibius dujardini with fast kinetics and sensitivity to zinc and copper. BMC Evol Biol. 2009;9:17. doi: 10.1186/1471-2148-9-17. [PMC free article] [PubMed] [Cross Ref]
  • Kinchin I, Dennis R. The biology of tardigrades. Portland Press London; 1994.
  • Baxevanis AD. Searching the NCBI databases using Entrez. Curr Protoc Hum Genet. 2006;Chapter 6 Unit 6.10. [PubMed]
  • Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000;28:33–36. doi: 10.1093/nar/28.1.33. [PMC free article] [PubMed] [Cross Ref]
  • Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41. doi: 10.1186/1471-2105-4-41. [PMC free article] [PubMed] [Cross Ref]
  • Frickey T, Lupas A. CLANS: a Java application for visualizing protein families based on pairwise similarity. Bioinformatics. 2004;20:3702–3704. doi: 10.1093/bioinformatics/bth444. [PubMed] [Cross Ref]
  • Tjoelker LW, Gosting L, Frey S, Hunter CL, Trong HL, Steiner B, Brammer H, Gray PW. Structural and functional definition of the human chitinase chitin-binding domain. J Biol Chem. 2000;275:514–520. doi: 10.1074/jbc.275.1.514. [PubMed] [Cross Ref]
  • Qian SB, McDonough H, Boellmann F, Cyr DM, Patterson C. CHIP-mediated stress recovery by sequential ubiquitination of substrates and Hsp70. Nature. 2006;440:551–555. doi: 10.1038/nature04600. [PubMed] [Cross Ref]
  • Chen WH, Ge X, Wang W, Yu J, Hu S. A gene catalogue for post-diapause development of an anhydrobiotic arthropod Artemia franciscana. BMC Genomics. 2009;10:52. doi: 10.1186/1471-2164-10-52. [PMC free article] [PubMed] [Cross Ref]
  • Kim YJ, Nachman RJ, Aimanova K, Gill S, Adams ME. The pheromone biosynthesis activating neuropeptide (PBAN) receptor of Heliothis virescens: identification, functional expression, and structure-activity relationships of ligand analogs. Peptides. 2008;29:268–275. doi: 10.1016/j.peptides.2007.12.001. [PubMed] [Cross Ref]
  • Alvarez-Ordóñnez A, Fernández A, López M, Bernardo A. Relationship between membrane fatty acid composition and heat resistance of acid and cold stressed Salmonella senftenberg CECT 4384. Food Microbiol. 2009;26:347–353. doi: 10.1016/j.fm.2008.11.002. [PubMed] [Cross Ref]
  • Tunnacliffe A, Wise MJ. The continuing conundrum of the LEA proteins. Naturwissenschaften. 2007;94:791–812. doi: 10.1007/s00114-007-0254-y. [PubMed] [Cross Ref]
  • Browne JA, Dolan KM, Tyson T, Goyal K, Tunnacliffe A, Burnell AM. Dehydration-specific induction of hydrophilic protein genes in the anhydrobiotic nematode Aphelenchus avenae. Eukaryot Cell. 2004;3:966–975. doi: 10.1128/EC.3.4.966-975.2004. [PMC free article] [PubMed] [Cross Ref]
  • Goyal K, Tisi L, Basran A, Browne J, Burnell A, Zurdo J, Tunnacliffe A. Transition from natively unfolded to folded state induced by desiccation in an anhydrobiotic nematode protein. J Biol Chem. 2003;278:12977–12984. doi: 10.1074/jbc.M212007200. [PubMed] [Cross Ref]
  • Tunnacliffe A, Lapinski J, McGee B. A putative LEA protein, but no trehalose, is present in anhydrobiotic bdelloid rotifers. Hydrobiologia. 2005;546:315–321. doi: 10.1007/s10750-005-4239-6. [Cross Ref]
  • Kobayashi F, Maeta E, Terashima A, Takumi S. Positive role of a wheat HvABI5 ortholog in abiotic stress response of seedlings. Physiol Plant. 2008;134:74–86. doi: 10.1111/j.1399-3054.2008.01107.x. [PubMed] [Cross Ref]
  • Hong-Bo S, Zong-Suo L, Ming-An S. LEA proteins in higher plants: structure, function, gene expression and regulation. Colloids Surf B Biointerfaces. 2005;45:131–135. doi: 10.1016/j.colsurfb.2005.07.017. [PubMed] [Cross Ref]
  • Fagegaltier D, Lescure A, Walczak R, Carbon P, Krol A. Structural analysis of new local features in SECIS RNA hairpins. Nucleic Acids Res. 2000;28:2679–2689. doi: 10.1093/nar/28.14.2679. [PMC free article] [PubMed] [Cross Ref]
  • Lobanov AV, Hatfield DL, Gladyshev VN. Eukaryotic selenoproteins and selenoproteomes. Biochim Biophys Acta. 2009. in press . [PMC free article] [PubMed]
  • March PE. Membrane-associated GTPases in bacteria. Mol Microbiol. 1992;6:1253–1257. doi: 10.1111/j.1365-2958.1992.tb00845.x. [PubMed] [Cross Ref]
  • Walsh P, Bursać D, Law YC, Cyr D, Lithgow T. The J-protein family: modulating protein assembly, disassembly and translocation. EMBO Rep. 2004;5:567–571. doi: 10.1038/sj.embor.7400172. [PubMed] [Cross Ref]
  • Cheetham ME, Caplan AJ. Structure, function and evolution of DnaJ: conservation and adaptation of chaperone function. Cell Stress Chaperones. 1998;3:28–36. doi: 10.1379/1466-1268(1998)003<0028:SFAEOD>2.3.CO;2. [PMC free article] [PubMed] [Cross Ref]
  • Letunic I, Doerks T, Bork P. SMART 6: recent updates and new developments. Nucleic Acids Res. 2009;37:D229–D232. doi: 10.1093/nar/gkn808. [PMC free article] [PubMed] [Cross Ref]
  • Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer ELL, Bateman A. The Pfam protein families database. Nucleic Acids Res. 2008;36:D281–D288. doi: 10.1093/nar/gkm960. [PMC free article] [PubMed] [Cross Ref]
  • Mounier N, Arrigo AP. Actin cytoskeleton and small heat shock proteins: how do they interact? Cell Stress Chaperones. 2002;7:167–176. doi: 10.1379/1466-1268(2002)007<0167:ACASHS>2.0.CO;2. [PMC free article] [PubMed] [Cross Ref]
  • Sun Y, MacRae TH. Small heat shock proteins: molecular structure and chaperone function. Cell Mol Life Sci. 2005;62:2460–2476. doi: 10.1007/s00018-005-5190-4. [PubMed] [Cross Ref]
  • Pesole G, Liuni S. Internet resources for the functional analysis of 5' and 3' untranslated regions of eukaryotic mRNAs. Trends Genet. 1999;15:378. doi: 10.1016/S0168-9525(99)01795-3. [PubMed] [Cross Ref]
  • Ostareck-Lederer A, Ostareck DH, Hentze MW. Cytoplasmic regulatory functions of the KH-domain proteins hnRNPs K and E1/E2. Trends Biochem Sci. 1998;23:409–411. doi: 10.1016/S0968-0004(98)01301-2. [PubMed] [Cross Ref]
  • Ostareck-Lederer A, Ostareck DH, Standart N, Thiele BJ. Translation of 15-lipoxygenase mRNA is inhibited by a protein that binds to a repeated sequence in the 3' untranslated region. EMBO J. 1994;13:1476–1481. [PubMed]
  • Lai EC, Burks C, Posakony JW. The K box, a conserved 3' UTR sequence motif, negatively regulates accumulation of enhancer of split complex transcripts. Development. 1998;125:4077–4088. [PubMed]
  • Lai E. Micro RNAs are complementary to 3' UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet. 2002;30:363–364. doi: 10.1038/ng865. [PubMed] [Cross Ref]
  • Lai EC, Tam B, Rubin GM. Pervasive regulation of Drosophila Notch target genes by GY-box-, Brd-box-, and K-box-class microRNAs. Genes Dev. 2005;19:1067–1080. doi: 10.1101/gad.1291905. [PubMed] [Cross Ref]
  • Dittrich M, Birschmann I, Pfrang J, Herterich S, Smolenski A, Walter U, Dandekar T. Analysis of SAGE data in human platelets: features of the transcriptome in an anucleate cell. Thromb Haemost. 2006;95:643–651. [PubMed]
  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [PMC free article] [PubMed] [Cross Ref]
  • Zhang Z, Schäffer AA, Miller W, Madden TL, Lipman DJ, Koonin EV, Altschul SF. Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res. 1998;26:3986–3990. doi: 10.1093/nar/26.17.3986. [PMC free article] [PubMed] [Cross Ref]
  • Neumann S, Reuner A, Brümmer F, Schill RO. DNA damage in storage cells of anhydrobiotic tardigrades. Comp Biochem Physiol A Mol Integr Physiol. 2009;153:425–429. doi: 10.1016/j.cbpa.2009.04.611. [PubMed] [Cross Ref]
  • Schill R, Neumann S, Reuner A, Brümmer F. Detection of DNA damage with single-cell gel electrophoresis in anhydrobiotic tardigrades. Comp Biochem Physiol A Mol Integr Physiol. 2008;151:32–32. doi: 10.1016/j.cbpa.2008.05.115. [Cross Ref]
  • Schill RO, Mali B, Dandekar T, Schnölzer M, Reuter D, Frohme M. Molecular mechanisms of tolerance in tardigrades: new perspectives for preservation and stabilization of biological material. Biotechnol Adv. 2009;27:348–352. doi: 10.1016/j.biotechadv.2009.01.011. [PubMed] [Cross Ref]
  • Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JGR, Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD, Birney E. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 2002;12:1611–1618. doi: 10.1101/gr.361602. [PubMed] [Cross Ref]
Articles from BMC Genomics are provided here courtesy of
BioMed Central