A total of 47,926 sequences remained after clone sequencing and quality checking and these had a minimum transcript size of 100 bp and an average of 538 bp. Sequencing effort was concentrated on the normalized libraries, which comprised 91.2% of the dataset, with between 21.5–24.9% contributed by each library (Table ). Preliminary sequencing was performed on the subtractive libraries (approximately 1000 clones each) and therefore these only comprise 8.8% of the EST sequences (Table ). The gene discovery and diversity rates of all libraries are relatively high (0.53–0.68 and 0.36–0.46 respectively) with the exception of the sbs04 subtractive library. Gene discovery is defined as the number of different "genes" each library contributed, divided by library size and gene diversity is defined as the number of singletons in each library divided by library size [22
General statistics for the normalized libraries.
General statistics for the subtractive libraries.
In spite of the subtractive process, this library was still highly redundant, potentially indicating massive gene redundancy within mictic females and their associated resting eggs. Ten contigs comprise 27.4% of the library sequences. Three of the ten contigs (7.7% of the library) show no match to known sequences in the database and four contigs (12.9%) have a highly repeated amino acid structure, which cannot be ascribed to a particular gene or gene family (Table ). None of these highly repeated proteins show similar homologies to each other, being 36–40% identical at the nucleotide level. One contig (clone sbs04P0012K21) shows similarity to putative oxidoreductases. This classification is applied to all enzymes that have oxidoreductase functions, however some are involved in acting on superoxide radicals, which are produced during stressful situations (c.f. resting eggs). Also of significant interest are the two other matches to ferritin (pearl oyster) and hsp26 (Artemia urmiana
). In studies on the crustacean Artemia franciscana
, which forms cysts in response to adverse conditions, two proteins were shown to be present in large amounts in the cysts; hsp26 and artemin (a ferritin homologue) [23
]. This situation is clearly mirrored in the sbs04 library and these transcripts represent potential markers for resting eggs.
Ten largest contigs in the sbs04 library (mictic females with resting eggs versus mixed stage population of clone 1B4) and the associated BLAST matches.
When similarity searches were run on the processed sequences from all the libraries, approximately 50% produced significant matches (expect score in excess of 1e-10 and therefore can be regarded as putative known genes) against the sequence databases. This percentage identification is much lower than a recently published EST library of B. plicatilis
], in which 80% of sequences showed similarity to database entries.
However, the number of ESTs in the Suga library was relatively small (2,362 ESTs), non-normalised and a significant proportion of the sequences formed a single cluster encoding the small ribosomal sub-unit. In total, almost 23% of the 2,362 ESTs were comprised of 14 clusters with matches ranging from cathepsin L to beta tubulin. Comparison of our ESTs with those of Suga et al., [21
] using BLASTN (E value < 10-10
) showed that 93% of the ESTs in the Suga library were represented in our dataset.
The main objective of this EST project was to develop transcriptome resources for B. plicatilis, which could be used in future global expression experiments. Therefore, the strategy of normalization and subtraction was used in library production. Although this should maximize the number of different transcripts obtained, it does mean that quantitative comparisons between libraries is not possible without further verification. Given this limitation, analyses were targeted at candidate genes involved in maintaining the stability and the integrity of cell compartments and macromolecules, as these are key factors for survival during dormancy. Searches were carried out using both BLAST and GO annotations and identified genes designated as or involved in:
• Protection against reactive oxygen species (ROS) and detoxification: ROS are toxic in all life stages but they are especially problematic for dormant forms. In plant seeds desiccation causes loss of control mechanisms that maintain low ROS concentrations, thus the antioxidant activity has great importance [25
• Maintaining the native folded conformation of proteins: changes in osmotic pressure, pH or temperature as well as desiccation all challenge protein conformation [26
] and may cause the formation of cytotoxic protein aggregates.
• Late Embryogenesis abundant (LEA) proteins: which have been shown to be involved in desiccation in a number of organisms [27
• Trehalose biosynthesis: trehalose is well-known to be present in high concentrations in the dormant stages of various organisms [17
] and small amounts have been found previously in B. plicatilis
desiccated resting eggs [28
• Aquaporins: these are transmembrane proteins that serve as channels for water and small soluble molecules transport [29
] and have been found to be important for desiccation tolerance in seeds [30
] and for freeze tolerance in yeast [31
• Lipids and fatty acid metabolism: lipid metabolism is associated with hibernation in mammals [32
] and the dauer form in nematodes [33
]. Vitellogenins are lipoproteins forming the yolk proteins [34
Protection against ROS and detoxification
A number of clones were identified associated with antioxidant activity GO term (GO:0016209), which was specifically narrowed to encompass clones encoding glutathione S transferases (Table ). These genes belong to a superfamily of multifunctional proteins with fundamental roles in cellular detoxification, participating in the second phase detoxification and removal of xenobiotics after the action of P450 [36
]. They are widespread among all organisms. In total, 129 putative transcripts for glutathione S-transferase (E value between 9.0 e-17
– 2.0 e-45
) were found in all the normalized libraries and in the sbs04 library (seven contigs and five singletons). More in-depth analysis revealed that these 129 transcripts comprise 11 distinct putative genes (designated Bpa-gst-1
, where Bpa stands for Brachionus plicatilis Atlit
), which on sequence similarity searching appear to most closely match the alpha class of cytosolic GSTs. This is by far the most abundant of cytosolic subfamilies often comprising tens of members in each species [36
] (c.f. 44 annotated GSTs identified in C. elegans
]). Five of the putative rotifer GSTs show closest sequence matches to C. elegans
genes, all of which are heavily documented in Wormbase with regard to expression and functional studies. Whilst those GSTs most similar to the rotifer transcripts all show expression responses to electrophilic stress [38
], interestingly GST-5 occurred in an expression cluster of strongly regulated dauer genes (WBPaper00024393; [39
]). Although functions of genes (even orthologues) differ between species, and this is particularly the case with multiple gene family members, the dauer is a stage of larval arrest in C. elegans
, which could equate functionally to the resting-egg stage in the rotifer. Indeed, this gene was only found in the resting-egg library, clearly a candidate for further investigation. No GSTs were found in the first three subtractive libraries, but this may not be surprising given the small sample size of the sequencing effort, or alternatively, their number was small as a consequence of subtractions.
Putative transcripts for members of the Glutathione-S-transferase family Identified in the EST libraries.
Further searches for antioxidant enzymes identified 135 clones, which assembled into 11 putative transcripts coding for peroxiredoxins (E values between 10-27
) and thioredoxin peroxidase activity (E value of 10-57
) (data not shown). Members of these families were found in all normalized libraries and Bpa-trpx-6
were additionally found in the sbs04 library associated with resting eggs (data not shown). Antioxidant activity is also associated with the enzyme phospholipid-hydroperoxide glutathione peroxidase, which protects membranes from oxidative stress by reducing the membrane hydroperoxides [40
]. Twenty-nine clones were found to be associated with phospholipid-hydroperoxide glutathione peroxidase activity (GO:0047066) in the EST libraries and two putative transcripts were produced after contig assembly and were named gpx1
. The transcript gpx1
was only found in the MS and sbs01 libraries, whilst gpx2
was found in all the normalized libraries. BLAST results for the two transcripts were quite different: gpx1
matched mammalian glutathione peroxidase 3 (E value = 10-42
) and gpx2
matched phosphlipid-hydroperoxide glutathione peroxidase of hydra and cattle tick (E value = 10-20
), and of mammals (E value 10-18
), although both confer antioxidant protection. The presence of two genes indicates a duplication of the gpx
genes in the rotifer.
Dismutases catalyze the conversion of superoxide radicals into hydrogen peroxide, preventing their conversion into the more active hydroxyl radical [25
]. Five putative transcripts were found to be associated with superoxide dismutase activity (GO:0004784). Two transcripts show homology with the Mn-SOD (E value = 10-111
) previously described by [19
]. Three other transcripts were found to be similar to Cu/Zn-SOD. Transcripts were found across several different libraries and so could be designated as ubiquitous. However, the previously identified Mn-SOD of B. plicatilis
was found to be over-expressed in rotifers with an extended life span resulting from caloric restriction [19
]. Similarly in C. elegans
, the DAF pathway (insulin, dauer associated) is also linked to caloric restriction and increased lifespan. Therefore these genes clearly have other roles in addition to putative functions associated with desiccation.
Maintaining the native folded conformation of proteins
Changes in environmental conditions (e.g. osmotic pressure, pH, temperature and desiccation), challenge protein structure and may cause the formation of cytotoxic protein aggregates and induce the production of "stress" proteins [41
]. Therefore, desiccation tolerant resting eggs need to develop mechanisms for coping with denaturing and aggregation of proteins. The classical cellular response to this type of stress is the induction of "heat shock" or chaperone proteins [42
] which facilitate the disaggregation of proteins and their refolding to native conformation, and/or the production of small heat shock proteins, which prevent initial protein aggregation [26
BLAST searches revealed 10 putative transcripts (6 contigs and 4 singletons) with matches to the HSP70 superfamily. Further analysis narrowed this to 6 putative genes as four of the sequences were potentially non-overlapping sections of the same genes (Table ). Of the six putative genes, three showed significant sequence similarity to the classical stress inducible HSP70 gene (Bpa-hsp70-1, Bpa-hsp70-3 and Bpa-hsp-6). The best database match to this gene was from the organism Microplitis mediator
, an orthopteran parasite and interestingly the publication annotation associated with this entry [Swiss-Prot:A8D4R0
] indicates that this gene is associated with diapause. All other putative genes are HSP70 family members and although the functional annotation is variable, all are potentially involved in the stress response. HSPA9 (Bpa-hsp70-4) is additionally implicated in the control of cell proliferation and cellular aging, whilst GRP170 (Bpa-hsp70-5) has a pivotal role in cytoprotection, specifically triggered in response to hypoxia [45
], both factors which are almost certainly associated with resting-egg formation. None of the rotifer sequences showed any significant similarity to the rotifer HSP70 sequence previously isolated [DDBJ:AB076052
]), which is most similar to the constitutive form of this family (HSC70) and has been shown to be expressed during population growth [18
]. Members of the HSP70 family were found in all normalized libraries and in two of the subtracted libraries.
Putative transcripts for members of the HSP70 family identified in the EST Libraries.
Although members of the HSP70 family are regarded as the classical cellular stress response, the small heat shock proteins are being increasingly identified as having a pivotal role in survival in stressful conditions and metabolic arrest [23
]. Encysted embryos of Artemia franciscana
have been shown to contain substantial amounts of HSP26 [46
] along with a ferritin homologue [24
], with both molecules acting as chaperones to prevent protein aggregation.
A search for small heat-shock proteins revealed five putative transcripts (5 contigs). One primarily matched an α-crystallin protein, (Ornithodoros parkeri
, E value = 7·10-13
), but this is not surprising as the α-crystalline domains are characteristic of small heat shock proteins [26
] and indeed all the deduced amino acid sequences of putative rotifer small HSPs described here contain an α-crystallin conserved domain. This first transcript was found exclusively in the normalized libraries containing resting eggs (RE and FRE). Four additional different transcripts were identified in the subtractive sbs04 library (Table ). Overall, small HSP transcripts were highly represented in the sbs04 library comprising 55 clones out of 1203 (~4.5%), and significantly, this was the only subtracted library to contain resting eggs. The sequence similarity of the putative rotifer small HSPs to small HSPs in the databases was low, in the region of 30% identity, but the small HSPs, contrary to the situation with HSP70, are not highly conserved between species. For example, comparing sequences from C. briggsae
] to C. elegans
] and the pink hibiscus mealy bug [Swiss-Prot:A2I3W3
] produces 28.6% amino acid identity/46% amino acid similarity and 29.2% identity/42.6% similarity, respectively. Given this lack of conservation, and that BLAST matches of the rotifer sequences were exclusive to other small HSPs, it is reasonable to assume that putative genes coding for small heat shock proteins are found in rotifers, particularly in resting eggs.
Putative transcripts for members of the small heat shock family identified in the EST libraries.
Regarding additional candidates for further investigation in resting-egg stage gene expression, a number of other heat shock proteins were identified (HSP60 and HSP80-100). Induction of the HSP60 protein was previously shown in B. plicatilis
in response to various environmental pollutants [20
], and also in Plationus patulus
in response to arsenic and heavy metal exposure [49
] and therefore are potential "stress" proteins. Eleven putative transcripts with matches to HSP60 were found, as were putative transcripts with matches to other high molecular weight heat-shock proteins (HSP80-100). These candidates were found in all of the normalized libraries.
Late Embryogenesis Abundant (LEA) proteins
LEA proteins were originally identified in plant seeds during the late stages of embryonic development and are associated with desiccation tolerance throughout the life cycle of all major plant taxa [50
]. They comprise a protein family with three major groups (Groups 1–3). They have also been found in non-plant species and to date almost all non-plant LEA proteins belong to Group 3 [51
]. LEAs have been found in the nematode Aphelenchus avenae
], bdelliod rotifers [53
] and desiccated A. franciscana
]. The exact function of LEA proteins is as yet, unknown, but their importance in desiccation and stress tolerance has been comprehensively demonstrated. For example, silencing of the lea
gene in C. elegans
dauer juveniles caused a significant reduction of worm survival during induction of desiccation and in osmotic and heat stresses [56
]. LEA proteins were found to prevent protein aggregation in vitro
]. Also, in vivo
experiments using Aphelenchus avenae
LEA proteins introduced into human cell lines demonstrated that these proteins played a role in anti-aggregation and protein stabilisation during desiccation procedures [58
Three transcripts matching group 3 LEA proteins on BLAST sequence similarity analyses were identified (E values in the range of 1E-11
). These have been designated bpa-lea-1, bpa-lea-2
(Table ). A rooted NJ tree was produced using translations of these transcripts with canonical plant LEA proteins from all three major groups [59
] and the metazoan LEA proteins of C. elegans, A. franciscana, P. vanderplanki
, and A. avenae
(Figure ). The putative rotifer genes were associated with the Group 3 protein family.
Putative transcripts for Late Embryonic Abundant proteins (LEA) identified in the EST libraries.
Figure 2 Rooted NJ tree of lea-like deduced proteins, LEA proteins of other invertebrates and canonical plant LEA proteins from the three major groups. The out-group used was of glucose starvation inducible protein of Bacillus subtilis (Accession No. 26907; defined (more ...)
Trehalose is thought to play an important role in enhancing desiccation and stress tolerance [60
]. For example, accumulation of trehalose has been shown in diapausing cysts of Artemia
] and also the stress responses of nematodes [62
]. Trehalose is synthesized from glucose, catalyzed by the enzymes trehalose-6-phosphate synthase (tps
) and trehalose phosphatase [64
]. Trehalose can comprise ~17% of the dry mass in Artemia
undergoing desiccation [65
] and small amounts (0.35% of dry weight) have previously been found in B. plicatilis
desiccated resting eggs [28
]. Also a transcript [DDBJ: BJ979617
] with high sequence similarity to the tps
gene, encoding to trehalose phosphate synthase, was previously identified in an EST library of B. plicatilis
Ten ESTs (1 contig and 7 singletons) were identified in the different libraries for trehalose-6-phosphate synthase but there was no particular association with the libraries containing resting eggs. In-depth analysis revealed that the ten ESTs could be assigned to three groups comprising non-overlapping regions of the tps
gene. In spite of this fragmentation, it was possible to identify that two paralogues (Table ) were present and that the rotifer, like C. elegans
, has a duplication of the tps
]. Other model organisms, such as the insects Drosophila melanogaster
, Aedes aegypti
, Anopheles gambiae
and baker's yeast, S. cerevisiae
] possess only a single tps
gene, but this may be a reflection on lifestyle and the requirement to survive stressful conditions. In support of this, phylogenetic analysis has shown adaptive selection operating on the glucose-6-phosphate branch point enzymes and adjacent pathways (including tps
) with the conclusion that this evolutionary pressure has played a significant role in metabolic adaptation [67
]. The C. elegans
paralogues showed only 48% identity overall, but they were slightly different lengths (1229 amino acids [Swiss-Prot:O45380
] (F19H8.1) and 1331 amino acids [Swiss-Prot:Q7YZT6
](ZK54.2)) and particularly differed at the 5' and 3' ends. The two fragments of tps
from the rotifer were 88.9% identical at the amino acid level, but these fragments did include the most conserved central portion of the gene and therefore the overall figure for amino acid conservation will be much lower if the whole sequence of each gene is compared.
Putative transcripts for members of the trehalose-6-phosphate synthase (tps) family identified in the EST libraries.
Given the data and the nature of the way the libraries were produced it is not possible to determine the role of trehalose in resting-egg formation and survival solely using this data. In addition to the duplication of the trehalose-6-phosphate synthase gene in C. elegans
, this species also shows a duplication of the trehalase gene, the enzyme which breaks down trehalose. In fact, there are four trehalase genes annotated in Ensembl [W05E10.4, F57B10.7, T05A12.2 and C23H3.7] [66
]. BLAST searches of the rotifer data produced three singletons with matches to trehalase (data not shown). Although these were single reads and therefore sequence quality was variable, there were sufficient differences between the putative translations of these clones to indicate that they were potentially three different genes, demonstrating another situation analogous with the nematode. Although the C. elegans
sequences are similar at the sequence level to other characterized trehalases (hence the annotation), they are designated as "unknown function", as RNAi studies produce no obvious phenotype. It has yet to be determined why there are four copies of this gene in C. elegans
and what is the exact function of each paralogue. By extrapolation the same can be inferred for the three putative trehalases in the rotifer.
Aquaprorins are transmembrane proteins that serve as channels for water and small soluble molecules transport [29
]. These proteins have been found to play a role in desiccation tolerance in seeds [30
] and freeze tolerance in yeast [31
]. Three different putative aquaporin transcripts were identified in the EST libraries (Table ) with E values in the range of 6E-22
. These were designated: bpa-aqp-1
. Exact assignment of these putative rotifer genes to aquaporin family members was difficult because of relatively short sequence lengths and low percentage similarity to aquaporin genes already in the databases. However, on BLAST assignment, the first two transcripts matched aquaporins 3, 7, 9 or 10, which are glycerol channels, while the third matched aquaporins 4, 2, 1 or the plant protein TIP. These genes are under further investigation and full length transcripts are being generated by RACE PCR for functional analyses.
Putative transcripts for members of the aquaporin (aqp) family identified in the EST libraries.
Lipid and fatty acid metabolism
Also of interest were genes associated with lipid metabolism as this may be the only source for energy whilst embryonic development is arrested and during hatching if similarities are assumed with other dormant or hibernating organisms. For example lipid metabolic pathways were up-regulated in the C. elegans
dauer larval stage [33
]. Lipids also serve as the main energy source in hibernating mammals [32
]. Resting eggs contain extremely large numbers of droplets with neutral lipids [68
] and these may serve as the only source for biosynthetic processes during dormancy and hatching via the glyoxylate cycle and gluconeogenesis. There were 28 clones (4 contigs and 2 singletons) matching lipoprotein lipase (Table ) in the libraries. Lipoprotein lipases are also known to serve as yolk proteins in dipterans eggs [69
], in contrast to vitellogenins that are the main yolk proteins in almost all egg forming organisms [34
]. Surprisingly, no BLAST matches were identified for vitellogenin, suggesting that lipoprotein lipase may serve as a yolk protein of B. plicatilis
. Allied to the possession of lipoprotein lipases are fatty acid-binding proteins (FABP) which are assumed to be involved in fatty acid uptake, transport and metabolism. These proteins are members of the lipocalin superfamily that are transporters of small hydrophobic molecules such as lipids, steroid hormones, bilins and retinoids [71
]. Both fatty acid and retinoid binding may be important for resting-egg formation as fatty acids may serve as an energy source during dormancy and retinoids are associated with embryonic development [72
]. Five putative transcripts were identified as lipocalins (Table ). For each transcript, the highest number of clones within the normalized libraries was found in library FRE (females with resting eggs) and one transcript was also found in library sbs04. These results may suggest a role of lipocalins in resting-egg production.
Putative transcripts for members of the lipoprotein lipase family (lpl), members identified in the EST libraries.
Putative transcripts for members of the fatty acid binding proteins (fab) family identified in the EST libraries.
Since all libraries were produced using either normalized or subtractive methods, real-time PCR experiments were conducted in order to assess the expression of selected genes in resting eggs and in resting-egg producing females (see [additional file 1
] Table S1). The expression patterns of the selected genes were determined in resting eggs relative to amictic eggs, and in resting-egg producing females relative to amictic females (Fig. ). It should be noted that in all cases the 95% confidence limits in the female samples were expanded compared to those of the egg samples. This may be attributed to the larger inherent variability between females, related to their age and size.
Figure 3 Expression pattern of selected genes in resting eggs (RE) vs. amictic eggs (AE) and resting-egg producing females (FRE) vs. amictic females (FA). Genes that were tested include: the Late embryonic abundant protein (lea-1, lea-2, lea-3), small heat shock (more ...)
Genes upregulated in resting eggs include all the lea-like transcripts, a small heat shock protein and two of the genes involved in antioxidant activities: one of the glutathione S-transferases (Bpa-gst-8) and a superoxide dismutase (Mn-sod-2). Two gst-like transcripts were chosen for analysis: Bpa-gst-8, identified in the normalized libraries associated with resting eggs (RE, FRE) and also the subtracted library containing resting eggs, and gst-2 found in all the normalized libraries. As mentioned above, gst-8 is up-regulated in resting eggs and in resting-egg producing females. No significant change in the expression of gst-2 was found in resting eggs relative to amictic eggs but it was slightly up-regulated in resting-egg producing females. Therefore, the two gene family members clearly play different roles in cellular defense mechanisms.
The relative expression of tps-1 transcript was determined in order to evaluate the significance of trehalose synthesis in resting-egg production. The results do not show any significant change in the expression of the tps-1 like gene in resting eggs relative to amictic eggs or in resting-egg producing females relative to amictic females. Hence, the expression pattern of the tps-like transcript suggests that this gene may not be associated with resting-egg production, although it cannot be discounted that trehalose production is regulated at the translational level or enzyme activity rather than the transcriptional level.