Overview of ESTs from the venom gland of D. acutus
After discarding the poor-quality sequences, 8696 high-quality ESTs were used to analyze gene expression profile in the venom gland of D. acutus
. The mean read length of ESTs was 398 bp (ranging from 50 bp to 772 bp, Figure ). Subsequently, ESTs were clustered into 3416 clusters, of which 118 clusters (40.16% ESTs) associated with toxin function has been reported elsewhere [11
]. In this report, we discussed mainly 1184 clusters (39.85% ESTs) involved in the cellular functional transcripts and other novel sequences (Figure ). The distribution of all ESTs was followed:
The distribution of the reads length of the ESTs. A total of 8696 ESTs were generated in the present study. Abscissa (50 bp) is the length of sequences, while the Y-coordinate is the number of ESTs. A large fraction of ESTs is between 350 and 650 bp.
Distribution of ESTs from D. acutus venomous gland according to their cDNA products. The percentage of ESTs classification more than 1.00% is indicated.
(1) Twenty-five clusters consisted of more than 50 ESTs each, which represented the most abundant transcripts and encoded known proteins. They constituted 0.73% of the total clusters (25 of 3416 clusters) including 26.16% of total ESTs (2275 of 8696 ESTs). Interestingly, half of the most abundant transcripts were previously reported metalloproteinase from venom gland of D. acutus
], indicating high prevalence. Of these 25 clusters, eight clusters are known genes that belong to housekeeping genes and two are toxin secretion related genes (Table ).
Identification of the high abundance cluster (≥10 reads) of cellular functional transcripts
(2) Thirty-nine clusters consisted of 20–49 ESTs each and represented 1.14% of the total clusters (39 of 3416 clusters) and 13.26% of the total ESTs (1153 of 8696 ESTs). Of 39 clusters, 13 encoded non-toxin functional proteins, such as myosin, NADH dehydrogenase subunit, cytochrome oxidase subunit and calmodulin protein. They are the second most abundant mRNA transcripts in the venom gland of D. acutus.
(3) Seventy-five clusters contained 10–19 ESTs each, and represented 2.20% of the total clusters (75 of 3416 clusters) and 11.83% of ESTs (1029 of 8696 ESTs), of which 17 clusters encoded the genes for troponin, ATPase subunits, retrotransposable-like elements and elongation factor 1-alpha, etc. They are considered medium-sized clusters with relatively low prevalence.
(4) The low abundant 445 clusters consisted of 2–9 ESTs each and constituted 13.03% of the total clusters (445 of 3416 clusters) and 13.69% of the total ESTs (1425 of 8696 ESTs). They included many toxin coding genes, cellular functional transcripts and partial unknown protein e.g., jerdonitin, proline dehydrogenase and hypothetical proteins.
(5) There are 2832 unique ESTs representing 82.9% of the total clusters (2832 of 3416 clusters) and 32.57% of ESTs (2832 of 8696 ESTs). The occurrence rate of these clusters is only once in current sequenced ESTs. They included cytokine-like molecule, zinc finger proteins, transport proteins and transcripts without hits to GenBank non-redundant proteins (nr) and nucleic acids databases (nt). The distribution of these cluster sizes are shown in Figure .
Figure 3 Prevalence distribution of the cluster size. The initial 8696 ESTs were grouped into 3416 clusters, consisting of 25 clusters (2257 of 8696 ESTs) that compromised of more than 50 ESTs each cluster, 39 clusters contained 20~49 ESTs (1153 of 8696 ESTs), (more ...)
The cDNA library constructed is a non-normalized primary library without amplification, so the clone abundance or the cluster size presents the relative mRNA population [12
]. About one-third of the total clones are singletons, and approximately one-fourth of the ESTs fit in clusters that are comprised of more than 50 ESTs, representing the complexity and specificity of the transcript population of the venom gland of D. acutus
ESTs relevant to protein processing
A homologue of Bothrops insularis
calglandulin EF-hand protein family is identified at high abundance (81 ESTs) in current library (Table ). Calglandulin EF-hand protein as a venom gland specific gene has been reported [13
]. It has several conserved Ca2+
motifs and is expressed exclusively in snake venom glands from many species, but not secreted to the venom. This protein family functions in the process of exporting toxins out of the cell and into the venom [14
], implying that it plays a fundamental role in toxin secretion process [15
]. In this library, three EF-hand protein families were found, of which two showed high identity with Bothrops insularis
EF-hand protein family and another showed homology only with Mus musculus
calmodulin. The diversity of EF-hand proteins in the venom gland of D. acutus
may suggest the complexity of toxin secretion activity. The other high expression gene encodes the protein disulfide isomerase (PDI), which was represented by 58 ESTs in the library. The PDI from D. acutus
showed 77.9% of identity with Gallus gallus
PDI. The PDI is a redox protein responsible for disulfide bond assembly in the endoplasmatic reticulum. We also found a significant frequency difference of cysteine residues between toxin protein and cellular functional proteins (data not shown) in the venom gland of D. acutus
, suggesting that the PDI plays a key role in toxin protein folding. Furthermore, heat shock proteins (HSPs) are also identified (15 ESTs) in this library including HSP20, HSP70 and HSP90. HSPs are chaperon for protein refolding and degeneration, which is possibly important to toxin proteins regeneration. There are many ribosomal proteins found in this library, which contributes to the high level of protein synthesis events. A large number of ribosomal proteins therefore are needed for the toxins synthesis [16
]. Several other identified transcripts can also shed light on the physiological aspects of the venom gland secretion style. For instance, various clusters involved in transporter activity are found, e.g., ion transporters (uni4929105), nucleoside transporters (uni7320865) (see Additional File). All these suggested that the venom gland of D. acutus
is a highly specialized active organ that plays a central role in secreting toxins and polypeptides with powerful synthesis capabilities.
ESTs relevant to structural components and energy supply
There are abundant structural component transcripts expressed in the venom gland of D. acutus, encoding actin, troponin, calsequestrin and myosin (Table ). Interestingly, these cellular structural components from the venom gland are similar to ones from mammalian muscle tissue, which may indicate that the structure of the venom gland cavity is similar to muscle tissue and contributes to the venom gland contractile activity. Accordingly, it could be explained that creatine kinase expressed highly in the venom gland of D. acutus, accounting for 2.49% (217 reads) of all ESTs (Table ). Because creatine kinase is an important enzyme regulator of high-energy phosphate production and utilization within contractile tissues, high expression of the enzyme is adapted to energy need for gland contraction. Furthermore, abundant transcripts expressed in current library also involved in cytochrome b, cytochrome oxidase and NADH dehydrogenase, which are also needed to meet energy needs for toxin protein synthesis and gland contraction.
Enzymes relevant to metabolism pathway
In this library, several enzymes in metabolic pathways such as glucose metabolism and nicotinate and nicotinamide metabolism were found (Table ). In energy and material metabolism, 22 clusters sequences were identified to play a role in glucose metabolism. We also identified that unigene uni4505467 and unigene uni41055552 code for the 5'-nucleotidase, which suggests that D. acutus
may possess a functional pathway for purine metabolism and nicotinate and nicotinamide metabolism in the venom gland. The 5'-nucleotidase participates not only in purine metabolism but also in nicotinate and nicotinamide metabolism. Snake envenomation employs three well-integrated strategies: prey immobilization via hypotension, prey immobilization via paralysis, and prey digestion. Purines (adenosine, guanosine and inosine) constitute the perfect multifunctional toxins, and evidently play a central role in all three envenomation strategies of most advanced snakes [17
Proteins from venom gland of D. acutus predicted to play a role in physiological metabolism pathway
ESTs relevant to other function
Surprisingly, 18 clusters (21 ESTs) encoding for reverse transcriptase were found in current library. They are similar to reverse transcriptase from teleost LINE family SW1 [18
]. At the same time, we identified retrotransposable-like elements in this library (16 ESTs), most of them similar to ORF2 protein from a Platemys spixii
retrotransposon CR1 [19
]. So we could expect an intact retrotransposable structure in the D. acutus
genome. This specific retrotransposable element in the D. acutus
genome may be adapted to environmental diversity and prey need.
So far, we still have not determined the complete functional categories of genes expressed in the venom gland of snake. To give an overview of the major cellular roles, the number of partial mRNA transcripts represented in each category is listed (Additional file) based on molecular function of Gene Ontology [20
]. A major proportion (38 clusters) represent transcripts involved in the binding category, corresponding to 34.86% of genes of molecular function and 1.33% of total unigenes, respectively. Based on Gene Ontology function classification, 824 clusters (3144 ESTs) are assigned into the organizing principle of molecular function and 1719 clusters (5472 ESTs) of biological process (Figure ). However, such an analysis only gave a hint of what the function might be and, in many cases, extensive biochemical and biological work is necessary to unambiguously identify a gene and its function [21
]. We have presented here an initial analysis of those relevant to physiological cellular proteins.
Functional classification of transcripts in the venom gland of D. acutus according to molecular function and biological process based on Gene Ontology. The vertical axis shows the number of EST or cluster.
ESTs identifying no significant matches to known genes
There are 1553 clusters (54.40%) without significant homology to any known genes in GenBank. According to sequences discarding criteria (less than 50 bp), we could exclude the possibility that too short sequences lead to no hits. The high abundant novel sequences represent a large number of unidentified genes, suggesting the complexity and diversity of genes expressed in the venom gland of D. acutus. In addition, among those clusters without significant matches to known genes, 11 clusters have matches with the dbEST database and 344 clusters with the hmmpfam database. Some of those clusters, such as unigene rfstca0_000120.y1.scf showed a putative toxin-related motif region of disintegrin, and unigene rfstda0_001953.y1.scf for Conotoxin I-superfamily, indicated new toxins among those sequences. The high abundance of these sequences might correspond to the unknown toxin genes stored in the venom gland of D. acutus. Further study of these novel genes expressed in this specialized organ could disclose the mechanism of toxin secretion and the evolution of the snake venom gland.
Comparative analysis with other snakes venom glands
Although several cDNA libraries from the venomous gland of a few snakes have been reported and characterized [10
], analysis of transcripts from these cDNA libraries seldom involve cellular functions as the main attention was focused on the toxin components. Many components of toxin in the venom gland have been identified [11
], but nerve growth factor (NGF) has not been identified in this library. In contrast to previous reports [10
], we postulate an alternative possibility for not identifying NGF in this library: one is that the NGF might express under the specifically physiological conditions, such as in milking venom gland of D. acutus
; another is that NGF is not a necessary component of toxin of D. acutus
. Furthermore, a lot of clusters that may be involved in many physiological process of venom gland remained to be deciphered. It is significant to study the gene expression of venom gland of snake on cellular structure and functional aspects, which will improve the study of some physiological process such as organogenesis, cell differentiation and protein synthesis. Alternatively, some secreted membrane proteins may represent antigens of potential importance to immune control.
Phylogenetic analysis of toxin related genes in D. acutus
Snake venom glands evolved once, at the base of the colubroid radiation, 60–80 million years ago, with extensive subsequent "evolutionary tinkering" [23
]. The advanced snakes (superfamily Colubroidea
) make up >80% of about 2900 species of snake currently described, and contain all of the known venomous forms [1
]. Generally in this library, toxin clusters match sequences from snake sources in database while the cellular functional transcripts are identified by its similarity to model organisms such as Gallus gallus, Homo sapiens
and Rattus norvegicus
. Of note, although those transcripts of toxin are always phylogenetically closer from another snake, the average of similarities over toxins and cellular functional transcripts has no obvious difference. From the view of evolution, we postulated that the toxins tend to diverge due to natural selective pressure to adapt to different environmental conditions (mainly distinct preys). Whereas, most of the cellular components showing similarities with mammalian proteins, although they are usually phylogenetically distant, correspond to proteins of conserved functions among vertebrates and thus show higher homology [15
The origin and evolution of many toxin gene families in the advanced snakes have been researched extensively [1
]. Among these toxin gene families, most were recruited into the advanced snakes before the split of elapids and viperids. However, the phylogenetic analysis of a few other toxin gene families, such as phospholipase A2 and natriuretic toxins provide a clear evidence of an independent recruitment event. Because of limited toxin gene sequences in public databases, the origin and evolution of a number of toxin gene families remain unknown. In this report, we analyzed the phylogeny of the group III snake venom metalloproteinase and serine protease. The group III snake venom metalloproteinase consists of a proprotein, a metalloproteinase, a disintegrin and a cysteine-rich domain. It inhibits the integrin receptor selectively. Figure described the phylogenetic analysis of this group III metalloproteinase in Colubroid. The phylogenetic model of this metalloproteinase is similar to CRISP protein [24
], which was recruited into the advanced snakes before the split of elapids and viperids. That is the advanced snake acquired the group III metalloproteinase genes by an early single recruitment event. Subsequently, this metalloproteinase family evolved independently in elapids and viperids. We also analyzed the phylogeny of the serine proteases in suborder Colubroidea
and similar results were shown (Figure ). But serine protease genes have not been identified in elapids, which suggested the gene loss events in the evolutionary process or insufficient sequences information of elapids.
Figure 5 Molecular phylogenetic analysis of the group III metalloprotease using Neighbor-joining method. To minimize confusion, all proteins sequences from database are referred to their NCBI accession numbers. The sequences from the current library are indicated (more ...)
Figure 6 Molecular phylogenetic analysis of the serine proteinase using Neighbor-joining method. To minimize confusion, all proteins sequences from database are referred to their NCBI accession numbers. The sequences from the current library are indicated with (more ...)