|Home | About | Journals | Submit | Contact Us | Français|
The venom gland of the snake Bitis gabonica (Gaboon viper) was used for the first time to construct a unidirectional cDNA phage library followed by high-throughput sequencing and bioinformatic analysis. Hundreds of cDNAs were obtained and clustered into contigs. We found mostly novel full-length cDNA coding for metalloproteases (P-II and P-III classes), Lys49-phospholipase A2, serine proteases with essential mutations in the active site, Kunitz protease inhibitors, several C-type lectins, bradykinin-potentiating peptide, vascular endothelial growth factor, nucleotidases and nucleases, nerve growth factor, and L-amino acid oxidases. Two new members of the recently described short coding region family of disintegrin, displaying RGD and MLD motifs are reported. In addition, we have identified for the first time a cytokine-like molecule and a multi-Kunitz protease inhibitor in snake venoms. The CLUSTAL alignment and the unrooted cladograms for selected families of B. gabonica venom proteins are also presented. A significant number of sequences were devoid of database matches, suggesting that their biologic function remains to be identified. This paper also reports the N-terminus of the 15 most abundant venom proteins and the sequences matching their corresponding transcripts. The electronic version of this manuscript, available on request, contains spreadsheets with hyperlinks to FASTA-formatted files for each contig and the best match to the GenBank and Conserved Domain Databases, in addition to CLUSTAL alignments of each contig. We have thus generated a comprehensive catalog of the B. gabonica venom gland, containing for each secreted protein: i) the predicted molecular weight, ii) the predicted isoelectric point, iii) the accession number, and iv) the putative function. The role of these molecules is discussed in the context of the envenomation caused by the Gaboon viper.
Snake venoms are complex mixtures of proteins, including enzymes and other biologically active components (Aird, 2002). These components are responsible for the envenomation caused by snake bites and display mostly neurotoxic (Harvey, 2001) or proteolytic (Bjarnason and Fox, 1995) activities. The Gaboon viper, Bitis gabonica, is a large viper widely distributed over West, Central, and East Africa. It produces the largest amounts of venom of all poisonous snakes, yielding in excess of 2 grams of dried venom per milking (Dorandeu, 1991). Bites from Gaboon vipers appear to be rare, however, due at least in part to the animal's extremely placid nature. Actually, the majority of reported bites have occurred from handling specimens in captivity (Marsh et al., 1997). In these cases, unstable circulation, a severe coagulation disorder, and tissue damage followed by necrosis are the most life-threatening conditions associated with the envenomation (March et al., 1997).
As far as the biochemical composition of B. gabonica venom is concerned, several activities have been reported including arginine esterases (Viljoen et al., 1979), phospholipase A2 (PLA2) (Botes and Vijoen, 1974), thrombin-like enzyme (gabonase) (Pirkle et al., 1986), anti-platelet (gabonin) (Huang et al., 1992), and metalloprotease (Marsh et al., 1997) activities. Remarkably, information about the B. gabonica snake venom gland at the molecular level is almost nonexistent. In fact, only the N-terminus of gabonase (Pirkle et al., 1986) in addition to the amino acid sequence of a B. gabonica PLA2 has been reported (Botes and Viljoen, 1974). Furthermore, a GenBank search with the term "Bitis gabonica" displays in September 2003 only a PLA2 and cytochrome b sequence at the protein level and two housekeeping sequences at the nucleotide level. The striking lack of information on the molecular constituents of B. gabonica venom led us to choose this snake to perform a venom gland cDNA library followed by sequencing of the clones. Edman degradation of the most abundant protein was performed in parallel, allowing us to generate for the first time a comprehensive catalog containing B. gabonica transcripts (cDNA) and proteins. Roles of the components of Gaboon viper venom are discussed in the context of both envenomation and in vitro activities previously described for this venom.
All water used was of 18 MΩ quality and was produced using a MilliQ apparatus (Millipore, Bedford, MA, USA). Organic compounds were obtained from Sigma Chemical Corporation (St. Louis, MO, USA) or as stated otherwise.
B. gabonica venom and venom gland were obtained from the same snake held in captivity at the Kentucky Reptile Zoo (Slade, KY). Three days after milking the head was cut and the gland immediately dissected and frozen in dry ice under the supervision of Jim Harrison and Kristen L. Wiley of Kentucky Reptile Zoo (http://www.geocities.com/Kentuckyreptilezoo).
Approximately 30 µg of venom was treated with LDS sample buffer (Invitrogen) containing SDS without reducing conditions and applied to a NU-PAGE 4–12% Bis-Tris gel (MES buffer) (Invitrogen) 1 mm thick. The supplemental data of this paper contains detailed information on B. gabonica cDNA library construction.
A fragment was rapidly obtained from the center part of the gland. Fragments were transferred to a sterile plastic Petri dish located on the top of dry ice to avoid melting. B. gabonica salivary gland mRNA was obtained using Micro-Fast Track mRNA isolation kit (Invitrogen, San Diego, CA, USA) according to the manufacturer's instructions. The PCR-based cDNA library was made following the instructions for the SMART cDNA library construction kit (Clontech, Palo Alto, CA, USA) as described (Francischetti et al., 2002). Cycle sequencing reactions using DTCS labeling kit from Beckman Coulter Inc. (Fullerton, CA, USA) was performed as reported (Francischetti et al., 2002). The supplemental data of this paper contains detailed information on B. gabonica cDNA library construction and sequencing of B. gabonica cDNA library.
Other procedures were as in Francischetti et al (2002) except that clustering of the cDNA sequences was accomplished using the CAP program (see Supplemental data). The electronic version of the complete tables (Microsoft Excel format), with hyperlinks to web-based databases and to BLAST results is available on request (vog.hin.diain@ittehcsicnarfi). The supplemental data of this paper contains detailed information on cDNA sequence clustering, sequencing information cleaning, blast search and other bioinformatic analysis.
In an attempt to improve our understanding of the complexity of the proteins and transcripts expressed in B. gabonica venom glands, we have performed SDS/PAGE and a cDNA library using, respectively, the secreted proteins and mRNA from this same tissue.
Figure 1 shows the pattern of separation of B. gabonica venom proteins by SDS-PAGE that have been stained by Coomassie Blue. The gel shows 15 clearly visible stained bands and many other slightly stained. The protein bands were numbered from 1 to 15 according to their decreasing apparent molecular weight, starting with the letter BG that stands for B. gabonica. To identify these proteins, they were transferred to PVDF membranes and the bands cut from the membrane and submitted to Edman degradation. Amino-terminal information was successfully obtained for all bands BG-1 to BG-15. To find matches to known proteins, the sequences were blasted against the NR GenBank database and to each cDNA sequence obtained in the mass-sequencing project of the B. gabonica venom gland described in this paper (see Materials and Methods).
A cDNA library was constructed using the venom gland of B. gabonica and about 600 hundreds of independent clones randomly 5/ sequenced. When a cluster analysis of all sequences from this library was performed, 300 independent contigs were organized. Subsequently, contigs were blasted against the NR nucleotide database, and the presence of signal peptides was predicted by submission of the sequences to the SignalP server (Nielsen et al., 1997). Our analysis shows that ~75 % of all sequences have database hits; ~ 46% of all sequences code for protein with a putative signal peptide, 38% code for proteins with housekeeping function, and the remaining sequences could not be assigned as housekeeping or secretory (unknown). It is thus clear that cDNA for secretory proteins are highly represented in our library, suggesting that in vivo these molecules are preferentially expressed over housekeeping and unknown-function proteins. Also, because the cDNA have been obtained from a single animal, these variations do not represent populational diversity, as the maximal number of alleles would be 2.
Among the individual cDNA sequences containing putative signal peptides, 90% have hits in the GenBank database. Figure 2A shows the relative proportion of the number of individual cDNA over the total number. These sequences were organized into 60 contigs (70%). Figure 2B shows the relative proportion of the number of contigs for each venom toxin family. Based on the distribution described above, cDNAs coding for PLA2 are the most abundant but organized in only three contigs. This indicates that PLA2 in this venom are rather similar and highly expressed. On the other hand, the metalloproteases are organized in 18 different contigs, suggesting that these enzymes may have evolved to perform other functions. Finally, among the 30 sequences (10% of all sequences) that do not have matches to any database, 25 contigs (30%) were organized.
Among the housekeeping cDNAs, we have found sequences involved in transcription and translation (ribosomal proteins, cAMP-dependent transcription factors, elongation factors), metabolism (ATP synthase, amine oxidase, glutathione S-transferase, guanine nucleotide-binding protein, cytochrome c oxidase, NADH-ubiquinone oxidoreductase chain I), processing (versican core protein precursor), cell regulation (lithostatine), structural functions (microtubule binding protein), storage (ferritin heavy chain), and retrotransposable (L-1) elements. The complete list of the sequences coding for proteins with secretory, housekeeping, or undetermined function, with or without database hits, can be obtained on request (vog.hin.diain@ittehcsicnarfi).
Table 1 describes the contigs we have found coding for putative secreted proteins and, when available, their corresponding N-terminus obtained by Edman degradation. Matches to the NR, snake DNA, and Conserved Domains Database in addition to accession numbers are also reported. A detailed discussion on the sequences assigned by each cluster and its participation in envenomation by B. gabonica is presented below.
The metalloproteases make up the most complex group of proteins, being composed of 18 contigs or 30% of all assembled cDNAs. These findings are consistent with the functional characterization of two hemorrhagic proteins (HTa and HTb) in B. gabonica venom that were previously shown to degrade collagen and affect endothelial cell morphology (Marsh et al., 1997). Metalloproteases, the primary proteins responsible for snake venom-induced hemorrhage, belong to the reprolysin family of venom metalloproteases (Bjarnason and Fox, 1995). These enzymes are capable of hydrolyzing various components of the extracellular matrix and have also been reported to affect endothelial cells leading to apoptosis. These enzymes are organized into four classes PI-PIV, according to size and domain composition (Bjarnason and Fox, 1995; Jia et al., 1997).
According to our cDNA library, we have found a number of contigs containing partial sequences with homology to the reprolysin, disintegrin or cysteine-rich domains of diferent venom metalloprotease. However, no matches to the C-type lectin domains of metalloproteases were found (Class P-IV). It appears that B. gabonica venom contains the P-II and P-III classes of metalloprotease although the presence of the P-I and P-IV classes cannot be excluded. Contig 34 has the longest cDNA we have found in our library coding for a metalloprotease and accordingly, it was extended with appropriated primers in an attempt to identify its functional domains. Although the pre- and pro-domains are not available, the regions coding for the metalloprotease and disintegrin domains were found, indicating that this enzyme belongs to the P-II class and named herein B. gabonica metalloprotease-4 (AY44228). The metalloprotease domain is typical and contains the zinc-binding motif, but is unusual in the sense that 5 instead of 6 cysteines are present. Since the nucleotide sequence coding for this regions was reproducible and unambiguous, and found in a reliable region of the chromatogram, it is concluded that this is a true substitution. Actually, it has been reported that the number of cysteine in the metalloprotease domain of these enzymes may differ (Kini et al., 2002). Of interest, atrolysin A, a class III enzyme from C. atrox also has an additional cysteine residue in the proteinase domain. The oxidation state or potential disulfide bond partner of this residue in atrolysin is unknown (Bjarnason and Fox, 1995). At present, however, it cannot be completely excluded that a mutation may have occurred in this cDNA during the first or second strand synthesis; or unlikely, the mRNA used to generate the cDNA presents a mutation. As far as the disintegrin domain is concerned a typical RGD sequence was found, and the cysteine pattern was similar but again, not identical to the disintegrin domain of most metalloproteases. In fact, the B. gabonica metalloprotease-4 (AY44228) does not contain 8 aminoacids in the N-terminus including a cysteine that is conserved in most P-II enzymes. Likewise, these amino acids are missing in a metalloprotease from Macrovipera lebetina (gi 2118144)(Figure 3) suggesting that the amino acid changes we have observed are consistent. Of note, the N-terminus NSAHPCCDPVTXK (BG-12) obtained by Edman degradation of B. gabonica venom proteins (Figure 1) is identical to the putative N-terminus coded by the corresponding cDNA in contig 34. This finding strongly suggests that this P-II metalloprotease when processed may generate disintegrin peptides. We could not identify the N-terminus of the metalloprotease domain in our proteome study (Figure 1), suggesting that these proteins may have their N-terminus blocked as previously reported (Fox et al., 2002). The CLUSTAL alignment of B. gabonica P-II metalloprotease (B. gabonica metalloprotease-4; AY442287) with other P-II class enzymes is shown in Figure 3A. The unrooted cladogram is presented in Figure 3B and shows that metalloproteases from B. gabonica and M. lebetina venoms are the most closely related enzymes.
Table 1 shows that a number of other contigs (e.g. contig 16, AY430411 and 30, AY 430412) contains the partial-length sequences homologous to P-III metalloproteases from other venoms (Jia et al., 1997). In addition, contig 172 (AY430412) contains the full-coding region for the disintegrin-like and cysteine- rich domains of a typical P-III venom metalloproteases. In fact, the protein coded by this cDNA has a SECD motif in the disintegrin-like domain in addition to a conserved pattern of cysteines commonly found in the cysteine-rich region. This cDNA codes for a protein that can be aligned (not shown) with the disintegrin/cysteine-rich domains of othe P-III metalloproteases including berythractivase, a prothrombin activator from B. erythromelas and other related molecules (Silva et al., 2003). These enzymes may participate together with other venom components in the pathogenesis of B. gabonica envenomation (Marsh et al., 1997).
In addition to metalloproteases, crotalid and viperid species contain large amounts of serine proteases (Markland, 1998). In most cases, theses enzymes have 12 cysteines strongly conserved in addition to a catalytic triad characteristic to serine proteases – His57-Asp102-Ser195 (Castro et al., 2001). These enzymes are frequently blocked by serine protease inhibitors and preferentially hydrolyze the α chain of fibrinogen over the β chain and/or induce platelet aggregation (Markland, 1998).
Among the serine proteinases found in our cDNA library, contig 71 encodes a protein similar to platelet pro-aggregatory PA-BJ from Bothrops jararaca (Serrano et al., 1995) and thrombocytin from B. atrox (Niewiarowski et al., 1979). Consistent with these contigs, we have found the sequence VIGXAEXDINEHPSLALIY for BG-8 and VIGXAEXNINEHRFLALVYF for BG-9 to be similar to the N-terminus of PA-BJ (VVGGRPCKINVHPSLVLL). It also resembles the sequence VVGGAGECKIDGHRCLA LLY described for gabonase, a pro-coagulant thrombin-like enzyme from B. gabonica (Pirkle et al., 1986). We could assign contigs with matches to enzymes that cleave fibrinogen (like gabonase) but are devoid of platelet aggregatory properties. In addition, contigs 60 and 90 have matches to β-fibrinogenases from Vipera lebetina (gi 2241722) and Agkistrodon blomhoffi (gi 6706013) snake venoms, respectively. The cDNAs identified in contig 60 (Bitis gabonica serine protease-1; AY430410) was completely sequenced and the CLUSTAL alignment with other venom serine proteases is shown in the Figure 6 of the supplemental data. Interestingly, in this protein the catalytic triad His 57 is replaced by Arg 57, and Ser 195 is replaced by Asp195. Identical substitutions have been found in the serine protease VLP2 from V. lebetina venom and it has been suggested that generation of such clones occurs via trans-splicing of the primary gene transcript, by exon shuffling or by unequal crossing-over on the genome level (Siigur et al., 2001). Since these proteins have not been expressed as recombinant protein it is a matter of debate whether it behave as serine proteases.
Finally, it is noteworthy that contigs 60, 90, 71, 108, and 259 match kallikrein-like enzymes in the gene ontology database, indicating that these serine proteases may act on kininogen to release bradykinin. This conclusion is consistent with reports showing that B. gabonica venom serine protease activities can be separated into kinin-releasing, clotting, and fibrinolytic activities (Viljoen et al., 1979).
Kunitz domains are about 60 residues and contain six specifically spaced cysteines that form disulfide bonds. In most cases, they are reversible inhibitors of serine proteases that bind the active site. In our library, we have found contig 203 with sequence homology to textilin, a Kunitz-type protease inhibitor that tightly inhibits plasmin and is supposed to have anti-hemorrhage or pro-thrombotic activity (Aird et al., 2002). Consistent with these data, we have found by Edman degradation that BG-11 and BG-15 protein bands share a similar sequence: KNRPEFX NLPADTGXXKAY and KKRPDFXYLPADTGPXMAN, respectively. These sequences match the N-terminus KDRPKFCELPADIG reported for textilin (gi 15321630). The full-length clones of two Kunitz-protease inhibitors from B. gabonica venom gland have been obtained and called Bitisilin-1 (AY430402) and Bitisilin-2 (AY430413). The CLUSTAL alignment of both sequences with other venom Kunitz inhibitors and unrooted cladogram of all sequences is presented in the supplemental data. Of interest, a third cDNA (contig 137) codes for a molecule containing at least two Kunitz domains organized in tandem and called herein Bitisilin-3 (AY442289). Although multi-Kunitz molecules from exogenous sources have been identified in the salivary gland of ticks (gi 15077001), this is the first description of a multi-headed Kunitz in snake venoms. Both heads are highly homologous, and are most likely the result of gene duplication from a common ancestor (Zupunski et al., 2003). Consistent with these data, we have found two protein bands with clearly distinct molecular weights (BG-11 and BG-15) with N-terminus that matches Kunitz inhibitors (Figure 1). Finally contig 146 has sequence homology to α-bungarotoxin from Bungarus candidus (gi 24459200), a well-studied Kunitz type K+ channel blocker from Bungarus spp. (Harvey, 2001). The CLUSTAL alignment of B. gabonica and other venom Kunitz-like protein and the unrooted cladogram of all sequences are shown in the Figure 7 of the supplemental data.
We have also found that contig 264 assigns for cystatin-like molecules. Cystatins are tight and reversible inhibitors of the cysteine proteinases and are present in a variety of mammalian and non-mammalian tissues including snake venoms (Aird, 2002). According to our library, B. gabonica also contains the full-length clone coding for a cystatin-like protein called herein Bitiscystatin (AY430403). In addition, BG-13 sequence KVGXLYXRDVMDPEVQXAA is similar to the N-terminus of B. arietans cystatin (gi 118194). The CLUSTAL alignment of B. gabonica and other cystatins and the unrooted cladogram of all sequences are shown in the Figure 8 of the supplemental data.
C-type lectins are molecules containing a carbohydrate-recognition domain (CRD). Most C-type lectins are Ca2+ dependent; however, many of them have lost their sugar-binding properties and have evolved to interact with platelet receptors and/or blood coagulation factors (Markland, 1998). Notably, snake venoms are a rich source of C-type lectins, and not surprisingly our library also contains a large amount of cDNA coding for this family of proteins. Among the cDNA we have sequenced, contig 218 assigns for a fibrinogen-clotting inhibitor from Gloydius halys brevicaudus (gi 4337050) and contig 216 assigns for Factor IX/X-Binding protein from Trimeresurus flavoviridis. Also, contig 215 assigns for a GPIb agonist from Agkistrodon acutus and may affect platelet function either by direct agglutination of platelets, through binding to von Willebrandt factor (Matsui et al., 2002). Finally, contig 15 assigns for a galactose-binding lectin from the venom of Trimeresurus stejnegeri (gi 7674107). Consistent with these results described above, we have found for BG-4 the sequence DFEXPSEWSAYGXHXYRAF in addition to BG-5 (DQGXLPDWSAYE QHXY), BG-7 (DEGXLPGWSLYE), and BG-2 (DFGXLSDWSXYEQH) that resembles the N-terminus of a C-type lectin from B. arietans DFQCPSEWSAYGQHCYR (Harrison et al., 2003). BG-3 (DFGA) and BG-10 (DQGALPDTSYHQHHYYP) are also similar to B. arietans C-type lectin in addition to the DQDCLPDWSSHERHCY N-terminus of Echis pyramidum leakeyi C-type lectin (gi 33243102). The full-length clones of three B. gabonica C-type lectin have been obtained and named B. gabonica C-type lectin-1 (AY439477), B. gabonica C-type lectin-2 (AY429478), and B. gabonica C-type lectin-3 (AY429479). The CLUSTAL alignment of B. gabonica and other venom C-type lectins and the unrooted cladogram of all sequences are shown in the Figure 9 of the supplemental data.
Snake and other venoms are rich sources of PLA2 (E.C.126.96.36.199), a family of enzymes known to have edematogenic, antiplatelet, anticoagulant, mast cell degranulating, or neurotoxic properties (Bon et al., 1994). On the basis of primary structure and disulfide bond pairings, snake venom PLA2s were classified as type I (Elapidae) or class II PLA2s (Viperidae/Crotalidae). The catalytic site of class II PLAs2 contains a highly conserved aspartic acid or lysine at position 49 (Ownby et al., 1999).
In our library, contigs 1 and 3 assign to PLA2, similar to one described in B. nasicornius venom (gi 67204), whereas contig 222 matches a PLA2 described in Echis pyramidum leakeyi (gi 27734438). The cDNA sequenced in this library code for Lys49-PLA2; no Asp49-PLA2 has been sequenced. The presence of a PLA2 protein in this venom was confirmed by the BG-14 sequence HLEQFGNMIDHVSGRSFWLY that is similar to the N-terminus DLTQFGNMIN previously reported for B. gabonica PLA2 (Botes and Vijoen, 1974). The full-length clone of B. gabonica PLA2 has been obtained and named B. gabonica PLA2–1 (AY430410). The CLUSTAL alignment of B. gabonica and Lys49-PLA2 and the unrooted cladogram of all sequences are shown in the Figure 10 of the supplemental data.
Disintegrins are cysteine-rich, low-molecular-weight platelet aggregation inhibitor polypeptides that usually contain an RGD sequences or other motifs that are recognized by integrins in different cell types (McLane et al., 1998)). In most cases, venom disintegrin are encoded with a signal peptide, pre-peptide (pro-domain), metalloprotease, and disintegrin region on their common precursors (P-II class metalloproteases). It is suggested that the metalloprotease/disintegrin precursor is cleaved by protease(s), resulting in production of metalloprotease and disintegrin (Bjarnasson and Fox, 1995; McLane et al., 1998). More recently, a new gene structure of the disintegrin family was identified in Agkistodon c. contortix and A. p. piscivourus venoms and it consists of signal peptide, pre-peptide (pro-domain), a disintegrin domain and lacking the protease domain (Okuda et al., 2002).
In our library, contigs 119 and 127 assign, respectively, for disintegrin similar to acostatin from the venom of Agkistrodon contortrix contortrix (Okuda et al., 2002), and eristochophin I from Eriscocophis macmahonii (gi 6225272). Interestingly, cluster 119 contains sequences that code for two proteins respectively called herein B. gabonica disintegrin-1 (gabonin-1, AY430904) and B. gabonica disintegrin-2 (gabonin-2, AY430505). Remarkably, these two protein sequences were identical except for nine amino acids that occurs between the cysteine residues that form the putative acidic hairpin loop where the disintegrin domain is found. One of these sequences contains a typical RGD sequence known to bind to β3 integrins (McLane et al., 1998), whereas the second sequence contains the motif MLDG, known to interact with integrin α9β1 and to affect neutrophil function (McLane et al., 2000). Since a typical signal peptide and a pre-peptide region were found for both gabonin-1 and −2, it is clear that these proteins together with acustatin and piscivostatin α chains are new members of the short coding region family of disintegrins (Okuda et al. 2002). The CLUSTAL alignment of gabonin-1 and −2 with acostatin and piscivostatin is shown in Figure 4A. The schematic domain structure of this family of protein is shown in Figure 4B (Okuda et al., 2002).
Consistent with these contigs, Edman degradation of the protein band BG-12 yields the sequence NSAHPXXDPV TXK that matches the N-terminus NSANPCCDPITCK of eristocophin (gi 265034). Since BG-12 N-terminus also matches the N-terminus found for the disintegrin domain of B. gabonica metalloprotease-4 (Figure 3A), it is unclear whether the protein we have identified as a disintegrin is a processed form of a B. gabonica venom P-II metalloprotease, a short code region disintegrin, or both. Finally, the finding that B. gabonica contains disintegrins reinforces the notion that B. gabonica venom targets hemostasis and may also indicate that the hemostatic disturbance found after B. gabonica envenomation is mediated, at least in part, by these molecules.
LAO are widely found in snake venoms and are thought to contribute to toxicity upon envenomation. It has been shown that these enzymes affect platelets, induce apoptosis, and have hemorrhagic effects (Aird, 2002). In our library, contig 165 (AY434453) assigns for a truncated clone coding for proteins with sequence homology to apoxin-1, a LAO and apoptosis inducer from Crotalus atrox venom (gi 5565692). Contigs 181 and 182 are similar to LAO with platelet and coagulation inhibitory properties isolated from Agkistrodon halys blomhoffii (gi 15887054). In addition, the N-terminus of BG-6, ADDKNPLEEXFRESSYEEFL is almost identical to the N-terminus ADDRNPLEECFRETDYEEFL of LAO from A. h. blomhoffii venom (gi 15887054), confirming the presence of this family of enzymes in B. gabonica venom. It remains to be demonstrated how LAO from B. gabonica venom affect hemostasis.
Snake venoms are a rich source of nucleotidases, and their participation in envenomation has been reviewed recently (Aird, 2002). In our library, contig 40 has a truncated clone whose sequence is similar to the sequence coding for an ectonucleotidase from the electric ray electric lobe (gi 112824). Nucleotidases inhibit platelet aggregation, and it appears that B. gabonica may affect platelet function by removal of ADP. We have also identified cDNA coding for endonucleases, a family of enzymes ubiquitously found in snake venoms. Venom endonucleases work together with venom and endogenous phosphodiesterase degrading nucleic acids to free nucleotides, which serve as substrate for 5/ nucleotidases, which, in turn, liberate free nucleosides. Adenosine, in particular, is a potent vasodilator and inhibitor of platelet aggregation (Aird et al., 2002).
In our library, we have found cluster 227 that match VEGF from Gallus gallus (gi 27368068). The VEGF are the most potent vascular permeability factors known and characteristically cause reversible increase in permeability and have been described in venoms (Aird, 2002). The full-length clone of B. gabonica VEGF has been obtained and named B. gabonica VEGF (AY429481). This protein may be involved in edema induced by B. gabonica bite. The CLUSTAL alignment of B. gabonica and other venom VEGF and the unrooted cladogram of all sequences are shown in the Figure 11 of the supplemental data. In addition, cluster 219 (AY430406) in our library is similar to NGF from B. jararacussu venom (gi 15407254). NGF is ubiquitous in snake venoms and exhibit non-neuronal effects such as the induction of plasma extravasation and histamine release from whole blood cells. Although we could not find the N-terminus of growth factors in any of the bands shown in Figure 1, this paper describes for the first time transcripts for this family of proteins in B. gabonica venom gland.
We have found an abundant contig 188 containing 28 truncated cDNA (AY434452) whose sequence has matches to the 3/ untranslated region of BPP from A. h. blomhoffi (gi 427226). BPP were first isolated in the venom of B. jararaca snake and shown to display intense hypotensive properties (Aird, 2002). We suggest that this family of peptides is involved in hypotension associated with B. gabonica envenomation.
We have sequenced other cDNA (contig 28) whose sequences are similar to a cytokine-like protein that inhibits insulin secretion recently described (Zhu et al., 2002). This is the first description of this family of proteins in snake venoms. By immunohistochemistry it was shown that a member of this cytokine family was expressed prominently in the vascular endothelium, particularly in capillaries (Zhu et al., 2002). The function of this cytokine-like protein in snake venom remains to be determined, but it may be that it somehow affects vascular biology. The full-length clone of B. gabonica cytokine-like protein has been obtained and named B. gabonica cytokine-like protein-1 (AY429480). The CLUSTAL alignment of B. gabonica and other cytokine-like proteins and the unrooted cladogram of all sequences are shown in Figure 5.
Finally, no matches have been found for some clusters, and these were assigned as unknowns (not shown, available on request). In some selected cases, we have named hypothetical proteins (HP) when a sequence without database hits has an open-reading frame (ORF) containing metionine, a stop codon and a putative signal peptide (Table 2).
To gather the maximum amount of information about the putative secreted proteins from the B. gabonica venom gland, selected sequences presented in Table 1 were re-sequenced and extended to obtain, when applicable and possible, their full-length cDNA. The full coding sequences with database hits were then blasted again to the NR protein database and SignalP server to confirm, respectively, sequence similarity and the presence of a signal peptide. In the event a signal peptide was predicted to exist, the molecular weight and the pI of the mature protein were also calculated and the putative function annotated. Most of the sequences displayed in Table 2 are full-length clones, with the exception of the metalloproteases, LAO, and BPP (see below). It may be that the cDNA coding for these proteins have an SfiI site that is purposely cleaved during the cDNA library construction (see Materials and methods). Although our library is PCR-amplified, it is clear that the base changes observed in different contigs, including contig 34 or 60 and others are not artefactual. In fact, similar base changes have been found for all individual sequences of a given contig; actually, this diversity can be explained by accelerated evolution that has been well-documented in snake venom glands (Deshimaru et al., 1996). It is also known that PCR based libraries which cDNAs have not been size-fractioned may be enriched with small cDNAs. However, our libraries have been constructed using low, medium and high molecular weight cDNAs that have been separated by gel-filtration (see Material and Methods in the supplemental data). This separation minimizes the preferential amplification of small transcripts over larger ones, and the preferential ligation of small-sized cDNAs over larger ones, in the TripleX2 vector. Of note, the putative proteins coded by the most abundant clusters have been identified in the SDS/PAGE (e.g. PLA2, protease inhibitors, C-type lectins, serine protease, and disintegrins) with the exception of the metalloproteases, which N-terminus are found to be frequently blocked (Fox et al., 2002), and the BPP that are nor appropriately separated by 4–12% PAGE due to its low molecular weight. Accordingly, PCR-based libraries appear to provide a reasonable qualitative estimate of the transcripts expressed in a given tissue. Alternatively, construction of a normalized B. gabonica cDNA library could be a useful strategy to follow in an attempt to identify rare transcripts that have been eventually missed in our library. Likewise, separation of venom proteins by 2-D gel followed by Edman degradation may well complement the data obtained herein using one-dimensional PAGE (Fox et al., 2002).
The summary of our findings is presented in Table 2. To our knowledge, this Table is the first attempt to create a comprehensive catalog of the cDNA from the B. gabonica snake gland. Eventually, such a catalog will contain a non-redundant set of full-coding cDNA sequences covering every B. gabonica venom gland cDNA and possibly each venom protein function. Thus, this transcript and protein catalog for B. gabonica and other snakes could form part of a large-scale and comprehensive functional analysis of snake venom genes and cDNA. Together with information derived from the venom gland genome, proteome (Fox et al., 2002), and microarrays (Gallagher et al., 2003), information provided by this catalog could be an essential tool to understand snakes physiology (Perales and Domont, 2002), the molecular basis of envenomation, as well as to find potential candidates for serum production (Theakston et al., 2003) and/or tools to study cell biology and biochemistry (Ménez, 1998).
Snake venom envenomation employs three well-integrated strategies including prey immobilization via hypotension, prey immobilization via paralysis, and prey digestion (Aird, 2002). Although the identification of the toxin clusters does not allow us to determine quantitatively the contribution of each protein cluster in the envenomation, it allow us to speculate about the mechanisms of envenomation by B. gabonica venom. It is remarkable that proteins such as metalloproteases, serine proteases, C-type lectins, PLA2, Kunitz inhibitors, growth factors, and LAO account for most of our sequences. As described above and reviewed elsewhere (Aird, 2002), these proteins act on the hemostatic system and/or affect vascular biology. In this respect, B. gabonica venom resembles an expressed sequence tag (EST) approach reported for B. insularis, where a large number of cDNA code for metalloproteases, BPP, C-type lectins, serine protease, PLA2, and growth factors (Junqueira-de-Azevedo and Ho, 2002). We have also found an abundant cluster whose sequences match the 3/ untranslated region cDNA of A. h. blomhoffi BPP, a family of peptides also abundant in the B. insularis cDNA library (Junqueira-de-Azevedo and Ho, 2002) (see Table 2). Because we also found sequences coding for kallikrein-like enzymes, it is plausible that these enzymes and BPP are primarily responsible for the hypotension associated with B. gabonica and possibly B. insularis envenomation. The similarity in the cDNA composition between B. gabonica and B. insularis libraries is also consistent with the symptoms resulting from envenomation by Bitis and Bothrops spp. that is characterized by consumption coagulopathy, hypotension, and local damage (Aird, 2002).
It is worth noting that the description of the B. gabonica venom gland cDNA database match biologic activities described before for this venom, including the molecules involved with hypotension, bleeding, digestion, and tissue damage (Marsh et al., 1997). This indicates that an approach combining cDNA library construction, massive sequencing, and bioinformatic analysis, in addition to Edman degradation of the main proteins, may be useful to study exogenous secretion from different venom glands, and to the development of recombinant antigens for antibody production.
We thank Drs. Thomas E. Wellems, Robert W. Gwadz and Thomas J. Kindt for encoragement and support, and Brenda Rae Marshall for editorial assistance.