Human envenomation by snakes is a worldwide issue that claims more than 100,000 lives per year and exacts untold costs in the form of pain, disfigurement, and loss of limbs or limb function [
1-
3]. Despite the significance of snakebites, their treatments have remained largely unchanged for decades. The only treatments currently available are traditional antivenoms derived from antisera of animals, usually horses [
4], innoculated with whole venoms [
5,
6]; such an approach is the only readily available option for largely uncharacterized, complex mixtures of proteins such as snake venoms. Although often lifesaving and generally effective against systemic effects, these antivenoms have little or no effect on local hemorrhage or necrosis [
7-
9], which are major aspects of the pathology of viperid bites and can result in lifelong disability [
4,
5]. These traditional treatments also sometimes lead to adverse reactions in patients [
6]. Advances in treatment approaches will depend on a complete knowledge of the nature of the offending toxins, but current estimates of the numbers of unique toxins present in snake venoms are in excess of 100 [
10], a number not approached in even the most extensive venom-characterization efforts to date [
11].
The significance of snake venoms extends well beyond the selective pressures they may directly impose upon human populations. Snake venoms have evolutionary consequences for those species that snakes prey upon [
12,
13], as well as species that prey upon the snakes [
14], and their study can therefore provide insights into predator-prey coevolution. Snake venom components have been leveraged as drugs and drug leads [
15-
17] and have been used directly as tools for studying physiological processes such as pain reception [
18]. In addition to the significance of the toxins, the nature of the extreme specialization of snake venom glands for the rapid but temporary production and export of large quantities of protein could provide insights into basic mechanisms of proteostasis, the breakdown of which is thought to contribute to neurodegenerative diseases such as Parkinson’s and Alzheimer’s [
19].
The eastern diamondback rattlesnake (
Crotalus adamanteus) is a pit viper native to the southeastern United States and is the largest member of the genus
Crotalus, reaching lengths of up to 2.44 m [
20]. The diet of
C. adamanteus consists primarily of small mammals (e.g., squirrels, rabbits, and mouse and rat species) and birds, particularly ground-nesting species such as quail [
20]. Because of its extreme size and consequent large venom yield,
C. adamanteus is arguably the most dangerous snake species in the United States and is one of the major sources of snakebite mortality throughout its range [
21].
Crotalus adamanteus has recently become of interest from a conservation standpoint because of its declining range, which at one time included seven states along the southeastern Coastal Plain [
22]. This species has now apparently been extirpated from Louisiana and is listed as endangered in North Carolina [
23,
24]. As a consequence of recent work by Rokyta et al. [
11] based on 454 pyrosequencing, the venom of
C. adamanteus is among the best-characterized snake venoms; 40 toxins have been identified.
Transcriptomic characterizations of venom glands of snakes [
25-
28] and other animals [
29-
32] have relied almost exclusively on low-throughput sequencing approaches. Sanger sequencing, with its relatively long, high-quality reads, has been the only method available until recently and has provided invaluable data on the identities of venom genes. Because venomous species are primarily nonmodel organisms, high-throughput sequencing approaches have been slow to pervade the field of venomics (but see Hu et al. [
33]), despite becoming commonplace in other transcriptomic-based fields. Rokyta et al. [
11] recently used 454 pyrosequencing to characterize venom genes for
C. adamanteus. More recently, Durban et al. [
34] used 454 sequencing to study the venom-gland transcriptomes of a mix of RNA from eight species of Costa Rican snakes. Whittington et al. [
30] used a hybrid approach with both 454 and Illumina sequencing to characterize the platypus venom-gland transcriptome, although they had a reference genome sequence, making
de novo assembly unnecessary. Pyrosequencing is expensive and low-throughput relative to Illumina sequencing, and the high error rate, particularly for homopolymer errors [
35], significantly increases the difficulty of identifying coding sequences without reference sequences.
We sequenced the venom-gland transcriptome of the eastern diamondback rattlesnake with Illumina technology using a paired-end approach coupled with short insert sizes effectively to produce longer, high-quality reads on the order of approximately 150 nt to facilitate
de novo assembly (an approach similar to that of Rodrigue et al. [
36] for metagenomics). The difference in read length from that of 454 sequencing was compensated for by the increase of more than two orders of magnitude in the number of reads. We demonstrated
de novo assembly and analysis of a venom-gland transcriptome using only Illumina sequences and provided a comprehensive characterization of both the toxin and nontoxin genes expressed in an actively producing snake venom gland.