Analysis of genome sequence data has led to the discovery of large families of serpins in multicellular organisms, including 36 in humans, nine in
Caenorhabditis elegans [
16], 29 in
Drosophila [
26] and plants [
53]. On the Merops database, 17 and 18 serpin sequences are listed for
Aedes aegypti and
Anopheles gambiae. In ticks, documentation of multiple serpin encoding cDNAs has provided indirect evidence suggesting that ticks do encode large serpin families. For instance, we recently described 17 serpin cDNAs that are expressed by
A. americanum during feeding [
33]. Here we describe the annotation and characterization of 45 serpin genes in the
I. scapularis genome. The observed high amino acid sequence identity between
I. scapularis and
I. ricinus serpins was not surprising as these two ticks belong to the same genus. It is important to point out that eight of the 45 annotated were not represented in the preliminary peptide build at VectorBase. This finding may raise the prospect of error in annotations reported here. Interestingly, this possibility is ruled out, as ESTs of six of the eight serpins in this study (S1, 13, 23, 24, 33 and 39) were present in the trace archive database, while coding regions of serpins S26 and 32 were amplified from unfed and partially fed ticks (see figure ).
The adoption of the consensus serpin secondary structures [
11,
12] and the high conservation of the core amino acid residues [
51] that underpin the structure and functionality of serpins strongly suggested that,
I. scapularis serpins are functional. The majority of known serpins function as inhibitors of serine proteinases and hence the name [
54]. However others with activity against cysteine proteinases and those with no inhibitor functions have also been described [
12]. Although on the basis of sequence analysis [
11], we are able to distinguish between inhibitory and non-inhibitory serpins, available data in this study is insufficient to specify their preferred proteinase substrates. Putative RCLs and scissile bonds of serpins in this study were predicted based on consensus that there are 17 amino acid residues between the start of the RCL hinge region and the scissile bond [
11,
45,
51]. As some of the characterized serpins such as α
2-antiplasmin [
11] or serpin1k from
Manducca sexta [
55], utilize shorter RCLs, we are interpreting our predictions of RCLs and scissile bonds with caution.
Our finding that 82% of the full-length serpins in this study have signal peptides is consistent with observations in humans where the majority of known serpins exist in the extracellular form [
14]. Findings in this study are not unique, in that similar results were reported in
A. americanum where 13 of the 17 putatively inhibitory serpins in
A. americanum were predicted to be extracellular [
33]. From the perspective of finding target antigens for vaccine development, it is encouraging to note that the majority of
I. scapularis serpins are putatively extracellular as they will be accessible to host immune response factors. Predictions based on sequence analysis, may not be consistent with the situation
in vivo. However it is interesting to note that the four serpin sequences (S30, 32, 35 and 38) that were predicted to be intracellular sequences, based on lack of a signal peptide also posses "C" residues in the exposed regions of their RCL, a feature that has been observed in many intracellular serpins [
14].
The use of alternatively spliced RCLs appears to be a wide spread strategy in insects to diversify the range of target proteinases that can be controlled by a single serpin gene [
56-
58]. A classic observation of this phenomenon is the serpin gene-1 of the tobacco hornworm,
M. sexta, which has 12 different alternatively, spliced RCLs [
56]. Effectively this gives rise to 12 serpins regulating 12 different proteolytic pathways. An interesting structural feature among the 12 variants of
M. sexta serpin gene 1 is that the first 336 amino acids are exactly identical with difference restricted to the RCL [
56]. The identity patterns among
I. scapularis serpins sequences of where, stretches of identical and variable domains were spread across the entire sequence suggest that the diversity among serpins in this study may have arisen by duplication other than alternative splicing of RCL encoding exons.
From the perspective of understanding how the tick manipulates the mammalian host's defense against tick feeding, the finding of 18 serpins with basic residues at their P1 sites was exciting. In humans, this feature is associated with key serpins such as α
1-antichymotrypsin, α
1-antiplasmin, antithrombin III, protein C inhibitor and C1 inhibitor [
11], which regulate important pathways such as inflammation, blood coagulation and complement activation. As these pathways are thought to represent the mammalian host's defense against tick feeding [
59,
60], it is tempting to imagine that ticks could utilize some of these serpins to manipulate host defense to facilitate tick feeding and disease transmission. It is also possible that these serpins may not be directly be involved in facilitation of feeding. However, like their mammalian counterparts, they may be involved in regulation of important pathways in the tick, which if disrupted can affect the capacity of ticks as vectors.
Although the biological significance of gene expression data will be strengthened if correlated with protein production, our RT-PCR data provide some useful insights on probable biological roles of serpin genes in this study in the physiology of
I. scapularis. Speculatively
I. scapularis genes that were induced or up regulated after ticks had penetrated their host skin may signal their involvement in facilitation of blood meal up take. This is particularly true for the 11 genes that were induced in both SG and MG (S1, 2, 3, 4 and 7) or SG alone (S5, 6, 16, 26, 35 and 36) in ticks that were fed for 6–24 hrs. This period corresponds to the preparatory feeding phase when tick attaches onto host skin and creates its feeding lesion [
2]. During this period the tick must overcome inflammation and blood coagulation for it to successfully start the feeding process. Similarly, the group of serpin genes (S17, 23, 25, 32, 37, 38 and 40) that were constitutively expressed but their transcript abundance increased with tick feeding may also play a role in facilitating blood meal up take. For those genes that were constitutively expressed, S9, 12 and 27 in the MG, and S19 in both SG and MG, but were progressively down regulated as ticks continued to feed, could be involved in regulating physiological processes at the front end of tick feeding process. The expression of S10, 14, 18, 21 and 22, specifically or predominantly in the MG is interesting as it signals the potential role for these genes in facilitation of not only blood meal processing, but also in the crossing of the gut barrier by pathogens. From the perspective of our long-term interest to find tick proteins that are used by ticks to evade host immunity, it was exciting to note that some serpins were specifically expressed in SG. It will be exciting to investigate whether or not any of these genes are injected into the host during tick feeding. It is possible that the genes analyzed here could be expressed in multiple tick organs besides the SG and MG. However, from the perspective of our long-term interests to understand molecular mechanisms that underlie tick-host interactions, our analysis here is biased to biological functions of serpins at the SG and MG levels. The SG is critical for feeding and disease transmission while the MG is important for blood meal processing and the passage of pathogens from the blood meal into the tick hemolymph [
2]. Our future questions will thus address the role of the serpins in facilitation of tick feeding and blood meal processing.
Most known serpins are glycosylated [
12,
16] and thus it is not surprising that 40 of the 43 serpin sequences that were tested are predicted to possess putative N-glycosyslation sites. As pool feeders, ticks accomplish feeding by lacerating small blood vessels and then sucking blood from the hematoma that forms in the feeding site [
2]. In order to complete feeding, ticks must prevent host blood from coagulating to ensure continued blood flow into the hematoma for the entire tick feeding period, which may last for over the 10–14 and 4–7 day feeding periods for adults and immature ticks respectively [
2]. From the perspective of solving the paradox of how the tick interferes with the coagulation cascade of its mammalian host, it was interesting to note that ~9% (4/45) of the sequences contain the RGD motif. Previous studies have shown that tick proteins containing the RGD motif such as variabilin [
61] and savignygrin [
62] were potent inhibitors of platelet aggregation. Platelet aggregation is critical to stopping bleeding of injured small blood vessels such as occurs at the tick bite [
59]. Thus, if functional, the RGD motif containing serpins could represent important molecular targets aimed at countering the ability of ticks to suppress the mammalian's host's blood clotting system. In addition to the anti-platelet aggregation function, RGD motif containing proteins are also involved in regulating cell-cell interactions in mammals [
63], immunity in arthropods [
64], and plants [
65]. From the foregoing, it is clear that the RGD motif containing serpins could also be involved in regulation of multiple other functions in the tick, besides platelet function at the tick feeding site. It is interesting to note that our RT-PCR data suggested that the expression of three (S1, 7 and 16)) of the four serpin RGD motif-containing genes was responsive to tick feeding activity (see figure ).
When compared to the 3100 Mbp human genome that encodes at least 36 serpin genes [
16] the 45 serpin genes identified in
I. scapularis, which has a 2100 Mbp genome is considerably high. While the biological significance of the high number of serpin genes in the
I. scapularis biology cannot be ascertained at present, we speculate that this may signal the significance of serpins in tick physiology. In light of lack of genome sequence data of many tick species, the sizes of tick serpin families will remain unknown. Ticks are diverse, both in terms of their biology [
2] and their genome sizes [
66-
68]. Thus it is likely that the sizes of serpin families in ticks are going to vary. However, this study provides some insight on the probable sizes of serpin families in ticks.