|Home | About | Journals | Submit | Contact Us | Français|
Ticks evolved various mechanisms to modulate their host’s hemostatic and immune defenses. Differences in the anti-hemostatic repertoires suggest that hard and soft ticks evolved anti-hemostatic mechanisms independently, but raise questions on the conservation of salivary gland proteins in the ancestral tick lineage. To address this issue the sialome (salivary gland secretory proteome) from the soft tick, Argas monolakensis was determined by proteomic analysis and cDNA library construction of salivary glands from fed and unfed adult female ticks. The sialome is composed of ~130 secretory proteins, of which the most abundant protein folds are the lipocalin, BTSP, BPTI and metalloprotease families which also comprise the most abundant proteins found in the salivary glands. Comparative analysis indicates that the major protein families are conserved in hard and soft ticks. Phylogenetic analysis shows, however, that most gene duplications are lineage specific, indicating that the protein families analyzed possibly evolved most of their functions after divergence of the two major tick families. In conclusion, the ancestral tick may have possessed a simple (few members for each family), but diverse (many different protein families) salivary gland protein domain repertoire.
Blood-feeding behavior evolved independently in insects in flies (Diptera), true bugs (Hemiptera), lice (Phthiraptera), fleas (Siphonaptera) and moths (Lepidoptera) (Ribeiro, 1995). In arachnids, hematophagous behavior evolved independently in ticks (Ixodida) and mesostigmatid mites (Mans and Neitz, 2004a; Radovsky, 1969). In each case the blood-feeding arthropod had to evolve mechanisms to modulate host defenses such as the hemostatic and immune systems. The mechanisms evolved by hematophagous arthropods mostly involve the secretion of salivary gland derived proteins during feeding and it is these proteins that target important host processes (Ribeiro, 1995). Adaptation to a blood-feeding environment by arthropods can thus be redefined as a study of the structural and functional evolution of the sialome (salivary gland proteomes of hematophagous organisms). From this perspective, the fact of independent adaptation to blood-feeding is clear when sialomes from different species are compared (Valenzuela et al., 2002a; Ribeiro and Francischetti, 2003; Champagne, 2004). In most cases the mechanisms to counteract the host’s immune and hemostatic defenses differ between various groups of hematophagous organisms. For the two major tick families, the soft (Argasidae) and hard (Ixodidae) ticks, this also holds and suggested that soft and hard ticks adapted to a blood-feeding environment independently (Mans et al., 2002a; Mans and Neitz, 2004a). Even so, phylogenetic analysis shows that the hard and soft tick families group as a monophyletic clade, indicating that they shared a common ancestor to the exclusion of other mites (Black and Piesman, 1994; Barker and Murrell, 2004). It was, however, also indicated that the ancestral tick lineage must have had some form of host-association, such as feeding on lymphatic fluid of the living host or as scavengers (Mans and Neitz, 2004a). This implies that the ancestral tick lineage must have features conserved in both families.
While the anti-hemostatic factors of soft and hard ticks differ in their mechanisms of action and protein families that they belong to, the two tick families share common protein folds in their salivary glands, such as the basic pancreatic trypsin inhibitor (BPTI) and lipocalin protein families (Mans and Neitz, 2004a). The BPTI/Kunitz family is composed of proteins that exhibit the basic pancreatic trypsin inhibitor fold (Laskowski and Kato, 1980). BPTI-like proteins act as thrombin, fXa and platelet aggregation inhibitors in soft ticks (Waxman et al., 1990; Karczewski et al., 1994; Van de Locht et al., 1998; Joubert et al., 1998; Mans et al., 2002b; Mans et al., 2007). In hard ticks, BPTI-like proteins inhibit the fVIIa/TF complex (Francischetti et al., 2002; Francischetti et al., 2004). The other major protein family so far described for ticks is the lipocalin family. In soft ticks, lipocalins function as anti-complement factors (Nunn et al., 2005), inhibitors of platelet aggregation (Waxman and Connolly 1993; Keller et al., 1993) and toxins (Mans et al., 2002c) and have been shown to be abundantly expressed in salivary glands (Mans et al., 2001; Mans et al., 2003; Oleaga et al., 2007). In hard ticks lipocalins that scavenge histamine and serotonin have also been described (Paesen et al., 1999; Paesen et al., 2000; Sangamnatdej et al., 2002). Recently high throughput sequencing of salivary gland transcripts from Ixodes scapularis and I. pacificus has shown that hard ticks possess more than 25 protein families in their sialomes (Valenzuela et al., 2002b; Francischetti et al., 2005; Ribeiro et al., 2006). The question raised is how many of these protein families are also represented in soft ticks? Data on this could indicate the number of conserved protein families found in the salivary glands of the ancestral tick lineage and whether any specific orthologs existed between hard and soft ticks before their divergence.
To address this, the salivary gland transcriptome and proteome of the soft tick Argas monolakensis were characterized. A. monolakensis is limited to islands located on Mono Lake, California where it feeds annually on the breeding Californian gull population (Larus californicus) (Schwan et al., 1992). It is problematic for ornithologists working on the islands and is a potential vector of Mono Lake virus (Rheoviridae: Orbivirus) (Schwan and Winkler, 1984; Schwan et al., 1988). We show that the major protein families are conserved between hard and soft ticks. Even so, the numerous gene duplication events observed within individual protein families appear to be limited to lineage specific expansions, indicating that most of the sialome diversity observed for the different tick families evolved after their divergence.
Argas monolakensis ticks were collected as described (Mans et al., 2007), salivary glands dissected and frozen at −70°C until use. Glands from unfed and fed (fed to repletion on chickens 4 days prior to dissection) adult females were thawed in RNAlater and cDNA libraries constructed as previously described (Valenzuela et al., 2002b). One thousand and seven hundred and twenty eight plaques were picked and sequenced for the unfed and fed library, respectively. Sequences were cleaned-up and clustered into contigs (synonymous with consensus sequences) using in-house bioinformatics programs as previously described (Ribeiro et al., 2006). Contigs obtained for individual clusters were analyzed using BLASTX to identify conserved proteins in the non-redundant database (Altschul et al., 1990). Potential protein coding sequences were manually identified from consensus sequences by the presence of an open reading frame, stop codon, poly-adenylation signal, poly-A tail and starting methionine. Protein domain data obtained from the BLASTX analysis were used to determine the coverage obtained for truncated ESTs that code for cytoplasmic proteins. Full-length secretory proteins were identified by the presence of a signal peptide using the SignalP3.0 server (Bendtsen et al., 2004). The presence of multiple cysteines that would suggest the presence of disulphide bonds were taken as evidence of secretion for truncated proteins. The assembly of the consensus sequences was manually inspected and full-length and truncated reading frames were confirmed by manual inspection.
BLASTP analysis was initially used to assign translated consensus sequences to protein families (Altschul et al., 1990). Subsequently, PSI-BLAST analysis (Altschul et al., 1997) was used to confirm these assignments, as this approach allows the retrieval of all family members, including those that are distantly related. This method allowed assignment of some contigs to protein families that could not be assigned using BLASTP analysis.
Most of the tick sequences derived from salivary gland transcriptomes have been deposited as ESTs with no corresponding data in the protein databank. As such, the data exist in a form that makes it difficult to perform a comparative analysis of salivary gland sialomes that will allow the assignment of relative domain numbers to different tick species. In order to derive a more representative data analysis of protein domains present in tick salivary glands, the EST datasets for different tick species were clustered as described above to obtain a non-redundant dataset of contigs for each. TBLASTN analysis using the predicted secretory proteins from A. monolakensis was then performed against these individual databases. Results were manually inspected and hits with e-values less than 10−5 were taken as positive hits to get a conservative estimate of protein domain numbers. Translated protein sequences were derived from the contigs for each species and used to perform a TBLASTN analysis against its own species specific database in order to find homologs.
Salivary gland extract (SGE) was prepared as described previously (Mans et al., 2007). Briefly, salivary glands were dissected from female ticks in 20 mM Tris-HCl, 0.15 M NaCl, pH 7.4 by tearing the dorsal cuticle and removing the glands with forceps. Glands were washed in dissection buffer before being frozen at −70°C. Glands were reconstituted in 20 mM Tris-HCl, pH 7.4 and disrupted by sonication, before the cell debri were pelleted by centrifugation at 16000g for 20 minutes on a bench top centrifuge. Supernatant was removed and constituted SGE.
SGE (10 gland equivalents) was heat treated at 80°C and centrifuged at 16000g for 10 minutes to pellet denatured proteins. The pellet (P) was dissolved in SDS-PAGE sample buffer and processed for one-dimensional electrophoresis by heating at 70°C for 10 minutes. Of this sample the equivalent of 5 glands was loaded per well. The soluble protein (S) was processed for SDS-PAGE in a similar manner and 5 gland equivalents were loaded as well. Electrophoresis was performed using standard conditions and protein blotted onto PVDF membranes before staining with Coomassie Brilliant Blue. Bands were cut out and analyzed by Edman sequencing.
Two-dimensional gel electrophoresis was performed using the ZOOM IPGRunner System (Invitrogen). Briefly, approximately 50 μg of salivary gland extract (5 salivary gland equivalents ) were solubilized with 155 μl rehydration buffer (7 M urea, 2 M thiourea, 2% CHAPS, 20 mM DTT, 0.5% carrier ampholytes, pH3-10). The samples were absorbed overnight at room temperature on rehydration ZOOM strips (7 cm; pH3-10NL) before focusing under manufacturer’s recommended conditions. The focused IPG strips were reduced/alkylated/equilibrated with reducing and then alkylation reagents dissolved in the sample buffer. The strips were then applied onto NuPAGE 4–12% Bis-Tris ZOOM gels (Invitrogen). The gels were run under MOPS buffer and stained with SeeBlue staining solution (Bio-Rad). A total of 78 spots were selected for tryptic digestion, based on their staining intensity. Spots picked were processed for tryptic digestion by standard mass spectrometry protocols that included reduction and alkylation before tryptic-digestion. Tryptic digests were analyzed by coupling the Nanomate (Advion BioSciences) – an automated chip-based nano-electrospray interface source – to a quadrupole time-of-flight mass spectrometer, QStarXL MS/MS System (Applied Biosystems/Sciex). Computer-controlled, data-dependent automated switching to MS/MS provided peptide sequence information. AnalystQS software (Applied Biosystems/Sciex) was used for data acquisition. Data processing and databank searching were performed with Mascot software (Matrix Science). The non-redundant protein database from the NCBI, National Library of Medicine, NIH, was used for the search analysis, as was a protein database generated during the course of this work.
Analysis of the A. monolakensis proteome by liquid chromatography followed the same approach previously used to identify the anti-clotting and anti-platelet agents from salivary gland extract (Mans et al., 2007). Briefly, SGE (100 gland equivalents) was applied to an anion exchange column and fractionated. The flow-through was applied to a cation exchange column and fractionated under the same conditions. Fractions that represented major peaks were pooled and applied to reversed phase chromatography. Individual peaks (116 fractions) were collected and analyzed by peptide mass fingerprinting as described above.
Protein families were aligned using ClustalX and manually checked and adjusted (Jeanmougin et al., 1998). Neighbor-joining analysis was performed using the Mega package (Kumar et al., 1994). The Poisson model for amino acid substitution was used and 10000 bootstraps were performed with complete deletion of gapped positions.
A total of 3087 EST sequences were obtained and their analysis and Genbank accession numbers can be found in the supplementary materials. Clusterization of ESTs yielded 1472 contigs. Unknown sequences represented 38.1% of all sequences and 67% of all clusters, with an average of 1.2 sequences per cluster, indicating that most of them were singletons. House-keeping sequences represented 23.9% of all sequences, 21% of all clusters with an average of 2.4 sequences per cluster, indicating only a slight increase above the singleton background. In contrast, sequences coding for potential secreted proteins represented 37.8% of all sequences, 11% of all clusters and on average 7.2 sequences per cluster. The unknown sequences are mostly singletons that do not yield an open reading frame and were considered non-sense sequences with no informative value for cDNA library analysis. If these sequences are removed from the library analysis, house-keeping sequences represent 38% of all sequences and 64% of all clusters, while potential secretory sequences represent only 33% of all clusters, but 61% of all sequences. This indicates that potential secreted proteins, although being the lowest in numbers of clusterized contigs, are the most highly represented sequences in the library. A final number of 127 potential secretory components were deposited in the sequence databank (Table 1).
The salivary proteome of A. monolakensis as analyzed by one-dimensional SDS-PAGE is complex and consists of several highly abundant bands (Fig. 1). Temperature denaturation and precipitation allowed for differential fractionation of proteins for easier interpretation of Edman sequencing data. Fourteen sequences were obtained, of which thirteen corresponded to sequences found in the cDNA database (Table 1). In all cases, the N-terminal sequences obtained corresponded to the predicted N-terminal sequences using SignalP.
In most cases the Edman sequences correspond to the most abundant sequences found in the cDNA library. Even so, the lipocalin (AM-10) with the most transcripts (150) are not found with Edman sequencing. This is most probably due to the fact that many cellular proteins are N-terminally blocked. To address this issue, the proteome were also investigated by peptide mass fingerprinting. For this, the SGE was fractionated using two-dimensional electrophoresis (Fig. 2). This resulted in the resolution of 78 spots that were picked and analyzed. Of these, 18 spots matched cDNA transcripts with significant values (supplementary material) and 14 proteins were identified. All identified spots were secretory and 7 corresponded with Edman products previously identified (Fig. 2). As such, 7 additional proteins were identified that corresponded to sequences found in the cDNA database. Of these the most prominent spot corresponded with contig AM-10, the contig with the most abundant transcripts in the cDNA library.
To corroborate the results obtained from Edman sequencing and 2D-analyses, SGE was fractionated by anion and cation exchange chromatography, followed by reversed phase chromatography (Fig. 3). This results in a two-dimensional separation of proteins based on their charge and hydrophobicity properties. Using in-house programs that allowed for the relative quantification of peaks obtained at 220 nm, relative concentrations of the peaks can also be estimated, thereby providing a measure of the relative yields for different identified proteins (Table 1). Salivary gland extract (100 glands, ~1 mg total protein) was first fractionated using anion exchange chromatography (Fig. 3, AEC). This resulted in the fractionation of 10 regions that were used for subsequent fractionation. The flow-through was subjected to cation exchange chromatography and yielded 8 regions that were used for further analysis (Fig. 3, CEC). The fractions from the ion exchange columns were applied to reversed phase chromatography and single peaks were collected for analysis by peptide mass fingerprinting (Fig. 3, RPC). This resulted in the analysis of 116 peaks, of which 52 could be identified with confidence and yielded 27 proteins. Eleven of these proteins were previously identified by Edman sequencing and 4 proteins were identified by 2D-SDS-PAGE that was not identified by Edman sequencing. Each method identified unique proteins not found with the other methods giving for Edman sequencing 4 unique proteins, for 2D-SDS-PAGE 2 unique proteins and for 2D-LC-MS 14 proteins. This yields a total of 35 proteins that were identified in the proteome of A. monolakensis and of these 29 were identified by liquid chromatography and quantified. Relative yields show that the 29 proteins identified correspond to ~ 60% of the total protein content of the salivary glands (supplementary table 6). The identified peaks cumulatively have 710 transcripts, which correspond to 20% of the secretory clusters and 60% of all secretory transcripts. Relative yields obtained after fractionation of salivary gland extract showed that the proteins identified by peptide mass fingerprinting represent 60% of the total protein extract. As such, the most highly abundant proteins in the proteome are also the most highly abundant transcripts.
Of the 127 secretory proteins deposited and analyzed, 81 are full-length while 46 are N-terminally truncated (Table 1). The truncated proteins are useful, as they can still be assigned to protein families. This allows for the description of the sialome of A. monolakensis in terms of protein domain composition. Proteins were assigned to families using the BLAST suite of programs (Altschul et al., 1990; Altschul et al., 1997). In some cases, no homologies were found using BLASTP or PSI-BLAST analysis. However, inspection of conserved cysteine-patterns allowed the assignment of small cysteine-rich proteins to various families (Table 1). Approximately forty different protein families or domains can be identified at first analyses, of which thirteen have more than one member (Table 1). The most abundant protein family is the lipocalins (27% of all domains, 34% of all secretory transcripts, 40.5% of total protein). This correlates with previous proteomic analysis that showed that lipocalins are the most abundant proteins found in the salivary glands of the soft tick O. savignyi (Mans et al., 2001; Mans et al., 2003; Oleaga et al., 2007). Other abundant domains include the basic tail secretory protein (BTSP) family (28% of all domains, 21% of all secretory transcripts, 6.4% of total protein), kunitz-BPTI family (12% of all domains and 15% of all secretory transcripts, 6.7% of total protein) and metalloproteases (21% of all domains and 6% of all secretory transcripts). The lipocalin, kunitz and metalloprotease families have been previously identified as domains found in soft and hard ticks (Mans and Neitz, 2004a).
Functions for possible tick salivary gland proteins and families have been proposed in a variety of sialome papers and will not be stated here (Valenzuela et al., 2002b; Francischetti et al., 2005; Ribeiro et al., 2006). Biochemical analysis has indicated that A. monolakensis possess anti-thrombin (monobin) and anti-platelet (monogrins and apyrase) activities that are orthologous to that found in Ornithodoros soft ticks (Mans et al., 2007). Monobin and the monogrins belong to the Kunitz/BPTI-family as was previously found for soft tick inhibitors (Joubert et al., 1998; Mans et al., 2002b; van de Locht et al., 1996; Waxman et al., 1990). These are so far, the only confirmed functions for the salivary gland proteins from A. monolakensis. Other functions observed in SGE include fibrinogenase and collagenase activity (results not shown) that are most probably associated with metalloproteases (Francischetti et al., 2003). Other functions readily transferable via homology include anti-microbial activities of defensins and microplusin-like proteins previously found in ticks (Fogaça et al., 2004; Lai et al., 2004; Todd et al., 2007; Zhou et al., 2007).
The salivary transcriptomes from various hard tick species have been described. These include Amblyomma variegatum (Nene et al., 2002), Dermacentor andersoni (Alarcon-Chaidez et al., 2007), Haemaphysalis longicornis (Nakajima et al., 2005), I. pacificus (Francischetti et al., 2005), I. scapularis (Valenzuela et al., 2002b; Ribeiro et al., 2006), and Rhipicephalus appendiculatus (Nene et al., 2004). With the data obtained in the present study the question can be asked, as to how many salivary gland derived protein domains are conserved between hard and soft tick species? This will serve as an indication of the protein domain repertoire that was present in the salivary glands of the ancestral tick lineage. In order to address this, secretory proteins identified in this study were used in PSI-BLAST analysis to retrieve hard tick proteins that has also been identified in salivary glands from the non-redundant database. This search mostly identified proteins from Ixodes and Haemaphysalis species, as protein sequence data for these ticks are available in the non-redundant protein database. In cases where EST sequences were deposited instead of protein sequences, clusterized databases were constructed for each species after which TBLASTN analysis was performed to detect homologs. This analysis indicates that the major protein domains are conserved between the soft and hard tick families, indicating that most of the protein folds found in ticks were already present in the salivary glands of the ancestral tick lineage (Table 2; supplementary material). It is of interest, that in some cases where no positive hits were found in hard tick salivary gland databases, homologs were found in the transcriptome data of B. microplus which includes pooled transcripts from a variety of organs. This would indicate that although these specific proteins were not expressed in hard tick salivary glands they were probably present in the genome of the ancestral tick lineage. In these cases, expression might have been lost in hard tick salivary glands or have been co-opted in soft tick salivary glands at a later time. The overall trends observed for soft tick salivary glands are also apparent for the hard tick protein families, in that the major families seem to be the BTSP, kunitz, lipocalin and metalloprotease families.
PSI-BLAST analysis of the various protein families allows for the elucidation of relationships between protein families that share the same fold, but are themselves too divergent for detection of homologous relationships (Altschul and Koonin, 1998). Using this approach, we found that the BTSP and the 7DBF family identified in this study are homologous (Table 1). Members of the 18.7 kDa previously identified in Ixodes ticks (Valenzuela et al., 2000b; Francischetti et al., 2005; Ribeiro et al., 2006), were also retrieved. PSI-BLAST analysis of this family also retrieved metalloproteases of the ADAM-TS family and the alignments produced during the BLAST analysis indicated similarity to a region not yet assigned to any domain structure. The cysteine arrangement is, however, similar to that of the thrombospondin type 1 repeat superfamily (Tang, 2001). The BTSPs possess a single TSP-1 repeat domain with a cysteine bond pattern similar to that found for TSP1 and TSP4 of F-spondin (Fig. 4A) (Pääkönen et al., 2006). The 7DBF family possesses two TSP-1 repeat domains, the first with a cysteine bond pattern that corresponds to that of TSP-1 repeats of thrombospondin (Tan et al., 2002), while the cysteine bond pattern of the second domain corresponds to that of F-spondin (Fig. 4B). Members also possess two extra cysteines at the N-terminus that presumably forms a disulphide bond. The 18.7 kDa family previously described for Ixodes species is composed of two TSP-1 repeats, both resembling the disulphide bond pattern of the F-spondin domain (Fig. 4C). Some members also possess two extra cysteines that are located between the two thrombospondin-like domains and presumably form a disulphide bond that will result in the formation of a hairpin-turn. This will bring the two thrombospondin domains in proximity, which could indicate that in these proteins inter-domain interaction occurs.
In some cases no positive hits were found between soft and hard ticks, for example, the 8kDa cysteine-rich family. However, small proteins might be divergent to an extent that even PSI-BLAST analysis might not detect homologies. In these cases, conserved features such as cysteine and disulphide bond patterns can be used to show that proteins possess the same fold. This study found at least 5 proteins that all share the same cysteine bond pattern but do not retrieve other proteins, i.e. the 8kDa cysteine-rich family. When compared to a group previously identified in Ixodes ticks (the ixodegrins), they clearly share the same cysteine pattern (Fig. 5). Some ixodegrins possess the RGD integrin recognition motif that correlates with that found for the snake derived disintegrin, dendroaspin and the hard tick platelet aggregation inhibitor variabilin (Francischetti et al., 2005). The Argas proteins, however, lack the RGD-motif. Proteins with the RGD-motif have been identified in the salivary glands of A. monolakensis and are orthologous to platelet aggregation inhibitors from the BPTI-like family found in the soft tick genus Ornithodoros (Mans et al., 2007).
As has been indicated previously, most anti-hemostatic and immunomodulatory activities are not conserved between soft and hard ticks (Mans and Neitz, 2004a). Given the fact that soft and hard ticks did share a common ancestor at a given time to exclusion of other mites (Barker and Murrell, 2004), it stands to reason that they should share to some extent a number of orthologous proteins that was present before divergence of the major tick families. This is supported by the number of shared protein families found in soft and hard ticks (Table 2), but raises the question regarding the presence of orthologs shared between the two tick families. Phylogenetic analysis of the major protein families (lipocalins and BTSP) found in soft and hard ticks indicates that most of the gene duplications found in the protein families are lineage specific expansions (Figs. 6 and and7).7). This implies that most gene duplication events occurred after divergence of hard and soft ticks and that most orthologous relationships shared between the two families were probably restricted to a limited number of members in each protein family. The numbers on these might be as little as one or two orthologs present in the ancestral tick lineage for each family. This indicates that although the ancestral tick lineage possessed a protein repertoire that gave rise to the diversity observed in salivary transcriptomes of present day ticks, it was itself most probably a minimal sialome that had very few of the functional proteins found in modern day ticks. The high level of gene duplication (paralogous genes) observed within the major protein families, could also obscure the true orthologous relationships between hard and soft tick proteins. Functional analysis of protein families should therefore remedy this and allow for detection of orthologs that were present in the ancestral tick lineage.
The field of vector-host interaction has gained tremendously by high-throughput analysis of salivary gland transcripts and proteomes, collectively called the sialome (Ribeiro and Francischetti, 2003). This approach allows for the description of secretory products involved in blood-feeding of hematophagous organisms. Heretofore, the identification of proteins active at the blood-feeding site was only accessible using biochemical purification techniques to isolate bio-active components from salivary glands or saliva. The present advances in high-throughput methodologies allows us to gain a glimpse of the sialome in its full complexity. Thus, while biochemistry will always be needed to confirm and validate functional predictions derived from sialomic databases, the analysis of sialomes in terms of their protein domain compositions allows for a comparative analysis of sialome complexity and diversity that gives us insights into the evolution of blood-feeding behavior in arthropods. In the present study we analyzed the sialome of a soft tick species (A. monolakensis) and compare it with hard tick sialomes previously described. We derive the general conclusion that hard and soft ticks shared a similar salivary gland protein repertoire in their last common ancestor.
Approximately 130 potential secretory transcripts were identified in the salivary gland transcriptome of A. monolakensis by cDNA library construction. This compares well with the proteomic analysis using 2D-electrophoresis and liquid chromatography that resolved 78 and 118 abundant proteins, respectively. It also indicates that this number is probably close to the real number of secretory components found in the salivary glands of this soft tick. In comparison, more than 500 proteins have been described for the salivary glands of I. scapularis (Ribeiro et al., 2006). In this latter study, the data were obtained from the analysis of ~8000 ESTs, while the current study only analyzed ~3000 ESTs. However, these estimates likely correlate with salivary gland complexity, with hard tick salivary glands being more complex than that of soft ticks, as evidenced by that higher number of secretory acini and cell types found in hard ticks (Coons and Alberti, 1999). Presumably, this is because hard ticks would need more components to control the host’s defense mechanisms and because they feed for longer periods of time and is exposed to the host’s immune system for extended periods of time.
The transcript numbers for highly abundant contigs also correlate well with the highly abundant proteins identified during the proteomic analysis. This correlation is consistent with the emerging paradigm in vector salivary gland biology, that proteins secreted during feeding are generally the most abundant salivary gland proteins with the correlated highest numbers of mRNA salivary gland transcripts. Even so, it is of interest that soft tick glands show such a good correlation between transcript and protein abundance. Soft tick salivary glands are normally filled with large secretory granules where proteins are stored until secretion during feeding (Roshdy, 1972, Roshdy and Coons, 1975, Coons and Roshdy, 1981, El Shoura, 1985; El Shoura 1987, Mans et al., 2004). Ticks, such as A. monolakensis, may only feed once a year and in the periods in between, granules will be stationary. Salivary glands can presumably only accommodate a certain number of granules and as such, proteins can be stored over prolonged periods of time and will accumulate, so that a general correlation between protein concentration and transcript number would not necessarily follow. Comparison of the cDNA libraries obtained for fed and unfed ticks do not differ markedly in terms of transcript numbers for various contigs (supplementary material). This would indicate that transcription occurs at the same rate for secretory transcripts regardless of the feeding status of the tick. Thus, if there is regulation of protein levels or salivary granule number, it would occur at a post-transcriptional level.
Analysis of the protein families identified in the Argas sialome indicate that they share to a large extent a similar salivary gland proteome with hard ticks. This would imply that most of the shared protein domains were also present in the ancestral tick lineage to the two families. Even so, few clear-cut orthologs were identified for highly abundant protein folds found in the soft and hard tick families. This observation could be extended to the lesser abundant protein families (results not shown). In the case of very short sequences (BPTI-kunitz and defensin families), sequences may evolve fast so that phylogenetic information is lost (Mans et al., 2002a). Host immune pressure and divergence times extending back to 400 MYA may also account for the high divergence of sequence (Barker and Murrell, 2004). Numerous biochemical studies have also shown that specific functions involved in the regulation of the host’s immune and hemostatic systems are not conserved between hard and soft ticks (Mans and Neitz, 2004a; Mans et al., 2007). The possibility thus exists that once functions are found for many of the divergent proteins in the sialomes; they will differ between the tick families. Those protein families for which orthologous proteins exist, are most probably ones conserved throughout invertebrates and would include the metalloproteases and anti-microbials. In short, these proteins would have had generalized functions before adaptation to a blood-feeding environment.
The predicted presence of certain shared protein folds in the ancestral tick lineage from which salivary gland function evolved raises the question as to where these folds derived from. For most of the major salivary gland protein families, they clearly derived from much more ancient members of the same folds that are present in arthropods. The ancestor to the salivary BPTI proteins probably derived from a hemolymph BPTI-like protein, such as those common in the hemolymph of arthropods (Mans and Neitz, 2004a). These proteins would generally be involved in the regulation of serine proteases involved in various processes and their presence in the salivary glands might have assisted in the inhibition of hemolymph proteases at a stage when the ancestral tick lineage still scavenged dead arthropods. Lipocalins most probably derived from a Lazarillo-like ancestor that was involved in the development of the neural system (Mans and Neitz, 2004b). The metalloproteases are ancient enzymes that are conserved throughout the animal kingdom and are involved in all processes of extra-cellular matrix remodeling, a role it most probably also plays in arachnids. In the ancestral tick lineage these proteases must have played a role in digestion and liquefaction of a scavenged meal. It would seem a simple jump to re-adapt and retain them for blood-feeding use, especially when previous targets were collagen or fibrinogen-like. The presence of other “house-keeping, but adaptive” domains are also noted, for example the anti-microbials that would have a more ancient protective role, but would be co-opted for blood-feeding. Certain folds seem to be novel to tick salivary glands, most notably the BTSP-fold because of the high number of members found in both hard and soft tick salivary glands. The thrombospondin-repeat occurs in a number of mammalian proteins (Tucker, 2004). Thus far the only proteins outside of the BTSP family with TSP repeat that has been found in tick salivary glands are the ADAM-TS metalloproteases. This suggests that the BTSP fold and all its derivatives, i.e. 7DBF, 18kDa, 2CF, were most probably derived from a gene duplication of this domain from an existing metalloprotease. Whether this occurred more than once is difficult to ascertain, but would not be impossible, as the disulphide patterns from the 18.7kDa and 7DBF families are different and probably derived from existing TSP-1 families. Those domains which are currently orphan domains, i.e. proteins that cannot be assigned to any known protein family or domain, are most probably highly divergent members of known domains. It is foreseen that as more data become available, most of these orphan domains will eventually be assigned to well known families and that their origins will be easier to trace.
Certain recurring trends appear for many proteins found in tick salivary glands. They either belong to large families of which most genes seem to be lineage specific expansions, or their origins can be traced back to proteins that were already present in the salivary glands of the ancestral tick lineage. From this viewpoint, we propose that the ancestral tick lineage had a restricted set of salivary gland derived protein families which was not necessarily adapted to function within a blood-feeding environment. Subsequently the main tick families diverged and adapted to a blood-feeding environment. During this period, novel proteins involved in the modulation of the host’s hemostatic and immune systems evolved by gene duplication of full-length domains, as well as sub-domains. Proteins with functions that could affect the host’s defenses were also recruited during this period.
The present study shows that the protein domain repertoires of hard and soft ticks are similar. This suggests that most protein domains were already expressed in the salivary glands of the ancestral tick. Even so, most gene duplication events occurred after divergence of the major tick families. This suggests an adaptive response of the different families, most probably in response to a blood-feeding environment, after their divergence.
This work was supported by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organization imply endorsement by the government of the United States of America. We thank Paul Kelly and Kristie Nelson for logistic support in collecting the ticks on islands in Mono Lake.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.