|Home | About | Journals | Submit | Contact Us | Français|
Efficient transmission of pathogens by an arthropod vector is influenced by the ability of the pathogen to replicate and develop infectiousness within the arthropod host. While the basic life cycle of development within and transmission from the arthropod vector are known for many bacterial and protozoan pathogens, the determinants of transmission efficiency are largely unknown and represent a significant gap in our knowledge. The St. Maries strain of Anaplasma marginale is a high-transmission-efficiency strain that replicates to a high titer in the tick salivary gland and can be transmitted by <10 ticks. In contrast, A. marginale subsp. centrale (Israel vaccine strain) has an identical life cycle but replicates to a significantly lower level in the salivary gland, with transmission requiring >30-fold more ticks. We hypothesized that strain-specific genes expressed in the tick salivary gland at the time of transmission are linked to the differences in the transmission efficiency phenotype. Using both annotation-dependent and -independent analyses of the complete genome sequences, we identified 58 strain-specific genes. These genes most likely represent divergence from common ancestral genes in one or both strains based on analysis of synteny and lack of statistical support for acquisition as islands by lateral gene transfer. Twenty of the St. Maries strain-specific genes and 16 of the strain-specific genes in the Israel strain were transcribed in the tick salivary gland at the time of transmission. Although associated with the transmission phenotype, the expression levels of strain-specific genes were equal to or less than the expression levels in infected erythrocytes in the mammalian host, suggesting that function is not limited to salivary gland colonization.
Transmission efficiency is a primary determinant of pathogen prevalence. Strain-specific differences in transmission efficiency underlie the strain structure in the host population and the patterns of infection and disease. For example, cholera pandemics are associated with a limited diversity of related, highly transmissible Vibrio cholerae strains (4). For vector-borne pathogens, efficient transmission is also dictated by infection and replication within the arthropod host (10, 15, 25, 26). While the basic cycle of development within the arthropod vector has been identified for many important pathogens, including Anaplasma, Borrelia, Plasmodium, Rickettsia, and Trypanosoma, the critical microbial determinants of transmission efficiency in the vector remain unknown for most bacterial and protozoan pathogens.
Transmission of Anaplasma spp. is initiated when ixodid ticks ingest an infected blood meal during acquisition feeding on a bacteremic mammalian host (25, 26). The organisms then replicate within the midgut epithelium, invade the salivary gland, and, concomitant with attachment and feeding on a naïve host, undergo a second round of replication and are released into the saliva for transmission (13, 25, 26). Importantly, the efficiencies of this developmental cycle within the tick are markedly different for genetically distinct strains (25, 26, 27). The St. Maries strain of Anaplasma marginale is a prototypical high-transmission-efficiency strain; >90% of ticks feeding on a bacteremic host become infected, and replication in the salivary gland generates >106 bacteria (25, 26). Consequently, the St. Maries strain is consistently transmitted to naïve hosts by ≤10 ticks and has been transmitted by a single tick (8, 18, 19). In contrast, low-transmission-efficiency strains are characterized phenotypically by low levels of colonization and replication in tick tissues (25, 26). A. marginale subsp. centrale (the Israel vaccine strain) is unique among the A. marginale strains described to date in that it colonizes the tick midgut and salivary gland but replicates at a significantly lower level in the salivary gland (9, 25). The numbers of organisms in the salivary gland and saliva at the time of transmission are 10-fold and 30-fold less, respectively, than the numbers of organisms in a high-transmission-efficiency strain (25). Transmission is correspondingly less efficient and requires >30-fold more ticks (9, 25). This statistically and biologically significant transmission phenotype provides a model for linking gene content and expression with transmission efficiency.
There are two competing hypotheses to explain how genetically distinct strains differ in transmission efficiency. The first hypothesis is that the difference is due to strain-specific gene content. This content may include the presence of a gene required for efficient transmission or, alternatively, loss of efficiency due to an inhibitory gene. Most strictly, this hypothesis dictates that a gene is present in one strain and absent in another strain; more broadly, this hypothesis includes polymorphisms that alter the function of a gene product. The second, alternative hypothesis is that the difference in transmission efficiency is regulatory, resulting from either an increase or a decrease in expression of genes common in strains. To address this gap in our knowledge regarding the microbial determinants of transmission efficiency, we identified the differences in gene content between the St. Maries and Israel vaccine strain genomes and then tested whether the different genes were specifically expressed in the tick salivary gland. We hypothesized that the transmission phenotype is linked to transcription of strain-specific genes during the final round of replication and transmission. Here we report the results of a test of this hypothesis.
The complete genome sequences of the St. Maries (GenBank accession no. CP000030) and A. marginale subsp. centrale Israel vaccine (GenBank accession no. CP001759) strains were used for this study. Strain-specific genes were identified using a two-level high-stringency screen. Initially, annotated genes in each strain were reciprocally screened against the other strain using BLASTP. Annotated genes that encoded products with ≥30% identity were considered homologous, based on predictions of the minimal level of identity required for retention of function (1, 23), and were eliminated from further analysis. The remaining putative Israel strain-specific genes were then searched against the complete St. Maries strain genome using TBLASTN (all reading frames in both directions) in order to identify regions of sequence identity that were not found in previously annotated St. Maries strain genes. The reciprocal TBLASTN search was conducted for putative St. Maries strain-specific genes against the complete Israel strain genome. Coding sequences with ≥30% identity over >60% of the full-length sequence were considered to be sequences having a homolog in the other strain and were removed from further strain-specific analysis, unless stop codons or frameshifts were present in the region identified by TBLASTN. If stop codons, frameshifts, or short regions of sequence identity were identified by TBLASTN, the presence and expression of a potentially unannotated homolog in the region were examined by performing reverse transcriptase PCR (RT-PCR). The remaining Israel strain genes that did not have a significant homolog in the St. Maries strain genome and the remaining St. Maries strain genes that did not have a homolog in the Israel strain genome were considered strain-specific genes for the purposes of this study.
To test whether strain-specific genes were clustered in the chromosome, consistent with lateral gene transfer, a chi-square test of independence was used to test the null hypothesis that strain-specific genes are randomly distributed among all annotated genes in each genome (28). Each genome was divided into segments so that, using the assumption that there was random distribution of strain-specific genes, there would be at least 5 strain-specific genes per segment. In order to eliminate bias, the chi-square statistics were computed using each gene in the chromosome as a starting point for segmenting the genome, and the resulting chi-square values were then averaged to produce a single statistic for each strain. Following computation of the chi-square statistic for the observed data, 1,000 bootstrap values of the chi-square statistic were computed to generate the distribution of the chi-square statistics under the null hypothesis that there is random distribution of strain-specific genes (7). P values for the averaged chi-square statistics were computed from the distribution generated by the bootstrap samples using the percentile method.
Transcription was detected by RT-PCR using specific primers for each strain-specific gene (Table (Table1).1). RNA from A. marginale-infected erythrocytes during acute bacteremia was isolated using Trizol reagent (Invitrogen), while RNA from pools of 10 infected tick salivary glands at the time of transmission was stabilized in RNA Later (Applied Biosystems) and isolated using RNeasy (Qiagen). Isolated RNA was treated with DNase using 4 U of TURBO DNase (Applied Biosystems) for 1 h at 37°C. A one-step RT-PCR was performed using the Titan one-tube RT-PCR system (Roche). The thermocycling conditions used for reverse transcription and subsequent PCR amplification of cDNA were as follows: 50°C for 30 min and 94°C for 2 min, followed by 35 cycles of 94°C for 10 s, 60°C for 30 s, and 68°C for 45 s and then a final extension at 68°C for 7 min. RT-PCR products were separated and visualized by electrophoresis on a 2% agarose gel stained with ethidium bromide. Initially, transcription levels were examined semiquantitatively based on the agarose band intensities of cDNA amplicons relative to that of msp5, which has been shown to be transcribed at high levels in all mammalian and tick stages. To ensure that RNA was not contaminated with genomic DNA, reverse transcriptase-negative samples were prepared exactly as described above except that 2.5 U of Platinum Taq polymerase (Invitrogen) was added instead of the Titan enzyme mixture (Roche). The accuracy of this semiquantitative approach was then verified using real-time quantitative RT-PCR (qRT-PCR) to determine the gene copy numbers for AM378/AM380, AM997, and AM286. The copy numbers were then normalized by comparison to the msp5 copy number. qRT-PCR was performed using an iScript one-step RT-PCR kit for probes (Bio-Rad) with a Bio-Rad iCycler as described by the manufacturer. Primers and probes used for the qRT-PCR assay are described in Table Table1.1. Forward and reverse primers were used at a final concentration of 400 nM, and probes labeled at the 5′ end with 6-carboxyfluorescein (6-FAM) and at the 3′ end with 6-carboxytetramethylrhodamine (TAMRA) (Integrated DNA Technologies, Corallville, IA) were used at a final concentration of 100 nM. Reaction mixtures (25 μl) were prepared in triplicate, and the transcript copy number was quantified by comparison to a standard curve. Standard curves were constructed by performing qRT-PCR with samples containing 102, 103, 104, 105, 106, and 107 copies of full-length AM286, AM378/AM380, AM997, or msp5 cloned into PCR-4-TOPO. The thermocycling conditions for reverse transcription followed by PCR amplification and quantification of AM378/AM380, AM997, and AM283 were as follows: 50°C for 10 min and 95°C for 5 min, followed by 45 cycles of 95°C for 15 s and 55°C for 30 s. For msp5, the thermocycling conditions were modified to accommodate the reduced annealing temperature of the msp5 primers and increased amplicon size, as follows: 50°C for 10 min and 95°C for 5 min, followed by 45 cycles of 95°C for 15 s, 50°C for 30 s, and 60°C for 30 s.
For analysis of strain-specific gene transcription at the time of transmission feeding and confirmation of the phenotype, seronegative (as determined by an Msp5 competitive enzyme-linked immunosorbent assay [C-ELISA]) calves were infected intravenously with either the St. Maries or Israel strain (9, 24). During acute bacteremia (>108 organisms per ml), adult male Dermacentor andersoni ticks (Reynolds Creek strain) were allowed to acquisition feed on a calf infected with the St. Maries strain (n = 150) or the Israel vaccine strain (n = 350). Blood for strain-specific gene transcription analysis was obtained at this time. Ticks were then kept at 26°C for 3 days to allow complete digestion of the acquisition blood meal and prevent mechanical transmission by contaminated mouthparts. Acquisition-fed ticks were subsequently allowed to attach and transmission feed on a second set of naïve, seronegative calves for 7 days. The ticks were then removed, and the salivary glands were dissected for analysis of strain-specific transcription and quantification of the pathogen load. The pathogen load was determined by using quantitative real-time PCR to determine the number of copies of a single-copy gene, msp5, in individual ticks infected with either the St. Maries or Israel vaccine strain, and the results were expressed as the mean number of bacteria (26). Transmission to the naïve calves was monitored by microscopic examination of Giemsa-stained blood smears and was confirmed by both PCR detection of msp5 and seroconversion using the Msp5 C-ELISA (25, 26).
Using the criteria described above for defining strain-specific genes, 24 St. Maries strain-specific genes and 34 Israel strain-specific genes were identified (Fig. (Fig.1).1). St. Maries strain-specific genes AM378 and AM380 are identical. ACIS30, ACIS49, ACIS1175, and ACIS1182 are identical genes scattered throughout the Israel strain chromosome that show partial sequence identity to the msp2 gene (3). For genes that met the strain-specific criteria, the levels of identity to the other genome varied; 8 St. Maries strain genes showed no sequence identity (defined as a TBLASTN E value of >0.5) to the Israel strain genome, and 4 Israel strain genes showed no identifiable sequence identity to the St. Maries strain genome (Table (Table2).2). Sixteen St. Maries strain genes exhibited very low levels of identity (defined as a TBLASTN E value of <0.5 but <30% identity) with the Israel strain genome, and 10 of them were associated with the sequence in a syntenic locus (Table (Table2).2). Reciprocally, 30 Israel strain genes exhibited very low levels of identity with the St. Maries strain genome, and 21 of these genes were associated with a syntenic locus. AM1216, ACIS380, and ACIS932 showed multiple regions of sequence identity to the other genome, occurring both in the syntenic locus and outside the syntenic locus (Table (Table22).
St. Maries strain-specific genes are dispersed throughout the chromosome, but some regions contain a disproportionately large number of strain-specific genes (Fig. (Fig.1).1). AM345, AM357, AM360, AM378, AM380, and AM382 occur over a 46-kb region; AM955, AM997, AM1021, AM1057, AM1058, and AM1077 occur over an 86-kb region; and AM1161, AM1165, AM1189, and AM1216 occur over a 46-kb region. Similarly, Israel strain-specific genes are broadly distributed in the genome, and several of these genes occur in clusters, most notably ACIS888, ACIS906, ACIS919, ACIS932, ACIS957, ACIS962, ACIS1002, and ACIS1028, which occur over a 155-kb region. Despite the close proximity of some strain-specific genes, a chi-square test of independence revealed that clustering of strain-specific genes was not statistically significant (P = 0.137 and P = 0.074 for St. Maries strain-specific genes and Israel strain-specific genes, respectively). All strain-specific genes identified encode hypothetical proteins unique to A. marginale, except for AM286, which encodes a conserved hypothetical protein with orthologs that are present in additional species in the family Anaplasmataceae, including the tick-transmitted pathogens Ehrlichia canis, Ehrlichia chaffeensis, and Ehrlichia ruminanitium, but not in Anaplasma phagocytophilum (5, 12, 16).
Twenty of the 24 St. Maries strain-specific genes identified and 16 of the 34 Israel strain-specific genes were transcribed in tick salivary glands at the time of transmission (Table (Table2).2). Each of these genes was also transcribed in the mammalian host during acute bacteremia. Six genes, all in the Israel strain (ACIS183, ACIS185, ACIS206, ACIS380, ACIS769, and ACIS957), were expressed exclusively in the blood and not in the tick salivary glands (Table (Table2).2). The transcript levels were expressed relative to the level of msp5 (Fig. (Fig.2),2), which is transcribed at high levels in all stages of A. marginale examined. Eight St. Maries strain-specific genes and four Israel strain-specific genes were found to be expressed at high levels in both the tick salivary gland and the blood stage of infection (Table (Table2).2). An additional six strain-specific genes, St. Maries strain AM286 and Israel strain genes ACIS929, ACIS932, ACIS962, ACIS1093, and ACIS1096, were found to be expressed at high levels during the blood stage of infection but not in the tick salivary gland (Table (Table2).2). No strain-specific genes appeared to be transcribed at higher levels at the tick salivary gland stage than at the blood stage in either strain. The relative levels of transcription were confirmed using quantitative RT-PCR for the following genes representing three types of transcription: AM997 (expressed at high levels in the blood and in the salivary gland compared to msp5), AM378/AM380 (expressed at low levels in the blood and in the salivary gland compared to msp5), and AM286 (expressed at high levels in the blood and at low levels in the salivary gland) (Table (Table33).
The low-transmission-efficiency phenotype of the Israel vaccine strain was confirmed by transmission feeding of 250 ticks on an animal, which was less than the predicted threshold of 300 ticks based on pathogen levels in the salivary gland and saliva and less than the 425 ticks shown to transmit infection (25). The transmission-fed animal was not infected as determined by an initial screening by microscopic examination of blood smears and was confirmed to be negative using the Msp5 C-ELISA and msp5 PCR at 90 days after tick feeding, which was more than 3 standard deviations longer than the time required for seroconversion following feeding of ≥10 ticks infected with the St. Maries strain (26). As a positive control, the St. Maries strain was successfully transmitted and detected microscopically, which was confirmed by an msp5 PCR at 21 days after transmission feeding. Quantification of the bacterial load in the salivary glands by quantitative PCR showed that the mean infection levels were 108.45 ± 100.49 and 106.76 ± 100.62 bacteria per salivary gland pair for St. Maries and Israel vaccine strain-infected ticks, respectively.
The lack of transmission by 250 feeding ticks confirmed and further refined the low-transmission-efficiency phenotype of the Israel vaccine strain. In contrast to the consistent transmission of the St. Maries strain in multiple replicate trials using ≤10 ticks, the Israel strain has been successfully transmitted only using >400 ticks and has not been transmitted repeatedly using cohorts of 100 ticks (9, 25, 26). The quantitative data for the pathogen loads in the salivary gland and saliva predict that a minimum of 325 ticks is required to transmit the Israel strain (25). The data obtained in this study are consistent with this minimum threshold. Importantly, this low-transmission-efficiency phenotype of the Israel strain is conserved across at least four species of vector ticks, D. andersoni, Hyalomma excavatum, Rhipicephalus sanguineus, and Rhipicephalus annulatus (20, 22), indicating that the transmission phenotype is an intrinsic pathogen strain characteristic and consistent with a unique gene content.
The criteria for identification of a gene as a strain-specific gene included <30% sequence identity between genomes. While this conservative threshold, based on minimal estimates for retention of enzymatic function (1, 23), may have excluded genes sufficiently polymorphic in the strains to encode proteins with different functions, it avoided “false-positive” identification of genes as strain-specific genes. Importantly, the identification was not limited to a comparison of annotated genes. When a gene in one strain had a low-identity homolog in the other strain, independent of prior annotation, analysis of transcription was used to confirm the gene identification. When analyses were conducted reciprocally, this approach reduced the likelihood of missing strain-specific genes simply due to annotation differences in the two strains.
The 58 unique genes identified in the two strains is far greater the number than has been reported previously for sequenced A. marginale strains (6). This greater number of strain-specific genes was not due to annotation or screening methodology differences between this study and the previous analyses (6) but rather was due to true genomic divergence. Whether this divergence is representative of genetic heterogeneity among A. marginale strains is unknown; the Israel vaccine strain is the first A. marginale sensu lato strain sequenced, while all previous sequenced strains, which were characterized by a closed core genome, were A. marginale sensu stricto strains isolated in North America (6, 11). The strain-specific genes reported here most likely resulted from accumulated changes in a common ancestral gene, with deletions and mutations causing loss or marked divergence in one of the two strains. This hypothesis is supported by the frequent identification of a homolog with a very low level of sequence identity in a syntenic locus in the other strain and the lack of statistical support for strain-specific gene clusters that would be expected to result from lateral gene transfer. The latter finding is consistent with the findings for comparative genome analysis of the family Anaplasmataceae, in which there has been no evidence of pathogenicity islands or CG skews suggestive of lateral gene transfer (2, 6, 11, 12).
The most rigorous association between strain-specific gene content and transmission efficiency phenotype would be unique expression of the genes in the tick salivary gland, where the different replication phenotypes result in logarithmic differences in the number of bacteria, rather than in the mammalian host, where levels of bacteremia are similar for the two strains (9, 25, 26). This hypothesis was rejected; there were no St. Maries or Israel strain genes that were transcribed either exclusively or at markedly higher levels in the tick salivary gland than in the mammalian host that may have conferred enhanced and impaired transmission phenotypes, respectively. Rejection of this “exclusive expression” hypothesis does not rule out the possibility that the strain-specific genes may affect the observed phenotypic differences in replication and secretion in the tick salivary gland but does indicate this is unlikely to be their only function. An alternative view is that pathogen gene expression in the salivary gland at the time of transmission reflects the molecules required to establish infection in the mammalian host. The high levels of transcription of St. Maries strain genes AM283, AM382, AM634, AM682, AM997, AM1021, AM1057, and AM1216 in both the salivary gland and the mammalian host fit this pattern. The semiquantitative RT-PCR results were verified by performing quantitative RT-PCR with a subset of strain-specific genes with high, low, and different levels of transcription. Consequently, these eight St. Maries strain-specific genes with high levels of expression in the tick salivary gland and during acute bacteremia are new candidates for genes associated with efficient transmission.
Notably, all identified strain-specific genes encoded hypothetical (57/58) or conserved hypothetical (1/58) proteins. While A. marginale, like other understudied organisms in the family Anaplasmataceae, contains a relatively large number of annotated hypothetical proteins, the overrepresentation of hypothetical proteins encoded by strain-specific genes is striking and likely reflects adaptations to the unique mammalian and vector life cycle of this bacterium. The conclusion that most of these genes (42/58) are real genes is supported by the transcription data, consistent with previous studies of hypothetical and conserved hypothetical proteins in the St. Maries strain that have linked transcription to protein expression (14, 17). The remaining 16 genes may be pseudogenes or expressed in another stage of the transmission cycle, such as the tick midgut. The skew toward hypothetical proteins is not strictly a bias of the conservative parameters used to identify strain-specific genes. The products of the first five genes whose data are above the threshold for classification as strain-specific genes were also originally annotated as hypothetical proteins; AM359, AM366, AM368, AM470, and AM1046 all had levels of identity between the 30% threshold and 40% with the Israel strain genome. The expression of each of these genes has been confirmed by RT-PCR, and the expression of AM359, AM366, AM368, and AM470 was detected previously by mass spectrometry (14, 17; S. Ramabu and G. Palmer, unpublished data). The identification of these specific genes in strains with markedly different phenotypes emphasizes the importance of determining the functions of the large number of novel proteins in the tick vector and the mammalian host as a fundamental step in understanding pathogenesis and transmission.
This work was supported by National Institutes of Health grant AI44005, by Wellcome Trust grant GR075800M, and by U.S. Department of Agriculture Agricultural Research Service grant 5348-32000-027-00D/-01S. J. T. Agnes was supported in part by a National Institutes of Health predoctoral fellowship in protein biotechnology.
Editor: R. P. Morrison
Published ahead of print on 22 March 2010.