|Home | About | Journals | Submit | Contact Us | Français|
Recent serologic, immunoprotection, and pathogenesis studies identified the Lig proteins as key virulence determinants in interactions of leptospiral pathogens with the mammalian host. We examined the sequence variation and recombination patterns of ligA, ligB, and ligC among 10 pathogenic strains from five Leptospira species. All strains were found to have intact ligB genes and genetic drift accounting for most of the ligB genetic diversity observed. The ligA gene was found exclusively in L. interrogans and L. kirschneri strains, and was created from ligB by a two-step partial gene duplication process. The aminoterminal domain of LigB and the LigA paralog were essentially identical (98.5 ± 0.8% mean identity) in strains with both genes. Like ligB, ligC gene variation also followed phylogenetic patterns, suggesting an early gene duplication event. However, ligC is a pseudogene in several strains, suggesting that LigC is not essential for virulence. Two ligB genes and one ligC gene had mosaic compositions and evidence for recombination events between related Leptospira species was also found for some ligA genes. In conclusion, the results presented here indicate that Lig diversity has important ramifications for the selection of Lig polypeptides for use in diagnosis and as vaccine candidates. This sequence information will aid the identification of highly conserved regions within the Lig proteins and improve upon the performance characteristics of the Lig proteins in diagnostic assays and in subunit vaccine formulations with the potential to confer heterologous protection.
Pathogenic spirochaetes belonging to the genus Leptospira are the agents of leptospirosis, which is considered to be the most widespread zoonosis in the world (Faine et al., 1999; Levett, 2001; Bharti et al., 2003). Susceptible animals, including humans, are infected by direct contact with urine from a reservoir host, usually rats or other rodents, or indirectly through contaminated water. Transmission occurs via dermal abrasions or inoculation of the mucous or conjunctival membranes (Faine et al., 1999). In the majority of infected individuals, leptospirosis is a self-limited disease characterized by flu-like symptoms (Faine et al., 1999). However, hepatorenal manifestations, as observed in Weil’s disease, are frequent complications and are associated with significant (10–15%) mortality (Bharti et al., 2003; McBride et al., 2005). In addition, leptospirosis causes severe pulmonary haemorrhage syndrome (SPHS), for which case fatality is >50% (Segura et al., 2005; Gouveia et al., 2008). Leptospirosis is considered to be an emerging infectious disease in endemic regions of Asia (Karande et al., 2003, 2005; LaRocque et al., 2005; Yanagihara et al., 2007; Peacock and Newton, 2008) and Latin America (Ko et al., 1999; Sarkar et al., 2002; Romero et al., 2003; Johnson et al., 2004) and is a major public health concern in poverty stricken regions of the world (McBride et al., 2005; Ganoza et al., 2006; Riley et al., 2007).
The Leptospira genus is sub-classified into 18 genomospecies that includes both saprophytic and pathogenic species (Levett, 2001; Levett et al., 2006; Matthias et al., 2008). Classification based on serologic methods has identified ~300 serovars, of which more than 200 are considered to be pathogenic (Faine et al., 1999; Levett, 2001; Bharti et al., 2003). The availability of genomic sequence data from five Leptospira strains, L. interrogans serovars Lai (Ren et al., 2003) and Copenhageni (Nascimento et al., 2004), L. borgpetersenii serovar Hardjo strains L550 and JB197 (Bulach et al., 2006), and the saprophyte L. biflexa serovar Patoc I (Picardeau et al., 2008), is driving the discovery of new diagnostic tools and vaccines for leptospirosis. Considerable effort has been expended towards identifying conserved surface-exposed antigenic determinants that could improve diagnosis and provide heterologous protection via subunit or DNA vaccines.
A number of leptospiral outer membrane proteins (OMPs) have been characterized (Cullen et al., 2005), including OmpL1 (Haake et al., 1993), LipL41 (Shang et al., 1996), LipL36 (Haake et al., 1998), the major outer membrane protein, LipL32 (Haake et al., 2000), LipL21 (Cullen et al., 2003), LipL46 (Matsunaga et al., 2006), LenA (Verma et al., 2006), and the OmpA-like proteins Loa22 (Koizumi and Watanabe, 2003) and Omp52 (Hsieh et al., 2005). However, their performance in diagnostic assays for acute leptospirosis or as vaccine candidates has been problematic (Haake et al., 1999; Branger et al., 2001; Flannery et al., 2001; Guerreiro et al., 2001). LigA and LigB, belonging to a family of leptospiral immunoglobulin-like (Lig) proteins, appear to be promising antigens (Palaniappan et al., 2002; Matsunaga et al., 2003). The gene encoding a third Lig protein, ligC, was identified as a pseudogene in L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa (Matsunaga et al., 2003), but was found to be intact in L. interrogans serovar Lai (Ren et al., 2003). The Lig proteins contain a series of bacterial immunoglobulin-like (Big) repeat domains that were originally identified in virulence determinants from Escherichia coli and Yersina pseudotuberculosis (Hamburger et al., 1999; Luo et al., 2000).
The lig genes are of great interest because emerging serologic, vaccine, and pathogenesis studies indicate that Lig proteins are key virulence determinants involved in host—pathogen interactions. Lig proteins mediate interaction with multiple host extracellular matrix proteins, including fibronectin, fibrinogen, collagen, and laminin (Choy et al., 2007). Several studies have provided evidence that the Lig proteins are protective immunogens in animal models of leptospirosis (Koizumi and Watanabe, 2004; Palaniappan et al., 2006; Silva et al., 2007). In addition, we recently demonstrated that a recombinant polypeptide containing Big domains 2–6 from LigB was able to protect hamsters against homologous challenge by L. interrogans serovar Copenhageni (unpublished data). Virulent forms of L. interrogans and L. kirschneri strains express higher levels of Lig proteins than culture-attenuated forms (Matsunaga et al., 2003). Lig expression is strongly induced by shifting the osmolarity from low levels used in EMJH culture medium to osmolarity levels found in host tissues (Matsunaga et al., 2005). Up-regulation during early host infection is consistent with the strong serologic response to Lig proteins observed during acute leptospirosis (Croda et al., 2007).
Considering the large number of pathogenic Leptospira serovars and the broad distribution of leptospiral host reservoirs, the potential effect of selective pressure on the genetic diversity of the Lig proteins was unclear. Given the potential of the Lig proteins as diagnostic antigens and vaccine candidates, we examined their sequence diversity in the serovars most often associated with leptospirosis.
Virulent leptospiral strains (Table 1) were obtained from culture collections maintained by the authors. The isolation conditions of a number of the strains used in this study were previously described (Ko et al., 1999; Haake et al., 2002; Silva et al., 2008). The identity of each of the strains used in this study was confirmed by 16S rRNA gene sequencing (Hookey et al., 1993) and serogrouping based on the microscopic agglutination test (MAT) (Cole et al., 1973). Strains were cultured in liquid Ellinghausen—McCullough—Johnson—Harris modified tween 80-bovine albumin medium (Ellinghausen and McCullough, 1965; Johnson and Harris, 1967) at 30 °C. Virulence of each Leptospira strain was evaluated in the hamster model of lethal leptospirosis as previously described (Silva et al., 2008).
Leptospiral genomic DNA was extracted from 7-day-old cultures using the GFX Genomic Blood DNA Purification Kit (GE Healthcare) according to the instructions provided for Gramnegative bacteria. The concentration was determined by absorbance at 260 nm and the quality of the genomic DNA was confirmed by agarose gel electrophoresis.
A PCR-based screening method for the detection of the lig genes was developed. Degenerate primers (Table S4) were designed to amplify ligA, ligB and ligC based on sequence information deposited in GenBank. Additional primers were designed to amplify ligB from L. weilii strains due to the heterogeneity of ligB from this species compared to the other leptospiral species. The DNA sequence of each lig gene was aligned using AlignX (Vector NTI, Invitrogen) and, based on homology between the sequences, suitable regions were identified and primers, degenerate where appropriate, were designed. PCR (30 cycles) was carried out using recombinant Taq DNA polymerase (Invitrogen) following the protocol provided. PCR products were analysed by agarose gel electrophoresis. The expected product size for ligA was estimated to be 211 bp, 536 bp for ligB (1076 bp for ligB from L. noguchii and 1625 bp for ligB from L. weilii) and 248 bp for ligC. Quality control PCR amplification of the 16S rDNA and lipL32 genes was used to verify the quality of the genomic DNA.
Southern blotting was carried out as described previously (Sambrook and Russell, 2001). Briefly, 3 μg of genomic DNA was digested with 20 units of BamHI (Invitrogen) and separated by agarose gel electrophoresis. DNA was transferred from the gel to a positively charged Hybond-N nylon membrane (GE Healthcare) with a vacuum blotter (Bio-Rad). Probes to each of the lig genes were based on pooled PCR products amplified using the primers described in Table S4 and labelled using the ECL Direct Nucleic Acid Labelling and Detection System (GE Healthcare) as described in the protocol provided. Prehybridization was carried out at 42 °C for 1 h in hybridization buffer supplemented with 0.5 M NaCl and 5% blocking agent. Hybridization was carried out overnight at 42 °Cin roller bottles. Following hybridization, the membrane was washed twice for 10 min at 55 °C in wash solution (0.4% SDS, 0.5× SSC). Finally, the membrane was washed twice in 2× SSC, 5 min per wash at room temperature. After incubation with ECL detection reagents, hybridization products were detected by exposure of the membrane to Hyperfilm ECL X-ray film (GE Healthcare).
Dependant on their presence, full-length ligA (~3.7 kb), ligB (~5.7 kb) and ligC (~5.9 kb) genes from L. interrogans serovars Pomona and Canicola, L. noguchii serogroup Bataviae and L. weilii serogroup Hebdomadis were amplified using Elongase Mix (Invitrogen), which contains a proof-reading polymerase, and subsequently cloned using the TOPO-TA cloning kit (Invitrogen). Each gene sequence was determined by direct sequencing of PCR products amplified using the cloned genes and internal primers (Table S4). Ambiguous bases or those different to previously published sequences were resolved by direct sequencing of PCR products amplified from genomic DNA. Each base was sequenced on both strands a minimum of two times resulting in each base being sequenced a minimum of four times. Previously unpublished sequences were submitted to GenBank and assigned accession numbers EU700267 to EU700275.
Raw DNA sequences were analysed for the quality of base calling using Phred (Ewing et al., 1998), the 5′ and 3′ ends were trimmed accordingly using FinchTV (Geospiza Inc.) and each contig was assembled using Contig Express (Vector NTI, Invitrogen) at the default settings. Consensus DNA sequences were exported and coding sequences (CDS) identified using Vector NTI. CDS were aligned using AlignX (Vector NTI, Invitrogen), based on the ClustalW algorithm (Thompson et al., 1994), at the default settings. Protein sequences were generated by translation of the CDS and were aligned using AlignX at the default settings. The number of observed synonymous (Sd), and nonsynonymous (Nd) substitutions in the lig genes were calculated using Syn-SCAN (Gonzales et al., 2002). The evolutionary history relating the ligA, ligB, and ligC genes from a total of 14 strains was examined using a statistical phylogenetic approach. Alignments of the genes were based on their amino acid translations to maintain reading frames and then reverse translated back to their nucleotide sequences using the Geneious Pro 3.0.1 software package (Drummond et al., 2007). Possible intragenic recombination was examined using the DualBrothers plugin for Geneious. Default settings were used with the following exceptions: chain length 220,000, subsampling frequency 100 and burn-in length 10,000. DualBrothers is a recombination detection algorithm based on a phylogenetic dual multiple change-point model (MCP) (Suchard et al., 2002, 2003; Minin et al., 2005). The MCP model allows for changes in evolutionary relationships and rates across sites in a multiple sequence alignment by assuming that the sites separate into an unknown number of contiguous segments, each with possibly different topologies or mutation processes. Differing evolutionary topologies on either side of a break-point suggests recombination (Li et al., 1988). The DualBrothers implementation takes a Bayesian approach that employs Markov chain Monte Carlo to simulate from the posterior distribution of model parameters. One strength of the Bayesian model is that it can measure uncertainty in break-point locations, determine the most likely parental sequences of the putative recombinant and assess the statistical significance of recombination simultaneously; this simultaneous approach avoids the pitfall inherent in sequential testing for recombination found in many recombination detection programs (Suchard et al., 2002). For each putative recombinant, break-points were inferred and parental representatives and the significance of recombination using the approach of Suchard et al. (2003) were assessed.
Statistical analysis was carried out using Graphpad Prism 5.01 (GraphPad Software, San Diego, CA, USA). The Mann—Whitney test was used to calculate differences between populations and two-tailed P values <0.05 were considered significant.
A PCR screening assay demonstrated that ligB was present in all 10 strains studied, ligA in five strains and ligC in seven strains (Table 1). Using the lig sequences deposited in GenBank, primers were designed to screen for the presence of lig genes in the leptospiral species most commonly associated with human leptospirosis (Table 1). L. interrogans serovars Copenhageni Fiocruz L1-130, Pomona Kennewicki, Canicola Kito and L. kirschneri serovar Grippotyphosa RM52 were found to contain three lig genes (Fig. 1). L. interrogans serovar Lai 56601 and L. weilii serogroup Hebdomadis Eco-Challenge were found to contain ligB and ligC while L. noguchii serogroup Bataviae Cascata and L. borgpetersenii serovar Hardjo strains JB197 and L550 contained only ligB (Fig. 1). Of note, in place of the ligA gene in L. interrogans serovar Lai and the two L. borgpetersenii serovar Hardjo genomes there were transposases (Fig. 1). In serovar Lai, upstream of ligB and where ligA would normally be located, two transposase genes were identified, Tn8 and an integrase, and a ccrB ortholog (encodes camphor resistance in bacteria). L. borgpetersenii serovar Hardjo strain JB197 contained a transposase, IS1533, and the C-terminal of the ccrB ortholog. L. borgpetersenii serovar Hardjo L550 contained an IS1533 pseudogene and the C-terminal region of the ccrB gene. The JB197 strain has a genome layout similar to that of serovar Lai with a genome inversion (Ren et al., 2003), while the genome organization of strain L550 more closely resembles that of the serovars Copenhageni, Pomona and Canicola.
To confirm the integrity of the lig genes in pathogenic Leptospira spp. they were cloned and sequenced from L. interrogans serovar Canicola, serovar Pomona, L. noguchii serogroup Bataviae and L. weilii serogroup Hebdomadis. The presence or absence of the lig genes was also confirmed by Southern blotting (data not shown). The ligA and ligB genes in serovars Pomona and Canicola were of the expected sizes and genome locations, with ligA situated upstream of ligB in both strains. We previously reported that the ligC gene was a pseudogene in L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa (Matsunaga et al., 2003). In this study we found ligC to be present in L. interrogans serovars Pomona and Canicola, and L. weilii serogroup Hebdomadis, but absent in L. noguchii serogroup Bataviae.
LigB was the only lig gene ortholog found in all 10 pathogenic strains of Leptospira spp. studied (Table 1) and was compared by estimating pair-wise sequence distances. LigB was found to be significantly more variable than ligA (P < 0.05) and ligC (P < 0.005), ranging from 67.9 to 99.9% (mean 80.2 ± 13.1%) and 62.6 to 99.9% (mean 77.3 ± 15.3%) for pair-wise DNA and amino acid sequence identity, respectively (Table S1). The base substitutions in ligB were found throughout the gene, although there was increased variability within a region of the carboxyterminal domain (amino acids 1518–1669). When phylogenetic groupings were analysed, L. interrogans serovars Copenhageni, Canicola, Lai and Pomona had a 95.4 and 94.4% mean pair-wise DNA and amino acid sequence identity, respectively. The mean number of observed synonymous substitutions (Sd) was 77.8, the mean number of observed nonsynonymous substitutions (Nd) was 65.5 and the ratio of synonymous to nonsynonymous substitutions (dS/dN) was 4.01. Inclusion of L. kirschneri serovar Grippotyphosa in this grouping reduced, but not significantly, the mean identity observed to 91.0 and 90.6% for DNA and amino acid sequences, respectively (dS/dN = 4.88). The two L. borgpetersenii serovar Hardjo strains and the L. weilii Eco-Challenge strain had a 90.9 and 90.2% mean DNA and amino acid sequence identity, respectively, and the majority of the base substitutions were synonymous (mean Sd = 182.7, mean Nd = 125.7, mean dS/dN = 4.24). LigB from the L. noguchii strain did not fall within either of these two groupings and inclusion in either group increased the levels of sequence variability by 20–30% (Table S1).
LigA from five strains (Table 1) was highly conserved, ranging from 85.4 to 99.8% (mean 91.8 ± 5.9%) and 80.5 to 99.8% (mean 88.9 ± 6.8%) DNA and amino acid sequence identity, respectively (Table S2). The carboxyterminal region of LigA, comprising Big domain repeats 11–13 (see Fig. 2a), exhibited the highest level of variability, with 78.1 and 71.2% mean pair-wise DNA and amino acid identity, respectively. Although LigA sequence conservation was high, the majority of the single-nucleotide polymorphisms resulted in nonsynonymous amino acid substitutions (mean Sd = 119.7, mean Nd = 174.9, mean dS/dN = 2.89). Of note, the serovar Pomona LigA amino acid sequence previously deposited in GenBank ((Palaniappan et al., 2002), accession number AAN52495), contained a region (amino acids 740–774) of very low identity (<10%) compared to the other LigA sequences, specifically, point-insertions at nucleotide positions 2218, 2315 and 2322. We resequenced the ligA gene from the serovar Pomona Kennewicki PO-06-047 strain, accession number EU700270, and did not find the point-insertions.
The ligC gene was present in 7 of the 10 strains evaluated in this study. Furthermore, it was the most conserved Lig protein. LigC DNA and amino acid sequence identity ranged from 77.0 to 100% (mean 91.2 ± 8.9%) and 83.6 to 99.9% (mean 94.1 ± 6.2%), respectively (Table S3). Similar to ligB, the base substitutions are spread throughout the gene. The majority of the base substitutions were nonsynonymous (mean Sd = 96.1, mean Nd =342.9, mean dS/dN = 1.03). Previously we found that the ligC genes from L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa appear to be pseudogenes. Serovar Copenhageni contained a point mutation while serovar Grippotyphosa contained a frame-shift mutation, both of which resulted in stop codons (Matsunaga et al., 2003). The major difference between the various ligC sequences was a gap towards the middle of ligC (nucleotides 2839–3104) from the Eco-Challenge strain, as there was one less Big repeat domain in the L. weilii ligC.
Translation-based alignment of nucleotide sequences of the immunoglobulin-like domains of LigA, LigB, and LigC of L. interrogans serovar Copenhageni revealed clustering of LigB domains 5–10 with LigA domains 8–13 (Fig. 2b). The same relatedness pattern was found for the Big domains of LigA, B, and C of L. kirschneri serovar Grippotyphosa (data not shown). This result suggests that the ligA gene was derived from the ligB gene in a partial gene duplication event (Fig. 2a). The ligA genes were found only in L. interrogans and L. kirschneri strains, indicating that the partial gene duplication event occurred in a progenitor of the L. interrogans—L. kirschneri branch of the leptospiral evolutionary tree. LigC domains were more distantly related than the LigA and LigB domains, consistent with the more ancient origin of ligC (see below).
The relatedness of the 10 full-length ligB sequences is presented in Fig. 3a. The ligB sequences of L. borgpetersenii and L. weilii cluster together in one region of the tree while the L. interrogans and L. kirschneri sequences cluster together in a separate region of the tree, with the ligB sequence of L. noguchii occupying an intermediate position. This relatedness pattern is consistent with a phylogenetic tree based on 16S rDNA sequences (Haake et al., 2004). Analysis using the multiple change-point (MCP) model revealed that all ligB genes except two were phylogenetically clonal (no evidence of rearrangements). The ligB sequence of L. interrogans serovar Copenhageni was found to have two L. kirschneri-like regions: a 153 nucleotide region in Big domain 11, and a 500 nucleotide region in the carboxyterminal domain (Fig. 3b). The ligB sequence of L. kirschneri serovar Grippotyphosa was found to have a large 1300 nucleotide rearrangement containing an L. interrogans serovar Lai-like sequence in the region coding for the carboxyterminal domain (Fig. 3c).
There were two different phylogenetic trees for the 5′ (nucleotides 1–2820) and 3′ regions (nucleotides 2821–3675) of the ligA gene sequences. The phylogenetic tree for the 5′ region (Fig. 4a) showed low levels of sequence non-identity (ranging from 1 to 9%) for the five strains. In contrast, the tree for the 3′ region (Fig. 4b) revealed that while the ligA sequences of the Canicola and Pomona strains are 100% identical, their ligA sequences were 33% non-identical to the Grippotyphosa and Copenhageni ligA sequences, a 4–10-fold increase in sequence non-identity. One possible interpretation of this difference in sequence variation for the two different regions of the ligA gene is the recent acquisition the 3′ ligA region, encoding the repeats 11–13 (Fig. 4c).
In contrast to the distribution of the ligA genes, the ligC genes were found not only in L. interrogans and L. kirschneri, but also in L. weilii. The sequence identity between the ligB and ligC sequences of L. interrogans and L. weilii was similar (69 and 78%, respectively), suggesting that the ligB and ligC genes coevolved with leptospiral evolution, rather than the L. weilii ligC gene representing a more recent horizontal acquisition. MCP analysis of the seven ligC genes revealed a single tree structure consistent with leptospiral evolution throughout, except for an L. interrogans-like 640 nucleotide insertion (4660–5300) into the carboxyterminal domain coding region of the ligC gene of L. weilii serovar Eco-Challenge (Fig. 5). In addition, the ligC gene of L. weilii serovar Eco-Challenge was found to lack one Big repeat domain (domain 11).
Genome sequencing studies have demonstrated that pathogenic Leptospira spp. contain ligB together with up to two lig paralogs, ligA and ligC (Ren et al., 2003; Nascimento et al., 2004; Bulach et al., 2006), while they are absent from the non-pathogenic saprophyte, L. biflexa (Matsunaga et al., 2003; Picardeau et al., 2008). However, little was known about the distribution of the lig genes among pathogenic leptospiral strains or their interrelationships. Our study demonstrates that an intact ligB gene is found in all leptospiral pathogens studied to date, suggesting an important, or perhaps essential, role in virulence. The ligA gene appears to have been derived from ligB by a two-step partial gene duplication process. LigC is structurally similar to LigB and ligC gene variation also follows phylogenetic patterns, suggesting an early gene duplication event. However, the role of LigC in virulence is less clear, as some strains have lost ligC, while in others, such as L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa, ligC is a pseudogene. Of note, we were unable to demonstrate any association between the number of lig genes, their diversity, or the hosts from which they were isolated and the degree of virulence in the hamster model (data not shown). There is strong evidence that LigA is expressed during infection (Palaniappan et al., 2002; Matsunaga et al., 2003; Koizumi and Watanabe, 2004; Silva et al., 2007; Srimanote et al., 2008) and yet several virulent pathogenic Leptospira strains do not contain ligA (Table 1). A possible explanation is that ligA and ligB are involved in virulence but that both copies are not required, an example of gene redundancy. Indeed, a recent report showed that a ligB knockout in L. interrogans strain Fiocruz L1-130 did not alter the virulence of the ligB- strain (Croda et al., 2008). As LigA was expressed in the ligB- strain it would appear that LigA can replace LigB during infection, although the role of the Lig proteins in virulence is not yet clear.
Phylogenetic analysis based on LigB, which is conserved in all strains, sorted the strains into three distinct groups: (i) L. kirschneri and L. interrogans; (ii) L. borgpetersenii and L. weilii; and (iii) L. noguchii (Fig. 1). Amino acid sequence alignment of LigB reveals that overall the carboxyterminal domain is the most conserved region (>60% mean identity), although there are short, highly variable regions within this domain. However, when aligned by phylogenetic group the aminoterminal region (Big domains 1–12) exhibited a similar level of identity when compared with the carboxyterminal domain (>90% mean identity). LigA, although only present in L. interrogans and L. kirschneri strains, demonstrated a high level of conservation of Big domains 1–10 (89% mean identity), while the full-length LigA had a mean identity of 80%. LigC was the most conserved of the Lig proteins (90% mean identity) however, it appears to be a pseudogene in L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa (Matsunaga et al., 2003) and is absent in L. borgpetersenii and L. noguchii strains (this work). The most widely used method for identifying non-functional genes is the dS/dN test, which compares the rate of synonymous to nonsynonymous mutations (Nei and Kumar, 2000). Pseudogenes have no functional restrictions as they are not expressed and are therefore expected to have a dS/dN ratio that does not differ significantly from one. Analysis of ligC revealed a mean dS/dN ratio of 1.03 suggesting that ligC may have lost or is in the process of losing its role in virulence and strains that contain this gene may be subject to a genome reduction event in the future.
Evidence is presented for horizontal recombination events affecting all three lig genes. The results indicate that several types of evolutionary mechanisms have been acting on the lig genes, including genetic drift, gene duplication, and horizontal gene transfer. We find that genetic drift accounts for most of ligB genetic diversity, suggesting that ligB was acquired early during the evolution of leptospires from free-living saprophytes to colonizers of host tissues. Although the level of DNA and amino acid sequence diversity for the LigB was similar to what had been observed previously for the gene encoding the porin, OmpL1, it would have been difficult to anticipate what level of diversity to expect without a sequence analysis study of this type. Our previous study found surprisingly large differences in the rates of sequence variation among genes encoding surface-exposed leptospiral outer membrane proteins (Haake et al., 2004). Sequences of lipL32 genes encoding the major outer membrane protein were highly conserved (3.1% DNA and 0.9% amino acid sequence non-identity). Strong lipL32 sequence conservation was not anticipated because LipL32 is highly immunogenic; >95% of patients with leptospirosis have an antibody response to LipL32 (Flannery et al., 2001). In contrast to lipL32, genes encoding OmpL1 exhibited significantly higher rates of sequence variation (14.9% DNA and 9.6% amino acid sequence non-identity). Rates of sequence variation and recombination for the lig genes were comparable to those observed for the ompL1 genes. However, there were some notable differences compared to the earlier study. For example, DNA sequence variability was higher than amino acid sequence variability for lipL32, lipL41, and ompL1. In contrast, the rates of DNA and amino acid sequence variation for the lig genes were comparable, indicating that lig genes had a higher overall rate of nonsynonymous sequence changes.
Our analysis revealed evidence for a second mechanism of sequence diversity for all three lig genes: recombination events and horizontal DNA transfer between related bacterial species. Two ligB genes and one ligC gene were found to be mosaics. The ligB gene of L. interrogans serovar Copenhageni was found to contain two L. kirschneri-like insertions: one insertion in Big domain 11 and a second insertion in the carboxyterminal domain (relative probability of recombination event >1000:1). The ligB gene of L. kirschneri strain RM52 and the ligC gene of L. weilii strain Eco-Challenge included L. interrogans-like insertions in the regions encoding their carboxyterminal domains (relative probability of recombination event >1000:1). We previously reported mosaicism for 20% of genes encoding the outer membrane protein, OmpL1. Mosaicism does not affect all leptospiral outer membrane proteins, as no evidence of recombination events was found for the gene encoding the major outer membrane lipoprotein, LipL32, and only one recombination event affecting a second outer membrane lipoprotein, LipL41, from 38 different strains representing six pathogenic Leptospira spp. (Haake et al., 2004). In the case of ligA, three of five strains appear to have acquired the same DNA encoding their last three ligA Big domains. Phylogenetic comparison of the transferred DNA encoding the exogenous ligA Big domains 11–13 with all known lig Big domains shows that they are most closely related to their ligA orthologs in L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa (data not shown). However, insufficient ligA sequence data is available to determine the phylogenetic origin of this exogenous ligA DNA.
Of note, we found that the first six Big domains from ligB and the first six Big domains from ligA were essentially identical (98.5 ± 0.8% mean identity). This is an important observation for the future development of diagnostic reagents. Recombinant LigB polypeptides containing Big domains 2–6 from L. interrogans serovar Copenhageni and L. kirschneri serovar Grippotyphosa were evaluated as antigens for the diagnosis of leptospirosis. Sensitivity and specificity were reported to be >90 and >97%, respectively, during the acute-phase of leptospirosis (Croda et al., 2007). Furthermore, there was no evidence of major genetic rearrangements in this region. The lack of genetic drift within the identical regions of ligB and ligA is evidence of selective pressure, intragenic recombination or gene conversion. The remaining Big domains (7–12 in ligB and 7–13 in ligA) were considerably more variable (34.2 ± 1.6% mean identity). This is an important observation, as our previous findings indicate that these Big domains are involved in binding extracellular matrix proteins and fibrinogen (Choy et al., 2007). Together these findings support a role for the Lig proteins during the transmission of leptospirosis. The carboxyterminal Big domains may have evolved to recognise specific host extracellular matrix proteins. This region was subject to horizontal recombination between Leptospira spp. suggesting that increased variability in the carboxyterminal Big domains of ligA and ligB may have contributed towards the adaption to novel hosts and potentially accounting for the extensive serovar-host specificity that typifies leptospirosis.
Our study had several potential limitations. The status of the ligB gene of L. interrogans serovar Manilae remains to be clarified since the ligB gene sequence has been reported to lack the carboxyterminal domain due to a point mutation that created a stop codon (Koizumi and Watanabe, 2004). Due to the difficulties inherent in sequencing the lig genes it is possible that the ligB sequence for serovar Manilae contains an erroneous stop codon in the carboxyterminal domain. Bacterial genes containing indels resulting in premature stop codons are defined as pseudogenes and are not functional (Ochman and Davalos, 2006). Yet, experimental data suggested that LigB was expressed in L. interrogans serovar Manilae. Koizumi and Watanabe (2004) showed that sera from leptospirosis patients specifically recognized recombinant LigB cloned from serovar Manilae. The partial ligB sequence (accession number AB098517) demonstrated >95% identity with the same region in the other ligB orthologs (data not shown). However, we were unable to obtain the serovar Manilae strain to evaluate this possibility and therefore excluded the serovar Manilae lig gene sequences from further analyses. Nevertheless, sequence data from other serovars from the same Icterohaemorrhagiae serogroup (serovars Lai and Copenhageni) as well as other strains from the same species (serovars Canicola and Pomona) provided consistent findings for ligB (see Table 1). Sequence alignment suggests that the previously sequenced ligA from L. interrogans serovar Pomona strain Kennewicki (GenBank sequence AAN52495) contains a highly variable region (<10% amino acid identity) due to three indels that alter the coding sequence of this region. These indels are either representative of strain-to-strain variation within serovar Pomona Kennewicki or of PCR artefacts. The ligA sequence data in this study was derived from three independent PCR products sequenced in both directions for the region in question. Each base was therefore sequenced a minimum of six times to rule out the possibility of PCR artefacts and sequence errors. As the strains are probably different, our strain was isolated from an aborted swine foetus while the other was isolated from a case of equine recurrent uveitis (Palaniappan et al., 2002), more serovar Pomona strains would need to be sequenced to resolve this issue. A further limitation is that we were only able to include 10 Leptospira strains in our analysis however, all of these strains were considered virulent.
To date, Lig research has focused mainly on the Big domains, which are highly antigenic and are the focus of most of the anti-Lig antibody response during leptospirosis infection. LigA and LigB repeats have been shown to provide protective immunity in animal models of leptospirosis (Koizumi and Watanabe, 2004; Palaniappan et al., 2006; Silva et al., 2007). Lastly, Big domains 7–13 of LigA and domains 7–12 of LigB have been shown to function in binding to host extracellular matrix proteins (Choy et al., 2007). The results presented here serve to highlight the potential importance of the carboxyterminal domains of LigB and to prompt studies investigating its cellular location and function. Sequence analysis of lig genes from multiple strains is important in the ongoing efforts to develop new Lig-based serologic tests and vaccines. LigB has been found in all strains studied, making it an ideal candidate for vaccine and diagnostic applications. Furthermore, sequence conservation of LigB from L. interrogans, which contains the serovars most important to public health, was high, >96%. The sequence information provided here will improve the performance characteristics of the Lig proteins in diagnostic and subunit vaccine formulations. Knowledge of sequence variations among Lig serodiagnostic antigens should prompt cross-reaction studies using sera from different human and animal populations and possibly inclusion of homologous antigens from antigenically distinct strains. Future Lig vaccine studies should evaluate cross-protection using antigens and challenge strains with different lig sequences in order to assess the significance of sequence variation on immunoprotection.
This work was supported by Bio-Manguinhos, Oswaldo Cruz Foundation, Brazilian Ministry of Health (grant 09224-7), Brazilian National Research Council (grants 01.06.0298.00 3773/2005, 420067/2005, 554788/2006 and 473006/2006-5); Public Health Service grants AI-34431 (to D.A.H.) and R01 AI034431 (to A.I.K.) from the National Institute of Allergy and Infectious Diseases, the VA Medical Research Funds (to D.A.H.) and grant D43 TW00919 (to A.I.K.) from the Fogarty International Center, National Institutes of Health. M.A.S. is supported by an Alfred P. Sloane Research Fellowship and a John Simon Guggenheim Memorial Fellowship.
Appendix A. Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.meegid.2008.10.012.