|Home | About | Journals | Submit | Contact Us | Français|
Comparative genomic hybridization was used to compare genetic diversity of five strains of Leptospira (Leptospira interrogans serovars Bratislava, Canicola, and Hebdomadis and Leptospira kirschneri serovars Cynopteri and Grippotyphosa). The array was designed based on two available sequenced Leptospira reference genomes, those of L. interrogans serovar Copenhageni and L. interrogans serovar Lai. A comparison of genetic contents showed that L. interrogans serovar Bratislava was closest to the reference genomes while L. kirschneri serovar Grippotyphosa had the least similarity to the reference genomes. Cluster analysis indicated that L. interrogans serovars Bratislava and Hebdomadis clustered together first, followed by L. interrogans serovar Canicola, before the two L. kirschneri strains. Confirmed/potential virulence factors identified in previous research were also detected in the tested strains.
Leptospirosis is a zoonotic disease that is found worldwide. It has gained increasing attention because of several recent outbreaks, and it has also become a major public health concern in many developing countries (13, 14, 15, 16, 17, 24, 30). Leptospirosis is caused by a range of leptospires with a broad range of hosts, showing a variety of symptoms including high fever, severe headache, chills, muscle aches, jaundice, and vomiting (7, 10, 25). So far, more than 200 strains and serovars are considered to be pathogenic (2, 17, 18).
Complete genome sequences are currently available for six Leptospira strains, including four pathogenic strains, Leptospira interrogans serovar Lai strain 56601 (23), L. interrogans serovar Copenhageni strain Fiocruz L1-130 (19), Leptospira borgpetersenii serovar Hardjo strains L550 and JB197 (3), and saprophytic Leptospira biflexa serovar Patoc I Paris and Ame strains (20). The availability of these strains has opened the window for scientists to observe and explore the genetic diversity of this species. Based on the comparison of genomes, genetic diversity was observed between saprophytic leptospira (L. biflexa) and pathogens of leptospirosis (20), between different species (L. borgpetersenii and L. interrogans) within the Leptospira genus (3), between different serovars (L. interrogans serovars Copenhageni and Lai), and within species (19). These differences were related to strain evolution events, pathogenesis, and flexibility of survival in various environments. As the leptospires are so diverse, the more genomic information that becomes available, the more the evidence collected would benefit studies on their pathogenesis, adaptation to various environments, and the development of further strategies for diagnosis and vaccine development.
Comparative genomic hybridization (CGH) has proven to be a very powerful tool for high-throughput genomic comparisons between pathogenic and nonpathogenic species and genomic screening of many bacterial species (5, 8, 9, 12, 21, 24, 30). The available genomic sequences of L. interrogans serovars Lai and Copenhageni were used to design a tiling array. In this study, genomic DNA from five strains, three from L. interrogans and two from L. kirschneri (Table 1), were applied to the array, providing a foundation for high-throughput genome screening of leptospire strains. To our knowledge, this is the first report of applying CGH to comparative genomics study on Leptospira species.
Five Leptospira reference strains, three from L. interrogans and two from L. kirschneri, were chosen as representatives in this study. These strains shown in Table 1 were isolated from different hosts and countries and belonged to different sequence types, as determined using multilocus sequence typing (MLST) (27).
Leptospires were grown in Ellinghausen-McCullough-Johnson-Harris (EMJH) base medium (Difco, Sparks, MD) supplemented with 1% bovine serum albumin (BSA), 1.25% Tween 80, and 2% rabbit serum at 30°C under aerobic conditions. Genomic DNA extraction was performed as previously described from approximately 100-ml cultures of leptospires grown to late exponential phase (11).
In this study, L. interrogans serovars Copenhageni and Lai were selected as reference genomes. Probes for the microarray were designed based on the reference genome sequences. Each of the reference strains has two chromosomes. They are L. interrogans Lai strain 56601 chromosome (chr) I (accession number AE010300), L. interrogans Lai strain 56601 chr II (AE010301), L. interrogans Copenhageni strain Fiocruz L1-130 chr I (AE016823), and L. interrogans Copenhageni strain Fiocruz L1-130 chr II (AE016824). The whole genome sequences for these two strains were extracted from sequence files. Each genome was split into fragments 500 bp in length, and these fragments were put through the Oxford Gene Technology's probe design pipeline with up to three probes designed to each fragment. The number of probes was then automatically reduced and optimized in silico to generate the final probe set to fit onto a 4-by-44,000 Agilent array with the aim of conserving probe coverage of at least one probe per gene.
DNAs for each sample were quantified before labeling, and quality was also checked. The readings of A260/A280 were more than 1.80 for all the samples. DNA (1.1 μg) was fragmented by sonication with a pulse of 1 s and pause of 1 s for 30 s at 20% amplitude. A total of 21 μl (1 μg) of the sonicated DNA solution was then labeled using Klenow-based incorporation of Cy dye-labeled dCTP. Twenty microliters of 20× random primers was added to each of the samples, and this step was followed by incubation at 95°C for 5 min. The samples were then placed on ice for 5 min when the labeling Master Mix was added to each sample. The Master Mix included 5 μl of 10× deoxynucleoside triphosphate (dNTP) mix, 3 μl of Cy3-dCTP (experiment) or Cy5-dCTP (reference), and 1 μl of Exo-Klenow fragment. The samples were incubated for 2 h at 37°C and then purified using Qiagen QIAquick cleanup columns. The yield and efficiency of dye incorporation (specific activity) of purified labeled DNA were then determined.
The labeled samples were vacuum concentrated and prepared for hybridization. The samples were incubated for 3 min at 94°C and 30 min at 37°C and then applied to the slides and hybridized for 24 h at 65°C and at 20 rpm. Slides were scanned using an Agilent scanner at 100% scan power. For slides and dyes for which the scanning appeared to be saturated at 100%, a lower scanning power was used for the intensity data (10%).
The data were extracted using Agilent feature extraction 9.5.3. Channels were selected from all slides that had a median intensity across all features of greater than 500 at 100% scan power. Foreground and background intensities for each of these arrays were taken from the featured extracted data and paired across all combinations as if they were the two colors on a normal array. Applied background subtraction and dye normalization (Loess) were performed in exactly the same manner as would be done for a normal two-color array. The fold changes by genomic location for visual identification of patterns of segmentation were plotted.
Principal-component analysis (PCA) was conducted using SYSTAT, version 10, software (SPSS, Inc., Chicago, IL). PCA was employed to evaluate the relationships among strains tested based on microarray hybridization signal intensities.
The raw microarray data of five Leptospira strains were processed with extraction of background, removal of outliers, and normalization using a global Loess method. Probes for the tiling array were designed to cover the entire genome. Therefore, in the tiling array, each gene was designed into 1 to 27 probes based on the length of the gene. A total of 43,235 probes for the L. interrogans serovar Copenhageni and Lai genomes were designed. Five strains (from L. interrogans serovars Bratislava, Canicola, and Hebdomadis and L. kirschneri serovars Grippotyphosa and Cynopteri) from two Leptospira species (Table 1) were applied to the tiling array. The three serovars (L. interrogans serovars Bratislava, Canicola, and Hebdomadis) from the same species as the reference genome detected much higher percentages of the probes (96.23%, 95.45%, and 94.28%, respectively) than the two L. kirschneri serovars Grippotyphosa and Cynopteri (64.54% and 67.84%, respectively). In addition, the genomes of the five strains tested had higher similarity to the reference genome of L. interrogans serovar Copenhageni than to that of L. interrogans serovar Lai (data not shown).
Determination of genes present, absent, and partially present was conducted. If all probes representing one gene had hybridization intensities equal to or greater than 500, then this gene was considered to be present in the tested strains; likewise, if the signal intensities of all probes representing one gene were all less than 500, this gene was considered to be absent; if the signal intensity of one or more than one of the probes representing one gene was less than 500, this gene was considered partially present (or partially absent).
Based on the definition above, the percentage of present, absent, and partially present genes in each functional category (based on published annotation) for each strain was determined (Fig. 1). The known functional diversity showed a broad conservation in L. interrogans serovars Bratislava, Canicola, and Hebdomadis, with 90 to 99.29% of reference genes detected in each functional category, while the highest percentages of reference genes detected in L. kirschneri serovars Grippotyphosa and Cynopteri were 66.99% and 70.83%, respectively, for protein synthesis; the lowest percentages were 21.88% and 24.22%, respectively, for the mobile and extrachromosomal element functions (MEEF). Transposases were predominant in the MEEF category, and a new insertion (IS) element, ISlin1, was identified in L. interrogans serovar Copenhageni; IS1500, IS1501, and IS1533 were discovered previously (19). IS1500 was detected in all tested strains. Other IS elements were present in L. interrogans serovars Bratislava, Canicola, and Hebdomadis, and they were either absent or partially present in L. kirschneri serovar Grippotyphosa; only 4 of 31 ISlin1 elements were present in L. kirschneri serovar Cynopteri. Transposases contributed to creating genetic diversity within species and adaptability to changing living conditions. This suggested that the two L. kirschneri serovars Grippotyphosa and Cynopteri might have less genetic diversity than the three serovars from L. interrogans.
Compared with the reference genomes, the percentage of genes present in the tested serovars varied from 51.23% (L. kirschneri serovar Grippotyphosa) to 95% (L. interrogans serovar Bratislava), whereas the percentages of partially similar genes ranged from 1.70% (L. interrogans serovar Canicola) to 27.90% (L. kirschneri serovar Grippotyphosa), and the percentages of absent genes ranged from 3.82% (L. interrogans serovar Bratislava) to 20.87% (L. kirschneri serovar Grippotyphosa) (Table 2). This result suggests that L. interrogans serovar Bratislava is the closest to the reference genome of L. interrogans serovar Copenhageni and that L. interrogans serovar Lai and L. kirschneri serovar Grippotyphosa have the least similarity to L. interrogans serovar Copenhageni.
A total of 3,957 genes were detected in all five tested strains. Of these genes, 54.18% belonged to unclassified, hypothetical, or unknown functions or were unassigned any function; the rest of the genes were predominantly housekeeping genes involved in transport and binding, regulatory functions, transcription, purines, pyrimidines, nucleosides and nucleotides, protein synthesis, protein fate, energy metabolism, central intermediary metabolism, DNA metabolism, cellular processes, cell envelope, amino acid biosynthesis and biosynthesis of cofactors, and prosthetic groups, most of which were consistent with the core leptospiral genes resulting from the comparison of genomes between the saprophyte L. biflexa and pathogenic species L. interrogans and L. borgpetersenii (3, 19, 20, 23). Eighty-four genes were not detectable in all five strains, and these genes did not have functions assigned. The number of genes unique to each tested strain (except for the L. kirschneri serovar Grippotyphosa strain) was 120, 78, 30, and 4 for L. interrogans serovars Bratislava, Canicola, and Hebdomadis and L. kirschneri serovar Cynopteri strains, respectively. Genes unique to L. interrogans and L. kirschneri were also observed. A total of 993 genes were detected in the three strains of L. interrogans while only five genes without assigned functions were detected in the two strains of L. kirschneri. Of the 993 genes, only 183 had known functions, and these were dominated by genes involved in MEEF (47 transposase) and cell envelope (44 lipoproteins and membrane proteins), which are involved in nutrition and signal transduction. In addition, three genes responsible for fruiting body development for long-term survival (28) were shown only in the L. interrogans species. This observation suggests that L. interrogans might better adapt to multiple environments (3, 19, 20) than the L. kirschneri species.
PCA based on signal intensities was used to group and separate strains with similar or dissimilar genetic properties. The results showed that the two L. kirschneri serovars Grippotyphosa and Cynopteri were closely grouped, separated from the three L. interrogans serovars (Fig. 2). Clustering analysis based on microarray signal intensities showed that the cluster formed into two groups, I and II (data not shown). In group I, L. kirschneri serovars Grippotyphosa and Cynopteri were closely clustered together and then clustered with group II from L. interrogans; L. interrogans serovars Bratislava and Canicola grouped first, followed by L. interrogans serovar Hebdomadis. The PCA and clustering results combined showed that strains within species were genetically closer than those from across species. In addition, there are controversies in previous publications about L. kirschneri serovar Grippotyphosa taxonomy. Yasuda et al. (32) assigned L. kirschneri serovar Grippotyphosa to L. interrogans while Ramadass et al. (22) assigned it to L. kirschneri. Our result based on tiling arrays, which covered most of the genome, supported that L. kirschneri serovar Grippotyphosa should be assigned to L. kirschneri instead of L. interrogans, consistent with the most recent result (22) based on DNA hybridization.
Genes identified to be responsible for pathogenesis were also observed in the tested strains (Table 3). ligA, ligB, and ligC previously reported to be probably involved in host-pathogen interactions (19) were present or partially present in the tested strains except those of L. kirschneri; ligB and ligC were present in three L. interrogans strains while ligB was partially present in L. kirschneri strains; ligA was present in one L. kirschneri serovar Cynopteri strain and partially present in all other tested strains. This is consistent with previous results of McBride et al., who used L. interrogans serovar Canicola strain Kito. The L. interrogans serovar Canicola used in this study had 90.3%, 96.7%, and 98.5% DNA sequence identity of ligA, ligB, and ligC genes, respectively, with those from L. interrogans serovar Copenhageni while L. kirschneri serovar Grippotyphosa had 91.4%, 93.2%, and 90.5%, respectively, sequence identity.
Three integrin alpha-like proteins (LIC12259, LIC10021, an LIC13101) from L. interrogans serovar Copenhageni and three (LA1499, LA0022, and LA3881) from L. interrogans serovar Lai were identified as candidates of leptospiral adhesins (19). Except for LA0022 missing on the microarray, others were present or partially present in all tested strains (Table 3).
Eshghi et al. (6) compared global proteome analyses on L. interrogans serovar Copenhageni grown under conventional in vivo conditions and growth mimicking in vivo conditions. Four novel proteins (LIC12575, LIC13050, LIC12032, and LIC13166) and related virulent factors were identified, which were present in all five tested strains (Table 3). The lipoproteins LipL21, LipL45, and LipL36 were unique in pathogenic Leptospira based on the sequenced strains (20). LipL21 and LipL45 were present in all tested strains; however, LipL36 varied among the serovars as it was present in L. interrogans serovars Bratislava and Hebdomadis, partially present in L. kirschneri serovars Grippotyphosa and Cynopteri, but absent in L. interrogans serovar Canicola (Table 3).
Two proteins, Lsa63 (LIC10314) and Lp95 (LIC12690), were observed in all strains tested; they were confirmed to bind laminin and collagen (29) and extracellular matrix components (1), respectively, which were related to invasion of the hosts.
The genomic comparison between saprophyte L. biflexa and the pathogens of leptospirosis (L. borgpetersenii and L. interrogans) showed that 1,431 genes were unique to the pathogens. These genes may be playing a role in pathogenesis since there were no orthologous genes in L. biflexa (20). The array used in this study contained 1,083 of 1,431 genes, and only 323 genes had assigned functions. A clustering analysis for the 323 genes based on genes present, partially present, and absent in the tested strains was performed; for this, present was replaced by 1, partially present was assigned a value of 0.5, and absent was assigned a value of 0. Five clusters were formed (see Fig. S1 in the supplemental material). In cluster I (see Fig. S1a), the present genes varied among strains, probably related to the survival in the environment for different strains; in clusters II and III (see Fig. S1b and c), genes were present in all the strains tested or partially present in the L. kirschneri serovars Grippotyphosa and Cynopteri, in which sphingomyelinase, phage-related protein, leucine-rich repeat protein, and methylase/methyltransferase were reported to be related to pathogenesis; in clusters IV and V (see Fig. S1d), genes were either present only in L. interrogans serovars Bratislava, Canicola, and Hebdomadis (cluster IV) or present only in L. interrogans serovar Canicola (cluster V). The pathogenic roles of most of these genes, even with assigned functions, were not clear; however, the data based on CGH microarray provided basic genomic information that can become the references for further study on Leptospira.
A tiling array was used in this study, which was designed to cover the whole genome. The advantage of a tiling array compared to an expression array is that the tiling includes not only open reading frames (ORFs) but also intragenic DNA fragments, potentially providing more information. In addition, on the tiling array, a gene can be designed with more probes according to the gene size, thereby allowing identification of genes which were present, partially present, and absent in the tested strains confidently. We proposed the concept partially present in this study so that the gene variation during strain evolution could be identified. Furthermore, genes reported to be involved in pathogenesis were observed in all the five strains. However, a limitation of the application of an array based on the reference genome is that the unique genes, which existed in the tested strains instead of in the reference genomes, cannot be detected because they were not on the array. Our results also showed that the tiling CGH array could clearly distinguish species and identified the differences of genetic content for each strain. Thus, the tiling CGH array designed for this study is appropriate to conduct high-throughput genome screens for Leptospira.
We thank Duangjai Suwanchareon at the National Institute of Animal Health, Department of Livestock and Development, Ministry of Agriculture and Cooperatives, Bangkok, Thailand, for providing Leptospira strains.
This work was supported by National Institutes of Health grant SO6GM0816-37.
Published ahead of print 17 February 2012
Supplemental material for this article may be found at http://aem.asm.org/.