|Home | About | Journals | Submit | Contact Us | Français|
Lactobacillus helveticus is a versatile dairy bacterium found to possess heterogeneous genotypes depending on the ecosystem from which it was isolated. The recently published genome sequence showed the remarkable flexibility of its structure, demonstrated by a substantial level of insertion sequence (IS) element expansion in association with massive gene decay. To assess this diversity and examine the level of genome plasticity within the L. helveticus species, an array-based comparative genome hybridization (aCGH) experiment was designed in which 10 strains were analyzed. The aCGH experiment revealed 16 clusters of open reading frames (ORFs) flanked by IS elements. Four of these ORFs are associated with restriction/modification which may have played a role in accelerated evolution of strains in a commercially intensive ecosystem undoubtedly challenged through successive phage attack. Furthermore, analysis of the IS-flanked clusters demonstrated that the most frequently encountered ISs were also those most abundant in the genome (IS1201, ISL2, ISLhe1, ISLhe2, ISLhe65, and ISLhe63). These findings contribute to the overall viewpoint of the versatile character of IS elements and the role they may play in bacterial genome plasticity.
Lactobacillus helveticus is a gram-positive, homofermentative lactic acid bacterium which is widely used in the manufacture of cheeses, such as Swiss cheese and some Cheddar-type cheeses (22, 25). It is also commonly used in the production of different types of Italian cheeses, such as Parmigiano Reggiano (18) and Grana Padano, where it contributes to the formation of specific flavor compounds (42).
Phylogenetic analysis of ribosomal protein sequences derived from lactobacilli and streptococci classified L. helveticus in the same group along with both gastrointestinal (GI) tract and dairy-specific species (14). Comparative analysis of the 16S rRNA of L. helveticus DPC4571 revealed 98.4% identity with Lactobacillus acidophilus NCFM and indicated that this probiotic strain was closely related to strain DPC4571, despite the different environments these two lactobacilli inhabit (4). The results of genomic analysis of L. helveticus suggested that two major events have occurred in the diversification process of L. helveticus from a common ancestor with L. acidophilus, selective gene loss and acquisition of a large number of insertion sequence (IS) elements (4). IS elements are DNA sequences capable of independent transposition within and between bacterial genomes (31). Their capacity for independent mobility demonstrates the parasitic nature of these elements (11); however, they can also be regarded as having a positive influence, as they assist in promoting genetic variation (1). Thus, even though the primal character of these elements remains unclear in that they may be considered simply as selfish DNA elements, their impact on the architecture of microbial genomes is undeniable. It has already been demonstrated that IS-related mutations occur in Escherichia coli (44), Lactococcus lactis (10), Mycobacterium tuberculosis (33), and Francisella tularensis (39). Their active role was also demonstrated in the evolution of Paracoccus methylutens DM12 plasmids (3). Early bioinformatic analysis of the L. helveticus DPC4571 genome sequence resulted in identification of IS-associated truncations in genes associated with cellobiose transport, acetaldehyde dehydrogenase and diacetyl reductase (6). Considering the extraordinary abundance of IS elements in the L. helveticus DPC4571 chromosome (213 in total), it is noteworthy that very few open reading frames (ORFs) are directly affected by their presence. Presumably, the vast majority of insertion events proved detrimental to some aspect of the strain's competitiveness and so were not selected in the ensuing population. We believe that the phenomenonal abundance of IS elements in L. helveticus makes it a very suitable system in which to study the role of IS elements in the evolution of bacterial genomes, particularly in ecosystems which impose challenging selective pressures.
The level of chromosomal synteny that exists between L. helveticus DPC4571 and L. acidophilus NCFM is surprising, especially since the latter strain contains only 17 IS elements, and this observation highlighted the need for further studies of mobile genetic elements in the L. helveticus species. In order to address this issue, we employed DNA microarray technology to compare the overall genetic complement and specific genes associated with IS elements in different strains of L. helveticus. The use of comparative whole-genome array-based comparative genome hybridization (aCGH) has already been successfully applied to the identification of genetic differences within many closely related microorganisms. For example, large genomic deletions were identified among pathogenic Mycobacterium avium subsp. avium and Mycobacterium avium subsp. paratuberculosis (49) and Tropheryma whipplei strains (27). In addition, the absence of five Streptomyces coelicolor genomic islands were reported in Streptomyces lividans (24), and differences in gene content were detected in other species, Salmonella enterica (38) and Xylella fastidiosa (26). In this work we compared the genomes of nine strains of L. helveticus which were isolated from the dairy environment.
The reference strain, whose genome was spotted on the microarray was L. helveticus DPC4571. The other L. helveticus strains used in this study were DPC5607, DPC5389, DPC5367, DPC5365, DPC5360, DPC5394, DPC5352, DPC5364, and DPC1132 from the Moorepark Food Research Centre (MFRC) Culture Collection and CNRZ32. All lactobacilli were grown under static conditions in modified MRS (mMRS) broth (Difco, Detroit, MI) supplemented with 0.5 g/liter l-cysteine and incubated at 37°C for 12 h. Stock cultures were stored at −80°C in 80% (wt/wt) glycerol.
High-molecular-weight DNA was isolated from 1 ml of a stationary-phase culture as follows. The cells were harvested by centrifugation, washed once in 1 M NaCl and 10 mM Tris-Cl (pH 7.6), and suspended in 300 μl of the same buffer. The cell suspension was mixed with an equal volume of 2% (wt/vol) low-melting-point agarose (Bio-Rad Laboratories, CA) in 0.125 M EDTA (pH 7.6), dispensed into molds, and allowed to solidify for 15 min at 4°C. The agarose cell mixture set within each mold was referred to as a plug. Two plugs per strain were added to 1 ml of 1 M NaCl, 6 mM Tris-Cl, 100 mM EDTA, and 1% (wt/vol) Sarkosyl (Sigma Aldrich, Dublin, Ireland) (pH 7.6) containing 10 mg/ml of lysozyme and incubated overnight at 37°C. The lysozyme buffer was then replaced with 1 ml of 0.5 M EDTA and 1% (wt/vol) Sarkosyl (pH 8.0) containing 0.5 mg/ml of proteinase K (Sigma Aldrich, Dublin, Ireland) and incubated overnight at 37°C. This step was repeated with a fresh proteinase K solution. After two 1-hour washes with 1 mM phenylmethylsulfonyl fluoride (PMSF) in 10 mM Tris-Cl and 1 mM EDTA (pH 8.0) at 37°C, all plugs were stored in 10 mM Tris-Cl and 100 mM EDTA (pH 8.0) at 4°C. Prior to incubation with restriction enzymes, a 1-mm slice of the plug was removed and washed three times for 15 min in 1 ml of 10 mM Tris-Cl and 0.1 mM EDTA (pH 8.0) at room temperature. Each plug slice was then washed once with 100 μl of restriction buffer 4 (New England Biolabs) for 30 min at 4°C and digested with 20 units of SmaI enzyme in fresh buffer overnight at 37°C. Plug slices were loaded directly into the wells of 1% (wt/vol) pulse-field grade agarose gel. DNA fragments were resolved using the CHEF-DR III instrument (Bio-Rad Laboratories, CA) at 6 V/cm and linear ramped pulsed times 1 to 15 seconds for 18 h with 0.5× concentrated Tris base, borate, and EDTA running buffer. The gel was stained in distilled water containing 0.5 μg/ml of ethidium bromide for 30 min and destained for 60 min with water.
The PFGE restriction fragments were analyzed using BioNumerics v.2.0 software, using the fingerprint types and comparison and cluster analysis modules. The dendrogram was produced using the Dice unweighted-pair group method using average linkage (UPGMA) algorithms with position tolerance settings of 0.5% (optimization) and 1.0% (band position).
API tests (BioMerieux, Marcy-l'Etoile, France) were performed on pure overnight cultures grown at 37°C in mMRS broth according to the manufacturer's instructions.
cDNA microarrays were obtained from the Ocimum Biosolutions Ltd. (Hyderabad, India). The in situ-synthesized 50-mer oligonucleotides representing all 1,618 genes of L. helveticus DPC4571 were spotted on the slides in duplicate. The sequences of transcriptional activator MhpR, membrane protein OmpA and formate acetyltransferase 2 from E. coli chromosome were used as negative controls (one in each subarray, 32 in total).
Fifty milliliters of mMRS broth was inoculated with 200 μl (0.4%) overnight culture and incubated at 37°C until an optical density at 600 nm of 1.0 was reached. The cells were harvested by centrifugation at 2,000 × g, washed with 20 ml of TE buffer (10 mM Tris hydrochloride [pH 7.5], 1 mM EDTA), and suspended in 10 ml of lysis buffer (25 mM Tris hydrochloride [pH 8.0], 50 mM EDTA, 50 mM glucose) supplemented with 10 mg of lysozyme and 750 U of mutanolysin. The suspension was incubated at 37°C for 30 min. One milliliter of 10% sodium dodecyl sulfate (SDS) with 50 μl of proteinase K (20 mg/ml) was then added. Incubation was performed until the suspension became clear. The cell lysate was extracted three times with an equal volume of phenol-chloroform-isoamyl alcohol (25:24:1) followed by two extractions with an equal volume of chloroform-isoamyl alcohol (24:1). One-tenth the volume of sodium acetate (3 M) was added, and DNA was precipitated with 3 volumes of 99% ethanol chilled to −20°C. DNA was subsequently harvested by centrifugation, resuspended in 5 ml of distilled water supplemented with 10 μg/ml of RNase A, and incubated at 37°C for 60 min. DNA (1.5 ml) was then sheared by sonication (Soniprep 150, MSE) for 20 min at an amplitude of 2 μm (30-second pulses with 30-second intervals), a treatment determined to yield fragments of 150 to 1,000 bp. After sonication, nucleic acid was extracted once more with phenol-chloroform-isoamyl alcohol (25:24:1) and precipitated with absolute ethanol. The success of the DNA fragmentation strategy was determined electrophoretically in 1% agarose gels.
DNA labeling was by priming using random hexamer oligonucleotides. Three micrograms of random primers and 10 μg of DNA were dissolved in distilled water to a final volume of 41 μl. The mixture was then centrifuged briefly, heated at 95°C for 5 min, and cooled on ice for 3 min. Following this, 1.5 μl of deoxynucleoside triphosphates (dNTPs) (5 mM dA/dG/dT, 2 mM dC), 1 μl of Klenow fragment (exo−, i.e., lacking 3′-to-5′ exonuclease proofreading activity; 5 U/μl), and 1.5 μl (2 mM) of Cy3/Cy5-labeled dCTP (GE Healthcare, Bucks, United Kingdom) were added, and the mixture was incubated at 37°C for 2 hours followed by 15-min incubation at 75°C to inactivate the enzyme. Labeled DNA was purified using a Qiagen MinElute purification kit (Qiagen, Crawley, United Kingdom) according to the manufacturer's instructions. Purified labeled DNA (1 μl) was quantified using the NanoDrop ND1000 spectrophotometer (Mason Technology, Dublin, Ireland) and resulted in a specific activity at the average levels of 45 and 35 pmol dyes per μg genomic DNA for Cy3- and Cy5-labeled samples, respectively.
The hybridization solution was prepared by mixing 40 μl of purified labeled DNA with 10 μl of salmon sperm DNA (1 μg/μl) and 120 μl of salt-based hybridization buffer (Ocimum Biosolutions Ltd.) preheated to 42°C. The solution was then heated to 95°C for 3 min and then cooled on ice for 1 min. Genomic DNA hybridizations were carried out in a metal chamber placed in a water bath at 42°C for 24 h containing the OciChip array and 170 μl of the hybridization solution. Following incubation, the microarray slides were washed with prewarmed washing buffer 1 (2× saline-sodium citrate [SSC] [1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate], 0.1% SDS) at 42°C on an orbital shaker for 5 min, washed with buffer 2 (1× SSC) at 30°C, and washed with buffer 3 (0.1× SSC) at 30°C. Washed slides were dried in 50-ml conical-bottom tubes by centrifugation at room temperature for 1 min at 500 rpm. Microarrays were scanned with Affymetrix 428 array scanner (Affymetrix, Bedford, MA). Laser lights of wavelengths at 550 nm and 650 nm were used to excite Cy3 and Cy5 dye, respectively. Images were captured with Affymetrix array reader and analyzed using Imagene and Genesight software (BioDiscovery, El Segundo, CA).
Spots identified as poor or with bad morphology were excluded from further analysis in the process of quantification. This was achieved by discarding spots with a reference signal lower than the background level plus 2 standard deviations of the background level. Signal median intensities were corrected by subtracting the local background level and the log2 (test/reference) ratios were globally normalized (16, 27). Distribution of M values (log2 ch1/ch2) corresponding to majority of genes was adjusted to zero by using a normalization factor as follows: M′ = M − log2 N, where M′ is the normalized log2 ratio and N is a normalization factor (calculated by summing the measured intensities in both channels (channel 1 [ch1] and channel 2[ch2]) (40).
The cutoff threshold for absent/present genes was defined on the basis of the test/reference median signal ratio. All genes characterized by values lower than 0.25, which is equivalent to −2 log2 (test/reference signal), were considered absent from the test strain. In contrast, genes considered present or divergent were represented by values higher than 0.25. This study was primarily focused on analysis of missing gene clusters, and it was of great importance to apply stricter criteria for confident identification of absent genes. Other studies performed on Salmonella enterica (38) or Listeria monocytogenes (12) employed more tolerant cutoff values, 0.33 and 0.3, respectively. On the basis of the chosen cutoff level, binary scores for all analyzed genes were generated, and a matrix of this conversion was visualized by using Genesis software (48). All microarray data (four measurements per strain) were statistically analyzed using CyberT tool (30).
As an aid to quality control the data obtained by microarray analysis, PCR amplification of regions where gene loss was indicated was undertaken. In particular, genes lhv_0953 and lhv_0954, identified as missing by microarray analysis from all strains tested except strains DPC5607 and DPC5394 (the two strains identified as most similar to the reference strain [DPC4571]) were targeted. Primers were designed to sequences flanking the two genes and were used in PCRs to confirm the presence or absence of these genes from the test strains.
The microarray data have been submitted to the Gene Expression Omnibus (GEO) at the website http://www.ncbi.nlm.nih.gov/geo/ under accession number GSE16553.
Interstrain genetic variations within a range of L. helveticus strains from the MFRC Culture Collection and strain CNRZ32 were examined at the whole-genome level by customized microarrays. A total of 2,164 oligonucleotides (50-mers) with homology to all identified ORFs, including pseudogenes (as well as hypothetical genes) were chosen to fully mirror the reference genome sequence of L. helveticus DPC4571. On the basis of the quality of the spots, detected signals, and statistical significance, overall analysis was performed on 90% of ORFs (P value of <0.05). A general comparison of genetic content did not reveal the absence of significantly large clusters of genes from any of the nine strains tested (Fig. (Fig.1).1). The largest fragment of DNA (~30 kb with average GC content of 40.1%) which appeared to be missing in some of the test strains was found to be located within genomic islands (based on percent GC content) in two of the test strains (DPC5352 and DPC5367). Genes located in three large regions of the L. helveticus DPC4571 genome (a 150-kb-long region situated from lhv_0266 [serine hydroxymethyltransferase] to lhv_0435 [putative RNA helicase], a 280-kb region from lhv_0677 [RNA methyltransferase protein] to lhv_0952 [transposase IS1201], and a 270-kb region spanning between lhv_1168 [endopeptidase O2] and lhv_1456 [hypothetical protein]) which constitute ~33% of the genome were highly conserved within all the genomes tested. Four strains, DPC5364, DPC5365, DPC5352, and DPC5389, appear to have maintained all the genes present in strain DPC4571 within these regions; however, as with all aCGH experiments, it is impossible to determine whether any new strain-specific ORFs are present in the test strains.
The aligned genetic content profiles revealed different levels of similarity between the nine tested strains and the reference strain, L. helveticus DPC4571 (Fig. (Fig.2A).2A). Of all the annotated ORFs in the L. helveticus DPC4571 genome, only 14 appear to be absent in all of the seven test strains that are most divergent from DPC4571 (Table (Table1).1). The list of missing genes as demonstrated in Fig. Fig.33 indicates that two strains, DPC5394 and DPC5607, are nearly identical to strain DPC4571, which is supported by the PFGE data (Fig. (Fig.4).4). aCGH data demonstrated that strain DPC5394 differed from DPC4571 by the absence of seven ORFs, while DPC5607 was missing eight ORFs all with homology to lysin and ATPase genes, which constitute 0.32% and 0.37% of the total number of ORFs, respectively. In contrast, strain DPC5367, which appeared to be the most divergent strain, lacked 85 ORFs (3.93%) annotated in DPC4571. Significant numbers of missing ORFs were also observed in two other strains, 75 in DPC5360 and 74 in CNRZ32. The majority of these represent genes involved in sugar metabolism (phosphotransferase system[PTS], ribose-5-phosphate, maltose phosphorylase or maltose ABC transporters), restriction/modification (R/M) system (type I and type II R-M subunits), and purine biosynthesis (phosphoserine aminotransferase and dehydrogenase, ribonucleoside diphosphate reductase, ADP-ribosylglycohydrolase, or purine-cytosine permease). The existence of L. helveticus variants deficient in purine biosynthesis was already demonstrated by Hebert et al. (20). Microarray analysis also revealed the absence of two surface layer proteins (lhv_0184 and lhv_1473) in all strains tested with the exception of the highly similar DPC5394 and DPC5607 strains plus DPC5364 and two other genes (lhv_0082 and lhv_0087) in DPC5367, DPC5360, and CNRZ32 strains. This group of genes has already been investigated in L. helveticus, and significant nucleotide sequence heterogeneity was reported (19). It is noteworthy that two of the test strains, DPC5360 and CNRZ32, do not possess the helveticin and the bacteriocin ABC transporter genes found in DPC4571.
Detailed analysis of the microarray data resulted in the identification of 16 DPC4571 gene sets flanked by IS elements, which were missing from some of the test strains. The exact number of these clusters varies between strains (Fig. (Fig.5),5), and it involves 10 different types of IS elements, ISLhe1, ISLhe2, ISLhe12, ISLhe63, ISLhe65, ISLhe66, ISLjo1-like, IS1201, ISL2, and ISL5. Generally, they all belong to the most abundant IS types identified in L. helveticus DPC4571 (4). The only high-copy-number IS element not encountered in this analysis was ISLhe15, which had 23 copies in the reference chromosome. In contrast, ISLhe66 and ISL5 were the only low-copy-number elements (two copies in strain DPC4571) associated with these gene sets. Each strain tested was missing at least one IS-flanked cluster from its genome. The two strains most similar to strain DPC4571, DPC5394 and DPC5607, appear to be missing two short DNA fragments, and interestingly, one of these fragments was detected between two IS1201 elements (cluster X, Fig. Fig.5)5) and was specific for these two strains only. Three of the missing ORFs in this cluster were annotated in strain DPC4571 as lysin pseudogenes due to the presence of a frameshift mutation in each ORF. The largest numbers of missing clusters, 11 and 10, were identified in strains CNRZ32 and DPC5367, respectively. Cluster XV containing a cobalt transport ATP-binding transporter and ABC transporters, is believed to be unique for both strains mentioned above. Among 16 gene sets flanked by IS elements, 4 were absent from just one of the test strains, 2 in DPC5360 and 2 in DPC5367. The first, a 5.9-kb-long stretch of ORFs (cluster I) identified in strain DPC5360 contains genes with homology to phosphoglycerate mutase, phosphoserine aminotransferase, phosphoglycerate dehydrogenase, and a small multidrug efflux protein, placed between two copies of ISLhe65 and ISLhe2. The second, a 2.8-kb-long segment (cluster V) found in strain DPC5360 encloses a frameshift mutated phospho-beta-galactosidase I homologous gene from DPC4571. Another strain-specific fragment (3.8 kb long, cluster XII) identified in strain DPC5367 possesses two ORFs annotated in the reference strain as fully functional ribonucleoside-diphosphate reductase genes. Another fragment (11 kb, cluster VIII) missing from strain DPC5367 was localized between IS1201 and ISLhe65 elements and contains ORFs with homology to an ABC transporter ATPase component and trehalose-6-phosphate hydrolase genes (annotated as a pseudogene in DPC4571).
Some genes were found to be missing from the majority of the strains examined (with the exception of strains DPC5394 and DPC5607). Genes encoding a surface layer protein (lhv_0184), uridine kinase (lhv_0596), and amino acid permease (lhv_0954) have not been detected in any of the test strains in this analysis. Genes in cluster XVI were absent in five strains, strains DPC5365, DPC5352, DPC5367, CNRZ32, and DPC5389. This 7.3-kb-long fragment contains putative restriction/modification and carbon dioxide catalysis genes and is flanked by two identical copies of ISL5 element. It is noteworthy that these two elements are the sole representatives of the IS4 family of IS elements identified in the L. helveticus DPC4571 genome and appear be involved in the genomic rearrangements observed. To date, IS4 family members have been reported to be linked with gene mobility only in Bacillus thuringiensis (32).
It is known that the dairy niche is not the primary ecosystem of L. helveticus which has evolved from the gastrointestinal environment (4). Apart from acquisition of a large number of IS elements, direct comparison of L. helveticus DPC4571 and L. acidophilus NCFM genomes suggested a crucial role for selective gene loss in adaptation of DPC4571 to the dairy environment. The presence of cell wall and mucus binding proteins, as well as glucosidases and PTS system genes was significantly reduced in strain DPC4571 compared to L. acidophilus NCFM (4). Gene acquisition was observed only for R/M genes. Hence, we focused on this set of genes and investigated their distribution within the test strains.
As stated above, the number of R/M systems increased in L. helveticus DPC4571 since the strain diverged from its GI ancestor. Interestingly, the list of missing ORFs with recognized function (Fig. (Fig.3)3) reveals 17 loci associated with different R/M systems, which constitutes 14% of all nonhypothetical ORFs absent from some or all of the nine L. helveticus strains used in this study. Three types of R/M systems have been identified in bacteria (34), and the DPC4571 genome contains all three types. The aCGH analysis revealed that strains DPC5365, DPC5367, DPC5389, and CNRZ32 each lack 5 to 10 ORFs with homology to R/M type I, while strains DPC5365, DPC5352, DPC5360, CNRZ32, and DPC5389 each lack one gene from R/M type II system. Finally, all strains except strains DPC5394, DPC5607, and DPC5365 (only one out of two functional ORFs detected missing) appear not to have two genes (lhv_0027 and lhv_0028) associated with a type III R/M system.
Genome sequence analysis of L. helveticus DPC4571 has revealed a number of gene sets, including phosphotransferase (PTS) systems, cell wall proteins, and glucosidases which have decreased in abundance in L. helveticus relative to L. acidiophilus. The genome of L. helveticus DPC4571 possesses nine putative functional PTS genes and six nonfunctional PTS genes. On the basis of microarray data, three of these genes were absent from other strains. The glucose PTS gene (lhv_0640) was not detected in six strains, strains DPC5364, DPC5352, DPC5367, DPC5360, CNRZ32, and DPC5389, whereas the sucrose PTS pseudogene (lhv_0476) was absent from strains DPC5352, DPC5367, CNRZ32, and DPC5389, and the arbutin-like PTS pseudogene (lhv_0634 to lhv_0636) was detected only in strains DPC5352 and DPC5360. No major variations were observed in the number of cell wall proteins and glucosidases shared by the 10 strains in this study. A gene encoding a sugar kinase with an LPxTG cell wall anchoring motif (lhv_1986) appears to be the only such gene missing from all strains, with the exception of strains DPC5394, DPC5607, and DPC5364, whereas neopullulanase (lhv_2001) and 1,6-glucosidase (lhv_2002) were not detected in strains DPC5352, DPC5367, and DPC5360. Examination of the other nine L. helveticus genomes did not reveal the loss of any of these genes among all strains tested.
An intriguing observation was made in relation to two potentially important loci, ORF lhv_1134 and ORFs lhv_1618 to lhv_1623. The first one encodes a product with homology to the MutS2 protein that is known as a homologous recombination suppressor in Helicobacter (37). Interestingly, this ORF was found to be missing from all strains showing visible chromosomal rearrangements as demonstrated by PFGE analysis and present in strains DPC5394 and DPC5607 whose genomes appear to be highly conserved relative to strain DPC4571. The second locus is associated with clustered regularly interspaced short palindrome repeats (CRISPRs), which are composed of short repeats of 25 to 50 bp separated by unique sequence spacers of similar length. The CRISPR loci are usually linked with the so-called CRISPR-associated proteins (CAS). L. helveticus DPC4571 ORFs lhv_1618 to lhv_1623 were annotated as CAS proteins (4), so while the CRISPR sequences themselves are not included on the array hybridization, the CAS protein genes suggest the presence of CRISPR-like elements in other L. helveticus strains, such as strains DPC5607, DPC5394, and DPC5360. Of the sequenced lactobacilli, 10 out of 16 contain at least one CRISPR, suggesting they are more common in this genus (62% of sequenced genomes) as opposed to bacteria in general where they are detected in ~40% of sequenced genomes (23).
Identification of gene divergence or absence has already been thoroughly investigated in many closely related organisms (21, 24, 29, 41, 43). Often, loss or alteration of genetic content is affected by the activity of IS elements (7, 28). In this work we suggest that IS elements do play an important role in genomic rearrangements among L. helveticus strains. Interestingly, even though IS elements are very abundant within the genome of L. helveticus, comparison of the 10 genomes (including the reference one) did not reveal any deletion of large gene sets that could be expected in IS-rich chromosomes. However, variations in genetic content were detected in a number of single genes or short clusters of ORFs. Interestingly, the majority of these clusters were localized between IS elements. A similar observation was made for Bordetella pertussis and Bordetella parapertussis (36). Considering the unique expansion of mobile genetic elements in both species, it is plausible to conclude that the acquisition of IS elements might have been advantageous in terms of chromosomal rearrangement. Recombinational reshuffling of fragments of the chromosome and acquisition/loss of DNA by horizontal gene transfer are two major natural sources of genetic variability in prokaryotes. Zhou et al. (52) investigated the recently active insertion sequences (raIS) in cyanobacteria and archaea and found the following: (i) the activities of transposable elements depend on the environment inhabited by the host; (ii) the number of raIS tends to increase with genome size; (iii) regions flanking raIS are enriched with genes encoding DNA-binding factors, transporters, and enzymes; and (iv) IS mobility presents no tendency to disrupt operons. The results of our analysis of IS elements in the L. helveticus genome appear to support these observations. First, the genomic similarity to L. acidophilus and the significant difference in the number of IS elements might suggest a possible environmental influence on their expansion. Second, a correlation between the types of IS elements believed to have recently participated in horizontal gene transfer (possible raIS of strain DPC4571) and their copy number was observed. Eight out of ten elements associated with interstrain genomic rearrangements belong to the most abundant representatives, which might indicate different levels of their activity. It seems that within L. helveticus, expansion capabilities are stronger for some IS elements (IS1201, ISL2, ISLhe1, ISLhe2, or ISLhe65) and weaker for others (e.g., ISL7, ISLhe4, or ISLhe30). However, it should be noted that ISLhe15 with 23 copies present in strain DPC4571 was not detected in the group of active elements. In contrast, the expansion of IS elements in other prokaryotic organisms was limited to one particular type, IS481 in B. pertussis (36) or IS1541 in Yersinia pestis (9). Third, many IS elements participating in gene gain/loss events are in proximity to ORFs representing enzymes or transporter genes, and finally, in L. helveticus, with the exception for lactose operon (5), there is no clear evidence of IS elements interfering with the structure or organization of individual operons.
Another observation worthy of mention is the fact that among so-called IS-flanked clusters, the majority of ORFs missing represented different R/M system genes. This finding supports the theory of Naderer et al. who postulated that an association exists between R/M systems and mobility genes (35). Furthermore, microarray analysis has led to the observation that the CAS proteins located adjacent to the CRISPR in L. helveticus DPC4571 are missing from six of the L. helveticus genomes analyzed. CRISPR sequences are thought to be an acquired resistance mechanism against viruses and possibly other foreign DNA, as it has been shown that after bacteriophage challenge, CRISPRs contain new spacers that were derived from the bacteriophage genomic DNA. It is thought that the specificity of the system is determined by the spacer sequence and that the CAS proteins provide the actual resistance mechanism (2). The CRISPR therefore represents a history of the exposure of the strain to foreign DNA and may be used in evolution and diversity studies. Phage attacks are common among industrial dairy microorganisms and can constitute serious threats to the proper milk fermentation process (50). L. helveticus-specific bacteriophages were found in 16 out of 28 samples of natural whey starters examined by Zago et al. (51). Because of the importance of R/M systems and CRISPR loci in resisting bacteriophage attack, it is logical to assume that there would be tremendous selective pressure on L. helveticus to acquire more of these phage resistance mechanisms due to their increased exposure to phage when in commercial use as a starter.
Lack of major gene loss from the genomes of strains DPC5394 and DPC5607 (relative to strain DPC4571) might be explained by the presence of MutS2 within their genomes. This recombination inhibitor (17, 37) was not detected in the other strains, which in contrast, contain many more examples where specific genes are absent from their respective genomes. Strains DPC5394 and DPC5607 revealed only one cluster of ORFs flanked by two identical copies of IS1201 which leads to the conclusion that specific activity of IS elements might be subjugated by the antirecombination enzyme. However, confirmation of this hypothesis requires further experimentation.
L. helveticus DPC4571 contains 24 identified peptidase genes. Endopeptidases encoded by the pepE, pepE2, pepF, pepO, pepO2, and pepO3 genes and the proline-specific dipeptidase encoded by the pepR gene have already been identified and characterized in L. helveticus CNRZ32 (8, 45, 47). Release of these intracellular enzymes from the cell is considered to be highly important during cheese ripening, as they play a key role in biochemical processes leading to textural changes and flavor development (13). All of these genes along with other peptidase genes identified in the L. helveticus DPC4571 genome were present in all the strains analyzed by aCGH. Some of these genes (e.g., pepD3t, pepE2, and pepV) are situated next to IS elements, which implies how rigorously controlled the functional organization of IS elements in L. helveticus appears to be.
We thank Paul O'Toole and Emma Raftis from the Microbiology Department, University College Cork, Ireland, for helpful advice on microarray data processing and analysis.
This research was supported by the Department of Agriculture under the Food Institutional Research Measure (04/R&D/TD/311).
Published ahead of print on 30 October 2009.