Enterococci are among the first bacteria to colonize the neonatal GI tract [1
]. Though originally considered as harmless commensals, the enterococci now rank among the leading causes of nosocomial infections [36
]. The present study was undertaken in an attempt to further explore the differences in the genetic make-up of E. faecalis
. A total of 31 community-derived fecal baby isolates were sequence typed by MLST and characterized with respect to antibiotic resistance and properties associated with virulence. A subset of the isolates was genomotyped using genome-wide DNA microarrays.
By MLST analysis, the 31 baby isolates were resolved into 12 STs and grouped into 11 genetic lineages, including 6 major clonal complexes (CCs) and 5 singletons http://efaecalis.mlst.net/
. Analyses with the MLST scheme employed in the present study have previously defined distinct clonal complexes of E. faecalis
associated with the hospital environment, so-called high-risk enterococcal clonal complexes (HiRECC; CC2, CC9, CC40 and CC87) [20
]. Of the isolates included in this study, only 158B and 226B (ST6) grouped into one of these complexes (CC2). These isolates were obtained towards the end of the sampling period, and may therefore have been introduced through habituation to solid food or from the environment through fecal-oral contamination. To our knowledge, none of the infants were admitted to the hospital during the period of sampling, however, hospital contact cannot be excluded as a source for ST6 isolates.
According to our results, several of the putative enterococcal virulence factors were widespread among the commensal baby isolates. These findings are in line with previous reports [11
], and may reflect the adaptive functions that these factors can hold in non-virulent contexts, as indicated by Semedo et al. [11
]. Several of the virulence traits and antibiotic resistant phenotypes were common for all the isolates within the clonal complexes; however, the strain set is too small to deduce statistically significant association of features with clonal identity of isolates.
Overall, our results highlight the importance of phenotypic assays to confirm genomics data as revealed by PCR and CGH. PCR confirmed the presence of cylL in the eight isolates that displayed hemolytic activity in our study, however, cylL was also found to be present in an additional eight Cyl--isolates. A similar discrepancy between fermentation capabilities and the presence of iolE and iolR was also observed. Moreover, three isolates carrying both gelE and fsrB came out as GelE- in the plate assay. The confined genotype-phenotype correlation that is here reported, visualizes the need for phenotypic confirmation of genotypes.
which is known to be associated with the E. faecalis
pathogenicity island (PAI) [25
], was detected in two thirds of the commensal isolates by PCR. According to the CGH data, none of the baby isolates contained complete PAIs. These findings were as expected, considering that the PAI has been shown to be enriched among infection-derived enterococcal isolates [25
]. More surprisingly, but consistent with a previous report [39
], all the isolates studied contained some PAI genes. Several of the baby isolates showed similar patterns of present and divergent PAI genes (Figure ). This suggests that the evolution of the enterococcal PAI may be driven by insertion and deletion of larger modules, as hypothesized in [39
]. Shankar et al. also suggested that parts of the enterococcal PAI originate from pheromone-responsive plasmids, with subsequent indels of transposable elements driving the evolution of the PAI [39
]. Indeed, conjugal transfer of a segment of the E. faecalis
PAI has been demonstrated [40
]. The CGH revealed a high degree of plasticity within all the MGEs represented on the microarray. These "mosaic structures" may reflect a complex evolutionary history of elements that have been frequently rearranged by horizontal gene transfer (HGT) and homologous recombination.
According to the CGH data presented here, a preliminary E. faecalis
core genome consisting of 2092 (out of the 3093 chromosomal V583 ORFs) can be delineated. Compiled analysis of the data from Aakra et al. and McBride et al. [17
] with the data from the present study produced a core genome estimate of 1722 genes. An additional 62 genes were only represented on one or two of the three different arrays used, but were defined as core genes in these experiments. Although the size of the core genome may fluctuate due to the stringency of the statistical methods used in the different studies, our data do add substantial information on the E. faecalis
In general, the genomic variation between isolates that are evolutionary -linked, e.g
. isolates with the same ST, was expected to be lower than the variation between isolates that belong to different evolutionary lineages. Bayesian-based phylogenetic analysis confirmed these expectations (Figure ). McBride et al. previously reported genomotyping by CGH to be biased by the activity of MGEs in E. faecalis
], and we therefore repeated the Bayesian analysis, using the CV genes, only. The phylogenetic analysis based on CV genes recovered the same patterns of relatedness as the analysis comprising all genes, with slight internal rearrangements of branches (Figure ). These rearrangements supported the hypothesis on the distribution of mobile elements as a source of genomic diversity in E. faecalis
. Moreover, our data suggest that within lineages, most of the variation detected by CGH is due to MGEs. However, the conserved clade identified by the analyses based both on the CV genes and the complete gene-set, indicates that also other and more complex discriminatory factors contribute to the genomic diversity in E. faecalis
. Since an overall correlation between CGH and MLST was revealed, it is reasonable to believe that genes contributing to the formation of clades, i.e
. lineage-specific genes may be identified. In the 7 baby isolates that formed a clade in the phylogenetic analysis, we were able to recognize 137 genes that were divergent, but present in the remaining three isolates (including the reference strain). The majority of these genes were MGE genes located in phage03 (n = 39), phage06 (n = 28) and a phage-related region identified by McBride et al. [17
] (EF2240–82/EF2335–51; n = 44). Lepage et al. have previously reported phage03 to be absent from several food isolates [16
]. Since ST6 is part of CC2, which has been found to be significantly enriched among nosocomial isolates, phage03 may potentially represent an element associated with increased fitness in the hospital environment. The latter report also identified eight genes as potential markers for the V583/MMH594-lineage [16
]. V583 and MMH594 both display ST6 [17
], and five of the eight genes (EF2250, EF2253, EF2254, EF3155 and EF3252) were also present in the ST6-isolates (158B and LMGT3303; results not shown) analyzed by CGH in our study.
Comparative genome analyses have revealed that pathogen evolution often progresses through gene acquisition via HGT [32
]. The 169 genes that were characterized as divergent in all the community-derived baby isolates by CGH may be potential pathogen-specific genes in E. faecalis
, or genes that are specific to V583. However, additional CGH data from both pathogenic and non-pathogenic isolates are needed to address this issue. Vancomycin-resistant E. faecalis
(VREfs) isolates appear to be widely spread in hospital environments, while isolation of VREfs from healthy volunteers rarely occurs [41
]. Accordingly, the vanB
operon was divergent in all the isolates studied by CGH in our lab (altogether 21 strains;[18
] and results not shown). In addition to gene acquisition, pathoadaptive mutations via gene loss also plays an important role in evolution of bacteria [45
]. A disadvantage of the comparative genomic analyses presented here, is that the comparison of gene content is based on a single reference strain (V583), only. The E. faecalis
OG1RF genome showed that, in addition to a shared core of 2474 ORFs [29
], both the V583 and the OG1RF genome carry unique genes, suggesting that the E. faecalis
pan-genome will be further extended as more strains will be sequenced. The availability of additional E. faecalis
genome sequences and the construction of a pan-species array would further increase the sensitivity of such approaches.