|Home | About | Journals | Submit | Contact Us | Français|
Shiga Toxin-producing Escherichia coli (STEC) are a group of foodborne pathogens associated with diarrhea, dysentery, hemorrhagic colitis (HC) and hemolytic uremic syndrome (HUS). Shiga toxins are the major virulence factor of these pathogens, however adhesion and colonization to the human intestine is required for STEC pathogenesis. A subset of STEC strains carry the Locus of Enterocyte Effacement (LEE) pathogenicity island (PAI), which encodes genes that mediate the colonization of the human intestine. While LEE-positive STEC strains have traditionally been associated with human disease, the burden of disease caused by STEC strains that lacks LEE (LEE-negative) has increased recently in several countries; however, in the absence of LEE, the molecular pathogenic mechanisms by STEC strains are unknown. Here we report a 86-kb mosaic PAI composed of four modules that encode 80 genes, including novel and known virulence factors associated with adherence and autoaggregation. Therefore, we named this PAI as Locus of Adhesion and Autoaggregation (LAA). Phylogenomic analysis using whole-genome sequences of STEC strains available in the NCBI database indicates that LAA PAI is exclusively present in a subset of emerging LEE-negative STEC strains, including strains isolated from HC and HUS cases. We suggest that the acquisition of this PAI is a recent evolutionary event, which may contribute to the emergence of these STEC.
Shiga Toxin-producing Escherichia coli (STEC) are a group of foodborne pathogens associated with gastrointestinal diseases, including acute diarrhea and dysentery. Annually, STEC causes over two million cases of acute illness worldwide1. Importantly, STEC infection may progress to severe diseases such as hemorrhagic colitis (HC) and hemolytic uremic syndrome (HUS)2. In humans, STEC pathogenesis involves the initial adhesion of the bacteria to the intestinal epithelium in the ileum, later colonization of the colon and production of Shiga toxin (Stx), thereby impairing epithelial barrier function and ion transport, causing diarrhea3, 4. Stx may reach the bloodstream and disseminate to extra-intestinal tissues, producing more severe diseases such as HUS that may result in death. Thus, although Stx is considered sine qua non of virulence, adhesion to the intestinal mucosa is a required first step for STEC pathogenesis5. In particular, a subset of STEC strains use an adhesion mechanism called attaching-and-effacing (A/E) lesion, which is characterized by the alteration of the architecture and physiology of the colon epithelial cells. This pathogenic process is mediated by genes encoded in the Locus of Enterocyte Effacement (LEE) pathogenicity island (PAI)6. To date, STEC strains that are LEE-positive, such as O157:H7 and several serotypes belonging to the “big six” non-O157 STEC O serogroups (O26, O45, O103, O111, O121 and O145), have been the most frequently associated with outbreaks and/or severe illness7. Consequently, in the context of STEC infection, the presence of LEE is considered a risk factor for the development of HUS8.
However, STEC strains that do not carry LEE (LEE-negative) have also been isolated from cases of severe illness5. Indeed, there has been an increase in the number of reports of clinical LEE-negative STEC strains that belong to serogroups O91, O113 and O1749–11. For instance, there has been an increase in the detection of O91 strains in Germany from ~5% of all STEC strains isolated from humans in 1999 to ~15% in 2012 and 201312. Similarly from 2007 to 2012, the serogroups O91 and O113 were among the most common non-O157 serogroups associated with human disease in Netherlands13. Recently, the serogroup O174 was identified as one of the four non-O157 serogroups most commonly associated with HUS in Argentina14. Nevertheless, in the absence of LEE, the molecular mechanisms by which these strains adhere to the host intestinal epithelium remain largely unknown15.
In light of these observations, our group is currently investigating the emergence of LEE-negative STEC strains of clinical relevance. In a previous report, we identified a member of the Heat-resistant agglutinin family (Hra Family) produced by the LEE-negative O113:H21 STEC strain E045-00, which is seroreactive to sera from patients with HUS16. In this study, we characterized this antigen named Hemagglutinin from Shiga toxin-producing E. coli (Hes). Much like other members of the Hra family, Hes is a virulence factor that participates in several colonization-associated phenotypes, including hemagglutination, adhesion and autoaggregation. More importantly, we show that the hes gene is localized in a 86-kb mosaic PAI composed of 80 genes organized into four modules, one of them (module III) previously described by Shen et al.17. Here we report the complete sequence of this PAI, in which other virulence factors participating in adhesion and autoaggregation are also encoded, such as Iha18 and Ag4319, respectively. Consequently, this PAI was named the Locus of Adhesion and Autoaggregation (LAA). We also determined the distribution of this PAI among STEC strains isolated from different sources, showing its presence in a subset of LEE-negative STEC strains, some of which were isolated from cases of HC or HUS. Remarkably, our analyses suggest the acquisition of LAA is probably a recent evolutionary event, which may contribute to the emergence of these pathogens. Thus, this study is a step forward toward an understanding the evolution, emergence and pathogenicity of this subset of STEC strains. Additionally, the identification of this PAI will be useful in epidemiological studies that assess the public health risk of STEC.
The Hra family is composed of the integral outer membrane proteins Hra1, Hra2, Tia and Hek, which share considerable amino acid sequence similarity but participate in different colonization-associated phenotypes20. Hra1 participates in hemagglutination, autoaggregation, biofilm formation and aggregative adherence21; Hra2 participates in adhesion20; Tia in adhesion and invasion22; and Hek in hemagglutination, autoaggregation, adhesion and invasion23. In a previous study, we identified a member of the Hra family in the outer membrane protein extract of the LEE-negative O113:H21 STEC strain E045‐0016. The analysis of this protein by mass spectrometry (MALDI TOF/TOF) suggested that it was the Hek protein. During subsequent analysis aimed at determining whether Hek was present in other STEC strains, the amino acid sequence of this protein was used to perform BLASTp searches against the NCBI nr database. Nevertheless, the Hek protein was not detected in any STEC strains. In contrast, we found an allelic variant (GenBank accession: EGW68377) of the Hra family in the LEE-negative O91:H21 STEC strain B2F1, which has 90%, 65%, 65% and 86% amino acid identity with the Hra1, Hra2, Tia and Hek proteins, respectively, and that primarily exhibits amino acid substitutions in predicted loops exposed to the extracellular space (Supplementary Fig. 1). This variant has not been described previously. Therefore, following the nomenclature used for members of the Hra family, we named it Hemagglutinin from Shiga toxin-producing E. coli (Hes).
To determine the distribution of hes in our culture collection (167 STEC strains, including 48 LEE-negative STEC, and 12 fecal E. coli isolated from healthy subjects), PCR analysis was performed using primers hes_det1+hes_det2 (Supplementary Table 1), which are specific for this gene and do not amplify other allelic variants of the Hra family (Supplementary Note 1). Strains positive for hes were then analyzed by PCR using the primers hes_for+hes_rev, which amplify the complete nucleotide sequence of hes and other members of the Hra family, and the PCR products obtained were sequenced. Notably, 60% (29/48) of the LEE-negative STEC strains were positive for hes, including isolates of serotypes O91:H21, O113:H21 and O174:H21 (Supplementary Table 2). Nevertheless, hes was not detected in two strains of the serotype O113:H21. Importantly, the E045-00 strain was positive for hes, indicating that the peptide originally mis-identified as Hek is indeed Hes. In addition, sequencing of PCR products indicated that hes sequence is 99.7% conserved (data not shown). In contrast, this gene was not detected in any of the LEE-positive STEC strains (0/119) or fecal isolates (0/12). These results indicate that Hes is a novel variant of the Hra family that is widely distributed in LEE-negative STEC strains.
Given the biological role of members of the Hra family, we evaluated whether the hes gene product is capable of conferring colonization-associated phenotypes to the non-adherent E. coli HB101 strain. When expressed under the control of an inducible promoter, hes confers the capacity of agglutination of sheep erythrocytes (Fig. 1A) and autoaggregation (Fig. 1B,C) to the HB101 strain. It also promotes biofilm formation at 72h (Fig. 1D,E). In vitro adhesion and invasion assays indicate that hes significantly increases the adherence of this strain to epithelial cells (Fig. 1F), in an aggregative pattern (Fig. 1G), but does not promote invasion of epithelial cells (not shown). These results indicate that Hes is functional and confers E. coli with phenotypic characteristics typical of other members of the Hra family.
Some members of the Hra family, such as Hek and Tia, are encoded in PAIs24, 25. In fact, one of these PAIs, the Subtilase-Encoding Pathogenicity Island, has been identified in LEE-negative STEC strains isolated from humans and animals25. Given these reports and the mutually exclusive distribution of hes with LEE in STEC strains, we sought to determine whether this gene is harbored in a mobile genetic element. In silico analysis of the boundaries of the hes gene in the draft genome of STEC strain B2F1 (see methods for details), showed that an integrase-encoding gene and the pheV tRNA gene are located 3,342 and 4,805bp upstream of the hes stop codon, respectively (Fig. 2A). Also, 23-bp imperfect direct repeats (DR1) that corresponds to the 3′ end of the pheV gene were identified 80,626bp downstream of the hes start codon. These DR1 sequences flank a 86,256-bp DNA region with a G+C content of 48%, which is less than the 51% G+C exhibited by the B2F1 chromosome. Additionally, other direct repeats (DR2 to DR7) and several insertion sequence (IS) elements were identified within this chromosomal region. The yqgA and yghD genes are located next to the DR1 sequences, with the first located upstream and the second downstream. Genome analysis of the laboratory strain E. coli K-12 MG1655 indicated that both genes are located in the vicinity of the pheV gene (not shown). Thus, hes is encoded in a 86,256-bp DNA region that is inserted in the pheV gene in the B2F1 strain. In this DNA region, 80 ORFs (open reading frames) were identified, including genes with known functions, some of which are associated with pathogenicity in STEC, as well as several genes and pseudogenes that code for hypothetical proteins (vide infra) (Fig. 2A and Supplementary Table 3). Among the genes coding for virulence factors identified, in addition to hes (ORF5), there are other genes that code for adhesins such as Iha (ORF20)18 and Ag43 (ORF67), which also promotes autoaggregation19. Given that this cluster of genes is related to adhesion and autoaggregation, we have named this DNA region as Locus of Adhesion and Autoaggregation (LAA). Collectively, these results show that LAA exhibits several features typical of a PAI.
In addition to hes, LAA carries genes encoding functions other than adhesion and autoaggregation (Supplementary Table 3). The nmpC gene (ORF26) has been associated with heat resistance in E. coli 26, while the NmpC protein of STEC strain E045-00 has been shown to be seroreactive only with HUS sera16. The sisA gene (ORF2) product has the ability to attenuate the host inflammatory response induced by uropathogenic E. coli strains27. Four additional ORFs encoding potential virulence factors are: ORF29, which encodes for a novel variant of a Serine Protease Autotransporter of Enterobacteriaceae (SPATE), herein called the LAA encoded SPATE (LesP) (Supplementary Table 4); ORF42, which encodes a protein sharing 60.6% similarity with the Salmonella enterica PagC protein involved in serum resistance phenotype28; and ORF43 and ORF45 (tpsA and tpsB genes, respectively), which encode a two-partner secretion system that in Gram-negative bacteria participates in different virulence phenotypes29. However, the biological function of these hypothetical proteins remains uncharacterized.
Several transcriptional regulators are also encoded in PAIs30. These proteins may control the expression of genes located at PAIs and/or elsewhere in the genome. Two major classes of these transcriptional regulators are proteins of the AraC family and two-component signal transduction systems. ORF28 encodes a hypothetical protein of the AraC family that is also present in the genomic island called the Locus of Proteolysis Activity (LPA)31. ORF6 and ORF7 encode the AtoS-AtoC two-component system that positively regulates the expression of the atoDAEB operon (ORF8–ORF11)32. The potential involvement of these regulators in the expression of LAA-encoded functions and virulence is a matter of current investigation.
Since LAA exhibits several features of a PAI, we hypothesized that this locus should also be present in other STEC strains, but not in commensal strains. To test this, we performed alignments between representative genomes (draft or complete) of LEE-positive and LEE-negative STEC and commensal strains using progressiveMauve33 (Fig. 3). As for strain B2F1, LAA was identified downstream of the pheV gene in the LEE-negative STEC strains, with exception of O91:H14 str. 06-3691 where a region only harboring genes ORF2 to ORF20 (18.5-kb) was located next to the selC tRNA gene, and a second region containing genes ORF43 (truncated) to ORF80 was located next to the pheV, while genes ORF21 to ORF40 were not present. In contrast, LAA was not identified in K12, commensal or LEE-positive STEC strains. Thus, these results suggest that LAA is uniquely associated with LEE-negative STEC strains.
The presence of LAA fragments at locations other than the pheV locus, along with the identification of several DR sequences and IS elements, raises the question as to whether this locus has a mosaic structure. Initially, we analyzed its nucleotide sequence searching for DNA regions flanked by IS elements, DR sequences and/or having different G+C content, which may correspond to modules. Four fragments (modules) with these features were identified (Fig. 2A): module I (13-kb) from the pheV gene to ORF11 is flanked at the 3′ end by a putative transposase of the IS66 family (ORF12); module II (23-kb) from ORF13 to ORF32 is flanked at the 3′ end by a putative transposase of the IS66 family (ORF33); module III (26-kb) from ORF34 to ORF49 is flanked at the 3′ end by genes encoding putative transposases of the of the IS600 family (ORF50 and ORF51) ; and module IV (21-kb) from ORF52 to ORF80 located at the 3′ end of LAA. Additionally, all of these DNA regions have different G+C content (Fig. 2B), supporting the idea that they potentially have different genetic origins.
Next, we searched for these modules in the Pathogenicity Island Database v 2.034 and found a number of other PAIs with DNA regions that shared more than 80% identity with them, with the exception of module I (Fig. 4). Module III was previously described in the LEE-negative STEC O113:H21 str. CL3 and called PAI ICL3 (Fig. 2A, dotted line)17. A similar gene cluster is found in the genomic island GIpheV-CRICC168 of C. rodentium 35 and as part of the PAI-IAL862 with the deletion of the two-partner secretion system (ORF43-ORF45) (Fig. 4). On the other hand, module II shares homology with the LPA PAI. Likewise, module IV resembles a cluster of genes present in several PAIs (Fig. 4B). Collectively, these data confirm the mosaic structure of LAA and the diverse distribution of its modules.
At this point, it was interesting to investigate the distribution of LAA modules among E. coli strains. Accordingly, we developed a multiplex PCR assay for simultaneous detection of modules I, II and III. It was not possible to design specific primers for module IV, as its genes are widely distributed in E. coli. Pairs of primers LAA1_for+LAA1_rev and LAA2_for+LAA2_rev were designed to amplify modules I and II, respectively. A third pair of primers (ms3_for+ms3_rev), reported by Girardeau et al.35, was used for the amplification of module III (Supplementary Table 1 and Supplementary Fig. 2). With these primers, we examined the presence of the first three modules in our culture collection. As expected, PCR products were obtained only in LEE-negative STEC strains. Among them, 24/48 strains (50%) were positive for all three modules and 6/48 strains (12.5%) for two modules (Supplementary Table 2). These results demonstrate that LAA is associated with LEE-negative STEC strains and suggest that its modules might mobilize (be acquired or lost) independently of the complete structure.
In order to evaluate the possible mobilization of LAA modules, we performed a in silico analysis of genome sequences (draft and complete) from 115 LEE-negative STEC strains, 7 LEE-positive STEC strains, 2 strains of other E. coli pathotypes, E. coli K-12 MG1655 and 3 commensal E. coli strains, which are available in the NCBI database (Supplementary Table 5) (see methods). First, all draft genome sequences were ordered and aligned using progressiveMauve. Next, phylogenetic relationships among the strains were determined based on whole-genome single nucleotide polymorphism (SNP) analysis. Additionally, phylogroup assignment was conducted in silico based on the methods of Clermont et al.36. Finally, a local BLASTn search was performed to determine the presence of LAA, the distribution of its modules and the tRNA loci located close to these sequences. The presence of a module was established when 50% or more of its nucleotide sequence was identified. Our results indicate that LEE-negative STEC strains are phylogenetically diverse. A maximum likelihood phylogenetic tree based on whole genome SNPs is shown in Fig. 5. In general, designation of phylogroup and serotype (antigen H) were consistent with the topology of the tree, with the exception of serotype O8:H19, in which strains were separated into different clades. In contrast, serogroups (O antigens) were polyphyletic. For example, several serotypes belonging to the serogroups O91, O113, O104 and O174 were found in distantly related clades, which is consistent with previous studies demonstrating this genetic diversity37, 38. Remarkably, the complete LAA structure was identified adjacent to the pheV gene in 34.8% (40/115) LEE-negative STEC strains from several serotypes (-: H25, O8:H19, O22:H8, O38:H21, O48:H21, O74:H42, O79:H7, O88:H25, O91:H21, O96:H19, O104:H21, O113:H21, O116:H21, O130:H38, O134:H38, O163:H19, O168:H8, O171:H2, O174:H2, O174:H21 and O185:H7), including 3 isolates from HC and 6 from HUS cases. In two additional LEE-negative STEC strains, one isolated from a HC case (O166:H28 str. FHI92) and the other from human feces (O91:H21 str. FHI59, diagnostics not available), we were also able to identify each of the four LAA modules; however they were located adjacent to different tRNA genes (pheV, selC and thrW). All the above isolates that carry the four LAA modules belong to the phylogroup B1, except the FHI92 strain, which belongs to phylogroup E. Additionally, these strains were associated (p<0.0001) with Shiga toxin type 2 (Stx2), a toxin that is a risk factor for HUS development8 (Supplementary Table 6). By contrast, these strains were negatively associated (p<0.01) with Shiga toxin type 1 (Stx1), a toxin not associated with an increased risk for HUS8. Interestingly, two isolates from the serotype O91:H14 lack this PAI, indicating that this locus is not present in all strains of the same serotype. Furthermore, individual LAA modules were also identified (in some cases forming part of other mosaic PAI-like structures) in several LEE-negative STEC strains, including isolates belonging to phylogroups other than B1. In these cases, the modules were also located adjacent to several tRNA genes (pheV, selC, ileX and serX), which suggest the direct acquisition and/or loss of modules. No association between these strains and a Shiga toxin type was found (Supplementary Table 6). Thus, whatever the underlying mechanism (acquisition or deletion), the analysis presented demonstrates the wide distribution and mobilization of LAA modules among LEE-negative STEC strains.
In order to investigate the evolutionary history of this PAI, we determined the genetic relationships between 42 LAA sequences (all four modules) based on SNP analysis. Additionally, we assessed the genetic variability of genes encoded within these sequences. In both analyses, the LAAB2F1 sequence was used as the reference. Notably, the phylogenetic tree of LAA sequences demonstrated that two major lineages (LAA-1 and LAA-2) are found within different strains of a same serotype, including O91:H21, O96:H19 and O174:H21 (Fig. 6A), indicating that these lineages determinations are not linked to serotype. The major genetic variability between lineages was found in modules III and IV (Fig. 6B), indicating that these DNA regions may have evolved under different evolutionary pressures. Additionally, two strains (O166:H28 str. FHI92 and O130:H38 str. 492-1) were clustered outside of the LAA-1 and LAA-2 lineages. As shown above, in the FHI92 strain, which belongs to phylogroup E, LAA modules were found adjacent to several different tRNA genes (Fig. 5). Thus, it is possible to infer that LAA modules present in the FHI92 strain may be ancestral sequences.
Despite persistent efforts to increase food safety, the burden of disease caused by foodborne pathogens continues to be a concern in both developed and developing countries. The evolution of these pathogens and the impact of this biological process on their epidemiology constitutes a growing challenge for public health authorities worldwide39. In particular, the emergence of the epidemic EAEC/STEC O104:H4 strain, which caused the 2011 outbreak in Germany, changed the epidemiology of diarrheagenic E. coli and showed that the genetic plasticity and horizontal gene transfer of these bacteria accelerates their adaptation to a variety of environments40. In this new epidemiological scenario, it is now accepted that there are LEE-negative STEC strains that represent a potential public health risk. Consequently, it is necessary to determine the molecular mechanisms by which these emerging STEC strains are causing disease in humans. Notably, our findings strongly suggest that in absence of LEE, LAA may be involved as an alternative mechanism of adhesion to the human intestine, in which Hes, Iha, Ag43 and other virulence factors encoded in this locus could participate. Owing to the biological functions of Hes (Fig. 1) and Ag4319, it is possible to hypothesize that LAA-mediated adherence of STEC to intestinal cells would exhibit an aggregative (or “semilocalized”) pattern, which has been reported in isolates of serotypes O91:H21 and O113:H2141, 42. It must be emphasized that the Hes, NmpC and Ag43 proteins, all encoded in LAA, are reactive to sera from patients with HUS16, indicating that they are synthetized during human disease. Thus, these antigens probably play a role in the development of these pathologies. Furthermore, the complete LAA structure was identified in strains isolated from cases of HC and HUS (Fig. 5). Besides, the first three modules of this PAI were detected through multiplex PCR assay in several LEE-negative STEC strains belonging to clinically relevant serotypes (O91:H21, O113:H21 and O174:21) and isolated from different sources, including humans, animals and foods (Supplementary Table 2). Nevertheless, because of its modular nature, LAA may be present as a “complete” (with all four modules) or an “incomplete” (with one, two or three modules) structure in different strains. Importantly, module III, previously described by Shen et al.17, has been proposed as a molecular marker of LEE-negative STEC strains linked to disease35. However, our data revealed that the presence of all modules, not one of them, is most probably associated with disease. It is interesting to note that the complete LAA structure was identified almost exclusively in STEC strains belong to the phylogroup B1 (Fig. 5), suggesting that a specific genetic background is required for its acquisition and/or maintenance. Furthermore, we identified two major lineages of LAA that can be present in different, but closely-related strains, including strains of a same serotype (Fig. 6A), indicating that this PAI has been most likely acquired multiple times via independent events. This supports the idea that the acquisition of LAA through horizontal gene transfer is most likely recent, which may account for the evolution and emergence of these strains. Further studies aimed at understanding the genetic variability between LAA sequences could provide more information about the evolution of this PAI and its possible role in the adaptation of E. coli to new niches, such as the human and bovine intestine. Accordingly, it is important to elucidate the participation (or lack thereof) of LAA in human disease, but this is beyond the scope of the present study.
To date, it has not been possible to fully define human pathogenic STEC, as there is no single or combination of marker(s), including the determination of serotype, that allows to absolutely predict the potential of a STEC strain to cause human disease43. However, the European Food Safety Authority Panel on Biological Hazards (BIOHAZ Panel) has suggested that the presence of molecular markers, such stx2 and eae or aaiC plus aggR along with other non-characterized markers, may be associated with a higher risk of severe disease43. Several virulence factors, such as saa 42, sab and eibG 6, and PAIs25, 31 have been reported as exclusively present in LEE-negative STEC strains. Nonetheless, none of these molecular markers appear to be associated with severe disease. On the contrary, our data suggest that the identification of LAA may allow the surveillance and assessment of the public health risk associated with emerging clones belonging to the group of under-diagnosed LEE-negative STEC strains. The association between LAA and Stx2 (Supplementary Table 6) is a significant result for public health considerations due to the epidemiological relevance of this toxin. This new knowledge will contribute significantly to epidemiological studies of STEC.
Moreover, the misconception that LEE-negative STEC strains make up a homogeneous subgroup has interfered with the determination of the real public health impact of these pathogens. Importantly, this study demonstrates that the current classification of STEC strains primarily based on the presence of LEE is inappropriate. Consequently, we propose a new classification scheme for STEC strains based on the presence of both the LEE and LAA pathogenicity islands, as well as the emergence of hybrid strains between STEC and other E. coli pathotypes44, 45 (Supplementary Fig. 3).
In conclusion, we demonstrate that LAA is a novel pathogenicity island that is present in a group of emerging STEC strains that cause severe diseases in humans. Our results provide evidence suggesting the involvement of LAA in the adaptation of these bacteria to the human intestine, thereby causing disease. This study contributes to an understanding of the evolution, pathogenicity and epidemiology of these human pathogens.
The strains used in this study are listed in Supplementary Table 2. Strains were grown in Luria-Bertani broth (LB) or Dulbecco’s Modified Eagle’s Medium-low glucose (DMEM) at 37°C with agitation. The culture media were supplemented as needed with ampicillin (Amp) (100µg/mL) and/or 2mM m-toluic acid. All vectors and primers used are listed in Supplementary Table 1. All primers designed in this study were obtained using the NCBI Primer-BLAST tool (http://www.ncbi.nlm.nih.gov).
The presence of the hes gene was determined using the primers hes_det1+hes_det2, which are specific for hes and do not amplify other allelic variants of the Hra family (Supplementary Note 1). Positive strains for hes were then analyzed with the primers hes_for+hes_rev and the PCR products obtained were sequenced (Macrogen, USA).
Coding sequences for hes and hra1 genes were amplified using the primers hes_clon1+hes_clon2 and hra1_clon1+hra1_clon2, respectively. These primers allowed obtaining products with recognition sites for the restriction enzymes NdeI and BamHI in the 5′ and 3′ ends of each gene, respectively. The hes and hra1 genes were amplified from the LEE-negative O113:H21 STEC strain E045-00 and from the EAEC str. 042, respectively. PCR products were ligated to the vector pTZ57R/T (Fermentas, Lithuania), following the manufacturer’s instructions, in order to obtain the vectors pTZ57R/T_hes and pTZ57R/T_hra1. These vectors were used to transform the E. coli DH5α, and clones were selected according to Amp resistance and α-complementation. The correct clone was confirmed by sequencing (Macrogen, USA). Next, corresponding vectors were extracted from the transformed E. coli DH5α strains and digested with NdeI and BamHI. The digestion products were analyzed by agarose gel electrophoresis, and the inserts (hes and hra1 genes) were purified. These fragments were ligated to the vector pVB1 (Dualsystems Biotech, Switzerland) to obtain the vectors pVB1_hes and pVB1_hra1, in which the genes hes and hra1 are regulated by the Pm/xylS expression system. These vectors were used to transform the E. coli HB101 strain. As a control, the empty vector was also transformed in the HB101 strain.
1) Hemagglutination: Agglutination of sheep erythrocytes was carried out, as previously described46. 2) Bacterial Autoaggregation: Bacterial autoaggregation was measured as described in ref. 21 with the following modifications. Briefly, bacterial cultures were grown in LB supplemented with the appropriate antibiotic and m-toluic acid (2mM) overnight at 37°C with agitation. The cultures were centrifuged at 9000×g, re-suspended in phosphate-buffered saline (PBS) and normalized to an optical density at 600nm (OD600) of ~0.9. 10mL of each bacterial suspension was placed in two separate tubes. One tube remained static and the other was vortexed before each OD measurement. The tubes were left static at room temperature. To measure the bacterial settling over time, at designated time points (every hour for 8hours), 0.5mL was removed from within 1cm of the surface of each bacterial suspension and the OD600 was measured. This assay was performed two times on different days. 3) Biofilm formation: Biofilm formation was observed and quantified by fixing and staining with crystal violet as described in21 with the following modifications. Briefly, 10μL of overnight culture in LB supplemented with the appropriate antibiotic and m-toluic acid (2mM) was added to 1mL of LB medium in a 24-well plate. Plates were incubated without shaking at room temperature. At designated time points (24, 48 and 72hours), culture medium was aspirated, each well was washed three times with PBS and contents were fixed for 7min with 70% methanol. The wells were allowed to dry completely. Fixed biofilms were stainied with 0.5% crystal violet for 15min and washed twice with water. Quantification of biofilm formation was carried out by the addition of 0.5mL of 33% glacial acetic acid and measurement of the OD595. This assay was performed three times in triplicate on different days 4) Bacterial adhesion and invasion of human epithelial cells: Bacterial adhesion to Caco-2, HT-29 and Hep-2 cells was evaluated as previously described46, with slight modifications. Briefly, epithelial cells were cultivated in DMEM supplemented with 10% bovine fetal serum and 1% penicillin-streptomycin at 37°C in 5% CO2 atmosphere. Cells were seeded in 24-well plates and grown to confluence (approximately 4×105 cells/well). Bacterial pre-inoculates were grown overnight in DMEM low-glucose supplemented with the appropriate antibiotic and m-toluic acid (2mM) at 37°C with agitation. An aliquot of each pre-inoculum was diluted 50-fold in the same culture medium and incubated at 37°C for 4h with agitation. The epithelial cells were washed three times with PBS and infected with an MOI of 100 for 30min at 37°C in 5% CO2 atmosphere.
Non-adherent (planktonic) bacteria were removed by five washes with PBS, and the adherent bacteria were recovered by lysis with 0.1% Triton X-100. The number of adherent bacteria was determined by serial dilution and counts of viable bacteria in LB agar. The final result was expressed as the percentage of bacterial cells adhered to the cell layer relative to the number of bacteria added. To quantify the invasiveness of the bacterial strains, we performed a protection assay with gentamicin. One hour after infection, planktonic bacteria were removed by three washes with PBS and the cells were incubated with DMEM medium supplemented with gentamicin (50µg/mL) for 1.5h. The medium was removed, the cells were washed three times with PBS and the invasive bacteria were recovered by lysis with 0.1% Triton X-100. After serial dilution and a count of invasive bacteria in LB agar, we determined the percentage of invasive bacteria relative to number of bacteria added. All assays were performed three times in duplicate on different days. 5) Visualization of the adherence phenotype. HT-29 cells were cultivated on glass coverslips and infected for 1h, as described above. After three washes with PBS, DMEM medium was added and incubated for 1 more hours. After three more washes with PBS, 70% methanol was added for 7min and cells were stained with Giemsa 1:20 for 40min at room temperature.
All genome sequences analyzed were downloaded from GenBank at the National Center for Biotechnology Information (NCBI - http://www.ncbi.nlm.nih.gov/) on 20 September 2016. Accession number and the source of the sequences are listed in Supplementary Table 5. Contigs of draft genomes were ordered and aligned against the complete genome of E. coli K-12 substr. MG1655 using progressiveMauve33. Then, contigs of each strain were concatenated into one contiguous sequence and the genetic context of the hes gene was analyzed using several bioinformatic tools. For instance, the DR sequences, IS elements and tRNA loci were identified using REPuter47, ISfinder48 and tRNAscan-SE49, respectively. Besides, the ORFs and the G+C content were determined by analyzing genomic sequences using Unipro GENE50 and the Geneious software package (v10.0.9; Biomatters Ltd). DNA with PAI features were used to performed BLASTn searches against the Pathogenicity Island Database v2.034. Also, a local BLASTn search was performed in the Geneious software package to determine the distribution of LAA modules and their insertion sites in the genomes analyzed.
Simultaneous detection of the modules I, II and III of the LAA pathogenicity island was performed through PCR assay (Fig. S2). The primers LAA1_for +LAA1_rev and LAA2_for +LAA2_rev were designed to amplify modules I and II, respectively. A third pair of primers (ms3_for +ms3_rev), reported by Girardeau et al.35, were included for the amplification of module III. The amplification reactions were performed in a final volume of 25µL containing template DNA, 0.3µM each primer, 0.4µM each deoxynucleoside triphosphate (Fermentas, Lithuania), 5µL 5X GoTaq DNA polymerase buffer and 1.25U GoTaq DNA polymerase (Promega, USA). The amplification reaction included initial denaturing at 94°C for 5min, 30 cycles of denaturing at 94°C for 30s, hybridizing at 62.5°C for 40s and extension at 68°C for 2min, with a final extension at 72°C for 10min. PCR products were analyzed by electrophoresis in 1% agarose gel using Tris-acetate-EDTA buffer and stained with ethidium bromide.
(1) Whole genome SNP analysis: Genome sequences, both draft and complete, were uploaded to the CSI Phylogeny 1.4 server51, which identifies SNPs from whole genome sequencing data, filters and validates the SNP position, and then infers phylogeny based on concatenated SNP profiles. This analysis was performed using the default input parameters and E. coli K-12 MG1655 as the reference genome. As a result, 167,167 SNPs were identified in 3,008,649 positions found in all analyzed genomes. The output file in Newick format was downloaded and used for visualization of the phylogenetic tree in FigTree v.1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). In silico PCR52 was performed for the determination of phylogroup based on presence/absence of the genes chuA, yjaA, arpA, trpA and the segment TspE4.C2, as proposed by Clermont et al.36. (2) Genetic relationships of the LAA pathogenicity island among LEE-negative STEC strains: A total of 42 genomic sequences of STEC strains carry the four modules of LAA were uploaded to the CSI Phylogeny 1.4 server and the SNPs identification was carried out with default input parameters and LAAB2F1 as the reference sequence. As a result, 677 SNPs were identified in 49,427 positions found in all sequences. Tree construction was performed as described above.
The presence, absence and variations in LAA-encoded genes was assessed by BLASTn searches performed in the Geneious software package with the LAAB2F1 as the reference sequence. By default, when coverage and/or identity of the genes was below to 60%, this was considered absence. Comparisons between genomes and complete LAA sequences were performed and visualized using progressiveMauve33 and EasyFig v2.153, respectively. A heat map showing the presence, absence and identity of LAA-encoded genes was drawn using the package gplots54 in R55.
Three independent adhesion, invasion and biofilm formation assays were performed and data was compared using the Student’s t-test (two-tailed). A P-value of less than 0.05 was considered significant.
This study was supported by FONDECYT grant 1161161, awarded to R. Vidal and CONICYT-PCHA/2014-63140238 fellowship awarded to D. Montero.
D.A.M. data acquisition, data analysis, data interpretation, study design, revising of the manuscript, prepared figures and tables. J.V. data analysis, revising of the manuscript. F.D.C. data interpretation, data analysis, revising of the manuscript. J.L.P. data analysis, editing of the manuscript. N.L.P. data interpretation, revising of the manuscript. D.A.R. contributing with original genome sequences, data acquisition, revising of the manuscript. M.F. data analysis, revising of the manuscript. J.C.S. data analysis, experimental design, revising of the manuscript. R.V. study design, data interpretation, data analysis, wrote the main manuscript text, revising of the manuscript. All authors contributed to the editing and approved the final manuscript version.
The authors declare that they have no competing interests.
Electronic supplementary material
Supplementary information accompanies this paper at doi:10.1038/s41598-017-06999-y
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.