|Home | About | Journals | Submit | Contact Us | Français|
A survey of chromosomal variation in the ST239 clonal group of methicillin-resistant Staphylococcus aureus (MRSA) revealed a novel genetic element, ICE6013. The element is 13,354 bp in length, excluding a 6,551-bp Tn552 insertion. ICE6013 is flanked by 3-bp direct repeats and is demarcated by 8-bp imperfect inverted repeats. The element was present in 6 of 15 genome-sequenced S. aureus strains, and it was detected using genetic markers in 19 of 44 diverse MRSA and methicillin-susceptible strains and in all 111 ST239 strains tested. Low integration site specificity was discerned. Multiple chromosomal copies and the presence of extrachromosomal circular forms of ICE6013 were detected in various strains. The circular forms included 3-bp coupling sequences, located between the 8-bp ends of the element, that corresponded to the 3-bp direct repeats flanking the chromosomal forms. ICE6013 is predicted to encode 15 open reading frames, including an IS30-like DDE transposase in place of a Tyr/Ser recombinase and homologs of gram-positive bacterial conjugation components. Further sequence analyses indicated that ICE6013 is more closely related to ICEBs1 from Bacillus subtilis than to the only other potential integrative conjugative element known from S. aureus, Tn5801. Evidence of recombination between ICE6013 elements is also presented. In summary, ICE6013 is the first member of a new family of active, integrative genetic elements that are widely dispersed within S. aureus strains.
ST239 is a globally distributed clonal group of methicillin-resistant Staphylococcus aureus (MRSA). Currently, ST239 is a major cause of MRSA infections in Asian hospitals (5, 18, 25, 37, 45, 64, 74). Pulsed-field gel electrophoresis has detected extensive chromosomal variation in local ST239 populations (3, 24, 52, 72). As ST239 has geographically spread and diversified, its variants have been given more than a dozen different names (20, 22, 24, 25, 49, 52, 61, 67, 68, 73), which reflects their clinical significance in various locales. The molecular basis for the ecological success of ST239 is unclear, but virulence-associated traits such as enhanced biofilm development and epidemiological characteristics such as a propensity to cause device-associated bacteremia and pulmonary infections have been highlighted (3, 19, 27, 54).
Multilocus genetic investigations of the ST239 chromosome revealed that it is a hybrid with estimated parental contributions of approximately 20% and 80% from distantly related ST30- and ST8-like parents, respectively (58). Unusual for naturally isolated bacteria was the finding that these parental contributions were large chromosomal replacements rather than a patchwork of localized recombinations. It was postulated that conjugation might be responsible for the natural transfer of hundreds of kilobases of contiguous chromosomal DNA that resulted in ST239 (58). Recent genomic investigations have presented evidence that large chromosomal replacements also occur within Streptococcus agalactiae strains and that they can be mimicked with laboratory conjugation experiments (12). Importantly, conjugative transfer frequencies in S. agalactiae were found to be highest near three genomic islands (12), two of which were identified as being integrative conjugative elements (ICEs) (13).
ICEs and conjugative transposons are synonyms and refer to genetic elements that are maintained by integration into a replicon and are transmitted by self-encoded conjugation functions (56). ICEs abound in the genomes of S. agalactiae (11), but only one potential ICE has been identified in staphylococci to date: Tn5801 was discovered through the genomic sequencing of S. aureus strain Mu50 (46). Tn5801 is most similar to a truncated genetic element, CW459tet(M), from Clostridium perfringens (57). Both Tn5801 and CW459tet(M) have Tyr recombinases, regulatory genes, and tetM modules that are similar to those of the prototypical gram-positive conjugative transposon, Tn916. Moreover, both Tn5801 and CW459tet(M) integrate into the same locus, guaA, at a nearly identical 11-bp sequence. Although the conjugative transfer module of CW459tet(M) is deleted (57), the conjugative transfer module of Tn5801 is similar to that of Tn916.
We suspected that ST239 strains might carry novel accessory genes that contribute to their chromosomal variation and ecological success. To explore this possibility, we conducted a survey of chromosomal variation in ST239 using a PCR scanning approach. We report the discovery and partial characterization of a novel genetic element, ICE6013, that resulted from the survey.
Strain HDG2 represents the Portuguese variant of the multilocus sequence typing-defined clonal group ST239 (24). Strain EMRSA16 is of the ST36 clonal group, which is closely related to both the genome-sequenced strain MRSA252 (36) and the hypothetical ST30-like parent of ST239 (58). Strains HDG2 and EMRSA16 were used for studies of chromosomal variation; both strains were subsequently found to carry ICE6013. A collection of 43 additional strains (most reported in references 51 and 59) representing diverse clonal groups and a collection of 110 additional strains of ST239 from worldwide sources (our unpublished data) were used to assess the clonal distribution of ICE6013. Strain NCTC8325 (32), which does not carry ICE6013, was used as a negative control where indicated. All isolates were stored long term at −80°C, and routine growth was done overnight on tryptone soya agar plates at 37°C.
Large-scale chromosomal variation in bacteria can be detected with overlapping long-range PCRs that either amplify a product of unexpected size or fail to amplify a product (9). The detectable variation consists of (i) extension of the distance between primer pairs due to insertions, (ii) lack of a primer site due to deletions or mutations, and (iii) improper primer combinations due to rearrangements. The GenoFrag v2.0 computer program (9) was used to design 56 PCR primer pairs (O1 to O112 in Table S1 in the supplemental material) to amplify chromosomal regions, averaging 10 kb with 1-kb overlaps, that together span the ST30-like portion of the ST239 chromosome. We used the MRSA252 genome sequence (36) as a reference for primer design and strain EMRSA16 as a positive control for long-range PCRs.
We used a halving strategy to pinpoint an insertion (i.e., ICE6013) in one chromosomal region of strain HDG2 that initially did not produce a long-range PCR product. Briefly, using the MRSA252 genome sequence as a guide, we subdivided the region into halves and attempted to amplify across each half. The half that failed to amplify a product was suspected to contain an insertion. This process was continued until long-range PCR could be used to amplify across the remaining region. Genomic DNA was isolated with the DNeasy kit (Qiagen), and long-range PCR was done with the TripleMaster system (Eppendorf). To produce amplicons of 9 to 15 kb, thermal cycling parameters were an initial denaturation step of 93°C for 4 min, followed by 20 cycles of 93°C for 30 s and 68°C for 12 min (increasing by 15 s each cycle) and a final elongation step of 72°C for 15 min. To produce amplicons of >15 kb, the number of cycles was increased to 30. Long-range PCR products were analyzed by electrophoresis in 0.5% agarose.
A long-range PCR product containing ICE6013 from strain HDG2 was generated with primers O115 and O116 (see Table S1 in the supplemental material) and the 30-cycle program. The product was gel purified using the Wizard SV gel and PCR clean-up system (Promega) and randomly fragmented using the DNase shotgun sequencing kit (Novagen). The resulting fragments were flush ended and dA tailed using the Single dA Tailing kit (Novagen). Processed fragments were ligated into pCRII and transformed into chemically competent Escherichia coli Top10 cells (Invitrogen). Clones containing plasmids with inserts were identified with blue-white screening. Inserts from approximately 200 clones were amplified and sequenced on both strands. Sequence traces were assembled with Seqman Pro v7.2.1 (DNASTAR), treating forward and reverse traces from each clone as paired ends. Gaps in the assembled sequences were closed by PCR walking.
Multiplex PCR was used to screen strains for the presence of ICE6013. This involved the simultaneous amplification of an internal portion of the aroE housekeeping gene as a positive control and an internal portion of ICE6013 using primers O131 and O132 (see Table S1 in the supplemental material). Thermal cycling parameters were an initial denaturation step of 95°C for 2 min, followed by 30 cycles of 95°C for 30 s, 50°C for 30 s, and 72°C for 2 min and a final elongation step of 72°C for 4 min.
Outward-directed PCR was used to detect circular forms of ICE6013 from both genomic and chromosome-free DNA templates. Primers O135 and O136 (see Table S1 in the supplemental) permit amplification only if the element ends are facing each other, as is expected in circularized molecules and in tandemly repeated structures. Chromosome-free DNA was prepared by treating DNA, which was isolated with the Qiaprep Spin miniprep kit (Qiagen), with Plasmid-Safe ATP-dependent DNase (Epicentre). The absence of chromosomal DNA was consistent with our inability to amplify aroE from these DNA preparations. Thermal cycling parameters were an initial denaturation step of 95°C for 2 min, followed by 30 cycles of 95°C for 30 s, 55°C for 30 s, and 72°C for 1 min and a final elongation step of 72°C for 2 min. The resulting PCR products were sequenced on both strands.
The GenomeWalker Universal kit (Clontech) was used to identify the chromosomal integration sites of ICE6013 in five strains. Briefly, this involved the digestion of genomic DNA with DraI (New England Biolabs), followed by the ligation of the restriction fragments to DNA adaptors. The adaptor-ligated fragments were subjected to PCR using primer AP1 (Clontech), which anneals to the adaptor sequence, and primer O139 (see Table S1 in the supplemental material), which anneals near the left end of ICE6013. Thermal cycling parameters were 7 cycles of 94°C for 25 s and 70°C for 4 min, followed by 35 cycles of 94°C for 25 s and 65°C for 4 min and a final elongation step of 65°C for 4 min. The PCR mixtures were diluted 1:50, and 1 μl was used for a second round of PCR using nested primers AP2 (Clontech) and O140 (see Table S1 in the supplemental material). Thermal cycling parameters for nested PCR were 5 cycles at 94°C for 25 s and 70°C for 4 min, followed by 22 cycles of 94°C for 25 s and 65°C for 4 min and a final elongation step of 65°C for 4 min. Following nested PCR, the products were ligated into pCRII and transformed into E. coli Top10 cells (Invitrogen). For each of the five strains studied, inserts from five to six clones were identified as described above and sequenced on both strands.
Genomic DNA was digested with EcoRV (New England Biolabs), and restriction fragments were separated by electrophoresis in 0.5% agarose. The fragments were blotted onto nylon membranes (Bio-Rad) by capillary transfer. The digoxigenin (DIG)-labeling and detection system (Roche) was used for Southern hybridization and chemiluminescent detection. A DIG-labeled PCR probe, generated from strain HDG2 with primers O131 and O132 (see Table S1 in the supplemental material), was used to screen strains for the presence of ICE6013.
To test for the conjugative transfer of ICE6013, we used the Tn552-encoded ampicillin resistance marker on the ICE6013 elements of ST239 strains. Two sets of donor and recipient strains were used. The first strain set included ST239 strain DAR245 as an Strs Ampr donor and a passage-derived mutant of RN4220 as an Strr Amps recipient. The second strain set included ST239 strain DAR332 as an Erys Ampr donor and 10 different Eryr Amps S. aureus strains as recipients. Filter matings were done as described previously by Stout and Iandolo (66), with a modification that involved the dilution of separate cultures of the donor and recipient strains grown overnight in fresh medium, followed by incubation with shaking at 37°C for 6 h before mixing and filtering through sterile membranes. The remaining unfiltered mixed culture was incubated statically for a further 6 h at 37°C. Cells from filter and mixed-culture experiments were plated onto appropriate antibiotic plates to identify transconjugants (66).
ICE6013 was identified in the genome sequences of six S. aureus strains using BLASTN and BLASTX searches of the GenBank nonredundant and Sanger Institute databases. Open reading frames (ORFs) were predicted with GLIMMER v3.02 (21) using default parameters. Molecular weight, cellular location, signal peptidase I and II sites, and transmembrane (TM) helices were predicted for each ORF with Expasy proteomics tools (4), PSORTb v2.04 (31), SignalP v3.0 (8), LipoP v1.0 (41), and HMMTOP v2.0 (69) using default parameters. Protein domains within ORFs were predicted with searches of the Conserved Domain Database and Prosite. Best non-ICE6013 BLASTP hits were identified using the SEG filter and an E value threshold of ≤1E−4 and requiring a global alignment length to span 50% of the query.
Phylogenetic analyses were performed for five ORFs of ICE6013 that were determined to be homologous to ORFs found on ICEBs1 from Bacillus subtilis strain str168 (GenBank accession number NC_000964), Tn916 from Enterococcus faecalis strain DS16 (accession number EFU09422), and Tn5801 from S. aureus strain Mu50 (accession number NC_002758). These homologies were identified using pairwise BLASTP comparisons (blast2seq) of complete elements. Alignments of amino acids were made with MUSCLE v3.7 (26) using default parameters. Ambiguous regions of the alignment were removed with Gblocks v0.91b (17) using relaxed criteria. Maximum likelihood trees were constructed from the trimmed alignment with PhyML under a WAG model of substitution (23).
To detect the presence of recombination, the pairwise homoplasy index (14) was calculated using SplitsTree v4.8 (39). To assess the phylogenetic incompatibilities between nucleotide sites that were most likely caused by recombination, we constructed a compatibility matrix (40) of parsimony-informative sites from third-codon and intergenic positions using Spectronet v1.27 (38).
The complete ICE6013 sequence from strain HDG2 has been deposited in the GenBank database under accession number FJ231270.
We used PCR scanning to detect variation in the ST30-like portion of the chromosome of ST239 strain HDG2. Three regions of variation that mapped to known genetic elements, including the ISSA1 insertion sequence, the L54a prophage, and the staphylococcal chromosomal cassette mec element, were detected. However, another region of variation that did not map to a known genetic element was detected; a primer set that amplified a 12,833-bp product spanning loci homologous to SAR2661 and SAR2676 from positive control strain EMRSA16 did not amplify a product from tester strain HDG2. By iteratively subdividing this region of the chromosome into halves and attempting to amplify across each half, we subsequently amplified a 22,165-bp product from strain HDG2 where a corresponding 2,260-bp product was obtained from strain EMRSA16. Shotgun sequencing of this amplicon revealed a 19,905-bp insertion in strain HDG2 that mapped to the intergenic region between loci homologous to SAR2664 and SAR2665 (Fig. (Fig.11).
Close inspection of the insertion revealed that it was flanked by a 3-bp sequence, ATT, which occurred as a direct repeat. The corresponding position in the MRSA252 genome sequence was a single ATT sequence. Furthermore, the first and last 8 bp of the insertion constituted an imperfect inverted repeat of sequences GGCAGTGT and ACACCACC, respectively (Fig. (Fig.1).1). These terminal structures suggested that the insertion was a mobile genetic element.
Nucleotide database searches revealed sequences with 99% average pairwise identity to the element in 6 of 15 (40%) S. aureus genome-sequenced strains, including the completed sequences of strains MRSA252, COL, TCH1516, and FPR3757 and the unfinished sequences of strains 0582 and EMRSA15. Of note, the element was located at five different chromosomal loci in these six genome-sequenced strains, and it was present in a single copy in each of these strains (Table (Table1).1). Strains HDG2 and 0582 are both of the ST239 clonal group, and they carry the element at the same locus. Likewise, strains COL and FPR3757 are of the related ST250 and ST8 clonal groups, and they carry the element at the same locus. However, strain TCH1516 is also of the ST8 clonal group, but its element is located elsewhere (Table (Table1).1). In all cases, the element was flanked by 3-bp direct repeats, and its end points were demarcated by the 8-bp imperfect inverted repeats (Table (Table1).1). These observations indicated that the element has been mobilized in the past.
In strains HDG2 and 0582, a 6,551-bp Tn552 transposon, which encodes a β-lactamase operon, has inserted into the element at the preferred 6-bp target site of Tn552, CACGAG (62). After accounting for the Tn552 sequence, the length of the element in strain HDG2 is 13,354 bp. The G+C content of the element is 30.1%, which is slightly below the range of 32.7 to 32.8% for the four complete genome sequences of the strains that contain the element. We reserved the Tn designation (56) of Tn6013 for this element. Further sequence analyses (described below) that revealed the presence of conjugation-associated genes with homology to those of the ICE ICEBs1 from B. subtilis (6) prompted us to name the element ICE6013.
To investigate the presence of ICE6013 among additional strains of S. aureus, we designed PCR primers and a Southern hybridization probe based on an internal portion of the element. Strain NCTC8325 was used as a negative control for these screening studies since its complete genome sequence did not contain the element, and strain HDG2 was used as a positive control. We detected ICE6013 in 19 of 44 (43%) clonally diverse strains, which included a variety of multilocus sequence typing-defined clonal groups (Table (Table2).2). Four strains were negative for ICE6013 by PCR but positive by Southern blotting, which pointed to variation in the element sequence. We note that the element was variably present in the ST30- and ST8-like backgrounds of ST239's parents. ICE6013 was detected in 111 of 111 (100%) geographically diverse ST239 strains (Table (Table2),2), indicating that the element might be fixed within this clonal group.
EcoRV-digested genomic DNA from these strains revealed a variety of patterns when probed with an internal portion of ICE6013. A representative Southern blot is shown in Fig. Fig.2.2. EcoRV is expected to cut downstream of the probed region within ICE6013 and upstream of ICE6013. Strains HDG2 and EMRSA16 both produced a single EcoRV fragment that hybridized to the probe (Fig. (Fig.2).2). The restriction fragment from strain HDG2 was similar to the expected size of 9.6 kb based on the shotgun sequence that we obtained. In contrast, strains C437, C316, D410, and D22 all produced multiple EcoRV fragments that hybridized to the probe (Fig. (Fig.2),2), even though PCR products of the probed region from these four strains revealed no internal EcoRV sites. These results indicated that ICE6013 is variable in copy number among S. aureus strains.
To study the ICE6013 integration sites from some clonally diverse strains, we constructed GenomeWalker libraries for strains EMRSA16, C437, C316, D410, and D22. One or more integration sites were identified for all five strains (Table (Table1).1). The integration site for strain EMRSA16 was found to be identical to that of the MRSA252 genome-sequenced strain; we note that these two strains are both of the ST36 clonal group. The integration sites for the other four strains were all unique (Table (Table1).1). Full-length ICE6013 could be amplified by long-range PCR from strains EMRSA16, C437, and C316 using primers anchored in the element flanking regions. PCR with primers that spanned the left junction of the HDG2 integration site revealed that ICE6013 was present at the same site in all 111 strains of ST239 (data not shown). In addition, this particular copy of ICE6013 contained the Tn552 insertion in all 111 strains of ST239.
Two ICE6013 integration sites were identified for each of strains D410 and D22. Interestingly, the data for strain D410 indicated a composite ICE6013 structure; site 8 represented an integration within an ICE6013-specific sequence, and site 9 represented the right end of one element and the left end of another element separated by a 3-bp sequence (Table (Table1).1). Corresponding right ends of the elements from strains D410 and D22 were not identified among the GenomeWalker clones that we examined, and attempts to amplify full-length elements using primers anchored in the flanking regions from these two strains were unsuccessful. The GenomeWalker results were consistent with the interpretation of the multiple bands shown in Fig. Fig.22 for strains D410 and D22 as deriving from multiple integrations of ICE6013.
Extrachromosomal circular forms of ICEs are thought to represent active intermediates between the integrated and self-transmissible forms of the elements (16). We sought to detect these molecules for ICE6013 by outward-directed PCR (Fig. (Fig.3A).3A). Strain NCTC8325 provided a negative control for these studies, and the aroE gene was used to signify the presence of chromosomal DNA. No circular forms were detected from strain HDG2 (Fig. (Fig.3B).3B). Strain EMRSA16 consistently produced a faint amplicon of a circular form from genomic DNA templates that was absent from the chromosome-free DNA templates. In contrast, circular forms were detected for strains C437, C316, D410, and D22 from both genomic and chromosome-free DNA templates (Fig. (Fig.3B);3B); the circular forms in the chromosome-free DNA preparations were also detectable via Southern blotting for these four strains. These results demonstrated that ICE6013 can form extrachromosomal circular molecules.
A rescreening of all our ICE6013-positive S. aureus strains revealed that 17 of 19 (89%) diverse strains produced circular forms. In contrast, only 2 of 111 (2%) ST239 strains produced circular forms (Table (Table2).2). Importantly, Southern blotting detected the presence of a second copy of ICE6013 in the two ST239 strains that produced circular forms.
Direct sequencing of the outward-directed PCR products from both types of DNA preparations revealed the left and right ends of ICE6013 separated by a 3-bp coupling sequence (Fig. (Fig.3C).3C). We note that the sequence traces from the right ends (O135) were consistently clean, whereas those from the left ends (O136) showed heterogeneity. Nonetheless, the 3-bp coupling sequence determined from the right ends of the circular forms (Fig. (Fig.3C)3C) corresponded to one of the 3-bp sequences identified as a direct repeat flanking the chromosomal forms (Table (Table1).1). These results suggested that ICE6013 can be precisely excised in some cells.
We attempted to demonstrate the conjugative transfer of ICE6013 via filter mating and mixed-culture mating experiments. Using ST239 donors, which have a Tn552-encoded selectable marker on ICE6013, no transconjugants were observed.
Excluding the Tn552 insertion, our annotation of ICE6013 predicts 15 ORFs that span 92% of the sequence (Fig. (Fig.1).1). For eight of these ORFs, both a homolog and a conserved domain were detected through comparative sequence analysis (Table (Table3).3). Various features of these eight ORFs are described below.
Orf1 and Orf2 are most similar to putative integrases from Pseudomonas fluorescens strain Pf0-1 and Exiguobacterium sibiricum strain 255-15 (Table (Table3).3). They are predicted to be small cytoplasmic proteins, and they may be individually nonfunctional. The Tra8 domain, which is based on the IS30 family of transposases (Tpases), is partially present in both ORFs (Table (Table3).3). The signature DDE catalytic triad of the IS30 Tpase is present (Fig. (Fig.4A)4A) only if sequences from orf1 and the intergenic region between orf1 and orf2 are both translated. Specifically, the first Asp residue of the DDE motif requires the translation of the intergenic region. We were unable to detect the dual helix-turn-helix motifs, which are found in the N-terminal region of the IS30 Tpase (53), in either ORF. These observations indicated that Orf1 and Orf2 might function differently than the IS30 Tpase.
As the upstream gene, orf2 could be an important regulator of Tpase activity. The 5′ end of orf2 has two poly(A7) tracts following the initiation codon. Single insertions and deletions in these tracts are predicted to result in a severely truncated Orf2 product. The MRSA252 annotation (36) makes note of a poly(A5) tract in the 3′ end of orf2. This tract could cause a fused peptide of Orf1 and Orf2 that expresses the entire DDE catalytic triad if transcriptional slippage by RNA polymerase and/or translational frameshifting by ribosomes were to occur. In ST239 strains, the Tn552 insertion occurs in the 3′ end of orf2 and results in a unique C-terminal Orf2 sequence of SLCLIYENYLTL instead of the IKYRLKKY sequence found in other strains; the insertion is also predicted to prevent a fused peptide of Orf1 and Orf2 from forming.
The location of Orf1 and Orf2 near the end of ICE6013 is of interest because this location is where the site-specific recombinases of ICEs are typically located (15, 16, 63). It was previously demonstrated that the IS30 Tpase from E. coli can fulfill Tyr/Ser recombinase functions in an experimental lambda phage system (43). The Tn552 disruption of orf2 and the lack of circular forms in all ST239 strains except two strains that have an additional copy of ICE6013 led us to hypothesize that the IS30-like Tpase is the recombinase for ICE6013. Moreover, our inability to demonstrate the conjugative transfer of ICE6013 with ST239 donors could conceivably be a result of the Tn552 insertion.
Orf3 is most similar to a putative lipoprotein from S. aureus strain Mu50 (Table (Table3).3). A type II signal peptidase is predicted to cleave adjacent to a signature Cys residue (65), liberating a 17-residue signal peptide with a sequence of MRRWFVLILGLVILLSA. The C-terminal end of lipoprotein signal peptides can serve as pheromones that induce conjugation by certain gram-positive plasmids; however, Orf3 is not similar to either of the two known S. aureus sex pheromones (29, 30).
Orf5 is most similar to TrsG from plasmid pV030-8 (Table (Table3),3), and it is closely related to TraG from conjugative plasmid pSK41. It is predicted to have one TM helix near its N-terminal end and a cysteine-, histidine-dependent amidase/peptidase (CHAP) domain across its C-terminal half. Since the CHAP domain occurs in proteins involved with peptidoglycan cleavage (7), we hypothesize that Orf5 has a role in localized cell wall lysis that permits the assembly of a mating apparatus.
Orf6, Orf7, and Orf8 are most similar to proteins from Bacillus and Clostridium spp. (Table (Table3).3). We noticed that with multiple ICE6013 query ORFs, among the top three hits from reciprocal BLASTP searches of GenBank were proteins encoded by linked genes from B. subtilis. Further inspection revealed that these B. subtilis genes were carried on the genetic element ICEBs1 (6). We therefore conducted pairwise BLASTP comparisons of all ICE6013 ORFs with all ICEBs1 ORFs and, for comparison, all Tn916 and Tn5801 ORFs (Fig. (Fig.5A).5A). The gene content and organization of ICE6013 were most similar to those of ICEBs1 (Fig. (Fig.5A).5A). Moreover, phylogenetic analysis of the five homologous ORFs common to all four genetic elements showed that ICE6013 was most closely related to ICEBs1 (Fig. (Fig.5B).5B). These findings strengthened the case that ICE6013 represents a novel genetic element in S. aureus.
Orf6 is most similar to a hypothetical protein from Bacillus pumilus strain SAFR-032 (Table (Table3).3). It is predicted to be a large protein that may be secreted and inserted into the cell membrane. Seven TM helices are predicted within its N-terminal region, leading us to hypothesize that Orf6 serves as an important scaffolding component of a mating apparatus. The C-terminal one-third of the protein is predicted to be glutamine rich and contains similarities to portions of an RNase domain.
Orf7 is most similar to an FtsK-like protein from Clostridium acetobutylicum strain ATCC 824 (Table (Table3).3). It is predicted to be another large protein, which has two TM helices near its N-terminal end and an FtsK domain across its C-terminal half. Our annotation of Orf7 is consistent with that of a coupling protein (55), which serves to link processed donor DNA to a mating apparatus. Comparison of Orf7 with homologs from four other genetic elements (Fig. (Fig.5A)5A) reveals its unique location in between Orf6 and Orf8 homologs.
Orf8 is most similar to the YddE protein from ICEBs1 of B. subtilis (Table (Table3).3). It is predicted to be a cytoplasmic protein and the largest protein encoded by ICE6013. This protein has a VirB4 domain across its C-terminal half. In other gram-positive conjugation systems (33), VirB4 domains are thought to use NTP hydrolysis to provide the energy necessary for DNA and/or protein transport through a mating apparatus.
Orf12 is most similar to the NicK relaxase from ICEBs1 of B. subtilis (Table (Table3).3). It is predicted to be a cytoplasmic protein with an RstA domain that spans most of the protein. The proposed active-site tyrosine of relaxases from ICEs (60) is present at position 190 in Orf12. The function of a relaxase is to make a single-stranded nick at the origin of transfer (oriT), which allows the unwinding and transfer of the donor DNA. Using the experimentally demonstrated oriT sequence from ICEBs1 as a query (47), we identified a predicted oriT sequence upstream from orf12 (Fig. (Fig.4B).4B). ICE6013 and ICEBs1 contain this sequence very close to the relaxase gene and within the relaxase gene (47), respectively, whereas Tn916 and Tn5801 contain this sequence 100 bp or more away from their relaxase genes (Fig. (Fig.4B4B).
Given the role of ICEs in facilitating conjugation between bacteria, we investigated whether ICE6013 showed evidence of past recombinations. The pairwise homoplasy index test detected a very strong signal (P = 5.5E−17) of recombination in a 13,284-bp gap-free alignment of seven ICE6013 sequences. Blocks of polymorphisms revealed the mosaic character of the element (Fig. (Fig.6A).6A). We used compatibility analysis to identify pairs of informative nucleotide sites with similar phylogenetic histories. A pair of informative sites are phylogenetically compatible only if three or fewer of four possible combinations of nucleotides are present; such sites fit a single underlying tree. Phylogenetic incompatibility can arise as a result of recombination or homoplasious mutation (i.e., convergent, parallel, and reverse mutation). To attempt to filter out blocks of potentially homoplasious mutations, we examined the 76 informative sites from third-codon positions and from intergenic regions. Blocks of compatible and incompatible sites were evident (Fig. (Fig.6B6B).
Interestingly, the nucleotide sites from orf5 and orf8 showed 94.7% pairwise compatibility (Fig. (Fig.6C)6C) despite the overall strong signal of recombination in ICE6013. In fact, the sites in orf8 were slightly more compatible with the sites in orf5 than with its own sites. The same pattern was observed for orf6 and orf7 (Fig. (Fig.6C).6C). The Orf5 and Orf8 proteins were hypothesized to perform cell wall lysis and NTPase functions, respectively. Experimental evidence for protein-protein interactions between analogous proteins of other genetic elements was presented previously, including VirB1-VirB4 from the Ti conjugative plasmid of Agrobacterium tumefaciens (71) and Orf7-Orf5 from broad-host-range conjugative plasmid pIP501 (1). Although purifying selection could have maintained these compatibilities, the block-like boundaries between compatible and incompatible sites could also be explained by recombination hotspots.
ICE6013 was discovered in this study through PCR scanning of the ST30-like portion of the chromosome of ST239 strain HDG2. The element is related to the ICEBs1 family of genetic elements and represents only the second potential ICE known for staphylococci. ICE6013 probably has a different function than ICEBs1 because it has no detectable homologs to the Phr quorum-sensing system that regulates and is carried by ICEBs1 (6). As for the biological significance of ICE6013, one possibility is its contribution to the diversity of S. aureus strains. In addition to potential variations arising as a result of ICE6013 integrations into different loci, the element is predicted to encode several secreted and surface proteins that could contribute to phenotypic variation. ICE6013 can also carry Tn552, which encodes penicillin resistance.
It is clear that ICE6013 is active and widely dispersed within S. aureus strains. Interestingly, ICE6013 integration sites differ for the genome-sequenced strains FPR3757 and TCH1516. Both strains are of the ST8 clonal group and are representatives of the MRSA clone known as USA300, which is currently the main cause of community-associated MRSA infections in the United States (44). In a study reported previously by Highlander et al. (35), the most significant difference between strains FPR3757 and TCH1516 was identified as being a 13.4-kb insertion. Inspection of this region of the chromosome shows that it corresponds to ICE6013 (strain FPR3757 positions 1630722 to 1644075; strain TCH1516 positions 680918 to 694271). The contribution of ICE6013 to USA300 diversity becomes even more apparent when revisiting single nucleotide polymorphism (SNP) data for 10 other strains of this clone (42). In a study reported previously by Kennedy et al. (42), strain 18813 was the most divergent USA300 strain identified. Of the 408 SNPs that occur between reference strain FPR3757 and strain 18813, 68 SNPs (17%) are accounted for by ICE6013. The movement and sequence variation of ICE6013 within a virulent clonal group suggest that the element might have value as an epidemiological marker among closely related isolates.
The detection of circular ICE6013 molecules and sequences consistent with precise excision points to a mechanism of integration. Site-specific recombination between the 3-bp direct repeats that flank the element could lead to a precise excision of the element. Furthermore, site-specific recombination between the 3-bp coupling sequence in the circular molecule and a target in the chromosome (possibly in a recipient cell) could lead to an integration of the element. For most ICEs, it is thought that their Tyr/Ser recombinases mediate such site-specific recombinations (15). Recently, novel DDE Tpases in S. agalactiae were found to be involved in the integration and circularization of two ICEs through an unknown mechanism (13). In addition, the IS30 Tpase of E. coli has been shown to be capable of substituting for site-specific recombination functions in an experimental lambda phage system (43). An IS30-like Tpase, encoded by orf1 and orf2, occurs at the expected location of a recombinase in ICE6013. Natural Tn552 disruption of orf2 is accompanied by a lack of ICE6013 circular forms in 98% of ST239 strains; the two ST239 strains that can produce circular forms have a second copy of ICE6013 in their chromosomes. The IS30-like Tpase of ICE6013 has unique features in comparison to the IS30 Tpase of E. coli, including the presence of a two-gene system and no detectable helix-turn-helix motifs. The two-gene system is of further interest because it suggests a transcriptional slippage and/or translational frameshifting mechanism for regulating integration.
ICE6013 appears to have a low integration site specificity within S. aureus, which is characteristic of DDE Tpases (13). Based on data from 12 S. aureus strains, ICE6013 has integrated into 11 unique sites that include five unique 3-bp direct repeats (Table (Table1).1). The data showed a nearly even distribution of intragenic and intergenic integration sites. Multiple copies of ICE6013 can occur in a single strain. In comparison, TnGBS2 from S. agalactiae encodes a DDE Tpase as its recombinase, but it preferentially integrates upstream of putative −35 promoter sequences; however, no significant sequence similarity was observed among the duplicated integration site sequences (13). A low integration site specificity for ICE6013 in S. aureus might enable this element to contribute significantly to diversity.
Sequence homologies indicate that ICE6013 may be conjugative or at least derived from a conjugative element. Although our experiments did not demonstrate a conjugative transfer of ICE6013, Tn552 may have disrupted essential functions. Unlike some other ICEs (15), ICE6013 does not have an obvious “cargo” region where selectable markers could be placed without disrupting predicted functions. Although DNA transfer and replication functions are encoded on elements that are merely mobilizable and on elements that are self-transmissible, mating-pair formation functions are encoded only on the latter elements (2, 63). Gram-positive conjugation systems are still poorly understood relative to their gram-negative counterparts (33), but attempts have been made to compile diagnostic sequence characteristics of these systems (33, 48, 63, 75) and to develop functional models (1). With ICE6013, Orf12 is predicted to be a relaxase, and Orf5, Orf6, Orf7, and Orf8 are predicted to contribute key mating-pair formation functions. In multiple gram-positive ICEs, the predicted coupling protein gene is situated upstream of the predicted relaxase gene (Fig. (Fig.5A),5A), which could assist the coevolution of potentially interacting portions of these proteins (10). In ICE6013, the gene for the predicted coupling protein is between the genes for a predicted membrane scaffold and NTPase. This raises the question of whether the ICE6013 coupling protein and relaxase would be functionally compatible. One possibility is that ICE6013 serves as a transposable oriT site, which mobilizes the downstream chromosome only when a cognate coupling protein from another conjugative system is available in a cell.
Conjugation is a candidate mechanism for mediating large chromosomal replacements that have spawned clinically significant S. aureus clonal groups such as ST239 (58). Recent advances have been made in our understanding of mechanisms that prevent conjugation in staphylococci. For example, the SauI restriction-modification system was shown to be necessary and sufficient to prevent the conjugation of Tn918 from E. faecalis to S. aureus (70). Furthermore, loci known as clustered regularly interspaced short palindromic repeats were recently shown to prevent conjugation with plasmid DNA in Staphylococcus epidermidis (50). An additional line of investigation is needed to understand mechanisms that promote conjugation in staphylococci. Previous work on the conjugative transfer of antibiotic resistance genes between S. aureus strains that lacked detectable plasmids, and where transformation and transduction had been inhibited, hinted at the presence of conjugative elements in the chromosome (28). ICE6013 represents a demonstrably integrative genetic element for staphylococci that has hallmark components of a conjugative system.
We thank Shilpa Reddy for assistance with the screening studies. We thank Yves Le Loir and Dominique Lavenier for use of the GenoFrag computer program.
This work was supported by a grant from the American Heart Association and by NIH grant GM080602 (to D.A.R.).
Published ahead of print on 31 July 2009.
†Supplemental material for this article may be found at http://jb.asm.org/.