|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: EA AR PM MB. Performed the experiments: MB PW. Analyzed the data: MB EA AR PM. Contributed reagents/materials/analysis tools: EA AR PM. Wrote the paper: MB EA AR PM PW.
Clostridium difficile is the leading cause of hospital-associated diarrhoea in the US and Europe. Recently the incidence of C. difficile-associated disease has risen dramatically and concomitantly with the emergence of ‘hypervirulent’ strains associated with more severe disease and increased mortality. C. difficile contains numerous mobile genetic elements, resulting in the potential for a highly plastic genome. In the first sequenced strain, 630, there is one proven conjugative transposon (CTn), Tn5397, and six putative CTns (CTn1, CTn2 and CTn4-7), of which, CTn4 and CTn5 were capable of excision. In the second sequenced strain, R20291, two further CTns were described.
CTn1, CTn2 CTn4, CTn5 and CTn7 were shown to excise from the genome of strain 630 and transfer to strain CD37. A putative CTn from R20291, misleadingly termed a phage island previously, was shown to excise and to contain three putative mobilisable transposons, one of which was capable of excision. In silico probing of C. difficile genome sequences with recombinase gene fragments identified new putative conjugative and mobilisable transposons related to the elements in strains 630 and R20291. CTn5-like elements were described occupying different insertion sites in different strains, CTn1-like elements that have lost the ability to excise in some ribotype 027 strains were described and one strain was shown to contain CTn5-like and CTn7-like elements arranged in tandem. Additionally, using bioinformatics, we updated previous gene annotations and predicted novel functions for the accessory gene products on these new elements.
The genomes of the C. difficile strains examined contain highly related CTns suggesting recent horizontal gene transfer. Several elements were capable of excision and conjugative transfer. The presence of antibiotic resistance genes and genes predicted to promote adaptation to the intestinal environment suggests that CTns play a role in the interaction of C. difficile with its human host.
Clostridium difficile is an anaerobic spore-forming bacterium that can be part of the normal gut flora in healthy individuals . Antibiotic treatment disrupts the microbial community in the gut, providing an opportunity for C. difficile to compete with the other species and induce disease by toxin production. The C. difficile toxins affect gut epithelial cells and result in symptoms ranging from mild diarrhoea to the potentially fatal condition, pseudo-membranous colitis . Although toxins A and B are the main virulence factors known for C. difficile , the role of other factors, such as adhesins and other toxins, and the mechanisms by which these virulence factors are regulated, remain to be determined.
Once considered relatively rare, there has been a global increase in the incidence of C. difficile-associated disease (CDAD) since the turn of the century. A number of explanations for the increase have been proposed including the emergence of so-called ‘hypervirulent’ strains, especially those belonging to ribotype 027/North American PFGE type, NAP1  which are associated with more severe disease, higher rates of mortality, higher relapse rates and increased resistance to fluoroquinolones . Whilst ribotype 027/NAP1 strains have received much attention, in other countries different ribotypes have emerged (eg., 078) and these may also have the potential to cause severe disease , .
C. difficile 630, a strain isolated in 1982 from a hospital patient with severe pseudomembraneous colitis was the first strain to be fully sequenced . It had previously been shown to contain the conjugative transposon Tn5397 , also referred to as CTn3 , and the mobile element Tn5398 , providing the host with tetracycline- and erythromycin-resistance, respectively. Full genome annotation revealed that strain 630 contains additional mobile genetic elements including bacteriophages, IS elements, IStrons (a chimera of an IS element and a group I intron) and putative conjugative transposons: CTn1, CTn2, CTn4, CTn5, CTn6 and CTn7 . These elements have recently been given the prefix CDCTn, however no justification for this renaming was presented . C. difficile strain R20291 (ribotype 027), the index isolate in an outbreak at Stoke Mandeville Hospital, UK in 2006, has recently been sequenced and was also shown to contain a large number of putative mobile genetic elements, of which two putative conjugative transposons differ significantly from related elements in 630 .
Conjugative transposons are mobile genetic elements capable of integration and excision from the host genome and conjugational transfer by means of proteins encoded by genes on the element . Additionally, they contain accessory genes that are not involved in transfer and which often encode functions that contribute to the environmental adaptability of the host cell, commonly antibiotic resistance-conferring proteins . In this work, searches of the C. difficile genome sequences available at NCBI using a library of recombination genes identified new putative conjugative and mobilisable transposons. We have examined the genome sequences of ten C. difficile strains including R20291, a recent UK ribotype 027/NAP1 isolate, four recent ribotype 027 isolates, one recent 078 ribotype isolate and one ribotype 001 isolate all from Quebec, Canada, as well as two historical ribotype 027 strains isolated in France and Canada (CD196 and QCD-76W55, respectively), The ability of the newly discovered elements to excise from the host genome was investigated, as was the mobility of the previously described elements in strains 630 and R20291. In addition, using a selection of bioinformatics programs, we predict the potential function of some of the accessory gene products carried by these transposons.
Strain 630 has 6 putative conjugative transposons: CTn1, CTn2, CTn4, CTn5, CTn6 and CTn7, of which, CTn4 and CTn5 have been shown to excise from the genome . In order to determine whether the other putative elements are capable of excision, specific oligonucleotide pairs were used to PCR amplify the element-genome junctions, the joints of the element in a circular form (the transposition and conjugal intermediate) and the regenerated target site after excision. Figure 1 shows the experimental details. PCR products were produced with CTn1, CTn2 and CTn7, and sequencing showed the element-genome junction, the empty target site in the chromosome after excision and the joint sequence in the circular molecules (Figure 2). This analysis allowed the ends of the various elements to be defined (Figure 2 and Table 1). CTn1 was delineated by a 6-bp direct repeat, one copy of which was present in the joint of the circular form and in the empty target after excision of the element (Figure 2a). CTn1 contains a tyrosine integrase (CD0355) which, together with the excisionase (CD0356) (Figure 3), is likely to be responsible for excision.
The boundary of CTn2 is defined by an imperfect direct repeat with the 8-bp sequence on the left end of the element present in the joint of the circular form, and the 9-bp sequence on the right end of the element left in the empty target site after excision (Figure 2b). There is one large serine recombinase (CD0436) in this element which is likely to be responsible for excision, although the exact mechanism requires further investigation. Our analysis also shows that CD0404 to CD0406 are not part of the transposon and remain in the chromosome after excision (Figure 4), demonstrating that the ends of CTn2 are not as previously reported .
CTn7 is flanked by 15-bp imperfect direct repeats (Figure 2c). The sequence on the right end of the element is present in the joint of the circular form and the sequence on the left end of the element is found in the empty target site after excision. The element contains one large serine recombinase (CD3370) which is likely to be responsible for the excision reaction.
CTn6 is the only putative conjugative transposon in strain 630 for which no joint of a circular form or empty target site could be detected. Results are summarised in Table 1.
Strain R20291 (ribotype 027) contains two putative conjugative transposons that are variants of elements found in strain 630 . One of these has a structure comparable to CTn1 in strain 630, the main difference being the accessory module of the elements (Figure 3). Excluding the accessory module, the remainder of the two elements show at least 82% nucleotide sequence identity. We could not detect excision of the element in R20291, possibly because of a deletion of three ORFs, including the putative xis gene, compared to CTn1 in 630 (Figure 3). Furthermore, this element was found to have integrated into a different site within the genome of R20291 when compared to 630, integrating between ORFs 3452 and 3476 in R20291, genes encoding a hypothetical protein and a putative transcriptional regulator, respectively (ORFs 3452 and 3476 in R20291 are homologues of CD3614 and CD3615 in strain 630). In 630, CTn1 is integrated into CD0354, a gene encoding a hypothetical protein; an uninterrupted homologue of this gene is present in R20291.
The second putative conjugative transposon in R20291, has been previously reported as CTn027 and an insertion in the element was erroneously called the Stoke Mandeville phage island, SMPI . There is no evidence of a phage within this element or that the element itself is a prophage, and we renamed it Tn6103 as it fits the criteria for a conjugative transposon according to the transposon registry guidelines . This element is similar to CTn5 in strain 630, having at least 85% nucleotide identity along most of its length; however, it contains three insertions which are probably mobilizable transposons (see below), two of which are inserted within ORF 1743 and one within ORF 1776. These elements have been named Tn6104, Tn6105, and Tn6106 (Figure 5). All three elements contain a recombinase gene, however, excision and circularisation has been demonstrated only for Tn6104 as well as the composite element itself, Tn6103 (Figure 2e). Tn6104 contains 21 orfs and is flanked by a 2-bp direct repeat which is also present in the circular form of the element and the empty target site after excision (Figure 2e). The recombinase of this element, ORF 1744, is a member of the family of large serine recombinases and is related to TnpX from the mobilisable transposons Tn4451 and Tn4453 of Clostridium perfringens and C. difficile, respectively , . TnpX also has a 2-bp target site . Tn6104 has another gene product (encoded by ORF 1745) which is 48% identical at the amino acid level to TnpV of Tn4451, postulated to be involved in excision of Tn4451 based on homology with λ Xis . Another similarity between Tn6104 and Tn4451 is the mobA/mobL mobilisation gene (ORF 1758) which is present in the same orientation at the right end of the element. In contrast to the single accessory gene, catP in Tn4451, Tn6104 contains several accessory genes with the potential to encode a putative transcriptional regulator (ORF 1747), a two component regulatory system (ORFs 1748 and 1749), an ABC transporter (ORFs 1750, 1751 and 1752), three sigma factor-like proteins (ORFs 1754, 1755 and 1756), a putative toxin-antitoxin system (ORFs 1759 and 1760) and a phage-associated protein (ORF 1762) (see Table S1 and section on predicted accessory gene function below).
Excision of Tn6105 and Tn6106 was not detected. Tn6105 consists of 11 ORFs and contains two putative large serine recombinase genes situated in the centre of the element (ORFs 1771 & 1772, Figure 5). Other putative genes are tnpV (ORF 1765), as well as a mobilisation protein (ORF 1768), a predicted sigma factor (ORF 1773) and a predicted orphan response regulator (ORF 1775) (Table S1). Tn6106 consists of 11 ORFs and contains a single large serine recombinase gene on the right side of the element (ORF 1788, Figure 5). Other genes encode a TnpV homologue (ORF 1777), a predicted mobilisation protein (ORF 1784) and a predicted transcriptional regulator (ORF 1778) (Table S1).
Tn6103 itself is flanked by perfect 5-bp direct repeats and one of these is present in the joint of the circular form and in the empty target site in the genome after excision, identical to CTn5 in 630 (Figure 2d) . The element contains a large serine recombinase which is likely to be responsible for its excision.
To study the transfer of the putative conjugative transposons, the ClosTron system  was retargeted to ORFs within the accessory module of each element that were predicted not to be involved in conjugation.
The transposons marked with the ClosTron could still excise from the genome, as determined by PCR. Filter mating assays were performed using strains containing a marked element as donor and C. difficile CD37 as recipient. Transconjugants were screened by PCR for the presence of the inserted ClosTron, as well as the absence of the PaLoc (to confirm them as strain CD37). All six marked elements from strain 630 transferred into the recipient strain CD37 at frequencies between 10−4 and 10−9 (Table 1). Transfer of Tn6103 from strain R20291 to CD37 was not detected indicating either that the element cannot transfer into CD37, or does so at a transfer frequency below the detection limit.
The nucleotide sequences of the genes encoding serine- or tyrosine-based recombinases associated with conjugative transposons, plus those phylogenetically-related genes present in bacterial genomes, were downloaded from Genbank at the NCBI (see materials and methods for more details). These sequences, together with the sequences of the ORFs present on the CTns in strain 630, were used in BLAST searches of the C. difficile genome sequences available at NCBI. This analysis allowed the identification of the novel putative mobile elements summarised in Table 1. The Artemis Comparison Tool (ACT)  was used to compare the novel elements to the previously identified conjugative transposons in strains 630 and R20291. ORFs annotated as hypothetical proteins in genome sequencing projects were analysed using selected bioinformatics programs (see materials and methods) (full data provided in Table S1).
Diverse variants of CTn1 were found in all the C. difficile genomes that were searched (Table 1). All strains contain an element sharing 98% nucleotide identity with the element described in R20291 (see above) (Figure 3). Additionally, all of these elements are present in the same target site as the CTn1-like element in R20291 and, in common with this element, do not have an excisionase (xis) homologue. The elements of strains QCD-32G58 and QCD-66C26 were analysed for excision from the genome but no PCR product for the joint of the target site or the circular form could be amplified, presumably due to the lack of a functional Xis. All these strains are ribotype 027/NAP1 suggesting that a CTn1-like element transferred into the ancestor of the modern ribotype 027/NAP1 strains where it suffered a deletion of xis, fixing it within the host chromosome. The accessory modules of these elements contain predicted ABC transporters. Other accessory gene products encoded by these elements are shown in Table S1 and some are discussed in the section on accessory gene function below.
Strain QCD-23M63 (ribotype 078) contains a CTn1-like element (75–99% nucleotide sequence identity with CTn1 in 630, excluding the accessory module) which we have named Tn6073 (Figure 3). The element is flanked by 7-bp direct repeats, one of which is present in the joint of the circular form and one in the empty target site in the genome after excision (Figure 6a). The element contains a tyrosine recombinase and excisionase which together are likely to be responsible for its excision. Tn6073 is located in a different target site from CTn1 in 630: it is inserted between homologues of the 630 genes CD0651 and CD0652, which are predicted to encode a membrane protein and a transcriptional regulator, respectively. The accessory module of the element consists of genes encoding a predicted N-terminal hydrolase, a sigma factor and an ABC transporter (see Table S1 and section on accessory gene function below). Compared to CTn1 in 630, there are three insertions in the conjugation module which contain hypothetical genes (Figure 3).
Strain QCD-63Q42 (ribotype 001/NAP2) contains two CTn1-like elements, the first of which is inserted between homologues of CD1565 and CD1564 and has a minimum of 73% nucleotide sequence identity (excluding insertions) with CTn1 of strain 630. An insertion of approximately 20-kb between homologues of CD0386 and CD0383 in this CTn1-like element contains a sequence with on average 92% sequence identity with prophage 1 of strain 630 . However, until the sequence gaps either side of the partial phage are filled, it is not possible to say unequivocally that this is the actual insertion site of this element (Figure 7). Another interesting feature of this element is the fact that the accessory module is 33-kb (compared to 9.5-kb in strain 630) and includes genes predicted by bioinformatics analysis to encode an ABC transporter, two sigma factors and a transcriptional regulator (Table S1). The second CTn1-like element in strain QCD-63Q42 is inserted between homologues of CD1807 and CD1806. Two insertions in the conjugation module include genes encoding a putative alpha/beta hydrolase, a lactoylglutathione lyase, a group II intron reverse transcriptase, as well as several hypothetical proteins. The accessory module contains two putative ABC-transporter genes and four transcriptional regulators of which one is predicted to be a sigma factor, as well as several hypothetical proteins (Table S1). Both elements contain an intact xis homologue and a complete tyrosine recombinase suggesting that they can excise from the genome, although this has not been investigated.
Strain ATCC-43255 (formerly VPI 10463) contains a CTn1-like element (at least 73% nucleotide sequence identity with CTn1 from 630 excluding the accessory module) inserted between homologues of the 630 genes, CD1234 and CD1235, two hypothetical genes within the prophage-like skinCD element which is itself inserted into the sigma K (σk) gene, involved in sporulation . The skinCD element was shown to excise from σk during sporulation, forming a circular molecule. However the skinCD element itself was not characterised in that study and therefore the presence of the CTn1-like element within skinCD was not detected. We could not detect excision of the CTn1-like element from skinCD by PCR. Taken together, our results and those of Haraldsen et al  demonstrate that the presence of the CTn1-like element within skinCD does not prevent its excision and does not prevent sporulation in this strain.
Several variants of CTn5 were found in the C. difficile genomes that were examined. Strains QCD-37X79, QCD-66C26 and QCD-32G58 (all ribotype 027) contain an element 99% identical at the nucleotide level to the CTn5-like element Tn6103 in R20291. However, in all three strains only Tn6105 is present in the homologue of R20291 ORF 1743, and Tn6104 and Tn6106 are absent (Figure 8). We have demonstrated excision of the elements in strains QCD-66C26 and QCD-32G58, (Figure 6), and the elements were designated Tn6110 and Tn6111, respectively. Although the element in R20291 has inserted in the same target site as CTn5 in strain 630 (within a homologue of CD1844), in strains QCD-66C26 and QCD-32G58 it has inserted at a different site (between homologues of CD3369 and CD3393 encoding a hypothetical protein and putative RNA methyltransferase, respectively). Interestingly this is the same target site occupied by CTn7 in 630. Tn6110 and Tn6111 are flanked by 5-bp sequences identical to those of CTn5 in 630 and Tn6103 in R20291 (Figure 6c, d). Although the genome of QCD-32G58 was assembled, there are still gaps in the sequence and the contigs on which Tn6111 is present have not been joined. However, the fact that the circular form of the element and the empty target site were detected indicates that a functional element is present.
Strain QCD-63Q42 contains an element with a similar structure to CTn5 in strain 630 (Figure 8), including the accessory module. However, the region homologous to CD1863 through to CD1870, encoding conjugation functions in CTn5 in strain 630, has been replaced with a region containing several hypothetical genes and genes encoding restriction modification proteins. Excluding the insertion, the CTn5-like element in strain QCD-63Q42 shares on average 88% identity with CTn5 in 630 and is inserted in the target site of CTn7, between homologs of CD3369 and CD3392, a hypothetical gene and a gene encoding a putative RNA methyltransferase, respectively. A CTn7-like element is present in tandem with the CTn5-like element in this strain. This element shares 97% nucleotide identity with CTn7 of strain 630 although it has a 0.8-kb insertion containing a predicted transmembrane protein intergenic between the homologues of CD3389 and CD3390.
An element related to the CTn5-like element in strain QCD-63Q42, named here Tn6107, is present in strain QCD-23M63 (at least 80% sequence identity to CTn5 excluding inserted region and accessory module) (Figure 8). In common with the CTn5-like element in QCD-63Q42, the conjugation region (CD1863–CD1870) has partially been replaced with a segment containing several genes encoding either hypothetical or restriction modification proteins. In addition, the accessory module is replaced with genes encoding hypothetical proteins, putative transcriptional regulators and ABC-transporters (see Table S1). The element is present between homologues of CD3369 and CD3393, the target site of CTn7 in 630. The joint of the circular form and empty target site have 7-bp imperfect repeats, one copy of each is present on either side of the element in the integrated state (Figure 6b).
Strain QCD-23M63 (ribotype 078) contains a CTn4-like element (Figure 9) that has between 95 and 98% sequence identity with the element in strain 630 but with a deletion of ORFs CD1103–CD1105 (6 ORFs in total) and an insertion between the homologues of CD1106a and CD1107 consisting of genes encoding a putative histone acetyltransferase (ORF 4895) and a hypothetical protein. The element is inserted in the homologue of CD1036 in 630, a putative cell surface protein.
Strain ATCC 43255 contains a putative mobilisable transposon highly related to Tn5398 in strain 630  (99% nucleotide sequence identity excluding the ermB cassette (see below)) and present in an identical target site (Figure 10a). The sequence between the two direct repeats of the ermB cassette in Tn5398 is not present in strain ATCC 43255, however, in its place is an ORF encoding a protein that is predicted to be secreted by virtue of an N-terminal export signal.
Strain QCD-63Q42 (ribotype 001) contains a novel 15-kb putative mobilisable transposon, designated Tn6115 (Figure 10b), encoding several hypothetical gene products, a predicted ABC transporter (ORFs 7732 and 7737), a protein containing a predicted virulence-associated E domain [PFAM: PF05272] (ORF 7777) and a two component system (ORFs 7742 and 7747). A serine recombinase is predicted to be responsible for the potential excision of this element which is flanked by a GG dinucleotide direct repeat.
In an effort to predict the role of the accessory regions of the conjugative transposons in the biology of C. difficile, we carried out a bioinformatics analysis of the predicted gene products encoded in the accessory regions using PSI-BLAST . A limitation of PSI-BLAST is that the hits are listed according to their mathematical scores and not according to biological function. Therefore, we applied BYPASS, a program that uses fuzzy logic to rearrange the output from PSI-BLAST, putting in top position proteins with additional similarity in hydropathic profile, flexibility profile, amino acid composition, and length of the matched amino acid stretch, parameters which contribute to the accuracy of the functional prediction , . We then searched the BYPASS output for hits with experimental evidence of function. To corroborate the function suggested by this analysis, we used P-SORT , PRO-DOM  and SMART  to predict the cellular location, the presence of signatures of protein families and domains, transmembrane regions and secretion signals. The analysis identified new potential functions for several genes annotated in the 630 genome sequence as hypothetical proteins (Table S1).
The analysis suggests that most of the accessory genes on the CTns in strain 630 encode ABC transporters and efflux systems which may function in resistance to antimicrobial peptides. For example, CD0363–0365 on CTn1 encodes a predicted ABC transporter consisting of two different transmembrane domains and two ATP-binding domains each present on individual polypeptides. CD1095–1097 on CTn4 encodes a predicted transporter consisting of two different transmembrane domains and a single ATP-binding domain. Experimental evidence of function is available for two hits in the BYPASS analysis, the plasmid-encoded BcrA and B proteins which comprise an ABC transporter mediating bacitracin resistance in Enterococcus faecalis . The ATP-binding component, BcrA shares 52% identical amino acids with both CD0366 and CD1097, whereas the transmembrane protein BcrB shares 21% sequence identity with CD0365, CD1095 and CD1096. The ATP-binding domain of the ABC transporter (CD1349) mediating resistance to the cationic antimicrobial peptides, nisin and gallidermin  shares 32–36% amino acid identity with the predicted ATP-binding subunits encoded by CD0366 and CD1097 and the predicted transmembrane domains encoded by CD0365 and CD1095 share 15–22% identity with CD1350. The similarity with functionally characterised ABC transporters suggests that CD0363–0365 on CTn1 and CD1095–1097 on CTn4 may be involved in the export of antimicrobial peptides, a function likely to be important for intestinal colonisation.
Other accessory proteins carried by CTns in 630 that may function in interaction with the human host include a protein predicted to be surface-located by virtue of the presence of an N-terminal signal sequence and a C-terminal LPXTG membrane-anchoring domain. This protein (encoded by CD0386 on CTn1) was incorrectly annotated in the 630 genome sequence as a “collagen binding protein” because of the presence of B region domains which are repeated 7 times in the Staphylococcus aureus collagen-binding surface protein, Cna . The B regions do not bind collagen, however; this is the function of the A domain. The B domains in Cna are thought to serve as a stalk that projects the A region from the cell surface facilitating its interaction with collagen . There are two B repeats predicted in CD0386, but no A domain, and no other ligand-binding domain is identifiable using prediction tools. Similar LPXTG-linked proteins containing Cna B repeats are present on the CTn1-like elements in R20291 (gene 3453, 100% identical) and on all the CTn1-like elements identified, as well as in CTn7 in strain 630 (CD3392, 95% identical). Interestingly, a membrane-anchored protein containing eight Cna B-type domain repeats and a predicted intimin/invasion domain, suggestive of a function in adhesion, is present on an integrated conjugative element in a strain of Streptococcus pyogenes . As well as genes encoding putative ABC transporters, the novel CTn1-like elements carry other accessory genes with the potential to influence the ability of C. difficile to adapt to the human host. The CTn1-like element in QCD-23M63 carries a gene (3082), the predicted product of which gives highly significant PSI-BLAST scores with proteins belonging to the family of N-terminal (Ntn) hydrolases that includes bile salt hydrolases, β-lactam acylases and N-acyl homoserine lactone acylases (Table S1), however none of the hits identified by BYPASS are experimentally verified. A gene product that is 71% identical to the product of 23M63_3082 is carried by CTn6 in strain 630 (CD3331). Alignment of the predicted proteins of 23M63_3082 and 630_CD3331 with an experimentally proven conjugated bile acid hydrolase from C. perfringens [Swiss-Prot:P54965]  and with a known penicillin G amidase from Bacillus sphaeroides [Swiss-Prot:P12256]  shows low level identity (14% and 12% identical amino acids, respectively).
An intriguing finding of our computational analysis is that many of the accessory genes on CTns in C. difficile are predicted to encode proteins with sequence similarity to predicted sigma factors and include a ‘helix-turn-helix’ motif involved in binding the conserved −35 region of promoters in DNA (Table S1). This is the only recognisable domain in TcdR, which has been proven experimentally to function as an alternative sigma factor in toxin gene expression . Other potential sigma factors containing this domain are present on the CTn1-like elements in strains 43255, R20291, 23M63 and 63Q42 and on the CTn5-like elements in R20291 and 23M63 (Table S1). It will be interesting to determine if these gene products are able to recruit core RNA polymerase and bind to promoters within the element and/or to promoters within the recipient genome, and whether this influences the recipient cell transcriptome. In the CTn5-like element in R20291 (Tn6103), there are three tandem genes predicted to encode sigma factors (R20291_1754, 1755, and 1756) and a fourth predicted sigma factor encoded by R20291_1773 (Figure 5).
In addition to these putative sigma factors, the CTn5-like element in R20291 contains three genes predicted to encode transcriptional regulators including two (R20291_1747 and 1780) that contain a predicted helix-turn-helix motifs of type HTH_XRE found in a family of DNA binding proteins that include a bacterial plasmid copy control protein and various bacteriophage transcription control proteins [PFAM:PF01381] and one of which (1747) is related (30% identity) to the predicted transcriptional repressor within the regulatory region of Tn916 (Orf 9) . In addition to these transcriptional regulators, the CTn5-like element in strain R20291 contains a predicted two-component system (R20291_1748 and 1749) as well as an additional orphan response regulator (R20291_1775). Although genes encoding putative transcriptional regulators do occur in the accessory regions of other CTns in C. difficile (for example, predicted transcriptional regulators on CTn6 (CD3334), and CTn7 (CD3376)), the presence of so many putative transcriptional regulators on a single element in strain R20291 is intriguing.
In this study, we have shown that C. difficile genomes contain novel putative conjugative and mobilisable transposons related to the elements that were previously described for strains 630 and R20291. A library of conserved sequences of recombinase genes was compiled from Genbank and used to search for putative recombinases in recently sequenced C. difficile genomes. Aligning contigs from these genome sequences with the genome sequences of strain 630 and R20291 showed that 18 novel putative elements were present in the 9 different genomes. Most of these elements have a similar structure to CTn1 or CTn5 of strain 630.
A recent comparative genomic hybridisation study by Marsden et al  of 94 clinical strains of various ribotypes isolated predominately in the UK and the Netherlands reported that CTn1 was absent or highly divergent in the majority of ribotype 027 and 001 strains and in all ribotype 078 and 015 strains. However, this conclusion was based on probes specific for the divergent accessory module (Marsden, personal communication). In contrast, we show that the core regions of CTn1-like elements i.e. the conjugation and integration/excision modules are present in all the strains in our collection including all five recently isolated ribotype 027 strains. This underlines the need for care when making conclusions about the presence or absence of particular integrative elements in genomes. Given the modular nature of these elements and the fact that the accessory regions are often divergent, it is important to be clear which modules have been specifically tested for.
In a comparative genome analysis, Stabler et al.  previously reported two unique conjugative transposons in strain R20291 that were absent in strain 630. One of these transposons, referred to as CTn027 by Stabler et al., and renamed Tn6103 here, was reported to contain a single 20-kb phage island which they termed SMPI. We have shown that, rather than a single large insertion, Tn6103 contains three distinct insertions which are likely to be mobilisable transposons and therefore they have been named Tn6104, Tn6105 and Tn6106. We have demonstrated excision of Tn6104 and shown that some ribotype 027 strains contain an element that is related to Tn6103 but lacks the Tn6104 and Tn6106 insertions.
Excision from the genome to a circular intermediate is a prerequisite for conjugal transfer . Circular molecules were demonstrated in this study for several of the previously described elements as well as for some of the novel elements. To determine if the elements were capable of conjugal transfer, they were marked with an antibiotic resistance gene and conjugative transfer of CTn1, CTn2, CTn4, CTn5 and CTn7 from strain 630ΔErm to CD37 was demonstrated. Although we detected a circular form of Tn6103 in R20291, transfer of this element to CD37 was not demonstrated. This is possibly because of the insertion of the mobilizable transposons, Tn6104, Tn6105 and Tn6106 in the conjugation module.
It is interesting to note that many of the putative mobile elements have been conserved, with many exhibiting variation only in the module of accessory genes. We have attempted to gain insight into the functions of the genes in these modules using a computational approach. Our analysis suggests that the majority of accessory genes carried on CTns in C. difficile encode ABC transporters and efflux systems presumed to function in resistance to antimicrobial peptides, produced either by the host innate immune response, or by microbial competitors in the intestinal niche. In addition, we have shown that some elements carry genes with the potential to encode bile salt hydrolases which could contribute to the ability of the bacterium to adapt to the human host. A secreted protein which appears to have a stalk-like structure that projects it away from the cell surface is also worthy of further investigation since it is likely to be involved in the interaction with the human host. Perhaps the most interesting finding of our study is that several of the CTns encode sigma factor-like proteins and transcriptional regulators.
Investigating when both the accessory proteins and also the excision and transfer proteins are expressed will be the next step in understanding the function and regulation of these elements. We are currently using RT-PCT to investigate the conditions under which the putative surface protein, CD0386, is expressed.
Acquisition of mobile genetic elements will result in numerous heritable changes, over and above the addition of new genes. Insertion between ORFs may result in transcriptional effects in the locality of the insertion site which can fundamentally alter the phenotype of the host. Elements can insert into ORFs resulting in gene inactivation, and we have shown here that similar elements select different target sites in different strains, e.g. the CTn1-like elements. Gene fusion events may also occur, eg., CTn5 promotes a fusion with CD1844 in strain 630 . Furthermore the newly acquired DNA can be a substrate for recombination promoting more general genome rearrangements. Thus, it appears there is much still to learn about the contribution of mobile genetic elements to the biology of C. difficile.
The bacterial strains used in this study are listed in Table 2 and Table S2. C. difficile strains were grown on brain heart infusion (BHI) agar plates (Oxoid Ltd, Basingstoke, UK) supplemented with 5% defibrinated horse blood (E & O laboratories, Bonnybridge, UK) or in BHI broth (Oxoid Ltd). Cultures were grown at 37°C in anaerobic conditions (80% N2, 10% H2, 10% CO2). E. coli CA-434 was grown on Luria-Bertani agar plates (Sigma-Aldrich Company Ltd., Dorset, UK) at 37°C in aerobic conditions.
DNA was isolated using the Puregene yeast/bacterial kit B (Qiagen, Crawley, UK) according to the manufacturer's instructions, with the addition of 3 µl of both the lytic enzyme solution and RNAse A solution instead of 1.5 µl at the appropriate steps in the protocol. Purity assessment and quantification was done using a Nanodrop 1000 spectrophotometer.
PCR amplifications were carried out using the NEB Taq Polymerase kit (New England Biolabs, Herts, UK) according to the manufacturer's instructions with 10 mM dNTPs (NEB). The primers that were used are listed in Table S3 (Sigma-Genosys, UK).
PCR products were run on a 1% agarose gel at 100 mV for 1 hour, supplemented with Gelred at a 110,000 dilution (Biotium, Hayward, USA). PCR products were purified with the spin column PCR purification kit (NBS Biologicals ltd, Cambridgeshire, UK) according to the manufacturer's instructions. When multiple PCR products were present, the product was purified using the spin column gel extraction kit according to the manufacturer's instructions.
PCR products were sequenced at the Department of Biochemistry, University of Cambridge.
In order to investigate if a putative element can excise form the genome, PCR analysis was performed to amplify the joint region of the circular intermediate using primers at the ends of the element, facing outward. The sequence of joint regions of novel circular intermediates was deposited in Genbank; accession numbers are provided in Table 1. Additionally, the regenerated target sites and the junctions between the element and the genome were amplified. Comparison of the sequences of the junctions between the genome and element, the empty target site in the genome and the circular joint of the excised molecule enabled identification of the ends of the transposons.
The ClosTron system was used to make insertions in strain 630 ORFs CD0364 and CD0386 (CTn1) and CD3392 (CTn7) by retargeting a group II intron as described by Heap et al. . Suitable target sites were identified and primers were designed using the Targetron Gene Knockout System kit (Sigma-Aldrich) (primers listed in Table S4). Splicing by Overlapping Extension PCR was used to create the specific intron retargeting sequence which was cloned into pMTL-007. The plasmids were transferred from E. coli CA-434 into C. difficile 630Δerm by conjugation. Selection was carried out on agar containing thiamphenicol (Sigma-Aldrich) (15 µg/ml) and C. difficile selective supplement (Oxoid Ltd). Thiamphenicol resistant colonies were suspended in BHI broth containing 1 mM IPTG (Sigma-Aldrich) and incubated for 3 hours at 37°C. Cultures were spread onto BHI plates containing 40 µg/ml lincomycin (Sigma-Aldrich) and C. difficile selective supplement. Colonies were restreaked onto fresh selective plates and the insertions were confirmed using PCR to amplify the junction between the target site and the intron.
The revised ClosTron system  was used to target 630 ORFs CD0428 (CTn2) and CD1099 (CTn4) and R20291 ORF 1803 (Tn6103). The construction of the plasmid for the revised protocol varies in that the target sites were identified using the algorithm available at www.clostron.com. The plasmid was produced by DNA2.0 (Menlo Park, USA). Colonies on plates containing thiamphenicol and C. difficile selective supplement were streaked directly onto plates containing lincomycin and C. difficile selective supplement. Insertions were confirmed using PCR as described above.
Filter matings were carried out as described previously . Putative transconjugants were screened with ErmRAM primers  to confirm the presence of the marked elements. To confirm the identity of the recipient strain, PCR with primers Lok1 and Lok3 was used to confirm the absence of the PaLoc . Transfer frequencies were calculated as number of transconjugants per donor cell.
The nucleotide sequences of the genes encoding serine or tyrosine-based recombinases associated with conjugative transposons, plus those phylogenetically related genes present on bacterial genomes, were downloaded from Genbank at the NCBI. This included sequences from the following (the numbers in box brackets are the genome position while those in parentheses are the Genbank accession numbers): Tn916 (U09422), Tn1545 (X61025), Tn1549 (AF192329), Tn5382 (AF063010), Tn5386 (DQ321786), Streptococcus thermophilus genomic island CIME19258 [553–1749bp] (AJ586571), C. difficile 630 [1284507–1285700 bp] (AM180355), Tn4451 (U15027), Treponema denticola ATCC 35405 [2204491–2206329 bp] (NC_002967), Campylobacter coli RM2228 [3327–5213] (NZ_AAFL01000021), Enterococcus faecalis V583 [2204960–2206573 bp] (NC_004668). Streptococcus pyogenes MGAS2096 [1092958–1094889 bp] (CP000261), Streptococcus suis [58479–58655 bp] (NZ_AAFA02000004).
Novel transposons were named according to the transposon registry guidelines . The registry stipulates that if functionality of a transposable element is demonstrated e.g. by excision from the host genome, or the entire sequence of a putative transposable element is determined and shown to be <100% identical to previously known transposable elements, then a Tn number is warranted .
All C. difficile genome sequences available in the database at NCBI as of March 2010 were searched using the BLAST algorithm with the 33 sequences of the recombinase library as input. Data in this paper was updated for all genome corrections made, up until October 2010. Additionally, the sequences of the ORFs of the (putative) conjugative transposons of strain 630, CTn1, CTn2, CTn4, CTn5 and CTn7 were used in this search. All contigs containing a putative recombinase were compared to the genome sequence of strains 630 and R20291 to look for insertions. Comparisons were made using Doubleact  and visualised using the Artemis Comparison Tool .
Predicted proteins present on putative conjugative transposons that had previously been annotated as hypothetical genes were analysed using several bioinformatics tools : PSI-BLAST , BYPASS , P-SORT , PRODOM ( and SMART .
PSI-BLAST was performed and the PSSM matrix after the fifth iteration, or when the program converged from lack of further similarities, was used for analysis with BYPASS. In parallel, analysis of the protein sequences was performed with the P-SORT, PRO-DOM and SMART programs.
Results of BYPASS, PSORT, SMART and PRODOM searches of hypothetical proteins.
Bacterial strains and plasmids produced in this study.
PCR primers used to amplify junctions of circular intermediates of conjugative transposons and empty target sites. PCR primers used to produce ClosTron mutants, and to screen transconjugant cells.
PCR primers used to produce ClosTron mutants, and to screen transconjugant cells.
The authors would like to thank Dr. John T. Heap and Prof. Nigel P. Minton (University of Nottingham) for providing us with plasmid pMTL007 and for the use of the ClosTron system, Dr. Andre Dascal (McGill University, Montreal) for providing us with C. difficile strains QCD-32G58, QCD-66C26 and QCD-23M63, Dr. Maja Rupnik (Institute of Public Health Maribor) and Dr. Ed Kuijper (Leids Universitair Medisch Centrum) for information on bacterial strains, Dr. Antonio Gomez (Universitat Autonoma de Barcelona) for running the protein sequences on their local BYPASS server and Dr Alexander Indra (Austrian Agency for Health and Food Safety) for ribotyping strain CD37.
Competing Interests: The authors have declared that no competing interests exist.
Funding: This research has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 223585, the Medical Research Council (grant no. G0601176) and the Wellcome Trust (grant no. WT078131AIA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.