DNA from a pooled plaque sample was subjected to a TRACA reaction (see Materials and methods). A total of 33 kanamycin-resistant transformants were isolated, each containing an EZ-Tn5:: plasmid cointegrate. Sequence analysis revealed that the captured circular DNA ranged in size from 0.9 to 7.3 kb, with a G+C range of 30–52% (). Of the 33 plasmids, 29 belong to one of four distinct groups based on their homology to each other (> 92% nucleotide identity): the pTRACA41 group (pTRACA41 and pTRACA58), the pTRACA42 group (pTRACA42, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 59, 60, 62, 64, 65, 67, 68, 70 and 72), the pTRACA63 group (pTRACA63 and pTRACA43) and the pTRACA69 group (pTRACA69 and pTRACA71). The remaining four plasmids, pTRACA45, pTRA CA61, pTRACA66 and pTRACA73, share no homology with other plasmids identified in this study.
Schematic of the TRACA plasmids showing the putative ORFs, the size and the percentage G+C composition.
A number of putative ORFs were identified on each of the plasmids (). The closest matches with the predicted amino acid sequence of these ORFs, identified by blastp analysis, are listed in . Some of these ORFs are predicted to encode polypeptides with homology to proteins of known function, such as replication, mobilization or plasmid stability. Others encode hypothetical proteins, many of which show no significant homology to sequences in both the NCBI protein and the nucleotide databases, indicating a potential reservoir of genes encoding as yet uncharacterized functions.
Analysis of the putative ORFs present on the TRACA-isolated plasmids
A putative replication (Rep) protein was identified in all except one (pTRACA61) of the plasmids isolated in this study (). The Rep from pTRACA45 shares 71% amino acid identity to that of pJD1, a 4.2-kb cryptic plasmid from Neisseria gonorrhoeae
(Korch et al., 1985
). Furthermore, the two additional ORFs on pTRACA45 are also closely related to those on pJD1 (), while its G+C content (50.6%) is similar to that of pJD1 (51.5%) and the genomes of Neisseria
spp. (~51%), indicating that it is of neisserial origin.
The Rep proteins of the other 32 plasmids are more distantly related (25–43% amino acid identity) to plasmids found in bacteria belonging to either the Firmicutes
or the Proteobacteria
phyla. The pTRACA42-group comprises the majority of the plasmids isolated, 23 in total. This suggests either that this group of plasmids is more abundant in the oral metagenomic DNA and/or is more stable in the E. coli
host. The plasmids within this group differ in length (1467–1482 bp) and share > 92% nucleotide identity. One plasmid, pTRACA42, was selected for further study. The putative Rep protein is most closely related to that of the small, cryptic plasmid pCL2.1 from Lactococcus lactis
(Chang et al., 1995
) (). However, the G+C content of the pTRACA42 group of plasmids (~52%) is considerably higher than that of pCL2.1 (34%) and of L. lactis
genomes (~35%), suggesting that they are not of lactococcal origin. The other ORFs on these plasmids have no significant homology to any proteins in the database. Interestingly, nucleotide sequences with over 80% identity to pTRACA42 were identified in one of the two human lung viral metagenomes – project ID: 28439 (Dinsdale et al., 2008
). The majority of the sequences in this metagenome were from phage.
The Rep protein from pTRACA66 is also most closely related to that from an L. lactis plasmid, specifically pKL001 (). However, the G+C content of pTRACA66 (45%) is higher than that of pKL001 (32.9%) and the L. lactis genomes (~35%), suggesting that it is not of lactococcal origin. This plasmid contains an ORF with the potential to encode an integrase and three other ORFs that have no significant homology to anything in the protein or the nucleotide databases ().
The Rep associated with the pTRACA63 group of plasmids are most closely related to that from pAB49, an Acinetobacter baumannii plasmid (). However, the G+C content of pAB49 (38.8%) and the genomes of Acinetobacter spp. (38–42%) are much lower than that of pTRACA63 (50.4%), suggesting that pTRACA63 originates from a different bacterial genus. This plasmid also contains an ORF with the potential to encode an integrase.
The Rep proteins associated with plasmids from the pTRACA41 group and pTRACA73 are most closely related to those on pTRACA20 and pTRACA22, respectively, plasmids isolated from the gut metagenome using TRACA (Jones & Marchesi, 2007
) (). Interestingly, the Rep from the pTRACA41 group and pTRACA73 are related to that of pTS1 (24% and 43% amino acid identity, respectively), a cryptic plasmid from the oral bacterium, Treponema denticola
(Chauhan & Kuramitsu, 2004
In addition to the rep gene, a number of other putative ORFs were identified on each plasmid (.) The identities of the top hits identified by blastp analysis are listed in . Some of the ORFs on pTRACA73 are predicted to encode polypeptides with shared function, such as mobilization or plasmid stability, to those present on pTRACA22. However, based on sequence analysis, they are only distantly related and the G+C content of pTRACA73 (30.7%) is much lower than of pTRACA22 (51.4%). In contrast, the genes encoding polypeptides on the pTRAC41 group of plasmids share no homology with those present on pTRACA20; however, the G+C content of pTRACA41 (49.5%) is similar to that of pTRACA20 (48.7%).
In contrast to the other 32 plasmids, pTRACA61 does not contain a rep
gene homologue, but contains an integrase gene homologue sharing 34% amino acid identity to a tyrosine integrase family protein (accession number ZP_06402565). It is possible that this is a circular intermediate of a mobile element. Integrases are site-specific recombinases that frequently produce circular molecules by recombination between their target sites (for reviews, see Smith & Thorpe, 2002
; Roberts & Mullany, 2009
), providing the intriguing possibility that TRACA has the ability to isolate mobile genetic elements other than plasmids.
The 159 metagenomic data sets currently in the NCBI database were investigated for the presence of the plasmid DNA isolated in this study; however, except for pTRACA41 no homology was found.
This study has identified several novel plasmids, most of which encode hypothetical proteins of unknown function. This shows that there is a relatively unexplored genetic reservoir in the oral metagenome. Although previous studies have reported plasmids in oral streptococci (Dunny et al., 1973
; Yagi et al., 1978
; Caufield et al., 1982
; Vandenbergh et al., 1982
), none were captured by the TRACA system. This may be because the bacterial community found at periodontal disease sites is dominated by obligate anaerobes, and streptococci are typically associated with periodontally healthy sites (Paster et al., 2001
). However, the rep
genes associated with pUA140 (Zou et al., 2001
) and pLM7 plasmids, from S. mutans
, could be detected in the sample by PCR amplification (data not shown). Similarly, plasmids previously isolated from gut bacteria were also not captured by Jones & Marchesi (2007)
, suggesting a limitation of the TRACA system. It is possible such plasmids are unstable in E. coli
, refractory to transposon insertion or are not present in high enough copy number to enable capture. Furthermore, all the plasmids captured were < 8 kb, mirroring the majority of the previously reported plasmids from oral bacteria, although large plasmids have been reported from the oral cavity (LeBlanc et al., 1993
). Whether the isolation of only small plasmids with TRACA is a result of them being numerically dominant in the oral cavity and therefore preferentially captured or a possible limitation of the TRACA system is unknown. It is known that there is a logarithmic decrease in the transformation frequency of plasmids as the size increases; thus, larger plasmids will simply transform less easily into E. coli
(Szostková & Horáková, 1998
). Larger plasmids will also be present in lower copy number, making them harder to capture by TRACA. We are currently investigating whether the substitution of different origins of replication into Tn5 has allowed the capture of different plasmids. It also has to be borne in mind that it is not expected that the TRACA process is likely to capture linear plasmids because the origin of replication used by the modified Tn5 does not have the ability to replicate their extreme termini; these require specialized enzymes (reviewed in Ravin, 2003
The TRACA protocol has successfully captured novel plasmids from human oral plaque, many of which carry genes encoding as yet uncharacterized functions. TRACA has an advantage over other plasmid isolation techniques as it does not require the expression of plasmid-encoded genes in a surrogate host; thus, as illustrated by this study, novel plasmids and circular molecules can be isolated.