Search tips
Search criteria 


Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. 2012 July; 194(14): 3636–3642.
PMCID: PMC3393498

Novel Miniature Transposable Elements in Thermophilic Synechococcus Strains and Their Impact on an Environmental Population


The genomes of the two closely related freshwater thermophilic cyanobacteria Synechococcus sp. strain JA-3-3Ab and Synechococcus sp. strain JA-2-3B′a(2-13) each host several families of insertion sequences (ISSoc families) at various copy numbers, resulting in an overall high abundance of insertion sequences in the genomes. In addition to full-length copies, a large number of internal deletion variants have been identified. ISSoc2 has two variants (ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2) that are observed to have multiple near-exact copies. Comparison of environmental metagenomic sequences to the Synechococcus genomes reveals novel placement of copies of ISSoc2, ISSoc2[partial differential]-1, and ISSoc2[partial differential]-2. Thus, ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2 appear to be active nonautonomous mobile elements derived by internal deletion from ISSoc2. Insertion sites interrupting genes that are likely critical for cell viability were detected; however, most insertions either were intergenic or were within genes of unknown function. Most novel insertions detected in the metagenome were rare, suggesting a stringent selective environment. Evidence for mobility of internal deletion variants of other insertion sequences in these isolates suggests that this is a general mechanism for the formation of miniature insertion sequences.


Transposable elements (TEs), DNA segments that can relocate within a genome, are common in eukaryotes, bacteria, and archaea (10). Transposition events can result in deleterious mutations within the host cell (21, 24, 29, 40). Indeed, the genomes of most prokaryotes sequenced so far show a low transposon abundance, and the number of transposons present in a genome is generally positively correlated with genome size (39). TEs fall into two categories. Autonomous TEs (ATEs) encode a transposase, the enzyme that catalyzes the excision and reinsertion of the sequence elsewhere in the genome. Nonautonomous TEs (NTEs) carry the sequence signals required for transposition but have no coding regions within them, so they rely on the transposase from an ATE for activity (20).

The best-studied class of NTEs is the miniature inverted-repeat transposable elements (MITEs). First identified as “transposon-like elements” in Neisseria gonorrhoeae (12), much of the early investigation was performed in plant species (79). There are now dozens of identified MITE families from plant and animal genomes (20). MITEs consist simply of a short central region surrounded by terminal inverted repeats (TIRs) that are the recognition site for the cognate transposase. MITEs duplicate their target site upon insertion, causing them to be flanked by terminal direct repeats (TDRs). Most MITE families are associated with known ATEs by similarity between the TIRs. While they are abundant and well studied in eukaryotes, we are only just beginning to understand their distribution and importance in bacterial populations (16, 17).

The recent availability of genome sequences has enabled the study of MITEs in bacteria. Genome comparison between closely related organisms has been used to discover small repeats carrying TIRs in Neisseria (12, 28), Rickettsia (32), Pneumococcus (33), Caulobacter (11), the enterobacteria (15, 22, 35), and the cyanobacteria (27, 40, 41). Bacterial MITES vary greatly in sequence and structure and have been observed to affect the expression and function of genes in their hosts. MITES in Neisseria have been shown to affect the stability of cotranscribed RNA (13, 28). Other MITES have putative transcriptional signals in their TIRs, possibly stimulating transcription of adjacent genes (6). MITES do not encode any genes, yet some have short open reading frames in their TIRs that can fuse potentially functional motifs to genes into which they have inserted, possibly leading to altered function (1, 15, 32). There is little known about the origin of MITE sequences in bacteria, and transposition activity is largely inferred from comparisons between genome sequences of related cultured isolates or even different species, although biologically verified transposition activity of MITEs in Enterobacter cloacae and Pseudomonas syringae has recently been reported (34, 37).

The thermophilic Synechococcus strains Synechococcus sp. strain JA-3-3-Ab (Synechococcus OS-A) and Synechococcus sp. strain JA-2-3B′a(2-13) (Synechococcus OS-B′) are members of the photosynthetic microbial mat community at Octopus Spring in Yellowstone National Park (19). Their genomes have been sequenced, and they have been shown to have an unusually high abundance of a simple class of TEs known as insertion sequences (ISs) (4, 31). Synechococcus OS-A harbors 71 full-length ISs from 9 IS subfamilies (termed ISSocs), and Synechococcus OS-B′ harbors 82 full-length ISs from 12 ISSoc subfamilies (4). Six of the ISSoc subfamilies have near-identical copies both populations. Most orthologs between these two isolates share only ~83% nucleotide identity. This unusual level of sequence conservation, in addition to evidence of recent lateral gene transfer between these two populations (4, 31), suggests that these ISSocs have been passed between the two species. In addition to the full-length copies, there are numerous partial copies present on the genomes. Some of these partial copies are missing one or both termini of the ISSoc sequence, while others are internal deletions that preserve the termini. A metagenomic data set generated from samples taken from the same mats from which the cultured isolates were obtained (4) allows examination of transposition activity in this system. Comparison of the genome sequences against the metagenomic data set has been used to precisely define the ends of the ISs and gauge transposition activity in the environmental population (31).

Further examination of this data set has revealed two novel, small, nonautonomous transposable elements derived by internal deletion of ISSoc2. Structural features distinguish them from other known bacterial NTEs. Further, recent activity of these novel NTEs in the natural environment is demonstrated by the variation in the insertion locations observed in the metagenome relative to the isolate genomes.


Data sources.

The genomes of Synechococcus OS-A and Synechococcus OS-B′ have previously been published (4). They are available from GenBank (accession numbers CP000239 and CP000240). The metagenomic data set consists of 202,329 paired-end sequence reads derived from 105,373 fosmid clones. Its generation has been described (4). The maximum insert size tolerated by the fosmid vector is 10 kb; the average sequence length in the data set is 829 bp. The sequences are available from GenBank (NCBI project_ids 20717, 20719, 20721, 20723, 20725, and 20727).

Identifying miniature TE sequences in the metagenome.

The metagenomic data set was searched (Megablast [2]) for reads with similarity to ISSoc2 or the internal deletions variants, ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2. Reads were binned according to the ISSoc2 sequence variant to which they had best similarity (typically ≥95% nucleic acid identity [NAID]).

Taxonomic binning of metagenomic sequences.

The nucmer alignment program from the MUMmer package (14) was used to align all the metagenome reads against both genomes. Identifying reads as “Synechococcus OS-A-like” or “Synechococcus OS-B′-like” was accomplished in a stepwise manner. The first pass identified sequences that aligned to a reference genome with ≥92% NAID contiguously across ≥ 95% of the read length. Since the two reference genomes show massive rearrangement relative to each other and the metagenome demonstrates that rearrangements are common in the natural population (data not shown), the second pass identified sequences that had multiple (usually two) alignment regions with ≥92% NAID that were noncontiguous and nonoverlapping and whose lengths summed to ≥95% of the read length. Clone membership was used to draw in reads that did not meet the above criteria: clone mates of binned reads were screened for those having ≥92% NAID across ≥50% of their length (noncontiguous) to the same reference as their mate. Any read whose clone mate was not a member of the same species-specific bin (Synechococcus OS-A-like or Synechococcus OS-B′-like) was removed from the bin. If both clone mates met the criteria for both Synechococcus OS-A and Synechococcus OS-B′ (i.e., appeared to derive from a genome region with an unusually high sequence identity between the two genomes) or if clone mates were members of opposite bins, the sequences were put into a “Synechococcus OS-A/B′-like” bin. If only one or neither mate met the criteria for either Synechococcus OS-A or Synechococcus OS-B′, the sequence reads were classified as “other.” Reads lacking a clone mate were binned based on their own characteristics.

Determination of synteny.

Metagenome sequence reads were searched against the reference genomes using nucmer (14). Results were screened for multiple-alignment regions having >92% NAID, which overlapped less than 67% of the length of either alignment region and which were nonadjacent (with adjacency defined as being within 50 nucleotides [nt] to allow for short indels). Sequence regions containing IS segments usually resulted in regions of alignment to many areas of the genome. If any of the IS alignments met the synteny criteria, it was not considered further.


Internal deletion mutants of ISSoc2.

Members of the ISSoc2 subfamily of insertion sequences carry two genes, encoding a transposase and a resolvase. Sequence and structural similarity to IS607 from Helicobacter pylori places the ISSoc2s within the IS200/IS605 family in the Chandler-Mahillion nomenclature scheme (36). ISSoc2 is the most abundant IS in Synechococcus OS-A, with 19 intact copies, and there are two intact copies in Synechococcus OS-B′. In addition to these intact copies, there are many partial copies that we categorized as truncations (lacking a segment of sequence that includes one terminus of the IS), fragments (lacking segments at both termini), or internal deletions (with both ends being intact but missing internal sequence) (28).

While the truncations and fragments show heterogeneity in their length and termini, two internal deletion variants (ISSoc2[partial differential]s) were observed to have multiple near-identical copies. ISSoc2[partial differential]-1 has 21 copies in Synechococcus OS-A and 44 in Synechococcus OS-B′, while ISSoc2[partial differential]-2 has 9 copies in Synechococcus OS-A and 10 in Synechococcus OS-B′. The ISSoc2[partial differential]s have both structural and sequence differences (Fig. 1; see Fig. S1 in the supplemental material); however, all copies of ISSoc2[partial differential]-1 share >97% NAID, as do all copies of ISSoc2[partial differential]-2. The terminal sequences of both variants are identical to those of ISSoc2, but ISSoc2[partial differential]-1 has more internal sequence similar to ISSoc2, whereas ISSoc2[partial differential]-2 has a sequence segment of unknown origin.

Fig 1
Structures of ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2 (A) Schematic representation of ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2. The central box represents an intact ISSoc2 (1,760 nt), with internal polygons denoting CDS. Black bars above and below represent ...

ISSoc2[partial differential]s are present in environmental populations.

We identified 494 sequences in the metagenomic data set that contained ISSoc2-specific sequence, 571 that contained ISSoc2[partial differential]-1-specific sequence, and 223 that contained ISSoc2[partial differential]-2-specific sequence (Table 1). Most of the metagenomic sequences containing an ISSoc2 or ISSoc2[partial differential] (1,024 out of 1,288) could be confidently assigned as being derived from either Synechococcus OS-A-like or Synechococcus OS-B′-like individuals in the community, and an additional 159 derived from regions that appear to be recently laterally transferred between Synechococcus OS-A and Synechococcus OS-B′ (i.e., syntenic and sharing >95% NAID), making it impossible to determine from which of the two they derived. The remainder of the reads (109) had their ISs masked and were searched against the NCBI nucleotide database. Most (98) had either Synechococcus OS-A or Synechococcus OS-B′ as their best hit, with high identity (>90% NAID).

Table 1
Metagenome sequences containing an ISSoc2 variant

ISSoc2[partial differential]s appear to be present in fewer copies in individuals found in the natural environment than in those found in laboratory culture. The ratio of ISSoc2 to ISSoc2[partial differential]-1 to ISSoc2[partial differential]-2 in the Synechococcus OS-A genome is 2:2:1, while the ratio for reads identified as Synechococcus OS-A-like in the metagenome was 9:3:1, and the ratio of those found in the Synechococcus OS-B′ genome is 1:10:5, whereas the Synechococcus OS-B′-like metagenomic reads had a ratio of 1:5:2.

ISSoc2[partial differential] activity in the environment.

To search for evidence of ISSoc2[partial differential] transposition activity in the environmental population, sequences from the metagenomic data set were compared to the Synechococcus OS-A and Synechococcus OS-B′ genomes. For presentation purposes, in this report we define “insertion” and “excision” events relative to the reference genomes; however, these observations cannot determine whether what we term an “insertion event” in the environmental sequence is not actually the result of an excision event which occurred in the cultured isolate and vice versa. The metagenomic sequences containing ISSoc2 or ISSoc2[partial differential] sequence were further examined to identify those showing an insertion at a location where the reference genome does not have an insertion (insertion events) (Fig. 2).

Fig 2
Examples of metagenomic sequences demonstrating synteny, “insertion,” and “excision” relative to the reference genome. (A) Alignment of metagenome sequences to the region from Synechococcus OS-B′ surrounding locus ...

For ISSoc2, 339 sequences were categorized as Synechococcus OS-A-like (see Materials and Methods for details of sequence binning), of which 106 (31%) display alternate insertion locations at 56 distinct locations (Table 1). In the Synechococcus OS-B′-like bin, there are 69 reads with ISSoc2 sequence, with 29 (20%) showing alternate insertion location at 17 distinct sites. Many alternate insertion locations were observed only once (29/55 and 10/17); others were found in up to 7 metagenomic sequences. These data serves as a standard of comparison for environmental transposition activity due to the ISSoc2 transposase.

A similar pattern of activity was observed for the ISSoc2[partial differential]s in the Synechococcus OS-A-like bin (Table 1). For ISSoc2[partial differential]-1, 36 out of 102 sequence reads (35%) show an alternate insertion location, and 8 out of 36 reads (25%) containing ISSoc2[partial differential]-2 do as well. In the Synechococcus OS-B′-like bin, 224 out of 346 (65%) of the ISSoc2[partial differential]-1-containing metagenome sequences and 71 out of 132 (54%) of ISSoc2[partial differential]-2-containing metagenome sequences show an alternate insertion location. An ~2:1 ratio of insertion reads to distinct insertion sites is observed in all cases. As seen for the intact ISSoc2 activity, approximately half of all insertion sites were observed only once in the metagenomic data set.

Transposition activity was also evaluated by screening metagenomic sequences for those that lack an ISSoc2 or ISSoc2[partial differential] insertion at locations where one exists in a reference genome (Fig. 2). We identified 106 metagenomic sequences in the Synechococcus OS-A-like bin derived from regions in Synechococcus OS-A where ISSoc2 is inserted. Of these, 36 (34%) did not contain an ISSoc2 (Table 2). At least one metagenomic reading lacking ISSoc2 was identified for 11 of the 19 insertion sites in Synechococcus OS-A; however, since there is only ~4.2× coverage of the Synechococcus OS-A genome sequence in the metagenome, it is possible that our sampling missed “excision events” that occurred at those other eight sites. In the Synechococcus OS-B′-like bin, we identified only 14 sequences derived from the two regions where the ISSoc2 insertions are in Synechococcus OS-B′. Of these, 6 (43%) lack ISSoc2 sequence.

Table 2
Metagenome sequences showing excision of an ISSoc2 variant relative to the reference Synechococcus genome

ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2 “excision events” are observed in both the Synechococcus OS-A-like and Synechococcus OS-B′-like bins (Table 2). Most insertion sites had at least one metagenomic sequence showing absence of the ISSoc2[partial differential]. Of the 109 Synechococcus OS-A-like metagenome sequences derived from regions containing ISSoc2[partial differential]-1 insertions in Synechococcus OS-A, 43 (39%) lacked an ISSoc2[partial differential]-1 insertion, as did 14 of 34 (41%) metagenome sequences from ISSoc2[partial differential]-2 insertion regions. Higher ratios of absence to presence were detected in the Synechococcus OS-B′-like bin, with 133 of 229 (58%) metagenome reads showing a lack of ISSoc2[partial differential]-1 and 28 of 52 (54%) metagenome reads showing a lack of ISSoc2[partial differential]-2.

Other ISSoc internal deletions in Synechococcus OS-A and Synechococcus OS-B′.

Internal deletion variants of other ISSoc subfamilies were identified in our examination of Synechococcus OS-A and Synechococcus OS-B′. ISSoc1, the most abundant IS in Synechococcus OS-B′, has 12 internal deletion variants in Synechococcus OS-B′ (and one in Synechococcus OS-A), ISSoc5 has five in Synechococcus OS-A, ISSoc6 has 19 in Synechococcus OS-A, and ISSoc10 has four in Synechococcus OS-B′. Many of these internal deletion variants share structural similarity with other copies of their class. Sequence conservation between these genomic copies, however, is lower than was observed for the ISSoc2[partial differential]s. We screened the metagenome for copies of these internal deletion mutants to look for evidence of insertion/excision activity (Table 3).

Table 3
Metagenome sequences containing other ISSoc internal deletion variants

One hundred four metagenomic sequences have regions with best similarity to internal deletion mutants of ISSoc1 (ISSoc1[partial differential]s). Most are syntenic with their cognate reference genome: only three (all in the Synechococcus OS-B′-like bin) show an alternate insertion location for the ISSoc1[partial differential] sequence.

For ISSoc6, there were 200 metagenomic reads with regions similar to known ISSoc6 internal deletions (ISSoc6[partial differential]s). As expected (since ISSoc6 is found only in Synechococcus OS-A), all were in the Synechococcus OS-A-like bin. Ten showed alternate insertion locations, representing only 2 unique insertion events. Six sequences showed an insertion adjacent to locus CYA_IS00031 (encoding an ISSoc2 transposase). The other four sequences showed an intergenic insertion between CYA_2460 (encoding an aminotransferase) and CYA_2461 (encoding an oxidoreductase).


We have identified two classes of internal deletion variants of the Synechococcus insertion sequence ISSoc2, ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2, members of which are conserved in size, sequence, and structure. Although they lack TIRs and DRs, the terminal regions are conserved. The structure of the transposition complex and mechanism of action have been determined for the IS200/IS605 family insertion sequence IS608 from H. pylori. The left and right recognition signals consist of short hairpins approximately 20 bp from the termini and an adjacent short region complementary to the genome insertion site (3). While the transposase in this system (HpTnpA) is not homologous to the transposase found in ISSoc2, this example does demonstrate that the transposition signals in insertion sequences lacking TIRs do reside in the termini, and thus it is reasonable to propose that ISSoc2[partial differential]s are competent for transposition.

A novel class of nonautonomous transposable elements.

ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2 are not simply different internal deletions of ISSoc2. There are sequence differences and small regions of DNA of unknown origin in their interiors. We have observed only nearly identical copies of these variants; there are no intermediate or “transitional” variants that are more similar to the parental ISSoc2 sequence. Thus, it is likely that ISSoc2[partial differential]-1 and ISSoc2[partial differential]-2 were formed in independent events. The presence of identical copies in Synechococcus OS-A and Synechococcus OS-B′, along with evidence that ISSoc2 and other ISs have been transferred between Synechococcus OS-A and Synechococcus OS-B′ (4, 31), suggests that the ISSoc2[partial differential]s spread by lateral gene transfer.

There are several factors that distinguish ISSoc2[partial differential]s from MITEs. Based strictly on the definition of MITEs, ISSoc2[partial differential]s do not qualify, as they lack terminal inverted repeats and are not surrounded by direct repeats; however, functionally ISSoc2[partial differential]s appear to be equivalent to MITEs. The origin of most other known bacterial MITEs is murky because the only sequence shared between described MITEs and their cognate autonomous transposable elements is the TIRs; the core region of known MITEs has no similarity to known TEs (20), with the one exception being the mPing family of MITEs in rice, which has homology to the Ping TE (23, 25, 30). Thus, what truly distinguish ISSoc2[partial differential]s from most known MITEs are the segments of the transposase gene that they contain in their core. MITEs have been categorized into two types (5). Type I have TIRs precisely identical to those of known ISs and are thought to derive through internal deletion of the intact IS. Type II has TIRs that are similar but not precisely identical to those of known ISs and thus are thought to have originated through convergent evolution. Our observations of the ISSoc2[partial differential] elements are clear evidence that they form through internal deletion of active ISs, and our detection of similarly structured variants of other insertion sequences suggests this is a general mechanism for MITE formation in Synechococcus populations.

Activity of ISSoc2[partial differential]s and effect on variation.

The available metagenome sequence allowed us to examine both the activity and impact of ISSoc2[partial differential]s in the natural population. We observed placement of ISSoc2[partial differential] elements in novel locations relative to the genomes of Synechococcus OS-A and Synechococcus OS-B′ and observed the absence of these elements at locations where they exist in Synechococcus OS-A and Synechococcus OS-B′. This alternate placement does not appear to be due to general recombination events, because the endpoints of the insertions are precisely the borders of the ISSoc2[partial differential] sequences. In addition, we do not believe these elements to be products of a senescence process that degrades intact ISSoc2 insertions, because there are conserved sequence differences between ISSoc2[partial differential]-1, ISSoc2[partial differential]-2, ISSoc2, and we observe only near-identical copies of the internal deletion variants. Thus, the most likely explanation for this distribution is that these elements are functional nonautonomous transposable elements that are active in the natural populations of these organisms.

We observed a ratio of sequences showing novel placement to unique insertion locations of 2:1. A majority of the locations showed only a single instance, while a few had up to 10 sequences. That is to say, we do not observe large subpopulations with specific insertions that may represent beneficial (or even neutral) mutations. This suggests that these novel insertions are short-lived either because they are selected against or because they are excised either to insert somewhere else or to be lost.

In Synechococcus OS-A, the rate of detection of ISSoc2[partial differential] insertion and excision events is similar to that of the ISSoc2 insertion and excision events. This suggests that they are acting at a similar rate in the natural populations. In Synechococcus OS-B′, however, there is a higher rate of detection of ISSoc2[partial differential] insertion and excision events. Previous studies have described a higher diversity in genomic structure in Synechococcus OS-B′ populations (4, 31). The presence of transposable elements such as insertion sequences and the internal deletion variants described here could play a role in variation in genomic structure, keying homologous recombination within the chromosome. Since we observe IS activity in both populations but greater diversity in the Synechococcus OS-B′ population, we believe that environmental factors and not transposition rate are the stronger determinant of genomic diversity. That is to say that both populations undergo similar rates of transposition, but fewer of the resulting variants persist in Synechococcus OS-A populations due to more stringent selection.

The stable growth conditions provided by laboratory culture might allow accumulation of ISSoc2[partial differential]s (and other transposable elements) in locations that would be detrimental or fatal to individuals in the natural environment. Analysis of the abundance of the ISSoc2[partial differential]s indicates that they are present in lower numbers in the natural population than in the cultured populations. Most insertion locations in the reference genomes are intergenic; however, it is possible that some of these insertions affect regulation of adjacent genes and thus could affect the viability. Some insertions that interrupt genes in the reference genomes were not observed in the metagenome, for example, in Synechococcus OS-A, insertions of ISSoc2[partial differential]-1 into an acyl phosphatase gene (CYA_0362/CYA_0361) and into a DNA methyltransferase gene (CYA_1314/CYA_1313). This would be expected if these functions were required for survival in the natural environment.

Genotypic impact of ISSoc2[partial differential] activity.

While most ISSoc2[partial differential] insertions are intergenic, interrupted coding genes were observed in both the reference genomes and the metagenome reads (Fig. 2; see Table S1 in the supplemental material). Many of the insertions interrupting coding sequences (CDS), however, are within 30 nucleotides (nt) of the 3′ end of the gene, making it unlikely that gene function is lost, although it may be altered. To gauge the selective pressure on insertions at each location observed, we compared the number of metagenome reads showing an insertion at that location and the number lacking an insertion (see Table S1 in the supplemental material).

While many of the insertions that were novel to the metagenome (i.e., not present in the reference genomes) were observed only once, suggesting that most insertion events result in variants that are quickly selected from the population, a handful appear to be prevalent. In the Synechococcus OS-A-like bin, we identified an ISSoc2[partial differential]-1 insertion interrupting dnaX (GenBank locus_id CYA_0563, encoding the DNA polymerase III gamma and tau subunits) 13 nt from the 3′ end of the gene. This insertion was observed in all 7 metagenome reads that mapped to that region. The insertion causes the terminal 3 amino acids (LPF) to be replaced by the amino acid sequence HDSSQ. This change, being short and located at the carboxy terminus of the protein, is unlikely to affect the protein's function but could alter its stability. It is unlikely that this insertion affects the transcriptional profile of this region either, since the downstream CDS is in the opposite orientation. No rho-independent terminators have been predicted downstream of dnaX (26); however, there is a putative hairpin and a putative stem-loop structure that could be termination signals. The insertion separates the end of the dnaX CDS from these features, but ISSoc2[partial differential]-1 contains several putative hairpin structures that also may be transcription control features, including one of 30 bp that is only 30 nt from the insertion sequence terminus. In the Synechococcus OS-B′-like bin, an ISSoc2[partial differential]-1 insertion into the 5′ end of the era gene (CYB_1268, encoding a GTPase that in Escherichia coli is involved in coupling growth to cell division) was observed in all 9 metagenome reads that mapped to the region. In this case, translation is still possible from an open reading frame in the terminal region of the ISSoc2[partial differential]-1 that contains an ATG start codon and fuses in frame to the era coding frame. Regulation of expression of this gene is likely altered by this mutation. That these mutations appear to have become fixed (or at least prevalent) in the environmental populations suggests either that there is some beneficial effect to their presence or that they are neutral mutations linked to an advantageous trait.

Intergenic ISSoc2[partial differential] insertions might result in selectable phenotypes. Some intergenic insertions found in the reference genomes were not identified in any metagenomic reads. Others were found in all metagenomic reads that mapped to that location. These biases in the representation of these insertions in the population are likely due to a selective advantage. Insertions at these sites may be either disrupting or introducing regulatory elements that affect expression of adjacent genes.

Genome sequencing technology has enabled the discovery of many bacterial MITEs. The availability of complete genome sequences allows inter- and intragenomic comparisons that identify repeated sequences with TIRs and DRs. The simple structure of MITEs has also allowed the development of computational methods that can identify these elements independent of comparative analyses (27). However, our results demonstrate that reliance on the presence of TIRs might underestimate the number of NTEs present.

These additional mobile elements affect the cell in a manner similar to that of the cognate IS. Transposition activity can result in gene interruptions or other mutations that affect the survivability of the host individual. To fully understand the role of these elements, long-term in vitro evolution studies will need to be performed to track the rates and patterns of transposition of both the IS elements and associated NTEs.

Supplementary Material

Supplemental material:


This work was supported by the Frontiers in Integrative Biology Program of the National Science Foundation (grant EF-0328698). D. Bhaya acknowledges support from the Carnegie Institution for Science.


Published ahead of print 4 May 2012

Supplemental material for this article may be found at


1. Abergel C, et al. 2006. Impact of the excision of an ancient repeat insertion on Rickettsia conorii guanylate kinase activity. Mol. Biol. Evol. 23:2112–2122 [PubMed]
2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–410 [PubMed]
3. Barabas O, et al. 2008. Mechanism of IS200/IS605 family DNA transposases: activation and transposon-directed target site selection. Cell 132:208–220 [PMC free article] [PubMed]
4. Bhaya D, et al. 2007. Population level functional diversity in a microbial community revealed by comparative genomic and metagenomic analyses. ISME J. 1:703–713 [PubMed]
5. Brugger K, et al. 2002. Mobile elements in archaeal genomes. FEMS Microbiol. Lett. 206:131–141 [PubMed]
6. Buisine N, Tang CM, Chalmers R. 2002. Transposon-like Correia elements: structure, distribution and genetic exchange between pathogenic Neisseria sp. FEBS Lett. 522:52–58 [PubMed]
7. Bureau TE, Wessler SR. 1994. Mobile inverted-repeat elements of the Tourist family are associated with the genes of many cereal grasses. Proc. Natl. Acad. Sci. U. S. A. 91:1411–1415 [PubMed]
8. Bureau TE, Wessler SR. 1994. Stowaway: a new family of inverted repeat elements associated with the genes of both monocotyledonous and dicotyledonous plants. Plant Cell 6:907–916 [PubMed]
9. Bureau TE, Wessler SR. 1992. Tourist: a large family of small inverted repeat elements frequently associated with maize genes. Plant Cell 4:1283–1294 [PubMed]
10. Chandler M, Mahillon J. 2002. Insertion sequences revisited, p 305–366 In Craig NL, Craigie R, Gellert M, Lambowitz AM, editors. (ed), Mobile DNA II. American Society for Microbiology, Washington, DC
11. Chen SL, Shapiro L. 2003. Identification of long intergenic repeat sequences associated with DNA methylation sites in Caulobacter crescentus and other alpha-proteobacteria. J. Bacteriol. 185:4997–5002 [PMC free article] [PubMed]
12. Correia FF, Inouye S, Inouye M. 1988. A family of small repeated elements with some transposon-like properties in the genome of Neisseria gonorrhoeae. J. Biol. Chem. 263:12194–12198 [PubMed]
13. De Gregorio E, Silvestro G, Petrillo M, Carlomagno MS, Di Nocera PP. 2005. Enterobacterial repetitive intergenic consensus sequence repeats in yersiniae: genomic organization and functional properties. J. Bacteriol. 187:7945–7954 [PMC free article] [PubMed]
14. Delcher AL, Phillippy A, Carlton J, Salzberg SL. 2002. Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res. 30:2478–2483 [PMC free article] [PubMed]
15. Delihas N. 2007. Enterobacterial small mobile sequences carry open reading frames and are found intragenically—evolutionary implications for formation of new peptides. Gene Regul. Syst. Biol. 1:191–205 [PMC free article] [PubMed]
16. Delihas N. 2011. Impact of small repeat sequences on bacterial genome evolution. Genome Biol. Evol. 3:959–973 [PMC free article] [PubMed]
17. Delihas N. 2008. Small mobile sequences in bacteria display diverse structure/function motifs. Mol. Microbiol. 67:475–481 [PMC free article] [PubMed]
18. Edgar RC. 2004. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113 doi:10.1186/1471-2105-5-113 [PMC free article] [PubMed]
19. Ferris MJ, Muyzer G, Ward DM. 1996. Denaturing gradient gel electrophoresis profiles of 16S rRNA-defined populations inhabiting a hot spring microbial mat community. Appl. Environ. Microbiol. 62:340–346 [PMC free article] [PubMed]
20. Feschotte C, Zhang X, Wessler SR. 2002. Miniature inverted-repeat transposable elements and their relationship to established DNA transposons, p 1147–1158 In Craig NL, Craigie R, Gellert M, Lambowitz AM, editors. (ed), Mobile DNA II. ASM Press, Washington, DC
21. Hubner P, Iida S, Arber W. 1987. A transcriptional terminator sequence in the prokaryotic transposable element IS1. Mol. Gen. Genet. 206:485–490 [PubMed]
22. Hulton CS, Higgins CF, Sharp PM. 1991. ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium and other enterobacteria. Mol. Microbiol. 5:825–834 [PubMed]
23. Jiang N, et al. 2003. An active DNA transposon family in rice. Nature 421:163–167 [PubMed]
24. Kiel JA, Boels JM, Ten Berge AM, Venema G. 1993. Two putative insertion sequences flank a truncated glycogen branching enzyme gene in the thermophile Bacillus stearothermophilus CU21. Mitochondrial DNA 4:1–9 [PubMed]
25. Kikuchi K, Terauchi K, Wada M, Hirano HY. 2003. The plant MITE mPing is mobilized in anther culture. Nature 421:167–170 [PubMed]
26. Kingsford CL, Ayanbule K, Salzberg SL. 2007. Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biol. 8:R22 doi:10.1186/gb-2007-8-2-r22 [PMC free article] [PubMed]
27. Lin S, et al. 2011. Genome-wide comparison of cyanobacterial transposable elements, potential genetic diversity indicators. Gene 473:139–149 [PubMed]
28. Mazzone M, et al. 2001. Whole-genome organization and functional properties of miniature DNA insertion sequences conserved in pathogenic Neisseriae. Gene 278:211–222 [PubMed]
29. Nakamura K, Inouye M. 1981. Inactivation of the Serratia marcescens gene for the lipoprotein in Escherichia coli by insertion sequences, IS1 and IS5; sequence analysis of junction points. Mol. Gen. Genet. 183:107–114 [PubMed]
30. Nakazaki T, et al. 2003. Mobilization of a transposon in the rice genome. Nature 421:170–172 [PubMed]
31. Nelson WC, Wollerman L, Bhaya D, Heidelberg JF. 2011. Analysis of insertion sequences in thermophilic cyanobacteria: exploring the mechanisms of establishing, maintaining, and withstanding high insertion sequence abundance. Appl. Environ. Microbiol. 77:5458–5466 [PMC free article] [PubMed]
32. Ogata H, et al. 2000. Selfish DNA in protein-coding genes of Rickettsia. Science 290:347–350 [PubMed]
33. Oggioni MR, Claverys JP. 1999. Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae. Microbiology 145:2647–2653 [PubMed]
34. Poirel L, Carrer A, Pitout JD, Nordmann P. 2009. Integron mobilization unit as a source of mobility of antibiotic resistance genes. Antimicrob. Agents Chemother. 53:2492–2498 [PMC free article] [PubMed]
35. Sharples GJ, Lloyd RG. 1990. A novel repeated DNA sequence located in the intergenic regions of bacterial chromosomes. Nucleic Acids Res. 18:6503–6508 [PMC free article] [PubMed]
36. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. 2006. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 34:D32–D36 [PMC free article] [PubMed]
37. Stavrinides J, Kirzinger MW, Beasley FC, Guttman DS. 2012. E622, a miniature, virulence-associated mobile element. J. Bacteriol. 194:509–517 [PMC free article] [PubMed]
38. Thompson JD, Gibson TJ, Higgins DG. 2002. Multiple sequence alignment using ClustalW and ClustalX. Curr. Protoc. Bioinformatics Chapter 2:Unit 2.3 doi:10.1002/0471250953.bi0203s00 [PubMed]
39. Touchon M, Rocha EP. 2007. Causes of insertion sequences abundance in prokaryotic genomes. Mol. Biol. Evol. 24:969–981 [PubMed]
40. Wolk CP, Lechno-Yossef S, Jager KM. 2010. The insertion sequences of Anabaena sp. strain PCC 7120 and their effects on its open reading frames. J. Bacteriol. 192:5289–5303 [PMC free article] [PubMed]
41. Zhou F, Tran T, Xu Y. 2008. Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria. Biochem. Biophys. Res. Commun. 365:790–794 [PubMed]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)