In genomes of many bacterial species, intergenic regions are found to be rich in repeat elements such as MITEs [9
], other small nucleotide sequence repeats [11
] and small non-coding RNA genes [48
]. Here we analyzed intergenic plasmid regions from three species of Borrelia
and have detected intergenic sequences that can fold into conserved RNA secondary structures. Compelling evidence for evolutionary conservation comes from comparisons of homologous sequences, where numerous base-pair changes are found to maintain stem loop structures. These stem loops are specific to plasmid sequences, and none have been detected in Borrelia
chromosomes or in sequences from other bacterial species.
Two RNA-motifs associated with super families of protein genes (lipoprotein_1 and CRASP-1) show a high conservation of secondary structure between homologs, yet these gene families show extensive amino acid substitutions and deletions/insertions. Perhaps the cell maintains these RNA motifs as reservoirs and as potential functional units in the formation of new variant proteins. A major focus in future work should be to determine if variant CRASP-1 and lipoprotein-1 loci are translated.
Sequence #2 contains inverted repeats and is located less than 35 bp downstream of putative lipoprotein_1 genes, and in one case overlaps the terminal codon sequences. This is very similar to the location of several miniature inverted repeats, the MITEs that are present in other bacterial species. These inverted repeats are also found downstream of genes, and in some cases are found to overlap C-terminal codons [13
]. In Yersinia
, genes situated upstream of MITEs appear to be regulated by these inverted repeat elements, which are transcribed into RNA [50
]. Although Sequence #2 differs from bacterial MITEs in not having a large nucleotide segment between inverted repeats, the proximity of this sequence to C-terminal coding ends of genes is similar to that of several MITEs.
contains transposase genes that are found in other bacterial species [20
]. Some plasmids show a high percentage of transposase-specific nucleotide sequences which may not be evident from gene annotations, e.g., the first ~1400 bp of the left side of B. afzelii PKo
plasmid lp28 starting at nucleotide position 1 consists entirely of transposase-related sequences (unpublished results). There may also be non-autonomous transposable elements present in Borrelia
that are moved and replicated by transposases. As many other bacteria contain these elements [1
], it would not be surprising if Borrelia
had its own set of non-autonomous small transposable elements, possibly with their own specific signatures. Repeat Sequence #2 described above should be further analyzed for a possible relationship to bacterial MITEs.
Stem loops that are proximal to protein genes have been reported before. Dunn et al [51
] described two inverted repeat sequences in tandem with perfect base paired stems in B. burgdorferi
in circular plasmid cp8.3. The hairpins are adjacent to putative -35 promoter sequences of an open reading frame. Also, an inverted repeat sequence is found in the 5' flanking region of the bba64 (P35) gene in B. burgdorferi
]. However the above sequences, which are upstream of genes in promoter regions, are unrelated to those reported here.
Stem loop 2, from Sequence #3 is downstream of the CRASP-1-related genes and appears to have classic Rho-independent termination signatures in terms of size and oligo U tail. The adjacent stem loop 1 may be part of a putative 3' UTR of CRASP-1 and CRASP-1-related proteins. Functions can not presently be assigned, but it should be noted that some small RNAs in E. coli
represent 3' UTR transcripts which show different expression levels from associated mRNAs and may have independent functions [8
]. Sequences #1, #4, and #5 appear to have typical RNA signatures with long stem loops and bulged/looped positions. Without further characterization, functional roles cannot be assigned. But of particular interest is the conservation of the bulged U at position 23 of the Sequence #1 stem loop. Many RNA secondary structures display conserved bulged positions and these have functional roles in RNA/RNA interactions [53
]. Sequence #1 does not appear to be linked to any protein genes and is present in nine different plasmids. This poses the question of how it was transferred and why the sequence is duplicated. Interestingly, Sequence #4 is found in three different species, B. burgdorferi str. B31
, B. afzelii PKo
and B. garinii PB
but in only one copy number. Thus this RNA motif may provide an essential function in Borrelia
, as it is found in all three species. Once complete genome sequences of other Borrelia
species are determined, it would be of interest to see if Sequence #4 and/or its characteristic secondary structural model is also present in these species.
Only a limited number of plasmids have been analyzed for repeat sequences that fold into RNA motifs, but a more comprehensive search is necessary to assess their abundance. Experimental RNA analyses such as Northern blots needs to be done to determine if these sequences are transcribed, but in view of the strong evidence for evolutionary conservation of secondary structure, they may function at the RNA level. In E. coli
, many intergenic sequences are transcribed, which results in the presence of a large number of heterogeneous small RNAs [8
]. These elements also have not been analyzed for function.