Search tips
Search criteria 


Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. 2010 June; 192(12): 3231–3234.
Published online 2010 April 16. doi:  10.1128/JB.00124-10
PMCID: PMC2901701

A Unique Group of Virus-Related, Genome-Integrating Elements Found Solely in the Bacterial Family Thermaceae and the Archaeal Family Halobacteriaceae[down-pointing small open triangle]


Viruses SH1 and P23-77, infecting archaeal Haloarcula species and bacterial Thermus species, respectively, were recently designated to form a novel viral lineage. In this study, the lineage is expanded to archaeal Halomicrobium and bacterial Meiothermus species by analysis of five genome-integrated elements that share the core genes with these viruses.

Viruses appear to form lineages that span different domains of life (1-3, 5, 13-15). Recently, a novel lineage of viruses and virus-like genetic elements of halophilic archaea and thermophilic bacteria was identified (13). The members of the lineage include the Thermus thermophilus virus P23-77 (11), the Thermus aquaticus virus IN93 (17), the Haloarcula hispanica virus SH1 (4), the Haloarcula salinarium plasmid pHH205 (29), and a genome-integrated element of Halobacterium marismortui. Structural analysis of P23-77 and SH1 revealed that both viruses form icosahedral capsids with triangulation number 28, decorate capsomers with similar types of tower-like structures, and contain an inner lipid membrane beneath the protein capsid (10, 11). The genomes of the viruses and the plasmid are formed of double-stranded DNA molecules that range from ~16 kbp to ~31 kbp in length. All of the above-mentioned genetic elements share three common genes, two coding for a small and a large major coat protein (sMCP and lMCP, respectively) and one coding for a putative genome-packaging ATPase, all arranged in similar orders in the genome (13). The cryoelectron microscopy structures of P23-77 and SH1 suggest that the base of the capsomer could be formed of a hexameric single beta-barrel protein (10, 11), in contrast to the pseudohexameric double beta-barrel capsomer structure of the PRD1 adenovirus lineage (15). However, the packaging ATPases share sequence similarity with the ATPases of this previously demonstrated virus lineage (13, 15). These notions lead to the suggestion that P23-77-like viruses might build an early divergent branch of the widespread lineage of beta-barrel capsid-containing viruses. Recently, the genomes and functions of some genes of the temperate virus IN93 were also studied (16, 17). IN93 was shown to contain four transcriptional units. Three of them are transcribed in the same direction, and they were active during the lytic cycle. One is transcribed in the opposite direction and is active during the lysogenic cycle. Moreover, a novel thermostable lysozyme from IN93 was discovered (17).

In this study, we analyzed five new virus-related, genome-integrated elements belonging to the lineage of P23-77-like viruses. All of these elements contain genes for the major capsid proteins and the putative packaging ATPase. Two of the elements reside in the genomes of bacterial Meiothermus species (20), two in archaeal Halomicrobium species (21), and one in Haloarcula species (9), thus widening the distribution of members having a putative single beta-barrel capsid protein in the families of Thermaceae on the bacterial tree and Halobacteriaceae on the archaeal tree. There are currently no particle-forming viruses characterized for Halomicrobium or Meiothermus; therefore, the genetic elements described in this study provide the first evidence of genomic evolution by unique virus types in these bacterial and archaeal species.

Chromosome-integrated, virus-like sequences in the genomes of Meiothermus and Halomicrobium species.

The genomes of Meiothermus ruber DSM 1279 (GenBank accession no. ABUF00000000) and Meiothermus silvanus DSM 9946 (ABUG00000000) contain genes with clear similarity to major capsid proteins of phages P23-77 and IN93. Closer analysis revealed that both Meiothermus genomes contain a P23-77-related provirus. They were designated MeioRubP1 and MeioSilP1, for the genome-integrated element in M. ruber and that in M. silvanus, respectively. However, the putative packaging ATPase gene, which is one of the core genes of the P23-77-like viruses, was missing in the genome-integrated provirus of M. silvanus, due to an incomplete genomic sequence. We sequenced the missing gap (using methods similar to those described in reference 13) in order to obtain the full provirus sequence. The gap was ~1,000 bp in length, and it contained a gene for a putative packaging ATPase. The boundaries of the integrated elements were determined by studying the genes surrounding the obvious virus-related sequence. Directly downstream of both the MeioRubP1 and the MeioSilP1 sequence are arginine tRNA genes. tRNA genes are known to be common sites for bacteriophage integration (6). Obvious host genes (OHCU [2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline] decarboxylase- and predicted HDIG domain-containing proteins) follow the tRNA genes. Upstream of the phage-related integration cassette in M. ruber is a gene for protoporphyrinogen oxidase, a common bacterial gene. In M. silvanus, the last virus-related gene (related to IN93 gene 29) encodes a protein with homology to a site-specific recombinase, which is followed by three widespread transposase genes and other host genes.

A genome-integrated, virus-like element from the genome of Haloarcula marismortui was discovered previously (ATCC 43049) and was designated IHP for integrated Haloarcula provirus (13). IHP has a set of genes in common with the Haloarcula hispanica virus SH1 and the Halobacterium salinarium plasmid pHH205. Two new elements in the genomic sequence of Halomicrobium mukohataei DSM 12286 (GenBank accession no. CP001688) were discovered. These new elements were designated HaloMukP1 and HaloMukP2. All three genome-integrated elements, IHP, HaloMukP1, and HaloMukP2, have at their opposite ends genes encoding a putative zinc finger protein and a phage integrase. Beyond these genes, the homologues between the genetic elements do not exist. The locations of the open reading frames (ORFs) of all of the genome-integrated elements are listed in Table S1 in the supplemental material.

Analysis of the genome-integrated elements.

A comparison of the genetic elements of MeioRubP1 and MeioSilP1 is presented in Fig. Fig.1.1. The elements are 16 kbp and 21 kbp for MeioRubP1 and MeioSilP1, respectively. The longer sequence of MeioSilP1 is partly due to the difference in transposases (ORFs 38 and 39) and a region that appears to have been acquired from a bacterial genome (genes 25 to 27). Indeed, it is possible that MeioSilP1 is a defective virus, due to the disturbed sequence. Matsushita and Yanase have experimentally demonstrated a transfer of transposable element lStaqTZ2 from the Thermus thermophilus TZ2 genome into the genome of IN93 (18), suggesting that such events occur spontaneously. Moreover, there is a short region after ORF 39 in the MeioSilP1 element which shows similarity to MeioRubP1 ORF 32 but which is not part of any ORFs of MeioSilP1. Therefore, the predicted sequences of the phage integrase gene (ORF 37, the region after ORF 39, and ORF 42) of MeioSilP1 appear to have been divided into pieces due to integration of transposases and other non-provirus-originated genes (ORFs 40 to 44). Possibly, transposases that recognize phage integrase sequences can be favorable for the host by rendering genome-integrating viruses defective. Interestingly, the genes encoding sMCPs are only 30% identical. They were previously determined to be the most conserved genes within the lineage (13). Alignment of the genes does not indicate that a frameshift event has occurred, thus suggesting that the sMCP protein has evolved structurally to a somewhat different form or altered the interactions with other structural proteins. In line with this is, for example, the notion that some of the genes that were determined to encode minor structural components of the P23-77 virion (gene 11 and gene 29) have no homologues in the MeioRubP1 element and that the homologues with structural proteins VP19, VP20, and VP22 were only ~20% identical at the protein level. Moreover, and differently from MeioRubP1, a homolog with gene 29 of the P23-77 genome was present in the MeioSilP1 element. These aspects indicate a structural evolution in the hypothetical virions of MeioRubP1 and MeioSilP1. The cell-wall-digesting enzymes are different in MeioRubP1 and MeioSilP1.This follows the pattern in the related viruses P23-77 and IN93 (13) and, for example, some tectiviruses (26).

FIG. 1.
Comparison of the MeioRubP1 and MeioSilP1 elements. Functions of the genes, where they are mentioned, are based on BLAST results. “mr” indicates the protein identity for the matching region in the pairwise protein alignment.

A comparison of the genetic elements of IHP, HaloMukP1, HaloMukP2, and pHH205 is presented in Fig. Fig.2.2. In these elements, the genes encoding sMCP, lMCP, and ATPases were generally the most conserved. Indeed, the major capsid proteins of HaloMukP1 and HaloMukP2 are almost 100% identical, but many of the other common genes were only around 40% identical (interestingly, also including the putative packaging ATPase). Most of the genes in these elements were shared by at least one other member of the lineage, but many genes showed no similarities to any known genes.

FIG. 2.
Comparison of the genetic elements found in archaeal organisms. Functions of the genes, where they are mentioned, are based on BLAST results. “mr” indicates the protein identity for the matching region in the pairwise protein alignment. ...

The elements of the lineage have adapted to use various life strategies. SH1, IN93, and P23-77 are true viruses, pHH205 has been reported to be a plasmid, and the five other elements are integrated into the host genome. Interestingly, all of the genome-integrated elements have many genes with homologues to regulatory and/or DNA binding functions, suggesting that their temperate life strategies are more dependent on these genes. As opposed to the integrated elements, the lytic viruses (P23-77 and SH1), with their straightforward lytic life strategy, have no homologues to such genes.

Phylogenetic order and geographical distribution.

The phylogenetic relationships and the geographical distributions of the members of the lineage are presented, respectively, in Fig. S1 and S2 in the supplemental material. The evolutionary histories were inferred as described in reference 13, using previously described methods (8, 19, 23, 24, 27, 28, 31). The separation orders of the elements appear to be rather similar regardless of which of the lineage-defining genes are used to build the tree. However, minor differences exist, suggesting either that the evolutionary rates of these genes are not uniformly constant but may occasionally take a faster pace or that individual genes may be exchanged with related elements. The elements in Archaea and Bacteria form their own branches, suggesting that they have evolved separately. The putative single beta-barrel lineage resides in two very distantly related hosts, one group being thermophilic bacteria and the other being halophilic archaea. All of the hosts are able to thrive in moderately high temperatures, as even most of the less thermophilic species of Meiothermus and Halomicrobium have optimal growth temperatures between 50 and 60°C (20, 22). It is possible that the specific pattern in which one of the hosts is a true thermophile is not a coincidence but a direct result of better chances of survival of ancient virus types in thermal environments (12). The elements are distributed equally on Earth, suggesting that they are common in any of the natural habitats of their hosts. Furthermore, they are common in the currently sequenced genomes of Meiothermus species and members of the order Halobacteriales but have not been discovered in any other cellular groups.

Bacterial and archaeal genomes are noted to contain many ORFans (i.e., ORFs with no matches in current databases) (25). Their share of the genomic content has remained steady despite the growing number of sequenced genomes (30). Integrative elements, such as viruses, plasmids, and transposable elements, are suggested to be responsible for a large number of the currently annotated ORFans (7). We have demonstrated here that a unique group of genome-integrating, virus-related elements can be relatively common in few groups of distantly related cellular groups and that these elements can have a number of putative genes with no matches in the previously sequenced genomes. Therefore, it is possible that a portion of the ORFans in bacterial and archaeal genomes may be related to some unique and ancient viruses.

Nucleotide sequence accession numbers.

The M. silvanus provirus sequence has been deposited in GenBank under accession number HM140848.

Supplementary Material

[Supplemental material]


This work was supported by the Finnish Centre of Excellence Program of the Academy of Finland (2006-2011), grant 1129648 (J.K.H.B.).


[down-pointing small open triangle]Published ahead of print on 16 April 2010.

Supplemental material for this article may be found at


1. Abrescia, N. G., J. J. Cockburn, J. M. Grimes, G. C. Sutton, J. M. Diprose, S. J. Butcher, S. D. Fuller, C. San Martin, R. M. Burnett, D. I. Stuart, D. H. Bamford, and J. K. Bamford. 2004. Insights into assembly from structural analysis of bacteriophage PRD1. Nature 432:68-74. [PubMed]
2. Akita, F., K. T. Chong, H. Tanaka, E. Yamashita, N. Miyazaki, Y. Nakaishi, M. Suzuki, K. Namba, Y. Ono, T. Tsukihara, and A. Nakagawa. 2007. The crystal structure of a virus-like particle from the hyperthermophilic archaeon Pyrococcus furiosus provides insight into the evolution of viruses. J. Mol. Biol. 368:1469-1483. [PubMed]
3. Bamford, D. H. 2003. Do viruses form lineages across different domains of life? Res. Microbiol. 154:231-236. [PubMed]
4. Bamford, D. H., J. J. Ravantti, G. Rönnholm, S. Laurinavicius, P. Kukkaro, M. Dyall-Smith, P. Somerharju, N. Kalkkinen, and J. K. Bamford. 2005. Constituents of SH1, a novel lipid-containing virus infecting the halophilic euryarchaeon Haloarcula hispanica. J. Virol. 79:9097-9107. [PMC free article] [PubMed]
5. Benson, S. D., J. K. Bamford, D. H. Bamford, and R. M. Burnett. 2004. Does common architecture reveal a viral lineage spanning all three domains of life? Mol. Cell 16:673-685. [PubMed]
6. Canchaya, C., G. Fournous, and H. Brüssow. 2004. The impact of prophages on bacterial chromosomes. Mol. Microbiol. 53:9-18. [PubMed]
7. Cortez, D., P. Forterre, and S. Gribaldo. 2009. A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes. Genome Biol. 10:R65. [PMC free article] [PubMed]
8. Felsenstein, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-791.
9. Ihara, K., S. Watanabe, and T. Tamura. 1997. Haloarcula argentinensis sp. nov. and Haloarcula mukohataei sp. nov., two new extremely halophilic archaea collected in Argentina. Int. J. Syst. Bacteriol. 47:73-77. [PubMed]
10. Jaalinoja, H. T., E. Roine, P. Laurinmaki, H. M. Kivela, D. H. Bamford, and S. J. Butcher. 2008. Structure and host-cell interaction of SH1, a membrane-containing, halophilic euryarchaeal virus. Proc. Natl. Acad. Sci. U. S. A. 105:8008-8013. [PubMed]
11. Jaatinen, S. T., L. J. Happonen, P. Laurinmaki, S. J. Butcher, and D. H. Bamford. 2008. Biochemical and structural characterisation of membrane-containing icosahedral dsDNA bacteriophages infecting thermophilic Thermus thermophilus. Virology 379:10-19. [PubMed]
12. Jalasvuori, M., and J. K. Bamford. 2009. Did the ancient crenarchaeal viruses from the dawn of life survive exceptionally well the eons of meteorite bombardment? Astrobiology 9:131-137. [PubMed]
13. Jalasvuori, M., S. T. Jaatinen, S. Laurinavicius, E. Ahola-Iivarinen, N. Kalkkinen, D. H. Bamford, and J. K. Bamford. 2009. The closest relatives of icosahedral viruses of thermophilic bacteria are among viruses and plasmids of the halophilic archaea. J. Virol. 83:9388-9397. [PMC free article] [PubMed]
14. Khayat, R., L. Tang, E. T. Larson, C. M. Lawrence, M. Young, and J. E. Johnson. 2005. Structure of an archaeal virus capsid protein reveals a common ancestry to eukaryotic and bacterial viruses. Proc. Natl. Acad. Sci. U. S. A. 102:18944-18949. [PubMed]
15. Krupovic, M., and D. H. Bamford. 2008. Virus evolution: how far does the double beta-barrel viral lineage extend? Nat. Rev. Microbiol. 6:941-948. [PubMed]
16. Matsushita, I., and H. Yanase. 2008. A novel thermophilic lysozyme from bacteriophage phiIN93. Biochem. Biophys. Res. Commun. 377:89-92. [PubMed]
17. Matsushita, I., and H. Yanase. 2009. The genomic structure of thermus bacteriophage phiIN93. J. Biochem. 146:775-785. [PubMed]
18. Matsushita, I., and H. Yanase. 2009. A novel insertion sequence transposed to thermophilic bacteriophage phiIN93. J. Biochem. 145:797-803. [PubMed]
19. Nei, M., and S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, New York, NY.
20. Nobre, M. F., H. G. Triiper, and M. S. da Costa. 1996. Transfer of Thermus ruber (Loginova et al. 1984), Thermus silvanus (Tenreiro et al. 1995), and Thermus chliarophilus (Tenreiro et al. 1995) to Meiothermus gen. nov. as Meiothermus ruber comb. nov., Meiothermus silvanus comb. nov., and Meiothermus chliarophilus comb. nov., respectively, and emendation of the genus Thermus. Int. J. Syst. Bacteriol. 46:604-606.
21. Oren, A., R. Elevi, S. Watanabe, K. Ihara, and A. Corcelli. 2002. Halomicrobium mukohataei gen. nov., comb. nov., and emended description of Halomicrobium mukohataei. Int. J. Syst. Evol. Microbiol. 52:1831-1835. [PubMed]
22. Robinson, J. L., B. Pyzyna, R. G. Atrasz, C. A. Henderson, K. L. Morrill, A. M. Burd, E. Desoucy, R. E. Fogleman III, J. B. Naylor, S. M. Steele, D. R. Elliott, K. J. Leyva, and R. F. Shand. 2005. Growth kinetics of extremely halophilic archaea (family Halobacteriaceae) as revealed by Arrhenius plots. J. Bacteriol. 187:923-929. [PMC free article] [PubMed]
23. Rzhetsky, A., and M. Nei. 1992. A simple method for estimating and testing minimum evolution trees. Mol. Biol. Evol. 9:945-967.
24. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425. [PubMed]
25. Siew, N., Y. Azaria, and D. Fischer. 2004. The ORFanage: an ORFan database. Nucleic Acids Res. 32:D281-D283. [PMC free article] [PubMed]
26. Sozhamannan, S., M. McKinstry, S. M. Lentz, M. Jalasvuori, F. McAfee, A. Smith, J. Dabbs, H. W. Ackermann, J. K. Bamford, A. Mateczun, and T. D. Read. 2008. Molecular characterization of a variant of Bacillus anthracis-specific phage AP50 with improved bacteriolytic activity. Appl. Environ. Microbiol. 74:6792-6796. [PMC free article] [PubMed]
27. Takezaki, N., A. Rzhetsky, and M. Nei. 1995. Phylogenetic test of the molecular clock and linearized trees. Mol. Biol. Evol. 12:823-833. [PubMed]
28. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596-1599. [PubMed]
29. Ye, X., J. Ou, L. Ni, W. Shi, and P. Shen. 2003. Characterization of a novel plasmid from extremely halophilic Archaea: nucleotide sequence and function analysis. FEMS Microbiol. Lett. 221:53-57. [PubMed]
30. Yin, Y., and D. Fischer. 2006. On the origin of microbial ORFans: quantifying the strength of the evidence for viral lateral transfer. BMC Evol. Biol. 6:63. [PMC free article] [PubMed]
31. Zuckerkandl, E., and L. Pauling. 1965. Evolutionary divergence and convergence in proteins, p. 97-166. In V. Bryson and H. J. Vogel (ed.), Evolving genes and proteins. Academic Press, New York. NY.

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)