|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: MD KM AL PR KK EG VM. Performed the experiments: BL KK CD ES EG VM AV MZ BR TB. Analyzed the data: EK MD MG YW KM MO AL AC ML BL KK JF CD ES EG VM BR TB AS AG SP EK KM. Contributed reagents/materials/analysis tools: EK MG. Wrote the paper: MD KM. Other: JGI Genome-sequencing project coordinator: AL. Genome annotation QC: EK. Genome annotation: ML KM. Data management/transfer, NCBI submission: SP. Genome Library construction: CD. Genome finishing: TB ES. Programmers: AS AG JF. Contributed to writing the paper: EK.
Bacteria of the genus Deinococcus are extremely resistant to ionizing radiation (IR), ultraviolet light (UV) and desiccation. The mesophile Deinococcus radiodurans was the first member of this group whose genome was completely sequenced. Analysis of the genome sequence of D. radiodurans, however, failed to identify unique DNA repair systems. To further delineate the genes underlying the resistance phenotypes, we report the whole-genome sequence of a second Deinococcus species, the thermophile Deinococcus geothermalis, which at its optimal growth temperature is as resistant to IR, UV and desiccation as D. radiodurans, and a comparative analysis of the two Deinococcus genomes. Many D. radiodurans genes previously implicated in resistance, but for which no sensitive phenotype was observed upon disruption, are absent in D. geothermalis. In contrast, most D. radiodurans genes whose mutants displayed a radiation-sensitive phenotype in D. radiodurans are conserved in D. geothermalis. Supporting the existence of a Deinococcus radiation response regulon, a common palindromic DNA motif was identified in a conserved set of genes associated with resistance, and a dedicated transcriptional regulator was predicted. We present the case that these two species evolved essentially the same diverse set of gene families, and that the extreme stress-resistance phenotypes of the Deinococcus lineage emerged progressively by amassing cell-cleaning systems from different sources, but not by acquisition of novel DNA repair systems. Our reconstruction of the genomic evolution of the Deinococcus-Thermus phylum indicates that the corresponding set of enzymes proliferated mainly in the common ancestor of Deinococcus. Results of the comparative analysis weaken the arguments for a role of higher-order chromosome alignment structures in resistance; more clearly define and substantially revise downward the number of uncharacterized genes that might participate in DNA repair and contribute to resistance; and strengthen the case for a role in survival of systems involved in manganese and iron homeostasis.
Deinococcus geothermalis belongs to the Deinococcus-Thermus group, which is deeply branched in bacterial phylogenetic trees and has putative relationships with cyanobacteria , . The extremely radiation-resistant family Deinococcaceae is comprised of greater than twenty distinct species  that can survive acute exposures to ionizing radiation (IR) (10 kGy), ultraviolet light (UV) (1 kJ/m2), and desiccation (years) , ; and can grow under chronic IR (60 Gy/hour) . D. geothermalis was originally isolated from a hot pool at the Termi di Agnano, Naples, Italy , and subsequently identified at other locations poor in organic nutrients including industrial paper machine water , deep ocean subsurface environments , and subterranean hot springs in Iceland .
D. geothermalis is distinct from most members of the genus Deinococcus in that it is a moderate thermophile, with an optimal growth temperature (Topt) of 50°C , is not dependent on an exogenous source of amino acids or nicotinamide for growth , , is capable of forming biofilms , and possesses membranes with very low levels of unsaturated fatty acids compared to the other species . Based on the ability of wild-type and engineered D. geothermalis and D. radiodurans to reduce a variety of metals including U(VI), Cr(VI), Hg(II), Tc(VII), Fe(III) and Mn(III,IV) , , these two species have been proposed for bioremediation of radioactive waste sites maintained by the US Department of Energy (DOE) , , . These characteristics were the impetus for whole-genome sequencing of D. geothermalis at DOE's Joint Genome Institute, and comparison with the mesophilic D. radiodurans (Topt, 32°C), to date the only other extremely IR-resistant bacterium for which a whole-genome sequence has been acquired .
Chromosomal and plasmid DNAs in extremely resistant bacteria are as susceptible to IR-induced DNA double strand breaks (DSBs) as in sensitive bacteria , – and broad-based experimental and bioinformatic studies have converged on the conclusion that D. radiodurans uses a conventional set of DNA repair and protection functions, but with a far greater efficiency than IR-sensitive bacteria , , . This apparent contradiction is exemplified by work which showed that the repair protein DNA polymerase I (PolA) of D. radiodurans supports exceptionally efficient DNA replication at the earliest stages of recovery from IR, and could account for the high fidelity of RecA-mediated DNA fragment assembly . Paradoxically, however, IR-, UV-, and mitomycin-C (MMC)-sensitive D. radiodurans polA mutants are fully complemented by expression of the polA gene from the IR-sensitive Escherichia coli .
The reason why repair proteins, either native or cloned, in D. radiodurans function so much better after irradiation than in sensitive bacteria is unknown. The prevailing hypotheses of extreme IR resistance in D. radiodurans fall into three categories: (i) chromosome alignment, morphology and/or repeated sequences facilitate genome reassembly , , , ; (ii) a subset of uncharacterized genes encode functions that enhance the efficiency of DNA repair ; and (iii) non-enzymic Mn(II) complexes present in resistant bacteria protect proteins, but not DNA, from oxidation during irradiation, with the result that conventional enzyme systems involved in recovery survive and function with far greater efficiency than in sensitive bacteria , . The extraordinary survival of Deinococcus bacteria following irradiation has also given rise to some rather whimsical descriptions of their derivation, including that they evolved on Mars . On the basis of whole-genome comparisons between two Deinococcus genomes and two Thermus genomes, we present a reconstruction of evolutionary events that are inferred to have occurred both before and after the divergence of the D. radiodurans and D. geothermalis lineages. We revise down substantially the number of potential genetic determinants of radiation resistance, predict a Deinococcus radiation response regulon, and consider the implications of these comparative-genomic findings for different models of recovery.
One approach to delineating a minimal set of genes involved in extreme resistance is to compare the whole-genome sequences of two phylogenetically related but distinct species that are equally resistant, whereby genes that are unique to both organisms are ruled out, whereas shared genes are pooled as candidates for involvement in resistance. We show that D. geothermalis (DSM 11300) and D. radiodurans (ATCC BAA-816) are equally resistant to IR (60Co) (Figure 1A) and UV (254 nm) (Figure 1B) when pre-grown and recovered at their optimal growth temperatures, 50°C and 32°C, respectively. When recovered at 50°C, the survival of D. geothermalis exposed to 12 kGy was 1,000 times greater than at 32°C (Figure 1A) . The extreme resistance to desiccation of D. geothermalis recovered at 50°C was demonstrated previously . Thus, D. geothermalis and D. radiodurans are well-suited to defining a conserved set of genes responsible for extreme resistance.
The random shotgun method  was used to acquire the complete sequence of the D. geothermalis (DSM 11300) genome, that is comprised of a main chromosome (2,467,205 base pairs (bp)), and two megaplasmids (574,127 bp and 205,686 bp). The general structure of the predicted D. geothermalis genome was tested by pulsed field gel electrophoresis (PFGE) of genomic DNA linearized in vivo by exposure to IR (0.2 kGy), and by restriction endonuclease (SpeI) cleavage (Figure 1C). The IR-treatment revealed the existence of a ~570 kb megaplasmid in D. geothermalis, and the SpeI-treatment yielded the expected number of chromosomal bands: 3 singlets (632 kb, 376 kb and 282 kb) and one doublet (574/579 kb); the plasmids do not contain a SpeI site. In comparison, IR-treated D. radiodurans (ATCC BAA-816) subjected to PFGE displayed the presence of the DR412 (412 kb) and DR177 (177 kb) megaplasmids, previously observed . The approximately 206 kb D. geothermalis megaplasmid was not visualized by PFGE although its size lies between the two D. radiodurans megaplasmids, which were readily observed (Figure 1C). Consistently, the abundance of DNA clones for the 206 kb megaplasmid was significantly lower than the 574 kb megaplasmid during construction of the D. geothermalis genome-library used for sequencing (data not shown). Thus, the 574 kb megaplasmid of D. geothermalis exists at higher copy-number than the 206 kb megaplasmid.
Comparison of the general genome features of D. geothermalis and D. radiodurans revealed major differences in genome partitioning, and in the number of noncoding repeats (SNRs) (Table 1).
We previously demonstrated homologous relationships between the DR412 megaplasmid of D. radiodurans and the sole 233 kb megaplasmid (pTT27) of T. thermophilus . Based on the gene contents of DR412 and pTT27, we concluded that these megaplasmids evolved from a common ancestor (Figure S1), are essential to the survival of both species, and appear to serve as a sink for horizontally transferred genes . In contrast, the 574 kb megaplasmid (DG574) of D. geothermalis is distinct from pTT27, and appears to have been derived from a fusion of DR412 and DR177 (Table S1), followed by numerous rearrangements. Levels of gene order conservation for the D. geothermalis and D. radiodurans chromosomes and megaplasmids were determined by genomic dot plots  (Figure S2). The dot plots of the chromosomes showed a clear pattern characteristic of chromosomes of relatively closely related bacteria that retain significant colinearity of the gene order. The X-shape pattern is thought to arise from inversions of a chromosomal segment around the origin of replication . By contrast, DR412 and DR177 did not display any discernable colinearity (Figure S2B), indicating substantially greater levels of rearrangement in the megaplasmids.
Dozens of small noncoding repeats (SNRs) of an unusual, mosaic structure have been identified in the D. radiodurans genome, suggesting a possible role in resistance . In stark contrast, no mosaic-type SNRs were found in the D. geothermalis genome (Table 1), suggesting that SNRs are not involved in recovery from radiation or desiccation , , . Further, there are about 20 DNA repeats in D. radiodurans that contain oligoG stretches (Figure S3). Such DNA sequences might adopt an ordered helical structure (G-quadruplex), predicted to form parallel four-stranded complexes capable of promoting chromosomal alignment . However, the absence of such oligoG stretches in the G-rich sequence of D. geothermalis (G+C content, 66%) indicates that G-quartets are not essential for resistance. In contrast, the D. geothermalis genome contains CRISPR repeats , whereas D. radiodurans does not (Table 1). CRISPR repeats are part of a predicted RNA-interference-based system implicated in immunity to phages and integrative plasmids , . Since no homologous prophages were identified in the two deinococci, and no CRISPR repeats are present in D. radiodurans, these sequences apparently have no role in determining levels of resistance either.
The 206 kb D. geothermalis megaplasmid (DG206), predicted by genome sequencing, is in lower copy-number than DG574 (Figure 1C). The presence of DG206 in genomic DNA preparations was confirmed in D. geothermalis (DSM 11300) DNA samples used for sequencing and from independent preparations by polymerase chain reaction (PCR) using DG206-specific primers that yielded DNA products of the predicted sizes (Figure S4). DG206 contains 205 predicted open reading frames (ORFs), of which 103 have significant similarity to genes in current databases; approximately 40 are identical to genes in either the D. geothermalis chromosome or DG574; and 28 have homologs in D. radiodurans, including 3 ORFs encoding highly diverged single-stranded DNA-binding proteins. Among other sequences of interest in DG206 are 22 transposon-related ORFs; 11 ORFs related to phage proteins; and 5 ORFs related to conjugal plasmid replication systems. In summary, DG206 is enriched for phage-, integrative plasmid- or transposon-related ORFs, but encodes no known metabolic enzymes and very few replication or repair proteins. Thus, DR206 seems to mimic the trend seen for ORFs in the smallest plasmid (46 kb) of D. radiodurans , , with no predicted genes implicated in resistance.
Our previous analysis of the major events in the evolution of the Deinococcus-Thermus group was based on D. radiodurans (ATCC BAA-816) and T. thermophilus strain HB27 . The current study includes additional comparisons with D. geothermalis (DSM 11300) and a second strain of T. thermophilus (HB8). Based on the standard approach of COGs (clusters of orthologous groups of proteins) , , COGs for Deinococcus and Thermus (tdCOGs) were constructed (Table S2). The tdCOGs were used as a framework for the whole-genome comparisons and evolutionary reconstructions (Figure 2). Using a weighted parsimony method and distantly related bacteria as outgroups, the evolutionary reconstructions revealed significant and independent expansion of the repertoire of genes in the Deinococcus and Thermus lineages following their divergence from a common ancestor. The expansion appears to have occurred through both lineage-specific duplications and gene acquisition via horizontal gene transfer (HGT). The high level of protein family expansion (paralogy), and the larger complement of species-specific genes acquired principally by HGT, could account for the existence of 600–900 more genes in Deinococcus than Thermus.
Our previous comparative analysis of T. thermophilus and D. radiodurans identified several evolutionary trends that correlate with the distinct phenotypes of these bacterial lineages . These trends were further refined through the analysis of the D. geothermalis sequence, and the unique features of the Deinococcus lineage were used to better define the pathways implicated in extreme radiation resistance (Table S2). One such trend in Deinococcus, in comparison to the inferred common ancestor of the Deinococcus-Thermus group, is the acquisition of a set of genes involved in transcriptional regulation and signal transduction. Examples of acquired transcriptional regulators include two proteins of the AsnC family, two proteins of the GntR family, and one protein of the IclR family. These families likely are involved in amino acid degradation and metabolism –. Further, the Deinococcus lineage acquired at least six TetR and MerR family regulators dedicated to diverse stress response pathways , . Among the acquired signal transduction genes, the most notable examples are two-component regulators of the NarL family (four distinct tdCOGs) involved in the regulation of a variety of oxygen and nitrate-dependent pathways of Escherichia coli ; and the presence of several diguanylate cyclase (GGDEF) domain-containing proteins supports an increased role of cyclic diGMP in Deinococci. A second evolutionary trend in Deinococcus is the acquisition of genes encoding proteins involved in nucleotide metabolism, in particular, degradation and salvage –. For example, this group includes genes for xanthine dehydrogenase, urate oxidase, deoxynucleoside kinases, thymidine kinase, FlaR-like kinase, and two UshA family 5′-nucleotidases.
Other gene-gains in Deinococcus relative to Thermus include genes for enzymes of amino acid catabolism and the tricarboxylic acid (TCA) cycle (Table S2). Beyond the differences reported previously , , the new reconstructions indicate that several catabolic genes of Deinococcus were already present in the Deinococcus-Thermus common ancestor. Following their divergence, however, the Thermus lineage appears to have lost many of these systems, including all enzymes involved in histidine degradation. By contrast, the Deinococcus lineage not only retained a majority of the predicted ancestral catabolic functions, but acquired new pathways including ones involved in the degradation of tryptophan and lysine, and several peptidases (Table S2). A hallmark of the Deinococcus lineage is the presence of two predicted genes for malate synthase, an enzyme of the glyoxylate bypass which converts isocitrate into succinate and glyoxylate, allowing carbon that enters the TCA cycle to bypass the formation of α-ketoglutarate and succinyl-CoA . It has been proposed that the strong upregulation of the glyoxylate bypass observed in D. radiodurans following irradiation facilitates recovery by limiting the production of metabolism-induced reactive oxygen species (ROS) . Dgeo_2616/DRA0277 is the malate synthase ortholog present in the Thermus lineage, but the second predicted deinococcal malate synthetase (Dgeo_2611/DR1155) is unique and only distantly related to homologs in other bacteria. Although the two predicted deinococcal malate synthetases could have similar functions, the genomic context of Dgeo_2611/DR1155 indicates otherwise; Dgeo_2611/DR1155 are both located in a predicted operon with two cyclic amidases of unknown biochemical function.
In a broader context, the present reconstruction indicates that many expanded families of paralogous genes in D. radiodurans proliferated before the emergence of the common ancestor of the Deinococci, but the expansions were not present in the ancestor of the Deinococcus-Thermus group (Table 2). Such Deinococcus-specific expanded families include the Yfit/DinB family of proteins, acetyltransferases of the GNAT family, Nudix hydrolases, α/β superfamily hydrolases, calcineurin family phosphoesterases, and others. Many of these expansions are for predicted hydrolases, phosphatases in particular, but their substrate specificities are either unknown or the affinity of known substrates is extremely low . It has been proposed, therefore, that the majority of these predicted enzymes perform cell-cleaning functions including degradation of damaged nucleic acids, proteins and lipids, and/or other stress-induced cytotoxins . The global proliferation of these enzymes in the Deinococcus lineage (Table S3) supports the acquisition of chemical stress-resistance determinants early in its evolution; and the independent proliferation of determinants within these deinococcal species (e.g., calcinurin phosphatses, Figure S5) might represent secondary adaptations to specific stress environments. In summary, these findings indicate that the Deinococcus stress-resistance phenotypes evolved continuously, both by lineage-specific gene duplications and by HGT from various sources (Table S3, S4 and S5) .
The comparison of gene-gain and gene-loss events in the D. radiodurans and D. geothermalis lineages reveals numerous differences, many of which correlate with their distinct metabolic phenotypes (Figure 3).
The most notable, distinctive feature of D. geothermalis is a greater abundance of genes for sugar metabolism enzymes, which could have been acquired after the divergence of the two Deinococci. The largest group within this set of genes is predicted to be involved in xylose utilization, needed for growth on plant material. D-xylose, which forms xylan polymers, is a major structural component of plant cell walls , and the presence of genes for aldopentose (xylose)-degradation explains why D. geothermalis is a persistent contaminant in paper mills . Specifically, D. geothermalis contains genes encoding xylanases (Dgeo_2723; Dgeo_2722), an ABC-type xylose transport system (Dgeo_2699-Dgeo_2703), xylose isomerase (Dgeo_2375, Dgeo_2692, Dgeo_2693, Dgeo_2826), and xylose kinase (Dgeo_2691). Several of the genes that encode enzymes of xylose metabolism form paralogous families (Table S4), most of which form a cluster on the megaplasmid DG574 (Dgeo_2703-Dgeo_2687), which also contains two gene clusters predicted to be involved in carbohydrate utilization (Dgeo_2669-Dgeo_2693, Dgeo_2832-Dgeo_2812). By comparison, there are no large clusters of functionally related genes on the D. geothermalis chromosome; approximately 80 and 120 encoding proteins involved in sugar-metabolism were identified on DG574 and the chromosome, respectively. The putative xylose metabolism functions of D. geothermalis appear to represent an expansion of a pre-existing, broad and diverse set of functions underlying the saccharolytic phenotypes of all Deinococci , , , . In contrast, D. radiodurans has a proteolytic lifestyle, where a loss of various amino acid biosynthetic pathways (Figure 3)  was accompanied by a gain of several predicted peptidases (DR0964, DR1070, DR2310, DR2503) and a urease system (DRA0311-DRA0319) . Thus, the evolutionary processes underlying the emergence of extreme resistance in Deinococci appear not to be dependent on a particular set of genes for sugar- or nitrogen-metabolism. In summary, these findings support that DG574 is essential to the natural growth modes of D. geothermalis, which is a proficient saccharolytic organism , ,  and strengthen the case that the megaplasmids in the Deinococcus-Thermus group are major receptacles of horizontally acquired genes, as proposed previously .
Further supporting the notion that a distinct set of metabolic genes is not a prerequisite for high levels of radioresistance, there are patent differences between sulfate and energy metabolism in D. geothermalis and D. radiodurans. In agreement with previously published results , , , the prototrophic D. geothermalis has orthologs of the nadABCD genes that are required for nicotinamide adenine dinucleotide (NAD) biosynthesis, whereas the auxotrophic D. radiodurans lacks these genes and is dependent on an exogenous source of this coenzyme , . Another example illustrating the relationship in D. radiodurans between gene-loss and its growth requirements is that of cobalamine (vitamin B12). Whereas D. geothermalis and T. thermophilus are not dependent on B12 in minimal medium, D. radiodurans can utilize inorganic sulfate as the sole source of sulfur only when vitamin B12 is present . Conversely, D. geothermalis has lost several genes for enzymes of protoheme biosynthesis (HemEZY) , which in D. geothermalis likely yields siroheme under the microaerophilic conditions which predominate at the Topt of D. geothermalis; the solubility of dioxygen in water at 50°C is significantly lower than at 32°C, the Topt of D. radiodurans.
There are also important differences between the systems for enzymes implicated in energy transformation in D. geothermalis and D. radiodurans. The D. geothermalis chromosome encodes two heme-copper cytochrome oxidases of types ba3 and caa3 ; and a cytochrome bd ubiquinol oxidase system (Dgeo_2707-Dgeo_2704), known to be expressed under oxygen-limiting conditions , is encoded by DG574. In contrast, D. radiodurans encodes only the caa3 oxidase system (DR2616-DR2620), which apparently was present in the Deinococcus-Thermus common ancestor. Furthermore, D. geothermalis encodes genes for proteins that comprise an assimilatory nitrite NAD(P)H reductase and a molybdopterin-cofactor-dependent nitrate reductase system (Dgeo2392-Dgeo_2389), which also is known to be expressed under anaerobic conditions , ; and D. geothermalis encodes several predicted multi-copper oxidases (Dgeo_2590, Dgeo_2559, Dgeo_2558) that are not present in D. radiodurans and are most similar to homologs from nitrogen-fixing bacteria. Since nitrogen fixation in D. geothermalis has not yet been studied, the possibility remains that these enzymes are involved in dissimilatory anaerobic reduction of nitrate or nitrite , . D. geothermalis, but not D. radiodurans, also encodes a formate dehydrogenase, which is related to nitrate reductase and has a possible role in energy transfer under anaerobic conditions .
In general, the evolutionary trends in D. radiodurans lineage appear to mimic closely those of the Deinococcus lineage, which is evident from the analysis of expanded families of paralogous genes (Table S5). In particular, proliferation of genes for the Yfit/DinB family, Nudix enzymes, acetyltransferases of the GNAT superfamily, and the α/β hydrolase superfamily was observed (Table 2). Plausible resistance-related functions readily can be proposed for these and other expanded families of deinococci. For example, hydrolases might degrade oxidized lipids; Yfit/DinB proteins might be involved in cell damage-related pathways ; subtilisin-like proteases might degrade proteins oxidized during irradiation , ; and the Nudix-related hydrolase, diadenosine polyphosphatase (ApnA), yields adenosine, a molecule that has been implicated in cytoprotection from oxidative stress and radiation , .
Several families expanded in D. radiodurans are predicted to possess functions potentially relevant to stress response, but are not conserved in D. geothermalis; most likely, non-conserved families can be disqualified as major contributors to the extreme IR and desiccation resistance phenotypes. Families that are specifically expanded in D. radiodurans include the TerZ family of proteins, which are predicted to confer resistance to various DNA damaging agents , ; secreted proteins of the PR1 family, whose homologs are involved in the response to pathogens in plants, and resistance to hydrophilic organic solvents in yeast , ; PadR-like regulators, which are implicated in the regulation of amino acid catabolism and cellular response to chemical stress agents and drugs –; TetR/AcrR transcriptional regulators, which are involved in antibiotic resistance regulation ; and KatE-like catalases, which would decompose hydrogen peroxide –. In contrast, there are family expansions which are shared by D. radiodurans and D. geothermalis, but have no obvious role in radiation or desiccation resistance. These include SAM-dependent metyltransferases (COG0500) and an uncharacterized family of predicted P-loop kinases (COG0645). In some bacteria, homologs of these kinases are fused to phosphotransferases that mediate resistance to aminoglycosides .
Since the IR-, UV- and desiccation-resistance profiles of D. radiodurans and D. geothermalis are identical (Figure 1) , the subset of stress response genes in D. radiodurans that are not unique, but exist in excess compared to D. geothermalis are unlikely to be required for extreme resistance either (Figure 3). This subset includes two Cu-Zn superoxide dismutases (SOD), a peroxidase, two HslJ-like heat shock proteins, and many genes implicated in antibiotic resistance (Table S5). Consistently, SodA and KatA of D. radiodurans can be disrupted with almost no loss in radiation resistance , and antibiotics have little effect on survival following irradiation provided corresponding antibiotic resistance genes are present , –.
Considerable independent gene-gain was detected in both D. geothermalis and D. radiodurans lineages in several other functional categories including transcriptional regulation, signal transduction, membrane biogenesis, inorganic ions metabolism, and to a lesser extent DNA replication and repair (Figure 3). In general, regulatory functions mirror the metabolic and stress-response-related differentiation of these two species outlined above. For instance, among the 12 genes for predicted transcriptional regulators that apparently were acquired in the D. geothermalis lineage, five are similar to ones known to be involved in the regulation of sugar metabolism in other bacteria, two of the RpiR family and three of the AraC family , . By contrast, D. radiodurans has at least 25 unique genes for transcriptional regulators: three of the ArcR family; 16 of the Xre family; one of the CopG/Arc/MetJ family; and five of a species-specific expanded family reported previously  that likely is responsible for stress-response control -. Other potentially independent gains involve genes predicted to be involved in signal transduction systems. D. radiodurans, for example, encodes photochromic histidine kinase, a protein that has been extensively studied in D. radiodurans and plays a role in the regulation of pigment biosynthesis , , but is missing in D. geothermalis. Alternatively, D. geothermalis encodes a putative negative regulator of sigma E, a periplasmic protein of the RseE/MucE family (Dgeo_2271). So far, RseE/MucE-members have been detected only in proteobacteria, where it regulates the synthesis of alginate, an extracellular polysaccharide which plays a key role in the formation of biofilms . D. geothermalis, however, likely does not produce alginate itself since it has no orthologs of the genes of the alignate pathway . On the other hand, D. geothermalis has clusters of genes implicated in exopolysaccharide biosynthesis, with the most notable cluster located on DG574 (Dgeo_2671-Dgeo_2646). It seems likely that this cluster is involved in the biosynthesis of exopolysaccharides, which might facilitate biofilm formation in D. geothermalis, and the Dgeo_2271 protein could be a regulator of this process. Overall, D. radiodurans encodes approximately 470 unique, uncharacterized proteins, for which no function could be predicted, compared to approximately 286 such proteins in D. geothermalis. Thus, an additional 756 unique, uncharacterized genes of the Deinococcus lineage can be excluded from the pool of putative determinants of the extreme IR, UV and desiccation resistance phenotype.
Over the last two decades, extensive experimental and comparative-genomic analyses have been dedicated to the identification and evolutionary origin of the genetic determinants of radiation resistance in D. radiodurans. Early on, it became evident that the survival mechanisms underlying extreme radiation resistance in D. radiodurans probably were not unique. In 1994, for example, IR-sensitive D. radiodurans polA mutants were fully complemented by expression of the polA gene from the IR-sensitive E. coli ; and in 1996, UV-sensitve D. radiodurans uvrA mutants were complemented by uvrA from E. coli , suggesting that these recombination and excision repair genes are necessary but not sufficient to produce extreme DNA damage resistance. Following the whole-genome sequencing of D. radiodurans in 1999 , comparative-genomic analysis revealed many distinctive genomic features that subsequently became the focus of high throughput experiments, including the analysis of transcriptome and proteome dynamics of D. radiodurans recovering from IR , , . Surprisingly, the cellular transcriptional response to IR in D. radiodurans appeared largely stochastic, and mutant analyses confirmed that many of the highly induced uncharacterized genes were unrelated to survival. So far, those correlative studies have failed to produce a coherent, comprehensive picture of the complex interactions between different genes and systems that have been thought to be important for the resistance phenotype.
The complete sets of orthologous genes in D. radiodurans and D. geothermalis are listed in Table S2. Within the subgroup of genes in D. radiodurans previously implicated in resistance by transcriptional induction following exposure to IR  (3 hours after irradiation and displaying more than a 2-fold induction), 45% have no othologs in D. geothermalis. This raises the possibility that many genes induced in irradiated D. radiodurans do not functionally participate in recovery, or that D. geothermalis carries a distinct set of resistance determinants. From the subgroup of putative resistance genes lacking counterparts in D. geothermalis, we constructed D. radiodurans knockouts of four representative genes: i) a ligase predicted to be involved in DNA repair (DRB0100) ; ii) a LEA76 desiccation resistance protein homolog (DR0105) ; iii) a predicted protein implicated in stress response (DR2221) ; and iv) a protein of unknown function (DR0140) . Homozygous disruptions of each of these genes in D. radiodurans (Figure S6) had no significant effect on IR resistance (Figure 4).
By contrast, most of the genes whose mutants display radiation-sensitive phenotypes in D. radiodurans , , , ,  are conserved in D. geothermalis. To date, 15 single-gene mutants of D. radiodurans have been reported to be moderately to highly radiation-sensitive; of these, 13 genes have orthologs in D. geothermalis (Table 3). The exceptions are DR0171 and DR1289, which encode the DNA helicase RecQ and a transcriptional regulator, respectively (Table 3). Remarkably, 10 of the 15 genes are conserved in other bacteria and are well-characterized components of DNA repair pathways. However, 5 of the 15 genes (DR0003, DR0070, DR0326, DR0423, DRA0346) are unique to the Deinococcus lineage, supporting the existence of at least a few novel resistance mechanisms.
Given that the two Deinococcus species are equally resistant to IR (Figure 1A), genes dedicated specifically to the extreme radiation/desiccation response are expected to belong to the set of tdCOGs. D. radiodurans and D. geothermalis share 231 tdCOGs that are relatively uncommon in other prokaryotes, and 63 of these are unique to the Deinococcus lineage. Using the most sensitive methods available to predict function, we reanalyzed these tdCOGs by using a remote sequence similarity search, and genomic context analysis –. Interpretation of such analyses, however, is constrained by the complexity and ambiguities inherent in the approach, and by the knowledge base. In contrast, many cytosolic proteins (e.g., RecA, PolA, SodA and KatA) are known to be intimately involved in resistance, so we present functional predictions for 50 genes (Table S6). Among the predictions for cytosolic proteins, several are new and potentially relevant to resistance. For example, DR0644 (Figure 5A) is predicted to be a distinct Cu/Zn superoxide dismutase that could defend against metabolism-induced oxidative stress during recovery (Table S7); and DR0449 (Figure 5B) is a divergent member of the RNAse H family that is fused to a novel domain, a combination that is currently unique to Deinococcus. Other functional insights were for DR0041/Dgeo_0188, that is a paralog of DR0432 (DdrA) (Figure 5C); and a member of the RAD22/Rad52 family (Figure 5C) of single-stranded annealing proteins , that yields a moderately sensitive phenotype in D. radiodurans upon disruption . Interestingly, the radiation-sensitive T. thermophilus encodes a homolog of DdrA (TTC1923), indicating that this protein had an ancestral role that was not directly related to radiation resistance. Notably, we continue to find proteins in Deinococcus species which are only remotely similar to well-characterized enzymes in other organisms, and it is difficult to predict their role in the cell or radiation resistance. For example, we have identified a protein that is conserved in both D. geothermalis and D. radiodurans and is distantly related to enzymes of the QueF/FolE family, which are involved in queuosine/folate biosynthesis (Figure 5D), but their role in the Deinococci remains undefined. Collectively, these results support the conclusion that many genes that are significantly induced in irradiated D. radiodurans are not involved in recovery (Table 3). Thus, the genome of D. geothermalis is a resource of major importance in delineating a reliable minimal set of resistance determinants, by corroborating those that are conserved and ruling out those which are unique.
A potential radiation-desiccation response regulon and the corresponding regulator common to D. radiodurans and D. geothermalis were identified using the approach developed by Mironov et al , . In the search for such a regulator, we used a training-set comprised of sequences flanking D. radiodurans genes that were strongly upregulated by IR, and for which the corresponding mutants were radiosensitive (Table 3) . The upstream regions of several genes from the training set (DR0326, ddrD; DR0423, ddrA; DRA0346, pprA; DR0070, ddrB) revealed a strong palindromic motif, designated the radiation/desiccation response motif (RDRM). Using a positional weight matrix, the RDRM was used to generate the initial profile and to scan the entire D. radiodurans genome. This genome survey picked up a similar motif in the upstream regions of other genes upregulated after irradiation . The upstream regions with the highest scores (DR0219, DR0906, DR1913 and DR0659) were then used to better define the RDRM, and the complete genomes of D. radiodurans and D. geothermalis were scanned with the updated motif. Using moderately relaxed parameters (Materials and Methods), approximately 120 genes in each of the Deinococcus genomes were selected by the screen. The final, most conservative prediction of the radiation/desiccation response (RDR) regulon consisted of two groups: (i) a set of orthologous genes present in both Deinococcus species that contain the RDRM; and (ii) a set of unique genes of D. radiodurans that contain the RDRM and are upregulated during the recovery from irradiation , . Since microarray data for D. geothermalis are not available, it was not possible to predict a set of unique RDRM-dependent genes for this species. Table 4 lists the set of genes predicted to comprise the regulon together with the corresponding RDRM sites (Figure 6). Collectively, the RDR regulon is predicted to consist of a minimum of 29 genes in D. radiodurans and 25 genes in D. geothermalis, contained within 20 operons in each species.
The RDR regulon is dominated by DNA repair genes, including the recombinational repair proteins RecA and RecQ , ; the mismatch repair proteins MutS and MutL, that are located in one operon in D. geothermalis; and the UvrB and UvrC proteins, which are involved in nucleotide excision repair (Table 4). In addition, the predicted RDR regulon includes the transketolase gene. In bacteria, transketolase is a key enzyme of the pentose-phosphate pathway for carbohydrate metabolism and is known to be induced by a variety of stress conditions including cold shock, and mutagens that trigger the SOS response . Moreover, the pentose-phosphate pathway in D. radiodurans is reported to facilitate DNA excision repair induced by UV irradiation and hydrogen peroxide (H2O2) . The RDRM also precedes a conserved histidine catabolism operon . Several bacterial biodegradative and related operons are known to be differentially induced in response to a decline in biosynthetic and energy-generating activities under oxidative stress . For example, the TCA cycle in D. radiodurans is strongly down-regulated following irradiation , whereas the glyoxylate bypass of the TCA cycle, and the His operon are induced . Several studies have provided direct evidence that survival of D. radiodurans following exposure to IR depends on a coordinated metabolic response and a high level of respiratory control , .
The regulation of gene expression in D. radiodurans during recovery from IR has been the subject of considerable investigation. Recently, it has been shown that the induction of recA in irradiated D. radiodurans is regulated by the IrrE/PprI protein , , which consists of two domains, a Xre-like HTH domain and a Zn-dependent protease. In both D. radiodurans and D. geothermalis, the irrE gene is located upstream of the folate biosynthesis operon, but appears to be regulated independently . Since recA in D. radiodurans is strongly induced following irradiation , , it was surprising that the irrE gene of D. radiodurans was constitutively expressed, showing no post-irradiation induction , , . Furthermore, the IrrE/PprI protein has an unusual domain structure and does not appear to bind the promoter region of recA or other induced genes .
Compared to radiosensitive bacteria, the regulatory mechanisms underlying the response to radiation in D. radiodurans are still poorly characterized. For example, the LexA-regulated SOS-dependent radiation response regulon of E. coli is well-defined , –, but an equivalent system in D. radiodurans has not been identified. D. geothermalis has one lexA gene (DG1366) and D. radiodurans has two lexA paralogs (DRA0344, DRA0074). However, the lexA genes in D. radiodurans are not induced after irradiation, are not involved in RecA induction , and are not preceded by RDRM sites , . Therefore, LexA is not a candidate for the role of the regulator of the Deinococcus RDR regulon. In the microarray experiments of Liu et al, several putative regulators were upregulated in D. radiodurans following exposure to 15,000 Gy . In contrast, at lower doses (3,000 Gy), the D. radiodurans microarray experiments of Tanaka et al detected only one upregulated putative regulator (DdrO) (DR2574) . An orthologous gene for DdrO is present in D. geothermalis (Dgeo_0336). DdrO is a Xre family protein and is the only Deinococcus gene for a predicted regulator that is preceded by a RDRM site (Table 4). This arrangement is common to many stress response regulators, e.g., the lexA genes of many species . Thus, we propose that DdrO is the global regulator of the RDR regulon in the Deinococcus lineage.
In 1971, Moseley and Mattingly reported the first mutant analyses for D. radiodurans that showed that its recovery from radiation is dependent on DNA repair . Subsequent research confirmed that DNA repair enzymes, which are central to recovery of irradiated bacteria in general, were key to D. radiodurans survival. Remarkably, several highly radiation-sensitive D. radiodurans DNA repair mutants were fully complemented by expression of orthologous genes from radiosensitive bacteria , , –. Thus, the extreme resistance phenotype appeared to be dependent, at least in part, on a conventional set of DNA repair functions , , . Generally, this view has been supported by the analysis of the complete genome sequence of D. radiodurans , and subsequently, by whole-transcriptome and whole-proteome analyses for D. radiodurans recovering from IR , , . Central to current models of extreme resistance are hypotheses that aim to reconcile the seemingly paradoxical findings that DNA repair proteins in D. radiodurans function extremely efficiently, yet appear structurally unremarkable, and often can be complemented by orthologs from radiosensitive bacteria. Within this conceptual framework, we examined the impact of the inferences on gene-gain and gene-loss derived from the comparative-genomic analysis of the two Deinococcus species on prevailing models of extreme radiation and desiccation resistance.
recA-dependent homologous recombination occurs at hundreds of IR-induced DSB sites in D. radiodurans during recovery from 17.5 kGy IR , –. In D. radiodurans, the alignment of its multiple identical chromosomes is often tacitly assumed as the starting point for a given repair model, yet little is known about how, or even if, such chromosomal alignment occurs. The first model that considered this possibility in the recovery of D. radiodurans was published by Minton and Daly in 1995 . The model built on the idea that alignment of identical chromosomes is a natural and early consequence of semi-conservative replication, where persistent chromosomal pairing was predicted to facilitate the ‘search for homology’ that precedes homologous recombination. The model made two major predictions: first, transmission electron microscopy (TEM) of chromosomal DNA from D. radiodurans should reveal evidence of structures linking chromosomes; and second, recA-dependent recombination between homologous DNA fragments inserted at widely separated genomic locations should show strong positional effects upon irradiation. Both predictions have been tested and refuted: no linking structures have been observed by TEM-based optical mapping , and molecular studies have shown high levels of recombination between homologous DSB fragments irrespective of their genomic origin –, . Thus, it has been concluded that IR-induced DSB fragments in D. radiodurans are mobile and that the structural form of its nucleoids does not play a key role in radioresistance. These conclusions were subsequently strengthened by cryoelectron microscopy of vitreous sections of D. radiodurans , , and by nucleoid morphology studies , , , .
The genome of D. radiodurans contains numerous, unusual, mosaic-type SNRs , ,  which potentially could contribute to genome assembly by holding together homologous DSB pairs . TEM optical mapping of D. radiodurans recovering from IR, however, showed that IR-induced DSB fragments in D. radiodurans were not linked . Consistently, the present whole-genome comparison detected none of these repeats in D. geothermalis, nor any other expanded repeat families, including G-quadruplex sequences (Table 1) (Figure S3). We did not identify any unusual features in chromosome-binding proteins that are conserved in the two Deinococcus genomes compared to the orthologous proteins from other bacteria  (Table S7 and S8). Thus, our comparative analysis does not seem to support Hypothesis I. More broadly, there is currently no convincing experimental evidence supporting the idea that structural alignment, aggregation or organization of the D. radiodurans chromosomes has a significant effect on radiation/desiccation resistance. However, we cannot rule out the possibility that the genomes of sensitive bacteria have structural characteristics that predispose them to inefficient genome reassembly.
In general, bioinformatic and experimental studies suggest that genome configuration and copy-number or the protection and repair functions of sensitive bacteria do not have unique properties that predispose them to DNA damage or inefficient DNA repair , , . More specifically, chromosomes in sensitive and resistant bacteria are equally susceptible to IR-induced DSB damage ,  and UV-induced base damage ; and DNA repair and protection genes of T. thermophilus, a radio-sensitive representative of the Deinococcus-Thermus group, and E. coli do not show obvious differences from their counterparts in D. radiodurans or D. geothermalis , ,  (Table S8). Furthermore, several E. coli DNA repair genes, including polA and uvrA, have been shown to restore the corresponding radiation-sensitive D. radiodurans mutants to wild-type levels of D. radiodurans resistance , , ; and the products of interchromosomal recombination in D. radiodurans following irradiation are consistent with the canonical version of the DSB repair model –. It has been proposed that D. radiodurans uses a core set of conventional DNA repair enzymes in novel ways, where conventional repair activities are enhanced by as yet uncharacterized proteins. For example, Zahradka et al have recently proposed a model called extended synthesis dependent strand annealing (ESDSA) that utilizes PolA in an unprecedented way .
Under the ESDSA, DSB fragments formed in irradiated D. radiodurans are first subject to a 5′→3′ exonuclease resection mechanism that generates overhanging 3′ tails. A 3′ tail then invades a homologous DSB fragment derived from a different chromosomal copy, displacing the corresponding 5′ strand as a loop. Synthetic extension of the priming 3′ terminus might then proceed to the end of the invaded fragment, followed by annealing of the newly synthesized long 3′ extension with a complementary strand of another fragment engaged in ESDSA (Figure S7). For example, if the sequences of two priming fragments were ABCD and GHIJ, then a bridging and templating fragment could be DEFG, and the sequence of the assembled contig would be ABCDEFGHIJ . The ESDSA model accounts for the formation of large, interspersed blocks of old and new DNA observed in repaired D. radiodurans chromosomes. Some aspects of the ESDSA model, however, are difficult to reconcile with earlier experimental findings for recA-independent single-stranded annealing (SSA) mechanisms in irradiated D. radiodurans  (Figure S7). Zharadka et al conceded that the SSA model is a potential alternative to ESDSA and could perhaps generate small blocks of old and new DNA , but pointed out that the E. coli PolA Klenow fragment, that lacks the 5′→3′ exonuclease, fully complements D. radiodurans polA mutants for resistance to γ-radiation. The present analysis shows that, although D. radiodurans and D. geothermalis do not encode recBCE, they both encode recJ, a putative 5′→3′exonuclease that could potentially provide nuclease activity missing in the Klenow fragment (Table S8).
The possibility that extreme resistance in D. radiodurans is determined by novel genes that enhance conventional repair functions has also been examined , , . At least 12 genes of D. radiodurans, which were implicated in resistance by transcriptional profiling following IR, have been knocked out and the resulting mutants were characterized for IR resistance (Table 3). Remarkably, for most of the novel mutants, the IR resistances remained high , , , indicating that few of the uncharacterized genes, at least individually, makes a substantial contribution to the recovery of irradiated D. radiodurans. For example, the DR0423 protein has been reported to bind 3′ ends of single-stranded DNA molecules, perhaps, protecting 3′ termini generated by SSA or ESDSA from nuclease degradation. A DR0423 knockout mutant, however, retained approximately half of the wild-type level of IR resistance , . To date, only a few of the uncharacterized genes selected for disruption analysis have contained the RDRM (Table 3 and and44).
At least three Deinococcus proteins involved in repair show features that stand out against the overall, “garden-variety” of bacterial repair systems. First, D. radiodurans encodes a protein (DR1289) of the RecQ helicase family, which contains three Helicase and RNase D C-terminal (HRDC) domains, whereas most of the other bacterial RecQ proteins have a single HRDC domain. A D. radiodurans recQ knockout mutant is sensitive to IR, UV, H2O2, and MMC, and it has been reported that all three HRDC domains contribute to resistance . However, D. geothermalis has no ortholog of the D. radiodurans RecQ, but does encode the Dgeo_1226 protein that contains a helicase superfamily II C-terminal domain and a second HDRC domain that has high similarity to the corresponding domains of DR1289. Both DR1289 and Dgeo_1226 belong to the predicted resistance regulon (Table 4). A second exceptional case is RecA, the key repair protein that is required for homologous DNA recombinational repair following irradiation . The DNA strand-exchange reactions promoted by the RecA proteins from all other bacteria studied to date are ordered such that the single-stranded DNA is bound first, followed by the double-stranded DNA. In contrast, the D. radiodurans RecA binds the DNA duplex first and the homologous single-stranded DNA substrate second . It seems likely, however, that these unusual properties of RecA are ancestral to the Deinococcus-Thermus group. Indeed, most of the amino acid residues that are distinct in Deinococcus and could be responsible for the structural and functional differences between the RecA proteins of Deinococcus and other bacteria are also present in the RecA sequence of Thermus (Figure S8). In this context, early work by Carroll et al  reported that E. coli RecA did not complement an IR-sensitive D. radiodurans recA point-mutant (rec30) and that expression of D. radiodurans RecA in E. coli was lethal. More recently, however, it has been reported that E. coli recA can provide partial complementation to a D. radiodurans recA null mutant , and that D. radiodurans recA fully complements E. coli recA mutants . This suggests that the D. radiodurans RecA protein is not as unusual as initially believed, but rather is more analogous to polA and uvrA of D. radiodurans, which can be functionally replaced by E. coli orthologs , , , . A third example, the Deinococcus single-stranded binding protein (Ssb) has a distinct structure, with two OB-fold domains in a monomer, but this feature was apparently already present in the common ancestor of Deinococcus/Thermus group and therefore cannot be linked to radiation resistance directly .
It has been repeatedly proposed that nonhomologous end-joining (NHEJ) occurs in D. radiodurans , –. However, experiments specifically designed to test for the occurrence of NHEJ in D. radiodurans have shown that NHEJ of irradiation-induced DSB fragments is extremely rare, if not absent . More recent work also supports this conclusion . In the present and a previous study, we did not identify any orthologs of genes from other organisms that might encode NHEJ in D. geothermalis or D. radiodurans . However, it cannot be ruled out that Deinococcus encodes a unique NHEJ system. For example, DRB0100 encodes an ATP-dependent ligase that contains domains that could potentially contribute to NHEJ, namely, a predicted phosphatase of the H2Macro superfamily and an HD family phosphatase and polynucleotide kinase , . Furthermore, DRB0100 belongs to a set of three genes comprising a putative operon (DRB0098-0100) that is strongly induced by IR. A homozygous disruption of the DRB0100 gene, however, is fully IR-resistant (Table 3) (Figure 4), and genome comparison showed that D. geothermalis has no orthologs of DRB0100 or any functionally related operons. Despite the strong induction of DRB0100 following irradiation and the apparent relevance of the predicted function of this protein to D. radiodurans repair, DRB0100 appears not to contribute to resistance (Figure 4), and when purified, does not display DNA or RNA ligase activity in vitro . These findings, therefore, reflect a broader paradox of Deinococcus: whereas computational analyses have revealed an increasing number of new proteins potentially involved in the extreme resistance phenotype, very few of the corresponding D. radiodurans mutants tested so far have had a significant effect on its IR resistance. The present work leads to further shrinking of the set of genes implicated as major contributors to the resistance phenotype by showing that many of the original candidates are not conserved between D. geothermalis and D. radiodurans. Thus, our comparative analysis appears to be inconsistent with Hypothesis II, and reinforces inferences from a growing body of experimental work on Deinococcus species, which support that these organisms rely on a relatively conventional set of DNA repair functions.
Over the past decade, several observations have challenged the DNA-centered view of IR toxicity in eukaryotes and prokaryotes , , , , including (i) IR-induced bystander-effects in mammalian cells, defined as cytotoxic effects elicited in non-irradiated cells by irradiated cells, or following microbeam irradiation of cells where the cytoplasm but not the nucleus is directly traversed by radiation ; (ii) the genomes of radiation-sensitive bacteria revealed nothing obviously lacking in their repertoire of DNA repair and protection systems compared to resistant bacteria , ; and (iii) for a group of phylogenetically diverse bacteria at the opposite ends of IR resistance, the amount of protein damage, but not DNA DSB damage, was quantifiably related to radioresistance , . Thus, while the etiological radicals underlying different oxidative toxicities appear closely related , the pathway connecting the formation of IR-induced ROS with endpoint biological damage is still not definitively established . It has been proposed recently that proteins in IR-sensitive cells are major initial targets, where cytosolic proteins oxidized by IR might actively promote mutation by transmitting damage to DNA , and IR-damaged DNA repair enzymes might passively promote mutations by repair malfunction . In comparison, Mn-dependent radioprotective complexes in IR-resistant bacteria  appear to protect proteins from oxidation during irradiation, with the result that enzymatic systems involved in recovery survive and function with great efficiency . The proposed mechanism of extreme IR resistance requires a high intracellular Mn/Fe concentration ratio, where redox-cycling of Mn(II) complexes in resistant bacteria ,  scavenge a subset of IR-induced ROS that target proteins. Because the formation of ROS during irradiation is extremely rapid , an intracellular protection system that is ubiquitous, but not highly dependent on the induction of enzymes, stage of growth, or temperature over a range at which cells are metabolically active, could provide a selective advantage to the host in diverse settings.
Since high intracellular Mn/Fe ratios have been implicated in radiation and desiccation resistance , , , , we examined the intracellular concentrations and distributions of Mn, Fe and seven other elements in D. geothermalis compared to D. radiodurans, determined by x-ray fluorescence (XRF) microscopy (Figure 7) . The XRF analyses showed that the intracellular levels of Mn and Fe and their locations in D. geothermalis are essentially the same as D. radiodurans , but very different from the concentrations and distributions in IR-sensitive bacteria , . In this context, both D. radiodurans and D. geothermalis encode the Mn(II) transporter Nramp (DR1709) and a putative Mn-dependent transcriptional regulator TroR (DR2539) , but lack many genes for Fe homeostasis common in other bacteria, including for siderophore biosynthesis (COG3486, COG4264, COG4771) and Fe transport (COG1629, COG0810) (Table S9) . Consistently, D. radiodurans and D. geothermalis do not secrete siderophores (Figure S9), the nramp gene of D. radiodurans is essential and could not be disrupted, and the Fe uptake regulator (Fur) in D. radiodurans was dispensable (Figure S10); a system for gene disruption in D. geothermalis has not been developed. Other recent work that has strengthened the argument for a critical role of Mn(II) in the extreme resistance phenotypes of D. radiodurans includes in vitro studies of Heinz and Marx . They have shown that purified D. radiodurans PolA and E. coli PolA can bypass certain forms of IR-induced DNA damage during replication in the presence but not in the absence of 1 mM Mn(II), and suggested that Mn(II) ions might serve as important modulators of enzyme function . In summary, we conclude that our genome comparison (Table S9), gene knockout (Figure S10) and element analyses (Figure 7) appear to be consistent with Hypothesis III, whereby survival is facilitated by systems which regulate the concentration and distribution of intracellular Mn and Fe. Based on recent work, it appears that the presence of globally-distributed intracellular nonenzymic Mn(II) complexes in resistant bacteria facilitates recovery by preventing a form of IR-induced Fe-catalyzed protein oxidation known as carbonylation .
Based on their identical radiation resistance characteristics and close phylogenetic relationship, D. geothermalis and D. radiodurans are well-suited to defining a minimal set of conserved genes that could be responsible for extreme resistance. The two major findings of this analysis are (i) the characterization of the evolutionary trends that led to the emergence of extreme stress resistance in the Deinococcus lineage, in particular the finding that many families of paralogous genes, previously shown to be expanded in D. radiodurans, proliferated before the emergence of the common ancestor of the Deinococci, but were not present in the ancestor of the Deinococcus-Thermus group (Table 2); and (ii) delineation of a set of genes that comprise the predicted Deinococcus radiation and desiccation response regulon, which defines a new subgroup of targets for investigation in the Deinococci (Table 4). These findings have strengthened the view that Deinococci rely more heavily on the high efficiency of their detoxifying systems, including enzymic and nonenzymic ROS scavengers, than on the number and specificity of their DNA repair systems (Table 3). Our findings, however, do not rule out the possibility that the exceptional efficiency of DNA repair processes in both Deinococcus species is, at least in part, due to modifications of a set of universal repair genes. With respect to the impact of the whole-genome sequence of D. geothermalis on prevailing models of extreme IR resistance, the results of the comparative analysis weaken the arguments for a role of higher-order chromosome alignment structures (Hypothesis I); more clearly define and substantially revise downward the number of uncharacterized genes that might participate in DNA repair and contribute to resistance (Hypothesis II); and are consistent with the notion of a predominant role in resistance of systems involved in cellular protection and detoxification (cell-cleaning) (Hypothesis III).
In the hierarchy of DNA lesions caused in vivo by radiation, DSBs are the least frequent ones, but the most lethal . Since the number of genomic DSBs induced by a given dose of IR in resistant and sensitive bacteria is about the same , , a legitimate question is whether resistant and sensitive bacteria are also equally susceptible to DNA base damage. Setlow and Duggan showed that D. radiodurans and E. coli are similarly susceptible to DNA thymine-dimers caused by UV . For IR and UV, the differences reported in resistance of DNA to radiation damage are not nearly sufficient to account for the relative resistance of D. radiodurans. Thus, it seems surprising that the recombination and excision repair systems of D. geothermalis and D. radiodurans did not proliferate compared to sensitive cells . The DNA repair and damage signaling systems of these radiation reisistant bacteria appear quantitatively and qualitatively even less complex and diverse than those reported for some sensitive bacteria , . Instead, the stress-resistance phenotypes of the Deinococcus lineage appear to have evolved progressively by accumulation of cell-cleaning systems which eliminate organic and inorganic cell components that become toxic under radiation or desiccation , , , . In D. geothermalis and D. radiodurans, this form of cell-cleaning appears to manifest itself as protein protection during exposure to IR  or desiccation [JFK, EKG, MJD, unpublished], where proteins in Deinococci are substantially more resistant to oxidative damage than proteins in sensitive bacteria . Our finding that many genes in the predicted Deinococcus damage response regulon are the same as those found in SOS regulons of sensitive bacteria, but are regulated differently, is easily reconciled with the idea that enzymes and biochemical pathways in resistant bacteria survive and function more efficiently because they are less prone to interference from the toxic byproducts of IR and desiccation , , .
More generally, our findings place constraints on the degree to which functional inferences can be made from whole-genome transcriptome analyses based on a single organism. For example, two independent analyses of gene induction in D. radiodurans recovering from different IR doses revealed numerous genes that are upregulated during the post-irradiation recovery, many of which were viewed as plausible candidates for a significant role in resistance , . The hierarchy of induced genes in both transcriptome analyses was very similar, however, most of the highly induced D. radiodurans genes have no orthologs in D. geothermalis, and knockout of many of the uncharacterized unique D. radiodurans genes that were strongly induced by IR had little effect on IR resistance. A similar paradigm is emerging from the analysis of other systems, where the cellular transcriptional response to stress was largely stochastic, frequently involving genes known to be unrelated to the mechanisms under investigation -. Thus, it stands to reason that any comprehensive bioinformatics effort aimed at deciphering a complex, multi-gene phenotype using whole-genome, transcriptome and proteome approaches should aim to study at least two closely-related species. In the present context of understanding the genomic basis of extreme resistance phenotypes and the nature of the common ancestor of the Deinococcus-Thermus group, we consider Truepera radiovictrix an appropriate next candidate for whole-genome sequencing. T. radiovictrix is a recently discovered, deeply branching representative of the Deinococcus branch that is both thermophilic and extremely IR-resistant .
The strains used were as follows: Deinococcus radiodurans (ATCC BAA-816), Deinococcus geothermalis (DSM 11300), and Escherichia coli (K-12) (MG1655).
D. radiodurans strain ATCC BAA-816 was grown at 32°C in undefined liquid nutrient-rich medium TGY (1% tryptone/0.1% glucose/0.5% yeast extract) or on TGY solid medium . In liquid culture, cell density was determined at 600 nm by a Beckman Coulter spectrophotometer. For acute IR (60Co Gammacell irradiation unit, J. L. Shepard and Associates, Model 109) or UV (254 nm) exposures, late logarithmic-phase D. radiodurans cultures [OD600=0.9, 1×108 colony-forming units (cfu)/ml] were irradiated to the indicated doses (Figure 1). Cell viability and cell numbers were determined by plate assay as described previously . Three independent cell cultures and irradiation treatments of the same kind were performed and served as biological replicates for determining irradiation resistance profiles. To test the predicted involvement of the indicated genes, a mutant (Figure S6) was generated using previously developed D. radiodurans disruption protocols . PCR was carried out as described previously .
The complete genome of D. geothermalis (DSM 11300) was sequenced at the Joint Genome Institute (JGI) using a combination of 3 kb-, 8 kb- and fosmid- (40 kb) libraries. Library construction, sequencing, finishing, and automated annotation steps were carried out as follows.
Approximately 3–5 µg of isolated DNA was randomly sheared to 3 kb fragments in a 100 µl volume using a HydroShear™ (Genomic Solutions, Ann Arbor, MI). The sheared DNA was immediately blunt end-repaired at room temperature for 40 min using 6 U of T4 DNA Polymerase (Roche Diagnostics, Indianapolis, IN), 30 U of DNA Polymerase I Klenow Fragment (NEB, Beverly, MA), 10 µl of 10 mM dNTP mix (GE Healthcare, Piscataway, NJ), and 13 µl of 10× Klenow Buffer in a 130 µl total volume. After incubation, the reaction was heat-inactivated for 15 min at 70°C, cooled to 4°C for 10 min, and then frozen at −20°C for storage. The end-repaired DNA was run on a 1% Tris/Borate/EDTA (TBE) agarose gel for ~60 min at 120 volts. Using ethidium bromide stain and UV illumination, 3 kb sheared fragments were extracted from the agarose gel and purified using QIAquick™ Gel Extraction Kit (QIAGEN, Valencia, CA). Approximately 300 ng of purified fragment was blunt-end-ligated overnight at 16°C into the Sma I site of 100 ng of pUC18 cloning vector (Roche) using 12 U T4 DNA Ligase, 3.2 µl 10× buffer (Roche), and 4.8 µl 30% PEG in a 32 µl total reaction volume. A very similar process was carried out to create an 8 kb library in pMCL200 with 10 µg of isolated genomic DNA.
Following standard protocols, 1 µl of each ligation product (3 kb or 8 kb) was electroporated into DH10B Electromax™ cells (Invitrogen, Carlsbad, CA) using the GENE PULSER® II electroporator (Bio-Rad, Hercules, CA). Transformed cells were transferred into 1 mL of SOC medium and incubated at 37°C in a rotating wheel for 1 h. Cells (usually 20–50 µl) were spread on 22×22 cm LB agar plates containing 100 µg/mL of ampicillin (pUC19) or 20 µg/mL of chloramphenicol (pMCL200), 120 µg/mL of IPTG, and 50 µg/mL of X-GAL. Colonies were grown for 16 h at 37°C. Individual white recombinant colonies were selected and picked into 384-well microtiter plates containing LB/glycerol (7.5% v/v) media containing 50 µg/mL of ampicillin or 20 µg/mL of chloramphenicol using the Q-Bot™ multitasking robot (Genetix, Dorset, U.K.). To test the quality of the library, 48 colonies were directly PCR-amplified with pUC m13–28 and –40 primers using standard protocols. Libraries passed PCR quality control if they had >90% 3 kb inserts or 8 kb inserts, respectively. For more details, see research protocols at www.jgi.doe.gov.
One µl-aliquots of saturated E. coli cultures (DH10B) containing (pUC19 vector with random 3 kb DNA inserts or pMCL200 vector with random 8 kb DNA inserts) were added to 5 µl of a 10 mM Tris-HCl pH 8.2 and 0.1 mM EDTA denaturation buffer. The mixtures were heat-lysed at 95°C for 5 min then placed at 4°C for 5 min. To these denatured products, 4 µl of a rolling circle amplification (RCA) reaction mixture (Templiphi™ DNA Sequencing Template Amplification Kit, GE Healthcare) were added. The amplification reactions were carried out at 30°C for 12–18 h. The amplified products were heat-inactivated at 65°C for 10 min then placed at 4°C until used as template for sequencing .
Aliquots of the 10 µl amplified plasmid RCA products were sequenced with standard pUC m13–28 or –40 primers. The reactions typically contained 1 µl of the RCA product, sequenced with 4 pmoles (1 µl) of standard M13–28 or –40 primers, 0.5 µl 5×buffer, 1.75 µl H2O, and 0.75 µl BigDye sequencing kit (Applied Biosystems) at 1 min denaturation and 25 cycles of 95°C for 30 sec, 50°C for 20 sec, 60°C for 4 min, and finally held at 4°C. The reactions were then purified by a magnetic bead protocol (see research protocols, www.jgi.doe.gov) and run on an ABI PRISM 3730xl (Applied Biosystems) capillary DNA sequencer.
Approximately 15–20 µg of isolated DNA was randomly sheared to 40 kb fragments (25 cycles at speed code 17 using the large assembly, part # JHSH204007) in a 60 µL volume using a HydroShear™ (GeneMachines, San Carlos, CA). The sheared DNA was immediately blunt end-repaired at room temperature for 45 min using the End-It end-repair kit (Epicentre, Madison, WI). The end-repair reaction contained 60 µL sheared DNA, 8 µL of 10×End-It buffer, 8 µL of 2.5 mM End-It dNTP mix, 8 µL of 10 mM End-It ATP, and 4 µL of End-It Enzyme mix in a 80 µL total volume. After 45 min of incubation, the reaction was heat-inactivated for 10 min at 70°C, cooled to 4°C for 10 min and then frozen at −20°C for storage. The end-repaired DNA was run on a 1% TBE low melting point agarose gel for 13 hours using the following conditions (Temperature: 14°C, Voltage: 4.5 V/cm, Pulse initial: 1.0–final: 7.0 sec, Angle: 120°) on a BioRad Chef-DR III™ System PFGE system. Using standard procedures, the gel was stained with ethidium bromide, destained, and visualized under UV for less than 10 seconds while the 40 kb band was excised. DNA was extracted from the agarose gel and blunt-end ligated into pCC1FOS following the Copy Control Fosmid Kit (Epicentre) protocol. With minimal modifications to the Copy Control Fosmid Kit (Epicentre) protocol, the ligated DNA was packaged, infected and plated for picking and end-sequencing. For detailed JGI protocols used, please see research protocols at www.jgi.doe.gov.
Draft assemblies were based on 34,919 total reads. The Phred/Phrap/Consed software package (http://www.phrap.com) was used for sequence assembly and quality assessment , . After the whole-genome shotgun stage, sequence reads were assembled with parallel Phrap (High Performance Software, LLC). All mis-assemblies were corrected by editing in Consed , and gaps between contigs were closed by custom primer walk or PCR amplification (Roche Applied Science, Indianapolis, IN). The completed genome sequence of D. geothermalis (DSM 11300) contained 36,718 reads, achieving an average of 8-fold sequence coverage per base with an error rate less than 1 in 100,000. The D. geothermalis genome sequence can be accessed at GenBank, or at the JGI Integrated Microbial Genomes website (http://img.jgi.doe.gov). Predicted coding sequences were manually analyzed and evaluated using an Integrated Microbial Genomes (IMG) annotation pipeline (http://img.jgi.doe.gov). The general structure of the predicted D. geothermalis genome was examined by PFGE as described previously for D. radiodurans , . For structural analysis, D. geothermalis was exposed to 0.2 kGy, which introduces approximately 0.013 DSB/Gy per genome, and the cells were then embedded and lysed in agarose. For PFGE of genomic DNA subjected to restriction endonuclease analysis, non-irradiated D. geothermalis cells were used.
Reconstructed clusters of orthologous genes for the Deinococcus and Thermus genomes (tdCOGs) were constructed using a technique based on the standard COG approach , , . First, a coarse-grained classification was obtained by assigning predicted genes to the NCBI Clusters of Orthologous Groups of proteins (COGs) using the COGNITOR method . Then, the genes were organized into tight clusters, based on triangles of best hits . Proteins belonging to the same cluster were aligned using the MUSCLE program ; alignments were converted into PSI-BLAST PSSMs . Subsequent PSI-BLAST searches using these PSSMs against a database of Deinococcus and Thermus proteins were used to merge homologous clusters and previously unclustered proteins into tdCOGs. Cases when proteins assigned to different COGs were automatically clustered into one tdCOG were resolved by manual curation (either COG or tdCOG assignment was changed to remove the contradiction).
Evolutionary events in the history of the Deinococcus-Thermus group were reconstructed using an ad hoc parsimony approach , , . Presence/absence data from COG-based reconstruction of the deep ancestor of Cyanobacteria, Actinobacteria and Deinococcus-Thermus group  were added to the tdCOG phyletic patterns. Simple parsimony rules were used to infer the ancestral states and the evolutionary events in the history of the Deinococcus and Thermus genomes (e.g. a gene present in both Deinococci and in the deep ancestor but absent in both Thermus species was considered to be present in the Deinococcus-Thermus group ancestor and in the Deinococcus genus ancestor, but lost by the Thermus genus ancestor). The only departure from the straightforward parsimony inference was made for homologous tdCOGs that form clade-specific expanded families, e.g. there are several tdCOGs, all assigned into the same ancestral COG, with genes present in both Deinococci but in neither of the Thermus species. In this case, contrary to the formal parsimony assumption of multiple losses in the Thermus ancestor, the scenario was interpreted as multiple gains (due to duplications) in the Deinococcus ancestor (Table S10).
XRF microscopy measurements were made at beamline 2ID-D at the APS as described previously . Briefly, the 2ID-D is an undulator beamline with Fresnel zone plates focusing optics that produced a focal spot with a FWHM (full width at half maximum) spatial resolution of approximately 120 nm for these experiments. For each pixel, the full XRF spectrum between approximately 2 keV and 10 keV was measured using a silicon drift detector. Thus, the distribution of elements between phosphorus and zinc on the periodic table of elements could be measured with 120-nm resolution throughout a cell and its periphery (Figure 7). XRF microprobe measurements were made on D. geothermalis cells grown in TGY to OD600 0.3 at 50°C; and D. radiodurans cells were grown in TGY to OD600 0.3 at 32°C. The cells were deposited on grids as suspensions in TGY liquid medium, which served to help maintain the structure and viability of the cells as they dried.
Proposed evolutionary history of genome partitions in the Deinococcus-Thermus group.
(0.08 MB DOC)
Genome dot plots for homologous genome partitions of D. radiodurans and D. geothermalis.
(0.06 MB DOC)
Guanine quadruplet repeats in D. radiodurans.
(0.03 MB DOC)
Verification of the presence of megaplasmid DG206 in D. geothermalis (DSM11300).
(0.12 MB DOC)
Phylogenetic relationships of tdCOGs of the calcineurin-like phosphoesterase subfamily of COG0639 with proteins from other organisms represented by this COG.
(0.06 MB DOC)
Structure of D. radiodurans homozygous mutants.
(0.25 MB DOC)
The ESDSA model does not fully explain the early formation of covalently closed circular (ccc) derivatives of tandem duplications in irradiated D. radiodurans.
(0.08 MB DOC)
Multiple alignment comparisons for RecA proteins of the Thermus-Deinococcus group with selected representatives of other bacteria.
(0.05 MB DOC)
Chrome azurol S agar plate assay for siderophore production.
(0.13 MB DOC)
Whereas the nramp gene of D. radiodurans is essential, the fur gene is dispensable.
(0.13 MB DOC)
Homology between the D. radiodurans and D. geothermalis megaplasmids.
(0.04 MB DOC)
Clusters of orthologous groups of proteins for Deinococcus and Thermus (tdCOGs).
(0.24 MB TXT)
Lineage specific expansion of selected families in D. geothermalis (DG), D. radiodurans (DR), T. thermophilus HB27 (TT27), and T. thermophilus HB8 (TT8).
(0.05 MB DOC)
Protein families expanded in D. geothermalis.
(0.05 MB DOC)
Protein families expanded in D. radiodurans.
(0.07 MB DOC)
Gene context and motifs of predicted cytoplasmic proteins shared by two Deinococcus species, but for which homologs outside the lineage do not exist.
(0.17 MB DOC)
Stress response-related genes in D. radiodurans (DR), D. geothermalis (DG) and T. thermophilus (TT).
(0.23 MB DOC)
Genes coding for replication, repair and recombination functions in E. coli, D. radiodurans and T. thermophilus.
(0.15 MB DOC)
Manganese- and iron-related homeostasis genes.
(0.08 MB DOC)
Parsimony pattern rules for reconstruction of evolutionary events in the Deinococcus/Thermus lineage.
(0.11 MB DOC)
We are grateful to Deb Ghosal at Uniformed Services University of the Health Sciences (USUHS) for conducting the chrome azurol S agar plate assay for siderophore production. We are also grateful to Susan Lucas and Tijana Glavina del Rio of the DOE-Joint Genome Institute for support in genome sequence quality control, production and assembly
Competing Interests: The authors have declared that no competing interests exist.
Funding: The work of KSM, MVO, YIW, AS, and EVK was supported by the Intramural Research Program of the National Institutes of Health, National Library of Medicine. The work at USUHS was supported by grant DE-FG02-04ER63918 to MJD from the U. S. Department of Energy (DOE), Office of Science, Office of Biological and Environmental Research (BER), Environmental Remediation Sciences Program (ERSP); and by grant FA9550-07-1-0218 to MJD from the Air Force Office of Scientific Research. The work at the DOE-Joint Genome Institute was supported by the DOE Office of Science. Work at the Advanced Photon Source was supported by the DOE Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. The work of MSG and AVG was supported by grants from the Howard Hughes Medical Institute (55005610), INTAS (05-8028), and the Molecular and Cellular Virology program of the Russian Academy of Sciences. D. geothermalis was selected for genome sequencing by BER (http://www.science.doe.gov/ober/RFS-2.pdf) with MJD as the Principal Investigator.