|Home | About | Journals | Submit | Contact Us | Français|
Nitrosomonas europaea (ATCC 19718) is a gram-negative obligate chemolithoautotroph that can derive all its energy and reductant for growth from the oxidation of ammonia to nitrite. Nitrosomonas europaea participates in the biogeochemical N cycle in the process of nitrification. Its genome consists of a single circular chromosome of 2,812,094 bp. The GC skew analysis indicates that the genome is divided into two unequal replichores. Genes are distributed evenly around the genome, with ~47% transcribed from one strand and ~53% transcribed from the complementary strand. A total of 2,460 protein-encoding genes emerged from the modeling effort, averaging 1,011 bp in length, with intergenic regions averaging 117 bp. Genes necessary for the catabolism of ammonia, energy and reductant generation, biosynthesis, and CO2 and NH3 assimilation were identified. In contrast, genes for catabolism of organic compounds are limited. Genes encoding transporters for inorganic ions were plentiful, whereas genes encoding transporters for organic molecules were scant. Complex repetitive elements constitute ca. 5% of the genome. Among these are 85 predicted insertion sequence elements in eight different families. The strategy of N. europaea to accumulate Fe from the environment involves several classes of Fe receptors with more than 20 genes devoted to these receptors. However, genes for the synthesis of only one siderophore, citrate, were identified in the genome. This genome has provided new insights into the growth and metabolism of ammonia-oxidizing bacteria.
Nitrosomonas europaea is a bacterium that can derive all its energy and reductant for growth from the oxidation of ammonia to nitrite. The cell's demand for carbon has to be met almost entirely by the fixation of carbon dioxide. Additional mineral salts complete the cell's nutritional needs. Although this bacterium can incorporate small amounts of organic compounds into cellular biomass (19, 20, 59, 60), there is an obligate requirement for oxidation of ammonia and assimilation of inorganic nutrients to support growth. As such, this bacterium is a member of a small group of bacteria known as obligate chemolithoautotrophs.
N. europaea and other ammonia-oxidizing bacteria participate in the biogeochemical N cycle in the process of nitrification (the biological conversion of reduced nitrogen in the form of ammonia [NH3] or ammonium [NH4+] to oxidized N in the form of nitrite [NO2−], nitrate [NO3−], or gaseous forms [NO, N2O]). The gaseous products of nitrification, NO and N2O, are “greenhouse gases,” whereas the soluble forms are readily leached into ground and surface waters. In contrast to these detrimental consequences of ammonia oxidation, nitrifiers increase N availability to plants, are important to the treatment of wastewater, and have potential for the bioremediation of sites contaminated with chlorinated aliphatic hydrocarbons (44, 45, 61, 76).
Ammonia-oxidizing bacteria such as Nitrosomonas convert NH3 to NO2− by the successive action of ammonia monooxygenase (AMO) and hydroxylamine oxidoreductase (HAO): NH3 + O2 + 2H++ 2e− → NH2OH + H2O → NO2− + 5H+ + 4e−. Two of the four electrons return to the AMO reaction, and two are either reductant for biosynthesis or pass to a terminal electron acceptor (41, 83).
N. europaea is a member of the proteobacteria, a large group of eubacteria of presumed photosynthetic ancestry (73). All terrestrial ammonia-oxidizing bacteria currently identified belong to the β-subdivision of the proteobacteria, including the Nitrosomonas and Nitrosospira genera, whereas marine strains are found in both the γ-subdivision Nitrosococcus genus and the β-subdivision genera (34, 82).
N. europaea is the best studied of the ammonia-oxidizing bacteria at the molecular level. Previous work has focused primarily on unique aspects of this bacterium's growth, namely, its ability to utilize NH3, and the bioenergetics associated with this process. Thus, the pathway for NH3 oxidation has received considerable attention at both the biochemical and the genetic levels. AMO is a heteromultimeric enzyme encoded by the amoCAB operon that exists in two strikingly similar copies in the N. europaea genome (9, 37, 47, 54, 55). HAO is a homotrimer in which each subunit harbors eight c-type hemes, one of which binds substrate. Electrons pass from HAO through two tetraheme c cytochromes (cytochromes c554 and cm552) to ubiquinone (79). Hence, one theoretical HAO-electron acceptor complex includes 48 c-hemes. Similar to the multiple copies of the amo operon, the hao gene is present in three nearly identical copies (55). Gene sequences for three cytochromes that participate in NH3 oxidation have also been determined (7, 39), and some of these exist as multiple copies in the genome. In contrast to NH3 oxidation, other cellular processes have received only limited attention. Analysis of the complete genome sequence has allowed new insights into all aspects of the growth and environmental responses of this ammonia-oxidizing bacterium.
Genomic DNA, isolated from the N. europaea (ATCC 19718), was sequenced by using a conventional whole-genome shotgun strategy (30). Briefly, random 2- to 3-kb DNA fragments were isolated after mechanical shearing. These gel-extracted fragments were concentrated, end repaired, and cloned into pUC18. Double-ended plasmid sequencing reactions were carried out by using PE BigDye Terminator chemistry (Perkin-Elmer, Foster City, Calif.) and sequencing ladders were resolved on PE 377 automated DNA sequencers. Two rounds (near 55,000 reads each) of small-insert library sequencing were performed, generating 17-fold redundancy.
A large-insert (~30 kb) fosmid library was also constructed by Sau3AI partial digestion of genomic DNA and cloning into the pFos1 cloning vector (46). End sequencing of approximately 300 fosmid clones (0.05-fold sequence redundancy) generated roughly 3-fold genome scaffold coverage. The fosmids were fingerprinted with EcoRI to aid in assembly verification and determination of gap sizes and provided a minimal scaffold used for order and orientation across assembly gaps.
Sequence traces were processed with PHRED (27, 28) for base calling and assessment of data quality prior to assembly with PHRAP (P. Green, University of Washington) and visualization with CONSED (33). Even with 17-fold redundancy, the assembly consisted of 400 contigs. The previously unknown abundance of repetitive elements within the N. europaea genome, a feature which confuses most assembly programs, was found to be responsible for this large number of contigs. Due to the repetitive nature of this genome, many assembly gaps (~70%) were the direct result of misassembling these repeat elements, while the remainder were “true” gaps. Most repeats smaller than 2 kb, along with most of the other “true” gaps, were closed by primer walking on gap-spanning library clones (identified by using linking information from forward and reverse reads). Alternatively, some of the larger gaps, including the larger repeat elements and regions covered only by fosmid clones, were closed by primer walking on PCR products. Remaining physical (uncaptured) gaps, some of which are regions suspected of being lethal in Escherichia coli when in high copy number (e.g., the amoCAB promoter region), were closed by combinatorial (multiplex) PCR.
Once a single contig assembly was achieved, the consensus genome sequence was screened for regions of single-clone coverage and low sequence quality. These regions were resequenced, usually either by primer walking on PCR products or on different templates. The finishing phase (gap closure, repeat resolution, and resequencing poor-quality regions) added an additional ~12,000 reads to the assembly. Based on the consensus quality scores generated by PHRAP, we estimate the overall error rate to be substantially less than one error in 10,000 bases.
Gene modeling was performed by using the Critica (3), Glimmer (23), and Generation (http://compbio.ornl.gov/generation/index.shtml) modeling packages; the results were combined, and a BLASTP search of the translations versus GenBank's nonredundant database (NR) was conducted. The alignment of the N terminus of each gene model versus the best NR match was used to pick a preferred gene model. If no BLAST match was returned, the longest model was retained. Gene models that overlapped by more than 10% of their length were flagged, giving preference to genes with a BLAST match. The revised gene-protein set was searched against the KEGG GENES, Pfam, PROSITE, PRINTS, ProDom, and COGs databases, in addition to the BLASTP versus NR databases. From these results, categorizations were developed by using the KEGG and COGs hierarchies. Initial criteria for automated functional assignment required a minimum 50% residue identity over 80% of the length of the match for BLASTP alignments, plus concurring evidence from pattern or profile methods. Putative assignments were made for identities down to 30%, over 80% of the length. The sequence, as well as the results of automatic annotations, is available online (http://genome.ornl.gov/microbial/neur/embl/).
The sequence of the complete N. europaea strain ATCC 19718 is available under EMBL-EBI accession number AL954747.
The genome of N. europaea ATCC 19718 consists of a single circular chromosome of 2,812,094 bp (the general features of the genome are listed in Table Table1,1, and a detailed map is shown in Fig. Fig.1);1); nucleotide 1 was assigned at the predicted origin of replication. The GC skew analysis also reveals that the genome is clearly divided into two unequal replichores (roughly 1/3 to 2/3). The mechanism and biological reason for this asymmetry is unclear but may be a result of the abnormally large amount of repetitive material within the genome and presumed fluidity of the genome. One clear effect of this bias is an ~1% difference (25.84% versus 24.87%) in G versus C strand composition. Another peculiarity is that the multiple copies of genes participating in ammonia oxidation (amoCAB and hao), along with a large (7.5 kb) recently duplicated region, are all on the larger replichore (see Fig. Fig.1).1). Overall, the N. europaea genome is 50.7% G+C. Although several spikes are observed, these are not postulated to have been acquired via lateral transfer. However, some of these G+C spikes do correlate with GC skew spikes. Interestingly, some of these spikes also correspond to the regions containing repeated genes and operons such as the hao gene region (HAO and associated cytochromes), amoCAB (AMO operon), and tufB (elongation factor Tu), all of which are themselves associated with several ribosomal proteins (not shown).
Genes are distributed evenly around the genome, with ~47% transcribed from the forward strand and ~53% from the complementary strand. A total of 2,460 protein-encoding genes emerged from the modeling effort, averaging 1,011 bp in length, with intergenic regions averaging 117 bp. These open reading frames (ORFs) account for 2,487,261 nucleotides of coding sequence (88.4%). An additional 113 ORFs are fragmentary, frameshifted, or interrupted by insertion sequence (IS) elements; these have been designated pseudogenes. Of the 2,460 putative proteins, 2,147 matched a sequence in the NR database with an e-score of <1e-5; of these, 1,863 have similarity to a protein with a functional assignment, and 285 match a protein of unknown function. An additional 312 (13%) are unique to N. europaea. Roughly 75% of the predicted proteins have the potential to be assigned a function. Other searches give similar results: 1,737 proteins match InterPro profiles, 1,967 match a Pfam hmm profile (default threshold), and 1960 can be assigned to a COG group (3BeTs). In addition to protein-encoding genes, we identified 41 tRNAs, representing all 20 amino acids. The only rRNA operon in this strain is of the 16S-Ala tRNATGC-Ile tRNAGAT-23S-5S type and contains the typical I-CeuI endonuclease site in the gene for the 23S rRNA.
The sequence of this third β-proteobacterium makes a fine addition to the repertoire of structural diversity and content observed in the genomes sequenced thus far. A range of organisms (>130) were represented with top BLAST hits to one or more genes. Ralstonia solanacearum, one of two β-proteobacterial genome sequences currently in the public databases, was most often the top BLAST hit (31%, 768 of the 2,460 predicted ORFs). Neisseria meningitidis, the other sequenced β-proteobacterium, was the top hit with 5% of the genes. A surprising 13% had, as the top hit, Pseudomonas aeruginosa, a primarily soil-dwelling γ-proteobacterium. Other proteobacteria, including common soil inhabitants, were often found at the top of BLAST lists. Genes from E. coli, Caulobacter crescentus, Sinorhizobium meliloti, Vibrio cholerae, Xylella fastidiosa, and Mesorhizobium loti were frequent top BLAST hits with matches to 4.0, 2.9, 2.8, 2.6, 2.2, and 1.9% of N. europaea genes, respectively. At least 50 genes were also identified with each of the following classes of microorganisms: cyanobacteria and gram-positive actinobacteria and bacilli. The broad distribution of BLAST hits may reflect the few previously completed genome sequences from members of the β-proteobacteria or from any other ammonia-oxidizing bacteria.
Hirota et al. (36) conducted pulsed-field gel electrophoresis experiments to localize both copies of amoCAB and all three copies of hao to a single 487-kb fragment of DNA in Nitrosomonas sp. strain ENI-11. Each copy of amo was found to be within 15 or 23 kb of a copy of hao, based on restriction maps followed by long-range PCR for Nitrosomonas sp. strain ENI-11. The genome sequence of N. europaea ATCC 19718 reveals nearly identical proximity (15.5 and 23.1 kb) of these two important ammonia oxidation gene clusters to a copy of hao. The genes between each amo and hao cluster identified in Nitrosomonas sp. strain ENI-11 are also the same in N. europaea. These include the genes encoding threonyl tRNA synthetase (thrS), initiation factor 3 (infC), ribosomal protein L20 (rplT), phenylalanyl tRNA synthetase α and β subunits (pheS and pheT) in the 15.5-kb intergenic spacer region and the RNA polymerase β and β′ subunits (rpoBC) in the 23.1-kb span. The complete 23.1-kb spanning region includes tRNA genes, several ribosomal genes, elongation factors G (fusA) and Tu (tufB), and the transcription anti-termination gene nusG. Both genomes also have a third hao gene copy that is located approximately 300 kb upstream of the amo/hao cluster with the 23-kb spanning region which suggests two similar amo and hao gene arrangements in these two Nitrosomonas strains (Fig. (Fig.2).2). However, these clusters are separated by a strikingly different spanning region (87 kb in ENI-11 versus 1,300 kb in N. europaea). Thus, although the overall arrangement of two amo/hao gene clusters is conserved, dramatic rearrangements have nonetheless occurred since these two Nitrosomonas strains diverged.
Like most ammonia-oxidizing bacteria examined, N. europaea has multiple copies of the genes coding for AMO, HAO, and cytochrome c554 (Table (Table2).2). The question of whether other genes, and their associated functions, might also be duplicated has long been speculated. In contrast to several other autotrophs, the genes coding for ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCo) are not duplicated. The ribosomal genes also are not duplicated as they are in many other bacteria. One of the two copies of another duplicated gene (tufB), encoding elongation factor Tu, has been found associated with one of the two amo/hao gene clusters (with the 23-kb spanning region [see Fig. Fig.11 and and22]).
A perfect tandemly repeated region, a possible example of a recent duplication event, may give us insight into the mechanism of gene duplication and evolution and provides another example that gene duplication in N. europaea is not limited to genes involved with ammonia catabolism. This 7.5-kb duplication was found to carry genes for phosphoenolpyruvate synthase (ppsA; NE2359 and NE2366), glutaminyl tRNA synthetase (glnS; NE2356 and NE2363), along with three conserved hypothetical proteins, as well as portions of the genes for lysyl tRNA synthetase (lysS; NE2361) and aspartate aminotransferase (NE2362). These two likely “pseudogenes” have full-length representatives at the left and right borders of this most obvious duplication event. A second, different type of complex tandem repeat is composed of 15 copies of an ~339-bp degenerate repeat (nine forms) interspersed with four identical copies of a second repeat type (317 bp). This 6.3-kb repeat region lies within (and preserves throughout its length) an ORF (NE0161) encoding a large 3,064-amino-acid conserved hypothetical protein with a putative hemolysin-type calcium-binding region.
Besides the abundance of duplicated (presumably) native genomic material, the genome contains 85 predicted IS elements (most are complete elements) representing eight different families not previously described (see Table Table2).2). Although these elements appear randomly distributed around the genome (Fig. (Fig.1),1), closer inspection reveals that there are seven IS-free islands between 100 and 200 kb in size, whereas the genome in its entirety averages one IS per 33 kb. The three least prominent IS families (ISne6, ISne7, and ISne8) do in fact appear randomly distributed, but the IS families with 9 to 27 members appear to have preferentially inserted in the vicinity of other IS elements (or alternatively served as attractants for subsequent IS integrations). The ISne1 and ISne3 classes displayed only a relatively random integration pattern, with half of the members being in close (<2 kb) proximity to other IS elements. Although ISne4 and ISne5 encode similar transposases and share some level of identity, they are not found near one another. Rather, four of the ten ISne4 are proximal to ISne1 integrations, and another two are close to ISne2 sites, whereas five of the nine ISne5 are very close to an ISne2 integration site. In fact, four of these are directly adjacent to ISne2 elements and, surprisingly, in three different orientations, indicating completely independent integration events. ISne2 itself is <2 kb away from other IS elements in 11 of its 16 members. The reason for this striking integration bias is unknown. Once a function is lost due to IS integration, if this mutation is not deleterious to the organism, further integrations in this gene would not be lethal and would be maintained in the population. This mechanism of acquiring integrations could be extended to subsequent IS integrations in the surrounding region if nearby genes participate in the same function as the gene with the first IS integration and are, therefore, rendered functionless by that integration event.
A different class of duplicated genes includes likely members of a recently diverged paralogous family that codes for or regulates production of Fe siderophore receptors. Several classes of siderophore receptors, along with regulatory genes, were identified in the genome. Together, all of the repeated elements within this genome constitute ~5% of the nucleic acid sequence, which ranks as one of the most densely populated bacterial genomes in terms of complex repetitive DNA.
The use of ammonia as an energy source, an example of lithotrophy, requires the ability to catabolize ammonia and to generate reductant for biosynthesis and to generate a chemiosmotic gradient to drive ATP synthesis (Fig. (Fig.3).3). The genes coding for AMO (amoCAB), HAO (hao), and cytochromes c554 and cm552 were previously sequenced (7, 9, 39, 47, 54, 64, 65) and confirmed with the genome sequence. No additional genes were identified that might be involved in the oxidation of ammonia to nitrite. Electrons from hydroxylamine oxidation flow through cytochrome c554 and cytochrome cm552 into the electron transport chain at the level of ubiquinone (Fig. (Fig.3)3) (79). Genes for a typical ubiquinone-cytochrome c oxidoreductase and a cytochrome aa3-type cytochrome c oxidase are present; the soluble monoheme cytochrome c552 is thought to mediate between the two enzymes. The gene for cytochrome c552 is not located in an energetics-related gene cluster. Contributions to the proton gradient include scalar protons that are products of the HAO reaction and proton translocation by the bc1 complex and the terminal oxidase. Genes for all subunits of a typical ATP synthase are present. Reduction of NAD+ requires proton-driven reverse electron flow, presumably through NADH-ubiquinone oxidoreductase which is encoded in a discrete gene cluster. Reduction of NADP+ may be carried out by the proton gradient-dependent NAD+/NADP+ transhydrogenase also encoded in a gene cluster. The energy cost of reduction of NADP+ is thus very high. Two gene clusters contribute an Na+-dependent NADH-ubiquinone reductase and an Na+/H+ antiporter, which may facilitate some Na+-driven secondary transporters and also may be significant in marine environments.
The genome reveals that N. europaea has a relatively limited number of optional paths to terminal electron acceptors. Only one type of terminal oxidase of the aa3 family is present. The only other potential terminal oxidase is the soluble, periplasmic copper enzyme noted below. The apparent presence of a membrane-anchored cytochrome c4 suggests that electrons from ubiquinol may, at times, bypass the bc1 complex on their way to cytochrome oxidase.
N. europaea is incapable of reducing nitrate but is capable of reducing nitrite with the formation of nitric and nitrous oxide but not dinitrogen (reviewed in reference 41). No full ORFs were identified with strong similarity to known dissimilatory nitrate reductases (EC 126.96.36.199) or nitrous oxide reductases (EC 188.8.131.52), a finding consistent with the biochemical evidence. A cluster of genes encodes periplasmic proteins seemingly related to the transfer of electrons to and reduction of nitrite and/or oxygen (Fig. (Fig.4B).4B). The first gene encodes an aerobically expressed, soluble “blue-copper oxidase” (24). The next two genes encode a soluble, monoheme c cytochrome and a diheme c cytochrome, respectively. The fourth gene in this cluster has the best match to the aniA gene of Neisseria gonorrhoeae encoding an inducible nitrite reductase (PAN1) (verified by mutational inactivation of nitrite reduction) (56). The putative nitrite reductases from other Neisseria spp. and N. europaea all have significant similarities to conserved domains from copper-containing nitrite reductases from several denitrifying bacteria but form a separate clade in the phylogeny of nirK. A signal sequence of 24 amino acids is predicted. N. europaea appears to have a distinct form of nirK versus other ammonia oxidizers from the β subdivision (17). Disruption of the putative nirK did not lead to a loss of nitrous oxide production but did result in an increased sensitivity to nitrite (5). The NO reducing system is encoded in a nor gene cluster (norCBQD; NE2003, NE2004, NE2005, and NE2006) with an organization similar to that found in Pseudomonas sp. strain G-179 (6). Anaerobic metabolism of N. europaea was reported with pyruvate as the reductant and nitrite as the terminal electron acceptor (1). The genome encodes the enzymes necessary for this process as mediated by pyruvate dehydrogenase and possibly the citric acid cycle. Under these conditions, electrons apparently pass to nitrite reductase by way of NADH-ubiquinone reductase. Anaerobically, Nitrosomonas eutropha and N. europaea are reported to oxidize H2 and reduce nitrite, although the rate was much higher in N. eutropha (12). A gene for hydrogenase was not identified in the genome of N. europaea.
Assimilation of carbon dioxide is initiated by a type I RubisCo (Fig. (Fig.4).4). The genes for this enzyme are most similar to those from Acidithiobacillus ferroxidans, another obligate lithoautotroph. A carbonic anhydrase gene (cynT) is next to a gene for an anion transporter and only 4.6 kb from the RubisCo genes (Fig. (Fig.4A).4A). If this transporter is for carbonate or bicarbonate, then it and carbonic anhydrase would promote accumulation of CO2, the substrate for RubisCo. Genes for three additional carbonic anhydrases were also identified. CO2-repressible carbonic anhydrase activity has been observed in Nitrosomomas (43). Ralstonia eutropha H16, upon the inactivation of the gene encoding carbonic anhydrase, was unable to grow at ambient CO2 concentration (49). N. europaea does not have genes associated with the production of carboxysomes (16).
With the exception of two enzymes, genes for all enzymes to complete the Calvin-Benson-Bassham cycle are present. A gene for sedoheptulose 1,7-bis-phosphatase (EC 184.108.40.206) is absent. However, the fructose 1,6-bis-phosphatase (EC 220.127.116.11) encoded by NE0521 may have higher activity with sedoheptulose 1,7-bis-phosphate and may function primarily in its hydrolysis and not in gluconeogenesis (see below). This was found to be the case with the highly similar enzymes from Ralstonia metallidurans (formerly Alcaligenes eutrophus) (84) and Xanthobacter flavus (67). The gene encoding NADPH-dependent glyceraldehyde-3-phosphate dehydrogenase (EC 18.104.22.168) is also absent and is apparently replaced by a gene for the NADH-dependent enzyme (EC 22.214.171.124) “borrowed” from and also used by gluconeogenesis and glycolysis, as has been observed in other chemoautotrophs (67). Since during growth of this obligate autotroph it is unlikely that there is an extended period of time when both CO2 fixation and gluconeogenesis are not occurring, there is little advantage in having two separate genes that can be independently regulated. In fact, energy is conserved by having only one enzyme. Metabolic regulation at the enzyme level can still provide the appropriate flux of gluconeogenesis and glycolysis.
A very significant energy savings is achieved by the use of NADH rather than NADPH (generated from NADH in an energy-dependent reaction) to reduce 3-phosphoglycerate. It is interesting that an obligate autotroph, lacking catabolic pathways, had no reason to adopt a system of metabolic regulation dependent on the separate use of NAD+ or NADP+ as a redox mediator and effector in catabolic or anabolic pathways, respectively, as is seen in the Eukaryota. It continues to use both NADH and NADPH for biosynthesis.
Genes for the enzymes common to gluconeogenesis and glycolysis were present. From an energetic point of view it seems likely that the flux through either pathway is limited to biosynthetic requirements or recycling of fixed carbon. As noted above, the gluconeogenic enzyme fructose 1,6-bis-phosphatase (EC 126.96.36.199) likely has a substrate specificity suitable for a role in the hydrolysis of sedoheptulose 1,7-bis-phosphate. The hydrolysis of fructose 1,6-bis-phosphate may be carried out by a pyrophosphate-dependent 6-phosphofructokinase (EC 188.8.131.52), i.e., fructose 1,6-bis-phosphate + Pi → fructose-6-phosphate + PPi. Notably, the adjacent gene encodes a pyrophosphatase whose action would “pull” gluconeogenesis. The same pyrophosphate-dependent 6-phosphofructokinase, which is reversible, may also catalyze the production of fructose 1,6-bis-phosphate in glycolysis (participating in a pyrophosphate-dependent energy economy). Genes with high sequence similarities to ATP-dependent 6-phosphofructokinases (EC 184.108.40.206) were not found.
Genes for all enzymes of the tricarboxylic acid (TCA) cycle were identified. The predominant theory for the basis of obligate autotrophy holds that the TCA cycle is incomplete in these organisms (68). Indeed, for many autotrophs, including N. europaea (40), α-ketoglutarate dehydrogenase activity was not detected or was very low, in contrast to other activities of the TCA cycle. However, genes for subunits E1, E2, and E3 (which is shared by pyruvate dehydrogenase) are present. It remains to be seen whether the genes for α-ketoglutarate dehydrogenase are expressed.
N. europaea does not synthesize glycogen or β-hydroxybutyrate as storage products but does accumulate polyphosphate when growth is limited by low values of pH (72). The gene for polyphosphate kinase (NE0323) found in the genome has a very high sequence similarity to the equivalent gene in N. meningitidis.
Biochemical evidence suggests that ammonia is assimilated via glutamate dehydrogenase (40), and this pathway is consistent with the presence of a gene coding for an NADPH-specific glutamate dehydrogenase. Also consistent with the biochemical evidence (11), a glnA homolog, which encodes glutamine synthetase, is present. Although a putative glnE, which encodes the glutamine synthetase adenylating enzyme, was found, homologs for glnD (PII uridyl transferase), glnB (PII), or glnK (alternative PII) were not identified. The absence of a gene encoding a PII protein is surprising given the broad distribution of this regulatory protein. Glutamate synthase activity was not observed previously (11). Although a gene with similarity to glutamate synthase is present, the gene is truncated and appears to encode only the domains that transfer ammonia to α-ketoglutarate and accept and transfer electrons from a donor to the reductive amination domain. The domain responsible for the hydrolysis of ammonia from glutamine was not identified in this truncated gene or elsewhere in the genome. The gene profile supports the biochemical evidence that ammonia is assimilated via glutamate dehydrogenase, whereas the role of glutamine synthetase is to produce glutamine.
An ammonium transporter is present, which at low pH may supplement the passive uptake of ammonium. Nitrosomonas is reported to assimilate nitrite-N but not nitrate-N (62). The observed assimilation of nitrite-N is presumed to involve the siroheme-containing sulfite or nitrite reductase that is encoded in the genome. It is interesting that this frugal autotroph would expend reducing power on the production of ammonium.
Genes encoding the classical urease (e.g., ureABC coding for urea amidohydrolase) are not present. A candidate for urea metabolism in N. europaea is a variation of urea amidolyase. However, the gene in N. europaea is shorter than the one found in S. cerevisiae (a well studied system; [15, 31, 32]) and appears to contain only the carboxylase function of this bifunctional enzyme, and not the hydrolase/amidase function. No good homologs for the hydrolase/amidase were identified. To our knowledge, there are no reports indicating that urea can support the growth of N. europaea.
Autotrophy requires the ability to synthesize most, and in the case of N. europaea, all required small molecules and macromolecules from inorganic constituents. The gene profile of N. europaea is consistent with this requirement. In general, most and often all of the genes needed for particular biosynthetic pathways can be identified. Fatty acid and lipid synthesis, production of cofactors and prosthetic groups, nucleic acid synthesis, and amino acid synthesis can all be accounted for based on the gene profile. For example, similarity searches revealed many genes for the synthesis of purines and pyrimidines. For purine synthesis, genes were present for de novo synthesis of adenyl and guanisyl phosphates and their deoxy derivatives. Of the 23 steps required for the synthesis of ATP, dATP, GTP, and dGTP from ribose-5-phosphate, genes for all of the steps were identified. In the case of pyrimidine synthesis, complete pathways for synthesis from carbamoyl phosphate to UTP, dTTP, CTP, and dCTP were identified.
The genes for enzymes needed to synthesize fatty acids up to hexadecanate from acetyl coenzyme A (acetyl-CoA) were identified. N. europaea has three 3-ketoacyl-ACP synthases that catalyze the initial condensation reaction between acetyl-CoA and malonyl-CoA. Both synthase I (encoded by fabB) and synthase II (encoded by fabF) can elongate saturated fatty acids; however, only synthase I can catalyze the synthesis of unsaturated fatty acids. Synthase III (encoded by fabH) is involved with branched-chain fatty acid biosynthesis. As in other bacteria, several genes involved in fatty acid synthesis in N. europaea are arranged in an operon containing fabF, acpP (acyl carrier protein), fabG (3-ketoacyl-ACP reductase), fabD (malonyl-CoA ACP transacylase), fabH, and plsX (undefined role in fatty acid biosynthesis). No match for enoyl-ACP hydratase was found in the genome. However, a 3-hydroxymyristoyl/ 2-hydroxydecanoyl ACP hydratase was found (encoded by fabZ) that has a broad substrate range, including both short-chain and saturated and unsaturated long-chain fatty acids (35). Genes for the synthesis of squalene from dimethylallyl-PP and isopentenyl-PP were identified. In addition, the gene for the branch point for hopane synthesis from squalene, squalene-hopene synthase, is present. However, the gene for the previous enzymatic step, squalene epoxidase, was not found.
Genes for amino acid synthesis are among the most conserved pathways and allow for comparisons of gene organization among organisms. Genes for the synthesis of aromatic amino acids and histidine illustrate this point (Fig. 4C and D). The synthesis of Phe, Tyr, and Trp from phosphoenolpyruvate and erythrose-4-phosphate is via the shikimic acid pathway. Genes for all enzymes required for chorismate synthesis and the tryptophan branch were identified. Tryptophan is synthesized from chorismate via anthralinate. All elements of the pathway were identified and are found in three clusters in the genome. Anthranilate synthase component I (EC 220.127.116.11; encoded by trpE) and anthranilate phosphoribosyltransferase (EC 18.104.22.168; encoded by trpD) likely encode a bifunctional enzyme. N-(5′-Phosphoribosyl) anthranilate isomerase (EC 22.214.171.124) (encoded by trpF) and indole-3-glycerolphosphate synthase EC 126.96.36.199 (encoded by trpC) also likely encode a bifunctional enzyme. Tryptophan synthase α (EC 188.8.131.52; encoded by trpA) and β (encoded by trpB) chains are contiguous.
Phenylalanine could be produced via the nonarogenate branch from chorismate to phenylpyruvate by a dual function chorismate mutase-prephenate dehydratase (P protein) (NE0335). An amino transferase (EC 184.108.40.206; NE0336) would convert phenylpyruvate to phenylalanine. The P protein will apparently also produce free prephenate. An aspartate amino transferase (EC 220.127.116.11) could presumably transform prephenate to arogenate. No aromatic specific aminotransferase (EC 18.104.22.168) was identified, but HisC2 has high similarity to Pseudomonas stutzeri aminotransferase with a broad specificity in the biosynthesis of histidine, phenylalanine, or tyrosine. In N. europaea, the prephenate dehydratase of the P protein is specific and does not metabolize arogenate (71). Therefore, arogenate is a precursor of tyrosine but not phenylalanine. This observation was supported by the pathway identified in the genome sequence.
Tyrosine synthesis occurs via prephenate which is produced by the P protein. An aminotransferase then forms arogenate (see above). Arogenate dehydrogenase function may be included in another dual-function enzyme combining arogenate dehydrogenase (EC 22.214.171.124) and prephenate dehydrogenase activity (EC 126.96.36.199) (TyrAc). However, Subramaniam et al. (71) observed that this enzyme was arogenate and NADP+ specific in N. europaea. Therefore, arogenate would be an obligatory intermediate in tyrosine synthesis and 4-hydroxyphenylpyruvate would not be formed. Subramaniam et al. (71) have shown that tyrosine is required for the prephenate dehydratase activity and that phenylalanine is an inhibitor, thus assuring a balanced production of phenylalanine and tyrosine. For tyrosine to regulate phenylalanine synthesis, two separate paths must be present (after prephenate). Since the aspartate aminotransferase is probably promiscuous (for phenylpyruvate and prephenate) this requires P protein to be specific for prephenate (versus arogenate). Subramaniam et al. (71) observed that the use of arogenate as a precursor for tyrosine and phenylpyruvate for phenylanine is characteristic of the cyanobacteria and coryneform bacteria.
The histidine biosynthetic operon in N. europaea exhibits overall organization (Fig. (Fig.4D)4D) similar to that seen in N. meningitidis and E. coli, with some exceptions that help to place gene fusion events along the evolutionary lines of the proteobacteria. In N. europaea, the majority of the his genes are contiguous; however, hisDG genes are separated from the rest of the operon. The hisI and hisE genes are not fused in N. europaea, but they are adjacent genes whose ORFs overlap. In the enteric bacteria (γ-proteobacteria) examined to date, the hisIE gene product is a bifunctional enzyme. Examples of monofunctional enzymes encoded by hisI and hisE are commonly found in the β subdivisions. This observation would indicate that the hisIE gene fusion occurred after the enteric proteobacteria split from these subdivisions. The hisB gene of N. europaea is predicted to encode imidazole glycerol phosphate dehydratase (EC 188.8.131.52), while the histidinol phosphatase (EC 184.108.40.206) is likely encoded by a separate gene (NE1185) outside of the operon, similarly to that found in R. solanacearum (63). This observation confirms that the fusion of these two activities occurred after the evolutionary split separating the γ subdivision from the other subdivisions of the proteobacteria (29). HitA matches with a nucleotide-binding protein, similar to members of the HIT (histidine triad) family. A second putative hisC gene (NE0647) encoding an aminotransferase was identified outside of the his operon structure. However, this gene may be involved in the biosynthesis of other aromatic amino acids (see above).
The pathways for glycine, serine, and threonine synthesis are identified and are the presumed to be starting materials for the synthesis of the osmoprotectants betaine and glycine betaine. The betA gene encoding EC 220.127.116.11 in the osmoregulatory choline-glycine betaine pathway was identified (NE1237), although other genes found in the bet operon of proteobacteria were not found in close proximity. A possible ABC transport system for osmoprotectants may be encoded for by a polyamine transport operon similar to potABCD in P. aeruginosa (NE1870 to NE1873).
Genes for aminoacyl tRNA synthetases for all amino acids are present. Glutamyl-tRNA synthetase is encoded by gltX (NE1624). The duplicate glnS genes (NE2356 and NE2363) likely encode a specific glutaminyl-tRNA synthetase rather than the indiscriminate glutaminyl/glutamyl-tRNA synthetase. The absence of a complete gatCAB operon encoding the Glu-tRNA amidotransferase further supports the glutamine specificity for glnS. The presence of both glnS and gltX may be the result of an earlier gene duplication event and subsequent functional divergence (14).
In contrast to the genes for the biosynthesis of cellular constituents, genes for the catabolism of organic compounds are scant. For example, no genes for the degradation of either purine or pyrimidine nucleosides were identified. Salvage pathways for nucleotides were present and included genes for DNA exo- and endonucleases and ribonucleases. Nucleoside salvage appeared to be limited to uracil and thymidine and was nonexistent for the nucleobases. Likewise, complete pathways for the catabolism of most amino acids were not identified. Where a few genes were present, these genes were most often also required for processes other than catabolism. For a few of the simpler amino acids (e.g., aspartate, serine, and glycine), for which transamination would result in an intermediate in a primary pathway such as the TCA cycle, pathways for catabolism could be envisioned based on the gene profile. Likewise, as discussed above, genes for complete glycolytic pathways and the TCA cycle are present, suggesting that complete oxidation of simple sugars and organic acids should be possible.
A notable exception to the dearth of genes for catabolic enzymes is seen with fatty acid oxidation since genes for all of the enzymes required for fatty acid oxidation are present. As in E. coli and other bacteria, many of the activities of fatty acid oxidation are contained in two subunits of a multienzyme complex. The 3-hydroxyacyl-CoA dehydrogenase, enoyl-CoA hydratase, cis-Δ 3-trans-Δ2-enoyl-CoA isomerase, and the 3-ketoacyl-CoA epimerase are all contained in one subunit (encoded by fadA). The 3-keto-CoA thiolase is associated with a second subunit (encoded by fadB). As in other bacteria, the fadA and fadB genes are adjacent. Although N. europaea does have phospholipase D, which cleaves the head group from phospholipids, the genes for phospholipase A1 and A2 (enzymes that remove the fatty acids) and phospholipase C were not present, suggesting that N. europaea is not able to degrade phospholipids.
N. europaea has an array of active (primary and secondary) transporters (i.e., ion-coupled, ATP hydrolysis, or redox-driven transporters). Approximately 285 ORFs in its genome (11.5%) are dedicated to the active transport of molecules across its membranes. Other gram-negative bacteria have 3 to 12% of the ORFs in their genome encoding proteins involved in transport (78). The number of predicted ABC transporters in N. europaea, 13, is similar to the numbers in other lithoautotrophic bacteria but smaller than the numbers in facultative bacteria. For example, in Methanobacterium thermoautotrophicum ΔH (a lithoautotrophic thermophilic archaebacterium), 10 clusters are predicted to code for ABC transporters (69). In Aquifex aeolicus (an obligate chemolithotrophic eubacterium), 13 clusters similar to ABC transporter systems are present (22). In contrast, E. coli (a chemoorganoheterotrophic bacterium) has 51 clusters containing sequences similar to ABC transporters (out of 746 transport and binding gene products from which 382 are characterized; 15.8% of the total ORFs ). Thermotoga maritima (representing a new genus of unique extremely thermophilic eubacteria growing up to 90°C) has 31 ABC transporters (57). Halobacterium sp. strain NRC-1 has at least 27 members of the ABC transporter superfamily.
In N. europaea ca. 14% (40 ORFs) of the active transport proteins are dedicated to Fe transport (see below). The 13 putative active transporters that include ATP-binding cassettes (ABC) represent approximately 75 ORFs or 3% of the total ORFs. These ABC transporters contain at least two of the typical three components for ABC transporters (nucleotide-binding domain, membrane-spanning domain, and solute-binding protein). ABC-type multidrug, protein, and/or lipid transporters are predominant. For instance, capsular polysaccharide export and polysaccharide or polyol phosphate-type transporters are present, as is an ABC-type sugar transferase system involved in lipopolysaccharide synthesis with a putative membrane cation efflux permease. This lipopolysaccharide transporter may confer to N. europaea the ability to adhere to surfaces. Two ABC-type transporters for anions (e.g., nitrate) could also be inferred. The nitrate transport systems have similarity to the sulfonate and bicarbonate transporters (52). A sulfate permease and an ABC-type sulfate/molybdenum transport system are present. N. europaea has two ORFs coding for a potassium uptake system and another for a sodium-translocating NADH dehydrogenase. Two loci for heme-exporting proteins adjacent to cytochrome related genes are also present in the genome.
Systems for the uptake of organic molecules are few in N. europaea. ABC transporters for some single molecules are present (i.e., for glutamine and spermidine or putrescine). An amino acid permease and an amino acid transporter are present but are not linked with a nucleotide-binding domain or a membrane-spanning domain, which are characteristic of ABC transporters. Only one sugar transporter is apparent as a phosphotransferase system (PTS) with similarity to fructose or mannose transporters. Two additional PTSs are nitrogen related. The limited number of permease genes for organic molecules may contribute to the obligate nature of lithoautotrophy by N. europaea.
The genome also contains loci encoding a Hg scavenger-like transport system, which could be responsible for heavy metal tolerance. Furthermore, several candidate genes encoding divalent cation (Cd, Zn, and Co) transporters have been identified. Because of the high redundancy of encoding genes, the cases of iron and zinc transport are discussed below.
The subject of iron uptake is among the most interesting revelations of the genome sequence. Although genes for siderophore biosynthesis seem to be absent, the genome of N. europaea contains 20 likely functional fecIR gene tandems, most of which are directly preceded or succeeded by iron siderophore receptor-encoding genes (Fig. (Fig.5).5). Among these are ORFs with significant similarity to known receptors for the uptake of ferrichrome-, pseudobactin-, and pyoverdin-like siderophores and other ferric iron siderophores, whose regulation seems to be linked to the FecI/FecR sigma factor/membrane sensor system of ferric-dicitrate iron accumulation (26). Overall, the genome of N. europaea appears to contain 22 genes encoding FecI-sigma factor-like proteins (Fig. (Fig.5).5). Our phylogenetic analysis of putative FecI-FecR homologs revealed that an ancestral fecI and fecR gene tandem has coevolved into two distinct subgroups of fecI and fecR gene tandems by an early duplication event. While the fecIR gene tandems have further coevolved by numerous gene duplication events within their subgroups, these subgroups did not evolve with any preference for the proximity of a particular iron siderophore-sensing receptor gene (see indices in Fig. Fig.5).5). Whereas all fecR genes were found next to a fecI gene, only two fecI genes were found not in tandem with a fecR gene. Because unnecessary redundancy is usually easily lost from prokaryotic genomes, we predict that the identified fecIR gene tandems have physiological relevance for Nitrosomonas' high need for effective iron acquisition. Additionally, a gene encoding an enterobactin-like siderophore receptor (NE1205) was found in close proximity to a substrate-binding protein (NE1206) and five ORFs that likely comprise an ABC type 2 transport uptake system (NE1207 to NE1211). The overall scenario suggests that N. europaea may utilize siderophores produced by other organisms in its environmental consortium while under iron stress, and the possible citrate mechanism may serve as a “last resort” in the event iron is not available from any other source. The advantage of this mechanism would be the ability of the organism to supply its iron requirement without the costly secretion of reduced carbon. The fecIR gene tandem NE1217-NE1218 (followed by a TonB-dependent outer membrane receptor) is adjacent to a copF-like gene (NE1216) that may encode a Cu2+ cation transport ATPase. Copper ion transport is likely facilitated by the product of NE1019, which is highly similar to the CopA copper transport ATPase from Staphylococcus aureus. To prevent copper toxicity while facilitating Cu utilization, N. europaea likely expresses the three copper resistance proteins A, B, and D (NE0279, NE0280, and NE2058); a copper-binding protein, CopC (NE1491); and an inner membrane copper tolerance protein (NE2389).
In addition to iron and copper sensors and transporters, the N. europaea genome contains three separate gene clusters putatively involved in divalent cation (cobalt, zinc and cadmium) transport (Fig. (Fig.4E).4E). Cluster 1 (NE0346/5/4/3) is organized similarly to the czc gene cluster in R. metallidurans CH34, and genes in the second cluster (NE0373/4/5/6/7) are arranged like the czt gene cluster in P. fluorescens 13525, whereas the third cluster seems to lack the two-component regulatory system found in the other two clusters (Fig. (Fig.4E).4E). The third czc gene cluster is in proximity to a gene coding for an Mg2+ transporter protein, MgtE (NE1633). Based on the similarity to genes encoding heavy metals efflux systems in Ralstonia and Pseudomonas spp., the czc and czt clusters may also encode heavy metal efflux systems in N. europaea. Because of its proximity to a putative Mg2+ uptake protein, the third czc cluster may be involved in metal uptake.
N. europaea, like virtually all other aerobic organisms, is expected to contain enzymes that convert superoxide and hydroperoxides into innocuous products (21, 42, 48). Abundant aerobic bacteria such as Bacillus subtilis and P. aeruginosa have elaborate and redundant complements of superoxide dismutases (SODs), hydroperoxidases (HP), and “iron management” enzymes, and their expression is regulated through complex regulatory networks (70). The genome of N. europaea contains genes that encode a monofunctional small-subunit catalase (HPII, katA), a catalase-peroxidase (HPI, katG), a thioredoxin-dependent peroxide reductase (alkyl HP, ahpC), and an iron-containing SOD (Fe-SOD, sodB). The HPI-encoding gene is preceded by a truncated copy of its N terminus, as found in the genome of Burkholderia fungorum (cepacia) LB400, also a member of the β-proteobacteria. Most of the HPs are heme-containing enzymes and genes coding for bacterioferritin (Bfr, bfr) and bacterioferritin comigratory protein (Bcp) were identified. In contrast, genes encoding a thioredoxin reductase (NADH-peroxiredoxin reductase, AhpF), which is part of the thioredoxin redox couple in many bacteria, glutathione oxidoreductase (gorA), and other isozymes of HP or SOD were not found in the genome.
The cytochrome c peroxidase of N. europaea is a diheme cytochrome. Homologous enzymes require the reduction of one heme before reaction with hydrogen peroxide. In contrast, the diferric form of the N. europaea enzyme reacts with hydrogen peroxide, making it a relatively better scavenger at a higher cellular redox potential (2). The cytochrome P460 in N. europaea has a possible NO-scavenging role (10). The gene encoding cytochrome P460 in N. europaea was sequenced previously (8) and is corroborated in the genome sequence.
To detect and defend themselves against oxidative damage, E. coli cells sense their cytoplasmic redox state with the OxyR protein, whose expression is upregulated autogenously by hydrogen peroxide concentrations of 50 to 200 nM and which is a key regulator protein in many multigene stress defense networks in most bacteria (70). Surprisingly, the genome of N. europaea does not contain a gene similar to known oxyR genes. Thus, the katG and ahpC genes, which are “normally” regulated by OxyR, are likely independently regulated in N. europaea, as is the gene encoding the ferric uptake regulator protein (Fur, NE0616, fur). The Fur protein, for instance, regulates ferric citrate (FecIR) and ferrichrome (fhu operon) transport, exotoxin synthesis, and the expression of HPs (HPI and HPII) in several bacteria (75, 77). Genes encoding other regulators in the Fur protein family such as Zur, the zinc uptake regulator protein, were found in the genome of N. europaea (Zur, NE1722, zur). OxyR also controls the expression of the alternate sigma factor for stationary-phase-specific transcription, RpoS, in enterobacteria, whereas rpoS gene expression in P. aeruginosa is dependent on cell density (80). Surprisingly, the genome of N. europaea lacks an rpoS-like gene entirely. On the other hand, genes encoding other alternative sigma factors such as RpoN (σ54-NE0062), RpoH (σ32-NE0584), and RpoE (σ24-NE2331) were found in the genome.
N. europaea is motile and can form biofilms; hence, its genome should contain structural and regulatory genes necessary for the synthesis (and its regulation) of flagella, as well as for the correlation of flagellar synthesis and function with environmental cues and challenges. The complement of operons needed for flagellum biosynthesis is complete compared to available information from other bacteria (13, 25). However, the organization of these genes and the operon locations in the genome are remarkably different from those in other bacteria (25, 51). As far as is known, the flagellar master operon flhDC (NE2407/2406) is required for the transcriptional initiation of flagellation and chemotaxis both through direct activation and/or derepression of operons and indirectly through control of the FliA protein (NE2491), an alternative sigma factor (σ28). Five classes of methyl-accepting chemotaxis proteins (MCPs) are known in enterobacteria (13, 51), and the N. europaea genome contains genes similar to members of three classes: (i) a Tsr-like protein (with HAMP domain; NE1864) that directly senses Ser, Ala, Gly, and aminoisobutyrate (tsr); (ii) a Tar-like protein (with PAC, PAS, MA, and HAMP domains; NE1863) that senses Asp and Glu directly and maltose through a periplasmic binding protein and is responsive to Co and Ni (tar); and (iii) a Tap-like protein (with MA domain; NE1251) that senses dipeptides through a periplasmic binding protein (tap). Additionally, a gene encoding a pseudomonad PilJ-like MCP (with the MA domain; NE1251) was identified. It is not yet known if these N. europaea genes are responding to these same attractants. Genes similar to ones known to encode the ribose/glucose/galactose sensor Trg and the redox sensor Aer in enterobacteria were not found. In comparison to other genomes, best conserved is the cluster with the genes encoding the proteins CheA, CheW, and MCP-Tsr (NE1866, NE1865, and NE1864), followed by genes encoding MCP-Tar, CheR, and CheB (NE1863, NE1861, and NE1859), a sequence of chemotaxis genes that is found in the E. coli genome. The other two MCP-encoding genes are adjacent to genes encoding additional CheW proteins (NE1250/51 and NE1396/97). In comparison to the P. aeruginosa genome that encodes more than 20 MCP-like proteins, chemotactic responsiveness of N. europaea mediated through only 4 MCP-like proteins are rather limited. Remarkable also are the fairly distant locations of the operons that contain fliA (NE2491), cheYZ (NE1923 and NE1924), and cheA (NE1866), which are usually in close proximity to or members of the same operon (25). The N. europaea genome also lacks loci encoding the FlgM/FliT proteins that regulate FliA availability in enterobacteria through anti-sigma activity (51). Given the slower growth rates, such a complex regulation of flagellation would likely drain more energy from the tight budget than it could save. Because N. europaea seems able to respond to the autoinducer N-acylhomoserine lactone (AHL) and of forming biofilms (4), some sort of integration of these signal cascades can be expected. The flagellar master regulon flhDC is also involved in the regulation of virulence factor synthesis and cell division and seems to be a good candidate for functional compensation of the missing stationary-phase regulation via RpoS. Like its β-proteobacterial relative Neisseria, N. europaea contains only the essential suite of cell division proteins (ftsI, murEF, mraY, murD, ftsW, murGCB, ddlB, NE0985 to NE0994; ftsQAZ, NE0995 to NE0997; ftsK, NE1051; and minCDE, NE1831 to NE1829), and it lacks genes for SulA, ZipA, FtsL, and FtsN proteins that are found in γ-proteobacteria (53). Surprisingly, it lacks the functional two-component regulatory systems of LasRI/RhlR, which play key roles in connecting quorum sensing, motility, stationary-phase response, and the synthesis of virulence and stress tolerance factors in many environmental bacteria such as pseudomonads (81). Nevertheless, since the synthesis of autoinducer branches off fatty acid biosynthesis pathways and genes encoding the FAB pathway have been identified in N. europaea, we conclude that NE1184 encodes a putative AHL synthase similar to that found in P. fluorescens (50). In P. fluorescens, this autoinducer synthase produces both short- and long-chain AHLs (three identified). Although N-(3-oxohexanoyl)-l-homoserine lactone acted as a signal molecule in cell density regulated recovery from starvation in N. europaea (4), the actual sensor molecule(s) produced have not been biochemically characterized.
The gene profile reveals the basic array of catalysts necessary for all of biosynthesis starting with ammonia and carbon dioxide. Genes necessary for catabolism of ammonia, energy and reductant generation, biosynthesis, and CO2 and NH3 assimilation were identified. In contrast, genes for catabolism of organic compounds are limited. Whereas genes are present for the enzymes to degrade macromolecules to monomers and building blocks (e.g., by proteases and nucleases), pathways for further degradation are not represented in the genome. A likely consequence of this lack of degradation pathways is that this energy-challenged bacterium is expected to secrete rather than to recycle many organic wastes. N. europaea also appears to have a limited ability to transport organic molecules from the environment into the cell.
One of the interesting features of this bacterium is that it is obligate with regard to its necessity to use ammonia as a substrate. No other sources of energy, at least as sole sources of energy, have been identified. The genome provides some new insights to this nutritionally limited status. No evidence for other lithotrophic capabilities was apparent in the genome. For example, the presence of genes for a hydrogenase might have indicated that the bacterium could also grow with H2 as the source of energy and reductant. However, for this ammonia-oxidizing bacterium, genes for the utilization of H2, CO, Fe, or other inorganic sources of energy were not identified. N. eutropha was shown to grow anaerobically with H2 as the reductant and nitrite as the electron acceptor. Such a capability in N. europaea would seem unlikely, given the absence of recognizable hydrogenase genes. As described above, genes for the catabolism of organic compounds are scant. Nonetheless, complete pathways for a few compounds can be written based on the gene profile. At least in the case of fructose and/or mannose, a complete PTS transporter also seems to be present. To date, growth has not been observed on any organic compounds. The presence of the genes for α-ketoglutarate dehydrogenase, the step in the TCA cycle that appears to be absent in many obligate autotrophs, further clouds our understanding of obligate autotrophy. Although the genome provides some insights to this nutritionally limited status, it also raises new questions about why at least some organic compounds are not utilized.
One of the more interesting and unexpected findings to come out of the genome is insight into the strategy of N. europaea to accumulate Fe from the environment. These bacteria have a large appetite for Fe given all of the cytochromes they produce. Therefore, it was not surprising to learn that receptors for Fe-siderophores are encoded. The diversity of different classes of receptors and the number of genes devoted to these receptors (>20) was something of a surprise. However, even more surprising was the almost complete absence of biosynthetic pathways for siderophores. An exception is the citrate transporter, which utilizes a product and intermediate of the TCA cycle. Apparently, N. europaea relies on other bacteria to produce siderophores and then uses an arsenal of receptors to harvest their products. This reliance on other bacteria for siderophore production is in contrast to the otherwise complete self-reliance of this bacterium, which can extract energy and produce complex cellular constituents from simple inorganic nutrients. It is likely that N. europaea seeks to manage the Fe nutrition of its environment as a survival strategy (much like some pseudomonads on plant surfaces or pathogens in hosts).
Only two other complete genomes of β-proteobacteria are currently published, namely, N. meningitidis (74) and R. solanacearum (63). Although 31% of the genes in N. europaea were most similar to genes from R. solanacearum based on current BLAST searches, 13% of the genes were most similar to P. aeruginosa genes. As more β-proteobacterial genomes are completed, it will be interesting to see how similar their genomes are to N. europaea in terms of both gene similarity and genome organization. Of particular interest will be the genome of A. ferroxidans, another β-proteobacterium that is also an obligate lithoautotroph, albeit with a dependence on reduced sulfur or ferrous iron rather than ammonia. It also remains to be seen how representative N. europaea is of other ammonia-oxidizing bacteria. As noted above, major rearrangements of the locations of the amo/hao clusters are already evident between N. europaea and Nitrosomonas strain ENI-11. In comparing N. europaea to other ammonia oxidizers, differences in the number of copies of amo and hao and in their physiological capabilities (e.g., growth of N. eutropha on H2 and the presence of urease in Nitrosospira) suggests that more differences should be expected.
Another interesting feature of the genome is the preponderance of insertion sequences. Both the variety and the number of elements suggest that this genome is prone to loss or accession of genetic elements. For N. europaea, it seems most likely that the genome is decreasing in size and the insertion sequences may be assisting in this evolutionary march toward a more compact genome. Two lines of evidence support this notion. First, the genomes of many other proteobacteria are much larger. If N. europaea, like all of the proteobacteria, had its origin as a phototroph, then it has lost this ability. While other proteobacteria that have lost phototrophy have nonetheless gained additional capabilities (and larger genomes), N. europaea has arrived at a smaller genome. Similar dramatic gene losses have occurred in other divisions of the gram-negative bacteria such as the sphingobacteria and planctobacteria (18). The second line of evidence comes from thinking about the minimum genome set that is required for lithoautotrophy or photoautotrophy. Certain cyanobacteria (e.g., Prochlorococcus spp.) and archaebacteria (e.g., M. thermoautotrophicum) reveal that a much smaller genome is sufficient to sustain the autotrophic lifestyle. Perhaps N. europaea is continuing its downsizing toward a more compact genome like other catabolically limited bacteria.
As with other whole-genome sequencing projects, this genome has provided remarkable new insights into the growth and metabolism of this organism. Not unexpectedly, the sequence has also raised a number of questions that now await experimental investigation.
This research was funded by the Biological and Environmental Research Program and the U.S. Department of Energy's Office of Science. The Joint Genome Institute managed the overall sequencing effort, which was carried out primarily at Livermore under the auspices of the U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory, under contract W-7405-Eng-48. Computational annotation was carried out at the Oak Ridge National Laboratory, managed by UT-Battelle for the U.S. Department of Energy under contract DE-AC05-00OR22725. A consortium of investigators from four universities assisted in the analysis of the information made available from the sequencing effort. M. Klotz was supported by an incentive grant awarded by the College of Arts and Sciences at the University of Louisville. Additional support was provided by DOE grant DE-FG03-97ER20266 to D. Arp and L. Sayavedra-Soto.
†For a commentary on this article, see page 2690 in this issue.