Formation of heat-resistant endospores is a specific property of the members of the phylum Firmicutes (low-G+C Gram-positive bacteria). It is found in representatives of four different classes of Firmicutes: Bacilli, Clostridia, Erysipelotrichia, and Negativicutes, which all encode similar sets of core sporulation proteins. Each of these classes also includes non-spore-forming organisms that sometimes belong to the same genus or even species as their spore-forming relatives. This chapter reviews the diversity of the members of phylum Firmicutes, its current taxonomy, and the status of genome sequencing projects for various subgroups within the phylum. It also discusses the evolution of the Firmicutes from their apparently spore-forming common ancestor and the independent loss of sporulation genes in several different lineages (staphylococci, streptococci, listeria, lactobacilli, ruminococci) in the course of their adaptation to the saprophytic lifestyle in nutrient-rich environment. It argues that systematics of Firmicutes is a rapidly developing area of research that benefits from the evolutionary approaches to the ever-increasing amount of genomic and phenotypic data and allows arranging these data into a common framework.
Later the Bacillus filaments begin to prepare for spore formation. In their homogenous contents strongly refracting bodies appear. From each of these bodies develops an oblong or shortly cylindrical, strongly refracting, dark-rimmed spore.
Ferdinand Cohn. 1876. Untersuchungen über Bacterien. IV. Beiträge zur Biologie der Bacillen. Beiträge zur Biologie der Pflanzen, vol. 2, pp. 249–276. (Studies on the biology of the bacilli. In: Milestones in Microbiology: 1546 to 1940. Translated and edited by Thomas D. Brock. Prentice-Hall, Englewood Cliffs, NJ, 1961, pp. 49–56).
Globin-coupled sensors (GCS) are heme-binding signal transducers in Bacteria and Archaea where an N-terminal globin controls the activity of a variable C-terminal domain. Here we report that BpeGReg, a globin-coupled diguanylate cyclase (GCDC) from the whooping-cough pathogen Bordetella pertussis, synthesizes the second messenger bis-(3’–5’)-cyclic diguanosine monophosphate (c-di-GMP) upon oxygen binding. Expression of BpeGReg in Salmonella typhimurium enhances biofilm formation, while knockout of the BpeGReg gene of B. pertussis results in decreased biofilm formation. These results represent the first identification of a gaseous ligand for any diguanylate cyclase and provide definitive experimental evidence that a globin-coupled sensor regulates c-di-GMP synthesis and biofilm formation. We propose that the synthesis of c-di-GMP by globin sensors is a widespread phenomenon in bacteria.
globin; oxygen sensor; c-di-GMP; diguanylate cyclase; biofilm
This review traces the evolution of the cytochrome bc complexes from their early spread among prokaryotic lineages and up to the mitochondrial cytochrome bc1 complex (complex III) and its role in apoptosis. The results of phylogenomic analysis suggest that the bacterial cytochrome b6f-type complexes with short cytochromes b were the ancient form that preceded in evolution the cytochrome bc1-type complexes with long cytochromes b. The common ancestor of the b6f-type and the bc1-type complexes probably resembled the b6f-type complexes found in Heliobacteriaceae and in some Planctomycetes. Lateral transfers of cytochrome bc operons could account for the several instances of acquisition of different types of bacterial cytochrome bc complexes by archaea. The gradual oxygenation of the atmosphere could be the key evolutionary factor that has driven further divergence and spread of the cytochrome bc complexes. On one hand, oxygen could be used as a very efficient terminal electron acceptor. On the other hand, auto-oxidation of the components of the bc complex results in the generation of reactive oxygen species (ROS), which necessitated diverse adaptations of the b6f-type and bc1-type complexes, as well as other, functionally coupled proteins. A detailed scenario of the gradual involvement of the cardiolipin-containing mitochondrial cytochrome bc1 complex into the intrinsic apoptotic pathway is proposed, where the functioning of the complex as an apoptotic trigger is viewed as a way to accelerate the elimination of the cells with irreparably damaged, ROS-producing mitochondria.
bioenergetics; molecular evolution; ubiquinol:cytochrome c oxidoreductase; ubiquinone; plastoquinone; cytochrome c; cardiolipin; cell death; photosynthesis; apoptosome
The class Clostridia in the phylum Firmicutes (formerly low-G+C Gram-positive bacteria) includes diverse bacteria of medical, environmental, and biotechnological importance. The Selenomonas-Megasphaera-Sporomusa branch, which unifies members of the Firmicutes with Gram-negative-type cell envelopes, was recently moved from Clostridia to a separate class Negativicutes. However, draft genome sequences of the spore-forming members of the Negativicutes revealed typically clostridial sets of sporulation genes. To address this and other questions in clostridial phylogeny, we have compared a phylogenetic tree for a concatenated set of 50 widespread ribosomal proteins with the trees for beta subunits of the RNA polymerase (RpoB) and DNA gyrase (GyrB) and with the 16S rRNA-based phylogeny. The results obtained by these methods showed remarkable consistency, suggesting that they reflect the true evolutionary history of these bacteria. These data put the Selenomonas-Megasphaera-Sporomusa group back within the Clostridia. They also support placement of Clostridium difficile and its close relatives within the family Peptostreptococcaceae; we suggest resolving the long-standing naming conundrum by renaming it Peptoclostridium difficile. These data also indicate the existence of a group of cellulolytic clostridia that belong to the family Ruminococcaceae. As a tentative solution to resolve the current taxonomical problems, we propose assigning 78 validly described Clostridium species that clearly fall outside the family Clostridiaceae to six new genera: Peptoclostridium, Lachnoclostridium, Ruminiclostridium, Erysipelatoclostridium, Gottschalkia, and Tyzzerella. This work reaffirms that 16S rRNA and ribosomal protein sequences are better indicators of evolutionary proximity than phenotypic traits, even such key ones as the structure of the cell envelope and Gram-staining pattern.
Sporulation; taxonomy; Gram staining; cellulose; xylan; Clostridium difficile
We have recently reconstructed the ‘hatcheries’ of the
first cells by combining geochemical analysis with phylogenomic scrutiny of the
inorganic ion requirements of universal components of modern cells (Mulkidjanian
et al.: Origin of first cells at terrestrial, anoxic
geothermal fields. Proc Natl Acad Sci USA 2012,
109:E821–830). These ubiquitous, and by inference primordial, proteins
and functional systems show affinity to and functional requirement for
K+, Zn2+, Mn2+, and phosphate. Thus,
protocells must have evolved in habitats with a high
K+/Na+ ratio and relatively high concentrations of Zn,
Mn and phosphorous compounds. Geochemical reconstruction shows that the ionic
composition conducive to the origin of cells could not have existed in marine
settings but is compatible with emissions of vapor-dominated zones of inland
geothermal systems. Under anoxic, CO2-dominated atmosphere, the ionic
composition of pools of cool, condensed vapor at anoxic geothermal fields would
resemble the internal milieu of modern cells. Such pools would be lined with
porous silicate minerals mixed with metal sulfides and enriched in K+
ions and phosphorous compounds.
Here we address some questions that have appeared in print after the
publication of our anoxic geothermal field scenario. We argue that anoxic
geothermal fields, which were identified as likely cradles of life by using a
top-down approach and phylogenomics analysis as a tool, could provide
geochemical conditions similar to those which were suggested as most conducive
for the emergence of life by the chemists who pursuit the complementary
Any scenario of the transition from chemistry to biology should include an “energy module” because life can exist only when supported by energy flow(s). We addressed the problem of primordial energetics by combining physico-chemical considerations with phylogenomic analysis. We propose that the first replicators could use abiotically formed, exceptionally photostable activated nucleotides both as building blocks and as the main energy source. Nucleoside triphosphates could replace cyclic nucleotides as the principal energy-rich compounds at the stage of the first cells, presumably because the metal chelates of nucleoside triphosphates penetrated membranes much better than the respective metal complexes of nucleoside monophosphates. The ability to exploit natural energy flows for biogenic production of energy-rich molecules could evolve only gradually, after the emergence of sophisticated enzymes and ion-tight membranes. We argue that, in the course of evolution, sodium-dependent membrane energetics preceded the proton-based energetics which evolved independently in bacteria and archaea.
Over the last five years proteogenomics (using mass spectroscopy to identify proteins predicted from genomic sequences) has emerged as a promising approach to the high-throughput identification of protein N-termini, which remains a problem in genome annotation. Comparison of the experimentally determined N-termini with those predicted by sequence analysis tools allows identification of the signal peptides and therefore conclusions on the cytoplasmic or extracytoplasmic (periplasmic or extracellular) localization of the respective proteins. We present here the results of a proteogenomic study of the signal peptides in Escherichia coli K-12 and compare its results with the available experimental data and predictions by such software tools as SignalP and Phobius. A single proteogenomics experiment recovered more than a third of all signal peptides that had been experimentally determined during the past three decades and confirmed at least 31additional signal peptides, mostlyin the known exported proteins, which had been previously predicted but not validated. The filtering of putative signal peptides for the peptide length and the presence of an eight-residue hydrophobic patch and a typical signal peptidase cleavage site proved sufficient to eliminate the false-positive hits. Surprisingly, the results of this proteogenomics study, as well as a re-analysis of the E. coli genome with the latest version of SignalP program, show that the fraction of proteins containing signal peptides is only about 10%, or half of previous estimates.
Twenty-five years have passed since the discovery of cyclic dimeric (3′→5′) GMP (cyclic di-GMP or c-di-GMP). From the relative obscurity of an allosteric activator of a bacterial cellulose synthase, c-di-GMP has emerged as one of the most common and important bacterial second messengers. Cyclic di-GMP has been shown to regulate biofilm formation, motility, virulence, the cell cycle, differentiation, and other processes. Most c-di-GMP-dependent signaling pathways control the ability of bacteria to interact with abiotic surfaces or with other bacterial and eukaryotic cells. Cyclic di-GMP plays key roles in lifestyle changes of many bacteria, including transition from the motile to the sessile state, which aids in the establishment of multicellular biofilm communities, and from the virulent state in acute infections to the less virulent but more resilient state characteristic of chronic infectious diseases. From a practical standpoint, modulating c-di-GMP signaling pathways in bacteria could represent a new way of controlling formation and dispersal of biofilms in medical and industrial settings. Cyclic di-GMP participates in interkingdom signaling. It is recognized by mammalian immune systems as a uniquely bacterial molecule and therefore is considered a promising vaccine adjuvant. The purpose of this review is not to overview the whole body of data in the burgeoning field of c-di-GMP-dependent signaling. Instead, we provide a historic perspective on the development of the field, emphasize common trends, and illustrate them with the best available examples. We also identify unresolved questions and highlight new directions in c-di-GMP research that will give us a deeper understanding of this truly universal bacterial second messenger.
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI’s MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Comparative analysis of the complete genome sequences from a variety of poorly studied organisms aims at predicting ecological and behavioral properties of these organisms and help in characterizing their habitats. This task requires finding appropriate descriptors that could be correlated with the core traits of each system and would allow meaningful comparisons. Using the relatively simple bacterial models, first attempts have been made to introduce suitable metrics to describe the complexity of organism’s signaling machinery, which included introducing the “bacterial IQ” score. Here, we use an updated census of prokaryotic signal transduction systems to improve this parameter and evaluate its consistency within selected bacterial phyla. We also introduce a more elaborate descriptor, a set of profiles of relative abundance of members of each family of signal transduction proteins encoded in each genome. We show that these family profiles are well conserved within each genus and are often consistent within families of bacteria. Thus, they reflect evolutionary relationships between organisms as well as individual adaptations of each organism to its specific ecological niche.
comparative genomics; evolution; protein phosphorylation; receptor; Mycobacterium; Shewanella
Planctomycetes, Verrucomicrobia and Chlamydia are prokaryotic phyla that are sometimes grouped together as the PVC superphylum of eubacteria. Some PVC species possess interesting attributes, in particular, internal membranes that superficially resemble eukaryotic endomembranes. Some biologists now claim that PVC bacteria are nucleus-bearing prokaryotes and that they are evolutionary intermediates in the transition from prokaryote to eukaryote. PVC prokaryotes do not possess a nucleus and are not intermediates in the prokaryote-to-eukaryote transition. All of the PVC traits that are currently cited as evidence for aspiring eukaryoticity are either analogous (the result of convergent evolution), not homologous, to eukaryotic traits; or else they are the result of lateral gene transfers. Here we summarize the evidence that shows why most of the purported similarities between the PVC bacteria and eukaryotes are analogous and the rest are consequence of lateral gene acquisition.
Experimental data exists for only a vanishingly small fraction of sequenced microbial genes. This community page discusses the progress made by the COMBREX project to address this important issue using both computational and experimental resources.
The availability of genome sequences from a variety of organisms presents an opportunity to apply this sequence information to solving the key problems of molecular biology. One of the principal roadblocks on this path is the lack of appropriate descriptors and metrics that could succinctly represent the new knowledge stemming from the genomic data. Several new metrics have recently been used in comparative genome analysis, yet challenges remain in finding an appropriate language for the emerging discipline of systems biology.
Sentra (), a database of signal transduction proteins encoded in completely sequenced prokaryotic genomes, has been updated to reflect recent advances in understanding signal transduction events on a whole-genome scale. Sentra consists of two principal components, a manually curated list of signal transduction proteins in 202 completely sequenced prokaryotic genomes and an automatically generated listing of predicted signaling proteins in 235 sequenced genomes that are awaiting manual curation. In addition to two-component histidine kinases and response regulators, the database now lists manually curated Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews. All entries in Sentra are extensively annotated with relevant information from public databases (e.g. UniProt, KEGG, PDB and NCBI). Sentra's infrastructure was redesigned to support interactive cross-genome comparisons of signal transduction capabilities of prokaryotic organisms from a taxonomic and phenotypic perspective and in the framework of signal transduction pathways from KEGG. Sentra leverages the PUMA2 system to support interactive analysis and annotation of signal transduction proteins by the users.
The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year’s highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein–protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI’s dbVar and EBI’s DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein–ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and currently lists 1512 online databases. The full content of the Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
The 19th annual Database Issue of Nucleic Acids Research features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals. The highlights of this issue include, among others, a description of neXtProt, a knowledgebase on human proteins; a detailed explanation of the principles behind the NCBI Taxonomy Database; NCBI and EBI papers on the recently launched BioSample databases that store sample information for a variety of database resources; descriptions of the recent developments in the Gene Ontology and UniProt Gene Ontology Annotation projects; updates on Pfam, SMART and InterPro domain databases; update papers on KEGG and TAIR, two universally acclaimed databases that face an uncertain future; and a separate section with 10 wiki-based databases, introduced in an accompanying editorial. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and now lists 1380 databases. Brief machine-readable descriptions of the databases featured in this issue, according to the BioDBcore standards, will be provided at the http://biosharing.org/biodbcore web site. The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Comparative analysis of the sequences of enzymes encoded in a variety of prokaryotic and eukaryotic genomes reveals convergence and divergence at several levels. Functional convergence can be inferred when structurally distinct and hence non-homologous enzymes show the ability to catalyze the same biochemical reaction. In contrast, as a result of functional diversification, many structurally similar enzyme molecules act on substantially distinct substrates and catalyze diverse biochemical reactions. Here, we present updates on the ATP-grasp, alkaline phosphatase, cupin, HD hydrolase, and N-terminal nucleophile (Ntn) hydrolase enzyme superfamilies and discuss the patterns of sequence and structural conservation and diversity within these superfamilies. Typically, enzymes within a superfamily possess common sequence motifs and key active site residues, as well as (predicted) reaction mechanisms. These observations suggest that the strained conformation (the entatic state) of the active site, which is responsible for the substrate binding and formation of the transition complex, tends to be conserved within enzyme superfamilies. The subsequent fate of the transition complex is not necessarily conserved and depends on the details of the structures of the enzyme and the substrate. This variability of reaction outcomes limits the ability of sequence analysis to predict the exact enzymatic activities of newly sequenced gene products. Nevertheless, sequence-based (super)family assignments and generic functional predictions, even if imprecise, provide valuable leads for experimental studies and remain the best approach to the functional annotation of uncharacterized proteins from new genomes.
Enzyme Catalysis; Enzyme Mechanisms; Enzyme Structure; Evolution; Phosphodiesterases; Convergence; Divergence
Cyclic diguanylate (c-di-GMP) is a ubiquitous second messenger regulating diverse cellular functions including motility, biofilm formation, cell cycle progression and virulence in bacteria. In the cell, degradation of c-di-GMP is catalyzed by highly specific EAL domain phosphodiesterases whose catalytic mechanism is still unclear. Here, we purified 13 EAL domain proteins from various organisms and demonstrated that their catalytic activity is associated with the presence of 10 conserved EAL domain residues. The crystal structure of the TDB1265 EAL domain was determined in a free state (1.8 Å) and in complex with c-di-GMP (2.35 Å) and unveiled the role of the conserved residues in substrate binding and catalysis. The structure revealed the presence of two metal ions directly coordinated by six conserved residues, two oxygens of the c-di-GMP phosphate, and potential catalytic water molecule. Our results support a two-metal-ion catalytic mechanism of c-di-GMP hydrolysis by EAL domain phosphodiesterases.
EAL domain; cyclic di-GMP; phosphodiesterase; X-ray crystallography; Thiobacillus denitrificans
Binding of calcium ions (Ca2+) to proteins can have profound effects on their structure and function. Common roles of calcium binding include structure stabilization and regulation of activity. It is known that diverse families – EF-hands being one of at least twelve – use a Dx[DN]xDG linear motif to bind calcium in near-identical fashion. Here, four novel structural contexts for the motif are described. Existing experimental data for one of them, a thermophilic archaeal subtilisin, demonstrate for the first time a role for Dx[DN]xDG-bound calcium in protein folding. An integrin-like embedding of the motif in the blade of a β-propeller fold – here named the calcium blade – is discovered in structures of bacterial and fungal proteins. Furthermore, sensitive database searches suggest a common origin for the calcium blade in β-propeller structures of different sizes and a pan-kingdom distribution of these proteins. Factors favouring the multiple convergent evolution of the motif appear to include its general Asp-richness, the regular spacing of the Asp residues and the fact that change of Asp into Gly and vice versa can occur though a single nucleotide change. Among the known structural contexts for the Dx[DN]xDG motif, only the calcium blade and the EF-hand are currently found intracellularly in large numbers, perhaps because the higher extracellular concentration of Ca2+ allows for easier fixing of newly evolved motifs that have acquired useful functions. The analysis presented here will inform ongoing efforts toward prediction of similar calcium-binding motifs from sequence information alone.
Response regulators (RRs) within two-component signal transduction systems control a variety of cellular processes. Most RRs contain DNA-binding output domains and serve as transcriptional regulators. Other RR types contain RNA-binding, ligand-binding, protein-binding or transporter output domains and exert regulation at the transcriptional, post-transcriptional or post-translational levels. In a significant fraction of RRs, output domains are enzymes that themselves participate in signal transduction: methylesterases, adenylate or diguanylate cyclases, c-di-GMP-specific phosphodiesterases, histidine kinases, serine/threonine protein kinases and protein phosphatases. In addition, there remain output domains whose functions are still unknown. Patterns of the distribution of various RR families are generally conserved within key microbial lineages and can be used to trace adaptations of various species to their unique ecological niches.
protein domains; transcriptional regulation; protein phosphorylation; signal transduction; genome annotation; protein structure
The rapidly accumulating genome sequence data allow researchers to address fundamental biological questions that were not even asked just a few years ago. A major problem in genomics is the widening gap between the rapid progress in genome sequencing and the comparatively slow progress in the functional characterization of sequenced genomes. Here we discuss two key questions of genome biology: whether we need more genomes, and how deep is our understanding of biology based on genomic analysis. We argue that overly specific annotations of gene functions are often less useful than the more generic, but also more robust, functional assignments based on protein family classification. We also discuss problems in understanding the functions of the remaining “conserved hypothetical” genes.
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources; and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
The current 18th Database Issue of Nucleic Acids Research features descriptions of 96 new and 83 updated online databases covering various areas of molecular biology. It includes two editorials, one that discusses COMBREX, a new exciting project aimed at figuring out the functions of the ‘conserved hypothetical’ proteins, and one concerning BioDBcore, a proposed description of the ‘minimal information about a biological database’. Papers from the members of the International Nucleotide Sequence Database collaboration (INSDC) describe each of the participating databases, DDBJ, ENA and GenBank, principles of data exchange within the collaboration, and the recently established Sequence Read Archive. A testament to the longevity of databases, this issue includes updates on the RNA modification database, Definition of Secondary Structure of Proteins (DSSP) and Homology-derived Secondary Structure of Proteins (HSSP) databases, which have not been featured here in >12 years. There is also a block of papers describing recent progress in protein structure databases, such as Protein DataBank (PDB), PDB in Europe (PDBe), CATH, SUPERFAMILY and others, as well as databases on protein structure modeling, protein–protein interactions and the organization of inter-protein contact sites. Other highlights include updates of the popular gene expression databases, GEO and ArrayExpress, several cancer gene databases and a detailed description of the UK PubMed Central project. The Nucleic Acids Research online Database Collection, available at: http://www.oxfordjournals.org/nar/database/a/, now lists 1330 carefully selected molecular biology databases. The full content of the Database Issue is freely available online at the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
An analysis of the distribution of the Na+-translocating ATPases/ATP synthases among microbial genomes identified an atypical form of the F1Fo-type ATPase that is present in the archaea Methanosarcina barkeri and M.acetivorans, in a number of phylogenetically diverse marine and halotolerant bacteria and in pathogens Burkholderia spp. In complete genomes, representatives of this form (referred to here as N-ATPase) are always present as second copies, in addition to the typical proton-translocating ATP synthases. The N-ATPase is encoded by a highly conserved atpDCQRBEFAG operon and its subunits cluster separately from the equivalent subunits of the typical F-type ATPases. N-ATPase c subunits carry a full set of sodium-binding residues, indicating that most of these enzymes are Na+-translocating ATPases that likely confer on their hosts the ability to extrude Na+ ions. Other distinctive properties of the N-ATPase operons include the absence of the delta subunit from its cytoplasmic sector and the presence of two additional membrane subunits, AtpQ (formerly gene 1) and AtpR (formerly gene X). We argue that N-ATPases are an early-diverging branch of membrane ATPases that, similarly to the eukaryotic V-type ATPases, do not synthesize ATP.
Contact: email@example.com; firstname.lastname@example.org
Supplementary information: Supplementary data are available at Bioinformatics online.