Proteins of the β-propeller fold are ubiquitous in nature and widely used as structural scaffolds for ligand binding and enzymatic activity. This fold comprises between four and twelve four-stranded β-meanders, the so called blades that are arranged circularly around a central funnel-shaped pore. Despite the large size range of β-propellers, their blades frequently show sequence similarity indicative of a common ancestry and it has been proposed that the majority of β-propellers arose divergently by amplification and diversification of an ancestral blade. Given the structural versatility of β-propellers and the hypothesis that the first folded proteins evolved from a simpler set of peptides, we investigated whether this blade may have given rise to other folds as well. Using sequence comparisons, we identified proteins of four other folds as potential homologs of β-propellers: the luminal domain of inositol-requiring enzyme 1 (IRE1-LD), type II β-prisms, β-pinwheels, and WW domains. Because, with increasing evolutionary distance and decreasing sequence length, the statistical significance of sequence comparisons becomes progressively harder to distinguish from the background of convergent similarities, we complemented our analyses with a new method that evaluates possible homology based on the correlation between sequence and structure similarity. Our results indicate a homologous relationship of IRE1-LD and type II β-prisms with β-propellers, and an analogous one for β-pinwheels and WW domains. Whereas IRE1-LD most likely originated by fold-changing mutations from a fully formed PQQ motif β-propeller, type II β-prisms originated by amplification and differentiation of a single blade, possibly also of the PQQ type. We conclude that both β-propellers and type II β-prisms arose by independent amplification of a blade-sized fragment, which represents a remnant of an ancient peptide world.
The role of the disaccharide trehalose, its biosynthesis pathways and their regulation in Archaea are still ambiguous. In Thermoproteus tenax a fused trehalose-6-phosphate synthase/phosphatase (TPSP), consisting of an N-terminal trehalose-6-phosphate synthase (TPS) and a C-terminal trehalose-6-phosphate phosphatase (TPP) domain, was identified. The tpsp gene is organized in an operon with a putative glycosyltransferase (GT) and a putative mechanosensitive channel (MSC). The T. tenax TPSP exhibits high phosphatase activity, but requires activation by the co-expressed GT for bifunctional synthase-phosphatase activity. The GT mediated activation of TPS activity relies on the fusion of both, TPS and TPP domain, in the TPSP enzyme. Activation is mediated by complex-formation in vivo as indicated by yeast two-hybrid and crude extract analysis. In combination with first evidence for MSC activity the results suggest a sophisticated stress response involving TPSP, GT and MSC in T. tenax and probably in other Thermoproteales species. The monophyletic prokaryotic TPSP proteins likely originated via a single fusion event in the Bacteroidetes with subsequent horizontal gene transfers to other Bacteria and Archaea. Furthermore, evidence for the origin of eukaryotic TPSP fusions via HGT from prokaryotes and therefore a monophyletic origin of eukaryotic and prokaryotic fused TPSPs is presented. This is the first report of a prokaryotic, archaeal trehalose synthase complex exhibiting a much more simple composition than the eukaryotic complex described in yeast. Thus, complex formation and a complex-associated regulatory potential might represent a more general feature of trehalose synthesizing proteins.
Background: The AAA ATPases of the PAN/Rpt1–6 group regulate access of substrates to the 20S proteasome.
Results: Two groups of AAA proteins, CDC48 and AMA, function as novel proteasomal ATPases in archaea.
Conclusion: This network of regulatory ATPases increases the capacity of proteasomal protein degradation in archaea.
Significance: Diversification at the level of the regulatory ATPase provides a contrast to the fully differentiated 26S proteasome of eukaryotes.
The proteasome is the central machinery for targeted protein degradation in archaea, Actinobacteria, and eukaryotes. In its basic form, it consists of a regulatory ATPase complex and a proteolytic core particle. The interaction between the two is governed by an HbYX motif (where Hb is a hydrophobic residue, Y is tyrosine, and X is any amino acid) at the C terminus of the ATPase subunits, which stimulates gate opening of the proteasomal α-subunits. In archaea, the proteasome-interacting motif is not only found in canonical proteasome-activating nucleotidases of the PAN/ARC/Rpt group, which are absent in major archaeal lineages, but also in proteins of the CDC48/p97/VAT and AMA groups, suggesting a regulatory network of proteasomal ATPases. Indeed, Thermoplasma acidophilum, which lacks PAN, encodes one CDC48 protein that interacts with the 20S proteasome and activates the degradation of model substrates. In contrast, Methanosarcina mazei contains seven AAA proteins, five of which, both PAN proteins, two out of three CDC48 proteins, and the AMA protein, function as proteasomal gatekeepers. The prevalent presence of multiple, distinct proteasomal ATPases in archaea thus results in a network of regulatory ATPases that may widen the substrate spectrum of proteasomal protein degradation.
Archaea; ATP-dependent Protease; ATPases; Proteasome; Protein Degradation
Exonuclease VII (ExoVII) is a bacterial nuclease involved in DNA repair and recombination that hydrolyses single-stranded DNA. ExoVII is composed of two subunits: large XseA and small XseB. Thus far, little was known about the molecular structure of ExoVII, the interactions between XseA and XseB, the architecture of the nuclease active site or its mechanism of action. We used bioinformatics methods to predict the structure of XseA, which revealed four domains: an N-terminal OB-fold domain, a middle putatively catalytic domain, a coiled-coil domain and a short C-terminal segment. By series of deletion and site-directed mutagenesis experiments on XseA from Escherichia coli, we determined that the OB-fold domain is responsible for DNA binding, the coiled-coil domain is involved in binding multiple copies of the XseB subunit and residues D155, R205, H238 and D241 of the middle domain are important for the catalytic activity but not for DNA binding. Altogether, we propose a model of sequence–structure–function relationships in ExoVII.
Models of early protein evolution posit the existence of short peptides that bound metals and ions and served as transporters, membranes or catalysts. The Cys-X-X-Cys-X-X-Cys heptapeptide located within bacterial ferredoxins, enclosing an Fe4S4 metal center, is an attractive candidate for such an early peptide. Ferredoxins are ancient proteins and the simple α+β fold is found alone or as a domain in larger proteins throughout all three kingdoms of life. Previous analyses of the heptapeptide conformation in experimentally determined ferredoxin structures revealed a pervasive right-handed topology, despite the fact that the Fe4S4 cluster is achiral. Conformational enumeration of a model CGGCGGC heptapeptide bound to a cubane iron-sulfur cluster indicates both left-handed and right-handed folds could exist and have comparable stabilities. However, only the natural ferredoxin topology provides a significant network of backbone-to-cluster hydrogen bonds that would stabilize the metal-peptide complex. The optimal peptide configuration (alternating αL,αR) is that of an α-sheet, providing an additional mechanism where oligomerization could stabilize the peptide and facilitate iron-sulfur cluster binding.
The ferredoxin fold is one of the oldest structures capable of catalyzing electron transfer reactions. In nature, only a right-handed topology exists in the ferredoxin fold. To understand how a specific fold-handedness was selected, we analyzed the structural motif using the tools of de novo protein design, searching in an unbiased fashion for backbone geometries that can favorably interact with the tetrahedral iron-sulfur cluster. In silico, we found both left-handed and right-handed folds can be formed, however the right-handed folds provide up to six hydrogen bonds that can stabilize the reduced iron-sulfur cluster, whereas left-handed folds at most form three hydrogen bonds. The difference in electrostatic conformational energy may have influenced selection of topology early in the evolution of iron-sulfur cluster containing proteins. This observation led us to establish a fundamental protein design principle that only right-handed peptide folds can properly interact while maintain redox function. Our results provide guidance in the creation of artificial proteins capable of carrying out redox reactions.
The linkage of isoprenoid and aromatic moieties, catalyzed by aromatic prenyltransferases (PTases), leads to an impressive diversity of primary and secondary metabolites, including important pharmaceuticals and toxins. A few years ago, a hydroxynaphthalene PTase, NphB, featuring a novel ten-stranded β-barrel fold was identified in Streptomyces sp. strain CL190. This fold, termed the PT-barrel, is formed of five tandem ααββ structural repeats and remained exclusive to the NphB family until its recent discovery in the DMATS family of indole PTases. Members of these two families exist only in fungi and bacteria, and all of them appear to catalyze the prenylation of aromatic substrates involved in secondary metabolism. Sequence comparisons using PSI-BLAST do not yield matches between these two families, suggesting that they may have converged upon the same fold independently. However, we now provide evidence for a common ancestry for the NphB and DMATS families of PTases. We also identify sequence repeats that coincide with the structural repeats in proteins belonging to these two families. Therefore we propose that the PT-barrel arose by amplification of an ancestral ααββ module. In view of their homology and their similarities in structure and function, we propose to group the NphB and DMATS families together into a single superfamily, the PT-barrel superfamily.
Mitochondria must uptake some phospholipids from the endoplasmic reticulum (ER) for the biogenesis of their membranes. They convert one of these lipids, phosphatidylserine, to phosphatidylethanolamine, which can be re-exported via the ER to all other cellular membranes. The mechanisms underlying these exchanges between ER and mitochondria are poorly understood. Recently, a complex termed ER–mitochondria encounter structure (ERMES) was shown to be necessary for phospholipid exchange in budding yeast. However, it is unclear whether this complex is merely an inter-organelle tether or also the transporter. ERMES consists of four proteins: Mdm10, Mdm34 (Mmm2), Mdm12 and Mmm1, three of which contain the uncharacterized SMP domain common to a number of eukaryotic membrane-associated proteins. Here, we show that the SMP domain belongs to the TULIP superfamily of lipid/hydrophobic ligand-binding domains comprising members of known structure. This relationship suggests that the SMP domains of the ERMES complex mediate lipid exchange between ER and mitochondria.
Supplementary information: Supplementary data are available at Bioinformatics online.
Toxin-antitoxin systems consist of a stable toxin, frequently with endonuclease activity, and a small, labile antitoxin, which sequesters the toxin into an inactive complex. Under unfavorable conditions, the antitoxin is degraded, leading to activation of the toxin and resulting in growth arrest, possibly also in bacterial programmed cell death. Correspondingly, these systems are generally viewed as agents of the stress response in prokaryotes. Here we show that prlF and yhaV encode a novel toxin-antitoxin system in Escherichia coli. YhaV, a ribonuclease of the RelE superfamily, causes reversible bacteriostasis that is counteracted by PrlF, a swapped-hairpin transcription factor homologous to MazE. The two proteins form a tight, hexameric complex, which binds with high specificity to a conserved sequence in the promotor region of the prlF-yhaV operon. As homologs of MazE and RelE, respectively, PrlF and YhaV provide an evolutionary connection between the two best-characterized toxin-antitoxin systems in E. coli, mazEF and relEB.
mRNA decay; RelE superfamily ribonuclease; stress response; swapped-hairpin barrel; toxin-antitoxin system
Outer membrane proteins (OMPs) are the transmembrane proteins found in the outer membranes of Gram-negative bacteria, mitochondria and plastids. Most prediction methods have focused on analogous features, such as alternating hydrophobicity patterns. Here, we start from the observation that almost all β-barrel OMPs are related by common ancestry. We identify proteins as OMPs by detecting their homologous relationships to known OMPs using sequence similarity. Given an input sequence, HHomp builds a profile hidden Markov model (HMM) and compares it with an OMP database by pairwise HMM comparison, integrating OMP predictions by PROFtmb. A crucial ingredient is the OMP database, which contains profile HMMs for over 20 000 putative OMP sequences. These were collected with the exhaustive, transitive homology detection method HHsenser, starting from 23 representative OMPs in the PDB database. In a benchmark on TransportDB, HHomp detects 63.5% of the true positives before including the first false positive. This is 70% more than PROFtmb, four times more than BOMP and 10 times more than TMB-Hunt. In Escherichia coli, HHomp identifies 57 out of 59 known OMPs and correctly assigns them to their functional subgroups. HHomp can be accessed at http://toolkit.tuebingen.mpg.de/hhomp.
Yersinia enterocolitica is an enteric pathogen that exploits diverse means to survive in the human host. Upon Y. enterocolitica entry into the human host, bacteria sense and respond to variety of signals, one of which is the temperature. Temperature in particular has a profound impact on Y. enterocolitica gene expression, as most of its virulence factors are expressed exclusively at 37°C. These include two outer membrane proteins, YadA and Ail, that function as adhesins and complement resistance (CR) factors. Both YadA and Ail bind the functionally active complement alternative pathway regulator factor H (FH). In this study, we characterized regions on both proteins involved in CR and the interaction with FH. Twenty-eight mutants having short (7 to 41 amino acids) internal deletions within the neck and stalk of YadA and two complement-sensitive site-directed Ail mutants were constructed to map the CR and FH binding regions of YadA and Ail. Functional analysis of the YadA mutants revealed that the stalk of YadA is required for both CR and FH binding and that FH appears to target several conformational and discontinuous sites of the YadA stalk. On the other hand, the complement-sensitive Ail mutants were not affected in FH binding. Our results also suggested that Ail- and YadA-mediated CR does not depend solely on FH binding.
Trimeric autotransporter adhesins (TAAs) are a major class of proteins by which pathogenic proteobacteria adhere to their hosts. Prominent examples include Yersinia YadA, Haemophilus Hia and Hsf, Moraxella UspA1 and A2, and Neisseria NadA. TAAs also occur in symbiotic and environmental species and presumably represent a general solution to the problem of adhesion in proteobacteria. The general structure of TAAs follows a head-stalk-anchor architecture, where the heads are the primary mediators of attachment and autoagglutination. In the major adhesin of Bartonella henselae, BadA, the head consists of three domains, the N-terminal of which shows strong sequence similarity to the head of Yersinia YadA. The two other domains were not recognizably similar to any protein of known structure. We therefore determined their crystal structure to a resolution of 1.1 Å. Both domains are β-prisms, the N-terminal one formed by interleaved, five-stranded β-meanders parallel to the trimer axis and the C-terminal one by five-stranded β-meanders orthogonal to the axis. Despite the absence of statistically significant sequence similarity, the two domains are structurally similar to domains from Haemophilus Hia, albeit in permuted order. Thus, the BadA head appears to be a chimera of domains seen in two other TAAs, YadA and Hia, highlighting the combinatorial evolutionary strategy taken by pathogens.
The ability to adhere is an important aspect of the interaction between bacteria and their environment. Adhesion allows them to aggregate into colonies, form biofilms with other species, and colonize surfaces. Where the surfaces are provided by other organisms, adhesion can lead to a wide range of outcomes, from symbiosis to pathogenicity. In Proteobacteria, colonization of the host depends on a wide range of adhesive surface molecules, among which Trimeric Autotransporter Adhesins (TAAs) represent a major class. In electron micrographs, TAAs resemble lollipops projecting from the bacterial surface, and in all investigated cases, the adhesive properties reside in their heads. We have determined the head structure of BadA, the major adhesin of Bartonella henselae. This pathogen causes cat scratch disease in humans, but can lead to much more severe disease in immunosuppressed patients, e.g., during chemotherapy or after HIV infection. Surprisingly, domains previously seen in other TAA heads are combined in a novel assembly, illustrating how pathogens rearrange available building blocks to create new adhesive surface molecules.
Motivation: Trimeric autotransporter adhesins (TAAs), such as Yersinia YadA, Neisseria NadA, Moraxella UspAs, Haemophilus Hia and Bartonella BadA, are important pathogenicity factors of proteobacteria. Their high sequence diversity and distinct mosaic-like structure lead to difficulties in the annotation of their sequences. These stem from the large number of short repeats, the presence of compositionally unusual coiled-coils, fuzzy domain boundaries and regions of seemingly low sequence complexity.
Results: We have developed a workflow, named daTAA, for the accurate domain annotation of TAAs. Its core consists of manually curated alignments and of knowledge-based rules that enhance assignments made by sequence similarity. Compared to general domain annotation servers such as PFAM, daTAA captures more domains and provides more sensitive domain detection, as well as integrated and detailed coiled-coil assignments.
Availability: The daTAA server is freely accessible at http://toolkit.tuebingen.mpg.de/dataa
Supplementary information: Supplementary data are available at Bioinformatics online.
The outer membrane protein OmpW from E. coli was overexpressed in inclusion bodies and refolded with the help of detergent. The protein has been crystallized and the crystals diffract to 3.5 Å resolution.
OmpW is an eight-stranded 21 kDa molecular-weight β-barrel protein from the outer membrane of Gram-negative bacteria. It is a major antigen in bacterial infections and has implications in antibiotic resistance and in the oxidative degradation of organic compounds. OmpW from Escherichia coli was cloned and the protein was expressed in inclusion bodies. A method for refolding and purification was developed which yields properly folded protein according to circular-dichroism measurements. The protein has been crystallized and crystals were obtained that diffracted to a resolution limit of 3.5 Å. The crystals belong to space group P422, with unit-cell parameters a = 122.5, c = 105.7 Å. A homology model of OmpW is presented based on known structures of eight-stranded β-barrels, intended for use in molecular-replacement trials.
OmpW; membrane proteins; outer membrane; homology modelling
The Yersinia adhesin A (YadA) is a trimeric autotransporter adhesin of enteric yersiniae. It consists of three major domains: a head mediating adherence to host cells, a stalk involved in serum resistance, and an anchor that forms a membrane pore and is responsible for the autotransport function. The anchor contains a glycine residue, nearly invariant throughout trimeric autotransporter adhesins, that faces the pore lumen. To address the role of this glycine, we replaced it with polar amino acids of increasing side chain size and expressed wild-type and mutant YadA in Escherichia coli. The mutations did not impair the YadA-mediated adhesion to collagen and to host cells or the host cell cytokine production, but they decreased the expression levels and stability of YadA trimers with increasing side chain size. Likewise, autoagglutination and resistance to serum were decreased in these mutants. We found that the periplasmic protease DegP is involved in the degradation of YadA and that in an E. coli degP deletion strain, mutant versions of YadA were expressed almost to wild-type levels. We conclude that the conserved glycine residue affects both the export and the stability of YadA and consequently some of its putative functions in pathogenesis.
Histones organize the genomic DNA of eukaryotes into chromatin. The four core histone subunits consist of two consecutive helix-strand-helix motifs and are interleaved into heterodimers with a unique fold. We have searched for the evolutionary origin of this fold using sequence and structure comparisons, based on the hypothesis that folded proteins evolved by combination of an ancestral set of peptides, the antecedent domain segments.
Our results suggest that an antecedent domain segment, corresponding to one helix-strand-helix motif, gave rise divergently to the N-terminal substrate recognition domain of Clp/Hsp100 proteins and to the helical part of the extended ATPase domain found in AAA+ proteins. The histone fold arose subsequently from the latter through a 3D domain-swapping event. To our knowledge, this is the first example of a genetically fixed 3D domain swap that led to the emergence of a protein family with novel properties, establishing domain swapping as a mechanism for protein evolution.
The helix-strand-helix motif common to these three folds provides support for our theory of an 'ancient peptide world' by demonstrating how an ancestral fragment can give rise to 3 different folds.
Solenoid repeat proteins of the Tetratrico Peptide Repeat (TPR) family are involved as scaffolds in a broad range of protein-protein interactions. Several resources are available for the prediction of TPRs, however, they often fail to detect divergent repeat units.
We have developed TPRpred, a profile-based method which uses a P-value-dependent score offset to include divergent repeat units and which exploits the tendency of repeats to occur in tandem. TPRpred detects not only TPR-like repeats, but also the related Pentatrico Peptide Repeats (PPRs) and SEL1-like repeats. The corresponding profiles were generated through iterative searches, by varying the threshold parameters for inclusion of repeat units into the profiles, and the best profiles were selected based on their performance on proteins of known structure. We benchmarked the performance of TPRpred in detecting TPR-containing proteins and in delineating the individual repeats therein, against currently available resources.
TPRpred performs significantly better in detecting divergent repeats in TPR-containing proteins, and finds more individual repeats than the existing methods. The web server is available at , and the C++ and Perl sources of TPRpred along with the profiles can be downloaded from .
The MPI Bioinformatics Toolkit is an interactive web service which offers access to a great variety of public and in-house bioinformatics tools. They are grouped into different sections that support sequence searches, multiple alignment, secondary and tertiary structure prediction and classification. Several public tools are offered in customized versions that extend their functionality. For example, PSI-BLAST can be run against regularly updated standard databases, customized user databases or selectable sets of genomes. Another tool, Quick2D, integrates the results of various secondary structure, transmembrane and disorder prediction programs into one view. The Toolkit provides a friendly and intuitive user interface with an online help facility. As a key feature, various tools are interconnected so that the results of one tool can be forwarded to other tools. One could run PSI-BLAST, parse out a multiple alignment of selected hits and send the results to a cluster analysis tool. The Toolkit framework and the tools developed in-house will be packaged and freely available under the GNU Lesser General Public Licence (LGPL). The Toolkit can be accessed at .
HHsenser is the first server to offer exhaustive intermediate profile searches, which it combines with pairwise comparison of hidden Markov models. Starting from a single protein sequence or a multiple alignment, it can iteratively explore whole superfamilies, producing few or no false positives. The output is a multiple alignment of all detected homologs. HHsenser's sensitivity should make it a useful tool for evolutionary studies. It may also aid applications that rely on diverse multiple sequence alignments as input, such as homology-based structure and function prediction, or the determination of functional residues by conservation scoring and functional subtyping.
HHsenser can be accessed at . It has also been integrated into our structure and function prediction server HHpred () to improve predictions for near-singleton sequences.
HHpred is a fast server for remote protein homology detection and structure prediction and is the first to implement pairwise comparison of profile hidden Markov models (HMMs). It allows to search a wide choice of databases, such as the PDB, SCOP, Pfam, SMART, COGs and CDD. It accepts a single query sequence or a multiple alignment as input. Within only a few minutes it returns the search results in a user-friendly format similar to that of PSI-BLAST. Search options include local or global alignment and scoring secondary structure similarity. HHpred can produce pairwise query-template alignments, multiple alignments of the query with a set of templates selected from the search results, as well as 3D structural models that are calculated by the MODELLER software from these alignments. A detailed help facility is available. As a demonstration, we analyze the sequence of SpoVT, a transcriptional regulator from Bacillus subtilis. HHpred can be accessed at .
REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins. REPPER is available at .
Bartonella henselae causes vasculoproliferative disorders in humans. We identified a nonfimbrial adhesin of B. henselae designated as Bartonella adhesin A (BadA). BadA is a 340-kD outer membrane protein encoded by the 9.3-kb badA gene. It has a modular structure and contains domains homologous to the Yersinia enterocolitica nonfimbrial adhesin (Yersinia adhesin A). Expression of BadA was restored in a BadA-deficient transposon mutant by complementation in trans. BadA mediates the binding of B. henselae to extracellular matrix proteins and to endothelial cells, possibly via β1 integrins, but prevents phagocytosis. Expression of BadA is crucial for activation of hypoxia-inducible factor 1 in host cells by B. henselae and secretion of proangiogenic cytokines (e.g., vascular endothelial growth factor). BadA is immunodominant in B. henselae–infected patients and rodents, indicating that it is expressed during Bartonella infections. Our results suggest that BadA, the largest characterized bacterial protein thus far, is a major pathogenicity factor of B. henselae with a potential role in the induction of vasculoproliferative disorders.
pilus; endothelial cells; HIF-1; VEGF; angiogenesis
Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies.The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.
A comparative genomic approach was used to identify Helicobacter pylori 26695 open reading frames (ORFs) which are conserved in H. pylori J99 but highly diverged in other eubacteria. A survey of selected pathways of central intermediary metabolism was also carried out, and genes with a potentially selective role in H. pylori were identified. Forty-five ORFs identified in these two analyses were screened using a rapid vector-free allelic replacement mutagenesis technique, and 33 were shown to be essential in vitro. Notably, 13 ORFs gave essentiality results which are unexpected in view of their known or proposed functions, and phylogenetic analysis was used to investigate the annotation of 7 such ORFs which are highly diverged. We propose that the products of a number of these H. pylori-specific essential genes may be suitable targets for novel anti-H. pylori therapies.