In this article, we describe the development of the first publicly available program that predicts with good accuracy the integral β-barrel OM proteins from a collection of polypeptide sequences from Gram-negative bacteria. The development of a reliable program to perform this task has previously proven to be a bottleneck in the area of TM protein prediction (3
). The most common way to identify integral β-barrel proteins from predicted proteomes has so far been the use of annotation information in addition to PSORT I (34
). PSORT I, with a precision of 65.3% and recall of 54.5% in the prediction of all types of OMPs, was recently replaced by a new and improved version, PSORT B, with a reported recall of 90.3% and precision of 98.8%. PSORT B does not, however, separate the integral β-barrel proteins from the lipoproteins. When examining all the PSORT B-predicted OMPs from E.coli
and six other precomputed genomes (Helicobacter pylori J99
, Haemophilus influenzae
, Fusobacterium nucleatum
, E.coli O157:H7 Sakai
and Xanthomonas campestris
), we found that all the predicted OMPs were recognized by the PSORT B BLAST module. No additional sequences without known homologues were predicted by the other program modules. This indicates that PSORT B will probably have little chance of identifying novel OMPs without already-known homologues. At least three other programs for integral β-barrel prediction have been developed over the last couple of years, Hunter (9
), the β-barrel finder (11
) and a simple algorithm developed by Wimley (10
). Hunter is mainly based on signal sequence prediction, and a predictor of topography to recognize all-β-membrane proteins, whereas the β-barrel finder is based on secondary structure predictions together with hydropathy and amphipathicity information. Wimley developed a simple algorithm to calculate the β-barrel score of sequences based on the relative abundance of amino acids in the TM β-strands of 15 different integral β-barrel proteins with known crystal structures (10
). Unfortunately, none of these programs has been available for performance testing, and Hunter is the only one to report its accuracy, with a recall of 82.4% and a precision of 90.3% for the prediction of well-annotated integral β-barrel proteins in E.coli
. This is slightly poorer recall, but higher precision, than BOMP. Unlike Hunter, BOMP is not based on signal sequence prediction, giving BOMP an advantage when it comes to predicting integral β-barrel proteins from translated open reading frames since in some cases they can have been given the wrong start site, which might lead to difficulties in signal sequence prediction.
From the discussion outlined above, it is obvious that BOMP will close a gap in the collection of currently available prediction tools for TM proteins. This program will provide fast and reliable information for the experimental analysis of β-barrel OMPs. When analysing a predicted proteome with BOMP, the resulting overview of the predicted integral β-barrel OM subproteome will provide important information on how to approach the experimental proteomic work, and will speed up the experimental analysis of integral β-barrel proteins in the laboratory. Due to the good prediction accuracy, several previously hypothetical annotated polypeptide sequences can now be given a likely localization, which will narrow down possible function(s) of these proteins. An overview of the predicted integral β-barrel subproteome will also narrow down the number of proteins to be selected for experimental investigation with respect to identifying proteins that might serve as vaccine candidates in pathogenic bacteria. BOMP also opens up the possibility of comparing the predicted integral β-barrel subproteome of two different strains of the same bacterium, in order to find differences that might explain pathogenesis of one of the strains.