|Home | About | Journals | Submit | Contact Us | Français|
Bacillus anthracis proteins that possess antigenic properties and are able to evoke an immune response were identified by a reductive genomic-serologic screen of a set of in silico-preselected open reading frames (ORFs). The screen included in vitro expression of the selected ORFs by coupled transcription and translation of linear PCR-generated DNA fragments, followed by immunoprecipitation with antisera from B. anthracis-infected animals. Of the 197 selected ORFs, 161 were chromosomal and 36 were on plasmids pXO1 and pXO2, and 138 of the 197 ORFs had putative functional annotations (known ORFs) and 59 had no assigned functions (unknown ORFs). A total of 129 of the known ORFs (93%) could be expressed, whereas only 38 (64%) of the unknown ORFs were successfully expressed. All 167 expressed polypeptides were subjected to immunoprecipitation with the anti-B. anthracis antisera, which revealed 52 seroreactive immunogens, only 1 of which was encoded by an unknown ORF. The high percentage of seroreactive ORFs among the functionally annotated ORFs (37%; 51/129) attests to the predictive value of the bioinformatic strategy used for vaccine candidate selection. Furthermore, the experimental findings suggest that surface-anchored proteins and adhesins or transporters, such as cell wall hydrolases, proteins involved in iron acquisition, and amino acid and oligopeptide transporters, have great potential to be immunogenic. Most of the seroreactive ORFs that were tested as DNA vaccines indeed appeared to induce a humoral response in mice. We list more than 30 novel B. anthracis immunoreactive virulence-related proteins which could be useful in diagnosis, pathogenesis studies, and future anthrax vaccine development.
Anthrax is a severe and often fatal disease that is caused by the gram-positive spore-forming bacterium Bacillus anthracis. B. anthracis virulence is attributed mainly to two key elements, a tripartite toxin complex and a capsule (49). Exclusion of either one of these constituents results in significant attenuation of virulence (62). Three genes (pagA, cya, and lef) coding for the toxin complex (protective antigen [PA], edema factor [EF], and lethal factor [LF]) are located on a native plasmid, pXO1. PA is the target cell binding determinant and translocation factor of both effector moieties, EF is an adenylate cyclase which increases intracellular cAMP levels, and LF is a zinc protease which cleaves mitogen-activated protein kinase kinases (47). The genes coding for synthesis of the poly-γ-d-glutamic acid antiphagocytic capsule are located on a second native plasmid, pXO2. Although the toxin complex and the capsule are considered the major virulence factors, other proteins encoded by the virulence plasmids and the chromosome probably work in concert to enable survival and growth of B. anthracis in its host (12, 15, 63). Recently, it was demonstrated that a chromosome-encoded Mn2+-binding protein, a component of an ABC transporter, is an example of an essential B. anthracis virulence determinant (37).
Licensed anthrax human vaccines are based on purified B. anthracis-derived culture supernatant (32). PA is the major constituent of these formulations. Although the vaccines are efficient and safe for human use, there have been various attempts to formulate alternative more potent anthrax vaccines in recent years. The common rationale behind these novel approaches is complementation of the established protective value of the pXO1-encoded PA by additional anthrax-derived factors. Accordingly, newly identified B. anthracis antigens, encoded by genes located either on the chromosome or on the virulence plasmids, may be additive ingredients for PA-based vaccines that could result in efficacious products which require a less demanding vaccination regimen (13, 16, 25, 42, 43, 50, 57, 70, 77, 81).
The availability of genome sequences of human pathogens has radically changed the ability to develop improved and novel vaccines by increasing the speed of target identification. Antigen discovery by targeted computational screening of the complete repertoire of proteins potentially encoded by a pathogen is an approach termed “reverse vaccinology” (1, 27). The specific classes of proteins selected by in silico analysis include mostly surface-exposed and/or exported proteins with putative involvement in virulence. Selected genes are usually subsequently cloned and expressed in bacterial systems. The corresponding purified proteins are used to immunize mice, and their protective abilities are assessed. Some examples of this genomic technology used for identification of potential vaccine candidates are the studies performed with Neisseria meningitidis (72), Streptococcus pneumoniae (92), Porphyromonas gingivalis (79), Chlamydia pneumoniae (30, 64), and group B Streptococcus (54). Other complementary large-scale screening approaches, including DNA microarray, proteomics, and comparative genome-proteome technologies, have been successfully used for selection of candidates or for development of live attenuated vaccines for several important human pathogens (31).
The availability of the DNA sequence of the B. anthracis chromosome (75), together with the previously documented sequences of the two virulence plasmids (67, 69), allowed in silico analysis of the complete B. anthracis genome, including the chromosome (11) and plasmid pXO1 (10), in a search for putative vaccine candidates and/or virulence-related factors. This analysis resulted in identification of more than 500 potential candidate open reading frame (ORF) products.
Here we describe development and application of a rapid and efficient functional large-scale genomic screen of these vaccine candidates. Representative bioinformatically preselected ORFs were expressed in vitro from linear PCR amplicons in a cell-free system, which eliminated the need for cloning or expression in bacterial systems. The corresponding protein products were tested for immunoreactivity with a series of antisera produced against live B. anthracis strains. Finally, some of the immunoreactive ORFs were analyzed by animal immunization to determine their abilities to elicit a humoral response, using a DNA vaccine-based technique.
Most of the potential antigens discovered in this analysis are novel B. anthracis immunogens. The implications of the results of this screening strategy are discussed below both in a general context and with regard to their relevance for development of a future anthrax vaccine.
The computational analyses were described previously in detail (11). The ORFs studied here were originally selected from an in-house annotated draft version of the B. anthracis strain Ames chromosome sequence (February 2001 draft version; 460 contigs; The Institute for Genomic Research, Rockville, MD). After publication of the B. anthracis Ames ancestor 0581 complete genome sequence (accession no. NC_007530; NCBI) and because of updates of sequence and domain databases, ORF identity, the presence of anchoring signals, domain assignment (Pfam, SMART, and CDD databases), sequence complexity and repeats, and the extent of structural disorder (RADAR program EBI; SMART database DisEmble module) were reassessed for the ORFs studied here. The recent availability of the genome sequences of phenotypically distinct Bacillus cereus family members (B. cereus 10987, B. cereus E33L, B. cereus G9241, and Bacillus thuringiensis konkukian) allowed reevaluation of the uniqueness of selected B. anthracis ORFs. Based on this analysis, some of the hypothetical ORF products with unknown functions were found to be not unique to B. anthracis. In this paper, ORF products are referred to by the B. anthracis Ames ancestor locus tag numbers. The GenBank accession numbers of the pXO1 and pXO2 sequences are AF065404 and AF188935, respectively.
The list of hypothetical ORFs consists of 45 products. Following a database update, all ORF products were recently reevaluated with the complete genomes of B. anthracis strains.
The source of genomic DNA was B. anthracis strain Vollum (pXO1+ pXO2+) (3, 10). A DNA preparation which contained chromosomal and plasmid DNA was used as the template for the PCRs. Oligonucleotides were constructed using an Applied Biosystems 392 DNA-RNA synthesizer. All of the 5′ primers included a common sequence coding for T7 promoter, the Kozak sequence, and a start codon (underlined) upstream of the specific sequence for amplification of the gene selected (N4GAATTCTAATACGACTCACTATAGGTACCACCATGN18-24). The primers also contained unique restriction sites for EcoRI and KpnI (indicated by boldface type). All of the 3′ primers contained a stop codon (underlined) and restriction sites for BglII, NotI, and SmaI (N2AGATCTTGCGGCCGCCCGGGTTAN18-24). The restriction sites are compatible with cloning into a eukaryotic expression vector suitable for DNA vaccination (see below). The Expand High Fidelity system (for genes up to 2,000 bp long) or the Expand Long PCR system (for longer genes), both obtained from Roche Molecular Biochemicals USA, were used. The resulting PCR amplicon fragments of selected B. anthracis ORFs included the full-length coding sequence, excluding only the 5′ leader sequences encoding secretion signal peptides when they were present. When necessary, extensive transmembrane (TM) segments, highly hydrophobic domains, or common antigenic domains were removed or the sizes were reduced by using internal PCR amplification primers as described previously (10).
The linear PCR products were translated individually in vitro by using a coupled rabbit reticulocyte lysate transcription and translation (T&T) kit (TNT T7 Quick for PCR DNA; Promega, Madison, Wis.) with T7 RNA polymerase and [35S]methionine. The transcription and translation products were analyzed by sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE) (8 to 15% polyacrylamide gels), followed by autoradiography.
The following sera were used for evaluation of the antigenic potential of selected ORF products (Table (Table1).1). Hyperimmune antisera R-1, R-2, R-3, and R-4 were obtained from rabbits. R-1 was collected following multiple injections of live B. anthracis Sterne (pXO1+ pXO2−) spores (45); the dose was gradually increased from 105 to 109 spores per animal. R-2 was obtained by three successive injections of 106, 108, and 5 × 108 spores of a highly attenuated mutant strain (designated BA19) of B. anthracis Vollum (pXO1+ pXO2+) generated by partial deletion of the pagA gene (unpublished data). R-3 was collected following multiple injections of 109 spores of the fully attenuated strain B. anthracis Δ14185 (pXO1− pXO2−) (3, 25) and was used as a negative control for evaluation of pXO1- and pXO2-derived ORF products. R-4 was obtained by exposing animals that were preimmunized with a limiting amount of PA to a lethal dose (104 spores) of the Vollum strain and rechallenging the animals with the same dose, as described previously (90). Antisera G-1 and G-2 were obtained from guinea pigs. G-1 was obtained by exposing animals to a lethal challenge (104 spores) of the Vollum strain, followed by fluoroquinolone treatment, as described previously (3), and rechallenging the animals with the same dose. G-2 was collected from animals that were vaccinated with 107 spores of a highly attenuated mntA mutant of the Vollum strain, as described previously (37). A general negative control serum, NRG, was a mixture of naïve rabbit and guinea pig sera at a ratio of 1:1.
The sera were evaluated for the presence of antibodies against total vegetative bacterial antigens by an enzyme-linked immunosorbent assay (ELISA), using extracts of B. anthracis Δ14185 secreted and membrane proteins as the coating antigens (essentially as described by Aloni-Grinstein and coworkers ). Only sera (from individuals belonging to the same treatment group) that elicited high anti-vegetative (secreted and membrane) bacterial antigen titers were pooled and used as probing vehicles for the screen. Anti-PA and anti-LF antibodies were detected by ELISA as described previously (25, 36, 57) and were used to evaluate unimpaired expression of the pXO1-derived toxin genes during infection.
The presence of specific antibodies against selected ORF products was determined by immunoprecipitation (IP) of the T&T reaction mixtures with the anti-B. anthracis antisera. Individual 35S-radiolabeled T&T reaction mixtures (5-μl aliquots) were diluted (to obtain a final volume of 100 μl) in RIPA buffer (20 mM Tris-HCl [pH 7.5], 150 mM NaCl, 5 mM EDTA, 0.5% sodium deoxycholate, 1% Triton X-100, 0.1% SDS) and reacted with each serum (10 μl). Following 1 h of incubation at 37°C, antibodies were precipitated using protein A-Sepharose 4B Fast Flow beads (Sigma), and the immunoprecipitated [35S]methionine-labeled T&T products were detached from the washed beads by boiling in SDS-PAGE sample buffer and were analyzed by SDS-PAGE, along with the specific T&T reaction mixture.
To allow determination of the comparative immunoreactivities of the seropositive ORF products, a quantitative method was developed. The method was based on IP titration of analytical amounts of 35S-labeled T&T products (usually 1 to 2 μl of the reaction mixture) with serial dilutions of the desired antiserum (in a final volume of 150 μl, adjusted with RIPA buffer). The initial dilutions of the sera were 1:10, 1:30, and 1:50, and the sequential dilution used throughout the study was 1:3. The background control was a 1:10 dilution of the NRG serum (or R-3 for pXO1- and pXO2-derived ORFs) that was reacted with each T&T product. Following IP reactions as described above, the immunoprecipitated proteins were analyzed by SDS-PAGE radiography along with equivalent amounts of the T&T reaction product that was used in the IP protocol. In parallel, the immunoprecipitated labeled proteins were quantitated with a β-counter (1600 TR liquid scintillation analyzer; Packard). The final dilution that still allowed detection of the immunoprecipitated T&T polypeptide by SDS-PAGE or the dilution that exhibited a measurable level of radioactivity (expressed in counts per minute) which was at least three times the background radioactivity was considered the specific IP titer of the serum. In all cases, the titers determined by the two procedures were identical.
A DNA immunization procedure was used for evaluation of the immunogenic potential of the selected ORF products. Individual ORFs were cloned in the eukaryotic expression vector pCI (Promega), which carries the eukaryotic cytomegalovirus promoter, a recombinant chimeric intron, and the simian virus 40 polyadenylation signal for efficient expression in mammalian cells, in addition to the T7 promoter for in vitro T&T expression (35, 38).
The plasmid DNA used for gene gun immunization was prepared by an alkaline lysis method, followed by CsCl gradient centrifugation. The purified DNA preparations were solubilized in pyrogen-free water and kept frozen. The immunization protocols used were essentially the protocols described previously (38). For gene gun vaccination (Helios gene gun system; Bio-Rad), plasmid DNA was precipitated onto 1-μm-diameter gold particles at a ratio of 2 μg per mg of gold and loaded onto Gold-Coat tubing (Bio-Rad). Polyvinylpyrrolidone was used as an adhesive. Gene gun shots (0.5 μg DNA) were directed onto exposed abdominal dermis, and the protocol included three or four immunizations of ICR mice (6-week-old females; Charles River Laboratories, Margate, United Kingdom) at 2-week intervals. For serum collection mice were bled from the tail vein. The methods used to determine the immune response elicited in the mice following DNA immunization included quantitative IP titration as described above. Animals used for vaccination were handled in accordance with the National Research Council 1996 Guide for the Care and Use of Laboratory Animals and with protocols approved by the Animal Use Committee of the Israel Institute for Biological Research.
Selection of B. anthracis chromosome-derived ORFs as specific vaccine candidates was performed previously by a multistep computational screen of the B. anthracis Ames strain draft (February 2001 version) chromosome sequence (11). The selection procedure combined ORF determination, preliminary annotation, prediction of cellular localization, and taxonomic comparison with closely related genomes. The selection rationale was based on the following criteria: (i) putative surface exposure (anchored or secreted) of the ORF products (proteins amenable to interaction with the host immune system), (ii) lack of similarity to proteins present in nonpathogenic bacteria (removal of housekeeping genes), and (iii) putative function or sequence motifs similar to documented virulence factors of other pathogens. Together with additional filtering criteria (size and number of paralogs) and manual curation, this reductive strategy resulted in selection of 240 candidate ORFs encoding proteins with putative assigned functions and 280 hypothetical unique proteins with unknown functions having putative secretion signals and/or TM retention segments (11).
To reduce the number of candidate ORFs, we removed ORF products that are identical in B. cereus ATCC 14579, the only related genetically similar yet phenotypically distinct pathogen whose genome sequence was available at the time of the study. When there were several paralogous ORF products, a single paralog was chosen arbitrarily.
The final list consisted of 116 chromosome-derived candidate ORFs with putative functions and/or defined domains, including ORFs encoding nine S-layer homology (SLH) domain proteins (including the S-layer proteins Sap and EA1, which were used as positive controls for known chromosome-derived immunogens ), 18 adhesins, lipoproteins, or transporters, 21 repeat-containing proteins, 53 membrane or secreted enzymes, and 15 other surface-anchored proteins harboring motifs characteristic of microbial virulence factors. For hypothetical ORFs with unknown functions, we selected only ORFs encoding products that were larger than 150 amino acids and contained either signal peptide or TM segments.
In a previous study of pXO1, 11 candidate ORF products with putatively assigned functions were evaluated using similar criteria (10). In the current study, 14 additional pXO1 ORF products with unknown functions that were unique at the time of selection (absent from the B. cereus 14579 genome) and eight pXO2-derived ORF products with putatively assigned functions were selected for further analysis. The three toxin component genes, pagA (pXO1-110), lef (pXO1-107), and cya (pXO1-122), were included in the study as positive controls.
In summary, an in vitro analysis was performed with 197 ORFs from the B. anthracis chromosome, as well as pXO1 and pXO2. The products of 133 of these ORFs had putative assigned functions, 59 products were hypothetical products with unknown functions, and 5 products were used as positive controls.
The in vitro screening procedure for analysis of selected ORF candidates consisted of three main steps: (i) generation of the bioinformatically selected ORFs as linear PCR expression cassettes, (ii) synthesis of the corresponding protein products individually from the linear PCR amplicons by in vitro coupled T&T in the presence of [35S]methionine, and (iii) monitoring the seroreactivity of the protein products by IP of the radioactively labeled T&T reaction mixtures with polyclonal antisera to B. anthracis (Table (Table1).1). A positive result was identified by development of a specific IP product, which was visualized as a discrete band at the expected molecular size. Figure Figure1A1A shows examples of positive and negative responses.
All sera except R-3 (see Materials and Methods) exhibited high titers against B. anthracis toxin components, as well as against vegetative bacterial extracts representing total membrane and secreted proteins (Table (Table1).1). For most of the in vitro T&T ORF products, R-1 serum exhibited higher IP titers than other sera; one example of this is the ORF BA2805 product and the S-layer protein Sap, as shown in Fig. Fig.1B.1B. When R-1 results were either ambiguous or negative, other sera were used; this is shown for pXO1-130, BA3189, and BA4787 in Fig. Fig.1C.1C. We found that only three of the ORF products tested reacted to the same extent both with the anti-B. anthracis sera and with the sera collected from naïve animals (NRG) (Table (Table1);1); obviously, these ORFs were not considered positive ORFs.
In the absence of purified protein antigens that could be used in a quantitative immunoassay, the seroreactivities of all ORF products tested were quantified by determining IP titers of the radiolabeled T&T products. The IP titer was calculated by determining the end point of IP titration using serial serum dilutions by (i) monitoring the amount of radioactivity in the IP reaction mixture and/or (ii) visualization of the IP products by SDS-PAGE, followed by autoradiography (see Materials and Methods). We compared the IP titers to the titers determined by a standard ELISA for the PA antigen (Table (Table1).1). As shown in Fig. Fig.2A2A and Table Table11 for antisera R-1 and G-1, the IP titers determined by both assays described above were in good agreement with those determined by ELISA (for R-1, 1:35,000, compared with 1:32,000 determined by the ELISA; for G-1, 1:300,000, compared with 1:500,000 determined by the ELISA).
Figure Figure22 shows examples of determinations of IP titers for ORFs with different levels of T&T expression for ORFs BA5330 and BA4766 (Fig. (Fig.2B)2B) or DppA and HtrA (Fig. (Fig.2C).2C). The analysis of PA, DppA, and HtrA also demonstrated that the electrophoretic data were in good agreement with the data obtained from direct counting of radioactivity in the IP reaction (Fig. 2A and C, lower panels).
Several parameters, including the ability to generate a T&T product, the reaction with the anti-B. anthracis antisera, and the extent of immunoreactivity, were considered when we interpreted the results of the screen. These parameters were used for evaluation of the three groups of bioinformatically selected ORFs, including (i) chromosomal ORFs with putative assigned functions, (ii) chromosomal unique ORFs with no assigned functions, and (iii) selected ORFs from the pXO1 and pXO2 plasmids.
Of the 116 chromosomal ORFs selected (for the complete list, see Table S1 in the supplemental material), 115 yielded PCR products. The only one ORF that did not generate a PCR product was a repeat-rich ORF, BA4978. Of the 115 PCR amplicons, 109 could be used for T&T product generation (see Table S1 in the supplemental material). For five of the six ORFs that did not generate T&T products (BA1094, BA1222, BA1290, BA3725, and BA4764), the failure could be explained by the length and/or extent of putative disordered or unstructured regions in the sequence. No attempt was made to manipulate these six ORFs for further analysis.
All 109 T&T products were tested with the different sera, and 43 were found to react positively with anti-B. anthracis antisera. The seropositive chromosome-encoded ORF products are listed in Table Table22 and have the following features.
The peptidoglycan (cell wall [CW]) of gram-positive bacteria is a docking site for proteins, which interact with the environment (66). In pathogens, proteins that are retained in the CW envelope are important for bacterial attachment, invasion, interaction with host proteins, and virulence (18, 19, 89). Table Table33 shows the number of ORFs in the genome of B. anthracis that have the various CW localization signals, the number of ORFs selected bioinformatically for screening, and the number of ORFs found to be seropositive. The distribution of these ORF products is described below according to their various anchoring mechanisms, domains, and motifs.
Of 130 B. anthracis putative lipoproteins encoded in the genome, 12 were selected in silico for analysis, and all were immunoreactive (Table (Table2).2). Ten represented ligand-binding components of ABC transporters.
The B. anthracis genome harbors genes encoding three sortases (classes A, B, and C) and 10 substrates with diverse sorting signals. Ten candidate ORFs encoding sortase-containing signal proteins were preselected, and seven were seropositive (Table (Table2).2). We noted that only 3 of the 15 preselected ORFs encoding products harboring virulence-related motifs (see Table S1 in the supplemental material) were seropositive and that all three had a sortase-anchoring signal. Thus, it appears that there is a high probability that ORFs encoding products with a sorting signal are expressed and induce an immune response following B. anthracis infection.
The SLH domain, a frequent gram-positive anchoring domain for CW attachment (58), was one of the major selection criteria for candidate ORFs. Six of the nine chromosomal SLH proteins preselected for the screen were seropositive (Table (Table22).
The LysM domain is recognized in a variety of enzymes involved in CW metabolism and is assumed to be necessary both for the enzymatic functions of adjacent catalytic domains and for the peptidoglycan attachment domain (39, 86). The LysM domain was recognized in six ORFs in the B. anthracis genome; four were preselected for analysis, and only one was seropositive (BA3668).
The Escherichia coli NlpC/Listeria monocytogenes P60 domain occurs at the C terminus of a number of different bacterial and viral proteins. Several related, but distinct, catalytic activities, such as murein degradation, acyl transfer, and amide hydrolysis, occur in the NlpC/P60 superfamily (4, 19). The three ORFs encoding hypothetical proteins having an NlpC/P60 domain in the B. anthracis chromosome were analyzed, and all were seropositive (Table (Table2).2). Only one of these ORFs (BA5427) encodes a protein with a catalytic domain (endopeptidase LytE), while the other two (BA1952 and BA2849) encode proteins with undefined functions. BA1952 harbors an additional domain, SH3b (see below).
SH3b is the bacterial homologue of SH3 (Src homology domain), a eukaryotic motif exhibited by several proteins involved in signal transduction. SH3b has been found in a number of different bacterial proteins, including endopeptidases, bacteriocins, and signaling proteins involved in cell attachment, and it may represent a putative CW targeting domain (19, 53). Four of the 10 B. anthracis ORFs harboring SH3b domains in the chromosome were selected for the screen, and only one, BA1952, which also harbors an NlpC/P60 domain, was seropositive.
Like SH3, the PKD domain is involved in protein-protein or protein-carbohydrate interactions, which indicates possible CW exposure of the ORF products carrying it. It was first identified in PKD1, the polycystic kidney disease protein. The PKD motif occurs in the B. anthracis genome three times, all three times in ORFs encoding putative collagenases; the only ORF selected for analysis (BA3584) was seropositive.
Not all of the preselected ORF products had secretion signal peptides; however, the absence of a secretion signal does not necessarily imply that a protein is not surface exposed (8). In pathogens, secretion of some signal-less virulence factors was suggested to be mediated by alternative secretion pathways (48, 68), and their presence in the extracellular milieu does not necessarily reflect cell lysis.
Signal peptides were detected in 34 of the 43 seropositive ORF products (Table (Table2).2). Two of the nine positive ORFs without a signal sequence, BAS5207 (LPXTG) and BA3668 (LysM), encode proteins with anchoring motifs. The other seven ORF products that do not have a signal peptide (BA3841, BA4510, BA4989, BA0309, BA0485, BA1353, and BA2805) have some characteristics that may localize them to the surface. The BA3841, BA4510, and BA4989 products have some features common to anchorless adhesins, which were first described by Chhatwal (22) as a group of proteins from gram-positive pathogens which bind to extracellular matrix components like fibronectin and collagen. BA0309 encodes δ-1-pyrroline-5-carboxylase (RocA), which has been reported to be a component of the B. cereus exosporium (21) and was recently found to be an immunodominant component of the B. anthracis membrane proteome (23). BA0485 is a phage lambda (B. anthracis-specific) endolysin gene that is located adjacent to a holin gene, and BA2805 encodes another putative endolysin-like product; therefore, based on their function assignments both products are expected to be extracellular.
The presence of a repeated sequence domain or motif was also one of the criteria used for the in silico selection process (11). Although proteins containing repeats are more abundant in eukaryotes than in prokaryotes, they are frequently found in pathogenic bacteria and are related to adhesion, invasion, molecular mimicry of host proteins, or antigenic variation (6, 55, 76). Repeat proteins harbor either tandem repeat domains, such as ankyrin repeats, TPR-like repeats, collagen-like repeats, leucine-rich repeats, and diverse internal repeats, or repeat motifs consisting of single amino acids, periodically conserved amino acids, and oligopeptide repeats. Ten seropositive ORF products can be referred to as repeat-containing proteins (Table (Table2).2). The products of six ORFs (BA0552, BA1952, BA3841, BA4510, BA4787, and BA4989) belong to the original group of preselected repeat proteins (11). The products of the other four ORFs (BAS5205, BAS5207, BA3367, and BA4789) belong to different preselected groups. Two of the latter, the BAS5205 and BAS5207 products, are putative extracellular matrix-binding collagen adhesion proteins that have internal repeats. Similar collagen repeat-containing proteins, expressed on the surfaces of other gram-positive pathogens, are involved in attachment to host connective tissues (9, 83, 93, 94). We noted that in B. anthracis, a collagen-like glycoprotein is a structural component of the exosporium (88). The BA3367 ORF, encoding a sortase-anchored internal repeat protein, precedes a putative cobalt ion ABC transporter operon. The BA3367 product was recently shown to be a γ phage receptor (26). The BA4789 product contains a near transporter (NEAT) repeat motif that is present in proteins involved in iron acquisition (see Discussion). Overall, we found that 10 of 22 (45%) expressed repeat-containing proteins reacted with the anti-B. anthracis antisera.
Forty-five chromosomal ORFs to which no function could be assigned were selected for the screen (Table (Table4).4). Only 31 of these 45 ORFs (70%) yielded T&T products. This yield is significantly lower than that for the annotated ORFs (109/115). It is reasonable to assume that most of the nontranslatable ORFs probably do not code for real polypeptides. Indeed, when the B. anthracis chromosome sequence was completed (75), 6 of the 14 ORFs not amenable to T&T expression were no longer considered ORFs.
The most striking observation related to the group of ORFs with no assigned functions was that only 1 of the 31 T&T products, the BA3807 product, was seropositive and exhibited a detectable IP titer (Table (Table4).4). BA3807 is localized adjacent to a prophage locus and may thus have a phage-related role.
In a previous study (10), 12 pXO1-related ORFs were screened (including 11 novel ORFs and the PA gene as a positive control) (Table (Table5).5). For 10 ORFs, a T&T product was generated, and four products were found to be seropositive: PA, pXO1-54, pXO1-90, and pXO1-130. We included the following 16 additional pXO1 ORFs in our screen (Table (Table5):5): the toxin genes encoding LF and EF (as positive controls) and 14 ORFs encoding hypothetical short polypeptides with unknown functions. Only nine ORFs, the two toxin genes (Fig. (Fig.3A)3A) and seven ORFs with unknown functions, yielded T&T products (Fig. (Fig.3B).3B). This may imply that as was observed for the chromosomal genes with unknown functions, not all the predicted pXO1 ORFs represent true genes. Finally, except for the toxin components, none of the hypothetical ORF products reacted with the sera. Thus, only 6 of 19 pXO1 ORFs tested to date which generated T&T products were seropositive, including the three toxin components (Fig. (Fig.3C3C).
In view of the results obtained with pXO1, we decided to restrict the pXO2 analysis to ORFs encoding products with putative assigned functions. Two of the eight annotated pXO2 ORFs selected for analysis (Table (Table5),5), pXO2-8 and pXO2-42, reacted with the antisera (Fig. (Fig.3D).3D). Both of these ORFs encode anchored CW hydrolases; the pXO2-8 product contains an NlpC/P60 motif, and the pXO2-42 product contains an SLH domain. The pXO2-42 product was described previously and was found to exhibit peptidoglycan hydrolase activity (59). It should be noted that the products of two of the three novel pXO1 positive ORFs that were identified (10) (Fig. (Fig.3C),3C), pXO1-54 and pXO1-90, are also SLH-anchored proteins. This observation further supports the notion that SLH anchoring is a reliable parameter for prediction of immunoreactivity, as demonstrated previously for analysis of the chromosomal genes.
The IP titers obtained for the various ORF products are summarized in Tables Tables2,2, ,4,4, and and5.5. For the chromosome-derived positive ORF products (Table (Table2),2), the highest scores (IP titers greater than 1,000) were detected with the following polypeptides: S-layer proteins EA1 and Sap (BA0887 and BA0885), two SLH amidases (BA0898 and BA3737), glycosyl hydrolase (BA2805), HtrA (BA3660), DppA (BA0656), a NEAT-containing protein probably involved in iron acquisition (BA4787), and a protein with an unknown function (containing SH3b and NlpC/P60 domains; BA1952). These results may be a reflection of the abundance of the proteins in vivo and/or their high immunogenicity.
As expected, for the plasmid-derived ORF products (Table (Table5)5) the highest immunoreactivity was the immunoreactivity with PA. It is well established that PA is both a very potent immunogen and a major protein expressed and secreted by vegetative B. anthracis cells during infection. Of the other plasmid-derived positive immunogens (Table (Table5),5), the pXO1-90 product was as immunoreactive as PA and was highly reactive with all the sera tested (Fig. (Fig.3C),3C), the pXO1-130 product was also highly reactive but only with the R-2 antisera (Fig. (Fig.1C1C and and3C),3C), and the other pXO-derived ORF products reacted differently with different sera but were all “weakly” seropositive protein products.
Once it was established that the selected ORF products could be expressed and could specifically react with relevant anti-B. anthracis antisera, the seropositive candidate ORFs were screened for the ability to elicit a humoral response. Linear PCR amplicons coding for positive immunogens were directly cloned into a compatible plasmid vector, which allowed expression of the bacterial genes in a mammalian host following DNA vaccination (38). A representative set (ca. 30%) of immunoreactive ORFs was selected for the analysis, and this set included antigens with both high and low IP titers. We also included two seronegative ORFs.
The IP titers of DNA-vaccinated mice are summarized in Table Table6.6. For the two seronegative ORFs (pXO1-66 and pXO1-67), DNA immunization did not lead to detectable seroconversion. In contrast, almost all (10 of 12) of the seropositive ORFs selected for immunization were also capable of eliciting antibodies when they were administered individually as DNA vaccines, and all of the immunizations resulted in relatively high specific titers against the corresponding labeled T&T products. For the two seropositive ORFs that did not elicit a detectable immune response following DNA immunization (pXO1-90 and BA3807), a second round of DNA immunization was performed with full-length or truncated versions of the genes (data not shown). These manipulations did not result in any detectable positive humoral response.
Since DNA immunization did not provide real added value in down-selection of the seropositive ORFs for further vaccination studies, there was no reason to extend this type of analysis, except to generate specific antibodies.
Therefore, it appears that the approach used for down-selection described here allowed us to reduce the number of ORF candidates from 197 to 52 seropositive ORFs. Each of the novel candidate ORFs still has to be evaluated in an anthrax disease animal model (guinea pig or rabbit) to determine its efficacy.
In this paper we describe a functional genomic-serologic screen aimed at identification of immunogenic proteins of B. anthracis. In this screen we focused on bioinformatically selected ORFs representing distinct groups of proteins characterized by specific structural features, as well as functional features which could make these ORFs potential candidates for vaccine development (10, 11).
Genome-wide in silico prediction typically targets microbial genome-encoded proteins, which have signals for extracellular exposure (either surface anchored or secreted). Subsequent analysis of vaccine candidates usually relies on cloning of the genes, followed by expression and purification of the corresponding polypeptides, which are evaluated in animal models for antibody production and protection. In the study of Pizza and coworkers (72), the first example of the application of genomic technology for identification of vaccine candidates, 570 ORFs were selected by in silico analysis of the Neisseria meningitidis genome; 350 of these ORFs were successfully cloned and expressed, and the purified proteins were used for generation of sera. For group A Streptococcus, P. gingivalis, and S. pneumoniae, 312 of 589, 50 of 120, and 108 of 130 selected ORFs were successfully expressed, respectively (54, 79, 92). The screening system described here is based on coupled transcription and translation of linear DNA expression elements to generate selected polypeptides in vitro (10, 35). The efficient use of PCR products as vehicles for in vitro expression should eliminate complications associated with bacterial propagation, such as toxicity, lethality, or stability. Furthermore, this method allows examination of large and hydrophobic proteins that are usually incompatible with expression in cell-dependent systems (64, 72).
The screen reported here was initiated with 197 B. anthracis ORFs (including the 5 controls represented by the classical toxin components PA, LF, and EF and the S-layer proteins Sap and EA1). The functionally annotated ORFs were overrepresented in the group of successfully expressed genes (129/138 [93%], compared with 38/59 [64%] of the ORFs encoding proteins with unknown functions). This bias in expression could be consistent with the assumption that the ORFs encoding proteins with unknown functions may not represent real genes. The overall high success ratio for expression of preselected annotated ORFs clearly demonstrates the advantage of the in vitro method over other methods.
The screen established the seroreactivity of 52 ORFs (31% of the ORFs expressed), yet again the distributions of the immunogenic proteins were different for the two groups; 40% (51/129) of the functionally annotated ORFs were positive, while only 1 of the 38 unique ORF products with unknown functions was found to be immunoreactive in the IP assay. In view of the clear bias in in vitro expression ability and in seroreactivity against hypothetical gene products, it is possible that in large screens similar to that described in this report it may be more practical to include only ORFs coding for proteins with putative biological activity.
The use of DNA immunization did not prove to be useful in down-selection of ORF candidates since close to 90% of the immunoreactive ORFs appeared to be immunogenic when this screening procedure was used (Table (Table6).6). Nevertheless, this method is experimentally convenient, and in spite of the limited information obtained from DNA immunization in the screening process, it may be very useful for priming and for obtaining specific reagents for future evaluation of potential vaccine candidates (37).
The distribution of seropositive proteins among the five initially selected groups of chromosomal ORF products with assigned functions (Table (Table2;2; see Table S1 in the supplemental material) is very useful for providing criteria for future in silico selection of vaccine candidates, as follows: (i) the SLH-containing protein and adhesin-lipoprotein-transporter groups have the highest predictive values for seroreactivity, 67% (6/9) and 71% (12/17), respectively; (ii) the repeat-containing proteins (10/22) and the group of enzymes (16/52) had predictive values of 45% and 31%, respectively; and (iii) the group of ORFs with undefined functions had the lowest predictive score, and only 23% (3/13) were seropositive. Of about 190 ORF products containing cell wall-anchoring signals encoded in the genome of B. anthracis, 49 were selected for analysis, and 35 of them were seropositive (Table (Table3).3). This high ratio of reactivity indicates that this criterion can be used as a very efficient predictive tool for identification of immunogenic ORF candidates.
One can also evaluate the predictive value of bioinformatic selection on the basis of a more defined functional categorization. Accordingly, we decided to consider the following five functional categories (Table (Table2):2): (i) hydrolases associated with the cell wall, (ii) iron acquisition and transport proteins, (iii) amino acid and oligopeptide transport proteins, (iv) extracellular proteases, and (v) other enzymes with documented involvement in the virulence of human pathogens.
With the exception of Sap and EA1 (the seropositive chromosomal controls), all chromosomally derived CW-associated ORF products have hydrolytic activities (Table (Table2).2). According to their functions, CW hydrolases may play a central role in many housekeeping functions (like preservation of cell shape, osmotic stability, and cell division [51, 85]). However, in many pathogens, they may also be niche-specific virulence factors, such as inducers of adhesion and colonization and molecules that modify the innate immune response by releasing CW-derived hydrolysis products (18, 19, 39, 44, 65). The nine seropositive chromosomal CW hydrolases (Table (Table2)2) have the following putative functions: amidases (BA0898, BA1818, and BA3737), transglycosylase (BA0981), glycosyl hydrolases (endolysins BA0485, BA2805, and BA3668), polysaccharide deacetylase (BA2944), and a spore cortex lytic enzyme (BA2748). In addition, the two pXO2-encoded positive ORF products, the pXO2-8 and pXO2-42 products, are also amidases (Table (Table5).5). Altogether, the CW hydrolases described here comprise 22% of the immunoreactive ORF products and constitute a major specific functional category in the screen.
The information about iron acquisition systems in gram-positive pathogens is rather limited (17, 74, 80). Genomic data have suggested that B. anthracis might express an expanded array of genes encoding functions related to iron acquisition (75). Indeed, two studies determined that a siderophore biosynthesis operon (20) and the MntA (BA3189) gene (37) are expressed during infection, and both were shown to be essential for virulence. In this study, seven seropositive chromosomal ORF products, representing 15% of the seropositive products, have possible functions in iron acquisition (Table (Table2).2). Five of them are substrate-binding components of ABC transporters (BA0175, BA3189 [MntA], BA4597, BA4766, and BA5330 products), and all of these products except MntA are probably importers of siderophore complexes (FatB and FhuD domains). The remaining two ORFs, BA4787 and BA4789, code for sortase-anchored products harboring five copies and one copy, respectively, of the NEAT domain. In gram-positive pathogens, variable copy numbers of the NEAT domain are present in proteins encoded by genes that are usually adjacent to the genes of iron ABC transporters (5). In Staphylococcus aureus, analogous proteins, encoded by the isd (iron-regulated surface determinant) locus, are known to be involved in heme scavenging and heme iron compound transport (56, 84). Notably, Zink and Burns (95) have recently suggested that BA4789 may play a role in growth of B. anthracis in macrophages.
Amino acids and peptides are imported into the cells as nutrient sources. As determined from the sequence data, B. anthracis has an expanded capacity for amino acid and peptide utilization (75). There are 17 ABC-type peptide-binding proteins and 9 branched-chain amino acid transporters that are encoded in the B. anthracis genome (compared to only three and two, respectively, in the soil bacterium Bacillus subtilis). Four ligand-binding ORF products with putative functions in transport of amino acids or peptides are seropositive (BA0656, BA0855, BA1191, and BA2848 products), and they represented all the amino acid- and oligopeptide-binding proteins selected for the screen (Table (Table2).2). Thus, within the limitations of the small sample size, the results of the screen confirmed the in silico prediction described above and suggest that this group may also be useful as a predictive tool in selection of immunogenic ORFs.
Extracellular proteases can function in nutrient scavenging, as participants in determining the fate of bacterial cell proteins, and/or as virulence factors damaging host tissues. According to genomic analysis, B. anthracis encodes more secreted proteases and peptidases than B. subtilis encodes (75). Sixteen ORF products with hypothetical proteolytic activity representing 31% of the enzyme group (16/52 products) (see Table S1 in the supplemental material) were preselected for the screen, and six of them were seropositive (BA0672, BA1295, BA1353, BA3584, BA3660, and BA5427 products) (Table (Table2).2). Four of the products are well-documented virulence factors in related bacteria. BA0672 and BA1295 encode two distinct orthologs of immune inhibitor (InhA) metalloproteases, which are virulence factors in B. thuringiensis and B. cereus (28, 29, 73, 82). BA3584 encodes a collagenase. Collagenases can participate in host tissue destruction, thus promoting bacterial invasion and dissemination (14). Finally, HtrA (BA3660), a widely distributed serine protease that acts in the context of protein secretion both as a chaperone and as a protease (7, 78), is considered a virulence factor and a vaccine candidate in many pathogens (33, 40, 41, 52, 71). Notably, a recent proteomic study identified HtrA as one of the major proteins induced in a virulent B. anthracis strain under in vivo growth-stimulating conditions (24).
Generally, ORFs encoding enzymes for housekeeping functions were not included in the screen; the exceptions were ORFs whose orthologs have been shown to be involved in virulence mechanisms of other pathogens. Excluding the proteases and other hydrolases, only 4 of the remaining 19 ORFs coding for enzymes were found to be seropositive (BA0309, BA1041, BA3891 and BA4346) (Table (Table22).
In conclusion, based on the distribution of the immunoreactive proteins among the in silico-preselected groups, there are several general criteria that could be used in predicting the immunogenic potential of ORF vaccine candidates: (i) the presence of an anchoring domain (mainly SLH, sortase recognition, or lipobox); (ii) the presence of a repeat domain or motif; (iii) the presence of domains which indicate that an ORF product is an adhesin (this is a reasonable criterion only if it is accompanied by an additional virulence-related putative function); and (iv) functional categories that appear to have high predictive value, including cell wall hydrolases, proteins that participate in iron acquisition and import, and amino acid and oligopeptide transporters. The immunogenic ORF products that were identified in this study provide a valuable list of proteins for future investigations. A majority of these products are novel B. anthracis immunogens, some of which could be useful for diagnostic purposes or for future vaccine development. We are currently studying the abilities of these seropositive proteins to confer protective immunity, both as recombinant purified proteins and in the context of live vaccine strains.
We are grateful to B. Velan for critically reviewing the manuscript, and we thank R. Aloni-Grinstein for providing the R-3 hyperimmune antisera, I. Mendelson for construction of the B19 mutant strain, and G. Friedman and N. Zeliger for their excellent technical assistance.
Editor: D. L. Burns
†Supplemental material for this article may be found at http://iai.asm.org/.