|Home | About | Journals | Submit | Contact Us | Français|
Chronic infection of the human stomach by Helicobacter pylori leads to a variety of pathological sequelae, including peptic ulcer and gastric cancer, resulting in significant human morbidity and mortality. Several genes have been implicated in disease related to H. pylori infection, including the vacuolating cytotoxin and the cag pathogenicity island. Other factors important for the establishment and maintenance of infection include urease enzyme production, motility, iron uptake, and stress response. We utilized a C57BL/6 mouse infection model to query a collection of 2,400 transposon mutants in two different bacterial strain backgrounds for H. pylori genetic loci contributing to colonization of the stomach. Microarray-based tracking of transposon mutants allowed us to monitor the behavior of transposon insertions in 758 different gene loci. Of the loci measured, 223 (29%) had a predicted colonization defect. These included previously described H. pylori virulence genes, genes implicated in virulence in other pathogenic bacteria, and 81 hypothetical proteins. We have retested 10 previously uncharacterized candidate colonization gene loci by making independent null alleles and have confirmed their colonization phenotypes by using competition experiments and by determining the dose required for 50% infection. Of the genetic loci retested, 60% have strain-specific colonization defects, while 40% have phenotypes in both strain backgrounds for infection, highlighting the profound effect of H. pylori strain variation on the pathogenic potential of this organism.
Helicobacter pylori, a bacterial pathogen of the human stomach, infects an estimated 50% of the population worldwide. Infection by H. pylori causes gastritis initially and, if allowed to persist, can induce a range of pathologies. It is the causative agent of most peptic ulcers, and other serious outcomes such as atrophic gastritis, intestinal metaplasia, and gastric cancer are correlated with long-term infections. It is not yet known whether these outcomes are due to specific factors produced by the organism or whether they result from chronic inflammation due to efficient and persistent colonization of the gastric mucosa. Thus, colonization and persistence factors may themselves constitute virulence factors for this organism.
While mechanisms of virulence with respect to pathogen-host interactions are poorly understood, several important mechanisms of survival required to initiate the infection have been characterized (42). H. pylori absolutely requires a family of genes involved in the production of urease, an enzyme critical for neutralizing the pH around the organism during exposure to the acidic lumen of the stomach. H. pylori also requires a large set of motility genes related to the successful production and operation of its flagellum. Without motility, H. pylori cannot penetrate the mucous layer which protects the stomach epithelium from the acid it produces. While the bacteria can persist in the mucous layer, they also attach tightly to gastric epithelial cells via a number of adhesins. Particularly important are two outer membrane proteins, BabA and SabA, that bind oligosaccharide antigens present on cellular subpopulations of the gastric epithelium (43). Successful colonization thus requires intimate interaction with the epithelium itself.
Several virulence factors are thought to be important once contact with the host cell epithelium is established. VacA, a secreted protein, has a vast array of activities linked to it, including membrane insertion, anion-conducting channel activity, alteration of transepithelial resistance, inhibition of antigen processing, and induction of apoptosis (50). Presumably, the loss of membrane integrity in the host cell increases nutrient availability at the host cell surface. The cag pathogenicity island (PAI) is composed of 27 genes. The cag PAI contains homologues to type IV secretion system (T4SS) components in other gram-negative organisms such as Agrobacterium tumefaciens and Legionella pneumophila (plant and human pathogens, respectively). The presence of the cag PAI correlates with more-serious disease outcomes, implying that when functioning, it is important in pathogenesis. The CagA protein (also encoded in the island) is translocated to the host cell cytoplasm by the T4SS (7). Once CagA is translocated into epithelial cells, it remains associated with the host membrane and becomes tyrosine phosphorylated on C-terminal repeat motifs, Glu-Pro-Ile-Tyr-Ala (EPIYA motifs), by proteins of the Src family of tyrosine kinases. Both phosphorylated and unphosphorylated CagA proteins have been reported to induce cell signaling pathways resulting in altered spreading, migration, and adhesion of epithelial cells. The Ras/mitogen-activated kinase kinase/extracellular signal-regulated kinase (MEK/ERK) and Src homology 2 (SH2) domain containing tyrosine phosphatase (SHP-2) pathways are some of the pathways reported to be activated by cagA-positive strains, explaining observations of increased cell proliferation during H. pylori infection, a hallmark for a precancerous state.
Several screens for colonization or virulence factors of H. pylori have been described using libraries of insertional mutants made either by the transposition of H. pylori DNA clones propagated in Escherichia coli (shuttle mutagenesis) (26, 31, 44, 45) or by the integration of plasmids containing random small pieces of H. pylori DNA (5, 6). Both these techniques take advantage of H. pylori bacteria's ability to integrate homologous DNA by recombination. So far, genetic screens have focused mostly on in vitro phenotypes thought to be important for infection of the human stomach, such as motility, acid survival, urease activity, adherence to gastric epithelial cells, and the ability to take up exogenous DNA (reviewed in reference 23). Recently, the first in vivo screen using a gerbil model of infection was reported (31). That study queried 960 mutants, giving 252 candidate mutants corresponding to 47 genes (due to redundancy of the original 960-mutant pool). Both the previously characterized colonization genes (such as those that encode motility factors) and the novel components (such as collagenase) were identified. None of the screens to date has been saturating, and none has produced either novel T4SS-dependent genes or genes required for the regulation of virulence gene expression.
We recently developed a genome-saturating library of transposon mutants and a method to track pools of these transposon mutants (MATT), using a whole-genome microarray (54). We have initiated a screen for H. pylori colonization factors by using these tools in a mouse infection model (32). Here we present the results of our initial screen in two different mouse-adapted strain backgrounds. Our goal was to identify genes contributing to bacterial colonization and/or persistence in the stomach.
H. pylori was grown on solid medium on horse blood (HB) agar plates containing 4% Columbia agar base (Oxoid), 5% defibrinated horse blood (HemoStat Labs), 0.2% β-cyclodextrin (Sigma), 10 μg of vancomycin (Sigma) per ml, 5 μg of cefsulodin (Sigma) per ml, 2.5 U of polymyxin B (Sigma) per ml, 5 μg of trimethoprim (Sigma) per ml, and 8 μg of amphotericin B (Sigma) per ml, under microaerobic conditions at 37°C. A microaerobic atmosphere was generated either by using a CampyGen sachet (Oxoid) in a gas pack jar or by incubating the culture in an incubator equilibrated with 14% CO2 and 86% air. For liquid culture, H. pylori was grown in Brucella broth (Difco) containing 10% fetal bovine serum (BB10; Invitrogen) with shaking in a microaerobic atmosphere. For antibiotic resistance marker selection, bacterial media were additionally supplemented with 25 μg of chloramphenicol (CHL) per ml. In some cases, 200 μg per ml bacitracin (BAC) was added to eliminate normal mouse flora contamination.
DNA manipulation (restriction digests, PCR, agarose gel electrophoresis, and Southern blotting) was performed according to standard procedures (3). Genomic DNA was prepared from H. pylori by using a Wizard genomic DNA preparation kit (Promega). A list of primers used for PCR and sequencing is given in Table S1 in the supplemental material.
The mouse-adapted strains used in this study were NSH79 and NSH57. NSH79 is a mouse-adapted H. pylori strain obtained after multiple passages through mice of a mixture of strains, including strain SS1 described by Lee and coworkers (32). We distinguish the NSH79 strain from strain SS1 because microarray-based comparative genomic DNA hybridization revealed that the NSH79 clone has numerous genetic changes (gene acquisition and loss) relative to the SS1 strain (N. R. Salama, L. J. Thompson, A. Lee, and S. Falkow, unpublished observations) (Fig. (Fig.1).1). NSH57 is a mouse-adapted derivative of strain G27 (15). NSH57 was generated by inoculating approximately 5 × 108 G27 bacteria, as described below, into four FVB mice. At 3 weeks, mice were sacrificed, and the output from one mouse was pooled (passage 1), amplified by growth in liquid culture, and used to inoculate four new FVB mice (passage 2). For the third passage, 30 mice were infected. These mice were sacrificed at 3 weeks, and 26 of the 30 inoculated mice were found to have detectable infection, with bacterial loads ranging from 3.7 to 5.3 log CFU/g. Bacteria from three mice giving bacterial loads of >5.0 log CFU/g were separately inoculated into C57BL/6 mice. At 3 weeks, the mice were sacrificed. All mice were infected with bacterial loads ranging from 2.6 to 5.4 log CFU/g. Single-colony isolates were obtained from mice with >5.0 log CFU/g, and one isolate, designated NSH57, was used for further study.
The H. pylori transposon library was initially generated from the G27 strain. Genomic DNA (7 μg) prepared from the original 10,000-clone G27 library was transformed into two mouse-adapted strains, NSH79 and NSH57, by natural transformation (66), with selection on HB-CHL plates. Colonies from the original transformation plates were harvested, pooled, and frozen immediately. The resulting NSH79 library contained approximately 2,000 clones, and the NSH57 library contained approximately 500,000 clones. Twelve clones from each library were individually picked for genomic DNA preparation. The genomic DNA was digested with HindIII (which cuts once in the transposon sequence) to completion, separated by gel electrophoresis, and transferred to nylon membranes. The membranes were probed by Southern blotting with transposon sequences to demonstrate that individual clones contained distinct insertions (data not shown). Pools of 48 individual mutants for infection studies were generated by plating the original frozen library stock to give single colonies on HB-CHL plates. Individual colonies were patched onto fresh plates and harvested in pools of 48 clones.
Female C57BL/6 mice, 24 to 28 days old, were obtained from Charles River Laboratories and certified free of endogenous Helicobacter sp. infection. For infection of pools of mutants, frozen stocks of the pools of 48 clones were plated on solid medium to give at least 1,000 individual colonies. After 2 (NSH57) or 3 (NSH79) days of growth, the bacteria were harvested directly from the plates and resuspended in BB10 medium to give approximately 5 × 109 bacteria per ml. Cultures were examined by microscopy to ensure a spiral shape and motility indicative of mid- to late logarithmic growth phase. Four mice were inoculated by oral gavage with 0.5 to 1 ml. After dosing, an aliquot of the inoculum was plated on HB-CHL plates to generate the input pool. The mice were housed in sterilized microisolator cages with irradiated PMI 5053 rodent chow, autoclaved corncob bedding, and acidified, reverse-osmosis purified water provided ad libitum. All studies were done under practices and procedures of Animal Biosafety Level 2. The facility is fully accredited by the Association for Assessment and Accreditation of Laboratory Animal Care, International, and all activities were approved by the Institutional Animal Care and Use Committee. Two mice at 1 week and two mice at 1 month were euthanized by inhalation of CO2. The glandular stomach was removed and cut along the greater curvature; any remaining food was removed, and tissue samples were placed in 2 ml of BB10 and homogenized with a Powergen 125 homogenizer (Fisher Scientific). The homogenate was cleared by centrifugation at 500 rpm in a clinical centrifuge (model 5810R; Eppendorf), and 200 μl was plated to generate the output pool.
For competition experiments, each indicated null mutant strain and the parental wild-type strain were grown from frozen stock in liquid culture to mid- to late logarithmic growth phase. The wild-type and mutant bacteria were combined to give approximately 2.5 × 108 bacteria of each in 5-ml and 1-ml amounts and inoculated by oral gavage in each of four mice. After inoculation, a portion of the inoculum was plated on HB-BAC plates and HB-CHL plates to enumerate the actual number of wild-type and mutant bacteria present in the inocula, respectively. After 1 week, the mice were euthanized and the stomach removed as described above. The stomach was then cut in half longitudinally, and one half of the stomach was placed in 0.5 ml BB10 for homogenization with disposable pellet pestles (Fisher Scientific). Dilutions of the homogenate were plated on HB-BAC to enumerate total bacteria and on HB-CHL plates to enumerate mutant bacteria.
For the 50% infective dose (ID50) experiments, the indicated strains were grown in liquid culture to mid- to late logarithmic phase and concentrated to give approximately 5 × 108 bacteria per ml. Serial 10-fold dilutions were prepared and inoculated into each of five mice. After 7 to 10 days, mice were euthanized and processed as described above for competition experiments with plating on HB-BAC plates.
Microarray design and hybridization conditions have been described previously (28, 52). Each isolate was analyzed on at least two microarrays, generating four potential data points for each gene. Data points were excluded due to low signal or slide abnormalities, and only those genes with two measurements were analyzed. Data were normalized using the default-computed normalization of the Stanford Microarray Database (4), and the mean of the log 2(red/green ratio) was computed. The cutoff for the absence of a gene was defined as a log 2(red/green ratio) of −1.0 based on test hybridizations (52). Data were simplified into a binary score, analyzed with CLUSTER software (http://rana.lbl.gov/EisenSoftware.htm), and displayed with TREEVIEW (19). The complete data set is available in Table S3 in the supplemental material, and raw data are available at http://genome-www5.stanford.edu.
An adaptation of MATT (54) was used to estimate the location of all 48 mutants in each pool (outlined in Fig. Fig.2).2). Briefly, genomic DNA was prepared from the input and output pools of each mouse. Sequences flanking one side of the transposons in each pool were amplified in a two-step PCR and labeled for microarray hybridization as previously described (54). The transposon primer N2 was redesigned and named N3 (5′-CTTTAATACGACGGGCAATTTGCACTTCAG-3′), based on hybridization optimization of known mutant pools (data not shown). Additionally, only primer CEKG2C was used as the random anchored primer for amplification. For all experiments (each mouse represents a single experiment, with an input pool and an output pool), the input pool DNA was labeled with Cy-3 (green [G]) and the output pool was labeled with Cy-5 (red [R]). Therefore, the red/green (R/G) ratio for each gene provides a measure of fitness for any given gene mutated in the input pool: a yellow spot (R/G, 1) present in both the input and the output represents no phenotype; a green spot (R/G, <1) present in the input but absent in the output represents a candidate mutant. For the set of four infections, the input pool DNAs were identical. Microarray scanning and analysis were performed on a GenePix 4000B scanner (Axon) using GenePix Pro 5.0 software (Axon). Spots were filtered for slide abnormalities (flag, 0) and uniformity of spot intensity (regression correlation of the spot, ≥0.6). Because each pool contained only 48 insertions, most of the 1,660 gene spots on the array were expected to give a background signal. Previous experiments using MATT to map transposon insertions had revealed that not all insertions are detected and that in some cases multiple genes adjacent to the transposon insertion get amplified (54). Thus, we expected roughly 100 spots in each experiment array to give a strong signal, but the actual number would vary from experiment to experiment. To select spots for analysis, we first computed the average background-subtracted signal intensity in the Cy-3 channel (input pool) of all the spots except the brightest 100 spots. We then chose for further analysis spots whose intensity was greater than 4 standard deviations from this mean. The average number of spots selected was 150 (range, 89 to 164). The log2 R/G was collected for each of these gene spots. Two data sets were collected for each experiment (mouse infection): one where the amplification of flanking DNA was performed from the left side of the transposon (S primers) and a second where the amplification of flanking DNA was performed from the right arm of the transposon (N primers). The data from all arrays were then collated using a relational database (Access software; Microsoft). To define an insertion site, we required that its gene spot be labeled from both sides of the transposon insertion, and thus there were data for the same gene spot in both data sets. These genes were assigned a probability of insertion of 1. To account for transposon insertions where the insertion lay near the end of a gene, we also defined insertions where, after the gene spots were arranged in chromosomal order, there was signal for a gene spot in the reaction from one side of the transposon and signal for the nearest neighboring gene spot in the reaction from the other side of the transposon. In this case, we gave each of the two neighboring genes a probability of insertion of 0.5. We then queried the color of the insertion marking the gene spots. If the spot was yellow [present in both the input and the output, log2(R/G) ≥ 2], we assigned a positive value; and if it was green [present in the input, absent in the output, log2(R/G) ≤ 2], we assigned a negative value. Thus, insertions with no apparent phenotype (present in both the input and the output) received a value of 1 or 0.5, and insertions with a mutant phenotype (absent in the output) received a value of −1 or −0.5. The data for all the pools tested in each strain background were analyzed separately. For each strain background, the data were summed across all the pools, and genes with a value of less than −1.5 (a value of −0.5 in each of three mice) were considered candidate mutants.
Genes reproducibly indicating by microarray analysis that they contained transposon insertions that attenuated survival in the mouse were chosen for further analysis. Primers were designed for both the 5′ and 3′ ends of each gene by aligning those of the two sequenced genomes present in the TIGR comprehensive microbial resource (48) and choosing conserved sequences (see Table S1 in the supplemental material). PCR was performed using the MATT transposon primers (N3 and S) in combination with each of the gene-specific primers. PCR products were sequenced by using BigDye sequencing reagents (ABI) and the transposon-specific primer by the FHCRC genomic shared resource.
Genes whose transposon locations were verified by PCR and sequencing were considered for further analysis. Null alleles were constructed using a vector-free allelic replacement strategy, as described previously, to generate alleles where a chloramphenicol antibiotic resistance cassette replaced most of the coding sequencing of the gene while preserving the start and stop codons (12, 54). The resistance cassette contains its own promoter but lacks a transcriptional terminator and, in all cases, was cloned in the same direction of transcription as the native gene. Briefly, N-terminal and C-terminal fragments of each gene were amplified using the upstream and downstream primers combined with gene internal primers such that both fragments were 250 to 400 bp (see Table S1 in the supplemental material). The gene internal primers had sequences complementary to the full lengths of primers C1 and C2, used to amplify the Campylobacter coli Cat gene, appended to the 5′ end. PCR products from each of three individual PCRs (the N terminus, the C terminus, and the Cat gene) were purified using DNA Clean & Concentrator-5 (Zymo Research) or by agarose gel electrophoresis followed by QIAEX II gel extraction (QIAGEN) if there were multiple bands. A final PCR was performed with approximately 100 ng of each of the three PCR products as template and the upstream and downstream primers to generate the knockout cassette. This final PCR product was verified by agarose gel electrophoresis, and 10 μl was directly used for the natural transformation (66) of each strain background (NSH79 and NSH57) and selected on HB-CHL media. Four to eight clones were evaluated by PCR to confirm replacement of the wild-type allele with the null allele, urease, and motility phenotypes. A single clone was used for infection experiments.
The human gastric adenocarcinoma cell line AGS was maintained in 10% CO2 in Dulbecco's minimal essential medium supplemented with 10% fetal bovine serum (FBS). Cells were seeded at a density of 1 × 105 in 24-well plates. H. pylori strains were grown at 37°C overnight in BB10 to an optical density at 600 nm (OD600) of 0.4 to 0.8, harvested, and resuspended in DB medium (81% Dulbecco's minimum essential medium, 9% Brucella broth, 10% FBS) at 1 × 106 to 2 × 106 bacteria/ml, and 1 ml was used to inoculate each well. At each time point, the supernatant was harvested, centrifuged, and frozen for interleukin 8 (IL-8) analysis (Biotrak enzyme-linked immunosorbent assay system; Amersham Biosciences). To detect phosphorylated and total CagA, the remaining cells and bacteria in each well were washed three times with 1 mM Na3VO4 in phosphate-buffered saline (Invitrogen) and lysed with 100 μl 2× sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) sample buffer (0.25 M Tris-Cl [pH 6.8], 4% glycerol, 4% SDS, 0.001% bromophenol blue, 2% 2-mercaptoethanol). Samples were resolved by SDS-PAGE and transferred to polyvinylidene difluoride membranes. To visualize phosphorylated CagA, membranes were probed with anti-phosphotyrosine-PY20 (BD Transduction Laboratories) or 4G10 (Upstate), followed by 1:10,000 goat anti-mouse horseradish peroxidase (HRP) (Amersham Biosciences). Immunoreactive proteins were visualized using ECL Plus Western blotting detection reagent (Amersham Biosciences). To measure CagA, the same membranes were probed with 1:10,000 anti-CagA polyclonal antibody pAs (2), followed by 1:10,000 goat anti-rabbit HRP (Amersham Biosciences), and visualized using ECL Plus.
Individual clones and the wild-type parent strain were inoculated into plates containing 28 g/liter Brucella broth, 5% FBS, and 0.4% agar. After 4 days of growth, the plates were evaluated for halo production.
Bacteria were resuspended from freshly growing plates (24 to 48 h) in 100 μl urease broth (Difco) in 96-well plates in duplicate. The wells were covered with tape and incubated at 37°C in 14% CO2 for 24 h. Wells where the pH-sensitive indicator dye changed from yellow to pink were scored urease positive.
Null mutant or isogenic wild-type bacteria were inoculated from 2 to 3 days of growth on plates into 2 ml BB10 medium for overnight cultures. The resulting cultures were checked for spiral morphology and motility and then diluted to an OD600 of 0.1. Aliquots were removed every 3 to 6 h to determine the OD600. The resulting growth curves were compared graphically using Excel (Microsoft).
Raw data for each array were submitted to the Stanford Microarray Database for normalization and warehousing (4). The raw data are available at http://smd.stanford.edu/cgi-bin/publication/viewPublication.pl?pub_no=564.
Our original transposon mutant library was made with strain G27 (54). This strain has several advantages. It is highly transformable with exogenous DNA by natural transformation, it contains all known H. pylori virulence factors, and it displays robust PAI function. This strain, however, is not able to efficiently colonize the commonly used C57BL/6 mouse infection model. In order to perform selections in vivo, we used genomic DNA from the original library to transfer the library mutations by natural transformation and homologous recombination into two genetically distinct mouse-adapted strains, NSH79 and NSH57 (see Materials and Methods). There have been increasing reports of strain-specific effects on colonization from null mutations of candidate virulence genes (14, 17, 35, 36). By querying two different strain backgrounds, we hoped to illuminate which genes have a universal impact on colonization and which have strain-specific effects.
As described in Materials and Methods, NSH57 and NSH79 were both derived from multiple mouse and laboratory passages of previously characterized H. pylori isolates (G27  and SS1 , respectively). We reported previously that the adaptation of human clinical isolates of H. pylori to animal hosts can result in gene deletion, particularly in the cag PAI (27, 30, 60). Therefore, we performed comparative genomic DNA hybridizations to the whole-genome cDNA microarrays to detect whole-gene-level alterations relative to those of the starting strains (see Table S3 in the supplemental material). No changes in genomic content were detected with NSH57 relative to the genome of the parental G27. Strains NSH79 and SS1, however, differed in the presence and absence of 114 genes, indicating that they were quite genetically distinct and that NSH79 likely is unrelated to SS1. Hierarchical clustering of the strains based on the presence and absence of genes revealed that NSH79 was as equally distinct from SS1 and its pre-mouse clinical isolate (60) as it was from G27 and NSH57 (Fig. (Fig.1A).1A). We did not find any major alterations in the cag PAI (see Table S3 in the supplemental material) among any of these strains.
Even in the absence of gene deletion, several studies have shown that many mouse-adapted strains, including SS1, display attenuation of the proinflammatory activity of the cag PAI T4SS (17, 49, 63). To check the activity of the T4SS in our two mouse-adapted strains, we assayed both IL-8 release and host cell translocation-dependent tyrosine phosphorylation of CagA after coculture of bacteria with AGS gastric epithelial tissue culture cells (Fig. (Fig.1).1). Both G27 and the mouse-adapted variant of this strain, NSH57, showed robust induction of IL-8 and translocation of CagA into host cells (Fig. 1B and D and see Fig. S1 in the supplemental material). In contrast, strain NSH79 showed less induction of IL-8, and we were unable to detect phosphorylated CagA in spite of robust CagA expression, even at 24 h (Fig. 1B, C, and D). Thus, NSH79 appears to have a partially attenuated cag PAI T4SS and/or a phosphorylation-defective cagA, while the NSH57 cag PAI T4SS is fully functional.
In order to screen as many mutants as possible using the lowest number of mice, we wanted to infect mice with pools of mutants and monitor the fate of these mutant clones using MATT. The number of mutants that can be screened simultaneously depends on the number of clones that can independently establish infection in the stomach mucosa at one time (41). To address this issue, we performed pilot infections with 24, 48, and 96 random clones from the NSH79 mutant library for 2 weeks. To do this, each pool of transposon mutants was used to infect three mice. Genomic DNA was prepared from input bacteria and output bacteria from each of the three mice. This DNA was labeled for MATT analysis either from one mouse individually or by combining the bacterial outputs from all three infected mice. As summarized in Table Table1,1, MATT was able to detect an increasing number of clones as the input number of clones increased from 24 to 48. However, less than the expected number of clones was detected when a mixture of 96 independent clones was used for the infection. Additionally, there were six clones that disappeared from the output pool when one mouse was analyzed individually, but these clones were present when the outputs from the three mice were pooled. This indicates that somewhere between 48 and 96 clones, the criteria of independent action of clones fail to be satisfied. Thus, we have shown that MATT can be used to track clones during mouse infection and that 48 clones can be screened simultaneously in a single mouse.
There are roughly 1,500 genes in the H. pylori genome, of which approximately 350 are essential, (54) leaving approximately 1,150 genes that can tolerate mutation by transposon insertion. We screened 25 pools of 48 mutants, or 1,200 mutants in each background, for a total of 2,400 transposon mutant clones. We infected four C57BL/6 mice per pool and harvested the output bacteria from the stomachs of two mice at 1 week and from the other two mice at 1 month postinoculation. Our previous studies had shown that immediately after infection, the majority of bacteria are cleared and then the bacterial load slowly increases until 7 days, after which it remains stable (53). Therefore, we expected that genes required for colonization would display a phenotype by the 1-week time point. We hypothesized that genes not required for the initial colonization, but which play a role in persistence, might display a phenotype at the later time point of 1 month. In practice, however, we did not find insertions that were consistently present at 1 week and absent at 1 month, so all four mice were considered as independent replicates.
Each mouse was analyzed separately, using a modification of MATT (54) as outlined in Fig. Fig.2.2. This method uses semirandom PCR to amplify the DNA next to transposon insertions. Pooled genomic DNA from the input and output bacteria from each mouse was amplified and labeled with different fluorescent dyes separately and then cohybridized on a single microarray. Transposon insertions were assigned if detected by the same gene spot in separate hybridization experiments initiated from the left and right sides of the transposon. Genes were assigned a probability of insertion value of 1 (−1 for a mutant phenotype, see Materials and Methods) if the insertion mapped to the middle of the gene. If the insertion lay at the end or between genes, each of the two neighboring genes was assigned a value of 0.5 (or −0.5). For each strain background, data were summed across all the pools and genes with a value less than −1.5 (a value of −0.5 in three mice) were given further consideration. The summed data for each strain background are given in Table S4 in the supplemental material.
Using MATT, we were able to track the behavior of 1,378 of the 2,400 initial transposon insertions which mapped to 758 genes. By requiring observation in the microarray hybridizations from both sides of the transposon, we excluded data for many insertions. We felt that this loss of sensitivity was balanced by the elimination of false-positive results. Some genes had insertions observed in multiple pools. Of the 758 genes measured in our screen, only 250 were measured in both strains. This confirms that neither screen was saturating. As expected, most transposon mutant clones were present in both the input and the output pools, but 223 genes (29%) showed a colonization phenotype (see Table S4 in the supplemental material). Of the candidate mutant genes, 98 were queried in both strain backgrounds. For 23 genes, the data from both strain backgrounds indicated a phenotype in colonization after transposon insertion-mediated gene inactivation. However, 75 clones were eliminated during infection in one strain background but not in the other. These discrepancies could arise from the locations of transposon insertion being different for the two screens, where in one case it inactivates a gene while in another it does not. Alternatively, there may be functionally redundant genes present in one strain but not in the other.
The genes we identified as candidate colonization factors in our screen fell into a variety of function categories, some of which were expected based on previous work with H. pylori or other pathogenic organisms, including motility, lipopolysaccharide or exopolysaccharide biosynthesis and modification, urease production, iron homeostasis, and stress response (Table (Table2).2). The largest number of genes (81 genes) does not have informative homologies based on DNA sequence alignment and is annotated as hypothetical. Only 21 of these genes are conserved in other sequenced organisms or have recognizable domains (e.g., GTP binding). The second most frequent category (19) was motility and chemotaxis, including a regulator of flagellar biosynthetic gene expression (HP0703, flgR), genes involved in flagellar biosynthesis (12), three methyl-accepting chemotaxis proteins, and three genes involved in transducing signals to the flagellar motor. A number of genes (14) can be grouped together because they are involved in biosynthesis, including three genes involved in nucleotide metabolism, six genes involved in amino acid biosynthesis, three genes involved in biosynthesis of cofactors, and two genes involved in lipid or fatty acid metabolism. Similarly, we identified 11 genes annotated as transporters of various nutrients and a putative efflux pump. Identification of these last two function classes is consistent with a host environment that is limited in the variety of nutrients essential to H. pylori growth compared to the rich media used to propagate the mutant pools in vitro.
Twelve of the candidate colonization genes were annotated to be involved in the production of lipopolysaccharides or extracellular polysaccharides. We identified 11 genes annotated to be involved in energy production, including genes involved in both aerobic and anaerobic respiration, as well as a regulator induced under carbon starvation conditions (cst, HP1168). Mutation of cst attenuates the virulence of Salmonella enterica (58). Perhaps surprisingly, we identified a large number of genes annotated as involved either in DNA modification or in outer membrane proteins (11 each). These are both large gene families in H. pylori that have been postulated to be involved in virulence through the mediation of antigenic variation and genomic variation. Ten genes could be grouped by their roles in the production of ammonia, which is both the main nitrogen source of the bacteria and important in buffering the low pH encountered in the stomach. We identified five of the seven urease genes in our screen. We also included in this group nixA (HP1077) and hypA (HP0869), genes involved in the uptake of nickel and assembly of nickel into the urease enzyme, respectively. Interestingly, both amiE (HP0924) and amiF (HP1238), two amidases carried in the H. pylori genome, were identified as having a phenotype in the NSH57 strain background but not in the NSH79 strain background. A previous study also found that amidase gene function is not required for colonization in the SS1 strain background (9).
We identified six genes from the cag PAI in our screen. In general, these genes showed discordant data in the two strain backgrounds, and as described below, we confirmed this strain-specific phenotype for some of these genes. Two other genes have annotations that implicate them in virulence: a paralogue of the vacA cytotoxin (HP0609) and an orphan homologue of the Agrobacterium tumafaciens type IV secretion apparatus subunit virB4 (jhp0917). We identified eight genes that fell into the categories of protein synthesis and modification and included three genes involved in lipoprotein biogenesis, a putative sialoglycoprotease, and three genes predicted to affect protein translation. Seven genes are annotated to be involved in DNA metabolism, including DNA repair proteins and putative recombination enzymes. Seven of the transposon insertions showing an attenuated phenotype mapped to one of several transposase genes of the IS605 insertion sequences which are present in multiple copies in the genome. The mechanism by which these insertions affect colonization is likely indirect. Six genes are annotated to be involved in iron homeostasis, including the ferric uptake regulation protein (fur, HP1027), the nonheme iron-containing ferritin (pfr, HP0653), and four genes with homology to outer membrane iron transporters. Four genes are annotated to be involved in stress response, including a superoxide dismutase (HP0389), a chaperone (HP0210), a peroxyredoxin (HP0136), and a periplasmic acid phosphatase (HP1285). In addition to the flgR and fur genes, we identified two additional candidate transcriptional regulators in our screen: the two-component response regulator HP1043 and tenA (HP1287), a gene shown to induce expression of extracellular enzymes in Bacillus subtilis (47). Finally, we detected a single gene that was annotated to be involved in cell division (fic, HP1159).
We initially focused on a complete analysis of the candidate mutants from two mutant pools from the screen of the NSH79 strain background (NSH79-2 and NSH79-3). We detected transposon insertions in 44 different genes from these two pools, three of which were detected in both pools. Of these 44, 12 genes were suspected to be involved in colonization because they were not detected in the output pool (Table (Table3).3). Consistent with the results from the entire screen, the majority (seven genes) had no annotation, based on nucleotide sequence homology, and one is annotated as a conserved hypothetical integral membrane protein. One gene was annotated as a nonfunctional type II restriction endonuclease, presumably because it lies adjacent to a putative type II N6 adenine-specific methyl transferase (HP0369), though the sequence of HP0368 has degenerated to the point that no conserved endonuclease domain can be identified.
The remaining genes do have annotations that suggest possible roles in virulence. Two genes, HP1027 (fur) and HP1400 (fecA), are expected to play a role in iron homeostasis. The iron-dependent transcriptional repressor Fur was shown to be required for full virulence by SS1 bacteria in the mouse model of infection (10). Two insertions reside within the cag PAI. We were surprised to find insertions in PAI genes giving a colonization phenotype because previous work has indicated that the PAI is not necessary for bacterial colonization in the mouse (49, 60).
When the data from these pools were compared to the larger data set, some of these genes appeared in multiple pools. Additionally, some were detected in both screens. One gene, HP0503, showed a colonization phenotype in both strain backgrounds, while HP0022, HP0368, HP0529 (cag9), and HP1028 were not candidate mutants in the NSH57 strain background. This discordance could result from strain-specific differences in genetic redundancy or phenotypic buffering or from differences in the site of transposon insertion in the different clones. Transposon insertions at the very 5′ or the 3′ 10 to 15% portion of the gene have been suggested not to severely affect gene function (29).
To evaluate the robustness of the array data, we first identified the exact location of each transposon in the region predicted by the arrays. Gene-specific primers were designed for the 5′ and 3′ ends of each gene and were used in combination with the transposon-specific primers to amplify genomic fragments flanking the insertions. These PCR products were then sequenced. The position of the transposon insertion for each candidate mutant is given in Table Table3.3. We made independent null alleles of each of these genes by replacing most of the coding sequence with a nonpolar antibiotic resistance cassette. Null alleles were made in both the NSH79 and NSH57 mouse-adapted strain backgrounds to determine if the phenotype was strain specific or more universal. We tested these mutants in a 1:1 competition experiment with wild-type bacteria. After 1 week of infection, we harvested the stomach and determined the competitive index (CFU mutant bacteria/CFU wild-type bacteria). This competitive index number was corrected by dividing the actual input ratio enumerated from plating the inocula after infection. As seen in Fig. Fig.3,3, all the gene loci tested gave a phenotype in the NSH79 strain background, which was the strain background of the pools being analyzed. Many of these gene loci also gave a phenotype in the second strain background (NSH57). The one gene locus, HP0503, which was predicted to give a phenotype in both strains did so, as did HP0137, for which there were no data from the NSH57 screen. When the four genes that gave discordant data were examined, HP0022 and HP0529 showed a strain-specific phenotype as expected. HP0368 and HP1028, however, gave a phenotype in both strain backgrounds. We examined the location of the transposon for the clones detected in the NSH57 screen and found an insertion in gene HP0368 at nucleotide 40 (out of 399) of the coding sequence, and the insertion in gene HP1028 was at nucleotide 150 (out of 450). For the NSH57 screen, the insertion in HP1028 appeared attenuated in some but not in all mice and thus was not considered a candidate mutant. In addition to HP0022 and HP0529, four additional genes showed a strain-specific phenotype.
The strain-specific differences in colonization phenotypes were further investigated for two genes by an independent assay, an ID50 experiment (Table (Table4).4). We chose HP0503 as an example of a gene locus that gave a phenotype in both strain backgrounds and HP1081 as a gene locus that showed a strain-specific phenotype. We were not able to recover bacteria from mice infected at any concentration for the HP0503 mutant in the NSH79 strain background, indicating that the ID50 is at least 470-fold higher than that of wild-type bacteria. For the NSH57 strain background, the ID50 was 30-fold higher than that of wild-type bacteria. As expected, the HP1081 null allele has a higher ID50 than the wild type in the NSH79 background, in this case, 2,500-fold higher. By contrast, in the NSH57 background, the ID50 was essentially indistinguishable from that of wild-type bacteria (1.5-fold higher). Thus, the ID50 experiments further support a role for both of these genes in colonization in the NSH79 strain background and a strain-specific colonization defect for HP1081.
As mentioned in the introduction, there are several phenotypes known to affect colonization, including motility and urease activity. To address the possible mechanisms by which these novel candidate colonization genes might be operating, we checked both the motility and the urease activity of the null allele-containing strains used to recheck the colonization phenotype. All 10 mutant strains retained both robust motility and urease activity in the NSH57 background (data not shown). Interestingly, in the NSH79 background, three mutants had attenuated urease activity (HP0137, HP0368, and HP1081). This was the case for multiple independent transformation clones, indicating that it was not due to a second mutation that had occurred during transformation. The annotation of these genes does not give any clues to their function, and they lay distal from all the genes known to be involved in urease enzyme synthesis and activity. Interestingly, two of these genes (HP0137 and HP0368) also gave a phenotype when mutated in the NSH57 background, where they do not alter urease activity. This indicates that these genes must affect an additional process in this strain background that is required for colonization. The strain-specific phenotype of HP1081, however, may be due to its effect on urease activity. Similarly, the HP0963 mutants showed a motility defect by microscopic observation and motility in soft agar exclusively in the NSH79 strain background (not shown). Thus, the strain-specific phenotype of this gene locus may result from its effect on flagellar biosynthesis or function in the NSH79 strain background.
Two of the genes we retested are part of the cag PAI. HP0529 encodes a homologue of the Agrobacterium tumefaciens virB8 gene, which is a core component of the T4SS, and HP0538 encodes CagN, a protein that localizes to the inner membrane but has no known effects on the induction of IL-8 or CagA translocation into host cells (8, 20). Mutation of these genes displayed a phenotype only in the NSH79 strain background. While the NSH79 cag PAI T4SS appears partially attenuated (Fig. (Fig.1B),1B), induction of host cell IL-8 secretion was further diminished by the mutation of the virB8 homologue HP0529 and not by the mutation of cagN (HP0538). This indicates that NSH79 induced IL-8 secretion by virtue of its partially functional cag PAI T4SS and not other bacterial components (Fig. (Fig.1D).1D). We were unable to assess the contributions of HP0529 and HP0538 to CagA translocation in NSH79 because the wild-type parent showed no detectable CagA phosphorylation in our assay system, though it remains possible that some CagA does become translocated into host cells during mouse infection. These results suggest that a partially functional cag PAI T4SS provides a competitive advantage in the NSH79 strain background but not in the NSH57 strain background.
Large-scale screens for colonization and persistence by a number of bacterial pathogens in various animal models have begun to give insights into both the challenges bacteria face during their sojourn in mammalian hosts as well as specific molecules used to reprogram the host and facilitate survival. Prerequisites for such studies include genome-saturating mutant libraries and reliable animal models. We recently developed a comprehensive transposon-based mutant library for H. pylori (54) and here screened a subset of this library in a now well-established mouse model of stomach colonization and persistence using C57BL/6 mice. The H. pylori strain used to generate our initial library does not readily infect this model. To overcome this obstacle, we transformed the library into two different mouse-infecting strains (NSH79 and NSH57). Characterization of NSH57 revealed that unlike many mouse-adapted human clinical isolates, including NSH79, it retains the ability to assemble a fully functional T4SS that can efficiently translocate the CagA effector and induce IL-8 production in gastric epithelial tissue culture cells. Thus, NSH57 may be particularly useful in assaying pathogenicity in the mouse model, since the cag PAI appears to retain full activity.
While screening of very-high-complexity libraries in single animals has been possible for systemic infection models such as interperitoneal mouse inoculation with Salmonella enterica and Mycobacterium tuberculosis (13, 55), colonization of mucosal surfaces often results in survival bottlenecks such that clones no longer show independent action (37, 40). These bottlenecks presumably result from innate defenses of mucosal barriers. In our study, we empirically determined an optimal pool size of 48 clones. This allowed us to screen 1,200 transposon clones from two libraries in different strain backgrounds for a total of 2,400 clones. We used whole-genome microarrays to monitor the behavior of individual clones in the pools and to identify those that could not survive. Our method of detecting the transposon insertion site on the microarray relied on a semirandom PCR amplification step of the DNA sequences flanking the transposon, using an anchored random primer expected to anneal approximately once in every 1,000 bases in the H. pylori genome (54). In practice, not all transposon-flanking sequences are amplified efficiently. We further required a microarray signal in the amplifications from both sides of the transposon to eliminate technical artifacts. This resulted in the detection of only 1,378 clones out of 2,400, or 57%. These insertions mapped to 758 genes, of which 223 had a predicted colonization defect. A recent signature-tagged mutagenesis-based screen of H. pylori stomach colonization using a gerbil infection model queried 960 transposon mutants and identified 252 candidate mutants corresponding to 47 genes (31). These two in vivo screens yielded similar percentages of candidate mutants (29% in our screen versus 26% in the previous screen). This suggests that a high percentage of the H. pylori genome is devoted to the ability to colonize and persist in the stomach.
When comparing our data to the above-mentioned screen for colonization of the gerbil stomach, we collected data for 22 of the 47 in vivo essential genes described previously (31). We detected a colonization defect for 14 of these genes, including four urease genes, five genes involved in motility or chemotaxis, alpha-ketoglutarate permease (HP1091), UDP-glucose 4-epimerase (HP0360), a gene with homology to an exopolysaccharide biosynthesis protein (HP0366), and two hypothetical protein-encoding genes (HP0486 and HP1525). None of these fourteen mutants were retested in the previous work, so our data represent the first independent analysis of many of these genes' roles in colonization. The remaining eight genes for which there were data in our screen did not make our cutoff point as candidate mutants, either because they were not measured in all the mice of a particular pool or because there were conflicting data from different pools. Since we do not know the precise site of the transposon insertion, it may be that the insertions we measured were not inactivating. Alternatively, since the previous work was done using a different bacterial strain background and a different host, these genes may have strain-specific or host-specific effects.
In addition to the above-mentioned screen, several groups have assessed the roles of individual genes in colonization, and many of these genes were identified by our screen. Examples of genes previously shown to be compromised for colonization in the mouse model, when mutated and identified as colonization defective in our screen, include the urease genes (16, 22, 62), the nickel transporter nikA gene, the fur gene (10), the bacterial ferritin pfr gene (64), the β-1-4 galactosyl transferase gene (18), several chemotaxis genes (21, 59), and the flagellar sheath-localized adhesin hpaA gene (11). A hallmark of H. pylori infection is infiltration of neutrophils that produce reactive oxygen and nitrogen species, exposing both the host and the bacteria to DNA and protein damage. Thus, genes involved in protection from oxidative stress or repair of DNA damage might be expected to play an essential role in vivo. We found candidate mutants linked to several stress-associated genes, including superoxide dismutase (sodB) and a peroxyredoxin-thioredoxin-like protein-encoding gene (HP0136), which previous work had shown to be essential (56, 65), and a gene not previously tested, that encoding a heat shock protein (HSP90) homologue (HP0210). Catalase (katA) (25) and γ-glutamyltranspeptidase (14, 24), an enzyme involved in glutathione metabolism, showed subtle or variable effects on colonization and persistence in our screen and in the published literature. Similarly, two genes involved in DNA repair, ruvC (34) and the endonuclease III gene (46, 51), gave variable colonization phenotypes at early time points in other studies and gave inconsistent results in our screen. However, transposon mutants mapping to two DNA glycosylases (HP1347 and HP0602) were attenuated in our screen, suggesting that the ability to repair DNA damage is likely very important for persistence in vivo.
Another group of genes that has been investigated in some detail in animal models is the cag PAI which encodes a T4SS that translocates CagA into host cells and induces proinflammatory cytokine production. This pathogenicity island is variably present in human clinical isolates and has been associated with more-severe pathological outcomes. In humans it is thought not to be required for establishing infection since it is not present in all strains. As with human infection, some mouse-colonizing strains do not contain the PAI (60). Murine adaptation of cag+ human clinical isolates often results in attenuation of PAI activity (49), and this is the case for NSH79, one of the strain backgrounds used in our experiments. While many insertions within the PAI did not give a colonization phenotype, we confirmed by construction of independent null alleles that disruption of the HP0529, an Agrobacterium tumefaciens virB8 homologue, and the HP0538 (cagN) genes conferred a competitive defect in the NSH79 background. Others have observed a competitive defect for PAI mutants at early, but not later, time points in the mouse model (35). The mechanism for this attenuation is not clear. There may be an energy cost to producing a nonfunctional PAI, although we could not measure any growth attenuation in vitro. Further analysis of the kinetics of infection and the precise localization of the bacteria may shed light on this puzzle.
One advantage of the microarray method for tracking transposon mutant behavior over most signature-tagged mutagenesis strategies is that it provides information for genes that have been tested and have not given a phenotype. For example, another group evaluated the importance of the Entner-Doudoroff pathway in vivo and found no defect for the mutation of edd (6-phosphogluconate dehydratase) (67). We found similar results for both edd and eda, a gene encoding another enzyme in this pathway. Similarly the amidases amiE and amiF previously were suggested to not be important for colonization when deleted in the SS1 strain background (9), similar to our data for the NSH79 background. In the NSH57 background, however, we observed a phenotype for both genes. The importance of these enzymes has been suggested by the profound role ammonia production plays both in acid protection and as a nitrogen source for H. pylori. Indeed, early testing of urease mutants in vivo showed a requirement for this enzyme even when the stomach pH was neutralized, suggesting that ammonia plays multiple roles during infection (16). AmiE is a highly expressed protein in the H. pylori cell (33, 61), and both amiE and amiF are regulated by pH (10, 38) and iron availability (39). The strain-specific requirement for the amidases may result from genetic variation in the various pathways that govern ammonia production and consumption. This hypothesis will require further testing.
While our colonization screen was not saturating, we queried more than half of the genome, allowing a global view of the H. pylori cellular processes important for establishing and maintaining infection in the stomach. As detailed above, many of the functional categories identified support results from similar analyses of mucosal surface colonization by other bacterial pathogens. Indeed, specific examples from each of these categories have been investigated on an individual gene basis in H. pylori, and our results confirm and extend these studies. Perhaps the more significant impact of this work is the identification of candidate colonization genes that may be specific to H. pylori infection of the stomach. The largest class of genes that showed attenuated colonization potential when interrupted by transposon insertion consisted of hypothetical genes that may be uniquely important to H. pylori biology. We validated the requirement for eight of these gene loci by making independent null alleles, assaying competition with wild-type bacteria, and in some cases, determining the dose for 50% infection. At this point, the mechanism by which these genes contribute to the infectious process is not clear, and it remains possible that disruption of the targeted genes affects the expression of neighboring genes that mediate the phenotype. In a few cases, we could measure strain-specific effects on urease activity or motility, which are processes known to be required for colonization. One of the genes (HP1028) that gave a strong phenotype in both strain backgrounds we tested was suggested as a putative virulence factor from a bioinformatics-based systems biology analysis of genes linked to those of the cag PAI (57). While that study did not include functional studies, the authors suggest that this putative secreted protein may have toxin-like properties based on protein folding prediction software.
Many of the genes in our overall screen as well as those we independently validated showed strain-specific colonization phenotypes. H. pylori strains differ substantially in both their gene complements and sequence polymorphisms. Thus, it is perhaps not surprising that the genetic requirements for colonization would differ in different strain backgrounds. Furthermore, recent studies have shown that strains differ in their anatomic distribution within the stomach even in the murine model (1), which could contribute to a differential requirement for some genes. It is tempting to focus on those genes required in a multiple strain background as being the most important for colonization. Strain-specific genes, however, may regulate biological pathways that are equally important but show differential phenotypic buffering by the actions of unique genes or alleles in different strains. More-detailed analysis of the genes identified in this screen should yield a deeper appreciation of the pathogenic potential of various H. pylori strains and illuminate additional targets for antimicrobial strategies.
We thank Jutta Fero for excellent technical assistance, Olivier Humbert for critical reading of the manuscript, and members of the Salama laboratory for stimulating discussions. We thank the FHCRC Animal Health and Genomics Shared Resources for experimental support.
This work was supported by Public Health Service grant AI054423 from the National Institute of Allergy and Infectious Diseases and by a grant from the Pew Charitable Trusts Program in the Biomedical Sciences.
Editor: V. J. DiRita
Published ahead of print on 13 November 2006.
†Supplemental material for this article may be found at http://iai.asm.org/.