|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: AK. Performed the experiments: KP SAM LEB AMH PS PREM. Analyzed the data: KP AK. Contributed reagents/materials/analysis tools: SA PREM DEB. Wrote the paper: AK.
Deletion of single genes from expanded gene families in bacterial genomes often does not elicit a phenotype thus implying redundancy or functional non-essentiality of paralogous genes. The molecular mechanisms that facilitate evolutionary maintenance of such paralogs despite selective pressures against redundancy remain mostly unexplored. Here, we investigate the evolutionary, genetic, and functional interaction between the Helicobacter pylori cysteine-rich paralogs hcpG and hcpC in the context of H. pylori infection of cultured mammalian cells. We find that in natural H. pylori populations both hcpG and hcpC are maintained by positive selection in a dual genetic relationship that switches from complete redundancy during early infection, whereby ΔhcpC or ΔhcpG mutants themselves show no growth defect but a significant growth defect is seen in the ΔhcpC,ΔhcpG double mutant, to quantitative redundancy during late infection wherein the growth defect of the ΔhcpC mutant is exacerbated in the ΔhcpC,ΔhcpG double mutant although the ΔhcpG mutant itself shows no defect. Moreover, during early infection both hcpG and hcpC are essential for optimal translocation of the H. pylori HspB/GroEL chaperone, but during middle-to-late infection hcpC alone is necessary and sufficient for HspB/GroEL translocation thereby revealing the lack of functional compensation among paralogs. We propose that evolution of context-dependent differences in the nature of genetic redundancy, and function, between hcpG and hcpC may facilitate their maintenance in H. pylori genomes, and confer robustness to H. pylori growth during infection of cultured mammalian cells.
Gene duplication provides the raw material for functional innovation, and is a source of genetic redundancy and phenotype robustness , , . Evolutionary theories on the fate of duplicate genes are based on the premise that gene duplication creates functional redundancy thereby relieving selection pressure on one or both gene copies , , . Thus, mutations normally deleterious to gene function escape purifying selection and, over time, the mutation-containing gene is pseudogenized and lost from the population owing to genetic drift. However, duplicate genes are retained when accumulating mutations cause complementary loss of functional attributes in each copy such that both are required for full functionality . Rarely, some duplicate genes are retained because accumulated mutations confer a new advantageous function , . Also, a duplicate gene may be retained because it provides a buffer against deleterious mutations in the ancestral gene , , , , , , . However, because true genetic redundancy is evolutionarily unstable , at best transient ,  and only theoretically sustainable on an evolutionary time-scale , its contribution to maintenance of duplicate genes remains a subject of intense debate , , , , .
Sequence-related gene families produced by gene duplications are frequently observed in the genomes of bacteria that maintain long evolutionary associations with their eukaryotic hosts , . In such bacteria, gene family expansions are likely associated with host-specific adaptations , , , , . Paradoxically, deletion of single genes from expanded gene families often has little or no phenotypic consequence (e.g., on fitness) implying redundancy among paralogs , . Conceptually, a lack of a notable phenotype is generally equated with the ability of a system to continue functioning after genetic change, although underlying molecular mechanisms for this mutational robustness remain largely unidentified. In the context of bacterial infection and virulence, whether gene dispensability in expanded gene families reflects the ability of paralogs to functionally compensate each other or a lack of essentiality of ancestral function is not known. Moreover, bacterial genes that are redundant and not under sufficient selection should be rapidly deleted , and the evolutionary drive toward small specialist genomes in host-adapted bacteria  should exacerbate loss of redundant genes. The mechanisms that facilitate evolutionary maintenance of expanded gene families in bacterial genomes remain largely unexplored.
Helicobacter pylori is an important human pathogen in that it can establish decades-long infections and is the main cause of serious gastric diseases, including ulcers and cancer . Nearly 17% of the H. pylori genome is composed of duplicate genes , , which are categorized into several gene families. Prominent among these is the Sel1-like gene family, which arose from H. pylori genome-specific expansion and contains eight diverse, rapidly evolving genes . This gene family is characterized by the presence of modular Sel1-like repeat (Slr; PFAM entry, PF08238), which is eukaryotic in origin. Many of the Slr-containing genes encode Helicobacter cysteine-rich proteins (Hcp), which are highly immunogenic secreted proteins , , , , , , thought to contribute to H. pylori infection and pathogenesis. However, little is known about the genetic relationships among Slr-containing paralogs or about their functional relevance in the context of H. pylori infection or pathogenesis. For example, in H. pylori strain HpG27MA HcpG (HpG27_1469) is 53% similar to HcpC (HpG27_1039) (Fig. 1a; Table S6 in File S1) and predicted to adopt a helical conformation similar to that of HcpC (Fig. 1b). However, whereas 1) the Slr-gene hcpG (also called hsp12) is strain-specific, highly polymorphic, and apparently expressed under stress , and 2) the divergence of hcpG from its closest paralog in the H. pylori genome, hcpC, is driven by positive selection, indicating functional divergence between the two paralogs , hcpG appears dispensable to H. pylori growth in vitro under both normal and stress conditions . Thus, does the dispensability of hcpG to H. pylori growth reflect that hcpG is genetically redundant with hcpC? And, does hcpC functionally compensate for the lack of hcpG? In the present study, we investigated the molecular evolutionary, genetic, and functional relationship between hcpG and hcpC to explore their relevance to H. pylori pathogenesis, and to gain general insight into the mechanisms that maintain duplicate genes in expanded bacterial gene families.
This study began with our observation that hcpG was present in some fully sequenced H. pylori genomes but absent from others (Table S6 in File S1). Furthermore, in all genomes analyzed in this study hcpG appears to be a single copy gene whose chromosomal context is conserved (Table S6 in File S1). To confirm and extend these observations, we first sought to determine whether hcpG was present in 166 geographically diverse H. pylori isolates (Fig. S1 and Table S1 in File S1). Seventy-nine of the isolates were positive for hcpG with PCR primers located within genes flanking hcpG (Table S2 in File S1). PCR primers within hcpG-specific conserved internal regions confirmed the absence of hcpG among isolates that initially tested negative for hcpG. Thus, when present, the chromosomal context of hcpG appears conserved. We observed striking variation in the molecular size of hcpG in different H. pylori isolates (Fig. S3A in File S1). Thus, we next examined the molecular basis and effect of size variation on the modular domain architecture of hcpG alleles. The complete nucleotide sequences of the 79 PCR products described above revealed that 15 had a non-Slr containing gene in place of hcpG. Of the remaining 64 hcpG alleles, 28 encoded pseudogenes that contained premature stop codons in their nucleotide sequence introduced mostly via frameshift mutations (Fig. S3b in File S1). In hcpG nucleotide sequences encoding functional proteins (i.e., >100 amino acids), the number of Slr modules ranged from two to seven. Accordingly, the predicted proteins ranged from 107 to 330 amino acids with strikingly different domain architectures (Fig. 1c). To better understand hcpG evolution, we analyzed 46 unique alleles of hcpG phylogenetically. ML analysis revealed no significant geographic clustering of sequences or of different domain architectures (Fig. 1c). Next, we estimated selection pressures on individual hcpG codons and in branches of the phylogenetic tree using a combination of codon-based models of sequence evolution, ML, and Bayesian methods. Codon models that incorporated positive selection (ωS >>1) within the estimated parameters fit the data significantly better than those that did not (Table 1), suggesting that functional hcpG alleles are subject to heterogeneous selective pressures. Moreover, Bayesian analysis confidently identified 11 sites under positive selection (Bayesian probability ≥0.99; ωS=4.46) (Table 1). Positive selection was also evident in several branches of the hcpG phylogenetic tree (Fig. 1c and Table 1; Table S7 in File S1). Thus, we conclude that hcpG is highly polymorphic and present in only 38% of H. pylori strains, that some H. pylori strains only contain pseudogenized alleles while other strains only contain functional hcpG variants, and that positive selection maintains and drives the divergence of extant, functional hcpG alleles.
To better understand the divergence mechanisms of duplicate genes, we next characterized the evolutionary dynamics of hcpC. Although there are several slr genes encoded in H. pylori genome we focus here on hcpC because it is the closest paralog of hcpG . We found that in contrast with hcpG, which was present in only a subset of H. pylori strains, hcpC was present in all H. pylori isolates screened. Complete nucleotide sequence analysis of the hcpC alleles from 100 H. pylori strains revealed that, unlike hcpG, their modular domain architecture was conserved. Also, hcpC alleles exhibited less overall nucleotide diversity than did hcpG alleles (Table S8 in File S1). Population genetic and ML phylogenetic analyses of 81 unique sequences revealed that unlike hcpG alleles, hcpC alleles clustered according to their geographic origins (Fig. 1d; Table S9 in File S1). Such geographic clustering is typical of H. pylori gene sequences. Because of striking differences in overall genomic conservation and evolution of hcpG and hcpC, we next examined the selective pressures on individual hcpC codons and branches of the hcpC phylogenetic tree. We found that hcpC codons experienced heterogeneous selective pressures similar to those on hcpG codons (Table 2). Corresponding Bayesian analysis confidently (Bayesian probability >0.99) identified 23 hcpC codons under positive selection (ωs=2.73). All 23 sites mapped to the molecular surface of HcpC, some in close proximity to the experimentally identified peptide binding site  (Table 2; Fig. S4 in File S1). Thus, unlike hcpG, in which large-scale domain architecture changes are positively selected (Fig. 1c) and likely drive gain or loss of protein function , we predict that positive selection may simply fine tune hcpC functions by modulating its interaction with other bacterial or host proteins . Most notably, although evolutionary rates vary significantly in the hcpC phylogenetic tree, few hcpC lineages preferentially accumulate non-synonymous substitutions (Fig. 1d, Table S10 in File S1). Moreover, the MacDonald-Kreitman test revealed that hcpC evolution was predominantly neutral in different populations (Table S9 in File S1). Note that this test was not useful on hcpG sequences because of lack of any fixed differences and geographic partitioning among hcpG alleles (Fig. 1C). Thus, we conclude that although divergence of hcpG and hcpC alleles is driven by positive selection, the intensity of positive selection is stronger on hcpG (ωshcpG=4.46 versus ωshcpC=2.73). Consequently, functional constraints on hcpC are likely stronger and manifest in its overall structural and genomic conservation.
To test the functional essentiality of hcpG, we determined whether it was expressed in broth cultures during H. pylori growth and cultured AGS cells during H. pylori infection. Under both conditions, we detected robust hcpG mRNA transcripts from diverse strains (Fig. 2a and 2b). We next quantified the hcpG mRNA transcripts during infection of cultured AGS cells with the H. pylori strains HpG27MA and JHp99, both of which have been extensively characterized. This analysis revealed that 3 and 6 h after infection with HpG27MA, hcpG expression was up-regulated 2.0 and 2.5 fold, respectively. However, following infection with JHp99, hcpG expression was up-regulated 25-fold at 3 h but then down-regulated to 16-fold at 6 h (Fig. 2c). These data suggested that hcpG is expressed in vitro and during infection and that its expression is likely to be differentially regulated in distinct H. pylori strains.
To assess the contribution of hcpG to H. pylori growth and infection, the growth of the ΔhcpG mutant strain in broth culture, and then during infection of cultured AGS cells was compared with that of the WTHpG27MA strain. This comparison revealed that the ΔhcpG mutant had no significant growth defect in either broth culture or during infection of cultured AGS cells (Fig. 2d and 2e).
We next determined whether the mild up-regulation of hcpG transcript expression seen in HpG27MA resulted in protein production. For this purpose we reverse engineered the ΔhcpG mutant strain by replacing the rpsL,catR cassette with a hcpG::6xHis fusion assembly (Fig. S2 in File S1). We detected the HcpG::6xHis fusion protein following infection of AGS cells with the HpG27MA::hcp-6xhis strain (Fig. 2f). Furthermore, the ΔhcpG mutant and HpG27MA::hcp-6xhis strains both triggered bacterial CagA translocation and subsequent activation of cellular MAPK (ERK2) in AGS cells suggesting that both strains initiated normal infection-induced signaling events  (Fig. 2f).
We next assayed the growth and fitness of the ΔhcpG mutant strain relative to that of the WT strain during competition in broth culture or during infection in cultured AGS cells. These competition assays revealed no significant reduction in growth or fitness of the ΔhcpG mutant strain in broth culture or during infection of cultured AGS cells (Figs. 3a and 3d; Figs. S5a and S5d in File S1). Thus, data from absolute and relative measures of fitness suggest that hcpG is dispensable to H. pylori growth and infection even though hcpG is expressed in vitro and during infection in cultured AGS cells.
We next asked whether the dispensability of hcpG indicated lack of essentiality of its ancestral function or some form of genetic buffering by its paralog, hcpC. To answer this question, we engineered two mutant H. pylori strains for use in growth and fitness assays: 1) HpG27MAΔhcpC, and 2) the HpG27MAΔhcpG,ΔhcpC double mutant [See supplementary methods in File S1]. We found that compared to the WT HpG27MA strain, the ΔhcpC mutant had no growth defects in pure broth culture (Fig. 2d). However, we observed a small but significant growth defect in the ΔhcpC mutant in cultured AGS cells 24 h after infection (0.01<P<0.05; Fig. 2e). The ΔhcpG,ΔhcpC double mutant exhibited no defects for up to 48 h of growth in pure broth culture, but measurements at 56 h revealed a small but significant growth defect (0.01<P<0.05; Fig. 2d). Similar to the ΔhcpC mutant, the ΔhcpG,ΔhcpC double mutant had a small growth defect in cultured AGS cells 24 h after infection (0.01<P<0.05; Fig. 2e). These data suggest that hcpC may be required for optimal H. pylori growth late during infection of cultured AGS cells and that deletion of both hcpG and hcpC is mildly deleterious for growth in late broth culture and during late infection. Thus, collectively these results indicate a possible genetic interaction between hcpG and hcpC.
To confirm and clarify the nature of the genetic interaction between hcpG and hcpC, we co-cultured the ΔhcpC and ΔhcpG,ΔhcpC mutants, respectively, with the WT strain in broth cultures and during infection in cultured AGS cells. The ΔhcpC mutant exhibited no growth defects in competition assays in broth culture (Fig. S5b and S5e in File S1) or in cultured AGS cells 6 h after infection. However, unlike the ΔhcpG mutant, the ΔhcpC mutant had a significant growth defect and reduced fitness relative to that of the WT strain in cultured AGS cells 24 h after infection (Fig. 3b and 3e). Strikingly, the ΔhcpG,ΔhcpC double mutant, unlike the ΔhcpC and ΔhcpG single mutants, had significant fitness reduction relative to that of the WT strain in cultured AGS cells 6 h after infection (Fig. 3c and 3f). Moreover, the ΔhcpG,ΔhcpC double mutant showed significant fitness reduction, even more than that observed for the ΔhcpC mutant, in cultured AGS cells 24 h after infection (Fig. 3c and 3f). Similar to the ΔhcpG and ΔhcpC mutants, the ΔhcpG,ΔhcpC double mutant experienced no fitness reduction when co-cultured with the WT strain in broth culture (Fig. S5c and S5f in File S1). Thus, we identify two categories of genetic interactions between hcpG and hcpC depending on the temporal context of H. pylori infection (Fig. 3g). First, during early infection, hcpG and hcpC are completely redundant in that disrupting either gene alone has no effect on H. pylori growth or fitness, but disrupting both genes causes significant reduction in H. pylori growth fitness. Second, during late infection hcpG and hcpC are quantitatively redundant in that the fitness phenotype of the ΔhcpC mutant is exacerbated in the ΔhcpG,ΔhcpC double mutant. Thus, hcpG and hcpC are coupled in a redundant relationship that differs depending on the temporal context of the infection. We conclude that context-dependent redundant relationships between hcpG and hcpC contribute significantly to the mutational robustness of H. pylori growth during infection and likely contribute to the retention of hcpG along with hcpC in select H. pylori genomes.
We next considered molecular mechanisms underlying the genetic redundancy of hcpG and hcpC. Using a combination of co-immunoprecipitation and mass-spectrometry the Helicobacter HspB/Hsp60/GroEL chaperone was identified as a potential interacting partner of HcpC . Using ELISA we confirmed that HcpC can bind directly to HspB (Fig. 4a). HspB/Hsp60/GroEL is an essential chaperone that is cytoplasmic in most bacteria except H. pylori in which it can also be translocated to the bacterial surface or extracellular milieu , . The translocated HspB protein then associates with the UreB subunit of the H. pylori urease complex and contributes to H. pylori pathogenesis via multiple pathways , . The mechanism of HspB translocation is controversial, although it is likely translocated actively . Because Slr-containing proteins facilitate protein-protein interactions, we hypothesized that the redundant partners HcpC and HcpG may mediate or modulate HspB translocation.
To determine whether HspB translocation is affected by hcpG or hcpC, we analyzed HspB expression in unpermeabilized WT and mutant H. pylori strains in broth cultures and during infection of AGS cells using fluorescence-activated cell sorter (FACS) analysis. We found that 3 h after infection, HspB fluorescence was significantly reduced in the ΔhcpG and ΔhcpC single mutants and in the ΔhcpG,ΔhcpC double mutant (Fig. 4b). Strikingly, 6 h after infection with the, HspB fluorescence in the ΔhcpG mutant strain recovered nearly to the same level as seen with WT strain (Fig. 4c); by 12 and 24 h after infection, the ΔhcpG mutant and WT strain demonstrated similar HspB fluorescence levels (Fig. 4d and 4e). In contrast, HspB fluorescence remained significantly reduced throughout the infection in the ΔhcpC mutant and ΔhcpG,ΔhcpC double mutant (Figure 4b–4e). Importantly, parallel experiments with pure broth cultures maintained for up to 56 h revealed similar HspB fluorescence levels in the WT and mutant strains (Fig. S6 a–c in File S1). Thus, these data demonstrate apparent modulation of HspB translocation specifically in response to infection-induced signals.
To confirm that the modulation of HspB expression did not reflect generalized disruption of infection-induced signaling events in response to deletion of hcpG and/or hcpC, we monitored CagA and MAPK (ERK2) expression levels in infected permeabilized AGS cells using FACS analysis. We found that the mutant and WT strains described above all similarly triggered the release of bacterial CagA accompanied by activation of cellular MAPK (Fig. 5). Thus, hcpG and hcpC specifically modulate HspB expression whereas independent H. pylori infection-induced signaling events remain unaffected.
We next asked whether hcpG and hcpC affected the transcriptional regulation of hspB, which in turn may alter HspB expression. Real-time (RT) PCR analysis revealed no significant alterations in hspB expression levels in the mutant and WT strains following infection of AGS cells (Fig. 6a). Because HspB is known to associate with the UreB subunit of the urease complex we also measured the ureB expression levels to rule out indirect causes of altered HspB expression. We found no significant alteration in ureB expression in the mutant and WT strains at equivalent intervals during infection (Fig. 6b). Thus, these data suggest that altered HspB fluorescence does not result from modulation of hspB expression in hcpG or hcpC mutants.
We next determined whether the deletion of hcpG resulted in the upregulation of hcpC expression, which, if true, could explain the apparent normalization of GroEL/HspB fluorescence to WT levels in the ΔhcpG mutant (Fig. 4c–4e). We found that hcpC expression in the ΔhcpG mutant was not significantly affected (Fig. 6c). Similarly, deletion of hcpC had no significant effect on hcpG expression. Together these data suggest that hcpC and hcpG are transcriptionally uncoupled.
Taken together, our results reveal two categories of functional interactions between hcpG and hcpC depending on the temporal context of H. pylori infection (Fig. 4f). First, during early infection, hcpG and hcpC are both essential for optimal HspB translocation and that neither of them can functionally compensate for deletion of the other gene. Thus, hcpG and hcpC are selected independently to perform HspB translocation. Second, during middle to late infection, hcpC alone is necessary and sufficient for optimal HspB translocation whereas hcpG is not required for it. Given the quantitatively redundant fitness phenotypes exhibited by hcpG and hcpC, our results suggest that these two genes are likely important because of their capacity to perform distinct functions. We conclude that hcpG and hcpC partially overlap in their function but lack the generic functional backup capacity expected among genetically redundant paralogs.
Genetic buffering interactions are most commonly studied by measuring the fitness of an organism under standard laboratory growth conditions, in which the spatial and temporal flux in the organism’s interactions with its environment is inherently underestimated. Moreover, in most cases, the molecular functions underlying such genetic interactions remain relatively unexplored. In our study, by taking into account the temporal context of H. pylori infection modeled in cultured mammalian cells, we uncovered the simultaneous occurrence of different types of genetic buffering interactions between the H. pylori paralogs hcpG and hcpC. A recent study reported that multiple genetic interactions in yeast paralogs conferred robustness to yeast signaling and regulatory networks . In bacteria, studies of duplicate genes have historically focused on tandem gene duplications (gene amplification) but rarely on expanded gene families . To the best of our knowledge, no previous reports have generally described multiple genetic buffering interactions among duplicate genes from expanded bacterial gene families in the context of pathogen-host interaction. Moreover, our data caution against the prevalent notion that absence of phenotypes upon deletion of single genes from expanded gene families reflects either compensation of function by other paralogs or lack of essential function of paralogs in mediating pathogen-host interactions.
Previous studies have proposed that maintenance of redundant paralogs can have several selective advantages, including mutational robustness , , , robustness against random fluctuations in gene expression ,  and robustness of regulatory signaling networks . In bacteria, tandem gene duplications are known to contribute to specific environmental adaptations . Our present findings clearly show that genetic redundancy of hcpG and hcpC contributes significantly to the mutational robustness of H. pylori growth specifically during infection of cultured AGS cells (Figs. 3a–3f). Genetic redundancy among paralogs generally tends to be condition-dependent ; thus, that effects of hcpG and hcpC deletions on H. pylori growth and fitness were more apparent specifically during infection, a physiologically more relevant condition, and not in pure broth culture is not surprising (Figs. S5a–S5f in File S1). What is surprising, however, is that the nature of this genetic redundancy switches from complete to quantitative depending on the temporal context of the infection (Fig. 3g). This suggests that hcpG and hcpC are also coupled via infection-induced regulatory links that mediate such switches. Constituting such a regulatory module can be advantageous for H. pylori because depending on when hcpG and hcpC are active, regulation of distinct processes mediated by them (see discussion below) can be coupled or uncoupled from each other in response to temporal or spatial context of the infection.
The dependence of paralogous redundancy on the context of H. pylori infection observed in the present study and in other earlier studies ,  argues against a predominantly compensatory (backup) function of duplicate genes. During early infection, both hcpG and hcpC are essential for optimal HspB translocation, and neither of them functionally compensates for a lack of their redundant partner. This suggests that hcpG and hcpC are specialized in distinct manners to perform HspB translocation during early infection. This is intriguing, because from middle to late infection, HcpC alone appears to be necessary and sufficient for HspB translocation, whereas HcpG is dispensable and unable to functionally rescue HcpC deletion despite its apparent ability to mediate HspB translocation in early infection (Fig. 4b–4e). We ascribe the lack of generic backup functional capacity between hcpG and hcpC to two distinct factors: dosage amplification and functional or regulatory divergence.
Theoretical studies have suggested that duplicate genes whose products mediate stress responses or generally mediate organism-environment interactions can be retained in genomes of such organisms by positive selection for increased dosage , . Specifically, surface-associated HspB appears to be important for gastric colonization early in H. pylori infection . Thus, efficient, rapid HspB translocation early during infection should favor successful H. pylori colonization. We also observed that during early infection, both hcpG and hcpC were expressed at relatively low levels in WT HpG27 strain, and expression of both genes was not significantly altered when their redundant partner was deleted (Fig. 6c). Thus, given their relatively low expression levels, HcpG and HcpC appear independently selected because of their combined contribution to efficient and rapid HspB translocation in early H. pylori infection.
Temporal variation in the relative necessity of H. pylori paralogs for HspB translocation indicates the potential regulatory influence of infection-induced cellular signals on HcpG and HcpC activity and function. This regulatory influence is also suggested by our observation that hcpG could not functionally rescue hcpC deletion (Figure 4b–4e). Thus, we speculate that whereas early infection-induced signals likely activate both HcpG and HcpC the transition from early infection to middle to late infection phase predominantly elicits HcpC-activating signals (Fig. 4f). Two additional lines of evidence suggest functional divergence between hcpG and hcpC during late infection: (1) the additive effect of hcpG deletion on the ΔhcpC phenotype during late infection suggests that hcpG and hcpC likely perform distinct functions; and (2) positive (or diversifying) selection on extant hcpG and hcpC allele also suggests potential functional divergence between HcpG and HcpC. Collectively, these data suggest that during late infection hcpG and hcpC appear to be selected primarily for their divergent functions that are likely regulated by infection-induced signals.
The context-variable redundancy of hcpG and hcpC described here may also have broader implications on understanding the evolution of gene duplications. Models explaining the maintenance of paralogs typically invoke functional subdivision and/or novelty in duplicate copies and are classified into multiple categories . Although, our present data best fit the positive dosage model , , in which paralogs are selected independently because of their cumulative contribution to the same function, additional models may be required because of different redundancy dynamics during late infection. The positive dosage model predicts two possible outcomes based on the strength of selection on cumulative hcpG and hcpC action: under strong selection, the duplicate copy (hcpG) may be quickly fixed whereas under weak selection, a null mutation may become fixed by random genetic drift resulting in hcpG pseudogenization, and eventually loss of hcpG from H. pylori populations (Fig. 7). In highly variable environments, such as those in H. pylori’s more than three billion human hosts, the strength of selection on increased dosage may periodically wax and wane, leading to cyclical gene duplication and gene loss . Such variation in strength of selection may underlie the three distinct hcpG and hcpC genotype combinations we observed in H. pylori populations (Fig. 7). Thus, each strain-specific hcpG-hcpC genotype may reflect an individual host-specific adaptation of that H. pylori strain. Alternatively, if cyclical bursts of gene duplication and pseudogenization are common then the presence of pseudogenized hcpG alleles in H. pylori populations could be re-interpreted to reflect inferior hcpG variants outcompeted by hcpG alleles better adapted to a specific biochemical niche . Such cyclical duplication-pseudogenization may permit selection to explore a wide mutational landscape, likely fixing only those hcpG variants that provide additional functional capability (Fig. 7). The extreme genetic heterogeneity among functional hcpG alleles driven by positive selection further supports the possibility that hcpG alleles may encode functionally divergent proteins. Taken together, we propose that context-variable redundant behavior and coupling of paralogs via regulatory links generated by infection-induced signals may have wide-ranging implications on understanding of the evolution of gene duplications, and may require additional sub classification of existing models.
Our study has identified two new potential bacterial determinants, HcpG and HcpC, which may contribute to H. pylori pathogenesis via regulation of HspB/GroEL/Hsp60 translocation or export to the bacterial surface. The H. pylori HspB/GroEL/Hsp60 appears essential for H. pylori colonization during early infection , for induction of innate immune responses , and can enhance angiogenesis  and tumorigenesis . In most bacteria the HspB chaperone protein is cytoplasmic but in H. pylori this protein is often found on bacterial surface and in the extracellular milieu. The mechanism of HspB export is somewhat controversial. While Phadnis et al  argued that HspB is reabsorbed to intact cell membranes following its release into the extracellular milieu via autolysis, Vanet and Labigne  showed that HspB/GroEL/Hsp60 more likely underwent active secretion rather than autolysis. Because of the apparent HcpG- and HcpC-dependent modulation of HspB translocation in intact unpermeabilized H. pylori cells, and the demonstration that HcpC can directly interact with HspB, we favor the idea that HspB may be actively secreted rather than exported via autolysis. The precise mechanism of how HcpC and HcpG might mediate HspB export, however, remains to be determined. It will be important to determine whether HcpG can also interact directly with HspB, whether HcpG and HcpC co-localize with HspB to the bacterial surface, and identify the infection-induced signals that seem to temporally regulate the functions and/or activity of HcpC and HcpG in the context of infection.
The 166 H. pylori isolates included in this study were obtained from phylogenetically distinct European (Spain), African (The Gambia and South Africa), East Asian (Japan, South Korea), South Asian (India), and South American (Lima and Shimaa, Peru) populations (Figure S1 and Table S1 in File S1). All of these strains were obtained from patients who had sought medical attention and undergone endoscopic biopsies after giving their informed consent at their respective institutions. Detailed listing of strains, along with their culture, growth and maintenance conditions is described in supplementary methods available in File S1.
Standard procedures were used for extracting H. pylori genomic DNA , PCR and nucleotide sequencing. Briefly, specific PCRs were carried out in 25 µl reaction mixtures in a PCR buffer supplied by the manufacturer (Biolase; MidSci, St. Louis, MO) and containing 5–10 ng of genomic DNA, 1 U of Taq polymerase (Biolase; MidSci), 1.5 mM MgCl2, 0.8 pmol each of forward and reverse primers, and 100 µM dNTP mix for 30–35 cycles of denaturation (94°C), annealing (55°–64°C, as required), and elongation (68°–72°C, as required). PCR products were purified (QIAquick PCR purification kit; Qiagen, Valencia, CA) and then quantified using the Biophotometer (Eppendorf, Hauppauge, NY). Nucleotide sequencing was performed using both DNA strands at the high-throughput genomics unit at the University of Washington, Seattle, WA. The primers used in PCR, and nucleotide sequencing are listed in Table S2 in File S1.
hcpC and hcpG nucleotide sequences were assembled and edited using the Seqman suite in the Lasergene software program (DNASTAR, Madison, WI). An hcpC multiple sequence alignment (MSA) was generated using MEGA version 5  (www.megasoftware.net). The hcpG MSA was generated as follows: an initial MSA was generated by aligning HcpG sequences with crystal structures of H. pylori Slr proteins HcpB  and HcpC  using EXPRESSO  (http://igs-server.cnrs-mrs.fr/Tcoffee/tcoffee_cgi/index.cgi). Because of high polymorphism levels in hcpG sequences, this alignment was further edited manually based on the biochemical features of amino-acids to correct for obvious mismatches. The resulting MSA was used to manually derive a corresponding nucleotide alignment using MEGA alignment editor. Phylogenetic reconstruction and analyses of selection pressures acting on hcpC and hcpG codons and lineages were performed using maximum likelihood (ML) methods implemented in PAUP* version 4b10 and PAML version 4.3b, respectively. Best-fit models of DNA sequence evolution used in phylogenetic reconstruction were selected using MODELTEST (Table S3 and Table S4 in File S1). Details of these analyses are described in supplementary methods available in File S1. Phylogenetic datasets generated in this study have been submitted to GenBank® (Accession numbers: 1) hcpC dataset, KC007946–KC008026 and 2) hcpG dataset KC008027–KC008064).
Estimates of total nucleotide diversity (π), Waterson’s θ, nucleotide diversity at synonymous and nonsynonymous sites (πS and πA), genetic differentiation among populations (FST) with accompanying permutation tests, and the McDonald-Kreitman tests and associated estimates of α, the proportion of amino acids under positive selection  were obtained using the DNASP software program (version 5.1) .
Domain architecture analysis of hcpG and hcpC sequences was performed using the Simple Modular Architecture Research Tool (SMART) . Positively selected residues of HcpC were mapped to the surface of the HcpC crystal structure (PDB code 1OUV ) using the PYMOL molecular visualization system (http://www.pymol.org).
hcpC and hcpG knockout derivatives (ΔhcpC and ΔhcpG single mutants and ΔhcpC,ΔhcpG double mutant) of H. pylori strain G27MA and the hcpG::6xHIS knock-in G27MA strain were generated using the streptomycin contraselection-based method described previously  while incorporating small modifications. Strategy for generating knockout and knock-in strains is shown in Fig. S2 in File S1 and is described in detail in supplementary material available in File S1.
H. pylori G27MA WT and its hcpG and/or hcpC deletion derivatives were grown on BHI plates containing appropriate selective antibiotics for 3 days. Fifty milliliter tissue culture flasks were then inoculated with a bacterial suspension derived from plate cultures (OD600=0.05/mL for each strain). A liquid medium comprising BHI broth supplemented with 1% IsoVitaleX, 1% H. pylori-selective supplement, and 10% fetal bovine serum (FBS) (GIBCO, CA) was used to culture the H. pylori strains. Flasks were initially incubated at 37°C for 30 min in a 5% CO2 incubator and then transferred to GasPak jars and incubated at 37°C with shaking (120 rpm) for a maximum of 56 h . At specific intervals, cell aliquots were from culture flasks were diluted serially and plated on selective BHI plates to enumerate the WT and mutant colony-forming units (CFUs). The log-transformed CFU mL−1 count was used to determine the competitive index (CI) in co-culture experiments. The CI was calculated according to the ratio of mutant to WT bacteria at each time point compared with the ratio of mutant to WT bacteria in the inoculum , . A CI value greater than one indicated that the mutant out-competed the WT, whereas a value less than or equal to one indicated that the WT out-competed the mutant. Growth assays in pure cultures and fitness assays in broth co-cultures were repeated three times, and the statistical significance of observed differences in the growth or fitness of hcpC and/or hcpG mutants and WT G27MA was determined using a t-test with α=0.05.
Before each experiment in cultured AGS cells, bacteria were passaged once on BHI horse blood agar plates under standard microaerobic conditions as recommended . AGS cells (ATCC CRL 1739) were normally cultured and maintained in antibiotic-free high-glucose Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% heat-inactivated FBS. Cells were allowed to grow and maintained in 50 mL tissue culture flasks at 37°C in a humidified atmosphere of 5% CO2. AGS cells were seeded at a density of 1×105 cells/mL into six-well plates and then allowed to grow to 80% confluence . Just prior to infection of cultured AGS cells with H. pylori, the cell-medium mixture was removed and replaced with fresh DMEM containing 20% FBS. Bacteria were harvested from pure liquid cultures in BHI and washed in phosphate-buffered saline (PBS; pH 7.4); AGS cells were infected at an MOI of ~ 100 H. pylori cells per AGS cell. Plates were centrifuged for 10 min at 1,000 g to ensure bacterial contact with the AGS cells and incubated at 37°C in a humidified atmosphere of 5% CO2. At specific intervals cells were gently scraped from a well, mixed, diluted serially, and plated on selective BHI plates to enumerate the WT and mutant CFUs. In co-cultures experiments the CI was calculated as described above. Each experiment was repeated five more times, and the statistical significance of observed differences in WT and mutant strains was calculated using a t-test with α=0.05.
Antibodies used in this study are listed in supplementary methods available in File S1.
HspB expression was studied in unpermeabilized bacterial cells grown in pure cultures and during infection of cultured AGS cell using a FACS-Calibur™ flow cytometer (BD, Franklin Lakes, NJ). The data were analyzed using the WinMDI software program (version 2.9). HspB, CagA and activated-Mitogen Activated Protein Kinase (MAPK, ERK2) expression during infection was studied using serum-deprived AGS cells that were allowed to grow to 80% confluence. Parameters used in the FACS analyses are listed in Table S5 in File S1. Detailed methods for FACS analyses are presented in supplementary material available in File S1.
HcpC fused to N-terminal MBP and C-terminal histidine tags was expressed and purified as described previously . HspB (gene hp0010 from H. pylori strain 26695) was cloned into a pGEX-6P expression vector (GE Healthcare) and purified GST-HspB was concentrated using ultrafiltration and stored in PBS buffer supplemented with 10% (v/v) glycerol at −80°C. All ELISA experiments were performed using Nunc Maxisorp 96-well plates at volumes of 100 µL/well. Details of HspB expression, purification and subsequent use in ELISA are provided in supplementary methods available in File S1.
Standard methods were used and are detailed in supplementary material available in File S1.
Detailed description of methods and supplementary figures (Fig. S1– Fig. S6) and tables (Table S1–Table S10).
We thank Drs. Hideki Innan and Fyodor Kondrashov for discussions and comments on the manuscript. We thank Prof. Yousef Abu Kwaik and members of his laboratory for their help and advice with cell culture and RT-PCR experiments.