|Home | About | Journals | Submit | Contact Us | Français|
Correspondence to: Pilar Delgado, Associate Profesor, Molecular Diagnostics and Bioinformatics Laboratory, Biological Sciences Department, Los Andes University, Carrera 1 # 18A - 10, Bogotá 111711, Colombia. oc.ude.sednainu@odagledm
Telephone: +57-1-3394949 Fax: +57-1-3394949
AIM: To investigate Helicobacter pylori (H. pylori) CagA diversity and to evaluate the association between protein polymorphisms and the occurrence of gastric pathologies.
METHODS: One hundred and twenty-two clinical isolates of H. pylori cultured from gastric biopsies obtained from Colombian patients with dyspepsia were included as study material. DNA extracted from isolates was used to determine cagA status, amplifying the C-terminal cagA gene region by polymerase chain reaction. One hundred and six strains with a single amplicon were sequenced and results were used to characterize the 3’ variable region of the cagA gene. To establish the number and type of tyrosine phosphorylation motifs Glutamine acid-Proline-Isoleucine-Tyrosine-Alanine (EPIYA) bioinformatic analysis using Amino Acid Sequence Analyzer-Amino Acid Sequence Analyzer software was conducted. Analysis of the association between the number of EPIYA motifs and the gastric pathology was performed using χ2 test and analysis of the presence of EPIYA-C motifs in relation to the pathology was made by logistic regression odds ratios. Comparisons among EPIYA types found and those reported in GenBank were performed using a proportion test in Statistix Analytical Software version 8.0.
RESULTS: After amplification of the 3’ of the cagA gene, 106 from 122 isolates presented a single amplicon and 16 showed multiple amplicons. As expected, diversity in the size of the cagA unique fragments among isolates was observed. The 106 strains that presented a single amplicon after 3’ cagA amplification came from patients with gastritis (19 patients), atrophic gastritis (21), intestinal metaplasia (26), duodenal ulcer (22) and gastric cancer. DNA sequence analysis showed that the differences in size of 3’ cagA unique fragments was attributable to the number of EPIYA motifs: 1.9% had two EPIYA motifs, 62.3% had three, 33.0% had four and 2.8% had five motifs. The majority of tested clinical strains (62.3%) were found to harbor the ABC combination of EPIYA motifs and a significant statistical difference was observed between the frequencies of ABCC tyrosine phosphorylation motifs and Western strains sequences deposited in GenBank.
CONCLUSION: The present report describes a lack of association between H. pylori CagA-protein polymorphisms and pathogenesis. ABCC high frequency variations compared with Western-strains sequences deposited in GenBank require more investigation.
Helicobacter pylori (H. pylori) is a spiral gram-negative microaerophilic bacillus. This bacteria is one of the most common worldwide human pathogens, it is present in at least 50% of the world’s population, with the highest incidence recorded in industrially underdeveloped areas, including Asia, Africa and South America. H. pylori colonizes the human stomach and persists for several decades, causing chronic gastritis and peptic ulcer diseases. Studies have suggested that chronic infection by H. pylori is an important risk factor for the development of gastric carcinoma[5,6]. For this reason, H. pylori was defined as a type I carcinogen by The International Agency for Research on Cancer - IARC.
The cytotoxin-associated antigen A, CagA, was identified in 1989. It is encoded in the cag pathogenicity island (cag PAI), a segment of 40 kb that codifies components required to ensemble a type IV secretion system (TFSS). More than 90% of isolated strains from East Asia including Korea, Japan, and China are known to harbor cagA, while 50%-60% of isolated strains from Western countries are positive for it. This gene shows variation which is explained by adaptive evolution, where a genetically diverse H. pylori population provides the host with a repertoire of varied phenotypes from which a subpopulation with optimal fitness may be selected. This evolution would operate through recombination between H. pylori direct DNA repeats that result in deletion (or duplication) of phosphorylation sites in cagA gene[12,13] or the entire genomic cag PAI.
The cagA gene product, CagA, is directly translocated from H. pylori into the gastric epithelia cells the bacteria are attached to via TFSS[15-18] and upon localizing in the inner surface of the plasma membrane, CagA undergoes tyrosine phosphorylation by Ab1 and Src family kinases on specific tyrosine residues within a Glutamine acid-Proline-Isoleucine-Tyrosine-Alanine (EPIYA) motif[19-25]. Once CagA is phosphorylated, it interacts with Src homology phosphatase 2 (SHP-2) which stimulates downstream signaling cascades involved in the reorganization of the cytoskeleton, resulting in cellular morphological changes such as the “hummingbird” phenotype. Among the various CagA activities that disturb cellular functions, deregulation of SHP-2 by CagA is of potential importance in gastric carcinogenesis because mutations in PTPN11, the gene encoding human SHP-2, have been identified in human malignancies[26,27].
CagA protein varies in size according to the strain[28,29]. The structure of the gene reveals a 5’ highly conserved region and CagA size variation is due to the presence of different types and/or numbers of repeat sequences containing the EPIYA motifs within the C-terminal variable region. Four types of EPIYA segments have been described: A, B, C, and D, each of which contains a single EPIYA motif. Moreover, Panayotopoulou et al and Kanada et al described the pattern around the EPIYA motif to determine the type to which it corresponds as follows: EPIYA-A, EPIYAKVNKKK(A/T/V/S)GQ; EPIYA-B, E(S/P)IY(A/T)(Q/K)VAKKVNAKI; EPIYA-C, EPIYATIDDLG and EPIYA-D, EPIYATIDFDEANQAG. Earlier studies have shown that CagA protein nearly always contains EPIYA-A and EPIYA-B sites, followed by one to three EPIYA-C repeats in Western-type H. pylori isolates or by one EPIYA-D motif in East Asian-type isolates. Src kinase in gastric epithelial cells phosphorylates CagA on EPIYA-C tyrosine residue[34,35]. Consequently, among Western CagA strains, the number of EPIYA-C sites is directly associated with the level of tyrosine phosphorylation. Thus, Western CagA proteins with a greater number of EPIYA-C sites are pathophysiologically more virulent and probably more carcinogenic. The tyrosine phosphorylation status of CagA is important for the pathogenicity of H. pylori and this variable number of EPIYA could be of clinical relevance in gastroduodenal diseases. For the above stated reasons it is important to analyze the genetic variability of cagA gene from clinical strains in relation to the associated pathologies. In this study, we used a polymerase chain reaction (PCR)-sequencing-bioinformatics strategy to characterize the CagA variable region of H. pylori from Colombian isolates. Additionally, the association between CagA diversity and the severity of gastroduodenal disease was analyzed.
A total of 122 H. pylori strains obtained from the stock collection at the Instituto Nacional de Cancerología, in Bogotá, Colombia, were grown on blood agar plates, supplemented with 7% horse serum (Invitrogen, Grand Island, NY), 1% Vitox (Oxoid, Basingstoke, UK), and Campylobacter selective supplement (Oxoid, Basingstoke, UK), at 37°C for 3 d in microaerophilic conditions. Three cagA positive control strains NCTC 11637, NCTC 11638 and ATCC 43579, were also included. Isolates belonged to patients with different types of gastric pathologies including benign, mild and severe conditions associated with H. pylori infection. Histopathology diagnosis was recorded for all voluntary participants.
Genomic DNA was extracted using AquaPure Genomic DNA isolation kit, BIO-RAD, according to manufacturer’s instructions and obtained DNA was stored at -20°C until PCR amplification. In order to amplify the variable cagA region, a PCR assay was carried out in a volume of 50 μL containing 50 mmol/L KCl, 20 mmol/L Tris-HCl, pH (8.4), MgCl2 1.75 mmol/L, 0.2 mmol/L of each dNTP, 1 pmol/μL of each primer (CAGTF 5-3: ACCCTAGTCGGTAATGGG and CAGTR 5-3: GCTTTAGCTTCTGAYACYGC), previously reported by Yamaoka et al, 1.25 units of Taq DNA Polymerase [Invitrogen, Carlsbad (California), USA] and 4 μL of DNA (positive controls with GenBank accession numbers: H. pylori NCTC 11637 (AF202973.1), NCTC 11638 (AF282853) and ATCC 43579 (AB015414.1). The PCR conditions included an initial denaturation step: 92°C for 5 min, followed by 35 cycles of 92°C for 1 min, 61°C for 1 min, 72°C for 1 min, and a final extension at 72°C for 7 min. PCR products were run on 1% agarose gels in 0.5 × TAE buffer at 100 V in a BIO-RAD® electrophoresis system and purified using the Wizard SV Gel and PCR Clean-Up System Kit [Promega, Madison (Wisconsin), USA] according to the manufacturer’s instructions, prior to sequencing.
Sequencing was performed using an ABI PRISM® 310 Genetic Analyzer [Applied Biosystems, Foster City (California), USA], BigDye® Terminator v3.1 Cycle Sequencing kit (Applied Biosystems, Foster City, CA, USA). The reactions were done in a volume of 10 μL containing 0.5 × of Premix, 0.5 × of buffer, 0.16 μmol/L of each primer and 2 μL of purified DNA. The conditions were: one cycle at 96°C for 1 min, followed by 25 cycles of 10 s at 96°C, 5 s at 50°C and 4 min at 60°C. All of the 106 sequences obtained were deposited into GenBank/EMBL/DDBJ database with the following accession numbers: FJ755476 and FJ915841 to FJ915945.
The sequences (forward and reverse) were edited and assembled using CLC DNA Workbench [CLC Bio A/S, (Aarhus C), Denmark]. For the characterization and quantification of the EPIYA motifs located in the C-terminal of CagA protein, a software called Amino Acid Sequence Analyzer (AASA) was designed for the study, in order to look for the type and number of EPIYA motifs, using the 6 open reading frames of each sequence. The characterization of tyrosine phosphorylation motifs, which contains EPIYA sequences, was done as previously described by Higashi et al. Clustal W program was used to generate a multiple alignment from the amino acid sequences of each strain.
In order to test the capability of the software to establish phosphorylation motifs in an accurate manner and the facility of its use, all controls were processed as a first phase. Sequence data results was as expected according to GenBank databases, so based on these results evaluation of clinical strain was done.
Analysis of association between the number of EPIYA motifs and the gastric pathology described was performed using χ2 test. Analysis of the presence of one or more than one EPIYA-C motifs in relation to the pathology was made by logistic regression odds ratios (OR); a 95% confidence interval (CI) was calculated using SPSS statistical software package version 16.0 (SPSS Inc., Chicago, IL, USA). Comparisons among EPIYA types found and those reported in GenBank were performed using a Proportion test in Statistical Analysis Software, version 8.0 (Software, 1985-2003).
After amplification of the 3’ of the cagA gene, 106 clinical isolates from 122 isolates presented a single amplicon (a unique band), and 16 showed multiple amplicons (two or more fragments) (Figure (Figure1,1, lane 4). As expected, diversity in the size of the cagA fragment among isolates was observed, and PCR products ranged from 343 to 811 bp. The 106 strains that presented a single amplicon after 3’-cagA amplification came from patients with superficial gastritis (19 patients), atrophic gastritis (21), intestinal metaplasia (26), duodenal ulcer (22) and gastric cancer (18).
After sequencing of cagA 3’ region gene, corresponding peptide sequences from all of the strains with single amplicons were deduced. The combination of the different EPIYA motifs was determined using the ClustalW and AASA software, based on the classification defined by Higashi et al and Panayotopoulou and collaborators. The pattern EPIYA(K/Q)VNKKK(A/T)GQ that corresponds to EPIYA-A; the pattern E(P/S)IY(A/T)(Q/K)VAKKV(N/T)(A/Q)KI, to EPIYA-B; and the pattern EPIYATIDDL(G/R) to EPIYA-C were found (Figure (Figure2).2). All the EPIYA motifs were Western type (Table (Table1),1), we did not found strains harboring EPIYA-D, a characteristic pattern of Eastern type.
DNA sequence analysis revealed that 2 of 106 strains (1.9%) had two EPIYA motifs, 66 strains (62.3%) three EPIYA motifs, 35 strains (33.0%) four EPIYA motifs and 3 strains (2.8%) five EPIYA motifs. In 66 out of the 106 (62.3%) strains, the pattern of the EPIYA motifs was ABC type (Table (Table1).1). Moreover, 49 of 106 strains had a modified EPIYA-B motif (EPIYT) instead of EPIYA (Figure (Figure2)2) and there was no association between this type of tyrosine phosphorylation motif (EPIYT) containing a threonine residue instead of an alanine residue and the studied pathologies (P = 0.51) (data not shown).
Studies have shown that CagA proteins with more EPIYA-C motifs are expected to be more active biologically than those with a small number of EPIYA-C motifs because they interact more effectively with SHP-2 phosphatases and therefore it could perturb SHP-2-dependent signaling pathways, inducing greater morphological changes or probably contributing to generation of gastric cancer[30,36]. For this reason, in order to determine if there was any relation between the severity of the gastroduodenal disease and the number and type of the EPIYA motifs, statistical analyses were applied.
A χ2 test performed showed that in this group of Colombian isolates, the variation of all EPIYA motifs in CagA was not directly associated to the outcome of the disease caused by H. pylori (P = 0.52). Then, atrophic gastritis, intestinal metaplasia, duodenal ulcer and gastric cancer risks were estimated with respect to a reference group made up of patients with gastritis, infected by cagA positive strains with one EPIYA-C motif and those with more than one EPIYA-C motif. No association between the number of EPIYA-C motifs and the pathology was found using logistical regression (Table (Table2).2). Proportion test frequencies of the EPIYA genotypes obtained in this study were compared with those of CagA sequences including the C-terminal variable region reported by Argent et al and sequences collected from Colombia deposited in GenBank. Statistical differences between the frequencies of ABCC pattern in Western strains previously reported by Argent et al, and our results were established, P < 0.002 (Table (Table11 and Figure Figure33).
The present work considered amplification and sequencing of the 3’ variable cagA-gene region and a final bioinformatics analysis of the corresponding C-terminal of CagA protein using AASA software as a single and rapid method for H. pylori CagA characterization. Determination of number and type of H. pylori CagA phosphorylation motifs has been suggested by some researchers as a way to predict clinical outcome of H. pylori associated pathologies and as a prognosis tool by others. So, in order to examine the viability of mentioned approach, 122 Colombian clinical isolates, all of them derived from adults, were analyzed. From 122 initial strains included, 106 isolates that presented a single PCR product were considered for latter analysis. As expected, diversity in the size of the cagA fragment among isolates was observed, and PCR products ranged from 343 to 811 bp. This is explained by the fact that cagA varies in size depending on the number of EPIYA motifs it has, and because each type of EPIYA adds a certain number of amino acids, since each type is surrounded by a specific amino acid sequence. EPIYA-A represents an additional 32 amino acids in the protein, EPIYA-B 40 more amino acids, and EPIYA-C 34 more amino acids for Western CagA type[40,41]. The remaining 16 strains that showed two or more PCR products were excluded. However, they were considered for further analysis due to: (1) they could represent subclones attributable to a clone microevolution process demonstrating the high recombination rate of H. pylori plastic genome; or (2) they could represent a coinfection process with multiple H. pylori strains coexisting in the same host. A microevolution process at cagA gene has been described previously in the same individual and also in different family members[12,42,43]. Otherwise strains with a polymorphism in the cagA variable region have been also observed by Panayotopoulou et al, Aras et al and Reyes-Leon et al. The EPIYA patterns found in this study match with those reported by Panayotopoulou et al, but there were differences for EPIYA-A in which the amino acid Lysine (K) in the 6th position (being the amino acid E of the EPIYA motif the 1st position) changed to K/Q as previously reported, EPIYA-B in which the pattern in positions 12 and13 changed from being only Asparagine and Alanine (N, A) to (N/T) and (A/Q) respectively and finally EPIYA-C in which the amino acid in the 11th position changed from Glycine (G) to (G/R) (data not shown, accession number: FJ915913), which had been previously reported by Occhialini. There was no association between the phosphorylation motif, EPIYT, which has a threonine residue instead of an alanine residue and the studied pathologies (P = 0.51). However, it has been reported that isolates that harbor the ABCC genotype and have a modification in the 5th residue (EPIYT) in the B type may induce lower levels of cellular elongation and interleukin-8 secretion than isolates with the normal ABCC pattern.
EPIYA results from this study were compared with those reported by Argent et al. The analyses showed that in our samples the ABC pattern of EPIYA motifs is the most common (62.3%) (Table (Table11 and Figure Figure3),3), which is in agreement with Western sequences reported before (63.3%). Interestingly, we found a statistical difference between the frequency of ABCC patterns in Western strains reported in GenBank (19.6%) and our results (31.1%) (Table (Table1)1) (P = 0.002). A search in the GenBank data base was performed in order to look for sequences reported in Colombia[45,46] that included the C-terminal variation region of CagA. A comparison between the frequencies of the different EPIYA patterns with our results showed that there is not a statistically significant difference (Table (Table1)1) (P = 0.98). Current research showed that the variation of the EPIYA motifs in CagA protein is not directly associated with the outcome of the disease caused by H. pylori and that there is no association between the number of EPIYA-C motifs and the pathology (Table (Table2).2). Similar results had been reported before in Iranian and Iraqi populations, where it seems there is no explicit positive correlation between the number of EPIYA motifs and various gastroduodenal diseases associated to H. pylori infection. In contrast, Yamaoka et al showed that 7 out of 8 H. pylori Colombian strains (87.5%) with more than 3 EPIYA-C motifs were from gastric cancer patients. More recently Sicinschi et al evaluated the 3’ cagA region in 66 isolates from Colombian patients with gastric precancerous lesions, from areas with low (31 strains) and high risk of gastric cancer (35 strains) in the south of the country. The proportion of strains bearing one EPIYA-C (62.2%), two EPIYA-C (34.3%) and three EPIYA-C (1.5%) motifs were similar to the ones observed in our population of strains. In contrast with our results, they observed a significant association between the presence of two or three EPIYA-C motifs and more severe lesions. Particularly they found a very high prevalence of strains with more than two EPIYA-C motifs in individuals with intestinal metaplasia (16/27, 59%). Sicinschi et al included in the study only isolates from men aged between 39 and 69 years in order to have high prevalence of H. pylori infection and preneoplastic lesions; 40% of the analyzed strains were from intestinal metaplasia. Our study included isolates from the Andes Mountains in the central part of the country, a high risk gastric cancer area; we included strains from men and women aged between 19 and 80 years, furthermore we included strains from gastric cancer patients. These differences in the design of the studies and also variations in geographic areas could explain the different results obtained. A limitation of our study is the small size of the analyzed population of strains. Although we could not find differences between EPIYA-C motifs and clinical outcomes, high prevalence of strains with more than one EPIYA-C motif might explain in part the high incidence of gastric cancer in Colombia.
It also must be considered that the absence of association between the CagA polymorphisms and the pathogenesis could be due to other factors such as genetic features of the host and cellular and extracellular environmental variables, which could influence the development of gastric cancer in combination with the type and number of EPIYA motifs in CagA protein. Moreover, another possibility as stated by Püls et al, using site-specific mutagenesis, suggests that tyrosine phosphorylation at EPIYA-C is sufficient, but not exclusive, to activate translocated CagA, suggesting that other motifs besides EPIYA-C are used for phosphorylation of CagA proteins as well. For this reason it is not only important to study the tyrosine phosphorylation motif, EPIYA, but also other new repetitions that have been discovered recently, such as the 7-AA sequence KIDQLNQ, which occurs in and near the CagA variable region. Another important amino acid sequence is FPLXRXXXVXDLSKVG (Figure (Figure2,2, second square), which surrounds the EPIYA-C motif and EPIYA-D in Western and East Asian CagA species, this was called CagA multimerization (CM) motif. This CM motif was characterized in all of our strains, and interestingly in the strain 5-22019 (GenBank access number: FJ915938) the mentioned motif does not only surround the EPIYA-C motif but also the EPIYA-A. For this reason, it is important to study the molecular interaction between CagA proteins through the CM motif, because CagA multimerization is critically involved in the formation of the CagA-SHP2 signaling complex, during cagA-positive H. pylori infection. Thus, it will be important to study the biological activity of Colombian CagA variants that harbor a variation in the C-terminal of the CagA protein, in order to comprehend how the interaction between the EPIYA-C motif and the SRC-2 protein in the epithelial cells works.
At the structural level it is important to determine how the EPIYA motifs are located in the CagA protein and if it influences CagA-SRC interaction, and consequently contribute to elucidate the biological activity of the CagA protein in the development of carcinogenesis in epithelial cells through this interaction. It is also important for future studies to check the evolution of the cagA gene in patients with atrophic gastritis, because this pathology has been reported as an early step in the development of gastric cancer and this combined with genetic variability on the cagA gene generated by multiple rearrangements such as insertions, deletions, and substitution events on a 102nt region encoding the EPIYA-C motif could increase the probability of developing a gastric cancer pathology. This kind of study has already been developed through hybridization assays with whole-genome DNA microarrays studies from Colombian strains and monitoring the acquisition of additional EPIYA-C motifs throughout adulthood to determine if they contribute to H. pylori biological activity at a later age.
Our findings suggest that CagA polymorphisms EPIYA motifs, in Colombian patients are not clearly associated with the outcome of the disease. For this reason, it could be important to evaluate if the EPIYA motifs variation can cause an effect in the CagA protein structure and if so, if it can be correlated with its interaction inside the gastric epithelial cells. CagA has a striking functional similarity with Gab, Grb2 associated binder adaptor, in terms of being a protein that recruits SHP-2 to the plasma membrane and activates SHP-2 phosphatase by forming a physical complex. Accordingly, it could be important to understand the role of CagA variants inside the gastric epithelial cells at protein-protein interaction level. On the other hand, microevolution in cagA gene has important implications for H. pylori pathogenesis because changing CagA type may provide a partial explanation of why studies examining the relationship between bacterial virulence and disease do not show tighter associations. Finally, the CagA variable region characterization was successfully achieved and it gave an important register of the CagA protein polymorphisms in Colombian patients.
Helicobacter pylori (H. pylori) is one of the most common worldwide human pathogens and it is present in at least 50% of the world’s population. Studies have suggested that chronic infection by H. pylori is an important risk factor for the development of gastric carcinoma, due to the presence of an important virulence factor, CagA protein.
CagA protein is a potential factor in the development of gastric carcinogenesis in human gastric epithelial cells. However, the relation between the polymorphisms on CagA and the pathogenesis of H. pylori in gastric cells has not been addressed in terms of number and type of Glutamine acid-Proline-Isoleucine-Tyrosine-Alanine (EPIYA) motif located in the C-terminal of CagA protein.
Some studies argue that there is an association between the EPIYA motifs in CagA and the pathology presented by the patients. In this study, the authors show that this association is not present in Colombian patients. Furthermore, their results suggest that this association is not equal in different regions of the world because there are others factors such as structural conformation and protein-protein interactions that could affect the final outcome of the translocation of CagA in the gastric epithelial cells.
Because there is no association in Colombian samples between the motifs in the C-terminal of CagA and the pathology, it may be important to understand how these motifs interact with their target on gastric cells (Src kinase and Src homology phosphatase 2 proteins) in order to elucidate the function of CagA in the development of cancer gastric.
CagA is one of the major virulence factors of H. pylori. This protein is involved in the development of gastric cancer because when this protein is translocated into gastric cells it causes changes in the signal transduction pathway which are involved in the proliferation of the cells.
The paper by Acosta et al assesses EPIYA in Colombian patients with dyspepsia and finds no association with pathology.
We would like to thank colleagues who developed the bioinformatics software (AASA): Eric Gerard Rothstein Morris, Miguel Andrés Yáñez Barreto and Sonia Vivas from the Engineering Faculty, Los Andes University, Bogotá, Colombia.
Supported by Sciences Faculty, Los Andes University, Bogotá, Colombia and National Cancer Institute, Bogotá, Colombia, Grant No. 41030310-28 (to Bravo MM)
S- Editor Wang YR L- Editor O'Neill M E- Editor Lin YP