elicobacter pylori infection is associated with several gastro-duodenal inflammatory diseases of various levels of severity. To determine whether certain combinations of genetic markers can be used to predict the clinical source of the infection, we analyzed well documented and geographically homogenous clinical isolates using a comparative genomics approach.
A set of 254 H. pylori genes was used to perform array-based comparative genomic hybridization among 120 French H. pylori strains associated with chronic gastritis (n = 33), duodenal ulcers (n = 27), intestinal metaplasia (n = 17) or gastric extra-nodal marginal zone B-cell MALT lymphoma (n = 43). Hierarchical cluster analyses of the DNA hybridization values allowed us to identify a homogeneous subpopulation of strains that clustered exclusively with cagPAI minus MALT lymphoma isolates. The genome sequence of B38, a representative of this MALT lymphoma strain-cluster, was completed, fully annotated, and compared with the six previously released H. pylori genomes (i.e. J99, 26695, HPAG1, P12, G27 and Shi470). B38 has the smallest H. pylori genome described thus far (1,576,758 base pairs containing 1,528 CDSs); it contains the vacAs2m2 allele and lacks the genes encoding the major virulence factors (absence of cagPAI, babB, babC, sabB, and homB). Comparative genomics led to the identification of very few sequences that are unique to the B38 strain (9 intact CDSs and 7 pseudogenes). Pair-wise genomic synteny comparisons between B38 and the 6 H. pylori sequenced genomes revealed an almost complete co-linearity, never seen before between the genomes of strain Shi470 (a Peruvian isolate) and B38.
These isolates are deprived of the main H. pylori virulence factors characterized previously, but are nonetheless associated with gastric neoplasia.