|Home | About | Journals | Submit | Contact Us | Français|
To overcome some of the deficiencies with current molecular typing schema for Campylobacter spp., we developed a prototype PCR binary typing (P-BIT) approach. We investigated the distribution of 68 gene targets in 58 Campylobacter jejuni strains, one Campylobacter lari strain, and two Campylobacter coli strains for this purpose. Gene targets were selected on the basis of distribution in multiple genomes or plasmids, and known or putative status as an epidemicity factor. Strains were examined with Penner serotyping, pulsed-field gel electrophoresis (PFGE; using SmaI and KpnI enzymes), and multilocus sequence typing (MLST) approaches for comparison. P-BIT provided 100% typeability for strains and gave a diversity index of 98.5%, compared with 97.0% for SmaI PFGE, 99.4% for KpnI PFGE, 96.1% for MLST, and 92.8% for serotyping. Numerical analysis of the P-BIT data clearly distinguished strains of the three Campylobacter species examined and correlated somewhat with MLST clonal complex assignations and with previous classifications of “high” and “low” risk. We identified 18 gene targets that conferred the same level of discrimination as the 68 initially examined. We conclude that P-BIT is a useful approach for subtyping, offering advantages of speed, cost, and potential for strain risk ranking unavailable from current molecular typing schema for Campylobacter spp.
Campylobacter species, particularly C. jejuni subsp. jejuni (hereafter C. jejuni), represent the most commonly reported bacterial cause of gastroenteritis in humans in the developed world (47), with New Zealand having one of the highest rates of infection (55). The sheer scale of infection makes concerted epidemiological studies difficult, as does the extremely wide distribution of the organism, found in all major avian and mammalian food animals, their products, and indeed environments. Moreover, many Campylobacter spp. are susceptible to spontaneous genetic change through a variety of mechanisms that can result in conflicting data for genetic typing methods aiming to establish a molecular epidemiological link between strains (reviewed by On and colleagues ).
The poor discrimination of phenotypic typing methods led to intense developments in molecular epidemiological tools for more accurate data. Although a wide range of genotypic methods have been described (47), two methods are now more commonly used by laboratories worldwide. The availability of standardized protocols for macrorestriction profiling with pulsed-field gel electrophoresis (PFGE) and multilocus sequence typing (MLST) have facilitated major contributions to our understanding of the epidemiology of these bacteria. Nonetheless, issues remain, notably relating to the speed, cost, and ease of data analysis from these methods. Furthermore, although MLST has proven useful in evaluating the original host of a given strain, no current methods provide information on the relative risk to human health from individual strains. Various studies, including those identifying stable clones found in humans and various animals as well as strain types only in a particular animal host (5, 13, 38, 48, 61), and whole-genome microarray-based comparisons revealing a correlation between genome content and stress survival (46) indicate that not all strains are of equal risk to humans.
In this study, we designed a range of specific PCR assays and investigated the distribution of 68 genes associated with epidemicity factors in C. jejuni, to establish the basis of a novel PCR binary typing (P-BIT) system that is inexpensive, rapid, and highly portable. We compared our data with MLST and PFGE (using restriction enzymes SmaI and KpnI) results for the same isolates of C. jejuni (n = 58), C. coli (n = 2), and C. lari (n = 1).
Fifty-eight C. jejuni isolates, 2 C. coli isolates, and 1 C. lari isolate from four countries and a variety of sources were included in this study (Table (Table1).1). Isolates were grown on Columbia sheep blood agar plates for 48 h at 42°C under microaerobic conditions.
Between 5 and 10 colonies were transferred to a 1.5-ml tube containing phosphate-buffered saline (PBS) (BR0014G; Oxoid, Basingstoke, England) and centrifuged at 6,000 × g. The supernatant was discarded and DNA was extracted from the pellet using DNeasy blood and tissue kits (Qiagen, Hilden, Germany). The DNA quantity and purity were established using the Nanodrop 1000 (Thermo Scientific, Waltham, MA), and the DNA was stored at −20°C until required. DNA was diluted in sterile Milli-Q water as required.
Selection of gene targets for potential use in a PCR binary typing (P-BIT) system was based upon several criteria. We chose predominantly those genes implicated as markers of epidemicity that were overrepresented in genotypes associated with human illness (1, 4, 14, 16, 17, 20, 22, 26-31, 35-37, 43, 44, 46, 49, 50, 52, 53, 59, 60, 64, 65, 67, 69-71) or associated with virulence factors that were not components of the core genome (33, 34, 60), to evaluate a basis for subtyping that could also be related to the risk to human health from individual strains. These genes were generally involved in cell surface, mobility, and toxin production. Primers were designed using Primer3 (58) via Geneious Pro (10; available from http://www.geneious.com/) and the sequences of the identified genes and the identified strains as loaded in GenBank. Primers were designed to produce PCR products with sizes and melting temperatures that will allow for easy multiplexing in the future. The primers were checked for sequence homology with all known sequences using the Basic Local Alignment Search Tool (BLAST; www.ncbi.nlm.nih.gov/BLAST/) and synthesized by Invitrogen (Carlsbad, CA). In addition, one primer set (Cj0265) designed by Price and colleagues (56) was also included in the risk-based binary typing system. The target genes and primer sequences are listed in Table Table22.
DNeasy-extracted DNA from each isolate was tested using each gene target in PCRs containing 2.5 mM MgCl2, 1× PCR buffer II (50 mM KCl, 10 mM Tris, pH 8.3; Applied Biosystems, Foster City, CA), 0.2 mg/ml bovine serum albumin (Sigma-Aldrich, St. Louis, MO), 250 μM each deoxynucleoside triphosphate (dNTP), 12.5 pmol each primer, 1.25 U AmpliTaq DNA polymerase (Applied Biosystems, Foster City, CA), and 0.5 ng or 5 ng of extracted DNA. PCRs were run in ABI 9700 (Applied Biosystems, Foster City, CA) thermal cyclers using the following cycling conditions: 5 min at 95°C followed by 40 cycles of 1 min at 95°C, 1 min at 59°C, and 1 min at 74°C and then a final extension at 74°C for 8 min.
PCR products were run in 2% agarose gels with 1× Tris-borate-EDTA (TBE) for 70 min at 110 V and visualized using ethidium bromide. The presence or absence of target PCR products was loaded into a BioNumerics version 5.10 (Applied Maths, Ghent, Belgium) database. Interstrain relationships were assessed by numerical analysis of the P-BIT data using the simple matching coefficient and Ward's clustering.
The Penner serotyping system was used to determine the heat-stable serotypes of the isolates using the passive hemagglutination technique and antisera produced in house according to the method of Penner and Hennessy (54).
All isolates except SVS835-770, which had become nonviable, were analyzed by PFGE using the standardized PulseNet protocol (57), and Salmonella enterica serovar Braenderup H9812 restricted with XbaI as a size standard (21). Samples were digested using SmaI and KpnI. Gels were prepared with 1% (wt/vol) SeaKem Gold agarose (Lonza, Rockland, ME) and electrophoresed for 18 h using respective initial and final switch times of 6.8 s and 38.4 s for SmaI and 5.2 s and 42.3 s for KpnI. PFGE profiles were analyzed and compared using BioNumerics version 5.10 and submitted to the PulseNet Aotearoa (New Zealand) Campylobacter database, where SmaI and KpnI pattern designations were assigned.
DNeasy-extracted DNA from each isolate, except SVS835-770, which had become nonviable, was analyzed using multilocus sequence typing (MLST) as described by Dingle and colleagues (7). Amplification was performed in a 25-μl reaction mixture using AmpliTaq Gold master mixture (Applied Biosystems, Foster City, CA) and 5 pmol of each primer. Products were sequenced with an ABI Genetic Analyser 3130XL (Applied Biosystems, Foster City, CA). The sequence information was collated and alleles were assigned using the Campylobacter PubMLST database (http://pubmlst.org/campylobacter/) (23). Novel alleles and sequence types (ST) were submitted for allele and ST and clonal complex (CC) designations when appropriate.
Sixty-eight PCRs were designed for 67 prospective risk-based binary typing genes. Two primer sets for pglB were designed using the sequence from positions 81 to 176 (accession no. AF108897) (66). Both sequences aligned only with C. jejuni, but one primer set contained mismatches with some strains (pglB-s) while the other showed 100% alignment for both (pglB-g). Five targets produced negative results for all 61 isolates tested, fhuA, aphA-3, CAT, HS41.06, and HS19.09, of which the latter two were serotype-specific targets (24) subsequently shown to produce positive results for isolates of the target serotype (data not shown). Four targets (cheW, cft, pglB-g, and p19) were positive for all C. jejuni and C. coli isolates tested and negative for C. lari. One target (csrA) was positive for all C. jejuni isolates, negative for C. lari, and variable for C. coli isolates. Eleven targets (fdxA, cmeB, Cj0659, flgS, serA, peb1A, cheB, pglB-s, pldA, Cj1357, and iamA) were positive for all C. jejuni isolates and negative for C. coli and C. lari isolates. The remaining 49 targets were detected in at least 1, and at most 57, of the 58 C. jejuni isolates studied. The risk-based binary typing system produced 47 types from 61 isolates with a diversity index (62) of 98.5%.
A subset of 18 targets was selected to yield the same discriminatory potential as the entire scheme. Their selection was based upon several criteria, including discriminatory potential among the strain set examined, position in the genome, and known or predicted function.
A dendrogram of the cluster analysis (Fig. (Fig.1)1) revealed nine clusters at the 67% similarity level (S-level). Cluster 1 comprised the two C. coli isolates and one C. lari isolate studied, distinguished at the species level at the 68% S-level. Cluster 2 contained 13 C. jejuni isolates, comprised of several highly related clusters assigned to CC 48, 257, 354, and 52, in addition to three sequence types yet to be assigned to a CC. Cluster 3 contained 13 isolates assigned to CC 45 and 692 (in which each of these types clustered closely with isolates unassigned to a CC) and one isolate (SVS835-770) for which MLST data was not available. Cluster 4 contained six isolates, three of which were unassigned to a CC and one each of CC 692, 1034, and 1332. Cluster 5 was represented by a single isolate of CC 42. Cluster 6 comprised three isolates unassigned to any CC. Clusters 7, 8, and 9 contained a total of 22 isolates all assigned to CC 21, but representing ST 50 (cluster 7), ST 520 (cluster 8), and ST 21, ST 43, and ST 53 (cluster 9), respectively. A cluster analysis based on the suggested 18-target P-BIT set yielded a similar topology (Fig. (Fig.22).
A minimum spanning tree was prepared for the 18-target P-BIT subset using the binary coefficient, maximum neighbor distance of 2 changes, and minimum size of 2 types. The tree is displayed in Fig. Fig.33 with ST labels. Six clusters were revealed. Cluster 1 comprised isolates assigned to CC 48 and 21, and cluster 2 contained the ST 520 isolates from CC 21. Cluster 3 comprised isolates of CC 48, 52, 257, 354, and 828 (C. coli) and isolates that have yet to be assigned a CC. Cluster 4 contained isolates of CC 692 and one unassigned isolate. Cluster 5 contained isolates assigned to CC 45 and one isolate for which MLST was unavailable (SVS835-770). Clusters 6 and 7 contained only isolates that had not been assigned a CC.
The wide range of sources and wide variation in detection rates between target genes in this study make investigation of source bias problematic. However chi-square analysis of the P-BIT data revealed that, at the 95% confidence level, CJE1500 was detected more frequently in poultry and wild bird isolates and less frequently in animal, human, and water samples than would be expected if isolates from all sources had an equal probability of carrying this gene (data not shown). However, we do not know how many of the human isolates were derived from diarrheal cases that arose from consumption of contaminated poultry but assume the proportion is not insignificant (47).
One of the isolates (SVS835-770) became nonviable before PFGE and MLST could be undertaken. In our laboratory, Penner serotyping is only available for C. jejuni and MLST is only available for C. jejuni and C. coli isolates. Three isolates were untypeable by Penner serotyping, and the remaining 55 isolates had 22 Penner serotypes and a diversity index of 92.8%. The 60 isolates analyzed by PFGE produced 38 SmaI and 52 KpnI types with diversity indices of 97.0% and 99.4%, respectively. Combining SmaI and KpnI produced 54 types and a diversity index of 99.7%. The 59 isolates analyzed by MLST produced 32 sequence types and a diversity index of 96.1%.
Typing systems for microorganisms may be assessed by a number of criteria to establish their fitness for purpose (68; R. J. Meinersmann, presented at Campylobacter, Helicobacter and Related Organisms, Cape Town, South Africa, 1997). Some C. jejuni strains resist digestion with certain restriction enzymes, making characterization by PFGE typing problematic (13, 18), while MLST target genes may not always provide data appropriate for the scheme (40). P-BIT relies on a code derived from positive or negative results of PCR analyses for a wide range of genes widely distributed in C. jejuni genomes and extrachromosomal elements and thus offers complete typeability. In addition, our results indicate the P-BIT approach to be more discriminatory than MLST and SmaI-based PFGE typing, another important feature. We found KpnI-based PFGE typing to be the most discriminatory of the methods used in our study, notwithstanding that not all strains prove typeable with this enzyme (18).
Although the more widespread use of standardized methodologies for PFGE-based typing of bacterial pathogens has proven valuable for epidemiological studies (2, 11, 18, 19, 32), determining and using normalization parameters to enable a meaningful comparison of PFGE macrorestriction profiles can be challenging. Furthermore, the PFGE apparatus is of moderate cost. The increased availability of DNA sequencing facilities has made MLST a popular option and offers excellent portability, enabling researchers to readily compare their data with results obtained worldwide. The MLST approach also offers phylogenetically meaningful data that have proven invaluable for population genetic (6, 41) and source attribution (13, 42) studies. However, while technological advances continue to reduce the cost of performing the analysis, MLST still requires substantive capital outlay for implementation and running costs can be prohibitive for small routine laboratories. Our P-BIT approach requires only the most basic molecular biology laboratory equipment for implementation (approximately one-tenth the cost of PFGE equipment, for example) and no more than e-mail for the exchange of data between laboratories. The PCRs used have been designed to enable multiplexing, to further reduce running costs. We have identified a core set of 18 gene targets that conferred the same discriminatory potential in this study as the complete range of 68 targets employed. We believe these features, together with its high discriminatory power and complete typeability, render P-BIT a worthy tool for laboratories engaged in epidemiological studies of C. jejuni. The distinctiveness of profiles obtained from the few C. coli and C. lari strains included suggest P-BIT data may also provide a basis for species identification between closely related Campylobacter species (itself a challenging task ), but further work is required to substantiate this.
The P-BIT approach to typing yields a simple binary code (albeit one based on whole-genomic polymorphism) that is not suited for analysis with algorithms aimed at predicting phylogenetic relationships. The bases for determining strain relatedness with MLST and P-BIT are also very different, with the former evaluating change between housekeeping gene sequences and the latter determining, by PCR, the presence of diverse, widely distributed genes that may be subject to selective pressure. Nevertheless, some congruence between these methods was seen when comparing the interstrain relationships inferred. In particular, isolates assigned to CC 21, 45, 52, 257, and 354 formed or dominated well-defined clusters in the P-BIT analyses (Fig. (Fig.11 to to3).3). With the population genetic structure of C. jejuni believed to be significantly shaped through recombination events and horizontal gene transfer (7, 63), the correlation of P-BIT and MLST is perhaps not surprising, with P-BIT potentially offering additional insight into the genomic evolution of strains that may be useful for relatively uncommon MLST types or those that cannot be assigned to a clonal complex.
Subtyping using PFGE depends on the distribution of restriction sites within the organismal cellular DNA. As with P-BIT, the data are not aimed at determining strain phylogeny unless banding patterns are indistinguishable, in which case strains are assumed to be related. It has become common practice to confirm strain relationships inferred with one restriction enzyme with the use of a second (47). Despite these caveats, we compared the clustering of C. jejuni strains in a dendrogram based on SmaI and KpnI PFGE profiles (data not shown) with the P-BIT and MLST results, given the agreement between the latter two methods. The PFGE analysis agreed with some of the P-BIT and MLST results, with homogeneous clusters of ST 257, 436, and 2392 each formed at S-levels of 80%; other clusters were dominated (but not exclusive to other ST) by strains assigned to ST 137, 50, and 520 at S-levels between 70 and 92%. Five of 6 ST 53 strains also clustered together (S-level of 75%), and two highly related ST 45 strains (NZRM 4141 and SVS 4039) gave PFGE profiles determined as 90% similar. However, although all CC 21 strains formed a distinct cluster in the PFGE analysis, the infraspecific relationships between the constituent ST differed from that of the P-BIT and MLST analyses and also included the three CC 48 strains examined. Strains assigned to CC 45 and 354 were divergently distributed in the PFGE analysis. Although the discriminatory potential of PFGE typing can exceed that of both P-BIT and MLST, our results support the need for caution when using PFGE data to infer clonal relationships, for reasons described in more detail elsewhere (47).
One feature we had in mind when developing this approach was the possibility of quantifying the risk to human health from individual strains through generation of a simple binary code relating to epidemicity factors. There have been various observations of “stable clones” present in human clinical specimens and various host animals (39, 47, 48, 61) as well as whole-genomic comparative studies indicating that host-specific types are less well adapted for stress survival than those found more widely (46). The selection of initial marker genes in this study was on a metagenomic basis but focused on genes that conferred, or were implicated in, some aspect of strain virulence. Many aspects of pathogenicity in Campylobacter are poorly understood, but examination of some of the strains classified as “high risk” and “low risk” by On and colleagues (46) showed them to cluster distinctly (Fig. (Fig.1).1). Notably, all “high-risk” strains (NCTC11168, SVS992, SVS1099, SVS1425, SVS3141, and SVS5001) were found to belong to CC 21, a globally common type, with four of the five “low-risk” strains (SVS380-827, SVS4039, SVS72-64077, and SVS835-770) appearing in related clusters 3 and 4 (Fig. (Fig.1).1). We recognize that our P-BIT system uses PCR primers that are designed to amplify specific sequence orientations of the target genes and that genes may be present in particular strains with different flanking regions that would therefore not be detected. Thus, P-BIT data provides indicative but not absolute data regarding the presence or absence of a gene. Nevertheless, for the establishment of a simple, inexpensive, and effective typing method, we feel this is an acceptable compromise. For genes carried on plasmids, there is a risk in long-term studies that strains will be cured of such extrachromosomal DNA and thereby lose a specific marker. This appears to be the case with one strain examined here, 81-176 (RM1864), in which the plasmid-borne tetO gene conferring tetracycline resistance was not detected. Our example of this strain was a kind gift from colleagues, and we have no information on the number of passages it may have experienced before it was received. However, the isolate's first description was in 1988 in another laboratory, and it is likely to have been subcultured many times before we received it, allowing ample opportunity for the plasmid to be cured. We examined a heated lysate of 81-176 to ascertain if the result may have been due to the DNA preparation method used being suited for genomic DNA extraction, but still obtained no tetO-derived amplicon (data not shown). In the context of the P-BIT system, we do not consider the inclusion of tetO inappropriate since it represents an important virulence marker (found in two of the strains examined here: see Fig. Fig.2)2) and potentially valuable for short-term epidemiological studies such as outbreaks, where plasmid loss is unlikely to occur. Further studies of the relationship between P-BIT profile, stress survival, and strain type frequency in human populations are also warranted.
We conclude that P-BIT is a useful approach for subtyping, offering advantages of speed, cost, and potential for strain risk ranking unavailable elsewhere. We hope the wider scientific community will be encouraged to investigate further its usage in national and global surveillance studies and outbreak investigations.
This project was funded by the New Zealand Ministry of Research, Science and Technology through an ESR-administered Capability Fund project.
We acknowledge Aruni Premaratane and Stephanie Brandt for technical assistance and Beverley Horn for statistical assistance. We also thank the anonymous reviewers for constructive comments on the manuscript.
Published ahead of print on 18 December 2009.