C. coli and
C. jejuni are closely related bacterial species that cause a large number of clinical cases of gastroenteritis worldwide. It is therefore important to understand their molecular epidemiology and evolution. Central to this goal is the availability of a reliable approach to isolate typing that provides data for all strains which can be compared among laboratories and over time. An MLST scheme was developed for
C. coli by extending an existing scheme for
C. jejuni (
3) and was validated with 68
C. coli isolates from different sources, locations, and years and of different serotypes (Table ). MLST of all isolates tested confirmed the conservation of the primer binding sites. In addition, SVR sequences from the
flaA genes were obtained. A high level of resolution was achieved, indicating the suitability of the approach for investigation of the molecular epidemiology of
C. coli.
Although a diverse range of
C. coli isolates was examined, the species showed less diversity than
C. jejuni at each of the MLST loci. This agrees with findings obtained by amplified fragment length polymorphism analysis (AFLP), in which
C. coli strains from poultry were less variable than
C. jejuni strains (
6). However, the identification of a single divergent
C. coli isolate (ST-868) suggests that greater diversity that has not been sampled to date may exist. Both species contained similar levels of sequence diversity within the 321 bp of the
flaA SVRs.
The results of biochemical tests used to distinguish
C. coli and
C. jejuni can be ambiguous. Hybridization, multilocus enzyme electrophoresis, AFLP, and fluorescent AFLP studies have confirmed that they are separate species with 22 to 49% homology (
1,
6,
14,
15). As expected, the nucleotide sequences of the MLST loci segregated according to microbiological species (Fig. ), confirmed by 300 fixed nucleotide differences and a high F
ST value of 0.93170. The two species were closely related, sharing approximately 86.5% identity at the nucleotide sequence level, and the level of identity rose to approximately 95.0% at the amino acid sequence level.
In contrast, no evidence of segregation by species was detected within the
C. jejuni or
C. coli SVR sequences (Fig. ). No fixed nucleotide differences were detected in this locus (Table ), with SVR allele 16 found in both species. The low F
ST value of 0.04876 also implies that these sequences represent a single population. These observations agree with those of a previous study (
2) in which
flaA typing, conducted by enzyme digestion of a PCR product, could not distinguish
C. coli and
C. jejuni. Both
C. coli and
C. jejuni are naturally competent to take up DNA. Both intragenomic recombination and intergenomic recombination have been demonstrated within the flagellin locus of
C. jejuni (
13). Thus, frequent interspecies recombination appears to explain the common gene pool for
flaA shared by these species. These data indicate the unsuitability of the
flaA SVR (when used alone) as a marker for the molecular epidemiology of
C. coli and
C. jejuni. However, the diversity of this locus can allow closely related strains with the same MLST ST to be distinguished.
Unambiguous evidence of interspecies recombination within the housekeeping genes was confined to one
C. jejuni isolate which contained a
C. coli sequence in one of seven loci (ST-61) (Fig. ). This genotype has now been described in many isolates by multiple laboratories (see the database at
http://pubmlst.org/campylobacter/).
C. coli ST-868 may have a history of recombination with
C. jejuni, since 19 of 441 polymorphic sites within this genotype were characteristic of this species (Fig. ) and the STRUCTURE program assigns the
C. jejuni ancestry to five short runs of its DNA. These sequences could have entered the
C. coli population by recombination of larger gene fragments (similar to that observed in ST-61), followed by extensive recombination within other
C. coli strains, which could have resulted in the observation of only short fragments in the extant population.
However, there is an alternative phylogenetic explanation for why ST-868 shares nucleotides with C. jejuni, which is that the sequence may represent the ancestral state, with the mutation observed in the remaining C. coli isolates occurring subsequent to the divergence of ST-868. This explanation predicts that the shared nucleotides will be distributed at random among the polymorphisms that distinguish C. coli and C. jejuni. Recombination, on the other hand, would lead to a nonrandom distribution, with adjacent polymorphic nucleotides giving the same ancestral signal.
The distribution of shared nucleotides showed some evidence of being nonrandom. On average, 0.8 runs [(19/411) × (19/441) × 443] of two adjacent such polymorphisms would be expected. Five adjacent pairs of polymorphisms (the single run of three counts as two pairs), the occurrence of which has a probability of 0.0004, were observed. However, the evidence is weakened by the fact that two of the pairs of changes in tkt cause amino acid changes. These nucleotide changes may have occurred in quick succession due to natural selection, which provides an alternative explanation to recombination for their clustering on the chromosome. Thus, while the import of nucleotides from the housekeeping genes of C. jejuni to C. coli seems likely, it is not proven by the present data; and the contributions of recombination and mutation to the divergence of ST-868 from the other C. coli strains remain unknown.
Two previous studies by fluorescent AFLP have indicated that different
C. coli strains are associated with particular animal hosts or environmental sources (
15,
17). In the present study, the relationship between
C. coli genotype and isolation source was examined by using a radial neighbor-joining tree (Fig. ), but no clustering of the STs by source was apparent. However, chickens and pigs located on the same farm were colonized with different STs (Table ); this may indicate a host preference by certain
C. coli genotypes, and analysis of further isolates by MLST may clarify this issue.
C. coli MLST and flaA SVR sequencing provide sufficient resolution to be useful in future studies for the investigation of isolates from cases of human disease and the potential sources of human infection. As all MLST data are directly comparable, this approach could allow accurate assessments of the contributions that different infection sources make to the burden of human disease to be made. The inability to distinguish the C. coli and C. jejuni species by use of the flaA SVR sequence calls into question the use of the flaA locus alone in any method aimed at studying the epidemiology of the diseases caused by these organisms. Further extension of this MLST scheme to additional Campylobacter species will aid in providing an understanding of their evolutionary relationships.