Analysis of CRISPR spacer sequences and patterns revealed considerably more genetic diversity in
E. amylovora than had been known previously. CRISPR genotyping enabled the differentiation of strains that were shown in previous studies to be contained within the same ribotype, pulsed-field gel electrophoresis (PFGE) group, and
groEL sequence group
[6],
[8],
[27].
E. amylovora has been considered a fairly homogeneous species with a low level of genetic diversity although there are obvious differences between genomes of strains isolated from apple and strains isolated from
Rubus sp.
[13]. This is due to the hypothesis of a recent evolutionary bottleneck associated with the colonization of apple hosts in North America in the 1700s. The previous host(s) of the progenitor strain(s) first infecting apple and pear is unknown, and, to our knowledge, a comprehensive phylogenetic analysis of
E. amylovora including a large number of strains isolated in North America from wild Rosaceae hosts has not been done. While the CRISPR locus is not useful for phylogenetic analyses
[41], we and others
[42] have shown that CRISPR spacer array genotyping is a potential tool that could be used to identify progenitor strain(s) of
E. amylovora that are currently infecting apple and pear today. In this study, the
E. amylovora strains isolated from loquat were closest in genetic relatedness of CRISPR spacer content with apple and pear strains.
CRISPR spacer sequences are thought to provide a historical context of mobile sequences an organism encounters because individual spacers are inserted at the same position, adding on to the existing spacer assembly
[36]. For most of its life cycle,
E. amylovora is believed to be present within the interior of plants with significant epiphytic growth only occurring on the stigma surfaces of flowers
[2]. However, the recent identification of
E. amylovora pathogenicity island sequences suggestive of functioning in association with insect hosts has broadened the habitats in which this bacterium may dwell
[43]. Thus, the diversity of
E. amylovora habitats (plant and insect) could potentially increases the ecological context of CRISPR spacer evolution with exposure to distinct microbiomes.
Fire blight disease and
E. amylovora were known to spread from North America to New Zealand in the 1910s, to Europe in the 1950s, and subsequently to the Middle East
[44],
[45]. Our results of strain genotyping based on CRISPR spacer content lead us to hypothesize that an
E. amylovora strain(s) from the eastern U.S. is the likely source of fire blight disease spread into New Zealand and Europe. This is due to the observation that the most prevalent eastern U.S. genotype only differs from the genotypes observed in European and New Zealand strains by a small set of deleted spacers ( and ). Spacer deletion is thought to be a common route to CRISPR genotypic differentiation
[46]. Based on previous work using PFGE analyses, Jock and Geider
[45] concluded that
E. amylovora was originally spread from North America to England, and from there likely throughout Europe. PFGE results also suggested that
E. amylovora was not repeatedly introduced from North America, rather only a few strains, in effect a bottleneck were associated with spread of
E. amylovora to Europe
[6]. Our CRISPR spacer data, and those of a previous study
[42], corroborate these results, as relatively little diversity is observed among European
E. amylovora strains while North American strains contain a higher level of diversity.
Our current results also suggest that the sources of E. amylovora strains initially infecting apple and pear in the eastern and western U.S. could be distinct. E. amylovora WSDA 16, 87–70, and 87–73 are the only western U.S. strains with similar CRISPR genotype to eastern U.S. strains (). It seems likely that these strains were transported from the eastern U.S. to the western U.S. through human activity such as via movement of contaminated nursery stock.
We observed a clear delineation in spacer content and diversity in the CRISPR 1 and 2 arrays in the
E. amylovora strains studied, based on geographical location of isolation and plant host. This differentiation of spacer content lends credence to the hypothesis that CRISPR spacers reflect a geographic component of host interactions and an environmental niche component
[32],
[36],
[39]. Geographic differentiation within CRISPR loci has been used previously to reconstruct the routes of transmission of
Yersinia pestis strains from natural plague foci in China
[47] and to make inferences about viral biogeography, host-virus interactions, and genome dynamics in
Sulfolobus islandicus
[48]. For the most part,
E. amylovora apple strains from the western U.S. harbored a completely distinct set of CRISPR spacers compared to corresponding strains from the eastern U.S. and other continents, and many of the spacers carried by the western strains targeted pEU30, a plasmid that is exclusively found in a subset of western U.S. apple strains
[20].
Differentiation of CRISPR spacer content based on host of isolation adds to the possibility of an environmental niche component affecting the evolution of CRISPR loci. The
E. amylovora strains isolated from
Rubus, are readily differentiable from apple strains by various typing methods, and this has been confirmed at the genome sequence level
[13]. These strains are also differentiated by host specificity in that
Rubus strains are not pathogenic on apple or pear
[49],
[50]. We found that these strains are also clearly distinct in terms of CRISPR genotype. Our results, along with the clear phylogenetic distinction of
E. amylovora strains isolated from apple and from
Rubus
[8],
[12], suggests that these strains have been evolving in isolation from each other for an evolutionarily long period of time. Our results are also similar to those observed in
E. coli: when phylogenetic distance is small, a high relatedness of spacer repertoire is observed
[41]. As phylogenetic distance increases, spacer relatedness decreases. However, a second aspect of the
E. coli analyses indicated that spacer repertoire relatedness among strains followed either of two paths: the spacer content was either closely identical or completely different
[41]. This radical replacement or replenishment of spacers with unique spacers was interpreted to indicate that turnover of spacers is not gradual
[41]. Our results with
E. amylovora corroborate these previous observations with
E. coli.
The utility of CRISPR sequences for strain tracking on a local level was demonstrated in this study as we detected similar CRISPR genotypes in Michigan populations of Sm
S
E. amylovora and in corresponding Sm
R strains that had either acquired Tn
5393 or were spontaneous Sm
R mutants. Thus, CRISPR analysis was more sensitive than comparative
groEL sequencing or ribotyping which were used previously in an attempt to differentiate these strains
[3],
[27]. These results are important in that they suggest that Sm
R
E. amylovora populations in Michigan evolved from indigenous populations which also suggests that the resistance has arisen in locally-adapted genotypic backgrounds.
The diversity and distribution of plasmid sequences inhabiting
E. amylovora has received increased attention in recent years as researchers attempt to define the pan-genome of this species
[21],
[24],
[51],
[52]. Identification of 95 spacers targeting plasmids found in
Erwinia spp. in this study provides evidence of prior interactions and attempts to avoid incursions of specific plasmids during the life history of these strains. Of particular interest are spacers targeting plasmids reported from the epiphytic organisms
E. billingiae and
E. tasmaniensis and other related pathogenic species
E. pyrifoliae and
Erwinia sp. isolated in Japan (). Our observations either suggest interactions of
E. amylovora with these other species or mobility of targeted plasmids into
E. amylovora at points during the life history of this pathogen. Another question originating from our analyses is why is pEU30 targeted by so many spacers? Analysis of the complete sequence of pEU30
[20] suggested that the plasmid is relatively innocuous; aside from a
virB-type system encoding conjugation machinery, the plasmid does not encode any known genes of ecological or pathogenic importance. The lack of traits encoding a positive fitness benefit might be the very reason that pEU30 is frequently targeted for elimination. In addition, we found that three strains that harbored pEU30 also contained CRISPR spacers targeting the plasmid. Since it is known that 100% nucleotide identity is required for sequence elimination by the CRISPR system
[34], this could be an example of a plasmid-bacterial host “arms race” in which the plasmid has evolved through mutation to escape CRISPR surveillance. An alternate hypothesis is that self-targeting CRISPRs are involved in gene regulation; however, a recent comprehensive analysis suggested that self targeting is more a consequence of autoimmunity
[53].
In a recent study with
E. coli in which 926 unique spacer sequences were identified, none of these were found to match any known sequenced enterophages
[39]. This discontinuity between CRISPR sequences and bacteriophage sequences could be due to the low availability of phage sequences compared to phage environmental diversity. We identified 22 spacers targeting known phage sequences, and most of these targeted phage ΦEt88, which was previously identified in
E. tasmaniensis
[14]. Our results could also be due to the low availability of phage sequences or to the lack of encounters between the
E. amylovora examined in this study and these characterized phages. The potential of phage deployment for fire blight disease management has been assessed by several groups
[28],
[30],
[54],
[55]. Since the sensitivity or resistance to infection by specific phage can be affected by genes in addition to the CRISPR loci, much more information would be necessary to predict the sensitivity of
E. amylovora strains to phage under development for fire blight control.
In summary, we characterized CRISPR spacer diversity among 85 E. amylovora strains and found that this locus is robust for differentiating genotypes. We find that CRISPR analysis could be particularly useful for strain tracking on a local and possibly on a regional level. Also, the almost completely distinct composition of CRISPR arrays between E. amylovora strains isolated in the eastern and western U.S. indicates the potential that there were multiple introductions of this pathogen from native Rosaceae hosts to apple and pear hosts brought to and transported across North America by European settlers.