Nearly 30 distinct bacterial blight resistance genes from different rice varieties and wild relatives have been identified and many have been used in breeding programs for disease control [
7], but in several instances, resistance has broken down as new, virulent strains of Xoo have emerged [
12,
40-
42]. Understanding mechanisms that account for the rapid emergence of new pathogen genotypes, and identifying Xoo genes involved in pathogenic adaptation are important goals toward developing durable disease control strategies. The complete genome sequence of strain PXO99
A and its comparison to two previously sequenced strains, KACC10331 and MAFF311018, that we have presented here, provide new insights that advance these goals.
Because MAFF and KACC are highly similar in genome content and organization, our comparative analysis focused largely on PXO99A and MAFF. This analysis revealed a remarkable plasticity of the Xoo genome. This plasticity is most strikingly evident in the large number of major rearrangements and indels between these strains. On a smaller scale, differences are prevalent in the inventories of TAL effector genes in PXO99A and MAFF. Also, a number of indels exist that represent genes shared by both strains but present in higher copy in PXO99A, including several IS elements. All of these differences suggest that the Xoo genome evolves rapidly. This conclusion is perhaps best supported however by the 212 kb sequence duplication in PXO99A that we discovered using a new and powerful application, the Hawkeye assembly diagnostics tool, and which we confirmed by PCR amplification of the repeat junction. The duplication represents a remarkably recent event, with only a single nucleotide difference differentiating between the two copies in PXO99A.
Gene duplication contributes to gene diversification, allowing for unconstrained evolution of otherwise indispensable sequences. The abundance of duplications in PXO99
A suggests that they are an important source of genomic variation for Xoo. As made clear by analysis of the 212 kb repeat, IS elements play an important role in generating duplications. And they clearly can generate other types of genome modifications as well, including rearrangements and inversions, and insertions or deletions that can lead to acquisition, modification, or loss of gene content [
20]. Indeed, 7 out of 10 of the major rearrangements in the PXO99
A genome relative to MAFF are associated with IS elements. The presence of ISXo5 at both ends of the 38.8 kb locus containing the non-fimbrial adhesin-like genes in PXO99
A, compared with its presence in single copy in place of this locus in MAFF and KACC provides a patent example of an IS mediated genome modification that resulted either in an excision (from the MAFF and KACC lineage), or an integration of DNA (in the PXO99
A lineage). Our analysis highlights also an important role for phage as a source of genomic variation for Xoo. The PXO99
A sequence revealed numerous differences from MAFF related to phage integration, including the presence of genes that clearly originated in distantly related organisms. Yet another template for genome modification, and a particularly interesting characteristic of the Xoo genomes, are the TAL effector genes. As virulence factors and triggers of host resistance, differences in TAL effector gene content have been associated for some time with phenotypic diversity. Comparison of MAFF and PXO99
A provided clear evidence of the involvement of homologous recombination among these genes in generating differences in their structure and copy number at genomic locations that were otherwise conserved, indicating that the sequences themselves play a major role in generating that diversity.
Included among the 19 TAL effector genes in PXO99
A are
pthXo1, a major virulence determinant not present in other strains [
29] and
avrXa27, a cultivar specificity determinant [
16]. There is evidence also that the TAL effector gene
pthXo7 is important in the virulence of PXO99
A on plants containing the recessive resistance gene
xa5 [
14,
32]. Significantly,
xa5 is prevalent among the Aus-Boro lines of rice, which originated in Nepal and Bangladesh, the geographical region that likely gave rise to PXO99 [
13]. These and other observations firmly establish a role for TAL effector genes in strain-specific adaptation. The differences in TAL effector gene content and structure between the geographically distinct strains PXO99
A and MAFF further underscore this role, and the importance of understanding the diversity of TAL effector functions.
The non-fimbrial adhesin-like genes
fhaB,
fhaB1, and
fhaX and the transport gene
fhaC we discovered at the 38.8 kb locus in PXO99
A that is missing in MAFF and KACC are additional intriguing candidates for adaptations to certain host genotypes or environmental conditions. Homologs of
fhaB and
fhaC are present in a number of plant and animal pathogenic bacteria [
43]. MAFF and KACC encode other non-fimbrial adhesins, which are also present and highly conserved in PXO99
A. Thus, it seems likely that the
fha genes are not essential pathogenicity factors in PXO99
A. However, mutational analysis might reveal a quantitative effect on virulence, or a differential effect in certain rice varieties or under different temperatures. Other proteins encoded at the locus that are of interest from the perspective of host-pathogen interactions include a putative ice nucleation protein and a putative colicin with an associated transporter protein.
Complete genome sequences are available for a number of members of other
Xanthomonas species, including
X. campestris pv. campestris, the causal agent of black rot in crucifers, [
44,
45]
X. axonopodis pv. citri, which causes citrus canker, and
X. campestris pv. vesicatoria, which is responsible for bacterial spot in tomato and pepper plants [
46]. Whole genome alignments revealed several inversions, indels, and rearrangements in these genomes relative to one another [
46]. Thus the genus as a whole shows a high degree of genomic variation. Even in this context however, the differences uncovered here in structure and content of the PXO99
A versus the MAFF and KACC genomes are striking. Notably, Xoo strains contain the greatest number and diversity of IS elements of all the sequenced xanthomonads, and the size of the CRISPRs in the strains discussed here suggests a long history of interaction with phage.
X. oryzae strains are also unusual in their abundance of TAL effector genes. None of the other sequenced
Xanthomonas strains have more than four TAL effector genes, and some have none. Though a comprehensive survey has not been done, large numbers of TAL effector genes are only known to exist elsewhere in strains of
X. campestris pv. malvacearum, a pathogen of another ancient and genetically diverse domesticated crop plant, cotton [
47], and, curiously, in
Xanthomonas strains that infect mango [
48]. It is tempting to speculate for
X. oryzae that the diversification of its host through millennia of cultivation around the world favored an amplification of elements in the pathogen that confer genome plasticity and adaptability, including IS elements, phage, and the repeat-dominated TAL effector genes.
It is interesting that in contrast to the East Asian MAFF and KACC strains, the ancestry of PXO99
A is likely centered in South Asia [
13], one of at least three probable sites of domestication of rice [
6]. As described here, PXO99
A has a larger genome and a greater number of strain-specific genes than its close relatives MAFF and KACC. This greater size and complexity may be a consequence of this strain having derived from a lineage that evolved near a center of origin for its host, which would be expected to have a greater diversity of host genotypes than other locations.