(Xf) is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Foremost among these are Pierces's disease (PD) of grapevine, citrus variegated chlorosis (CVC), and leaf scorch diseases in almond (ALS) and oleander (OLS) [1
]. Due to its potential threat to US agriculture, X. fastidiosa
(CVC strain) is included in the Federal government's select agent list [4
]. The CVC strain (9a5c) was the first plant pathogenic bacterium whose genome was completely sequenced [5
]. This was followed by publication of draft sequences of the genomes of Dixon (almond) and Ann1 (oleander) [6
] and the complete sequence of the genome of the PD-associated Temecula-1 strain [7
]. To date, emphasis of the above published research has been on the functional reconstruction and deciphering the metabolic pathways.
Comparative genome sequencing of bacteria is a powerful means of detecting sequence diversity among closely related, but distinct populations. Comparative whole-genome information about strain specific DNA variation will have important implications for the development of new molecular markers for detection, pathovar classification, disease epidemiology and understanding evolutionary relationships. Using whole genome sequences of four X. fastidiosa strains, we conducted sequence analyses for genome-wide DNA- based variations that presumably are critically important in strain divergence, host specificity and pathogenicity.
Currently, genetic variation using markers such as the 16S-23S rRNA spacer region [8
], simple sequence repeat markers or Variable Number of Tandem Repeats (VNTRs) that are capable of differentiating among, and within, host-associated strains exists [9
]. However, information on DNA based variations in the coding and non-coding regions and information on SNPs (Single Nucleotide Polymorphisms) and insertions/deletions (INDELs) of one to several hundred base pairs, thus far have not been studied. Such information is extremely valuable for understanding the epidemiology of this bacterium which has specific host preference and pathogenicity [11
]. In nature, pathogen populations with high genetic diversity have high evolutionary potential and thus are more likely to overcome host genetic resistance than pathogen populations with low genetic diversity. The resulting changes in population structure or virulence can lead to resistance breakdown. This is particularly true in agricultural production systems in which mono-culture is a common practice. Under these conditions, the frequency of pathogen genotypes with increased virulence may increase and ultimately lead to resistance breakdown and increased disease incidence. Therefore, availability of such genomic information on coding and non-coding- polymorphic loci will help in linking variability in pathogenicity of different strains to differences in their genetic backgrounds and monitoring changes in their genetic diversity.
INDELs are important events in establishing genomic variations between similar strains [12
]. There are numerous mechanisms by which INDELs are formed, such as the DNA recombination, expansion of repetitive DNA sequences and insertion sequence (IS)-mediated events. INDELs serve as reliable signature sequences and have a definite advantage over the traditional phylogenetic analyses based on the gene or protein sequences due to the fact that the traditional analysis derives phylogenetic relationships assuming constant mutation frequency, which is incorrect over long periods of time, leading to incorrect species relationships [13
]. On the other hand, conserved INDELs of defined sizes are not greatly affected by such differences in evolutionary rates [14
]. Among bacterial species, INDELs have been identified as the principal source of genome variability in Mycobacterium tuberculosis
Another important factor that contributes to genomic variations is the occurrence of Single Nucleotide polymorphs (SNPs). SNPs have extremely low mutation frequency and are less prone to homoplasty when compared to other molecular markers, making them extremely valuable for phylogenetic analyses. SNPs have been effectively used in drawing evolutionary relationships of Bacillus anthracis
, the causative organism of anthrax, with extremely low strain variability [16
]. A total of 990 SNP markers genome-wide were used in their study. Recently, SNPs were found to be of invaluable source in tracing the worldwide spread of pathogenic Mycobacterium leprae
, the causative organism of leprosy [17
]. Apart from the phylogenetic analysis, SNPs have been identified as functional tools in linking the DNA variations in the promoter of the nitrate reductase gene cluster narGHJI to the observed differences in the nitrate reductase activity of M. tuberculosis
and M. bovis
] and in showing a link between DNA variability in the gyrA
gene to Salmonella enterica
strains resistance to quinolones [19
There are several means by which bacteria can acquire genes: conjugal transfer, phage-mediated insertions and the uptake of native DNA from the outside sources [20
]. While not all the genes that are introduced are retained, there are numerous instances where the stable introductions have been shown to play a pivotal role in the evolution of niche-adaptive and pathogenic characteristics of bacterial species, and thus greatly influence inter-strain differences in gene complement [20
]. In certain instances, 10–20% of the genes are estimated to have been laterally transferred [24
]. Xenologues have been identified in the past based on criteria, such as G+C content variation (the standard method), codon usage bias and differences in amino acid usage [25
The present study was undertaken to identify and characterize the macro (present or absent), medium (Tandem repeat variations) and micro (SNPs and INDELs) -level differences from coding and non-coding regions among the four published X. fastidiosa strain genomes leading to disease development, and for use in development of improved pathogen diagnostic and epidemiological tools. The results of this study are available through our database.