, a plant-associated member of the Enterobacteriaceae
, causes fire blight, a devastating disease of rosaceous plants, especially pear and apple (6
). The complete genome of Ea273 (ATCC 49946), a virulent strain isolated from an infected apple tree in New York State, was sequenced. Total DNA was extracted and prepared in pMAQ1 shotgun libraries. The complete shotgun sequence was obtained by using dye terminator chemistry in ABI 3730 automated sequencers and contains 88,457 reads (11.12-fold coverage), yielding a theoretical coverage of the genome of 99.99%. The sequence was assembled, finished, and annotated as described previously (1
), using Artemis (4
) to collate data and facilitate annotation.
The genome of E
consists of a circular chromosome of 3,805,874 bp and two plasmids, AMYP1 (28,243 bp) and AMYP2 (71,487 bp). Coding regions in the chromosome account for 85.1% of the total sequence, with 3,483 identified coding sequences (CDS). Two hundred fifty-four (7%) of the CDSs do not have any matches in current NCBI databases; 114 (3.3%) correspond to conserved hypothetical proteins. Forty-nine CDSs (1.4%) are similar to genes from mobile elements such as integrases, transposases, and bacteriophages, and 110 CDSs (3.2%) were classified as pseudogenes due to interruptions or truncations of the CDSs. The remaining 2,956 annotated CDSs include among other categories genes involved in biosynthesis of the cellular envelope and modifications of surface proteins (299 CDSs [11%]) and genes involved in signal transduction and regulation (228 CDSs [8%]). Seven rRNA operons and 78 tRNA sequences were identified in the chromosome; two new clusters were identified (AMY1550-1575 and AMY2648-2676) that resemble the T3SS-encoding SSR-1 island of Sodalis glossinidius
), and four clusters that contain genes for biosynthesis of flagella, which based on their location might be regulated independently.
The smaller plasmid, AMYP1, had been reported as pEA29 (3
); its sequence is nearly identical to the one reported here. The larger plasmid, AMYP2, renamed pEA72 for consistency in nomenclature, contains 87 predicted CDSs, with two predicted mobile-element-related CDSs and one pseudogene. Among the CDSs with annotated functions are a cluster of genes (AMYP2_49 to AMYP2_62) that encode a putative type IV fimbrial system (pil
The genome of E. amylovora is only 3.8 Mb long, whereas most free-living enterobacteria, including plant pathogens, have genomes of 4.5 Mb to 5.5 Mb. Comparison of the genome of Ea273 with the sequenced genomes of 15 closely related enterobacteria identified 21 lineage-specific regions, which might be considered genomic islands. E. amylovora has many more predicted pseudogenes, relative to other enterobacteria with similar lifestyles. Given its size and the preponderance of pseudogenes, genome reduction may have occurred via mutational inactivation and subsequent deletion with the following consequences: E. amylovora has fewer genes involved in anaerobic respiration and fermentation than are found in typical related enterobacteria; this likely result in a reduced capacity to live in anaerobic environments.
The genome sequence of E. amylovora has revealed clear signs of pathoadaptation to the rosaceous plant environment. For example, T3SS-related proteins are present that are more similar to proteins of other plant pathogens than to proteins of closely related enterobacteria. These include type III effectors, homologous to those of plant-pathogenic pseudomonads, which confer virulence to E. amylovora in plants, and a sorbitol-metabolizing cluster that may confer a competitive advantage for survival in rosaceous plants. The reduced genome size and erosion or loss of genes involved in anaerobic respiration and nitrate assimilation are remarkable, relative to other plant- and animal-pathogenic members of the Enterobacteriaceae.