Yersinia enterocolitica is a gastrointestinal pathogen with distinctive clinical manifestations and a predilection for young children (
2). It is a member of the
Enterobacteriaceae, genus
Yersinia, which includes three pathogenic species:
Y. pestis, the agent of plague, and two enteropathogens,
Y. pseudotuberculosis and
Y. enterocolitica. Y. enterocolitica strains are differentiated by bio- and serotyping. Biotype 1A (BT1A) strains are mostly nonpathogenic, in contrast to the highly pathogenic BT1B (predominant in the United States) and low-pathogenic BT2 to -5 (dominating worldwide). Recently,
Y. enterocolitica was divided into two subspecies (
3):
enterocolitica, proposed for strains with the 16S rRNA type of American origin, and
palearctica, proposed for strains of European origin (BT1A and BT2 to -5).
Yersiniosis is the third-most-common bacterial enteric disease in Europe, and
Yersinia enterocolitica subsp.
palearctica serobiotype O:3/4 is most frequently isolated from humans and slaughter pigs (
4). Thus, deciphering the
Yersinia enterocolitica subsp.
palearctica O:3/4 genome will help to uncover the mechanisms of its successful worldwide dissemination.
Y. enterocolitica subspecies
palearctica Y11 (DSMZ no. 13030), an O:3/4 human isolate, was selected for sequencing. The genome was determined using MegaBACE (at Integrated Genomics, Jena, Germany), a 454 Genome Sequencer (GS) 20 (at 454, Branford CT), and a 454 GS FLX Titanium (at Seq-IT, Kaiserslautern, Germany). MegaBACE sequencing produced 73,848 reads with 64,366,908 bp. The 454 GS 20 run yielded 927,998 reads with 94,454,598 bp, and the 454 GS FLX Titanium run added 240,813 reads with 105,539,453 bp. These datasets were assembled into a draft genome of 103 contigs with a total length of 4,464,482 bp using DNAstar Lasergene software (version 7.2.0; Seqman Pro). Gaps were closed manually, in cooperation with LGC Genomics (Berlin, Germany), by Sanger sequencing of PCR products. Finally, data were assembled into a complete genome sequence of 4,553,420 bp for the genome and an additional 72,460-bp contig representing the pYVO3 plasmid. For quality control, all raw data were mapped along the two contigs using 454 Reference Mapper software (version 2.3; Roche). The median sequence depth for the genome contig was 37 (1st quartile, 30; 3rd quartile, 48), and the proportion of Q40+ bases was 99.58%. For the plasmid sequence, a median depth of 12 (1st quartile, 8; 3rd quartile, 17) was achieved, with the proportion of Q40+ bases being 98.01%. The average GC content is 47%, similar to that of
Y. enterocolitica strain 8081 (NC_008800; 47.27%). The genome sequence was annotated using the RAST server (
http://rast.nmpdr.org/) (
1). This first deciphered
Y. enterocolitica subsp.
palearctica genome reflects multiple bacterial interactions and points to the structures and functions potentially involved in virulence attenuation and host adaptation.