|Home | About | Journals | Submit | Contact Us | Français|
Pathogenic and nonpathogenic Escherichia coli strains present a vast genomic diversity. We report the genome sequences of 2,244 E. coli isolates from multiple animal and environmental sources. Their phylogenetic relationships and potential risk to human health were examined.
Most Escherichia coli strains are harmless commensals that are found as part of the gut microbiota. Some strains, however, have the ability to cause disease and are considered pathogenic. Phylogenetic analyses have shown that E. coli strains can be divided into several phylogroups (1, 2), with pathogenic and nonpathogenic strains randomly distributed among them. To facilitate the study of the genomic diversity of this species, we sequenced a collection of 2,244 E. coli isolates from multiple mammalian (including human), avian, and environmental sources. This diverse collection contains nonpathogenic strains as well as several different pathogenic types (i.e., pathotypes), including attaching and effacing (AEEC), enteroaggregative (EAEC), enteroinvasive (EIEC), enterotoxigenic (ETEC), and Shiga toxin-producing (STEC) E. coli (3). Pathotypes are often defined by differing sets of virulence-associated genes. Many of these genes are carried on mobile genetic elements that can be transferred among strains, resulting in new combinations and several hybrid pathotypes such as STEC/EAEC (4), STEC/ETEC (5,–7), and AEEC/ETEC (8). In this report, we characterized 2,244 E. coli isolates based on phylogenetic relationships and their potential risk to human health. The information reported here will help to better understand the evolution of these emergent foodborne pathogens and improve the accuracy of trace-back investigations during outbreaks caused by them.
Pure cultures for each strain were grown aerobically overnight in Luria-Bertani broth at 37°C. Total genomic DNA was extracted from 1 ml of overnight culture using the DNeasy blood and tissue kit (Qiagen, Hilden, Germany). DNA extractions were performed with the Qiagen QIAcube instrument using the manufacturer’s Gram-negative bacteria protocol. Sequencing libraries were prepared with 1 ng of DNA using the Nextera XT DNA sample prep kit (Illumina, San Diego, CA, USA) and sequenced on either the Illumina MiSeq or NextSeq platform. The resulting paired-end reads were quality controlled using FastQC (Q > 30) and de novo assembled using SPAdes 3.8.2 (9) or CLC Genomics Workbench 8.2.1 (CLC bio, Aarhus, Denmark).
Depth of coverage for the draft genomes ranged from 20× to 200× with the genome sizes ranging from 4,412,939 to 5,984,698 bp. The number of contigs ranged from 39 to 1,110, while the N50 values ranged from 14,741 to 699,676 bp. Each of the established E. coli phylogroups is represented in the sequenced strain collection as follows: A, 23%; B1, 47%; B2, 13%; D, 6%; E, 9%; and F, 2%. The strains were also screened for the presence of known or putative virulence factors, such as aggR, eae, ipaH, LT, ST, stx1, and stx2. Out of the 2,244 isolates, 394 can be classified as AEEC, 23 as EAEC, 9 as EIEC, 134 as ETEC, and 402 as STEC. Several strains were found to possess factors associated with hybrid pathotypes: STEC/ETEC (n = 22), AEEC/ETEC (n = 2), and STEC/EAEC (n = 1).
The draft genome assemblies were deposited in DDBJ/ENA/GenBank through FDA’s GenomeTrakr pipeline under BioProject PRJNA230969 with accession numbers NJIZ00000000 to NJNL00000000, NJRR00000000 to NKAI00000000, NKDC00000000 to NKEV00000000, NKLT00000000 to NKPS00000000, NKUK00000000 to NKVR00000000, NLFN00000000 to NMOH00000000, NNSX00000000 to NOIE00000000, NOMB00000000 to NOUU00000000, NOWO00000000 to NOWP00000000, NTND00000000 to NTPX00000000, NVPS00000000, NWNA00000000 to NWQF00000000, and NXMG00000000 to NXNF00000000. The versions described in this announcement are the first versions.
The views expressed in this article are those of the authors and do not necessarily reflect the official policy of the Department of Health and Human Services, the U.S. Food and Drug Administration (FDA), or the U.S. Government. Reference to any commercial materials, equipment, or process does not in any way constitute approval, endorsement, or recommendation by the FDA.
Part of this study was supported by the ORISE Fellowship Program.
Citation Gangiredla J, Mammel MK, Barnaba TJ, Tartera C, Gebru ST, Patel IR, Leonard SR, Kotewicz ML, Lampel KA, Elkins CA, Lacher DW. 2017. Species-wide collection of Escherichia coli isolates for examination of genomic diversity. Genome Announc 5:e01321-17. https://doi.org/10.1128/genomeA.01321-17.