|Home | About | Journals | Submit | Contact Us | Français|
Listeria monocytogenes (Lm) causes severe foodborne illness (listeriosis). Previous molecular subtyping methods, such as pulsed-field gel electrophoresis (PFGE), were critical in detecting outbreaks that led to food safety improvements and declining incidence, but PFGE provides limited genetic resolution. A multiagency collaboration began performing real-time, whole-genome sequencing (WGS) on all US Lm isolates from patients, food, and the environment in September 2013, posting sequencing data into a public repository. Compared with the year before the project began, WGS, combined with epidemiologic and product trace-back data, detected more listeriosis clusters and solved more outbreaks (2 outbreaks in pre-WGS year, 5 in WGS year 1, and 9 in year 2). Whole-genome multilocus sequence typing and single nucleotide polymorphism analyses provided equivalent phylogenetic relationships relevant to investigations; results were most useful when interpreted in context of epidemiological data. WGS has transformed listeriosis outbreak surveillance and is being implemented for other foodborne pathogens.
Invasive Listeria monocytogenes (Lm) infection (listeriosis) is a rare but serious cause of foodborne illness in the United States, causing an estimated 19% of US deaths from foodborne diseases and costing society an estimated $2.8 billion annually [1, 2]. During the 1980s and 1990s, food safety measures targeting ready-to-eat meat and poultry products helped reduce the incidence of listeriosis by >50% . In 2015, other foods, particularly dairy products and produce items, were identified as the main sources of listeriosis . Since 2001, listeriosis incidence has changed little despite intensive efforts, remaining above the Healthy People 2020 target of 0.2 cases per 100 000 [3, 5].
Although Lm was first recognized as a human pathogen in the 1920s, the first foodborne outbreak of listeriosis was not recognized until the early 1980s [6, 7]. This half-century gap between discovery of the pathogen and discovery of its transmission route underscores the difficulty in solving listeriosis outbreaks, an area once known as the “graveyard of epidemiology” . Listeriosis outbreaks have historically been difficult to solve because they are often small, incubation periods are frequently long , and patients are typically severely ill or deceased, thus limiting statistical power and completeness of food histories.
Only a small proportion of listeriosis cases are linked to recognized outbreaks (ie, 2 or more cases linked to a common food) , but outbreak investigations are important because they provide critical information about food sources of listeriosis and identify food safety gaps. Before nationwide molecular subtyping, US investigators generally detected only large or localized outbreaks. From 1983 (the first recognized US foodborne outbreak of listeriosis) through 1997, 5 listeriosis outbreaks with an identified food source were reported (mean, 0.3 outbreaks/year) with a median of 54 cases per outbreak . In 1998, PulseNet, a national network of local, state, and federal public health laboratories coordinated by the Centers for Disease Control and Prevention (CDC), began subtyping Lm isolates with pulsed-field gel electrophoresis (PFGE) . During the next 6 years, investigators solved >5-fold more outbreaks (mean, 2.3 per year); outbreaks were notably smaller (median, 11 cases/outbreak) . Molecular subtyping has proven critical for identifying clusters that warrant further investigation, but epidemiology is needed to identify or confirm the outbreak food source. (A cluster is a group of isolates linked by molecular subtyping, a group of illness linked by epidemiology, or both. The cluster may constitute an outbreak, but a source, eg, food, has not yet been determined). To enhance collection of epidemiologic information, in 2004 epidemiologists at CDC started the Listeria Initiative (LI) , modeled after a similar program in France. Participating state and local health departments interview all listeriosis patients (or surrogates) as soon as possible after diagnosis using a standardized questionnaire asking about >40 different food exposures. When molecular subtyping detects a listeriosis cluster, epidemiologists use LI data to compare food exposures among patients in the cluster with exposures of patients with non-outbreak-associated (sporadic) listeriosis to identify a possible food vehicle. During 2004–2013, the first decade of LI use, investigators solved more outbreaks (mean, 3.5 per year) with fewer cases per outbreak (median, 5.5) than in previous years.
Although PFGE has proved remarkably useful in detecting listeriosis clusters, it lacks the discriminatory capacity and phylogenetic basis of more advanced methods . Genome insertions, deletions, rearrangement, and point mutations at a restriction enzyme site can cause Lm isolates that are highly related genetically to appear different by PFGE. Similarly, isolates that are not highly related may appear indistinguishable by PFGE. Both scenarios can lead to misclassification when PFGE is used to define an outbreak-related case, which hampers cluster detection and epidemiologic investigations.
To enhance listeriosis surveillance and control by providing improved understanding of phylogenetic relationships among Lm isolates, in September 2013, CDC, the US Food and Drug Administration (FDA), the US Department of Agriculture's Food Safety and Inspection Service (USDA-FSIS), the National Institute for Biotechnology Information (NCBI), and local, state, and international partners began a pilot project to prospectively perform whole-genome sequencing (WGS) on all available Lm isolates collected from patients, food, and food processing environments in the United States. Sequencing-based subtyping methods have previously been used successfully in investigation of outbreaks detected by other methods (eg, PFGE) [14–21], but never before for real-time foodborne disease surveillance. Lm was the ideal foodborne pathogen for such a pilot because of its relatively small and simple genome, the severity of the disease it causes, strong epidemiologic surveillance, and frequent testing for contamination in ready-to-eat foods. At the end of the pilot, the partner agencies incorporated WGS into standard national listeriosis surveillance.
Beginning in 2012, FDA created a network of GenomeTrakr laboratories, which now consists of 30 sites (27 domestic, 3 international) that sequence Lm and other foodborne pathogen isolates from food and environmental sources. State regulatory laboratories also sequenced food and environmental isolates and uploaded data into the GenomeTrakr database. The WGS Lm pilot began in September 2013 when CDC started sequencing all Lm isolates from patients. Over time, state and local public health laboratories developed the capacity to sequence isolates locally. As of March 2016, 20 of these laboratories were sequencing all Lm patient isolates and another 7 were in preparation. USDA-FSIS sequences all Lm isolates obtained from sampling of meat and poultry products and FSIS-regulated food processing environments. PFGE is performed in parallel with WGS.
All participating agencies submit raw sequence data that meet minimum quality thresholds and metadata (eg, date, location, submitting laboratory, sequencing parameters, source of isolates) to publicly accessible NCBI data archives (BioSample and Sequence Read Archive) under a single BioProject . Metadata for patient isolates are limited to protect patient confidentiality. NCBI continually generates and shares phylogenies constructed from all Lm with publicly available genomes .
CDC collaborated with international partners to identify the core genome and create a whole-genome multilocus sequence typing (wgMLST) scheme for Lm . The MLST analysis method is a gene-centric process whereby allelic differences are counted at the gene level . For Lm, there are 7 gene fragments in the basic housekeeping MLST scheme  and 4804 in wgMLST, in which each gene is found in nearly all Lm genomes. These MLST schemes have been built into the existing software used to manage PulseNet data (BioNumerics 7.5, Applied Maths) . The software determines whether the nucleotide sequence among isolates is identical or different at each locus or gene. For example, when comparing 2 isolates, any sequence differences (eg, 1 or >100 single-nucleotide polymorphisms [SNPs]) at a single locus count as a single allelic difference; if the 2 isolates have identical sequences at that locus, no allelic difference is counted. This method allows users to quickly and efficiently infer phylogenetic trees with relatively little bioinformatics expertise, allowing state and local public health laboratories to independently perform wgMLST analyses and query the national database for highly related isolates.
The FDA, CDC, and USDA-FSIS also analyze WGS data using free, open-source SNP-based approaches, which create phylogenetic trees based on nucleotide differences among genomes under investigation that are mapped for comparison to a reference genome [27, 28]. For identifying high-quality SNP (hqSNP) differences among genomes, the SNP positions are filtered for sequence depth and quality. These hqSNP phylogenies provide a high level of detail, but the analysis can be computationally intensive for large numbers of isolates and typically requires a skilled bioinformatician to do the analysis.
WGS has transformed listeriosis outbreak surveillance and response. Compared with the year before comprehensive WGS began (September 2012–August 2013), more clusters were detected in each year following introduction of WGS (14 in the pre-WGS year, 19 in year 1, and 21 in year 2) (Figure (Figure1).1). Improvement in numbers of clusters detected sooner or only by WGS compared with PFGE and a marked reduction in median cluster size were also observed. Similarly, more outbreaks were solved following implementation of WGS (2 outbreaks in the pre-WGS year, 5 outbreaks in year 1, and 9 outbreaks in year 2), linking more cases to food sources.
Based on experiences to date, WGS has improved listeriosis outbreak investigations in at least 6 key ways; an example is provided for each.
In the longer term, WGS can improve our understanding of the ecology of Lm contamination in the food production environment. In 3 small outbreaks (≤6 cases each) linked to soft cheese (2 outbreaks) and sprouts (1 outbreak) [31, 35, 36], relatively few allelic differences (≤12) existed among all patient, food, and environmental isolates, possibly because all involved small production environments. Strains harbored within some facilities appear to change little over time. For example, in 3 outbreak investigations during 2013–2015 linked to soft cheese [30, 35, 36], food and environmental isolates collected from the same facility 3–5 years apart were found to be highly related by WGS (≤13 alleles). By comparison, a more diverse phylogenetic range of isolates has been observed in outbreaks linked to large companies. In the outbreak linked to large ice cream producer A , isolates from one production facility differed by up to 30 alleles (excluding 4 food isolates that were >200 alleles from others), and isolates from a second differed by up to 32 alleles (Table (Table1).1). Similarly, in the outbreak linked to stone fruit , there were up to 43 allele differences among isolates. Factors that may influence the diversity of isolates within an outbreak include the source of food contamination (ie, whether on the farm or during transport or processing), the type of food, the number and manner of Lm introductions into a food processing environment, the size of a food processing facility, and the environmental conditions and duration in which Lm survives and grows before contaminating food.
In our experience investigating listeriosis clusters with wgMLST (ie, allele) and hqSNP analyses, the 2 methods equally distinguish isolates belonging to an outbreak from sporadic cases with high epidemiological concordance [24, 39]. The numbers of allele and hqSNP differences between isolates are typically similar when there are <100 differences (alleles or hqSNPs) between isolates. Theoretically, this concordance arises because point mutations typically do not occur twice in the same gene in closely related isolates.
European investigators have reported that a cutoff of ≤10 alleles by core genome MLST could distinguish outbreak-related isolates from unrelated ones in 2 outbreaks . Our experience investigating a wider variety of clusters and outbreaks suggests that that a single cutoff cannot consistently predict whether isolates will be epidemiologically related. Isolates with <10 wgMLST allele or hqSNP differences were often epidemiologically linked, whereas those in the 10–30 range were frequently linked, and isolates with >30 differences were occasionally linked. WGS results should not be interpreted in isolation, and other contextual information (ie, epidemiologic or product trace-back data) is needed to confirm the source.
In the context of robust epidemiologic data collection and regulatory food testing, the increased discriminatory capacity of WGS has helped strengthen links among Lm isolates from food, the environment, and patients to detect more listeriosis clusters, halt pseudo-cluster investigations, focus epidemiologic investigations, and ultimately, solve more outbreaks. In several instances, the high resolution of WGS allowed investigators and regulatory agencies to take action at a lower level of epidemiological evidence, a key advantage for the relatively small outbreaks typical for listeriosis. By helping to solve more outbreaks, WGS has increased public awareness of listeriosis, focused increased attention on contamination that food companies can prevent, and improved understanding of Lm in food. WGS, by helping to uncover food safety gaps through outbreak investigations, complements the ongoing implementation of the Food Safety Modernization Act, which aims to improve measures to prevent foodborne illness . These interventions should lead to renewed reductions in Lm-related illnesses and deaths. Based on the success of WGS-enhanced listeriosis surveillance, CDC and partner agencies are expanding the technology to US surveillance for other pathogens transmitted commonly by food, including Shiga toxin–producing Escherichia coli, Campylobacter, and Salmonella.
The Lm wgMLST analysis tool, previously only available at CDC, will soon be available to all US PulseNet member laboratories. Because standardization is essential for public health surveillance, CDC, FDA, and USDA-FSIS continue to work with academic and international partners to set sequence quality benchmarks and nomenclature schemes. As WGS becomes more integrated into foodborne outbreak detection, public health officials at all levels will need to learn core concepts about the technology and its interpretation, particularly as many will need to describe the implications of these analyses to industry representatives, policymakers, and the media.
Acknowledgments.Kristin G. Holt, John Johnston, Frankie Beacorn, Cathy Pentz, and Todd Lauze of US Department of Agriculture's Food Safety and Inspection Service (USDA-FSIS) also contributed to the Lm whole-genome sequencing (WGS) pilot project. We thank our international collaborators, including Sylvain Brisse, Marc Lecuit, Alexander LeClercq, Alexandra Moura, and colleagues at Institut Pasteur, as well as colleagues at the Public Health Agency of Canada. We acknowledge the critical work of state and local health departments, laboratories, and regulatory agencies in foodborne disease outbreak detection and investigation and in rapidly implementing WGS in their public health work.
Disclaimer.The findings and conclusions in this paper are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention (CDC), the Food and Drug Administration (FDA), FSIS, or the National Institutes of Health (NIH).
Financial support.This work was supported by CDC, including CDC's Advanced Molecular Detection Initiative and Cooperative Agreement number U60HM000803; FDA; USDA-FSIS; National Institute for Biotechnology Information; and the Intramural Research Program of the NIH, National Library of Medicine. H. P. is an employee of Applied Maths.
Potential conflicts of interest.All authors: No reported conflicts. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.