Foodborne pathogens are a leading cause of illness in the United States. In 1999, the Centers for Disease Control and Prevention estimated that there were around 76 million cases per year of illnesses due to foodborne agents, with 325,000 hospitalizations and 5,000 deaths in the United States each year [1
]. More recent estimates suggest that the annual number of infections approaches 82 million [2
]. Food contamination can occur at any step in the “farm-to-the-fork” continuum. Challenges in food safety stem from limited resources for surveillance of fresh and processed foods, sheer diversity, complexity of food matrices, and an increasingly globalized food supply chain. Recently, foodborne illness outbreaks have been increasingly recognized as a growing threat to the public health [3
Food inspection and outbreak detection are two vital steps to ensure food safety. Since severe food-related outbreaks are usually caused by bacterial pathogens, it is crucial to have technologies that are capable of quickly detecting these pathogens with high sensitivity and reliability. The conventional methods, which are frequently used in food laboratories, are based on cultural, serological, and biochemical properties of specific bacterial pathogens. These phenotypic methods are time-consuming and labor-intensive. Moreover, they have limited utility for epidemiologic analysis of pathogen transmission during outbreak investigations because of their poor discriminatory power for closely related strains. Several genotyping methods have been developed and applied to provide estimates of genetic relatedness and make inferences about the outbreak transmission [4
Typing methods such as pulsed-field gel electrophoresis (PFGE) that utilize restriction fragment analysis or polymerase chain reaction (PCR) are commonly used in epidemiological investigations of bacterial foodborne pathogens [4
]. However, a major drawback of these typing methods is that they provide limited utility in understanding the genetic traits of bacterial strains, such as pathogenicity, virulence, or antimicrobial resistance. High-throughput microarray technology provides an effective way to identify, characterize, and obtain a nearly complete snapshot of the genetic repertoire of a particular isolate. Such genome-wide insight is necessary for accurate and confident identification and discrimination of pathogens that may contaminate the food supply.
Microarray technology has been widely used in drug discovery and development, toxicology, and clinical application [5
]. However, the use of this technology in detecting and characterizing foodborne pathogens is still in its infancy [10
]. A properly designed genotyping microarray can not only provide strain-level discrimination within a particular pathotype, but also identify genetic elements responsible for virulence and antimicrobial resistance. Furthermore, microarrays are highly parallel assays in that they can detect tens of thousands of genes simultaneously in a single experiment. Combining this highly parallel assay with a semi-high-throughput workflow can provide a highly discriminatory and rapid subtyping method for use in epidemiological investigations of foodborne outbreaks. Indeed, the FDA has investigated several novel microarray-based strategies over the past several years in order to determine how best to identify and discriminate closely related strains [19
]. In this manuscript, two separate microbial microarrays are examined, both of which were custom made by independent government agencies. The FDA-ECSG array is a custom Affymetrix microarray developed by the FDA’s Center for Food Safety and Applied Nutrition and represents all of the genes found in 32 whole genome sequences and 46 related plasmid sequences from Escherichia coli
and the related species, Shigella
]. This array contains >23,000 independent genes and was designed to identify and discriminate between closely related strains of E. coli
. The second type of microarray examined in this study is a universal microarray developed by USDA scientists to detect antimicrobial resistance genes in bacterial pathogens [18
, a bioinformatics tool developed by the FDA, has been expanded to support microbial microarray data. ArrayTrackTM
has provided a rich set of functionality to manage, analyze, and interpret gene expression data from mammalian organisms [25
] and has been widely adopted by the research community and used for review of pharmacogenomics data in the FDA’s Voluntary Genomics Data Submission program [27
]. The new expansion of ArrayTrackTM
provides functionality to support microbial genomics research using microarrays. For example, ArrayTrackTM
’s libraries have been populated with bioinformatics data from public domains related to bacterial pathogen species [28
]. Data processing and visualization tools have been enhanced with customized options to facilitate analysis of microarray data generated from the custom microarrays developed at the FDA and USDA. Specifically, at the time of this writing, three new functions have been developed and are particularly effective for analysis of these microarray data: flag-based hierarchical clustering analysis (HCA), a flag concordance (FC) heat map, and flag indicators in the mixed scatter plot.
For each of the microarray experiments discussed in this manuscript, total genomic DNA was extracted from purified cultures of independent bacterial isolates and used as the target material for hybridization to individual microarrays. We used the term “sample” to denote such a DNA extract; for a bacterial isolate without replicates of microarray hybridization, “sample” is synonymous with “isolate”. This manuscript illustrates the microbial genomics specific functionality in ArrayTrackTM through the case studies based on these two microarrays.