1.  Genomic Recombination Leading to Decreased Virulence of Group B Streptococcus in a Mouse Model of Adult Invasive Disease 
Pathogens  2016;5(3):54.
Adult invasive disease caused by Group B Streptococcus (GBS) is increasing worldwide. Whole-genome sequencing (WGS) now permits rapid identification of recombination events, a phenomenon that occurs frequently in GBS. Using WGS, we described that strain NGBS375, a capsular serotype V GBS isolate of sequence type (ST)297, has an ST1 genomic background but has acquired approximately 300 kbp of genetic material likely from an ST17 strain. Here, we examined the virulence of this strain in an in vivo model of GBS adult invasive infection. The mosaic ST297 strain showed intermediate virulence, causing significantly less systemic infection and reduced mortality than a more virulent, serotype V ST1 isolate. Bacteremia induced by the ST297 strain was similar to that induced by a serotype III ST17 strain, which was the least virulent under the conditions tested. Yet, under normalized bacteremia levels, the in vivo intrinsic capacity to induce the production of pro-inflammatory cytokines was similar between the ST297 strain and the virulent ST1 strain. Thus, the diminished virulence of the mosaic strain may be due to reduced capacity to disseminate or multiply in blood during a systemic infection which could be mediated by regulatory factors contained in the recombined region.
PMCID: PMC5039434  PMID: 27527222
group B Streptococcus; Streptococcus agalactiae; recombination; invasive bacterial infection; adult infection; cytokines
2.  Envelope-specific B-cell populations in African green monkeys chronically infected with simian immunodeficiency virus 
Nature Communications  2016;7:12131.
African green monkeys (AGMs) are natural primate hosts of simian immunodeficiency virus (SIV). Interestingly, features of the envelope-specific antibody responses in SIV-infected AGMs are distinct from that of HIV-infected humans and SIV-infected rhesus monkeys, including gp120-focused responses and rapid development of autologous neutralization. Yet, the lack of genetic tools to evaluate B-cell lineages hinders potential use of this unique non-human primate model for HIV vaccine development. Here we define features of the AGM Ig loci and compare the proportion of Env-specific memory B-cell populations to that of HIV-infected humans and SIV-infected rhesus monkeys. AGMs appear to have a higher proportion of Env-specific memory B cells that are mainly gp120 directed. Furthermore, AGM gp120-specific monoclonal antibodies display robust antibody-dependent cellular cytotoxicity and CD4-dependent virion capture activity. Our results support the use of AGMs to model induction of functional gp120-specific antibodies by HIV vaccine strategies.
Infection of African green monkeys with simian immunodeficiency virus is a potential model for HIV vaccine development. Here, Zhang et al. catalogue the immunoglobulin loci present in the genome of these animals, and experimentally study their B-cell response to the viral envelope protein.
PMCID: PMC4935802  PMID: 27381634
3.  High Incidence of Invasive Group A Streptococcus Disease Caused by Strains of Uncommon emm Types in Thunder Bay, Ontario, Canada 
An outbreak of type emm59 invasive group A Streptococcus (iGAS) disease was declared in 2008 in Thunder Bay District, Northwestern Ontario, 2 years after a countrywide emm59 epidemic was recognized in Canada. Despite a declining number of emm59 infections since 2010, numerous cases of iGAS disease continue to be reported in the area. We collected clinical information on all iGAS cases recorded in Thunder Bay District from 2008 to 2013. We also emm typed and sequenced the genomes of all available strains isolated from 2011 to 2013 from iGAS infections and from severe cases of soft tissue infections. We used whole-genome sequencing data to investigate the population structure of GAS strains of the most frequently isolated emm types. We report an increased incidence of iGAS in Thunder Bay compared to the metropolitan area of Toronto/Peel and the province of Ontario. Illicit drug use, alcohol abuse, homelessness, and hepatitis C infection were underlying diseases or conditions that might have predisposed patients to iGAS disease. Most cases were caused by clonal strains of skin or generalist emm types (i.e., emm82, emm87, emm101, emm4, emm83, and emm114) uncommonly seen in other areas of the province. We observed rapid waxing and waning of emm types causing disease and their replacement by other emm types associated with the same tissue tropisms. Thus, iGAS disease in Thunder Bay District predominantly affects a select population of disadvantaged persons and is caused by clonally related strains of a few skin and generalist emm types less commonly associated with iGAS in other areas of Ontario.
PMCID: PMC4702752  PMID: 26491184
4.  Bloom and bust: intestinal microbiota dynamics in response to hospital exposures and Clostridium difficile colonization or infection 
Microbiome  2016;4:12.
Clostridium difficile infection (CDI) is the leading infectious cause of nosocomial diarrhea. Hospitalized patients are at increased risk of developing CDI because they are exposed to C. difficile spores through contact with the hospital environment and often receive antibiotics and other medications that can disrupt the integrity of the indigenous intestinal microbiota and impair colonization resistance. Using whole metagenome shotgun sequencing, we examined the diversity and composition of the fecal microbiota in a prospective cohort study of 98 hospitalized patients.
Four patients had asymptomatic C. difficile colonization, and four patients developed CDI. We observed dramatic shifts in the structure of the gut microbiota during hospitalization. In contrast to CDI cases, asymptomatic patients exhibited elevated relative abundance of potentially protective bacterial taxa in their gut at the onset of C. difficile colonization. Use of laxatives was associated with significant reductions in the relative abundance of Clostridium and Eubacterium; species within these genera have previously been shown to enhance resistance to CDI via the production of secondary bile acids. Cephalosporin and fluoroquinolone exposure decreased the frequency of Clostridiales Family XI Incertae Sedis, a bacterial family that has been previously associated with decreased CDI risk.
This study underscores the detrimental impact of antibiotics as well as other medications, particularly laxatives, on the intestinal microbiota and suggests that co-colonization with key bacterial taxa may prevent C. difficile overgrowth or the transition from asymptomatic C. difficile colonization to CDI.
Electronic supplementary material
The online version of this article (doi:10.1186/s40168-016-0156-3) contains supplementary material, which is available to authorized users.
PMCID: PMC4791782  PMID: 26975510
Clostridium difficile infection; Whole metagenome shotgun sequencing; Intestinal microbiota; Antimicrobials; Medications
5.  Population Structure and Antimicrobial Resistance Profiles of Streptococcus suis Serotype 2 Sequence Type 25 Strains 
PLoS ONE  2016;11(3):e0150908.
Strains of serotype 2 Streptococcus suis are responsible for swine and human infections. Different serotype 2 genetic backgrounds have been defined using multilocus sequence typing (MLST). However, little is known about the genetic diversity within each MLST sequence type (ST). Here, we used whole-genome sequencing to test the hypothesis that S. suis serotype 2 strains of the ST25 lineage are genetically heterogeneous. We evaluated 51 serotype 2 ST25 S. suis strains isolated from diseased pigs and humans in Canada, the United States of America, and Thailand. Whole-genome sequencing revealed numerous large-scale rearrangements in the ST25 genome, compared to the genomes of ST1 and ST28 S. suis strains, which result, among other changes, in disruption of a pilus island locus. We report that recombination and lateral gene transfer contribute to ST25 genetic diversity. Phylogenetic analysis identified two main and distinct Thai and North American clades grouping most strains investigated. These clades also possessed distinct patterns of antimicrobial resistance genes, which correlated with acquisition of different integrative and conjugative elements (ICEs). Some of these ICEs were found to be integrated at a recombination hot spot, previously identified as the site of integration of the 89K pathogenicity island in serotype 2 ST7 S. suis strains. Our results highlight the limitations of MLST for phylogenetic analysis of S. suis, and the importance of lateral gene transfer and recombination as drivers of diversity in this swine pathogen and zoonotic agent.
PMCID: PMC4783015  PMID: 26954687
7.  Clonal Complex 17 Group B Streptococcus strains causing invasive disease in neonates and adults originate from the same genetic pool 
Scientific Reports  2016;6:20047.
A significant proportion of group B Streptococcus (GBS) neonatal disease, particularly late-onset disease, is associated with strains of serotype III, clonal complex (CC) 17. CC17 strains also cause invasive infections in adults. Little is known about the phylogenetic relationships of isolates recovered from neonatal and adult CC17 invasive infections. We performed whole-genome-based phylogenetic analysis of 93 temporally and geographically matched CC17 strains isolated from both neonatal and adult invasive infections in the metropolitan region of Toronto/Peel, Canada. We also mined the whole-genome data to reveal mobile genetic elements carrying antimicrobial resistance genes. We discovered that CC17 GBS strains causing neonatal and adult invasive disease are interspersed and cluster tightly in a phylogenetic tree, signifying that they are derived from the same genetic pool. We identified limited variation due to recombination in the core CC17 genome. We describe that loss of Pilus Island 1 and acquisition of different mobile genetic elements carrying determinants of antimicrobial resistance contribute to CC17 genetic diversity. Acquisition of some of these mobile genetic elements appears to correlate with clonal expansion of the strains that possess them. Our results provide a genome-wide portrait of the population structure and evolution of a major disease-causing clone of an opportunistic pathogen.
PMCID: PMC4740736  PMID: 26843175
8.  Clinical utilization of genomics data produced by the international Pseudomonas aeruginosa consortium 
The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database ( Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care.
PMCID: PMC4586430  PMID: 26483767
Pseudomonas aeruginosa; next-generation sequencing; bacterial genome; phylogeny; database; cystic fibrosis; antibiotic resistance; clinical microbiology
9.  Complex Population Structure and Virulence Differences among Serotype 2 Streptococcus suis Strains Belonging to Sequence Type 28 
PLoS ONE  2015;10(9):e0137760.
Streptococcus suis is a major swine pathogen and a zoonotic agent. Serotype 2 strains are the most frequently associated with disease. However, not all serotype 2 lineages are considered virulent. Indeed, sequence type (ST) 28 serotype 2 S. suis strains have been described as a homogeneous group of low virulence. However, ST28 strains are often isolated from diseased swine in some countries, and at least four human ST28 cases have been reported. Here, we used whole-genome sequencing and animal infection models to test the hypothesis that the ST28 lineage comprises strains of different genetic backgrounds and different virulence. We used 50 S. suis ST28 strains isolated in Canada, the United States and Japan from diseased pigs, and one ST28 strain from a human case isolated in Thailand. We report a complex population structure among the 51 ST28 strains. Diversity resulted from variable gene content, recombination events and numerous genome-wide polymorphisms not attributable to recombination. Phylogenetic analysis using core genome single-nucleotide polymorphisms revealed four discrete clades with strong geographic structure, and a fifth clade formed by US, Thai and Japanese strains. When tested in experimental animal models, strains from this latter clade were significantly more virulent than a Canadian ST28 reference strain, and a closely related Canadian strain. Our results highlight the limitations of MLST for both phylogenetic analysis and virulence prediction and raise concerns about the possible emergence of ST28 strains in human clinical cases.
PMCID: PMC4574206  PMID: 26375680
11.  A pipeline for the systematic identification of non-redundant full-ORF cDNAs for polymorphic and evolutionary divergent genomes: Application to the ascidian Ciona intestinalis 
Developmental Biology  2015;404(2):149-163.
Genome-wide resources, such as collections of cDNA clones encoding for complete proteins (full-ORF clones), are crucial tools for studying the evolution of gene function and genetic interactions. Non-model organisms, in particular marine organisms, provide a rich source of functional diversity. Marine organism genomes are, however, frequently highly polymorphic and encode proteins that diverge significantly from those of well-annotated model genomes. The construction of full-ORF clone collections from non-model organisms is hindered by the difficulty of predicting accurately the N-terminal ends of proteins, and distinguishing recent paralogs from highly polymorphic alleles. We report a computational strategy that overcomes these difficulties, and allows for accurate gene level clustering of transcript data followed by the automated identification of full-ORFs with correct 5′- and 3′-ends. It is robust to polymorphism, includes paralog calling and does not require evolutionary proximity to well annotated model organisms. We developed this pipeline for the ascidian Ciona intestinalis, a highly polymorphic member of the divergent sister group of the vertebrates, emerging as a powerful model organism to study chordate gene function, Gene Regulatory Networks and molecular mechanisms underlying human pathologies. Using this pipeline we have generated the first full-ORF collection for a highly polymorphic marine invertebrate. It contains 19,163 full-ORF cDNA clones covering 60% of Ciona coding genes, and full-ORF orthologs for approximately half of curated human disease-associated genes.
Graphical abstract
•Ascidians foster functional genomics by compact genomes and fixed cellular lineages.•A resource of 19.000 GATEWAY full ORF clones was generated for Ciona intestinalis.•Novel methods support automated finding of coding 5′ ends and paralog distinction.•The strategy is robust to polymorphism and poorly annotated genomes.•Half of human disease associated genes are covered by full ORF Ciona orthologs.
PMCID: PMC4528069  PMID: 26025923
Full-ORF; Functional genomics; Prediction pipeline; Ascidians; Transcriptomics; Human disease
12.  Sequencing strategies and characterization of 721 vervet monkey genomes for future genetic analyses of medically relevant traits 
BMC Biology  2015;13:41.
We report here the first genome-wide high-resolution polymorphism resource for non-human primate (NHP) association and linkage studies, constructed for the Caribbean-origin vervet monkey, or African green monkey (Chlorocebus aethiops sabaeus), one of the most widely used NHPs in biomedical research. We generated this resource by whole genome sequencing (WGS) of monkeys from the Vervet Research Colony (VRC), an NIH-supported research resource for which extensive phenotypic data are available.
We identified genome-wide single nucleotide polymorphisms (SNPs) by WGS of 721 members of an extended pedigree from the VRC. From high-depth WGS data we identified more than 4 million polymorphic unequivocal segregating sites; by pruning these SNPs based on heterozygosity, quality control filters, and the degree of linkage disequilibrium (LD) between SNPs, we constructed genome-wide panels suitable for genetic association (about 500,000 SNPs) and linkage analysis (about 150,000 SNPs). To further enhance the utility of these resources for linkage analysis, we used a further pruned subset of the linkage panel to generate multipoint identity by descent matrices.
The genetic and phenotypic resources now available for the VRC and other Caribbean-origin vervets enable their use for genetic investigation of traits relevant to human diseases.
Electronic supplementary material
The online version of this article (doi:10.1186/s12915-015-0152-2) contains supplementary material, which is available to authorized users.
PMCID: PMC4494155  PMID: 26092298
Vervet; Non-human primate; Whole genome sequencing; SNP; Linkage; Association
13.  Genomic Comparison of Non-Typhoidal Salmonella enterica Serovars Typhimurium, Enteritidis, Heidelberg, Hadar and Kentucky Isolates from Broiler Chickens 
PLoS ONE  2015;10(6):e0128773.
Non-typhoidal Salmonella enterica serovars, associated with different foods including poultry products, are important causes of bacterial gastroenteritis worldwide. The colonization of the chicken gut by S. enterica could result in the contamination of the environment and food chain. The aim of this study was to compare the genomes of 25 S. enterica serovars isolated from broiler chicken farms to assess their intra- and inter-genetic variability, with a focus on virulence and antibiotic resistance characteristics.
Methodology/Principal Finding
The genomes of 25 S. enterica isolates covering five serovars (ten Typhimurium including three monophasic 4,[5],12:i:, four Enteritidis, three Hadar, four Heidelberg and four Kentucky) were sequenced. Most serovars were clustered in strongly supported phylogenetic clades, except for isolates of serovar Enteritidis that were scattered throughout the tree. Plasmids of varying sizes were detected in several isolates independently of serovars. Genes associated with the IncF plasmid and the IncI1 plasmid were identified in twelve and four isolates, respectively, while genes associated with the IncQ plasmid were found in one isolate. The presence of numerous genes associated with Salmonella pathogenicity islands (SPIs) was also confirmed. Components of the type III and IV secretion systems (T3SS and T4SS) varied in different isolates, which could explain in part, differences of their pathogenicity in humans and/or persistence in broilers. Conserved clusters of genes in the T3SS were detected that could be used in designing effective strategies (diagnostic, vaccination or treatments) to combat Salmonella. Antibiotic resistance genes (CMY, aadA, ampC, florR, sul1, sulI, tetAB, and srtA) and class I integrons were detected in resistant isolates while all isolates carried multidrug efflux pump systems regardless of their antibiotic susceptibility profile.
This study showed that the predominant Salmonella serovars in broiler chickens harbor genes encoding adhesins, flagellar proteins, T3SS, iron acquisition systems, and antibiotic and metal resistance genes that may explain their pathogenicity, colonization ability and persistence in chicken. The existence of mobile genetic elements indicates that isolates from a given serovar could acquire and transfer genetic material. Conserved genes in the T3SS and T4SS that we have identified are promising candidates for identification of diagnostic, antimicrobial or vaccine targets for the control of Salmonella in broiler chickens.
PMCID: PMC4470630  PMID: 26083489
14.  Complete Genome Sequence of Streptococcus thermophilus SMQ-301, a Model Strain for Phage-Host Interactions 
Genome Announcements  2015;3(3):e00480-15.
Streptococcus thermophilus is used by the dairy industry to manufacture yogurt and several cheeses. Using PacBio and Illumina platforms, we sequenced the genome of S. thermophilus SMQ-301, the host of several virulent phages. The genome is composed of 1,861,792 bp and contains 2,037 genes, 67 tRNAs, and 18 rRNAs.
PMCID: PMC4440953  PMID: 25999573
15.  Excretion of Host DNA in Feces Is Associated with Risk of Clostridium difficile Infection 
Journal of Immunology Research  2015;2015:246203.
Clostridium difficile infection (CDI) is intricately linked to the health of the gastrointestinal tract and its indigenous microbiota. In this study, we assessed whether fecal excretion of host DNA is associated with CDI development. Assuming that shedding of epithelial cell increases in the inflamed intestine, we used human DNA excretion as a marker of intestinal insult. Whole-genome shotgun sequencing was employed to quantify host DNA excretion and evaluate bacterial content in fecal samples collected from patients with incipient CDI, hospitalized controls, and healthy subjects. Human DNA excretion was significantly increased in patients admitted to the hospital for a gastrointestinal ailment, as well as prior to an episode of CDI. In multivariable analyses, human read abundance was independently associated with CDI development. Host DNA proportions were negatively correlated with intestinal microbiota diversity. Enterococcus and Escherichia were enriched in patients excreting high quantities of human DNA, while Ruminococcus and Odoribacter were depleted. These findings suggest that intestinal inflammation can occur prior to CDI development and may influence patient susceptibility to CDI. The quantification of human DNA in feces could serve as a simple and noninvasive approach to assess bowel inflammation and identify patients at risk of CDI.
PMCID: PMC4451987  PMID: 26090486
16.  Population Structure and Antimicrobial Resistance of Invasive Serotype IV Group B Streptococcus, Toronto, Ontario, Canada 
Emerging Infectious Diseases  2015;21(4):585-591.
Conjugate vaccines should include polysaccharide or virulence proteins of this serotype to provide complete protection.
We recently showed that 37/600 (6.2%) invasive infections with group B Streptococcus (GBS) in Toronto, Ontario, Canada, were caused by serotype IV strains. We report a relatively high level of genetic diversity in 37 invasive strains of this emerging GBS serotype. Multilocus sequence typing identified 6 sequence types (STs) that belonged to 3 clonal complexes. Most isolates were ST-459 (19/37, 51%) and ST-452 (11/37, 30%), but we also identified ST-291, ST-3, ST-196, and a novel ST-682. We detected further diversity by performing whole-genome single-nucleotide polymorphism analysis and found evidence of recombination events contributing to variation in some serotype IV GBS strains. We also evaluated antimicrobial drug resistance and found that ST-459 strains were resistant to clindamycin and erythromycin, whereas strains of other STs were, for the most part, susceptible to these antimicrobial drugs.
PMCID: PMC4378482  PMID: 25811284
bacterial infection; invasive bacterial disease; group B Streptococcus; streptococci; Streptococcus agalactiae; bacteria; serotype IV; multilocus sequence typing; whole-genome sequencing; antimicrobial resistance; population structure; Toronto; Canada
17.  First Complete Genome Sequence of Staphylococcus xylosus, a Meat Starter Culture and a Host to Propagate Staphylococcus aureus Phages 
Genome Announcements  2014;2(4):e00671-14.
Staphylococcus xylosus is a bacterial species used in meat fermentation and a commensal microorganism found on animals. We present the first complete circular genome from this species. The genome is composed of 2,757,557 bp, with a G+C content of 32.9%, and contains 2,514 genes and 79 structural RNAs.
PMCID: PMC4110768  PMID: 25013142
18.  Systems Biology of the Vervet Monkey 
ILAR Journal  2013;54(2):122-143.
Nonhuman primates (NHP) provide crucial biomedical model systems intermediate between rodents and humans. The vervet monkey (also called the African green monkey) is a widely used NHP model that has unique value for genetic and genomic investigations of traits relevant to human diseases. This article describes the phylogeny and population history of the vervet monkey and summarizes the use of both captive and wild vervet monkeys in biomedical research. It also discusses the effort of an international collaboration to develop the vervet monkey as the most comprehensively phenotypically and genomically characterized NHP, a process that will enable the scientific community to employ this model for systems biology investigations.
PMCID: PMC3814400  PMID: 24174437
African green monkey; genetics; genomics; phenomics; simian immunodeficiency virus [SIV]; systems biology; transcriptomics; vervet
19.  The MAPT H1 haplotype is associated with tangle-predominant dementia 
Acta neuropathologica  2012;124(5):693-704.
Tangle-predominant dementia (TPD) patients exhibit cognitive decline that is clinically similar to early to moderate-stage Alzheimer disease (AD), yet autopsy reveals neurofibrillary tangles in the medial temporal lobe composed of the microtubule-associated protein tau without significant amyloid-beta (Aβ)-positive plaques. We performed a series of neuropathological, biochemical and genetic studies using autopsy brain tissue drawn from a cohort of 34 TPD, 50 AD and 56 control subjects to identify molecular and genetic signatures of this entity. Biochemical analysis demonstrates a similar tau protein isoform composition in TPD and AD, which is compatible with previous histological and ultrastructural studies. Further, biochemical analysis fails to uncover elevation of soluble Aβ in TPD frontal cortex and hippocampus compared to control subjects, demonstrating that non-plaque-associated Aβ is not a contributing factor. Unexpectedly, we also observed high levels of secretory amyloid precursor protein α (sAPPα) in the frontal cortex of some TPD patients compared to AD and control subjects, suggesting differences in APP processing. Finally, we tested whether TPD is associated with changes in the tau gene (MAPT). Haplotype analysis demonstrates a strong association between TPD and the MAPT H1 haplotype, a genomic inversion associated with some tauopathies and Parkinson disease (PD), when compared to age-matched control subjects with mild degenerative changes, i.e., successful cerebral aging. Next-generation resequencing of MAPT followed by association analysis shows an association between TPD and two polymorphisms in the MAPT 3′ untranslated region (UTR). These results support the hypothesis that haplotype-specific variation in the MAPT 3′ UTR underlies an Aβ-independent mechanism for neurodegeneration in TPD.
PMCID: PMC3608475  PMID: 22802095
Dementia; Neurofibrillary tangle; Tau; Amyloid; MAPT; 3′ Untranslated region; Aging; Alzheimer’s disease; sAPPα
20.  A non-human primate system for large-scale genetic studies of complex traits 
Human Molecular Genetics  2012;21(15):3307-3316.
Non-human primates provide genetic model systems biologically intermediate between humans and other mammalian model organisms. Populations of Caribbean vervet monkeys (Chlorocebus aethiops sabaeus) are genetically homogeneous and large enough to permit well-powered genetic mapping studies of quantitative traits relevant to human health, including expression quantitative trait loci (eQTL). Previous transcriptome-wide investigation in an extended vervet pedigree identified 29 heritable transcripts for which levels of expression in peripheral blood correlate strongly with expression levels in the brain. Quantitative trait linkage analysis using 261 microsatellite markers identified significant (n = 8) and suggestive (n = 4) linkages for 12 of these transcripts, including both cis- and trans-eQTL. Seven transcripts, located on different chromosomes, showed maximum linkage to markers in a single region of vervet chromosome 9; this observation suggests the possibility of a master trans-regulator locus in this region. For one cis-eQTL (at B3GALTL, beta-1,3-glucosyltransferase), we conducted follow-up single nucleotide polymorphism genotyping and fine-scale association analysis in a sample of unrelated Caribbean vervets, localizing this eQTL to a region of <200 kb. These results suggest the value of pedigree and population samples of the Caribbean vervet for linkage and association mapping studies of quantitative traits. The imminent whole genome sequencing of many of these vervet samples will enhance the power of such investigations by providing a comprehensive catalog of genetic variation.
PMCID: PMC3392106  PMID: 22556363
21.  Reductions in intestinal Clostridiales precede the development of nosocomial Clostridium difficile infection 
Microbiome  2013;1:18.
Antimicrobial use is thought to suppress the intestinal microbiota, thereby impairing colonization resistance and allowing Clostridium difficile to infect the gut. Additional risk factors such as proton-pump inhibitors may also alter the intestinal microbiota and predispose patients to Clostridium difficile infection (CDI). This comparative metagenomic study investigates the relationship between epidemiologic exposures, intestinal bacterial populations and subsequent development of CDI in hospitalized patients. We performed a nested case–control study including 25 CDI cases and 25 matched controls. Fecal specimens collected prior to disease onset were evaluated by 16S rRNA gene amplification and pyrosequencing to determine the composition of the intestinal microbiota during the at-risk period.
The diversity of the intestinal microbiota was significantly reduced prior to an episode of CDI. Sequences corresponding to the phylum Bacteroidetes and to the families Bacteroidaceae and Clostridiales Incertae Sedis XI were depleted in CDI patients compared to controls, whereas sequences corresponding to the family Enterococcaceae were enriched. In multivariable analyses, cephalosporin and fluoroquinolone use, as well as a decrease in the abundance of Clostridiales Incertae Sedis XI were significantly and independently associated with CDI development.
This study shows that a reduction in the abundance of a specific bacterial family - Clostridiales Incertae Sedis XI - is associated with risk of nosocomial CDI and may represent a target for novel strategies to prevent this life-threatening infection.
PMCID: PMC3971611  PMID: 24450844
Intestinal microbiota; Clostridium difficile infection; 16S rRNA gene sequencing; Clostridiales Incertae Sedis XI
22.  Sequencing of the Dutch Elm Disease Fungus Genome Using the Roche/454 GS-FLX Titanium System in a Comparison of Multiple Genomics Core Facilities 
As part of the DNA Sequencing Research Group of the Association of Biomolecular Resource Facilities, we have tested the reproducibility of the Roche/454 GS-FLX Titanium System at five core facilities. Experience with the Roche/454 system ranged from <10 to >340 sequencing runs performed. All participating sites were supplied with an aliquot of a common DNA preparation and were requested to conduct sequencing at a common loading condition. The evaluation of sequencing yield and accuracy metrics was assessed at a single site. The study was conducted using a laboratory strain of the Dutch elm disease fungus Ophiostoma novo-ulmi strain H327, an ascomycete, vegetatively haploid fungus with an estimated genome size of 30–50 Mb. We show that the Titanium System is reproducible, with some variation detected in loading conditions, sequencing yield, and homopolymer length accuracy. We demonstrate that reads shorter than the theoretical minimum length are of lower overall quality and not simply truncated reads. The O. novo-ulmi H327 genome assembly is 31.8 Mb and is comprised of eight chromosome-length linear scaffolds, a circular mitochondrial conti of 66.4 kb, and a putative 4.2-kb linear plasmid. We estimate that the nuclear genome encodes 8613 protein coding genes, and the mitochondrion encodes 15 genes and 26 tRNAs.
PMCID: PMC3526337  PMID: 23542132
massively parallel DNA sequencing; fungal genomics; Ophiostoma novo-ulmi
24.  A founder mutation in the PEX6 gene is responsible for increased incidence of Zellweger syndrome in a French Canadian population 
BMC Medical Genetics  2012;13:72.
Zellweger syndrome (ZS) is a peroxisome biogenesis disorder due to mutations in any one of 13 PEX genes. Increased incidence of ZS has been suspected in French-Canadians of the Saguenay-Lac-St-Jean region (SLSJ) of Quebec, but this remains unsolved.
We identified 5 ZS patients from SLSJ diagnosed by peroxisome dysfunction between 1990–2010 and sequenced all coding exons of known PEX genes in one patient using Next Generation Sequencing (NGS) for diagnostic confirmation.
A homozygous mutation (c.802_815del, p.[Val207_Gln294del, Val76_Gln294del]) in PEX6 was identified and then shown in 4 other patients. Parental heterozygosity was confirmed in all. Incidence of ZS was estimated to 1 in 12,191 live births, with a carrier frequency of 1 in 55. In addition, we present data suggesting that this mutation abolishes a SF2/ASF splice enhancer binding site, resulting in the use of two alternative cryptic donor splice sites and predicted to encode an internally deleted in-frame protein.
We report increased incidence of ZS in French-Canadians of SLSJ caused by a PEX6 founder mutation. To our knowledge, this is the highest reported incidence of ZS worldwide. These findings have implications for carrier screening and support the utility of NGS for molecular confirmation of peroxisomal disorders.
PMCID: PMC3483250  PMID: 22894767
Zellweger syndrome; Founder effect; Peroxisome biogenesis disorders; Next generation sequencing
25.  Fourteen-Genome Comparison Identifies DNA Markers for Severe-Disease-Associated Strains of Clostridium difficile▿† 
Journal of Clinical Microbiology  2011;49(6):2230-2238.
Clostridium difficile is a common cause of infectious diarrhea in hospitalized patients. A severe and increased incidence of C. difficile infection (CDI) is associated predominantly with the NAP1 strain; however, the existence of other severe-disease-associated (SDA) strains and the extensive genetic diversity across C. difficile complicate reliable detection and diagnosis. Comparative genome analysis of 14 sequenced genomes, including those of a subset of NAP1 isolates, allowed the assessment of genetic diversity within and between strain types to identify DNA markers that are associated with severe disease. Comparative genome analysis of 14 isolates, including five publicly available strains, revealed that C. difficile has a core genome of 3.4 Mb, comprising ∼3,000 genes. Analysis of the core genome identified candidate DNA markers that were subsequently evaluated using a multistrain panel of 177 isolates, representing more than 50 pulsovars and 8 toxinotypes. A subset of 117 isolates from the panel had associated patient data that allowed assessment of an association between the DNA markers and severe CDI. We identified 20 candidate DNA markers for species-wide detection and 10,683 single nucleotide polymorphisms (SNPs) associated with the predominant SDA strain (NAP1). A species-wide detection candidate marker, the sspA gene, was found to be the same across 177 sequenced isolates and lacked significant similarity to those of other species. Candidate SNPs in genes CD1269 and CD1265 were found to associate more closely with disease severity than currently used diagnostic markers, as they were also present in the toxin A-negative and B-positive (A-B+) strain types. The genetic markers identified illustrate the potential of comparative genomics for the discovery of diagnostic DNA-based targets that are species specific or associated with multiple SDA strains.
PMCID: PMC3122728  PMID: 21508155

