The genus Streptococcus comprises important pathogens that have a severe impact on human health and are responsible for substantial economic losses to agriculture. Here, we utilize 46 Streptococcus genome sequences (44 species), including eight species sequenced here, to provide the first genomic level insight into the evolutionary history and genetic basis underlying the functional diversity of all major groups of this genus. Gene gain/loss analysis revealed a dynamic pattern of genome evolution characterized by an initial period of gene gain followed by a period of loss, as the major groups within the genus diversified. This was followed by a period of genome expansion associated with the origins of the present extant species. The pattern is concordant with an emerging view that genomes evolve through a dynamic process of expansion and streamlining. A large proportion of the pan-genome has experienced lateral gene transfer (LGT) with causative factors, such as relatedness and shared environment, operating over different evolutionary scales. Multiple gene ontology terms were significantly enriched for each group, and mapping terms onto the phylogeny showed that those corresponding to genes born on branches leading to the major groups represented approximately one-fifth of those enriched. Furthermore, despite the extensive LGT, several biochemical characteristics have been retained since group formation, suggesting genomic cohesiveness through time, and that these characteristics may be fundamental to each group. For example, proteolysis: mitis group; urea metabolism: salivarius group; carbohydrate metabolism: pyogenic group; and transcription regulation: bovis group.
comparative genomics; phylogenetics; gene gain and loss; enrichment; lateral gene transfer
Rapid advances in sequencing technology have changed the experimental landscape of microbial ecology. In the last 10 years, the field has moved from sequencing hundreds of 16S rRNA gene fragments per study using clone libraries to the sequencing of millions of fragments per study using next-generation sequencing technologies from 454 and Illumina. As these technologies advance, it is critical to assess the strengths, weaknesses, and overall suitability of these platforms for the interrogation of microbial communities. Here, we present an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads. We evaluated three overlapping regions of the 16S rRNA gene that vary in length (i.e., V34, V4, and V45) by resequencing a mock community and natural samples from human feces, mouse feces, and soil. By titrating the concentration of 16S rRNA gene amplicons applied to the flow cell and using a quality score-based approach to correct discrepancies between reads used to construct contigs, we were able to reduce error rates by as much as two orders of magnitude. Finally, we reprocessed samples from a previous study to demonstrate that large numbers of samples could be multiplexed and sequenced in parallel with shotgun metagenomes. These analyses demonstrate that our approach can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.
Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled “intractable” resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the “non-contiguous finished” Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.
Clostridium difficile is the leading cause of hospital-associated diarrhoea in the US and Europe. Recently the incidence of C. difficile-associated disease has risen dramatically and concomitantly with the emergence of ‘hypervirulent’ strains associated with more severe disease and increased mortality. C. difficile contains numerous mobile genetic elements, resulting in the potential for a highly plastic genome. In the first sequenced strain, 630, there is one proven conjugative transposon (CTn), Tn5397, and six putative CTns (CTn1, CTn2 and CTn4-7), of which, CTn4 and CTn5 were capable of excision. In the second sequenced strain, R20291, two further CTns were described.
CTn1, CTn2 CTn4, CTn5 and CTn7 were shown to excise from the genome of strain 630 and transfer to strain CD37. A putative CTn from R20291, misleadingly termed a phage island previously, was shown to excise and to contain three putative mobilisable transposons, one of which was capable of excision. In silico probing of C. difficile genome sequences with recombinase gene fragments identified new putative conjugative and mobilisable transposons related to the elements in strains 630 and R20291. CTn5-like elements were described occupying different insertion sites in different strains, CTn1-like elements that have lost the ability to excise in some ribotype 027 strains were described and one strain was shown to contain CTn5-like and CTn7-like elements arranged in tandem. Additionally, using bioinformatics, we updated previous gene annotations and predicted novel functions for the accessory gene products on these new elements.
The genomes of the C. difficile strains examined contain highly related CTns suggesting recent horizontal gene transfer. Several elements were capable of excision and conjugative transfer. The presence of antibiotic resistance genes and genes predicted to promote adaptation to the intestinal environment suggests that CTns play a role in the interaction of C. difficile with its human host.
Community acquired (CA) methicillin-resistant Staphylococcus aureus (MRSA) increasingly causes disease worldwide. USA300 has emerged as the predominant clone causing superficial and invasive infections in children and adults in the USA. Epidemiological studies suggest that USA300 is more virulent than other CA-MRSA. The genetic determinants that render virulence and dominance to USA300 remain unclear.
We sequenced the genomes of two pediatric USA300 isolates: one CA-MRSA and one CA-methicillin susceptible (MSSA), isolated at Texas Children's Hospital in Houston. DNA sequencing was performed by Sanger dideoxy whole genome shotgun (WGS) and 454 Life Sciences pyrosequencing strategies. The sequence of the USA300 MRSA strain was rigorously annotated. In USA300-MRSA 2658 chromosomal open reading frames were predicted and 3.1 and 27 kilobase (kb) plasmids were identified. USA300-MSSA contained a 20 kb plasmid with some homology to the 27 kb plasmid found in USA300-MRSA. Two regions found in US300-MRSA were absent in USA300-MSSA. One of these carried the arginine deiminase operon that appears to have been acquired from S. epidermidis. The USA300 sequence was aligned with other sequenced S. aureus genomes and regions unique to USA300 MRSA were identified.
USA300-MRSA is highly similar to other MRSA strains based on whole genome alignments and gene content, indicating that the differences in pathogenesis are due to subtle changes rather than to large-scale acquisition of virulence factor genes. The USA300 Houston isolate differs from another sequenced USA300 strain isolate, derived from a patient in San Francisco, in plasmid content and a number of sequence polymorphisms. Such differences will provide new insights into the evolution of pathogens.
The draft genome sequence of Mannheimia haemolytica A1, the causative agent of bovine respiratory disease complex (BRDC), is presented. Strain ATCC BAA-410, isolated from the lung of a calf with BRDC, was the DNA source. The annotated genome includes 2,839 coding sequences, 1,966 of which were assigned a function and 436 of which are unique to M. haemolytica. Through genome annotation many features of interest were identified, including bacteriophages and genes related to virulence, natural competence, and transcriptional regulation. In addition to previously described virulence factors, M. haemolytica encodes adhesins, including the filamentous hemagglutinin FhaB and two trimeric autotransporter adhesins. Two dual-function immunoglobulin-protease/adhesins are also present, as is a third immunoglobulin protease. Genes related to iron acquisition and drug resistance were identified and are likely important for survival in the host and virulence. Analysis of the genome indicates that M. haemolytica is naturally competent, as genes for natural competence and DNA uptake signal sequences (USS) are present. Comparison of competence loci and USS in other species in the family Pasteurellaceae indicates that M. haemolytica, Actinobacillus pleuropneumoniae, and Haemophilus ducreyi form a lineage distinct from other Pasteurellaceae. This observation was supported by a phylogenetic analysis using sequences of predicted housekeeping genes.
Rickettsia typhi, the causative agent of murine typhus, is an obligate intracellular bacterium with a life cycle involving both vertebrate and invertebrate hosts. Here we present the complete genome sequence of R. typhi (1,111,496 bp) and compare it to the two published rickettsial genome sequences: R. prowazekii and R. conorii. We identified 877 genes in R. typhi encoding 3 rRNAs, 33 tRNAs, 3 noncoding RNAs, and 838 proteins, 3 of which are frameshifts. In addition, we discovered more than 40 pseudogenes, including the entire cytochrome c oxidase system. The three rickettsial genomes share 775 genes: 23 are found only in R. prowazekii and R. typhi, 15 are found only in R. conorii and R. typhi, and 24 are unique to R. typhi. Although most of the genes are colinear, there is a 35-kb inversion in gene order, which is close to the replication terminus, in R. typhi, compared to R. prowazekii and R. conorii. In addition, we found a 124-kb R. typhi-specific inversion, starting 19 kb from the origin of replication, compared to R. prowazekii and R. conorii. Inversions in this region are also seen in the unpublished genome sequences of R. sibirica and R. rickettsii, indicating that this region is a hot spot for rearrangements. Genome comparisons also revealed a 12-kb insertion in the R. prowazekii genome, relative to R. typhi and R. conorii, which appears to have occurred after the typhus (R. prowazekii and R. typhi) and spotted fever (R. conorii) groups diverged. The three-way comparison allowed further in silico analysis of the SpoT split genes, leading us to propose that the stringent response system is still functional in these rickettsiae.
The leukotoxin of Mannheimia haemolytica is an important virulence factor that contributes to much of the pathology observed in the lungs of animals with bovine shipping fever pneumonia. We believe that identification of factors that regulate leukotoxin expression may provide insight into M. haemolytica pathogenicity. The DNA sequence upstream of the leukotoxin operon is divergently shared by PlapT, which transcribes an arginine permease gene. The intergenic region contains several elements that are potential sites for transcriptional modulation of the promoters. We have developed plasmid-borne chloramphenicol acetyltransferase (cat) operon fusions, as well as lktC::cat chromosomal fusions, to study transcription initiation in M. haemolytica. Using these genetic tools, we have identified cis-acting sequences and environmental conditions that modulate transcription of the leukotoxin and lapT promoters. By deletion analysis, promoters were shown to rely on sequences upstream of their −10 and −35 regions for full activity. Direct repeats of the sequence TGT-N(11)-ACA and a static bend region caused by phased adenine tracts were necessary for full activation of Plkt. A computer-generated model of the promoter's structure shows how DNA bending brings the repeat sequences within close proximity to the Plkt RNA polymerase, and we hypothesize that these repeats are a binding site for an activator of leukotoxin transcription. The lktC::cat operon fusion was also used to demonstrate that, like that of other RTX toxins, leukotoxin transcription is environmentally regulated. Roles for iron deprivation and temperature change were identified.
The leukotoxin of Pasteurella (Mannheimia) haemolytica is believed to play a significant role in pathogenesis, causing cell lysis and apoptosis that lead to the lung pathology characteristic of bovine shipping fever. Using a system for Cre-lox recombination, a nonpolar mutation within the lktC transacylase gene of the leukotoxin operon was created. The lktC locus was insertionally inactivated using a loxP-aph3-loxP cassette, and then the aph3 marker was excised from the chromosome by Cre recombinase expressed from a P. haemolytica plasmid. The resulting lktC strain (SH2099) secretes inactive leukotoxin and carries no known antibiotic resistance genes. Strain SH2099 was tested for virulence in a calf challenge model. We inoculated 3 × 108 or 3 × 109 CFU of wild-type or mutant bacteria into the lungs of healthy, colostrum-deprived calves via transthoracic injection. Animals were observed for clinical signs and for nasal colonization for 4 days, after which they were euthanized and necropsied. The lower inoculum (3 × 108 CFU) caused significantly fewer deaths and allowed lung pathology to be scored and compared, while the 3 × 109 CFU dose of either the wild-type or mutant was lethal to ≥50% of the calves. The estimated 50% lethal dose of SH2099 was four times higher than that of the wild-type strain. Lung lesion scores were reduced twofold in animals inoculated with the mutant, while clinical scores were nearly equivalent for both strains. The wild-type and mutant strains were equally capable of colonizing the upper respiratory tracts of the calves. In this study, the P. haemolytica lktC mutant was shown to be less virulent than the parent strain.
DNA samples derived from vertebrate skin, bodily cavities and body fluids contain both host and microbial DNA; the latter often present as a minor component. Consequently, DNA sequencing of a microbiome sample frequently yields reads originating from the microbe(s) of interest, but with a vast excess of host genome-derived reads. In this study, we used a methyl-CpG binding domain (MBD) to separate methylated host DNA from microbial DNA based on differences in CpG methylation density. MBD fused to the Fc region of a human antibody (MBD-Fc) binds strongly to protein A paramagnetic beads, forming an effective one-step enrichment complex that was used to remove human or fish host DNA from bacterial and protistan DNA for subsequent sequencing and analysis. We report enrichment of DNA samples from human saliva, human blood, a mock malaria-infected blood sample and a black molly fish. When reads were mapped to reference genomes, sequence reads aligning to host genomes decreased 50-fold, while bacterial and Plasmodium DNA sequences reads increased 8–11.5-fold. The Shannon-Wiener diversity index was calculated for 149 bacterial species in saliva before and after enrichment. Unenriched saliva had an index of 4.72, while the enriched sample had an index of 4.80. The similarity of these indices demonstrates that bacterial species diversity and relative phylotype abundance remain conserved in enriched samples. Enrichment using the MBD-Fc method holds promise for targeted microbiome sequence analysis across a broad range of sample types.
Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its “pan-genome”. We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800–3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25–53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link the distribution pattern of a specific phenotype to the presence/absence of specific sets of genes.
We determined the complete genome sequence of Lactobacillus brevis KB290, a probiotic lactic acid bacterium isolated from a traditional Japanese fermented vegetable. The genome contained a 2,395,134-bp chromosome that housed 2,391 protein-coding genes and nine plasmids that together accounted for 191 protein-coding genes. KB290 contained no virulence factor genes, and several genes related to presumptive cell wall-associated polysaccharide biosynthesis and the stress response were present in L. brevis KB290 but not in the closely related L. brevis ATCC 367. Plasmid-curing experiments revealed that the presence of plasmid pKB290-1 was essential for the strain's gastrointestinal tract tolerance and tendency to aggregate. Using next-generation deep sequencing of current and 18-year-old stock strains to detect low frequency variants, we evaluated genome stability. Deep sequencing of four periodic KB290 culture stocks with more than 1,000-fold coverage revealed 3 mutation sites and 37 minority variation sites, indicating long-term stability and providing a useful method for assessing the stability of industrial bacteria at the nucleotide level.
Dental decay is one of the most prevalent chronic diseases worldwide. A variety of factors, including microbial, genetic, immunological, behavioral and environmental, interact to contribute to dental caries onset and development. Previous studies focused on the microbial basis for dental caries have identified species associated with both dental health and disease. The purpose of the current study was to improve our knowledge of the microbial species involved in dental caries and health by performing a comprehensive 16S rDNA profiling of the dental plaque microbiome of both caries-free and caries-active subjects. Analysis of over 50,000 nearly full-length 16S rDNA clones allowed the identification of 1,372 operational taxonomic units (OTUs) in the dental plaque microbiome. Approximately half of the OTUs were common to both caries-free and caries-active microbiomes and present at similar abundance. The majority of differences in OTU’s reflected very low abundance phylotypes. This survey allowed us to define the population structure of the dental plaque microbiome and to identify the microbial signatures associated with dental health and disease. The deep profiling of dental plaque allowed the identification of 87 phylotypes that are over-represented in either caries-free or caries-active subjects. Among these signatures, those associated with dental health outnumbered those associated with dental caries by nearly two-fold. A comparison of this data to other published studies indicate significant heterogeneity in study outcomes and suggest that novel approaches may be required to further define the signatures of dental caries onset and progression.
Peritonitis is the major disease problem of laying hens in commercial table egg and parent stock operations. Despite its importance, the etiology and pathogenesis of this disease have not been completely clarified. Although avian pathogenic Escherichia coli (APEC) isolates have been incriminated as the causative agent of laying hen peritonitis, Gallibacterium anatis are frequently isolated from peritonitis lesions. Despite recent studies suggesting a role for G. anatis in the pathogenesis of peritonitis, little is known about the organism’s virulence mechanisms, genomic composition and population dynamics. Here, we compared the genome sequences of three G. anatis isolates in an effort to understand its virulence mechanisms and identify novel antigenic traits. A multilocus sequence typing method was also established for G. anatis and used to characterize the genotypic relatedness of 71 isolates from commercial laying hens in Iowa and 18 international reference isolates. Genomic comparisons suggest that G. anatis is a highly diverse bacterial species, with some strains possessing previously described and potential virulence factors, but with a core genome containing several antigenic candidates. Multilocus sequence typing effectively distinguished 82 sequence types and several clonal complexes of G. anatis, and some clones seemed to predominate among G. anatis populations from commercial layers in Iowa. Biofilm formation and resistance to antimicrobial agents was also observed in several clades. Overall, the genomic diversity of G. anatis suggests that multiple lineages exist with differing pathogenic potential towards birds.
Colonization of the gastrointestinal (GI) tract is initiated during birth and continually seeded from the individual’s environment. Gastrointestinal microorganisms play a central role in developing and modulating host immune responses and have been the subject of investigation over the last decades. Animal studies have demonstrated the impact of GI tract microbiota on local gastrointestinal immune responses; however, the full spectrum of action of early gastrointestinal tract stimulation and subsequent modulation of systemic immune responses is poorly understood. This study explored the utility of an oral microbial inoculum as a therapeutic tool to affect porcine systemic immune responses. For this study a litter of 12 pigs was split into two groups. One group of pigs was inoculated with a non-pathogenic oral inoculum (modulated), while another group (control) was not. DNA extracted from nasal swabs and fecal samples collected throughout the study was sequenced to determine the effects of the oral inoculation on GI and respiratory microbial communities. The effects of GI microbial modulation on systemic immune responses were evaluated by experimentally infecting with the pathogen Mycoplasma hyopneumoniae. Coughing levels, pathology, toll-like receptors 2 and 6, and cytokine production were measured throughout the study. Sequencing results show a successful modulation of the GI and respiratory microbiomes through oral inoculation. Delayed type hypersensitivity responses were stronger (p = 0.07), and the average coughing levels and respiratory TNF-α variance were significantly lower in the modulated group (p<0.0001 and p = 0.0153, respectively). The M. hyopneumoniae infection study showed beneficial effects of the oral inoculum on systemic immune responses including antibody production, severity of infection and cytokine levels. These results suggest that an oral microbial inoculation can be used to modulate microbial communities, as well as have a beneficial effect on systemic immune responses as demonstrated with M. hyopneumoniae infection.
A novel non-culture based 16S rRNA Terminal Restriction Fragment Length Polymorphism (T-RFLP) method using the restriction enzymes Tsp509I and Hpy166II was developed for the characterization of the nasopharyngeal microbiota and validated using recently published 454 pyrosequencing data. 16S rRNA gene T-RFLP for 153 clinical nasopharyngeal samples from infants with acute otitis media (AOM) revealed 5 Tsp509I and 6 Hpy166II terminal fragments (TFs) with a prevalence of >10%. Cloning and sequencing identified all TFs with a prevalence >6% allowing a sufficient description of bacterial community changes for the most important bacterial taxa. The conjugated 7-valent pneumococcal polysaccharide vaccine (PCV-7) and prior antibiotic exposure had significant effects on the bacterial composition in an additive main effects and multiplicative interaction model (AMMI) in concordance with the 16S rRNA 454 pyrosequencing data. In addition, the presented T-RFLP method is able to discriminate S. pneumoniae from other members of the Mitis group of streptococci, which therefore allows the identification of one of the most important human respiratory tract pathogens. This is usually not achieved by current high throughput sequencing protocols. In conclusion, the presented 16S rRNA gene T-RFLP method is a highly robust, easy to handle and a cheap alternative to the computationally demanding next-generation sequencing analysis. In case a lot of nasopharyngeal samples have to be characterized, it is suggested to first perform 16S rRNA T-RFLP and only use next generation sequencing if the T-RFLP nasopharyngeal patterns differ or show unknown TFs.
Outbreaks of antibiotic-resistant bacterial infections emphasize the importance of surveillance of potentially pathogenic bacteria. Genomic sequencing of clinical microbiological specimens expands our capacity to study cultivable, fastidious and uncultivable members of the bacterial community. Herein, we compared the primary data collected by the NIH’s Human Microbiome Project (HMP) with published epidemiological surveillance data of Staphylococcus aureus.
The HMP’s initial dataset contained microbial survey data from five body regions (skin, nares, oral cavity, gut and vagina) of 242 healthy volunteers. A significant component of the HMP dataset was deep sequencing of the 16S ribosomal RNA gene, which contains variable regions enabling taxonomic classification. Since species-level identification is essential in clinical microbiology, we built a reference database and used phylogenetic placement followed by most recent common ancestor classification to look at the species distribution for Staphylococcus, Klebsiella and Enterococcus.
We show that selecting the accurate region of the 16S rRNA gene to sequence is analogous to carefully selecting culture conditions to distinguish closely related bacterial species. Analysis of the HMP data showed that Staphylococcus aureus was present in the nares of 36% of healthy volunteers, consistent with culture-based epidemiological data. Klebsiella pneumoniae and Enterococcus faecalis were found less frequently, but across many habitats.
This work demonstrates that large 16S rRNA survey studies can be used to support epidemiological goals in the context of an increasing awareness that microbes flourish and compete within a larger bacterial community. This study demonstrates how genomic techniques and information could be critically important to trace microbial evolution and implement hospital infection control.
Malawi commenced the introduction of the 13-valent pneumococcal conjugate vaccine (PCV13) into the routine infant immunisation schedule in November 2011. Here we have tested the utility of high throughput whole genome sequencing to provide a high-resolution view of pre-vaccine pneumococcal epidemiology and population evolutionary trends to predict potential future change in population structure post introduction.
One hundred and twenty seven (127) archived pneumococcal isolates from randomly selected adults and children presenting to the Queen Elizabeth Central Hospital, Blantyre, Malawi underwent whole genome sequencing.
The pneumococcal population was dominated by serotype 1 (20.5% of invasive isolates) prior to vaccine introduction. PCV13 is likely to protect against 62.9% of all circulating invasive pneumococci (78.3% in under-5-year-olds). Several Pneumococcal Molecular Epidemiology Network (PMEN) clones are now in circulation in Malawi which were previously undetected but the pandemic multidrug resistant PMEN1 lineage was not identified. Genome analysis identified a number of novel sequence types and serotype switching.
High throughput genome sequencing is now feasible and has the capacity to simultaneously elucidate serotype, sequence type and as well as detailed genetic information. It enables population level characterization, providing a detailed picture of population structure and genome evolution relevant to disease control. Post-vaccine introduction surveillance supported by genome sequencing is essential to providing a comprehensive picture of the impact of PCV13 on pneumococcal population structure and informing future public health interventions.
Microbes of the human respiratory tract are important in health and disease, but accurate sampling of the lung presents challenges. Lung microbes are commonly sampled by bronchoscopy, but to acquire samples the bronchoscope must pass through the upper respiratory tract, which is rich in microbes. Here we present methods to identify authentic lung microbiota in bronchoalveolar lavage (BAL) fluid that contains substantial oropharyngeal admixture. We studied clinical BAL samples from six selected subjects with potential heavy lung colonization. A single sample of BAL fluid was obtained from each subject along with contemporaneous oral wash (OW) to sample the oropharynx, and then DNA was extracted from three separate aliquots of each. Bacterial 16S rDNA sequences were amplified and products analyzed by 454 pyrosequencing. By comparing replicates, we were able to specify the depth of sequencing needed to reach a 95% chance of identifying a bacterial lineage of a given proportion—for example, at a depth of 5,000 tags, OTUs of proportion 0.3% or greater would be called with 95% confidence. We next constructed a single-sided outlier test that allowed lung-enriched organisms to be quantified against a background of oropharyngeal admixture, and assessed improvements available with replicate sequence analysis. This allowed identification of lineages enriched in lung in some BAL specimens. Finally, using samples from healthy volunteers collected at multiple sites in the upper respiratory tract, we show that OW provides a reasonable but not perfect surrogate for bacteria carried into to the lung by a bronchoscope. These methods allow identification of microbes that can replicate in the lung despite the background due to oropharyngeal microbes derived from aspiration and bronchoscopic carry-over.
Recent losses in honey bee colonies are unusual in their severity, geographical distribution, and, in some cases, failure to present recognized characteristics of known disease. Domesticated honey bees face numerous pests and pathogens, tempting hypotheses that colony collapses arise from exposure to new or resurgent pathogens. Here we explore the incidence and abundance of currently known honey bee pathogens in colonies suffering from Colony Collapse Disorder (CCD), otherwise weak colonies, and strong colonies from across the United States. Although pathogen identities differed between the eastern and western United States, there was a greater incidence and abundance of pathogens in CCD colonies. Pathogen loads were highly covariant in CCD but not control hives, suggesting that CCD colonies rapidly become susceptible to a diverse set of pathogens, or that co-infections can act synergistically to produce the rapid depletion of workers that characterizes the disorder. We also tested workers from a CCD-free apiary to confirm that significant positive correlations among pathogen loads can develop at the level of individual bees and not merely as a secondary effect of CCD. This observation and other recent data highlight pathogen interactions as important components of bee disease. Finally, we used deep RNA sequencing to further characterize microbial diversity in CCD and non-CCD hives. We identified novel strains of the recently described Lake Sinai viruses (LSV) and found evidence of a shift in gut bacterial composition that may be a biomarker of CCD. The results are discussed with respect to host-parasite interactions and other environmental stressors of honey bees.
Few microbial functions have been compared to a comprehensive survey of the human fecal microbiome. We evaluated determinants of fecal microbial β-glucuronidase and β-glucosidase activities, focusing especially on associations with microbial alpha and beta diversity and taxonomy. We enrolled 51 healthy volunteers (26 female, mean age 39) who provided questionnaire data and multiple aliquots of a stool, from which proteins were extracted to quantify β-glucuronidase and β-glucosidase activities, and DNA was extracted to amplify and pyrosequence 16S rRNA gene sequences to classify and quantify microbiome diversity and taxonomy. Fecal β-glucuronidase was elevated with weight loss of at least 5 lb. (P = 0.03), whereas β-glucosidase was marginally reduced in the four vegetarians (P = 0.06). Both enzymes were correlated directly with microbiome richness and alpha diversity measures, directly with the abundance of four Firmicutes Clostridia genera, and inversely with the abundance of two other genera (Firmicutes Lactobacillales Streptococcus and Bacteroidetes Rikenellaceae Alistipes) (all P = 0.05–0.0001). Beta diversity reflected the taxonomic associations. These observations suggest that these enzymatic functions are performed by particular taxa and that diversity indices may serve as surrogates of bacterial functions. Independent validation and deeper understanding of these associations are needed, particularly to characterize functions and pathways that may be amenable to manipulation.
Metagenome sequencing is becoming common and there is an increasing need for easily accessible tools for data analysis. An essential step is the taxonomic classification of sequence fragments. We describe a web server for the taxonomic assignment of metagenome sequences with PhyloPythiaS. PhyloPythiaS is a fast and accurate sequence composition-based classifier that utilizes the hierarchical relationships between clades. Taxonomic assignments with the web server can be made with a generic model, or with sample-specific models that users can specify and create. Several interactive visualization modes and multiple download formats allow quick and convenient analysis and downstream processing of taxonomic assignments. Here, we demonstrate usage of our web server by taxonomic assignment of metagenome samples from an acidophilic biofilm community of an acid mine and of a microbial community from cow rumen.
The human gut harbors thousands of bacterial taxa. A profusion of metagenomic sequence data has been generated from human stool samples in the last few years, raising the question of whether more taxa remain to be identified. We assessed metagenomic data generated by the Human Microbiome Project Consortium to determine if novel taxa remain to be discovered in stool samples from healthy individuals. To do this, we established a rigorous bioinformatics pipeline that uses sequence data from multiple platforms (Illumina GAIIX and Roche 454 FLX Titanium) and approaches (whole-genome shotgun and 16S rDNA amplicons) to validate novel taxa. We applied this approach to stool samples from 11 healthy subjects collected as part of the Human Microbiome Project. We discovered several low-abundance, novel bacterial taxa, which span three major phyla in the bacterial tree of life. We determined that these taxa are present in a larger set of Human Microbiome Project subjects and are found in two sampling sites (Houston and St. Louis). We show that the number of false-positive novel sequences (primarily chimeric sequences) would have been two orders of magnitude higher than the true number of novel taxa without validation using multiple datasets, highlighting the importance of establishing rigorous standards for the identification of novel taxa in metagenomic data. The majority of novel sequences are related to the recently discovered genus Barnesiella, further encouraging efforts to characterize the members of this genus and to study their roles in the microbial communities of the gut. A better understanding of the effects of less-abundant bacteria is important as we seek to understand the complex gut microbiome in healthy individuals and link changes in the microbiome to disease.
Ventilator-associated pneumonia (VAP) is a common nosocomial infection in mechanically ventilated patients. Biofilm formation is one of the mechanisms through which the endotracheal tube (ET) facilitates bacterial contamination of the lower airways. In the present study, we analyzed the composition of the ET biofilm flora by means of culture dependent and culture independent (16 S rRNA gene clone libraries and pyrosequencing) approaches. Overall, the microbial diversity was high and members of different phylogenetic lineages were detected (Actinobacteria, beta-Proteobacteria, Candida spp., Clostridia, epsilon-Proteobacteria, Firmicutes, Fusobacteria and gamma-Proteobacteria). Culture dependent analysis, based on the use of selective growth media and conventional microbiological tests, resulted in the identification of typical aerobic nosocomial pathogens which are known to play a role in the development of VAP, e.g. Staphylococcus aureus and Pseudomonas aeruginosa. Other opportunistic pathogens were also identified, including Staphylococcus epidermidis and Kocuria varians. In general, there was little correlation between the results obtained by sequencing 16 S rRNA gene clone libraries and by cultivation. Pyrosequencing of PCR amplified 16 S rRNA genes of four selected samples resulted in the identification of a much wider variety of bacteria. The results from the pyrosequencing analysis suggest that these four samples were dominated by members of the normal oral flora such as Prevotella spp., Peptostreptococcus spp. and lactic acid bacteria. A combination of methods is recommended to obtain a complete picture of the microbial diversity of the ET biofilm.