We sought to measure trends in Streptococcus pneumoniae (SP) carriage and antibiotic resistance in young children in Massachusetts communities after widespread adoption of heptavalent pneumococcal conjugate vaccine (PCV7) and before the introduction of the 13-valent pneumococcal conjugate vaccine (PCV13).
We conducted a cross-sectional study including collection of questionnaire data and nasopharyngeal specimens among children <7 years in primary care practices from 8 Massachusetts communities during the winter season of 2008–9 and compared with to similar studies performed in 2001, 2003–4, and 2006–7. Antimicrobial susceptibility testing and serotyping were performed on pneumococcal isolates, and risk factors for colonization in recent seasons (2006–07 and 2008–09) were evaluated.
We collected nasopharyngeal specimens from 1,011 children, 290 (29%) of whom were colonized with pneumococcus. Non-PCV7 serotypes accounted for 98% of pneumococcal isolates, most commonly 19A (14%), 6C (11%), and 15B/C (11%). In 2008–09, newly-targeted PCV13 serotypes accounted for 20% of carriage isolates and 41% of penicillin non-susceptible S. pneumoniae (PNSP). In multivariate models, younger age, child care, young siblings, and upper respiratory illness remained predictors of pneumococcal carriage, despite near-complete serotype replacement. Only young age and child care were significantly associated with PNSP carriage.
Serotype replacement post-PCV7 is essentially complete and has been sustained in young children, with the relatively virulent 19A being the most common serotype. Predictors of carriage remained similar despite serotype replacement. PCV13 may reduce 19A and decrease antibiotic-resistant strains, but monitoring for new serotype replacement is warranted.
Streptococcus pneumoniae; pneumococcal conjugate vaccine; antibiotic resistance; serotype; colonization
Infections caused by multi-resistant Gram positive bacteria represent a major health burden in the community as well as in hospitalized patients. Staphylococcus aureus, Enterococcus faecalis and Enterococcus faecium are well-known pathogens of hospitalized patients, frequently linked with resistance against multiple antibiotics, compromising effective therapy. Streptococcus pneumoniae and Streptococcus pyogenes are important pathogens in the community and S. aureus has recently emerged as an important community-acquired pathogen.
Population genetic studies reveal that recombination prevails as a driving force of genetic diversity in E. faecium, E. faecalis, S. pneumoniae, and S. pyogenes and thus, these species are weakly clonal. Although recombination has a relatively modest role driving the genetic variation of the core genome of S. aureus, the horizontal acquistion of resistance and virulence genes plays a key role in the emergence of new clinically relevant clones in this species. In this review we discuss the population genetics of E. faecium, E. faecalis, S. pneumoniae, S. pyogenes, and S. aureus. Knowledge of the population structure of these pathogens is not only highly relevant for (molecular) epidemiological research but also for identifying the genetic variation that underlies changes in clinical behaviour, to improve our understanding of the pathogenic behaviour of particular clones and to identify novel targets for vaccines or immunotherapy.
Enterococcus; Streptococcus; Staphylococcus; MLST; evolution; Molecular epidemiology
Enterococcus faecium has recently emerged as an important multiresistant nosocomial pathogen. Defining population structure in this species is required to provide insight into the existence, distribution, and dynamics of specific multiresistant or pathogenic lineages in particular environments, like the hospital. Here, we probe the population structure of E. faecium using Bayesian-based population genetic modeling implemented in Bayesian Analysis of Population Structure (BAPS) software. The analysis involved 1,720 isolates belonging to 519 sequence types (STs) (491 for E. faecium and 28 for Enterococcus faecalis). E. faecium isolates grouped into 13 BAPS (sub)groups, but the large majority (80%) of nosocomial isolates clustered in two subgroups (2-1 and 3-3). Phylogenetic and eBURST analysis of BAPS groups 2 and 3 confirmed the existence of three separate hospital lineages (17, 18, and 78), highlighting different evolutionary trajectories for BAPS 2-1 (lineage 78) and 3-3 (lineage 17 and lineage 18) isolates. Phylogenomic analysis of 29 E. faecium isolates showed agreement between BAPS assignment of STs and their relative positions in the phylogenetic tree. Odds ratio calculation confirmed the significant association between hospital isolates with BAPS 3-3 and lineages 17, 18, and 78. Admixture analysis showed a scarce number of recombination events between the different BAPS groups. For the E. faecium hospital population, we propose an evolutionary model in which strains with a high propensity to colonize and infect hospitalized patients arise through horizontal gene transfer. Once adapted to the distinct hospital niche, this subpopulation becomes isolated, and recombination with other populations declines.
Multiresistant Enterococcus faecium has become one of the most important nosocomial pathogens, causing increasing numbers of nosocomial infections worldwide. Here, we used Bayesian population genetic analysis to identify groups of related E. faecium strains and show a significant association of hospital and farm animal isolates to different genetic groups. We also found that hospital isolates could be divided into three lineages originating from sequence types (STs) 17, 18, and 78. We propose that, driven by the selective pressure in hospitals, the three hospital lineages have arisen through horizontal gene transfer, but once adapted to the distinct pathogenic niche, this population has become isolated and recombination with other populations declines. Elucidation of the population structure is a prerequisite for effective control of multiresistant E. faecium since it provides insight into the processes that have led to the progressive change of E. faecium from an innocent commensal to a multiresistant hospital-adapted pathogen.
Invasive pneumococcal disease (IPD) has been reduced in the US following conjugate vaccination (PCV7) targeting seven pneumococcal serotypes in 2000. However, increases in IPD due to other serotypes have been observed, in particular 19A. How much this “serotype replacement” will erode the benefits of vaccination and over what timescale is unknown. We used a population genetic approach to test first whether the selective impact of vaccination could be detected in a longitudinal carriage sample, and secondly how long it persisted for following introduction of vaccine in 2000. To detect the selective impact of the vaccine we compared the serotype diversity of samples from pneumococcal carriage in Massachusetts children collected in 2001, 2004 and 2007 with others collected in the pre-vaccine era in Massachusetts, the UK and Finland. The 2004 sample was significantly (p >0.0001) more diverse than pre-vaccine samples, indicating the selective pressure of vaccination. The 2007 sample showed no significant difference in diversity from the pre-vaccine period, and exhibited similar population structure, but with different serotypes. In 2007 the carriage frequency of 19A was similar to that of the most common serotype in pre-vaccine samples. We suggest that serotype replacement involving 19A may be complete in Massachusetts due to similarities in population structure to pre-vaccine samples. These results suggest that the replacement phenomenon occurs rapidly with high vaccine coverage, and may allay concerns about future increases in disease due to 19A. For other serotypes, the future course of replacement disease remains to be determined.
Streptococcus pneumoniae; Infectious disease epidemiology; Nasopharyngeal carriage; Population genetics
MLST; conjugate vaccination; Streptococcus pneumoniae; nasopharyngeal carriage
Defining the propensity of Streptoccocus pneumoniae (SP) serotypes to invade sterile body sites following nasopharyngeal (NP) acquisition has the potential to inform about how much invasive pneumococcal disease (IPD) may occur in a typical population with a given distribution of carriage serotypes. Data from enhanced surveillance for IPD in Massachusetts children ≤7 years in 2003/04, 2006/07 and 2008/09 seasons and surveillance of SP NP carriage during the corresponding respiratory seasons in 16 Massachusetts communities in 2003/04 and 8 of the 16 communities in both 2006/07 and 2008/09 were used to compute a serotype specific “invasive capacity (IC)” by dividing the incidence of IPD due to serotype x by the carriage prevalence of that same serotype in children of the same age. A total of 206 IPD and 806 NP isolates of SP were collected during the study period. An approximate 50-fold variation in the point estimates between the serotypes having the highest (18C, 33F, 7F, 19A, 3 and 22F) and lowest (6C, 23A, 35F, 11A, 35B, 19F, 15A, and 15BC) IC was observed. Point estimates of IC for most of the common serotypes currently colonizing children in Massachusetts were low and likely explain the continued reduction in IPD from the pre-PCV era in the absence of specific protection against these serotypes. Invasive capacity differs among serotypes and as new pneumococcal conjugate vaccines are introduced, ongoing surveillance will be essential to monitor whether serotypes with high invasive capacity emerge (e.g. 33F, 22F) as successful colonizers resulting in increased IPD incidence due to replacement serotypes.
Streptoccocus pneumoniae; serotype; invasive capacity
Phenotypic and genetic variation in bacteria can take bewilderingly complex forms even within a single genus. One of the most intriguing examples of this is the genus Neisseria, which comprises both pathogens and commensals colonizing a variety of body sites and host species, and causing a range of disease. Complex relatedness among both named species and previously identified lineages of Neisseria makes it challenging to study their evolution. Using the largest publicly available collection of bacterial sequence data in combination with a population genetic analysis and experiment, we probe the contribution of inter-species recombination to neisserial population structure, and specifically whether it is more common in some strains than others. We identify hybrid groups of strains containing sequences typical of more than one species. These groups of strains, typical of a fuzzy species, appear to have experienced elevated rates of inter-species recombination estimated by population genetic analysis and further supported by transformation experiments. In particular, strains of the pathogen Neisseria meningitidis in the fuzzy species boundary appear to follow a different lifestyle, which may have considerable biological implications concerning distribution of novel resistance elements and meningococcal vaccine development. Despite the strong evidence for negligible geographical barriers to gene flow within the population, exchange of genetic material still shows directionality among named species in a non-uniform manner.
fuzzy species; recombination; Neisseria
Analysis of important human pathogen populations is currently under transition toward whole-genome sequencing of growing numbers of samples collected on a global scale. Since recombination in bacteria is often an important factor shaping their evolution by enabling resistance elements and virulence traits to rapidly transfer from one evolutionary lineage to another, it is highly beneficial to have access to tools that can detect recombination events. Multiple advanced statistical methods exist for such purposes; however, they are typically limited either to only a few samples or to data from relatively short regions of a total genome. By harnessing the power of recent advances in Bayesian modeling techniques, we introduce here a method for detecting homologous recombination events from whole-genome sequence data for bacterial population samples on a large scale. Our statistical approach can efficiently handle hundreds of whole genome sequenced population samples and identify separate origins of the recombinant sequence, offering an enhanced insight into the diversification of bacterial clones at the level of the whole genome. A data set of 241 whole genome sequences from an important pandemic lineage of Streptococcus pneumoniae is used together with multiple simulated data sets to demonstrate the potential of our approach.
Pneumococcal type 1 pilus proteins have been proposed as potential vaccine candidates. Following conjugate pneumococcal vaccination, the prevalence of the pneumococcal type 1 pilus declined dramatically, a decline associated with the elimination of vaccine-type (VT) strains. Here we show that between 2004 and 2007, there has been a significant increase in pilus prevalence, now exceeding rates from the pre-conjugate vaccine era. This increase is primarily due to non-VT strains. These emerging piliated non-VT strains are mostly novel clones, with some exceptions. The rise in pilus type 1 frequency across multiple distinct genetic backgrounds suggests that the pilus may confer an intrinsic advantage.
S. pneumoniae pilus; PCV7; vaccine- and non-vaccine-types
In most pathogens, multiple strains are maintained within host populations. Quantifying the mechanisms underlying strain coexistence would aid public health planning and improve understanding of disease dynamics. We argue that mathematical models of strain coexistence, when applied to indistinguishable strains, should meet criteria for both ecological neutrality and population genetic neutrality. We show that closed clonal transmission models which can be written in an “ancestor-tracing” form that meets the former criterion will also satisfy the latter. Neutral models can be a parsimonious starting point for studying mechanisms of strain coexistence; implications for past and future studies are discussed.
Mathematical models; Stain coexistence; Neutral models; Population genetics; Ecology; Infectious disease epidemiology
The large outbreak of diarrhea and hemolytic uremic syndrome (HUS) caused by Shiga toxin-producing Escherichia coli O104:H4 in Europe from May to July 2011 highlighted the potential of a rarely identified E. coli serogroup to cause severe disease. Prior to the outbreak, there were very few reports of disease caused by this pathogen and thus little known of its diversity and evolution. The identification of cases of HUS caused by E. coli O104:H4 in France and Turkey after the outbreak and with no clear epidemiological links raises questions about whether these sporadic cases are derived from the outbreak. Here, we report genome sequences of five independent isolates from these cases and results of a comparative analysis with historical and 2011 outbreak isolates. These analyses revealed that the five isolates are not derived from the outbreak strain; however, they are more closely related to the outbreak strain and each other than to isolates identified prior to the 2011 outbreak. Over the short time scale represented by these closely related organisms, the majority of genome variation is found within their mobile genetic elements: none of the nine O104:H4 isolates compared here contain the same set of plasmids, and their prophages and genomic islands also differ. Moreover, the presence of closely related HUS-associated E. coli O104:H4 isolates supports the contention that fully virulent O104:H4 isolates are widespread and emphasizes the possibility of future food-borne E. coli O104:H4 outbreaks.
In the summer of 2011, a large outbreak of bloody diarrhea with a high rate of severe complications took place in Europe, caused by a previously rarely seen Escherichia coli strain of serogroup O104:H4. Identification of subsequent infections caused by E. coli O104:H4 raised questions about whether these new cases represented ongoing transmission of the outbreak strain. In this study, we sequenced the genomes of isolates from five recent cases and compared them with historical isolates. The analyses reveal that, in the very short term, evolution of the bacterial genome takes place in parts of the genome that are exchanged among bacteria, and these regions contain genes involved in adaptation to local environments. We show that these recent isolates are not derived from the outbreak strain but are very closely related and share many of the same disease-causing genes, emphasizing the concern that these bacteria may cause future severe outbreaks.
Technological advances in high-throughput genome sequencing have led to an enhanced appreciation of the genetic diversity found within populations of pathogenic bacteria. Methods based on single nucleotide polymorphisms (SNPs) and insertions or deletions (indels) build upon the framework established by multi-locus sequence typing (MLST) and permit a detailed, targeted analysis of variation within related organisms. Robust phylogenetics, when combined with epidemiologically informative data, can be applied to study ongoing temporal and geographical fluctuations in bacterial pathogens. As genome sequencing, SNP detection and geospatial information become more accessible these methods will continue to transform the way molecular epidemiology is used to study populations of bacterial pathogens.
The goals were to assess serial changes in Streptococcus pneumoniae serotypes and antibiotic resistance in young children and to evaluate whether risk factors for carriage have been altered by heptavalent pneumococcal conjugate vaccine (PCV7).
Nasopharyngeal specimens and questionnaire/medical record data were obtained from children 3 months to <7 years of age in primary care practices in 16 Massachusetts communities during the winter seasons of 2000–2001 and 2003–2004 and in 8 communities in 2006–2007. Antimicrobial susceptibility testing and serotyping were performed with S pneumoniae isolates.
We collected 678, 988, and 972 specimens during the sampling periods in 2000–2001, 2003–2004, and 2006–2007, respectively. Carriage of non-PCV7 serotypes increased from 15% to 19% and 29% (P < .001), with vaccine serotypes decreasing to 3% of carried serotypes in 2006–2007. The relative contribution of several non-PCV7 serotypes, including 19A, 35B, and 23A, increased across sampling periods. By 2007, commonly carried serotypes included 19A (16%), 6A (12%), 15B/C (11%), 35B (9%), and 11A (8%), and high-prevalence serotypes seemed to have greater proportions of penicillin nonsusceptibility. In multivariate models, common predictors of pneumococcal carriage, such as child care attendance, upper respiratory tract infection, and the presence of young siblings, persisted.
The virtual disappearance of vaccine serotypes in S pneumoniae carriage has occurred in young children, with rapid replacement with penicillin-nonsusceptible nonvaccine serotypes, particularly 19A and 35B. Except for the age group at highest risk, previous predictors of carriage, such as child care attendance and the presence of young siblings, have not been changed by the vaccine.
Streptococcus pneumoniae; pneumococcal conjugate vaccine; antibiotic resistance; serotype; colonization
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
The study of bacterial population biology is complicated by the fact that, although bacteria are largely asexual, they can also exchange genetic materials through homologous recombination. Unlike eukaryotes, recombination in bacteria is not an obligatory process. Furthermore, the recombination mechanisms are subject to many biological and ecological factors that can vary even within different populations of the same species. Although increasing evidence for homologous recombination has been found in many bacterial species, determining the frequency of recombination and understanding the influence that it exerts upon the evolution of bacterial populations remains a challenging work. In this article, we provide a dynamic picture of recombination within and between closely related bacteria species. Through an integration of several Bayesian statistical models, our method highlights the importance of a quantitative estimation of recombination. Our analyses of a challenging multi-locus sequence typing (MLST) database demonstrate that combined analyses using both traditional phylogenetic methods, explorative MLST tools and Bayesian population genetic models can together yield interesting biological insights that cannot easily be reached by any of the approaches alone.
The incidence of community-associated methicillin-resistant Staphylococcus aureus (MRSA) has risen dramatically in the U.S., particularly among children. Although Streptococcus pneumoniae colonization has been inversely associated with S. aureus colonization in unvaccinated children, this and other risk factors for S. aureus carriage have not been assessed following widespread use of the heptavalent pneumococcal conjugate vaccine (PCV7). Our objectives were to (1) determine the prevalence of S. aureus and MRSA colonization in young children in the context of widespread use of PCV7; and (2) examine risk factors for S. aureus colonization in the post-PCV7 era, including the absence of vaccine-type S. pneumoniae colonization.
Swabs of the anterior nares (S. aureus) were obtained from children enrolled in an ongoing study of nasopharyngeal pneumococcal colonization of healthy children in 8 Massachusetts communities. Children 3 months to <7 years of age seen for well child or sick visits in primary care offices from 11/03–4/04 and 10/06–4/07 were enrolled. S. aureus was identified and antibiotic susceptibility testing was performed. Epidemiologic risk factors for S. aureus colonization were collected from parent surveys and chart reviews, along with data on pneumococcal colonization. Multivariate mixed model analyses were performed to identify factors associated with S. aureus colonization.
Among 1,968 children, the mean age (SD) was 2.7 (1.8) years, 32% received an antibiotic in the past 2 months, 2% were colonized with PCV7 strains and 24% were colonized with non-PCV7 strains. The prevalence of S. aureus colonization remained stable between 2003–04 and 2006–07 (14.6% vs. 14.1%), while MRSA colonization remained low (0.2% vs. 0.9%, p = 0.09). Although absence of pneumococcal colonization was not significantly associated with S. aureus colonization, age (6–11 mo vs. ≥5 yrs, OR 0.39 [95% CI 0.24–0.64]; 1–1.99 yrs vs. ≥5 yrs, OR 0.35 [0.23–0.54]; 2–2.99 yrs vs. ≥5 yrs, OR 0.45 [0.28–0.73]; 3–3.99 yrs vs. ≥5 yrs, OR 0.53 [0.33–0.86]) and recent antibiotic use were significant predictors in multivariate models.
In Massachusetts, S. aureus and MRSA colonization remained stable from 2003–04 to 2006–07 among children <7 years despite widespread use of pneumococcal conjugate vaccine. S. aureus nasal colonization varies by age and is inversely correlated with recent antibiotic use.
PspA is a structurally variable surface protein important to the virulence of pneumococci. PspAs are serologically cross-reactive and exist as two major families. In this study, we determined the distribution of PspA families 1 and 2 among pneumococcal strains isolated from the middle ear fluid (MEF) of children with acute otitis media and from nasopharyngeal specimens of children with pneumococcal carriage. We characterized the association between the two PspA families, capsular serotypes, and multilocus sequence types (STs) of the pneumococcal isolates. MEF isolates (n = 201) of 109 patients and nasopharyngeal isolates (n = 173) of 49 children were PspA family typed by whole-cell enzyme immunoassay (EIA). Genetic typing (PCR) of PspA family was done for 60 isolates to confirm EIA typing results. The prevalences of PspA families 1 and 2 were similar among pneumococci isolated from MEF (51% and 45%, respectively) and nasopharyngeal specimens (48% each). Isolates of certain capsule types as well as isolates of certain STs showed statistical associations with either family 1 or family 2 PspA. Pneumococci from seven children with multiple pneumococcal isolates appeared to express serologically different PspA families in different isolates of the same serotype; in three of the children the STs of the isolates were the same, suggesting that antigenic changes in the PspA expressed may have taken place. The majority of the isolates (97%) belonged to either PspA family 1 or family 2, suggesting that a combination including the two main PspA families would make a good vaccine candidate.
Methods for assigning strains to bacterial species are cumbersome and no longer fit for purpose. The concatenated sequences of multiple house-keeping genes have been shown to be able to define and circumscribe bacterial species as sequence clusters. The advantage of this approach (multilocus sequence analysis; MLSA) is that, for any group of related species, a strain database can be produced and combined with software that allows query strains to be assigned to species via the internet. As an exemplar of this approach, we have studied a group of species, the viridans streptococci, which are very difficult to assign to species using standard taxonomic procedures, and have developed a website that allows species assignment via the internet.
Seven house-keeping gene sequences were obtained from 420 streptococcal strains to produce a viridans group database. The reference tree produced using the concatenated sequences identified sequence clusters which, by examining the position on the tree of the type strain of each viridans group species, could be equated with species clusters. MLSA also identified clusters that may correspond to new species, and previously described species whose status needs to be re-examined. A generic website and software for electronic taxonomy was developed. This site allows the sequences of the seven gene fragments of a query strain to be entered and for the species assignment to be returned, according to its position within an assigned species cluster on the reference tree.
The MLSA approach resulted in the identification of well-resolved species clusters within this taxonomically challenging group and, using the software we have developed, allows unknown strains to be assigned to viridans species via the internet. Submission of new strains will provide a growing resource for the taxonomy of viridans group streptococci, allowing the recognition of potential new species and taxonomic anomalies. More generally, as the software at the MLSA website is generic, MLSA schemes and strain databases for other groups of related species can be hosted at this website, providing a portal for microbial electronic taxonomy.
We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of strains in a data set increases.
We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites.
A multitude of challenging simulation scenarios and an analysis of real data from seven housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities offered by our approach. The software is freely available for download at URL .
Genetic surveys are uncovering the diversity of bacteria, and are causing the species concepts used to categorize these to be questioned. One difficulty in defining bacterial species arises from the high rates of recombination that results in the transfer of DNA between relatively distantly related bacteria. Barriers to this process, which could be used to define species naturally, are not apparent. Here, we have reviewed conceptual models of bacterial speciation and simulate speciation in silico. Our findings suggest that the rate of recombination and its relation to genetic divergence, have a strong influence on outcomes: we propose that a distinction be made between clonal divergence and sexual speciation. Hence, to make sense of bacterial diversity we need data not only from genetic surveys, but also from experimental determination of selection pressures and recombination rates, and from theoretical models.
A central problem in understanding bacterial speciation is how clusters of closely related strains emerge and persist in the face of recombination. We use a neutral Fisher–Wright model in which genotypes, defined by the alleles at 140 house-keeping loci, change in each generation by mutation or recombination, and examine conditions in which an initially uniform population gives rise to resolved clusters. Where recombination occurs at equal frequency between all members of the population, we observe a transition between clonal structure and sexual structure as the rate of recombination increases. In the clonal situation, clearly resolved clusters are regularly formed, break up or go extinct. In the sexual situation, the formation of distinct clusters is prevented by the cohesive force of recombination. Where the rate of recombination is a declining log-linear function of the genetic distance between the donor and recipient strain, distinct clusters emerge even with high rates of recombination. These clusters arise in the absence of selection, and have many of the properties of species, with high recombination rates and thus sexual cohesion within clusters and low rates between clusters. Distance-scaled recombination can thus lead to a population splitting into distinct genotypic clusters, a process that mimics sympatric speciation. However, empirical estimates of the relationship between sequence divergence and recombination rate indicate that the decline in recombination is an insufficiently steep function of genetic distance to generate species in nature under neutral drift, and thus that other mechanisms should be invoked to explain speciation in the presence of recombination.
Fisher–Wright model; simulation; species; recombination; multilocus genotypes; genetic cartography
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed.
multilocus sequence analysis; bacterial populations; species clusters; electronic taxonomy; bacterial systematics
The program eBURST uses multilocus sequence typing data to divide bacterial populations into groups of closely related strains (clonal complexes), predicts the founding genotype of each group, and displays the patterns of recent evolutionary descent of all other strains in the group from the founder. The reliability of eBURST was evaluated using populations simulated with different levels of recombination in which the ancestry of all strains was known.
For strictly clonal simulations, where all allelic change is due to point mutation, the groups of related strains identified by eBURST were very similar to those expected from the true ancestry and most of the true ancestor-descendant relationships (90–98%) were identified by eBURST. Populations simulated with low or moderate levels of recombination showed similarly high performance but the reliability of eBURST declined with increasing recombination to mutation ratio. Populations simulated under a high recombination to mutation ratio were dominated by a single large straggly eBURST group, which resulted from the incorrect linking of unrelated groups of strains into the same eBURST group. The reliability of the ancestor-descendant links in eBURST diagrams was related to the proportion of strains in the largest eBURST group, which provides a useful guide to when eBURST is likely to be unreliable.
Examination of eBURST groups within populations of a range of bacterial species showed that most were within the range in which eBURST is reliable, and only a small number (e.g. Burkholderia pseudomallei and Enterococcus faecium) appeared to have such high rates of recombination that eBURST is likely to be unreliable. The study also demonstrates how three simple tests in eBURST v3 can be used to detect unreliable eBURST performance and recognise populations in which there appears to be a high rate of recombination relative to mutation.
Long-distance dispersal in microbial eukaryotes has been shown to result in the establishment of populations on continental and global scales. Such “ubiquitous dispersal” has been claimed to be a general feature of microbial eukaryotes, homogenising populations over large scales. However, the unprecedented sampling of opportunistic infectious pathogens created by the global AIDS pandemic has revealed that a number of important species exhibit geographic endemicity despite long-distance migration via aerially dispersed spores. One mechanism that might tend to drive such endemicity in the face of aerial dispersal is the evolution of niche-adapted genotypes when sexual reproduction is rare. Dispersal of such asexual physiological “species” will be restricted when natural habitats are heterogeneous, as a consequence of reduced adaptive variation. Using the HIV-associated endemic fungus Penicillium marneffei as our model, we measured the distribution of genetic variation over a variety of spatial scales in two host species, humans and bamboo rats. Our results show that, despite widespread aerial dispersal, isolates of P. marneffei show extensive spatial genetic structure in both host species at local and country-wide scales. We show that the evolution of the P. marneffei genome is overwhelmingly clonal, and that this is perhaps the most asexual fungus yet found. We show that clusters of genotypes are specific to discrete ecological zones and argue that asexuality has led to the evolution of niche-adapted genotypes, and is driving endemicity, by reducing this pathogen's potential to diversify in nature.
Scientists believe that micro-organisms are spread around the planet on currents of air, a hypothesis that is known as “ubiquitous dispersal”. While fungi release huge quantities of widely dispersed spores, it is not known why many species remain endemic to specific regions around the globe. Research by the authors suggests an answer to this conundrum, by investigating the genetic structure of a fungus, Penicillium marneffei, that causes disease in people with damaged immune systems. This research has shown that P. marneffei spores can be dispersed over a wide distance, but fail to penetrate the new environments that they find themselves in. This appears to be because the fungus has largely dispensed with sexual reproduction, which means that its ability to adapt to new challenges is limited. The authors use DNA typing to show that different “clones” of the fungus are associated with different environments, and suggest that adaptation to these environments is constraining the organism's ability to successfully disperse in nature. This may explain why P. marneffei is endemic to a relatively small area of southeast Asia, and the authors go on to suggest that the long-term consequence of this strategy may be the eventual extinction of the organism.
We investigated the genetic relationships between serotypeable pneumococci and nonserotypeable presumptive pneumococci using multilocus sequence typing (MLST) and partial sequencing of the pneumolysin gene (ply). Among 121 nonserotypeable presumptive pneumococci from Finland, we identified isolates of three classes: those with sequence types (STs) identical to those of serotypeable pneumococci, suggesting authentic pneumococci in which capsular expression had been downregulated or lost; isolates that clustered among serotypeable pneumococci on a tree based on the concatenated sequences of the MLST loci but which had STs that differed from those of serotypeable pneumococci in the MLST database; and a more diverse collection of isolates that did not cluster with serotypeable pneumococci. The latter isolates typically had sequences at all seven MLST loci that were 5 to 10% divergent from those of authentic pneumococci and also had distinct and divergent ply alleles. These isolates are proposed to be distinct from pneumococci but cannot be resolved from them by optochin susceptibility, bile solubility, or the presence of the ply gene. Complete resolution of pneumococci from the related but distinct population is problematic, as recombination between them was evident, and a few isolates of each population possessed alleles at one or occasionally more MLST loci from the other population. However, a tree based on the concatenated sequences of the MLST loci in most cases unambiguously distinguished whether a nonserotypeable isolate was or was not a pneumococcus, and the sequence of the ply gene fragment was found to be useful to resolve difficult cases.
It is a matter of ongoing debate whether a universal species concept is possible for bacteria. Indeed, it is not clear whether closely related isolates of bacteria typically form discrete genotypic clusters that can be assigned as species. The most challenging test of whether species can be clearly delineated is provided by analysis of large populations of closely-related, highly recombinogenic, bacteria that colonise the same body site. We have used concatenated sequences of seven house-keeping loci from 770 strains of 11 named Neisseria species, and phylogenetic trees, to investigate whether genotypic clusters can be resolved among these recombinogenic bacteria and, if so, the extent to which they correspond to named species.
Alleles at individual loci were widely distributed among the named species but this distorting effect of recombination was largely buffered by using concatenated sequences, which resolved clusters corresponding to the three species most numerous in the sample, N. meningitidis, N. lactamica and N. gonorrhoeae. A few isolates arose from the branch that separated N. meningitidis from N. lactamica leading us to describe these species as 'fuzzy'.
A multilocus approach using large samples of closely related isolates delineates species even in the highly recombinogenic human Neisseria where individual loci are inadequate for the task. This approach should be applied by taxonomists to large samples of other groups of closely-related bacteria, and especially to those where species delineation has historically been difficult, to determine whether genotypic clusters can be delineated, and to guide the definition of species.