For the first time, we successfully employed Bayesian phylogeographic methods to a bacterial pathogen.
Staphylococcus aureus is a common cause of infections and therefore has not generally been thought of as a pandemic disease. However, with the emergence of specific antibiotic-resistant strains, it has become clear that
S. aureus can spread in epidemic waves across the globe. In this study, we demonstrate that the TMRCA of the ST239 lineage occurred shortly after the introduction of Methicillin in 1959 (
Batchelor et al. 1959), in agreement with the recent findings of
Harris et al. (2010). The type III SCC
mec cassette also encodes resistance to penicillin, tetracycline, and erythromycin (
Deurenberg and Stobberingh 2008), all of which were in use before 1960 (
Khardori 2006). Multiple drug resistance could have provided this strain with a strong fitness advantage in the hospital setting. We estimate that ST239 was spreading worldwide for almost 20 years before the initial identification of an isolate, in 1985, carrying the type III pathogenic cassette (
Deurenberg and Stobberingh 2008). Such a discrepancy is alarming because it suggests that antibiotic-resistant bacteria can circulate globally undetected for a relatively long time. Except for a single introduction into Asia, ST239 lineages were not geographically restricted but rather regularly moved among continents, indicating that HA-MRSA diffusion has the characteristics of a pandemic rather than regionally restricted outbreaks. Such results are in contrast with previous reports suggesting that MRSA is characterized by limited geographic dispersal in Europe (
Grundmann et al. 2010) and worldwide (
Nübel et al. 2008), but consistent with the qualitative observations of
Harris et al. (2010) that were based on a single ML tree reconstruction. However, these studies used limited genetic data, which may have contained too little resolution to infer detailed geographic patterns. Furthermore, because the genome-wide SNPs data set used in the present analysis only included the core genome (excluding the SCC
mec cassette;
Harris et al. 2010), the inferred evolutionary history is representative of the population history rather than the recombinant events within mobile elements.
The rate of evolution of bacteria species is uncertain (
Achtman 2008). In this study, the use of the ascertainment correction bias provided estimates of evolutionary rates that were intermediate between those estimated using tip-dated sequences without the bias correction reported here and elsewhere (
Harris et al. 2010;
Nubel et al. 2010) and those inferred using species divergence dating techniques (
Ochman et al. 1999). It is likely that previously estimated rates without the bias correction were artificially high, as the lack of invariant site patterns are not taken into account, in which case the sum of all possible site pattern probabilities is <1. Correcting this bias allowed the invariant sites to be properly incorporated into the probability summation. Evolutionary rates estimates were still 100 times faster than rate estimates based on species divergence times (external calibrations). Reasons for this include the fact that transient polymorphisms exist within a population that will eventually be selected against over time, thereby driving up the estimated evolutionary rate when inferences are drawn from intrapopulation samples. Therefore, this rate reflects a different population process to the long-term fixation process between species. However, the important result from the present analysis is that estimates of bacterial evolutionary timescales are feasible using serially sampled sequence data alone and do not always require a calibration date or external rate.
Recent theoretical advances in Bayesian phylogeography have made it possible to include the geographic location of the samples into the analysis, thus providing a formal statistical framework in which hypotheses concerning the spatial origin and dissemination of epidemics can be investigated. Bayesian methods account for phylogenetic uncertainty inherent in any reconstruction of the evolutionary history of a group of organisms, both in the tree topology and the assignment of geographic states to ancestral nodes, by estimating a probability distribution for parameters of interest (
Lemey et al. 2009). This approach constitutes a significant improvement (
Sanmartín et al. 2008) upon traditional parsimony-based models (
Slatkin and Maddison 1989) that only consider one reconstruction of migration on a fixed tree. Furthermore, although maximum parsimony can reliably be used in simple dissemination scenarios, minimizing the number of migrations is inappropriate in more complex situations, for example, in cases of continuous multidirectional gene flow (
Cunningham et al. 1998). The Bayesian framework, on the other hand, can account for more complex models by allowing the calculation of probabilities for the ancestral state (nucleotide and geographic location) reconstruction. This framework also allows testing of specific hypothesis of the driving factors of the migration (
Lemey et al. 2009). Here, we show that human migration appears to be significantly associated with the current HA-MRSA pandemic spread. Traveling and migration have been linked anecdotally with community-acquired MRSA (CA-MRSA) (
Ellington et al. 2010) and Methicillin-sensitive
S. aureus (MSSA) (
Schleucher et al. 2008). Although CA-MRSA has traditionally been viewed as demographically and clinically distinct from HA-MRSA and more similar to MSSA (
Groom et al. 2001;
Naimi et al. 2003), recent evidence suggests that the epidemiological behavior of both CA- and HA-MRSA have begun to overlap (
Seybold et al. 2006;
Klevens et al. 2007). Furthermore, asymptomatic carriers of HA-MRSA could act as vectors in transmitting the pathogen (
Zanger 2010); in fact, the majority of HA-MRSA cases in the United States from 2007 occurred outside of the hospital (
Klevens et al. 2007), providing ample opportunity for transmission in the community setting. Given the rapid increase of international air travel during the past several decades, emergence of drug-resistant pathogens in a specific locale should not be considered as an isolated event but rather within a larger global context. Specific migration patterns (such as those reported here) can then be incorporated into monitoring and intervention strategies.
To strengthen the phylogeographic analysis of viral gene flow, future studies should include multiple strains from each sampled location to avoid uncertainty in the reconstruction of ancestral locations. The cost of genome-wide typing is justified by the potential public health utility of such data (
Harris et al. 2010). Future applications of this approach may aid the control and prediction of newly emergent drug-resistant pathogens.