Staphylococcus aureus, the leading cause of nosocomial infections worldwide, causes a wide variety of infections (
14,
18). In recent decades, the worldwide spread of methicillin-resistant
S. aureus (MRSA) has become a major challenge in health care (
5). More recently, community-acquired MRSA has intensified public health concerns (
26). To understand these changes in epidemiology, different typing methods have been applied, including phage typing, pulsed-field gel electrophoresis (PFGE), and sequence-based typing methods. Sequence-based typing offers the advantage that results are easy to compare and communicate between different laboratories. Multilocus sequence typing (MLST) has become the “gold standard” for population analysis (
15), but it has low discriminatory power and is expensive, so this method is mainly restricted to reference laboratories. As a result of the predominantly clonal evolution of
S. aureus, sequencing of the repeat region of the protein A gene (
spa) generates informative typing results and has quickly been established as a robust and highly discriminatory method (
1,
3,
10,
16,
21,
24). The
spa region consists of a variable number of 21- to 27-bp repeats with differing nucleotide compositions that result in different
spa types. It has been observed that this region provides information not only about short-term epidemiology but also about long-term phylogeny and contains a reliable signal that could be utilized for the determination of clonal relatedness among individual strains (
12).
Here we investigated a well-characterized collection of methicillin-sensitive
S. aureus (MSSA) carriage strains by
spa typing. To evaluate the ability of
spa typing to determine the clonal relatedness of a natural population of
S. aureus strains, we used the recently described grouping algorithm that is “based upon repeat patterns” (BURP) to cluster related
spa types (
17) and compared the results with MLST clonal complexes (CCs) and PFGE groups.
(This study was presented in part at the 107th General Meeting of the American Society for Microbiology, Toronto, Ontario, Canada, 21 to 25 May 2007.)
One hundred ten MSSA carriage strains that originated from a random and representative sample of nonhospitalized elderly individuals living independently in the Nottingham Health district (
8) were analyzed. This collection was previously characterized by MLST, PFGE, phage typing, and analysis of randomly amplified polymorphic DNA (
6).
spa typing was performed as described elsewhere (
4,
10,
16). BURP clusters (
spa CCs) for these strains were determined, and clustering was portrayed with Ridom StaphType (version 1.5) software (Ridom GmbH, Würzburg, Germany). BURP offers two user-defined parameters that influence clustering: exclusion of
spa types that are shorter than a certain number (
x) of repeats and the maximum number of costs (
y) for clustering
spa types into the same group. Short
spa types can be excluded from further analysis, because their information content is limited and no reliable evolutionary history can be inferred. The costs account for the genetic changes between two different
spa types, whereas the algorithm attempts to minimize the genetic changes (“parsimony assumption”). The default parameters (
x = 5;
y = 4) were applied (
17). For the grouping of MLST data, the available sequence types (ST) and the enhanced BURST (eBURST) (based upon related ST) tool were used (
2). ST that shared at least five of seven identical alleles were grouped into a single CC. PFGE types and groups were assigned according to the criteria of Tenover et al. (
25), which correspond to a maximum variation of three bands for types and fewer than seven bands for groups. The UPGMA (unweighted-pair group method using average linkages) dendrogram illustrating the similarity was built based on Dice coefficients of the SmaI macrorestriction profiles using MEGA (version 4) software (
23). The index of diversity and simple concordance were calculated as previously described (
7,
11,
19).
All 110 S. aureus strains were spa typeable, and they exhibited 66 different spa types (number of repeats, 2 to 16). spa types t008 and t078 were most frequent (eight isolates each). The index of diversity for spa typing was 0.98 (95% confidence interval, 0.97 to 0.99). Using the default analysis parameters of BURP, the resulting spa types were clustered into 9 spa CCs and 19 singletons. Six spa types with fewer than five repeats were excluded (spa types t026, t232, t233, t287, t362, and t398). Figure shows the spa/BURP and corresponding MLST/eBURST and PFGE grouping results for each isolate. The clonal relatedness of all BURP-grouped spa types is illustrated in a population snapshot (Fig. ). Four of the nine spa CCs had designated group founders (spa CC401, spa CC382/399, spa CC084/346, and spa CC005). The group founder within BURP clusters of at least three different spa types is defined as the spa type with the highest founder score (assigned to the spa type to which the relevant spa types and strains are most closely related). In two spa CCs (spa CC382/399 and spa CC084/346), spa types t382 and t399 and spa types t084 and t346 had identical founder scores, respectively.
All STs belonging to CC217 and CC15 were grouped into spa CC005 and spa CC084/346. PFGE subdivided both spa CCs into two different groups. spa CC401 was clustered by eBURST into two different groups (CC c and CC d) that shared only one of the seven MLST loci (aroE). PFGE subdivided spa CC401 into three different groups. The 21 strains of spa CC382/399, exhibiting 12 different spa types, all clustered in MLST CC30 (ST30, ST37, ST39, and ST1005) by eBURST. Six other strains of CC30 were clustered by BURP in spa CC c, in one singleton (No260), and in one excluded spa type (No335). PFGE corroborated the group diversity (14 PFGE types). However, PFGE grouping resulted in five groups. Overall, grouping by BURP was highly concordant with that by eBURST (96.5%) and PFGE (94.9%). PFGE groups were 93.8% concordant with eBURST CCs. On the level of types instead of groups, concordance between spa types and ST or PFGE types was 96.8% or 97.1%, respectively, while PFGE types were 95.9% concordant with ST.
Two recent studies with strain collections that did not represent diverse natural populations of
S. aureus (Fig. ), but contained predominantly clinical isolates or mainly MRSA strains, showed a strong sampling bias (
9,
22). In both studies, high concordances between BURP-grouped
spa types and MLST and PFGE clusters were found, and only a few discrepancies were detected. In this study, similar discrepancies became apparent by using MLST-based grouping as a reference, e.g., in MLST CC30. MLST data are not greatly influenced by the effects of recombination, due to the use of BURST, which deduces CCs from allelic profiles. In contrast,
spa and PFGE grouping algorithms lack any transformation of the original data and are therefore more sensitive to recombination events. Therefore, large chromosomal replacements, which affect macrorestriction patterns and
spa typing substantially, are likely within CC30. Such events have already been shown in different clonal lineages, including CC30, by a previous study (
20). These cases could be clarified by using an additional target from another genomic region (e.g., clumping factor B gene) (
13). In some other instances, the use of BURP clustering is limited because short
spa types are excluded from further analysis. However, analysis of the SpaServer content, comprising more than 42,000 isolates with more than 3,100 different
spa types (
10), shows that fewer than 7% of all
spa types were affected by exclusion from BURP clustering.
In summary, BURP determines clonal relatedness, yielding results highly congruent with MLST and PFGE groupings, within an unbiased sampled population of MSSA strains based on spa data. spa typing and BURP should be considered for phylogenetic studies in addition to strain typing of MSSA.