|Home | About | Journals | Submit | Contact Us | Français|
The implementation of the new clustering algorithm Based Upon Repeat Pattern (BURP) into the Ridom StaphType software tool enables clustering based on spa typing data for Staphylococcus aureus. We compared clustering results obtained by spa typing/BURP to those obtained by currently well-established methods, i.e., SmaI macrorestriction analysis and multilocus sequence typing/eBURST. A total of 99 clinical S. aureus strains, including MRSA and representing major clonal lineages associated with important kinds of infections which have been prevalent in Germany and Central Europe during the last 10 years, were used for comparison. SmaI macrorestriction analysis revealed the highest discriminatory power, and clustering results for all three methods resulted in concordance values ranging from 96.8% between the two sequence-based methods to 93.4% between spa typing/BURP and SmaI macrorestriction/cluster analysis. The results of this study indicate that spa typing, together with BURP clustering, is a useful tool in S. aureus epidemiology, especially because of ease of use and the advantages of unambiguous sequence analysis as well as reproducibility and exchange of typing data.
Staphylococcus aureus is one of the most frequent nosocomial pathogens. The emergence and spread of epidemic strains of methicillin-resistant S. aureus in hospitals (hMRSA) and, independent from the nosocomial setting, in the community (cMRSA) require special attention of infection control. Typing is an important prerequisite for targeted control measures. For about 30 years, phage typing has been widely used for strain typing. More recently, SmaI macrorestriction analysis (pulsed-field gel electrophoresis [PFGE]) was introduced as a typing method with high discriminatory power. PFGE is still regarded the “gold standard” of molecular typing of MRSA, despite insufficient comparability of results obtained from different laboratories (21). During the past 5 years, DNA sequence-based typing has become more popular due to progress in large-scale sequencing methodology, ease of data transfer, and excellent comparability of results (2). This first became evident by the application of multilocus sequence typing (MLST) to MRSA (4, 5). At present, however, MLST is not suitable for routine infection control due to high cost, labor intensity, and lack of broad access to high-throughput DNA sequencing.
Several S. aureus typing schemes targeting polymorphic DNA repeat regions in genes for microbial surface components recognizing adhesive matrix molecules have been described previously (7, 9, 16, 27, 30). They also include typing methods based on the length polymorphism in spa amplimers (9) or, more recently, on polymorphisms in multiple fragments amplified in a multiplex PCR approach for variable-number tandem repeats (7, 27). Among sequence-based approches, spa typing was the most promising (8, 12, 13, 15, 31). The X region of the protein A gene (spa) consists of direct repeats exhibiting an extensive polymorphism based on point mutations, deletions, duplications, and insertions. Different repeats can be assigned an alpha-numerical code, and the order of specific repeats defines the spa type. Two systems of nomenclature are in use for spa type determination (13, 15). Ridom StaphType (13) provides a software tool enabling straightforward sequence analysis and designation of spa types via synchronization to a central server.
Previous studies have shown that there is a fairly good correlation between clonal groupings of MRSA isolates obtained by spa typing and other typing techniques (15, 22, 29, 36). The broader application of spa typing revealed a considerable degree of spa gene repeat polymorphism within particular clonal groups and clonal lineages of MRSA isolates, as defined by MLST and eBURST, indicating a higher discriminatory power for this method. However, in daily infection control, an unambiguous and quick attribution of newly arising spa types to known clonal complexes and clonal lineages is essential because of their differential dynamics of emergence and spread (33). This is exemplified by the occurrence of cMRSA isolates, most often containing the lukS-lukF determinant coding for Panton-Valentine leukocidin. They may emerge as (i) clonal lineages not previously reported (40), (ii) derivatives of clonal lineages which have already been known as nosocomial pathogens in the “pre-MRSA era” (23, 26), and (iii) clones belonging to the same clonal lineages as nosocomial MRSA strains and containing both the mecA gene and the lukS-lukF determinant (17).
In looking at sequence databases for spa types, specific repeats and repeat successions seem to be associated with particular MLST sequence types (http://www.spaserver.ridom.de). The recent implementation of the BURP (Based Upon Repeat Pattern) algorithm into the Ridom StaphType software (13, 28) makes allowance for this and provides a tool for classifying related spa sequence types into different BURP groups.
Here, we report about the application of spa typing and subsequent BURP clustering to a collection of S. aureus isolates, including all major clonal lineages of hMRSA and cMRSA isolates, as well as methicillin-susceptible S. aureus (MSSA) isolates of the same clonal lineages representing probable ancestors. Furthermore, MSSA isolates with particular virulence genes associated with important kinds of disease, such as tst (toxic shock syndrome), eta and etb (exfoliative dermatitis), and lukS-lukF (furunculosis and necrotizing pneumonia), were included. All isolates were collected at different times and from different geographical areas, mainly in Central Europe. The resulting groups were compared to those obtained with SmaI macrorestriction and MLST/eBURST analyses.
A total of 99 S. aureus isolates, including methicillin-sensitive as well as -resistant ones, were used in this study. Strains were selected from the strain collection of the German reference center for staphylococci situated at our laboratory and represent the majority of clonal lineages prevalent in Germany and Central Europe during the last 10 years, including recently emerging cMRSA isolates. The reference strains previously used in the “HARMONY” study on harmonization of PFGE protocols for MRSA strain typing (21) were included. Isolates were collected at different time points over a period of approximately 10 years. More-detailed information about strain characteristics, also including demonstrated virulence determinants for each isolate, can be found in Table Table11.
SmaI macrorestriction analysis was conducted according to the HARMONY protocol (21). Resulting gel images were analyzed using the guidelines proposed by Tenover et al. (37). Accordingly, strains were supposed to be identical or very closely related if they differed by at most three bands. Additionally, cluster analysis was performed with the BioNumerics software package (Applied Maths, Sint-Martens-Latem, Belgium), using the Dice coefficient, and visualized as a dendrogram by the unweighted-pair group method, using average linkages with 1% tolerance and 1% optimization settings. A similarity cutoff of 70% was used to define a cluster.
Genomic DNA for subsequent PCRs was isolated from a 2-ml overnight culture with the DNeasy tissue kit (QIAGEN, Hilden, Germany), using lysostaphin (100 mg/liter; Sigma, Taufkirchen, Germany) to achieve bacterial lysis.
The polymorphic X region of the protein A gene (spa) was amplified using the primers spa-1113f (5′ TAA AGA CGA TCC TTC GGT GAG C 3′) and spa-1514r (5′ CAG CAG TAG TGC CGT TTG CTT 3′). All sequencing reactions were carried out using the ABI PRISM BigDye Terminator cycle sequencing ready reaction kit (Applied Biosystems, Foster City, Calif.). spa types as well as BURP spa clonal complexes (spa-CCs) were assigned using the Ridom StaphType software version 1.3 (Ridom GmbH, Würzburg, Germany) as described by Harmsen et al. (13). Applying the newly implemented algorithm BURP, spa types were clustered into different groups, with the calculated cost between members of a group less than or equal to 8. spa types shorter than five repeats were excluded from analysis because no reliable deduction about ancestries can be made from these types. The new algorithm takes repeat duplication/deletion in addition to point mutation events into account when calculating the relatedness of different spa types. Due to speed constraints, a heuristic version (secondary duplication events within primary duplications are not detected) of the EDSI alignment (excisions, duplications, substitutions, and insertions), as described by Sammeth et al. (28), was used.
MLST was conducted as previously described (4). Allele types and resulting sequence types were assigned at the S. aureus MLST database via the Internet (http://www.mlst.net). Sequence types were clustered into groups using eBURST, employing the relaxed group definition with five of seven loci (i.e., members of a group differ at a single locus or two loci ).
An index of discrimination (DI) for each typing method was calculated, defined as the average probability that the typing system will assign a different type to two unrelated strains randomly sampled in the microbial population of a given taxon (14). The DI depends on the number of strain types and on the homogeneity of frequency distribution of strains into types. Confidence intervals (CIs) for discriminatory indices were calculated as previously described (11).
The agreement between two strain typing tests was calculated as described by Robinson et al. (25). Ideally, the DI and the typing system concordance should be calculated using a test population that includes epidemiologically unrelated strains. This is most likely not true in our study, and therefore, the absolute figures should be treated with caution. Nevertheless, the relative ordering of the typing schemes according to the DIs and the typing system concordance is meaningful. Calculation of both methods is implemented in the Ridom StaphType software version 1.3.
All 99 isolates were typeable by SmaI macrorestriction and produced 74 different macrorestriction patterns according to the criteria defined by Tenover et al. (37). Employing a cutoff similarity value of 70% in subsequent cluster analysis, we assigned the isolates to 16 different groups, with 5 groups containing only a single isolate (Fig. (Fig.11).
All isolates were assigned to 44 different spa types, varying in length between 2 (t586) and 16 (t032) repeats (Fig. (Fig.11 and Table Table1).1). Using the algorithm BURP, newly implemented in the Ridom StaphType software, spa types were clustered into 10 different groups, with 7 groups comprising more than one spa type and three so-called “singletons.” Thereby, spa types were grouped together if the calculated cost between members of a group was less than or equal to 8. Since clustering parameters excluded spa types shorter than five repeats, two types (t026 and t586) were excluded from BURP grouping. Nevertheless, t026 and t586 could be classified into spa CC045 and spa CC065, respectively, after visual inspection of the corresponding repeat patterns. Using these parameters, the majority of isolates were grouped together as expected regarding their evolutionary origin, as reflected by MLST analysis, although in most cases a variety of spa types corresponded to a single MLST. (ST5, six different spa types; ST254, three different spa types; ST45, three different spa types; ST30, eight different spa types; ST22, four different spa types; ST121, four different spa types).
Twenty-two different sequence types were identified. Using the relaxed group definition in eBURST, 10 different groups were defined, with five groups including more than one sequence type and five singletons. Groups corresponded to the most abundant clonal complexes present in Middle Europe during the last 10 years, i.e., CC-5, CC-8, CC-22, CC-30, CC-45, and CC-121, and included ST80 and ST1 representing the predominant cMRSA isolates in Central Europe and North America, respectively (Fig. (Fig.11).
The ability of each method to discriminate different strain types was assessed by calculation of the DI; DIs and corresponding CIs are summarized in Table Table22 .
Using MLST/eBURST data as a reference method, we evaluated the concordance between typing methods applied in this study for the given strain collection (Fig. (Fig.1).1). Regarding the isolates of the clonal complexes CC-121, CC-22, and CC-45 and sequence types ST426, ST59, ST81, and ST1, typing results were identical for all three methods. Each of these clonal complexes and sequence types corresponded to a single group after spa typing/BURP and SmaI macrorestriction analyses, respectively (Fig. (Fig.1).1). Similar results were obtained for isolates of the clonal complex CC-30, which clustered in a single group after spa typing/BURP (spa CC012) as well as after macrorestriction analysis (PFGE14). However, spa typing revealed very similar spa types (t037 and t030) for ST239 (CC-8) isolates compared to spa types of spa CC012. As a consequence, those spa types were grouped together in spa CC012 by BURP. This phenomenon was described previously and could be attributed to a chromosomal replacement in the evolution of ST239 MRSA isolates descending from ST8 by integration of a large genetic element from ST30, also encompassing the spa locus (24). spa types for the remaining isolates of clonal complex CC-8 (showing sequence types ST8, ST247, and ST254) were clustered into a single group (spa CC036) by BURP. Using SmaI macrorestriction analysis, all 19 isolates of CC-8 were classified into a total of four different groups (PFGE04, PFGE05, PFGE09, and PFGE11) containing 1 to 10 isolates. Although those PFGE groups contained predominantly isolates of one or two sequence types (PFGE04, ST8 and ST247; PFGE05, ST254; PFGE09 and PFGE11, ST239), cluster analysis was not able to group respective sequence types into separate groups unambiguously. Similar results were obtained for isolates of the clonal complex CC-5 with sequence types ST5, ST225, and ST228. While eBURST and BURP clustered all isolates into one group, SmaI macrorestriction analysis was able to group all ST228 isolates into a separate cluster; however, isolates of ST5 and ST225 were not separated by this method and macrorestriction patterns of isolates belonging to those sequence types were quite diverse. These results indicate a high concordance between results of BURP and eBURST for the data set used in this study, while concordance between both sequence-based methods and SmaI macrorestriction analysis is comparatively lower. This is also reflected by concordance values between the three typing methods for the given data set which are summarized in Table Table33.
A variety of genotyping techniques are available for classifying S. aureus strains for epidemiological investigation, including “band-based” as well as “sequence-based” methods. Thereby, sequence-based typing methods, such as spa typing and MLST, have some obvious advantages, such as ease of use, reproducibility, transportability, and comparability of results, compared to band-based methods, such as SmaI macrorestriction analysis (2). SmaI macrorestriction analysis, the current gold standard in S. aureus strain typing, is accepted for outbreak investigations, but some authors question its use for phylogenetic analyses (34).
In contrast to MLST (combined with eBURST grouping), which is widely used for evolutionary investigation in S. aureus, spa typing proved to be a tool for routine investigation. Moreover, spa typing was more discriminatory than MLST in previous studies. This was confirmed in the present study, where most MLST types encompassed several spa types. However, until now, no algorithm was available to group related spa types together for epidemiological investigations. The implementation of BURP into the Ridom StaphType software allows clustering of different spa types based on a new algorithm for the alignment of repeat sequences (28). Thus, the aim of the present study was to compare clustering results obtained by spa typing/BURP analysis to those obtained by well-established methods (SmaI macrorestriction analysis and MLST/eBURST).
Our study demonstrated a wide congruence of clustering results obtained by spa typing/BURP, SmaI macrorestriction analysis, and MLST/eBURST. Similar results were previously reported for clonal complexes containing major epidemic nosocomial MRSA isolates, such as CC5, CC8, and CC45 (15, 32), as well as for MSSA isolates of CC5, CC30, and CC121 (1, 15). Additionally, we found a good congruence for epidemic nosocomial MRSA isolates of ST22 (in CC22) and ST228 (in CC5), as well as for community-acquired MRSA isolates of ST80. We also confirmed the divergent spa types t037 and t030 in MRSA isolates of ST239 clustering together with spa types found in CC30 (15, 18). This has been explained previously by recombinative replacement of a large stretch of chromosomal DNA in MRSA isolates of CC8 by a stretch originating from CC30 and including the spa gene (24).
The different spa CCs are characterized by one or two repeats specific for a particular complex, such as r15 for spa CC012, r14 and r44 for spa CC435, r28 and r29 for spa CC022, r11 and r19 for spa CC036, r20 and r30 for spa CC045, r07 for t044, and r02, r08, and r09 for spa CC065, as well as by the repeat succession. The types within a spa CC in most instances differ by deletion, duplication, or insertion of repeats, but there are also point mutations leading to new repeats. Although we find the same or closely related spa sequence types in isolates of the same clonal lineage (as defined by MLST) collected at very different times and from different geographical locations, we should be careful with conclusions on descent and direct epidemiological relations of isolates within a spa CC based only on spa sequence types. In the following, these aspects will be discussed in more detail.
The majority of isolates belonging to spa CC012 (which corresponds to MLST CC30) have repeat r15 as the first repeat unit in common, which is followed by repeats r12, r16, and r02 in a rather conserved order. Besides spa type t019, most of the other types included in spa CC012 differ by various numbers of the final repeat r24. spa type t021 is already represented by MSSA isolates containing the lukS-lukF determinant coding for Panton-Valentine leukocidin from the 1960s, such as PS80, and by MSSA isolates from the 1980s. These isolates represent the so-called 80, 81 complex, a major nosocomial pathogen of the 1960s and 1990s (23). This spa type is, however, also seen in more-recent MSSA and MRSA isolates which contain tst but not lukS-lukF. An interpretation in the sense of a direct common ancestral origin of both pathotypes of this clonal lineage would be rather speculative, since spa type t021 could have been derived from other spa types by loss of one or more final repeats of r24 (e.g., from type t012 or t018). MSSA isolates of MLST CC30 exhibiting spa types t012, t018, and t021 from the United States and from Poland have also been described previously (15, 16, 18).
Both cMRSA isolates of MLST ST30 in our collection exhibit spa type t019; this type was also described for cMRSA from Belgium (3), Poland (18), and Japan (35). In this case, a wide geographic dissemination of a particular clone cannot be excluded, as t019 differs from other spa types in spa CC012 by two point mutations in the first repeat (r08 instead of r15 [for details, see http://www.spaserver.ridom.de]).
MSSA isolates of MLST ST121 are grouped in spa CC435 and share repeats r14, r44, and r13 as the first ones; repeats r14 and r44 have not been found in any other isolates of the collection. As already seen in isolates of CC30, there is no association of virulence-associated genes (eta and etb versus lukS-lukF) with particular spa types.
spa types of MLST ST22 MRSA isolates cluster in spa CC022. Among them, isolate 96-1678 exhibits the prototype SmaI macrorestriction pattern of ST22; the other isolates of this lineage have been selected for different fragment patterns. Four of them exhibit t032, one exhibits t022 (one deletion of r23), and one exhibits t005 (one point mutation in r29 leading to r05). spa type t310 was found for both cMRSA isolates of ST22 originating from Scotland and Germany, which suggests a more direct relation.
CC8 contains four major clonal lineages of epidemic hMRSA: ST8, ST239, ST247, and ST254 revealing spa types t008, t037/t030, t051, and t009, respectively. Besides t037/t030 (see above), they are grouped into spa CC08. The MRSA isolate of lineage ST254 isolated from a horse (isolate 03-2575) exhibits t036 (deletion of three repeats from t009). The finding of the same spa types for isolates from each clonal lineage collected in different years and from rather dispersed geographical areas suggests that spa types in spa CC08 are quite stable. spa types t008 and t037 have also been described for MRSA isolates of lineages ST8 and ST239 from the United States (15). There are, however, additional types (t388) among MRSA isolates of ST239 in Poland; Polish MRSA isolates of ST247 exhibit t052, differing from t051, which was found in Central European isolates of this lineage, by deletion of one repeat (18). cMRSA isolates belonging to ST8 cannot be discriminated from hMRSA isolates of this lineage by spa sequence typing; they also exhibit t008.
For CC5, isolates belonging to clonal lineages ST5, ST225, and ST228 have been investigated. Clustering of their spa types groups them in spa CC045. spa type t001 was found for isolates of ST228 collected from different locations in Germany, Poland, and Slovenia over 6 years. Besides t002, which had already been reported for MRSA isolates of lineage ST5 from Central Europe (13), two other spa types have been detected (t045 and t178). Type t002 was also found in MRSA isolates of ST5 from the United States (15, 16) and in the majority of MRSA isolates of these lineages from Japan and southern Korea. In this study, seven “subtypes” due to deletion, insertion, and point mutations were reported (32). The majority of MRSA isolates of ST5 reported from Poland exhibited t053, which differs from t002 by three point mutations within the final repeat (18). spa type t003 was found in three isolates of ST225 from different locations in Germany.
MRSA isolates of CC45 had first been reported in 1993 from Berlin hospitals (39) and afterwards from other European countries (26); they are obviously also disseminated in Northern America. spa types in ST45 are rather heterogeneous and are grouped in spa CC065; type t004 was found in five of eight isolates which have been collected from 1993 until now. Type t004 seems to be characteristic for MRSA isolates of ST45 originating from Central Europe, whereas t015 was reported for isolates from the United States (15) and Poland (18); it is substantially different from t004.
Among the cMRSA isolates investigated, clonal lineages ST80, which includes the most widely disseminated cMRSA isolates in Europe (38), and ST1, which is sporadic in central Germany but frequent in northern states of the United States (20), exhibit spa types t044 and t175, respectively, by which they can easily be recognized, as these types have not been reported for any other MSSA or MRSA isolates so far (http://www.spaserver.ridom.de). Type t044 was also reported for cMRSA isolates of ST80 from Belgium (3).
In conclusion, we could demonstrate a high degree of concordance between the three different typing and clustering methods applied in this study. Although SmaI macrorestriction analysis proved to be superior in discriminatory power, spa typing/BURP was shown to be a feasible tool for elucidating epidemiological questions, providing results comparable to those obtained with MLST/eBURST. However, the user must be aware of certain particularities (e.g., t037/t030 grouping; see above). A specialized software tool such as Ridom StaphType enables the implementation of spa type-specific alerts, thus preventing “misclassification.” Recently, such a software tool was used to establish a DNA sequence-based early warning system for outbreak investigations in hospitals (19). Additionally, it ensures a common typing nomenclature and thus greatly facilitates the exchange of typing data for S. aureus (2) as well as the setup of supranational typing networks (10).