|Home | About | Journals | Submit | Contact Us | Français|
Several methods have been used for typing of Streptococcus agalactiae (group B streptococci [GBS]). Methods currently in use may provide inadequate resolution (e.g., typing of capsular polysaccharides and surface protein) or are labor-intensive and expensive (e.g., multilocus sequence typing [MLST] or pulsed-field gel electrophoresis). This work describes the construction and use of a multiple-locus variant-repeat assay (MLVA) on 126 well-characterized human GBS strains, consisting mostly of invasive Norwegian strains and international reference strains. Based on in silico whole-genomic analysis of the genomes of strains A909, NEM316, and 2603V/R, 18 candidate loci were selected and investigated by PCR. Eleven loci showed diversity, and the five most diverse loci were used for the construction of an MLVA, consisting of a multiplex PCR followed by fragment analysis with capillary electrophoresis. The assay generated clusters which corresponded well with those observed by other methods. However, it provided a considerably higher degree of diversity, with 70 different MLVA types compared to 36 types generated by MLST. Simpson's index of diversity for the 5-locus MLVA was 0.963, compared to 0.899 for the MLST in this strain collection. MLVA results will generally be available within 2 days, which is usually faster than MLST. In our hands, MLVA of GBS represents a rapid, easy, and comparably inexpensive method for high-resolution genotyping of GBS.
Streptococcus agalactiae is the only species within the group B streptococci (GBS) of Lancefield's classification of hemolytic streptococci. It is well recognized as a human pathogen, especially for causing septicemia, meningitis, and other serious invasive disease in neonates; it is also associated with stillbirth. In addition, it is known to cause serious maternal infections and serious infections in elderly and immunocompromised patients.
Different approaches for typing of GBS have been developed through the years. Most commonly, GBS strains are assigned to one of the 10 capsular polysaccharide (CPS) types. Typing of surface proteins and analysis of the antibiotic resistance pattern can add further resolution to typing of phenotypic features. Examination of CPS or surface proteins can be done by immunological methods or, more recently, by detecting the corresponding genes by PCR (6, 8, 10). Multilocus sequence typing (MLST) was developed to study the long-term evolutionary development of a species. The total number of GBS sequence types (STs) is currently around 490 (http://pubmlst.org/sagalactiae/). Pulsed-field gel electrophoresis (PFGE) based on macrorestriction fragment analysis of genomic DNA is considered highly discriminatory. For GBS, the restriction enzyme SmaI is often used, generating only 8 to 12 restriction fragments, which may be suboptimal with respect to discriminating capacity (2, 19, 21). Since PFGE is an image-based method, it may be difficult to compare results between different laboratories. Both MLST and PFGE are labor-intensive techniques requiring experienced personnel and usually several days of work before results are available.
DNA loci consisting of repeated sequences are widespread throughout the genome of bacteria and have a tendency to vary in repeat number from strain to strain, so-called “variable number of tandem repeats” (VNTR). The combination of several VNTR loci in an assay (multiple-locus variant-repeat assay [MLVA]) has been shown in various bacterial species to generate strain-specific profiles that can be compared, exchanged, and reproduced in different laboratories. The technique has been employed successfully for the typing of an increasing number of bacterial species (23). The aim of the present study was to explore the GBS genome for variably repeated genetic regions and to investigate the feasibility of an MLVA method for typing the bacterium.
A panel of 126 GBS strains was selected to represent a broad range of CPS and surface proteins and sequence types. Most of these (n = 113) were invasive strains submitted to the Norwegian reference laboratory for GBS. They were also used in a previous study of invasive GBS in infants (3). In addition, nine international reference strains were included, among them all eight fully sequenced strains. Two Zimbabwean colonizing strains and two Danish CPS type IX strains were also included. The distribution of CPS types was as follows: Ia, 12 strains; Ib, 10 strains; II, 6 strains; III, 55 strains; IV, 9 strains; V, 30 strains; VI, 1 strain; and IX, 3 strains. Strains stored at −80°C were grown overnight on blood agar plates. For nucleic acid extraction, 4 to 5 colonies were added to 200 μl lysis solution containing 12 μl (20 mg/ml) lysozyme (Sigma-Aldrich Corp., St. Louis, MO), 4.8 μl (20 mg/ml) proteinase K (Sigma), 4.8 μl (10,000 U/ml) mutanolysin (Sigma), and 178.4 μl Tris-EDTA (TE) buffer and incubated at 37°C and 65°C for 15 min each. DNA was purified on a Qiagen BioRobot M48 instrument using the MagAttractDNA Mini 48 kit (Qiagen, Hilden, Germany) and eluted in a volume of 50 μl. The eluate was diluted 1:10 for PCR.
Genomes were analyzed using the Tandem Repeats Finder program, version 4.00 (1). The Variable Region Finder available at the Health Protection Agency website (http://www.hpa-bioinformatics.org.uk/variable_region_finder/index.html) was also used. Primers were designed using Oligo 6.71 software (Molecular Biology Insights, Inc., Cascade, CO). The loci were consecutively termed TR1 through TR18, while the five loci proposed for the subsequent MLVA were renamed SATR1 to SATR5 in the later phase of the study. Analysis of the repeats was generally done with single PCRs in 25 μl of reaction mixture containing 50 μM (each) dATP, dCTP, dGTP, and dTTP; forward and reverse primers (0.5 μM); 2.5 μl 10× PCR buffer with 15 mM MgCl and 0.75 U AmpliTaq Gold DNA polymerase (both of the latter by Applied Biosystems, Foster City, CA); a 1-μl aliquot of the purified, diluted DNA template; and RNase-free water. The PCR conditions were 5 min at 95°C for activation of the polymerase and then 35 cycles of 30 s at 95°C, 30 s at annealing temperature, and 60 s at 72°C. The annealing temperature used initially was that proposed by the primer design software, but the temperature was later changed to 55°C. The PCR was performed on an MJ Research PTC-200 instrument (MJ Research, Inc., Watertown, MA). Amplicon size was determined with an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) using the DNA 1000 chip. Later, amplicon size was determined by either 2% agarose gel electrophoresis for the VNTRs that showed only two or three alleles or by fragment analysis with capillary electrophoresis. The Qiagen multiplex PCR kit was used for the subsequent multiplex PCR. The manufacturer's recommendations were followed, except for the annealing temperature, which was set to 55°C. In addition, the concentrations of the individual primers in the primer mix were adjusted to 1.5 μM for SATR1 and -5, 1 μM for SATR2 and -3, and 0.5 μM for SATR4, and the total PCR volume was reduced to 25 μl.
Fragment analysis was done on an ABI 3130xl genetic analyzer. Hi-Di formamide, GeneScan 1200 LIZ size standard, and the PCR product were mixed (9 plus 0.5 plus 0.5 μl). The standard protocol for fragment analysis with 36-cm capillaries and POP7 polymer was used. The product sizes were analyzed using the GeneMapper 4.0 software. (All products used for fragment analysis were from Applied Biosystems.)
For analysis of the stability of the alleles, three strains (06/14, 06/123, and 07/15) were passaged on blood agar plates 40 times over a period of 4 months.
Typing of CPS and surface proteins was done by PCR (6, 11, 24), except that immunofluorescence microscopy was used for the nine Norwegian strains collected in 2005 (4). MLST was performed as described elsewhere (9). Results were available for all 126 of the strains. STs of most of the Norwegian strains were available from a previous study (3), whereas the ST of the international reference strains was available from a recent publication (22).
For cluster analysis, BioNumerics 6.0 software (Applied Maths, Sint-Martens-Latem, Belgium) was used. Simpson's index of diversity was calculated either via the VNTR DIversity and Confidence Extractor (V-DICE) software at the Health Protection Agency website (Health Protection Agency, Colindale, London, United Kingdom; http://www.hpa-bioinfotools.org.uk/cgi-bin/DICI/DICI.pl) or manually (18). The number of repeats was determined by subtracting the offset (the number of nucleotides between the primers and the start/end of the repeat) and dividing the remaining number of nucleotides by the repeat length (Table (Table11).
For DNA sequencing, loci were amplified in PCRs with unlabeled primers as described above. The PCR products were purified by using ExoSAP-IT (USB Corporation, Cleveland, OH) following the instructions of the manufacturer. BigDye 3.1 and BigDye terminator 3.1 (Applied Biosystems) reagents were used according to the manufacturer's instructions for sequencing PCR and subsequent purification. Capillary electrophoresis of sequencing PCR products was performed on an ABI 3130xl genetic analyzer (Applied Biosystems). The resulting sequences were analyzed using Sequencher 4.2 software (Gene Codes Corporation, Ann Arbor, MI).
The sequences of the three complete genomes of strains A909, NEM316, and 2603 V/R were analyzed in silico using the Tandem Repeats Finder program or the Variable Region Finder website. Candidate loci were compared via the BLAST website with the eight published whole-genome sequences, namely, the three above mentioned and five not fully assembled and annotated shotgun sequences (strains 515, H36B, 18RS21, COH1, and CJB111) (22). Most loci were excluded from further investigation for reasons such as large size, poorly conserved repeats, or poorly conserved flanking sequences. For the 18 remaining loci (Table (Table1),1), primers were designed. Their presence and their suitability for an MLVA were tested in 11 GBS strains. Seven loci (marked with asterisks in Fig. Fig.11 ) were found unsuitable because they did not show variability or were not amplified even after several primer pairs were designed and tested.
The remaining 11 loci were investigated in all 126 strains. Four of the loci had four or more alleles, and three had three alleles, while the remaining four loci had two alleles in this strain collection. Six of the 11 diverse loci did not generate a PCR product in some strains (indicated as −1 repeats in Fig. Fig.1).1). Amplification of SATR4 and SATR5 generated a PCR product devoid of the repeat in 1 strain and 21 strains, respectively (indicated as 0 repeats in Fig. Fig.1).1). The number of alleles for the 11 loci and their calculated diversity index are shown in Table Table2.2. All 11 loci were stable in three strains that were passaged 40 times.
The five most diverse loci were chosen for the construction of a GBS MLVA with fluorescently labeled primers and size estimation by capillary electrophoresis. The five proposed MLVA loci (SATR1 to -5) were amplified in one multiplex PCR. Due to uneven amplification efficiency, the primer concentration had to be adjusted as described above. Fragment sizes were estimated on an ABI 3130xl Genetic Analyzer using the 1200 LIZ size standard. This standard was chosen because SATR3 and SATR5 had large PCR products in some strains: up to around 750 bp in SATR3 and above 1,200 bp in the latter.
The five loci are described in more detail as follows.
This SATR1 repeat locus consisted of well-conserved repeats of 60 bp and was present in all strains investigated with one to three repeats. This locus was previously described by Dmitriev et al. (7, 17) as consisting of 16-bp direct repeats and 44-bp spacer regions.
The SATR2 repeats were well conserved, consisting of 18 bp, and the strains contained 3 to 15 repeats. All sequenced strains had an insert of 9 bp found between repeats 2 and 3. The only strains having 15 repeats were the three strains in this study of the recently described serotype IX (20). In 14 strains (11%), no PCR product was detected.
The SATR3 repeat locus gave comparably large products from 297 to 750 bp. The repeat unit of 12 bp was quite well conserved. The repeat count was computed to 18 to 54 repeats; however, sequencing demonstrated that the strains with the largest amplicons had 18 repeats and an inserted sequence of two 216-bp repeats. This combination was found in 33 strains (26.4%), all but one of which belonged to MLST clonal complex 17. A similar observation was made by Lamy et al. (12), as discussed below. SATR3 was not amplified in 25 strains (20%), most of which were of serotype III/R4 and MLST clonal complex 19.
The SATR4 repeat gave the smallest PCR products, ranging between 99 and 168 bp. The underlying repeat of 18 bp was quite degenerated; in some strains, some or all of the repeats were only 15 bp. For convenience of typing, all alleles are assigned as multiples of 18 bp in the calculation of repeat numbers. A PCR product was observed in all 126 strains, and five different alleles were found. The repeat is located in the pcsB gene, encoding a protein which is important for cell wall separation (15).
The SATR5 locus is the most diverse repeat locus in this study, with 24 different alleles in the 126 strains and a Simpson's index of diversity of 0.913 (confidence interval [CI], 0.900 to 0.922). The repeat of 48 bp was quite well conserved, and an amplification product was detected in all but one strain. Of 21 strains with a PCR product size equivalent to the offset (0 repeats), 13 were of ST19. A result of six repeats was found in 22 strains, 18 of which belonged to ST17. Some strains had an insert after the repeat locus, causing a variable offset. However, for typing purposes, the offset was set to 161 bp with the primers chosen. The larger products of this locus, especially those above 900 bp, did give weak signals in spite of attempts to optimize both multiplex PCR and fragment analysis. If SATR5 did not yield a product in the multiplex PCR, it was repeated in a single PCR. In five strains, the SATR5 locus was calculated to be more than 1,400 bp by the GeneMapper software, although this was above the upper limit of the 1200 LIZ size standard and therefore unreliable. When the five strains containing SATR5 amplicons above 1,200 bp were analyzed by gel electrophoresis, they showed diversity. In the analysis, however, these five strains were assigned a repeat count of 50. SATR5 is located in the gene encoding the fibrinogen binding protein FbsA (16).
MLVA analysis with the five proposed loci resolved the 126 strains into 70 different types with a Simpson's index of diversity of 0.963, while MLST separated them into 36 STs with an index of 0.899. The two largest groups of identical MLVA profiles consisted of 19 strains and 13 strains, most of which belonged to ST17 and ST19, respectively. These two MLST types are common subgroups of CPS type III (Fig. (Fig.1).1). All 27 of the ST17 strains in this work clustered together but were resolved into 12 different MLVA types within this cluster; 16 of them had identical profiles (Fig. (Fig.2).2). Similarly, all 11 CPS III/ST19 strains belonged to the same MLVA cluster and were resolved into four MLVA types; 7 of them had identical profiles. CPS type V has previously been described as relatively homogenous, possibly due to a recent introduction of this group as a human pathogen (5). Most of the type V strains belonged to ST1 (21 out of 30 type V strains); 20 of these 21 were in a cluster consisting of 12 MLVA types, mostly because of heterogeneity in SATR5.
A simpler MLVA analysis using SATR1 to -4 resulted in 43 different types for the 126 strains (Simpson's index of diversity, 0.926), also corresponding very well with the clustering by the other typing methods.
This study identified VNTR loci in the genome of GBS that are suitable for MLVA typing. MLVA is a relatively new typing method with several advantages compared to other methods. Mainly, MLVA is easy to perform at moderate costs, and results are in most cases available within 2 days, which will usually be more rapid than MLST or PFGE. Since MLVA generates numeric results, comparison of strains between different laboratories may easily be done. In the present study, 18 candidate loci were identified in the GBS genome by computational studies. Initial laboratory analysis found seven of these not suitable. The remaining 11 loci were tested in 126 strains, and the five most diverse loci were selected for an MLVA based on multiplex PCR and capillary electrophoresis for determination of the number of repeats.
The strain collection analyzed consisted of Norwegian invasive GBS strains and international reference strains which together represented a wide selection of CPS types and STs. The Norwegian strains, mostly from neonates, had been submitted to the national reference laboratory from hospitals throughout the country in the years 2006 and 2007. It is therefore reasonable to assume that there was no epidemiological association between most patients. Based on the high discriminatory power of the MLVA shown in this study, the method may well be suited for outbreak investigation. This should be addressed in a study with epidemiologically related strains.
Typing with MLVA in this study did generally correspond with the clustering of strains observed in CPS/surface protein typing or MLST. The 5-locus MLVA provided a very high degree of resolution, discriminating the 126 GBS strains into 70 different MLVA types (Fig. (Fig.1).1). The discriminatory power of this MLVA is superior to those of both MLST and combined CPS/protein typing, which resolved the strains into 36 and 19 types, respectively. The strains could have been resolved further if SATR5 alleles above 1,200 bp had been differentiated, which could be necessary under certain circumstances. If required, this will imply the use of other methods for exact size enumeration of SATR5 and consequently a delay of results. For SATR4, the diversity due to the degenerate character of the repeat should be kept in mind. In addition to the repeat count, the fragment size may be considered, especially if the MLVA is used in epidemiological investigations. As an alternative option for providing rapid results, a 4-locus MLVA, consisting of the loci SATR1 to -4 was also analyzed. Even this 4-locus MLVA was capable of providing higher resolution than MLST or CPS/protein typing. Conversely, some or all the six loci mentioned in Table Table22 but not included in the 5-locus MLVA analysis (e.g., TR9, TR13, and TR16) could be added to the MLVA if an even higher degree of resolution is desirable.
Some of the loci selected for the 5-locus MLVA have been investigated previously for other purposes. SATR1 has been addressed by a Russian and Slovakian group (7, 17) who found eight alleles in 112 bovine strains. Strictly speaking, this locus is a clustered regularly interspaced short palindromic repeat (CRISPR) but worked very well as component of this MLVA. SATR3 is located in the gene of surface adhesion protein gbs2018 previously analyzed by Lamy et al. (12). They described three alleles, one of which included the insert of 216 bp with two repeats mentioned above and found it as well strongly correlated with ST17. GBS belonging to ST17 account for a high proportion of newborn infections (3, 9, 14); however, whether these GBS represent a hypervirulent cluster is under discussion (13). SATR5 is the most diverse repeat locus and is responsible for a substantial part of the overall resolution obtained with this assay. Analyzed alone, it did cluster some of the MLST groups such as ST17 and ST19 well, while others (e.g., CPS type V/ST1) were more dispersed. On the other hand, SATR5 resolved CPS-type V/ST1 into many different types in the MLVA, while other typing techniques tend to give a much more homogeneous picture of this subgroup. In SATR2, the finding of 15 repeats was indicative for CPS type IX. This was also observed in three additional type IX strains not included in this study. Results that were indicative of certain sequence types were, e.g., the finding of 54 or 55 repeats in SATR3 for ST17 (Fig. (Fig.2).2). Similarly, an ST19 could be expected when SATR3 had no PCR product (−1 repeats) and SATR5 had 0 repeats.
To summarize, the present study identified VNTR loci in the genome of Streptococcus agalactiae suitable for MLVA analysis. Five loci were selected for a multiplex PCR protocol followed by capillary electrophoresis, enabling a very discriminatory GBS typing assay. The 5-locus MLVA resolved a strain collection of 126 GBS strains considerably better than MLST and with less workload. This method is well suited for typing of GBS, e.g., by national reference laboratories and for research purposes. Further work with this method would include refinement of the method, comparison with PFGE with epidemiologically related strains, as well as analysis of other strain collections, such as non-Norwegian strains and noninvasive and animal strains, and establishment of a website for international collaboration and strain comparison.
This work was supported by grants from the Clinic of Laboratory Medicine, St. Olav's University Hospital, Trondheim, and the Central Norway Regional Health Authority. A.R. was supported by a Ph.D. grant from the Norwegian University of Science and Technology, Department of Laboratory Medicine, Children's and Women's Health.
Kirsti Løseth is gratefully acknowledged for expert laboratory work.
Published ahead of print on 26 May 2010.