Phase variation can occur in different ways, including slipped-strand mispairing of SSRs during DNA replication, general and site-specific recombination, excision/insertion of mobile genetic elements such as transposons and insertion sequences, and epigenetic regulation through differential methylation (42
). Over the years, SSRs have emerged as a common and important mechanism for phase variation both of particular genes/proteins and on a genome scale (41
). The present study is one of the first comprehensive attempts to understand the importance of SSRs for phase variation in a Gram-positive pathogen, S. agalactiae
. Through comparative genomic analysis of eight bacterial genomes representing the most frequently isolated serotypes, we present evidence of genotypic variation indicative of slipped-strand mispairing in SSR regions.
In a first stringent screening, a wide variety of SSRs were examined across the eight genomes. The results suggested that the repeat type that has the highest propensity to vary is the homopolymeric tract, especially the poly-A tract. With an extended screening of homopolymeric tracts, additional polymorphisms were identified, and this reinforced the potential importance of poly-A as a means to drive phase variation. In addition, the longest homopolymeric tracts identified were examined, and gene loci with clustered SSRs were identified. Moreover, a previously undescribed positional bias of poly-A tracts within ORFs was observed. The longest tracts were preferentially located in the 5′ ends of ORFs. Should a poly-A tract within a coding sequence alter and a frameshift ensue, it may be advantageous that the resulting product is a short peptide. A major truncation is more likely to efficiently inactivate the protein and may avoid inadvertent effects due to misfolded or functionally altered proteins. Moreover, the bioenergetic cost of translation is proportional to the length of the product. Interestingly, while poly-A tracts seem to be the most slippage-prone among the SSRs, the longest such tracts are statistically underrepresented in the genome, implying an evolutionary selection against such sequence elements. An alternative means of repressing potential variation are point mutations within SSRs (stabilization), of which several examples were noted during screening procedures.
Overall, among the loci identified there are many examples of genes encoding surface proteins or secreted proteins or of genes that indirectly affect such proteins. Examples include cell wall attached proteins, lipoproteins, membrane proteins, and sortase, the latter of which could influence a range of cell wall attached proteins. Such surface structures constitute rather typical examples of potential phase variation. Less typical is the observed SSR polymorphisms in transcriptional regulators and two-component systems. It would seem disadvantageous to override regulation through the genotypic inactivation of regulatory components. However, in a recent study it was shown that homopolymeric tract polymorphisms in a C. jejuni
response regulator cause ON/OFF phase variation of both flagellar biosynthesis and its regulator (10
). The gene encoding the regulator is variable through a cluster of five poly-A and one poly-T tracts, which are comparable in length to the tracts discussed in the present study.
The inclusion of draft genomes in the analysis represents an increased risk to include sequencing errors. Steps were taken to exclude hits where sequence quality was in doubt, such as in the vicinity of contig ends (see Materials and Methods). Although only a subset of the variant loci was verified through manual scrutiny of trace file quality, there is little to suggest that overall genome sequence quality had a major impact on our results. Overall, no clear relationship between the degree of fragmentation (number of contigs) and the number or type of hits in our screening was noted. A limited number of loci were selected for resequencing. Nine loci, representing poly-A or poly-T and showing variation in at least one draft genome compared to the reference genome, were resequenced in those genomes and SSR polymorphisms were confirmed (data not shown). Nonetheless, it cannot be ruled out that selected indels represent sequencing errors.
The comparative genomic analysis and experiments performed here suggests that the phase variation mechanism involving SSR slippage is at play in S. agalactiae
and has resulted in antigenic differentiation between strains. However, despite considerable efforts and the use of three different methodologies, it was not possible to detect an intrastrain event of SSR slippage for any of the three proteins that were experimentally investigated. Compared to other bacterial species, S. agalactiae
mostly lacks the unusually long repeats responsible for high-frequency phase variation (5
). Moreover, the preference for poly-A tracts is atypical. Although there are several reports of genotypic switching through poly-A tract slippage (of comparable length), none of these suggest a high frequency. It has been suggested that poly-A tracts pose a delicate problem for bacteria, in that RNA polymerase (transcriptional) slippage seems to occur in addition to replicational slippage, and this may be a reason why some microorganisms show an underrepresentation of long poly-A/T tracts (2
). Our difficulties in isolating genotypic switching in vitro
may derive from the methodological issues associated with screening a sample representing a large enough number of bacteria. Moreover, an ON-OFF switching event may confer a selective advantage in vivo
that is not evident in vitro
, if the relevant proteins are surface exposed, immunogenic, and expendable. For practical purposes, our screening was biased, in that OFF-ON events were targeted. In our case, such events would have involved the expansion of SSRs. Mutation frequencies may differ significantly between ON-OFF and OFF-ON events, and contractions seem to be more frequent than expansions (5
A recent publication (23
) describes eight atypical clinical isolates (vaginal/rectal colonization) of S. agalactiae
that were unencapsulated. Among various polymorphisms with little or no phenotypic impact, three of the eight strains contained deletion of an adenine in the cpsG
locus, resulting in a frameshift and truncation of the protein with likely loss of function, and thus compromised capsule biosynthesis. Upon closer examination of the sequence, we note that the deletion is located in a poly-A tract, and represents a SSR change from A8
. This suggests that, in vivo
, SSR slippage in a poly-A tract may constitute a significant way to modify capsular biosynthesis. We describe extreme repeat clustering in cpsH
, but did not find evidence of variation in four unencapsulated invasive clinical isolates. In another attempt to approach the in vivo
situation, the three selected genes of interest were sequenced in a strain that had undergone repeated mouse passages, and all three genes remained unchanged with respect to the starting inoculum. Thus, in vivo
growth in naive mice does not seem to involve a selective pressure for variation in these loci.
In the strict sense, the lack of experimental in-strain variation means that we have no direct evidence of SSR-mediated phase variation. Nevertheless, we believe that S. agalactiae
uses SSRs as an adaptive strategy but in a significantly different role compared to that in Gram-negative bacteria. Bet-hedging as an evolutionary strategy comes at a cost, but it can improve fitness when environmental circumstances change frequently enough (26
). Moreover, generalist bacteria with large genomes and considerable genome redundancy may sustain genotype switching better than specialists. S. agalactiae
, by preferentially using homopolymeric A tracts, selecting against long repeats in general, and using damage control by positioning of SSRs, is ensuring that mutation frequency is kept in check. Nevertheless, SSRs are used, either for long-term adaptation or possibly during the uncommon invasive infections. This represents a cautious bet-hedging strategy suitable for a specialized commensal and occasional opportunist. Genome plasticity, rather than phase variation, may be the adequate term for this process.