|Home | About | Journals | Submit | Contact Us | Français|
The agr quorum-sensing and signal transduction system was initially described in Staphylococcus aureus, where four distinct allelic variants have been sequenced. Western blotting suggests the presence of homologous loci in many other staphylococci, and this has been confirmed for S. epidermidis and S. lugdunensis. In this study we isolated agr-like loci from a range of staphylococci by using PCR amplification from primers common to the six published agr sequences and bracketing the most variable region, associated with quorum-sensing specificity. Positive amplifications were obtained from 14 of 34 staphylococcal species or subspecies tested. Sequences of the amplicons identified 24 distinct variants which exhibited extensive sequence divergence with only 10% of the nucleotides absolutely conserved on multiple alignment. This variability involved all three open reading frames involved in quorum sensing and signal transduction. However, these variants retained several protein signatures, including the conserved cysteine residue of the autoinducing peptide, with the exception of S. intermedius of pigeon origin, which contained a serine in place of cysteine at this position. We discuss hypotheses on the mode of action and the molecular evolution of the agr locus based on comparisons between the newly determined sequences.
The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence (14, 23, 26). Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed postexponentially and repressing some exponential-phase surface components (17). The agr locus comprises two divergent operons expressed from promoters P2 and P3 (Fig. (Fig.1).1). The P2 operon includes four genes, of which two encode elements of a density-sensing cassette, agrD encodes the precursor of the autoinducing peptide (AIP), and agrB, whose product is probably involved in processing and/or secretion of the AIP (8). The two-component sensory transduction system is comprised of AgrC, the membrane sensor, and AgrA, the response regulator (11, 15, 18, 19). In brief, the AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNA III (6, 20). In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr (2, 16, 20).
The agr locus in S. aureus (agrSa) has been shown to be polymorphic and can be divided into four distinct genetic groups (7, 9). Stimulation of agr expression, and hence transcription from promoters P2 and P3, is strongly dependent on the presence in the medium of the AIP derived from AgrD of the inducing strain or a member of the same group. AIPs from other groups generally inhibit agr activity, as do AIPs from other species of staphylococci (7, 9, 21). Comparison of sequences from different strains of S. aureus indicates that AgrB, AgrC, and AgrD, which are involved in the generation of the specific signal, are strikingly variable among the four groups, whereas AgrA, which codes for the response regulator, is much more conserved. Sequence variation is particularly marked in the AIP precursor, AgrD; the C-terminal two-thirds of AgrB, which may be involved in the specific cleavage and transport of the AIP; and a portion of the N-terminal half of AgrC where the extracellular sites for interaction with the AIP may be located (9). Coevolution of these sites with a specific AIP might explain the high degree of specificity for induction of the agr locus, but the mechanism for its inhibition by a broad spectrum of AIPs with little sequence similarity is not immediately obvious. Functional agr loci have also been shown to be present in S. lugdunensis (agr-1Sl) and S. epidermidis (agr-1Se). Their gene products define additional activation or inhibition groups distinct from those of S. aureus (31, 32). Sequences related to the agr regulon have also been demonstrated by hybridization in at least 14 other staphylococcal species or subspecies (3, 30), suggesting that this regulatory mechanism is widespread in the staphylococci.
In the present study, we compare the nucleotide sequences for open reading frames (ORFs) in the agr region of a number of non-S. aureus Staphylococcus species. We confirm that recognizable agr loci are widespread in the staphylococci and are subject to considerable sequence variation. The 24 sequenced variants from 14 tested species and subspecies maintain several protein signatures that provide information concerning the mechanism by which agr diversity has arisen.
Bacterial strains and their origins are listed in Table Table1.1. The agr-null strain of S. aureus RN6911 was constructed by replacement of the entire agr locus of the standard agr-positive strain RN6390 by tetM (20). Type strains were obtained from the American Type Culture Collection (ATCC), the Czechoslovak Collection of Microorganisms (CCM), the Deutsche Sammlung von Mikroorganismen (DSMZ), the Japanese Collection of Microorganisms (JCM), and the R. P. Novick Collection. Other strains were collected throughout France from human or animal infections and were referred to the Centre National de Référence des Toxémies à Staphylocoques (Lyon, France). Species were identified by colony and microscopic morphology, coagulase activity on rabbit plasma (bioMérieux, Marcy l'Étoile, France), production of clumping factor (Staphyslide; bioMérieux), and results on the ID32 Staph gallery (bioMérieux).
Genomic DNA was extracted from agar plate cultures as previously described (12) and used as a template for PCR amplification with the primers AGR-1 (5"-ATGCACATGGTG CACATGCA-3") and AGR-2 (5"-CATAATCATGACGGAACTTGCTGCGCA-3") (Eurogentec) chosen to encompass a ca. 1,234-bp fragment of agr common to agr-1Sa, agr-2Sa, and agr-3Sa, agr-1Se, and agr-1Sl (GenBank accession numbers M21854, AF001782, and AF001783, AF288215, and AF173933, respectively). Amplification mixtures were denatured at 95°C for 5 min and then subjected to five low-stringency cycles (denaturation at 94°C for 1 min, annealing at 45°C for 1 min, and extension at 72°C for 1 min), followed by 30 high-stringency PCR cycles (denaturation at 94°C for 1 min, annealing at 55°C for 1 min, and extension at 72°C for 1 min). Positive controls included amplification of gyrA or 16S-23S intergenic rRNA region (12). PCR products were analyzed by electrophoresis through 0.8% agarose (Sigma), purified by using the High-Pure PCR product purification kit (Boehringer Mannheim), and sequenced with the above PCR primers (Genome Express).
Sequences were aligned by using the CLUSTAL algorithm with additional minor modifications by visual inspection and analyzed by using BLAST, GeneJockey, and CLUSTAL software (1, 35). Consensus membrane-spanning regions were determined from the aligned predicted amino acid sequences by using TMAP (http://www.mbb.ki.se/tmap/index.html) (24), and PROSITE sites and signatures were sought in AgrC by using PROSCAN (http://pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_proscan.html) (5).
Evolutionary distances were determined by the method of Kimura, and these values were used to construct dendrograms by the neighbor-joining method with the PHYLIP package (European Bioinformatics Institute) (27). The stability of the suggested relationships was tested by the construction of at least 1,000 bootstrap trees for each data set by using Seqboot (Phylip).
The agr quorum-sensing and signal transduction system was first described in S. aureus, where four distinct allelic loci have been identified and sequenced. Sequences have also been determined for homologous loci in S. epidermidis and S. lugdunensis (7, 9, 31, 32). The presence of similar loci in 12 other staphylococcal species or subspecies was suggested by Southern blot hybridization with probes derived from agrSa, which produced strong positive signals for nine species and weak signals for a further three (S. hominis subsp. hominis and S. schleiferi subsp. coagulans or subsp. schleiferi) (3). We developed PCR primers common to the six determined sequences that bracketed a ca. 1.2-kbp region, including much of the agr locus. Table Table11 lists the 71 strains from 14 species or subspecies of staphylococci from which a product of the expected size could be obtained with these primers, including all eight agr-positive strains of S. aureus. The agr-deleted RN6911 strain of S. aureus and 20 other staphylococcal species or subspecies did not yield a product. All of the 71 agr amplicons were sequenced and yielded 24 distinct variants. Each amplicon included the three genes expected, corresponding to agrB, agrC, and agrD. The first and last genes were truncated at each end owing to the positions of the primers. Although the agr locus has been considered to be dedicated to control of the production of virulence factors, these findings confirm that agr is in a wide range of staphylococci, including nonpathogenic species such as S. carnosus. Recent gene expression analysis by microarray (P. Dunman and S. Projan, unpublished data) has indicated that the locus is also important in the control of catabolic pathways, nutrient uptake, and energy metabolism. The lack of amplifications from 20 staphylococcal species or subspecies might indicate absence of the locus, but in at least some cases the primer sites may have diverged too far for successful amplification, particularly since three species which hybridized weakly on Southern blotting (see above) did not produce credible amplicons.
Of the species yielding credible agr amplicons, S. aureus yielded the previously identified four variants, S. epidermidis had three, and S. auricularis, S. capitis subsp. capitis, S. caprae, S. lugdunensis, and S. simulans had two each. Unique sequences were obtained from S. arlettae, S. carnosus, S. cohnii subsp. cohnii, S. cohnii subsp. urealyticum, S. gallinarum, S. intermedius, and S. xylosus (Table (Table1).1). Despite the conservation of the three ORFs, all 24 sequences showed considerable sequence divergence, with only 10% of the nucleotides being absolutely conserved. The agrB sequence is truncated at the N-terminal end owing to the position of the 5" primer, so the Shine-Dalgarno (SD) box and initiator codon context cannot be analyzed. The deduced amino acid sequences of the C-terminal portions aligned indicate much divergence (Fig. (Fig.1),1), with 8% identities and 23% conservative substitutions (the mean of the base identities in the sequence pairs was 0.44 ± 0.1). One exception to the rule of extreme divergence was the identity of AgrB sequences from the two different strains of S. simulans, although their AgrD and AgrC sequences were quite different.
The agrD sequences all possessed satisfactory SD boxes [AGGA(G/T)G-N7-10-ATG] but showed much sequence divergence throughout their lengths, including the region containing the AIP sequence. At the amino acid level, we observed only 10% identities and 26% conservative substitutions (the mean of the base identities in the sequence pairs = 0.44 ± 0.1). There is absolute conservation of two amino acid residues immediately following the AIP, and strong conservation over the next seven residues. In the upstream region, there is an absolutely conserved glycine and partial conservation at many other positions. The cysteine residue utilized in thiolactone cyclization of the AIP is absolutely conserved, with one exception: a serine residue in the AgrD from four strains of S. intermedius, all isolated from pigeons. These strains have not been shown to produce an active AIP.
AgrC sequences were also preceded by satisfactory SD boxes [AGTG(T/A)G-N3-7-ATG]. S. auricularis and S. intermedius appeared to use TTG as an initiator codon, as is commonly seen with staphylococcal proteins. Again, the alignment indicates a wide divergence of the predicted amino acid sequences for AgrC (10% identities and 22% conservative substitutions; mean of the base identities in the paired sequences = 0.35 ± 0.1), particularly in the upstream regions, with a greater degree of conservation toward the N-terminal transmembrane region, and a high degree of conservation in the C-terminal intracellular moiety, of which only the first 25 residues are present, owing to the location of the PCR primer utilized.
It might be expected that divergence of any one of these Agr peptides would require compensatory changes in the other components, because the AIP is derived from AgrD, is processed and secreted by the activity of AgrB, and then acts as a specific ligand to the extracellular portion of AgrC to induce signaling activity (9, 11). In general we observed distinct and clearly divergent sequences for all three agr ORFs for each divergent allele. The covariation of AgrB, AgrD, and AgrC was quantified by using a linear regresion method for comparing the identity for each ORF between alleles. This showed a strong correlation between the variations of the different ORFs (r = 1; analysis of variance, P < 0.001).
These results support the notion of a concerted coevolution of the different elements of the agr locus. Several species of staphylococci, such as S. aureus and S. epidermidis, have developed very divergent alternative alleles of the agr locus (Fig. (Fig.2)2) without apparent major divergence in the rest of their genomes as assessed by DNA-DNA hybridization. The agr variants are usually unable to cross-stimulate, and their bearers must communicate as “foreigners” rather than as members of the same community groups (7, 9, 21). This might represent adaptation to different microenvironments at specialized infection sites, for instance, and may indicate an incipient speciation event. Alternatively, other aspects of adaptation in the classical species might suffice to maintain these variants in the same genetic species despite an interruption in communication.
The Agr proteins are highly variable in sequence, but some general motifs may be discerned. Scattered residues throughout the AgrB peptide are conserved or maintain physicochemical similarity in all variants examined (Fig. (Fig.1),1), and interestingly, a conserved T/K-R-V/K motif is identified by PROSITE as a potential kinase C phosphorylation site. It should also be noted that AgrB of S. cohnii subsp. cohnii contains a perfect four-element leucine zipper motif, and other sequences maintain amino acids with large hydrophobic side chains (including one to three leucines) in the corresponding positions that could correspond to a tramsmembrane hydrophobic face. This might also suggest homo- or heterophilic association involving AgrB in the bacterial membrane.
As mentioned above, AgrD has a highly conserved motif, including the absolutely conserved Glu-Asp pair just downstream of the AIP, and it is assumed that this is essential for processing the C-terminal end of the AIP. In many cases the AIP of one species or variant of Staphylococcus acts as an agr inhibitor for other staphylococci (9, 21). For S. aureus, agr inhibition increases the expression of surface proteins such as adhesins. Maintenance of this blanket inhibitory activity would be reasonable among bacteria coexisting in mixed infections or during colonization if it were to contribute to adhesiveness through interspecies communication (33). Whereas specific self-activation, at least in S. aureus and S. epidermidis, requires a thiolactone ring structure in the peptide, the more general cross-inhibition can be mediated by AIP derivatives with lactone or even lactam rings (13, 22). Interestingly, we found a single exception to the rule of a conserved cysteine five residues before the C terminus of the AIP. Four different isolates of S. intermedius from pigeons all code for serine at this position. A culture supernatant of the strain listed in Fig. Fig.11 did not contain any peptide that inhibits agr expression in group I S. aureus nor did it contain detectable delta-hemolysin (G. Lyon and R. P. Novick, data not shown), although preliminary Northern blot hybridization tests suggest that it does produce RNA III (P. Dufour and Y. Benito, unpublished data). Moreover, a mutant of S. aureus group I agrD coding for serine at this position did not generate any detectable agr-inhibiting activity (G. Ji and R. P. Novick, data not shown). It thus appears that group I AgrB cannot process the serine-containing agrD propeptide; it is not currently known whether the S. intermedius AgrB can process such a propeptide. Two of three other strains of S. intermedius (kindly provided by W. Kloos) had agr-inhibitory activity against S. aureus (not shown); determination of the agrD sequences for these strains is currently in progress. It will be interesting to know whether a thiolactone or lactam analogue is activating or inhibitory in this species.
The N-terminal regions of AgrC variants shown in Fig. Fig.11 show no convincing conserved sequence motifs; they do, however, have very similar hydropathy profiles, as previously proposed for the S. aureus AgrC variants (not shown), presumably reflecting conserved topology. The region of AgrC sequenced corresponds to a domain with transmembrane elements and the presumptive site for interaction with the AIP. TMAP membrane protein topology prediction for the 24 alleles suggests an 8- to 10-amino-acid intracellular N terminus, followed by six transmembrane segments of 21, 21, 21, 24, 29, and 29 residues, respectively, with three extracellular loops of 11 to 12, 16, and 5 residues and two intracellular loops of 12 and 7 amino acids. This prediction agrees with PhoA fusion analysis of S. aureus AgrC, except that the N terminus was found to be extracellular (11). The exact topology of the first intramembrane helix will require further investigation. The fifth transmembrane helix of AgrC from S. xylosus, S. arlettae, and S. epidermidis 1 includes a perfect four-element leucine zipper motif, and other strains conserve amino acids with large hydrophobic side chains in the same positions, suggesting either homodimerization, as seen in many histidine kinase receptors, or heterodimerization with an unidentified partner (4).
Ji et al. (9) have suggested that the divergent agr loci have evolved by accumulation of random mutations through an unknown mechanism for the generation of hypervariability at this locus, followed by rigorous selection for functional combinations between the different elements. A limited matrix analysis between variant alleles such as agrSa-1 and agr-2Sa showed so little similarity that horizontal transfer from separate donors following divergent evolution in separate populations or species appeared to be a likely contributive mechanism. To investigate this question, we constructed phylogenetic trees from the DNA sequences of our different agr alleles and compared them with trees established from 16S rRNA loci from the same species and subspecies (Fig. (Fig.2).2). The branch lengths in the tree of agr sequences are unreliable because of the very high level of substitution, suggesting that there may also have been back mutations that would preclude a confident assessment of synonymous-to-nonsynonymous substitution rates. Nevertheless, the overall topology of trees established by using 16S rRNA, agrB, or agrC is remarkably similar. Less similarity was observed with agrD, but bootstrap analysis revealed that most of the interspecies relations based on agrD sequences had low confidence values (bootstrap values ranging from 12 to 65%). Variant agrB and agrC sequences from a single staphylococcal species generally occupy adjacent branches of the tree, except for agr-1Se, which reliably groups with sequences from the closely related S. capitis, rather than with the other two S. epidermidis alleles. In addition, agr-2Sa cannot be reliably grouped with other agrSa alleles by using agrB and agrD. In general, the relationships between different species and subspecies, as illustrated by 16S rRNA homologies (28), are maintained in the agr trees. S. xylosus, S. gallinarum, S. arlettae, and the two subspecies of S. cohnii group together in 16S rRNA and agrB and agrC trees, as do S. aureus, S. epidermidis, and S. capitis, although S. caprae, which was determined to be closely associated with S. capitis by using 16S rRNA, forms a separate branch by agrD and agrC sequences. In trees constructed with 16S rRNA, agrB, or agrC sequences, S. carnosus and S. simulans diverge from a single branch. This distribution provides no evidence for significant transfer of agr genes between populations of staphylococci and suggests that the wide differences observed are the result of a high rate of accumulation of mutations in this locus. However, Hayakawa et al. (29) have described two isolates of S. aureus from milk from cows with mastitis wherein the ORFs agrB and agrD were closely similar to those from agr-2Sa for one strain and from agr-3Sa for the other one, whereas the agrC ORF from the two strains was clearly an agr-1Sa type. These variants would be expected to show possible agr inhibition but not activation, and their pathogenic potential is surprising because virulence is usually associated with a functional agr system (10, 25, 34).
It is difficult to envision the precise mechanisms by which a new agr variant, in a situation where the AIP, its processing agent, or its specific receptor are rendered incompatible, can overcome the disadvantage of interrupted communication long enough to accumulate compensatory mutations in the other elements. The loss of phylogenic relationship between agrD and 16S rRNA suggests that agrD is the locus of the agr ORFs least linked to the genetic background. The driving force in agr evolution could be the aquisition of a mutation in the sequence coding for the AIP. This variant AIP would presumably act as a general inhibitor on other staphylococci, including the parent species. This activity may provide sufficient protection for the development of a suitable AgrC sequence to recover positive activation by the new self AIP, but it would still be necessary to coevolve a satisfactory AgrB to process and secrete the AIP. Suitable conditions would probably imply a rich nutrient supply to support rapid replication so as to compensate for the loss of nonadapted variants and probably a competition sufficient to engender pressure for reacquisition of a positive stimulatory communication between members of the newly formed agr group. It is not known, of course, in which particular milieu this evolution may have taken place. Given that many of the staphylococci are evidently not pathogenic for higher eukaryotes, the agr divergence would presumably represent environmental adaptations, of which pathogenesis is simply a subset.
The numerous sequence variants of the agr locus reported here confirm that this locus is highly variable even between subpopulations of a single species. Once fixed, however, the variants appear to be quite stable (many separate isolates in our study had identical sequences) and are a good marker for species and subspecies identification. We find no evidence for horizontal transfer, which could potentially confound species identifications, and conclude that the agr alleles have evolved contemporaneously with their parent species. The causes for agr divergence within a species and whether these events precede a more general divergence, even speciation, remain unclear. It will be useful to design fresh PCR primers to recover agr sequences from species and strains which failed to produce satisfactory amplicons, and the sequences discussed here should aid in this process. It is particularly desirable to determine the generality of the serine-for-cysteine substitution in S. intermedius and to determine the biochemical and physiological characteristics of this unusual variant.
We thank G. Lyon, M. Gouy, and C. Mougel for helpful discussions and Y. Benito, H. Meugnier, N. Violland, C. Courtier, and C. Gardon for technical assistance.