|Home | About | Journals | Submit | Contact Us | Français|
In Streptococcus mutans, both competence and bacteriocin production are controlled by ComC and the ComED two-component signal transduction system. Recent studies of S. mutans suggested that purified ComE binds to two 11-bp direct repeats in the nlmC-comC promoter region, where ComE activates nlmC and represses comC. In this work, quantitative binding studies and DNase I footprinting analysis were performed to calculate the equilibrium dissociation constant and further characterize the binding site of ComE. We found that ComE protects sequences inclusive of both direct repeats, has an equilibrium dissociation constant in the nanomolar range, and binds to these two direct repeats cooperatively. Furthermore, similar direct repeats were found upstream of cslAB, comED, comX, ftf, vicRKX, gtfD, gtfB, gtfC, and gbpB. Quantitative binding studies were performed on each of these sequences and showed that only cslAB has a similar specificity and high affinity for ComE as that seen with the upstream region of comC. A mutational analysis of the binding sequences showed that ComE does not require both repeats to bind DNA with high affinity, suggesting that single site sequences in the genome may be targets for ComE-mediated regulation. Based on the mutational analysis and DNase I footprinting analysis, we propose a consensus ComE binding site, TCBTAAAYSGT.
Bacteria can respond to various environmental stimuli by regulating metabolic pathways through two-component signal transduction systems (TCSTS) (7, 29, 31, 39). TCSTS have been shown to control many diverse processes in bacteria, such as sporulation in Bacillus subtilis, chemotaxis, nitrogen assimilation, and outer membrane protein expression in Escherichia coli, virulence in Agrocaberium tumefaciens, Bordetella pertussia, and Salmonella enterica serovar Typhimurium, and biofilm formation, acid tolerance, competence development, and bacteriocin production in streptococci, such as Streptococcus mutans, Streptococcus pneumoniae, and Streptococcus pyogenes (7, 9, 10, 15, 18, 23, 30, 39). These adaptive systems in their simplest form consist of two components: a membrane-bound signal-transducing protein, referred to as a histidine kinase, and a cytoplasmic transcription factor, referred to as a response regulator. Upon detection of the external signal, the γ-phosphoryl group of ATP is transferred to a conserved histidine residue on the histidine kinase and subsequently to a conserved aspartic acid residue in the regulatory domain of the response regulator, which thereby undergoes a conformational change and activates a response (38).
In silico analysis has shown 14 TCSTS in S. mutans UA159 (4). Of these, the ComED system has been studied extensively and shown to be involved in biofilm formation, competence development, and bacteriocin production (10, 18, 22, 23, 41). The first study done on the ComED system was reported by Li et al., who identified a genetic locus comprised of three genes, comC, comD, and comE, that encode a precursor to the competence-stimulating peptide (CSP), a histidine kinase, and a response regulator, respectively (22). S. mutans comCDE homologs were shown to function in competence by the fact that inactivation of each individual gene resulted in decreased transformation efficiency (22). Furthermore, the addition of chemically synthesized CSP was able to restore transformability to comC-deficient cells (22). Based on homology to the S. pneumoniae comCDE locus (1, 8, 14, 15, 20, 24, 32, 34), a model for competence in S. mutans was proposed (23). When a critical cell density is reached, cslAB (comAB), comC, and comED are induced, activating competence (13, 23, 33). cslAB likely encodes a CSP-specific secretion apparatus consisting of an ATP binding cassette (ABC) transporter (CslA) and its accessory protein (CslB), which presumably is involved in the processing and export of CSP (33). By analogy with other TCSTS, it is assumed that extracellular CSP binds ComD, induces autophosphorylation, and subsequently transfers this phosphate group to its cognate response regulator, ComE. Phosphorylated ComE would then directly activate expression of early com genes, which include cslAB, comC, comED, and comX. In S. pneumoniae, ComE induces expression of comX (also known as sigX), an alternate sigma factor that initiates transcription of late competence genes required for DNA uptake and recombination (20). A homolog of ComX in S. mutans is postulated to direct transcription of several competence-related genes as a result of a CSP-induced signal cascade (2).
Ween et al. showed that S. pneumoniae ComE is a DNA binding protein that self-regulates its own expression by binding to a region in its own promoter as well as the promoter region of cslAB (42). In a study by van der Ploeg, it was demonstrated that ComED is not only responsible for competence development but is also involved in the regulation of bacteriocin production in S. mutans. When the promoter regions of these bacteriocin genes were aligned, direct repeat elements were found and proposed to be ComE binding sites (41). Recently, we reported identical sequences in the intergenic region of nlmC-comC located upstream of the coding sequence of comC. Furthermore, we demonstrated that S. mutans ComE is also a DNA binding protein that recognizes these sequences, containing two 11-bp imperfect direct repeats, DRI and DRII, which are critical for specific binding (18). In this study, we provide quantitative DNA binding and footprinting analyses results to show that ComE indeed binds to DNA containing these two imperfect direct repeats. We have further identified and biochemically characterized 13 genes with this motif in the vicinity of their promoter regions, including the bacteriocin genes identified by van der Ploeg. Furthermore, we demonstrated that ComE may have two different modes of binding among these different high-affinity sites and that ComE only needs a single 11-bp site for high-affinity binding.
E. coli strains used in the study are listed in Table 1 and were maintained in Luria broth (LB) medium at 37°C with 100 μg/ml ampicillin, 200 μg/ml spectinomycin, or 1 mM isopropyl β-d-thiogalactopyranoside (IPTG) as needed.
The comE coding region was PCR amplified with primers ComE-F and ComE-R (Table 2) from S. mutans strain UA159 and ligated in frame into the expression vector pQE30 (Qiagen) to produce a coding sequence with a C-terminal His tag. The vector was transformed into E. coli strain DH5α and selected for ampicillin resistance, and the insertion of comE was confirmed by sequencing. Purification of ComE was performed as previously described (18).
For substrates with DRII removed, SG515 containing pFW5::(nlmCpΔDRII-luc) was used as previously described (18). For the substrates with both DRII and DRII deleted, the comCΔΔ fragment from pFW5::(nlmCpΔDRI+II-luc) (18) was subcloned into pCR 2.1 TOPO to generate SG516 (Table 1). Upon confirmation by sequencing, a sequence variation was observed outside the deletion region and is reflected below in Fig. 2.
To label the DNA substrates, 1 μM oSG316 (Table 2) was labeled at the 5′ end with 0.85 μM [γ-32P]ATP (10 mCi/ml; New Life Science Products, Boston, MA) by using T4 polynucleotide kinase (Promega, Madison, WI). Labeled oSG316 was used along with the unlabeled primer, oSG317 (Table 2), in a PCR to amplify comC (204 bp), comCΔ (deleted for DRII; 192 bp), and comCΔΔ (deleted for both DRII and DRI; 161 bp) products by using purified UA159 chromosomal DNA and plasmids isolated from SG515 and SG516 (Table 1) as the DNA template, respectively. All PCR products were purified using a QIAquick PCR purification kit (Qiagen, Valencia, CA) and quantified based on the optical density at 260 nm (OD260) using a Beckman spectrophotometer DU650 (Fullerton, CA). In addition, oligonucleotide primers (oSG454 and oSG455) and templates (oSG456, -457, -458, -459, -461, -462, -463, -518, -519, -520, and -521) (Tables 3 and and4)4) were synthesized by Invitrogen (Carlsbad, CA).
The upstream regions of comC and cslAB and the binding site identified within the coding sequence of gtfC were used for footprinting analysis with purified ComE. Labeled oSG316 and unlabeled oSG317 were used to generate a 204-bp comC promoter-containing fragment. Labeled oSG368 and unlabeled oSG369 (Table 2) were used to generate a 173-bp cslAB promoter-containing fragment. Meanwhile, labeled oSG420 and unlabeled oSG421 were used to generate a 188-bp internal gtfC* fragment. ComE was incubated in footprinting buffer (25 mM Tris-HCl [pH 7.5], 50 mM KCl, 6.25 mM MgCl2, 10% glycerol, and 1 mM dithiothreitol) with 1 nM labeled substrate in a final volume of 50 μl and incubated at ambient temperature for 30 min. After incubation, CaCl2 and MgCl2 were added to 2.5 mM and 5 mM, respectively, vortexed, and incubated for 1 min before the addition of 0.25 U of DNase I (Promega, San Luis Obispo, CA). Reactions were allowed to proceed for an additional 1.5 min before quenching by the addition of DNase I stop buffer (200 mM NaCl, 30 mM EDTA, 1% sodium dodecyl sulfate [SDS], 100 μg/ml tRNA [final concentrations]). The DNA was then extracted with an equal volume of phenol-chloroform (1:1), ethanol precipitated, and resuspended in 3 μl of sequence stop buffer (10 mM NaOH, 95% formamide, 0.05% bromophenol blue, 0.05% xylene cyanol). The pellet was solubilized on ice for at least 30 min before heating at 95°C for 2 min and then stored on ice for at least 2 min prior to loading. Reactions were separated on a 6% sequencing gel run at 40 V/cm for approximately 3 h. The gels were dried and scanned with a Pharos FX Bio-Rad imaging system (Hercules, CA). A sequence ladder for sense and antisense comC, cslAB, and gtfC* were generated using the fmol DNA cycle sequencing system (Promega, San Luis Obispo, CA) according to the manufacturer's instructions.
ComE was incubated in 20 μl reaction buffer (52.5 mM HEPES [pH 6.5], 50 μM EDTA, 9.5% glycerol, and 50 μg/ml bovine serum albumin) containing 100 ng salmon sperm DNA and 1 nM isotopically labeled DNA substrate for 30 min at ambient temperature. Following incubation, EMSA reaction mixtures were analyzed on a 6% nondenaturing polyacrylamide gel, run at 10 V/cm for 3 h, and subsequently dried and visualized with the Bio-Rad PharosFX molecular imaging system. Competition experiments were performed as described previously (12). Briefly, for each competition experiment, ComE was fixed at 1 nM with 1 nM labeled comCΔ DNA fragment. For competitions with bsmC, nlmAB, immB, and bsmB the template was too similar to amplify substrates by PCR from the chromosome of S. mutans. Therefore, to ensure that the correct sequence was being used for competition experiments, complementary 96-bp oligonucleotides were synthesized and used as templates for PCR (Table 4). Unlabeled competitor DNA from the promoter regions of cslAB, vicRKX, gtfB, gtfC, gtfD, comED, comX, ftf, gbpB, bsmC, nlmAB, immB, and bsmB (Table 2) was added, and the reduction of ComE bound to comCΔ DNA fragments was measured. ImageQuant 5.0 (Molecular Dynamics, Sunnyvale, CA) was used to quantify the shifted complexes and free DNA substrate in each reaction mixture. Each reaction was then plotted using Microsoft excel, and the 50% inhibitory concentration (IC50) was calculated. The IC50 represents the amount of competitor DNA that is required to reduce the ComE-comCΔ complex by 50%, which is a good indicator of the equilibrium dissociation constant.
Two substrates, comCΔ and gtfC*, were chosen to test for the bending of a DNA substrate by ComE. Each DNA target site had two sets of primer pairs (Table 2) designed to amplify the sequence from S. mutans UA159 chromosomal DNA to place the putative ComE binding site either in the middle or at the 3′ end of the substrate. These substrates were then incubated with ComE (10 nM), and the reactions were analyzed under the same conditions as described above for the EMSA.
We have previously shown by qualitative EMSA that ComE binds with high affinity to a specific DNA sequence that contains two direct repeats upstream of comC (18). Here we further characterized this interaction by DNase I footprinting. As shown in Fig. 1A, ComE not only binds to the two direct repeats (DRI and DRII) on the sense strand but also further protects regions 5′ to each direct repeat. However, it was observed that ComE protects more nucleotides associated with DRII than with DRI. Moreover, one hypersensitive cleavage was observed at the 5′ edge of each repeat. Such enhanced DNase I cleavage is indicative of greater access of DNase I to the phosphodiester backbone and is often the result of local DNA bending.
To test local DNA bending, we performed a DNA bend permutation experiment. We were able to show that a substrate containing only DRI (comCΔ) in a nucleoprotein complex with ComE migrated more slowly when the putative ComE binding site was placed in the middle of a DNA fragment, relative to a similar-length substrate where the binding site was placed at the end (see Fig. S1 in the supplemental material). This result is indicative of the induction of a mostly planar DNA bend. However, for another single-repeat ComE binding site (a putative site within gtfC, discussed below), the nucleoprotein complex migrated similarly whether the putative binding site was placed in either the middle or at the end, indicating no net DNA bending for either substrate. The dichotomy of this effect on these different substrates is discussed below.
Footprints of the antisense strand upstream of the comC coding sequence (Fig. 1B) showed that ComE protection also extends to the 3′ end from DRII, but not all 11 bp of each direct repeat were protected. A diagram of the protections and enhancements on each strand is shown in Fig. 1C. Again, more of DRII was protected than DRI, with some protection in the spacer region. The differences in affinity of ComE to DRII versus DRI are discussed below. In addition, hypersensitive sites were found at the ends of both direct repeats. Hence, both sense and antisense strands are similarly protected.
To determine the contribution of each direct repeat for ComE binding, we compared the footprints of two types of deletion mutants of the upstream comC substrate lacking either DRII or both DRII and DRI, comCΔ and comCΔΔ, respectively (Fig. 2 A). The single deletion of DRII showed the same protected regions of DRI (Fig. 2B) as in the intact comC substrate seen in Fig. 1A. No footprint was observed in the mutant, where both direct repeats were deleted (Fig. 2C), suggesting that the specificity of ComE binding for these sequences is necessary and sufficient.
Previously we had shown that ComE binds the comC promoter region with high affinity, and we revealed two shifted bands in an EMSA when both direct repeats were present (18). To determine if this binding is cooperative, EMSA analyses were carried out with a fixed amount of comC substrate (0.1 nM) and increasing concentrations of ComE (0 to 64 nM) (Fig. 3). A Hill plot was generated based on the analysis of three independent EMSAs, and the subsequently derived slope of each plot was taken as the Hill coefficient (1.7 ± 0.2 [mean ± standard error of the mean] (see Fig. S2 in the supplemental material), demonstrating positive cooperativity (Hill coefficient of >1) (35).
To estimate binding affinity, we performed quantitative EMSAs to determine the equilibrium dissociation constant, Kd. Briefly, a binding isotherm was created by keeping the concentration of purified ComE constant (15 nM) and varying the concentration of isotopically labeled comCΔ substrate (0 to 80 nM). The concentration of DNA, which produced the half-maximal amount of shifted complex (ComE bound to comCΔ), was used as an estimate of Kd. Although this method does not identify the active protomer (monomer, dimer, etc.), it does indicate the active quantity of protein, which is identical to the maximal amount of molar equivalents of shifted DNA (Bmax). comCΔ was used to examine the binding affinity, since it gave a single shifted complex over a wide range of protein concentrations (18). We calculated the Kd as (3.4 ± 0.5) × 10−9 M, indicating that ComE binds strongly to comCΔ (Fig. 4). In addition, a Bmax value of 5.9 × 10−9 M was observed. This Bmax is roughly 40% of the estimated quantity of added ComE, assuming that ComE is binding to the DNA as a monomer, i.e., this suggests that only 40% of our protein preparation is active. Alternatively, the functional protomer of ComE could be a dimer, implying that 80% of the protein is active. The actual oligomeric state of ComE in solution and bound to both single and tandem direct repeats will need to be determined.
After demonstrating that ComE protected the two direct repeats in the upstream promoter region of comC, we set out to determine if similar direct repeats exist in the promoter regions of cslAB, comED, and comX, as they do in S. pneumoniae. The 500-bp upstream regions from each ATG start site were used with the Multiple Em for Motif Elicitation (MEME) online program (http://meme.sdsc.edu/meme), and a putative consensus sequence, TCNTAAANGGT-10-TCNTAAANGGT, was identified. Furthermore, Senadheera et al. also showed that a two-component system, VicRK, affects the expression of gtfBCD, gbpB, and ftf (36) and suggested that VicRK may interact with ComED to regulate the expression of these genes. Using Mac Vector 7.2 software (Accelrys, Cary, NC), the aforementioned consensus was used to find potential matches in the promoter region of the following genes: cslAB (GenBank accession numbers AAN59510.1 and AAN59511.1), comED (AAN59528.1 and AAN59527.1), comX (AAN59601.1), gtfB (AAN58705.1), gtfC (AAN58706.1), gtfD (AAN58619.1), gbpB (AAN57811.1), ftf (AAN59631.1), and vicRKX (AAN59168.1, AAN59167.1, and AAN59166.1). To provide biochemical evidence that these genes contain bona fide ComE targets, we performed quantitative EMSA on all nine targets. In addition, we also tested four genes encoding putative bacteriocins (bsmA [AAN59525.1], bsmB [AAN59518.1], bsmC [AAN58177.1], and nlmAB [AAN57926.1 and AAN57927.1]) and one gene encoding bacteriocin immunity protein (immB [AAN58631.1]) that have been found to be regulated by ComED and contain putative ComE binding sites (41). The primer sets (Table 2) were used to amplify the putative ComE binding sites of the promoter region for each of these genes. Each selected region of DNA was used as competitor DNA with isotopically labeled comCΔ in an EMSA, and the IC50 for each competitor was calculated. The resulting IC50 (2.9 × 10−9 M) for comC wild type was lower than the homologous competition with comCΔ (3.7 × 10−9 M), indicating that DNA containing both direct repeats has a higher affinity for ComE than with a single site. As expected, when the two direct repeats were deleted, a higher IC50 (52 ± 40) × 10−9 M was observed, suggesting critical binding sequence elements were removed. As shown in Fig. 5, each sequence tested was aligned with comC to show the nucleotide differences, and the sequences were placed in an order from high affinity to low affinity, based on the IC50s. Out of the nine genes that were identified in this study, only cslAB (2.5 × 10−9 M) had a high affinity of binding to ComE similar to that of the wild-type comC. As expected, genes found by van der Ploeg (bsmB, bsmC, nlmAB, and immB) (41) had similar IC50s to wild-type comC, indicating high binding affinities to those sites.
To examine the cslAB binding site for ComE further, we used DNase I footprinting analysis. Figure 6 shows that ComE protects the upstream region of cslAB, and as expected the protected regions cover the consensus binding region. Furthermore, two hypersensitive sites were observed and showed similar spacing for both comC and cslAB, which correlates to almost two precise turns of the DNA helix. We have also examined the upstream regions of comED, comX, and vicRKX by footprinting analysis; however, none of these sequences was protected by ComE (data not shown).
As shown in Fig. 1, there are two direct repeats in the upstream region of comC that were protected in a DNase I footprinting analysis. In addition to these two protected regions, the 5′ end of the second direct repeat and the spacer region between the first and second direct repeats were also protected and contained hypersensitive sites. In order to understand how these two regions affect ComE binding affinity, we designed four templates each with 96 bases, as seen in Table 3, with random sequences in either of these two regions, and performed competition EMSAs with comCΔ. The first degenerate sequence (DGSI; see Table 3 for more details) has 9 random base pairs at the 5′ end immediately upstream of the second direct repeat, while the rest of the sequence is unaltered, whereas DGSII has 9 As or Ts in the same region. DGSIII has 10 Ns that replace the spacer region between the first and second direct repeats, whereas DGSIV has 10 As or Ts in place of the spacer. As shown in Table 5, these four templates have IC50s of 2.9 × 10−9 M, 3.9 × 10−9 M, 3.1 × 10−9 M, and 3.0 × 10−9 M, respectively, which were comparable to the 96-bp wild-type template (3.7 × 10−9 M). These results indicate that even though the 5′ upstream sequence immediately adjacent to the second direct repeat and the spacer were protected by ComE against DNase I cleavage, they do not play a measurable role in ComE binding affinity.
We also synthesized templates where we scrambled the sequences of the first, second, or both direct repeats, DNA substrates SSI, SSII, and SSIII, respectively, making sure that there were no bases that matched the original wild-type sequences. A competition experiment with these scrambled direct repeats showed that ComE bound to DRII with higher affinity than to DRI, as shown in Table 5, which was consistent with the footprinting analysis results indicating that DRII is more extensively protected by ComE (Fig. 1). When the two binding sites were scrambled, the IC50 (120 × 10−9 M) was the highest observed, indicating that the repeats represent the strongest determinants of DNA specificity.
Since ComE was able to bind to a single direct repeat and had a higher binding affinity for DRII of the comC promoter sequence (Fig. 2 and Table 5), we used DRII alone as the query sequence to identify other possible ComE binding sites in the S. mutans genome. From this search, we found two putative ComE binding sites within (+3293 and +3575) the coding region of gtfC, gtfC*, and gtfC** (the previously identified putative ComE biding site was upstream [−120] of the gtfC start codon and has both direct repeats), a gene that is regulated by another TCSTS (VicRK) in S. mutans (36). To further analyze gtfC*, DNase I footprinting and binding affinity assays were performed. As seen in Fig. 7, we observed that the putative binding site was both protected by ComE and created four hypersensitive sites. In addition, an IC50 of around 15 × 10−9 M was observed (Fig. 5). Thus, our data indicate five sequences yielding a footprint with ComE: upstream regions of comC and cslAB along with the site within the coding sequence of gtfC (gtfC*). Assuming that ComE can bind to a single direct repeat independently and still have biological consequences, alignment of these five sequences generated a consensus single binding site, TCBTAAAYSGT (see Fig. S3A in the supplemental material). Using available online software, Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/), we were able to find this consensus ComE binding site at 13 sites either in the upstream region or within the coding region of various genes (see Fig. S3B).
The main objective of this study was to characterize ComE interactions with specific DNA targets. Both EMSA and DNase I footprint experiments were used to determine binding affinities and specific interactions with putative ComE binding sites of S. mutans.
A previous study by van der Ploeg demonstrated that ComED in S. mutans regulated the expression of bacteriocins in addition to genetic competence (41). When the upstream promoter sequences of four genes encoding putative bacteriocins and one gene encoding a bacteriocin immunity protein were aligned, two 9-bp direct repeats were identified as putative ComE binding sites for S. mutans (41). Further analysis using these two 9-bp direct repeats identified a putative binding sequence in the intergenic region IGS1499, which is flanked by nlmC (also known as bsmA; mutacin V) and comC (CSP) (18). The putative ComE binding site found in this intergenic region matches the sequences found by van der Ploeg (41). In addition, we were able to expand the two 9-bp direct repeat to 11-bp direct repeats, because two additional bases at the beginning of each direct repeat were consistently identical for each site identified.
Previously we showed by qualitative EMSA that the two direct repeats, DRII (the one most distal from the ATG of comC) and DRI (the one most proximal to the ATG of comC), were critical for specific binding (18). Specifically, we observed that with increasing concentrations of ComE, one shifted and one supershifted complex were observed on the wild-type comC substrate. A similar titration with a DRII-deleted substrate (comCΔ) yielded only one shifted complex, while no shift was observed in the DRI and DRII double-deleted substrate (18). This result demonstrated that ComE does not require two repeats for high-affinity binding.
The response regulator OmpR from E. coli has three binding sites in the upstream region of the ompC promoter (25). The most upstream OmpR binding site was essential for ompC activation and was able to function independently of the other two binding sites (25). In addition, it was shown that when the most upstream site was absent, the efficiency of OmpR binding to the downstream sites was reduced significantly, which would suggest cooperative binding for OmpR (25). Herein, we demonstrate cooperative binding for ComE, although whether either or both direct repeats are essential for transcriptional activation has yet to be determined.
In addition, we demonstrated the ability of ComE to bend DNA as a result of binding to comCΔ (see Fig. S1 in the supplemental material). DNA bending plays a crucial role in many biological processes, including gene expression (6, 11, 28, 40). In our DNA bend permutation experiment, we only observed changes induced in gel electrophoresis migration consistent with the bending induced by ComE on comCΔ and not to gtfC* (see Fig. S1). A possible interpretation of the latter result is that binding is distributive and depends on the affinity of each sequence (37). For a region with a higher-affinity binding sequence, such as comCΔ, the binding of ComE resulted in a planar bending with slower mobility in the permutation assay. The functional significance of bending by ComE has yet to be determined; however, it is possible that the role of DNA bending is to allow a better interaction between ComE and RNA polymerase that would not be otherwise possible with a linear promoter (43).
We started our search for additional ComE binding sites by using an in silico approach, and we generated a consensus ComE binding site from the promoter regions of comC, cslAB, comED, and comX. With this consensus as a query sequence, we were able to find similar ComE binding sites at the promoter regions of gtfB, gtfC, gtfD, gbpB, ftf, and vicRKX. From quantitative EMSA experiments of each sequence, we found that only the upstream sequence of cslAB had comparable affinity to comC, and none was found to bind more strongly than comC (Fig. 5). Furthermore, footprinting analysis showed that ComE not only protected the region upstream of cslAB at the predicted ComE binding site but also that the hypersensitive sites were conserved compared to wild-type comC (Fig. 6). Recently, Martin et al., suggested that the S. mutans ComED are orthologs to the S. pneumoniae BlpRH (TCSTS that controls the bacteriocin-like peptides) and, therefore, the S. pneumoniae competence cascade is not a suitable model for S. mutans competence (26). One argument used was that there was no detected ComE binding site at the upstream region of cslAB (26). Here we have shown that ComE does indeed bind the promoter region of cslAB, which would indicate that during competence, ComE plays a role in regulating the expression of the ABC transporter, as suggested in the S. pneumoniae competence model (22).
We also analyzed a few of the other putative ComE binding sites that did not have IC50s that were as low (comED, comX, and vicRKX) in DNase I footprinting, and no protection was observed, but hypersensitive sites were found in the upstream region of vicRKX (data not shown). One possible reason for why ComE did not bind as well to these sequences compared to comC and cslAB is that there is too much disparity in the binding sequence from our proposed consensus. As shown in Fig. 5, an alignment of the putative direct repeats of cslAB to those of comC identified only two bases that differed between the two sequences, whereas in all other sequences, there was a minimum of three differences in each direct repeat that possibly prevented ComE protection of these sites. Another possibility is that ComE is not fully active unless phosphorylated and therefore does not recognize possible sequences with multiple differences. For this study, we did not focus on the phosphorylation state of ComE and the role it plays, if any, on ComE binding to the various sequences; however, this is definitely an area to investigate in future studies.
For ComX there may also be another possible explanation. Recently, Mashburn-Warren et al. discovered that competence in Streptococcus is controlled by two different quorum-sensing systems, ComCDE and ComRS (27). Uniquely, S. mutans has both of these systems, explaining why mutations in comE do not completely abolish competence in S. mutans (27). Mashburn-Warren et al. proposed a model where the ComRS system is the proximal regulator of comX and ComDE is an upstream regulator that may be connected to the ComRS system, although this connection remains unknown (27). The putative ComE binding site upstream of comX (IC50, 8.6 × 10−9 M) that was tested is located from −453 to −420 upstream of the comX start codon, whereas the sigX conserved promoter structure (P1) is located from −79 to −27 (27). Interestingly, looking upstream of comR and comS in S. mutans revealed no identifiable ComE binding sites.
The interaction between these two systems remains unknown. A recent report by Lemme et al. investigated the competence state of individual cells in a population of CSP-treated S. mutans (21). They found that within the CSP-treated culture there were two subpopulations, one that became competent and another that lysed (21). A model for this bifurcation step was proposed that shows that when there are low levels of ComE, the cells are not competent, and only when there are high levels of ComE does the individual cell become competent (21). It is also possible that the phosphorylation state of ComE plays a major role in the integration of these two systems. This will need to be investigated further to determine the exact mechanism for the integration of the ComR/S and ComDE systems.
In addition to the various genes directly affected by ComE binding, we are as yet unable to determine the minimum biological sequence requirements for an active ComE site. It is possible that the sequences that are protected by ComE are more biologically significant than the unprotected sequences. van der Ploeg showed decreases in β-galactosidase reporter activity when mutations in either direct repeat were introduced at the promoter region of nlmAB (41). These mutations in DRI led to an approximately 10-fold reduction in β-galactosidase activity, whereas mutation in DRII resulted in a 40-fold reduction. Removal of both repeats, and the region in between, abolished expression nearly completely (41). So, while it is clear that the direct repeats are biologically significant, it is unclear how much relative affinity is required to constitute true biological importance.
From the footprinting analysis with comC (Fig. 1) and gel shift analysis with scrambled sequences (Table 5), we showed that neither the 5′ extended region of DRII nor the spacer between the two repeats significantly influenced ComE binding, clearly demonstrating that the extended region protected by ComE does not influence ComE binding in a sequence-specific manner. Furthermore, it was clear that ComE bound to DRII with higher affinity than DRI (Table 5). Our observation that a deletion of both direct repeats (comCΔΔ) has a higher binding affinity than a scramble of both direct repeats (SSIII) (Table 5) is interesting. When both sequences were aligned with comC, as shown in Fig. 8, there was an equal number of matched sequences for SSIII and comCΔΔ with the native site of comC; however, for SSIII these identities were all within the spacer region, a region that we have demonstrated does not affect the affinity of ComE binding. As for comCΔΔ, there are seven matched sequences within the two direct repeats and three in the spacer region, which may allow ComE to recognize the binding site better than the SSIII, resulting in the higher affinity observed.
Since ComE was able to bind to a single direct repeat and DRII has a higher binding affinity, we set out to search for other possible ComE binding sites by using this sequence. We found one binding site that was interesting to us within the coding region of gtfC. This gene encodes a glucosyltransferase that S. mutans uses to convert sucrose into both water-soluble and water-insoluble glucan for initial attachment to the tooth surface. It has been shown that these water-insoluble glucans contribute to the virulence properties of S. mutans in a rat model. In addition, in vivo, S. mutans lacking gtfC is less cariogenic than in animals infected with the parental strain (16, 44).
DNase I footprinting analysis was performed with this putative ComE binding site within the coding region of gtfC* (Fig. 7). ComE was able to protect this sequence with the predicted hypersensitive sites. Two additional proteins have been shown to regulate expression of gtfC in S. mutans. A global regulator, CovR (also known as GcrR), binds to a region from +125 to −132 (+1 is the transcriptional start site) and negatively regulates the expression of gtfC (5). In addition, VicR, a response regulator of the TCSTS VicRK, was shown to bind DNA containing a consensus sequence at the −26 to −10 region and activate gtfC expression (36).
In a previous study, based on the different phenotypic observations caused by comC, comD, and comE mutants on biofilm formation, Li et al. proposed a model that the signal peptide (CSP) encoded by comC can simultaneously interact with multiple cognate receptors, one encoded by comD and at least one other encoded by an unknown gene (23). Since both ComED and VicRK have been shown to be involved in genetic competence development and biofilm formation in S. mutans (22, 23, 36), it is possible that there may be interactions between these two systems that regulate expression of competence development and biofilm formation. Currently, we are trying to determine if there are conditions that allow phosphotransfer between ComD and VicR or VicK and ComE. In general, cross talk must be kept to a minimum in order to ensure that an organism is able to detect a specific stimulus to evoke a specific response (19). However, cross talk between TCSTS is not restricted to only phosphotransfer. It has been shown that cross talk can be regulated at the level of transcription, for example, as described recently for EnvZ-OmpR and CpxA-CpxR systems in E. coli (3, 17). Although there is minimal cross talk at the level of phosphotransfer in these systems, certain genes, such as ompR and csgD, are directly regulated by both OmpR and CpxR (3, 17). It is possible that both ComED and VicRK cross-regulate the expression of gtfC at the transcriptional level, in which VicRK activates gtfC expression and ComE binds to the coding region of gtfC and occludes RNA polymerase from completing transcription, thus aborting the expression of gtfC. However, further genetic experiments need to be performed to test this hypothesis.
In summary, we have biochemically defined the ComE binding site by using EMSA and DNase I footprinting. Based on the footprinting analysis of comC and cslAB, we suggest the ComE binding consensus sequence, TCBTAAAYSGT, is sufficient for high-affinity binding. Although necessary, it is still not clear whether a single match to this consensus is sufficient for biological activity in any endogenous S. mutans system. Further investigation is required on these findings in the context of understanding competence regulation in S. mutans and the kinetics of ComE phosphorylation and dephosphorylation.
This work was supported by NIH grants 5R01DE013230 (to D.G.C. and S.D.G.), RO1-DE014757 (to F.Q.), 4R00DE018400 (to J.K.), and 1R01DE020102-01 (to W.S.).
†Supplemental material for this article may be found at http://jb.asm.org/.
Published ahead of print on 20 May 2011.