CcpA is a global regulator of carbon catabolism [
3] controlling expression of genes by binding to cognate operator sequences,
cre, which is characterized by a low-conserved consensus sequence [
32-
34]. Hence, it seems possible that CcpA binds some
cre sites with higher affinity than others. So far, the global studies of CcpA-dependent carbon catabolite repression were focused on identification of the members of the CcpA regulon [
40,
42,
44], while the analysis of
cre boxes in respect to their sequences, position and affinities in CcpA binding have been focused only on single examples [
7,
17,
33,
34,
45]. A broader comparison of 32
cre boxes sequences and function was published by Miwa Y.
et al. and it was deduced that a lower mismatching of
cre sequences to the query sequence in the same direction as that of transcription of the target genes and a more palindromic sequence of
cre boxes are desirable for their better function [
33]. The goal of our study was to perform a genome-wide analysis of
cre boxes in order to reveal
cre boxes with high and low binding affinities by comparing the CcpA regulon under three distinct conditions, where different amounts of CcpA were present in the cells and to identify
cre features that determine this affinity.
Using a tetracycline-dependent gene regulation system [
35] we achieved a tightly-controlled
ccpA expression, leading to a wide range of CcpA amounts in the cells.
B. subtilis cultures with relative low, medium or high amounts of CcpA in the cells were subjected to transcriptome analyses. The cells were grown in the presence of glucose to ensure sufficient production of low-molecular-weight modulators of CcpA activity (NADP, glucose-6-phosphate, fructose-1,6-bisphosphate). As expected, higher levels of CcpA protein lead to more genes significantly up- or downregulated. Most of the regulated genes, however, were affected indirectly, as they were lacking a
cre site. Genes regulated indirectly in a CcpA-dependent manner (no
cre or unfunctional
cre) were already observed before and were proposed to be grouped in class II, next to class I that includes genes regulated by CcpA directly [
40,
46,
47]. In our analysis, only genes belonging to class I were taken into account as the subject of this study was the nature of discriminating
cre boxes. Many repressed genes are σA-dependent and do not need another inducing protein for their expression. However, expression of some genes is regulated by more than one regulator. In those rare cases of multiple regulation, the full extent of regulation would not be observed in our transcriptome analysis, but this does not affect our analysis since we are looking at the relative strength of repression at different CcpA concentrations.
The search for putative
cre boxes in the
B. subtilis genome, using a
cre motif generated from the
cre boxes known from DBTBS [
41], T
1G
2A
3A
4A
5R
6C
7G
8Y
9T
10W
11W
12C
13A
14, resulted in 418 putative
cre boxes. The majority of the predicted
cre boxes were within ORFs far away from promoters and, although functional
cre boxes located within coding sequences are present in the
B. subtilis genome, a lot of the predicted
cre sites seemed to be at a too large distance from the promoter to possibly be able to play a role in regulation of gene expression. Therefore,
cre boxes located within −500 and +100 nucleotides from the first nucleotide of a start codon of the first genes of an operon were extracted. Also
cre boxes triggering gene regulation that are known from the literature, but not predicted by our method, were included in our analysis. The genes differentially expressed at least at a high CcpA production level and possessing
cre box(es) known from literature [
1,
41] and/or predicted in this study were selected. Among the selected genes, 30 were downregulated and 3 were upregulated at a low CcpA induction level, while the other 37 genes were downregulated only when CcpA was produced at higher levels (medium and high CcpA induction levels). For all these genes, expression fold changes were calculated as ratios of the amounts of transcripts downstream of
cre boxes as the microarray chip probes were synthesized upstream from them. Of the regulated first genes of operons possessing known and/or predicted
cre box, chip probes of only
kdgR and
resA were upstream from
kdgRcre and second
cre of
resA (located 1709

bp downstream from TSS). Therefore, these
cre boxes were not included in the sequence and position analysis of
cre boxes. Since regulation depends on CcpA-
cre binding,
cre boxes causing significant regulation of downstream operons already when a small amount of CcpA is available are supposed to have a high affinity to CcpA and titrate CcpA away from low-affinity
cre sites, which are able to exert regulation of other operons only when more CcpA is present. Notably, over a dozen of known
cre’s fell out of our data set, because the corresponding genes were not significantly regulated in any of the three microarray experiments. Despite of the fact that they could be considered as very low-affinity sites, they were not included in the analysis as lack of the differential expression might have been a false negative result due to, e.g., high background signal, bad spot quality on the microarray slides, mRNA degradation, growth conditions, more complex regulation or yet unidentified factors. Moreover, it should be noted that division of
cre boxes to two affinity groups is a simplification necessary for this analysis. Very likely a gradient distribution of
cre site affinities occurs in nature, which would be difficult to assess.
The detailed analysis of the sequences of high- and low-affinity
cre boxes, led to a few interesting observations. The G
2 and middle C
7 and G
8 residues (Figure ), known as highly conserved residues [
32-
34] are conserved in both high- and low-affinity
cre boxes. Interestingly, the high-affinity
cre boxes have more conserved G
6 and C
9 surrounding the middle CpG and C
13 (palindromic to the conserved G
2) and A
14 (palindromic to T
1) and their sequences are significantly more palindromic overall. It was observed before that a more palindromic sequence of
cre sites contributes to a better function [
33]. The more palindromic nature of the high-affinity
cre sites (in comparison low-affinity
cre sites) might create a more symmetric DNA conformation, preferred for CcpA binding. Although the bases at positions 4 and 11 are more often palindromic to each other in the weak
cre boxes, this is obviously less important for the
cre strength. In a previous study [
34] it was shown that CcpA binds with similar affinities to different
cre boxes, which explains well the role of CcpA as a global regulator. However, the three
cre boxes tested in that work differ very little around the middle CpG and in their symmetry (palindromic sequence) and they did not differ at the residues corresponding to our C
13 nor A
14.
Comparison of the high- and low-affinity
cre boxes location in relation to the TSS also shows some trends. While the low-affinity
cre sites can be located at any position from the TSS, the high-affinity
cre sites cluster around the TSS, 14 and 27 base pairs upstream from TSS and 44 base pairs downstream from TSS. Simultaneously, the strongest repression by CcpA was observed for the genes with
cre sites located around the TSS (
amyE, rbsRgmuB) and at positions −27 (
acoRglpF), -14 (
dctP), +230 (
xynP) and +372 (
treP) base pairs from the TSS, which are separated by approximately 10 - 11-nt increments (corresponding with a full helical turn). This observation is in agreement with previous findings that activation or repression by CcpA binding to
cre boxes is helix-face-dependent [
17,
45]. Also in
Lactococcus lactis the strongest repression by CcpA was shown to occur when the center of
cre box was located −39, -26, -16, +5 and +15 from the TSS [
48].
It was shown before that genes with
cre boxes located further upstream from −35 sequences of the promoter are subject to activation by the CcpA complex as in case of
ackA[
17],
pta[
18] and
ilvB[
19,
20]. In our work however, under the tested conditions, only three genes were activated:
ilvBopuE and
ycbP (the two latter genes with
cre sites predicted in this study). We did not observe activation of
ackA in this study. This is probably due to the very low basal expression of CcpA from the TetR repressed promoter that might be high enough for binding of CcpA to the
ackA cre box and for full activation of the
ackA promoter. In this case, a further increase of CcpA does not result in an additional increase of
ackA expression. Surprisingly,
pta was downregulated. However, in this study both test and control cultures were grown in medium supplemented with glucose. The mechanism of
pta regulation in this case is thus different from low glucose-dependent CCA. Based on our criteria, the
cre boxes of all three activated genes are of the high affinity type. Although the
ycbP cre box appears to be downstream to the TSS (+30), both the
cre box and the TSS in this case are not experimentally confirmed.
Some genes and operons possess multiple cre boxes. Since DNA microarray technology was used in this study to assess expression fold changes of genes and operons in the presence of different amounts of CcpA, we were not always able to judge whether the effect is due to one cre box (and which one) or more. In our set (Table ) there were only two operons with two cre boxes (the first genes of these operons are: iolA and gntR). gntR was weakly regulated (low-affinity cre box), suggesting that the regulatory effects of the two cre boxes do not add up to exert strong regulation. In case of the iolA operon, each of the two cre boxes is located within another gene of the operon (cre-1 within iolA and cre-2 within the second gene of the operon, iolB). In this case, the regulatory effects of these cre boxes could be assessed independently. Based on the fold changes of iolA (cre-1) and iolB (cre-2), both cre-1 and cre-2 seem to be of high affinity. Multiple cre boxes could serve for fine tuning of CcpA-regulated genes and operons.
For the genes with
cre boxes located close to the TSS and downstream, distinct repression mechanisms were proposed. Elongation blockage (roadblock) was shown for
xylara and
gnt operons, as well as
sigL and
acsA[
49-
53]. Prevention of binding RNAP to the promoter sequence was demonstrated for the
acuABC and
bglPH operons possessing
cre partially overlapping with the promoter region [
54,
55]. Transcription inhibition by direct interaction of CcpA with the σ-subunit of RNAP already bound to the promoter was shown in case of the
amyE gene and
xyl operon [
45]. The presence of a high-affinity
cre box in close vicinity to the TSS shown in this study, suggests that repression by inhibition of RNAP binding is one of the most effective mechanism of negative regulation by CcpA.