Considering the limited number of genes that meet our criteria for having a simple cis-regulatory promoter, and the finite number of conditions for which expression data is available, the proportion of functional BSMVs (9%) among all motif positions is remarkable.
We turned to the literature to assess the validity of a sample of the functionally variant binding positions we identified. We discuss what is predicted about each example from the VDRE approach alone, and then discuss each prediction in light of experimental evidence from the literature. Position 4 of the Mcm1 binding site, also called the middle sporulation element, is an example of a functionally variant binding site position identified in this analysis (; p
0.032). Under conditions where yeast is subjected to desiccation and rehydration, genes with an “A” at this position are induced, in comparison to genes with a “T” at this position. Under conditions where yeast is treated with methyl methanesulfonate (MMS), a DNA-damaging alkylating agent, the genes with an “A” at this position are repressed, in comparison to genes with a “T” at this position. A third category of genes has “C” at this position, and the VDRE scores of all three nucleotides (“A”,“T” and “C”) were considered when determining that the position is functional (Fig. S1
). The Mcm1 protein is a member of the MADS box family and plays important roles in several diverse cellular processes; therefore, its binding site has been extensively characterized. When Mcm1 binding sites were selected from a pool of random sequence oligonucleotides, about three quarters of the selected sequences had “A” at position 4, ~15% contained a “T” at this position, and Mcm1 had a higher affinity to “A” BSMVs than to “T” BSMVs 
. Putative Mcm1 binding sites were cloned in a heterologous promoter in front of a reporter gene 
, and a Mcm1 binding site was subjected to saturation mutagenesis in front of a reporter 
, and in both cases, Mcm1 binding sites with “A” variants at position 4 showed higher (~2×–3×) activation of the reporter than “T” (or “C”) variants.
Mcm1 acts as an activator alone, but as a repressor when co-bound with α2. The saturation mutagenesis of the Mcm1 binding site shows that BSMVs have different effects, depending on whether or not the α2 is co-bound 
. An “A” nucleotide at position 4 of the binding site results in more than twice as much activation of the reporter gene than a “T”, but when α2 is present the high level of repression of reporter gene by the two BSMVs is almost identical–130× for the “A” BSMV and 126× for the “T” BSMV. One reason for this combinatorial effect may be that Mcm1 is known to induce sequence-specific DNA bending, which in turn regulates the formation of ternary complexes with other cofactors 
. Many of the single base pair changes in the binding site that alter its DNA bending and transcriptional regulation do not affect the affinity of the TF for the binding site 
. Our finding that the “A” and “T” variants at position 4 of the Mcm4 binding site have different effects under different conditions makes sense because cofactors that act in a BSMV-specific way may be present in only a subset of these conditions. Although we have not determined which cofactor(s) are involved in our case, it is interesting that α2 is absent from the haploid a-mating type strain used in the MMS experiments 
, but present in the a/α diploid strain used in the desiccation/rehydration experiments 
Sum1 provides another example of how BSMVs may regulate target genes in a condition-specific manner through the participation of another factor, in this case, a competing transcription factor. During growth in rich media, we find that genes regulated by binding sites with a “T” at this position are induced, relative to genes with an “A” at position 8 (; significance of functional BSMV p
0.003). During sporulation, the opposite relationship is observed. (Sum1 binding sites with “C” at this position are also functional; Fig. S1
). During vegetative growth, Sum1 induces expression of target genes, and the regulatory difference between genes with different variants at position 8 of the Sum1 binding site is small; indeed, while Sum1 has been shown experimentally through mutagenesis to bind sites with a “T” BSMV at position 8 at about 20% the rate of sites with an “A,” repression of reporter activity remained similar between the BSMVs in that study 
. However, during sporulation, the repressor Ndt80 is also expressed, and competes with Sum1 for binding to the motif, dictating whether the site acts as a repressor or activator. The relative affinity of the BSMV for Ndt80 versus Sum1 acts as a molecular switch that induces only the genes required for the meiotic G2
-to-M transition. For the “A” variant at position 8 of the binding site, Ndt80 out-competes Sum1 and causes induction of the target gene, while for the “T BSMV, Ndt80 does not out-compete Sum1, and the repressive effect of Sum1 on the target gene remains the same as it was for the “A” BSMV in the absence of Ndt80. This type of effect may explain why “A” and “T” functional variants at position 8 of the Sum1 binding site have different regulatory associations with target genes in sporulation media versus other conditions.
The functional BSMV at position 8 of the Sum1 binding motif remains significant when also considering target genes with multiple primary inputs using VDRE (p<0.001), and its effect on target genes in different conditions remains the same, even though the number of genes considered is greater (). Although the method presented here does not explicitly accounts for the effects of both multiple regulatory inputs and BSMVs, such an approach is currently under development 
The functional BSMVs revealed using the two different platforms (cDNA vs. Affymetrix) were largely non-overlapping. This is the expected result since the regulatory function of BSMVs we detect is condition specific, and the conditions investigated in these sets of experiments are different.
A proportion of the functional BSMVs were identified in multiple species, suggesting that the BSMVs are under evolutionary constraint to preserve their function. For example, position 9 of the Reb1 binding site was identified as having functional BSMVs in S. cerevisae, S. paradoxus
and S. mikatae
. Genes regulated by binding sites with a “G” at this position are induced relative to genes with an “A” during growth in glycerol in all three species (). In a small-scale affinity selection experiment, Reb1 had lower binding strength to sites with “G” at position 9 than to sites with “A” at position 9, and “G” BSMVs promoted lower transcriptional activity than “A” BSMVs when grown on 2% glucose plates 
Position 10 of the Rap1 binding site also has functional BSMVs identified in multiple species. During glucose starvation conditions (growth in glycerol), genes with “C” BSMVs are induced with respect to genes with “T” BSMVs (). Differences in affinity of Rap1 binding sites have been shown to be specifically associated with expression in low glucose conditions, according to a precise set of experiments including ChIP-chip, protein binding microarrays, deletion mutants, and gene expression analysis 
. High affinity sites are constitutively bound by Rap1, while low affinity binding sites are protected by chromatin structure from Rap1, except during low glucose conditions, when chromatin conformational changes expose them, and Rap1 binds and induces expression. According to our method, such BSMV-by-condition patterns for Rap1 can be learned from accurate binding site predictions and expression patterns alone.
Yeast has only around 200–300 TFs to regulate its complex regulatory function—from budding to the cell cycle to selectively metabolizing dozens of different energy sources. The fundamental question in regulatory biology is how a relatively small number of TFs orchestrate the regulation of thousands of genes to achieve innumerable phenotypic responses. The fine-tuning of TF binding motifs at non-consensus positions may provide an important source of control in coordinating these condition-specific expression patterns.
In this study, we found that a significant proportion of variable positions in TF binding motifs may have functional consequences. Several of these predictions are in agreement with available experimental evidence, and several are corroborated by conservation across species. We considered only a single variable position at a time and did not explicitly account for promoters with complex regulatory inputs. More functional BSMVs should be found if combinations of positions and/or binding sites are formally considered 
Functional BSMVs allow the same TF to have a broad range of regulatory effects simultaneously over different target genes. Our results, consistent with the molecular biology literature, show that these differential regulatory effects between BSMVs can change with the concentration of the TF and/or the concentration of cofactors across environmental or cellular conditions.
As the complexity of organisms increase, the complexity of their regulatory responses needs to also increase to accommodate differential expression across tissues and numerous developmental stages. We therefore expect that the contribution of functional BSMVs to the cis
-regulatory code of higher eukaryotes may be even more pronounced, an idea supported by the observation of such BSMVs in the experimental literature in diverse organisms such as nematode 
, fly 
, mouse 
, and human