To assess the quantitative effect of different genomic regions it is essential to establish experimental systems that separate these regions from their native genomic context and measure their direct effect. While it is well established that regulatory features other than the promoter can affect gene expression, to our knowledge our work provides the first systematic measurements of the independent effect of regulatory regions other than promoters in yeast. We show that native 3′ end sequences span a broad and continuous range of expression values of greater than 10-fold. Our library represents a limited number of 85 sequences, chosen without any prior knowledge on their expected effect on expression, and is composed of two unrelated functional groups from the yeast genome. Thus, it is likely that the effect of 3′ end sequences in the genome is larger than the effect we observe due to the small sample size of our library. These genes were chosen as they represent two different regulatory strategies, with ribosomal genes being house-keeping genes expressed constitutively in all growth conditions, and the other group being condition specific genes expressed in the growth condition in which we conduct our measurements. Notably, we did not find any major differences in the 3′end mediated regulation. We quantify the independent effect and explained variance of 3′ end and promoter sequences by comparing our 3′ end library to a promoter library and correlating both to endogenous mRNA levels. The results show that constitutive expression levels are determined by a combination of both regulatory regions. Interestingly, despite the large regulatory potential (dynamic range) of isolated 3′ end constructs on YFP expression, their contribution to the explained variance of endogenous mRNA levels is relatively small. One possible explanation is that the effect of the two regions is not independent; it would thus be interesting to test different 3′ end sequences in different promoter contexts.
Although we cannot say whether the A/T content itself causes higher expression or whether it is a proxy for a more specific signal, our results highlight the 3′UTR end as a genomic region that may have a significant effect on mRNA levels. This sequence signal depends on aligning the sequences by the polyadenylation site. We thus speculate that increased A/T content may result in more efficient 3′ end formation that gives rise to elevated protein expression. It has been previously shown that A/T content is required for efficient 3′ end processing as part of the upstream efficiency element (UAS) 
. More efficient 3′end processing can result in efficient release of RNA polymerase after polyadenylation and recycling of transcription initiation machinery, given that polyadenylation and transcription termination were shown to be mechanistically coupled 
. Additional potential means by which efficient polyadenylation could give rise to higher protein expression comes from a recent work in mammalian cells 
, which suggested that with more efficient 3′ end processing, more transcripts escape from nuclear surveillance, resulting in more mature mRNA molecules exported into the cytoplasm. Notably, all of these mechanisms would result in changes in the size of expression bursts. Although it was shown that by deliberately mutating polyadenylation signals, mRNA and protein levels decrease 
, we suggest that the efficiency of this process varies between native genes and is partly responsible for the observed variability in protein and mRNA expression in the genome.
Our study demonstrates the strength of a synthetic approach in establishing a causal link between sequence features and their outputs. Observing correlations in the genome, e.g. between sequence features in the 3′ UTR and expression levels could always be explained by indirect non-causal effects. For example, one could argue that the genes with certain UTR features may also have strong promoters. Observing such connections in a setup such as the current library in which the effect of 3′ UTR sequences is measured in isolation partly removes those potential confounders.
Finally, we showed that the observed span of YFP values in our library, mediated by the different 3′ end constructs, affect population noise in a very distinct way compared to expression changes that are mediated by differential promoter activations. Our results thus put 3′ end sequences as appealing candidates for the design of specific circuits in which changes in the mean expression level of a population are needed with little effects on noise. They also demonstrate how the different layers of gene expression regulation affect protein expression with distinct dynamics and propose that such analysis can be used to gain insights into the different layers of regulation involved in an observed change in protein levels.