|Home | About | Journals | Submit | Contact Us | Français|
Transcriptional transactivation is a process with remarkable tolerance for sequence diversity and structural geometry. In studies of the features that constitute transactivating functions, acidity has remained one of the most common characteristics observed among native activation domains and activator peptides.
We performed a deliberate search of random peptide libraries for peptides capable of conferring transcriptional transactivation on the lexA DNA binding domain. Two libraries, one composed of C-terminal fusions, the other of peptide insertions within the green fluorescent protein structure, were used. We show that (i) peptide sequences other than C-terminal fusions can confer transactivation; (ii) though acidic activator peptides are more common, charge neutral and basic peptides can function as activators; and (iii) peptides as short as 11 amino acids behave in a modular fashion.
These results support the recruitment model of transcriptional activation and, combined with other studies, suggest the possibility of using activator peptides in a variety of applications, including drug development work.
Transcriptional activation is one of the key elements of eukaryotic gene expression. A variety of genetic, biochemical, and structural studies have revealed many mechanistic details of this process [see, for review, [1-3]]. In some cases, the structural/chemical basis for protein/DNA complex formation and recruitment of ancillary factors that promote transcription have been elucidated. Mutational analysis has defined regions of transcription factors – even specific residues – that participate in the activation process [4-9]. One of the most common structural features is the acidic activation domain, a region with excess negative charge present on many native activators as well as artificial ones. These activation domains are modular and can be effective when removed from one context and transplanted onto other DNA binding molecules [10,11].
The yeast two-hybrid system exploits this modular feature of transcriptional activators [12,13]. Several variants of this general approach have been devised that employ DNA-binding domain and activation domain fusion proteins to assess and/or recover proteins that interact. The exceptional flexibility and efficiency of this system propelled it to the forefront of protein interaction studies [14-16]. The technology has also been adopted for a variety of other applications including protein-RNA interactions, small-molecule screens, and screens for random peptide aptamers [13,17-19].
In this study we used the DNA-binding domain of a lexA yeast two-hybrid construct to recover short peptides from random peptide libraries which function as transcriptional activators. A deliberate search for activator peptides revealed that, although C-terminal peptide fusions to the lexA DNA binding domain had a higher probability of activating transcription, sequences inserted within the structure of a green fluorescent protein (GFP) scaffold also functioned as activators. The selected peptide activators were strongly biased in amino acid composition compared to unselected library peptides. However, structural features such as acidity observed in previous studies were not absolute requirements for activation [11,20,21]. Charge neutral and basic peptide activators arose at significant frequencies. A peptide 11 residues in length was modular, remaining a competent activator when grafted into a new context. These results are consistent with the recruitment model for transactivation and suggest that peptides may act as bridging molecules . Such short peptidic sequences that activate transcription may be useful in the context of drug discovery.
Two random peptide libraries were used in the experiments, one that displayed peptides within the green fluorescent protein (GFP) structure, the other as C-terminal fusions to lexA. The GFP internal random peptide library was fused to the lexA DNA-binding domain, constructed according to a method used previously  (Figure (Figure1).1). Each clone contained a different random peptide sequence inserted within a loop located 157 amino acids from the N-terminus which connects two of the beta strands of GFP. Thus, the peptides were constrained at either end by the structure of GFP. A library of 8 million random peptides of length 15 was created in a yeast expression vector consisting of a 2μ circle origin, a selectable marker, and the yeast ADH1 promotor driving expression of the hybrid proteins.
Even though the library was biased at the DNA level to avoid stop codons (see Methods), roughly one third of the clones were expected to contain in-frame stop codons within the insert, thus producing a truncated GFP molecule containing only the first 157 amino acids of GFP followed by C-terminal peptides of variable length up to 15 amino acids. At least some of these truncated products were expressed at high levels in yeast when fused to lexA (Caponigro, unpublished). The characteristics of the library sequences have been described previously . No dramatic sequence biases were present.
A second library in which 15 residue-long random peptides were fused to the C-terminus of the lexA binding domain was also constructed. This library was expressed from a vector that contained a centromere, a selectable marker, and the ADH1 promotor to drive expression. Again, the library inserts were biased against stop codons as for the internal peptide library. Sequence analysis of a random set of 71 clones revealed an average C-terminal peptide sequence length of 11.7 residues, with no discernible bias for or against particular amino acids (Table (Table11).
To select for peptide activators, the lexA-GFP/peptide and lexA-peptide libraries were introduced separately into yeast cells and plated on selective media. Only cells that contained plasmids which activated expression of the lexA-regulated LEU2 marker grew. The reversion rate for the assay was low (< 0.001%). Based on the number of colonies per plate, frequencies of 0.02% of activating peptides in the lexA-GFP/peptide library and 0.1% in the C-terminal library were calculated.
Plasmid DNA from 131 individual lexA-GFP/peptide colonies and 81 lexA-peptide colonies was recovered and insert DNA sequences obtained. For the lexA-GFP/pep library, these sequences condensed to 44 unique sequences with only 9 sequences present in one copy (Table (Table2).2). The remaining clones (35) were observed multiple times; one sequence was seen 16 times (not shown). This suggests that the total population of unique sequences was relatively small. Retests of individual clones revealed that ~90% were bona fide activators. For the lexA-peptide library, only 11 were observed multiple times, leaving 55 single observations and a total of 66 unique sequences. Retests confirmed 100% (16/16) as true activators.
Statistical analysis of the activator peptide sequences from both libraries demonstrated biases in amino acid composition and net charge. The frequency of amino acids among the sequences was skewed significantly toward acidic residues (Table (Table1).1). For the lexA-GFP/peptide library, the ratio of acidic residues to basic residues (D + E)/(K + R) was about two (63/32). When the net charge (assuming D = E= -1, K = R = +1) was calculated for the 27 sequences, a highly significant bias in charge distribution was detected (Figures (Figures22,,33,,4).4). The preponderance of negatively charged sequences exceeded that of positively charged ones by a factor of about 2 (15 vs. 8). However, one clone, recovered 2 times and confirmed as an activator in plasmid linkage tests, had a net charge of +3, and there was no significant trend toward increased acidity among the more frequent clones (not shown). For the lexA-peptide activator set, the distribution was even more skewed. Negatively charged sequences exceeded positive ones by a factor of 5 (45 vs. 9). If the C-terminal negative charge was included, the disparity was over 6-fold. The (D + E)/(K + R) ratio for this sequence set was about 4 (127/31). However, there were basic sequences among the set. For example, of the 16 sequences confirmed to have transcriptional activator activity by plasmid linkage tests, one was charge neutral and 2 were basic (+2 and +1; Table Table3).3). Compared with amino acid frequencies among the unselected lexA-peptide library sequences, there was a highly significant increase among activators of glutamate, leucine, phenylalanine and tryptophan (p << 0.001). Arginine was significantly under-represented. A bias against short peptides with stop codons that terminate translation before the 15 residue limit is reached was observed among the lexA-peptide activators. The mean length of activators was 15.1 amino acids compared to a length of 11.7 in the unselected population.
We also analyzed the activator sequences from the lexA-GFP/peptide library which encoded truncated GFP sequences. Based on their frequency in the non-selected library, stop codons were expected at a rate of 27%. Of the 44 unique sequences 32% (14/44) contained stop codons within the random peptide sequence. Thus, there was neither a bias toward nor against truncated GFP molecules arising from insert stop codons in this population of activators. However, 48% (21/44) of unique activator sequences resulted in truncated GFP due to +1 or -1 frameshift alterations within the random oligonucleotide. Both the +1 and -1 frames hit stop codons in GFP within 10 amino acids of the random peptide sequence. Furthermore, from the total set of 44 unique sequences, ~39% were +1 frameshifted. When two sequences with +1 frameshifted inserts were picked from the unselected random peptide library, both behaved as activators (not shown). Together, these results suggest that translation of the +1 frame of GFP, after the random peptide insert, results in a molecule with activator properties.
One of the peptide activators from the lexA-GFP/peptide library, derived from a frameshift in the insert, was tested for its ability to activate transcription when separated from the GFP structure and directly fused to the lexA binding domain. This sequence, termed Core-Activating Sequence (CAS), contained 11 amino acids (WSFWIQEWNQS), had a net charge of -1, and three tryptophan residues. Five of the amino acids were contributed by out-of-frame translation of GFP, one from the random peptide sequence, and 5 from the 3' linker used to insert the random oligonucleotides for library construction. An expression construct was tested along with controls for its ability to activate transcription of LEU2. In a selection for the LEU2+ phenotype, roughly equivalent numbers of colonies formed on plates for a lexA-Gal4 control activator protein and for the grafted peptide. Furthermore, cell patches revealed a level of growth similar to the Gal4 control (Figures (Figures55 and and6).6). Thus, CAS was as potent a transcriptional activator as Gal4 in this growth assay.
In general, native transactivating domains are relatively large, dozens of amino acids in length. However, fragments of native activator proteins have been described. An 11-amino-acid-long acidic domain from the RelA subunit of NFκ-B is capable of minimal activating behavior when fused to the Gal4 DNA-binding domain (Gal4BD) . The sequence, when joined together in tandem arrays, is as potent as the wild type RelA transactivator. Wu et al. showed that fragments of the Gal4 activation region between 17 and 41 amino acids in length provide activation functions of a magnitude in rough proportion to their length . This result suggests that short sequences are capable of stimulating transcription when complexed with promotor elements and that their effect is amplified by increasing the total content of such short sequences.
A series of papers has explored properties of random sequences, including random peptides, fused to DNA binding domains in the context of transcriptional activation assays similar to those used in the two-hybrid system. Ma and Ptashne isolated a set of E. coli DNA fragments that, when fused to Gal4BD, promote transcription . All (15/15) active sequences examined by the authors encode predicted polypeptides that are acidic, and a peptide as small as 12 amino acids was seen. One sequence (B42), 72 residues long, was shown to function as an activator when transferred to the lexA DNA binding domain. Yang et al. serendipitously discovered among a set of retinoblastoma-binding peptides three activators of length 16 residues that function as Gal4BD C-terminal fusions . All three are acidic, with charge -3 or -4. Giniger and Ptashne found that a designed acidic, amphipathic alpha helix of 15 amino acids activates transcription when fused to the Gal4BD . Using in vitro transcription assays, Gerber et al. demonstrated that homopolymeric stretches as short as 10 residues of glutamine or proline fused to the Gal4BD function as activators . Finally, Lu et al. screened for C-terminal Gal4BD peptides, 8 amino acids long. Only one basic residue was observed among the set of activator peptides, and no peptide had a net predicted positive charge .
An NMR structure for the lexA DNA-binding domain has been presented . The structure is compact and globular, and probably interferes only minimally with fused transactivating sequences. In addition, the success of the yeast two-hybrid technique suggests that the geometric constraints on transactivation are not severe (i.e., the DNA-binding elements and the transactivating portion of the molecule (or complex) do not require precise positioning with respect to each other). Because Gal4-GFP fusions are unstable, we were unable to analyze the GFP/peptide insertions identified in the lexA hybrid proteins in the context of the Gal4BD (not shown). However, a peptide selected in the context of the amino terminal portion of GFP functioned effectively when this GFP segment was removed and the peptide was grafted directly onto lexA. Due to differences in expression levels among C-terminal and internal GFP peptide constructs, we did not consider it useful to pursue more quantitative transactivation studies [22,27].
Activator peptides were present in the two libraries at remarkably high frequencies – 10-3 in the case of the C-terminal library, approximately the same frequency observed by Lu et al. . The 5-fold lower frequency of activators in the lexA-GFP/peptide library may indicate a role for the C-terminal negative charge, the specific geometry, or the conformational flexibility of the fused (vs. inserted) peptides. Considerably more than half of the activator sequences we observed had a large excess of negative charge, a finding that is consistent with the involvement of acidic residues in the activation process . However, we observed several examples of charge neutral and basic peptides (Table (Table3).3). As a group, the sequences did not display obvious patterns of amphipathic helicity based on the lack of 3, 4 or 7 repeat units. Glutamine and proline residues were not significantly over-represented in the sequence set. Instead, in addition to glutamate, large hydrophobic amino acids were abundant. This result accords with mutagenesis experiments that have pinpointed acidic residues and phenylalanine as special contributors to transactivation [23,28].
Two questions are raised by our and other studies of peptide activators: (i) what is the minimum size of an activator? and, (ii) what is the significance of the extreme sequence diversity evidenced by transactivating peptides? We identified and tested an 11-residue-long activator that functioned in two different contexts: at the C-terminus of a truncated GFP molecule and attached directly to lexA. Because we did not direct our screen at smaller peptides, we suspect that a deliberate effort to identify smaller activators would be fruitful. Indeed, Lu et al. found 8-mer activators, though none was tested for modularity . Nonetheless, the tendency for C-terminal activators to be longer than the average library peptide suggests a preference for length in the transactivation function .
The recruitment model for transcriptional activation stipulates that eukaryotic transcription is driven mainly by recruitment of preformed RNA polymerase holoenzyme to specific sites on the DNA . According to this model, peptides may bind one of numerous proteins that guide holoenzyme to the lexA binding site. Such proteins include components of holoenzyme and perhaps other factors [see ref. ]. Short peptides can bind proteins with reasonably high affinity (see ref. , for review). Because transactivation is a process that involves protein-protein interactions, it is not surprising that peptides linked to DNA-binding domains function as transactivators by bridging interactions that promote transcription.
The wide range of primary sequences and sequence features described to date among peptide activators supports the recruitment model. Given the diversity of proteins that comprise holoenzyme, this model implies that many sequences of different types would be competent activators; their sole requirement would be to bind proteins within holoenzyme . The acidic peptides may interact in a manner akin to native activators . But the basic ones may, for example, bind other proteins that contain acidic regions, thereby forming a bridge to holoenzyme. Alternatively, these basic peptides may bind different regions of the holoenzyme component(s) recognized by acidic peptides, or entirely different subunits .
Apart from scientific interest, these findings may have value in creating tools for drug and drug target discovery using the yeast two-hybrid system. Isolation of a panel of small peptides with different structural properties could improve the performance of the system by providing an option for multiple screens using a variety of peptides. This option might decrease false negatives and false positives if applied intelligently. For example, a series of positively charged, negatively charged and neutral peptides may enhance the accuracy of the method. In addition, short activating sequences may prove useful in the context of isolating peptides that interact with other proteins. Peptides that bind specific proteins of therapeutic interest in the yeast two-hybrid system have a variety of applications in drug development. These peptides could be used as lead-ins for creation of peptido-mimetics. Alternatively, they might provide reagents for high-throughput displacement screens . Typically, such peptides are isolated from libraries on the activation domain of the two hybrid system . By substituting small pepticals for the activation function, it may be possible to define binding sites with higher resolution in the context, for instance, of the reverse two-hybrid approach ; Caponigro et al., unpublished). Small peptides active in the yeast two-hybrid system may be synthesized by Merrifield procedures, labeled, and used directly in screens for small molecules . This approach might avoid laborious testing of hybrid protein/peptides to determine which peptides bind independent of their fusion partner. Finally, non-peptidic molecules that bind in a sequence-specific manner to DNA have been fused to 20- and 16-residue peptides. These hybrid molecules activate transcription [9,31]. It may be possible to produce small-molecule activators (or repressors) that bind DNA specifically, but also activate transcription of adjacent genes through a non-peptidic, more drug-like moiety.
Peptide activators are recovered from random peptide libraries at high frequency. These peptides display significant sequence biases toward negative charge. However, basic activator peptides are also present in the set. These results are consistent with the recruitment model of transcriptional activation in eukaryotes and support the idea that small molecules directed at the transcription process may prove to be a useful therapeutic strategy.
Yeast strain yVT87 (EGY48 (Clontech) MATa, ura3, his3, trp1, leu2::lexA6op-LEU2) was transformed by the method of Gietz and Schiestl (Agatep, R., R.D. Kirkpatrick, D.L. Parchaliuk, R.A. Woods, and R.D. Gietz (1998) Transformation of Saccharomyces cerevisiae by the lithium acetate/single-stranded carrier DNA/polyethylene glycol (LiAc/ss-DNA/PEG) protocol. Technical Tips Online http://tto.trends.com.). Plasmids were maintained by growth in standard selective media.
pVT560 was constructed by first filling the BamHI and XhoI sites in pEG202 (Gyuris et. al (1993), Cell 75:791-803) with Klenow fragment. The resulting vector was digested with EcoRI, dephosphorylated with calf intestinal phosphatase (CIP, NEB), and a PCR product generated from pVT27 (Abedi et. Al, 1998) that contained a version of the GFP gene modified to have Xho1 and BamHI sites located between codons specifying N157 and K158, and EcoRI ends subsequently inserted. Lastly an ~1 kb "stuffer" fragment containing most of the STE4 gene was inserted into the XhoI and BamHI sites to aid in purification of doubly digested pVT560.
The lexA-GFP/peptide library (lVT560) was created with oligos oVT335 (TCGAGAGTGCAGGT (NN (G/C/T))15 GGAGCTTCTG), oVT336 (ACCTGCACTC), and oVT337 (GATCCAGAAGCTCC) in the manner described in (Abedi et. al, 1998). LVT560 is comprised of roughly 1.2 × 107, independent clones, of which ~66% contain peptide-encoding inserts as judged by restriction digest analysis.
The lexA-peptide library was created in a similar manner to the lexA-GFP/peptide library except that the sequence downstream of the random insert contained stop codons in all three reading frames. A total of 1.5 × 106 independent clones (of which over 95% contain random peptides) were selected to indentify transcriptionally active peptides.
pVT725 was created by replacing the ampicillin resistance cassette on pLexA (Clontech) with a kanamycin resistance gene (obtained from pGBKT7 (Clontech) by PCR) via homologous recombination in yeast.
The CAS peptide was transferred to pLexA as follows: Two complementary oligos which encoded the 11 amino acid CAS peptide (oVT2899: AATTCTGGAGCTTCTGGATCCAAGAATGGAATCAAAGTTAAG, and oVT2900: GGCCGCTTAACTTTGATTCCATTCTTGGATCCAGAAGCTCCAG) were annealed in PCR buffer and cloned using standard methods into pVT725 via EcoRI and NotI restriction sites. This resulted in the fusion of the CAS peptide onto the C-terminus of lexA.
Transcriptionally active peptides were selected by transforming yVT87 with the appropriate peptide library and selecting for transformants on plates lacking histidine. Transformants were then pooled and transferred onto plates lacking histidine and leucine in order to select for peptide sequences capable of activating the lexA-operator-driven LEU2 reporter. DNA encoding peptides of interest was amplified by PCR using oligos flanking the insert and sequenced using an ABI 377 automated DNA sequencer (Applied Biosystems). Plasmids were retested by isolation from yeast using the phenol/glass bead method followed by electroporation into bacteria . The plasmids were then isolated from bacteria using a commercial DNA miniprep kit (Qiagen) and re-transformed in to yVT87.
We are grateful to C. Wang for help with computer analysis.