|Home | About | Journals | Submit | Contact Us | Français|
Non-antibody scaffold proteins are used for a range of applications, especially the assessment of protein–protein interactions within human cells. The search for a versatile, robust and biologically neutral scaffold previously led us to design STM (stefin A triple mutant), a scaffold derived from the intracellular protease inhibitor stefin A. Here, we describe five new STM-based scaffold proteins that contain modifications designed to further improve the versatility of our scaffold. In a step-by-step approach, we introduced restriction sites in the STM open reading frame that generated new peptide insertion sites in loop 1, loop 2 and the N-terminus of the scaffold protein. A second restriction site in ‘loop 2’ allows substitution of the native loop 2 sequence with alternative oligopeptides. None of the amino acid changes interfered significantly with the folding of the STM variants as assessed by circular dichroism spectroscopy. Of the five scaffold variants tested, one (stefin A quadruple mutant, SQM) was chosen as a versatile, stable scaffold. The insertion of epitope tags at varying positions showed that inserts into loop 1, attempted here for the first time, were generally well tolerated. However, N-terminal insertions of epitope tags in SQM had a detrimental effect on protein expression.
Although antibodies have proven to be useful tools in a broad range of applications, they also possess some inherent limitations. Owing to their complex structure and extensive glycosylation, antibodies are fragile molecules with low yields in bacterial expression systems, having a high temperature sensitivity, and expensive production costs. Their correct folding relies on the formation of intramolecular disulphide bonds and as a consequence they frequently do not function in the reducing environment inside eukaryotic cells (Skerra, 2007). Furthermore, it is often very challenging or impossible to raise antibodies that are able to distinguish between different isoforms or mutants of proteins, an often wished for tool in clinical diagnostics.
It is therefore not surprising that for more than 14 years a great deal of effort has been invested in the design of non-antibody scaffold proteins that can present recognition surfaces that expand the immune repertoire. This has resulted in the description of over 40 artificial proteins that have been engineered to present peptides as recognition moieties (reviewed in Binz et al., 2005). Major classes of engineered scaffolds include those based on folds from proteins as diverse as protein Z (Affibodies, Nord et al., 1995), fibronectin (Adnexins, Koide et al., 1998), ankyrin repeat proteins (DARPins, Binz et al., 2003; Kohl et al., 2003), cysteine-knot miniproteins (Knottins, Clark et al., 2006; Kolmar, 2008) or Armadillo repeat proteins (Parmeggiani et al., 2008), as well as on full-length proteins such as lipocalins (Anticalins, Schlehuber and Skerra, 2005), ColE7 immunity protein (Im7, Juraja et al., 2006), the green fluorescent protein GFP (Abedi et al., 1998) and thioredoxin A (LaVallie et al., 1993). A major rationale of this work has been that conformational constraints enforced on inserted peptides by the scaffold protein will decrease the entropic cost of binding and lead to increased binding affinities (Ladner, 1995). A second motivation has been the idea that whereas free peptides are readily degraded by cellular proteases, full-length folded proteins are expected to be more stable (Saveanu et al., 2002; Reits et al., 2003). Similarly, antibodies that have evolved to function in the extracellular environment function poorly within eukaryotic cells. These considerations led Colas et al. to establish a yeast two-hybrid screening system to identify conformationally constrained peptides that would bind to target proteins within the intracellular environment of a eukaryotic (yeast) cell, coining the term ‘peptide aptamer’ to describe such tools (Colas et al., 1996; Colas, 2008). In addition to their ability to function inside cells, peptide aptamers possess several attributes in common with other constrained peptides that distinguish them from antibodies, including small size (typically 10–15 kDa, compared with 150 kDa for IgG), lack of post-translational modification (IgG requires extensive glycosylation of the constant region for stability) and being comprised of a single polypeptide chain (IgG is comprised of two copies each of a heavy and a light polypeptide chain). Nonetheless, peptide aptamers posses binding affinities that are comparable to most antibodies, in the pM–nM range (Cohen et al., 1998). A consequence of the simple design is a high protein yield in bacterial expression systems, increased thermal and chemical stability as well as the ability to functionally fuse with reporter proteins (Kolmar and Skerra, 2008).
A major concern with any approach is that the introduction of a scaffold protein itself into cells may lead to phenotypes that may complicate or mask the effects of the displayed peptide. For example, thioredoxin was chosen because the use of a bacterial protein scaffold was thought unlikely to produce interactions with proteins in human cells. However, expression of bacterial thioredoxin protein is anti-apoptotic and is able to decrease ischemia in mouse models (Tao et al., 2004).
Accordingly, we have sought to produce a new scaffold for the presentation of constrained peptides that would be ‘biologically neutral’, i.e. would not interact with human proteins. After an initial investigation of several candidates, we found that human stefin A (SteA, also called cystatin A) could be engineered to possess all of the desirable characteristics (Woodman et al., 2005). Stefin A is a small (98 amino acid) monomeric protein inhibitor of the cystatin family I (stefins) that inhibits cysteine proteases of the cathepsin family (Turk et al., 1986). It interacts with its partner proteins, cathepsins B, C, H, L and S (Brzin et al., 1984; Green et al., 1984; Abrahamson et al., 1986) using three sites, with key contacts made by glycine at position 4, valine at position 48 and lysine at position 73 (Bode et al., 1988; Stubbs et al., 1990; Martin et al., 1994, 1995; Tate et al., 1995; Pavlova and Björk, 2002; Jenko et al., 2003). Biological neutrality was achieved through a combination of two rational mutations [G4W to abolish interaction with cathepsins (Estrada et al., 1998, 1999) and V48D to abolish interaction with cathepsins and reduce dimer formation through domain swapping (Japelj et al., 2004)] and a third mutation to introduce a unique RsrII restriction site at codons 71–73, which becomes the insertion site for oligonucleotides encoding short peptides. The use of this site allowed us to introduce peptides into the longest loop found in SteA, known as loop 2 (Bode et al., 1988; Stubbs et al., 1990; Martin et al., 1994, 1995; Tate et al., 1995; Pavlova and Björk, 2002). The engineered protein was designated STM, for stefin A triple mutant (Woodman et al., 2005).
However, despite the success of this scaffold in presenting certain peptides for interaction (Woodman et al., 2005; Estrela et al., 2008; Evans et al., 2008; Johnson et al., 2008; Shu et al., 2008; Davis et al., 2009), there was clear room for further improvement. Most importantly in our view, crystal structures of stefin A in complex with its target cathepsin revealed that the parental SteA protein uses three non-contiguous peptide surfaces, comprising the amino terminus, loop 1 and loop 2, to interact with its target (Bode et al., 1988; Stubbs et al., 1990; Martin et al., 1994, 1995; Tate et al., 1995; Pavlova and Björk, 2002; Jenko et al., 2003). We wished to make use of the whole of this ‘tripartite wedge’ for interaction with targets, so as to maximize both binding affinity and the specificity of interaction.
The goal of the present study is to assess the effect on the SteA protein of introducing step-by-step amino acid changes at specific locations, alone and in combination, that would result from the generation of restriction sites in the open reading frame to be used for oligonucleotide insertion. Having identified a number of potential changes that do not adversely affect the folding of the engineered protein as assessed by circular dichroism (CD) spectroscopy, we asked whether a series of model peptides could be presented by the newly engineered scaffold protein, a stefin A quadruple mutant (SQM). We found (i) that inserts into the amino terminal region that mediates most of the protein contacts in SteA co-crystals frequently disturb the secondary structure of the protein, as analysed by CD spectroscopy and protein expression yields in E.coli; (ii) that replacement of loop 2 in SQM may adversely affect protein yields compared with the longer inserts that result from insertion into loop 2 in STM; and most surprisingly (iii) that short inserts into both loops 1 and 2 simultaneously can increase the apparent stability of the protein. Together, these data suggest new ways of working with stefin A derived scaffold proteins and may shed light on determinants of folding and stability of the stefin A protein itself.
pET30a(+), for the expression of His6-tagged proteins in bacteria, was purchased from Novagen (Nottingham, UK). The SteA variant STM in pET30a(+) was used as published (Woodman et al., 2005). DNA manipulations followed standard protocols (Sambrook and Russell, 2001), using enzymes obtained from NEB (MA, USA). Oligonucleotides were from Invitrogen (Paisley, UK) and are listed in Supplementary Table B (Supplementary data are available at PEDS online). The genes encoding the STM variants SDM (stefin A double mutant), SQM, SUM (stefin A unique middle), SUN (stefin A unique N-terminus) and SUC (stefin A unique C-terminus) were synthesized and cloned into pET30a(+) by Genscript (Piscataway, NJ, USA). Site-directed mutagenesis was performed according to Fisher and Pei (1997). All DNA manipulations were confirmed by sequencing.
Double-stranded oligonucleotide cassettes flanked by required restriction site overhangs and encoding a peptide tag (Supplementary Table A, Supplementary data are available at PEDS online) were made by annealing oligonucleotides (Supplementary Table B, Supplementary data are available at PEDS online). Digested dsDNA cassettes were ligated into the appropriate restriction sites of the scaffold-encoding open reading frame in pET30a(+).
pET30a(+) STM variants were transformed into the E.coli strain BL21 (DE3, Novagen, USA) that provides increased protein stability due to its Ion and OmpT deficiency (Shaw and Ingraham, 1967). The cells were grown to A600 = 0.6 in Luria–Bertani broth (Sambrook and Russell, 2001) at 37°C, and protein expression was induced with 0.4 mM IPTG. After 3 more hours of growth, the cells were harvested and resuspended in protein lysis buffer [50 mM Na2HPO4/NaH2PO4, 300 mM NaCl, supplemented with a complete protease inhibitor mix lacking EDTA (Roche)]. The cells were lysed using a cell disrupter (Constant Systems, Northants, UK). The His6-tagged proteins were captured on Ni-NTA columns (Qiagen) and, after extensive washing of the columns with lysis buffer and lysis buffer supplemented with 30 mM imidazole, eluted with 250 mM imidazole into a 50 mM sodium phosphate, 300 mM NaCl buffer.
Purified proteins were transferred into 50 mM sodium phosphate (pH 7.4) using buffer exchange columns (Amicon Ultra 10 kDa, Millipore), and their concentrations determined by measuring the absorbance at 280 nm (NanoDrop ND-8000, Thermo Scientific). CD spectra were recorded on a Jasco J715 spectropolarimeter at 10°C using a 300-µl quartz cuvette with d = 0.1 cm path length. Folding spectra were collected from 190 to 260 nm. The raw output is given in ellipticity [θ (mdeg)]. The data were normalized by calculating the mean residue ellipticity θ using the following equation:
where [θ]λ is the mean residue ellipticity (deg × cm2 × dmol−1), θ the observed ellipticity (in mdeg) at wavelength λ (in nm), Mr the molecular weight of the peptide (in g/mol), c the concentration (in mg/ml), d the path length (in cm) and n the number of residues. Three to eight spectra were taken for each STM variant and averaged, and the average spectrum for the buffer alone was subtracted to produce the final curves. The data were analysed using Microsoft® Excel® (version 12.5.1 for Mac OS) and visualized using Plot (version 0.997, Michael Wesemann, http://plot.micw.eu/).
Myc-tagged SQM peptide aptamers were immuno-precipitated using anti-Myc tag antibody-coated agarose resin (Abcam). About 10 µl of the resin was blocked in 250 µl of 50 mM sodium phosphate (pH 7.4) with 4% bovine serum albumin (BSA) and 0.1% Nonidet P-40 (NP-40, Calbiochem) for 1 h at 4°C. The resin was then washed three times in wash buffer (50 mM sodium phosphate with 0.05% BSA and 0.1% NP-40). Subsequently, the resin was incubated with 2 µg of purified peptide aptamer dissolved in 200 µl of WB for 2 h at 4°C, followed by seven washes in 200 µl WB. Protein sample buffer was added (Laemmli, 1970), the sample boiled for 5 min and analysed by SDS–polyacrylamide gel electrophoresis followed by western blotting with anti-S-tag monoclonal antibody (Novagen) targeting the S-tag added by the pET30a(+) vector to the amino-terminus of the peptide aptamer.
Microarray assays were performed with antibodies (Ab9106 Myc-tag polyclonal, Ab16 918 HA tag monoclonal and Ab24620 AU1 tag monoclonal, from Abcam) labelled 1:1 with Atto dyes (Atto 550-NHS ester and Atto 647N-NHS ester). Labelling was confirmed via a Nanodrop spectrophotometer. All samples have been spotted (BioOdessy, Biorad Corp.) on nickel NTA histag affinity (Xenopore Corp) surfaces using 100 μm capillary pins. Samples were printed at concentrations of 10 µM in print buffer, which comprised of PBST and 10% glycerol. Sample arrays were printed in repeats of four and the entire array was repeated three times across the slide. Spotted volumes were allowed to incubate on the slide for 60 min, prior to a blocking step in 1% BSA PBST for 60 min. Labelled antibodies were incubated at concentrations of 7–28 nM in 1% BSA PBST for 40 min using 50 µl volume Lifterslips (Thermo Scientific). Slides were washed in PBST twice for 10 min then twice in deionized water for 2 min. The slides were dried under a stream of nitrogen and scanned at a resolution of 5 µm with 543 and 633 nm lasers under the Cy3 and Cy5 detection protocol (Scan Array Express, Perkin Elmer).
Although STM can be used to present peptide aptamers that interfere with biological processes (J.T-H. Yeh, R. Woodman, and P. Ko Ferrigno, unpublished), we found that screens of random peptide libraries created by insertion into loop 2 frequently led to the identification of truncated peptide aptamers. These were produced by oligonucleotides that encoded either in-frame or out-of-frame stop codons, leading to the loss of the C-terminal 23 residues of stefin A (Woodman et al., 2005). We reasoned that the selection of these ‘unconstrained’ peptides from libraries may result from an inability of the scaffold to tolerate or correctly present a wide range of peptide sequences inserted into loop 2. To circumvent this problem and to increase the available surface area used for interaction with targets, we set out to further engineer the scaffold protein.
We set out to improve the STM scaffold, shown schematically in Fig. 1A. We used a step-by-step approach introducing successive mutations into the stefin A gene to generate new insertion sites for oligonucleotides because it was unknown whether any of the planned modifications would be tolerated by the encoded protein. In doing so, we were able to examine the effect of single amino acids changes at different positions, as well as the effects of those changes in combination with others. First, we introduced mutations at unique sites in the wild-type stefin A gene.
These engineered proteins were designated SUN (encoded by the SteA open reading frame possessing an AvrII site at codon 4) and SUM (with an NheI site at codons 48–50 inclusive). Second, a double mutant was generated, designated SUC (two RsrII sites in codons 71–73 and 82–83 inclusive). Third, a triple mutant designed to allow insertions into loops 1 and 2 was designated SDM (with NheI at codons 48–50 and two RsrII sites at codons 71–73 and 82–83). Finally, we conceived a quadruple mutant, SQM that combines all of these changes in one open reading frame. The corresponding changes in the amino acid sequences are shown in Fig. 1B. SQM represents a version of the desired final scaffold including all modifications that would allow the insertion of short random peptides into the N-terminus and loop 1 as well as the replacement of loop 2. This combination is designed to provide a high versatility and binding affinity.
We used two methods to determine the effect of the amino acid changes on the scaffold structure. The first method is simply to determine the relative expression levels of the engineered proteins in E.coli, with the rationale that most amino acid changes are likely to destabilize the protein and decrease yields of soluble protein. The second method is to assess the proportion of secondary structure elements in each protein isoform by CD spectroscopy.
When comparing protein yields from E.coli, we measured the amount of purified stefin A variant that could be recovered on Ni-NTA affinity columns of a given volume and standardized this number to the volume of starting culture. We found that each individual change from stefin A either had little effect on normalized yields or occasionally led to increased yield compared with STM. Thus, in five preparations of STM, the average yield was 46 ± 16 mg of purified STM from 1 l of culture, whereas for SUN the average yield was 42 ± 14 mg/l of culture (n = 3), SUM 31 ± 8 mg/l of culture (n = 3) and SUC 53 ± 9 mg/l of culture (n = 3). This was true also when we combined two mutations in one protein, as the yield for SDM was 46 ± 7 mg/l of culture. When we made changes at all three sites simultaneously in SQM, the yield was 49 ± 11 mg of purified protein per litre of culture (average of seven preparations).
CD spectroscopy allows an assessment of secondary structure elements (α-helix and β-sheet) present in a given protein, but can neither readily distinguish between the two nor can CD spectroscopy be used to predict secondary structure ab initio. However, we believe CD spectroscopy to be a useful tool for determining whether or not amino acid alterations or peptide insertions have perturbed the secondary structure of the well-characterized stefin A protein as multiple spectra for the parental protein are available (Turk et al., 1983; Zerovnik et al., 1991; Pol et al., 1995; Jerala and Zerovnik, 1999; Kenig et al., 2001). In using CD spectroscopy, we wish simply to ask whether the proportion of secondary structure remains constant following each engineering step. A change in the shape of the CD spectrum plot would reflect changes in the α-helical and β-sheet content of the folded protein. CD spectra for the different STM variants were obtained between 200 and 260 nm STM (Fig. 2). All STM variants showed similar CD spectra with an inflexion point at around 219 nm. The CD spectrum of STM appears to be slightly shifted towards the far UV compared with other variants. However, we also observed differences in the amplitude of the CD spectra with SQM, SUM and SUN showing a deeper curve compared with STM (Fig. 2; see Discussion).
In generating SQM from STM, an additional RsrII site was introduced into the loop 2 encoding region of the STM open reading frame. As a consequence, the leucine at position 82 and threonine at 83 were changed to arginine and serine, respectively. The two RsrII sites allow the substitution of the native ‘loop 2’ sequence with a random oligopeptide. This was expected to be less disruptive compared with STM where the use of a single insertion site inevitably extends loop 2, but may also affect the way that peptide inserts are presented.
Different oligonucleotides were inserted into each open reading frame so as to present encoded peptides at loop 2 of both the STM and SQM scaffolds. Initially, we inserted oligonucleotides that encode short peptides (A48, A52 and A58, each 10 amino acids long; and A7: 17 amino acid long; Supplementary Table A, Supplementary data are available at PEDS online) that bind to the POZ domain of BCL-6 when displayed by the thioredoxin A scaffold (Chattopadhyay et al., 2006). In the course of cloning, concatamerization of two oligonucleotides gave rise to a longer peptide, designated A52tandem, comprising 22 amino acids. Expression levels of these peptide aptamers in SQM were comparable to, or slightly exceeded, the expression levels obtained when using STM. Interestingly, one oligopeptide (A7) caused a deviation from the reference CD spectra of ‘empty’ STM (Fig. 3A) but not of empty SQM (Fig. 3B). The data are consistent with the lower yield of STM-A7 (23 mg protein/l of culture) compared with SQM-A7 (41 mg/l of culture) when expressed in E.coli and indicate that, at least in the case of A7, SQM may present peptides in loop 2 differently than STM.
We then turned to a model set of peptide inserts (pep2, pep6, pep10m: 20 amino acids; pep9: 42 amino acids; Supplementary Table A, Supplementary data are available at PEDS online) that were initially identified by Colas et al. (1996) as being able to bind to human CDK2 when presented by the thioredoxin A scaffold. To our surprise, the yield of protein from bacterial cultures was now lower when the peptides were in SQM compared with STM (STM and SQM: pep2, 19 and 9 mg/l; pep6, 36 and 2 mg/l; pep10m, 58 and 2 mg/l; pep9, 89 and 2 mg/l). Collected CD spectra for these peptide aptamers also showed a decreased tolerance for insertions in ‘loop 2’ of SQM than in STM (Fig. 3C and D). These data confirm that peptides in loop 2 of SQM behave differently than peptides inserted in loop 2 of STM.
We wished to make a preliminary assessment of whether SQM would be able to present peptides for interaction. Accordingly, we turned to our previous strategy (Woodman et al., 2005) of asking whether simple epitope tags would be recognized by their cognate antibodies when presented in the new sites created in SQM. We chose three peptide epitopes (AU1, HA and Myc tags) that differ in both length and physico-chemical characteristics (Supplementary Table A, Supplementary data are available at PEDS online). These peptides were inserted singly or in various combinations into the positions we have designed in the SQM scaffold (the N-terminus, loop 1 or loop 2). Initially, we inserted the HA tag into the amino terminal site, the AU1 tag (the shortest peptide) into loop 1 (the shortest loop) and the Myc tag into loop 2. To our surprise, insertions of the HA tag into the amino terminus were only poorly tolerated, with yields of protein decreased more than 2.5-fold compared with the empty scaffold [SQM-HA(N): 19 mg of purified protein/l of culture, compared with SQM: 49 ± 11 mg/l of culture). Similarly, Myc insertion into loop 2 resulted in >5-fold decrease in protein yield in E.coli. In contrast, insertion of the AU1 tag into loop 1 did not destabilize SQM and may in fact increase yields (Table I).
These adverse effects of peptide insertion could be due to the insertion of ‘any’ peptide at each site, or to the insertion of a peptide with a given sequence at that site. Accordingly, we asked whether insertion of the Myc epitope peptide in the amino terminal site had similar effects to the insertion of the HA peptide. Myc insertion into the amino terminus also led to a decrease in yield, especially when comparing SQM with an AU1 tag in loop 1 to SQM with the Myc tag in the N-terminal site combined with the AU1 tag in loop 1 (Table I). In addition, the combination of the N-terminal HA insertion with any other in loop 1 or 2 resulted in lower yields compared with similar peptide aptamers lacking the HA inserts. Similarly, the Myc insertion at loop 2 was uniformly destabilizing of all other peptide aptamer variants. The highest yield we obtained, over 200 mg of peptide aptamer per litre of culture representing a 4-fold increase in yield over the empty scaffold, involved SQM with insertions of the AU1 tag at loop 1 and the HA tag at loop 2, with no insertion into the amino terminus. We conclude that although some peptides may be tolerated by the amino-terminus of SQM, this will not be a generally useful site.
We next asked whether the insertion of the epitope tags into the three sites as described above (Table I) affected the proportion of secondary structure in the resulting peptide aptamers. To our surprise, even those insertions that decreased protein yields did not appreciably disrupt the secondary structure of the resulting peptide aptamers (Fig. 4A). We did note that the presence of a peptide at the N-terminal site changes the shape of the curve, pushing the inflexion point from 218 towards 209 nm (Fig. 4B SQM-HA). In order to ask whether this may reflect a general effect of insertions at this site on the structure of the scaffold, we analysed the spectra for a range of SQM-derived peptide aptamers with inserts in loop 1 and/or loop 2, all with an insert at the amino terminus. We consistently found the secondary structure of these proteins to be more disrupted, with inflexion points at 209 nm, than the corresponding proteins lacking an insert at the N-terminal site (Fig. 4B).
Having determined that it is possible, within limits, to insert model peptides into each of the three positions, we wished to ask whether the inserted peptides were presented to solvent or were being incorporated into the secondary structure of an aberrantly folded scaffold. In order to do this, we wanted to know whether antibodies specific for each inserted epitope could recognize their target in the scaffold. First, we asked whether the AU1 peptide inserted into ‘loop 1’ could be recognized by an anti-AU1 antibody. To this end, we asked whether an anti-AU1 antibody could immuno-precipitate model peptide aptamers presenting the AU1 epitope tag in each site in turn. Indeed, the interaction was sufficiently tight that each peptide aptamer could be immuno-precipitated (data not shown). Next, we asked whether other epitope tags would be equally well recognized. To this end, we used scaffold proteins in which the Myc epitope peptide had been inserted in either the amino terminus, loop 1 or loop 2. We performed immuno-precipitation experiments with an anti-Myc antibody, to ask whether the peptide was displayed on the surface of the folded protein, and used western blotting to detect the scaffold protein. As before, the epitope tag was equally well recognized regardless of the insertion site and appeared not to be affected by insertions at other sites in the same scaffold (Fig. 5).
Because of the combinatorial nature of the SQM scaffold with three possible insertion sites, we have devised a microarray assay that allows us to look at many peptide aptamers simultaneously using cognate antibodies of epitope tags incorporated at several sites independently. Peptide aptamers with various combinations of epitope inserts were spotted and allowed to chelate onto glass slides coated with nickel-NTA in order to induce oriented immobilization by the His tag. Subsequently, surfaces were blocked using 1% BSA PBST buffer and then incubated with the fluorescently labelled antibody at a concentration of 7 nM. After thorough washing, the slides were scanned using a standard DNA microarray scanner (Fig. 6). Tag-less aptamers and print buffer control spots all gave the expected lack of signal. Peptide aptamers containing the Myc and HA tags gave strong signals. Peptide aptamer samples containing the AU1 tag consistently gave much weaker signals compared with the Myc and HA counterparts, and on occasions could not be seen. In order to increase AU1 signal strength, the antibody incubation concentration was increased to 28 nM, which resulted in identifiable spots. Since all our model peptide aptamers were spotted at the same concentration of 10 µM, the possibility of concentration variations correlated with signal variation can be ruled out. Furthermore, internal controls of samples containing all three tags, where by definition the equimolar amounts of all three tags are present, showed the same systematic signal variations. This issue was independent of the insert position and is thus likely due to the short size of the AU1 tag leading to a lower binding affinity by the anti-AU1 antibody (see Discussion).
We have also analysed the expression profiles of 384 random peptides inserted at each of the three sites (Fig. 7). For this experiment, we grew small scale (1 ml) cultures of E.coli expressing the random peptide aptamers in 96-well plates, purified the peptide aptamers in high throughput (i.e. without optimizing expression or purification protocols for each well) and spotted an equal volume of each peptide aptamer in duplicate onto a glass microscope slide, creating a small microarray. We then probed the microarray with either of two fluorescently labelled antibodies that recognize the scaffold protein. After washing, the slides were analysed using a standard DNA microarray scanner. The signal intensity at each spot obtained with the antibody is proportional to the amount of peptide aptamer at each feature of the array. Because we do not normalize the peptide aptamers to a standard concentration prior to spotting, the signal intensity should be proportional to the amount of protein expressed in each of the 384 bacterial cultures. We found that insertions at loops 1 and 2 were generally well expressed, whereas only 42% of the peptide aptamers with amino-terminal inserts gave a signal-to-background ratio greater than 3 (Fig. 7 and Table II).
In this study, we aimed to improve the stefin A-based STM scaffold and broaden its versatility. Our analysis of six variants of a stefin A-based scaffold indicated structural similarity (Fig. 2). When considering the minimal changes in SUN, SUM, SUC and SDM, we find that the location of the major inflexion point at 219 nm is largely unaffected by each change, indicating that the proportion of secondary structure in stefin A derivatives is largely unchanged by the amino acid alterations. This is true even for SQM, the engineered variant that possesses a combination of all modifications. The slight shift in the CD spectrum of STM towards the far UV compared with all the other variants may be influenced by the STM-specific tryptophan at position 4. We conclude from these data that SQM is likely to be a suitable and versatile scaffold protein.
The insertion of 10 amino acid long peptides into loop 2 was equally well tolerated by SQM and STM. Only a 17 amino acid insert (A7) caused a deviation from the empty scaffold spectra in STM. This indicates that A7 may drive a structural change in the scaffold of STM and that the insertion of inserts longer than 10 amino acids into loop 2 may be better tolerated by SQM than STM. However, 20 amino acid long inserts affected the secondary structure of SQM more than that of STM and decreased the protein yield. This indicates that the two scaffolds are likely to be differentially affected by different peptide inserts, and thus will present the same peptides differently. It might also indicate a length limit of just below 20 amino acids for peptides that are tolerated in loop 2 of SQM. We conclude that SQM is able to present some but not all peptides in loop 2 without adverse effects on its secondary structure.
By asking whether SQM can present peptide inserts for interaction with target proteins, we observed that neither SQM nor STM is able to present thioredoxin-derived BCL-6 POZ domain binders (Chattopadhyay et al., 2006) for positive interaction in yeast two hybrid assays (data not shown). Previous attempts to shuffle peptides between scaffold proteins as diverse as thioredoxin, staphylococcal nuclease and GFP have suggested that this can be achieved approximately one-third of the time (Klevenz et al., 2002), and we interpret this result to indicate that STM and SQM present peptides differently than thioredoxin.
Insertion of epitope tags into the N-terminus, loop 1 and/or loop 2 had varying effects on protein yield and secondary structure. Most obviously, we observed that N-terminal insertions of short epitope tags were detrimental. Consistent with this, the analysis of a small-scale microarray of 384 spotted random peptide aptamers showed a detrimental effect of insertions at the N-terminus of SQM resulting in lower protein yields. Analysis of the CD spectra broadly confirmed these conclusions, showing that epitope peptide insertions at the N-terminal site caused drastic alterations of the secondary structure. This effect could be due to the insertion at this site, or the nature of the inserted peptide, which was the HA or Myc tag in all cases. We return to this question below. Decreases in protein stability following peptide insertion could be the result of sequence-specific effects of the peptide, or the result of generic effects of peptide insertion on protein stability, or a combination of the two. When considering the expression yields of the scaffold proteins presenting various combinations of epitope tag peptides, e.g. the effects of HA insertion at the amino-terminus, it appears to be a combination of these. Increased protein yields after insertion of the AU1 tag at loop 1 and the HA tag at loop 2 might indicate that double insertions may sometimes be beneficial. However, this effect cannot be generalized as replacement of the HA tag with a Myc tag decreased yield almost 15-fold (Table I).
Taken together, the protein expression data show that SQM is able to present peptides from three sites—the amino terminus, loop 1 and loop 2. Of these, the new loop 1 site appears to be most broadly useful, although loop 2 may be able to present peptides of 10–17 amino acids in length. We speculate that many (but not all, e.g. A52tandem) longer inserts may affect the formation of the following β-strand and hence destabilize the scaffold. Interestingly, most insertions into the amino terminus greatly affect the secondary structure of the scaffold, rendering use of this site in first-round screens of randomized libraries impractical. We conclude that, although peptides can be presented by insertions at the amino-terminal site, these can also be frequently detrimental to the stability of the scaffold confirming our conclusions from CD spectra of model peptides. Accordingly, we do not propose to screen libraries of random peptides inserted in the amino terminus, but we do expect to be able to make use of this site to improve peptide aptamers that interact with the target protein using loop 1 and/or loop 2 by using a greater surface area and thus create binders with greater affinity and specificity.
Epitope tags that were presented by SQM on a microarray were specifically recognized by antibodies. However, only weak signals were achieved with anti-AU1 antibody. The lower AU1 signal may be due either to the short length of the AU1 peptide sequence in the epitope loop, with steric hindrance preventing strong association with the antibody, or to the inherently low affinity of the anti-AU1 antibody compared with the anti-Myc and the anti-HA antibodies. Immunoblotting experiments where equal amounts of protein were analysed and the structure of the scaffold is destroyed during SDS–PAGE always gave a stronger signal for the anti-Myc antibody compared with the anti-AU1 antibody (data not shown). This is clear evidence that the affinity of the anti-AU1 antibody is lower than the affinity of the anti-Myc antibody, and this is the most likely explanation for the lower signal for the AU1 tag in the microarray experiments (Fig. 6). On the basis of these experiments, it can be concluded that all epitope tags show selective binding to their corresponding conjugate antibody, though variations in binding affinity exist. Future experiments will determine whether the three insertion surfaces can be simultaneously recognized by three proteins.
In summary, the results described above demonstrate (i) that the stefin A-derived scaffold is amenable to engineering in multiple locations, with each change alone or in combination being well tolerated, (ii) that loop 1 of SQM, utilized here for the first time, appears to be a valid site for peptide insertion, and (iii) that any detrimental effects of the mutations or of insertions are apparently magnified by insertions in the amino terminus. Therefore, the N-terminal insertion site cannot be used routinely. However, some N-terminal inserts are tolerated, which will allow us to use this site to improve the binding affinity and specificity of peptide aptamers in SQM-loop 1 and/or loop 2. SQM appears to be more tolerant to short-sequence insertions into loop 2 (10 amino acids) than STM but longer inserts (>17–42 amino acids) often affect the secondary structure. We conclude that the SQM scaffold is a valid new SteA-based scaffold protein that should allow the selection of peptide aptamers with an increased binding affinity and higher target selectivity than our previous scaffold, STM.
This work was supported by BBSRC Grant BB/F011296/1 to J.J.D. and P.K.F. T.H. is supported by a grant from the Leukaemia Research Fund (to S.D.W. and P.K.F.) and L.K.J.S. acknowledges support from AstraZeneca (to P.K.F.). A.T.B. was supported by a White Rose Health Innovation Partnership Proof of Concept award (to P.K.F.).
P.K.F. would like to thank Stephanie Carter, Sophie Laurenson, Sharon Tate, Robbie Woodman, Johannes Yeh for early work on STM in the MRC Cancer Cell Unit. Circular dichroism spectroscopy was performed at the Astbury Center for Structural Molecular Biology (Leeds University). We would also like to thank Keith Ainley (from Prof. Sheena Radford's group) for introducing us to the spectropolarimeter and giving helpful advice in using the equipment. Finally, we thank E. Gendra for her critical reading of the manuscript.
Edited by Dek Woolfson