Search tips
Search criteria 


Logo of proengLink to Publisher's site
Protein Eng Des Sel. 2011 May; 24(5): 419–428.
Published online 2011 January 8. doi:  10.1093/protein/gzq120
PMCID: PMC3077810

Conversion of scFv peptide-binding specificity for crystal chaperone development


In spite of advances in protein expression and purification over the last decade, many proteins remain recalcitrant to structure determination by X-ray crystallography. One emerging tactic to obtain high-quality protein crystals for structure determination, particularly in the case of membrane proteins, involves co-crystallization with a protein-specific antibody fragment. Here, we report the development of new recombinant single-chain antibody fragments (scFv) capable of binding a specific epitope that can be introduced into internal loops of client proteins. The previously crystallized hexa-histidine-specific 3D5 scFv antibody was modified in the complementary determining region and by random mutagenesis, in conjunction with phage display, to yield scFvs with new biochemical characteristics and binding specificity. Selected variants include those specific for the hexa-histidine peptide with increased expression, solubility (up to 16.6 mg/ml) and sub-micromolar affinity, and those with new specificity for the EE hexa-peptide (EYMPME) and nanomolar affinity. Complexes of one such chaperone with model proteins harboring either an internal or a terminal EE tag were isolated by gel filtration. The 3.1 Å resolution structure of this chaperone reveals a binding surface complementary to the EE peptide and a ~52 Å channel in the crystal lattice. Notably, in spite of 85% sequence identity, and nearly identical crystallization conditions, the engineered scFv crystallizes in a different space group than the parent 3D5 scFv, and utilizes two new crystal contacts. These engineered scFvs represent a new class of chaperones that may eliminate the need for de novo identification of candidate chaperones from large antibody libraries.

Keywords: antibody, binding affinity, co-crystallization, protein complex, protein engineering


Even though the numbers of protein databank entries continue to increase, numerous proteins are rejected from the pipeline leading to structure determination. Specifically, there is a need for strategies to overcome the crystallization limitation, especially for membrane proteins and proteins with inherent conformational variability. A number of strategies to improve the likelihood of growing crystals of so-called ‘difficult’ proteins have emerged over the last decade. Beyond improvements in recombinant expression and protein purification that enable more expansive crystallization trials, these techniques either involve modifying the protein to be crystallized in a way that improves its properties for crystallization, or introducing a second protein, a crystallization chaperone, to provide the crystal lattice. The former category includes random mutagenesis and homolog shuffling (Pedelacq et al., 2002; Yang et al., 2003; Keenan et al., 2005), limited proteolysis to generate a compact, stable protein entity (Wernimont and Edwards, 2009), the identification of ligands to optimally stabilize a particular conformation of the protein (Vedadi et al., 2006), modification of the protein surface to reduce entropy (Derewenda, 2004; Cooper et al., 2007; Goldschmidt et al., 2007) and protein symmetrization by cross-linking (Banatao et al., 2006), among others.

The chaperone category involves the formation of a specific complex between a client protein and a soluble protein that provides hydrophilic residues to form crystal contacts and thus increases the chances of growing well-ordered, highly diffracting crystals of the complex. Since the first report of a crystallization chaperone used to determine the HIV capsid protein structure (Prongay et al., 1990), efforts have focused on generating complexes between membrane proteins, which suffer from particularly unfavorable surface properties for crystal formation. Non-covalent complexes of target membrane proteins with tailored antibody fragments (Kovari et al., 1995; Ostermeier et al., 1995; Hunte et al., 2000; Zhou et al., 2001; Stura et al., 2002; Rasmussen et al., 2007; Uysal et al., 2009), affibodies (Warke and Monmany, 2007), VHH camelid domains (Tereshko et al., 2008) and designed ankyrin repeat proteins (DARPins, Huber et al., 2007; Sennhauser and Grutter, 2008; Milovnik et al., 2009) have been reported. In general, crystallization chaperones recognize native membrane protein sequence, and require identification of a new chaperone for each protein of interest. Fusion to or insertion of a chaperone into a flexible loop has also been described (Prive et al., 1994; Byrne et al., 2000; Hunte et al., 2000; Cherezov et al., 2007; Rasmussen et al., 2007; Rosenbaum et al., 2007). The location of the fusion protein is key, as long linkers confer flexibility typically detrimental to crystallization (Byrne et al., 2000). Ideally, the chaperone should not interfere with activity or function of the client protein of interest. Nevertheless, in principle, any stable soluble protein tethered to or with high affinity for the membrane protein of interest could be used in co-crystallization experiments.

Here, we describe the first steps in development of a generalizable approach to chaperoning crystal growth: antibody fragments that can be used as a co-crystallization chaperone for any protein in which a short peptide sequence, the EYMPME epitope (EE), is inserted. We selected the hexa-histidine-specific (His6) 3D5 single-chain antibody fragments (scFv) as the framework for protein engineering because it does not employ complementary determining region (CDR) residues in major crystal contacts and the CDRs face a wide channel that could accommodate a client protein (Lindner et al., 1997; Muller et al., 1998; Kaufmann et al., 2002). We hypothesized that these CDRs could be modified to recognize a new peptide epitope, and a peptide or client protein could bind, without compromising existing crystal contacts. Nevertheless, 3D5 possesses several shortcomings that require optimization for use as a crystallization chaperone including low affinity (Kd ~1 μM) for only extreme C-terminal histidines (Kaufmann et al., 2002), pH sensitive binding (Muller et al., 1998), relatively poor expression in Escherichia coli, and limited solubility (Kaufmann et al., 2002). Lastly, terminal His6 tags, which are commonly used for protein purification, are not always removed before crystallization and can degrade over time due to low-level protease contamination. These features limit the broader application of the His6 tag as a receptor for a crystallization chaperone and motivate conversion of 3D5 to new peptide specificity. The EE peptide (EYMPME) was chosen for (i) its short length, (ii) the presence of tyrosines to form hydrophobic interactions and hydrogen bonds that commonly dominate protein–protein binding energetics (Fellouse et al., 2007), (iii) charged residues to form electrostatic interactions, (iv) the presence of a proline to restrict conformational diversity and (v) the availability of high affinity commercial antibodies binding these peptides (Grussenmeyer et al., 1985; Prickett et al., 1989) (Covance, Sigma). Indeed, our optimized scFv exhibits enhanced crystallization propensity, including elevated solubility, stability, affinity and the ability to bind internal peptide sequences. This engineered, peptide-binding scFv represents a new class of crystallization chaperones that may eliminate the need for de novo identification of candidate chaperones from large antibody libraries.

Materials and methods

Molecular biology and expression of proteins presenting EE and His6 peptides

Antibody binding sites (peptide sequences) were incorporated into proteins of interest via site-directed mutagenesis with mutagenic oligonucleotides (Integrated DNA Technologies). To generate a ligand with a C-terminal EE tag, maltose binding protein (MBP) was amplified from the E.coli genome, appended with an EE tag and cloned into the pAK400 vector (Krebber et al., 1997), with or without a stop codon before the vector-encoded His6 tag to generate the MBP-EE and MBP-EE-His6 ligands. The variant without a His6 tag was used during phage screening and panning; both variants as well as a third in which MBP was cloned directly into pAK400 (MBP-His6) were used in Western analysis. To account for steric accessibility, we generated ligands with varying numbers of internal EE tags by introducing tandem repeats of the EE coding sequence into the flexible linker connecting the heavy and light chains of unrelated scFv proteins. A single repeat was introduced into the DO11.10 scFv gene to generate scFv-EE1, two repeats into the 14B7 scFv gene to generate scFv-EE2 (Maynard et al., 2002) and three repeats into a non-native scFv consisting of the 3D5 light chain and 14B7 heavy chain to generate scFv-EE3. These proteins also contain C-terminal, vector-encoded His6 tags to facilitate purification. The original 14B7 scFv with only a C-terminal His6 tag was used as a hexa-histidine-tagged ligand. All ligand proteins were expressed from pAK400 in E.coli BL21 in 250 ml cultures of TB media, induced with 1 mM IPTG for 3–5 h at 25°C before cell harvesting and periplasmic fractionation via osmotic shock as previously described (Maynard et al., 2005). Recombinant antibody-based ligands were purified via sequential immobilized Ni2+ affinity chromatography and size exclusion chromatography (SEC), using HEPES-buffered saline (HBS; 10 mM HEPES, 150 mM NaCl, pH 7.4). The MBP-based ligands were purified using an amylose affinity column and eluted with maltose-containing buffer (200 mM HEPES, 150 mM NaCl, 10 mM maltose, pH 7.4), prior to SEC.

Library generation by CDR and random mutagenesis

The 3D5 gene was generated by total gene synthesis from published amino acid sequence (Genscript), including previously identified solubilizing mutations (referred to here as 3D5) (Kaufmann et al., 2002) and cloned via SfiI–SfiI ligation into the pMoPac24 phage display vector (Hayhurst et al., 2003), which introduces a C-myc tag at the C-terminus of the displayed protein. To identify clones with desired peptide specificity, two CDR libraries were generated by modified Kunkel mutagenesis under error-prone conditions: HCDR3 only and HCDR2 + HCDR3 simultaneously. Randomized codons were selected to retain anti-peptide binding capabilities as described previously (Cobaugh et al., 2008). For generation of EE-specific antibodies, amino acid sequencing by Edman degradation (ICMB Protein Microanalysis Facility, University of Texas at Austin) of proteolytic fragments of the commercially available GluGlu antibody (Covance) identified a candidate HCDR1 sequence. The HCDR2 loop included 13 amino acid modifications (theoretical size, 9 × 1014), while HCDR3 loop length was set at seven residues to reflect observed diversity in anti-peptide antibodies (theoretical diversity, 6.4 × 108). The 3′ H2 oligonucleotide sequence is CACGGTGAGTGTGGCCCTSNNCTTVNHMNYSBNGTTATASNNSNNSNNSNNAYYSNNARDMYDAATSNNTCCGATCCACTCCAGACC, while the 3′ H3 oligonucleotide sequence is CCTTGACCCCAGTAATCCATAGCSNNSNNSNNGCTSNNSNNSNNSNNABNTGCACAGTAGTATACG.

A third library was generated to introduce random mutations into a pool of scFvs selected for EE-specificity. Here, 3D5 variants with desired epitope specificity were subjected to error-prone PCR with Mutazyme II DNA Polymerase (Stratagene) using flanking primers (5′ scback and 3′ scforlong, IDT; Krebber et al., 1997) and exponential amplification according to the manufacturer's instructions. Briefly, reaction mixtures were heated at 95°C for 4 min, followed by 25 cycles of incubation at 95°C for 30 s, 52°C for 30 s and 72°C for 1 min to introduce a predicted 3–4 mutations per 1000 bp. Gel-purified PCR products (Qiagen) were used in a modified Kunkel mutagenesis step to produce two libraries. Library mutation rate and diversity was assessed by plasmid DNA sequencing at the University of Texas Core Facility using primer 5′ pAKpel (Krebber et al., 1997) (IDT).

Selection and screening by phage display

M13 phage monovalently displaying scFv–gpIII fusions were prepared as previously described (Krebber et al., 1997). After precipitation with 1/4-volume PEG-2.5 M NaCl and resuspension in PBS, the phage concentration was quantified by absorbance: virions/ml = [(A269–A320)×6×1016/(#bases per virion)].

For panning, 1012 plaque forming units (pfu) were added to blocked ELISA wells (Costar) coated with either anti-C-myc antibody (9E10, Sigma), His6 or EE presenting ligand. After equilibration, and washing with PBS-0.05% Tween, bound phage were eluted with 0.1 M glycine–HCl pH 2.2, transferred to a new tube and neutralized with 2 M Tris, pH 7.0. Phage were then amplified in E.coli in preparation for the next panning round or used to infect E.coli and plated to isolate single clones. Panning involved two cycles, each consisting of three selection rounds: one with immobilized anti-c-myc antibody to enrich for full-length scFv and remove variants with primer-encoded stop codons or frameshifts, followed by two rounds with peptide-tagged ligands.

Individual phage clones were analyzed by phage ELISA to confirm enrichment of peptide-specific clones and to screen candidates for biophysical characterization. Phage from single clones were produced in 200 μl in sterile 96-well plates (Costar). The plates were centrifuged and supernatant transferred to coated and blocked ELISA wells (Costar). After washing with PBS-0.05% Tween, bound phage were detected by anti-M13-HRP (1:2500, GE Healthcare) with tetramethylbenzidine (TMB, Sigma) substrate and the resulting absorbance at 450 nm recorded. Binding of each clone to immobilized anti-C-myc antibody (1 μg/ml), peptide ligand (MBP-EE or 14B7-His6 at 4 μg/ml) and blocked wells (5% milk) was monitored. Clones with a high ratio of peptide ligand to anti-C-myc signal, indicating high peptide-binding specific activity, were further characterized. To rank the relative affinities of these high-activity variants, phage were produced from 100 ml cultures and the concentration of phage particles was quantified by absorbance prior to ELISA analysis. The phage concentration resulting in 50% of the maximum ELISA signal (EC50) was compared to select final candidates for expression and characterization as soluble scFv protein.

To confirm conversion of specificity, phage displaying the 3D5/EE_48 or commercial anti-His-HRP (Invitrogen) were used to probe a western blot containing host proteins presenting the EE (MBP-EE, scFv-EE), His6 (scFv-His6) or both peptides (scFv-EE2). Phage displaying 3D5 were not used as divalent display is required, due to low affinity His6 recognition (Kaufmann et al., 2002). A 15% SDS–PAGE gel was loaded with 10 µl cellular lysate from cells expressing ligand, and electrophoresed prior to transfer to PVDF membrane. After blocking with 5% non-fat milk in PBS, the blot was incubated with 8 × 1010 virions/ml fresh 3D5/EE_48 scFv displaying phage for 1 h at room temperature, washed three times with HBS-0.05% Tween and incubated with anti-M13-HRP secondary antibody (1:5000). Signal was developed with SuperSignal West Extended Duration substrate (Pierce), and the resulting image captured on Kodak film. The blot was stripped with mild stripping buffer (200 mM glycine pH 2.2 with 0.1% w/v SDS and 1% Tween-20), re-blocked and re-probed with commercial anti-His-HRP (1:5000, Invitrogen) to detect His6-containing ligands. The blot was stripped a second time, blocked and probed with anti-MBP-HRP (1:2500, Invitrogen) to confirm similar loading between wells.

Chaperone protein expression, purification and complexation

The parent 3D5 and scFv genes selected from phage display experiments were subcloned into the SfiI–SfiI site of pAK400 for scFv expression (Krebber et al., 1997), or pMoPac54, to produce an scAb (an scFv appended with a human kappa constant domain as a convenient detection handle; Hayhurst et al., 2003). Protein was secreted into the bacterial periplasm of E.coli strain BL21, isolated by osmotic shock and purified by immobilized Ni2+ affinity chromatography and SEC using a Superdex S75 column (GE Healthcare), as previously described (Maynard et al., 2005). The Superdex S75 column was calibrated using a Low Molecular Mass gel filtration calibration kit (GE Healthcare).

Protein purity and size were characterized by SDS–PAGE under reducing or non-reducing conditions (Sambrook, 2001). Protein solubility was determined by concentrating the protein to ~20 mg/ml, incubating for 4 days at 4°C, centrifuging for 10 min at high speed to pellet insoluble particles and quantifying the concentration of protein remaining soluble. Stability was assessed as the mid-point for thermal unfolding, using a fluorescence assay (Lavinder et al., 2009). Purified protein (20 μl at 280 μg/ml) or buffer blank and Sypro Orange (1 μl of a 1:1000 dilution; Molecular Probes) were heated in a Real Time PCR instrument (7900HT Fast Real-Time PCR System; Applied Biosystems) from 20°C to 85°C in increments of 0.5°C and analyzed with SDS.2 (Applied Biosciences). The scFv monomer-to-dimer ratio was determined from SEC traces by calculating the area under the curve for each peak with Unicorn software (GE Healthcare). Protein concentration was assessed by micro-BCA assay with a BSA standard curve and buffer blank (Pierce). To facilitate direct comparisons, all 3D5 and variant characterization values reported here were performed with these methods and specific values may differ slightly from those previously reported (Kaufmann et al., 2002).

Complex formation between 3D5/EE_48 and two ligands, scFv-EE3 and MBP-EE (see above), was assessed by SEC. Equimolar volumes of purified 3D5/EE_48 and either scFv-EE3 or MBP-EE (~1 μM each) were combined and allowed to incubate on ice for 6 min followed by separation using an analytical Superdex S75 column (GE Healthcare) equilibrated with 10 mM HEPES, 150 mM NaCl, pH 7.4. Fractions of interest were concentrated and characterized by SDS–PAGE. Control experiments applied the same quantity of each species alone as used in complexation experiments. The Superdex 75 column was calibrated using a Low Molecular Mass gel filtration calibration kit (GE Healthcare) supplemented with cross-linked albumin (Sigma).

Determination of chaperone-peptide binding affinity

Direct ELISA with purified scFv protein was performed in two orientations: scFv as an immobilized capture molecule or scAb detection of immobilized EE-tagged protein. For the former, wells were coated overnight at 4°C with 50 μg/ml scFv variant in PBS, prior to blocking with 5% milk in PBS. MBP-EE was serially diluted (1:2) from an initial concentration of 100 μg/ml, followed by washing and detection with anti-MBP-HRP (1:2500). For the inverse configuration, plates were coated with EE-tagged proteins (4 μg/ml) followed by 1:2 serial dilutions of scAb protein from 200 μg/ml. In this case, detection was achieved with anti-human-kappa-HRP (1:2500; Sigma) and TMB substrate. To assess the pH sensitivity of the binding interaction, ELISAs were performed in which the scFv–ligand interaction proceeds at pHs ranging from 6.0 to 8.0, in 0.5 increments. To rank the relative affinities, the EC50 concentrations were compared.

Kinetic binding assays were performed with proteins bearing C-terminal His6 or EE-tags and internal EE-tags to quantify scFv peptide selectivity using a BIAcore 3000 (GE Healthcare). Peptide-binding scFv or protein ligands were coupled to CM5 chips using NHS-EDC chemistry to a level of ~500 RU. The signal from a flow cell coupled with a control scFv (14B7-His6; Maynard et al., 2002) was used to correct for non-specific binding to the matrix, while control scFv injections corrected for changes in sample refractive index. Soluble protein ligands were injected in a duplicate dilution series from 8 to 0.1875 μM at a flow rate of 50 μl/min to minimize mass transport effects. The association rate constant (kon), dissociation rate constant (koff) and equilibrium dissociation constant (Kd; Kd = koff/kon) were calculated assuming a Langmuir 1:1 binding model with BIAevaluation software. Only data sets with χ2 < 0.6 were used.

Protein crystallization

3D5/EE_48 was crystallized by the sitting drop vapor diffusion method at 4°C. Conditions were optimized based on those reported for 3D5 (Kaufmann et al., 2002). One to two microliters of protein solution in HBS buffer at 3.8 mg/ml chilled on ice were mixed with 1 μl sample of reservoir solution containing 0.1 M Mes (pH 6.4), 0.1 M magnesium acetate and 20–24% (w/v) PEG 8000. Crystals of rectangular or triangular shape appeared in 4 days and grew to a maximal size of 40–60 µm within 4 weeks. The largest crystals grew when the reservoir to protein ratio was 1:1.33–1.66.

Data collection, structure determination and refinement

Crystals were harvested at 4°C and cryocooled using a solution consisting of 85.5% (v/v) reservoir solution and 14.5% (v/v) ethylene glycol. Crystallographic data were collected using a wavelength of 1 Å at the GM/CA-CAT beamline (Darien, IL) equipped with a 5 μm mini-beam setup. Data were processed with XDS (Kabsch, 1993) and Scala (CCP4, 1994). The structure of 3D5/EE_48 was solved by molecular replacement with Molrep (CCP4, 1994) using a polyalanine search model derived from parent 3D5 asymmetric unit (PDB ID 1KTR) from which all non-protein atoms and loop residues were removed. All four 3D5/EE_48 scFv monomers present in the asymmetric unit were identified from Molrep. The atomic model was fit to the respective electron density map using Coot (Emsley and Cowtan, 2004), and then iteratively refined with Refmac (CCP4, 1994). After several initial rounds of refinement using tight non-crystallographic symmetry restraints, refinement including Translation/Libration/Screw motions and medium non-crystallographic restraints was conducted. Of the 947 residues present in the asymmetric unit, 99.2% are in most favored and additional allowed regions of the Ramachandran plot. The final model has been deposited in the PDB (PDB ID 3NN8). Figures were generated using Pymol (The PyMOL Molecular Graphics System, Version 0.99rc6, Schrödinger, LLC). Electrostatic surface potentials were calculated using APBS (Baker et al., 2001) and visualized using Python Molecular Viewer 1.5.4 (Sanner, 1999). Computational peptide docking was conducted with ClusPro (Comeau et al., 2004).


Selection of 3D5 variants

We identified an scFv scaffold to use as a starting point for engineering peptide-binding chaperones by examining a family of structurally characterized antibody fragments binding small molecules (PDB IDs including 1KTR, 1MAJ, 2CJU, 1DLF, 2UUD, 1DSF, 1WZ1, 1N4X, 2G60) that share a highly conserved variable light chain (VL from the murine Vκ1 germline, >90% identity) and, if crystallized, a major crystal contact. One member of this family, the His6-specific 3D5 scFv, had previously been displayed on M13 bacteriophage (Lindner et al., 1997). We hypothesized that we could enhance and/or convert scFv peptide specificity while retaining the favorable crystallization characteristics of 3D5 through CDR and random mutagenesis, coupled with a phage display selection strategy in which peptide binding affinity, solubility, stability and expression level are used as proxy variables for crystallizability. Similar scaffolding approaches have been effective for antibody humanization and thermodynamic stabilization (Baca et al., 1997; Jung and Pluckthun, 1997).

To increase the versatility of our crystallization chaperones, we sought to identify variants with affinity for either the His6 or EE (sequence: EYMPME) hexa-peptides. The chemical diversity of the EE peptide would be expected to enhance binding interactions while the inclusion of a proline would limit conformational entropy (Reiersen and Rees, 2001). In order to engineer scFvs with the desired peptide specificity (His6 or EE), three libraries with randomized CDRs were generated by methods previously described (Cobaugh et al., 2008). Since the heavy chain (VH) typically dominates ligand interactions (Xu and Davis, 2000), VH CDRs 2 and 3 (HCDR2 and HCDR3) were randomized to convert peptide specificity while retaining the desirable crystallization properties of 3D5. The three libraries of scFv HCDR variants (actual library size ~107 each) were monovalently displayed on the surface of M13 phage via fusion to coat protein gpIII, and scFv variants were selected for ligand binding specificity using a series of panning cycles. First, full-length scFvs, which present a C-terminal C-myc epitope, were enriched from prematurely truncated variants resulting from oligonucleotide-encoded stop codons via immunoprecipitation. Next, eluted phage were amplified in E.coli, and variants with desired peptide specificity selected via phage binding to an immobilized host protein presenting either the His6 or EE peptide. One host protein, the 14B7 scFv with a terminal His6 peptide (scFv-His6), was employed for selection of hexa-histidine-specific variants. Two ligand proteins were used to select for EE peptide binders: MBP with a singe C-terminal EE tag (MBP-EE), and another scFv with two internal tandem EE tags to allow for steric accessibility within the Gly–Ser linkage between VH and VL immunoglobulin domains (scFv-EE2).

The amplification and selection procedure was repeated twice, using different immobilized host proteins during each cycle to ensure selection for peptide, as opposed to host protein, specificity. Next, weakly peptide-reactive phage were pooled and subjected to random mutagenesis to yield the libraries, one based on EE-specific scFv. Sequencing of 20 individual clones from each library comprising ~107 unique members confirmed library diversity and the anticipated ~0.5% mutagenic rate. An additional three rounds of phage selection yielded the pool of EE peptide-specific scFv variants.

After screening several hundred clones by monoclonal phage ELISA followed by phage titration ELISA to rank clones by binding affinity, two His6 (denoted 3D5/His_#) and six EE-specific (denoted 3D5/EE_#) scFv variants with unique sequences were identified (Table I). Of these, two clones, 3D5/His_683 and 3D5/EE_48, provided the highest specific binding activity (measured as the ratio of peptide tag/anti-c-myc ELISA signal). Western blot analysis provided a clear verification of peptide specificity: 3D5/EE_48 displayed on phage bound host proteins with either internal or C-terminal EE peptides, but not those with only C-terminal His6 peptides (Fig. 1a). These scFv variants were then expressed and purified as soluble protein (Fig. 1b and c), and characterized for binding activity by ELISA and surface plasmon resonance (SPR; Fig. 2a–f) analysis and for enhanced biophysical properties (see below).

Table I.
Comparison of 3D5 scFv CDR H3 regions
Fig. 1.
Characterization of selected scFv variants. (a) Western blot detection of peptide tagged ligands by anti-His6-HRP, 3D5/EE_48 displayed on M13 phage and anti-MBP-HRP on the same blot. Lanes are 1, MBP; 2, MBP-EE; 3, MBP-EE-His6; 4, scFv-His6. (b) Size ...
Fig. 2.
Peptide-scFv binding kinetics. Top row, analysis of 3D5 and 3D5/His_683 binding affinity and specificity His6 by SPR. Binding partners were injected in duplicate in concentrations ranging from 8000 to 125 nM. (a) 3D5 scFv recognition of immobilized scFv-His ...

Characterization of 3D5/His variants

The selected variants 3D5/His_67 and 3D5/His_683 differ from each other and 3D5 in the HCDR2 (3D5/His_67) and HCDR3 (3D5/His_67 and 683). These variants harbor longer CDR3 lengths with several amino acid differences (Table I). In our hands, 3D5/His_683 expressed nearly 3-fold better in E.coli than 3D5 (8.5 versus 3.1 mg/l culture; Table II), exhibits enhanced scFv solubility (estimated as 16.6 versus 2.3 mg/ml, respectively) and modestly improved affinity (Kd 808 versus 4700 nM). At concentrations relevant to crystallization (~4 mg/ml), 3D5/His_683 elutes from a gel filtration column as a mixture of a monomer and dimer (Fig. 1c). In contrast, 3D5/His_67 expressed at lower levels, but exhibited similar affinity for His6 (Kd 760 nM). One of the two key residues in HCDR3 that stabilize the bound His6 in the 3D5 crystal structure, Glu93 or Ser 96, is retained in each variant (Table I), yet these variants and others we tested all exhibited micoromolar affinity for His6 (Kd 3–4 μM). Thus, even though these variants possess rather different HCDR3s than 3D5 and more favorable biochemical properties, their affinity for His6 is not substantially improved over that of 3D5. These results suggest that the His6 binding site is well organized for peptide binding, or that HCDR3 may contribute fewer productive interactions than expected. For these reasons, plus concerns regarding the utility of His6 as a peptide ligand, these clones were not pursued further.

Table II.
Biophysical characteristics of 3D5 scFv variants

Characterization of 3D5/EE_48

The lead 3D5/EE scFv candidate, 3D5/EE_48, retains 85% amino acid identity relative to 3D5. In addition to novel HCDR sequences (Table I), two key amino acid changes in the VH framework identified during random mutagenesis (E6Q and S74T) were instrumental in improving scFv expression and affinity. The impact of Glu versus Gln at position 6 has been previously described (Kipriyanov et al., 1997; de Haard et al., 1998).

The 3D5/EE_48 scFv displayed no detectable binding affinity for C-terminal His6 tags and instead is able to bind both C-terminal and internal EE-tags with similar affinities, Kd 389 and 212 nM, respectively (Figs 1a and and2c–e;2c–e; Tables II and III). A terminal EE tag followed by a His tag was recognized with higher affinity than a naked EE tag (Kd 389 versus 767 nM), perhaps due to protease protection and reduced entropy with an additional C-terminal extension. Varying the pH in 0.5 increments from 6.0 to 8.0 or increasing the number of internal EE tag repeats from two to three had no detectable effect on affinity as measured by ELISA (data not shown) and SPR (Kd 25–30 nM; Table III). An increase in affinity was observed for ligands harboring multiple EE repeats versus single repeats, likely due to re-binding effects, as the measured on-rates are similar but the off-rates are slower (Table III). The use of 3D5/EE_48 displaying phage as detection reagents in ELISA (data not shown) and western blot (Fig. 1a) demonstrated specific binding of EE but not His6-tagged ligands. Expression levels of 3D5/EE_48 (2.1 mg/l culture) are similar to that observed for 3D5 (3.1 mg/l culture), but the solubility increased from 2.3 to >12 mg/ml. In addition, 3D5/EE_48 is initially purified as a predominantly monomeric species (~80% of total eluted protein; Table II) and retains this monomeric state when concentrated up to at least 3 mg/ml (Fig. 1c). This contrasts with the lower initial ratio of monomeric to dimeric protein (Table II) and slow conversion of purified monomer to dimer observed for 3D5 and 3D5/His_683 under similar conditions (Fig. 1c). The melting temperature of 3D5/EE_48 is almost identical to 3D5, indicating similar thermal stabilities (Table II). Overall, 3D5/EE_48 exhibits similar or enhanced biophysical properties when compared with 3D5 in terms of affinity, expression level, solubility, stability and homogeneity of oligomerization state.

Table III.
Characterization of 3D5/EE_48 scFv binding kinetics by SPR

Complexation of 3D5/EE_48 with EE-tagged proteins

The ability to isolate complexes of 3D5/EE_48 and client proteins expressing the EE tag was assessed next using SEC. Equimolar concentrations (~1 μM) of 3D5/EE_48 and the client protein were combined and fractionated using an analytical gel filtration column. Fractions corresponding to the eluted peaks were analyzed by SDS–PAGE and compared with control runs with isolated binding partners. One client protein, scFv-EE3 used originally for selections (see above), elutes as a dimer, and runs slightly higher than its expected molecular mass by SDS–PAGE, likely due to an extended conformation of the individual VH and VL domains within the monomer (Fig. 3a). Complexation with 3D5/EE_48 results in a single elution peak that corresponds to a molecular mass consistent with a heterotetramer, i.e. an scFv-EE3 dimer with two bound 3D5/EE_48 monomers (Fig. 3a). The second client protein tested was MBP-EE, which harbors only a C-terminal EE tag. Although MBP-EE by itself elutes as a monomer, complexation with 3D5/EE_48 yields two higher molecular weight complexes with molecular masses consistent with a heterodimer and heterotetramer (Fig. 3b). Given the lack of dimerization precedent for MBP, the heterotetramer could arise from a domain swapped 3D5/EE_48 dimer in which two distinct binding sites for MBP-EE are presented. Domain swapping has been proposed as a mode for 3D5 dimerization (Kaufmann et al., 2002).

Fig. 3.
Isolation of 3D5/EE_48 complexes with EE-tagged client proteins by SEC. (a) 3D5/EE_48 incubated with scFv-EE3 ligand. Elution peak 1 corresponds closely to the expected retention volume of a scFv-EE3 homodimer complexed with two 3D5/EE_48 molecules as ...

Structure of 3D5/EE_48

Crystals of 3D5/EE_48, grown under conditions used to crystallize 3D5 (Kaufmann et al., 2002), appeared within 4 days and continued to grow over several weeks. The structure of 3D5/EE_48 was solved by molecular replacement using a search model derived from the 3D5 coordinates (see Materials and methods, Table IV). Although the crystals were grown under similar conditions, and the proteins share a high level of sequence identity, the two scFvs do not crystallize in the same manner (Fig. 4). First, whereas 3D5 crystals belong to a trigonal space group (P3221), crystals of 3D5/EE_48 belong to a cubic space group (F23). The asymmetric unit of 3D5/EE_48 contains four molecules whereas 3D5 contains one VH–VL pair (Fig. 4a). In addition, in spite of the fact that no amino acid changes occurred in the major 3D5 crystal contact, the contact is not preserved in the 3D5/EE_48 lattice. Whereas the crystal lattice of 3D5 is built by alternating VH/VL subunits from neighboring molecules, that of 3D5/EE_48 relies primarily on HCDR residues from adjacent molecules (see arrows, Fig. 4b). The second largest contact in the 3D5 crystal lattice (305 Å2) has become the largest crystal contact (560 Å2) in the 3D5/EE_48 lattice with several additional hydrogen bonds formed at this interface (see shaded area, Fig. 4b). Finally, as a consequence of lattice changes, 3D5/EE_48 crystals consist of 66% solvent with a channel ~52 Å wide while 3D5 crystals consist of 77% solvent and a channel ~70 Å wide (Fig. 4c).

Table IV.
Data collection and refinement statistics
Fig. 4.
Comparison of 3D5/EE_48 (top) and 3D5 (bottom) crystal lattices. (a) Asymmetric units. (b) Crystal contacts. The preserved contact common to both lattice networks shaded grey. New crystal contact comparison depicted in arrows. (c) Lattice structure with ...

The overall structure of the 3D5/EE_48 scFv remains very similar to that of the parent 3D5 (average rmsd ~0.55 Å for main chain atoms in VL and ~1 Å for VH domains); however, changes observed in the CDR regions reconfigure the peptide binding region to accommodate an EE-tag (Fig. 5). In the VL CDR1 loop (LCDR1) of 3D5/EE_48, slight movement in residues His 27d–Asn 30 may be influenced by the presence of neighboring Leu 93 in the VL CDR3 loop (LCDR3), instead of the corresponding His residue at this position in 3D5 (Fig. 5a). Another substitution in LCDR3 of 3D5/EE_48, introduction of a Pro 96 for the Phe in this position in 3D5 appears to open up the peptide-binding groove to accommodate longer peptides, and, in particular, may allow internal peptides to be recognized (Fig. 5a). Compared with the LCDRs, HCDRs are more divergent both in sequence and in structure. In 3D5/EE_48, the beta-hairpin in HCDR2 as a whole shifts closer to HCDR1. HCDR3 differs primarily in its longer length, which significantly alters the shape of the peptide-binding region when compared with 3D5. The binding surface near the interface of the heavy and light chains forms a pronounced tri-lobed hydrophobic pocket (Fig. 5b and c). The electrostatic surface potential reflects a charge distribution complementary to that of the peptide in this region (Fig. 5b).

Fig. 5.
Analysis of 3D5/EE_48 structure. (a) Superimposition of 3D5/EE_48 and 3D5 with CDRs labeled. Amino acid changes discussed in the text are represented as ball-and-stick. LCDR1, LCDR2 and LCDR3 indicate VL CDR loops 1, 2 and 3, respectively. HCDR1, HCDR2 ...

We turned to computational docking to model EE-peptide binding to 3D5/EE_48 (Fig. 5c) because no crystals of adequate size for structure determination containing both 3D5/EE_48 and EE-peptide have been grown to date, and soaking with the commercial EE-peptide (Covance) has not yielded crystals with bound peptide. The EE peptide is predicted to bind in an orientation in which the central proline (Pro 4) introduces a kink, allowing peptide residues Tyr 2 and Met 3, to reach into the hydrophobic binding pocket. In this working model, VH residue His 50 appears to stabilize peptide Tyr 2 through hydrophobic interactions, while VH residue Arg 95 forms key polar interactions with multiple peptide side chains (Tyr 2, Met 3, Glu 6). Peptide residues Glu 1 and Glu 6 stabilize this binding mode through surface electrostatic interactions, and hydrogen bonding interactions between the peptide backbone amide and carbonyl stabilize the peptide conformation. In the case of a terminal EE tag, the C-terminus may compete for the Glu 6 side chain interactions, and/or more flexibility of the tag may destabilize the peptide backbone interactions. In this docked model, peptide residue Met 5 does not appear to be directly recognized by 3D5/EE_48. Notably, VL residues predicted to form interactions with peptide are conserved between 3D5 and 3D5/EE_48, while VH residues contributing to peptide interactions, such as Arg 95, were altered during engineering, underscoring the role of the VH in peptide recognition.


Crystallization chaperones are proposed to aid co-crystallization by several distinct mechanisms, including immobilizing flexible regions, concealing exposed hydrophobic regions and providing polar surfaces capable of forming lattice contacts (Zhou et al., 2001; Hunte and Michel, 2002). To date, most co-crystal structures have employed antibody fragments because the molecular requirements for ligand binding are well understood and their hypervariable regions can be modified to recognize nearly any epitope of interest (Chothia and Lesk, 1987; Cobaugh et al., 2008). Typically, antibodies that recognize specific epitopes on unmodified target proteins are identified through traditional hybridoma screening or library selection techniques (Rothlisberger et al., 2004; Huber et al., 2007; Uysal et al., 2009; Veesler et al., 2009) where there is minimal control over the epitope recognized. Moreover, the identification and optimization of a chaperone tailored to each client protein of interest is an expensive and time-consuming process. An attractive alternative is the use of commercially available purified monoclonal antibodies for common epitopes, such as commercially available anti-His antibodies. Unfortunately, the hybridomas secreting these antibodies are not available and the cost to purchase purified antibody sufficient for use in crystallization trials is prohibitive. Moreover, without the gene sequence available, the biophysical properties and format of the antibody (e.g. scFv, scAb, Fab) cannot be readily altered. Finally, even in the case where sequences are known and genes for the corresponding antibody fragments can be synthesized for recombinant expression, antibody fragments often express with relatively low yield in E.coli and lack suitable solubility and stability profiles.

The engineered scFv chaperone approach complements non-antibody formats that have been developed to allow modular recognition of a specific binding partner (e.g. DARPin, Vhh) (Huber et al., 2007; Warke and Monmany, 2007; Sennhauser and Grutter, 2008; Tereshko et al., 2008). Whereas these alternative frameworks express in very high levels (up to 200 mg/l in the bacterial cytoplasm for DARPins) and possess a stable structure, a potential disadvantage is their small size ~15 kDa, which limits the hydrophilic surface area available for generating protein–protein crystal contacts. In contrast, scFvs are nearly twice as large, and can be readily converted to a ~50 kDa Fab format to accommodate larger client proteins with a larger hydrophobic surface area.

Our engineered scFv chaperones, derived from the previously crystallized 3D5 scFv framework and binding short His6 or EE peptide sequences, overcome several of the aforementioned limitations of antibody fragments and represent a potentially generalizable solution to the production of high affinity protein complexes for crystallization of difficult proteins. We overcame the affinity, pH sensitivity and solubility limitations specific to 3D5 by employing a two-step protein engineering process of randomizing the HCDR 2 and 3, followed by random mutagenesis of the selected scFvs. This selection scheme does not directly select for the ability to crystallize, as there is no clear biophysical correlate of crystallization propensity, but can select for ‘well-behaved’ proteins, as evidenced by the increased expression levels, solubility and peptide binding affinity of our characterized variants. The initial library design focused on the HCDRs because these can be sufficient to confer high affinity and specificity (Rader et al., 1998; Sidhu and Weiss, 2004) and in the 3D5 family of antibody fragments, the VL domains are highly conserved.

After limited success in improving the biochemical characteristics of His6-specific scFvs, we converted 3D5 to EE epitope specificity. The 3D5/EE_48 scFv is expressed in high yield in E.coli, is highly soluble, is predominantly monomer, and is readily crystallized. The affinity of 3D5/EE_48 for internal EE-tags (Kd 212 for single, 26 nM for multiple peptide insertions) likely reflects the combined effects of restricted conformational variability due to the presence of a proline in the EE-peptide, as well as the ability of the remaining peptide residues to participate in hydrogen bonding, electrostatic or hydrophobic interactions. Combined with the general reduced entropic costs of binding an internal peptide, this scenario represents a desirable binding configuration for crystallization chaperone and tagged client protein. Indeed, complexes of 3D5/EE_48 with host proteins are sufficiently tight to withstand separation by SEC. In the context of a co-crystallization experiment, a modest 5 mg/ml concentration of a 30 kDa scFv chaperone protein equates to 170 μM, which is nearly 1000-fold above the measured equilibrium dissociation constant and will drive complex formation within the crystallization drop.

Unexpectedly, the crystal lattice of 3D5, whose open framework and limited use of CDRs in crystal contacts was an initial design criterion, was not preserved in 3D5/EE_48. Although the use of CDR residues in crystal contacts appears to render 3D5/EE_48 not ideal for co-crystallization, none of the residues participating in the major crystal contact of 3D5 has in fact been altered. Thus, it should be possible for 3D5/EE_48 to revert back the 3D5 lattice framework when most CDR residues are participating in a complex and CDR-based crystal contacts are no longer accessible. We are optimistic about the prospect of the ability of 3D5/EE_48 to promote crystallization of ‘difficult’ proteins, either by mediating formation of crystal contacts (as observed for KcsA; Zhou et al., 2001) or by immobilizing flexible loops (as observed for GPCRs; Rasmussen et al., 2007; Milovnik et al., 2009). Our current efforts are focused on further engineering of 3D5/EE_48 to render the CDR crystal contacts less favorable than those found in 3D5. We are also co-crystallizing 3D5/EE_48 with MBP and candidate membrane proteins in which the EE peptide has been installed into an accessible but functionally silent location. In the long term, we plan to extend our approach to generate 3D5-based scFvs or Fabs that recognize other peptide sequences, leading to a toolbox of peptide binding crystallization chaperones with homotypic crystal contacts that could be used to crystallize any protein of interest.

Accession numbers

The coordinates of the 3D5/EE_48 structure are deposited in the Protein Data Bank under PDB ID 3NN8.


This work was supported by the National Institutes of Health (grant number AI066239 to J.A.M.), the Packard Foundation (grant number 29098 to J.A.M.), the National Science Foundation (grant number 0845445 to R.L.L.) and the American Federation for Aging Research (R.L.L.). GM/CA-CAT has been funded in whole or in part with Federal funds from the National Cancer Institute (Y1-CO-1020) and the National Institute of General Medical Science (Y1-GM-1104). Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Basic Energy Sciences, Office of Science, under contract no. DE-AC02–06CH11357.


We thank Benjamin Roy for the scFv-EE1 construct.


Articles from Protein Engineering, Design and Selection are provided here courtesy of Oxford University Press