|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: AS PJL MM ELP AMS KCG MB. Performed the experiments: AS PJL MM ELP AMS. Analyzed the data: AS PJL MM ELP AMS. Contributed reagents/materials/analysis tools: AS PJL MM ELP AMS KCG. Wrote the paper: AS MB.
We introduce a new method for purifying recombinant proteins expressed in bacteria using a highly specific, inducible, self-cleaving protease tag. This tag is comprised of the Vibrio cholerae MARTX toxin cysteine protease domain (CPD), an autoprocessing enzyme that cleaves exclusively after a leucine residue within the target protein-CPD junction. Importantly, V. cholerae CPD is specifically activated by inositol hexakisphosphate (InsP6), a eukaryotic-specific small molecule that is absent from the bacterial cytosol. As a result, when His6-tagged CPD is fused to the C-terminus of target proteins and expressed in Escherichia coli, the full-length fusion protein can be purified from bacterial lysates using metal ion affinity chromatography. Subsequent addition of InsP6 to the immobilized fusion protein induces CPD-mediated cleavage at the target protein-CPD junction, releasing untagged target protein into the supernatant. This method condenses affinity chromatography and fusion tag cleavage into a single step, obviating the need for exogenous protease addition to remove the fusion tag(s) and increasing the efficiency of tag separation. Furthermore, in addition to being timesaving, versatile, and inexpensive, our results indicate that the CPD purification system can enhance the expression, integrity, and solubility of intractable proteins from diverse organisms.
The availability of simple, reliable, and cost-effective methods for recombinant protein purification is critical for the work of high throughput structural and proteomic centers and many individual researchers alike. While the addition of affinity tags such as poly-His and glutathione transferase (GST) to target proteins has greatly simplified purification strategies, it is often difficult to obtain soluble recombinant protein . As a result, intractable affinity-tagged target proteins are often fused to small proteins such as NusA and SUMO to improve their solubility, expression, and stability .
Since these tags can alter the biological activity of target proteins and interfere with protein crystallization studies, many biological and biomedical applications require that the tag be removed from the target protein. Most commonly used methods involve the addition of exogenous site-specific proteases to cleave the affinity tag off the target protein at engineered sites . Unfortunately, high levels of endoprotease must often be applied for extended periods of time, and this can result in undesirable cleavages within the target protein. Furthermore, these endoproteases are costly, often exhibit poor solubility, and require the inclusion of additional chromatography steps to remove the exogenous protease.
To circumvent these disadvantages, we have developed an on-bead cleavage purification system in which a site-specific affinity-tagged protease is fused directly to the target protein. This approach condenses affinity purification, cleavage, and tag separation into a single step, simplifying protein purification procedures and increasing purification yields. The key element of this purification method is the Vibrio cholerae MARTX toxin cysteine protease domain (CPD) . The CPD exhibits several properties that facilitate its development into an inducible, autocleaving protease tag. First, the CPD is a highly specific protease that cleaves exclusively after Leu residues . Second, the CPD is inducible, as it is specifically activated by the eukaryotic-specific small molecule inositol hexakisphosphate (InsP6) , . Since InsP6 is absent from bacterial cells , , full-length CPD-His6 fusion proteins can be purified from bacterial lysates in a protease-inactive form using imidzaole affinity chromatography (IMAC). Addition of InsP6 to an immobilized, C-terminally His6-tagged fusion protein induces autoprocessing at the P1 Leu cleavage site (P1 refers to the residue N-terminal to the scissile bond), which is located at the target protein-CPD junction (Figure 1). This processing event releases the untagged target protein into the supernatant, while the C-terminally His6-tagged CPD remains immobilized on the Ni2+-NTA resin. Third, as an autoprocessing enzyme, the CPD exhibits poor transcleavage efficiency , . This property should limit fusion protein cleavage to the CPD-target protein junction and permit the high fidelity removal of the His6-CPD tag from the target protein.
In this report, we demonstrate using a variety of target proteins that this novel purification system combines the simplicity of one-step purification systems ,  with many of the advantages of affinity tags  in that it can increase the expression, solubility, and integrity of target proteins. Thus, this method facilitates the rapid purification of both soluble and intractable, recombinant, untagged proteins, suggesting that it will have widespread utility in individual research labs and high-throughput structural and proteomic centers.
In order to produce CPD fusion proteins, we first constructed CPD expression vectors (pET-CPD expression vectors) using the pET expression vector backbone. DNA encoding the CPD was cloned into the SalI restriction site (Figure 2) such that the fusion protein produced upon IPTG induction of E. coli harboring the pET-CPDSalI vector carries the P2-P1 residues of the native CPD (Ala-Leu, respectively) and the P4-P3 residues encoded by the SalI site (Val-Asp, respectively) (Figures 1 and and2).2). The P1 residue refers to the amino acid N-terminal to the scissile bond, while the residue N-terminally adjacent to the P1 residue is termed P2, and so on. When InsP6 is added to induce CPD-mediated autocleavage of the fusion protein, the untagged target protein is released from the resin and carries four additional C-terminal residues (Val-Asp-Ala-Leu); the His6-tagged CPD remains bound to the resin (Figure 1). The Val-Asp-Ala-Leu C-terminal addition can be reduced to two amino acids (Glu-Leu) by cloning into the SacI site, or to a single amino acid (Leu) by cloning into the BamHI site and adding a Leu codon to the 3′ cloning primer (Figure 2).
To demonstrate the feasibility of this system, we first expressed and purified green fluorescent protein (GFP) as a fusion to CPD-His6 using IMAC. As anticipated, addition of increasing amounts of InsP6 stimulated the release of GFP from the Ni2+-NTA resin in a dose-dependent manner (Figures 3A and B), while the His6-tagged CPD remained bound to the Ni2+-NTA agarose beads (bead eluate, Figure 3A).
We have previously shown that V. cholerae CPD is positioned to undergo autocleavage at a proximal N-terminal leucine and that it exhibits significantly reduced transcleavage efficiency , , which should limit its ability to cleave target proteins at heterologous sites. Indeed, mutation of the P1 Leu to an Ile residue is sufficient to prevent CPD-mediated transcleavage, a finding that is explained by the observation that the P1 Leu residue fits snugly into the S1 substrate binding pocket in the crystal structure of the P1 Leu aza-epoxide inhibitor modified V. cholerae CPD . Nevertheless, since other site-specific proteases used to remove fusion tags have been observed to cleave target proteins at secondary sites , we examined whether the CPD would spuriously cleave target proteins. Specifically, we tested whether the CPD would cleave an intrinsically disordered protein after Leu residues within the target protein. We used the intracellular domain (ICD) of the cytokine receptor gp130 as a test substrate, since it is unstructured in solution by NMR  and contains multiple Leu residues that might serve as cleavage substrates . The ICD-CPD-His6 fusion protein was expressed and purified from E. coli lysates using IMAC, and CPD-mediated cleavage of the immobilized fusion protein was activated by InsP6 addition. As shown in Figure 4, autoprocessing occurred exclusively at the ICD-CPD interdomain junction, with a single protein equivalent to the size of His6-tagged ICD being released into the supernatant fraction. These results strongly suggest that the CPD will not promiscuously cleave target proteins.
We noticed that the expression of the ICD-CPD-His6 fusion protein was at least three-fold higher than the ICD-His6 protein in E. coli lysates (Figure 4, compare + lanes). This result suggested that the CPD might generally enhance target protein expression and/or solubility levels. To test this hypothesis, we compared the expression and solubility of CPD fusions to several other target proteins carrying either a His6-tag and/or GST-fusion tag (Figures 5--88 and Table 1). In all cases, the presence of the CPD-His6 fusion tag increased the expression and solubility of target proteins. For example, fusion of the CPD-His6 tag to biotin ligase (BirA) from E. coli (BirA-CPD-His6) raised BirA expression levels by three-fold over the GST-BirA construct  (Figure 5 and Table 1).
The CPD purification system also enhanced the expression and purity of a previously uncharacterized SUMO/Sentrin-specific peptidase 1 (SENP1) from the parasitic pathogen Plasmodium falciparum, the causative agent of malaria (Figure 6) . Although PfSENP1 carrying an N-terminal His6-tagged can be readily expressed and purified from E. coli, a number of contaminating bands are present, and the N-terminal His6-tag must be removed by the addition of thrombin followed by multiple chromatography steps (Table 2). In contrast, when PfSENP1 is expressed as a fusion to CPD-His6 and released as untagged PfSENP1 upon InsP6 addition, only one minor contaminant co-purifies with PfSENP1 (Figure 6B). This variant is easily removed using gel filtration chromatography (Figure 6C), and the untagged PfSENP1 is of sufficient purity that we have used it to obtain diffraction-quality crystals (E. Ponder, unpublished results). Notably, although the heterologous expression of P. falciparum proteins in E. coli is typically challenging , we have observed that this system can enhance the expression and purification of other parasite proteins from P. falciparum and a related apicomplexan parasite Toxoplasma gondii.
In addition to augmenting the expression of target proteins, CPD-His6 fusions protected target proteins from proteolytic degradation. This effect was observed when the CRAC-activation domain (CAD) of the ER calcium sensor STIM1 was fused to the CPD (Figure 7). CAD is a small 107 aa polypeptide that activates Ca2+ release-activated Ca2+ (CRAC) channels by binding to the CRAC channel protein Orai1 . Until now, large-scale expression and purification of this important regulatory domain has proven difficult due to its apparent instability even when fused to GST (Figure 7). However, using the CPD system, we were able to obtain significant quantities of a CAD-containing polypeptide (CAD128), which has subsequently been used in high-throughput screens for Orai1-CAD binding partners (A.M. Sadaghiani).
Finally, the CPD purification system also increased the solubility of difficult-to-express proteins. Fusion of the mouse macrophage metalloelastase (MMP12) to CPD-His6 facilitated its purification from the soluble fraction of E. coli lysates, whereas His6-tagged MMP12 remained largely insoluble (Figure 8A). The currently used method for purification of His6-tagged MMP12 is a laborious procedure that requires solubilization of MMP12 inclusion bodies, refolding over multiple days, followed by anion and cation exchange chromatography . The CPD purification system dramatically simplifies this purification procedure, allowing soluble, active MMP-12 to be isolated in approximately 7 hours (Figs. 8B and C, Table 3). We have used this improved purification protocol to rapidly express, purify and analyze MMP12 mutant proteins.
We have developed a novel one-step purification system that accelerates untagged recombinant protein purification from bacterial systems. By directly fusing an affinity-tagged, site-specific protease to a target protein, the CPD system ensures rapid and efficient removal of the fusion tag in a cost-effective manner. As a result, the CPD system overcomes many of the disadvantages associated with the exogenous addition of site-specific proteases, like thrombin and TEV protease, to remove fusion tags. These disadvantages can include their expense, generally low activity , , sensitivity to buffer conditions, and cleavage of target proteins at spurious sites . In contrast, the CPD rapidly completes tag removal within two hours of addition (Figures 3--8),8), since the CPD is present at a 11 ratio to the target protein and poised to undergo the autocleavage reaction . Furthermore, the responsiveness of the protease specifically to InsP6 provides the user with complete control over the timing and conditions of fusion tag removal, while the autoprocessing nature of the CPD confers a high degree of specificity to fusion tag removal , . Specifically, the protease is poised to undergo autocleavage upon InsP6 addition and exhibits poor transcleavage efficiency, as evidenced by the lack of CPD-mediated cleavage within any of the target proteins tested (Figures 3–8),8), including an intrinsically unstructured protein (Figure 4).
While purification systems based on fusing a protease to target proteins have previously been developed , , our demonstration that the CPD can enhance the expression, solubility, and stability of target proteins (Figures 4–8)8) suggests that the CPD system likely represents an improvement over existing methods like the intein-chitin-binding-domain (CBD) ,  and sortase-His6 one-step purification systems . Although these self-cleaving systems simplify the purification of well-expressed proteins, the large size of the intein-CBD fusion tag can decrease target protein solubility , while sortase-His6 fusion tags do not increase target protein solubility . Furthermore, unlike self-cleaving elastin-like polypeptide (ELP) tags , CPD fusion proteins do not need to be subjected to temperature cycles, pH shifts, or high salt concentrations, a feature that is critical for the purification of intractable proteins. Based on the properties reported here, the CPD could replace the intein-tag in the self-cleaving-ELP system and potentially improve the solubility of ELP-tagged proteins while retaining their self-cleavability .
Indeed, a considerable strength of this method is that the CPD remains active over a wide range of conditions. CPD-mediated cleavage is complete within 1–2 hrs at temperatures between 4°C and 37°C, requires only micromolar concentrations of the small molecule InsP6 (an abundant and inexpensive reagent), and occurs efficiently both in the presence of standard protease inhibitor cocktails and in the absence of salt. This latter property carries the additional advantage of allowing the user to determine the buffer system in which to elute the target protein, eliminating the need for desalting or buffer exchange steps that can reduce protein yields. In addition, we have created a number of vector backbones that can be used to vary the residues that are appended to the target protein following CPD-mediated cleavage, which can range from a single amino acid residue to an HA epitope tag (Figure 2). Thus, the CPD system allows for considerable flexibility in optimizing purification procedures, as is often necessary for uncharacterized target proteins.
This versatility, combined with our observation that it can improve the solubility and integrity of difficult-to-express proteins (Figures 5 to to8),8), suggests that it will have widespread utility in biological research. The simplicity of this system will also make it amenable for large-scale proteomic, structural genomic, and commercial applications by eliminating the cost and complexity associated with exogenous site-specific proteases, potentially permitting its use in robotic systems for constructing protein arrays for screening purposes.
Overnight bacterial strains were grown at 37°C in Luria-Bertrani (LB) broth. Antibiotics were used at 100 µg/mL carbenicillin for pET22b vectors expressed in E. coli.
Primers used are listed in Table S1; strains constructed are listed in Table S2 in the Supporting Information. For construction of pET-CPDSalI vectors, DNA encoding Vibrio cholerae MARTX toxin amino acids 3440-3650 from Vibrio cholerae N16961 was PCR amplified from genomic DNA using primers #1 and #2. The resulting PCR fragment was cloned into the SalI and XhoI sites of the pET22b and pET28a expression vectors, respectively (Novagen). For construction of the pET-CPDSacI vector, DNA encoding Vibrio cholerae MARTX toxin amino acids 3442-3650 from Vibrio cholerae N16961 was PCR amplified from genomic DNA using primers #3 and #2, and the resulting PCR fragment was cloned into the SacI and XhoI sites of pET22b. To construct the pET-HA-CPDSalI vectors, DNA encoding the HA epitope tag was added to the 5′ end of primer #4, and PCR amplification using primers #4 and #2 was used to fuse the HA tag directly to amino acid 3440 of V. cholerae MARTX CPD. The resulting PCR fragment was cloned into the SalI and XhoI sites of the pET22b and pET28a expression vectors, respectively. For construction of the pET-CPDBamHI-Leu vector, DNA encoding Vibrio cholerae MARTX toxin amino acids 3444-3650 from Vibrio cholerae N16961 was PCR amplified from genomic DNA using primers #5 and #2, and the resulting PCR fragment was cloned into the BamHI and XhoI sites of pET22b. For construction of the pET-CPDBamHI vector, DNA encoding Vibrio cholerae MARTX toxin amino acids 3440-3650 from Vibrio cholerae N16961 was PCR amplified from genomic DNA using primers #6 and #2, and the resulting PCR fragment was cloned into the BamHI and XhoI sites of pET22b.
The pET22b-GFP-CPD construct was cloned by PCR amplifying GFP from pEGFPN3 (Clontech) using primers #7 and #8. To construct the pET22b-gp130(ICD)-CPD vector, amino acids 642-918 of gp130 corresponding to the intracellular domain were PCR amplified using primers #9 and #10 and pET21a-gp130(ICD) as a template. The pET22b-BirA-CPD vector was constructed by PCR amplifying the birA gene from a pGEX4T1-BirA template using primers #9 and #10. The pET22b-STIM1(CAD)-CPD plasmid was constructed by PCR amplifying DNA encoding amino acids 342–469 of STIM1 using pGEX6-CAD128 as a template and primers #13 and #14. The pET22b-mMMP12-CPD construct was constructed by PCR amplifying the catalytic domain of mouse MMP12 (amino acids 29–267) using pET41a-mMMP12 as a template using primers #15 and #16. In all cases, the resulting PCR products were cloned into the NdeI and SalI sites of pET22b-CPDSalI.
For purification of His6-tagged CPD fusion proteins, overnight cultures of the appropriate strain were diluted 1500 into 1 L 2YT media and grown shaking at 37°C. When an OD600 of 0.6 was reached, IPTG was added to 250 µM, and cultures were grown for 3-4 hrs at 30°C. Cultures were pelleted, resuspended in 25 mL lysis buffer [500 mM NaCl, 50 mM Tris-HCl, pH 7.5, 15 mM imidazole, 10% glycerol] and flash frozen in liquid nitrogen. Lysates were thawed, then lysed by sonication and cleared by centrifugation at 15,000×g for 30 minutes. His6-tagged CPD fusion proteins were affinity purified by incubating the lysates in batch with 0.5–1.0 mL Ni-NTA Agarose beads (Qiagen) with shaking for 2–4 hrs at 4°C. The binding reaction was pelleted at 1,500×g, the supernatant was set aside, and the pelleted Ni2+-NTA agarose beads were washed three times with lysis buffer. In some cases, 10% of the Ni2+-NTA beads containing immobilized CPD-His6 fusion proteins were removed, pelleted and then His6-tagged fusion protein eluted using high imidazole buffer [500 mM NaCl, 50 mM Tris-HCl, pH 7.5, 175 mM imidazole, 10% glycerol].
To liberate untagged target proteins into the supernatant fractions, 300–500 µL lysis buffer was added to the Ni2+-NTA beads containing CPD-His6 fusion proteins and the indicated amount of inositol hexakisphosphate (InsP6, Calbiochem) was added. In general, on-bead cleavage was allowed to proceed by nutating the beads in the presence of 50–100 µM InsP6 for 1–2 hr at either room temperature or 4°C. The beads were pelleted at 1,500×g, and the supernatant fraction was removed. The beads were then washed 3–4 times with 300–500 µL lysis buffer, and supernatant fractions retained. His6-tagged proteins remaining on the beads (i.e. cleaved CPD-His6) were eluted using high imidazole buffer [500 mM NaCl, 50 mM Tris-HCl, pH 7.5, 175 mM imidazole, 10% glycerol] in 300–500 µL volumes. The elution was repeated 3–4 times, and eluate fractions were collected. Purification of His6-tagged proteins lacking the CPD was performed in parallel.
This general procedure was followed with the following exceptions: for purification of MMP12 constructs, the cultures were grown at 16°C for 8 hr after IPTG induction, and 1 mM tris(2-carboxyethyl)phosphine (TCEP) was added to the lysis buffer to prevent misfolding of the protein. PfSENP1 and BirA protein purifications were performed exclusively at room temperature, since at 4°C, protein aggregation was observed. For removal of the His6-tag from His6-PfSENP1, thrombin beads (Calbiochem) that had been washed in PBS were added to the eluted His6-PfSENP1, which had been buffer exchanged into PBS according to the manufacturer's instructions. Thrombin cleavage was allowed to proceed with shaking overnight for 12 hr at room temperature. Aliquots were taken before and after thrombin addition to monitor cleavage efficiency. Thrombin cleaved, untagged PfSENP1 was enriched by performing a subtractive Ni2+-NTA pull-down. Untagged PfSENP1 from both methods was then buffer-exchanged into gel filtration buffer (50 mM NaCl, 20 mM Tris pH 8.0). Protein purifications were analyzed by SDS-PAGE and Coomassie staining using GelCode Blue (Pierce). Purified protein concentrations of purified were determined by Bradford assay (Pierce).
MMP12-His6 was purified as previously described  with the following modifications. The cell pellet was resuspended in 100 mM NaCl, 100 mM Tris pH 8.0, 5.0 mM EDTA, 0.5 mM DTT, 100 µg/mL lysozyme and stirred for 2 hr. The cells were sonicated then centrifuged at 10,000 rpm for 10 min. The resulting inclusion bodies were washed two times and then resuspended in 50 mL 6M guanidine hydrochloride, 10 mM Tris pH 8.0 by stirring at 4°C overnight. The mixture was centrifuged at 15,000 rpm for 30 min, and 2 mL aliquots of supernatant were prepared. The supernatant was diluted 1100 into denaturing buffer [6M Urea, 50 mM Tris pH 8.0, 10 mM CaCl2, 30 mM NaCl, 5 mM DTT] to a final concentration of 0.1–0.2 mg/mL. The protein was then dialyzed for 24 hr in 2 L refolding buffer 1 [3 M Urea, 50 mM Tris pH 8.0, 10 mM CaCl2, 30 mM NaCl, 5 mM DTT]. The partially refolded protein was then dialyzed in 4 L of refolding buffer 2 [1 M Urea, 50 mM HEPES pH 7.4, 10 mM CaCl2, 5 mM DTT). The buffer exchanged protein was then purified using tandem 5 mL MonoQ and SP Sepharose (GE Healthcare) at 4°C. After loading the protein on the column, the column was washed with 50 mL of refolding buffer 2 without DTT at 1 M, 0.5, and 0 M urea, respectively. The protein was eluted from the SP column in 500 mM NaCl, 50 mM HEPES pH 7.4, 10 mM CaCl2.
Untagged PfSENP1 obtained from either thrombin or InsP6-mediated cleavage was concentrated using a 10 kDa Centricon concentrator (Millipore) and buffer exchanged into 50 mM NaCl, 20 mM Tris pH 8.0 and purified on a Superdex 200 10/30 column (GE Healthcare) equilibrated in the same buffer. For MMP12, the gel filtration buffer contained 150 mM NaCl, 50 mM Tris pH 7.4, 10 mM TCEP. Gel filtrations were performed at 4°C.
Fluorescence of purified GFP at 511 nm was verified using a Molecular Devices fmax plate reader in black 96-well plates and 488 nm excitation. The activity of MMP12 was determined using the fluorogenic substrate Mca-PLGLDL(Dpa)AR (Mca, (7-methoxycoumarin-4-yl)acetyl, Dpa, N-3-(2,4-dinitrophenyl)-L-2,3-diaminopropionyl, Anaspec). Reactions were performed in the assay buffer (50 mM Tris pH 7.4, 150 mM NaCl, 10 mM CaCl2, 0,02% NaN3, 5 mM TCEP) at 37°C. The substrate was used at 10 µM and the protein at 0.2 µM. The substrate hydrolysis was monitored continuously in a fluorescent plate reader (Molecular Devices) using an excitation wavelength of 325 nm and an emission wavelength of 395 nm.
Primers used in Study. a Restriction enzyme sequences are underlined, and the HA tag is shown in italics. b RE - Restriction site
(0.06 MB DOC)
Strains used in study. 1. Skiniotis G and Lupardus, PJ, et al. (2008) Mol Cell 31: 737–748. 2. Ponder EL, et al. (2009) under review Nat Chem Biol. 3. Park CY, et al. (2009) Cell 136: 876–890.
(0.07 MB DOC)
We thank Chris Overall for kindly providing the pET41a-mMMP12 expression construct, and Chan Young Park for providing the GST-CAD-His6 construct used to clone the CAD domain of Stim1.
Competing Interests: A.S., P.J.L., K.C.G., and M.B. are listed as inventors on a provisional patent application describing the CPD purification system technology. This patent will not alter the authors' adherence to PLoS ONE policies on sharing data and materials. Materials and information associated with the authors' publication will be freely available to those as requested for the purpose of academic, non-commercial research.
Funding: This work was supported by a Burroughs Wellcome Foundation grant and NIH grants R01 AI078947 and R01 EB005011 to M.B., the Damon Runyon Cancer Research Fellowship to P.J.L., a Keck Foundation and Howard Hughes Medical Institute grant to K.C.G., a Stanford Dean's Fellowship to A.S., and a Beautriu de Pinós of Agaur Fellowship to M.M. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.