Figure shows a schematic workflow for purification-based screening of expressed proteins. We have applied this approach to eukaryotic proteins, including those from human embryonic stem cells. The advantages of the approach include the small quantity of cell culture required, the speed in going from cells to purified protein, the relatively low cost of the procedure, the ability to scale up to automated, multiple-lane purifications, and the protein yields, which are sufficient to support characterization of the protein product.
Fig. 8 Schematic of a purification screening protocol. Steps from obtaining a sequence-verified target in auto-cleavage vector pVP62K to identification of purified proteins. The transformed expression host is grown in auto-induction medium. Cells from production (more ...)
Protein production pipelines have numerous points of attrition that limit the number of proteins available for structural analysis. This attrition adds significant expense to the overall process, particularly when multiple handling steps and larger volumes are typically required to obtain decisions. Thus the use of small- or micro-scale protein screening methods has considerable appeal [22
The focus of this work has been on eukaryotic proteins, which are generally found to be more difficult to express and purify than prokaryotic proteins. Small-scale production screening efforts have predicted the outcome of downstream large-scale protein production with up to 80% efficiency [26
]. However, these previous efforts did not effectively address either the variability in proteolysis of fusion proteins that are often used to express eukaryotic proteins in E. coli
or the behavior of the liberated targets after proteolysis. By adopting this screening approach, the decision to scale-up protein production can be based on the ability to express, proteolyze, and purify the protein, and as indicated here, this decision can be extended to the ability to include other information such as acceptable 1
N HSQC spectra or evidence of crystallization.
Table contains information on another troublesome set of eukaryotic proteins, those with pI ~8 or greater (A2, A3, A4, A10, B1, B2). Purification screening would provide important insight into the behavior of these proteins, which often perform well as fusion proteins in total production and solubility properties, but which often fail in proteolysis or stability after proteolysis (~70%, unpublished results). Thus, although the original pipeline screening suggested A2, A6, A10, and B1 should have been advanced to purification, each of these targets failed to achieve the desired threshold from the Maxwell purification for purified fusion protein and thus deserved a work stopped assignment. In contrast, the high pI protein A3 was purified in high yield as a fusion protein from the Maxwell and subsequently was released by TEV protease treatment (as in the original pipeline scoring), supporting the decision that this protein should be continued along the scale-up process. Further consideration of the results of these targets will be included in a broader study of the effect of the N-terminal AIA tag on protein purification and structure determination statistics, which will be reported elsewhere.
Scalability requires similar protein production behavior in small-scale screening, large-scale protein production, and, ultimately, protein purification. For proteins A8 and A9, the original small-scale screening reported these proteins were unsuitable because of a failure in TEV proteolysis, and this result was also determined after Maxwell purification. Among the four human embryonic stem cell proteins investigated, Tcl-1 was highly expressed by auto-induction, underwent efficient in vivo proteolysis from MBP, and was successfully purified with an estimated volumetric productivity of 7.5 μg/ml. Two other stem cell proteins (C10orf96 and NPM2), were also purified, but their yields were not sufficient to indicate feasibility of the scale-up as a structural target. Nevertheless, the method yielded enough purified protein that some functional studies or other analyses could be undertaken. By coupling in vivo cleavage with automated purification, failure to proteolyze the His8-target from the fusion protein and cryptic insolubility of the His8-target after proteolysis are signaled by failure in automated purification. Since both of these results are diagnostic of likely failures in large-scale purification, the purification screening approach gave valuable insight into the behavior of the human stem cell protein CCNF and the others before any significant scale-up efforts were undertaken.
We demonstrated how the amount of His8-target successfully purified from a single Maxwell 16 lane can be used to determine the scale-up factor required to prepare samples for screening either by 1H–15N NMR for folding status (~700 μg of a 25 kDa or less protein set as the deliverable) or by microfluidic screening for crystallization (~10 μg of protein set as the deliverable). This scaling approach was demonstrated for both Tcl-1 and AIA–GFP. Decreasing the amount of protein required for initial structural screening through the use of small NMR tubes, cryoprobes, and by nL liquid handling effectively complements the ability to produce moderate amounts of protein in the cost-effective manner described here. Automated methods for removal of the His8-tag during the Maxwell 16 run would also be desirable, and these investigations are in progress.
CESG starts all expression work on eukaryotic proteins with sequence verified clones [32
]. Uncertainties in gene models and errors from primer synthesis are addressed by this effort, while immediate sequence verification of a cloned gene also supports reliable transfer into other expression vectors. Expression plasmids transformed into E. coli
B834 can be available for purification screening studies after 24 h, and growth from single colony transformants can be completed in 48 h using our auto-induction approach (24 h of growth in non-inducing medium followed by 24 h of growth in inducing medium). Auto-induced cultures can be immediately loaded onto the Maxwell 16 apparatus, with parallel processing of 16 samples in 45 min. Thus a complete 96-well plate of different targets (or variants of the same target) could be purified and analyzed for protein expression by the automated capillary electrophoresis in less than 7 h. In the work flow of Fig. , the best performing targets, provisionally defined as those obtained from in vivo cleavage and automated purification in yield of 50 μg/ml or greater, can be identified in about 4 days, with most of the elapsed time allotted to overnight culture growths or automated protein purification.
Auto-induction media are chemically defined and assembled from inexpensive components. Furthermore, the cost of labeled amino acids (15N or Se-Met) is minimal for the initial screening due to the small cell culture volume required. For the example shown in Fig. , the cost for all reagents for the auto-induction and automated purification of the 15N-labeled sample was less than $50. The simple instrumentation required for the auto-induction and the Maxwell 16 purification may allow wide access to this approach, and the minimal hands-on effort required to complete the analysis through to purified protein is another considerable operational advantage.
Capillary electrophoresis has several advantages relative to slab gel electrophoresis. Although the instrument is more expensive than a standard power supply, electrophoresis equipment, and gel documentation system, the average price per sample analysis (~$0.67 per lane of analysis) using the LC90 chip is less than pre-cast polyacrylamide gels (~$1.17 per lane of analysis). Other advantages of capillary electrophoresis include automated operation, rapid processing time, digital information capture, and quantitative analysis of electropherograms. This work shows that the quantitative analysis of protein yield from a small-scale expression can be used as a predictive tool for scale-up feasibility.
Other applications of this approach
The automated protein purification process described here has other potential uses. This process can facilitate evaluation of different vector designs and arrays of different expression hosts. For functional studies, banks of site-directed or randomly mutated proteins can be prepared and purified in amounts sufficient for catalytic screening. This may facilitate protein engineering for new traits that can be assayed such as changes in catalytic activity, thermal stability, or other desirable properties. In many cases, the amounts of protein recovered by the automated purification (Tables and ) should be adequate to initiate these functional studies. Surface entropy reduction analysis could also be facilitated through an effective sorting of protein variants that maintain sufficient stability to be purified. The delivery of small quantities of purified proteins for examination by micro-crystallization techniques or NMR analysis before significant effort is placed into purifying large quantities also has demonstrable advantages.
For eukaryotic proteins, domain engineering is an important experimental focus. It is clear that multiple changes at the N- and C-terminus may be required to identify the best performing variant. Through the use of purification screening, it is efficient to express, purify, and examine engineered domains for improved solubility properties as part of the experimental process.