|Home | About | Journals | Submit | Contact Us | Français|
Natural proteins often rely on the disulfide bond to covalently link side chains. Here we genetically introduce a new type of covalent bond into proteins by enabling an unnatural amino acid to react with a proximal cysteine. We demonstrate the utility of this bond for enabling irreversible binding between an affibody and its protein substrate, capturing peptide-protein interactions in mammalian cells, and improving the photon output of fluorescent proteins.
Covalent peptide bonds in the backbone and noncovalent interactions among amino acid side chains provide the essential framework for protein structure and function. Covalent disulfide bonds formed between two cysteine residues contribute another fundamental type of linkage, providing more stability and selectivity than noncovalent interactions. Indeed, the disulfide bond is crucial for the folding, stability and activity of a variety of proteins such as antibodies, cytokines and membrane-associated receptors1,2. However, its reversibility and redox sensitivity impose limitations on protein expression, engineering and applications in biotechnology and therapeutics3. Isopeptide bonds between lysine and asparagine or aspartate have also been discovered, but they form only in the hydrophobic protein interior and require an essential glutamate or aspartate residue to catalyze the reaction4. Additional types of covalent linkages formed autocatalytically would expand avenues for generating novel protein properties and functions. Here we report the design and application of a new covalent bond that is formed based on proximity-enhanced reactivity between an unnatural amino acid (Uaa) and a nearby cysteine residue.
To generate a new covalent bond between a Uaa and a natural amino acid, the side chain of the Uaa must be reactive toward that of the target natural amino acid. However, because the natural amino acid is ubiquitously present in proteins and inside cells, uncontrolled reactivity will result in nonspecific linkages, causing cytotoxicity. In addition, a Uaa that is bioreactive toward endogenous amino acids or proteins might not be able to go through the protein translational machinery for genetic incorporation. For these reasons, Uaas that have been incorporated into proteins through orthogonal tRNA-synthetase pairs hitherto have contained either chemically inert or bioorthogonal side chains5. We hypothesized that this challenge could be overcome by proximity-enhanced bioreactivity (Fig. 1a). Specifically, the Uaa should not react with any free natural amino acids under physiological conditions; but after the Uaa is incorporated and placed in proximity to its target natural amino acid residue into proteins, the increased effective concentration and appropriate orientation of their side chains should then facilitate the reaction. Reactivity between small molecules is enhanced when they are brought close by DNA templates6. In addition, proximity effects for enhancing reactivity between proteins and small molecules have been found in natural products7 and exploited in the development of irreversible ligand binding8, small-molecule inhibitors and activity-based protein profiling9. Because proximity of amino acid side chains can be readily attainable either within proteins or at the interface of interacting proteins, we reasoned that it would be feasible to harness the proximity of a designed Uaa and a target natural amino acid for selective formation of a new covalent bond.
We designed p-2′-fluoroacetylphenylalanine 1 (Ffact, Fig. 1b) as the Uaa to react with cysteine. The sulfhydryl group of cysteine has the highest nucleophilicity among chemical groups occurring in natural amino acid side chains, so we expected that a weak electrophilic group, when close to a cysteine residue, would selectively react with it. The C-F bond is strong, and F is not a very good leaving group in SN2 reactions, but the carbonyl group of Ffact increases its reactivity, making the C a weak electrophile. In fact, small-molecule or peptidyl fluoromethyl ketones have been shown to react irreversibly with cysteine in the active site of a cysteine proteinase10 and the S6 kinase11.
Ffact was synthesized via a four-step approach (Fig. 1c, Supplementary Note, Supplementary Fig. 1). To test the reactivity of Ffact toward cysteine, we incubated them at different concentrations and analyzed the reaction using HPLC–mass spectrometry (HPLC-MS). In reactions containing 1 mM Ffact and 50 μM cysteine, no Ffact reacted with cysteine (data not shown). A small percentage (<5%) of Ffact reacted with cysteine when cysteine concentration was increased to 1 mM. In contrast, in reactions containing 1 mM Ffact and 10 mM cysteine, Ffact was completely converted to the product (Fig. 1d). The intracellular concentration of free cysteine is ~19 μM in Escherichia coli12, and proximity increases the effective concentration of reactants. Therefore, these concentration-dependent results suggest that Ffact would not react with free cysteine inside E. coli cells, but its reactivity with cysteine could be enhanced by proximity.
To determine if a covalent bond could be formed at the interface of two interacting proteins, we genetically incorporated Ffact into the ZSPA affibody13 and tested whether the resultant affibody could covalently capture its substrate Z protein modified with a cysteine residue. We chose to incorporate Ffact at Asp36 in the affibody and to mutate Asn6 to cysteine in the Z protein (Fig. 2a) because these two sites are in close proximity at the interface and have been used previously for cross-linking through chemical modification14. Because Ffact is structurally similar to p-acetyl-L-phenylalanine (Fact, Fig. 2b), differing only in having a hydrogen atom replaced by fluorine, we reasoned that the orthogonal tRNA-synthetase pair evolved for incorporating Fact (ref. 15) should be able to incorporate Ffact into proteins. We mutated Asp36 into an amber codon TAG in the affibody gene to encode Uaa and coexpressed the mutant gene with the orthogonal tRNA-synthetase genes in E. coli. Gel separation of the purified affibody showed that production of the full-length affibody was markedly increased (43-fold) when Ffact was included in the growth medium (Fig. 2c). We then analyzed the purified mutant affibody by electrospray ionization MS (ESI-MS) (Fig. 2d), which indicated that Ffact was site-specifically incorporated into the affibody at position 36 through protein translation in E. coli.
We next tested the intermolecular reaction of the mutant affibody (D36Ffact) with the mutant Z protein (N6C). We incubated the two proteins at a 4:1 ratio in Tris buffer (pH 8.0) or PBS (pH 7.4) at 37 °C for 1 h and analyzed the reaction mixture by SDS-PAGE under denaturing conditions. We detected a band with molecular weight corresponding to the affibody–Z protein complex (Fig. 2e), indicating that the two were covalently linked. The yield of covalent complex formation was 63 ± 3% (n = 3). ESI-MS analysis of the reaction product confirmed that the complex was covalently linked by Ffact reacting with cysteine (Fig. 2f). This complex band did not form when Ffact36 was replaced by the non-fluorinated Fact in the affibody or when Cys6 was replaced by asparagine in the Z protein (Fig. 2e). In addition, when we placed cysteine at Z protein Asn3 instead of Asn6, no covalent affibody-Z complex was formed (Fig. 2e). Taken together, these results indicate that the covalent bond formed by Ffact and cysteine was dependent on the presence of both amino acids in close proximity.
We further demonstrated the compatibility of this intermolecular bond formation in mammalian cells. We incorporated Ffact into the corticotropin release factor receptor type 1 (CRF-R1)16, a class B G protein–coupled receptor (GPCR), at a site near the tip of transmembrane helix 6 (Fig. 3a). Available three-dimensional structures indicate that the corresponding helix of the class A GPCR is involved in ligand interaction. Cysteine was introduced into different positions of urocortin-1 (Ucn-1), a native 40-amino-acid peptide agonist of this receptor. We incubated the cysteine–Ucn-1 (Cys-Ucn-1) ligand analogs individually with HEK293T cells expressing the Ffact-CRF-R1 mutant, and then resolved cell lysates by reducing SDS-PAGE and immunoblotted them with an antibody specific for Ucn-1 (anti–Ucn-1). Only when cysteine was introduced at position 12 of Ucn-1 was a covalent ligand-receptor complex formed (Fig. 3b, Supplementary Fig. 2). These results demonstrate that the Ffact-cysteine reaction can be used in mammalian cells to site-specifically capture peptide-protein interactions under native conditions.
To explore whether the covalent bond between Ffact and cysteine could be formed intramolecularly within a protein, we introduced these amino acids into fluorescent proteins to covalently link the fluorophore to the β-barrel. We replaced Tyr67 of the fluorophore with Ffact and the proximal Ser146 in the β-barrel with cysteine in the red fluorescent protein mPlum17(Supplementary Results, Supplementary Methods and Supplementary Fig. 3). ESI-MS analysis of the expressed S146C-Y67Ffact mutant indicated that the Ffact67-Cys146 bond was formed (Supplementary Fig. 3c). In comparison to the isosteric mutant S146C-Y67Fact lacking the fluorine atom and thus the covalent bond, the S146C-Y67Ffact mutant showed a 4.6-fold greater quantum yield and a 0.86-fold greater photon output (Supplementary Fig. 3e,g). Covalent bond formation and similar effects on fluorescence properties were also observed when the Ffact-cysteine pair was introduced into another red fluorescent protein, mKate2 (ref. 18) (Supplementary Results and Supplementary Fig. 4).
In conclusion, we have shown that it is possible to add a new covalent linkage to proteins by designing proximity-enhanced bioreactivity between side chains of a Uaa and a natural protein-ogenic amino acid. We expect that this approach can be used to generate other new covalent bonds for proteins by targeting other natural amino acids. The ability to introduce new covalent linkages into proteins will afford new avenues toward protein properties and functions that were previously inaccessible because of the limitations of disulfide bonding, which will find broad applications in biological studies, protein therapeutics and synthetic biology.
Please see Supplementary Note.
All plasmids were assembled by standard cloning methods and confirmed by DNA sequencing. The Z protein and mutants were expressed using the pBAD-Z plasmids, in which target genes were cloned into pBAD (Invitrogen) using the Spel and HindIII sites. A His6 tag was appended to the C terminus for purification. The affibody and mutants were expressed using pLei-tRNAopt-Aff and pBK-LW1RS15. pLei-tRNAopt-Aff was constructed from pLei-tRNAopt-STAT3 (ref. 21) by replacing the STAT3 gene with the affibody gene using the SpeI and BlpI sites. A His6 tag was appended at the C terminus of the affibody for purification. Primer sequences can be found in Supplementary Table 1. mPlum and mutants were expressed with pBAD-mPlum22 and pLei-tRNAopt-factRS. pLei-tRNAopt-factRS was constructed from pLei-tRNAopt-STAT3 by replacing the STAT3 gene with the LW1RS gene using the SpeI and BglII sites, and no His6 tag was appended at the C terminus of the factRS. mKate2 and mutants were expressed with pLei-tRNAopt-mKate2 and pBK-LW1RS. pLei-tRNAopt-mKate2 was constructed from pLei-tRNAopt-STAT3 by replacing the STAT3 gene with the mKate2 gene using the SpeI and BglII sites. The TAG codon for encoding the Uaa was introduced with the QuikChange site-directed mutagenesis method (Stratagene). For the affibody and Z proteins, two amino acids (threonine and serine) were introduced right after the N-terminal methionine by the SpeI site. Numbering of the amino acid residues for affibody and Z followed the original numbering13. The rat CRF-R1 gene with a TAG stop codon at L329 position was PCR amplified from pAIO-Azi-CRFR1 (ref. 23) and cloned into pcDNA3.1(+) using the EcoRI and NotI restriction sites to afford plasmid 329tag-CRF-R1-Flag. A Flag tag was appended at the C terminus for western blot detection. Plasmid pIre-Keto3 was constructed to express the tRNACUATyr derived from the Bacillus stearothermophilus tRNATyr and the Fact-specific synthetase EKetoRS, which was derived from the E. coli TyrRS with the following mutations: Y37I, D182G, F183M, L186A and D265R (ref. 24). The tRNACUATyr without the last CCA trinucleotide was driven by the U6 promoter and flanked by the 3′-flanking sequence derived from a human initiator methionine tRNA gene25. The tRNA cassette was repeated 3 times in tandem. The EKetoRS gene was driven by the PGK promoter.
E. coli BL21 cells harboring expression plasmids were grown at 37 °C in glycerol minimal medium supplemented with 1 mM of Fact or 2 mM of racemic Ffact (effective concentration of L-Ffact was 1 mM) when appropriate. Protein expression was induced with 0.5 mM IPTG (for pLei plasmids) or 0.2% arabinose (for pBAD plasmids) when the OD600 of cells reached 0.5. After 4–6 h, cells were lysed by sonication, and proteins were purified using Ni-NTA resin (Qiagen) by following the published conditions and procedures26. Yields for Ffact containing proteins were: affibody 3.1 mg/L, mPlum 2.5 mg/L, mKate2 3.5 mg/L.
Intact proteins were analyzed by ESI-TOF using an Agilent 6210 mass spectrometer coupled to an Agilent 1100 HPLC system. Two micrograms of protein samples were injected by an auto-sampler and separated on an Agilent Zorbax SB-C8 column (2.1 mm ID × 10 cm length) by a reverse-phase gradient of 0–80% acetonitrile for 15 min. Mass calibration was performed right before the analysis. Protein spectra were averaged and the charge states were deconvoluted using Agilent MassHunter software.
The purified ZSPA affibody (75 μM) and Z protein (19 μM) were incubated with molar ratio of 4:1 in 1× phosphate-buffered saline (PBS) (pH 7.4) or Tris buffer (50 mM Tris, 500 mM NaCl, pH 8.0) at 37 °C for 1 h. SDS loading buffer containing 100 mM of dithiothreitol was added to the reaction mixture, which was then boiled at 100 °C for 8 min to disrupt noncovalent interactions. The boiled samples were then separated by 16.5% Tris-tricine gels and stained with Coomassie blue. Band intensities were quantified by using ImageJ. The cross-linking yield was calculated by (IZ + Iaffibody − Icross-linked)/2IZ, in which I represents band intensity.
The sequence of Ucn-1 is DDPPLSIDL TFHLLRTLLELARTQSQRERAEQNRIIFDSV. HEK293T cells were transfected with 8 μg pIre-Keto3 plasmid and 4 μg 329 tag-CRF-R1-Flag plasmid using Lipofectamine 2000 in a 10-cm dish, and cultured with 0.25 mM Ffact for 48 h. These cells were then divided into 6 aliquots, 5 of which were incubated with the 5 different Cys-Ucn-1 ligand analogs (100 nM) individually in HEPES buffer (25 mM HEPES, pH 7.5, 5 mM KCl, 5 mM MgCl2, 140 mM NaCl, 0.1% BSA, 0.01% Triton X-100) containing 1 mM TCEP (tris(2-carboxyethyl)phosphine) for 90 min. The last aliquot was incubated with 100 nM [Bpa12]-Ucn in the same condition, after which it was photo-cross-linked for 20 min in a Spectrolinker XL-1500A (365 nm) at 4,400–5,000 μW/cm2. Cell pellets for all samples were separately collected, lysed, denatured with 100 mM DTT for 30 min at 37 °C, and then separated by SDS-PAGE on 10% acrylamide Tris-Gly minigels. Proteins were transferred to a PVDF membrane and immunoblotted with a polyclonal rabbit anti-urocortin. For detection of the Flag tag, the monoclonal mouse anti-Flag M2–peroxidase conjugate (cat. no. A8592) purchased from Sigma was used.
We thank J. Xu for help with the NMR measurements, M. Beyermann (Leibniz Institute of Molecular Pharmacology, Germany) for synthesizing the Cys-Ucn-1 analogs, and the Vale laboratory (Salk Institute) for the polyclonal rabbit anti-urocortin. H.R. was partially funded by the Nomis Postdoctoral Fellowship. I.C. was supported by a Marie Curie fellowship from the European Commission within the 7th framework program. L.W. acknowledges support from the California Institute for Regenerative Medicine (RN1-00577-1) and US National Institutes of Health (1DP2OD004744-01, P30CA014195).
Note: Any Supplementary Information and Source Data files are available in the online version of the paper.
AUTHOR CONTRIBUTIONSZ.X. designed and synthesized the Uaa, tested the reaction, analyzed the data and wrote the manuscript; H.R. performed affibody-Z expression and complex formation, expressed and purified fluorescent proteins, measured quantum yields, analyzed the data and wrote the method section; Y.S.H. and H.C. performed single-molecule imaging, analyzed the data and wrote the single-molecule section. I.C. performed the CRF-R1 experiments and analyzed the data. J.W. characterized Uaa incorporation by MS, analyzed the data and wrote the MS section; L.W. conceived and directed the project, analyzed the data and wrote the manuscript.
COMPETING FINANCIAL INTERESTS
The authors declare competing financial interests: details are available in the online version of the paper.
Reprints and permissions information is available online at http://www.nature.com/reprints/index.html.