|Home | About | Journals | Submit | Contact Us | Français|
A critical source of insight into biological function is derived from the chemist’s ability to create new covalent bonds between molecules, whether they are endogenous or exogenous to a biological system. A daunting impediment to selective bond formation, however, is the myriad of reactive functionalities present in biological milieu. The high reactivity of the most abundant molecule in biology, water, makes the challenges all the more difficult.
We have met these challenges by exploiting the reactivity of sulfur and selenium in acyl transfer reactions. The reactivity of both sulfur and selenium is high compared with that of their chalcogen congener, oxygen. In this Account, we highlight recent developments in this arena, emphasizing contributions from our laboratory.
One focus of our research is furthering the chemistry of native chemical ligation (NCL) and expressed protein ligation (EPL), two related processes that enable the synthesis and semisynthesis of proteins. These techniques exploit the lower pKa of thiols and selenols relative to alcohols. Although a deprotonated hydroxyl group in the side chain of a serine residue is exceedingly rare in a biological context, the pKa values of the thiol in cysteine (8.5) and of the selenol in selenocysteine (5.7) often render these side chains anionic under physiological conditions. NCL and EPL take advantage of the high nucleophilicity of the thiolate as well as its utility as a leaving group, and we have expanded the scope of these methods to include selenocysteine. Although the genetic code limits the components of natural proteins to 20 or so α-amino acids, NCL and EPL enable the semisynthetic incorporation of a limitless variety of nonnatural modules into proteins. These modules are enabling chemical biologists to interrogate protein structure and function with unprecedented precision.
We are also pursuing the further development of the traceless Staudinger ligation, through which a phosphinothioester and azide form an amide. We first reported this chemical ligation method, which leaves no residual atoms in the product, in 2000. Our progress in effecting the reaction in water, without an organic cosolvent, was an important step in the expansion of its utility. Moreover, we have developed the traceless Staudinger reaction as a means for immobilizing proteins on a solid support, providing a general method of fabricating microarrays that display proteins in a uniform orientation.
Along with NCL and EPL, the traceless Staudinger ligation has made proteins more readily accessible targets for chemical synthesis and semisynthesis. The underlying acyl transfer reactions with sulfur and selenium provide an efficient means to synthesize, remodel, and immobilize proteins, and they have enabled us to interrogate biological systems.
As articulated by Trost in 1973,1−3 “chemoselectivity” refers to the preferential reaction of a chemical reagent with one of two or more different functional groups.3 In modern chemical biology, the desire to form covalent bonds with molecules endogenous or exogenous to a biological system has made the search for chemoselective reactions into a sine qua non. The challenge of developing such chemoselective reactions is amplified by the high reactivity of the most abundant molecule in biological systems, water.
Nature provides some inspiration. For example, the thiol group of coenzyme A4 is a means to convey acetyl groups, via a thioester, into the citric acid cycle. Similar thiols,5 as well as selenols,6,7 play important roles in numerous biological pathways and in the biosynthesis of polyketides and the nonribosomal biosynthesis of peptides.8 This utility is due to the distinct properties of sulfur and selenium compared with their chalcogen congener, oxygen.9 A key difference is the acidity of an alcohol, thiol, and selenol. Although a deprotonated hydroxyl group in the side chain of a serine residue is exceedingly rare in a biological context, the pKa10 of the thiol in cysteine (8.5) and of the selenol in selenocysteine (5.7) decree that both of these side chains are often anionic under physiological conditions. These low pKa values enhance the reactivity of cysteine and selenocysteine at physiological pH, as well as the reactivity of their thio- and selenoester counterparts. For example, a thioester is 102-fold more reactive toward amine and thiolate nucleophiles than is an isologous oxoester but has a comparable resistance to hydrolysis,11−13 and thermodynamic stability increases in the order thioester < oxoester < amide.14,15 This versatile and chemoselective reactivity, coupled with the low abundance16 of cysteine and selenocysteine relative to other proteinogenic amino acids, enables their utility in the synthesis and semisynthesis of proteins. Here, we review work from our laboratory that has exploited acyl transfer reactions with the chalcogens sulfur and selenium.
A popular method in chemical biology that avails the unique reactivity of the chalcogens is native chemical ligation (NCL). Precedented by reactivity discovered by Wieland in 195317 and developed by Kent and co-workers starting in 1994,18,19 NCL is a two-step process that uses the high nucleophilicity of the thiolate anion, as well as its ability to act as a leaving group,20 to join two peptides. Specifically, the thiolate of an N-terminal cysteine residue of one peptide reacts with a C-terminal thioester installed in a second peptide, forming an amide bond after rapid S- to N-acyl group transfer (Figure (Figure1).1). An extension of NCL, expressed protein ligation (EPL),21−23 employs an engineered intein to access a polypeptide containing a C-terminal thioester, which can react subsequently with the thiolate of an N-terminal cysteine residue.
We utilized EPL to construct the paradigmatic enzyme bovine pancreatic ribonuclease (RNase A24,25).26 RNase A consists of 124 residues, including eight cysteines that form four disulfide bonds in the native enzyme.24 We achieved its semisynthesis with cysteine 95 as the point of disconnection. After expression of the fusion protein and thiol-induced cleavage, the RNase A(1–94) fragment with a C-terminal thioester was reacted with the N-terminal cysteine of a peptide corresponding to residues 95–124, thereby reconstituting wild-type RNase A. Altogether, the semisynthetic route (Figure (Figure2)2) required four distinct acyl transfer reactions involving sulfur.
The genetic code limits the components of natural proteins, like RNase A, to 20 or so α-amino acids. In contrast, EPL enables the semisynthetic incorporation of a limitless variety of nonnatural modules into proteins. These modules are enabling chemical biologists to interrogate protein structure and function with unprecedented precision.
Our work with EPL has focused on the reverse turn.28,29 Compared with α-helices and β-sheets, which are buttressed by numerous hydrogen bonds, turns are unconstrained and unstable. Moreover, turns are often a preferred site for degradation by proteolytic enzymes.30,31 Hence, it is important to identify reverse turn mimics that can endow stability and withstand proteolysis.32
Toward this end, we replaced two residues that form a reverse turn in RNase A (asparagine 113–proline 114) with a variety of synthetic mimics (Figure (Figure3).3). The first was a module consisting of two cyclic β-amino acid residues, R-nipecotic acid–S-nipecotic acid (R-Nip–S-Nip).27 This dipeptide unit was known to form an internal hydrogen bond33 that promotes β-hairpin formation.34,35 We found the catalytic activity of the ensuing variant of RNase A to be indistinguishable from that of the wild-type enzyme. Moreover, the variant had greater thermostability (ΔTm = 1.2 ± 0.3 °C). Installing a diastereomeric analog that cannot form a turn (R-Nip–R-Nip) caused the enzyme to lose nearly all catalytic activity.
Likewise, we used EPL to replace proline 114 in RNase A with the non-natural amino acid 5,5-dimethyl-l-proline (dmP).36 The dmP module is unusual in being an α-amino acid that forms almost exclusively cis (that is, E) peptide bonds,38−41 which are a signature of type VI reverse turns.42 The catalytic activity of this analog was found to be virtually identical to that of the wild-type enzyme, and it again was endowed with increased thermostability (ΔTm = 2.8 ± 0.3 °C), along with faster folding. To probe further the effect of cis peptide bonds on the rate of peptide folding, EPL was used to replace asparagine 113–proline 114 with a 1,5-triazole surrogate made by a Ru(II)-catalyzed Huisgen 1,3-dipolar cycloaddition reaction.37 The 1,5-triazole, which mimics a cis peptide bond, enables the enzyme to retain high catalytic activity and thermostability. In contrast, the regioisomeric 1,4-triazole made by Cu(I)-catalyzed cycloaddition of the same components, has compromised thermostability.
Proteins contain several nucleophiles but no electrophiles (other than disulfide bonds). We used intein-mediated protein splicing to develop a general strategy for intercepting a transiently formed electrophile as a means of appending a useful functional group to the C-terminus of a protein (Figure (Figure4).4). Upon examining the capture of a model thioester by various nitrogen-based nucleophiles, we found that hydrazines give the highest rates for S- to N-acyl transfer.43 The intein-thioester of RNase A, when treated with hydrazino azide 1, led to RNase A labeled at the C-terminus with a versatile azido group.
l-Selenocysteine (Sec or U), often referred to as the “21st amino acid”,44−47 is not produced by posttranslational modification but rather shares many features with the 20 common amino acids. Selenocysteine has its own codon and its own unique tRNA molecule, and is incorporated into proteins by ribosomes.48 The incorporation of selenocysteine rather than cysteine enables proteins to avoid irreversible oxidation, because a seleninic acid (unlike a sulfinic acid) can be reduced readily.49 Many natural proteins that contain selenocysteine are known,50 yet direct introduction of selenium into an existing protein remains a challenge. We reasoned that selenocysteine, like cysteine, could effect both NCL and EPL and thereby provide a means to incorporate selenocysteine into proteins. A selenolate (RSe–) is more nucleophilic than its analogous thiolate (RS–).51−54 Moreover, the pKa of a selenol (RSeH) is lower than that of its analogous thiol (RSH). These properties suggested to us that native chemical ligation with selenocysteine should be more rapid than that with cysteine, especially at low pH. We tested this hypothesis by comparing the rates at which cysteine and selenocysteine react with a model thioester. We found that selenocysteine reacts 103-fold faster than does cysteine at pH 5.0,55 providing high chemoselectivity (Figure (Figure55).
Next, we explored the utility of selenium in EPL. We made residues 1–109 of RNase A bearing a C-terminal thioester using rDNA technology and solid-phase peptide synthesis to access residues 110–124 with either cysteine or selenocysteine56 as residue 110. After ligation, the two synthetic proteins, RNase A and C110U RNase A, had indistinguishable catalytic activity. A disulfide bond between cysteine 58 and cysteine 110 makes a significant contribution to catalytic activity.57 Accordingly, we concluded that the C110U variant formed a selenosulfide bond. Because the reduction potential of selenosulfide and diselenide bonds is less than that of the corresponding disulfide,58 this strategy could be used to endow a protein with high conformational stability in a reducing environment, such as the cytosol. Finally, we exploited the mechanism of intein-mediated protein splicing to access a protein with a pendant C-terminal selenocysteine residue that is poised for a chemoselective reaction, even in the presence of cysteine residues.59
NCL and EPL enable the synthesis and semisynthesis of proteins. These methods are limited, however, by the requirement for a cysteine residue at the ligation juncture. Several methods (including an inspirational modification of the venerable Staudinger reaction by Bertozzi60) overcome this limitation but are limited otherwise in adding exogenous atoms to the product. Another approach is to desulfurize the ligation product, thereby accomplishing, in effect, an alanine ligation.61−63 For this method to be effective, all other sulfur moieties must be resistant to the desulfurization conditions.64 This approach has been extended to accomplish ligations at valine,65,66 lysine,67 threonine,68 and leucine residues.69,70
In 2000, we reported on a chemical ligation method that leaves no residual atoms in the product and that avails the chemoselectivity of sulfur in acyl transfer reactions. First, the nucleophilicity of sulfur is used to create a phosphinothioester at the C-terminus of a peptide by S- to S-acyl transfer from an extant thioester71,72 to a phosphinothiol (Figure (Figure6).6). Treating the incipient C-terminal phosphino group with a second peptide containing an N-terminal azido group73 initiates the Staudinger reaction, generating an iminophosphorane intermediate. Intramolecular S- to N-acyl transfer from the sulfur of the thioester to the nitrogen of the iminophosphorane generates an amidophosphonium salt. Hydrolysis of the P–N bond provides a nascent peptide bond in a traceless manner. By generating N2 and a phosphine oxide, the traceless Staudinger ligation adds the thermodynamic driving force of the Staudinger reaction to that of native chemical ligation.74
The attributes of the phosphinothiol reagent are the key to achieving a Staudinger ligation in high yield. During the course of our work, we have synthesized and evaluated numerous P,P-diaryl phosphinothiols that effect the transformation (Figure (Figure7).7). (P,P,P-Trialkyl phosphinothiols are also effective,77 but are highly prone toward oxidation.) The first reagent used to perform a traceless Staudinger ligation was o-phosphinobenzenethiol (2).75 Although capable of carrying out the desired coupling between a phenylalanyl thioester and glycyl azide, it gave only a 35% yield of the desired peptide product (Figure (Figure8). We8). We learned that this low yield was due to competition with the reduction to the amine. The Staudinger ligation with phosphinothiol 2 (and 3(78,79)) occurs through a transition state with a six-membered ring. We reasoned that by accessing instead a transition state with a five-membered ring,80 we could favor the ligation over the reduction pathway. To assess our reasoning, we synthesized diphenylphosphinomethanethiol (4) and found that it facilitated the same coupling in a much more impressive 85% yield over two steps (Figure (Figure88).76
To expand further the scope of this transformation, it was necessary to determine the reactivity of chiral azides. All natural α-amino acids except glycine have a stereogenic center at their α-carbon. To be useful as a means to couple peptides, a ligation reaction must proceed without any measurable epimerization. To address this concern, the azido benzamides of both enantiomers of phenylalanine, serine, and aspartic acid were synthesized and subjected to Staudinger ligation mediated by phosphinothiol 4.84 Phenylalanine, serine, and aspartic acid were chosen as representatives of three distinct side chains with moderate to high propensity for epimerization during standard peptide couplings.85 In all cases, the ligation proceeded in excellent yield (>90%) to give the expected amides without any loss of enantiomeric excess as determined by chiral HPLC (Figure (Figure99).
The Staudinger ligation has proven to be a versatile alternative to NCL, EPL, and resin-based methods for the synthesis of peptides.86 To expand this versatility to protein production, we tested the ability of the Staudinger ligation to work in concert with NCL.87 Specifically, we sought to assemble RNase A from three peptide fragments composed of residues 1–109, 110–111, and 112–124 (Figure (Figure10).10). The 109–110 peptide bond would be formed by NCL, and the 111–112 peptide bond by Staudinger ligation. To accomplish this feat, the 112–124 fragment was synthesized with an N-terminal azido group and with its C-terminus attached to PEGA resin. This fragment was treated with the C-terminal phosphinothioester of the 110–111 fragment88 and then cleaved from the resin to give RNase A(110–124) in 61% yield. This process was also carried out with [13C′,13Cα,15N]proline 114 to give the analogous labeled peptide. RNase A(1–109) was prepared with a C-terminal thioester by using an intein and coupled with cysteine 110 of both labeled and unlabeled RNase A(110–124) to give enzymes with full catalytic activity. One-dimensional HSQC NMR experiments with the labeled semisynthetic protein confirmed that the 113–114 peptide bond had the expected cis conformation (Figure (Figure3).3). Along with NCL and EPL, the traceless Staudinger ligation has made proteins accessible targets for chemical synthesis and semisynthesis.74
Simple aryl or alkyl phosphinothiol reagents are highly effective at mediating the Staudinger ligation when a glycine residue is at one of the two coupling sites. When neither residue at the coupling site is glycine, however, the ligation yield drops sharply.89 To address the inefficiency of hindered (that is, non-glycyl) coupling reactions, we tuned the electron density on the phosphorus by adding substituents to the aryl rings of the phosphinothiol.81 We found that the electron-donating p-methoxy groups in phosphinothiol 7 enabled efficient ligation (>80%) of both alanine and phenylalanine thioesters with an alanine azide (Figure (Figure1111).81 In a related kinetic study with phosphinothiols 4–9, we found that the rate of ligation of sterically hindered amino acids increases but the yield of product decreases with electron donation.82 We suspect that too much electron density renders the iminophosphorane nitrogen highly susceptible to protonation by water, which leads to hydrolysis of the P–N bond prior to the desired S- to N-acyl transfer.
As with any new chemical transformation, gaining insight into the kinetics and mechanism of the reaction is vital. Experiments with 18O-labeled water confirmed79 that the reaction proceeds by S- to N-acyl transfer of the iminophosphorane intermediate to form an amidophosphonium salt, which hydrolyzes to give exclusive 18O incorporation in the phosphine oxide byproduct (Figure (Figure12).12). In addition, a continuous assay based on 13C NMR spectroscopy revealed that the rate-determining step in the Staudinger ligation was the formation of the initial phosphazide intermediate.90 A second-order rate constant of 7.7 × 10–3 M–1 s–1 was determined for the reaction, which is consistent with that for other Staudinger ligations. The NMR experiment showed rapid conversion of starting materials to products without the accumulation of intermediates. Polar solvents were shown to increase the rate of the reaction, providing further support for a charged phosphazide intermediate in the rate-determining transition state.
The ability to carry out the Staudinger ligation in water without an organic cosolvent was an important hurdle in the expansion of its utility.91 The key to this endeavor was the careful design of a new phosphinothiol reagent that retained high reactivity while attaining water solubility. We synthesized the phosphinothiols 10–14, which all contain a thiomethyl group and exhibit water solubility. Bis(p-dimethylaminoethylphenyl)phosphinomethanethiol (11) was shown to mediate the rapid ligation of equimolar substrates in water. Moreover, this reagent also performed an S- to S-acyl transfer reaction with the thioester intermediate formed during intein-mediated protein splicing of RNase A without the need for a catalytic small-molecule thiol.92,93 In a related study, we investigated the proximity of the amino groups to the reaction center.83 With its cationic dimethylammonium groups close to its phosphorus, phosphinothiol 13 proved to be a superior reagent for mediating a traceless Staudinger ligation in water, enabling yields of 70% near pH 8.0 (Figure (Figure1313).
The traceless Staudinger ligation is a versatile new tool for protein chemistry. The chemoselective reaction takes advantage of the unique properties of sulfur as both a good nucleophile and, ultimately, a good leaving group. We reasoned that, in addition to its synthetic utility, the reaction also provides a means to immobilize proteins on a solid support.
To test the applicability of the Staudinger ligation for protein immobilization, we chose RNase S as a target.94 RNase S, the archetypal protein-fragment complementation system,95 consists of S15 (residues 1–15 of RNase A) and S-protein (residues 21–124). We synthesized S15 in two forms, one with the azido group in place of the side-chain amino group of lysine 1 and another with an azido group attached to the N-terminus via a PEG linker. We then immobilized phosphinothiol 4 as a thioester on the surface of a glass slide and treated it with azido-S15 analog 15 or 16, followed by S-protein (Figure (Figure14).14). S15 was immobilized on the surface rapidly (t1/2 < 1 min) in 67% yield, and the S-protein·S15 complexes formed with immobilized 15 and 16 retained 85% and 92%, respectively, of the catalytic activity of soluble RNase S.
Next, we applied the traceless Staudinger ligation to an intact protein. We did so by displaying a phosphinothiol 4 as a thioester on a self-assembled monolayer on a gold chip (Figure (Figure15).15). An azido group was installed at the C-terminus of RNase A as before by intercepting its intein thioester with hydrazino azide 1 (Figure (Figure44).96 Immobilization proceeded rapidly and selectively, and the immobilized protein retained its catalytic activity and was able to bind to a natural inhibitor protein. This strategy provides a general means to fabricate microarrays displaying proteins in a uniform orientation.
In the decade since its introduction, the traceless Staudinger ligation has provided synthetic chemists and chemical biologists with a chemoselective means to create an amide bond. By tuning the electronics and solubility of the phosphinothiols, we have identified optimized reagents for effecting the traceless Staudinger ligation in different contexts (Figure (Figure16).16). Still, an important caveat exists. The rate constant for the fastest known Staudinger ligation at room temperature is k = 7.7 × 10–3 M–1 s–1 (Figure (Figure12).12). In general, a reaction between two equimolar reactants provides a 50% yield of product at time t = 1/(k[reactant]t=0).97 With reactant concentrations of 1 μM, this most rapid Staudinger ligation will require 4.1 years to form 0.5 μM of an amide product! Accordingly, the Staudinger ligation is useful for synthetic reactions (Figure (Figure17)17) but requires extraordinary detection methods for application in a biological context.
Designing new chemical reactions that can attain high chemoselectivity in the presence of a plethora of reactive functionalities found in native biological settings is and will continue to be an important goal for chemists and biologists alike. The unique acyl transfer capabilities of sulfur and selenium make them important tools for chemical biologists in this ongoing challenge.
We are grateful to L. L. Kiessling, R. J. Hondal, B. L. Nilsson, M. B. Soellner, A. Tam, J. Kalia, and our other co-workers for their contributions to this work. Our research on protein chemistry has been supported by Grant R01 GM044783 (NIH), the Materials Research Science and Engineering Center at the University of Wisconsin–Madison (NSF DMR-0520527), and the Guggenheim Foundation.
Nicholas A. McGrath received his Ph.D. in chemistry and chemical biology from Cornell University in 2010 under the direction of Professor Jon T. Njardarson. There, he developed synthetic routes to hypoestoxide, platensimycin, and the guttiferone family of natural products. He is a now a Ruth L. Kirschstein–NRSA postdoctoral fellow in the group of R. T. Raines.
Ronald T. Raines is the Henry Lardy Professor of Biochemistry and a Professor of Chemistry at the University of Wisconsin—Madison. His research group has discovered an RNA-cleaving enzyme that is in a human clinical trial as an anticancer agent, provided fundamental insight on the stability of collagen and other proteins, and developed processes to synthesize proteins and convert crude biomass into useful fuels and chemicals.
National Institutes of Health, United States