|Home | About | Journals | Submit | Contact Us | Français|
Given the central role of erythrocytes in enabling life, it is not surprising that there has evolved a highly sophisticated system for their production and orchestration. Broadly speaking, the multifaceted erythrocyte management process is encompassed in the term “erythropoiesis.” In the early mammalian fetus, erythropoiesis begins in mesodermal cells in the yolk sac. With further development, the spleen and liver become the venues of erythrocyte production. Ultimately, the bone marrow becomes the “plant site” of erythropoiesis.
Upon sensing decreased oxygen in circulation, the kidneys secrete a hormone called erythropoietin (EPO). Contact of EPO with its receptor initiates signaling routines, which trigger erythropoiesis. Thus, EPO is clearly a major participant in erythropoiesis, which is central to life itself.
The history of erythropoietin and the intellectual milestones leading to its recognition, demonstration, purification, sequencing, expression, production, and multifaceted medical ramifications (including applications to anemia induced by dialysis, and cancer chemotherapy) are continually being updated in various review forums.
Notwithstanding its celebrated status in biology and medicine, the term “erythropoietin” – insofar as it implies a chemically discrete entity – is a misnomer. Erythropoietin (EPO), as encountered by researchers and employed by physicians, is actually a large family of entities. The primary protein structure is highly conserved, as are its sites of glycosylation. Indeed, its sole O-linkage (to glycophorin) at Ser126 is substantially conserved. By contrast, the remaining three oligosaccharide domains (i.e. the N-linkages at asparagines 24, 38 and 83) are not under tight genetic supervision. This uncharacteristically permissive biomanagement leads to a highly complex medley of non-separable EPO glycoforms, which has defied diligent efforts at separation. To our knowledge, erythropoietin as a homogeneous chemical entity, containing a defined unitary array of N-linked carbohydrate domains, was unknown prior to our study.
Curiously, our laboratory first became interested in erythropoietin from its presumably less functionally critical carbohydrate sectors. These complicated domains posed seemingly daunting problems from the perspective of organic synthesis. As we developed strategies and methods to deal with assembling suitably complex carbohydrate domains, we began, in ca. 2002, to fantasize about the possibility of generating homogeneous erythropoietin itself, solely through the resources of organic chemistry. This paper describes a major advance in the realization of this goal.
A prime reason that the EPO-directed venture gained increasing fascination in our chemistry–centered laboratory was the perception that, powerful as it was, the “state of the art” of protein synthesis was not then up to the task of solving the problem. It seemed that new methods, and conceptual advances would be necessary to synthesize EPO by chemical means.
The two vital field resources, then available toward the synthesis of an EPO-sized protein, were step-by-step solid phase peptide synthesis (SPPS) and possible ligations for merging polypeptides. While there were, and still are, no reliable rules limiting the size of a polypeptide which is accessible by linear reiterative SPPS, the size of the EPO protein we were after (166-mer), in the context of its seriously hydrophobic stretches, not to speak of its four carbohydrate domains, seemed to place it out of the range of SPPS, per se. Rather, SPPS, properly employed, could hopefully provide for the synthesis of useful (i.e. combinable) fragments of EPO. It would then be necessary to ligate judiciously selected subunits, bearing N- and O-oligosaccharide domains to reach target EPO, itself.
Of the ligation methods then available, the seminal native chemical ligation (NCL) protocols of Kent and associates were certainly the most powerful. However, the NCL method requires an N-terminal cysteine at the N→C ligation site (see Figure 2). Examination of the primary structure of EPO protein reveals that the positioning of its four cysteine residues is such that NCL per se would be of likely value only for the Cys29 (or Cys33) site.
In Figure 2, we briefly summarize a menu of new methods that were developed for EPO and related protein-based synthesis projects. Of particular interest to this program early on, was the development of the ortho-mercaptoaryl ester rearrangement (OMER) methodology. Thus, an incipient C-terminal thioester is generated from a phenyl ester following TCEP mediated cleavage of an ortho positioned disulfide bond (Figure 2b). The thioester, thus generated, participates as the C-terminus in an NCL ligation, or in HOBT–mediated ligations with other N-terminal protected but non-cysteine containing peptides. Moreover, it was shown that NCL could be applied to the coupling of glycopolypeptide fragments.[10a] This complements an earlier report of a NCL between two O-linked glycopeptides.[10b]
Another advance, which was stimulated by EPO and related projects, involved hindered isonitrile mediated N-terminal elongation of a polypeptide chain with an N-protected thioacid (Figure 2c). Finally, and most critically for the purpose at hand, we were able to accomplish major extensions of NCL, occasioned by the discovery of metal-free dethiylation (MFD) (Figure 2d). This finding has had a huge enhancing effect on the reach of NCL. Our first application of MFD was to enable N-terminal alanine ligations. NCL logic has been extended, from our laboratory and others, to embrace ligations at N-terminal valine, leucine, lysine, threonine, proline, and phenylalanine.
Cumulatively, these capabilities served to change the landscape of retrosynthetic analysis in the polypeptide field, by building into planning exercise, options for using non-cysteine containing N-terminal fragments, which would reenter the world of proteogenic amino acids through MFD.[12a] The first demonstration of the consequences of this capability in the context of building therapeutic-sized proteins arose from our recently reported synthesis of the human parathyroid hormone (hPTH),[19a] as well as truncated versions thereof. Following this demonstration, the combined NCL/MFD logic for the synthesis of glycoproteins has been reported by other laboratories.[19b,c]
As it turned out, our earlier engagements in trying to synthesize erythropoietin preceded some of these enabling discoveries. Our most advanced point previous to this disclosure, using largely OMER technology, as well as applications to glycopeptide synthesis, brought us to three fragments – EPO(1-28), EPO(29-77), and EPO(78-166) – which formally corresponds to erythropoietin in need of two ligations (Figure 3). Unfortunately, the weak acyl donor and acyl acceptor reactivity of the various fragments, in the face of solubility problems with the partially protected substrates, and serious aggregation tendencies, served to frustrate all attempts to join EPO(78–166) with EPO(29–77) in a meaningful yield.
Fortunately, at the time that these major limitations were surfacing, the MFD method, which vastly extended the logic of NCL, came into use. At that difficult juncture, we could undertake a revised plan for the total synthesis of erythropoietin, the success of which we are pleased to present below.
A central question that we were addressing in the first instance was the ability to implement the new technologies described above to reach a homogeneous, “wild-type” erythropoietin. Unlike concurrent and illuminating programs in other laboratories, which were also exploiting the possibilities of EPO retrosyntheses based on MFD technology, we set as a non-compromisable condition, that our target would be strictly of the wild-type, rather than contain artificial mutants to simplify handling and isolation. Furthermore, we adopted as a sine qua non that all three wild-type asparagine sites, and the one serine site be glycosylated.
In so doing, we were asking whether we could obtain indications for erythropoietic activity from a homogeneous, but more simply, glycosidated EPO, lacking the “high mannose” and sialic acid containing sectors (see Figure 1). We would hope to determine whether a structure of that sort would be foldable and would manifest both activity and stability. Finally, we hoped to compare the properties of such a homogeneous synthetically derived construct with those of wild-type “aglycone protein”. In this fashion, we would be providing the scientific basis need to address the fascinating question as to why nature glycosidates many of its most precious proteins.
In light of these considerations, we set our sights on a homogeneous EPO glycoform incorporating three N-linked chitobiose disaccharides – at Asn24, Asn38, and Asn83 – and the O-linked glycophorin, at Ser126 (Figure 4). We envisioned that the unfolded EPO primary structure, might be assembled in a convergent fashion from four glycopeptide fragments (I–IV) via iterative alanine and cysteine-based ligations. The plan now called for installation of cysteine residues in place of Ala79 and Ala125, located at the N-termini of fragments III and IV, respectively. Following sequential cysteine-facilitated ligations, the EPO II–IV fragment would be in hand. This glycopolypeptide would then be subjected to global MFD, to convert the now-extraneous Cys functionalities to the requisite Ala residues at the ligation sites. As shown, the synthetic design required that Cys161 and Cys33 be protected (with acetamidomethyl [Acm]) groups during the MFD step. Finally, we envisioned that EPO fragment II–IV, encompassing Cys29–Arg166, would undergo cysteine-based NCL with fragment I to deliver, following removal of the Acm protecting groups, the glycosylated full EPO primary structure. We were hopeful, but not confident, that appropriate folding conditions might be identified for synthetically derived full length EPO (1).
Our synthesis commenced with the preparation of EPO fragment IV, which contains the O-linked glycophorin tetrasaccharide. We had previously demonstrated that fully protected O-linked Ser glycosylamino acid cassettes, of the type 2, can be readily prepared and subsequently employed in the synthesis of α-O-linked glycopeptides. As outlined in Scheme 1, NaOH–mediated global deprotection of 2, followed by reaction with Fmoc-thiazolidine succinimide ester (3) under basic conditions, afforded glyco-dipeptide 4. This intermediate was then coupled with alanine (2-ethyldithiolphenyl)ester, 5, to generate tripeptide 6, bearing a stable masked thioester equivalent.[9b]
Glycopeptide 6 was next subjected to NCL with peptide 7, itself prepared through SPPS using an Fmoc-based strategy (Scheme 2). Upon removal of the Fmoc group, glycopeptide 8 was in hand. Finally, thiazolidine ring opening of the N-terminal residue delivered EPO fragment IV (9) in good overall yield.
The next challenge was to assemble the requisite N-linked glycopeptide fragments (I–III). Toward this end, we took recourse to a recently disclosed one-flask aspartylation/deprotection protocol, invented for just such an application. As shown, the method allows for the highly convergent synthesis of complex glycopeptides from fully elaborated peptide and carbohydrate precursors. Notably, temporary placement of a pseudo-proline dipeptide at the n+2 position, relative to the Asp residue, serves (for reasons not yet clear) to suppress otherwise competitive aspartimide-based peptide decomposition pathways. Thus, as outlined in Schemes 3–5, the protected peptide fragments, bearing pseudo-proline motifs, were prepared through SPPS. Under our one-flask aspartylation/deprotection conditions, peptides 10, 12, and 14 were smoothly merged with chitobiose to deliver the target EPO fragments III, II, and I, respectively.
Having accomplished the syntheses of the four component glycopeptides, I–IV, we next explored strategies by which to merge these fragments en route to EPO. As shown in Scheme 6, glycopeptides 9 and 11 were coupled, under standard NCL conditions, to cleanly afford intermediate 16. This compound was subjected to cysteine-based ligation with glycopeptide 13, to provide fragment 17, corresponding to the EPO(29-166) domain. At this stage, glycopeptide 17 was subjected to our previously developed MFD protocol, to deliver 18, possessing the requisite Ala residues at the original ligation sites (79, 125, and 128). Following removal of the Acm groups,[24a] ligation between 19 and 15 delivered the EPO(1-166) primary structure, possessing all four sites of glycosylation. We note that, due to the poor solubility of the EPO (29-166) domain (cf. 19), the use of trifluoroethanol (TFE) as co-solvent in the final step was critical for the success of the final transformation.
We concurrently explored the feasibility of an alternative, kinetic chemical ligation (KCL)–based route to 1. In order to gain optimal convergence, we hoped to achieve the one-flask merger of three fully elaborated EPO fragments (Fragment I′, II′, and III–IV) via NCL, facilitated by temporarily installed N-terminal cysteine residues. Following ligation, the cysteines would be converted to the native alanine residues through a global MFD step. As shown in Scheme 7, this strategy required the preparation of slightly modified versions of EPO Fragments I and II (20 and 21), such that the envisioned formal alanine ligation would be achievable. In the event, KCL of glycopeptides 20 and 21, followed by in situ activation of Gln78 alkylthioester using mercaptophenylacetic acid (MPAA) in the presence of glycopeptide 16, delivered the target glycopeptide 22. Following dialysis by centrifugal ultrafiltration, the crude mixture was directly subjected to standard desulfurization conditions, to afford the desired protected glycopeptide 23 in good yield. Finally, treatment of 23 with AgOAc in acetic acid solution served to remove all four Acm protecting groups, leading to the generation of the EPO(1-166) primary structure. The HPLC retention time and mass spectral data (Figure 5B) obtained from both routes were identical.
Folding of the protein sequence was conducted following the literature reported protocol, using CuSO4 as oxidant,  and N-lauroylsarcosine as an additive. Following folding, top-down mass-spectroscopy clearly indicated the formation of our desired folded protein (Figure 5C). The potential of slight inhomogeneity arising from misfolded or unfolded protein as exhibited in the HPLC trace (Figure 5A) cannot be ruled out.
In order to examine the consequences of protein glycosylation on the physical properties and biological activity of EPO, we sought to synthesize the non-glycosylated variant of EPO protein (EPO aglycone). We had previously disclosed the synthesis of the partially protected protein chain of non-glycosylated EPO (containing Acm groups at Cys7, Cys29, Cys33, and Cys161). In that work, the fully deprotected protein could not be reached by this route, due to the marginal stability of its partially protected precursor. In that first effort, we had observed that, upon assembly of the entire protein chain, the solubility of the non-glycosylated peptide decreased severely. For our purposes, this finding was particularly problematic because our synthetic strategy at the time called for a number of post-ligational operations on the full-length peptide sequence. Happily, non-glycosylated EPO protein could be prepared through adaptation of the first synthetic route employed above en route to glycosylated EPO, 1 (see Scheme 6). The main advantage offered by this strategy is the fact that the full peptide chain is assembled through NCL as the final step of the sequence, prior to protein folding.
Thus, as shown in Scheme 8, peptides 24 and 25 were combined via NCL, followed by conversion of the N-terminal thiazolidine to a cysteine residue, to yield 26. This compound was combined with 27 to form peptide 28. The non-native cysteine residues of peptide 28 were converted to wild-type alanine residues via MFD and the Acm protecting groups on the remaining cysteine residues were removed by treatment with AgOAc to yield peptide 29. Synthesis of unfolded non-glycosylated EPO (31) was completed by ligation of peptides 29 and 30 under NCL conditions. As shown in the Supporting Information, the acquisition of high quality mass spectrometry data for protein 31 was limited, presumably due to the aggregation and solubility issues of the fully assembled EPO protein. This result is not entirely unexpected. We had noted that several studies reporting to have produced EPO aglycone by genetic engineering in bacterial cells[34–36] or via deglycosylation of EPO[37–39] never characterized the protein by the MS criterion. The fully assembled protein (31) was subjected to EPO folding buffer[24c] to form non-glycosylated EPO aglycone (32).
Not surprisingly, unfolded EPO(1-166) failed to exhibit measurable erythropoietic activity. We then turned to determination of the possible activity of samples of synthetic EPO, as well as synthetic EPO aglycone (32), that had emerged from the folding exercises. This question was explored by in vitro cell proliferation assays. Cord blood CD34+ cells were cultured in the presence of synthetic EPO. Erythroid colony formation was observed at various concentrations after 14 days for both the glycosylated and non-glycosylated forms of EPO. Glycosylated EPO demonstrated greater activity compared to the non-glycosylated protein, perhaps due in part to the poor shelf-life of non-glycosylated EPO. These findings demonstrate that a simplified form of EPO can promote the development of erythroid colonies from their corresponding progenitor cells. Additionally, the observed activity of synthetic non-glycosylated EPO complements prior studies which have found modest levels of in vitro activity for both semi-synthetic[37–39] and expressed non-glycosylated EPO.
In summary, the inaugural synthesis of EPO “wild type” polypeptide (1-166), glycosidated at the three “wild-type” N-linked sites and the one O-linked site, has been accomplished. The material, thus produced, has been characterized, both in non-folded and folded form. Clear erythropoietic activity has been manifested by the fully synthetic EPO. By contrast, the EPO aglycone, while sustainable at the level of bioassay, did not give rise to a supportive mass spectrum. These results tend to confirm the important role of glycosidation in maintenance of the glycoprotein stability, thereby presumably helping to enable biological activity. From here, we would hope to go onto the synthesis of the three N-linked domains containing the consensus sequence indicated in Figure 1. In so doing, it should be possible to evaluate, in greater detail, the role of the carbohydrate domains.
Perhaps, the broader lesson to be learned is that chemical synthesis has now reached the maturity level to enable the synthesis of even a complex glycoprotein in foldable, biologically functional, form. With this capability could come new understandings as to why nature glycosidates many proteins. Such understandings might well carry with them insights for improved therapeutic applications.
Dr. Ping Wang, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Suwei Dong, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. John A. Brailsford, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Karthik Iyer, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Steven D. Townsend, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Qiang Zhang, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Ronald C. Hendrickson, Department of Pharmacology and Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. JaeHung Shieh, Cell Biology Program, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Dr. Malcolm A. S. Moore, Cell Biology Program, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA)
Prof. Samuel J. Danishefsky, Laboratory for Bioorganic Chemistry, Sloan-Kettering Institute for Cancer Research, 1275 York Avenue, New York, NY 10065 (USA). Department of Chemistry, Columbia University, 3000 Broadway, New York, NY 10027.