|Home | About | Journals | Submit | Contact Us | Français|
Chemokine receptors are critical regulators of cell migration in the context of immune surveillance, inflammation and development. The G protein-coupled chemokine receptor, CXCR4, is specifically implicated in cancer metastasis and HIV-1 infection. Here we report five independent crystal structures of CXCR4 bound to an antagonist small molecule IT1t and a cyclic peptide CVX15 at 2.5–3.2 Å resolution. All structures reveal a consistent homodimer with an interface involving helices V and VI that may be involved in regulating signaling. The location and shape of the ligand binding sites differ from other G protein-coupled receptors and are closer to the extracellular surface. These structures provide new clues about the interactions between CXCR4 and its natural ligand CXCL12 and with the HIV-1 glycoprotein gp120.
Chemokine receptors are G protein-coupled receptors (GPCRs) that, together with their small protein ligands, regulate the migration of many different cell types, most notably leukocytes (1–3). CXCR4, one of 19 known human chemokine receptors, is activated exclusively by the chemokine CXCL12 (also known as Stromal Cell-Derived Factor-1, SDF-1) and couples primarily through Gi proteins. Targeted deletion of CXCR4 or CXCL12 in mice confers embryonic lethality and exhibits defects in vascular and CNS development, hematopoiesis, and cardiogenesis (4–5). CXCR4 has been associated with more than 23 types of cancers where it promotes metastasis, angiogenesis and growth/survival (6–10). Furthermore, T-tropic HIV-1 uses CXCR4 as a co-receptor for viral entry into host cells (11). Thus, the discovery that endogenous CXCL12 inhibits HIV-1 entry suggested the therapeutic potential of targeting CXCR4 to block viral infection (12–13). Despite a wealth of data related to CXCR4 and GPCRs in general, many aspects of ligand binding and signaling are poorly understood at the molecular level. For instance, CXCR4 has a propensity to form hetero- and homo-oligomers (14–15), and such oligomerization could play a role in the allosteric regulation of CXCR4 signaling (16). While structural understanding of GPCRs has benefited from a number of recent breakthroughs (17–20), coverage of the superfamily’s phylogenetic tree is incomplete, and a structure of a GPCR that is activated by a protein ligand has not been reported.
Here we report the crystal structures of human CXCR4 in complex with a small molecule antagonist at 2.5 Å resolution and with a cyclic peptide inhibitor at 2.9 Å resolution. Three stabilized constructs (CXCR4-1, CXCR4-2 and CXCR4-3; Table S1) expressed in baculovirus-infected Spodoptera frugiperda (Sf9) insect cells were selected for structural studies based on thermal stability, monodispersity, and lipid matrix diffusion. Similar to the previously determined high-resolution structures of the β2-adrenergic receptor (β2AR) (17, 21) and A2A adenosine receptor (A2AAR) (18), the CXCR4 constructs contain a T4 lysozyme (T4L) fusion inserted between transmembrane (TM) helices V and VI at the cytoplasmic side of the receptor. In addition, all three constructs contain a thermostabilizing L1253.41W mutation (22–23). The constructs differ in the precise T4L junction site, the position of the C-terminal truncation, as well as a T2406.36P mutation in CXCR4-3, and required further stabilization with ligands to facilitate purification and crystallization. Two antagonists were selected for crystallization trials based on ligand solubility, binding affinity, and induced protein thermostability (Table S2, S3): a small, drug-like, isothiourea derivative (IT1t) (24) and CVX15, a 16-residue cyclic peptide analog of the horseshoe crab peptide polyphemusin, that was previously characterized as an HIV-inhibiting and anti-metastatic agent (25–27).
Prior to crystallization trials, the effects of the protein engineering on CXCR4 function were evaluated using radioligand binding and calcium flux assays. CXCR4-WT expressed in Sf9 cells binds a [3H]bis(imidazolylmethyl) amine analog (BIMA) with similar affinity as the same construct expressed in HEK293 cells (Kd 3.5 ± 1.5 and 3.7 ± 1.4 nM, respectively). All other constructs expressed in Sf9 cells also show similar binding affinity to BIMA and IT1t (Table S3). However, CXCR4-1 and CXCR4-2 display lower binding affinity for the CVX15 peptide compared to CXCR4-WT and CXCR4-3. Calcium flux assays demonstrated the expected result that these constructs do not activate G proteins (Fig. S1), due to the T4L insertion in the third intracellular loop, which is critical for G protein interactions. Assays with the same constructs lacking T4L confirmed that the stabilizing L1253.41W mutation, as well as the various C-terminal truncations, did not adversely affect calcium release, while the T2406.36P mutation, which is present only in the CXCR4-3 construct, abolished signaling.
After extensive optimization of crystallization conditions in lipidic mesophase, five distinct crystal forms were obtained (Table S4). CXCR4-1, CXCR4-2 and CXCR4-3 were co-crystallized with IT1t (two crystal forms for CXCR4-2), while crystals of CXCR4-3 were also obtained with CVX15. Data collection and refinement statistics for all five crystal forms are shown in Table S1 (28).
The overall structure of CXCR4 bound to the small molecule antagonist IT1t is conserved in all crystal forms with a Cα RMSD of 0.6 Å. Binding of the CVX15 cyclic antagonist peptide induced conformational differences relative to IT1t in the CXCR4-3/CVX15 structure (Cα RMSD 0.9 Å). For clarity, we focus on the highest resolution crystal form of CXCR4-2/IT1t (2.5 Å, monomer A) for discussion of the CXCR4 structural features and comparison with other GPCR structures. The final model includes 326 of the 352 residues of CXCR4 and residues 2–161 of T4L. The remaining N-terminal 26 residues did not have interpretable density and are presumed to be disordered. The main fold of CXCR4 consists of the canonical bundle of 7 TM α-helices (Fig. 1A), which shows about the same level of structural divergence from 7TM helical bundles of previously solved GPCR structures (Cα RMSDs ~2.0–2.2 Å) (Fig. 1B). The most striking differences in the disposition of the TM helices of CXCR4 are the following: i) The extracellular end of helix I is shifted towards the central axis of the receptor by 9 Å compared to β2AR and by more than 3 Å compared to A2AAR; ii) helix II makes a tighter helical turn at Pro922.58 resulting in ~120° rotation of its extracellular end compared to other GPCR structures (this rotation essentially introduces a one-residue gap in the sequence alignment that would result in wrong residues facing the ligand-binding pocket in a homology model that did not account for the rotation); iii) both intracellular and extracellular tips of helix IV in CXCR4 substantially deviate (~5 and ~3 Å, respectively) from their consensus positions in other GPCRs; iv) the extracellular end of helix V in CXCR4 is about one turn longer; v) helix VI has a similar shape in all structures and is characterized by a sharp kink at the highly conserved residue, Pro2546.50; however, its extracellular end is shifted by ~3 Å in CXCR4 relative to β2AR and A2AAR; and finally vi) the extracellular end of helix VII in CXCR4 is two helical turns longer than in other GPCR structures. These two extra turns place Cys2747.25 at the tip of helix VII in a strategic position to form a disulfide bond with Cys28 in the N-terminal region. Taken together, these multiple differences suggest that accurate homology modeling of even the CXCR4 TM bundle, let alone the entire structure, would be challenging.
The extracellular interface of CXCR4 consists of 34 N-terminal residues, extracellular loop 1 (ECL1, residues 100–104) linking helices II and III, ECL2 (residues 174–192) linking helices IV and V, and ECL3 (residues 267–273) linking helices VI and VII (Fig. 1A). Clear density starts at Pro27, adjacent to Cys28, which pins the base of the N-terminal segment to Cys2747.25 at the tip of helix VII via a disulfide bond; these two cysteines are conserved in all chemokine receptors except CXCR5 and CXCR6 (Fig. S2). Another disulfide links Cys1093.25 with Cys186 of ECL2, which is the largest extracellular loop in CXCR4. While ECL2 length, sequence and secondary structure vary dramatically in GPCRs, the disulfide connecting ECL2 with the extracellular end of helix III is highly conserved in chemokine receptors and most other Class A GPCRs. Both disulfide bonds at the extracellular side of CXCR4 are critical for ligand binding (29), and the crystal structure shows that they function by constraining ECL2 and the N-terminal segment (residues 26–34), thereby shaping the entrance to the ligand binding pocket.
The intracellular side of CXCR4 contains intracellular loop 1 (ICL1, residues 65–71) linking helices I and II, ICL2 (residues 140–149) linking helices III and IV, and ICL3 (residues 225–230) linking helices V and VI, and the C-terminus. ICL3 also contains T4L inserted between Ser229 and Lys230 and flanked by short linkers (GS-T4L-GS). Structural alignment of CXCR4 with high resolution GPCR structures indicates that the intracellular half of the TM bundle is structurally more conserved (Cα RMSDs with β2AR, A2AAR and rhodopsin are 1.8, 1.9 and 1.4 Å, respectively) than the extracellular half (2.6, 2.2 and 2.2 Å, respectively). Therefore, it comes as a surprise that in all five CXCR4 structures, helix VII is about one turn shorter at the intracellular side, ending right after the GPCR-conserved NPxxY motif, and that all structures lack the short α-helix VIII (Fig. 1B). The C-terminal part of CXCR4 beyond Ala3037.54 adopts an extended conformation and participates in a number of crystal contacts with the extracellular side of a symmetry-related molecule in the highest resolution crystal form, CXCR4-2/IT1t, (Fig. S4A), and is not traceable in the other four CXCR4 structures. Due to its structural persistence and common α-helical sequence motif (F[RK]xx[FL]xxx[LF]), helix VIII was thought to be a regular structural element of all Class A GPCRs. However, CXCR4 contains only a partially conserved motif FKxxAxxxL, and while it may be capable of forming an α-helix under certain conditions, this helix would be less stable due to replacement of Phe/Leu with Ala. In addition, CXCR4 lacks a putative palmitoylation site at the end of helix VIII, which anchors to the lipid membrane in many GPCRs.
Construct CXCR4-3 contains a T2406.36P mutation near the intracellular side of helix VI, which results in retention of ligand binding affinity, but abolishes signaling (Table S3 and Fig. S1). Comparison of the CXCR4-3 structure with CXCR4-1 and CXCR4-2 reveals that the only effect of the T2406.36P mutation is the disruption of a short section of helix VI between Lys2346.30 and Pro2406.36. Since helix VI is thought to be one of the major players in the signaling mechanism (30–31), disruption of its structure would likely impact G protein binding and activation. Thus, T2406.36P represents a novel structure-based uncoupling mutation.
Strong electron density was observed for IT1t in the binding cavity of both subunits of the CXCR4 homodimer (Fig. S3A). Compared to previous GPCR structures, the cavity is larger, more open and is located closer to the extracellular surface (Fig. 2A, C, Fig. 3B & Table S5). The IT1t ligand occupies part of the pocket defined by side chains from helices I, II, III and VII, but makes no contact with helices IV, V and VI, in stark contrast to ligands in previous GPCR structures. The nitrogens of the symmetrical isothiourea group are both protonated with a net positive resonance charge, one of them (N4) forming a salt bridge (2.7 Å) with the Asp972.63 side chain. Note, that the electron density does not preclude the existence of a very similar ligand conformation with a flipped thiourea group, in which the N3 nitrogen forms a salt bridge to Asp972.63, and the N4 nitrogen makes a polar interaction with mainchain carbonyl of Cys186 in ECL2. The importance of both nitrogens is supported by a reduction in binding affinity of ~100-fold upon methylation of one of them (24). Both cyclohexane rings fit into small subpockets, making hydrophobic contacts with CXCR4. Connected by a short flexible linker, the imidazo-thiazole ring system is the only part of the ligand that contacts helix VII, in particular by making a salt bridge (2.8 Å) between the protonated imidazo-thiazole N1 and Glu2887.39 (32).
In the CXCR4-3/CVX15 complex, the bulky 16-residue ligand fills most of the binding pocket volume (Fig. 2B, D, Fig. S3B & Table S5). The peptide forms a disulfide-stabilized (Cys4-Cys13) β-hairpin, with dPro8-Pro9 at the tip of the turn exposed to the extracellular milieu. The N-terminal part of the peptide backbone from Arg1 to Cys4 forms hydrogen bonds to CXCR4 backbone residues Asp187-Tyr190, adding a partial third strand to the ECL2 β-hairpin. The core specific interactions are formed by two arginines at the peptide N-terminus: Arg1 makes polar interactions with Asp187 (3.1 Å), while Arg2 interacts with Thr1173.33 (2.9 Å) and Asp1714.60 (3.0 Å) and may form an additional hydrogen bond with His1133.29 (2.9 Å) depending on its protonation state. The bulky naphthalene ring of Nal3 is anchored in a hydrophobic region bordered by helix V. Arg14 makes a salt bridge with Asp2626.58 (3.2 Å), and an intramolecular hydrogen bond to the Tyr5 side chain, which in turn makes hydrophobic contacts with helix V side chains. Finally, the C-terminal d-proline is buried in the pocket next to the N-terminus of the peptide, making a water-mediated interaction with Asp2887.39 side chain of CXCR4. The importance of the above interactions is supported by structure-activity relationship analyses of a series of CVX15 analogues (25).
The small molecule and peptide ligand binding sites significantly overlap (Fig. 3A). As CVX15 fills the entire pocket, some conformational variations between the two complexes are not surprising. CVX15 binding induces significant deviations in the base of the receptor N-terminus (residues 29–33), as well as a minor adjustment of extracellular tips of helices VI (~1Å inward), VII (~1Å tangential) and V (~0.3Å outward). Major differences, observed between binding of IT1t and CVX15 to CXCR4 compared to ligand binding modes in β2AR, A2AAR and rhodopsin (Fig. 3B), highlight the structural plasticity of GPCR binding sites.
CXCR4 has been previously shown to homo- and hetero-dimerize, constitutively and upon ligand binding, by many different experimental methods (14–15, 33–39). While the functional importance of dimerization remains incompletely characterized, a significant body of data suggests that it has important in vivo pharmacological effects. For example, WHIM syndrome has been linked to mutations in the C-terminus of CXCR4 and results in truncated variants that exhibit enhanced signaling and fail to desensitize and internalize upon CXCL12 stimulation. As a primarily heterozygous disease in which truncated CXCR4 is co-expressed with WT receptor, dimerization has been proposed as the most likely mechanism to explain the dominance of mutant CXCR4 over the WT receptor (40–41). The structures presented here corroborate the concept of CXCR4 dimerization and define the dimer interface for a human GPCR with substantial buried surface area (850 Å2). A similar parallel, symmetric dimer of CXCR4 is observed in all five crystal forms (Fig. 4 & Fig. S4), suggesting that these contacts represent a biologically relevant homodimer interface.
In dimers of CXCR4 bound to IT1t, the monomers interact only at the extracellular side of helices V and VI, leaving at least a 4 Å gap between the intracellular regions, which is presumably filled by lipids (Fig. 4A, B & Table S6). Dimer association is driven mostly by hydrophobic interactions involving Leu1945.33/Val1975.36/Val1985.37, as well as Phe2015.40-Phe2015.40, Met2055.44-Met2055.44, and Leu2105.49-Leu2105.49 contacts. A substantial role is also played by a Trp1955.34-Leu2676.63 contact, which includes both side-chain stacking and a hydrogen bond from Trp1955.34 (NE1) to the main chain carbonyl oxygen of Leu2676.63. Another specific polar interaction includes a hydrogen-bonding network between the side chains of Asn192 and Glu268 in opposing receptors, which also involves the main-chain carbonyl oxygens of Leu2666.62 and Trp1955.34. Pro191 in ECL2 likely plays a role in this network by stabilizing the Trp1955.34 side-chain conformation. As these contacts persist throughout all five crystal forms, they are likely genuine, rather than artifacts of crystallization (Fig. 4E).
In addition, dimers of CXCR4 bound to CVX15 are stabilized by interactions at the intracellular ends of helices III and IV, and ICL2, controlled largely by hydrophobic interactions of Tyr1353.51, Leu1363.52, His140 and Pro147 side chains (~400 Å2 buried) (Fig. 4C, D, F & Table S6). It appears that binding of the bulky CVX15 peptide induces a small tilt in the extracellular part of helix V, which brings the intracellular parts of opposing receptors into close contact. This type of ligand-induced conformational change could explain the cooperative binding observed with certain CXCR4 ligands, as well as the effects of allosteric modulators. Specifically, binding of a ligand to one receptor could induce a structural change in helix V of the second receptor, thereby modifying the ligand binding affinity to the second receptor, resulting in either negative or positive cooperativity. Extending this concept to chemokine receptor heterodimers, CXCR4 has been reported to dimerize with CCR2 and CCR5 and both complexes show negative binding cooperativity with their ligands not only in vitro but also in vivo (36, 38), an observation which may have significant implications for drug efficacy.
The CXCR4 dimer is strikingly different from previous models of GPCR dimerization, which suggested contacts through either helix I or helices IV/V (42–46) and implied contacts throughout the length of the TM bundle. It is also notable that with the exception of Trp1955.34 (conservation ~70%), little sequence conservation is found among chemokine receptors for the residues that constitute the dimerization site, even though many receptors have been shown to oligomerize (39). The specific nature of the interactions may facilitate the ability of CXCR4 to heterodimerize with other chemokine receptors (36, 38, 47) as well as GPCRs outside of the chemokine family (48), although one cannot discount the possibility that many modes of oligomerization may exist.
The known structures of chemokines, including CXCL12, feature a disordered N-terminal domain that largely controls receptor signaling and is hypothesized to penetrate the receptor helical bundle (49–50). The chemokine N-terminus is followed by a core globular domain, which is thought to bind to the receptor N-terminus and ECLs, forming an interaction site that confers affinity and specificity (51). The separation of the binding and signaling functions has led to the so-called “two-site” model of receptor binding with the chemokine core domain being the “site one” docking domain and the chemokine N-terminus being the “site two” signaling trigger (49, 52–53). The NMR structure of CXCL12 complexed to a 38-residue, sulfotyrosine-containing peptide derived from the CXCR4 N-terminus has been determined (PDB ID: 2K05) (54). This structure is thought to represent at least part of the “site one” complex and reveals important interactions between CXCL12 and residues that are absent from the CXCR4 receptor structure, including three sulfated tyrosines.
The peptide and small molecule complexes of CXCR4 identify the likely “site two” of the chemokine signaling trigger. The IT1t compound and CVX15 peptide have both been characterized as competitive inhibitors of CXCL12, and many of the receptor-ligand contacts in the co-crystal structures presented are important for CXCL12 binding, including the acidic Asp187, Glu2887.39 and Asp972.63 (Fig. 2) (55–56). The CVX15 peptide, rich in basic residues, may trace to some extent the path of the N-terminal signaling peptide of CXCL12 (KPVSLSYR), and the binding site of IT1t may point to the major anchor region for this domain. Furthermore, our preliminary modeling studies suggest that Lys1, the most critical residue in CXCL12 for receptor activation, could reach into the CXCR4 pocket and interact with one of these acidic residues (Fig. 5). The extensive binding site mapped out by the CVX15 peptide also clarifies how progressive shortening of the CXCL12 N-terminus leads to a gradual loss of binding affinity (49). Taken together, these data suggest that the small molecule and cyclic peptide block ligand binding by acting as orthosteric competitors of the CXCL12 N-terminal signaling trigger, providing strong support for the two-site model of binding. Along these lines, a recent NMR study showed that the CXCR4 antagonist AMD3100 could displace the CXCL12 N-terminus from the receptor without displacing the chemokine core domain (57).
Chemokines are able to bind their receptors as monomers in order to activate cell migration (58). However, chemokine oligomers, including CXCL12, appear to be functional and induce alternative signaling responses such as cellular activation or signals to halt migration (54, 59–60) giving rise to the concept that these complexes dynamically change their stoichiometries and structures as part of their functional regulation. Given the oligomeric nature of CXCR4 and the complementary electrostatic surfaces of the ligand and receptor, one can envision CXCL12 binding the receptor as a 1:1, 1:2, or 2:2 ligand:receptor complex (Fig. 5). Additional experiments will be necessary to fully define the relevance and functional implications of different chemokine:receptor stoichiometries and structures. Nevertheless, the current CXCR4 structures are compatible with emerging concepts of signaling diversity induced by alternative binding modes of the ligands.
CXCR4 and the related CCR5 serve as co-receptors for HIV-1 viral particles, facilitating their entry into cells. Structures have been reported for the other key components of the entry complex, HIV-1 glycoproteins gp120 and gp41, and the host leukocyte glycoprotein receptor CD4 (61–63). The N-termini of CXCR4 and CCR5, including sulfated tyrosine residues, have been implicated in gp120 binding, analogous to CXCL12 recognition (64). Other structural features critical to the interaction involve the gp120 V3 loop, which becomes exposed on CD4 binding (65), and then interacts with CXCR4 ECL2 and ECL3. The basic character of the protruding V3 loop along with acidic residues in the CXCR4 binding pocket have been reported to be important for HIV-1 infectivity (Fig. 2C & D) (56, 66), suggesting that the loop could also penetrate the pocket (Fig. 6). Thus, the CXCR4 structures suggest testable hypotheses regarding interaction of CXCR4 with its natural ligand and with HIV-1 gp120. The real challenge will be in understanding the dynamic changes in these complexes that result in signal transduction and viral fusion. As further details of these interactions are resolved, new opportunities for drug discovery efforts targeting specific functional states of the receptor will emerge.
This work was supported in part by the Protein Structure Initiative grant U54 GM074961 for structure production, NIH Roadmap Initiative grant P50 GM073197 for technology development, NIH grants R21 RR02536 and R21 AI087189 to VC, and Pfizer. Additionally, TMH acknowledges support from National Institutes of Health R01 AI037113 and R01 GM081763 and DJH from F32 GM083463. The authors thank Jeffrey Velasquez for help on molecular biology, Tam Trinh and Kirk Allin for help on baculovirus expression, Ian Wilson and Dennis Burton for careful review and scientific feedback on the manuscript, Bill Schief for electron miscroscopy models of gp120/gp41/CD4 complex, Gye Won Han for evaluating the structure quality and preparation for PDB submission, and Angela Walker for assistance with manuscript preparation. The authors acknowledge Erik La Chapelle on chemistry tool compound synthesis, Anil Rane on radiolabelling of [3H]BIMA, and Mei Cui for help developing [3H]BIMA binding assay; Yuan Zheng, The Ohio State University, and Martin Caffrey, Trinity College (Dublin, Ireland), for the generous loan of the in meso robot (built with support from the National Institutes of Health [GM075915], the National Science Foundation [IIS0308078], and Science Foundation Ireland [02-IN1-B266]); and Janet Smith, Robert Fischetti and Nukri Sanishvili at the GM/CA-CAT beamline at the Advanced Photon Source, for assistance in development and use of the minibeam and beamtime. The GM/CA-CAT beamline (23-ID) is supported by the National Cancer Institute (Y1-CO-1020) and the National Institute of General Medical Sciences (Y1-GM-1104). Atomic coordinates and structure factors have been deposited in the Protein Data Bank with identification codes 3ODU (CXCR4-2/IT1t, P21), 3OE0 (CXCR4-3/CVX15, C2), 3OE8 (CXCR4-2/IT1t, P1), 3OE9 (CXCR4-3/IT1t, P1), 3OE6 (CXCR4-1/IT1t, I222).
#This manuscript has been accepted for publication in Science. This version has not undergone final editing. Please refer to the complete version of record at http://www.sciencemag.org/. The manuscript may not be reproduced or used in any manner that does not fall within the fair use provisions of the Copyright Act without the prior, written permission of AAAS.
Materials and Methods