|Home | About | Journals | Submit | Contact Us | Français|
G protein-coupled receptors (GPCRs) comprise the most “prolific” family of cell membrane proteins, which share a common mechanism of signal transduction, but greatly vary in ligand recognition and function. Crystal structures are now available for rhodopsin, adrenergic, and adenosine receptors in both inactive and activated forms, as well as for chemokine, dopamine, and histamine receptors in inactive conformations. Here we review common structural features, outline the scope of structural diversity of GPCRs at different levels of homology and briefly discuss impact of the structures on drug discovery. Given the current set of GPCR crystal structures, a distinct modularity is now being observed between the extracellular (ligand-binding) and intracellular (signaling) regions. The rapidly expanding repertoire of GPCR structures provides a solid framework for experimental and molecular modeling studies, and helps to chart a roadmap for comprehensive structural coverage of the whole superfamily and an understanding of GPCR biological and therapeutic mechanisms.
The G protein-coupled receptor (GPCR) superfamily comprises more than 800 distinct human proteins that share a common seven transmembrane α-helical (7TM) fold. Tracing their origins to the first eukaryotes , GPCRs have diverged in vertebrates into five major classes and numerous subfamilies (Figure 1) [2, 3], and continue to play a key dynamic role in mammalian evolution . As highly versatile membrane sensors, GPCRs respond to a great variety of extracellular signals, ranging from photons, ions and sensory stimuli, to lipids, neurotransmitters and hormones , converting them into cellular responses via G-proteins, β-arrestins and other downstream effectors . Signaling through GPCRs controls major biological and pathological processes in neural, cardiovascular, immune and endocrine systems, as well as in cancer , making GPCRs the most prominent therapeutic target family that mediates action of more than 40% of clinically approved drugs.
Due to challenges in expression and crystallization, the GPCR family has remained the largest “terra incognita” of structural biology for decades, which hampered molecular interpretation of biophysical and biochemical findings, and rational drug discovery applications. The bovine rhodopsin structure was first solved in 2000 , 15 years after the first membrane protein crystal structure , and another seven years of extensive research and technology developments were needed to obtain the high-resolution structure of the human β2-adrenergic receptor (β2AR) – the first example of a GPCR with a diffusible ligand [11, 12] . That structure was followed by other Class A (rhodopsin-like) GPCRs, including β1AR , A2A adenosine (A2AAR) , chemokine CXCR4 (CXCR4) , dopamine D3 (D3R) , and most recently histamine H1 (H1R)  receptors (see Table I). While corroborating the common 7TM GPCR fold, the structures provide the first insights into the scope of structural diversity in GPCRs at various levels of homology. Thus, the structures represent closely related GPCR subtypes (β1AR and β2AR with 65% sequence identity), different subfamilies within the aminergic family (β2AR, D3R and H1R with ~35% identity), different sub-branches within the same major α-branch (β2AR and A2AAR with ~30% identity), as well as different major α- and γ-branches of Class A GPCRs (β2AR and CXCR4 with 25% identity) (Figure 1). In this review, we highlight both common and diverse structural features of GPCRs and discuss how these crystallographic insights and initial comparative modeling efforts help to outline a path to comprehensive coverage of the structure and biology of the GPCR family. We will focus on inactive conformations of GPCRs in complex with antagonists and inverse agonists, while the impact of structure-function efforts on understanding GPCR activation mechanisms is discussed in several recent and upcoming reviews (e.g. see ).
GPCRs are comprised of a bundle of seven transmembrane (7TM) α-helices, connected by three extracellular loops (ECL1-3) and three intracellular loops (ICL1-3) (Figure 2). The extracellular (EC) part, responsible for ligand binding, also includes an N-terminus, which can range from relatively short, often unstructured sequences in some of the rhodopsin-like and bitter taste receptors to large globular EC domains in other GPCR classes . The intracellular (IC) part interacts with G proteins, arrestins and other downstream effectors, and, in addition to ICLs, usually includes helix VIII and a C-terminus sequence that may carry palmitoylation and other signal sites [20, 21].
The 7TM helical bundle has been long recognized as the most conserved component of GPCRs. It shows characteristic hydrophobic patterns and harbors several functionally important signature motifs, such as the D[E]RY motif in helix III (part of the so-called “ionic lock”), the WxP motif in helix VI and the NPxxY motif in helix VII . Although the available crystal structures of rhodopsin-like GPCRs (see Table 1) confirm the overall structural conservation of the 7TM fold, they also reveal a remarkable structural diversity, not only in the loop regions, but also in the helical bundle itself. These variations are especially pronounced on the EC side of receptors, reflecting a distinct evolutionary and functional modularity between EC (ligand-binding) and IC (downstream signaling) modules of GPCRs (Figure 2).
One of the most striking features in the EC region revealed by the high-resolution GPCR structures involves the highly ordered conformations of their ECLs, stabilized by various secondary structure elements, disulfide bonds and interactions with the 7TM bundle. Thus, several structures of β2AR [11, 23, 24] reveal virtually the same conformation in each of their ECLs, despite being crystallized by different approaches, in different crystal packing orientations, and with different antagonists and inverse agonists bound; even in the activated agonist-bound structures[25, 26], the changes are rather limited. The ECL conformations are also very similar between two molecules within the asymmetric unit of the D3R crystal structure , between different complexes and different crystal forms of the CXCR4 structures , as well as between the ECL backbone structures of very closely related β2AR and β1AR subtypes that share 65% sequence identity. Importantly, the high conformational stability of ECLs in β2AR was recently validated by hydrogen-deuterium exchange (HDX) data, where the measured deuterium exchange ratios were consistent with the crystal structures [27, 28]. The only regions of potential high instability in ECLs that have been identified so far are in small distal portions of ECL2 in A2AAR and H1R, which were not resolved in the corresponding crystal structures [15, 18]. Although a more recent structure of a highly thermostabilized A2AAR observed about a 2-turn α-helix in this region, its stability and physiological relevance need further evaluation, as this short α-helix in ECL2 is sandwiched between two α-helices from C-termini of other crystallographic units in the particular crystal form.
One common structural feature of ECL2 is a conserved disulfide bridge connecting the loop with the tip of helix III. This bridge divides ECL2 into two regions –ECL2a and ECL2b, the latter serving as a covalent “linker” between helices III and V. The distinct structural elements especially in the ECLa region show remarkable diversity between distinct GPCR subfamilies (Figure 3). Unlike in rhodopsin structures, in which the β-hairpin of ECL2 “seals” the retinal binding pocket, the ECL2s in other GPCR structures keep the pocket open and readily accessible for ligands. The structural diversity in ECL2 is evident even between related GPCRs within the aminergic subfamily; for example, the ECLa in β2AR contains 2.5-turns of an α-helix, whereas the shorter ECL2a in D3R and H1R lack any secondary structure. In contrast, A2AAR contains both a 1-turn α-helix in ECLb and a short β-strand in ECL2a interacting with a β-strand in ECL1. Finally, in the more distant (by sequence homology) CXCR4, ECL2 forms a β-hairpin, which is positioned very differently than the β-hairpin in rhodopsin, and is critical for CXCR4 binding of either the small molecule IT1t, or the peptide antagonist CVX15 that mimics the V3 loop of the HIV spike protein gp120.
The ECL2b linker is tightly integrated with the 7TM bundle and, in many GPCRs, participates in binding ligands. Crystal structures show important consequences of the ECL2b length variations even in closely related GPCRs. Thus, although in β2AR, ECL2b has five residues and an almost fully extended backbone conformation, in A2AAR, it is seven residues long and comprises a 1-turn α-helix that accommodates the extra residues. By contrast, in D3R and other D2-like dopamine receptors, ECL2b has only four residues, which apparently pull the EC tips of helices III and V about 3.5 Å closer together than in β2AR . Intriguingly, the D1 and D5 dopamine receptors are more homologous to β2AR than to D2-like dopamine receptors, and, like β2AR also have five residues in their ECL2b, suggesting that variations in the ECL2b between dopamine receptor subtypes can play a role in their different response profile to dopamine .
Although the structural diversity is much less pronounced in ECL1s and ECL3s, which are 5–6 and 6–8 amino acids long, respectively, in most solved GPCR structures, they still show several distinct features. For example, in A2AAR, ECL1 is dramatically shifted inward as compared to other known GPCR structures and forms a unique short β-strand with ECL2, shaping the entrance to the ligand binding pocket. In A2AAR, D3R and H1R, ECL3 is further stabilized by a disulfide bond, and in A2AAR and β2AR, the side chains of ECL3 also form salt bridges to ECL2, which apparently impact ligand binding. In CXCR4, the distinct shape of ECL3 is defined by helix VII, which is elongated by an additional 2.5 turns, and a disulfide bond that bridges ECL3 with the N-terminus; this disulfide crosslink is important for shaping the site of interaction with peptide antagonists, and potentially with its native SDF-1 ligand.
Apparently, this is only the tip of the GPCR family-wide repertoire of structural arrangements, secondary structure elements, and disulfide bonding patterns in the ECL region. Note, for example, that even the ECL2-helix III disulfide bond – the only common feature observed in the ECL2 of all GPCR crystal structures published to date – is not conserved in many Class A receptors that lack a cysteine in position 3.25 (i.e. S1P receptor family). Additional structural features can also be expected for the N-terminus (Figure 3C), which so far has only been resolved in rhodopsin and partially in CXCR4. In CXCR4, the extended conformation of residues 28 to 38 of the N-terminus is stabilized by a disulfide bond to ECL3; the rest of this N-terminus is probably unstructured in the absence of its chemokine binding partner .
Precise 3D structural knowledge of the EC regions is of great value in GPCR drug discovery, as the N-terminus and/or ECLs play major roles in subtype selectivity for both peptide and small molecule ligands. This 3D knowledge is even more critical for understanding the binding and mechanisms of action for allosteric modulators and “bitopic” ligands of GPCRs [6, 31–33], as these promising new classes of therapeutic compounds often specifically target the loop regions of receptors.
As expected from the characteristic 7TM sequence pattern and early 3D modeling, crystal structures confirm the overall structural conservation of the 7TM helical bundle, with Cα root mean squared deviations (RMSD) of <3 Å in TM-helices between any pairs of GPCRs. At the same time, the structures reveal dramatic local variations between different GPCRs even within the 7TM helices; in particular, at the sites involving bulges, kinks and other deviations from canonical α-helices that are usually associated with prolines (Figure 4).
Some less obvious variations, such as those including helix IV of H1R (Figure 4C)  and helix II of CXCR4 , can be represented as an insertion or deletion of one residue in the backbone structure, accompanied by some local adjustments of a few neighboring residues. The corresponding sequence alignment should, therefore, incorporate such one-residue insertions/deletions. Comparative analysis of GPCR structures, however, has traditionally relied on gapless alignments of TM helices and specialized residue numbering, e.g. Ballesteros-Weinstein , which, in many cases, may not reflect a real structural alignment of the residues. Special care should be used to identify and predict such local deviations often dramatically affecting functional sites in the ligand-binding pocket. Moreover, some other deviations, like π-helical (i+5) structure of the helix V in A2AAR (Figure 5D), or sharp bends of the EC tips of helices II and III in A2AAR, cannot be adequately described by sequence alignment. All these non-canonical features represent a significant challenge for molecular modeling [35, 36], emphasizing the role of high-resolution crystallography as an indispensible source of accurate 3D knowledge for GPCRs.
It is important to note, in the structures solved to date, that structural variations are especially pronounced in the EC-TM half of the helical bundle-module, as reflected in up to 7Å shifts in the EC helical tips. On average, the protein backbone RMSD in the EC-TM region between different GPCR pairs is almost two times larger than the RMSD value for the IC-TM module (Figure 4A,B). Such a contrast between ligand binding and downstream signaling modules in their structural variability is also evident from a comparison on the individual residue level. For example, only 6% of residues are exactly conserved in the EC-TM region across all published GPCR structures, as compared to 26% of residues in the IC-TM region.
A highly diverse repertoire of structural features in the ligand-binding pockets of different GPCR subfamilies apparently reflects evolutionary pressure to selectively recognize ligands that vary greatly in shapes, sizes and electrostatic properties. Even within the preserved overall 7TM bundle architecture, there is a remarkable diversity of the binding pocket shapes and features between GPCRs subfamilies (Figure 5), manifested in both side-chain diversity and variations in backbone conformations of TM helices and ECLs. Thus, unlike rhodopsin that contains a tightly enclosed hydrophobic pocket, GPCRs that bind diffusible ligands have more open pockets, accessible from the EC side. In the β2AR and other aminergic receptors, the core binding site sits rather deep in the 7TM bundle, and many high affinity ligands do not make significant contacts with the ECL residues. In contrast, the A2AAR binding site is located much closer to the loops and the high affinity binding requires ligand interaction with at least one ECL2 side chain, Phe168. In the chemokine-binding CXCR4, the ligand pocket is much larger and shallower than in other solved GPCR structures, and binding of its peptide antagonists involves extensive interactions with ECL2, which probably mimic interactions with a native SDF-1 ligand and the V3 loop of the HIV gp120 protein.
In contrast, GPCR subtypes that bind the same endogenous ligand are expected to have rather high levels of binding pocket conservation. Among them, the crystal structures of β2AR and β1AR represent extreme cases, with 100% conserved contact residues and RMSD <1.0 Å in these side chains (compared to heavy-atom RMSD ~0.9 Å in the same side chains between β2AR crystal structures with different ligands). More typically, about 50–60% of residues are identical between pockets in GPCR subtypes binding the same ligand. Conformational modeling studies supported by ligand binding data, for example for adenosine receptor subtypes [37, 38], suggest a high level of structural conservation between such subtypes, allowing accurate prediction of the pockets from one or more representative structures in the subfamily. As crystallization of each GPCR subtype may be impractical, such “close-range” modeling can be useful to fill the remaining gaps in structural knowledge to better understand ligand subtype selectivity, which is essential for rational design of tool compounds and safer drug candidates for GPCR targets.
Another important feature of GPCRs revealed by analysis of multiple crystal structures is the relative rigidity of their binding pockets. Despite the overall conformational flexibility of GPCRs, crystal structures show that conformational rearrangements (ligand induced fit) in the binding pockets are usually rather limited within their inactive states. For example, in several crystal structures of β2AR with different antagonists and inverse agonists [11, 23, 24], RMSD of the common binding pocket’s side chains does not exceed 1.0 Å. Of course, binding of some bulky conformationally selective ligands, e.g. the 16-residue cyclic peptide antagonist CVX15 , can induce some substantial deviations in the side chains and helices (compare Figure 5D and E), however such deviations are usually minor for small drug-like compounds. This result has a major implication for the applicability of GPCR crystal structures to rational drug discovery [39, 40], suggesting that one (or very few) conformations of the receptor can effectively explain binding of majority of drug-like compounds. Applicability of the high-resolution GPCR structures to drug discovery has been further corroborated by retrospective Virtual Ligand Screening (VLS) (e.g. [41, 42]) and exceptionally high hit rates obtained in prospective VLS-based studies that have already identified a number of novel high-affinity antagonists for β2AR [43, 44] and A2AAR [45, 46].
Structural characterization of agonist-bound active-state GPCRs is much more challenging due to their reduced stability, though both crystallography [26, 29, 47–51] and conformational modeling [52, 53] are starting to provide the first insights into binding with this class of GPCR ligands, as reviewed in [19, 40, 54].
In contrast to the high structural diversity observed in the EC module, crystal structures of inactive GPCRs reveal high structural conservation (RMSD ~1.5 Å) in the IC module, which is involved in binding to G proteins and arrestins and downstream signal transduction through mechanisms that are believed to be similar across GPCRs [55, 56]. The IC-TM region also harbors several functionally important features conserved in the Class A subfamily, including the D[E]RY motif in helix III and Glu6.30, which form the so-called “ionic lock” in some inactive GPCR structures , as well as the NPxxY motif in helix VII . GPCR structures reveal that a high level of structural conservation also extends to the short ICL1 and helix VIII. The 6-residue ICL1 has a similar backbone conformation and a conserved leucine side chain inserted back into the TM bundle in all known GPCR structures. Helix VIII is a 3–4 turn α-helix that runs parallel to the membrane and is characterized by a common [F(RK)xx(FL)xxx(LF)] amphiphilic motif. In many GPCRs, helix VIII is also anchored to the membrane by palmitoylation and is essential for receptor expression and function . Among the known GPCR structures, CXCR4 is the only exception in which helix VIII shows a disordered behavior, probably because one of the phenylalanines in the amphiphilic motif is missing, although this does not preclude the possibility of its formation in CXCR4 in cell membranes under certain conditions.
At the same time, the IC region undergoes large conformational changes as a part of the activation mechanism in each GPCR [26, 29, 47–51], therefore one can expect much higher conformational flexibility and/or instability of some of ICLs. Indeed, dramatic variations in ICL2 structure (9–12 residues long), have been observed within the same receptor, and even within the same crystal form. Thus, in the D3R structure, crystallized with two receptors per asymmetric unit, molecule A has a well-resolved 2.5-turn α-helix in ECL2 that runs parallel to the membrane, while in molecule B this loop is disordered and no electron density is observed. Similarly, both extended and α-helical conformations have been observed for ICL2s of β2AR [11, 12] and β1AR [14, 60], which have almost the same sequence. These crystallographic, as well as HDX observations , suggest that ICL2 can be structurally ambivalent and its conformational state may depend on the functional state of the receptor and its interactions with G proteins.
Unlike other IC regions, ICL3 has a highly variable length, ranging from as short as five residues in CXCR4, up to hundreds of residues in some other receptors. Variations in length and sequence motifs in ICL3 are high even in closely related subtypes, as ICL3 is believed to controls receptor selectivity to different G proteins. ICL3 regions in crystal structures are usually disordered or replaced by a stabilizing fusion protein. At the same time, there are indications that in some GPCRs the ICL3 has a propensity for forming α-helical structures, which elongate helices V and VI by at least 2–3 turns [50, 61]. In β1AR , two alternative conformations of this region have been recently resolved in a crystal structure of the inactive receptor: one of them includes a bent helix VI allowing formation of the ionic lock, whereas another with straight helix VI lacks the ionic lock interaction. By contrast, the high proteolytic susceptibility of ICL3, also supported by recent HDX data for β2AR [27, 28], suggests that such secondary structure elements in ICL3 are rather unstable and/or flexible, at least in the absence of a binding partner.
Further structural and biophysical studies, especially in GPCR complexes with G-proteins  and arrestins will be needed to understand the structure and dynamics of the IC region and nature of its multifaceted pattern of selectivity to downstream effector molecules. While is not directly involved in binding of orthosteric ligands or drugs, the IC region is critical for the functional selectivity of drug candidates and is also considered a potential alternative target for allosteric drugs .
GPCR function involves specific interactions with numerous binding partners, which include not only signaling ligands and downstream effectors, such as G proteins and arrestins, but also other regulatory proteins [63, 64], lipids and sterols [59, 65] and ions . Moreover, allosteric sites  for binding small molecule modulators have been identified in many GPCRs . Such sites can be located in direct proximity of the ligand binding pocket , in the 7TM helical bundle for Class B and C GPCRs , or even in the IC part of the receptor [62, 71]. Molecules targeting regulatory and other allosteric sites are important for understanding GPCR biological pathways, and also may be highly beneficial in therapeutic applications as they can modulate specific parameters of the native ligand signaling without completely hijacking or blocking receptor activity .
Although the number of studies characterizing these types of interactions biochemically and functionally has been growing, understanding the molecular basis of allosteric binding in GPCRs remained limited until recently owing to the lack of corresponding structural information. Moreover, allosteric sites are usually located in less-conserved regions of the protein, as opposed to orthosteric sites, and, therefore, are less amenable for homology modeling. The recent availability of GPCR crystal structures has provided a detailed atomic structural framework that can greatly assist biochemical methods traditionally used for identification and analysis of allosteric sites . Crystal structures can also directly suggest novel allosteric sites with unusual properties and selectivity (Figure. 6). For example, a specific cholesterol binding site has been repeatedly observed in β2AR [11, 23], with cholesterol modulating receptor thermostability and affinity to inverse agonists [23, 73]. Moreover, the allosteric effect of cholesterol and some of its close analogs, found to be critical for full activation of the oxytocin receptor, is likely mediated by the same and possibly other allosteric sites . Binding of a phosphate ion in the EC subpocket of the H1R structure was also found to allosterically and specifically modulate binding of different zwitterion antihistamines . Some other “druggable” non-orthosteric pockets or subpockets have been described in crystal structures, for example a selectivity subpocket in D3R , which can be further explored as a putative target for allosteric and/or bitopic ligands.
Another important type of interactions is homo- and hetero-dimerization of GPCRs. These interactions can modulate the signaling properties of the receptors and mediate crosstalk between GPCR pathways . Even for the most studied dimers, however, unambiguous identification of the functionally relevant interfaces has proved to be very difficult, partially due to a transient mode of the interactions and technical challenges of separating specific and non-specific binding in a crowded membrane environment . In crystallization studies, non-specific dimerization of a GPCR can be detrimental for obtaining crystals, as it introduces heterogeneity in samples, and for that reason it was largely avoided. Therefore, in most crystal structures so far, the GPCR molecules have been found to pack in non-functional (e.g. antiparallel) orientations. An important exception is represented by the recent crystal structures of CXCR4, which have revealed a consistent parallel dimer arrangement . The dimer interface is virtually identical in five different crystal packing forms of CXCR4 complexes with both peptide and small molecule antagonists, suggesting its functional relevance. It is likely that future crystallization efforts, also specifically optimizing dimerization conditions, will bring about more structures of GPCR functional dimers, including heterodimers.
Due to the rapid progress in GPCR crystallography, in a mere four years we have come from reliance on rhodopsin as the only representative structure for the GPCR superfamily to high resolution structural characterization of seven distinct GPCRs. Three of them [16–18] have been resolved just within the last year. The determined crystal structures represent two (α and γ) out of four major branches in Class A GPCRs  (Figure 1) and exemplify very different levels of sequence and structural homology between them. The structures allow major insights into the rich structural complexity of GPCRs, making it possible to let go of the natural simplifications employed for analysis in the long period of the “structure drought”.
At the same time, the new structures can be considered as the first steps on the way to comprehensive coverage of the superfamily, which defines the future strategy towards this major goal. Selection of targets for the first phase of the NIH NIGMS PSI:Biology center GPCR Network was based on the assumption that an optimum use of limited resources can be achieved by crystallization of one to three representative receptors in each GPCR subfamily, also taking into account the level of homology and the number of GPCRs in the subfamily. Although still limited, the initial sampling provided by the solved GPCR crystal structures supports this strategy. The structures show remarkable structural variations between GPCR subfamilies, especially in the ligand binding region, which cannot be reliably and accurately predicted by current modeling techniques. On the other hand, genetically and functionally related subtypes within GPCR subfamilies are also close structurally. The crystal structures, therefore, may provide a solid structural framework for comparative analysis of some other subtypes with close structural homology, making it possible to fill (at least temporarily) the gaps in the structural knowledge and understanding of the 3D basis of ligand selectivity.
Another dimension in deciphering GPCR biology is provided by the structural characterization of representative receptors bound to different ligands. Multiple inactive structures with antagonists suggest a generally low level of induced fit in GPCRs, making possible accurate prediction of binding conformations for most ligands . However, binding of certain conformationally selective compounds, including both antagonists and agonists can reveal a number of new interesting features including characterization of new functionally selective states. These states are of great interest to GPCR biology and drug discovery, and should be actively pursued by structural studies. Finally, although crystallography has recently achieved a first milestone in characterizing activated states of adrenergic and adenosine receptors, we are still at the very beginning of the quest for understanding the nature of GPCR function. Further structural, biophysical and computational studies will be required for deciphering ligand-dependent activation triggers in other receptors, are well as the structural basis of GPCR selective interactions with G-proteins, arrestins and other modulators.
This work was supported in part by the NIH Roadmap grant P50 GM073197 and NIH PSI:Biology grant U54 GM094618. We are very grateful to the accumulation of more than 20 years of structure determination efforts on GPCR structural biology and technology by a number of different groups worldwide, including our own lab. We are particularly thankful to our biochemistry/chemistry collaborators in the different receptor systems including Ad IJzerman and Ken Jacobsen (A2A adenosine), Brian Kobilka (β2 adrenergic), Tracy Handel and Alexei Brooun (chemokine CXCR4), Jonathan Javitch and Amy Newman (dopamine D3), and So Iwata and Tatsuro Shimamura (histamine H1). We thank K. Kadyshevskaya for assistance with figure preparation and A. Walker for assistance with manuscript preparation.