|Home | About | Journals | Submit | Contact Us | Français|
Reactome is an open-source, freely available database of human biological pathways and processes. A major goal of our work is to provide an integrated view of cellular signalling processes that spans from ligand–receptor interactions to molecular readouts at the level of metabolic and transcriptional events. To this end, we have built the first catalogue of all human G protein-coupled receptors (GPCRs) known to bind endogenous or natural ligands. The UniProt database has records for 797 proteins classified as GPCRs and sorted into families A/1, B/2 and C/3 on the basis of amino accid sequence. To these records we have added details from the IUPHAR database and our own manual curation of relevant literature to create reactions in which 563 GPCRs bind ligands and also interact with specific G-proteins to initiate signalling cascades. We believe the remaining 234 GPCRs are true orphans. The Reactome GPCR pathway can be viewed as a detailed interactive diagram and can be exported in many forms. It provides a template for the orthology-based inference of GPCR reactions for diverse model organism species, and can be overlaid with protein–protein interaction and gene expression datasets to facilitate overrepresentation studies and other forms of pathway analysis.
Database URL: http://www.reactome.org
G protein-coupled receptors (GPCRs), also known as 7-transmembrane (7TM) domain receptors, comprise the largest and most diverse gene super-family in humans—>1% of the total protein-coding human genome. Estimates of the exact number of GPCR genes vary but a recent phylogenetic analysis identified over 800 (1). Of these, 701 were classified within the rhodopsin family (type A) including 241 non-olfactory receptors. Many protein coding genes are alternatively spliced giving rise to isoforms so the true number of functionally unique receptors may be much higher than estimates based on gene numbers.
These GPCRs sense extracellular molecules and, through their interaction with G proteins, activate downstream signal transduction pathways. GPCRs respond to a huge range of stimuli, including light, odours, hormones, neurotransmitters and peptides (2). GPCRs represent around half of cell surface drug targets (3) and are a very successful therapeutic target family for the pharmaceutical industry accounting for the majority of best-selling drugs, ~30% of all prescription pharmaceuticals on the market (4). The potential for further exploitation remains high, as only 10% of GPCRs are targeted by these marketed drugs (5).
Reactome is a free, open-source pathways database. Information in Reactome is captured by expert curators and peer-reviewed by experts in their fields of biology. The data is extensively cross-referenced to databases such as Ensembl [http://www.ensembl.org/index.html (6)], GO [http://www.ebi.ac.uk/QuickGO/ (7)], PubMed (http://www.ncbi.nlm.nih.gov/pubmed), ChEBI [http://www.ebi.ac.uk/chebi/index.jsp (8)], UniProt [http://www.uniprot.org/ (9)] and OMIM [http://www.ncbi.nlm.nih.gov/omim (10)]. Reactions for other species are inferred by orthology from curated human ones. Reactions can be viewed in the context of their pathways and interaction data can be overlaid to further expand the data richness. Tools are available in Reactome to help users with analyses such as pathway over-representation (enrichment) and pathway differential expression, and data including tables of pairwise protein–protein interactions computed from manually curated reactions and complexes can be downloaded in a range of formats.
Several resources hold rich data for GPCRs. UniProt is a comprehensive protein knowledgebase of protein sequence and functional information. IUPHAR-db (International Union of basic and clinical PHARmacology, http://www.iuphar-db.org/) is a database of receptor nomenclature and drug classification. Its GPCR section is arranged according to the sequence homology and functional similarity of these receptors. It also contains orphan GPCR lists. These resources were used as a starting point to catalogue the GPCR project in Reactome.
In UniProt, a query was constructed to search for all manually annotated and reviewed human GPCRs.
Information in Reactome is annotated by database curators. These in-house experts systematically reviewed the literature for the three GPCR families. GPCRs whose ligands were identified from published experimental data were captured via the Curator Tool, an interface which allows the curator to annotate and structure data in accord with Reactome’s frame-based data model, and commit the results to a central repository (11). Data was organized into the three main GPCR families, A/1, B/2 and C/3. Within each family, details were further structured based on the type of ligand. Attributes of a reaction captured by Reactome are:
Input and output entities can be composed of proteins, simple chemicals or combinations of these entities (complexes).
Useful information captured from IUPHAR-db by the curation team included:
As of October 2009, there were 356 GPCRs captured by the IUPHAR database. The database also contains lists of orphans. Orphans are proteins classified as GPCR protein family members based on sequence similarity but whose endogenous ligands are unknown. These were investigated by Reactome curators to determine if recent advances assigned ligands to some of them.
From UniProt, we retrieved records for the three main families of human GPCRs with the query:
family: ‘G-protein coupled receptor’ and organism:human and reviewed: yes
The query resulted in 836 protein matches. Of these, 797 proteins matched the three main families (A/1, B/2 and C/3). We then queried IUPHAR-db and searched published literature to identify ligands for these proteins, with the results
Class A/1—726 UniProt records; ligands found for 519
Class B/2—49 UniProt records; ligands found for 29
Class C/3—22 UniProt records; ligands found for 15
Of 797 GPCRs in families A/1–C/3 screened from UniProt, we were able to catalogue 563 GPCRs that have ligands (70%), supported by information from IUPHAR and appropriate literature references.
We believe the remainder (234) to be true orphans i.e. no credible endogenous ligands have been determined for these receptors.
The project can be viewed here:
An on-line description of the features of these web pages and of additional pathway analysis tools that can be applied to the data is available here:
For each class, there are further subdivisions of the hierarchy, organized into ligand types that bind particular GPCRs.
First and foremost, this addition to Reactome provides a computationally-accessible resource for information about ligand-binding GPCRs. The three main families in human are annotated, together with downstream signalling events mediated by coupling to the appropriate G protein. Each receptor protein record has multiple link-outs to key databases related to sequence, genetic disorders, ontology and literature, further enriching the information a user can view. These annotations of GPCRs by protein family complement the extensive annotation by ligand specificity previously compiled by Alexander and colleagues (12).
A total of 563 ligand-binding GPCRs were identified and included in Reactome; an additional 234 with no identifiable ligand were not. Notably, we included a set of GPCRs thought to function as olfactory receptors. In many cases, these GPCRs have been identified and classified based on their interaction when expressed in a model cultured cell with members of a small set of standard test odorant molecules. These studies are generally accepted as establishing the olfactory receptor function of these GPCRs, albeit without identifying the odorant molecule(s) with which they interact under physiological conditions (13).
Though the absence of any identified ligand presents problems for the pharmaceutical industry and for researchers wishing to study a receptor using a tool agonist, orphan receptors can be of interest when linked to a particular subcellular location and/or physiological process. For instance, the predominantly dorsal root ganglion expressed MRGX family of receptors have been extensively studied because of their narrow and therapeutically interesting expression profile (14). The pathway contexts provided by Reactome annotation provide an additional functional grouping that may be useful in generating testable hypotheses about roles of orphan GPCRs.
Orphan GPCRs have been the subject of intensive research including ligand screening by pharmaceutical companies for many years (15,16), so why do so many GPCRs have no identified ligand? There are several reasons why the endogenous ligands are still undetermined for some orphan GPCRs:
Overlaying protein–protein interaction data e.g. from IntAct on the curated Reactome GPCR dataset may provide a powerful approach for identifying candidate heterodimer partners and their potential functions and thus provides a novel tool for the study of orphan receptors (27). Overlaying protein-small molecule data from resources such as PubChem, ChEMBL or proprietary sources may enable identification of cofactors or modulators and could identify novel lead compounds.
Reactome contains several tools for the analysis of large-scale data sets that the user can submit to the resource. Results of such analyses are exportable in many formats from PNG images to systems biology data standards such as BioPAX and SBML. Some key features of the data in Reactome are:
Features accessed from the pathway diagrams page (Entity Level Views or ELVs)
National Human Genome Research Institute at the National Institutes of Health (P41 HG003751); European Union 6th Framework Programme ‘ENFIN’ (LSHG-CT-2005-518254). Funding for open access charge: National Institutes of Health (P41 HG003751).
Conflict of interest. None declared.
Development of the Reactome data model and database is a collaborative project and this work benefited greatly from our interactions with David Croft, Phani Garapati, Marc Gillespie, Gopal Gopinath, Robin Haw, Lisa Matthews, Bruce May, Gavin O’Kelly, Esther Schmidt, and Guanming Wu, and with our colleagues at GO, ChEBI, UniProt and IntAct. The authors thank Joël Bockaert and Leslie Vosshall for their expert reviews of our GPCR annotations. We also thank three anonymous reviewers for their useful comments on an earlier version of this article.