We performed a comprehensive search for non-olfactory GPCR genes in the dog genome. A start dataset was produced from BLASTN searches in the Genbank non-redundant database. This contained 325 full-length GPCRs and 5 pseudogenes. Around 13% of these needed manual curation because they had an incorrect composition of exons. TBLASTN and BLAT searches in the dog genome assembly completed the analysis. A total number of 353 full-length sequences, 18 incomplete sequences and 13 pseudogenes were retrieved. A full-length dog GPCR gene has been defined as one that contains an intact transmembrane domain. The incomplete GPCR gene sequences are missing exons or parts thereof because they reside in genomic regions that have not been sequenced. It is also possible that whole GPCR genes are missing in the dog genome assembly and these can be very difficult to distinguish from those that do not exist in this species unless the specific genomic region is carefully analysed. The gene sequences of MAS1, NPY2R, GPR52 and GPR37L1 were found to include frameshifts and/or stop codons in the Broad Institute genome assembly (from the boxer). However, in a second BLAST search of these sequences in the TIGR poodle assembly [28
] these 4 genes were found to be intact/full-length. This may either reflect sequencing issues or indicate real differences between breeds.
The dog GPCR gene sequences were divided into families in line with the GRAFS
, Rhodopsin, Adhesion
, Frizzled, Secretin, Taste2
]. 18 genes do not have sequence similarity to any GPCR family and these were treated as a separate group called Other GPCRs
according to our previous classification of the rat GPCRs [30
], the only difference being that GPR149 was here moved to the Rhodopsin
family. The numbers of genes in each GPCR family; including previously published sensory GPCRs for human, dog, mouse and rat; are presented in Table . A complete table of all GRAFS
GPCR genes in dog, human, rat and mouse is presented in Additional file 1
. The amino acid sequences of all dog GPCRs obtained in this study are included in Additional file 2
The number of GPCR genes in human, dog, mouse and rat.
We performed phylogenetic analyses of all dog and human GRAFS GPCR protein sequences and identified orthologs and species-specific genes. The latter represent paralogous genes that have arisen or been lost specifically in either human or dog or the lineages leading to them. Consensus trees of 100 Maximum Parsimony phylogenetic trees and the average amino acid sequences identities of receptor orthologs are presented in Figure : Rhodopsin family and Figure :Glutamate, Adhesion, Frizzled and Secretin families. Dog genes missing in human are listed in Table , whereas human GPCR genes not found in the dog and/or rodent genomes are listed in Table .
Figure 1 Consensus tree of the human (hs) and dog (cf) Rhodopsin family based on 100 Maximum Parsimony phylogenetic trees. The sequence alignment used for the phylogenetic calculation was based on the transmembrane segments. A pie-chart displays the average pairwise (more ...)
Figure 2 Consensus trees of the human (hs) and dog (cf) Adhesion, Frizzled, Glutamate and Secretin GPCR families. Each tree is based on 100 Maximum Parsimony trees. The sequence alignments used for phylogenetic calculations were based on the transmembrane segments. (more ...)
Table showing the dog GPCR genes that are missing or pseudogenes in human.
Human GPCR genes that are missing (not found in genome assemblies) or are pseudogenes in dog and/or rodents.
We identified 267 Rhodopsin
GPCR genes in dog and this can be compared with the corresponding number in human that is 284 (Table and Additional File 1
). The average protein sequence identity is 86% between dog and human one-to-one orthologs and this is higher than is observed for each of these two species to the mouse orthologs. For ease of discussion we present the Rhodopsin
family of GPCRs according to their broad phylogenetic grouping [16
] (see Additional file 1
α subfamily in dog is missing the receptors GPR148, Red opsin (OPN1LW), TAAR6, TAAR8 and TAAR9 (Table ). In the rodent genomes, three of these receptors; GPR148, OPN1LW and TAAR8; are absent, whereas two; TAAR6 and TAAR9; are present. In dog GPR78 and TAAR1 are pseudogenes while TAAR4 is a full-length/intact gene in contrast to its human ortholog, which is a pseudogene. The gene sequences of dog ADRA1B, ADRA1D, ADRA2A, DRD4 and MTNR1B are incomplete.
In the Rhodopsin β subfamily one new dog gene, TRHR3, was identified. TRHR3 is not present in humans or rodents and the receptor with the highest amino acid identity, 59%, is the Xenopus laevis thyrotropin-releasing hormone receptor 3 (TRHR3, GenBank accession: CAD12656). Two Rhodopsin β subfamily receptors, GPR75 and GPR150, are missing in dog, but present in human and rodents. The dog NPFFR1 and NPFFR2 gene sequences are incomplete.
In the Rhodopsin γ subfamily the dog lacks the genes for FPR1, FPRL2, GPR32, NPBWR2 and SSTR4, which are all present in human (Table ). GPR33, which is a pseudogene in human, is a full-length gene in both dog and rodents. In contrast, another gene, RXFP4, is a pseudogene in dog, but full-length in both human and rodents. The dog KISSR1 was found to have an incomplete sequence in the genome assembly.
In the Rhodopsin δ subfamily we identified one new dog member of the Mas-Related GPCR (MRG) cluster, MRGPR-like1. The dog assembly is missing, GPR109B, GPR42 (FFAR1L), MAS1L, MRGPRE, MRGPRX1, MRGPRX3 and MRGPRX4, which are all present in the human genome. GPR79 is a full-length gene in both dog and rodents, but is a pseudogene in human. In contrast, P2RY4 is a full-length gene in human and rodents, but not in dog in which it is a pseudogene. The dog MRGPRX2 gene sequence is incomplete.
One additional new dog Rhodopsin GPCR was identified, GPR141b. The most similar receptor, human GPR141, is an orphan GPCR. GPR135, which is also an orphan Rhodopsin GPCR, was not found in dog. GPR166P, which is a pseudogene in human, was found to be a full-length gene in dog appearing to be functional. Two additional dog Rhodopsin GPCRs, DARC and GPR88, have only incomplete gene sequences.
Figure displays consensus trees of 100 maximum parsimony phylogenetic trees of the Adhesion, Frizzled, Glutamate and Secretin families of GPCRs. All families are relatively well conserved in terms of sequence identity (in the order Frizzled > Glutamate > Secretin > Adhesion).
The results show that the Glutamate
family is well conserved having 22 orthologous receptor pairs and no species-specific genes in dog and human. (Figure and Additional file 1
). The average protein sequence identity is 89% between dog and human orthologs and lower for each of these two species to the mouse orthologs. The sequence of dog GRM3 is incomplete.
family displays have unconventional orthology relationships between dog and human. All 33 human Adhesion
GPCRs are present in the dog genome. But, interestingly, the dog also contains an additional 5 full-length genes; EMR2b, EMR2c, EMR2d, EMR4b and EMR4c; and 1 pseudogene GPR133b. These Adhesions
GPCR genes seem to be specific for the dog lineage as they have not been found in other mammals studied [8
]. We performed a phylogenetic analysis based on the 5 dog-specific EMR receptor sequences together with the dog, human, cow and opossum EMR1-EMR4 and CD97. The phylogenetic analysis was based on the transmembrane regions and the resulting consensus tree is presented in Figure . The dog and human one-to-one Adhesion
receptor orthologs have an average protein sequence identity of 83% and this is higher than each of these species have to their mouse counterparts (Figure ). GPR144, EMR2 and EMR3; which are full-length in human but pseudogenes in rodents; appear to be functional (are full-length) in dog. The gene sequences of BAI1, EMR2d, EMR4c, GPR123 and GPR124 are incomplete.
Consensus tree of the EGF-TM7 Adhesion family GPCRs derived from 100 Maximum Parsimony phylogenetic trees. The sequence alignment used for the phylogenetic calculation was based on the transmembrane segments.
family is well conserved between dog, mouse and human having 11 orthologous receptor pairs and no species-specific genes in either species (Figure and Additional file 1
). A slight difference is observed for the rat Frizzled
repertoire in which FZD10 appears as a pseudogene. The average amino acid identity is 96.9% between dog and human Frizzled
orthologs. The gene sequence of dog FZD8 is incomplete.
The Secretin family has the same 15 members in human, dog, mouse and rat i.e. their repertoires are identical. The average protein sequence identity between dog and human Secretin family GPCR orthologs is 88.5% (Figure ).
The group defined as Other GPCRs include 18 dog genes. One of these, GPR172A, which is present in human but missing in rodents, was found to be missing in the dog genome. Another gene, TMEM185B, which is a pseudogene in human but full-length in rodents, appears to be functional (is a full-length gene) in dog.