|Home | About | Journals | Submit | Contact Us | Français|
Mammalian carboxylesterase (CES or Ces) genes encode enzymes that participate in xenobiotic, drug, and lipid metabolism in the body and are members of at least five gene families. Tandem duplications have added more genes for some families, particularly for mouse and rat genomes, which has caused confusion in naming rodent Ces genes. This article describes a new nomenclature system for human, mouse, and rat carboxylesterase genes that identifies homolog gene families and allocates a unique name for each gene. The guidelines of human, mouse, and rat gene nomenclature committees were followed and “CES” (human) and “Ces” (mouse and rat) root symbols were used followed by the family number (e.g., human CES1). Where multiple genes were identified for a family or where a clash occurred with an existing gene name, a letter was added (e.g., human CES4A; mouse and rat Ces1a) that reflected gene relatedness among rodent species (e.g., mouse and rat Ces1a). Pseudogenes were named by adding “P” and a number to the human gene name (e.g., human CES1P1) or by using a new letter followed by ps for mouse and rat Ces pseudogenes (e.g., Ces2d-ps). Gene transcript isoforms were named by adding the GenBank accession ID to the gene symbol (e.g., human CES1_AB119995 or mouse Ces1e_BC019208). This nomenclature improves our understanding of human, mouse, and rat CES/Ces gene families and facilitates research into the structure, function, and evolution of these gene families. It also serves as a model for naming CES genes from other mammalian species.
Five families of mammalian carboxylesterases (CES; E.C.184.108.40.206) have been described, including CES1, the major liver enzyme (Ghosh 2000; Holmes et al. 2009a; Munger et al. 1991; Shibita et al. 1993); CES2, the major intestinal enzyme (Holmes et al. 2009a; Langmann et al. 1997; Schewer et al. 1997); CES3, expressed in brain, liver, and colon (Holmes et al. 2010; Sanghani et al. 2004); CES5 (also called CES7 or cauxin), a major urinary protein of the domestic cat also present in human tissues (Holmes et al. 2008a; Miyazaki et al. 2003, 2006; Zhang et al. 2009); and CES6, a predicted CES-like enzyme in brain (Clark et al. 2003; Holmes et al. 2009a; reviewed by Williams et al. 2010). These enzymes catalyze hydrolytic and transesterification reactions with xenobiotics, anticancer prodrugs, and narcotics (Ohtsuka et al. 2003; Redinbo and Potter 2005; Satoh and Hosokawa 1998, 2006; Satoh et al. 2002), the conversion of lung alveolar surfactant (Ruppert et al. 2006), and several lipid metabolic reactions (Becker et al. 1994; Diczfalusy et al. 2001; Ghosh 2000; Hosokawa et al. 2007; Tsujita and Okuda 1993); they may also assist with the assembly of low-density lipoprotein particles in liver (Wang et al. 2007).
Structures for human and animal CES genes have been reported, including rodent CES1- and CES2-“like” genes (Dolinsky et al. 2001; Ghosh et al. 1995; Hosokawa et al. 2007) and human CES1 and CES2 genes (Becker et al. 1994; Ghosh 2000; Langmann et al. 1997; Marsh et al. 2004). Predicted gene structures have been also described for the human CES3, CES5, and CES6 genes, which are localized with CES1 and CES2 in two contiguous CES gene clusters on human chromosome 16 (Holmes et al. 2008a, 2009a, b, 2010). In addition, a CES1-like pseudogene (currently designated CES4) is located with the CES1–CES5 gene cluster (Yan et al. 1999). Mammalian CES genes usually contain 12–14 exons of DNA encoding CES enzyme sequences which may be shuffled during mRNA synthesis, generating several CES transcripts and enzymes encoded by each of the CES genes (see Thierry-Mieg and Thierry-Mieg 2006). There are significant sequence similarities for the five CES families, especially for key regions previously identified for human liver CES1 (Bencharit et al. 2003, 2006; Fleming et al. 2005). Three-dimensional structural analyses of human CES1 have identified three major ligand binding sites, including the broad-specificity active site, the “side door,” and the “Z-site,” where substrates, fatty acids, and cholesterol analogs, respectively, are bound; and an active site `gate', which may facilitate product release following catalysis (Bencharit et al. 2003, 2006; Fleming et al. 2005).
Because of the confusion associated with the current nomenclature for mammalian CES genes, particularly for mouse and rat Ces genes where significant gene duplication events have generated a large number of Ces1-like and Ces2-like genes (Berning et al. 1985; Dolinsky et al. 2001; Ghosh et al. 1995; Hosokawa et al. 2007; Satoh and Hosokawa 1995), this article proposes a new nomenclature system that enables easy identification of CES family members for this enzyme. The nomenclature follows the guidelines of the human, mouse, and rat gene nomenclature committees and allocates a new name for each human (CES) or mouse and rat (Ces) gene. It also names and identifies the gene family origin for identified CES pseudogenes and provides a system for naming transcript iso-forms derived from each of the CES genes. The nomenclature has the flexibility to accommodate new human, mouse, and rat CES genes and will assist further research into the structure, function, and evolution of these gene families as well as serve as a model for naming CES genes from other mammalian species.
The new nomenclature system for human, mouse, and rat CES genes and enzymes is based on the identification of homolog gene families and a subsequent allocation of a unique gene name for each of the genes observed from genome databases or reported from previous studies. It follows the guidelines of the human, mouse, and rat gene nomenclature committees and recommends the naming of homolog CES or Ces genes among species. The italicized root symbol “CES” for human and “Ces” for mouse and rat genes were used, followed by an number describing the gene family (examples include CES1 for human CES family 1 or Ces1 for mouse and rat Ces family 1 genes) (Tables 1, 2, 3). For mammalian genomes in which multiple genes were identified or a gene required a name that clashed with an existing name, a capital letter (for human genes) (e.g., CES4A) or a lower-case letter (for mouse and rat genes) (e.g., Ces1a, Ces1b for multiple mouse Ces1-like genes) was added after the number. The letter used for multiple genes reflected the relatedness of the genes across species (e.g., reflecting higher degrees of identity for mouse and rat Ces1a genes). When a human CES pseudogene was identified, a capital “P” and a number were added to the gene name (e.g., CES1P1), whereas for mouse and rat Ces pseudogenes, a unique lower-case letter was used followed by “-ps” (e.g., Ces2d-ps). Transcript iso-forms of human (CES) and mouse and rat (Ces) gene transcripts were designated by following the gene name with the GenBank transcript ID, such as human CES1_AB119997 and CES1_AB187225, which differs from the current nomenclature used for human CES1 iso-forms (CES1A1 and CES1A2, respectively) (see Table 1).
Table 1 summarizes the locations and exonic structures for human CES genes based upon previous reports for human CES1 and CES2 (Becker et al. 1994; Ghosh 2000; Langmann et al. 1997; Marsh et al. 2004) and predictions for human CES3 (Holmes et al. 2010), CES4A (Holmes et al. 2009a), and CES5A (Holmes et al. 2008a) [the February 2009 human reference sequence (GRCh37) was used in this study (Rhead et al. 2010)]. Human CES1P1 (a CES1-like pseudogene), CES1, and CES5A were located in a cluster (cluster 1) on chromosome 16, while CES2, CES3, and CES4A were in a separate cluster (cluster 2) on the same chromosome. Cluster 1 CES genes (CES1 and CES5A) were transcribed on the negative strand, whereas cluster 2 genes (CES2, CES3, and CES4A) were transcribed on the positive strand. Figure 1 summarizes the predicted exonic start sites for human CES genes, with CES1 and CES4A containing 14 exons, CES3 and CES5A 13 exons, and CES2 with 12 exons. These exon start sites were in identical or similar positions to those reported for CES1 (Ghosh 2000; March et al. 2004). Figure 2 shows the comparative structures for human CES reference sequences and transcripts described on the AceView website (http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/) (Thierry-Mieg and Thierry-Mieg 2006). The CES gene and transcript sequences varied in size from 11 kb for CES2 to 79 kb for CES5A and exhibited distinct structures in each case. Moreover, several isoforms were generated in vivo for each of the human CES genes and have different structures as a result of transcriptional events, including truncation of the 5' ends, differential presence or absence of exons, alternative splicing or retention of introns, or overlapping exons with different boundaries. In addition, the isoforms are differentially expressed in tissues of the body and may perform distinctive metabolic roles. CES isoforms were named by using the gene name followed by the GenBank ID for the specific transcript. Recent studies of human CES1 have described at least two major isoform transcripts, designated as CES1A1 (AB119997) and CES1A2 (AB119996) (Tanimoto et al. 2007). These isoforms have been redesignated as CES1_AB119997 and CES1_AB119997, respectively (see Table 1) and encode sequences that differ by only four amino acid residues within the N-terminal region (exon 1) (Tanimoto et al. 2007). Distinct 5'-untranslated consensus sequences for binding transcription factors were reported. They suggested differences in transcriptional regulation and functional roles in contributing to CPT-11 chemosensitivity for these isoforms (Hosokawa et al. 2008; Tanimoto et al. 2007; Yoshimura et al. 2008). Fukami et al. (2008) have also examined human CES isoform structure and proposed that CES1P1,a CES1-like pseudogene on chromosome 16 (designated as CES1A3), was derived from the CES1_AB119997 isoform.
An alignment of the amino acid sequences for human CES-like protein subunits is shown in Fig. 1, together with a description of several features for these enzymes. The sequences have been derived from previously reported sequences for CES1 (Munger et al. 1991; Shibata et al. 1993), CES2 (Langmann et al. 1997; Schewer et al. 1997), CES3 (Sanghani et al. 2004), CES4A (previously CES6 or CES8) (Holmes et al. 2009a); and CES5A (previously CES7) (Holmes et al. 2008a) (Table 1). Alignments of the human CES subunits showed between 39 and 46% sequence identities, which suggests that these are products of separate but related gene families, whereas sequence alignments of human CES1 and CES2 with mouse CES1-like and CES2-like subunits exhibited higher levels of sequence identities with the CES family homolog in each case [66–78% identities for human and mouse CES1-like subunits and 64–72% for human and mouse CES2-like subunits, respectively (data not shown)], suggesting that these are members of the same mammalian CES families, in each case. Similar results were observed for comparisons of human CES3, CES4A (previously CES6 or CES8), and CES5A (previously CES7) with the corresponding mouse CES homolog sequences, with 65, 72, and 69% identities being observed, respectively. This supports the designation of these CES genes as members of the same family, in each case.
The amino acid sequences for the human CES subunits examined contained 567 (CES1), 559 (CES2), 571 (CES3), 561 (CES4A), and 575 (CES5A) residues (Fig. 1). Previous studies on human CES1 have identified key residues that contribute to the catalytic, oligomeric, subcellular localization and regulatory functions for this enzyme (sequence numbers refer to human CES1). These included the catalytic triad for the active site (Ser221; Glu354; His468) (Cygler et al. 1993); disulfide bond-forming residues (Cys87/Cys116 and Cys274/Cys285) (Lockridge et al. 1987); microsomal targeting sequences, including the hydrophobic N-terminus signal peptide (Potter et al. 1998; von Heijne 1983; Zhen et al. 1995) and the C-terminal endoplasmic reticulum (ER) retention sequence (His-Ile-Glu-Leu) (Robbi and Beaufay 1983); and ligand-binding sites, including the “Z-site” (Gly356), the ”side door” (Val424-Met425-Phe426), and the “gate” (Phe550) residues (Bencharit et al. 2003, 2006; Fleming et al. 2005). Identical residues were observed for each of the human CES subunit families for the active site triad and disulfide bond-forming residues, although changes were observed for some key residues for CES1 subunits, including the “side-door” and “gate” of the active site, with family-specific sequences or residues in each case. The “Z-site” (Gly356 for human CES1) has been retained for human CES2 and CES5A sequences, but substituted for CES3 (Ser) and CES4A (Asn). The hydrophobic N-terminal sequence for human CES sequences has undergone major changes, although this region retains a predicted signal peptide property. The human CES C-terminal tetrapeptide sequences have also changed, although CES2 (HTEL) and CES3 (QEDL) are similar in sequence with human CES1 (HIEL), which plays a role in the localization of human CES1 within endoplasmic reticulum membranes (Robbi and Beaufay 1983).
Other key human CES1 sequences included two charge clamps that are responsible for subunit-subunit interaction, namely, residues Lys78/Glu183 and Glu72/Arg186, which contribute to the trimeric and hexameric structures for this enzyme (Bencharit et al. 2003, 2006; Fleming et al. 2005). Other human CES subunit sequences for these charge clamp sites included substitutions with neutral amino acids for the human CES2 and CES5A sequences, while the CES3 and CES4A sequences retained one potential clamp site (Fig. 1). Pindel et al. (1997) and Holmes et al. (2009b) have reported monomeric subunit structures for human and baboon CES2, which is consistent with the absence of charge clamps for this enzyme. This could have a major influence on the kinetics and biochemical roles for human CES isozymes since three-dimensional studies have indicated that ligand binding to the human CES1 “Z-site” shifts the trimer-hexamer equilibrium toward the trimer that facilitates substrate binding and enzyme catalysis (Redinbo and Potter 2005). The N-glycosylation site for human CES1 at Asn79-Ala80-Thr81 (Bencharit et al. 2003, 2006; Fleming et al. 2005; Kroetz et al. 1993) was not retained for any of the other human CES sequences, although potential N-glycosylation sites were observed at other positions, including CES2 (site 3), CES3 (site 2), CES4A (sites 4, 5, and 7), and CES5A (sites 6, 8, and 9) (Table 4). Given the reported role of the N-glycosylated carbohydrate group contributing to CES1 stability and maintaining catalytic efficiency (Kroetz et al. 1993), the N-glycosylation sites predicted for other human CES subunits may perform similar functions or indeed may serve new functions specific to a particular CES family.
Predicted secondary structures for human CES2 (Holmes et al. 2009b), CES3 (Holmes et al. 2010), CES4A (Holmes et al. 2009a), and CES5A (Holmes et al. 2008a) sequences were compared with those reported for human CES1, and similar α-helix β-sheet structures were observed for all of the CES subunits examined (Bencharit et al. 2003, 2006) (Fig. 1). This was especially apparent near key residues or functional domains such as the α-helix within the N-terminal signal peptide, the β-sheet and α-helix structures near the active site Ser221 (human CES1) and “Z-site” (Glu354/Gly356, respectively), the α-helices bordering the “side door” site, and the a-helix containing the “gate” residue (Phe550 for human CES1). The human CES5A sequence, however, contained a predicted helix at the hydrophobic C-terminus not observed for other CES subunits which may perform a family-specific function. Predicted 3D structures have been previously described for each of the human CES subunits (Holmes et al. 2008a, 2009a, b, 2010); they were similar to the human CES1 structure (Bencharit et al. 2003, 2006).
Table 2 summarizes the proposed names, locations, and overall structures for the Ces genes observed for the mouse genome (July 2007 mouse [Mus musculus] genome data obtained from the Build 37 assembly by NCBI and the Mouse Genome Sequencing Consortium) (http://www.ncbi.nlm.nih.gov was used in this study). The italicized gene name Ces is consistent with other mouse gene nomenclature and is preferred to the CES stem used for human genes. At least 20 mouse Ces genes are recognized on the Mouse Genome Database http://www.informatics.jax.org/) (MGI) and further described in terms of their locations on mouse chromosome 8, the number of predicted exons for each gene, predicted strand for transcription, number of amino acid residues and subunit molecular weights (MWs) for the encoded CES subunits, and identification symbols from MGI (e.g., MGI3648919 for Ces1a), NCBI (Reference Sequences were identified from the National Center for Biotechnology Information database) (http://www.ncbi.nlm.nih.gov/), Vega (the VErtebrate Genome Annotation database) (http://vega.sanger.ac.uk/index.html), UNIPROT (Universal Protein Resource) (http://www.ebi.ac.uk/uniprot/), and Ensembl (Genome Database) (http://www.ensembl.org/) database sources.
Eight Ces1-like genes are located in tandem within a 360-kb segment of mouse chromosome 8, with an average gene size of 28 kb. The names for these genes (Ces1a, Ces1b,…, Ces1h) are allocated in the same order as their locations on the mouse genome (Table 3). The Ces1-like gene cluster is also located near the mouse Ces5a gene, which is comparable to the CES1P1-CES1-CES5A cluster observed for human chromosome 16. Each of these genes contained 13 or 14 exons predicted for transcription on the negative strand and with encoded CES subunits exhibiting distinct but similar amino acid sequences (554–567 residues). The subunits were 63–85% identical with each other and with the human CES1 sequence, which is consistent with these being members of the mouse Ces1 gene family. Mouse Ces1-like genes included several that have been previously investigated, including Ces1c (previously called Es1), encoding a major mouse plasma esterase with 554 amino acid residues and also exhibiting lung surfactant convertase activity (Genetta et al. 1988; Krishnasamy et al. 1998); Ces1d (previously Ces3), encoding a mouse liver enzyme with 565 residues and exhibiting triacylglycerol hydrolase activity (Dolinsky et al. 2001); Ces1e (previously called Es22 or egasyn), encoding a liver CES with 562 residues and exhibiting β-glucuronidase-binding properties (Ovnic et al. 1991); and Ces1g (previously Ces1), encoding a liver CES with 565 amino acid residues and exhibiting lipid metabolizing activity (Table 4) (Ellingham et al. 1998).
Eight Ces2-like genes were also observed in a second 286-kb gene cluster on mouse chromosome 8, with an average gene size of approximately 8 kb (Table 2). These genes were named according to their sequence of position on the mouse genome (Ces2a, Ces2b,…, Ces2h) and included a pseudogene designated Ces2d-ps. Three of these mouse Ces2-like genes have been previously described, including Ces2c (previously Ces2), encoding an inducible liver acyl-carnitine hydrolase enzyme with 561 residues (Furihata et al. 2003); Ces2e (previously Ces5), encoding a liver and intestinal enzyme with 560 amino acid residues (The MGC Project Team 2004); and Ces2a (previously Ces6), encoding a liver and colon enzyme with 558 residues (The MGC Project Team 2004). The Ces2-like cluster was located alongside two Ces3-like mouse genes (Ces3a and Ces3b) and a Ces4a gene (Table 3); this is comparable to the CES2-CES3-CES4A gene cluster on human chromosome 16 (Table 1). The Ces3a gene (previously mouse esterase 31 or Est31) is expressed strongly in male mouse livers and encodes a 554-residue CES3-like subunit (Aida et al. 1993), whereas the Ces3b gene (previously Es31L or EG13909) is also expressed in liver and encodes a 568-residue subunit (The MGC Project Team 2004). The Ces4a gene (previously called EST8 or Ces8) encodes an enzyme predicted for secretion in epidermal cells with 563 amino acid residues and showing 72% identity with human CES4A (The MGC Project Team 2004).
Table 3 summarizes the proposed names, locations, and structures for Ces genes observed for the rat genome [the November 2004 rat (Rattus norvegicus) genome assembly based on version 3.4 produced by the Baylor Human Genome Sequencing Center (Gibbs et al. 2004) was used in this study]. Fifteen rat Ces genes were identified on the Rat Genome Database (RGD) (http://rgd.mcw.edu/) and further characterized by their locations on rat chromosomes 1 and 19, the number of predicted exons for each gene, the predicted strand for transcription, current gene symbols, the number of amino acid residues and subunit MWs for the encoded CES subunits, and the identification symbols from RGD (e.g., RGD1583671 for Ces1a), NCBI Reference Sequences (http://www.ncbi.nlm.nih.gov/), Vega (http://vega.sanger.ac.uk/index.html), UNIPROT (http://www.ebi.ac.uk/uniprot/), and Ensembl (http://www.ensembl.org/) database sources.
Five Ces1-like genes were located in tandem within a 201-kb segment of rat chromosome 19, with an average gene size of 33 kb (Table 3). The names for these genes (Ces1a, Ces1c,…, Ces1f) were allocated according to their degree of identity with the corresponding mouse Ces1-like genes (Table 3). The genes were located in tandem in the same order as the mouse Ces1-like genes and were near the rat Ces5a gene. This is comparable to the CES1P1–CES1A–CES5A gene cluster observed for human chromosome 16. The rat Ces1-like genes contained 14 exons and were predicted for transcription on the positive strand, with encoded CES subunits exhibiting similar amino acid sequences (550–565 residues). The subunits were 65–73% identical with each other and with the human CES1 sequence, which is consistent with membership of the rat Ces1 gene family. The encoded rat Ces1-like subunit sequences showed higher levels of identity with the corresponding mouse Ces1-like sequences (81–92% for rat and mouse CES1a, CES1c, CES1d, CES1e, and CES1f amino acid sequences). At least three rat Ces1-like genes have been previously described, including Ces1c (previously called Es1), encoding a rat plasma esterase (Sanghani et al. 2002; Vanlith et al. 1993); Ces1d (previously Ces3), encoding a rat liver enzyme with 565 residues and exhibiting cholesteryl ester hydrolase activity (Ghosh et al. 1995; Robbi et al. 1990); and Ces1e (previously called ES-3 or egasyn), encoding a rat liver Ces with 561 residues and having β-glucuronidase-binding properties (Robbi and Beaufay 1994).
Seven rat Ces2-like genes were observed on the rat genome and were localized on two chromosomes: chromosome 1 (Ces2c and Ces2i) and chromosome 19 in three locations: Ces2a and Ces2e; Ces2j; and Ces2g and Ces2h (Table 3). The genes were named according to the degree of sequence identity with the corresponding mouse Ces2-like genes. Rat Ces2-like genes have been previously investigated, including Ces2c (previously Ces2), encoding an inducible liver acyl-carnitine hydrolase enzyme with 561 residues (Furihata et al. 2003); Ces2e (previously Ces5), encoding a liver and intestinal enzyme with 560 amino acid residues (The MGC Project Team 2004); and Ces2a (previously Ces6), encoding a liver and colon enzyme with 558 residues.(The MGC Project Team 2004). The rat Ces2-like cluster was located alongside a Ces3-like gene (Ces3a and Ces3b) and a Ces4a gene (Table 3), which is comparable to the CES2A-CES3A-CES4A gene cluster on human chromosome 16 (Table 1).
Mammalian CES families exhibit broad substrate specificities, and specific roles for these enzymes have been difficult to establish because of the promiscuity of the CES active site toward a wide range of substrates and the existence of multiple forms with overlapping specificities (Fleming et al. 2005; Imai 2006; Leinweber 1987; Redinbo and Potter 2005; Satoh and Hosokawa 1998, 2006). Table 4 summarizes current knowledge concerning substrates and functions reported for human, mouse, and rat CES gene family members.
Studies on human CES1 have examined its role in the metabolism of various drugs, including narcotics such as heroin and cocaine (Bencharit et al. 2003; Pindel et al. 1997), warfare nerve agents (Hemmert et al. 2010), psy-chostimulants (Sun et al. 2004), analgesics (Takai et al. 1997), and chemotherapy drugs (Sanghani et al. 2004). Mammalian liver is predominantly responsible for drug clearance from the body, with CES1 and CES2 (with CES1 > CES2) playing major roles, following absorption of drugs into the circulation (Imai 2006; Pindel et al. 1997). Mammalian intestine (with CES2 > CES1) plays a major role in first-pass clearance of several drugs, predominantly via CES2 in the ileum and jejunum (Imai et al. 2003). CES1 and CES2 also have different roles in prodrug activation, as shown for the anticancer drug irinotecan (CPT-11), which is converted to its active form SN-38 predominantly by CES2 (Humerickhouse et al. 2000). Recent modeling studies have shown that the human CES2 active site cavity is lined with negatively charged residues; this may explain the preference of this enzyme for neutral substrates (Vistoli et al. 2010). The role for human CES3 has not been studied extensively, although the enzyme is capable of activating prodrugs such as irinotecan (Sanghani et al. 2004). There are no reports concerning the metabolic role(s) for human CES4A, and functional studies on mammalian CES5 function are limited to feline species, where the enzyme is secreted into cat urine and apparently regulates the production of a cat-specific amino acid “felinine,” a putative pheromone precursor (Miyazaki et al. 2006).
Recent comparative and evolutionary studies (Holmes et al. 2008b; Williams et al. 2010) have concluded that there are at least five major mammalian CES gene families. In addition, the gene duplication events that generated the ancestral mammalian CES1, CES2, CES3, CES4, and CES5 genes have apparently predated the common ancestor for marsupial and eutherian mammals (Holmes et al. 2008b) which has been estimated at approximately 173–193 million years ago (Woodburne et al. 2003) and may coincide with the early diversification of tetrapods approximately 350–360 million years ago (Donoghue and Benton 2007). The mammalian CES gene families are ancient in their genetic origins and were established prior to the appearance of mammals during evolution. Further CES/Ces gene duplication events have subsequently occurred during mammalian evolution, however, especially for rodent species, for which the mouse and rat Ces1-like and Ces2-like genes have apparently undergone successive duplication events. At least three of these are likely to have occurred in the common ancestor for rat and mouse during rodent evolution since several homolog genes and proteins were recognized, including Ces1c (previously Es1), Ces1d (Ces3), Ces1e (Es22), Ces2a (Ces6), Ces2c (Ces2), and Ces2e (Ces5) (Tables 3, ,4).4). With the exception of the rat Ces2-like genes, which were located in multiple clusters on chromosomes 1 and 19, human, mouse, and rat CES genes were localized within two clusters of genes on the same chromosome, namely, Ces1–Ces5A (with multiple Ces1-like genes) and Ces2–Ces3–Ces4A (with multiple Ces2-like genes in mouse and rat). The presence of two Ces3-like genes in the mouse suggests that a further duplication event also took place in this species.
This article has examined human, mouse, and rat carboxylesterase genes and encoded subunits and has proposed a new nomenclature system, identifying each of five gene families (designated as CES1, CES2,…, Ces5 for human genes and Ces1, Ces2,…, Ces5 for mouse and rat genes) and allocating a unique gene name for each of the genes. The italicized root symbol “CES” for human and “Ces” for mouse and rat genes followed by a number for the family were used, which is consistent with current practice. When multiple genes were identified for a gene family or where a gene required a name that clashed with an existing name, a capital letter (for human genes) (e.g., CES4A) or a lower-case letter (for mouse and rat genes) (e.g., Ces1a, Ces1b) was added after the number. A human CES pseudogene was named, using a capital “P” and a number (e.g., CES1P1), whereas mouse and rat Ces pseudogenes were named with a unique lower-case letter followed by “-ps” (e.g., Ces2d-ps). This new nomenclature will also assist in naming multiple CES genes and proteins from other mammalian species. As an example, Holmes et al. (2009c) and Williams et al. (2010) have reported multiple CES1-like genes on the horse genome that may be designated in accordance with the recommended nomenclature as CES1A, CES1B, CES1C, and so on, in order of the tandem locations of these genes on chromosome 3. Transcript isoforms of CES gene transcripts were named by following the gene name with the GenBank ID for the specific transcript. This nomenclature will assist our understanding of the genetic relatedness and the CES family origins for individual human, mouse, and rat CES genes and proteins and facilitate future research into the structure, function, and evolution of these genes. It will also serve as a model for naming CES genes from other mammalian species.
This research was supported by NIH Grants P01 HL028972 and P51 RR013986 (to LAC); R01 ES07965 (to BY); and CA108775, and a Cancer Center Core Grant CA21765, the American Lebanese and Syrian Associated Charities (ALSAC) and St. Jude Children's Research Hospital (SJCRH) (to PMP); and a program project grant HG000330 entitled `Mouse Genome Informatics' from the National Human Genome Research Institute of the NIH (to LJM). Acknowledgement is also given to members of the Redinbo laboratory and NIH grants CA98468 and NS58089 (to MRR).