|Home | About | Journals | Submit | Contact Us | Français|
UGT2B7 plays a central role in the liver-mediated biotransformation of endogenous and exogenous compounds. The genetic basis of interindividual variability in UGT2B7 function is unknown. This study aimed to discover novel gene variants of functional significance.
Caucasian human livers (n=54) were used. UGT2B7 was resequenced in 12 samples (6 highest and 6 lowest for the formation of morphine-3-glucuronide, M3G). Haplotype-tagging single nucleotide polymorphisms (tSNPs) were genotyped in the entire sample set. Samples were phenotyped for mRNA expression.
10 tSNPs were identified and their haplotypes were inferred. Haplotype 4 (-45597G;-6682_-6683A;372A;IVS1+9_IVS1+10A;IVS1+829T;IVS1+985G;IVS1+999C;IVS1+1250G;80 1T;IVS4+185C) (frequency of 0.12) was associated with an increase in enzyme activity and gene expression. The 1/4 and 4/6 diplotypes had higher M3G formation compared to 1/1 (p<0.05) and 2/3 (p<0.01) diplotypes. Diplotypes containing haplotype 4 resulted in a significant 45% average increase in the formation of M3G compared to diplotypes without haplotype 4 (p=0.002). There was also an association between haplotype 4 and increased mRNA expression. IVS1+985A>G, 735A>G and 1062C>T are the putative functional variants of haplotype 4. We also identified two mRNA splicing variants (UGT2B7_v2 and UGT2B7_v3) splicing out exon 1, 4, 5 and 6 but sharing exons 2 and 3 with the involvement of additional 5' exons. UGT2B7_v2 was detected in all livers tested, but UGT2B7_v3 was present at much lower levels compared to UGT2B7_v2. The UGT2B7 reference sequence mRNA is now named UGT2B7_v1.
UGT2B7 haplotype 4 is functional and its effects on the biotransformation of UGT2B7 substrates should be tested in controlled clinical trials. Biochemical studies should investigate the functional role of the newly discovered mRNA splicing variants.
UDP-glucuronosyltransferases (UGTs) are major phase II conjugative metabolic enzymes. The liver is the main organ of drug biotransformation, and the reaction with UDPGA as the cosubstrate results in the production of glucuronidated metabolites that are more water-soluble than the parent compound and readily eliminated through the biliary and renal systems . Similar to the 18 members of the human UGT superfamily of enzymes, the C-terminal domain of UGT2B7 is the site of binding for UDPGA, and the N-terminal domain is the site of binding for the various substrates. The elucidation of the biochemical properties of this enzyme is an area of active research, and the crystal structure of the UDPGA-binding domain has been recently obtained .
UGT2B7 is highly expressed in the human liver, with lower levels of expression in other extrahepatic organs . UGT2B7 is able to glucuronidate various steroid hormones (androsterone, epitestosterone) and fatty acids. UGT2B7 therefore plays a key role in the detoxification of cholestatic bile acids and may prevent the formation of proximal carcinogens such as quinone estrogens . In addition, UGT2B7 is also able to conjugate major classes of drugs such as analgesics, carboxylic nonsteroidal anti-inflammatory drugs (ketoprofen), anticarcinogens (all-trans retinoic acid) and anticancer drugs (epirubicin) .
The prototypical UGT2B7 substrate is morphine. In humans, morphine is primarily metabolized through glucuronidation via UGT2B7, and the generated metabolites have different pharmacological properties. The formation of the inactive morphine-3-glucuronide (M3G) is an important elimination pathway of morphine. The analgesic potency of morphine-6-glucuronide (M6G) is higher than that of morphine . Interpatient variability in the formation of M3G and M6G exists, but its genetic component remains to be established .
The variant in the UGT2B7 gene that has been studied the most is 802T>C in exon 2 (H268Y) (UGT2B7*2) . In established HEK293 cell lines expressing the two isoforms, many substrates were tested, with the result that the substrate specificity was not changed for the vast majority of the substrates [6-8]. In another study using a small series of human liver microsomes, morphine glucuronidation rates were not significantly affected by the 802T>C variant . More recently, we conducted a resequencing study in acute pain patients treated with morphine, and a -161T>C variant (relative to the ATG) in the UGT2B7 promoter [in complete linkage disequilibrium (LD) with 802T>C] was found to be associated with increased M6G/morphine plasma ratios . However, recent studies of morphine in cancer patients did not detect any significant association between these two UGT2B7 variants and either morphine glucuronidation  or analgesic response . Additional clinical studies investigating other UGT2B7 variants did not detect an effect on morphine disposition and response [11,13,14]. For example, in vitro luciferase assays of promoter variants show significant effects on gene transcription but no effect on morphine metabolism in patients [11,13]. A -840G>A variant has been recently shown to affect hepatic clearance of morphine in 20 sickle cell disease patients  but no effect was detected in a larger study of morphine treated patients .
From the studies conducted so far, there is no compelling evidence for the existence of functional UGT2B7 variants. The genetic variability that influences the activity of the UGT2B7 enzyme remains to be established . In addition, mRNA splicing variation can add another level of complexity to the biological functions of UGT genes in the liver, as previously demonstrated [16,17]. Due to the central role of UGT2B7 in the liver-mediated biotransformation of endogenous and exogenous compounds, we performed extensive resequencing of the UGT2B7 gene in human livers aiming to 1) discover novel gene variants, 2) characterize their haplotypic structure, 3) define the effect of genotypes and haplotypes on gene expression and enzyme activity, and 4) identify mRNA splicing variants. Our ultimate goal is to provide the knowledge base for testing the clinical impact of UGT2B7 variation in humans.
Caucasian human livers (n=54) were processed through Dr. Mary Relling's laboratory at St. Jude Children's Research Hospital and were provided by the Liver Tissue Procurement and Distribution System and by the Cooperative Human Tissue Network, funded through NIH contract #N01-DK-9-2310. Samples were collected with approval of institutional review boards. Morphine and epirubicin glucuronidation were measured in liver microsomes following the methods previously published . Briefly, incubations with morphine lasted 20 min and contained 1.4 mM morphine, 5 mM MgCl2, 2 mg/ml microsomal protein, 0.1 M Tris-HCl buffer (pH 7.4) and 5 mM UDPGA. The reactions were stopped with cold acetonitrile and spiked with internal standard (42 nmol 10,11-dihydrocarbamazepine). Formation of M3G and M6G was measured by HPLC . The concentration of morphine used (1.4 mM) resembles the Km of M3G (2 mM) and M6G (1.9 mM) formation . Incubations with epirubicin contained 600 μM epirubicin (similar to the Km for the formation of epirubicin-G=568 μM ), 10 mM MgCl2, 3 mg/ml microsomal protein, 0.1 M Tris-HCl buffer (pH 7.4) and 5 mM UDPGA. Reactions were stopped after 4 h with cold methanol followed by addition of internal standard (1 nmol daunorubicin), and analyzed by HPLC .
Activities determined at a single substrate concentration are not ideal for the phenotypic assessment of enzyme activities. The Km and Vmax for the formation of the glucuronides in each liver was not assessed due to the low amount of microsomes available from each liver, and this should be regarded as a limitation of this study. However, the phenotypic assessment of the UGT2B7 enzyme served the purpose of a rapid screening to identify the extremes of the phenotypic distribution and select those samples for resequencing, increasing the likelihood of identifying functional variants.
Phenotypic glucuronidating activity in each liver was expressed as the ratio between the peak heights of glucuronides to the internal standard. Although the amount of formed metabolite has not been assessed by using a standard curve, the chromatographic peak heights ratios of M3G, M6G, and epirubicin-G vs. their internal standards reported were in the same range of those reported in our previous publication  (data not shown).
Twelve samples in the phenotypic extremes of the frequency distribution of the M3G formation (6 with the highest and 6 the lowest formation) were chosen for resequencing. All six coding exons with flanking intron sequence, 3'UTR, promoter/5'UTR and the 5'-upstream region containing alternatively spliced exons were resequenced (Fig. 1). Sequence coverage includes approximately 2.4 kb of the 5' flanking region which contains 0.4 kb of exon 1A, 0.25 kb of exon 1B and flanking regions, 2.2 kb of exon 1 and flanking regions (including 1.2 kb upstream and 0.25 kb downstream flanking sequence), 0.78 kb of exon 2 and flanking intron sequence (including 0.48 kb upstream and 0.15 kb downstream flanking intron sequence), 0.4 kb of exon 3 and flanking intron sequence, 1.4 kb of exons 4 and 5 which includes all of intron 4 (0.84 kb), and approximately 0.9 kb of exon 6 including 0.1-0.2 kb of flanking intron and 3' flanking regions. Primers used for PCR amplification are listed in Table 1.
PCRs were set up in a 25 μl volume containing 2.5 mM MgCl2, 200 μM each dNTP, 500 nM forward and reverse primers, 0.5 units AmpliTaq Gold (Applied Biosystems, Foster City, CA, USA) and 100 ng of DNA. Reactions were cycled at 94°C for 45 s, and annealing was performed at appropriate temperatures for 30 s and 72°C for 1 min for 35 cycles. This was followed by double-stranded sequencing. In preparation for sequencing, PCR products were purified using the QIAquick® PCR Purification Kit (Qiagen, Inc., Valencia, CA, USA). Purified products were eluted in 30 μl of elution buffer. DNA cycle sequencing reactions were carried out in 10 μl reactions using 4 μl of purified PCR product, 400 nM forward or reverse primer, BigDye® Terminator Version 3.0 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) and HalfBD Dye Terminator Sequencing Reagent (Sigma-Aldrich Co., St. Louis, MO, USA) per manufacturer's instructions. The sequencing primers are shown in Table 1. In many instances the same primers used for PCR were also used for sequencing. Cycle sequencing was performed using standard conditions and reactions were run on a 3100 DNA Sequencer (Applied Biosystems, Foster City, CA, USA). Sequence analysis was performed using Polyphred, Version 4.05.
LD analysis of SNPs (single nucleotide polymorphisms) among 12 Caucasians was performed by using the VG2 program (http://pga.gs.washington.edu/VG2.html). For selection of the tSNPs, SNPs were clustered using LDSelect  integrated in the VG2 program. The parameters for the selection of tSNPs were r2≥0.9 and a minor allele frequency of ≥0.05. Haplotypes and diplotypes composed of 10 tSNPs among 54 samples were estimated by using the Phase 2.0 program. A phase probability cut off of ≥0.9 was used for diplotype assignment. Diplotypes assigned from 4 out of the 54 individuals were ambiguous and not used in the association with phenotypic data.
Ten tSNPs (eight SNPs and two indels) were genotyped in 54 Caucasian liver DNA samples (Table 2, in bold). All SNPs were genotyped by single base extension (SBE) except for 644T>A which was genotyped by primer extension. PCR and extension primers are listed in Table 1. PCRs were set up in a 15 μl volume containing 1.5 mM MgCl2, 100 μM each dNTP, 125 nM forward and reverse primers, 0.375 units AmpliTaq Gold (Applied Biosystems, Foster City, CA, USA) and 25 ng of DNA. Reactions were denatured initially at 95°C for 15 min then cycled at 95°C for 15 s, annealing was performed at appropriate temperatures for 15 s and 72°C for 45 s for 40 cycles.
For the SBE reactions, PCR amplified products were treated with shrimp alkaline phosphatase and exonuclease I to remove excess dNTPs and primers. Six μl of the purified PCR product was combined with 1 μM of extension primer, 250 μM each ddNTP and 1.25 U Thermo Sequenase™ (Amersham Pharmacia Biotech, Piscataway, NJ, USA) in a 10 μl volume, and denatured initially at 96°C for two min then cycled at 96°C for 30 s, 55°C for 30 s, and 60°C for 30 s for 60 cycles. Samples were denatured at 96°C for 4 min prior to separation of the extension products. Extension products were separated by denaturing high performance liquid chromatography using the WAVE 3500HT Nucleic Acid Fragment Analysis System (Transgenomic, Inc., Omaha, NE, USA). Separation on the column was performed at 70°C with 76% 0.1 M triethylammonium acetate and 24% acetonitrile/0.1 M triethylammonium acetate for 2.7 min.
SNPs 137T>C and IVS1+985A>G were genotyped in a duplex reaction for both the PCR and SBE reactions. In the PCR, the primers were adjusted to 200 nM and 65 nM respectively to obtain approximately equal amounts of both amplicons. One μM of each extension primer was added to the duplex SBE reaction.
SNP 644T>A was genotyped by primer extension as opposed to SBE as resolution of the T and A alleles was not optimal by SBE. For the primer extension reaction, 10 μl of purified PCR product was combined with 1 μM of extension primer, 250 μM ddATP, 125 μM dCTP, dGTP, dTTP and 2.5 U Thermo Sequenase™ (Amersham Pharmacia Biotech, Piscataway, NJ, USA) in a 20 μl volume, and denatured initially at 96°C for two min then cycled at 96°C for 30 s, 52°C for 30 s, and 60°C for 30 s for 60 cycles.
The indel variants -6682_-6683insGCAAAT and IVS1+9_IVS1+10insT were genotyped by PCR and sizing by running of amplified products on an ABI 3730 DNA analyzer (Applied Biosystems, Foster City, CA, USA) at the University of Chicago DNA Sequencing and Genotyping Facility and scored using GeneMapper software (Applied Biosystems, Foster City, CA, USA).
We also genotyped all the samples for the -161T>C and 802T>C variants using methods established in one of our previous studies .
Analysis of the presence of mRNA expression of the UGT2B7 splicing variants in the 54 human livers (with the exception of 2 samples due to mRNA degradation) was performed by PCR using primers specific for each splicing variant. Reference sequence UGT2B7 mRNA (now classified as UGT2B7_v1) and β-actin mRNA were also amplified. Sequences for forward primers used to amplify splicing variants were 5'-GGCCCAGATTCCACAGAAGGAA-3' (junction of exon 1B and exon 2 for UGT2B7_v2) and 5'- GTCCAGACATCTAAAATTTGGAAGA-3' (junction of exon 1A and exon 2 for UGT2B7_v3) (Fig. 1A). The sequence of the reverse primers for UGT2B7_v2 and UGT2B7_v3 was the same of that of UGT2B7_v1 (see below).
PCRs were performed using HotStar Taq DNA Polymerase (Qiagen, Valencia, CA, USA), with thermal profile for both splicing variants as follows: initial activation at 95°C for 10 min, followed by 40 cycles of denaturation at 95°C for 10 s, annealing at 62°C for 10 s and extension at 72°C for 30 s and a final extension cycle at 72°C for 7 min. The PCR conditions for the UGT2B7_v1 and β-actin were the same of those used for the splicing variants except for annealing at 58°C and number of cycles (29 and 28, respectively). PCR products were separated on a 1.5% agarose gel and visualized by ethidium bromide staining using a ChemiDoc (Bio-Rad Laboratories, Hercules, CA, USA). PCR products of the two splicing variants were also cut out from the gel, purified and confirmed by direct sequencing.
To discover the full sequences of UGT2B7_v2 and UGT2B7_v3, 5'- and 3'-RACE (Rapid Amplification of cDNA End) were conducted. Liver cDNA library was constructed by using the SMART™ RACE cDNA Amplification Kit (Clontech, Mountain View, CA, USA) according to the manufacturer's instruction. Nested PCR was used to generate the 5'- and 3'-ends of the cDNA. Briefly, gene specific primers 5'-GGAATCTGGGCCAAGTCTGAAGC-3' (5'-RACE) and 5'-GAAGCAGGTAGCAGGCCCTCGCAG-3' (3'-RACE) were used to perform the first RACE PCR with the universal primer AP1 (Clontech), respectively; Nested PCR was performed using gene specific primers 5'-CTCAGATAATGTAGTGGGTCTTCCAAAT-3' (5'RACE) 5'-GTCCAGATCTCTGGCAAAGATGG-3' (3'-RACE) and universal nested primer AP2 (Clontech), respectively. After amplification, PCR products were separated in 2% agarose gel; PCR bands were cut out from the gel, purified and directly sequenced from both ends.
The mRNA levels of UGT2B7_v1 and UGT2B7_v2 were measured in the same set of 54 livers (with the exception of 2 samples due to mRNA degradation) by two-step real time PCR using IQ™ SYBR Green Supermix (Bio-Rad Laboratories, Hercules, CA, USA) on a Mx3000P system (Stratagene, Cedar Creek, TX, USA). β-actin mRNA expression was also measured. We were not able to quantify the expression level of UGT2B7_v3 with accuracy.
cDNA was synthesized in a 20 μl reaction volume using the iScript™ cDNA Synthesis Kit (Bio-Rad Laboratories, Hercules, CA, USA) and 2 μg of total RNA. Real time PCRs for UGT2B7_v1 and β-actin were then performed with the thermal cycles as follows: 40 cycles with denaturing at 95°C for 30 s, annealing at 55°C for 1 min and extension at 72 °C for 30 s, after preheating at 95°C for 10 min. Real time PCR for UGT2B7_v2 was performed as follows: 45 cycles with denaturing at 95°C for 15 s, annealing and extension at 68°C for 1 min, after preheating at 95 °C for 10 min. The oligonucleotide sequences of the primers were: 5'-CAGCTTCTCTCCTGGCTACACTT-3' (forward primer) for UGT2B7_v1, 5' GGCCCAGATTCCACAGAAGGAA (forward primer) for UGT2B7_v2 and the same reverse primer 5'-CAGGAGTTTCGAATAAGCCATAC-3' for both. The β-actin gene was amplified using 5'-ACGTGGACATCCGCAAAGAC-3' (forward primer) and 5'-CAAGAAAGGGTGTAACGCAACTA-3' (reverse primer). Reactions were performed in triplicate in 96-well plates including standard curves. Gene expression was normalized to the expression level of β-actin.
The frequency distribution of M3G, M6G, epirubicin glucuronide (epirubicin-G) was normal according to the Kolmogorov-Smirnov test (p>0.05). The mRNA levels of UGT2B7_v1_(ratio to β-actin) were log transformed, as they did not pass the normality test (p<0.05); after log transformation, they were normally distributed (p>0.05). ANOVA test with post-hoc pair-wise comparison was applied to the mRNA levels, M3G, M6G and epirubicin-G formation in each diplotype. Data are expressed as mean±SD, unless specified otherwise. The results of this study should be regarded as hypothesis generating, as this study was not prospectively powered to detect statistically significant associations between SNPs, haplotypes, diplotypes and phenotypes. Moreover, the p values have not been adjusted for multiple comparisons.
Linear regression analysis of each tSNP genotype vs. the phenotypic data (M3G and M6G formation and expression of reference sequence mRNA) was also performed. For the purpose of this analysis, three univariate models (additive, dominant and recessive) were tested, as the modes of inheritance were either unknown or not clearly identifiable from the plotted data. For example, for the hypothetical variant A>G, we have genotypes A/A, A/G and G/G. The additive model would be: A/A=0, A/G=1, G/G=2. The dominant model would be: A/A=0, A/G=1, G/G=1. The recessive model would be: A/A=0, A/G=0, G/G=1. Hardy-Weinberg equilibrium for each variant (both after sequencing and genotyping) was also tested. Statistical significance of the data was set at p<0.05. We used ESEfinder 2.0 [http://rulai.cshl.edu/tools/ESE, 18] for predicting the effect of the gene variants comprised in haplotype 4 on exonic splicing enhancer binding sites for specific serine/arginine-rich (SR) proteins (SF2/ASF, SC35, SRp40, SRp55).
The frequency distribution of the M3G formation in 49 samples appeared normal (p>0.1) (Fig. 2). Resequencing of approximately 7.8 kb in 12 phenotypic extremes revealed the presence of 50 variants in the UGT2B7 gene, comprising 6 indels and 44 single nucleotide substitutions (Table 2). The former also included a 6 bp indel (-6682_-6683insGCAAAT) and a 324bp Alu repeat at position -2065. With regard to the location of the variants, 14 SNPs were found in exons (including 3 variants in the newly discovered exon 1A) and 2 nonsynonymous variants were identified in exon 1 (644T>A, I215N) and exon 2 (802T>C, Y268H) (Table 2).
Using LDSelect (r2≥0.9 and a minor allele frequency cut off of ≥0.05), 10 bins were identified among 45 out of the 50 variants (5 variants had a frequency <0.05). Ten tSNPs were genotyped in the 54 samples (tSNPs indicated by arrows, Fig. 3). No deviation from Hardy-Weinberg equilibrium was observed (p>0.01). Haplotypes comprising the 10 tSNPs are summarized in Table 3. In the 54 samples, haplotypes 1-6 account for approximately 90% of all haplotypes. Haplotypes 1-6 can be also detected by using only the data from 12 resequenced samples (Table 3). The following common diplotypes were identified: 1/4 (frequency of 0.13), 1/1 (0.11), 1/2 (0.09), 1/3 (0.07), 2/3 (0.07), 3/5 (0.06), and 4/6 (0.06).
We selected M3G formation as the primary phenotype used to select the samples for resequencing. M3G formation data are available from 49 samples out of 54 due to insufficient availability of microsomes for some samples. The 1/4 diplotype had higher M3G formation compared to 1/1 (p<0.05) and 2/3 (p<0.01), similar to the 4/6 diplotype (p<0.05 and p<0.01, respectively) (Fig. 4A). Diplotypes containing haplotype 4 resulted in a significant 45% average increase in the formation of M3G compared to diplotypes without haplotype 4 (1.74±0.21 vs. 1.08±0.09, respectively, p=0.002). This comparison retained statistical significance when the phenotypic extremes were excluded (1.64±0.39 vs. 1.11±0.41, p=0.002).
In addition to M3G, we sought to investigate the role of UGT2B7 haplotypes on the formation of M6G, the active metabolite of morphine. M6G formation data are available from 49 samples out of 54 due to insufficient availability of microsomes for some samples. The 1/4 and 4/6 diplotypes had higher M6G formation compared to 2/3 (p<0.01 and p<0.05, respectively) (Fig. 4B). Diplotypes containing haplotype 4 resulted in a significant 56% average increase in the formation of M6G compared to diplotypes without haplotype 4 (0.13±0.02 vs. 0.09±0.01, respectively, p=0.002). This comparison retained statistical significance when the phenotypic extremes were excluded (0.12±0.03 vs. 0.09±0.03, p=0.002).
Data for the formation of epirubicin-glucuronide (epirubicin-G), a reaction mediated by UGT2B7, are available from 53 samples out of 54 due to insufficient availability of microsomes for one sample. The 1/4 and 4/6 diplotypes had higher epirubicin-G formation than the 2/3 diplotype (p<0.05, Fig. 4C). The diplotypes containing haplotype 4 resulted in a significant 27% average increase in the formation of epirubicin-G compared to the diplotypes without haplotype 4 (1.08±0.11 vs. 0.85±0.05, p=0.04). This comparison retained statistical significance when the phenotypic extremes were excluded (1.10±0.27 vs. 0.87±0.27, p=0.012).
Due to the observed associations with glucuronide formation (see above), we sought to investigate if they were possibly due to an effect of haplotype 4 on UGT2B7_v1 expression. The diplotype 1/4 had increased UGT2B7_v1 mRNA expression compared to diplotypes 1/1 (p<0.05), 2/3 (p<0.01) and 3/5 (p<0.05) (Fig. 4D). This was reflected by more than a two-fold increase in mRNA expression in diplotypes containing haplotype 4 [UGT2B7_v1/β-actin, median (range): 0.44 (-1.20, 1.16)] compared to diplotypes without haplotype 4 [0.088 (-1.01, 0.91)] (p=0.028, t-test). This comparison retained statistical significance when the phenotypic extremes were excluded [median (range): 0.47 (-0.19, 1.16) vs. 0.21 (-1.01, 0.91) (p=0.017, t-test)].
In this analysis, we aimed to investigate which of the tSNPs in haplotype 4 was likely to be responsible for the observed effect of haplotype 4. Linear regression analysis using three different genetic models indicated that IVS1+985A>G was significantly associated with mRNA levels (for UGT2B7_v1), M3G and M6G formation (additive and dominant models, see Tables 4 and and55 and Fig. 5). The IVS1+985A>G variant is included in haplotype 4 and was associated with an increase in M3G and M6G formation and UGT2B7_v1 mRNA. The IVS1+985A>G variant (ID number 56721) is located in intron 1 and has a frequency of 0.17. The other two variants of haplotype 4 that differ from haplotype 1 were IVS1+1250A>G and IVS1+829T>C, but no effect on phenotype was detected (Tables 4 and and55).
Moreover, in this analysis, we intended to see whether variants not included in haplotype 4 might show a phenotypic effect that was not detected by the haplotype analysis. The IVS1+9_IVS1+10insT variant was significantly associated with M3G and M6G formation in both models. We also evaluated whether the -161T>C and 802T>C variants were correlated with M3G and M6G formation and with mRNA levels: no significant correlation was found in any of the models tested (p>0.05, linear regression, data not shown).
The IVS1+985A>G (ID number 56721) is in high LD with 735A>G (ID number 57049) and 1062C>T (ID number 65704) (Fig. 3). Hence, due to the association between IVS1+985A>G and the phenotypes mentioned above, we investigated the effect of IVS1+985A>G, 735A>G, and 1062C>T on the binding sites for SR proteins. Concerning IVS1+985A>G, the presence of a G allele creates a SRp40 binding site (motif TTTAAG, score 3.40, cut-off 2.68). Similarly, the presence of the G allele in 735A>G creates a SRp55 binding site (motif TAACGTT, score 2.97, cut-off 2.68). Concerning 1062C>T, SRp40 (motif GTACAAG, score 2.94, cut-off 2.68) and SRp55 (motif TGTATA, score 3.61, cut-off 2.68) binding sites are present for each allele (C and T, respectively).
By searching the AceView database (http://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/), we found two putative alternative splicing variants supported by more than one clone. With PCR amplification, we confirmed the existence of the two variants in human liver mRNA. We designated the newly discovered splicing variants as UGT2B7_v2 and UGT2B7_v3; and the original UGT2B7 sequence is therefore designated as UGT2B7_v1 according to the Human Gene Nomenclature Committee (http://www.genenames.org/). Further validation studies were conducted to amplify the 5'- and 3'-ends of the two variants by RACE PCR. As a result, we discovered that both variants splice out the exon 1, 4, 5, and 6, but share the exon 2 and 3 of UGT2B7_v1. The 3'-end extends from exon 3 to intron 3 and stops 244bp from the end of exon 3. For the 5'-ends, UGT2B7_v2 comprises exons 1A and 1B, while UGT2B7_v3 comprises exons 1A alone. The 1A and 1B exons are 400 bp and 132 bp long and located 45 kb and 6.9 kb 5' to exon 1, respectively (Fig. 1A). The splicing out of intron 1A and 1B fits the rule of GT-AG on the exon-intron boundaries (Fig. 1B).
Expression of UGT2B7_v2 and UGT2B7_v3 mRNA was observed in human livers (Fig. 6). Their expression was also found in other human tissues (placenta and kidney) but not brain (data not shown). Sequencing of PCR products confirmed the DNA sequences of the two variants (data not shown). UGT2B7_v2 was detectable in all samples (Fig. 6). However, under the present PCR conditions, UGT2B7_v3 was undetectable in some samples (Fig. 6). Real time PCR analysis indicated that the mRNA levels of UGT2B7_v2 were lower than those of UGT2B7_v1 (Ct value, mean±SD, 33.30±1.84 for UGT2B7_v2 vs. 22.19±1.84 for UGT2B7_v1, n=52, p<0.0001). In addition, a modest correlation was found between the expression of the reference sequence mRNA and UGT2B7_v2 mRNA (Pearson r=0.53, p<0.0001).
UGT2B7 is an important enzyme for phase II metabolism. This study identified probable functional variants and haplotypes in the UGT2B7 gene using the phenotype-to-genotype approach of resequencing the tails of the distribution of M3G formation in human livers. In addition to discovering several novel UGT2B7 variants that have not been deposited in the dbSNP database, our study identified haplotype 4 as a common and functional haplotype increasing enzyme activity by regulating mRNA expression of UGT2B7. Haplotype 4 is comprised of 10 tSNPs, and the investigation of which SNP is more likely to be driving the effect of haplotype 4 indicated that IVS1+985A>G (intron 1) is the only tSNP consistently associated with an increase in glucuronide formation and mRNA expression. This is consistent with the fact that IVS1+985A>G is the only variant present only in haplotype 4 and absent in all the other common haplotypes (i.e., haplotypes 1-6). Hence, the effect of haplotype 4 is likely to reside in one of the 3 SNPs of the bin tagged by IVS1+985A>G (Fig. 3).
In agreement with the observed effect of increased mRNA expression, ESEfinder scan of these three SNPs tagged by IVS1+985A>G is suggestive of their role on constitutive UGT2B7 splicing. For IVS1+985A>G (intronic) and 735A>G (exonic), the variant allele creates splicing enhancers for SR proteins (SRp40 and SRp55, respectively) that are regulators of both alternative and constitutive gene splicing . Concerning the third variant in the bin (i.e., 1062C>T, exon 4), the T allele determines a stronger score for the SRp40 binding site (3.61) compared to that of the C allele (2.94, cut-off 2.68). These observations require biological validation in experimental models. However, the results of ESEfinder are consistent with the hypothesis that the effect of haplotype 4 is probably mediated by a more efficient splicing of the reference sequence mRNA.
While the IVS1+985A>G has never been identified before, 735A>G and 1062C>T (the other two variants in high LD with IVS1 +985A>G) were previously reported in a population of Norwegian cancer patients . In agreement with our study, these two variants were in complete LD, with an allele frequency of 0.11. In an earlier report of Holthe et al. , no association was observed between any of the investigated gene variants (including 735A>G and 1062C>T, that are in high LD with IVS1+985A>G) and morphine pharmacokinetics and glucuronidation in cancer patients. This lack of correlation highlights the importance of clinical validation of our findings in prospective trials. Such trials should control for all the possible confounders, including inadequate phenotyping, concomitant medications, variables known a priori to affect the outcome measures and patient ethnicity. In the study of Holthe et al. , differences among patients in morphine doses and blood sampling time (during a 1-2 h interval post-dosing), and concomitant medications (potentially altering morphine metabolism and excretion) might have hampered the detection of significant associations with UGT2B7 genotypes. Concerning the effect of ethnicity of patients in prospective trials, our study focused on Caucasians only, and the functional role on common haplotypes should be tested in patients of African and Asian background to account for ethnic differences in a study population including different ethnicities. A detailed haplotype analysis has been recently described in Japanese . Ethnic variability in UGT2B7 allele frequencies has been demonstrated among the three main ethnic groups [9,22,23].
Performing a SNP discovery and biochemical characterization of SNP function is necessary for understanding the role of genetic variation that occurs in the population. In our study, approximately 30% of the variants found in this study have not been reported yet in the dbSNP database (Table 2). Our study led to the discovery of a significant proportion of novel variants using a relatively small sample size (n=12) for resequencing. Using a small sample size for SNP discovery might hamper a detailed characterization of the LD structure of a certain gene. However, when we compared the LD pattern of the UGT2B7 variants reported in the HapMap database (Caucasian CEPH samples), we observed a close similarity in the pattern of LD between our data and the HapMap data (Fig. 7). Moreover, the most common haplotypes (i.e., haplotypes 1-6) found in the 54 samples using 10 tSNPs could be also identified by using the resequencing data of 12 phenotypic extremes (Table 3). Taken together, these observations indicate that the inferred haplotype analysis is likely to be accurate for the most common haplotypes, despite the limited sample size of the resequenced samples.
Using an in vitro model for SNP discovery has both advantages and limitations. The main advantage is that putative variants of functional significance are easier to detect in an in vitro study than in a clinical trial. However, the liver system is not devoid of sources of environmental variability, and differences may exist in microsomal yield and storage of fractions. Another source of variation might derive from the induction state of the livers from the previous exposure of the donors to inducers. We tried to account for the induction state of livers by normalizing the phenotypic data (specifically the M3G formation) by the formation of SN-38G (the glucuronide of SN-38, a UGT1A1 probe); this approach did not lead to an improvement of the observed correlations (data not shown). The UGT2B7 gene seems to be refractory to upregulation by classical inducers [24-26]. Despite its limitations, the in vitro system has the advantage of providing evidence for the functional role of gene variants that should be confirmed in further studies.
In addition to discovering new functional genetic variants of UGT2B7, this study demonstrates the existence of alternative splicing variants of the gene. The two newly found UGT2B7_v2 and UGT2B7_v3 variants comprise two additional exons and splice out exon 1, 4, 5 and 6 (Fig. 1). Both variants were confirmed with PCR amplification and the 5'- and 3'-ends were validated using RACE PCR. In addition to the liver, their existence has been verified in placenta and kidney but not in brain (data not shown). Splicing variants of UGT2B7 have never been demonstrated before, and they appear to exist in all samples tested, although UGT2B7_v3 seems to be expressed at a much lower level than both UGT2B7_v2 and the reference sequence mRNA. In addition, the expression of UGT2B7_v2 mRNA seems to share, at least in part, common regulatory mechanisms with the reference sequence UGT2B7, as shown by a significant correlation with the expression of the reference sequence mRNA.
The present data on the two newly identified mRNA splicing variants do not elucidate the molecular mechanism behind the alteration of the splicing machinery. Regarding their potential function, our data suggest that exon 1 skipping in UGT2B7_v2 and UGT2B7_v3 might alter the substrate binding to the enzyme. Although the expression of the truncated proteins has not been performed, translation analysis showed that the two additional exons 1A and 1B may not be able to be translated because of the existence of stop codons. Therefore, they may serve as putative 5'-UTRs. Arbitrary open reading frames in both variants, if any, may start from the ATG codon at position 477 of UGT2B7_v2 or position 369 of UGT2B7_v3. If this is the case, both variants would produce the same putative protein equivalent to a truncated short form of UGT2B7 beginning from the 250Met to 336Val. The 3'-end extend into the intron 3, but only one putative amino acid (Arg) after the 336Val may be translated. This putative N, C-terminal-less protein seems to share only 87 amino acid in the endoplasmic reticulum transmembrane domain and the UDPGA binding domain but may lack the substrate binding domain and the catalytic domain [27-29].
An alternative splicing pattern similar to that of UGT2B7 can be found in other UGT2B genes. UGT2B genes and pseudogenes are clustered in chromosome 4q13-q21.1 probably due to the results of gene duplication during human evolution. UGT2B28 mRNA variants exist, with type III mRNA altering the putative substrate binding domain of the enzyme and retaining its homodimerization capacity . It has been hypothesized that shorter UGT isoforms might be involved in cofactor transport from the cytosol to the cisternal lumen or within membranes that occurs during homo and heterodimerization (30). As the interpretation of the possible functions of these two variants is not based upon any experimental evidence, further biochemical studies are needed to evaluate the presence of cryptic splicing sites and the function of the protein products of UGT2B7_v2 and UGT2B7_v3 on UGT2B7 dimerization.
To conclude, our study supports the hypothesis that a common haplotype of the UGT2B7 gene might explain part of the phenotypic variability in the pharmacokinetics and pharmacodynamics of UGT2B7 substrates. As our statistical analyses have not been adjusted for multiple testing, they served the purpose of providing hypotheses for further molecular, biochemical and clinical studies. Studies are planned in postoperative patients treated with morphine to test the effect of haplotype 4 on morphine clearance and acute pain control. Other studies should evaluate the importance of haplotype 4 for UGT2B7-mediated detoxification of drugs and inactivation of carcinogens.
We thank Kathy Hennessy for providing expert technical assistance.
Sponsorship This work was supported by the Pharmacogenetics of Anticancer Agents Research (PAAR) Group (http://pharmacogenetics.org) (NIH/NIGMS grant U01GM61393). Data will be deposited into PharmGKB (supported by NIH/NIGMS U01GM61374, http://pharmgkb.org/).