|Home | About | Journals | Submit | Contact Us | Français|
Low bilirubin levels are significantly associated with cardiovascular diseases (CVD). In previous genome-wide linkage studies we identified a major locus on chromosome 2q harboring the candidate gene UDP-glucuronosyltransferase (UGT1A1). The activity of this enzyme is significantly influenced by a TA-repeat polymorphism in the promoter of the gene. In a prospective study individuals with genotype (TA)7/(TA)7 had significantly higher bilirubin levels and approximately one third the risk of CVD as carriers of the wild type (TA)6 allele. In the present study we performed a conditional linkage study to investigate whether this polymorphism explains the observed linkage peak and extended our analysis by a genome-wide association study on bilirubin levels in 1345 individuals.
After adjustment for the bilirubin variance explained by this polymorphism, the LOD score on chromosome 2q dropped from 3.8 to 0.4, demonstrating that this polymorphism explains the previous linkage result. For the genome-wide association study, the closest marker to UGT1A1 was in the top ranking SNPs. The association became even stronger when we considered the TA-repeat polymorphism in the analysis (p=2.68×10−53). Five other SNPs in other regions reached genome-wide significance without obvious connection to bilirubin metabolism.
Our studies suggest that UGT1A1 may be the major gene with strong effects on bilirubin levels and the TA-repeat polymorphism might be the key polymorphism within the gene controlling bilirubin levels. Since this polymorphism has a high frequency and a substantial impact on the development of CVD, the gene might be an important drug target.
Many studies have shown a significant association between low serum bilirubin levels and cardiovascular diseases (CVD) (1–8). Lipid oxidation and formation of oxygen radicals are important factors of arterial plaque formation. Bilirubin may serve as a physiological antioxidant providing protection against atherosclerosis and CVD. Segregation analyses in two studies proposed a major gene controlling a substantial amount of serum bilirubin levels (5,9). Two recent independent genome-wide linkage scans identified a major locus on chromosomal 2q telomere controlling serum bilirubin concentrations with maximum multipoint LOD score of 3.8 (10) and 3.2 (11), respectively. The identified chromosomal region harbors the gene of UDP-glucuronosyltransferase (UGT1A1), the major enzyme of bilirubin glucuronidation, which mainly determines bilirubin elimination in humans. The activity of UGT1A1 is significantly influenced by a TA-repeat polymorphism in the promoter region of this gene (12). Individuals homozygous for 7 TA repeats (7/7) were found to have a lower promoter activity and subsequently higher levels of bilirubin than heterozygous (6/7) or wild type homozygous (6/6) (12–14). The (TA)7 allele, also named as UGT1A1*28, is responsible for Gilbert syndrome in Caucasians, a benign, non-hemolytic, mild unconjugated hyperbilirubinemia.
We recently conducted an association study of this TA-repeat polymorphism with bilirubin levels and CVD in the Framingham Heart Study among subjects followed for 24 years. The total variance of bilirubin explained by this polymorphism after adjustment for covariates was 18.6%. Individuals with genotype 7/7 compared to genotypes 6/7 and 6/6 had significantly higher serum bilirubin levels and lower CVD and coronary heart disease (CHD) risk with a hazard ratio of 0.36 and 0.30, respectively (8). Since the frequency of the genotype 7/7 ranges from 12–16% in the Caucasian population, this polymorphism may have a large population impact.
In the present study, we performed 1) a linkage scan conditional on the UGT1A1*28 association analysis to evaluate if the TA-repeat polymorphism in the promoter region of the UGT1A1 can fully explain the linkage peak we identified in our previous linkage studies (10,11) and 2) an Affymetrix 100K SNPs genome-wide association (GWA) study to search for further genes influencing serum bilirubin levels.
Details on the Framingham Heart Study are described in the Online Supplementary Material. The conditional linkage genome scan was carried out in the 330 largest extended Framingham families used in the previous genome-wide linkage scan (10). The same set of family samples was used for genotyping with the Affymetrix 100K SNP GeneChip for a genome-wide association study with a total number of 1345 genotyped individuals (15). The investigation was in line with the principles outlined in the Declaration of Helsinki.
Total serum bilirubin was measured during the first and second examinations of the offspring after a 12 hour fasting period. Details on the genotyping of the 399 microsatellites for genome-wide linkage scan were provided recently (10). Genotyping of the UGT1A1 promoter TA-repeat polymorphism in the TATA box at position −53 was performed using the ABI 3130xl sequencing system as recently described in detail (8). The Affymetrix 100K SNPs provided 112,990 autosomal and 2344 X-chromosomal SNPs in 1345 individuals (16). SNPs with minor allele frequency <10% or a call rate <80% or Hardy-Weinberg equilibrium p-value<0.001 were excluded, leaving 70591 autosomal and 1346 X chromosomal SNPs available for analysis.
To evaluate linkage conditional on the −53 TA repeat polymorphism, we used mixed effects variance component models in the SOLAR package to simultaneously incorporate the genotype-specific means of the measured genotype test while performing a variance component-based linkage test (17). As in our previous analysis on the same set of families, we adjusted bilirubin levels for age, sex, height, weight, total cholesterol, hematocrit, albumin, serum glutamic oxaloacetic transaminase, smoking and alcohol (10) and added the −53 TA repeat genotypes as an additional covariate.
For each SNP, we modeled the log-transformed trait value adjusted for the same covariates as used in the linkage analysis using a linear mixed effects model (LME) in SAS and assuming a recessive, additive, and dominant model. For each genetic model, a p-value was obtained. The smallest of the three p-values was used to rank all SNPs. This was shown to be more robust than using a single p-value based on the association test under an additive model (18,19). Since LME does not take the specific kinship coefficient of each relative pair into account as SOLAR does, we used SOLAR to analyze the 50 LME top-ranked SNPs to compare the results of SOLAR with that of LME. Further details and handling of X-chromosomal SNPs is described in the Online Supplementary Material.
Table 1 presents the characteristics of the study population stratified by −53 TA repeat genotypes from offspring examination 1. Genotype frequencies were 47%, 44% and 9% for genotypes 6/6, 6/7, and 7/7, respectively. No significant differences across genotype groups were found in all characteristics listed in Table 1 except for bilirubin. Mean serum bilirubin was the highest in 7/7 carriers (13.2±5.9 mg/l) and lower in 6/7 (8.2±2.9 mg/l) and 6/6 (6.8±2.2 mg/l) carriers (P<0.01). The characteristics of the study population in the second examination are similar to those of the first examination (data not shown).
Conditional on the TA-repeat association, the original linkage peak on the 2q telomere dropped from a LOD score of 3.8 to 0.4 (Figure 1). The second linkage peak on chromosome 4 at 96 cM originally described (10) with a LOD score of 1.3 remained unchanged. In two chromosomal regions the LOD scores increased from originally less than 1 to greater than 1: chromosome 7 at 110 cM (increase from 0.51 to 1.65); chromosome 15 at 0 cM (increase from 0.58 to 1.03).
The results of the top 10 SNPs by LME minimum p-value ranking and then subsequently by SOLAR analyses for the first and second examinations are listed in Table 2. An extended list of the top ranking 50 SNPs is provided in Online Supplementary Tables 1 and 2.
For analyses of bilirubin from both examinations, there were six SNPs that reached genome-wide significance in the LME analysis: two were associated with bilirubin levels measured from the first examination (rs9323037 and rs1903937) and four were from the second examination (rs1113193, rs2269337, rs618171 and rs434310). The top ranked SNP from the second examination, rs1113193, is about 100kb upstream of the UGT1A1 gene, the closest SNP to the gene and is in significant LD with the UGT1A1*28 (TA-repeat) polymorphism, with D′=0.86 (Figure 2). This SNP was ranked number 2 in the subsequent SOLAR analysis. For the first examination, this SNP was ranked number 8 in LME and number 3 in SOLAR by the minimum p-value method. Except for this SNP, the rest of the 49 SNPs in each examination did not overlap (Table 2 and Supplementary Tables 1 and 2).
There were no genes located within 100–200kb of four other top ranked SNPs (rs9323037, rs1903937, rs618171 and rs434310). rs2269337 was located within exosome component 2 (EXOSC2) which is involved in various mRNA decay pathways (20). The wider chromosomal area hosts a few genes involved in the development of cancer and leukemia and no obvious genes involved in bilirubin metabolism.
The results of the top 10 SNPs from the X chromosome using LME analysis are listed in Supplementary Table 3. No SNP reached significance after correction for multiple comparisons and none of the top 10 SNPs from the first and second examinations overlapped.
Using SOLAR to analyze the association between the −53 TA-repeat polymorphism and bilirubin levels from the first examination, the p-values were 3.06×10−39, 2.68×10−53, and 2.81×10−31 for the dominant, additive, and recessive models, respectively. Figure 2 shows the p-values of 14 SNPs genotyped in the GWA study and of the UGT1A1*28 TA-repeat polymorphism in the chromosomal region of UGT1A1 including the upstream 350 kb and downstream 100kb.
Most knowledge regarding the relationship of the UGT1A1 gene and serum bilirubin has been derived from association studies using polymorphisms within this gene in relatively small samples of unrelated individuals. Although these studies provided important evidence for the genetic effect of this gene, they are limited to this known candidate gene and are susceptible to type I error due to possible hidden population stratification. Our study is the first study using a large familial sample to measure the effect of the −53 TA-repeat (UGT1A1*28) polymorphism and to explore the possibility of other loci controlling serum bilirubin levels through genome-wide linkage, genome-wide linkage/disequilibrium (conditional linkage), and genome-wide association studies. Our findings support an important continued role for linkage analysis and the collection of family data in the age of GWA studies (21,22).
Our study extends the previous observation that individuals with the UGT1A1*28 allele have significantly higher serum bilirubin levels. In conditional linkage analysis, when an association between the UGT1A1*28 and bilirubin levels was allowed simultaneously, the evidence for linkage on the 2q telomere was eliminated. This indicates that the TA repeat polymorphism accounts for the genetic linkage signal we observed earlier (10,11). The conclusion is supported by the results of the 100K SNPs GWA study. Of the SNPs screened in our 100K scan, rs1113193 is the SNP that is physically close to and in significant linkage disequilibrium with UGT1A1*28. It is the only SNP to be consistently ranked in the top 10 SNPs with different analytical approaches, different ranking methods, and two different bilirubin measurements at the two Framingham examinations. This SNP reaches genome-wide significance and is the top-hit in the second examination by LME analysis.
Although several other mutations within this gene have been identified, their allelic frequencies are low in Caucasians (23), and the UGT1A1*28 was thought to be the main cause of Gilbert syndrome in Caucasians due to its functional influence on the promoter activity (12,24). This is supported by a recent association study on 13 polymorphisms within the UGT1A1 gene and serum bilirubin levels, that conditional on the TA repeat no other variant was shown to be significantly associated with bilirubin in both genotype and haplotype analyses (25). Recently, another polymorphism in the phenobarbital responsive enhancer module of this gene was found to reduce transcriptional activity (26). However, this polymorphism was in complete LD with the TA-repeat (27,28). Therefore, association studies can not distinguish their effect.
The second aim of our study was to further explore, whether there are other genes with large effects contributing to the variation of bilirubin levels. However, from our previous genome scans (10,11) and the conditional linkage genome scan in this study, we observe no additional linkage peaks with a LOD score above 2 in the rest of the genome. The results are in line with our GWA study which showed that except for rs1113193, no other SNP was consistently ranked in the top list. This implies that UGT1A1 might be one of the most important genes with a large effect controlling bilirubin levels in a Caucasian population as proposed by earlier segregation analyses (5,9). However, one has to bear in mind that neither our linkage scan nor our 100K GWA study adequately covered the genome in a high enough resolution which leaves some room for the identification of further genes in the future. Furthermore, the total variance of bilirubin explained by the UGT1A1 TA-repeat polymorphism was 18.6% (8) which is very high compared to other phenotypes. Recent GWA studies on e.g. lipid levels found that several genes combined explained only up to 4.8% of variance in lipids (29) which was at the best about 1.4% for a singular gene (30). The effect of the UGT1A1 TA-repeat polymorphism on bilirubin levels can therefore be considered as exceptionally high which might be explained by the fact that the TA-repeat polymorphism was already shown to have a strong functional influence (12–14). Therefore, future large GWA studies including meta-analyses using microarrays with a higher coverage and HapMap-imputed genotyes might be able to find further genes with small- and medium-sized effects on bilirubin levels for which our present study does not have the power.
One may question why this locus was not identified by recent GWA studies on CVD. Table 3 lists all previous studies which investigated the association between the UGT1A1 TA-repeat polymorphisms and both, bilirubin levels and atherosclerosis outcomes (6–8,31,32). Although these studies found clear associations of this polymorphism with bilirubin levels and most studies found an association of bilirubin levels with atherosclerosis outcomes, only the prospective Framingham Heart Study (8) found an association of the polymorphism with CVD endpoints. The major difference between our previous Framingham Study (8) and these retrospective studies and GWA studies on CVD is that the Framingham Study was a prospective population-based cohort study followed for 24 years with an average age of 36 years at baseline, whereas the other studies were cross-sectional case-control studies with much higher mean ages at recruitment, mostly 60 years or older. First of all, a survival bias due to older ages at recruitment is possible, because up to half of the incident cases do not survive the first half year after a major CHD event and thus are not available for inclusion into the studies. CVD in those individuals could be due to stronger genetic effects and thus be more severe. Furthermore, the further survival after the first event is influenced by genetic factors (33). Second, genetic heterogeneity at this locus in those GWA studies can not be excluded. All those GWA studies included a large study population and some with several independent samples. This increased the possibility to include people with different genetic backgrounds.
One lesson we have learned from recent GWA studies is that due to multi-comparison issues, power may be low and it is not easy to reach genome-wide significance even for associated SNPs with moderate to large effect unless the sample size is large. Therefore, it may be appealing to use different analytical methods, ranking methods and repeated measurements to identify those SNPs that consistently rank in top position. This was shown to be more robust than using a single p-value based on the association test under an additive model (18,19). The probability that those SNPs are truly associated SNPs might be high. This might be the reason why rs1113193 did not reach genome-wide significance and rank in top position in the earlier 100K GWA scan on bilirubin measured on the Framingham Offspring second examination using a less extended analysis method and different covariates (15). Our study demonstrates that GWA study results can still be robust in the case where a SNP is in linkage disequilibrium with a functional variant with an effect size which is not too small, even if the marker density is low as with the 100K gene chip, and SNPs related to the phenotype are far apart with distances of 100kb and more. However, due to type I error, some SNPs that are not truly associated could have small p-values in one analytical method, or using one ranking method, or on one examination, but have larger p-values otherwise.
The other SNPs identified with genome-wide significance but less consistent results with the various analysis methods clearly need further investigation in other cohorts. Four of the other five SNPs identified with genome-wide significance were far outside of genes. Since long-range effects of regulatory regions can not be excluded (30), these regions might be of interest for future studies.
Our studies suggest that UGT1A1 may be the only major gene with strong effects on bilirubin levels and the TA-repeat polymorphism might be the key polymorphism within the gene controlling serum bilirubin levels in the Caucasian population. With an allele frequency of 30%–40% of the (TA)7 repeats and a frequency of homozygous individuals of 10%–16% (6–8,12,24), the gene might have an important effect on preventing the development of CHD and CVD. Therefore, future research to examine possible therapeutic prevention targeting of this genetic variant and its influence on bilirubin levels may have a large population impact (34).
We are very grateful to Robert C. Elston and Nancy L. Geller for their helpful comments on the data analysis and critical review of the manuscript.
This work was supported by the National Heart, Lung, and Blood Institute’s Framingham Heart Study (Contract No. N01-HC-25195). This work was also supported by grants to Florian Kronenberg from the “Tiroler Wissenschaftsfonds” (Project 404/505), the “Austrian Nationalbank” (Project 9331) and the “Genomics of Lipid-associated Disorders – GOLD” of the “Austrian Genome Research Programme GEN-AU”, Schoepfstr. 41, A-6020 Innsbruck.
Conflict of interest: None declared
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.