|Home | About | Journals | Submit | Contact Us | Français|
Seven genomic loci, implicated by single nucleotide polymorphisms (SNPs), have recently been associated with progression to advanced fibrosis (fibrosis risk) in patients with chronic hepatitis C virus. Other variants in these loci have not been examined but may be associated with fibrosis risk independently of or due to linkage disequilibrium with the original polymorphisms.
We carried out dense genotyping and association testing of additional SNPs in each of the 7 regions in Caucasian case control samples.
We identified several SNPs in the toll-like receptor 4 (TLR4) and syntaxin binding protein 5-like (STXBP5L) loci that were associated with fibrosis risk independently of the original significant SNPs. Haplotypes consisting of these SNPs in TLR4 and STXBP5L were strongly associated with fibrosis risk (global P = 3.04 × 10−5 and 4.49 × 10−6, respectively).
Multiple variants in TLR4 and STXBP5L genes modulate risk of liver fibrosis. These findings are of relevance for understanding the pathogenesis of HCV-induced liver disease in Caucasians and may be extended to other ethnicities as well.
The development of hepatic fibrosis which leads to cirrhosis in patients with chronic hepatitis C virus (HCV) infection results from inflammatory response. This process is associated with marked inter-patient variation which is diffcult to predict [1,2]. The wide spectrum in the rate of fibrosis progression is thought to be modulated by a combination of host genetic factors and other host variables including age, gender, and alcohol intake [3,4]. Recently, a seven gene variant prognostic signature for cirrhosis (CRS7) in patients with chronic hepatitis C has been developed for Caucasian patients and validated in an independent patient cohort with area under the curve (AUC) = 0.73 (95% CI: 0.56–0.89; P-value <0.001) .
There are strong a priori rationale for and functional follow-up study supporting the genes of the prognostic signature, which were initially identified from a large scale case control genetic association study . Among these genes, the toll-like receptor 4 (TLR4), a lipopoly-saccharide-receptor, plays a critical role in pathogen recognition and activation of innate and adaptive immunity . Interaction of HCV and TLR4 signaling is robust although complex: HCV infection can directly induce TLR4 expression  and interfere with TLR4 signaling in immune cells , and TLR4 signaling itself may regulate HCV replication . In hepatic stellate cells, activation of TLR4 results in the down-regulation of the transforming growth factor (TGF)-β pseudoreceptor Bambi, thereby sensitizing cells to TGF-β-induced signaling leading to hepatic inflammation and fibrosis . Furthermore, a mechanistic study initiated because of these results has already demonstrated that the two fibrosis-associated TLR4 missense variants, T339I (rs4986791) and D299G (rs4986790), have a significant impact on the activity of TLR4 in inflammatory and fibrogenic signaling; more specifically disease protective variants lower the apoptotic threshold of hepatic stellate cells . Of the other genes, the antizyme inhibitor AZIN1, considered a tumor suppressor, plays a role in cell proliferation and death . Syntaxin binding protein STXBP5L is likely to be involved in vesicle traf-ficking and exocytosis  and may therefore be involved in the replication of HCV in the liver and indirectly in liver fibrosis via promoting an environment conducive to HCV replication.
In this study, we have carried out additional, dense association testing of the above gene regions, for the following reasons. First, we aim to identify likely causal genes or regulatory elements which may be in linkage disequilibrium (LD: the non-random association of alleles at two or more loci) with the original markers. Second, genetic studies have demonstrated high probability for the existence of other, independent risk variants that impair expression and/or function of genes associated with disease risk (see, for example, ). Third, if there are independent risk variants at the same locus , then this allelic heterogeneity will likely make an important contribution to the phenotype, i.e. disease risk. In particular, the original TLR4 and STXBP5L variants associated with liver fibrosis are rare or absent in Asians and/or Africans [16,17]; and hence identification of additional risk variants at these loci may explain a portion of the disease risk in those populations. It is conceivable that variants other than those reported may modulate disease risk in these as well as Caucasian populations.
The individual SNPs of the CRS7 signature were initially identified from a gene-centric, genome-wide associations study of ~25,000 SNPs . Additional SNPs were tested in this follow-up study to provide better coverage of each region implicated by the signature SNPs so that other potentially causal or independently significant markers could be identified. The extent of fine-mapping regions was determined by examining the LD pattern in the HapMap CEPH (Centre d’Etude du Polymorphisme Humain) dataset (www.hapmap.org); we primarily targeted markers that are present in the same LD region (“main region”) as the individual CRS7 markers, although some markers in the adjacent regions were also tested. Markers tested included tagging SNPs (representative SNPs in a region of the genome with high LD), putative functional SNPs, and others such as those in high LD with the individual CRS7 markers. Markers capable of tagging SNP diversity in the main block were selected with the tagger program (http://www.broad.mit.edu/mpg/tagger/server.html) under the following criteria: minor allele frequency ≥0.05 and r2 > 0.8; our sample set had 80% power to detect a variant of 0.05 frequency that has an effect size of 2.2 at the allelic level. The putative functional markers, such as non-synonymous SNPs and those in putative transcription factor binding sites, were selected based on both public and Celera annotation. Additional information for the selected SNPs can be found in Table 1.
The 420 Caucasian samples used in this study were collected from the University of California at San Francisco (UCSF) (N = 187) and the Virginia Commonwealth University (VCU) (N = 233). They consisted of 263 cases and 157 controls where patients with fibrosis stages 3 or 4 were defined as cases and those with fibrosis stage 0 were used as controls; samples with fibrosis stages 1 or 2 were excluded from the study to more effectively delineate genetic factors involved in progression. Fibrosis stages were determined by biopsies read by liver pathologists; the Batts–Ludwig scoring system was utilized in UCSF and the Knodell system in VCU . Cases were sampled at the age of 27–71 years (means ± SD = 49.4 ± 7.4), consisted of 75.3% males, and had daily alcohol intake of 46.2 ± 67.2 g. Controls were sampled at the age of 19–80 years (means ± SD = 47.3 ± 9.1), consisted of 61.1% males, and had daily alcohol intake of 50.5 ± 76.2 g. Sample specific information, including estimated age of infection and duration of infection, is presented in Supplementary Table 1. All patients provided written informed consents, and the study was approved by institutional review boards of UCSF and VCU.
Cases and controls were individually genotyped by allele-specific, kinetic PCR . For each allele-specific PCR reaction, 0.3 ng of DNA was amplified. Genotypes were automatically called by an in-house software program followed by manual curation without knowledge of case/control status. Our genotyping accuracy was approximately 99% .
Allelic association of the SNPs with fibrosis risk was determined by the χ2 test. Logistic regression was carried out to correct the genetic association for age, gender, alcohol intake and sample source. Logistic regression models for each possible pair of SNPs assumed an additive effect of each additional risk allele on the log odds of fibrosis risk. Linkage disequilibrium (r2) were calculated from the unphased genotype data using LDMax in the GOLD package . Haplotypes were estimated and tested for association with disease status using a score test with haplotypes coded in an additive fashion . Global tests of association, which test the null hypothesis that the frequency distribution of haplotypes is equal in cases and controls, as well as haplotype specific tests of association were performed .
The individual CRS7 predictor SNPs reside in 7 distinct chromosomal regions (Table 1), where LD extends from ~20 to ~662 kbp according to the HapMap CEPH dataset (www.hapmap.org). To thoroughly examine whether other SNPs in these regions associate with cirrhosis risk more strongly than and/or independently of the original markers, we carried out dense SNP genotyping in the Caucasian samples used to build the CRS signature . Common HapMap SNPs (of ≥5% allele frequency) in these regions could be effciently tested with a minimum of 14–34 SNPs that tagged other untested markers at r2 ≥ 0.8; for the CRS7 predictor 6 region on chromosome 3, marker-marker LD was extensive (~662 kbp), but we only targeted a 163 kbp region that contained STXBP5 and POLQ genes, the only two within this entire LD region, as we were primarily interested in determining which of these genes was more likely to be involved in fibrosis risk. We genotyped a total of 23–71 SNPs for each region; these included tagging, putative functional and other SNPs such as those in high LD with the original marker (Table 1). Coverage of the tagging SNPs by the HapMap markers we tested ranged from 64% to 92% but was likely to be higher since additional non-HapMap markers were genotyped as well.
We first present the detailed analysis for the TLR4 locus implicated by the original CRS7 predictor rs4986791. Extensive marker-marker LD was discernable across a region of ~76 kbp encompassing rs4986791 (Fig. 1A). No other genes were located in this region. For fine mapping, we tested an additional 61 SNPs and identified 15 that were associated with cirrhosis risk at allelic P < 0.05 (Fig. 1B); the original marker had the strongest effect (Table 2).
Pair-wise SNP regression analysis revealed that significance of some markers could be adjusted away by other significant markers (data not shown), suggesting that all were not independently associated with disease risk, as expected from marker-marker LD (data not shown). Attempting to derive a most parsimonious set of independently significant markers, we identified three groups of SNPs in the TLR4 region that were associated with fibrosis risk (Table 2). Group 1 contained 9 SNPs including the original TLR4 marker rs4986791 and 8 other fine mapping SNPs that were in relatively high LD with rs4986791. None of the 8 markers remained significant after adjustment for rs4986791, nor did rs4986791 after adjustment for any of the 8 markers. Of the other significant markers in the TLR4 region, 5 survived adjustment for rs4986791 (regression P < 0.05), four of which were in relatively moderate to high LD (Group 2). An intergenic SNP, rs960312, had the strongest effect and was not independent from the other 3 SNPs in Group 2 as their significance could be adjusted away by each other. The third group contained only one marker, rs11536889, which shared little LD with Group 1 or 2 markers. This marker trended to significance after adjustment for Group 2 marker rs960312 (P = 0.086), while rs960312 remained significant after adjustment for the Group 3 marker. Because there was almost no LD between markers in these two groups and they were present in distinct haplotypes (see next), we considered Groups 2 and 3 markers to be independently associated with disease risk.
In multivariant analysis that controls for sample source and other known risk factors including age, gender and alcohol intake, both rs4986791 in Group 1 and rs960312 in Group 2 remained significant at P < 0.05 (adjusted P = 0.0033 and 0.033, respectively) while rs11536889 in Group 3 trended to significance (adjusted P = 0.081). Haplotype analyses with these 3 SNPs resulted in the identification of three significant common haplotypes, each distinguishable by one of the three independent markers (Table 3). The global test of haplotype association was highly significant (global P = 3.04 × 10−5).
Of the 58 SNPs tested in the STXBP5L locus, 27 were associated with fibrosis risk at allelic P < 0.05 (Supplementary Fig. 1). In pair-wise marker regression analysis, two SNPs, rs17740066 and rs2169302, remained significant after adjustment for any of the other markers (Table 4 and data not shown), indicating that no other marker could account for association of these two markers. Conversely, when additional markers were adjusted for rs17740066, only three remained significant. One of them was rs2169302, and the other two were rs13086038 and rs35827958; the latter two were nearly perfectly concordant (r2 = 0.97). Similarly, a few other markers remained significant when adjusted for rs2169302 but they could be accounted for by rs17740066. Thus the most parsimonious set of independently significant markers at this locus included rs17740066, rs2169302 and rs13086038/rs35827958, all of which remained significant at the level of 0.05 when adjusted for sample source, age, gender and alcohol intake (adjusted P = 0.00049, 0.0016 and 0.011/0.011, respectively). LD between these independent markers was low (Table 4; r2 < 0.02 between any pairs). These markers were present in distinct haplotypes (Table 5), and overall haplotype-disease association was strong (global P = 4.49 × 10−6).
The remaining five chromosomal regions were similarly analyzed as above. Although a number of fine mapping markers at each locus were associated with fibrosis risk (Supplementary Fig. 1), none was independent of the original CRS7 predictors (Supplementary Table 2). In the case of SNP predictor 5, association of the original marker rs4290029 could not be accounted for by any other fine mapping marker and all fine mapping markers could be accounted for by rs4290029 (data not shown), suggesting that rs4290029 was likely to be the most informative marker (Supplementary Table 2); SNP5, rs4290029, was located in the intergenic region between DEGS1 encoding a lipid desaturase and NVL encoding a nuclear VCP-like protein. For other loci, relationship between the initial CRS7 markers and other similarly significant and high LD markers we tested could not be teased apart with our regression analysis (Supplementary Table 2). However, these associated markers indicated that AZIN1 (antizyme inhibitor 1), TRPM5 (transient receptor potential cation channel, subfamily M, member 5), AP3S2 (adaptor-related protein complex 3, sigma 2 subunit), and AQP2 (aquaporin 2) were likely candidate genes modulating liver fibrosis (Supplementary Fig. 1).
In addition to the above analyses, we also tested markers in the regions adjacent to the main LD region that contained the individual CRS7 markers but did not find any other markers that were associated with fibrosis risk independently of the original CRS7 markers (data not shown). This finding was consistent with the fact that markers outside the main LD region shared little to low LD with the original CRS7 markers.
Our study shows that multiple SNPs in each of the seven chromosomal loci we investigated are significantly associated with an increased risk for developing advanced liver fibrosis and cirrhosis and that this increased risk is specific to certain allele profiles and loci within genes. For the TLR4 and STXBP5L loci, each has three independently significant sets of SNPs, which together give rise to highly significant haplotypes modulating the risk of developing progressive liver fibrosis. For the other 5 loci, each appears to have only one set of independent markers, one of which, rs4290029, an intergenic variant on chromosome 1, may be most informative. Other original and fine mapping markers cannot be distinguished because their association with fibrosis risk can be accounted for by each other. These observations do not exclude the possibility that other less common, independently associated variants may also exist.
The most notable finding of this study is perhaps the TLR4 locus, which is of considerable interest for its role in immunity as well as genetics since it appears to have undergone selective pressure exerted by pathogens throughout primate evolution [23–25]. To our knowledge, the fine mapping performed for this study represents the most comprehensive analysis of this region, with evaluation of at least 88% of the known SNP diversity covering the entire TLR4 gene (~13 kbp) and its flanking regulatory sequences. In contrast, previous studies have often exclusively examined the two co-segregating missense variants T339I (rs4986791) and D299G (rs4986790) in Group 1 (Table 2), both of which are known to attenuate receptor signaling, NFκB activation and pro-inflammatory cytokine production and to impact cell growth and survival [11,26] but they are absent in Asian populations. However, both Group 2 SNP rs960312 and Group 3 SNP rs11536889 are common in Asians (allelic frequency of ~25% in the Hap-Map) and may thus play a role in modulating disease risk in the Asian population. Previously, a putative 3′UTR SNP 11381G/C in TLR4 has been reported to marginally affect prostate cancer risk (P = 0.02) . These non-coding variants may affect gene expression, which may account for a large fraction of disease risk caused by genetic factors .
The TLR4 missense variants appear to be associated with a risk of numerous other diseases and indications as well [29,30], including endotoxin hypo-responsiveness , bacterial and fungal infections [31,32], septic shock , malaria , inflammatory bowel disease , atherosclerosis , and gastric cancer . Thus, to better assess the overall contribution of TLR4 polymorphisms on the etiology of these other diseases, the independent variants in Groups 2 and 3 as well as their haplotypes should be further evaluated. In addition, the large number of antagonists and agonists of TLR4 currently being evaluated in drug development strongly suggests that TLR4 genotypes should be utilized as probable biomarkers in future clinical trials [38,39].
The statistical evidence we presented here is strong, although not at the level reported in some other genome-wide association studies. This is, however, not unexpected, as the number of samples having this type of fibrosis data are limited in contrast to other genome-wide association studies involving tens of thousands of samples. Furthermore, multiple, independent polymorphisms associated in the same gene region support an allelic heterogeneity model of liver fibrosis. While some fine mapping markers (e.g. those in AZIN1) and both the TLR4 and STXBP5L haplotypes will remain significant if corrected by the number of tests done in each region, replication in other sample sets of similar characteristics would be fruitful, as for other reported genetic variants [40–47]. The contribution of these additional SNPs determined by regional fine mapping to the CRS merits further investigation, which may be particularly relevant to studies in the Asian and African populations; the previously reported gene variants in TLR4 and STXBP5L are rare in Asians and/or Africans, whereas the gene variants reported here are found at higher frequencies in these populations.
In conclusion we have expanded the list of SNPs and localized putative candidate genes/variants that are independently associated with the risk of liver fibrosis progression and the development of cirrhosis involved in this process. Further examination of these gene variants in mechanistic studies is warranted.
We are grateful to all patients for their participation in this study. We thank clinical staff at the participating university hospitals and Hongjin Huang and colleagues at Celera for excellent technical assistance, Thomas J. White and Andrew Grupe for stimulating discussions, and Steve Schrodi for helpful comments on the manuscript.
Y.L., M.C., V.G., C.R., J.C., D.R., S.B. and J.S. are employees of Celera Corporation and declared their financial interest in the company; O.A. was an employee of Celera Corporation at the time the study was carried out; T.W. declared that she received funding from the drug companies involved in order to carry out her research in this manuscript; M.S., R.C. and S.L.F. declared that they do not have anything to disclose regarding funding or conflict of interest with respect to this manuscript.
Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.jhep.2009.04.027.