|Home | About | Journals | Submit | Contact Us | Français|
Fibrinogen is both central to blood coagulation and an acute phase reactant. We aimed to identify common variants influencing circulation fibrinogen levels.
We conducted a genome-wide association analysis on six population-based studies, the Rotterdam Study, the Framingham Heart Study, the Cardiovascular Health Study, the Atherosclerosis Risk in Communities Study, the MONICA/KORA Augsburg Study, and the British 1958 Birth Cohort Study, including 22,096 participants of European ancestry. Four loci were marked by one or more single nucleotide polymorphisms (SNPs) that demonstrated genome-wide significance (p<5.0×10−8). These included a SNP located in the fibrinogen β chain (FGB) gene and three SNPs representing newly identified loci. The high-signal SNPs were rs1800789 in exon 7 of FGB (p=1.8×10−30), rs2522056 downstream from the interferon regulatory factor 1 (IRF1) gene (p= 1.3×10−15), rs511154 within intron 1 of the propionyl coenzyme A carboxylase (PCCB) gene (p= 5.9×10−10), and rs1539019 on the NLR family, pyrin domain containing 3 isoforms (NLRP3) gene (p = 1.04×10−8).
Our findings highlight biological pathways that may be important in regulation of inflammation underlying cardiovascular disease.
Elevated levels of fibrinogen within or above the normal range are consistently associated with an increased risk of cardiovascular disease.1 Fibrinogen has a key role in blood coagulation but is also known as a marker of inflammation. Studies in persons of European ancestry have estimated the heritability of multivariable-adjusted fibrinogen levels from 24% in multiplex families2 to more than 50% in twins.3 The three genes encoding the three fibrinogen protein chains explain only a small part of the total estimated genetic variance of circulating levels of fibrinogen.4
The objective of this study was to identify novel genetic loci related to plasma fibrinogen levels. A meta-analysis of genome-wide association (GWA) findings was conducted on six population-based studies. We analyzed GWA data of 2,661,766 SNPs from one or more studies from a total of 22,096 participants of European descent.
The setting for this meta-analysis is primarily the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium.5 CHARGE includes the Rotterdam Study (RS), the Framingham Heart Study (FHS), the Cardiovascular Health Study (CHS), and the Atherosclerosis Risk in Communities (ARIC) Study. In addition, data from the British 1958 Birth Cohort (B58C) and the MONICA/KORA Augsburg Study (KORA) has been included.
The RS is a prospective, population-based cohort study of determinants of several chronic diseases in older adults.6 In brief, the study comprised 7,983 inhabitants of Ommoord, a district of Rotterdam in the Netherlands, who were 55 years or over. The baseline examination took place between 1990-1993.
Genotyping was conducted using the Illumina 550K array. SNPs were excluded for minor allele frequency ≤1%, Hardy-Weinberg equilibrium (HWE) p<10−5, or SNP call rate ≤90% resulting in data on 530,683 SNPs. Imputation was done with reference to HapMap release 22 CEU using the maximum likelihood method implemented in MACH. The final population for this fibrinogen analysis comprised 2,068 individuals.
The FHS started in 1948 with 5,209 randomly ascertained participants from Framingham, Massachusetts, US, who had undergone biannual examinations to investigate cardiovascular disease and its risk factors.7 In 1971, the Offspring cohort8,9 (comprised of 5,124 children of the original cohort, and the children's spouses) and in 2002, the Third Generation (consisting of 4,095 children of the Offspring cohort), were recruited10. FHS participants in this study are of European ancestry.
Genotyping was carried-out as a part of the SHARe project using the Affymetrix 500K mapping array (250K Nsp and 250K Sty arrays) and the Affymetrix 50K supplemental gene focused array on 9,274 individuals. Genotyping resulted in 503,551 SNPs with successful call rate >95% and HWE p>10−6 on 8481 individuals with call rate >97%. Imputation of ~2.5 million autosomal SNPs in HapMap with reference to release 22 CEU sample was conducted using the algorithm implemented in MACH. The final population for fibrinogen analysis included 7,022 individuals (Original Cohort n=383, Offspring n=2,806, Third Generation n=3,833).
The CHS is a population-based, observational study of risk factors for clinical and subclinical cardiovascular diseases.11 The study recruited participants 65 years of age and older from four US communities in two phases: 5,201 participants in 1989-1990, and 687 (primarily African American participants) in 1992-1993. A GWA study was conducted in a subset of CHS participants (n=3,980), all of whom were without clinical cardiovascular disease at their baseline clinical visit and provided consent to use their DNA for research. The study sample used in the fibrinogen analysis represented the first two of three rounds of genotyping, which was a stratified probability sample. Weights were assigned to each observation to reflect the likelihood of sampling from the 3,980 participants. The analysis was restricted to participants of European decent.
Genotyping was performed using the Illumina 370 CNV BeadChip system. Samples were excluded for sex mismatch, discordance with prior genotyping, or call rate <95%. SNPs were excluded from analysis when monomorphic, when HWE p<10−5, and when call rates were <95%. Imputation was performed using BIMBAM v0.95 with reference to HapMap CEU using release 21A build. The population available for the fibrinogen analysis included 1,993 individuals.
The ARIC study is a longitudinal cohort study of atherosclerosis and its clinical sequelae. It recruited a population-based sample of 15,792 men and women aged 45-64 years from four US communities in 1987-89.12 The analysis was restricted to subjects of European decent.
Genotyping was performed using the Affymetrix Genome-Wide Human SNP Array 6.0. SNPs were excluded for not being autosomal SNPs, not passing laboratory QC, no chromosome location, minor allele frequency ≤1%, SNP call rate <90%, or HWE p<10−6. This resulted in data on 716,442 SNPs. Imputation to HapMap SNPs was performed using MACH. After excluding subjects who disallowed DNA use, subjects with a mismatch between called and phenotypic sex, with a mismatch on >10 of 47 previously analyzed SNPs in ARIC, all but one in sets of first degree relatives, and other individuals who were genetic outliers, the final population for fibrinogen analysis comprised 8,051 individuals.
The presented data were derived from the third population-based Monitoring of Trends and Determinants in Cardiovascular Disease (MONICA)/Cooperative Health Research in the Region of Augsburg (KORA) survey S3.13 This cross-sectional survey covering the city of Augsburg (Germany) and two adjacent counties was conducted in 1994/95 to estimate the prevalence and distribution of cardiovascular risk factors among individuals aged 25 to 74 years as part of the WHO MONICA study. The MONICA/KORA S3 study comprises 4,856 subjects. Among them, 3,006 subjects participated in a follow-up examination of S3 in 2004/05 (MONICA/KORA F3). All participants underwent standardized examinations including blood withdrawals for plasma and DNA. For the KORA genome-wide association study, 1,644 subjects, aged 45 to 69 years were selected from the KORA S3/F3 samples.
Genotyping was performed using the Affymetrix 500K Array Set. Samples were excluded for sex mismatch, discordance with prior genotyping, or call rate <95%. SNPs were excluded from analysis when monomorphic (MAF<0.01), when call rates per SNP were <0.1 and per individual were <0.1. Imputation was done using maximum likelihood method implemented in MACH 1.0. The final population available for the fibrinogen analysis included 1,523 individuals.
The B58C is a national population sample followed periodically from birth. At age 44-45 years, 9,377 cohort members were examined by a research nurse in the home as described previously.14 For this study we used a total of 1,480 cell-line-derived DNA samples from unrelated subjects of European ancestry, with nationwide geographic coverage, which were used as controls by the Wellcome Trust Case Control Consortium (WTCCC).15
Genotyping was performed using the Affymetrix 500K Mapping Array Set using the call algorithm CHIAMO as implemented by the WTCCC.15 Genotypes at other loci were imputed by the program IMPUTE version 0.1.2, using 490,032 autosomal SNPs with CHIAMO calls and the linkage disequilibrium patterns in the HapMap CEU panel. Analysis of imputed genotypes used Marchini's SNPTEST version 1.1.3 and supplementary regression modeling used STATA version 10.0. A final sample size of 1,459 individuals was included in the fibrinogen analysis.
In the KORA study, fibrinogen was determined by an immunonephelometric method (Dade Behring Marburg GmbH, Marburg, Germany) on a Behring Nephelometer II analyzer. FHS study used the Clauss method16 in the offspring and the third generation subjects, and a modified method of Ratnoff and Menzie17 in the original cohort subjects. In the RS, fibrinogen levels were derived from the clotting curve of the prothrombin time assay using Thromborel S as a reagent on an automated coagulation laboratory 300 (ACL 300, Instrumentation Laboratory, Zaventem, Belgium). The other studies used the Clauss method for measuring plasma fibrinogen.16
Each study independently analyzed their genotype-phenotype data. Except for FHS, which has a family structure, all studies conducted analyses of all directly genotyped and imputed SNPs using linear regression on untransformed fibrinogen measures using an additive genetic model adjusted for age, sex, and site of recruitment (if necessary). In FHS, a linear mixed effects model was employed with a fixed additive effect for the SNP genotype, fixed covariate effects, random family specific additive residual polygenic effects to account for within family correlations18, and a random environment effect. In addition, FHS adjusted for population stratification using principal components of the directly measured SNPs which were computed using the Eigenstrat software.
To account for residual stratification, p-values were adjusted for genomic inflation. The inflation of the association test statistic, stated as inflation factor lambda (λgc), was small for all studies: 0.995 for RS, 1.016 for FHS, 1.031 for CHS, 1.024 for ARIC, 1.012 for KORA, and 1.008 for B58C. Using the study-specific results, we conducted a fixed effect model meta-analysis based on inverse-variance weighting. MetABEL, a package running under R was used to perform the meta-analysis. We used Bonferroni correction to deal with the problem of multiple testing. Simulation studies show that the effective number of independent tests in a GWA analysis is nearly one million.19 Based on one million tests, we selected a p-value threshold of 5×10−8 as the level of genome-wide significance.
In addition, we estimated the effect of the top SNPs in strata of sex and smoking status. Gene-by-sex and gene-by-smoking interaction was tested in each study by introducing an interaction term into the linear model. We used a sample size weighted meta-analysis to combine the reported interaction p-values across studies for each of the top SNPs.
We used the WHGS to replicate our genome wide significant findings and other loci for which our meta-analysis generated more modest evidence of an association (p-value of 5×10−7). Participants in WGHS are derived from the genetic arm of the Women's Health Study and include American women with no prior history of cardiovascular disease, cancer, or other major chronic illness who provided a baseline blood sample during the enrollment phase of the Women's Health Study between 1992 and 1995.20 Fibrinogen levels were measured using an immunoturbidimetric assay (Kamiya Biomedical, Seattle, Wash), which was standardized to a calibrator from the World Health Organization. Genotyping was done using the Illumina Infinium II assay to query a genome-wide set of 315,176 haplotype-tagging SNP markers (Human HAP300 panel) as well as a focused panel of 45,882 missense and haplotype tagging SNPs. For this analysis, the evaluation was performed on 17,686 non-diabetic individuals who were of Caucasian ancestry and were not taking lipid-lowering agents. The GWA results of the WHGS are reported in a companion manuscript.
The sample size and participant characteristics from each study are shown in Table 1. A quantile-quantile plot (Q-Q plot) of the observed against expected p-value distribution is shown in Figure 1. Figure 2 illustrates the primary findings from the meta-analysis and presents p-values for each of the interrogated SNPs across the 22 autosomal chromosomes. A total of 73 SNPs (supplemental Table 1) exceeded the threshold of genome-wide significance and clustered around four loci on chromosomes 1 (2 SNPs), 3 (12 SNPs), 4 (23 SNPs), and 5 (36 SNPs) (Figure 3).
The strongest statistical evidence for an association was for rs1800789 which is located at 4q31.3 in exon 7 of the fibrinogen ß (FGB) gene (minor allele frequency [MAF]: 0.20-0.24, meta-analysis p-value = 1.75×10−30, fibrinogen level change per minor allele [Δ]: 0.087 g/L). The other significant loci were marked by rs2522056, which is located at 5q23.3, 25 kb downstream of the interferon regulatory factor 1 (IRF1) gene (MAF: 0.17-0.21, p = 1.3×10−15, Δ: −0.063 g/L), rs511154, which is located at 3q22.3, in intron 1 of the propionyl coenzyme A carboxylase, beta polypeptide (PCCB) gene (MAF: 0.21-0.24, p = 5.94×10−10, Δ: 0.045 g/L) and rs1539019 which is located at 1q44, on the NLR family, pyrin domain containing 3 isoforms (NLRP3) gene (MAF: 0.37-0.42, p = 1.04×10−8, Δ: −0.038 g/L). Cohort-specific findings are presented for the top SNP within each locus in Table 2. Results did not change materially when we adjusted the model for other covariates (smoking, alcohol consumption, body mass index, systolic blood pressure, triglyceride, total- and HDL-cholesterol, diabetes, and cardiovascular disease) (data not shown). Table 3 shows the mean and standard deviations for fibrinogen levels by genotype for each of the four SNPs.
We estimated the association of the four SNPs by sex and smoking status separately but none of the SNPs showed a significantly different association between subgroups (Supplementary Table 2 and 3).
A combined risk alleles score summarizing the number of risk alleles was associated with a 15% increase in overall mean fibrinogen level comparing subjects with no risk allele (mean fibrinogen level 2.81 g/L) to subjects with six or more risk alleles (mean fibrinogen level 3.24 g/L). The genetic variants identified in our study explained less than 2% of the overall variance in plasma fibrinogen in all studies except one.
To investigate the validity of our findings, we sought replication of the four loci using WHGS data. Since WHGS did not genotype the identical SNPs as our six cohorts, the best proxy SNP was used for replication. For rs1800789, rs2522056, rs511154, and rs1539019, we used WHGS SNPs rs6056 (r2=0.95; p=8.04×10−39), rs1016988 (r2=0.80; p=1.24×10−12), rs684773 (r2=1.0; p=1.92×10−5), and rs1539019 (p=2.89×10−4), respectively, as the proxy SNP. The direction of each association in WHGS was consistent with our findings.
In addition to our four genome-wide significant loci, two other loci demonstrated multiple-SNP hits with p-values <5×10−7: one on chromosome 2 (rs4251961, p=3.5×10−7) and one on chromosome 14 (rs8017049, p= 5.6×10−7). When we examined the results for these two loci in the WHGS data, we found evidence for replication on chromosome 2 (rs4251961 in WHGS, p=8.5×10−3).
We identified four loci associated with circulating fibrinogen level through a meta-analysis of GWA data from six cohort studies comprising 22,096 subjects. We provide strong information of the previously reported associations with the FGB locus. Three of our findings are newly identified associations.
The most significant SNP in our study was rs1800789 which is located on the FGB gene. The FGB gene encodes the fibrinogen ß chain. A well-characterized SNP at this locus is rs1800787 (−148C/T) which resides 965 base pairs away from our top SNP (rs1800789) and is in high LD with it (D′~1.0, r2=0.91). It is known that rs1800787 directly affects gene transcription in basal and IL6-stimulated conditions in luciferase expression studies.21 Another well-characterized SNP in this region is rs1800790 (455G/A), which is also in strong LD with rs1800787, is known to be related to plasma fibrinogen 22 and showed a strong association with fibrinogen levels in our study as well (p=5.04×10−27, Supplementary Table 1).
The second locus is located 25 kb downstream from the IRF1 gene on chromosome 5. IRF1 is a member of the interferon regulatory transcription factor family and activates transcription of interferon α and ß. IRF1 also functions as a transcription activator of genes induced by interferon α, β and γ. Direct effects of interferons on fibrinogen have not previously been described, but it is known that they play a role in the regulation of acute phase proteins. Notably, the SNP is only 31 kb from a SNP strongly associated with Crohn's disease in a recent meta-analysis (rs2188962, p<2.32×10−18).23 Individuals with inflammatory bowel disease (IBD), including Crohn's disease, are at a threefold higher risk of venous thrombosis 24, accounting for substantial morbidity and mortality in this group.25 Furthermore, multiple studies have indicated significantly elevated levels of fibrinogen in IBD patients.26 This suggests that IRF1 or nearby genes may contribute to Crohn's disease via a mechanism mediated through an increase in acute phase responsiveness and fibrinogen levels.
The third locus on chromosome 3 is located in intron 1 of the PCCB gene. The PCCB gene is responsible for a particular step in the breakdown of the amino acids isoleucine, methionine, threonine, and valine. However, the available information about PCCB does not provide a strong hypothesis about the putative function of the gene in regulation of fibrinogen levels.
The fourth locus on chromosome 1 is located on the NLRP3 gene. The NLRP3 gene encodes a pyrin-like protein, which interacts with the apoptosis-associated speck-like protein PYCARD/ASC and is a member of the NALP3 inflammasome complex.27 Activated NALP3 inflammasome drives processing of the pro-inflammatory cytokine pro-IL1ß to IL1ß. Recent data indicate that the NALP3 inflammasome can be activated by endogenous ‘danger signals’ as well as compounds associated with pathogens and triggers an innate immune response. 28
The finding on chromosome 2 is located in the promoter region (1 kb upstream from the transcription start site) of the interleukin-1 receptor antagonist (IL1RN) gene. Fibrinogen is an acute phase protein that is regulated by cytokines, mainly IL1 and IL6, while the IL6-mediated transcription of the fibrinogen gene is inhibited by IL1ß.29 This region has formerly reported to be associated with fibrinogen levels; rs2232354, which is in high LD with our top SNP, rs4251961, was associated with fibrinogen levels in an asymptomatic population.30
Our findings were replicated in WGHS. Two of our four SNPs are reported by WGHS as genome-wide significant findings (rs6056 and rs1016988) and the other two have p-values which suggest non-chance findings in a replication (rs684773 and rs1539019). These results provide further credibility that our newly identified loci are valid.
We examined evidence for the top four fibrinogen loci among gene expression QTLs from recent GWA studies in human liver tissues31 and lymphoblastoid cell lines.32 In liver tissues, SNPs at the FGB locus were strongly associated with the expression of FGB (e.g., rs4508864, p<1.20×10−8) as well as with other trans-located mRNAs. Likewise, we observed that several SNPs in the region of the IRF1 locus were strongly associated with the expression of nearby genes (including IRF1, LOC441108, and SLC22A5) in both liver tissues and lymphoblastoid cell lines (e.g., rs2070729, p=4.9×10−10 for expression of the IRF1 gene). These results from independent genome-wide association studies strongly suggest a functional basis for the observed associations in the FGB and IRF1 loci.
Although heritability estimates for circulating fibrinogen are substantial, the genetic variants identified in our study explain only a small part of the overall variance. Therefore, our SNPs probably have limited value in prediction of cardiovascular disease. Rare variants, common variants with smaller effects, or variants which interact with other genetic and environmental factors may explain the remaining variation in plasma fibrinogen levels.
Fibrinogen was measured independently in the six cohorts. Though methods for measuring fibrinogen concentration were not standardized, they were all based on the Clauss method or another clotting assay, except for KORA which used nephelometry. Nonetheless, the effect estimates for the top SNPs were comparable between KORA and other studies.
Contributing studies used different genotyping platforms with different groups of SNPs. To enable the meta-analysis, each study imputed ~2.5 million SNPs in HapMap CEU samples. Imputation has previously been shown to be accurate and to increase the power. The power, of course, would have been higher if all SNPs were genotyped in all studies.
In conclusion, we have identified four loci associated with fibrinogen levels through meta-analysis of GWA data from six cohort studies comprising 22,096 subjects. All four loci replicated in a seventh study. In addition, we replicated one of the two other loci which showed a close to significant association in our meta-analysis and is biologically plausible. Three of our findings (IRF1, PCCB, and NLRP3) represent newly identified associations. Among the genes in the novel loci implicated in our study are those that encode proteins playing a role in inflammation representing interesting targets for further research into biological pathways involved in cardiovascular disease and other chronic inflammatory conditions.
Fibrinogen is a major player in the coagulation system, is a determinant of platelet aggregation, and affects blood viscosity. Circulating fibrinogen levels have been consistently associated with risk of coronary heart disease. Although blood fibrinogen levels are influenced by many environmental factors, genes either independently or in combination with environmental factors play an important role in determining circulating fibrinogen levels. The advent of genome-wide association studies provides an opportunity to identify previously unsuspected genetic loci that influence complex traits. In this study, we combined genome-wide association data from six large prospective cohort studies, and identified four genetic loci that are associated with circulating fibrinogen levels. These genetic loci provide valuable insights into the pathways that determine circulating fibrinogen levels. Although additional investigations are needed to understand the exact mechanisms, our findings do highlight the key contribution of inflammatory genes in influencing inter-individual variation in fibrinogen levels. A better understanding of the molecular mechanisms that control circulating fibrinogen levels may spur the development of novel therapeutic strategies that might reduce fibrinogen levels. Such pharmacological agents may be potentially useful for reducing the risk of coronary heart disease.
The Rotterdam Study would like to thank Pascal Arp, Mila Jhamai, Dr Michael Moorhouse, Marijn Verkerk and Sander Bervoets for their help in creating the database and Maxim Struchalin for his contributions to the imputations of the data.
The analyses in Framingham Heart Study are based on the efforts and resource development from the Framingham Heart Study investigators participating in the SNP Health Association Resource (SHARe) project.
The Atherosclerosis Risk in Communities Study wish to thank the University of Minnesota Supercomputing Institute for use of the blade supercomputers
A full list of principal CHS investigators and institutions can be found at http://www.chs-nhlbi.org/pi.htm.
The authors acknowledge the essential role of the Cohorts for Heart and Aging Research in Genome Epidemiology (CHARGE) Consortium in development and support of this manuscript. CHARGE members include the Netherland's Rotterdam Study (RS), Framingham Heart Study (FHS), Cardiovascular Health Study (CHS), the NHLBI's Atherosclerosis Risk in Communities (ARIC) Study, and the NIA's Iceland Age, Gene/Environment Susceptibility (AGES) Study.
The Rotterdam Study is supported by the Erasmus Medical Center and Erasmus University Rotterdam; the Netherlands Organization for Scientific Research; the Netherlands Organization for Health Research and Development (ZonMw); the Research Institute for Diseases in the Elderly; The Netherlands Heart Foundation; the Ministry of Education, Culture and Science; the Ministry of Health Welfare and Sports; the European Commission; and the Municipality of Rotterdam. Support for genotyping was provided by the Netherlands Organization for Scientific Research (NWO) (175.010.2005.011, 911.03.012) and Research Institute for Diseases in the Elderly (RIDE). This study was further supported by the Netherlands Genomics Initiative (NGI)/Netherlands Organisation for Scientific Research (NWO) project nr. 050-060-810. AD is supported by NWO, RIDE (94800022).
The Framingham Heart Study was supported by the National Heart, Lung and Blood Institute's Framingham Heart Study (Contract No. N01-HC-25195) and its contract with Affymetrix, Inc for genotyping services (Contract No. N02-HL-6-4278).
The Atherosclerosis Risk in Communities Study is supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, and N01-HC-55022 and grants R01-HL-087641, R01-HL-59367 and R01-HL-086694; National Human Genome Research Institute contract U01-HG-004402; and National Institutes of Health contract HHSN268200625226C. The infrastructure was partly supported by Grant Number UL1-RR-025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research.
The Cardiovascular Health Study is supported by contract numbers N01-HC-85079 through N01-HC-85086, N01-HC-35129, N01 HC-15103, N01 HC-55222, N01-HC-75150, N01-HC-45133, grant numbers U01 HL080295 and R01 HL 087652 from the National Heart, Lung, and Blood Institute, with additional contribution from the National Institute of Neurological Disorders and Stroke.
The MONICA/KORA Augsburg studies were financed by the Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany and supported by grants from the German Federal Ministry of Education and Research (BMBF). Part of this work was financed by the German National Genome Research Network (NGFN) and through additional funds from the University of Ulm.
We acknowledge use of phenotype and genotype data from the British 1958 birth cohort DNA collection, funded by the Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. (http://www.b58cgene.sgul.ac.uk/).
Conflict of Interest Disclosure: None