|Home | About | Journals | Submit | Contact Us | Français|
Plasma levels of coagulation factors VII (FVII), VIII (FVIII), and von Willebrand factor (vWF) influence risk of hemorrhage and thrombosis. We conducted genome-wide association studies to identify new loci associated with plasma levels.
Setting includes 5 community-based studies for discovery comprising 23,608 European-ancestry participants: ARIC, CHS, B58C, FHS, and RS. All had genome-wide single nucleotide polymorphism (SNP) scans and at least 1 phenotype measured: FVII activity/antigen, FVIII activity, and vWF antigen. Each study used its genotype data to impute to HapMap SNPs and independently conducted association analyses of hemostasis measures using an additive genetic model. Study findings were combined by meta-analysis. Replication was conducted in 7,604 participants not in the discovery cohort. For FVII, 305 SNPs exceeded the genome-wide significance threshold of 5.0×10−8 and comprised 5 loci on 5 chromosomes: 2p23 (smallest p-value 6.2×10−24), 4q25 (3.6×10−12), 11q12 (2.0×10−10), 13q34 (9.0×10−259), and 20q11.2 (5.7×10−37). Loci were within or near genes, including 4 new candidate genes and F7 (13q34). For vWF, 400 SNPs exceeded the threshold and marked 8 loci on 6 chromosomes: 6q24 (1.2×10−22), 8p21 (1.3×10−16), 9q34 (<5.0×10−324), 12p13 (1.7×10−32), 12q23 (7.3×10−10), 12q24.3 (3.8×10−11), 14q32 (2.3×10−10) and 19p13.2 (1.3×10−9). All loci were within genes, including 6 new candidate genes, as well as ABO (9q34) and VWF (12p13). For FVIII, 5 loci were identified and overlapped vWF findings. Nine of the 10 new findings replicated.
New genetic associations were discovered outside previously known biologic pathways and may point to novel prevention and treatment targets of hemostasis disorders.
A complex cascade of coagulation factors underlies hemostasis and prevents life-threatening blood loss from damaged blood vessels. The hemostatic factors VII and VIII, both produced in the liver, play central roles in the initiation and propagation, respectively, of fibrin formation. In the tissue-factor pathway, blood coagulation factor VII (FVII), once activated, serves as a catalyst for factor X (FX) activation, which converts prothrombin to thrombin. During propagation, activated factor VIII (FVIII) activates FX in the presence of activated factor IX. Von Willebrand factor (vWF), produced by endothelial cells and megakaryocytes, has multiple roles in hemostasis. Its primary role is to serve as an adhesion molecule that anchors platelets to exposed collagen after endothelial cell damage. The factor also acts as a carrier protein of FVIII, thereby prolonging the half-life of FVIII.
Elevated circulating levels of FVIII and vWF are risk factors for venous thrombosis but the data supporting an association of FVII levels with arterial thrombosis are less consistent.1-5 Hemorrhagic complications are associated with deficiency in FVII and vWF (von Willebrand disease), as well as X-linked deficiency in FVIII (Hemophilia A).6-9 Plasma levels of these proteins are affected by environmental factors but they also are genetically influenced.10-13 Heritability estimates range from 0.53-0.63 for FVII, 0.40-0.61 for FVIII, and 0.31-0.75 for vWF.12, 13 To date, our understanding of genetic variation influencing plasma levels has been focused primarily on cis-acting variation in the genes encoding each protein product (F7, F8, and VWF, respectively). A large-scale genome-wide investigation of the genomic correlates of plasma levels has not been previously published. Using data from 23,608 adults, we investigated genome-wide associations between common genetic variation and plasma levels of FVII, FVIII, and vWF.
The discovery meta-analysis was conducted in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, which includes data from 5 prospective, population-based cohorts of adults in the US and Europe.14 Phenotype data were available from 4 of the cohorts: the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), the Framingham Heart Study (FHS), and the Rotterdam Study (RS). In addition, participants from the British 1958 Birth Cohort (B58C) who had genome-wide data and were used as controls for the Wellcome Trust Case-Control Consortium were included in these analyses.15 The designs of these studies have been described elsewhere.15-22
Eligible participants for these analyses had high-quality data from the genome-wide scans (see below), had at least 1 of the 3 phenotypes measured, and were not using a coumarin-based anticoagulant at the time of the phenotype measurement. Participants were of European (n=23,608) ancestry by self report. Each study received IRB approval and all participants provided written informed consent for the use of their DNA in research.
Plasma measures of FVII, FVIII, and vWF were obtained at the time of cohort entry for ARIC, CHS, and RS (except vWF, which was measured at visit 3), and at a follow-up visit for B58C (examination 2002 and 2003) and FHS (examination cycle 5, 1991-1995). Only the FVII phenotype in CHS was measured more than once and this included 2 measures, which were averaged for 2653 participants (81% of 3266 participants). The FVII phenotype was measured in 4 cohorts and included both antigen (FHS23) and activity (ARIC,24, 25 CHS,26 RS27). Factor VIII activity was measured in 3 cohorts (ARIC,24, 25 CHS,26 and RS) and vWF antigen was measured in 4 cohorts (ARIC,24, 25 B58C,28 FHS,29 and RS). Supplemental Table S1 provides details about assays in each of the 5 cohorts.
Baseline measures of clinical and demographic characteristics were obtained at the time of cohort entry for ARIC, CHS, and RS, and at the time of phenotype measurement for B58C and FHS. Measures, taken using standardized methods as specified by each study, included in-person measures of height and weight; and self-reported treatment of diabetes and hypertension, current alcohol consumption, and prevalent cardiovascular diseases (history of myocardial infarction, angina, coronary revascularization, stroke, or transient ischemic attack).
Genotyping used DNA collected from phlebotomy (ARIC, CHS, FHS, RS) or cell lines (B58C). Genome-wide assays of SNPs were conducted independently in each cohort using various technologies: Affymetrix 6.0 for ARIC, Affymetrix 500K for B58C, Illumina 370CNV for CHS, Affymetrix 500K and gene-centric 50K for FHS, and Illumina 550K v3 for RS. Genotype quality control and data cleaning that included assessing Hardy-Weinberg equilibrium and variant call rates, were conducted independently by each study; details have been published elsewhere and are also provided in Supplemental Table S2.14
We investigated genetic variation in the 22 autosomal chromosomes and the X chromosome. Genotypes were coded as 0, 1, and 2 representing the number of copies of the coded alleles for all chromosomes except the X chromosome in males where genotypes were coded as 0 and 2.30 Each study independently imputed their genotype data to the ~2.6 million SNPs identified in the HapMap Caucasian (CEU) sample from the Centre d’Etude du Polymorphisme Humain.31-33 Imputation software included MACH, BIMBAM or IMPUTE, with SNPs that pass quality-control criteria (Supplemental Table S2) were used among studies to impute unmeasured genotypes based on phased haplotypes observed in HapMap. Imputed data for the X chromosome were not available in ARIC or RS. Imputation results were summarized as an “allele dosage” defined as the expected number of copies of the minor allele at that SNP (a continuous value between 0 and 2) for each genotype. Each cohort calculated a ratio of observed to expected variance (OEV) of the dosage statistic for each SNP. This value, which generally ranges from 0 to 1 (poor to excellent), reflects imputation quality.
Investigators from all cohorts developed the pre-specified analytic plan described below. Each study independently analyzed their genotype-phenotype data. All studies used linear regression to conduct association analyses between measured and imputed SNPs and untransformed phenotype measures except for FHS, which used a linear mixed effects model to account for family relationships.34 An additive genetic model with 1 degree of freedom was adjusted for age and sex. In addition, CHS and ARIC adjusted for field site and FHS adjusted for generation and ancestry using principal components.35 For each analysis a genomic control coefficient, that estimated the extent of underlying population structure based on test-statistic inflation, was used to adjust standard errors.36
For each phenotype, within-study findings were combined across studies to produce summary results using standard meta-analytic approaches. For vWF antigen and FVIII activity levels, fixed-effects, inverse-variance weighted meta-analysis was performed, and summary p-values and β-coefficients were calculated. Parameter coefficient represents change (% vWF antigen or % FVIII activity) associated with 1-unit change in allele dosage. This approach was not appropriate for the FVII phenotype which combined analysis of activity and antigen measures. Instead effective-sample-size weighted meta-analysis was performed to estimate summary p-values only. All meta-analyses were conducted using METAL software (http://www.sph.umich.edu/csg/abecasis/METAL/index.html). For loci containing genes already known to be associated with the phenotype, we conducted secondary analyses, adjusting for 1 or more SNPs within the known gene. This allowed us to assess possible novel associations in neighboring genes independently of the strong signal in the known SNPs.
The a priori threshold of genome-wide significance was set at a p-value of 5.0×10−8. When more than 1 SNP clustered at a locus, we picked the SNP with the smallest p-value to represent the locus and to be used for replication. For each phenotype, the amount of variation explained by the top SNPs was the difference in the R2 value when comparing a model with adjustment variables only to a model that also included top genome-wide significant SNPs. Each study calculated the amount of variation explained and these estimates were combined across cohorts using weighted averages. Linkage disequilibrium (LD) between SNPs was estimated using a weighted average across cohorts.
Novel genome-wide significant loci were subject to replication in 5 new populations. These populations included 1,375 ARIC participants not in the discovery cohort, 2,484 B58C cohort participants not in the discovery cohort, 1,006 participants from the Twins UK Study (587 from TUK-1 and 419 from TUK-2),37 765 participants from VIS (Vis Croatia Study),38 677 participants from ORCADES (Orkney Complex Disease Study),39 and 1297 healthy, population-based Swedish subjects recruited to serve as controls in the PROCARDIS program.40 The FVII replications were conducted using participants from ARIC, Twins UK, and PROCARDIS; for vWF, participants from ARIC, B58C, Twins UK, VIS, and ORCADES were included; and for FVIII, ARIC and Twins UK studies were included. Details about hemostasis assays can be found in Supplemental Table S1. Replication β-coefficients and p-values were meta-analyzed using samples-size (FVII) and inverse-variance (FVIII, vWF) weighting. The threshold of significance was 5.0×10−2.
A total of 23,608 participants of European ancestry, were eligible for 1 or more analyses. Counts of participants and their characteristics are provided in Table 1. The average age ranged from 44.9 to 72.3 years for the 5 cohorts, and 54% of the participants were women. The median and interquartile range for each phenotype are also listed in Table 1.
Within the 4 cohorts that conducted FVII analyses (ARIC, CHS, FHS, and RS), the genomic control coefficients for analyses were small (<1.04), suggesting negligible test statistic inflation. Figure 1a presents all 2,734,954 meta-analysis p-values organized by chromosome and genomic position. Among these SNPs, 305 exceeded the genome-wide significance threshold and marked 5 regions on 5 chromosomes: 2p23, 4q25, 11q12, 13q34 (includes F7), and 20q11.2. A sixth region on chromosome 15 was marked by 6 SNPs all with very low MAFs (<0.007) so this locus was not investigated further.
Table 2 lists the top SNP for each chromosomal region. Cohort-specific p-values are provided in Table 3. The amount of variance in the FVII phenotype explained by the 5 loci was 7.7%. Genome-wide significant signals at chromosomal position 2p23 (Supplemental Figure S1) were within or close to 1 gene: GCKR (glucokinase [hexokinase 4] regulator). Rs1260326 was associated with the smallest p-value (6.2×10−24; overall minor allele frequency [MAF] = 0.422) in this region, and this SNP codes a non-synonymous substitution (proline to leucine) at amino acid position 446 in GCKR. Genome-wide significant signals at chromosomal position 4q25 (Supplemental Figure S2) were within the region of 2 genes: ADH4 (alcohol dehydrogenase 4 [class II], pi polypeptide) and ADH5 (alcohol dehydrogenase 5 [class III], chi polypeptide). Rs1126670, intronic to ADH4, had the smallest p-value (3.6×10−12; MAF = 0.316). This SNP was in high LD (r2=0.77) with the top SNP in ADH5 (rs896992; p-value = 3.2×10−11). At chromosomal position 11q12 (Supplemental Figure S3), genome-wide significant signals were within 2 genes: MS4A2 and MS4A6A (membrane-spanning 4-domains, subfamily A, member 2 and member 6A, respectively). Rs11230180 had the smallest p-value (2.0×10−10) and was close to the MS4A6A gene and in high LD with top SNPs in MS4A6A (r2=0.97 for rs17602572 [7.3×10−10]) and in MS4A2 (r2=0.84 for rs2847666 [3.8×10−8]).
The 63 genome-wide significant signals at chromosomal position 13q34 (Supplemental Figure S4) were within the factor VII structural gene (F7) and several other genes. Rs488703 had the smallest p-value (9.0×10−259) and is located in an F7 intron. After adjusting for rs6046 (4.4×10−259), a missense SNP which leads to the arginine-glutamine (R353Q) functional substitution, 22 SNPs in the 13q34 region retained their genome-wide significance. These SNPs were within 2 genes: rs3211727 (5.3×10−23) in F10 (coagulation factor X) and rs1755693 (3.1×10−19) in MCF2L (MCF.2 cell line derived transforming sequence-like).
The final chromosomal region containing genome-wide signals for FVII levels was position 20q11.2 (Supplemental Figure S5). Rs867186 in PROCR had the smallest p-value (5.7×10−37), and this variant leads to the serine-glycine (S219G) substitution in exon 4. The region also covered another 7 genes that contained SNPs with p-values that exceeded genome-wide significance: ITCH (itchy E3 ubiquitin protein ligase homolog), PIGU (phosphatidylinositol glycan anchor biosynthesis, class U), ACSS2 (acyl-CoA synthetase short-chain family member 2), MYH7B (myosin, heavy chain 7B, cardiac muscle, beta), TRPC4AP (transient receptor potential cation channel, subfamily C, member 4 associated protein), EDEM2 (ER degradation enhancer, mannosidase alpha-like 2), and MMP24 (matrix metallopeptidase 24). As depicted on Supplemental Figure e, all top SNPs were in strong LD with rs867186.
Across all studies, significant heterogeneity of effect (p-value < 0.01) was detected for rs1126670 and rs488703. Effect estimates were in the same direction but magnitudes differed between studies (Table 3a). When we restricted the discovery cohort to include only those studies that measured FVII activity (ARIC, CHS, and RS), p-values weakened for rs1260326 (1.7×10−18) and rs867186 (2.8×10−34), were virtually unchanged for rs11230180 and rs488703, and strengthened for rs1126670 (4.4×10−14). A sixth locus emerged with 4 genome-wide significant SNPs at 11q13. Among these, rs1149606 had the smallest p-value 1.9×10−9 (MAF = 0.176). This SNP was 4.0 kb from TSKU (tsukushin). This SNP had a p-value of 2.2×10−6 in the full discovery cohort. When we restricted the discovery cohort to the single study that measured measured FVII antigen, FHS, there were no SNPs that reached genome-wide significance.
The 4 new findings were tested in the ARIC, Twins UK (1 and 2), and Swedish PROCARDIS cohorts and all replicated. Replication p-values were 6.5×10−4 for rs1260326, 3.8 ×10−2 for rs1126670; 1.4×10−3 for rs11230180, and 1.6×10−7 for rs867186 (see Supplemental Table S3a). When separating the FVII activity (ARIC, TUK-1) and FVII antigen (TUK-2 and PROCARDIS) cohorts, replication results were similar except rs1260326 did not replicate in the activity cohorts (p-value 9.0 ×10−2).
Within each cohort that conducted genome-wide association analyses for vWF (ARIC, B58C, FHS, and RS) and FVIII (ARIC, CHS, and RS), the genomic control coefficients for analyses were small (<1.03), suggesting negligible test statistic inflation. Figures 1b and 1c present all meta-analysis p-values (2,742,821 for vWF and 2,729,294 for FVIII) organized by chromosome and genomic position.
For vWF, 400 SNPs exceeded the genome-wide significance threshold and marked 8 regions on 6 chromosomes: 6q24, 8p21, 9q34, 12p13 (including VWF), 12q23, 12q24.3, 14q32 and 19p13.2. For FVIII, 191 SNPs exceeded the genome-wide significance threshold and marked 5 regions on 4 chromosomes: 6q24, 8p21, 9q34, 12p13, and 12q23. Table 2 lists the top SNP markers for each of the regions for both vWF and FVIII. Cohort-specific parameters are provided in Table 3. The 8 loci explained 12.8% of the vWF antigen variation and the 5 loci explained 10.0% of the FVIII activity variation. Because all the FVIII regions were a subset of those identify by vWF, we focus on the vWF findings.
Genome-wide significant signals at chromosomal position 6q24 (Supplemental Figure S6) where within or close to 1 gene: STXBP5 (syntaxin binding protein 5). Rs9390459, with the smallest p-value (1.2×10−22; MAF = 0.442) at this locus, encodes a synonymous substitution at amino acid position 779 in STXBP5. At chromosomal position 8p21 (Supplemental Figure S7), genome-wide significant signals were located on 1 gene: SCARA5 (scavenger receptor class A, member 5). Rs2726953 had the smallest p-value (1.3×10−16; MAF = 0.309) and was intronic to SCARA5. Genome-wide significant signals at chromosomal position 9q34 (Supplemental Figure S8) were within 11 genes, 1 of which was ABO. All the SNPs with the very smallest p-values (<5.0×10−324) were within ABO. After adjusting for 4 SNPs (rs514659, rs8176749, rs8176704, and rs512770) that collectively serve as tag-SNPs for the O, Ov/2, A1−1/2, A2, and B haplotypes of ABO,41 none of the 161 remaining SNPs at 9q34 was of genome-wide significance.
Three regions containing high-signal SNPs were identified on chromosome 12: 12p13, 12q23, and 12q24.3. The vWF structural gene resides within 12p13 (Supplemental Figure S9) and rs1063857, which leads to a synonymous amino acid substitution at position 795 of VWF, had the smallest p-value (1.7×10−32; MAF = 0.360). Two genome-wide significant SNPs were located at 12q23 (Supplemental Figure S10) and were within 2 genes: STAB2 (stabilin 2) and NT5DC3 (5′-nucleotidase domain containing 3). Rs4981022 in STAB2 had the smaller p-value (7.3×10−10; MAF = 0.315) and was intronic. The LD between this SNP and rs10778286 (4.4×10−8; MAF = 0.296) in NT5DC3 was not complete (r2=0.49). At chromosomal region 12q24.3 (Supplemental Figure S11), genome-wide significant SNPs were found in 1 gene: STX2 (syntaxin 2). Rs7978987 had the smallest p-value (3.8×10−11; MAF = 0.349) and was intronic to STX2. At chromosomal region 14q32 (Supplemental Figure S12), genome-wide significant SNPs were found within or close to 1 gene: TC2N (tandem C2 domains, nuclear). Rs10133762 had the smallest p-value (2.3×10−10; MAF = 0.448) and was intronic to TC2N. Lastly, genome-wide significant SNPs at chromosomal position 19p13.2 (Supplemental Figure S13) were located within 1 gene: CLEC4M (C-type lectin domain family 4, member M). Rs868875 had the smallest p-value (1.3×10−9; MAF = 0.262) and was intronic to CLEC4M.
The change in vWF associated with each additional copy of the minor allele was large for the ABO locus (increase of 24.1 in vWF antigen %) and was smaller for the other SNPs of genome-wide significance, ranging from 3.1 to 6.0 (Table 2). These change values were generally smaller for the FVIII phenotype compared with vWF. Only the ABO locus (rs687621) demonstrated heterogeneity of effect size across studies (Table 3b).
The 6 new findings were tested in the ARIC, B58C, Twins UK (1 and 2), VIS, and ORCADES cohorts and 5 replicated. Replication p-values were 1.7×10−6 for rs9390459, 3.1×10−5 for rs2726953, 8.7×10−8 for rs4981022, 1.9×10−1 for rs7978987, 3.3×10−5 for rs10133762, and 1.4×10−2 for rs868875 (see Supplemental Table S3b).
In addition to confirming previously known associations of FVII levels with F7 variation, and vWF and FVIII levels with ABO and VWF variation, respectively, this meta-analysis of data from 23,608 subjects of European ancestry identified novel, genome-wide significant associations of 4 loci with circulating FVII levels, 6 loci with vWF levels, and 3 loci with FVIII levels. The FVIII loci were a subset of the vWF loci. Several of these discoveries were associated with p-values many magnitudes smaller than the threshold of genome-wide significance and included new candidate genes for levels of FVII (GCKR, ADH4, MS4A6A and PROCR), of vWF (STXBP5, SCARA5, STAB2, STX2, TC2N, and CLEC4M), and of FVIII (STXBP5, SCARA5, STAB2). There was independent evidence for replication for 9 of 10 new findings in new populations.
The strongest genetic associations with FVII levels resided in F7 and confirm previously reported associations of SNPs in this locus.42, 43 After adjustment for rs6046 (R353Q), there remained residual signal in this region with SNPs in MCF2L, F7, and F10 reaching genome-wide significance. Upon further exploration, the residual signal was only detected in 1 of the 4 cohorts (ARIC), where the rs6046 SNP was imputed with only modest precision (OEV = 0.607). Although the residual signal is likely attributable to the incomplete adjustments, we cannot rule out the possibility that other sequence variations in the 13q34 region contribute to FVII levels.44
The association of variants in PROCR with FVII levels has not been reported previously and is novel. Recent reports have described activated FVII (FVIIa) binding to the endothelial protein C receptor via the Gla domain, which has homologous counterpart on the protein C molecule.45, 46 This binding appears to inhibit procoagulant activity of the FVIIa-tissue factor complex and may impact the clearance of FVIIa and receptor signaling through competition with the binding site. Variation in GCKR, whose gene product inhibits glucokinase in liver and pancreatic islet cells, has not been associated previously with FVII levels but has been associated with C-reactive protein and triglyceride levels.47, 48 Factor VII protein concentrations are associated with triglyceride levels in the fasting state, but there is a direct effect of plasma triglycerides on FVII activity.49 Postprandial increase in plasma triglycerides is directly correlated with increase in FVII activity, but the protein concentration remains constant. Genetic variation in alcohol dehydrogenase 3 has been associated with myocardial infarction risk and coronary heart disease risk.50, 51 Plasma levels of FVII are negatively correlated with alcohol consumption and ADH4, a gene that regulates alcohol metabolism, and may have an effect on FVII levels. There are no previous reports of associations with levels of FVII of genetic variation in ADH4, ADH5 or other alcohol dehydrogenase genes. Little has been published on the MS4A6A gene.52
When comparing findings by the differ measurements of FVII, either antigen or activity level, we found little evidence that the genetic predictors we identified differed by sub-phenotype. Nonetheless a more extensive comparison of the 2 sub-phenotypes is merited and would require a larger representation of FVII antigen measures in the discovery cohort.
Because the vWF molecule transports FVIII and the 2 factors are highly correlated, we expected overlap in some of the vWF and FVIII findings.11 No unique genetic predictors of FVIII levels on the 22 autosomal chromosomes or the X chromosome, where F8 is located, were identified. This may indicate that in healthy individuals genetic determinants of FVIII levels are primarily dependant upon vWF levels.
The strong associations between SNPs located in ABO and VWF and FVIII levels were expected since it is known that individuals with blood group O have 25-30% lower vWF levels than individuals with non-O blood groups.53 The 2 SNPs in VWF with the smallest p-values for FVIII (rs1063856) and vWF (rs1063857) levels are in strong LD and mark non-synonymous and synonymous substitutions at amino acids 789 and 795, respectively. These variations are located in exon 18, which encodes for the D’ domain, and are involved in binding of FVIII. Rare genetic variation in VWF has been associated low levels of vWF, which characterize von Willebrand disease, either through lowering protein expression or through increased clearance. Rare missense mutations in the D’ part of the vWF gene have been associated with normal or reduced levels of vWF and low levels of FVIII, which is characteristic of von Willebrand disease type 2N.54, 55 Rare genetic variation in the VWF has also been associated with low levels of vWF, which characterize von Willebrand disease, either through lowering the proteins expression or through increased clearance.
We detected a novel and relatively strong association for the STXBP5 gene and also for STX2, the binding substrate for STXBP5, with vWF and FVIII levels. We were not able to replicate the STX2 finding. Syntaxin 4, one of the soluble NSF attachment protein receptor (SNARE) proteins, has been linked to exocytosis of Weibel Palade bodies (WPB) in endothelial cells by targeting WPB to the plasma membrane.56 Knockdown of syntaxin 4 resulted in inhibition of WPB exocytosis.57 Von Willebrand factor is the main constituent of WPB and upon stimulation, for instance due to stress, or after endothelial damage WPB are excreted by endothelial cells, and give rise to increased plasma vWF levels. Syntaxin 2 and vertebrate STXBP5 have not been shown to be involved in vWF secretion, however our observations may indicate a functional role of these proteins in determining circulating vWF levels.
Scavenger receptor Class A member 5 (SCARA5) belongs to the group of scavenger molecules, that have as primary function the initiation of immune responses. SCARA5 has recently been characterized in more detail and is solely expressed in epithelial cells. It has not yet been linked to vWF, and the nature of the genetic associations found in our study is yet unknown. Stabilin-2 (STAB2) is a transmembrane receptor protein primarily expressed in liver and spleen sinusoidal endothelial cells. The receptor binds and endocytoses various ligands including heparin, low density lipoproteins, bacteria and advanced glycosylation products.58 Taken together, these results generate the hypothesis that several loci may be involved in vWF and FVIII clearance or uptake.
The relevance of the reported associations of genome-wide variation with levels of hemostasis proteins to cardiovascular endpoints will require expanded research efforts. Elevated circulating levels of FVIII and vWF are independent risk factors for venous thrombosis and the genetic variants identified here explained roughly 10% of the variance in these measures.4 Variation in ABO is a known risk factor for venous thrombosis and the risk associated with the newly identified candidate genes presented here can only be hypothesized.59, 60 The association between FVII levels and thrombotic risk is less clear and epidemiologic findings have not been consistent.1-3, 5 Similarly, published evidence supporting the role of F7 variation in atherothrombotic risk has been heterogenous and an exploration of newly identified candidate genes in this report has not been undertaken.61, 62
This is the first genome-wide association study to attempt to discover novel genetic associations with the 3 hemostasis phenotypes: FVII, FVIII, and vWF. It included 23,608 individuals of European ancestry and ~2.6 million markers spread throughout the genome. Data came from 5 large, population-based cohort studies where cardiovascular outcomes were primary study endpoints. Phenotypes were measured in a standardized fashion within each cohort but measures between cohorts differed and likely introduced between-group variability. This variability may decrease statistical power to find associations of smaller magnitudes.
Supplemental Table S4 lists the minimum percent change in the hemostasis measures for variants with a range of MAFs and assuming power of 80% or greater and an α of 5.0×10−8. For a SNP with an MAF of 0.05, we had sufficient power to detect a 5% change in FVII, a 6% change in FVIII, and a 7% change in vWF. Not all SNPs tested were directly genotyped and the imputation quality varied across SNPs. For poorly imputed SNPs, there was reduced statistical power to detect an association. For each identified locus, we chose the SNP with the smallest p-value but the causal variant—if one exists—need not be the one with the smallest p-value and effect-sizes associated with the SNP may be an overestimate of any true effect. In situations where genes clustered near a gene containing 1 or more variants that were strongly associated with the phenotype, we used statistical adjustment to determine if neighboring genes had independent associations with the phenotype. This approach has limitations and cannot indentify novel causal variants that are in LD with known causal variants.
Using data from 5 community-based cohorts that included 23,608 participants, we identified 4 novel genetic associations with FVII activity and antigen levels and 6 novel genetic associations with vWF antigen levels: 9 of the 10 findings were replicated. New candidate genes of interest were discovered outside known biologic pathways influencing these essential hemostasis factors, and our findings may point to new pathways to target for the prevention and treatment of hemorrhagic and thrombotic disorders.
Elevated circulating levels of coagulation factor VIII (FVIII) and von Willebrand factor (vWF) are risk factors for venous thrombosis but the data supporting an association of coagulation factor VII (FVII) levels with arterial thrombosis are less consistent. Hemorrhagic complications are associated with deficiency in FVII and vWF (von Willebrand disease), as well as X-linked deficiency in FVIII (Hemophilia A). To date, our understanding of genetic variation influencing plasma levels has been focused primarily on variation in the genes encoding each protein product (F7, F8, and VWF, respectively). Using data from 23,608 adults drawn from community populations, we investigated genome-wide associations between common genetic variation and plasma levels of FVII, FVIII, and vWF. For FVII, we identified 5 loci on 5 chromosomes that exceeded genome-wide significance. All loci were within or near genes, including 4 new candidate genes and F7. For vWF, we identified 8 loci on 6 chromosomes. All loci were within genes, including 6 new candidate genes, as well as ABO and VWF. For FVIII, 5 loci were identified and overlapped vWF findings. We replicated 9 of the 10 new findings in independent samples. In summary, new genetic associations were discovered in biologic pathways not previously associated with circulating levels of these factors, including proteins implicated in uptake and intracellular transport of the factors. These findings may point to novel prevention and treatment targets of hemostasis disorders.
The authors acknowledge the essential role of the CHARGE (Cohorts for Heart and Aging Research in Genome Epidemiology) Consortium in development and support of this manuscript. CHARGE members include National Heart, Lung, and Blood Institute’s (NHLBI) Atherosclerosis Risk in Communities (ARIC) Study, NIA’s Iceland Age, Gene/Environment Susceptibility (AGES) Study, NHLBI’s Cardiovascular Health Study and Framingham Heart Study, and the Netherland’s Rotterdam Study. The authors also acknowledge the thousands of study participants who volunteered their time to help advance science and the scores of research staff and scientists who have made this research possible. Further, ARIC would like to thank the University of Minnesota Supercomputing Institute for use of the blade supercomputers; Twins UK would to acknowledge essential contribution of Peter Grant and Angela Carter from the Leeds Institute of Genetics, Health and Therapeutics, University of Leeds, UK for measurements of clotting factors phenotypes; and VIS would like to acknowledge the invaluable contributions of the recruitment team (including those from the Institute of Anthropological Research in Zagreb) in Vis, the administrative teams in Croatia and Edinburgh and the people of Vis. Genotyping was performed at the Wellcome Trust Clinical Research Facility in Edinburgh.
The Atherosclerosis Risk in Communities Study is supported by National Heart, Lung, and Blood Institute (NHLBI) contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, and N01-HC-55022 and grants R01-HL-087641, R01-HL-59367 and R01-HL-086694; National Human Genome Research Institute contract U01-HG-004402; and NIH contract HHSN268200625226C. The infrastructure was partly supported by grant number UL1-RR-025005, a component of the NIH and NIH Roadmap for Medical Research.
We acknowledge use of phenotype and genotype data from the British 1958 Birth Cohort DNA collection, funded by the Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. (http://www.b58cgene.sgul.ac.uk/).
The Cardiovascular Health Study (CHS) is supported by contract numbers N01-HC-85079 through N01-HC-85086, N01-HC-35129, N01 HC-15103, N01 HC-55222, N01-HC-75150, N01-HC-45133, grant numbers U01 HL080295 and R01 HL 087652 from NHLBI, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of principal CHS investigators and institutions can be found at http://www.chs-nhlbi.org/pi.htm. Support was also provided by NHLBI grants HL073410 and HL095080 and the Leducq Foundation, Paris, France for the development of Transatlantic Networks of Excellence in Cardiovascular Research. DNA handling and genotyping was supported in part by National Center for Research Resources grant M01-RR00425 to the Cedars-Sinai General Clinical Research Center Genotyping core and National Institute of Diabetes and Digestive and Kidney Diseases grant DK063491 to the Southern California Diabetes Endocrinology Research Center.
This research was conducted in part using data and resources from the Framingham Heart Study of the NHLBI of the NIH and Boston University School of Medicine. The analyses reflect intellectual input and resource development from the Framingham Heart Study investigators participating in the SNP Health Association Resource (SHARe) project. Partial investigator support was provided by National Institute of Diabetes and Digestive and Kidney Diseases K24 DK080140 (JB Meigs). This work was partially supported by NHLBI’s Framingham Heart Study (Contract No. N01-HC-25195) and its contract with Affymetrix, Inc., for genotyping services (Contract No. N02-HL-6-4278). A portion of this research utilized the Linux Cluster for Genetic Analysis (LinGA-II) funded by the Robert Dawson Evans Endowment of the Department of Medicine at Boston University School of Medicine and Boston Medical Center.
The Rotterdam Study is supported by the Erasmus Medical Center and Erasmus University Rotterdam; the Netherlands Organization for Scientific Research; the Netherlands Organization for Health Research and Development (ZonMw); the Research Institute for Diseases in the Elderly; The Netherlands Heart Foundation; the Ministry of Education, Culture and Science; the Ministry of Health Welfare and Sports; the European Commission; and the Municipality of Rotterdam. Support for genotyping was provided by the Netherlands Organization for Scientific Research (NWO) (175.010.2005.011, 911.03.012) and Research Institute for Diseases in the Elderly (RIDE). This study was further supported by the Netherlands Genomics Initiative (NGI)/Netherlands Organisation for Scientific Research (NWO) project nr. 050-060-810. Dr. Dehghan is supported by NWO, RIDE (94800022).
The Twins UK study was funded by the Wellcome Trust; European Community’s Sixth and Seventh Framework Programmes (FP-6/2005-2008) LIFE SCIENCES & HEALTH (Ref 005268 Genetic regulation of the end stage clotting process that leads to thrombotic stroke: The EuroClot Consortium and (FP7/2007-2013), ENGAGE project HEALTH-F4-2007-201413 and the FP-5 GenomEUtwin Project (QLG2-CT-2002-01254). The study also receives support from the Dept of Health via the National Institute for Health Research (NIHR) comprehensive Biomedical Research Centre award to Guy’s & St Thomas’ NHS Foundation Trust in partnership with King’s College London. TDS is an NIHR Senior Investigator. The project also received support from a Biotechnology and Biological Sciences Research Council (BBSRC) project grant. (G20234). The authors acknowledge the funding and support of the National Eye Institute via an NIH/CIDR genotyping project (PI: Terri Young).
The VIS study in the Croatian island of Vis was supported through the grants from the Medical Research Council UK and Ministry of Science, Education and Sport of the Republic of Croatia (number 108-1080315-0302) and the European Union framework program 6 EUROSPAN project (contract no. LSHG-CT-2006-018947).
ORCADES was supported by the Chief Scientist Office of the Scottish Government, the Royal Society and the European Union framework program 6 EUROSPAN project (contract no. LSHG-CT-2006-018947). DNA extractions were performed at the Wellcome Trust Clinical Research Facility in Edinburgh.
The PROCARDIS program was funded by the EC Sixth Framework Programme (LSHM-CT-2007-037273), the Swedish Research Council (8691), the Knut and Alice Wallenberg Foundation, the Swedish Heart-Lung Foundation, the Leducq Foundation, Paris, the Stockholm County Council (560283) and AstraZeneca. Genotyping of Swedish PROCARDIS control samples was performed at the SNP Technology Platform (head: Prof. Ann-Christine Syvänen), Department of Medical Sciences, Uppsala University, Sweden.
No author has a real or perceived conflict of interest to report for this manuscript except James B. Meigs who reports receiving grant support from Sanofi-Aventis and GlaxoSmithKline, and serving as a consultant/advisor to Interleukin Genetics and Eli Lilly.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.