|Home | About | Journals | Submit | Contact Us | Français|
Genes encoding protein C anticoagulant pathways are candidates for athero-thrombotic and other aging-related disorders.
Using a tagSNP approach, and data from the Cardiovascular Health Study (CHS), we assessed associations of common polymorphisms of PROC, PROS1, and PROCR with (1) plasma protein C, soluble protein C receptor (sEPCR), and protein S levels measured in a sub-sample of 336 participants at study entry; (2) risk of incident clinical outcomes (coronary heart disease or CHD, stroke, and mortality) in 4,547 participants during follow-up. Secondarily, we explored associations between plasma protein C, S, and sEPCR levels and other candidate genes involved in thrombosis, inflammation, and aging.
The PROCR Ser219Gly polymorphism (rs867186) was strongly associated with higher sEPCR levels, explaining 75% of the phenotypic variation. The Ser219Gly variant was also associated with higher levels of circulating protein C antigen. An IL10 polymorphism was associated with higher free protein S levels. The minor alleles of PROC rs2069901 and PROS1 rs4857343 were weakly associated with lower protein C and free protein S levels, respectively. There was no association between PROCR Ser219Gly and risk of CHD, stroke, or mortality. The minor allele of another common PROCR tagSNP, rs2069948, was associated with lymphoid PROCR mRNA expression and with increased risk of incident stroke and all-cause mortality, and decreased healthy survival during follow-up.
A common PROCR variant may be associated with decreased healthy survival in older adults. Additional studies are warranted to establish the role of PROCR variants in ischemic and aging-related disorders.
Protein C is activated on endothelium by the thrombin-thrombomodulin-endothelial protein C receptor (EPCR) complex . In the presence of cofactor protein S, activated protein C (APC) proteolytically inactivates coagulation factors VIIIa and Va, thereby inhibiting clot formation. By binding to EPCR and protease-activated receptor-1, APC also exerts anti-inflammatory and cytoprotective effects on a variety of cell types .
The Ser219Gly variant (rs867186) tags a common PROCR haplotype, and is associated with higher soluble EPCR (sEPCR) levels, explaining ~85% of the phenotypic variance [3,4]. The heritabilities of circulating protein C, free protein S, and total protein S range between 30–50% [5,6], but the specific genetic factors responsible are not as well characterized [7,8]. Promoter polymorphisms of the protein C gene (PROC) have been reported to account for ~5% of the phenotypic variability [9,10].
Mutations and polymorphisms of PROC, PROCR, and PROS1 (coding for protein S) are associated with risk of familial venous thrombotic disease. Recently, the PROCR Ser219Gly variant was associated with increased thrombin generation and increased risk of coronary heart disease (CHD) in men from the Northwick Park Heart Study . The role of Ser219Gly, or other common polymorphisms of the PROC, PROCR, and PROS1 genes, in risk of arterial thrombotic disease or mortality have not been carefully examined.
Here, using a tagSNP approach, and data from the population-based Cardiovascular Health Study (CHS), we assessed associations between common polymorphisms and haplotypes of the PROC, PROS1, and PROCR genes and (a) plasma protein C, sEPCR and protein S levels measured in a cross-sectional sub-sample of 336 participants at study entry, and (b) risk of incident clinical outcomes (MI, stroke, and mortality) in 4,547 participants during follow-up. Secondarily, we explored associations of other candidate genes involved in thrombosis, inflammation, and aging with plasma protein C, sEPCR, and protein S levels.
The CHS population, inclusion criteria for the current study, and follow-up for clinical events, including years of healthy life , are described in detail under Supplemental Methods. A total of 336 men and women with baseline protein C, S and sEPCR measurements not taking warfarin were included in the baseline cross-sectional analysis. The number of participants eligible for analysis of clinical events during follow-up was 4,547. All study participants provided written informed consent for use of their DNA for genetic testing.
Baseline blood was collected in a fasting state, and a special tube designed to prevent in vitro clotting activation (SCAT-1, Haematologic Technologies, Inc., Essex Junction, VT) was used . Blood samples were analyzed at the Central CHS Laboratory at the University of Vermont. Protein C antigen, free protein S (unbound to C4b-binding protein) total protein S (free + C4b-binding protein-bound), and sEPCR were measured by enzyme linked immunosorbent assays (ELISAs) as previously described [14,15]. The assay CVs were 2.5%, 9.9%, 6.7%, and 3%, respectively.
Single nucleotide polymorphisms (SNPs), pair-wise linkage disequilibrium (LD) patterns, and haplotypes for our candidate genes were identified from the SeattleSNPs candidate gene SNP discovery resource and database (http://pga.mbt.washington.edu/). Polymorphic sites were identified by direct re-sequencing of genomic sequence from 23 European-Americans encompassing all exons, introns, untranslated regions and ~ 2 kb of additional flanking sequence on either 5′ or 3′ end. For PROC, PROCR, and PROS1, a total of 39, 13, and 44 polymorphic sites were identified by SeattleSNPs, of which 28, 8, and 17 had minor allele frequency (MAF)≥10%. TagSNPs were identified using the pair-wise LD binning procedure implemented in the LDSelect algorithm of Carlson et al , at a ≥10% MAF threshold and a linkage disequilibrium (LD) threshold of r2 ≥ 0.64 to create bins. Using this procedure, we identified 4 tagSNP bins for PROC, 2 for PROCR, and 4 for PROS1 (see Table 1). One PROS1 tagSNP bin could not be assayed due to assay failure. Therefore linkage disequilibrium coverage in PROC, PROS1, and PROCR for untyped SNPs with minor allele frequencies of 10% or greater was 100%, 75%, and 100%, respectively at an r-squared threshold of ≥0.64. TagSNPs were similarly selected for the remaining 127 candidate genes using SeattleSNPs or HapMap databases at an LD threshold of r2 ≥ 0.64 and MAF ≥10%.
PROC, PROS1, and PROCR genotyping was performed in all consenting CHS participants at the Laboratory for Clinical Biochemistry Research (University of Vermont) with the ABI TaqMan platform using Assays By Design on an ABI 7900 real time thermal cycler under standard conditions (Applied Biosystems, Foster City, CA). Overall genotype missing rate was <0.1%, and blind duplicate concordance rates were >99%. Genotype distributions were in Hardy-Weinberg equilibrium for all PROC, PROCR, and PROS1 tagSNPs, with the exception of PROS1 rs4857343 (uncorrected p=0.002). Additional genotyping was performed using the Illumina GoldenGate platform by the Center for Inherited Disease Research (CIDR; Johns Hopkins University, Baltimore, MD) for 876 tagSNPs covering common linkage disequilibrium patterns across 127 thrombosis, inflammation, and aging-related genes, as previously described in detail .
Our primary hypotheses were to test whether common polymorphisms of the PROC, PROCR, and PROS1 genes were associated with (a) protein C, sEPCR, and protein S levels; (b) risk of incident clinical CVD and aging-related outcomes. Secondarily, we also explored associations of other candidate genes involved in thrombosis, inflammation, and aging on plasma protein C, sEPCR, and protein S levels. Cross-sectional relationships between baseline protein C, S and sEPCR measurements were assessed using Spearman’s rank correlation coefficient (ρ). Associations between individual SNP genotypes and baseline quantitative phenotypes were assessed using multiple linear regression models. All regression models were minimally adjusted for age and sex. To reduce the influence of environmental variation, protein C and protein S regression models were additionally adjusted for cholesterol and triglycerides (which were strong cross-sectional correlates of protein C and S levels). Covariate-adjusted mean adjusted plasma anticoagulant levels (and 95% confidence intervals) were estimated for each genotype group from the regression coefficients (β) and standard errors. Associations between individual tagSNP genotypes and risk of incident CVD events or mortality during follow-up were assessed using Cox regression, adjusted for major clinical risk factors (age, sex, diabetes, hypertension, clinical CVD status, smoking, and serum creatinine). Covariate-adjusted hazard ratios were estimated from the regression coefficient, assuming a constant effect size for each additional copy of the minor allele. In analyses of incident MI and stroke during follow-up, we excluded participants who had experienced an MI (n=447) or stroke (n=165) prior to baseline. Assessment and adjustment of potential within-Europe population structure during our primary candidate gene analyses was performed as described under Supplemental Methods.
We used various procedures to control for multiple hypotheses testing during our primary and secondary analyses. For each of our primary candidate genes, we conducted a “gene-wide” test of significance by performing 100,000 permutations of the data sampled under the null hypothesis. Empirical p-values were determined by counting the proportion of times the observed test statistic was greater than the maximum value of the permuted null test statistics across all SNPs in each gene, thereby accounting for the number of SNPs tested and the variable correlation structure, while controlling the experiment wise type 1 error rate at 5% for each gene . We also used statistical methods to infer haplotypes from the genotype data, using the probability-weighted haplotypes as the unit of analysis . To further correct our primary analyses of PROC, PROCR, and PROS1 genotypes for testing of multiple, correlated traits, we conducted an “experiment-wise” test of significance using a procedure that computes empirical p-values while retaining the original correlation structure among both genotypes and traits under the null hypothesis by simulation from a multivariate normal distribution . To correct for multiple comparisons in our secondary analysis of 876 tagSNPs from 130 candidate genes with protein C, S, and sEPCR levels, we assessed overall statistical significance using the false discovery rate (FDR) q value, which is an estimate of the proportion of features called significant that are truly null .
To assess associations between PROCR genotype and gene expression, we utilized the Sanger GENe Expression VARiation project (GENEVAR), a web-accessible collection of genome-wide microarray-based gene expression measurements obtained from Epstein-Barr virus-transformed lymphoblastoid cell lines from 60 unrelated European-American HapMap (CEPH) samples . Differences in mRNA expression levels for genes of interest were tested between genotype groups using ANOVA.
Participant characteristics among those eligible for each analysis (cross-sectional plasma phenotype and longitudinal time to clinical event) are summarized in Table 1. The distributions of protein C, free protein S, and total protein S levels were approximately normal. As previously reported in other studies [3,4], sEPCR had a multi-modal distribution. Mean levels of protein C, sEPCR, free protein S, and total protein S were 3.40 ±5.95 mg/L, 222.55 ±90.10 μg/L, 5.01 ±0.94 mg/L, and 22.38 ±3.14 mg/L, respectively. Total protein S levels were strongly correlated with free protein S (ρ=0.36; p<0.0001) and protein C (ρ=0.31; p<0.0001). Protein C levels were correlated with free protein S (ρ=0.18; p=0.002) and sEPCR (ρ=0.21; p=0.0001). Protein C, free protein S and total protein S were strongly correlated with baseline plasma cholesterol and lipid levels, as previously reported . sEPCR levels initially showed no significant correlation with any traditional cardiovascular risk factors. However, when adjusted for the strong effect of the Ser219Gly polymorphism, male sex (p=0.02) and age (p<0.0001) were associated with higher sEPCR levels.
Among the entire CHS sample, the r2 measure of linkage disequilibrium between pairs of SNPs in each gene were all below 0.65. The associations between PROC, PROCR, PROS1 tagSNPs, haplotypes and their respective plasma phenotypes are shown in Table 2. Under an additive genetic model, each additional copy of the PROCR Ser219Gly polymorphism was associated 182 ±6 μg/L higher sEPCR levels, explaining 75% of the phenotypic variation. In unadjusted models, the minor allele of PROCR rs2069948 appeared to be associated with lower sEPCR levels (β =−34 ±7 μg/L); however, adjustment for Ser219Gly greatly weakened the association (β=−3.7 ± 3.8 μg/L; p=0.33). Analysis of the 3 common PROCR haplotypes confirmed that the haplotype tagged by Ser219Pro (PROCR-3) was strongly associated with higher sEPCR levels (p<0.0001) relative to PROCR-1 containing both major tagSNP alleles, while there was no difference in sEPCR concentration between the haplotype tagged by rs2069948 (PROCR-2) and PROCR-1 (supplemental Table 1). Each additional copy of the PROC rs2069901 variant allele was associated with −0.13 ± 0.04 lower mg/L protein C levels (gene-wide p=0.01). PROC haplotypes containing the rs2069901 variant allele were also associated with lower protein C (supplemental Table 1). Each additional copy of the PROS1 rs4857343 minor allele (β=−0.28 ± 0.10 mg/L; p=0.005), or alternatively, the two haplotypes containing the rs4857343 minor allele (supplemental Table 3), was associated with lower free protein S levels. Upon experiment-wide correction for testing multiple genes and phenotypes, the association between Ser219Gly and sEPCR levels was still strongly significant (p<0.0001); however, neither the protein C nor the free protein S genotype-phenotype associations were significant (p=0.11 and 0.14, respectively).
We screened 876 additional tagSNPs across 130 candidate genes for association with plasma protein C, S, and sEPCR levels. The tagSNP identifiers, location, regression beta coefficients, standard errors, p-values and false discovery rate q-values are shown in Supplemental Table 4. At a false discovery rate threshold of q<0.05, the PROCR Ser219Gly polymorphism was associated with higher protein C levels (β = 0.63 ± 0.09 mg/L; p-value 9×10−11; q-value <1 ×10−8), explaining 13% of the variation in protein C. The minor allele of an interleukin-10 gene (IL10) intronic polymorphism on chromosome 1 (rs1878672) was associated with higher free protein S levels (β = 0.34 ± 0.08 mg/L; p-value 3×10−5; q-value = 0.03), explaining 5% of the variation in free protein S. The same IL10 rs1878672 variant was associated with 0.16 ± 0.06 mg/L higher protein C levels (p-value 0.006), though the protein C association did not meet the same significance threshold (q-value =0.50) as the free protein S association.
There was no evidence for association between PROC or PROS1 genotype and risk of incident CVD or death. On the other hand, each additional copy of the minor allele of PROCR rs2069948 was associated with 13% ± 6% (p=0.02) increased risk of stroke and 7% ± 3% increased risk of mortality (p=0.01), even after adjusting for major risk factors (Table 3). Moreover, each additional copy of the PROCR rs2069948 variant was associated 0.32 ± 0.09 fewer years of healthy life (p=0.0003). There was no association between Ser219Gly and risk of incident CVD or mortality. The PROCR genotype-clinical phenotype results did not differ among subgroups defined according to age, gender, diabetes, or baseline subclinical CVD status. Upon experiment-wide correction for multiple testing, only the PROCR rs2069948 association with years of healthy life remained statistically significant (p=0.01).
Examination of linkage disequilibrium patterns surrounding the PROCR gene on chromosome 20 revealed that PROCR rs2069948 was strongly correlated with several other SNPs located in PROCR as well as with additional SNPs located outside PROCR. Supplemental Table 5 shows the location, function, and pair-wise correlation coefficient for all other SNPs showing strong linkage disequilibrium (r2>0.5) in a 200 kb window surrounding PROCR rs2069948. We assessed the relationship between PROCR genotype and lymphoid mRNA expression levels of PROCR and neighboring chromosome 20 genes in 60 European-American HapMap individuals (Table 4). rs2069948 was associated with lower PROCR mRNA expression and also with higher and lower mRNA expression levels, respectively, of EDEM2 (ER degradation-enhancing-mannosidase-like protein 2) and GSS (glutathione or GSH synthase). EDEM2 and GSS are both involved in responses to cellular damage that accumulate during aging. PROCR Ser219Gly polymorphism was neither associated with PROCR mRNA expression, nor with mRNA expression of any other genes in the region (Table 4).
In older European-American men and women from CHS, the PROCR Ser219Gly variant was strongly associated with higher sEPCR levels. There were much weaker associations between common variants of the PROC and PROS1 genes and protein C and free protein S levels, respectively. In an exploratory analysis that included a larger set of thrombosis- and inflammation-related genes, an IL10 polymorphism was associated with higher free protein S and protein C levels. In addition, the PROCR Ser219Gly variant was associated with higher levels of protein C. There was no evidence of association between Ser219Gly and risk of clinical CVD. On the other hand, another common PROCR variant rs2069948 was associated with increased risk of stroke and mortality and decreased healthy survival during follow-up and with reduced PROCR mRNA expression in lymphoid cells.
Our results confirm that the PROCR Ser219Gly polymorphism is the major determinant of phenotypic variation in sEPCR [3,4]. In contrast, the majority of inter-individual variation in protein C and protein S levels does not appear to be explained by common polymorphisms of PROC or PROS1, respectively. The PROC rs2069901 variant allele weakly associated with lower protein C levels in CHS tags the same common “CG” haplotype as the PROC -1654 C/A and -1641 G/T promoter polymorphisms previously reported to be associated with lower plasma levels [9,10] and lower transcriptional efficiency of protein C . The PROS1 variant associated with lower free protein S levels in CHS is located in intron 2 and is in strong linkage disequilibrium with 2 other polymorphisms rs8178583 and rs7650230 located in the 5′ flanking region, but outside of the minimal PROS1 promoter [24,25]. While our tagSNP approach efficiently captures LD patterns among common SNPs across our candidate genes, we were not able to capture one common PROS1 LD bin (containing 4 SNPs with MAF ≥10%) due to assay failure. Moreover, the common tagSNP approach design does not address the possibility that rare variants within these same genes, such as the previously reported protein S Heerlen Ser460Pro variant  (estimated population prevalence ~ 0.5%), may contribute significantly to phenotypic variance.
Since protein C and sEPCR and levels were moderately correlated, it is interesting to note that the PROCR Ser219Gly variant was a fairly strong determinant of protein C, explaining a greater amount of the phenotypic variation than the PROC gene itself. The mechanism of association between PROCR Ser219Gly and higher sEPCR levels appears to involve increased EPCR shedding from the endothelium , which may result in reduced protein C activation . Therefore, it is possible that higher sEPCR levels relative to membrane-bound EPCR associated with Ser219Gly might effectively increase the amount of circulating protein C by stabilizing it.
We also observed a moderately strong correlation between protein C, protein S, and plasma lipid levels. Protein C and protein S are functionally- and structurally-related vitamin K-dependent proteins. It is possible that common polymorphisms of shared regulatory genes account for at least some of the correlation in their plasma concentration. In this regard, it is interesting to note that IL10 genotype was associated with both free protein S and protein C levels. IL10 is located in a region on chromosome 1q32, previously identified as a QTL for protein S in a genome-wide linkage analysis . The chromosome 1q32 region contains two genes encoding subunits of C4 binding protein (C4BPA and C4BPB), an inflammation-sensitive protein. The beta subunit of C4 binding protein binds strongly to protein S and effectively determines free protein S levels. C4BPA and C4BPB are located ~300 kb from the IL10 gene. Examination of HapMap data showed only weak linkage disequilibrium between the associated IL10 variants and the C4BP gene region, with all pair-wise SNP r2 with IL10 rs4857343 < 0.12. Therefore additional fine-mapping in this extended region on chromosome 1q32, which also contains several other complement and interleukin-related genes, will be required to further characterize the variant(s) responsible for the protein S and protein C phenotypic associations.
A recent report suggested that PROCR Ser219Gly was associated with increased thrombin generation and increased risk of CHD in men and in diabetics or those with metabolic syndrome . Our results do not support this finding, either overall or in subgroups defined according to age, gender, or diabetes. Failure to replicate the Ser219Gly CVD association in CHS could be related to differences in sample characteristics between studies. Another potential issue is that the risk associated with PROCR Ser219Gly in the report by Ireland et al appeared to be limited to homozygotes for the minor Gly219 allele . Gly219 homozygotes comprised only 0.8% of the CHS cohort; therefore even larger sample sizes may be required to detect a recessive effect.
Based on data from the HapMap, the PROCR rs2069948 variant associated with healthy aging in CHS is part of an extended haplotype on chromosome 20 that contains 45 other polymorphisms, including 9 PROCR SNPs in complete linkage disequilibrium, as well as several variants located in neighboring genes (Supplemental Table 5). One of the SNPs comprising the risk haplotype, rs6088747, is located within an enhancer region 5.5 kb upstream of the PROCR translation start site that is essential for PROCR mRNA expression inendothelium and hematopoietic cells . Another SNP, rs9574, is located within the proximal 3′ untranslated region of PROCR. It is possible that one or more of these polymorphism influences PROCR mRNA stability or processing, as suggested by the observed association between rs2069948 and decreased lymphoid PROCR mRNA expression. The precise molecular mechanism of this association requires further study, including assessment of the relative effects of PROCR genotype on membrane versus soluble forms as well as cell type-specific mRNA expression. It should also be noted that the same PROCR haplotype tagged by rs2069948 has been associated with higher circulating activated protein C levels and decreased risk of venous thrombosis , though other groups have reported that the Ser219Gly haplotype, but not the rs2069948 haplotype, is associated with increased risk of venous thrombosis [3,4].
Because of the anticoagulant, anti-inflammatory, cytoprotective, and immuno-modulatory properties of the protein C system [1,2], genes encoding protein C pathway components are potential candidates not only for athero-thrombotic and ischemic conditions , but also for inflammatory and other aging-related disorders. Endothelial protein C receptor is expressed by various blood cells, and binding of protein C arrests cellular migration [32,33]. The effects of the protein C/EPCR pathway on endothelial or immune cell function may not only be related to the apparent influence of the PROCR rs2069948 haplotype on healthy aging, but also to the beneficial effects of activated protein C in the treatment of severe sepsis [34,35]. Finally, there was some evidence of association between rs2069948 and expression of other genes in the region, EDEM2 and GSS, which are involved in protein unfolding response and oxidative damage responses, respectively. These data additionally suggest the possibility that other cellular processes may play a role in the apparent effect of the PROCR risk haplotype on thrombosis, mortality or healthy aging.
This study was supported by contracts N01-HC-85079through N01-HC-85086, N01-HC-35129, N01 HC-55222, N01 HC-15103, grants R01 HL-071862 and U01 HL080295 from the NHLBI and U19 AG023122 from the National Institute on Aging Longevity Consortium, with additional contribution from the National Institute of Neurological Disorders and Stroke. A full list of participating CHS investigators and institutions can be found at http://www.chs-nhlbi.org. Genotyping services for CHS were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, Contract Number N01-HG-65403. Genotyping services for CHS were also provided by the Johns Hopkins University under federal contract number (N01-HV-48195) from the National Heart, Lung, and Blood Institute. We utilized public re-sequencing data from the SeattleSNPs program, supported by NIH U01 HL66682 (http://www.pga.gs.washington.edu/).