|Home | About | Journals | Submit | Contact Us | Français|
Heart failure results from abnormalities in multiple biological processes that contribute to cardiac dysfunction. We tested the hypothesis that inherited variation in genes of known importance to cardiovascular biology would thus contribute to heart failure risk.
We utilized the ITMAT/Broad/CARe (IBC) cardiovascular SNP-array to screen referral populations of advanced heart failure patients for variants in ~2,000 genes of predicted importance to cardiovascular biology. Our design was a two-stage case-control study. In Stage 1, genotypes in Caucasian heart failure patients (n=1,590; ejection fraction 32±16%) were compared to those in unaffected controls (n=577; ejection fraction 67±8%) recruited from the same referral centers. Associations were tested for independent replication in Stage 2 (n=308 cases, 2,314 controls). Two intronic SNPs showed replicated associations with all-cause heart failure: rs1739843 in HSPB7 (combined P=3.09×10−6) and rs6787362 in FRMD4B (P=6.09×10−6). For both SNPs the minor allele was protective. In subgroup analyses, rs1739843 associated with both ischemic and nonischemic heart failure, whereas rs6787362 associated principally with ischemic heart failure. Linkage disequilibrium surrounding rs1739843 suggested that the causal variant resides in a region containing HSPB7 and a neighboring gene, CLCNKA, whereas the causal variant near rs6787362 is probably within FRMD4B. Allele frequencies for these SNPs were substantially different in African Americans (n=635 cases, 714 controls) and showed no association with heart failure in this population.
Our findings identify regions containing HSPB7 and FRMD4B as novel susceptibility loci for advanced heart failure. More broadly, in an era of genome-wide association studies, we demonstrate how knowledge of candidate genes can be leveraged as a complementary strategy to discern the genetics of complex disorders.
Heart failure in the United States will strike one of every five Americans and accounts annually for approximately one million hospitalizations, over 50,000 deaths, and almost $35 billion in health care costs1. Although heart failure risk is heritable, it is estimated that at most 40% of idiopathic cardiomyopathy is monogenetic2. The genetic architecture of sporadic heart failure is poorly characterized, but is believed to be driven by multiple common genetic variants that combine with environmental factors to determine individual susceptibility. Consistent with this idea, well-validated susceptibility genes or loci have been identified for clinical precursors of heart failure, but not for heart failure itself3. Identification of common variants associated with heart failure would inform our understanding of the disease and provide genetic risk markers to better direct disease surveillance and prevention in asymptomatic individuals.
Most studies reporting genetic associations in heart failure have focused on small numbers of candidate genes3. Because the resulting associations have frequently failed to be replicated4, genome-wide association studies (GWAS) have supplanted candidate gene studies in large-scale comparative genomics research5. The unbiased GWAS approach has identified genetic risks for clinical antecedents of heart failure, such as hypertension, atherosclerosis, and diabetes mellitus6, but genome-wide studies in heart failure are few7, and have yet to be replicated. Furthermore, these studies typically implicate relatively large genetic loci, and so assignment of causality has not often been achieved. Consequently, the full import of GWAS findings on disease pathology can be difficult to determine.
Here, we combined the analytical advantages of a large-scale multi-gene approach with the mechanistic advantages of examining biologically relevant genes in a large-scale candidate gene case-control study of advanced heart failure. We utilized the ITMAT/Broad CARe (IBC) cardiovascular SNP-array8 to profile common single nucleotide polymorphisms in ~2,000 genes of predicted importance to cardiovascular disease in referral populations of patients with 4 advanced heart failure. Our analysis identifies common variants in HSPB7 and FMRD4B that show replicated associations in Caucasians with advanced heart failure.
Our design was a two-stage case-control study in referral populations of patients with advanced heart failure. In Stage 1, genotypes in Caucasian heart failure patients were compared with those in unaffected Caucasian controls derived from the same referral centers. We performed analyses comparing all-cause heart failure with controls, as well as stratified analyses examining ischemic and non ischemic subgroups separately. SNPs surpassing a pre-specified threshold for statistical significance were then tested for replication in Stage 2 using an independent population of Caucasian cases and controls. We also tested whether SNPs associated with heart failure in Caucasians were associated with heart failure in African Americans.
Caucasian heart failure cases (n=1590) were recruited from the Penn Heart Failure Study and the Cincinnati Heart Study, two ongoing NIH-sponsored (HL077101 and HL088577) prospective observational studies of patients with advanced heart failure referred for subspecialty care at the University of Pennsylvania Health System (Philadelphia, Pennsylvania) and at the University of Cincinnati Medical Center (Cincinnati, Ohio). For both studies, the primary inclusion criterion is a clinical diagnosis of heart failure with abnormal left ventricular function. Extensive clinical data were collected at enrollment as previously described9, 10. Caucasian controls (n=577) were recruited from among the population of patients referred for cardiovascular evaluation at Cincinnati or at Penn. Criteria for being a control were no history of heart failure, normal ventricular function on cardiac imaging, and no evidence of coronary artery disease as determine either by a negative treadmill exercise test or by maximal coronary stenoses of 20% or less on coronary angiography. Our recruitment strategy aimed to frequency-match cases and controls based on gender and age.
To test for independent replication of associations identified in Stage 1, we recruited additional Caucasian heart failure cases (n=308) from the University of Cincinnati using the same inclusion criteria. These were compared to additional Caucasian controls who were referred to Penn or to Cincinnati for medical evaluation and who were free of clinical heart disease by history and physical examination (n=2,314). Unlike controls used in Stage 1, phenotyping of Stage 2 controls did not include assessment of ventricular function, stress testing, or coronary angiography.
To explore whether SNPs associated in Caucasians were also associated in African Americans, we recruited African American heart failure cases from the University of Cincinnati using the same inclusion criteria. These subjects were compared to African American controls referred to Cincinnati or to Penn and who were free of heart disease by history and physical examination. Again, phenotyping of controls did not include assessment of ventricular function, stress testing, or coronary angiography.
Study protocols from all centers were approved by their respective institutional review boards. All patients provided written informed consent for genetic testing.
We used the IBC cardiovascular SNP-array to compare genotype frequencies in cases and controls in the Caucasian discovery population. The IBC array was designed by a consortium of experts who developed a prioritized list of candidate genes likely to be involved in cardiovascular disorders8. Ancestry-informative markers were included. Tagging SNPs for loci of interest were determined using a “cosmopolitan” approach appropriate for populations of different ancestry. For the Caucasian replication population and for African Americans, a minority of subjects was genotyped using the IBC array and the majority was genotyped at selected SNPs using Pyrosequencing. To ensure that both technologies provided similar results, we genotyped our top two associated SNPs using both approaches in 1000 individuals and found a >98% concordance of genotypes (see Supplemental Methods).
Unlike GWAS, there are no broadly accepted standards regarding the level of statistical significance for large-scale candidate gene association studies where SNPs are tightly clustered in genes of high expected relevance to disease, and where association tests are not independent due to a high degree of linkage disequilibrium (LD). Hence, we adapted the two-stage approach utilized in most GWAS studies, and we regarded independent replication as the most reliable measure of true association. For Stage 1, SNPs were selected based on a pre-specified P-value threshold and then tested for independent replication in Stage 2. GWAS studies using current platforms perform approximately 1 million association tests, and a P-value threshold of 5×10−7 utilized by the Wellcome Trust Case-Control Consortium11 and by a recent GWAS of echocardiographic phenotypes12 has become an accepted threshold for selection of Stage 1 SNPs. After quality control and after accounting for the high degree of LD among SNPs on the IBC array, we estimate that our Stage 1 analysis conducts ~10,000 association tests, which is 100-fold fewer than a typical GWAS. Hence we a priori selected P<5×10−5 as the threshold for significance in Stage 1. All subjects in the Stage 1 population had genetically-inferred Caucasian Ancestry based on multi-dimensional scaling (MDS) of all IBC genotypes (Supplemental Figure 1). Our primary analyses compared genotype frequencies between cases and controls using PLINK using an additive genetic model adjusting for age, gender, and study site. Though all subjects were genetically inferred Caucasians, we also adjusted for 2 principal components of race derived from MDS to account for any residual differences in ancestry. Secondary analyses included additional adjustments for hypertension and diabetes. We first used these analyses to compare all cases with controls, and repeated them after stratifying the cases into ischemic and nonischemic subgroups. Ischemic heart failure was defined by at least one 50% narrowing on coronary angiography, a positive stress test, history of an acute coronary syndrome, or prior coronary revascularization. Patients free of these criteria were classified as nonischemic. Patients that could not be clearly classified clearly were labeled as “other” and were not included in subgroup analyses.
SNPs that met criteria for association in Stage 1 were tested for replication in Stage 2, adjusting for age, sex, and study site. To be considered replication, we required that the direction of association for allele frequencies be the same as in Stage 1. Using a 1-sided P-value distribution is therefore necessary in order for the P-value distribution to be correct under the null hypothesis, and we required 1-sided P-value <0.05 as the criterion for replication12. To obtain better risk estimates, we conducted combined analyses of all available Caucasian cases and controls for the replicated SNPs, adjusting for age, sex, site, and analysis stage
To explore loci tagged by our most strongly associated SNPs, we returned to the IBC data used in Stage 1 and imputed IBC genotypes with HapMap to increase genotype coverage. We 9 then used these data to explore LD at the loci of interest using SNAP13. For a variant with minor allele frequency or 0.4, we had 80% power to detect an odds ratio of 1.20 in the combined Stage 1 and Stage 2 populations. Further details regarding our analysis, including additional power calculations and tests for Hardy-Weinberg equilibrium, are provided in the Supplemental Material.
In the Caucasian discovery population (Table 1), cases and controls were of similar age and most were male. Based on selection criteria, controls had no heart failure, no significant coronary artery disease on coronary angiography or stress testing, and a normal ejection fraction of 67±8%. By contrast, heart failure cases had an advanced systolic heart failure phenotype with an average ejection fraction of 32±16%, and were treated accordingly with neurohormonal antagonists, cardiac resynchronization, and implanted cardioverter-defibrillators. The prevalence of ischemic and nonischemic etiologies was roughly equivalent. Hypertension and diabetes were more common among cases than controls, as expected14, 15.
In the Caucasian replication population, cases again had severe systolic heart failure with an average ejection of 32±12%. Controls had no heart failure based on selection criteria. There was a higher prevalence of diabetes and hypertension among controls in the replication population than in the discovery population. African American cases had severe systolic heart failure with average ejection fraction of 32±14%. The prevalence of hypertension and diabetes was higher among both cases and controls compared to the Caucasian populations, as expected based on general population studies16.
After eliminating SNPs with minor allele frequency <1%, call rates <95%, or not in Hardy-Weinberg equilibrium, 31,682 autosomal SNPs (of 44,720 on the IBC array) were called in the Caucasian discovery population. The Q-Q plot (Supplemental Figure 2) showed a relatively small inflation factor (λ=1.065) that is similar to or lower than those found in several successful GWAS studies11 and revealed a number of SNPs with lower P-values than expected by chance. Twenty-two genes were associated with all-cause heart failure at P<0.001 (Supplemental Table 1). Two of these SNPs surpassed our threshold for significance in primary analyses (rs1738943[HSPB7], P =2.8×10−5 and rs6787362[FRMD4B], P=4.5×10−5), and two SNPs approached but did not surpass this threshold (rs7174882[PCSK6], P=5.7×10−5 and rs16877169[PDE4D], P=5.9×10−5). Subgroup analyses yielded only one association that met criteria for statistical significance (rs4581654[LIPC], P=4.9 × 10−5) associated with nonischemic heart failure (Supplemental Table 2). LIPC is a well characterized gene involved in hepatic lipid metabolism17. As such, we reasoned this association was likely attributable to differences in unmeasured lipid phenotypes between cases and controls, and we chose not pursue this association any further. Thus, four SNPs were selected for Stage 2: rs1738943, rs6787362, rs7174882, and rs16877169.
Table 2 summarizes association results for the four SNPs selected for Stage 2. Minor allele frequencies in controls were similar to those in the HapMap CEU population. Two of the four SNPs showed significant associations in the Caucasian replication population: rs1738943(HSPB7) and rs6787362(FRMD4B). As expected18, the associations in Stage 2 were less marked than in Stage 1 with minor allele frequencies that were less divergent between cases and controls. For both SNPs, the minor allele frequency was smaller in cases, consistent with a protective effect of the minor allele. In a combined analysis of all available genotypes, P-values were 3.09×10−6 for rs1739843 and 6.09×10−6 for rs6787362.
For each of the 4 SNPs identified in our Caucasian discovery population, we also explored associations in African Americans with heart failure (Supplemental Tables 3 and 4). For three of the SNPs, minor allele frequencies in African Americans were lower than those observed in Caucasians and these SNPs were not associated with African American heart failure. One SNP, rs16877169, showed a higher minor allele frequency in African Americans and showed modest association with African American heart failure (P=0.039).
We utilized all available data to explore whether the strength of association for our top findings varied with the underlying cause of heart failure (Table 3). The association with HSPB7 was noteworthy in both subtypes, but appeared slightly stronger in nonischemic heart failure, as indicated by a slightly lower P-value and odds ratio. The association with FRMD4B appeared stronger in ischemic heart failure. We also considered whether the risk alleles might lead to heart failure by increasing risk for intermediate phenotypes that contribute to heart failure risk over time. Diabetes and hypertension are well-established risk factors for heart failure and also have a genetic basis. We therefore included history of diabetes and history of hypertension as additional covariates in our regression models to test whether this adjustment attenuated the SNP associations. As shown in Table 3, the associations remained strong, indicating that these intermediate phenotypes are unlikely to play a major role in the causal pathway from risk allele to development of heart failure.
Next we explored the loci tagged by the two replicated SNP-associations in Caucasians. We imputed our IBC genotypes with HapMap CEU to increase coverage in these regions and constructed association plots (Figure 1, top panels). For comparison, we also explored the LD structure at these two loci using HapMap data alone (Figure 1, bottom panels). As shown, rs1739843 is located within a region of strong LD that spans two genes: HSPB7 and CLCNKA. As such, the causal genetic variant responsible for our association could be anywhere in this LD block and could thus reside within either of these two genes or within intervening regulatory regions. By contrast, rs6787362 is located in the 3’ end of FRMD4B in an area of low LD. This was evident both in our own data and in HapMap, which showed no nearby SNPs in strong correlation with rs6787362. Thus the causal variant is likely to be close to rs6787362 and may lie within FRMD4B itself.
We report the first large-scale candidate gene case-control study of advanced heart failure using the high density IBC cardiovascular SNP-array. Approximately 30,000 SNPs were studied in 2,000 candidate genes with a high a priori probability of cardiovascular involvement, thereby leveraging knowledge of cardiovascular biology and the results of prior genome-association and expression quantitative trait studies8. By so doing, we identified and replicated two common genetic variants that are significantly associated with advanced heart failure in Caucasians.
These findings provide specific genetic data supporting the observation that clinical heart failure, though a complex and heterogeneous phenotype, results in part from an inherited predisposition19. For both SNPs identified (rs1738943 and rs6787362) the minor allele is less frequent in cases than in controls, indicating a protective effect of the minor allele. However, the magnitude of these protective effects are too small to be of any real clinical value by themselves, which is similar to risk or protective effects associated with many common genetic variants11. Friedrichs et al recently published a candidate-gene association study of inflammatory genes that identified variants associated with dilated cardiomyopathy20. As more such variants are identified, it may be possible to develop a composite measure or score of multiple risk and protective alleles with more predictive ability and thus more clinical use, as has been attempted for lipid disorders21.
The most important implication of our findings is the identification of specific genes that may contribute to heart failure pathogenesis in humans. Our top association was rs1738943 (combined P=3.09×10−6), which is located in a 5′ intronic region of the gene encoding the heat shock protein B7 (HSPB7). HSPB7, also known as the cardiovascular heat shock protein (cvHSP), is a member of the small heat shock protein family and is expressed almost exclusively in cardiac and skeletal muscle. One of the most abundant myocardial transcripts22, HSPB7 preserves contractile integrity by binding to and stabilizing sarcomeric proteins23, 24. Mutations in another small heat shock protein, CRYAB (also known as HSPB5), cause a rare form of familial cardiomyopathy23, and common variation in HSPB7 may similarly alter susceptibility to adverse cardiac remodeling. Further scrutiny of the locus containing rs1738943 shows that although this SNP is located within HSPB7, it resides in a block of high LD that spans HSPB7 and a nearby gene, CLCNKA, which encodes a voltage-sensitive chloride channel expressed mainly in the kidney25. Variation in CLCNKA has been associated with alterations in renal sodium re-absorption and salt-sensitive hypertension25, 26. Thus, an alternative explanation for our finding is that rs1738943 may serve as a marker for pathogenic variants in CLCNKA that lead to excess sodium retention that, in turn, contribute to heart failure risk. Discerning the underlying causal variants will require resequencing of the entire LD block spanning both HSPB7 and CLCNKA.
We also found a replicated association between rs6787362 and all-cause heart failure (combined P=6.09 × 10−6). This SNP is located in a 3′ intronic region of the FRMD4B gene. FRMD4B27, FERM- domain containing protein 4B, was identified in a screen for proteins that physically interact with CYTH3, a downstream effector of PI-3 kinase signaling. PI3-kinase is a mediator of many different signaling pathways and it is difficult to speculate a specific mechanism based on our data alone. Unlike the HSPB7 locus, the locus containing rs6787362 is in an area of weak LD. The causal variant may thus be within the FRMD4B gene, and re-sequencing of a narrower region may be sufficient for it to be identified. We note that this association was strongest in patients with ischemic heart failure (Table 3). Thus it is possible that FRMD4B contributes to heart failure risk by conferring risk for coronary artery disease. This is in contrast to the HSPB7 association, which was noteworthy in both ischemic and nonischemic subtypes.
SNP frequencies were substantially different in African Americans, and the associations identified in Caucasians did not replicate in African Americans with heart failure, with the possible exception of the rs16877169 variant in PDE4D (Supplemental Table 3). These observations are consistent with previous studies that demonstrate differences in heart failure risk factors16, therapeutic response28, 29, and clinical outcomes30 in African Americans and Caucasians. Further work in larger cohorts is required to identify the extent to which ancestry-specific genetic variants explain these observations.
The use of a large-scale candidate gene approach stands in contrast to numerous GWAS studies that have rapidly emerged over the past few years and that have identified common variants associated with diabetes mellitus, hypertension, dyslipidemia, coronary artery disease, and echocardiographic traits31. The strength of the “hypothesis-free” GWAS approach is an unbiased look at the entire genome, with the consequence that all genetic variation is regarded equally. However, for any given disease process, each gene has a different likelihood of involvement in pathogenesis, which is not considered in most whole-genome approaches. Thus, the chance of false positive associations is enhanced, and the true associations identified are dominated by variants with modest effect sizes, such as the 9p21 locus associated with myocardial infarction11. By contrast, our approach sacrifices an unbiased view for one that leverages years of accumulated knowledge regarding the biology of cardiovascular disorders. This may reduce the likelihood of false-positive associations and identifies smaller genic regions that suggest specific pathogenetic mechanisms, but at a cost of ignoring regions of the genome that may contain important but unanticipated variants that predispose to heart failure. We thus regard GWAS and large-scale candidate gene studies as complementary approaches.
We studied patients with advanced heart failure and substantial ventricular dysfunction recruited from referral populations. By focusing on an extreme phenotype we are more likely to identify genetic variants that relate to myocardial dysfunction per se, rather than genetic variants that influence its clinical antecedents. Indeed, adjusting for hypertension or diabetes had little influence on our findings. We note that our findings may not be generalizable to early-stage heart failure in population-based studies, in which nearly half have preserved systolic function32. However, a comparison of the Framingham 100K heart failure GWAS findings with our own data suggests that there will be some overlap. Two heart failure genes (C9orf39 and RYR2) identified in Framingham had adequate proxy SNPs on the IBC array (Supplemental Table 6). Although C9orf39 showed no association with heart failure in our data, several variants in RYR2 showed modest associations with all-cause heart failure (P=0.016) and nonischemic heart failure (P =8×10−4). We thus provide further evidence to support RYR2 as a heart failure susceptibility gene, which is of particular interest given its well established role in excitation-contraction coupling. As large-scale heart failure genetic studies proceed, it will be instructive to compare findings from referral-based cohorts with those from population-based studies such as the CHARGE33 and CARe consortia8.
We acknowledge several limitations. As in all case-control studies, misclassification of cases and controls and limitations in sample size may reduce power to detect associations. Survival bias may also influence the composition of the cases, since some heart failure patients will die at an earlier stage and not have the opportunity to be recruited. Here we are reassured by independent replication of our principal findings. Like many common phenotypes, heart failure arises via derangements in a myriad of biological pathways and is rightly termed “complex.” Beyond this biological complexity, there is also substantial heterogeneity in the definition and diagnosis of heart failure, and this remains a particular challenge for studies of heart failure genetics. In this regard, detailed phenotyping in our discovery population, including clinical evaluation, cardiac imaging, and assessment for coronary artery disease in both cases and in controls, is a major strength. Finally, although we replicated our top SNPs identified in primary analyses, other promising variants identified in secondary analyses, such as ARHGAP26 and LIF (Supplemental Table 1), could benefit from further study in larger populations.
In conclusion, we have utilized a large-scale candidate gene approach to identify and replicate two common genetic variants associated with risk of advanced heart failure in Caucasians. These findings implicate regions containing HSPB7 and FRMD4B as novel susceptibility loci for advanced heart failure. In an era of genome wide association studies, our study thus demonstrates how knowledge of candidate genes can be leveraged as a complementary strategy to discern the genetics of complex disorders.
Funding sources: This work was funded by NIH R01HL088577, R21HL092379, R01HL087871, P50 HL077101, P50HL077113, RC2 HL102222, and the Penn Cardiovascular Institute.