|Home | About | Journals | Submit | Contact Us | Français|
A 58kb region on chromosome 9p21.3 has consistently shown strong association with coronary artery disease (CAD) in multiple genome-wide association studies in populations of European and East Asian ancestry. In this study we sought to further characterize the role of genetic variants in 9p21.3 in African American individuals.
Apparently healthy African American siblings (n=548) of patients with documented CAD <60 years of age were genotyped and followed for incident CAD for up to 17 years. Tests of association for 86 SNPs across the 9p21.3 region in a GEE logistic framework under an additive model adjusting for traditional risk factors, family, follow-up time, and population stratification were performed. A single SNP within the CDKN2B gene met stringent criteria for statistical significance, including permutation-based evaluations. This variant, rs3217989, was common (minor allele [G] frequency 0.242), conveyed protection against CAD (OR=0.19, 95% CI: 0.07 to 0.50, p=0.0008) and was replicated in a combined analysis of two additional case/control studies of prevalent CAD/MI in African Americans (n=990, p=0.024, OR= 0.779, 95% CI: 0.626-0.968).
This is the first report of a CAD association signal in a population of African ancestry with a common variant within the CDKN2B gene, independent from previous findings in European and East Asian ancestry populations. The findings demonstrate a significant protective effect against incident CAD in African American siblings of persons with premature CAD, with replication in a combination of two additional African American cohorts.
Although coronary artery disease (CAD) is the leading cause of death in African American men and women1, 2, there is a paucity of data regarding genetic variants related to CAD in persons of African compared to European ancestry. Consistently implicated in several large genome-wide association studies (GWAS) in primarily white, North American or Northern European populations3-7, the 9p21.3 chromosomal region appears to harbor a locus associated with CAD risk. Significant associations of specific single nucleotide polymorphisms (SNPs) in this region have since been replicated in Chinese8, 9, Italian10, 11, South Korean12, 13, Japanese13, 14, and American Hispanic15 populations but not in African Americans4, 15. The strongest associations are all within a 58 kb region (chromosome 9: 22,062,301 - 22,120,3894) containing multiple SNPs in tight linkage disequilibrium (LD) in European derived populations, but no known protein-coding genes. Nonetheless, the 9p21 locus appears to consistently convey risk independent of conventional CAD risk factors in general population studies4, 11. The strength of association of the 9p21 locus with CAD was recently demonstrated in a systematic review of 47 distinct data sets, including 8 Asian studies, with 35,872 cases and 95,837 controls, but none with African Americans16.
Two cyclin-dependent kinase inhibitor genes, CDKN2A and CDKN2B, located ~115kb away from the peak association signal in the 9p21.3 locus, have been suggested as the likely susceptibility genes in this region, and their translated proteins are thought to play important roles in cell cycle regulation17. More recently, a large noncoding antisense transcribed RNA named ANRIL (CDKN2BAS) has been implicated as a possible regulatory element 11, 18. The physical location of ANRIL is notable because it includes the SNPs previously strongly associated with CAD; it also overlaps (14763bp) with the physical gene locus CDKN2B.
Early-onset CAD is more heritable than CAD occurring at older ages19. CAD at particularly young ages aggregates strongly in families, accounts for 60% of all CAD occurring prior to age 6520, and independently conveys an excess risk of CAD in first degree relatives 2-5 times that of the general population19, 21, 22. We have previously shown a high prevalence of CAD risk factors23 and subclinical CAD24 as well as a higher than expected incidence of CAD events25 in African Americans with a family history of premature CAD. In the current study, we characterize the association of 9p21.3 genetic variants with incident CAD in initially healthy African American siblings of persons with documented early-onset CAD (<60 years of age). We test for replication of findings in two additional African American populations.
Between 1990 and 2002, GeneSTAR enrolled 548 asymptomatic, apparently healthy young African American siblings (<60 years of age) of 278 patients with premature CAD in a prospective study of families with premature CAD, GeneSTAR (Genetic Study of Atherosclerosis Risk. Mean sibship size was 1.97 ± 1.2. All siblings had DNA isolated and stored at the time of enrollment.
Siblings were identified from probands hospitalized with documented CAD including acute myocardial infarction (MI) (n=40), coronary artery bypass surgery (CABG) (n=81), percutaneous coronary intervention (PCI) (n=109), angina with angiographic evidence of flow-limiting coronary stenosis (n=40), or following sudden cardiac death (n=8). Their siblings were eligible if they were <60 years of age and had no known history of CAD. Siblings were excluded if they had autoimmune disease, life-threatening co-morbidity (i.e. AIDS, cancer), or were receiving chronic glucocorticosteroid therapy as previously described 26. The study was approved by the Johns Hopkins Medicine Institutional Review Board and all study participants gave informed consent.
All eligible siblings underwent a baseline comprehensive risk factor screening following a 12-hour overnight fast. A physical examination was performed, blood was taken for lipid and glucose levels, and a complete medical history was elicited. Cardiac risk factors were defined using standard thresholds and methods as previously described25. Participants were followed at five-year intervals for up to 17 years after baseline screening for incident CAD events. A trained telephone interviewer elicited a history of any cardiac related procedures or symptoms, the use of any coronary disease or risk factor related medications, and any history of CAD. Data was also recorded on all CAD diagnostic tests and results. In the event of a death, the closest family member was interviewed as a proxy. All reported CAD events and potential CAD events based on diagnostic procedures and therapies were reviewed from physician records, hospital records, death records, and autopsy records using standardized methods as defined in the Framingham Heart Study, using the same classifications and definitions27. The first CAD event during the follow-up period was recorded and included sudden cardiac death, MI, unstable angina with CABG, unstable angina with PCI, stable angina with CABG or PCI, and medically treated angina with no intervention. Each person could only enter CAD event modeling once. All records were independently reviewed by three investigators, and any single discordant classification was referred to external cardiology reviewers as previously described using a standardized coding schema25.
The Emory Genebank Study was designed to investigate the relationship between biochemical and genetic factors with CAD in subjects undergoing cardiac catheterization. Study participants were enrolled at the Emory University Hospital, Emory University Hospital Midtown, Emory Clinic and Grady Memorial Hospital in Atlanta, Georgia. Subjects with ≥ 50% stenosis in one or more coronary arteries or with prior history of MI, CABG or percutaneous transluminal coronary angioplasty (PTCA) were defined as cases (n=321). Subjects with no evidence of coronary artery disease on cardiac catheterization and no prior history of MI or CAD were defined as controls (n=146). The appropriate Institutional Review Board approved the study and all subjects provided written informed consent. Information on ethnicity was self-reported.
The study participants were enrolled at Duke University Medical Center (Durham, North Carolina) through the CATHGEN biorepository, consisting of subjects greater than 18 years of age, recruited sequentially through the cardiac catheterization laboratories from 2001-2005. Biological samples and extensive clinical, angiographic, and longitudinal follow-up data were collected on all subjects consenting to participation. Cases of MI (n=280) were defined as those having a history of MI (by self-report and corroborated by review of medical records using standardized criteria), or having suffered an MI during the study follow-up period using the same standardized methods to classify events. Controls (n=243) were defined as those with no previous history or evidence of significant CAD prior to or subsequent to cardiac catheterization, including MI, coronary revascularization, cardiomyopathy with an ejection fraction on left ventriculography <40%, or significant CAD on coronary angiography defined as a CAD index28 ≤ 23 and no coronary vessel with a stenosis > 50%. This study was approved by the Duke University Medical Center Institutional Review Board on Human Subjects and all subjects gave written informed consent.
In GeneSTAR, SNP genotyping was performed at deCODE Genetics, Inc. using the Human 1Mv1_C array from Illumina, Inc. where 1,044,094 markers were released with an average call rate per sample of 99.65% and an overall missing data rate of 0.35%. The Illumina 1M array did not include rs10757278, shown previously to be associated with CAD5, and this SNP was genotyped separately using a Taqman assay on an ABI Prims 7900HT Sequence Detection System at deCODE Genetics, Inc., so as to avoid reliance on in-silico genotyping methodologies such as imputations. A total of 100 SNPs that mapped to the 9p21.3 region (chromosome 9: 21920505 – 22128762) were available for these analyses, of which 86 were non-monomorphic and informative for tests for association. We used PLINK29 to detect and remove Mendelian errors. Hardy-Weinberg equilibrium (HWE) and minor allele frequency (MAF) for each SNP was tested in a defined set of independent subjects (n=326) representing the founders of the pedigrees. We detected no deviation from HWE for any of the SNPs at our threshold of 0.0005 (i.e. p= 0.05/number of SNPs tested). Admixture estimates were obtained using a subset of 18,982 SNPs from the GWAS array that were selected on the basis of low SNP correlation and high genetic distance between populations (FST) optimal for the differentiation of Ceph, Yoruban, and Chinese+Japanese HapMap (www.hapmap.org) samples selected as ancestral reference populations. Using STRUCTURE (v2.2; http://pritch.bsd.uchicago.edu/software) the mean estimated Yoruban ancestry in these 548 African American siblings was 79.29% (range 41.41% - 99.98%).
The SNP rs3217989 was genotyped at deCODE Genetics, Inc. using a Taqman assay on an ABI Prims 7900HT Sequence Detection System.
In the GeneSTAR population, logistic regression models were used to test for association between each individual SNP and incident CAD under the log-additive model (i.e. a linear recoding of the SNP as 0/1/2 for 0, 1 and 2 copies of the minor allele). Regression models were implemented in the generalized estimating equation (GEE) framework with an exchangeable covariance matrix to correct for familial correlation30. Data were analyzed adjusting for traditional CAD covariates including age, gender, LDL cholesterol, HDL cholesterol, triglycerides, systolic blood pressure, body mass index, fasting-plasma glucose, smoking status, and study follow-up time. Analyses were carried out using SAS v 9.1.3 [SAS Institute Inc., Cary, NC] and SUDAAN v 10.0 [Research Triangle Institute, Research Triangle Park, NC]. Given the range of Yoruban ancestry observed in the GeneSTAR African American siblings noted above, principal components-based estimates of admixture were obtained using the smartpca program in EIGENSOFT31. Analyses were repeated with and without the inclusion of the first two eigenvectors in the logistic regression models. Because the correction for admixture did not affect the results, the models we present do not include admixture. Estimates of LD and LD block definition were calculated using Haploview 4.232.
We performed permutation tests for all SNPs with a MAF > 4% to assess the genotype-specific null distributions in R (v. 2.9.0), which allowed for valid inferences in comparison with the observed test statistics. Event outcomes were shuffled 10,000 times across families of the same size, and then within families (thereby, keeping the number of affected individuals in families the same, and preserving the LD structure between SNPs). For each SNP, a p-value was recorded in each of the shuffles. To evaluate the significance of the association, we obtained an overall p-value for each genotype by counting how many of the 10,000 p-values obtained by shuffling the data were below the p-value observed in the original data. We considered the association of SNPs with CAD significant if this permutation p-value withstood a Bonferroni correction, controlling the family-wise error rate at 5%.
SNPs that met our stringent permutation-based criteria for association with incident CAD in the discovery GeneSTAR population were genotyped for replication in a total of 990 individuals (601 cases and 389 controls) using data from the Emory Genebank Study and CATHGEN cohorts. Each study first separately performed an age- and sex-adjusted log-additive logistic regression analysis for each SNP to test for association with CAD outcomes. Using the age and sex adjusted calculated ORs, 95% CIs, and the p-values from these two populations, we performed a SNP-based meta-analysis using a Mantel-Haenszel fixed effect model to calculate the overall combined OR, 95% CI, and p-value, taking the direction of allelic effect into account.
The GeneSTAR population had a baseline mean age of 46.9±7.0 years (range 26 to 60 years of age). Baseline population characteristics are shown in Table 1. Traditional CAD risk factors are prevalent, as we have previously reported23. During a mean follow-up time of 8 ± 3 years (range 5 to 17 years), there were 35 CAD events (77% acute coronary syndromes, which included acute MI and unstable angina with revascularization, and 23% stable symptomatic CAD with angiographic evidence of >50% stenosis in at least one epicardial coronary vessel, with and without revascularization).
Logistic regression analyses of 86 individual informative SNPs with incident CAD, adjusting for age, sex, LDL cholesterol, HDL cholesterol, triglycerides, fasting blood glucose, systolic blood pressure, current smoking status, body mass index, and years of follow-up time were performed using 35 incident CAD cases and 513 controls. Considering the LD structure between these SNPs, we considered a simple Bonferroni correction for multiple testing of all SNPs to be overly conservative. Thus, we used 10,000 permutation tests to assess the significance of the individual SNPs (Figure 1). One SNP, rs3217989, met the threshold for significance and was strongly associated with incident CAD, independent of traditional CAD risk factors. This SNP, rs3217989, is located in the 3′UTR of the CDKN2B gene and its minor allele (MAF=0.242) is protective against incident CAD, with an OR=0.19 (95% CI: 0.07 to 0.50, p=0.0008).
Figure 2 shows the chromosomal region of 9p21.3 (chromosome 9: 21,920,505 – 22,128,762) illustrating the physical location and level of significance of SNPs associated with CAD and the LD structure between the 86 SNPs in the GeneSTAR African American siblings.
Of the nine SNPs that passed a nominal threshold at the asymptotic level, a single SNP passed the permutation-based approach (rs3217989, Table 2) and is independent of the previously published region on 9p21 (rs10757278, Table 3). Table 3 demonstrates very low correlations in the nine GeneSTAR SNPs with the two most significant GeneSTAR SNPs (rs3217989 and rs17761446), and with the previously published lead SNP (rs10757278).
Due to its lack of representation on the GWAS array used, additional genotyping was done on rs10757278, one of the most published SNPs in the 9p21 region associated with MI and CAD in populations other than African ancestry5. We found no significant association between this SNP and CAD (p=0.709, MAF=21%). Furthermore, rs10757278 appears to be uncorrelated with the peak SNP, rs3217989, in our data (Table 3). Of note, we observed a p-value of 0.002 at rs17761446 (MAF=5.4%), a SNP which is 6.4kb away from rs10757278. However this SNP did not pass our permutation-based criteria for statistical significance. Additionally, there was no correlation between rs17761446 and rs10757278 (r2 = 0.017 in our data) suggesting that rs17761446 is not likely to be representative of the peak signal previously published at rs10757278.
The MAFs of rs3217989 in the Emory Genebank Study and CATHGEN African Americans were similar to that found in the GeneSTAR population (0.264 and 0.280, respectively). Genebank and CATHGEN each separately showed a trend for a protective effect of the minor allele of rs3217989 (p=0.15 and 0.08, respectively), with an overall significant combined meta-analysis result (meta-analysis OR=0.78, 95% CI: 0.63-0.97, p=0.02). The population characteristics of Genebank and CATHGEN can be found in Supplementary Table 1.
This is the first and only study in African Americans to date to show a significant association of CAD with any gene variant within the chromosome 9p21.3 locus. This SNP, rs3217989, is located within the 3′UTR of CDKN2B and is independent from the LD block previously associated with CAD in non-African ancestry populations. In our GeneSTAR cohort the protective effect of the minor allele is potent, with almost a five-fold decrease in incident CAD risk. The direction of effect was replicated and also found to be significant in a meta-analysis of two populations of Americans with similar African ancestry in North Carolina and Georgia.
We postulate that the heterogeneity in the magnitude of the odds ratio estimates in the discovery and replication samples is a function of study design. The cumulative CAD incidence in GeneSTAR was 6% and by design, GeneSTAR CAD events occurred in persons who were likey close in age-range, genetic susceptibility, and shared environment to the proband. The GeneSTAR sibling CAD events were thus likely causally more homogeneous as compared to all possible causes of CAD in the general population. In contrast, the population-based case-control studies, which are 50% cases and 50% controls by design, likely have greater variety of genetic and environmental factors among those subjects with CAD events. Given greater probable causal heterogeneity, CAD cases in these would demonstrate an odds ratio closer to null than that seen in the discovery cohort. The observation of a consistent protective effect between the discovery and replication samples supports our finding in the discovery population. Although it would have been desirable to have more replication populations of greater size, very few African American are fully phenotyped for CAD or have a GWAS. The fact that our finding remains robust in small samples supports the likelihood that it is real.
No other study has published any associations with any gene variants, identified by GWAS, and CAD in African American populations. In the only GWAS reporting CAD results in African Americans, the 9p21 polymorphisms studied were not significantly associated with CAD4. However, in studies published to date, the statistical power has been limited4. One study in African Americans was limited to three significant SNPs identified from a GWAS in populations of European ancestry and there was again no detectable significance15. Our results in African American families show that rs3217989 is in a different LD block than the lead SNPs previously found in the non-protein coding region in 9p21 in other populations. However, we also acknowledge that our study had limited statistical power to detect the previously reported association of these lead SNPs, including rs10757278, with CAD. Nonetheless, we were able to show that rs3217989 was independent of rs10757278 and therefore the lack of association of rs10757278 does not diminish the significance of our primary findings.
SNP rs3217989 appears to be monomorphic in populations of European ancestry in the Human Genome Diversity Panel data (http://hgdp.uchicago.edu/cgi-bin/gbrowse/HGDP/). Given more fragmented LD structure in African Americans, the finding of a risk allele at rs3217989 suggests a functional variant located closer to rs3217989 in persons of African ancestry than the traditional previously published 9p21 intergenic locus.
The 9p21 locus previously identified in most studies has very few gene candidates. Previous GWAS studies and subsequent replications have reported that most associations at genome-wide significance occur in a large haplotype block upstream and independent from two major genes, CDKN2A and CDKN2B 4. These genes encode protein inhibitors of cyclin-dependent kinases, p16INK4a and p15INK4b, respectively, expressed at high levels in endothelial and inflammatory cells, and are also thought to help regulate cell proliferation, cell aging, and apoptosis33-35.
ANRIL (CDKNA2BAS), which encodes a large antisense non-coding RNA, is another candidate locus in this region. ANRIL spans 126.3 kb and consists of 20 exons subjected to alternate splicing including the first two exons which appear to overlap two exons of CDKN2B36. ANRIL expression has been documented in atheromatous vessels, vascular endothelial cells, monocyte-derived macrophages, and coronary smooth muscle cells11. In a subset of individuals in the Ottawa Heart Study18 the previously identified 9p21 risk alleles were associated with ANRIL mRNA of differential lengths. Furthermore, expression of CDKN2B was correlated with the long variant of ANRIL, suggesting that the 9p21 risk alleles may be biologically tied to atherosclerosis risk through CDKN2B expression. Given these important findings, it is possible that the association of the CDKN2B gene variant with CAD in GeneSTAR African Americans is related to an alteration in ANRIL. However, the GeneSTAR CDKN2B SNP is located at the 5′ end of ANRIL whereas the aforementioned SNPs are located more at the 3′ end of ANRIL, from exon 13 to 20, where ANRIL splicing variation has been shown.18 Alternatively, the variant we report in the 3′-UTR of CDKN2B may contribute to an alteration in CDKN2B expression and/or function independently of ANRIL. For example, the 3′UTR could contain a regulatory binding site for transcription or stability of the coded protein.
We have found a novel variant in CDKN2B located in an LD block distinct from that of the prior European-derived signals in 9p21. This variant appears to be protective against CAD in African Americans. Further investigation of the CDKN2B gene and protein as well as the non-coding RNA ANRIL in persons of African ancestry will be necessary to provide additional clues as to the biological mechanisms of association between chromosome 9p21.3 and CAD.
This work was supported by grants HL072518 from the National Heart, Lung, and Blood Institute, and M01-RR00052 from the National Center for Research Resources, National Institutes of Health, Bethesda, MD. Genotyping of the Emory and Duke samples at deCODE genetics was supported by NIH grant R01HL089650-02 from the National Heart, Lung and Blood Institute.