|Home | About | Journals | Submit | Contact Us | Français|
D.-F.G. conceived of and designed the study, and supervised all the sample selection, genotyping, data analysis and interpretation. B.-Q.Q., D.-P.L., and X.-Z.P. participated in the study design and interpretation. Genotyping experiments were performed by S.-F.C., Y.-Y.S., L.Z., and H.-F.L. under the supervision of D.-F.G., L.H., J.C. and L.-Y.C. DNA sample preparation was carried out by L.-Y.W., H.-F.L., T.-C.W., Y.-T.M., Q.Z., Y.L., D.-H.Y., Q.-Q.W., Y.Y., F.-C.L., Q.-X.M., X.-H.L., J.-F.J., X.-B.M., D.-J.L., X.-H.L., and C.-L.D. Phenotype collection and data management were conducted by J.-F.H., S.-F.C., J.-X.L., J.C., J.-C.C., D.-H.L., J.-P.C., X.-F.D., T.-C.W., L.-G.W., Y.-T.M., Z.-J.F., Y.L., L.-C.Z., X.-Y.Z., F.-H.L., Z.-D.L., C.-L.Y., C.S., X.-D.P., L.Y., X.-H.F., L.-H.X., J.-J.M., X.-P.W., R.-P.Z., N.-Q.W., X.-L.L., M.-Q.W., D.-S.H.,X.J., D.-S.G., D.-L.S, P.-P.C., G.-P.C., X.-G.W., L.-Y.C., Y.-J.Y., Y.-D.T., X.-D.L., Z.-H.D, Z.-L.Y., Q.-J.M., D.W., R.-P.W., and J.Y. H.S., N.J.S., S.K., M.P.R., and J.E. provided the CARDIoGRAM Consortium data. Statistical analysis was performed by X.-F.L., X.-L.Y., Y.-C.H., D.-L.G, C.-C.G., and R.-S.C. The manuscript was written by D.-F.G., X.-F.L., and L.-Y.W. All authors reviewed the manuscript.
We performed a meta-analysis of 2 genome-wide association studies of coronary artery disease comprising 1,515 cases with coronary artery disease and 5,019 controls, followed by de novo replication studies in 15,460 cases and 11,472 controls, all of Chinese Han descent. We successfully identified four new loci for coronary artery disease reaching genome-wide significance (P < 5 × 10−8), which mapped in or near TTC32-WDR35, GUCY1A3, C6orf10-BTNL2 and ATP2B1. We also replicated four loci previously identified in European populations (PHACTR1, TCF21, CDKN2A/B and C12orf51). These findings provide new insights into biological pathways for the susceptibility of coronary artery disease in Chinese Han population.
Coronary artery disease (CAD), and its most severe complication myocardial infarction (MI), are leading causes of death and disability worldwide1,2. Recent genome-wide association studies (GWAS) of CAD have discovered multiple chromosomal regions3. However, most of the GWAS have focused on samples of European origin, and the identified loci altogether explain only a small fraction of the risk for CAD. Moreover, the variants identified in populations of European descent might not be applicable in other ethnic groups because of underlying genetic heterogeneity. Therefore, larger scale studies in Chinese and other non-European populations are needed to reveal new susceptibility loci and improve our understanding of causal pathways to CAD.
Herein, we carried out a two-stage GWAS study of CAD in a sample of ~33,000 Han Chinese (Supplementary Fig. 1). In the discovery stage (stage 1), we performed two GWAS studies of CAD in two independent Chinese samples, the Beijing Atherosclerosis Study (BAS) and the China Atherosclerosis Study (CAS), respectively. BAS consisted of 509 cases of MI and 1,034 controls genotyped using the Affymetrix Human Mapping 500K Array Set including 500,568 SNPs. CAS consisted of 1,034 cases of CAD and 4,245 controls genotyped using the Affymetrix Axiom™ Genome-Wide CHB 1 Array Plate including 657,124 SNPs. After a series of quality-control procedures (Online Methods), we retained 367,129 autosomal SNPs in 505 cases and 1,021 controls in BAS study, and 613,724 autosomal SNPs in 1,010 cases and 3,998 controls in CAS (Supplementary Table 1). To facilitate combining results of genome-wide association scans based on the two genotyping platforms, we imputed missing genotypes based on reference haplotypes from the phased CHB+JPT HapMap data release 22 reference dataset4. We performed a meta-analysis of the two GWAS studies with association data for approximately 2.2 million genotyped or imputed autosomal SNPs.
Principal component analysis showed minimal evidence for population stratification in these study populations (Supplementary Fig. 2). Quantile-quantile plot showed that the distribution of observed P values deviated from expected P values only in the extreme tail (Supplementary Fig. 3). The value of genomic inflation factors were 1.04 and 1.04 for BAS and CAS, respectively, and 1.05 for the two studies combined, which indicated population stratification effects were negligible in our study samples.
In this stage-1 discovery analysis, we used a set of criteria to select 96 promising SNPs (Online Methods) and genotyped them in a further replication analysis in a case-control sample including 8,803 CAD cases and 5,183 controls (replication 1) (Supplementary Table 1). We found that two association signals reached genome-wide significance (defined as P < 5 × 10−8) in the replication 1 alone. These included one locus (4 SNPs on 9p21.3) previously identified in populations of European descent and one newly discovered locus (rs9268402 at C6orf10-BTNL2). After meta-analysis of the discovery and replication 1 study, eight additional variants were associated with CAD at a significance level of P < 1 × 10−5. These eight SNPs were further examined in an independent sample (2,408 cases and 2,103 controls) (replication 2). The results of the selected SNPs in the discovery, the replication 1, and the replication 2 were summarized in Supplementary Table 2.
When combining the discovery data with those from replications 1 and 2, we found ten SNPs at seven regions associated with CAD at a pre-specified threshold for genome-wide significance of P < 5 × 10−8. Among these SNPs, the four (rs9632884, rs10757274, rs1333042, rs1333049) on 9p21.3, rs9349379 on 6p24.1 in PHACTR1, and rs11066280 on 12q24.13 near C12orf51 confirmed associations previously reported in Europeans5-11. In addition, rs12524865 on 6q23.2 near TCF21 also showed consistent association evidence in the discovery and replication stages with the same direction of association previously reported in Europeans12, which nearly reached the genome-wide significance threshold in the combined analyses (P = 1.87 × 10−7) (Supplementary Tables 2 and 3). Moreover, we identified four new CAD loci in Chinese (P value ranging from 4.48 × 10−8 to 9.92×10−14) : (i) rs2123536 on 2p24.1 near TTC32-WDR35; (ii) rs1842896 on 4q32.1 near GUCY1A3; (iii) rs9268402 on 6p21.32 near C6orf10-BTNL2; (iv) rs7136259 on 12q21.33 near ATP2B1.
To minimize the chance of false signal for the four new loci with CAD, we performed an additional validation in an independent sample consisting of 4,249 cases and 4,186 controls (replication 3). All four SNPs showed significant association with CAD in this additional sample after adjustment for multiple testing (P < 0.05/4 = 0.0125), and integration of all results for the discovery, replication 1, replication 2, and replication 3 showed unequivocal associations of these loci with CAD (rs2123536, P = 6.83 × 10−11, OR = 1.12; rs1842896, P =1.26 × 10−11, OR =1.14 ; rs9268402, P = 2.77 × 10−15, OR = 1.16 ; rs7136259, P = 5.68 × 10−10, OR = 1.11 ) (Table 1). Figure 1 displays these loci in their genomic coordinate context.
Of the 4 new CAD loci, 4q32.1 (rs13139571) and 12q21.33 (rs2681492) were recently identified as risk loci for blood pressure in Europeans studies13,14. We observed that rs13139571 and rs1842896 on 4q32.1 show a very weak linkage disequilibrium (LD) (r2 = 0.002, D´ = 0.062 in HapMap CHB; r2 = 0.004, D´ = 0.123 in HapMap CEU), while rs2681492 and rs7136259 on 12q21.33 are in strong LD (r2 = 0.90, D´ = 0.95 in HapMap CHB; r2 = 0.11, D´ = 0.88 in HapMap CEU). To shed some light on the seemingly entangled relationships of the two loci with CAD and hypertension, we examined the associations of the two CAD SNPs (rs1842896 and rs7136259) with blood pressure in all control samples from the discovery and replication samples. Suggestive associations with blood pressure and hypertension were observed (Supplementary Table 4). We further examined whether hypertension could mediate the effect on CAD. After adjustment for hypertension, the associations with CAD remained genome-wide significant (P = 1.31 × 10−9, OR = 1.13 for rs1842896; P = 6.63 ×10−12, OR = 1.13 for rs7136259), indicating the two CAD SNPs might be susceptibility markers independent of hypertension. rs1842896 is located to 76.4kb upstream of the GUCY1A3 locus. The GUCY1A3 gene encodes the α subunit of soluble guanylate cyclase (sGC), a key enzyme of nitric oxide signaling pathway implicated in the pathogenesis CAD and atherosclerosis. Preclinical studies have explored the therapeutic potential of sGC stimulators15. rs7136259 is near ATP2B1, which encodes PMCA1, a plasma membrane calcium ATPase which pumps calcium (Ca2+) ions out of the cytosol into the extracellular milieu16.
The C6orf10-BTNL2 on 6p21.32 is a hotspot associated with immune-related diseases17-19. BTNL2 is a member of the immunoglobulin superfamily that probably functions as a T cell costimulatory molecule. It is noteworthy that rs2076530, a truncating splice site mutation in BTNL2 gene17, is in strong LD with rs9268402 (r2 = 0.59). BTNL2 gene polymorphisms have been found associated with susceptibility to Kawasaki disease (KD)20, a vasculitis of young childhood that particularly affects the coronary arteries, with increased risk of developing ischemic heart disease in the future21. rs2123536 on 2p24.1 is located to ~150kb downstream of TTC32 and WDR35. TTC32 encodes the protein containing the tetratricopeptide repeat motif to bind other peptides22. WDR35 encodes a member of the WD repeat protein family23, which involves cell cycle progression, signal transduction, apoptosis, and gene regulation.
The chromosome 12q24 region is of particular interest. The association signal on 12q24 spans ~0.7 Mb and rs11066280 is in almost perfect LD (r2 = 0.95~0.97) with rs3782886, rs4646776, rs671, rs2074356, and rs77768175 (Supplementary Fig. 4). Previous studies showed significant evidence supporting signatures of natural selection11,24 and pleiotropic effects (CAD10,11,25, plasma lipids24,26,27 and blood pressure14,24,28). All variants on 12q24 associated with CAD10,11 in Europeans are not polymorphic in Chinese, whereas all CAD-associated variants on 12q24 in Chinese are monomorphic in Europeans (Supplementary Table 5). In the present study, rs11066280 also showed significant or suggestive evidence of association with high-density lipoprotein cholesterol (HDL-C), triglycerides (TG), total cholesterol (TC), and blood pressure (Supplementary Table 4). Because there is no substantial evidence for functional variants in this locus, further in-depth analysis is to explain the long-range LD and uncover causal mechanisms for CAD in this region.
While the association between the CDKN2A/B on 9p21 locus and CAD was replicated in our study, we note that the LD structures of the 9p21 region are different between populations of European and Asian descent (Supplementary Fig. 5). The 4 SNPs showing significant CAD association in 9p21.3 region were in almost perfect LD in European descent (pairwise r2 from 0.84 to 0.90). In Chinese, however, two of the SNPs (rs10757274 and rs1333049) are in strong LD with each other (r2 = 0.78), but only in moderate LD with the other two SNPs (rs9632884, rs1333042) (pairwise r2 from 0.27 to 0.43, Supplementary Table 6). Conditional logistic analyses showed the association evidence for rs1333042 or rs10757274 remained significant after control for the genetic effect at any of the other three SNPs (P value ranging from 3.71×10−8 to 7.96×10−9, and from 1.44×10−6 to 1.18×10−18, respectively) (Supplementary Table 7). Therefore, the two SNPs appear to represent independent signals.
We evaluated whether the CAD variants identified in Chinese were associated with CAD in Europeans using the results of the CARDIoGRAM consortium12, a meta-analysis for 22,233 cases and 64,762 controls (Supplementary Table 8). Of the four SNPs, rs2123536 (P = 0.0038, OR = 1.10) and rs7136259 (P = 0.035, OR = 1.03) showed nominal association with CAD in the population of European ancestry, and the direction of the effect was consistent with our findings. The associations for the other two SNPs were not detected in CARDIoGRAM consortium.
Conversely, we also investigated whether the 35 CAD-associated SNPs (in 29 loci) identified by previous GWAS in European populations were associated with CAD in our sample (Supplementary Table 9). In addition to the 4 loci that were confirmed by our discovery and replication studies (Supplementary Table 3), 7 loci within 1p32.2, 1q41, 10q23.31, 10q24.32, 11q22.3, 15q25.1 and 17p13.3 showed directionally consistent and nominally significant associations in the discovery study (P < 0.05) (Supplementary Table 9, Group 1). We observed 11 SNPs in 10 loci were monomorphic or had low minor allele frequencies (MAF ≤ 0.1) in Chinese Han population, which were quite different in European populations (Supplementary Table 9, Group 2). We examined the associations of other correlated SNPs in strong LD with these 11 SNPs in HapMap CEU data with CAD in Chinese. Of interest, the associations of the proxy SNPs in 3 loci, 3q22.3, 6q26 and 17p11.2, were also supported by our discovery analysis albeit with only suggestive evidence (P < 0.05). These data suggest that the difference in the LD structure may partially explain the discrepancy of the association between the European and Chinese populations. No associations were observed for the remaining 8 SNPs with common minor allele frequencies (MAF > 0.1) in the Chinese Han population (Supplementary Table 9, Group 3). The disparities between Chinese and European populations might be due to differences in genetic architecture and environmental factors, or insufficient power in the present study. In addition, a recent GWAS on CAD in the Chinese Han population29 identified rs6903956 on 6p24.1 (C6orf105) as a susceptibility locus. We could not replicate the association of rs6903956 in our data, though our present discovery analyses have >90% power to detect a SNP with an OR = 1.51 even using a P value threshold of 1.0 × 10−5.
To examine the effect of nine SNPs in aggregate on the risk of CAD, a CAD risk score was calculated by using weighted sum across the SNPs combining effect size and doses of risk alleles. The CAD risk score can explain ~1.92% of the variance in risk for CAD. The mean CAD risk score of cases was significantly higher than that of controls (P < 1 × 10−74). Logistic regression model was applied to test the association of risk score categories with CAD. Compared with bottom quintile, individuals in the top quintile of CAD risk score had greater than twofold increased risk for CAD (OR=2.34, 95%CI: 2.11-2.59). Risk of CAD across quintiles of CAD risk score is shown in Figure 2.
In conclusion, we identified four new loci for CAD (TTC32-WDR35, GUCY1A3, C6orf10-BTNL2 and ATP2B1) in Chinese and confirmed four previously reported loci (PHACTR1, TCF21, CDKN2A/B and C12orf51). These results suggest that both shared and unique genetic backgrounds of CAD are present between different ethnic groups and highlight the importance of fine-mapping efforts to pinpoint causal variants and mechanisms. Further study and integration of multiethnic GWAS findings will surely promote a fuller and better understanding of the global genetic architecture of CAD.
We performed a two-stage case-control analysis in participants of Chinese Han ancestry. The general characteristics of the study participants are summarized in Supplementary Table 1. In the discovery stage, we performed a meta-analysis of two independent GWAS studies, BAS and CAS study. BAS study30 consisted of 505 cases of MI and 1,021 controls. All participants were from Beijing, China. All cases had a validated history of MI and were verified by hospital records and by cardiologists according to standard protocol31. Controls were randomly selected from subjects participating in a community-based survey of cardiovascular risk factors in Beijing. The control subjects were judged to be free of CAD by history, clinical examination, electrocardiography, and Rose questionnaire32. Detailed data were collected through in-person interviews with each case and control. Subjects with congenital heart disease, cardiomyopathy, valvular disease, and renal or hepatic disease were excluded. CAS consisted of 1,010 cases of CAD and 3,998 controls. 1,010 cases from the Northern provinces in China were enrolled from Fuwai Hospital, National Center of Cardiovascular Diseases. 83.8% of cases have a family history of CAD. Diagnoses of MI cases follow strict diagnostic rules based on signs, symptoms, electrocardiograms and cardiac enzymes31. The CAD subjects without known history of MI had > 70% stenosis in at least one of a major epicardial vessel with the exception of the left main coronary artery where a >50% stenosis was sufficient to meet the diagnosis of CAD. Controls of CAS study were recruited from the International Collaborative Study of Cardiovascular Disease in Asia (InterASIA in China)33. InterASIA used a four-stage stratified sampling method to select a nationally representative sample of the general population aged 35 to 74 years in China. A total of 15,838 persons completed the survey and examination in 2001, and a follow-up of this survey in 2008 was conducted. 3,998 controls were individuals who did not develop incident CAD and had no family history of CAD during the 8 yr follow-up period of the study from four northern field centers of InterASIA.
In the stage 2, replication analyses were conducted in three independent samples with a total of 15,460 cases and 11,472 controls (8,803 cases and 5,183 controls for replication 1; 2,408 cases and 2,103 controls for replication 2; 4,249 cases and 4,186 controls for replication 3). All the cases in the replication stage were recruited using uniform criteria, and the clinical information were collected using the same questionnaire as that used in CAS study. For replication 1, CAD cases were recruited using standardized protocol through collaboration among multiple hospitals in China; and controls were selected from samples of the China Collaborative Study of Cardiovascular Epidemiology. For replication 2, cases were enrolled from Fuwai Hospital, Beijing, and controls were selected from urban and suburban communities in Beijing. For replication 3, cases from the northern China were enrolled from Fuwai Hospital, Beijing and other medical centers, and controls were selected from northern field centers of China Cardiovascular Health Study (CCHS) project. CCHS has been a population-based investigation of risk factors for cardiovascular diseases in China since 2006. The controls for the stage 2 validation were selected using the identical criteria as for the discovery populations.
Each study obtained approval from institutional review boards of Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, and other medical institutions, and all participants gave written informed consents.
For BAS in the discovery stage, a total of 509 cases of MI and 1,034 controls were genotyped with the Affymetrix GeneChip® Human Mapping 500K Array Set, including 500,568 SNPs. Principal component analysis using EIGENSOFT34,35 was used to compare all samples with reference samples from the HapMap YRI, CHB, JPT and CEU panels. We excluded SNPs with minor allele frequency (MAF) < 0.01 in cases or controls (n = 100,865, including 46,048 monomorphic SNPs); genotype call rates below 95% in cases or controls (n = 20,030); or deviations from Hardy-Weinberg equilibrium (P value < 10−4, n = 12,544). We also excluded 17 samples because of gender discordance, high genotype missing rate (>3.0%), cryptic relatedness (IBD > 0.1875) or population outliers. After quality control, 1,526 samples and 367,129 autosomal SNPs remained for the subsequent analyses. For CAS in the discovery stage, a total of 1,034 cases of CAD and 4,245 controls were genotyped with the Axiom™ Genome-Wide CHB 1 Array Plate, which was designed for Chinese population and includes 657,124 SNPs. After these quality control procedures, 5,008 samples and 613,724 autosomal SNPs were retained for subsequent analyses.
In stage 2, 96 SNPs were selected and genotyped using TaqMan SNP Genotyping Assays in Fludigm EP1 platform in replication 1 sample. Of the 96 SNPs genotyped, 5 SNPs with Hardy-Weinberg equilibrium P-value <0.001 were removed from further association analysis. To assess genotyping reproducibility, 48 duplicate samples were genotyped, and concordance rate is over 99.4%. 9 non-9p21 loci SNPs with P < 1 × 10−5 in the combined discovery and replication 1 analysis were selected and genotyped using the iPLEX MassARRAY platform (Sequenom) in replication 2 sample, but rs9268402 was not taken forward into replication 2 because of difficulty in the design of the replication array, leaving 8 SNPs for replication. The concordance rate for 96 replicate samples was 99.7%. To evaluate the quality of the genotype data between different genotyping platform, 48 random replication samples genotyped by Fludigm EP1 platform were re-genotyped for 8 SNPs on the iPLEX Sequenom MassARRAY platforms, and the concordance rate between the genotypes from the two platforms was 99.5%. The replication 3 sample for 4 new SNPs was genotyped using a TaqMan genotyping platform (ABI 7900HT Real Time PCR system, Applied Biosystems). The cluster patterns of the genotyping data from the Fludigm EP1, Sequenom and TaqMan analyses were checked to confirm their good quality.
In the discovery stage, imputation of allele dosage of ungenotyped SNPs using the HapMap Phase 2 (JPT+CHB) data was carried out using MACH36,37. After excluding imputed SNPs with imputation quality scores below a threshold (R2 < 0.30), call rate < 0.90 in either cases or controls, MAF < 0.01 in either cases or controls, Hardy-Weinberg equilibrium P < 1 × 10−5 in controls, and significantly different missing genotype rates between cases and controls (P < 1 × 10−5), a total of 1,532,051 genotyped and imputed autosomal SNPs from the BAS, 2,042,781 from the CAS, and 2,228,999 SNPs from the combined two GWAS samples were retained for subsequent association analysis.
After genome-wide association analyses for each of the two discovery studies and meta-analysis in the combined sample, SNPs fulfilling the following criteria were taken forward to replication: (i) SNPs showed potential associations from meta-analysis of the two study (P < 1.0 × 10−4); (ii) SNPs had a consistent association at P ≤ 1.0 × 10−2 in both discovery populations; (iii) SNPs with P < 1.0 × 10−3 in meta-analysis discovery within ±500 kb of the locus previously reported at genome-wide significance or suggestive significance in a prior publication; (iv) the SNP had nearby correlated SNPs (within 25 kb) that also showed a signal (P < 0.01). With the exception of four SNPs at the 9p21.3 locus, SNPs in strong LD (r2 > 0.5) with the most significant SNP at each locus were removed. When a SNP could not be genotyped, alternative tagging SNPs (r2 > 0.8) were considered.
The association of imputed and genotyped SNPs with CAD was tested with multiple logistic regression analysis in an additive genetic model (with 1 degree of freedom) after adjusting for age (onset of the first event for cases or time of recruitment for controls) and sex. We used allele dosages from the imputation to account for uncertainty of imputed genotypes. Association analyses were performed using PLINK38 (see URLs). A fixed-effects inverse variance-weighted meta-analysis as implemented by METAL39 (see URLs) was used to combine the two discovery studies and the results for each SNP across all replication studies. A quantile-quantile plot generated using R was used to evaluate the overall significance of the GWAS results and the potential impact of population stratification. The genomic inflation factor (λ)40 was estimated from the median of the χ2 statistic divided by 0.456.
The association of the loci with established cardiovascular risk factors was examined in all control samples from the discovery and replication samples. For quantitative traits (high-density lipoprotein, low-density lipoprotein, total cholesterol, triglyceride, blood pressure, fasting plasma glucose, body mass index), linear regressions were used whereas for binary trait (hypertension) logistic regression model was applied. We then combined the respective regression estimates from each stage in a meta-analysis using inverse variance weighting.
Conditional analyses were performed to test the independence of significant SNPs in each region conditioning on the genotype of the SNP chosen for replication. These analyses were carried out using PLINK with the --logistic and --condition options.
This study was funded by the National Basic Research Program of China (973 Plan) (2011CB503901,2006CB503805) from the Ministry of Science and Technology of China, the National Science Foundation of China (30930047), the High-Tech Research and Development Program of China (863 Plan) (2009AA022703, 2012AA02A516, 2006AA02A406), and a grand (2006BAI01A01) from the Ministry of Science and Technology of China. This study was also supported by Biomedical Project from the Council of Science and Technology, Beijing (H020220030130).
PLINK v1.07, http://pngu.mgh.harvard.edu/~purcell/plink/;
The International HapMap Project, http://hapmap.ncbi.nlm.nih.gov/;
COMPETING INTERESTS STATEMENT
The authors declare no competing financial interests.