|Home | About | Journals | Submit | Contact Us | Français|
Cohort ascertainment, characterization and DNA preparation: M.N., E.G., A.L., A.P., J.Ö. and J.H. (Helsinki); M.v.u.z.F., A.R., T.K., J.R., A.P. and J.E.J. (Kuopio); Y.M.R., L.H.v.d.B., C.W. and G.J.E.R. (Utrecht); C.M.v.D. and M.M.B.B. (Rotterdam) and A.T., A.H., H.K. and I.I. (Japan). Genotyping: K.B., Y.B., L.K., Z.A., S.R., R.P.L., M.G. and S.M. Study design and analysis plan: R.P.L. and M.G. Data management and informatics: C.E.M., K.B., M.W.S. and M.G. Statistical analysis: K.Y., K.B., I.I., M.C., R.P.L. and M.G. Writing team: K.B., K.Y., M.W.S., M.G. and R.P.L.
Stroke is the world’s third leading cause of death. One cause of stroke, intracranial aneurysm, affects ~2% of the population and accounts for 500,000 hemorrhagic strokes annually in midlife (median age 50), most often resulting in death or severe neurological impairment1. The pathogenesis of intracranial aneurysm is unknown, and because catastrophic hemorrhage is commonly the first sign of disease, early identification is essential. We carried out a multistage genome-wide association study (GWAS) of Finnish, Dutch and Japanese cohorts including over 2,100 intracranial aneurysm cases and 8,000 controls. Genome-wide genotyping of the European cohorts and replication studies in the Japanese cohort identified common SNPs on chromosomes 2q, 8q and 9p that show significant association with intracranial aneurysm with odds ratios 1.24-1.36. The loci on 2q and 8q are new, whereas the 9p locus was previously found to be associated with arterial diseases, including intracranial aneurysm2-5. Associated SNPs on 8q likely act via SOX17, which is required for formation and maintenance of endothelial cells6-8, suggesting a role in development and repair of the vasculature; CDKN2A at 9p may have a similar role9. These findings have implications for the pathophysiology, diagnosis and therapy of intracranial aneurysm.
Siblings of intracranial aneurysm probands are at ~fourfold increased risk of hemorrhage from intracranial aneurysm, suggesting a genetic component to risk10. Genome-wide linkage studies of familial cases11 and rare apparently mendelian kindreds have not thus far identified robustly replicable loci, and no underlying mutations have been identified12-14. Similarly, examination of candidate genes in small case-control studies has failed to produce replicable results12.
These considerations motivate the use of GWAS to identify common variants that contribute to intracranial aneurysm. We carried out a multistage intracranial aneurysm GWAS in three cohorts: a Finnish cohort of 920 cases and 985 controls, a Dutch cohort of 781 cases and 6,424 controls and a Japanese cohort of 495 cases and 676 controls (see Supplementary Methods online).
The study design consisted of a first stage of genome-wide genotyping of the European cohorts on the Illumina platform, careful matching of cases and controls, and identification of intervals harboring SNPs that surpassed a significance threshold of 5 × 10-7 for association with intracranial aneurysm2. This discovery phase had 80% power to detect common alleles that confer a genotype relative risk (GRR) of 1.31 and 50% power to detect a GRR of 1.25 (assuming an additive model in log-odds scale). Replication of association of SNPs in these intervals was tested in the Japanese cohort, setting P < 0.05 for significant replication. The replication study had 80% and 66% power to replicate SNPs with GRRs of 1.31 and 1.25, respectively. The utility of using a genetically diverse population for replication has been demonstrated by recent studies15, thereby extending association results to a broad segment of the world’s population.
Discovery phase genotypes were processed using rigorous quality controls; because Dutch controls and some Finnish controls were genotyped separately on Illumina chips of varying SNP density, particular attention was paid to ensuring consistent genotyping performance and excluding nonrandom genotyping error within and across cohorts (Supplementary Methods and Supplementary Tables 1 and 2 online). To control for population stratification, we genetically matched cases and controls from each cohort16, resulting in a dataset in which cases and controls are similarly distributed along axes of significant principal components (Supplementary Table 1).
We tested for association between each SNP and intracranial aneurysm by using the Cochran-Armitage trend test in each cohort and combined the results using the Mantel extension test. The distribution of test statistics for association of SNPs with intracranial aneurysm in the combined cohort is shown in Figure 1a. The genomic inflation factor (λ) was 1.043 and 1.136 for the Finnish and Dutch, respectively, and 1.114 combined, indicating well-matched populations17 (Fig. 1a); further logistic regression including principal components as covariates2 did not significantly change λ (Supplementary Fig. 1 online), nor did exclusion of SNPs with call rates <99% in any case or control cohort (Supplementary Methods); in contrast, because genomic inflation factor increases with sample size18, the large Dutch control sample was a major contributor to λ (Supplementary Methods). The association results reveal a number of SNPs whose P values exceed those expected under the null hypothesis; these persist after correction for λ (Fig. 1b). The P values across each chromosome are shown in Figure 1c. Four intervals (on 1q, 2q, 8q and 9p) harbored SNPs that surpassed the threshold for genome-wide significance; these include multiple SNPs with correlated P values and comprise 15 of the 16 SNPs with P < 10-6 (Fig. 1c). Associated SNPs in each interval have very high call rates in every cohort and none violate HWE in any cohort (Supplementary Table 2). The first three loci have not previously shown association with intracranial aneurysm or other diseases, whereas SNPs on 9p are in the block of linkage disequilibrium (LD) that has previously been shown to be associated with myocardial infarction2-4, abdominal aortic aneurysm and intracranial aneurysm5. Both Finnish and Dutch cohorts contributed to the significance of each locus, the risk alleles were identical and their odds ratios were not significantly different between cohorts (Table 1).
To attempt to replicate these four loci, we genotyped 15 SNPs from these intervals in the Japanese cohort (Supplementary Tables 2 and 3 online). Eight of the 15 SNPs showed significant association with intracranial aneurysm; these included SNPs on 2q, 8q and 9p (Table 1). At each locus, SNPs in strong LD in the Japanese sample showed highly correlated P values (Fig. 2). For associated SNPs, risk alleles in Japan were identical to and showed similar odds ratios to those found in Europe (Table 1 and Supplementary Table 3). Using the Mantel extension test to combine data from all three cohorts, we found the following P values and odds ratios for the SNPs showing the strongest evidence for association at each locus: 2q, P = 4.4 × 10-8 (odds ratio (OR) = 1.24); 8q, P = 1.4 × 10-10 (OR = 1.36); 9p, P = 1.4 × 10-10 (OR = 1.29) (Table 1). No locus showed significant deviation from an additive model (log-odds scale) (Supplementary Table 3).
We examined the distributions of P values in each significant interval. At 2q, association in Europe lies within a large block of LD (197.8-198.6 Mb; Fig. 2 and Supplementary Table 4 online). In Asian subjects, this segment is divided into two smaller blocks of LD and the association seen in Japan is confined to SNPs in the more telomeric block (198.2-198.5 Mb). This interval contains four known genes; the two most strongly associated SNPs, rs700651 and rs700675, lie in introns of adjacent genes, BOLL and PLCL1. PLCL1 is of interest because it has significant homology to phospholipase C, which lies downstream of VEGFR2 signaling19. VEGFR2 is a marker of endothelial progenitor cells and has a role in central nervous system angiogenesis20.
The LD structure at 8q is also of interest (Fig. 2 and Supplementary Table 4). SNP rs10958409 shows the most significant association; SNPs in high LD with rs10958409 show correlated P values. In addition, however, rs9298506, which lies 110 kb distally and shows virtually no LD with rs10958409 (r2 = 0.004 in European HapMap subjects21, 0.004 in Finnish cases and 0.0005 in Dutch cases) none-theless also revealed significant association in Europeans; adjacent SNPs in LD showed correlated P values. This observation suggests the presence of two independent risk alleles. A conditional test of association demonstrated that after accounting for the association with rs9298506, rs10958409 still showed significant association with intracranial aneurysm (and vice versa), consistent with two risk loci (Supplementary Table 4). The Japanese cohort replicated association at rs10958409, but not rs9298506, despite having had 88% power to detect association of this latter SNP (Supplementary Table 3). Further work will be required to determine whether the European association with rs9298506 is a true positive result. This 8q interval contains a single gene, SOX17, which lies between these two association peaks, 43 kb from rs10958409 and 64 kb from rs9298506. The next closest genes lie 201 kb distal and 266 kb proximal to rs10958409. Sox17 has an important role in formation and maintenance of the endothelium (see below).
Finally, SNPs on 9p that showed association with intracranial aneurysm (22.07-22.10 Mb) (Fig. 2 and Supplementary Table 4) were in LD with SNPs that have previously shown association with multiple arterial diseases2-4. Adjacent SNPs that are associated with type 2 diabetes mellitus22-24 showed no significant association with intracranial aneurysm. The strongest association was with rs1333040, which lies 74 kb from the 5′ end of CDKN2B and 88 kb from CDKN2A. These genes encode the cyclin-dependent kinase inhibitors p15INK4b and p16INK4a, as well as ARF, a regulator of p53 activity. In addition, a non-protein-coding transcript (ANRIL) lies within this interval25. Among these, p16INK4a is of particular interest (see below).
To determine whether the effects of these loci are influenced by known risk factors, we examined the odds ratios of the most significant allele at each locus after partitioning cases by gender, family history of intracranial aneurysm, age (older half versus younger) and ruptured versus unruptured aneurysm. The results showed no significant difference in odds ratios after any of these partitions, suggesting independent contributions to risk (Supplementary Table 5 online).
Finally, to assess the combined effects of the three loci, we defined each subject’s risk score by summing the logarithm of the odds ratio for each risk allele they harbor as determined in each cohort. The observed intracranial aneurysm risk showed a significant linear relationship with risk score in each cohort, with a more than threefold increase from lowest to highest strata (Table 2 and Supplementary Table 6 online).
This study provides the first results of a large GWAS of intracranial aneurysm or stroke. Three significant loci have been identified. These results cannot be explained by nonrandom genotyping error or population stratification and are robust to alternative analyses (Supplementary Fig. 1). We calculate that these loci collectively account for 38-46% of the population-attributable fraction of intracranial aneurysm and 2.3-3.8% of the sibling recurrence risk (Table 1). Additional common variants are likely to have a role in intracranial aneurysm, as the study was not well powered to find loci with GRR <1.25. In addition, population-specific effects were not considered in this study design. Given the risk allele frequencies and odds ratios of the identified loci, future replication cohorts will require ~900 to 1,600 cases and controls to have 80% power for replication (α = 0.05).
After genomic control correction and exclusion of SNPs at the four top loci, 37 SNPs remained with P values less than 10-4 (28 are expected by chance). Some of these may prove to be true risk alleles as additional cohorts are evaluated, as has occurred with type 2 diabetes26. In addition, rare variants with larger effects at these same loci may also contribute to the occurrence of intracranial aneurysm27,28.
Intracranial aneurysms predominate at arterial branch points and sites of shear stress, locations that incur endothelial damage. Vascular injury mobilizes bone marrow-derived cells that localize to these sites and contribute to repair29,30. SOX17, a member of the Sry-related HMG box transcription factor family, is of particular interest because it is required for both endothelial formation and maintenance6-8. Sox17 plays a key role in the generation and maintenance of fetal and neonatal stem cells of both hematopoietic and endothelial lineages8 and is expressed in adult endothelium6. Sox17-/- mice show multiple vascular abnormalities7; moreover, whereas Sox18-/- mice are normal, Sox18-/-;Sox17+/- mice show defective endothelial sprouting and vascular remodeling6. Similarly, p16INK4a has a role in regulation of stem (progenitor) cell populations, including bone marrow-derived cells of the vasculature9. These considerations suggest that intracranial aneurysm may result from defective stem (progenitor) cell-mediated vascular development and/or repair.
Finally, these findings have implications for identification of individuals with intracranial aneurysm before morbid events. The odds ratio of intracranial aneurysm increases greater than threefold in subjects with the highest versus the lowest risk (Table 2 and Supplementary Table 6). Although we caution that further work is required, these findings advance the potential for preclinical diagnosis by combined assessment of inherited susceptibility with previously established risk factors.
The study protocol was approved by the Yale Human Investigation Committee (HIC protocol 7680). In all cases, the diagnosis of intracranial aneurysm was made with computerized tomography angiogram, magnetic resonance angiogram or cerebral digital subtraction angiogram and confirmed at surgery, when applicable. Rupture of aneurysm was defined by identification of acute subarachnoid hemorrhage (via computerized tomography or magnetic resonance imaging) from a proven aneurysm. Cases with a first-degree relative with intracranial aneurysm were considered familial, and other cases were considered sporadic.
Three cohorts from independent studies in Finland, The Netherlands and Japan were collected and all participants provided informed consent. There were 960 Finnish cases and 1,017 controls; 786 Dutch cases and 6,424 controls; and 495 Japanese cases and 676 controls. Japanese controls were screened for not harboring intracranial aneurysm.
Genome-wide genotyping in European cohorts was done on the Illumina platform according to the manufacturer’s protocol (Illumina). We genotyped subjects on either the CNV370-Duo, HumanHap300 or HumanHap550 chips. SNPs shared across all platforms (n = 314,125) were extracted. We applied prespecified criteria to exclude samples and SNPs that performed poorly as well as samples that could not be genetically well matched (Supplementary Table 1 and Supplementary Methods). The overall median genotype call rate was 99.7% and the mean heterozygosity of all SNPs was 35%. Seventy-two duplicate pairs of samples were genotyped and showed 99.91% genotype identity. We carried out detailed analysis of the performance of SNPs across cohorts and platforms to ensure that significant associations observed were not due to differences in SNP performance (Supplementary Table 2).
We determined the identity by state (IBS) similarity and estimated the degree of relatedness for each pair of samples in the GWAS (Supplementary Methods) and excluded inferred first- and second-degree relatives (Supplementary Table 1).
In order to identify population outliers and cases whose genetic ancestry cannot be properly matched to controls (and vice versa), we used the Genetic Matching (GEM) method described previously16 based on principal component analysis (PCA). After this matching process, three significant principal components remained in the Finnish cohort and none in the Dutch cohort, as previously observed (Supplementary Methods).
After quality control and analysis of population structure, there remained 874 cases and 944 controls in the Finnish cohort and 706 cases and 5,332 controls in the Dutch cohort. Among the Finnish cases, 57% were female; 73% had suffered ruptured aneurysm and 43% had positive family history; the median age at diagnosis was 50 years (those with rupture 49 years versus those without rupture 52 years). In the Dutch cohort, 69% were female, 92% had ruptured aneurysm, 15% had a positive family history and the median age was 49 years.
To test for association of each SNP with intracranial aneurysm, we assumed an additive (in log-odds scale) model. We used the Cochran-Armitage trend test for each cohort. For the combined sample of European descent or of European and Japanese cohorts, we used the Mantel extension test (Supplementary Methods).
We calculated the per-allele and genotype-specific ORs and their 95% confidence intervals by fitting 1-d.f. and 2-d.f. logistic models, respectively. We assessed heterogeneity of ORs among populations by considering the likelihood ratios of a logistic model with population by genotype interaction term(s) versus a linear model without the interaction term(s) and used a P value <0.05 as evidence of significant heterogeneity (Supplementary Table 3). To evaluate the degree of overdispersion of test statistics, we calculated the genomic inflation factor (λ) for each statistical test by the ratio of the mean of the lower 90% of observed test statistics to that of the expected χ2 values17. We applied the genomic control method to correct for λ (Fig. 1b) and then compared a pairwise plot of P values for each SNP in the trend and corrected tests to determine the potential effect of any residual population stratification (Supplementary Fig. 1a,b and Supplementary Methods).
We also examined the validity of the assumption of additivity (in log-odds scale) in the association tests by comparing likelihood ratios assuming alternative models of dominance and rejected additivity for P < 0.05 (ref. 2 and Supplementary Table 3).
For each chromosome segment showing significant association with intracranial aneurysm, we investigated whether more than one SNP had an independent marginal effect on intracranial aneurysm by the Mantel extension test conditioned on genotypes for SNPs within each interval (Supplementary Table 4).
To assess the robustness of our GWAS results, we also performed a weighted Z-score test and found that the results of this alternative analysis were highly correlated with the results of the Mantel extension test (Supplementary Fig. 1c).
For the Japanese replication study, allelic discrimination assays were done with 15 SNPs on the Sequenom iPLEX genotyping platform according to the manufacturer’s protocol. For SNPs that showed significant P values, genotypes were repeated and P values confirmed on the TaqMan platform (Applied Biosystems). Association tests were done as described above, using P = 0.05 (in the Cochran-Armitage trend test with the same allele found associated in Europe) as the threshold for significance (Supplementary Table 3).
For SNPs with the most significant P values we investigated whether the association results were affected by potential confounding variables such as rupture status, family history or gender. We compared genotype distributions of cases stratified by these variables using the trend test (Supplementary Table 5).
We investigated two risk measures based on replicated SNPs: the population attributable fraction (PAF) and the proportion of the sibling recurrence risk attributable to a SNP (‘recurrence risk fraction’) as previously described (Table 1 and Supplementary Methods). For these calculations we assumed intracranial aneurysm population prevalence of 2% and λsib of 4 (ref. 10). The combined contribution of SNPs was obtained by assuming the multiplicative model (Supplementary Methods).
We analyzed the cumulative effects of the risk alleles at the most significant SNP at 2q, 8q and 9p (rs700651, rs10958409 and rs1333040) by calculating the risk score for each individual by the weighted sum of the number of risk alleles as defined by
where ψ[i] is the logarithm of the calculated per-allel odds ratio at each locus and n[i] is the number of risk alleles at the same locus. We then assessed the risk score for each of the 27 possible three-locus genotypes in each cohort (Supplementary Table 6). We fitted a simple linear logistic model with an additive effect (on log-odds scale) for each cohort and performed a likelihood-ratio test. For display purposes, the 27 strata of Supplementary Table 6 are compressed into 5 strata shown in Table 2 according to the absolute number of risk alleles, which closely parallels the risk score.
We are grateful to the participants who made this study possible. We thank A. Chamberlain, O. Törnwall, M. Alalahti, K. Helin, S. Malin and J. Budzinack for their technical help. This study was supported by the Yale Center for Human Genetics and Genomics and the Yale Program on Neurogenetics, the US National Institutes of Health grants R01NS057756 (M.G.) and U24 NS051869 (S.M.) and the Howard Hughes Medical Institute (R.P.L.). C.E.M. and M.W.S. are supported by a gift from the Lawrence Family and Y.M.R. by the Dr E. Dekker program of The Netherlands Heart Foundation (2005T014). K.Y. and I.I. were supported by the Core Research for Evolutional Science and Technology, Japan Science and Technology Corporation.