There is increasing evidence that the genetic architecture underlying susceptibility to common complex diseases reflects the influence of both common and rare DNA variants (
24,
25), but the relative contribution of common versus rare variants remains to be determined. Multiple, generally rare, polymorphisms in the EPHX2 gene have been identified, which result in amino acid substitutions in the EPHX2 protein (
17,
18). Significant variation in enzyme activity and stability has been shown to be associated with these variants in experimental assays
in vitro (
17,
19). In this study, we have genotyped 5 non-synonymous polymorphisms of the EPHX2 gene, with established or potential relevance to enzyme structure and function, as well as 7 additional common variants spanning the EPHX2 gene sequence. We have used a combination of single polymorphism- and haplotype-based methodologies to investigate the association of variation in the EPHX2 gene with incident ischemic stroke in African-Americans and Whites from the ARIC cohort.
An association between the E470G variant with incident ischemic stroke in African-Americans is suggested. Adjusting for effects of known stroke risk factors, individuals carrying the 470G allele had a 2 to 3 fold hazard rates of incident ischemic stroke compared to those who did not, although statistical significance for this effect was not achieved. Because this variant is rare in African-Americans and not present in our sample of Whites, our power to detect a statistically significant effect was limited, and very large sample sizes are required to obtain statistically meaningful numbers of individuals carrying the 470G allele. Future studies in the entire ARIC cohort will clarify the role of this variant in ischemic stroke. Interestingly,
in vitro enzymatic assays showed that the 470G EPHX2 isoform had a significantly greater hydrolase activity but similar phosphatase activity as compared to the wild type isoform (
17). Such an altered function, if present
in vivo, may have important consequences on the regulation of EETs levels in the brain and vasculature, and may contribute to cerebrovascular injury (
12). None of the other non-synonymous polymorphisms, with known relevance to enzyme structure or function, was
independently associated with incident ischemic stroke.
The relationship between EPHX2 sequence variation and ischemic stroke may be exerted through higher-order combinations of polymorphisms, rather than through their individual effects. Indeed, it has been suggested that haplotypes, rather than individual variants, define functional units of genes (26, 27). Thus, we next investigated the association of common EPHX2 haplotypes with stroke risk. In African-Americans, two common haplotypes with opposing effects on ischemic stroke risk were identified. Haplotype ATACGGT was associated with a significant decreased hazard rate for incident ischemic stroke, either alone or after adjusting for other known stroke risk factors. The variant(s) responsible for this effect is (are) unknown. Additional markers that uniquely tag this haplotype may be helpful in the identifying the underlying causative variant(s). Haplotype ACACAGT was associated with a significant increased hazard rate of incident ischemic stroke, and this effect was
largely independent from the effect of other stroke risk factors. This haplotype is uniquely tagged by SNP9 (Ala425Ala), which was significantly associated with incidence of ischemic stroke in single polymorphism analyses, suggesting that this variant itself, or a variant(s) in LD with it, contributes to ischemic stroke risk in African-Americans. On average, synonymous SNPs are more frequent than non-synonymous SNPs and it is generally thought that most synonymous mutations may be functionally neutral. However, studies in
E.coli, yeast, and
Drosophila support evidence for translation selection for major codons (
28,
29), and it has been shown that there is a strong correlation between bias in synonymous codon usage and gene expression levels (
30).
Moreover, given its close proximity to the junction of exon and intron 14, it is also possible that SNP9 may affect RNA splicing. Predictions from computational analysis (31, 32) revealed that the G to A substitution at SNP9 may lead to the activation of an exonic splicing enhancer (ESE), with the A allele having a 6.8 fold greater predicted SF2/ASF binding affinity than the G allele (see Supplementary Figure 1). To our knowledge, alternate splice variants of this gene have not been characterized in humans. Further studies are needed to determine whether the Ala425Ala polymorphism (SNP9) may produce such variants.
The strong LD between E470G and SNP9 raises the possibility that association of SNP9 with ischemic stroke may represent, at least in part, the association of E470G with the disease. Several lines of evidence suggest that this is not so. First, the haplotype associated with greater stroke incidence and uniquely tagged by SNP9 carries the E470 allele not the G470 allele. Second, the association of one SNP with incident ischemic stroke is not affected when controlling for the effects of the other (not shown), suggesting independent associations of the two SNPs with the disease.
Although we could not find conclusive evidence for an association between variation in the EPHX2 gene and ischemic stroke in Whites, there was a suggestive indication for an effect of a common EPHX2 haplotype, AAGTA, on higher ischemic stroke risk. This haplotype was associated with a modestly increased ratio of hazard rates of incident ischemic stroke, and was uniquely tagged by a SNP in intron 5, which itself showed some suggestion of an association with increased ischemic stroke incidence. It is not clear whether this SNP itself, or one or more SNP in LD with it, is responsible for the observed effects. As in African-Americans, in addition to contributing to increased ischemic stroke risk, the EPHX2 gene may also encompass sequence variation that contributes to lower ischemic stroke risk in Whites, as suggested by the trend toward an association of the AGGTA haplotype with lower ischemic stroke risk. We have recently shown that this haplotype is also significantly associated with an increased risk for calcified plaque in the CARDIA cohort (Fornage et al, in preparation). Plaques vulnerable to rupture and, thus, prone to cause stroke, have been rarely found to contain calcium, and there is accumulating evidence that plaque calcification may confer plaque stability (
33). Moreover, patients with calcified plaque are less likely to have symptomatic disease than those with non-calcified plaque (
33–
36). Our findings of a common EPHX2 haplotype associated with both increased susceptibility to calcified plaque and decreased risk for stroke are consistent with such data and justify further investigation about the role of this gene in the clinical manifestations of atherosclerosis.
Taken together, these data provide evidence that both common and rare variants are likely to play a role in the relationship between the EPHX2 gene and ischemic stroke. Moreover, they provide evidence that multiple variants exist within or near the EPHX2 gene, with greatly contrasting effects on ischemic stroke incidence; some associated with a higher incidence, others with a lower incidence.
Several hypotheses may be put forth to explain these results. First, multiple functional variants of the enzyme have been identified with contrasting enzymatic activity and/or stability (17, 19). Hence, different haplotypes carrying different such variants may influence disease risk in a contrasting manner. Second, it has been recently shown that the EPHX2 is a bifunctional enzyme composed of an epoxide hydrolase domain at its C-terminus and a phosphatase domain at its N-terminus (37, 38). Thus, the presence of two functionally distinct domains in the EPHX2 gene provides the opportunity for functional variants of these domains to influence the phenotype in a distinct fashion. Third, it is also conceivable that the observed associations may not stem solely from effects of polymorphisms in the EPHX2 gene, but also from those of polymorphisms in a nearby gene, whose relationship to ischemic stroke may differ from and oppose that of EPHX2. However, recent data on whole-genome patterns of common DNA variation (39, 40) do not support the possibility that LD extends to the characterized genes near the EPHX2.
This is the first study to describe the common patterns of EPHX2 gene variation associated with incident ischemic stroke in a well-characterized epidemiological setting, such as the ARIC cohort. However, the large number of statistical tests performed warrants caution interpretation of the findings, and underscores the need for additional studies to independently confirm the data reported here. Dual associations of EPHX2 gene variation with risk of coronary heart disease have been observed in a separate study (DC Zeldin, personal communication), lending further credibility to the hypotheses presented here. Additional studies are also needed to further investigate the functional relationships of the EPHX2 sequence variation with ischemic cerebrovascular disease. Identification of the variants influencing susceptibility to ischemic stroke is a necessary first step toward understanding the biological basis of the associations detected in this study.