|Home | About | Journals | Submit | Contact Us | Français|
Stroke is the leading cause of severe disability and the third leading cause of death, accounting for 1 of every 15 deaths in the United States. We investigated the association of polymorphisms in the soluble epoxide hydrolase gene (EPHX2) with incident ischemic stroke in African-Americans and Whites. Twelve single nucleotide polymorphisms (SNPs) spanning EPHX2 were genotyped in a case-cohort sample of 1,336 participants from the Atherosclerosis Risk in Communities (ARIC) study. In each racial group, Cox proportional hazard models were constructed to assess the relationship between incident ischemic stroke and EPHX2 polymorphisms. A score test method was used to investigate the association of common haplotypes of the gene with risk of ischemic stroke. In African-Americans, two common EPHX2 haplotypes with significant and opposing relationships to ischemic stroke risk were identified. In Whites, two common haplotypes showed suggestive indication of an association with ischemic stroke risk but, as in African-Americans, these relationships were in opposite direction. These findings suggest that multiple variants exist within or near the EPHX2 gene, with greatly contrasting relationships to ischemic stroke incidence; some associated with a higher incidence, others with a lower incidence.
Stroke is the leading cause of severe disability and the third leading cause of death, accounting for 1 of every 15 deaths in the United States. Stroke mortality and morbidity are greater among certain sub-groups, such as the elderly and African-Americans. Ischemic stroke accounts for about 80% of all strokes (1). While the genetic basis of some Mendelian forms of stroke has been elucidated (2–5), the mutations implicated in these rare disorders likely contribute little to the overall prevalence of stroke in the population-at-large. Thus, identification and characterization of genes implicated in the more common forms of stroke is critical to improve understanding of the pathophysiologic processes leading to its development.
Epoxyeicosatrienoic acids (EETs) are lipid metabolites of arachidonic acid that are synthesized in vascular endothelial cells by the cytochrome P450 system (6, 7). They function as potent vasodilators and participate in the regulation of vascular tone (8). They have anti-inflammatory properties (9), modulate platelet function during hemostasis (10), and promote cell proliferation (11). In the brain, both vascular endothelial cells and astrocytes provide a carefully regulated supply of EETs to the cerebral microvasculature (12, 13). EETs have been shown to regulate cerebral blood flow (12) and, through their mitogenic properties, may contribute to angiogenesis in the brain (14). Hence, they may play a role in predisposition to and/or recovery from cerebrovascular injury.
Hydrolysis of the EETs to their corresponding diols by epoxide hydrolases regulates EETs levels and represents a major mechanism by which the biological effects of EETs are attenuated (7). Inactivation of EETs by EPHX2 has been proposed to contribute to ischemic brain injury in rodent models, as both EPHX2 gene deletion and pharmacological inhibition of the enzyme reduced infarct size after focal cerebral ischemia (15). We have reported that sequence variation and altered levels of expression of the EPHX2 gene may contribute to kidney and brain injury in a rat model of genetic stroke (16). Multiple polymorphisms in the human EPHX2 gene have been identified, which result in amino-acid substitutions in the protein (17, 18). Significant variations in enzyme activity and stability have been shown to be associated with these variants in in vitro assays (17, 19). We have recently described an association between a functional polymorphism of the human EPHX2 gene with coronary artery calcification, a marker of atherosclerosis and independent predictor of stroke risk (20). In light of the mounting evidence of the role of endogenous EETs in processes intimately connected to cerebrovascular function, we have examined the association of polymorphisms in the EPHX2 gene with incidence of ischemic stroke in the large prospective Atherosclerosis Risk in Communities (ARIC) study.
Means, standard errors and proportions of established stroke risk factors are presented in Table 1 for incident ischemic stroke cases and the cohort random sample (CRS). Mean values for all variables except triglycerides, HDL cholesterol, and LDL cholesterol were significantly different between the 2 groups. Moreover, stroke cases had a significantly greater frequency of African-Americans, men, participants with hypertension and diabetes.
Figure 1 shows the gene location and race-specific allele frequencies (calculated in the CRS) for each of the polymorphisms genotyped in this study. Genotype frequency distributions were in accordance to Hardy-Weinberg equilibrium expectations in each race for all 12 polymorphisms (not shown). Allele and genotype frequency distributions for most of the polymorphisms significantly differed between the 2 racial groups, with the exception of SNP6 (R287Q) (Figure 1). SNP2 (R103C), SNP3, and SNP4 were moderately frequent in African-Americans but rare in Whites. SNP10 (E470G) was not polymorphic in Whites. Allele frequencies of each of the 12 polymorphisms were similar to those observed in other African-American and white populations ((20) and unpublished data).
A summary of the estimates of pairwise linkage disequilibrium (LD), as measured by D’ and r2, between each of the polymorphisms is presented by race in Figure 2. SNP 2, 3, 4, and 10 were not tested in Whites because of their low frequencies. LD levels were generally lower in African-Americans as compared to Whites, consistent with previous reports (21, 22).
Table 2 presents the results of analyses of association between individual non-synonymous SNPs of the EPHX2 gene and risk for incident ischemic stroke. There was no statistically significant association between each of the 5 non-synonymous polymorphisms in the EPHX2 gene and incident ischemic stroke in this sample of African-American and White middle aged-individuals. A rare non-synonymous variant, E470G, showed a trend toward a positive association of the A allele (470G) with a higher risk for ischemic stroke (HRR=2.9; P=0.13) in African-Americans. This trend was observed in all 3 analytical models. The E470G variant was not observed in Whites. Given the rare frequency of this variant, very large sample sizes are required to obtain statistically meaningful numbers of individuals carrying the 470G allele. The numbers of individuals carrying the 470G allele in this sample was 11, including 5 stroke cases.
We next investigated whether the combined effects of common SNPs in the EPHX2 gene significantly influence incident clinical ischemic stroke risk in African-Americans and Whites. To this end, haplotypes were constructed from a selected subset of SNPs. Consistent with approaches to reduce risk of false positive findings related to low frequency haplotypes (23), only haplotypes with a frequency greater or equal to 3% were considered in the association analyses. Nine common haplotypes, accounting for 93% of all African-American chromosomes, and 5 common haplotypes, accounting for nearly 98% of all white chromosomes, met this criterion. Table 3 presents the haplotype frequencies and results of the analyses of association between common haplotypes of the EPHX2 gene and ischemic stroke case status by race. In African-Americans, two common haplotypes showed significant associations with risk for ischemic stroke: Haplotype ATACGGT (frequency=3.2%) was associated with a statistically significantly lower risk of ischemic stroke (P=0.03), while haplotype ACACAGT (frequency=11.5%) was associated with a statistically significantly higher risk of ischemic stroke (P=0.02). Adjustment for covariates did not significantly affect these results. Cox proportional hazard analyses modeling the ratios of hazard rates of incident clinical ischemic stroke between individuals carrying 0, 1, or 2 copies of these haplotypes, and appropriately weighting for the sampling design, corroborated these findings. The ratio of hazards of incident clinical ischemic stroke per copy of haplotype ATACGGT was 0.34 (95%CI: 0.11; 1.00), while that of haplotype ACACAGT was 1.74 (95%CI: 1.11; 2.72) (Table 4). Haplotype ATACGGT was not uniquely tagged. Haplotype ACACAGT was uniquely tagged by SNP9, which encodes a synonymous substitution at Alanine 425. We then directly evaluated the association of SNP9 with stroke incidence in African-Americans. Consistent with the haplotype analyses results, individuals carrying at least one copy of the A allele had a 1.70 fold increased hazard rate of ischemic stroke (95% CI=1.06, 2.75; P=0.03). This association was statistically significant in unadjusted and minimally adjusted models, but was no longer statistically significant after adjusting for known stroke risk factors (HRR=1.60; 95% CI=0.90, 2.90; P=0.10). In particular, adjustment for hypertension status significantly diminished the association of SNP9 with ischemic stroke (HRR=1.40; 95% CI=0.84, 2.85; P=0.19), although no association of this SNP with hypertension status, systolic blood pressure, or diastolic blood pressure was detected in either the CRS, or the stroke cases (not shown).
In Whites, there was no statistically significant association between common patterns of variation in the EPHX2 gene and incident ischemic stroke (Table 3.B). However, haplotype AAGTA, the most frequent haplotype in this race group, showed a trend toward an association with a higher risk for ischemic stroke (P=0.05–0.08). This haplotype is uniquely tagged by SNP5. In single variant analyses, this variant also showed a trend toward a positive association with incident ischemic stroke in Whites (HRR AA vs GG= 1.6 (95%CI: 0.98; 2.6; P=0.06)). Similar to the dual associations of EPHX2 with ischemic stroke risk observed in African-Americans, a common haplotype (AGGTA) showed trend toward an association with lower risk for ischemic stroke in Whites (P=0.06). This haplotype was not uniquely tagged.
There is increasing evidence that the genetic architecture underlying susceptibility to common complex diseases reflects the influence of both common and rare DNA variants (24, 25), but the relative contribution of common versus rare variants remains to be determined. Multiple, generally rare, polymorphisms in the EPHX2 gene have been identified, which result in amino acid substitutions in the EPHX2 protein (17, 18). Significant variation in enzyme activity and stability has been shown to be associated with these variants in experimental assays in vitro (17, 19). In this study, we have genotyped 5 non-synonymous polymorphisms of the EPHX2 gene, with established or potential relevance to enzyme structure and function, as well as 7 additional common variants spanning the EPHX2 gene sequence. We have used a combination of single polymorphism- and haplotype-based methodologies to investigate the association of variation in the EPHX2 gene with incident ischemic stroke in African-Americans and Whites from the ARIC cohort.
An association between the E470G variant with incident ischemic stroke in African-Americans is suggested. Adjusting for effects of known stroke risk factors, individuals carrying the 470G allele had a 2 to 3 fold hazard rates of incident ischemic stroke compared to those who did not, although statistical significance for this effect was not achieved. Because this variant is rare in African-Americans and not present in our sample of Whites, our power to detect a statistically significant effect was limited, and very large sample sizes are required to obtain statistically meaningful numbers of individuals carrying the 470G allele. Future studies in the entire ARIC cohort will clarify the role of this variant in ischemic stroke. Interestingly, in vitro enzymatic assays showed that the 470G EPHX2 isoform had a significantly greater hydrolase activity but similar phosphatase activity as compared to the wild type isoform (17). Such an altered function, if present in vivo, may have important consequences on the regulation of EETs levels in the brain and vasculature, and may contribute to cerebrovascular injury (12). None of the other non-synonymous polymorphisms, with known relevance to enzyme structure or function, was independently associated with incident ischemic stroke.
The relationship between EPHX2 sequence variation and ischemic stroke may be exerted through higher-order combinations of polymorphisms, rather than through their individual effects. Indeed, it has been suggested that haplotypes, rather than individual variants, define functional units of genes (26, 27). Thus, we next investigated the association of common EPHX2 haplotypes with stroke risk. In African-Americans, two common haplotypes with opposing effects on ischemic stroke risk were identified. Haplotype ATACGGT was associated with a significant decreased hazard rate for incident ischemic stroke, either alone or after adjusting for other known stroke risk factors. The variant(s) responsible for this effect is (are) unknown. Additional markers that uniquely tag this haplotype may be helpful in the identifying the underlying causative variant(s). Haplotype ACACAGT was associated with a significant increased hazard rate of incident ischemic stroke, and this effect was largely independent from the effect of other stroke risk factors. This haplotype is uniquely tagged by SNP9 (Ala425Ala), which was significantly associated with incidence of ischemic stroke in single polymorphism analyses, suggesting that this variant itself, or a variant(s) in LD with it, contributes to ischemic stroke risk in African-Americans. On average, synonymous SNPs are more frequent than non-synonymous SNPs and it is generally thought that most synonymous mutations may be functionally neutral. However, studies in E.coli, yeast, and Drosophila support evidence for translation selection for major codons (28, 29), and it has been shown that there is a strong correlation between bias in synonymous codon usage and gene expression levels (30). Moreover, given its close proximity to the junction of exon and intron 14, it is also possible that SNP9 may affect RNA splicing. Predictions from computational analysis (31, 32) revealed that the G to A substitution at SNP9 may lead to the activation of an exonic splicing enhancer (ESE), with the A allele having a 6.8 fold greater predicted SF2/ASF binding affinity than the G allele (see Supplementary Figure 1). To our knowledge, alternate splice variants of this gene have not been characterized in humans. Further studies are needed to determine whether the Ala425Ala polymorphism (SNP9) may produce such variants.
The strong LD between E470G and SNP9 raises the possibility that association of SNP9 with ischemic stroke may represent, at least in part, the association of E470G with the disease. Several lines of evidence suggest that this is not so. First, the haplotype associated with greater stroke incidence and uniquely tagged by SNP9 carries the E470 allele not the G470 allele. Second, the association of one SNP with incident ischemic stroke is not affected when controlling for the effects of the other (not shown), suggesting independent associations of the two SNPs with the disease.
Although we could not find conclusive evidence for an association between variation in the EPHX2 gene and ischemic stroke in Whites, there was a suggestive indication for an effect of a common EPHX2 haplotype, AAGTA, on higher ischemic stroke risk. This haplotype was associated with a modestly increased ratio of hazard rates of incident ischemic stroke, and was uniquely tagged by a SNP in intron 5, which itself showed some suggestion of an association with increased ischemic stroke incidence. It is not clear whether this SNP itself, or one or more SNP in LD with it, is responsible for the observed effects. As in African-Americans, in addition to contributing to increased ischemic stroke risk, the EPHX2 gene may also encompass sequence variation that contributes to lower ischemic stroke risk in Whites, as suggested by the trend toward an association of the AGGTA haplotype with lower ischemic stroke risk. We have recently shown that this haplotype is also significantly associated with an increased risk for calcified plaque in the CARDIA cohort (Fornage et al, in preparation). Plaques vulnerable to rupture and, thus, prone to cause stroke, have been rarely found to contain calcium, and there is accumulating evidence that plaque calcification may confer plaque stability (33). Moreover, patients with calcified plaque are less likely to have symptomatic disease than those with non-calcified plaque (33–36). Our findings of a common EPHX2 haplotype associated with both increased susceptibility to calcified plaque and decreased risk for stroke are consistent with such data and justify further investigation about the role of this gene in the clinical manifestations of atherosclerosis.
Taken together, these data provide evidence that both common and rare variants are likely to play a role in the relationship between the EPHX2 gene and ischemic stroke. Moreover, they provide evidence that multiple variants exist within or near the EPHX2 gene, with greatly contrasting effects on ischemic stroke incidence; some associated with a higher incidence, others with a lower incidence. Several hypotheses may be put forth to explain these results. First, multiple functional variants of the enzyme have been identified with contrasting enzymatic activity and/or stability (17, 19). Hence, different haplotypes carrying different such variants may influence disease risk in a contrasting manner. Second, it has been recently shown that the EPHX2 is a bifunctional enzyme composed of an epoxide hydrolase domain at its C-terminus and a phosphatase domain at its N-terminus (37, 38). Thus, the presence of two functionally distinct domains in the EPHX2 gene provides the opportunity for functional variants of these domains to influence the phenotype in a distinct fashion. Third, it is also conceivable that the observed associations may not stem solely from effects of polymorphisms in the EPHX2 gene, but also from those of polymorphisms in a nearby gene, whose relationship to ischemic stroke may differ from and oppose that of EPHX2. However, recent data on whole-genome patterns of common DNA variation (39, 40) do not support the possibility that LD extends to the characterized genes near the EPHX2.
This is the first study to describe the common patterns of EPHX2 gene variation associated with incident ischemic stroke in a well-characterized epidemiological setting, such as the ARIC cohort. However, the large number of statistical tests performed warrants caution interpretation of the findings, and underscores the need for additional studies to independently confirm the data reported here. Dual associations of EPHX2 gene variation with risk of coronary heart disease have been observed in a separate study (DC Zeldin, personal communication), lending further credibility to the hypotheses presented here. Additional studies are also needed to further investigate the functional relationships of the EPHX2 sequence variation with ischemic cerebrovascular disease. Identification of the variants influencing susceptibility to ischemic stroke is a necessary first step toward understanding the biological basis of the associations detected in this study.
Participants were selected from the Atherosclerosis Risk in Communities (ARIC) study, an ongoing prospective study of atherosclerosis and its sequelae. The cohort consists of a probability sample of 15,792 men and women aged 45 to 64 years at the time of initial examination. Subjects were recruited from four US communities (Suburban Minneapolis, Minnesota; Washington County, Maryland; Forsyth County, North Carolina, and Jackson, Mississippi). In the first three communities, the sample reflects the demographic composition of the community. In Jackson, only black residents were enrolled. The baseline examination, conducted from 1987 to 1989, included a home interview to ascertain demographic, socioeconomic, and cardiovascular risk factors, and medical history; a clinical examination; and a blood draw for laboratory determination. The cohort was followed by annual phone interviews, a clinic examination every 3 years, and hospital and death certificate surveillance. A detailed description of the ARIC study design and methods has been previously published (41). Subjects with prevalent coronary heart disease, stroke, or transient ischemic attacks at the baseline visit were excluded from these analyses (N=1,434).
Incidence of clinical ischemic stroke was determined by review of hospital records and survey of discharge lists from local hospitals and death certificates from state vital statistics offices for potential cerebrovascular events (41, 42). Validation of stroke hospitalizations in ARIC is described elsewhere (42). Briefly, a hospitalization was eligible for validation if there was: 1) a discharge diagnosis indicative of stroke; 2) a discharge summary indicative of a possible stroke such as: transient ischemic attack (TIA), cerebral vascular disease, cerebral hemorrhage, cerebral infarction (list not exhaustive); and/or 3) a CT or MRI with cerebral vascular findings or admission to the neurological intensive care unit. Records for eligible hospitalizations were abstracted at a central center by a trained nurse using a standardized computerized algorithm. Medical records were also reviewed by a trained physician. Disagreement between computer diagnosis and physician review were adjudicated by a second physician. 315 validated incident clinical ischemic strokes were identified through 2001. A stratified random sample of the ARIC cohort (Cohort Random Sample (CRS), n=1,021) was used as the comparison group for the clinical cases. Selection of this cohort random sample (CRS) was stratified on the basis of ultrasound examination of carotid arteries, age, and sex.
Fasting levels of total triglycerides, total and HDL cholesterol were measured by enzymatic methods. LDL cholesterol was calculated. Seated blood pressure was measured three times with a random-zero sphygmomanometer and the last 2 measurements were averaged. Hypertension status was defined as a systolic blood pressure level ≥ 140 mm Hg, a diastolic blood pressure level ≥ 90 mm Hg, and/or treatment with antihypertensive medication. Diabetes was defined as a fasting blood glucose levels ≥ 126 mg/dl, a non-fasting blood glucose level ≥ 200 mg/dl, and/or a history of or treatment for diabetes. Ratio of waist (umbilical level) and hip (maximum buttocks) circumferences were calculated as a measure of adiposity. Cigarette smoking status was analyzed by comparing current smokers to individuals who had formerly or never smoked.
Twelve polymorphisms spanning the EPHX2 gene sequence were selected from previously published work, public databases, as well as from our own resequencing efforts. Polymorphisms were selected for genotyping based on their known or likely functional significance (non-synonymous SNPs), frequency (>5%), and/or their ability to tag haplotype blocks based on r2 values (43). Genotyping was performed using the TaqMan assay (Applied Biosystems) or the MassARRAY assay (Sequenom) as previously described (44). Genotype determination was done blinded to case-cohort status. For each polymorphism, race-specific genotype frequencies were estimated by gene counting in the CRS. Agreement of the genotype frequencies with Hardy-Weinberg equilibrium expectations was tested using a χ2 goodness-of-fit test.
Weighted analyses were used to draw inferences to the entire ARIC cohort from whom the CRS was drawn. The weight for a given stratum of the CRS equaled the total number of eligible participants in the study population stratum divided by the total number of participants sampled from that stratum. The proportions, means, and standard errors of established stroke risk factors were reported as weighted results for incident clinical ischemic stroke cases and the CRS.
For each polymorphism, Cox proportional hazards models were used to estimate the ratios of hazard rates of incident clinical ischemic stroke between individuals in the different genotype categories. The method of Barlow was used to adjust for the sampling strategy in the Cox proportional hazards models (45). Three general models were estimated: Model 1 was an unadjusted model containing only the genotype category; Model 2 included age, sex, and field center as covariates; Model 3 included covariates in model 2 and hypertension status, diabetes status, smoking status, and waist-hip ratio. For low frequency polymorphisms, heterozygous and homozygous for the minor allele were combined into a single category. All analyses were performed separately by race.
Within each race, only polymorphisms with a minor allele frequency ≥ 10% were further considered for the haplotype analyses. SNPs with pairwise r2> 0.8 were considered redundant and only one of them was included in the haplotype analyses. Hence, SNPs included in the haplotype analyses constituted a minimal set of highly informative markers (tagSNPs) while minimizing redundant data. This set was composed of SNP 1, 2, 4, 7, 9, 11, and 12 in African-Americans, and of SNP 1, 5, 6, 7, and 11 in Whites.
Significance of haplotype effects on each of the traits was evaluated using a regression method for unphased haplotypes based on score statistics (46, 47). This method uses the Generalized Linear Model (GLM) framework but accounts for haplotype ambiguity by modeling the probabilities of the possible haplotype pairs per subject (46). Score statistics were constructed to test the null hypothesis of no haplotype effects on the probability of being a stroke case. Statistical significance was evaluated by a permutation test that takes into account the number of haplotypes and, therefore, adjusts for multiple comparisons. Adjustment for covariates followed the same scheme as described above. Only individuals whose genotype was non-missing in 80% of the selected tagSNPs were included in haplotype-based analyses (N=439 and 813, for African-Americans and Whites, respectively). All analyses were performed separately by race.
While the score method has the advantage that permutation P values for significance tests can be easily computed, it does not estimate the regression parameters, hence, the magnitude of effects of haplotypes on traits. Each individual was assigned the most probable pair of haplotypes, as estimated by EM algorithm implemented in the score method above. For those haplotypes showing a significant association with ischemic stroke by the score test, haplotype effects size was estimated by the Barlow method described above, modeling the ratios of hazard rate of incident clinical ischemic stroke between individuals carrying 0, 1, or 2 copies of the given haplotype.
This research was supported by grants NS41466 and HL69126 to MF, by grant ES012856 to CL, by funds from the NIEHS Division of Intramural Research to DZ. The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, and N01-HC-55022. The authors thank the staff and participants of the ARIC study for their important contributions.