The aim of this study was to explore whether genetic variation at the
OPRM1 locus is a risk factor for AD and/or DD. While there have been numerous previous studies of specific variants (generally relatively uninformative ones, mapping either 5′ to the locus or in exon 1), this is, to our knowledge, the first attempt to evaluate a series of markers spanning the coding sequence of the locus; also, no previous study has, to our knowledge, considered informative markers that map 3′ to exon 1. We previously tested the association of the two most commonly studied
OPRM variants, Ala6Val and Asn40Asp (located in exon 1), with substance dependence [(
27) sample expanded in the present study]. However, as for most previous studies, allelic association analyses were not significant. Thus, we focused our effort on informative markers 3′ to this region, which are located in
OPRM1 introns.
First, HWE analyses demonstrated that
OPRM1 intronic SNPs affected risk for AD and/or DD. This approach tests whether genotype frequencies at a locus in a test group reflect an equilibrium state, which should occur (in the absence of an effect on phenotype), assuming no overlapping generations, a large sample size with random mating, no mutation, no migration and no selection. A genotype distribution that is not in HWE (i.e. is in HWD) in a case group may suggest an association between this locus and the disease studied (
52). HWD tests have been applied in fine-scale mapping of mutations responsible for diseases (
52–
54). Here, we found that the genotypic distributions of four intronic SNPs in Block I (SNPs 2–5) and three intronic SNPs in Block II (SNPs 10–12) were in HWD in substance dependent EA subjects (
P < 0.05). When the EA case subjects were divided by the substance of dependence, SNPs 2–5 in Block I and SNPs 11 and 12 in Block II were still in HWD in EA subjects with AD, and SNPs 4 and 5 in Block I and SNP10 in Block II were still in HWD in EA subjects with DD. These HWDs suggest that there may be associations between these markers and diseases.
Although HWD can indicate a valid disease-locus association, two primary alternative explanations, genotyping error and population admixture, must be considered (
55,
56). In this study, genotyping quality was controlled by genotyping DNA samples in duplicate analysis, which indicated a low genotyping error rate (see ‘Genotyping’ below). In addition, the influence of some other systematic flaws can be excluded as all selected SNPs were in HWE in EA control samples. Moreover, Leal (
57) reported that if
multiple markers that are in LD are not in HWE, genetic factors rather than genotyping errors would be the expected origin of the violation of HWE. Therefore, HWD in our cases was unlikely to be due to genotyping errors. As discussed below, it is also unlikely that population admixture resulted in HWD in our EA cases.
Second, results from linkage disequilibrium (LD) analyses suggested that SNPs within intron 1 might confer vulnerability to AD and/or DD. SNP Asn40Asp, located in exon 1, has been the most commonly studied variant in the OPRM1 gene because of its potential functional effects. However, the conflicting results regarding this marker prompted us to assume that another locus influencing the risk to AD and DD might exist elsewhere in OPRM1 and be in substantial LD with this marker. We used high-density SNP LD mapping (average distance between SNPs ~6000 bp) to examine this hypothesis. LD analyses of the OPRM1 SNP markers in our substance-dependent EA cases showed that SNP Asn40Asp was in the same haplotype block as six SNPs within intron 1. In particular, pair-wise LD analyses showed that SNP Asn40Asp was in complete LD (D′ = 1.000) with four of the six intronic SNP markers in Block I (SNPs 3, 4, 6 and 7) in EA subjects with AD. In EA subjects with DD, SNP Asn40Asp was in complete LD (D′ = 1.000) with five intronic SNPs in Block I (SNPs 3–7). Thus, prior positive results with Asn40Asp may be attributable in part to LD with other functional variants. This could help us explain the observed variability in association findings with that marker, as LD relationships would be expected to vary more between populations than do functional effects.
Third, allele-wise and genotype-wise analyses confirmed the association between
OPRM1 intronic markers and AD and/or DD. We observed significant differences between substance-dependent EA cases and controls in allele and/or genotype frequency distributions of six intronic SNPs in Block I and two intronic SNPs in Block II. As 13
OPRM1 SNP markers were included in the association analysis, significant adjustment for multiple testing was necessary. As the 13
OPRM1 SNP markers were located in two separate haplotype blocks and strong LD between SNPs was observed in each LD block, we used the program SNPSpD (
48) as an alternative to the standard Bonferroni method to correct for multiple comparisons. This program takes marker LD into consideration and generates an experiment-wide significance threshold to keep Type I error (or false positive disease association) less than 0.05. After correction by the latter method (significant threshold
P-value was set at 0.005), association results for SNPs 4 and 5 in Block I and SNPs 11 and 12 in Block II remained significant. In addition, analysis of four
OPRM1 SNP markers (SNPs 4, 5, 11 and 12) in a Russian AD replication sample generally supported the above findings. The modest significance levels obtained in the Russian population may be attributable to the comparatively smaller number of Russian control subjects.
Fourth, haplotype-based analyses also supported the association between
OPRM1 variants and substance dependence. As strong LD was observed for the SNPs in the two haplotype blocks, including all the 13 SNPs in haplotype analysis could substantially dilute the haplotype association results because of redundancy in the genotype information for some of the markers. A number of studies have demonstrated that regions of high LD display low haplotype diversity (
58–
60). This implies that common haplotypes can be efficiently tagged using tag SNPs derived from a subset of the common variants. Therefore, to avoid the inclusion of redundant genotype information for haplotype analysis, using the program Haploview Tagger (with
r2 set at 0.8, as SNPs with shared variance of
r2 = 0.8 can be considered somewhat redundant), we identified three tag SNPs in Block I (SNPs 1, 4 and 5) and two tag SNPs in Block II (SNPs 11 and 12).
As more than half of the tested markers were not in HWE in the EA cases, the program PHASE was applied to estimate haplotype frequencies for both EA cases and controls. This program is based on the Bayesian statistical method and the Partition Ligation algorithm. Even if HWE is violated, the program PHASE can still be used for reconstructing haplotypes from the genotype data and for accurately estimating haplotype frequencies (
49). In this study, we found significant associations between the five tag SNP haplotypes and AD and DD in EA subjects. Moreover, the haplotype results are consistent with those from allelic association analysis, i.e. haplotype AGTTC, containing major alleles of SNPs 4 and 5 in block I and minor alleles of SNPs 11 and 12 in Block II, was significantly more common in EA controls than in EA cases, while haplotype AACCT, containing minor allele of SNPs 4 and 5 in Block I and major alleles of SNPs 11 and 12 in Block II, was significantly more common in EA cases than in EA controls. In addition, haplotype association analysis results from the Russian samples supported this conclusion.
Fifth, population admixture (in this sample of European ancestry) was excluded as an explanation for the above findings through the application of a set of AIMs. Based on the genotype data of the 37 AIMs analyzed using the program STRUCTURE, all self-reported EA subjects could be assigned to a ‘genetic’ EA population group. This is in agreement with our own previous observations (
45,
61) and with the findings by Tang
et al. (
62) that genetic clustering for several groups, including EAs, corresponds very closely to self-identified race. In our EA cases and controls, although no individual had EA ancestry proportion as high as 1.0, the degree of admixture was very low (the average admixture degree being only 1.3%). In addition, the ancestry proportions of our EA cases were not significantly different from those of our EA controls when analyzed using the program STRAT. Therefore, there was minimal potential for Type I errors to result from population stratification.
Finally, logistic regression analyses confirmed the association between OPRM1 gene variants and AD and/or DD. To reduce the heterogeneity that may confound case-control studies, ideally, the sex and age of cases and controls should be matched. However, in this study, the EA cases and controls differed significantly on these variables. Hence, we used the backward stepwise logistic regression analysis to control for these differences statistically. This method uses the χ2 statistic to identify variables to be removed from the model; the covariates remaining in the final regression model contributed most to the difference between cases and controls. By this method, we found that the minor allele of SNP5 (rs495491) in Block I was a risk allele for AD and DD via a recessive mode of action, whereas the minor allele of SNP11 (rs609148) in Block II was a protective allele with respect to AD and DD, exerting a dominant effect. These results were in agreement with the findings by allelic association analyses that the minor alleles of several SNPs (including SNP5) in Block I were significantly more frequent, whereas the minor alleles of two SNPs (including SNP11) in Block II were significantly less frequent, in cases than in controls. In logistic regression analysis, as covariates were included in a single model with allele, genotype, haplotype or diplotype data, concern over multiple testing is reduced. In the Russian samples, as both case and control subjects were male and matched on age, regression analysis was not performed.
Haplotype regression analysis also supported the findings of haplotype association analysis, i.e. a specific haplotype AGTTC, which harbored major alleles of SNPs 4 and 5 and minor alleles of SNPs 11 and 12, protected against AD and DD. In addition, we performed the DTR analysis to verify the finding from individual marker or haplotype analysis. Luo
et al. (
52,
63) have reported that DTR might be a more powerful approach than haplotype analysis because it provides more information about the gene and its association with a disease. By DTR, we found that two diplotypes, harboring the specific haplotype AGTTC, contributed to the protection against risk for AD and DD. This suggests that interaction of a number of
OPRM1 variants affecting expression (including Asn40Asp) could confer variable liability to substance dependence. However, we did not observe a greater protective role for the diplotype homozygous for haplotype AGTTC, despite a frequency of approximately 6%. In addition, as diplotypes with frequency less than 0.01 were omitted (i.e. 259/1207 = 21.4% of haplotypes or diplotypes were discarded), we cannot exclude the possibility that certain rare diplotypes or haplotypes may confer vulnerability to substance dependence.
Although our AD sample was substantially larger than our DD sample, in most cases, we observed statistically stronger evidence of association for DD than for AD for SNPs from haplotype block I (). This suggests that OPRM1 Block I genetic variation is a more important predisposing factor for DD than for AD. In contrast, the opposite pattern existed for SNPs in Block II (). We infer from these patterns that there may be two distinct coding region risk loci mapped to OPRM1, one of which is more important for DD risk (mapped to the 5′ haplotype block, Block I), and one that is more important for AD risk (mapped to the 3′ haplotype block, Block II). In addition, the data hint at a possible difference in mode of action for these distinct loci, with the Block I locus data more consistent with an allele-wise (recessive) effect and the Block II data more consistent with a genotype-wise (dominant) effect. One possible explanation for this observation is that the Block I and Block II variants could act by different functional mechanisms, one of which is more relevant to DD, and the other, to AD, but it is also possible that this apparent difference is a chance finding. We predicted that the association would be strongest for opioiddependent subjects, because of the direct interaction of opioids with the MOR. Further association analysis was consistent with this prediction (), although the sample size for OD was small. This apparent difference is easy to understand in the context of OD and CD pathophysiology; direct opioidergic mechanisms might reasonably be more important for OD than for CD.
In conclusion, the present study showed that multiple intronic SNPs in OPRM1 may increase risk for substance dependence in the EA and Russian populations. These data are consistent with the interpretation that there are at least two bi-allelic risk variants at the OPRM1 locus, one mapping to the haplotype Block I, and the other, to the haplotype Block II. In particular, SNPs located in intron 1, which are in close LD with the most frequently studied SNP Asn40Asp, and SNPs in intron 3, which may be involved in alternative gene splicing or transcription regulation, are worthy of further investigation. It would be of great interest to determine if a significant association of substance dependence with OPRM1 gene variants can be detected in other population groups (e.g. African-Americans). Furthermore, it will be important to examine the potential functional effects of these variants, to establish a mechanism by which they increase or decrease an individual's susceptibility to substance dependence.