Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2700858

Formats

Article sections

Authors

Related links

Genet Epidemiol. Author manuscript; available in PMC 2010 January 1.

Published in final edited form as:

Genet Epidemiol. 2009 January; 33(1): 63–78.

doi: 10.1002/gepi.20358PMCID: PMC2700858

NIHMSID: NIHMS103623

See other articles in PMC that cite the published article.

In genetic mapping of complex traits, scored haplotypes are likely to represent only a subset of all causal polymorphisms. At the extreme of this scenario, observed polymorphisms are not themselves functional, and only linked to causal ones via linkage disequilibrium (LD). We will demonstrate that due to such incomplete knowledge regarding the underlying genetic mechanism, the variance of a trait may become different between the scored haplotypes. Thus, unequal variances between haplotypes may be indicative of additional functional polymorphisms affecting the trait. Methods accounting for such haplotype-specific variance may also provide an increased power to detect complex associations. We suggest ways to estimate and test these haplotypic variance contrasts, and incorporate them into the haplotypic tests for association. We further extend this approach to data with unknown gametic phase via likelihood-based simultaneous estimation of haplotypic effects and their frequencies. We find our approach to provide additional power, especially under the following types of models: (a) where scored and unobserved variants are epistatically interacting with each other; and (b) under heterogeneity models, where multiple unobserved mutations are linked to nonfunctional observed polymorphisms via LD. An illustrative example of usefulness of the method is discussed, utilizing analysis of multilocus effects within the catechol-O-methyl transferase (COMT) gene.

There is a substantial body of work on characterizing haplotype-trait associations using unrelated individuals. Simple methods of dealing with the unknown haplotype phase in association tests have been proposed [Schaid et al., 2002; Zaykin et al., 2002; Luo et al., 2006]. Xie and Stram [2005] showed asymptotic equivalence of two common types of these approaches. These methods have been found to provide adequate inference concerning both hypothesis testing, and association parameter estimation, and have been recommended for usage [Stram et al., 2003; Kraft et al., 2005; Xie and Stram, 2005; Kraft and Stram, 2007]. Tzeng et al. [2006] extended these approaches to incorporate evolutionary clustering of haplotypes. Unbiased estimation of association parameters may require a simultaneous estimation of haplotype frequencies and association parameters. The maximum-likelihood methods are theoretically preferable [Lin and Huang, 2007; Allen and Satten, 2008]. A variety of such methods have been proposed [Tregouet et al., 2002; Epstein and Satten, 2003; Stram et al., 2003; Zhao et al., 2003; Shibata et al., 2004; Lin et al., 2005; Lin and Zeng, 2006]. In this article, we incorporate haplotype and diplotype-specific variances into the likelihood for unphased data. The motivation for this extension comes from a scenario where haplotypes under investigation are either linked via LD to causal variants, or represent only a part of all causal variation. In both cases, the effect of an observed variant (i.e. haplotype) on the trait (*Y*) is a weighted average of the effects that correspond to all relevant polymorphisms considered jointly. The weights are given by the frequencies of these unobserved joint polymorphisms. To simplify the exposition, we first assume that the observed variant is an SNP “*A*” with alleles *A*_{1}, *A*_{2}. We denote haplotypes that include *A*_{1} by *h _{A}*

$${\mu}_{{A}_{1}}=\frac{{\sum}_{j\in {h}_{{A}_{1}}}{\mu}_{j}\phantom{\rule{0.16667em}{0ex}}{p}_{j}}{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}}$$

(1)

where *j* is indexing over all haplotypes *h _{A}*

Using a similar notation, the variance of a trait among individuals carrying *A*_{1} is

$${V}_{{A}_{1}}=\frac{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}{V}_{j}}{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}}+\frac{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}{({\mu}_{j}-{\mu}_{{A}_{1}})}^{2}}{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}}$$

(2)

where *V _{A}*

$${V}_{{A}_{1}}={\sigma}^{2}+\frac{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}{({\mu}_{j}-{\mu}_{{A}_{1}})}^{2}}{{\sum}_{j\in {h}_{{A}_{1}}}{p}_{j}}$$

(3)

The contrast between the variances, *V _{A}*

The main idea of the approach that we develop here is to estimate and test the variance contrast, *V _{A}*

Our approach is based on random population samples, and we assume that there is no statistical confounding, such as confounding due to population stratification. In the absence of confounding, both mean and the variance contrasts are interpreted in a similar manner: *μ _{A}*

We are concerned with haplotype and diplotype association methods that are capable of dealing with unphased data. Nevertheless, we start with a single di-allelic (*A*_{1}, *A*_{2}) SNP, because this allows us to succinctly describe the essence of the methods. We define the following three null hypotheses (*H*_{0}) of interest:

$$\begin{array}{l}{H}_{0}^{\mu ,V}:({\mu}_{{A}_{1}}={\mu}_{{A}_{2}};{V}_{{A}_{1}}={V}_{{A}_{2}})\\ {H}_{0}^{\mu}:({\mu}_{{A}_{1}}={\mu}_{{A}_{2}})\\ {H}_{0}^{V}:({V}_{{A}_{1}}={V}_{{A}_{2}})\end{array}$$

(4)

with the corresponding likelihood ratio test statistics. A rejection of the more general hypothesis,
${H}_{0}^{\mu ,V}$, indicates that there are differences in either means or the variances of the trait between the alleles *A*_{1} and *A*_{2}. The other two hypotheses are formed specifically regarding the allelic means or the variances. Genotypic rather than allelic-based hypotheses and tests are similarly defined. For example, the *H*_{0} regarding the genotypic means is stated as
${H}_{0}^{\mu}$: (*μ _{A}*

The model that corresponds to allelic tests and estimates is also referred to as the additive model. To construct allelic likelihood ratio test statistics (LRTs) that correspond to our hypotheses, ${H}_{0}^{\mu ,V},{H}_{0}^{\mu},{H}_{0}^{V}$, we define the following log-likelihoods:

$$\begin{array}{l}{L}_{0}=2\sum _{i=1}^{n}ln\phi ({Y}_{i};\widehat{\mu},\widehat{V})\\ {L}_{1}={L}_{1}^{{A}_{1}{A}_{1}}+{L}_{1}^{{A}_{1}{A}_{2}}+{L}_{1}^{{A}_{2}{A}_{2}}\\ {L}_{2}={L}_{2}^{{A}_{1}{A}_{1}}+{L}_{2}^{{A}_{1}{A}_{2}}+{L}_{2}^{{A}_{2}{A}_{2}}\\ {L}_{3}={L}_{3}^{{A}_{1}{A}_{1}}+{L}_{3}^{{A}_{1}{A}_{2}}+{L}_{3}^{{A}_{2}{A}_{2}}\end{array}$$

(5)

where

$$\begin{array}{l}{L}_{1}^{{A}_{1}{A}_{1}}=2\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}},{\widehat{V}}_{{A}_{1}});\phantom{\rule{0.38889em}{0ex}}{L}_{1}^{{A}_{2}{A}_{2}}=2\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}},{\widehat{V}}_{{A}_{2}})\\ {L}_{1}^{{A}_{1}{A}_{2}}=\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}},{\widehat{V}}_{{A}_{1}})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}},{\widehat{V}}_{{A}_{2}})\\ {L}_{2}^{{A}_{1}{A}_{1}}=2\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}},\widehat{V});\phantom{\rule{0.38889em}{0ex}}{L}_{2}^{{A}_{2}{A}_{2}}=2\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}},\widehat{V})\\ {L}_{2}^{{A}_{1}{A}_{2}}=\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}},\widehat{V})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}},\widehat{V})\\ {L}_{3}^{{A}_{1}{A}_{1}}=2\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{1}});\phantom{\rule{0.38889em}{0ex}}{L}_{3}^{{A}_{2}{A}_{2}}=2\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{2}})\\ {L}_{3}^{{A}_{1}{A}_{2}}=\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{1}})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{2}})\\ {\widehat{\mu}}_{{A}_{1}}=\frac{2{\sum}_{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}{Y}_{i}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{Y}_{i}}{2{n}_{{A}_{1}{A}_{1}}+{n}_{{A}_{1}{A}_{2}}};\phantom{\rule{0.38889em}{0ex}}{\widehat{\mu}}_{{A}_{2}}=\frac{2{\sum}_{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}{Y}_{i}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{Y}_{i}}{2{n}_{{A}_{2}{A}_{2}}+{n}_{{A}_{1}{A}_{2}}}\\ {\widehat{V}}_{{A}_{1}}=\frac{2{\sum}_{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{1}})}^{2}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{1}})}^{2}}{2{n}_{{A}_{1}{A}_{1}}+{n}_{{A}_{1}{A}_{2}}}\\ {\widehat{V}}_{{A}_{2}}=\frac{2{\sum}_{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{2}})}^{2}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{2}})}^{2}}{2{n}_{{A}_{2}{A}_{2}}+{n}_{{A}_{1}{A}_{2}}}\\ \widehat{V}=\frac{2{\sum}_{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{1}})}^{2}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{1}})}^{2}}{2n}\\ +\frac{2{\sum}_{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{2}})}^{2}+{\sum}_{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{2}})}^{2}}{2n}\\ \widehat{\mu}=\frac{{\sum}_{i=1}^{n}{Y}_{i}}{n};\phantom{\rule{0.38889em}{0ex}}{\widehat{V}}_{0}=\frac{{\sum}_{i=1}^{n}{({Y}_{i}-\widehat{\mu})}^{2}}{n}\end{array}$$

(6)

where (*y; μ*, *V*) is the normal density, *N*(*μ*, *V*), evaluated at *y*, and *n _{A}*

Tests for the means as well as tests for the variances can be constructed while allowing for the other parameter to be different between alleles. There is a concern that the test for the mean with unequal variances (which we denote by *T ^{μ}*

$$\begin{array}{l}{T}^{\mu ,V}=2({L}_{1}-{L}_{0})\sim {\chi}_{(2)}^{2}\\ {T}^{{\mu}^{\ast},V}=2(2{L}_{1}-{L}_{2}-{L}_{3})\sim {\chi}_{(2)}^{2}\\ {T}^{\mu}=2({L}_{2}-{L}_{0})\sim {\chi}_{(1)}^{2}\\ {T}^{{\mu}^{\ast}}=2({L}_{1}-{L}_{3})\sim {\chi}_{(1)}^{2}\\ {T}^{V}=2({L}_{1}-{L}_{2})\sim {\chi}_{(1)}^{2}\end{array}$$

(7)

The notation indicates that, for example, *T ^{μ}*

Genotypic rather than allelic-based tests are constructed similarly. The mean and the variance estimators for the homozygote *A*_{1}*A*_{1} are obtained as

$${\widehat{\mu}}_{{A}_{1}{A}_{1}}=\frac{{\sum}_{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}{Y}_{i}}{{n}_{{A}_{1}{A}_{1}}};\phantom{\rule{0.38889em}{0ex}}{\widehat{V}}_{{A}_{1}{A}_{1}}=\frac{{\sum}_{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}{({Y}_{i}-{\widehat{\mu}}_{{A}_{1}{A}_{1}})}^{2}}{{n}_{{A}_{1}{A}_{1}}}$$

(8)

These are similarly defined for the genotypes *A*_{1}*A*_{2} and *A*_{2}*A*_{2}. The log-likelihoods are

$$\begin{array}{l}{L}_{0}=\sum _{i=1}^{n}ln\phi ({Y}_{i};\widehat{\mu},\widehat{V})\\ {L}_{1}=\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}{A}_{1}},{\widehat{V}}_{{A}_{1}{A}_{1}})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}{A}_{2}},{\widehat{V}}_{{A}_{1}{A}_{2}})\\ +\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}{A}_{2}},{\widehat{V}}_{{A}_{2}{A}_{2}})\\ {L}_{2}=\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}{A}_{1}},\widehat{V})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{1}{A}_{2}},\widehat{V})\\ +\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};{\widehat{\mu}}_{{A}_{2}{A}_{2}},\widehat{V})\\ {L}_{3}=\sum _{i\in {A}_{1}{A}_{1}}^{{n}_{{A}_{1}{A}_{1}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{1}{A}_{1}})+\sum _{i\in {A}_{1}{A}_{2}}^{{n}_{{A}_{1}{A}_{2}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{1}{A}_{2}})\\ +\sum _{i\in {A}_{2}{A}_{2}}^{{n}_{{A}_{2}{A}_{2}}}ln\phi ({Y}_{i};\widehat{\mu},{\widehat{V}}_{{A}_{2}{A}_{2}})\end{array}$$

(9)

and the test statistics are

$$\begin{array}{l}{T}^{\mu ,V}=2({L}_{1}-{L}_{0})\sim {\chi}_{(4)}^{2}\\ {T}^{{\mu}^{\ast},V}=2(2{L}_{1}-{L}_{2}-{L}_{3})\sim {\chi}_{(4)}^{2}\\ {T}^{\mu}=2({L}_{2}-{L}_{0})\sim {\chi}_{(2)}^{2}\\ {T}^{{\mu}^{\ast}}=2({L}_{1}-{L}_{3})\sim {\chi}_{(2)}^{2}\\ {T}^{V}=2({L}_{1}-{L}_{2})\sim {\chi}_{(2)}^{2}\end{array}$$

(10)

The dominant and the recessive models are defined in the same way as the full genotypic, but without making distinctions between certain genotype classes. For example, in the dominant model we estimate _{A}_{1}_{A}_{1 +} _{A}_{1}_{A}_{2}, _{A}_{1}_{A}_{1+}_{A}_{1}_{A}_{2}, _{A}_{2}_{A}_{2}, _{A}_{2}_{A}_{2}, and form the likelihoods according to these four, rather than six parameters.

The LRT statistics for the situation with multiple SNPs and the unknown haplotype phase have a form similar to the statistics just described. However, because of the phase uncertainty, the form of a likelihood function is now

$$L\propto \prod _{i}^{n}\sum _{\text{haplotype}\phantom{\rule{0.16667em}{0ex}}\text{pairs}}Pr({h}_{s},{h}_{r}\mid \mathbf{H})\phantom{\rule{0.16667em}{0ex}}\phi ({Y}_{i};{\widehat{\mu}}_{sr},{\widehat{V}}_{sr}\mid {h}_{r},{h}_{s})$$

(11)

where **H** are sample haplotype frequencies and Pr(*h _{s}h_{r}*|

$${T}^{\mu}=\frac{{\sum}_{i=1}^{k}{n}_{i}{({\widehat{\mu}}_{i}-\widehat{\mu})}^{2}}{\widehat{V}}$$

(12)

$${T}^{{\mu}^{\ast}}=\sum _{i=1}^{k}\frac{{n}_{i}{({\widehat{\mu}}_{i}-\widehat{\mu})}^{2}}{{\widehat{V}}_{i}},$$

(13)

where *n _{i}* are haplotype counts. Under the normal model, the haplotypic overall tests for the means are analogous to the “2

$${T}^{V}=ln(\widehat{V})\sum _{i=1}^{k}{n}_{i}-\sum _{i=1}^{k}{n}_{i}ln({\widehat{V}}_{i}),$$

(14)

This statistic is equivalent to the Bartlett test [Bartlett, 1937]. Under
${H}_{0}^{\mu ,V}$ and normality, it is independent of the statistic that is based on the ratio of the mean difference and the pooled variance. Thus, asymptotically equivalent forms of the LRTs for testing the means and the LRTs for the variances are asymptotically independent. Frequencies of rare classes become critical with regard to the properties of an overall test. In our comparisons, we used an alternative overall test, the Simes test, based on *p*-values obtained from tests on individual haplotypes. Simes [1986] described this combination test that allows for dependencies among *p*-values. The Simes test rejects the overall *H*_{0} at the level *α* if *p*_{(}_{i}_{)} ≤ *iα*/*k* for at least one *i*, where *k* is the number of tested haplotypes, and *p*_{(}_{i}_{)} are ordered *p*-values. Equivalently, the overall *p*-value is given by min{*kp*_{(}_{i}_{)}/*i*}. Thus, this approach is to take one haplotype (*h _{i}*) at a time, contrasted against the group of all other haplotypes,

A note should be made regarding the interpretation of results for the variance-specific tests. When a haplotype *h _{i}* is compared against a combined group that consists of all remaining haplotypes,

The normality assumption can be relaxed in our implementation by utilizing a resampling distribution of the test statistics for computing *p*-values of the tests. Asymptotic tests appear to offer a reasonable performance, unless the expected count of a tested variant is low. Nevertheless, it is a good strategy to verify low *p*-values with a permutational test. Trait transformations to normality, such as the common Box-Cox transformation [Box and Cox, 1964] can also be considered. The common-variance test for the mean, *T ^{μ}* appears to be robust to deviations from normality (Results section). Similar tests contrasting means for the dominant, recessive, and diplotype models were described in Shibata et al. [2004].

The two combined statistics considered here can be factored as sums, e.g. *T ^{μ}*

To illustrate an application of the proposed association tests, we performed an analysis of the data collected within a larger prospective study, concerned with the pain sensitivity as a risk factor for the development of facial pain [Diatchenko et al., 2005].

One hundred ninety six healthy European American pain-free females with an age range of 18 to 34 years were genotyped and phenotyped. Phenotypic procedures and demographic characteristics of the cohort at the time of recruitment were described previously [Diatchenko et al., 2005].

Subjects were phenotyped with respect to their sensitivity to pressure pain, thermal pain, and ischemic pain. A summary measure of thermal threshold was used in this study as it has been shown to be the most sensitive measure with respect to association with catechol-O-methyltransferase (COMT) genotypes, and has the lowest measurement error [Diatchenko et al., 2006, 2005]. A summary measure was calculated by summing centered and scaled individual measures of thermal pain threshold, collected as subjective estimates of rating thermal stimuli that were applied to the skin overlaying the right masseter muscle, right forearm, and dorsal surface of the right foot. Positive values imply higher pain thresholds, and negative values imply lower pain thresholds.

Genomic DNA was purified using QIAampTM 96 DNA Blood Kit (Qiagen, Valencia, CA, USA) and used for 5′ exonuclease assay [Shi et al., 1999]. The primer and probes were ordered from ABI, Foster City, CA. The genotyping error rate was directly determined and was <0.005. Genotype completion rate was 95%.

We will first present results of theoretical models, followed by the analysis of association of pain characteristics with the COMT polymorphisms. In the theoretical part of the section, we will refer to a model where the observed and the unobserved loci are denoted by *A* and *B*, correspondingly. We will study situations where *A* and *B* represent loci with multiple haplotypes. When both *A* and *B* are either di-allelic, or when *A*_{2}, for example, can be considered as the “not-*A*_{1}” group of haplotypes, we will denote the vector of four effects, *μ _{A}*

Haplotype | P | M |

A_{1}B_{1} | p_{A}_{1}_{B}_{1} | μ_{A}_{1}_{B}_{1} |

A_{1}B_{2} | p_{A}_{1}_{B}_{2} | μ_{A}_{1}_{B}_{2} |

A_{2}B_{1} | p_{A}_{2}_{B}_{1} | μ_{A}_{2}_{B}_{1} |

A_{2}B_{2} | p_{A}_{2}_{B}_{2} | μ_{A}_{2}_{B}_{2} |

First, we consider models where both observed (*A*) and unobserved (*B*) loci are functional, contributing additively, with no interaction between the loci. In the di-allelic case, the array of effects takes the form **M** = {*κ _{A}*

Values for ln(*μ*_{A}_{1}/*μ*_{A}_{2}) (*) and ln(*V*_{A}_{1}/*V*_{A}_{2}) (●) for the non-interactive model; *σ*^{2} = 1. Left graph, (a): *p*_{A}_{1}_{B}_{1} = 0.3034; *p*_{A}_{1}_{B}_{2} = 0. Right graph, (b): *p*_{A}_{1}_{B}_{1} = *p*_{A}_{1}_{B}_{2} = 0.3034/2.

The left three columns of Table 2 show resulting power values for a single SNP. Because the goal was to evaluate the *relative* performance of the tests, the *σ*^{2} values, as shown in the table, were taken to be different for each setting, so that medium to high power values could be obtained. The corresponding results for unphased data, with population haplotype frequencies, modeled after the COMT, are given in the right columns of the table. In the case of multiple unphased haplotypes, the haplotype AGCG corresponds to the allele *A*_{1}, while the group *A*_{2} is composed of the seven remaining haplotypes, with their respective frequencies.

The results show a good correspondence of power values for the di-allelic case (Table 2, left three columns), with the case of unphased haplotypes (right columns of Table 2), with some loss of power due to phase ambiguity. There is also a cost of considering diplotype, rather than haplotype associations in this case, because the underlying model consists of phenotypic contributions of haplotypes, and lacks dominance deviations due to entire diplotypes. These findings will be confirmed in the subsequent simulations as well. In these simulations, the unequal variance test for the mean is slightly, but consistently more powerful than the test based on the common variance. This power advantage appears to be confined to cases when the minor allele (or a tested haplotype with frequency less than 0.5) is associated with a decreased variance. There is a power advantage associated with the inclusion of the variance parameter in the range of *p _{A}*

The previous model was modified to include an epistatic component *ε*, as **M** = {*ε* + *κ _{A}*

Values for ln(*μ*_{A}_{1}/*μ*_{A}_{2}) (*) and ln(*V*_{A}_{1}/*V*_{A}_{2}) (●) for the interactive model; *σ*^{2} = 1. Left graph, (a): *p*_{A}_{1}_{B}_{1} = *p*_{A}_{1}_{B}_{2} = 0.3034/2; “D” denotes the standardized LD (*D*′). Right graph, (b): Linkage equilibrium. **...**

Models of association via LD (proxy models) assume an unobserved functional locus *B*, while the observed locus *A* associated via LD with the *B* has no functional involvement. In the di-allelic case, the array of effects is **M** = {*x*, *y*, *x*, *y*}. Graphs for this model (Figure 3) bear similarity to the graphs for the non-interactive model (Figure 1). The distinction is that the behavior in this case is completely governed by LD: both the mean and the variance contrast curves cross zero at the same point. Figure 3(a) shows the LD correlation instead of the *D*′, because *D*′ is equal to one throughout this graph. The power gained from the inclusion of the haplotype-specific variance extends through only a limited part of the graph (Table 5). Nevertheless, in the remainder of the table, the association is so pronounced that it is easily detected with high power by either of the tests (
${H}_{0}^{\mu ,V}$ or
${H}_{0}^{\mu}$), even though very large values of *σ*^{2} were assumed. For Figure 3(b), where the LD is not complete, there is no advantage in including the haplotype-specific variance into the test (results not shown) - unlike with interactive models, high LD is required for proxy models to induce a substantial variance contrast.

Values for ln(*μ*_{A}_{1}/*μ*_{A}_{2}) (*) and ln(*V*_{A}_{1}/*V*_{A}_{2}) (●) for the LD proxy model; *σ*^{2} = 1. Left graph, (a): *p*_{A}_{1}_{B}_{1} = 0.3034; *p*_{A}_{1}_{B}_{2} = 0; “R” denotes the LD correlation. Right graph, (b): *p*_{A}_{1}_{B}_{1} = *p*_{A}_{1}_{B}_{2} = 0.3034/2; “D” **...**

The LD proxy model can be extended to multiple mutations, *B*_{1}, …, *B _{K}*. In this case, the resulting mixture distribution at the observed haplotypes linked via LD with the

Table 7 shows an application of the LRT tests to data on five polymorphisms in the COMT gene. COMT is an example of a gene where multiple interacting variants have been suggested to influence variation in several complex phenotypes. COMT codes for an enzyme that metabolizes catecholamines, such as dopamine (DA), norepinephrine (NE) and epinephrine (Epi), and thus affects an array of cognitive-affective traits, including pain perception [Diatchenko et al., 2005; Nackley et al., 2007; Egan et al., 2001; Enoch et al., 2003; Oroszi and Goldman, 2004; Meyer-Lindenberg et al., 2006]. Common nonsynonymous variation val158met (rs4680) has been shown to influence thermostability of the enzyme [Lotta et al., 1995], however, the associations between the low-activity Met158 allele and numerous complex phenotypes [Egan et al., 2001; Enoch et al., 2003; Karayiorgou et al., 1999; Zubieta et al., 2003; Oroszi and Goldman, 2004] have been often modest, and occasionally inconsistent. Repeated retesting of the val158met has been partially driven by numerous molecular and biochemical studies confirming that Met at position 158 does lead to a 3–4 times lower enzymatic activity of COMT at both cell culture and organismal levels [Lotta et al., 1995; Mannisto and Kaakkola, 1999]. There is evidence that additional functional SNPs in the COMT gene locus can modulate COMT activity, as supported by a number of recent positive association studies and molecular work using cell-based assays [Oroszi and Goldman, 2004; Meyer-Lindenberg et al., 2006; Mannisto and Kaakkola, 1999]. The list of potential functional sites includes the following SNPs: rs2097603 in the promoter region of brain-expressed, membrane-bound (MB-) COMT form [Palmatier et al., 2004; Zhu et al., 2004; Chen et al., 2004], rs737865 upstream in the intron 1, and rs165599 in the 3′ untranslated region [Shifman et al., 2002; Bray et al., 2003]. Furthermore, in studying the association between COMT genotypes and variability in human pain perception, it has been found that the val158met polymorphism alone was not significantly associated with a derived measure of global pain sensitivity [Diatchenko et al., 2005]. Instead, three common haplotypes of the COMT gene, consisting of two synonymous (rs4633 and rs4818) and one nonsynonymous val158met SNPs are coding for different levels of enzymatic activity. Corresponding differences in pain sensitivity are associated with regulation of the translation efficiency through haplotype-dependent secondary mRNA structure [Nackley et al., 2006]. Thus, COMT contains at least five functional polymorphisms that potentially impact the index of pain sensitivity. Interactions of functional alleles at COMT imply that the genetic effects may not be easily inferred from the information on one SNP at a time, and that the SNP-specific effects may in fact be misleading. In our application, we tested four SNPs which have been previously independently associated with COMT-dependent phenotypes: rs2097603, rs737865, rs4680 and rs165599. An additional SNP rs4818 was chosen as a major contributor to functional COMT haplotypes that together with SNP rs4680 defines three functional haplotypes and influences pain sensitivity [Diatchenko et al., 2005]. All SNPs were found to conform to Hardy-Weinberg expectations. As a functional measure of COMT activity we used sensitivity to noxious stimuli. A summary measure of thermal threshold has been chosen for this study as it has been shown to be the most sensitive measure with respect to association with COMT genotypes, and has a small associated measurement error [Diatchenko et al., 2005, 2006, 2007].

Two SNPs, rs737865 and rs4818 with the corresponding LD correlation of *r* = 0.5, show a significant mean effect as well as an indication that variability might be different between individuals carrying different alleles. There is a good correspondence between the asymptotic and the resampled *p*-values for the mean and the variance tests (based on 50,000 trait value permutations). There is less correspondence for the combined multilocus tests, *T ^{μ}*

When testing one haplotype at a time, a given haplotype *h _{i}* is contrasted against the rest of the haplotypes, collapsed into a combined group,

Frequencies of the first five COMT haplotypes, as given in Table 1, cover the range from 0.064 to 0.3. These haplotypes were used to evaluate the type-I error rates as a function of frequency. Results of the simulations, for different models of analysis, are shown in Table 9. Tests for the haplotypic and the dominant models of analysis maintain the nominal 5% error rate even for the rare haplotype (the last row in the table), and the error rate for the recessive model is the worst, for the tests that incorporate the variance. The *T ^{μ}*

The results can be summarized as follows. Predictably, the Simes-HTR gives power values that are close to the values for the overall (Simes) *T ^{μ}*. The Simes used as an overall test is more powerful than a multi-parameter test when there are one or two haplotypes with effects that are distinct from a baseline value. When all haplotypes have distinct mean effects, the multi-parameter overall test is more powerful than the Simes, however power values for the two methods are more similar when frequencies of haplotypes included into analysis are similar to each other. The type-I error for the Simes test is well maintained. In the presence of multiple haplotypic effects with the common variance (Settings 6,7,8), some increase in the type-I error for the

To investigate the effect of normality violation on the performance of the tests, we performed additional simulations under the Gamma model, using tests for a single haplotype (*h*_{1} with the frequency 0.3034). The power simulations (Table 11) indicate that only the tests *T ^{μ}* and HTR retain a proper type-I error. The other tests are sensitive to non-normality, however the Box-Cox transformation remedies this problem. When the values

Caution is needed regarding the interpretation of results obtained for the transformed data. Even under normality, the mean and the variance-specific tests are independent only under the complete null. There might be some inflation of the type-I error for these tests when the second parameter is heterogeneous between the haplotypes, and problem is likely to be exaggerated by both non-normality as well as by the transformation. Summarizing results across different simulations, we conclude that when variances are unequal, size of the test *T ^{μ}*

The association models that we considered here assume that some relevant variation is unknown to investigators. Although this incomplete knowledge impairs power to detect haplotypic effects, it also induces differences in trait variances between the scored haplotypes. We exploit this phenomenon in the proposed approach. Inclusion of the variance parameters into haplotype association methods for unphased data is useful for two reasons. Firstly, this adds power to detect associations under certain models, especially under models where known and unknown variants interact with each other. Secondly, a variance increase at a haplotype serves as an indication that the variant under study is either correlated, or epistatically interacts with additional unobserved polymorphisms. We conclude that tests that account for the haplotype- or diplotype-specific variance are useful for discovering complex associations with quantitative traits. Power can be gained while examining only a subset of SNPs that are either directly involved in joint multilocus genetic effects, or linked with functional variants via LD. Our approach is not a replacement for the conventional way of comparing haplotypic means. The number of tests is an issue, therefore at a hypothesis-generating stage, for example in a whole genome association analysis, a routine utilization of the conventional approach is more suitable. Testing additional hypotheses that involve haplotypic and diplotypic variances is more relevant in a smaller, or a follow-up study, concerned with a number of preselected loci.

The unknown factor influencing the trait can be non-genetic. Both, the usual type of an associations test for comparison of the haplotypic effects, as well as a test that incorporates haplotype-specific variances are sensitive to confounding. However, in the absence of confounding, differences between either the haplotypic means, or the variances in this scenario would indicate a partial functional involvement of the scored polymorphisms in the association with the trait, as well as an interaction with the unknown factor.

The approach outlined here is applicable to random samples from homogeneous populations, and thus, it is subject to the usual concerns about possibilities of confounding and stratification. Recently, Epstein et al. [2007] described a simple and effective method to correct for population stratification that can be used in conjunction with our approach. Further extensions are possible that will incorporate family-based samples.

A priori, untyped polymorphisms that reside within the same gene might be considered to be the first candidates that may account for a variance contrast between haplotypes. This is because models without the LD are “less general”, in that they require a particular form of epistasis (or, equivalently, a particular form of interaction with an environmental factor) in order to induce a haplotypic variance contrast [Zaykin and Shibata, 2008]. A gain in power for the proposed method is higher under models that involve some degree of interaction with or without LD. Considering that compensatory changes that follow functional mutations are likely to be a ubiquitous force in the genome evolution [Kern and Kondrashov, 2004; Kirby et al., 1995; Kondrashov et al., 2002], multiple interacting functional SNPs within a gene locus could be relatively common in the haplotypic organization of human genome. How easily these interactions and the corresponding genetic variants might be identified remains an open question, given statistical difficulties related to detection and characterization of complex multilocus effects.

Software implementing statistical approaches described here is available from the authors upon request. Simulation scripts used in this study are available upon request from DVZ.

This research was supported in part by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences. L.D. was supported by NIH grants PO1NS045685, DE16558 and DE017018. We thank Shyamal Peddada, Norman Kaplan, Rongheng Lin, Clare Weinberg, and David Umbach for stimulating discussion. We thank two reviewers for insightful suggestions.

Following Nielsen and Weir [1999], the effect of a scored allele, *A*_{1}, can be written as

$$\begin{array}{l}{\mu}_{{A}_{1}}=\sum _{i}E(Y\mid {B}_{i}{A}_{1})Pr({B}_{i}\mid {A}_{1})\\ =\sum _{i}E(Y\mid {B}_{i})Pr({B}_{i}\mid {A}_{1})\\ =\sum _{i}{\mu}_{{B}_{i}}Pr({B}_{i}\mid {A}_{1})\end{array}$$

(15)

The second equality holds because *A* is now only a marker for nearby functional variants. The probability Pr(*B _{i}* |

$$\begin{array}{l}{\mu}_{{A}_{1}}-{\mu}_{{A}_{2}}=\sum _{i}{\mu}_{{B}_{i}}[Pr({B}_{i}\mid {A}_{1})-Pr({B}_{i}\mid {A}_{2})]\\ =\sum _{i}{\mu}_{{B}_{i}}\left[\frac{{p}_{{B}_{i}{A}_{1}}}{{p}_{{A}_{1}}}-\frac{{p}_{{B}_{i}{A}_{2}}}{1-{p}_{{A}_{1}}}\right]\end{array}$$

The difference is zero under linkage equilibrium. Otherwise, the sign of the effect difference depends on the haplotype frequencies, *p _{B}*

In the simulation experiments, the joint frequencies *p _{B}*

Here we consider theoretical parameter values for the mean and the standard deviation given additive, diplotype, dominant and recessive models of analysis. We assume the population HWE. For the purpose of evaluating the estimation techniques, we assume that the two haplotypic contributions, represented by random variables *X _{i},.X_{j}*, combine in a diplotype as

$$\begin{array}{l}\overline{\mu}=\sum _{i}^{k}\sum _{j}^{k}[{p}_{ij}({\theta}_{i}+{\theta}_{j})]\\ =\sum _{i}^{k}\sum _{j}^{k}[{p}_{i}{p}_{j}({\theta}_{i}+{\theta}_{j})]\\ =2\sum _{i}^{k}{p}_{i}{\theta}_{i}\\ \overline{V}=\sum _{i}^{k}\sum _{j}^{k}{p}_{i}{p}_{j}[{\sigma}_{i}^{2}+{\sigma}_{j}^{2}+{({\theta}_{i}+{\theta}_{j}-\overline{\mu})}^{2}]\end{array}$$

where *p _{i}* is the frequency of the haplotype

Additive effect of the haplotype *h _{i}* is defined as the effect of a diplotype that contains the

$$\begin{array}{l}{\mu}_{i}={\theta}_{i}+\overline{\mu}/2\\ {V}_{i}={\sigma}_{i}^{2}+\sum _{j}^{k}{p}_{j}{\sigma}_{j}^{2}+{({\theta}_{j}-\overline{\mu}/2)}^{2}\end{array}$$

For the rest of the haplotypes, collapsed into a composite group, *h _{ī}*, the values are

$$\begin{array}{l}{\mu}_{\overline{i}}={\mu}_{a}+\overline{\mu}/2\\ {V}_{\overline{i}}=\sum _{j}^{k}{q}_{j}{\sigma}_{j}^{2}+{({\theta}_{j}-{\mu}_{a})}^{2}+\sum _{i}^{k}{p}_{i}{\sigma}_{i}^{2}+{({\theta}_{i}-\overline{\mu}/2)}^{2}\end{array}$$

where

$$\begin{array}{l}{q}_{i}=\frac{{p}_{i}}{{\sum}_{j\ne i}^{k}{p}_{j}}\\ {\mu}_{a}=\sum _{j\ne i}^{k}{q}_{j}{\theta}_{j}\end{array}$$

The diplotype model distinguishes between the three diplotype classes, *h _{i}*/

$$\begin{array}{l}{\mu}_{ii}=2{\theta}_{i}\\ {\mu}_{i\overline{i}}={\theta}_{i}+{\mu}_{a}\\ {\mu}_{\overline{ii}}=2{\mu}_{a}\\ {V}_{ii}=2{\sigma}_{i}^{2}\\ {V}_{i\overline{i}}={\sigma}_{i}^{2}+\sum _{j\ne i}^{k}{q}_{j}{\sigma}_{j}^{2}+{({\theta}_{j}-{\mu}_{a})}^{2}\\ {V}_{\overline{ii}}=\sum _{s\ne i}^{k}\sum _{t\ne i}^{k}{q}_{s}{q}_{t}[{\sigma}_{s}^{2}+{\sigma}_{t}^{2}+{({\theta}_{s}+{\theta}_{t}-2{\mu}_{a})}^{2}]\end{array}$$

The recessive model distinguishes between the two diplotype classes, *h _{i}*/

$$\begin{array}{l}{\mu}_{ii}=2{\theta}_{i}\\ {\mu}_{i\overline{i}/\overline{ii}}=\frac{{\sum}_{s}^{k}{\sum}_{t}^{k}{p}_{s}{p}_{t}({\theta}_{s}+{\theta}_{t})\{1-I(s=t=i)\}}{1-{p}_{i}^{2}}\\ {V}_{ii}=2{\sigma}_{i}^{2}\\ {V}_{i\overline{i}/\overline{ii}}=\frac{{\sum}_{s}^{k}{\sum}_{t}^{k}{p}_{s}{p}_{t}\left[{\sigma}_{s}^{2}+{\sigma}_{t}^{2}+{({\theta}_{s}+{\theta}_{t}-{\mu}_{i\overline{i}/\overline{ii}})}^{2}\right]\{1-I(s=t=i)\}}{1-{p}_{i}^{2}}\end{array}$$

where *I*(·) is the indicator function.

The dominant model distinguishes between the two diplotype classes, (*h _{i}*/

$$\begin{array}{l}{\mu}_{ii/i\overline{i}}=\frac{\left[2{\sum}_{j}^{k}{p}_{i}{p}_{j}({\theta}_{i}+{\theta}_{j})\right]-2{\theta}_{i}{p}_{i}^{2}}{1-{\sum}_{s\ne i}^{k}{\sum}_{t\ne i}^{k}{p}_{s}{p}_{t}}\\ {V}_{ii/i\overline{i}}=\frac{\left[2{\sum}_{j}^{k}{p}_{i}{p}_{j}\{{\sigma}_{i}^{2}+{\sigma}_{j}^{2}+{({\theta}_{i}+{\theta}_{j}-{\mu}_{ii/i\overline{i}})}^{2}\}\right]-{p}_{i}^{2}\{2{\sigma}_{i}^{2}+{(2{\theta}_{i}-{\mu}_{ii/i\overline{i}})}^{2}\}}{1-{\sum}_{s\ne i}^{k}{\sum}_{t\ne i}^{k}{p}_{s}{p}_{t}}\\ {\mu}_{\overline{ii}}=2{\mu}_{a}\\ {V}_{\overline{ii}}=\sum _{i}^{k}\sum _{j}^{k}{q}_{i}{q}_{j}[{\sigma}_{i}^{2}+{\sigma}_{j}^{2}+{({\theta}_{i}+{\theta}_{j}-2{\mu}_{a})}^{2}]\end{array}$$

- Allen AS, Satten GA. Robust estimation and testing of haplotype effects in case-control studies. Genet Epidemiol. 2008;32:29–40. [PubMed]
- Balding DJ. Likelihood-based inference for genetic correlation coefficients. Theor Popul Biol. 2003;63:221–230. [PubMed]
- Bartlett M. Properties of sufficiency and statistical tests. Proceedings of the Royal Society of London Series A, Mathematical and Physical Sciences. 1937;160:268–282.
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc B. 1995;57:289–300.
- Box G, Cox D. An Analysis of Transformations. J Royal Stat Soc B. 1964;26:211–252.
- Bray NJ, Buckland PR, Williams NM, Williams HJ, Norton N, Owen MJ, O’Donovan MC. A haplotype implicated in schizophrenia susceptibility is associated with reduced COMT expression in human brain. Am J Hum Genet. 2003;73:152–161. [PubMed]
- Chen J, Lipska BK, Halim N, Ma QD, Matsumoto M, Melhem S, Kolachana BS, Hyde TM, Herman MM, Apud J, Egan MF, Kleinman JE, Weinberger DR. Functional analysis of genetic variation in catechol-O-methyltransferase (COMT): effects on mRNA, protein, and enzyme activity in postmortem human brain. Am J Hum Genet. 2004;75:807–821. [PubMed]
- Diatchenko L, Nackley AG, Slade GD, Belfer I, Max MB, Goldman D, Maixner W. Responses to Drs. Kim and Dionne regarding comments on Diatchenko, et al. Catechol-O-methyltransferase gene polymorphisms are associated with multiple pain-evoking stimuli. Pain 2006;125:216–24. Pain. 2007;129:366–370. [PMC free article] [PubMed]
- Diatchenko L, Nackley AG, Slade GD, Bhalang K, Belfer I, Max MB, Goldman D, Maixner W. Catechol-O-methyltransferase gene polymorphisms are associated with multiple pain-evoking stimuli. Pain. 2006;125:216–224. [PubMed]
- Diatchenko L, Slade GD, Nackley AG, Bhalang K, Sigurdsson A, Belfer I, Goldman D, Xu K, Shabalina SA, Shagin D, Max MB, Makarov SS, Maixner W. Genetic basis for individual variations in pain perception and the development of a chronic pain condition. Hum Mol Genet. 2005;14:135–143. [PubMed]
- Egan MF, Goldberg TE, Kolachana BS, Callicott JH, Mazzanti CM, Straub RE, Goldman D, Weinberger DR. Effect of COMT Val108/158 Met genotype on frontal lobe function and risk for schizophrenia. Proc Natl Acad Sci USA. 2001;98:6917–6922. [PubMed]
- Elston R. On Fisher’s method of combining p-values. Biometrical journal. 1991;33:339–345.
- Enoch MA, Xu K, Ferro E, Harris CR, Goldman D. Genetic origins of anxiety in women: a role for a functional catechol-O-methyltransferase polymorphism. Psychiatr Genet. 2003;13:33–41. [PubMed]
- Epstein M, Satten G. Inference on haplotype effects in case-control studies using unphased genotype data. Am J Hum Genet. 2003;73:1316–1329. [PubMed]
- Epstein MP, Allen AS, Satten GA. A simple and improved correction for population stratification in case-control studies. Am J Hum Genet. 2007;80:921–930. [PubMed]
- Karayiorgou M, Sobin C, Blundell ML, Galke BL, Malinova L, Goldberg P, Ott J, Gogos JA. Family-based association studies support a sexually dimorphic effect of COMT and MAOA on genetic susceptibility to obsessive-compulsive disorder. Biol Psychiatry. 1999;45:1178–1189. [PubMed]
- Kern AD, Kondrashov FA. Disease-related versus polymorphic mutations in human mitochondrial tRNAs. Nat Genet. 2004;36:1207–1212. [PubMed]
- Kirby DA, Muse SV, Stephan W. Maintenance of pre-mRNA secondary structure by epistatic selection. Proceedings of the National Academy of Sciences. 1995;92:9047–9051. [PubMed]
- Kondrashov AS, Sunyaev S, Kondrashov FA. Dobzhansky-Muller incompatibilities in protein evolution. Proc Natl Acad Sci USA. 2002;99:14878–14883. [PubMed]
- Kraft P, Cox D, Paynter R, Hunter D, De Vivo I. Accounting for haplotype uncertainty in matched association studies: A comparison of simple and flexible techniques. Genetic Epidemiology. 2005;28:261–272. [PubMed]
- Kraft P, Stram D. Re: The use of inferred haplotypes in downstream analysis. Am J Hum Genet. 2007;81:863–865. [PubMed]
- Lin D. Haplotype-based association analysis in cohort studies of unrelated individuals. Genetic Epidemiology. 2004;26:255–264. [PubMed]
- Lin D, Huang B. Reply to Peter Kraft and Daniel O. Stram. Am J Hum Genet. 2007;81:865–866. [PubMed]
- Lin D, Zeng D. Likelihood-based inference on haplotype effects in genetic association studies. J Am Stat Assoc. 2006;101:89–104.
- Lin D, Zeng D, Millikan R. Maximum likelihood estimation of haplotype effects and haplotype-environment interactions in association studies. Genet Epidemiol. 2005;29:299–312. [PubMed]
- Lin PI, Vance JM, Pericak-Vance MA, Martin ER. No gene is an island: the flip-flop phenomenon. Am J Hum Genet. 2007;80:531–538. [PubMed]
- Lotta T, Vidgren J, Tilgmann C, Ulmanen I, Meln K, Julkunen I, Taskinen J. Kinetics of human soluble and membrane-bound catechol O-methyltransferase: a revised mechanism and description of the thermolabile variant of the enzyme. Biochemistry. 1995;34:4202–4210. [PubMed]
- Luo X, Kranzler H, Zuo L, Wang S, Schork N, Gelernter J. Diplotype trend regression analysis of the
*ADH*gene cluster and the*ALDH2*gene: Multiple significant associations with alcohol dependence. Am J Hum Genet. 2006;78:973–987. [PubMed] - Mannisto PT, Kaakkola S. Catechol-O-methyltransferase (COMT): biochemistry, molecular biology, pharmacology, and clinical efficacy of the new selective COMT inhibitors. Pharmacol Rev. 1999;51:593–628. [PubMed]
- Meyer-Lindenberg A, Nichols T, Callicott JH, Ding J, Kolachana B, Buckholtz J, Mattay VS, Egan M, Weinberger DR. Impact of complex genetic variation in COMT on human brain function. Mol Psychiatry. 2006;11:867–877. [PubMed]
- Nackley AG, Shabalina SA, Tchivileva IE, Satterfield K, Korchynskyi O, Makarov SS, Maixner W, Di-atchenko L. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science. 2006;314:1930–1933. [PubMed]
- Nackley AG, Tan KS, Fecho K, Flood P, Diatchenko L, Maixner W. Catechol-O-methyltransferase inhibition increases pain sensitivity through activation of both beta2- and beta3-adrenergic receptors. Pain. 2007;128:199–208. [PMC free article] [PubMed]
- Nielsen DM, Weir BS. A classical setting for associations between markers and loci affecting quantitative traits. Genet Res. 1999;74:271–277. [PubMed]
- Oroszi G, Goldman D. Alcoholism: genes and mechanisms. Pharmacogenomics. 2004;5:1037–1048. [PubMed]
- Palmatier MA, Pakstis AJ, Speed W, Paschou P, Goldman D, Odunsi A, Okonofua F, Kajuna S, Karoma N, Kungulilo S, Grigorenko E, Zhukova OV, Bonne-Tamir B, Lu RB, Parnas J, Kidd JR, DeMille MMC, Kidd KK. COMT haplotypes suggest P2 promoter region relevance for schizophrenia. Mol Psychiatry. 2004;9:859–870. [PubMed]
- Schaid D, Rowland C, Tines D, Jacobson R, Poland G. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet. 2002;70:425–434. [PubMed]
- Shi MM, Bleavins MR, de la Iglesia FA. Technologies for detecting genetic polymorphisms in pharmacogenomics. Mol Diagn. 1999;4:343–351. [PubMed]
- Shibata K, Ito T, Kitamura Y, Iwasaki N, Tanaka H, Kamatani N. Simultaneous estimation of haplotype frequencies and quantitative trait parameters: applications to the test of association between phenotype and diplotype configuration. Genetics. 2004;168:525–539. [PubMed]
- Shifman S, Bronstein M, Sternfeld M, Pisant-Shalom A, Lev-Lehman E, Weizman A, Reznik I, Spivak B, Grisaru N, Karp L, Schiffer R, Kotler M, Strous RD, Swartz-Vanetik M, Knobler HY, Shinar E, Beckmann JS, Yakir B, Risch N, Zak NB, Darvasi A. A highly significant association between a COMT haplotype and schizophrenia. Am J Hum Genet. 2002;71:1296–1302. [PubMed]
- Simes R. An improved Bonferroni procedure for multiple tests of significance. Biometrika. 1986;73:751–754.
- Stram D, Leigh Pearce C, Bretsky P, Freedman M, Hirschhorn J, Altshuler D, Kolonel L, Henderson B, Thomas D. Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals. Hum Hered. 2003;55:179–190. [PubMed]
- Tregouet D, Barbaux S, Escolano S, Tahri N, Golmard J, Tiret L, Cambien F. Specific haplotypes of the P-selectin gene are associated with myocardial infarction. Human Molecular Genetics. 2002;11:2015–2023. [PubMed]
- Tzeng J, Wang C, Kao J, Hsiao C. Regression-based association analysis with clustered haplotypes through use of genotypes. Am J Hum Genet. 2006;78:231–242. [PubMed]
- Valle T, Tuomilehto J, Bergman RN, Ghosh S, Hauser ER, Eriksson J, Nylund SJ, Kohtamaki K, Toivanen L, Vidgren G, Tuomilehto-Wolf E, Ehnholm C, Blaschak J, Langefeld CD, Watanabe RM, Magnuson V, Ally DS, Hagopian WA, Ross E, Buchanan TA, Collins F, Boehnke M. Mapping genes for NIDDM. Design of the Finland-United States investigation of NIDDM genetics (FUSION) study. Diabetes Care. 1998;21:949–958. [PubMed]
- Weir BS. Genetic data analysis II. Sinauer Associates Sunderland; Mass: 1996.
- Wright S. The genetical structure of populations. Ann Eugen. 1951;15:323–354. [PubMed]
- Xie R, Stram D. Asymptotic equivalence between two score tests for haplotype-specific risk in general linear models. Genet Epidemiol. 2005;29:166–70. [PubMed]
- Zaykin DV, Shibata K. Genetic flip-flop without an accompanying change in linkage disequilibrium. Am J Hum Genet. 2008;82:794–796. [PubMed]
- Zaykin DV, Westfall PH, Young SS, Karnoub MA, Wagner MJ, Ehm MG. Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum Hered. 2002;53:79–91. [PubMed]
- Zaykin DV, Zhivotovsky LA, Czika W, Shao S, Wolfinger RD. Combining p-values in large-scale genomics experiments. Pharm Stat. 2007;6:217–226. [PMC free article] [PubMed]
- Zhao L, Li S, Khalid N. A method for the assessment of disease associations with single-nucleotide polymorphism haplotypes and environmental variables in case-control studies. Am J Hum Genet. 2003;72:1231– 1250. [PubMed]
- Zhu G, Lipsky RH, Xu K, Ali S, Hyde T, Kleinman J, Akhtar LA, Mash DC, Goldman D. Differential expression of human COMT alleles in brain and lymphoblasts detected by RT-coupled 5′ nuclease assay. Psychopharmacology. 2004;177:178–184. [PubMed]
- Zubieta JK, Heitzeg MM, Smith YR, Bueller JA, Xu K, Xu Y, Koeppe RA, Stohler CS, Goldman D. COMT val158met genotype affects mu-opioid neurotransmitter responses to a pain stressor. Science. 2003;299:1240–1243. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |