Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2838926

Formats

Article sections

Authors

Related links

Genet Epidemiol. Author manuscript; available in PMC 2010 September 1.

Published in final edited form as:

Genet Epidemiol. 2009 September; 33(6): 463–478.

doi: 10.1002/gepi.20399PMCID: PMC2838926

NIHMSID: NIHMS87870

Department of Epidemiology and Biostatistics, Case Western Reserve University

Corresponding Author: Dr. Robert C. Elston Department of Epidemiology and Biostatistics Case Western Reserve University School of Medicine Wolstein Research Building 10900 Euclid Avenue Cleveland, Ohio, 44106-7281

See other articles in PMC that cite the published article.

The possible evidence for association comprises three types of information: differences between cases and controls in allele frequencies, in parameters for Hardy Weinberg disequilibrium (HWD), and in parameters for linkage disequilibrium (LD). LD between marker and disease alleles results in a difference in at least one of the three types of parameters [Won and Elston, 2008]. However, the parameters for LD require knowledge about phase, which is usually unknown, making the LD contrast test without modification infeasible in practice. Methods for handling phase uncertainty are: (1) the most probable haplotype pair for each individual can be considered as the true phase; (2) a weighted average of haplotypes can be used; (3) we can consider the composite LD, which does not require any information about phase. We compare these methods to handle phase uncertainty in terms of validity and efficiency, and the effect on them of HWD in the population, at the same time confirming results for the three types of information. When the LD between markers is high, the LD contrast test that uses a weighted average of haplotypes or the most probable haplotypes to calculate the LD is recommended, but otherwise the LD contrast test that uses the composite LD is recommended. We conclude that, even though the difference in allele frequencies is usually the most informative test except in the case of a recessive disease, the LD contrast test can be more powerful if the markers are dense enough.

The goal of population association studies is to identify genetic polymorphisms that distinguish among individuals with different disease status, and it has been shown that an association between marker and disease alleles can be detected as a result of gametic phase disequilibrium for linked loci, which is what we shall mean here by linkage disequilibrium (LD). Thus, today LD can play a key role in genetic epidemiology and the advent of SNPs has enabled association studies at the genome-wide level. However, notwithstanding their efficiency, the huge number of SNPs has generated the problem of multiple testing and statistical methods to combat this issue have been investigated. SNP reduction, sometimes called self-replication [Van Steen et al., 2005; Zheng et al., 2007] is a possible solution to this problem when we have two independent statistics that can be used to test the same biological hypothesis. For example, suppose we have 500,000 equally-spaced SNPs in a genome-wide association (GWA) study, i.e. SNPs about 6Kb apart, then correction for multiple tests would require a Bonferroni-corrected p-value of 0.05/500,000 for genome-wide significance at the 0.05 level, which is difficult to meet in practice. However, suppose we have two independent statistics (*t*_{1} and *t*_{2}) and their p-values are *u*_{1} and *u*_{2}. Then, if 1% of the SNPs are selected by screening based on either of the two independent statistics, and only the selected SNPs are used for a testing phase with the other statistic, we now only need a Bonferroni-corrected p-value of 0.05/5,000 at this testing phase to have genome-wide significance at the 0.05 level. This self-replication can be reformulated in terms of combining p-values. The rejection region for *t*_{1} (or *t*_{2}), in this example of self-replication is asymptotically *u*_{1}<0.05/5,000 with *u*_{2}<0.01 (or *u*_{2}<0.05/5,000 with *u*_{1}<0.01). We can also use Fisher's method or Liptak's method [Liptak, 1958] for combining p-values (see Figure 1 for their rejection regions). As a result, better approaches than self-replication can be attained as long as we use the optimal method for combining *u*_{1} and *u*_{2}; and we have shown that the most powerful method can be obtained under certain circumstances even though no uniformly most powerful test is possible [Won et al., 2008].

It has been shown that LD between marker and disease alleles can result in differences of the following parameters between cases and controls: allele frequencies, the parameters for Hardy-Weinberg disequilibrium (HWD) and the parameters for LD between markers [Won and Elston, 2008]. The fact that the phase-known genotype frequencies can be parameterized in terms of these three parameters shows they can each be used for association studies. In principle, there is evidence for association if any one of the parameters is different between cases and controls provided certain conditions - such as no genotyping error – hold, and the three types of information can have different levels of power according to the disease mode of inheritance and LD structure. However, even though these results provide intuition about an optimal strategy for analysis, their application to a practical situation is restricted because the usual LD contrast test requires information about phase, which is often unknown. Three ways of overcoming the phase uncertainty have been considered. First, the most probable haplotype pair for each individual can be considered as the true phase. Second, a weighted average of haplotypes can be used. Third, we can use the composite LD [Wang et al., 2007; Zaykin, 2004; Zaykin et al., 2006], which does not require any information about phase. The validity and efficiency of these three approaches have not yet been investigated. We therefore compare these approaches to handle phase uncertainty in terms of their type I error and power according to disease mode of inheritance and SNP density. At the same time, we investigate the effect of HWD in the population on the analysis. Our results show that, whereas the composite LD always preserves the type I error with reduced power, the methods that use the most probable haplotype or weighted haplotypes for each individual can have better power with type I error preserved provided dense markers are used. We further show that, when the markers are dense, the LD contrast test can be more powerful than a test based on allele frequency differences.

Let *X _{ij}* and

$$\begin{array}{cc}\hfill & {X}_{\mathit{ij}}=\left\{\begin{array}{cc}1\hfill & \text{if an allele at locus}\phantom{\rule{thinmathspace}{0ex}}A\phantom{\rule{thinmathspace}{0ex}}\text{in haplotype}\phantom{\rule{thinmathspace}{0ex}}j\phantom{\rule{thinmathspace}{0ex}}\text{of individual}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}{A}_{1}\hfill \\ 0\hfill & \text{if an allele at locus}\phantom{\rule{thinmathspace}{0ex}}A\phantom{\rule{thinmathspace}{0ex}}\text{in haplotype}\phantom{\rule{thinmathspace}{0ex}}j\phantom{\rule{thinmathspace}{0ex}}\text{of individual}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}{A}_{2}\hfill \end{array}\right\}\hfill \\ \hfill & {Y}_{\mathit{ij}}=\left\{\begin{array}{cc}1\hfill & \text{if an allele at locus}\phantom{\rule{thinmathspace}{0ex}}B\phantom{\rule{thinmathspace}{0ex}}\text{in haplotype}\phantom{\rule{thinmathspace}{0ex}}j\phantom{\rule{thinmathspace}{0ex}}\text{of individual}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}{B}_{1}\hfill \\ 0\hfill & \text{if an allele at locus}\phantom{\rule{thinmathspace}{0ex}}B\phantom{\rule{thinmathspace}{0ex}}\text{in haplotype}\phantom{\rule{thinmathspace}{0ex}}j\phantom{\rule{thinmathspace}{0ex}}\text{of individual}\phantom{\rule{thinmathspace}{0ex}}i\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}{B}_{2}\hfill \end{array}\right\}\hfill \end{array}$$

where *j* = 1 (2) indicates a maternal (paternal) haplotype. For given marker loci *A* and *B*, let *p _{Ak}*,

$$\begin{array}{cc}\hfill & {d}_{12}^{\mathit{AB}}\equiv {p}_{{A}_{1}{B}_{1}}{p}_{{A}_{1}{B}_{2}}-\frac{1}{2}{p}_{{A}_{1}{B}_{2}}^{{A}_{1}{B}_{1}},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{d}_{13}^{\mathit{AB}}\equiv {p}_{{A}_{1}{B}_{1}}{p}_{{A}_{2}{B}_{1}}-\frac{1}{2}{p}_{{A}_{2}{B}_{1}}^{{A}_{1}{B}_{1}}\hfill \\ \hfill & {d}_{14}^{\mathit{AB}}\equiv {p}_{{A}_{1}{B}_{1}}{p}_{{A}_{2}{B}_{2}}-\frac{1}{2}{p}_{{A}_{2}{B}_{2}}^{{A}_{1}{B}_{1}},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{d}_{23}^{\mathit{AB}}\equiv {p}_{{A}_{1}{B}_{2}}{p}_{{A}_{2}{B}_{1}}-\frac{1}{2}{p}_{{A}_{2}{B}_{1}}^{{A}_{1}{B}_{2}}\hfill \\ \hfill & {d}_{24}^{\mathit{AB}}\equiv {p}_{{A}_{1}{B}_{2}}{p}_{{A}_{2}{B}_{2}}-\frac{1}{2}{p}_{{A}_{2}{B}_{2}}^{{A}_{1}{B}_{2}},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{d}_{34}^{\mathit{AB}}\equiv {p}_{{A}_{2}{B}_{1}}{p}_{{A}_{2}{B}_{2}}-\frac{1}{2}{p}_{{A}_{2}{B}_{2}}^{{A}_{2}{B}_{1}}.\hfill \end{array}$$

Then ${d}_{A}=-({d}_{13}^{\mathit{AB}}+{d}_{14}^{\mathit{AB}}+{d}_{23}^{\mathit{AB}}+{d}_{24}^{\mathit{AB}})$ and ${p}_{{A}_{1}{B}_{1}}{p}_{{A}_{1}{B}_{1}}-{p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}=-{d}_{12}^{\mathit{AB}}-{d}_{13}^{\mathit{AB}}-{d}_{14}^{\mathit{AB}}$.

To parameterize LD, let Δ_{A1B1} *p*_{A1B1} – *p*_{A1B1}, Δ_{A1B1|case} *p*_{A1B1|case} – *p*_{A1|case}*p*_{B1|case} and Δ_{A1B1|control} *p*_{A1B1|control} – *p*_{A1|control}*p*_{B1|control} be the corresponding parameters between the two loci. The composite LD parameters, which do not require any information about phase, are respectively defined as ${\Delta}_{{A}_{1}{B}_{1}}^{C}={p}_{{A}_{1}{B}_{1}}+{p}_{{A}_{1},{B}_{1}}-2\phantom{\rule{thinmathspace}{0ex}}{p}_{{A}_{1}}{p}_{{B}_{1}}$, ${\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}^{C}={p}_{{A}_{1}{B}_{1}\mid \text{case}}+{p}_{{A}_{1},{B}_{1}\mid \text{case}}-2\phantom{\rule{thinmathspace}{0ex}}{p}_{{A}_{1}\mid \text{case}}\phantom{\rule{thinmathspace}{0ex}}{p}_{{B}_{1}\mid \text{case}}$ and ${\Delta}_{{A}_{1}{B}_{1}\mid \text{control}}^{C}\equiv {p}_{{A}_{1}{B}_{1}\mid \text{control}}+{p}_{{A}_{1},{B}_{1}\mid \text{control}}-2\phantom{\rule{thinmathspace}{0ex}}{p}_{{A}_{1}\mid \text{control}}{p}_{{B}_{1}\mid \text{control}}$ in the population, cases and controls, respectively. Finally, it should be kept in mind that this notation is also applied to a disease locus (*D*) by correspondingly replacing the notation for the locus and alleles.

When there are markers around a disease locus *D*, where *D*_{1} (*D*_{2}) denotes a disease (normal) allele, association can be detected from the LD between the marker and the disease alleles, and we have shown that the information for association in case-control studies consists of three different parts [Won and Elston, 2008]:

*I*: trend (or difference) in allele frequencies_{a}*I*: trend (or difference) in allele frequencies_{HWD}*I*: trend (or difference) in LD._{LD}

For each type of information, we define the three principal statistics as follows [Nielsen et al., 1998; Nielsen et al., 2004; Sasieni, 1997; Song and Elston, 2006]:

$$\begin{array}{cc}\hfill & {S}_{a}=\frac{{\widehat{p}}_{{A}_{1}\mid \text{case}}-{\widehat{p}}_{{A}_{1}\mid \text{control}}}{\sqrt{\mathrm{var}({\widehat{p}}_{{A}_{1}\mid \text{case}}-{\widehat{p}}_{{A}_{1}\mid \text{control}})}}{S}_{\mathit{HWD}}=\frac{{\widehat{d}}_{A\mid \text{case}}-{\widehat{d}}_{A\mid \text{control}}}{\sqrt{\mathrm{var}({\widehat{d}}_{A\mid \text{case}}-{\widehat{d}}_{A\mid \text{control}})}}\hfill \\ \hfill & {S}_{\mathit{LD}}=\frac{{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}}{\sqrt{\mathrm{var}({\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}})}}.\hfill \end{array}$$

However, haplotypes are usually unknown and only the genotype at each locus is observed. Three possible ways that handle phase uncertainty have been used. First, the most probable haplotype pair for each individual can be estimated and considered as the true phase. Second, a weighted average of haplotypes can be used. Third, we can use the composite LD [Wang et al., 2007; Zaykin, 2004; Zaykin et al., 2006], which does not require any information about phase. We denote these methods as follows:

- ${I}_{\mathit{LD}}^{m}$: trend (or difference) in LD using the most probable haplotypes
- ${I}_{\mathit{LD}}^{w}$: trend (or difference) in LD using weighted haplotypes
- ${I}_{\mathit{LD}}^{c}$: trend (or difference) in composite LD.

Also, as has been known for some time, if the variance for *I _{LD}* is used in place of the true variances of the three corresponding statistics, the type I error can be either inflated or deflated; this will also be seen in our results. Thus, the variances of the statistics corresponding to ${I}_{\mathit{LD}}^{m}$, ${I}_{\mathit{LD}}^{w}$ and ${I}_{\mathit{LD}}^{c}$ need to be derived.

First, we consider the most probable haplotypes and weighted averages of haplotypes obtained via haplotype frequency estimation. If the haplotype frequencies are known, then, letting *w* = *p*_{A1B1}*p*_{A1B1}/(*p*_{A1B1}*p*_{A1B1} + *p*_{A1B2}*p*_{A2B1}), the most probable haplotypes and the weighted haplotypes for loci *A* and *B* result in the following for the phase-unknown double heterozygote genotype *A*_{1}*B*_{2}*A*_{2}*B*_{1}:

$$\begin{array}{cc}\hfill & \text{most probable haplotypes}-{A}_{1}{B}_{2}{A}_{2}{B}_{1}\to \left\{\begin{array}{cc}{A}_{1}{B}_{1}\u2215{A}_{2}{B}_{2}\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}w\phantom{\rule{thinmathspace}{0ex}}\text{is larger than}\phantom{\rule{thinmathspace}{0ex}}0.5\hfill \\ {A}_{1}{B}_{2}\u2215{A}_{2}{B}_{1}\hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\text{otherwise}\hfill \end{array}\right\}\hfill \\ \hfill & \text{weighted haplotypes}-\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{A}_{1}{B}_{2}{A}_{2}{B}_{1}\to \left\{\begin{array}{cc}{A}_{1}{B}_{1}\u2215{A}_{2}{B}_{2}\hfill & \phantom{\rule{1em}{0ex}}\text{with probability}\phantom{\rule{thinmathspace}{0ex}}w\hfill \\ {A}_{1}{B}_{2}\u2215{A}_{2}{B}_{1}\hfill & \text{with probability}\phantom{\rule{thinmathspace}{0ex}}1-w.\hfill \end{array}\right\}\hfill \end{array}$$

Thus, if we let *N* and *N*_{AkBkAk′Bk′} be the sample size and the number of individuals with genotype *A _{k}*

$$\begin{array}{cc}\hfill {\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{m}& \equiv \frac{1}{N}\left[{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{1}}+\frac{1}{2}{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{2}}+\frac{1}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{1}}+\frac{1}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{2}}I(w>\frac{1}{2})\right]-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}\hfill \\ \hfill \text{and}\phantom{\rule{thinmathspace}{0ex}}{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{w}& \equiv \frac{1}{N}\left[{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{1}}+\frac{1}{2}{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{2}}+\frac{1}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{1}}+\frac{w}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{2}}\right]-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}.\hfill \end{array}$$

By defining ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}={\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{m}$ if *W* = *I*(*w* > 1/2) for the indicator function *I,* and ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}={\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{w}$ if *W* = *w*, we can subsume both estimates in a general way, using *W,* as follows:

$${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}\equiv \frac{1}{N}\left[{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{1}}+\frac{1}{2}{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{2}}+\frac{1}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{1}}+\frac{W}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{2}}\right]-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}.$$

To derive the variance and covariances with the estimators for *I _{a}* and

$$\begin{array}{cc}\hfill & {\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}=\frac{1}{2N}\underset{i=1}{\overset{N}{\Sigma}}\left[2{X}_{i1}{X}_{i2}{Y}_{i1}{Y}_{i2}+(1-{X}_{i1}){X}_{i2}{Y}_{i1}{Y}_{i2}+{X}_{i1}(1-{X}_{i2}){Y}_{i1}{Y}_{i2}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}+{X}_{i1}{X}_{i2}(1-{Y}_{i1}){Y}_{i2}+{X}_{i1}{X}_{i2}{Y}_{i1}(1-{Y}_{i2})+W(1-{X}_{i1}){X}_{i2}(1-{Y}_{i1}){Y}_{i2}\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}+\left[W{X}_{i1}(1-{X}_{i2})(1-{Y}_{i1}){Y}_{i2}+W(1-{X}_{i1}){X}_{i2}{Y}_{i1}(1-{Y}_{i2})+W{X}_{i1}(1-{X}_{i2}){Y}_{i1}(1-{Y}_{i2})\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}={\widehat{\Delta}}_{{A}_{1}{B}_{1}}+\frac{1}{2N}\underset{i=1}{\overset{N}{\Sigma}}\left[\left\{W({X}_{i1}(1-{X}_{i2})(1-{Y}_{i1}){Y}_{i2}+(1-{X}_{i1}){X}_{i2}{Y}_{i1}(1-{Y}_{i2}))\right\}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}-\left[(1-W)\left\{(1-{X}_{i1}){X}_{i2}(1-{Y}_{i1}){Y}_{i2}+{X}_{i1}(1-{X}_{i2}){Y}_{i1}(1-{Y}_{i2})\right\}\right],\hfill \end{array}$$

because ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}={\displaystyle {\Sigma}_{i=1}^{N}}\left[{X}_{i1}{Y}_{i1}+{X}_{i2}{Y}_{i2}\right]\u22152N-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}$ and we obtain the results under Hardy-Weinberg equilibrium (HWE) using the delta method in a way analogous to that in Won and Elston [Won and Elston, 2008]:

$$\begin{array}{c}\hfill E\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}\right)={\Delta}_{{A}_{1}{B}_{1}}-({p}_{{A}_{1}{B}_{1}}+{\Delta}_{{A}_{1}{B}_{1}})\left\{(1-{p}_{{A}_{1}})(1-{p}_{{B}_{1}})+{\Delta}_{{A}_{1}{B}_{1}}\right\}+W\left[2{p}_{{A}_{1}}{p}_{{B}_{1}}\times \right]\hfill \\ \hfill \left[(1-{p}_{{A}_{1}})(1-{p}_{{B}_{1}})+{\Delta}_{{A}_{1}{B}_{1}}\left\{(1-2{p}_{{A}_{1}})(1-2{p}_{{B}_{1}})+2{\Delta}_{{A}_{1}{B}_{1}}\right\}\right]-\frac{{\Delta}_{{A}_{1}{B}_{1}}}{2N}\hfill \end{array}$$

$$\begin{array}{cc}\hfill & \mathrm{var}\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W}\right)=\frac{1}{N}\left[(1-2W)\left\{-\frac{1}{4}{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}(4{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}+2{b}_{{A}_{1}}{b}_{{B}_{1}}-1)+{\Delta}_{{A}_{1}{B}_{1}}\left(-2{p}_{{A}_{1}}^{2}{p}_{{B}_{1}}^{2}(2{q}_{{A}_{1}}{q}_{{B}_{1}}+{q}_{{A}_{1}}+{q}_{{B}_{1}})\right)\right\}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}+\left(2{p}_{{A}_{1}}{p}_{{B}_{1}}({p}_{{A}_{1}}{q}_{{A}_{1}}+{p}_{{B}_{1}}{q}_{{B}_{1}}-2{q}_{{A}_{1}}{q}_{{B}_{1}})+\frac{1}{8}(-2{({p}_{{A}_{1}}-{p}_{{B}_{1}})}^{2}+6({p}_{{A}_{1}}{q}_{{A}_{1}}+{p}_{{B}_{1}}{q}_{{B}_{1}})-1)\right)+{\Delta}_{{A}_{1}{B}_{1}}^{2}({p}_{{A}_{1}}{q}_{{A}_{1}}+{p}_{{B}_{1}}{q}_{{B}_{1}}\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}-6{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}})+{\Delta}_{{A}_{1}{B}_{1}}^{3}(1-{b}_{{A}_{1}}{b}_{{B}_{1}})-{\Delta}_{{A}_{1}{B}_{1}}^{4}+W\left(\frac{1}{2}{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}(4{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}-1)+\frac{1}{4}{\Delta}_{{A}_{1}{B}_{1}}{b}_{{A}_{1}}{b}_{{B}_{1}}(8{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}\right)\hfill \\ \hfill & \left[\left\{\left(\phantom{\rule{thickmathspace}{0ex}}-1)+2{\Delta}_{{A}_{1}{B}_{1}}^{2}(6{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}-{p}_{{A}_{1}}{q}_{{A}_{1}}-{p}_{{B}_{1}}{q}_{{B}_{1}})+2{\Delta}_{{A}_{1}{B}_{1}}^{3}{b}_{{A}_{1}}{b}_{{B}_{1}}+2{\Delta}_{{A}_{1}{B}_{1}}^{4}\right)\right\}+\frac{1}{8}(2{p}_{{A}_{1}}{q}_{{A}_{1}}{p}_{{B}_{1}}{q}_{{B}_{1}}+{\Delta}_{{A}_{1}{B}_{1}}{b}_{{A}_{1}}{b}_{{B}_{1}})\right]\hfill \end{array}$$

$$\begin{array}{cc}\hfill & \mathrm{cov}({\widehat{p}}_{{A}_{1}},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W})=\frac{1}{2N}(1-2{p}_{{A}_{1}})\left[{\Delta}_{{A}_{1}{B}_{1}}+2{\mathit{Wp}}_{{A}_{1}{B}_{2}}{p}_{{A}_{2}{B}_{1}}-2(1-W){p}_{{A}_{1}{B}_{1}}{p}_{{A}_{2}{B}_{2}}\right]+O\left(\frac{1}{{N}^{2}}\right)\hfill \\ \hfill & \mathrm{cov}({\widehat{d}}_{A},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{W})=\frac{1}{2N}{p}_{{A}_{1}}\left[{\mathit{Wp}}_{{A}_{1}{B}_{2}}{p}_{{A}_{2}{B}_{1}}-(1-W){p}_{{A}_{1}{B}_{1}}{p}_{{A}_{2}{B}_{2}}\right]+O\left(\frac{1}{{N}^{2}}\right).\hfill \end{array}$$

Next, the composite LD can be estimated as follows [Weir, 1996]:

$${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\equiv \frac{1}{N}\left[2{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{1}}+{N}_{{A}_{1}{A}_{1}{B}_{1}{B}_{2}}+{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{1}}+\frac{1}{2}{N}_{{A}_{1}{A}_{2}{B}_{1}{B}_{2}}\right]-2{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}.$$

To derive its variance and relevant covariances, we parameterize it in terms of the random variables defined above as follows:

$$\begin{array}{cc}\hfill & {\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}=\frac{1}{2N}\underset{i=1}{\overset{N}{\Sigma}}\left[4{X}_{i1}{X}_{i2}{Y}_{i1}{Y}_{i2}+2(1-{X}_{i1}){X}_{i2}{Y}_{i1}{Y}_{i2}+2{X}_{i1}(1-{X}_{i2}){Y}_{i1}{Y}_{i2}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}+2{X}_{i1}{X}_{i2}(1-{Y}_{i1}){Y}_{i2}+2{X}_{i1}{X}_{i2}{Y}_{i1}(1-{Y}_{i2})+(1-{X}_{i1}){X}_{i2}(1-{Y}_{i1}){Y}_{i2}\hfill \\ \hfill & \left[\phantom{\rule{thickmathspace}{0ex}}+{X}_{i1}(1-{X}_{i2})(1-{Y}_{i1}){Y}_{i2}+(1-{X}_{i1}){X}_{i2}{Y}_{i1}(1-{Y}_{i2})+{X}_{i1}(1-{X}_{i2}){Y}_{i1}(1-{Y}_{i2})\right]-2{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}={\widehat{\Delta}}_{{A}_{1}{B}_{1}}+\frac{1}{2N}\underset{i=1}{\overset{N}{\Sigma}}[{X}_{i1}{Y}_{i2}+{X}_{i2}{Y}_{i1}]-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}.\hfill \end{array}$$

With these results, we find that the expectations, variances [Weir, 1996] and covariances under HWE in either cases or controls are as follows:

$$\begin{array}{cc}\hfill & E\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\right)={\Delta}_{{A}_{1}{B}_{1}}-\frac{{\Delta}_{{A}_{1}{B}_{1}}}{N},\hfill \\ \hfill & \mathrm{var}\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\right)=\frac{1}{N}\left[{p}_{{A}_{1}}(1-{p}_{{A}_{1}})+{p}_{{B}_{1}}(1-{p}_{{B}_{1}})+\frac{1}{2}(1-2{p}_{{A}_{1}})(1-2{p}_{{B}_{1}}){\Delta}_{{A}_{1}{B}_{1}}\right].\hfill \end{array}$$

$$\begin{array}{cc}\hfill & \mathrm{cov}({\widehat{p}}_{{A}_{1}},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C})=\frac{{\Delta}_{{A}_{1}{B}_{1}}(1-2{p}_{{A}_{1}})}{2N}+O\left(\frac{1}{{N}^{2}}\right),\phantom{\rule{thinmathspace}{0ex}}\text{and}\hfill \\ \hfill & \mathrm{cov}({\widehat{d}}_{A},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C})=\frac{1}{N}{\Delta}_{{A}_{1}{B}_{1}}\phantom{\rule{thinmathspace}{0ex}}{p}_{{A}_{1}}(1-{p}_{{A}_{1}})+O\left(\frac{1}{{N}^{2}}\right).\hfill \end{array}$$

(see the Appendix for HWD). These show that the expectation and covariances of ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}$, but not its variance, are approximately equal to those of ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}$ under HWE. Also, the equivalence of ${\Delta}_{{A}_{1}{B}_{1}}^{C}$ to Δ_{A1B1} under HWE guarantees that

$$\begin{array}{cc}\hfill & \mathrm{var}\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\right)=\frac{1}{N}\left[{p}_{{A}_{1}}(1-{p}_{{A}_{1}}){p}_{{B}_{1}}(1-{p}_{{B}_{1}})+\frac{1}{2}(1-2{p}_{{A}_{1}})(1-2{p}_{{B}_{1}}){\Delta}_{{A}_{1}{B}_{1}}^{C}\right]\text{and}\hfill \\ \hfill & \mathrm{cov}({\widehat{p}}_{{A}_{1}},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C})=\frac{{\Delta}_{{A}_{1}{B}_{1}}^{C}(1-2{p}_{{A}_{1}})}{2N}+O\left(\frac{1}{{N}^{2}}\right).\hfill \end{array}$$

With these results, we can consider the following three LD contrast test statistics for association analysis in case-control studies:

$$\begin{array}{cc}\hfill & {S}_{\mathit{LD}}^{m}=\frac{{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{m}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{m}}{\sqrt{\mathrm{var}({\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{m}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{m})}}\phantom{\rule{1em}{0ex}}{S}_{\mathit{LD}}^{w}=\frac{{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{w}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{w}}{\sqrt{\mathrm{var}({\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{w}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{w})}}\hfill \\ \hfill & {S}_{\mathit{LD}}^{c}=\frac{{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{c}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{c}}{\sqrt{\mathrm{var}({\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{case}}^{c}-{\widehat{\Delta}}_{{A}_{1}{B}_{1}\mid \text{control}}^{c})}}.\hfill \end{array}$$

To elucidate the relative efficiency of the parameters for LD, the information in ${I}_{\mathit{LD}}^{m}$, ${I}_{\mathit{LD}}^{w}$ and ${I}_{\mathit{LD}}^{C}$ needs to be quantified and compared. If we assume HWE in the population and respectively let *ϕ* and *ϕ _{l}* be the disease prevalence and the penetrance of genotype

$$\begin{array}{cc}\hfill & {p}_{{A}_{1}\mid \text{case}}={p}_{{A}_{1}}+\frac{{\Delta}_{{A}_{1}{D}_{1}}}{\varphi}\left[{p}_{{D}_{1}}({\varphi}_{{D}_{1}{D}_{1}}-{\varphi}_{{D}_{1}{D}_{2}})+{p}_{{D}_{2}}({\varphi}_{{D}_{1}{D}_{2}}-{\varphi}_{{D}_{2}{D}_{2}})\right]\hfill \\ \hfill & {p}_{{A}_{1}\mid \text{control}}={p}_{{A}_{1}}+\frac{{\Delta}_{{A}_{1}{D}_{1}}}{1-\varphi}\left[{p}_{{D}_{1}}({\varphi}_{{D}_{1}{D}_{1}}-{\varphi}_{{D}_{1}{D}_{2}})+{p}_{{D}_{2}}({\varphi}_{{D}_{1}{D}_{2}}-{\varphi}_{{D}_{2}{D}_{2}})\right]\hfill \\ \hfill & {d}_{A\mid \text{case}}={p}_{{A}_{1}{A}_{1}\mid \text{case}}-{\left({p}_{{A}_{1}\mid \text{case}}\right)}^{2}=\frac{{\Delta}_{{A}_{1}{D}_{1}}^{2}}{\varphi}\left({\varphi}_{{D}_{1}{D}_{1}}{\varphi}_{{D}_{2}{D}_{2}}-{\varphi}_{{D}_{1}{D}_{2}}^{2}\right)\hfill \\ \hfill & {d}_{A\mid \text{control}}={p}_{{A}_{1}{A}_{1}\mid \text{control}}-{\left({p}_{{A}_{1}\mid \text{control}}\right)}^{2}=\frac{{\Delta}_{{A}_{1}{D}_{1}}^{2}}{{(1-\varphi )}^{2}}\left((1-{\varphi}_{{D}_{1}{D}_{1}})(1-{\varphi}_{{D}_{2}{D}_{2}})-{(1-{\varphi}_{{D}_{1}{D}_{2}})}^{2}\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill & {\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}={\Delta}_{{A}_{1}{B}_{1}}+\frac{1}{\varphi}\left[{p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\times \left[{\Delta}_{{A}_{1}{\mathit{DB}}_{1}}-\frac{1}{\varphi}\left({p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right){\Delta}_{{A}_{1}D}{\Delta}_{{\mathit{DB}}_{1}}\right]\hfill \\ \hfill & {\Delta}_{{A}_{1}{B}_{1}\mid \text{control}}={\Delta}_{{A}_{1}{B}_{1}}+\frac{1}{1-\varphi}\left[{p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\times \left[{\Delta}_{{A}_{1}{\mathit{DB}}_{1}}-\frac{1}{1-\varphi}\left({p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right){\Delta}_{{A}_{1}D}{\Delta}_{{\mathit{DB}}_{1}}\right].\hfill \end{array}$$

The quantification of ${\Delta}_{{A}_{1}{B}_{1}}^{m}$ and ${\Delta}_{{A}_{1}{B}_{1}}^{w}$ can be generalized using ${\Delta}_{{A}_{1}{B}_{1}}^{W}$:

$$\begin{array}{cc}\hfill & {\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}^{W}=(2W-1){p}_{{A}_{1}\mid \text{case}}{p}_{{A}_{2}\mid \text{case}}{p}_{{B}_{1}\mid \text{case}}{p}_{{B}_{2}\mid \text{case}}+{\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}\left\{1+(W-1)({p}_{{A}_{1}\mid \text{case}}{p}_{{B}_{1}\mid \text{case}}\right\}\hfill \\ \hfill & \left\{\phantom{\rule{1em}{0ex}}+{p}_{{A}_{2}\mid \text{case}}{p}_{{B}_{2}\mid \text{case}}-W({p}_{{A}_{1}\mid \text{case}}{p}_{{B}_{2}\mid \text{case}}+{p}_{{A}_{2}\mid \text{case}}{p}_{{B}_{1}\mid \text{case}})\right\}+(2W-1){\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}^{2},\hfill \end{array}$$

and ${\Delta}_{{A}_{1}{B}_{1}\mid \text{control}}^{W}$ can also be quantified analogously.

The quantification of ${\Delta}_{{A}_{1}{B}_{1}}^{C}$ requires the joint frequencies of alleles *A*_{1} and *B*_{1} in two different gametes because ${\Delta}_{{A}_{1}{B}_{1}}^{C}$ is defined as ${\Delta}_{{A}_{1}{B}_{1}}^{C}={p}_{{A}_{1}{B}_{1}}+{p}_{{A}_{1}\mid {B}_{1}}-2\phantom{\rule{thinmathspace}{0ex}}{p}_{{A}_{1}}{p}_{{B}_{1}}$. Under HWE in the population we have:

$$\begin{array}{cc}\hfill & {p}_{{A}_{1}\u2215{B}_{1}\mid \text{case}}=\frac{1}{\varphi}\underset{j\in \{\mathit{DD},\mathit{Dd},\mathit{dd}\}}{\Sigma}\left[P\left(\begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{1}\hfill \end{array}\mid \begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{1}\hfill \end{array},j,\text{affected}\right)+\frac{1}{2}\left\{P\left(\begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{2}\hfill \end{array}\mid \begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{1}\hfill \end{array},j,\text{affected}\right)\right\}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\left[\left\{+P\left(\begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{1}\hfill \end{array}\mid \begin{array}{c}\hfill {A}_{2}\hfill \\ \hfill {B}_{1}\hfill \end{array},j,\text{affected}\right)+P\left(\begin{array}{c}\hfill {A}_{1}\hfill \\ \hfill {B}_{2}\hfill \end{array}\mid \begin{array}{c}\hfill {A}_{2}\hfill \\ \hfill {B}_{1}\hfill \end{array},j,\text{affected}\right)\right\}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}=\frac{1}{\varphi}\left[P\left({A}_{1}{D}_{1}\right)P\left({D}_{1}{B}_{1}\right){\varphi}_{{D}_{1}{D}_{1}}+\left\{P\left({A}_{1}{D}_{1}\right)P\left({D}_{2}{B}_{1}\right)+P\left({A}_{1}{D}_{2}\right)P\left({D}_{1}{B}_{1}\right)\right\}{\varphi}_{{D}_{1}{D}_{2}}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}+P\left({A}_{1}{D}_{2}\right)\left[P\left({D}_{2}{B}_{1}\right){\varphi}_{{D}_{2}{D}_{2}}\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}={p}_{{A}_{1}}{p}_{{B}_{1}}+\frac{1}{\varphi}\left[({p}_{{A}_{1}}{\Delta}_{{D}_{1}{B}_{1}}+{p}_{{B}_{1}}{\Delta}_{{A}_{1}{D}_{1}})\right]\left\{{p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right\}\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\left[+{\Delta}_{{A}_{1}{D}_{1}}{\Delta}_{{D}_{1}{B}_{1}}({\varphi}_{{D}_{1}{D}_{1}}-2{\varphi}_{{D}_{1}{D}_{2}}+{\varphi}_{{D}_{2}{D}_{2}})\right],\hfill \end{array}$$

where the derivation is based on the fact that ${p}_{{A}_{1}\mid {B}_{1}}={p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}+\frac{1}{2}\left({p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{2}}+{p}_{{A}_{2}{B}_{1}}^{{A}_{1}{B}_{1}}+{p}_{{A}_{2}{B}_{1}}^{{A}_{1}{B}_{2}}\right)$. Thus, under HWE the composite LD between markers in cases is

$$\begin{array}{cc}\hfill & {\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}^{C}={\Delta}_{{A}_{1}{B}_{1}\mid \text{case}}+{p}_{{A}_{1}\u2215{B}_{1}\text{case}}-{p}_{{A}_{1}\mid \text{case}}{p}_{{B}_{1}\mid \text{case}}\hfill \\ \hfill & ={\Delta}_{{A}_{1}{B}_{1}}+\frac{{\Delta}_{{A}_{1}{D}_{1}}{\Delta}_{{D}_{1}{B}_{1}}}{\varphi}\left[{\varphi}_{\mathit{DD}}-2{\varphi}_{\mathit{Dd}}+{\varphi}_{\mathit{dd}}\right]+\frac{1}{\varphi}\left[{p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\times \left[{\Delta}_{{A}_{1}{\mathit{DB}}_{1}}-\frac{2}{\varphi}\left({p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right){\Delta}_{{A}_{1}D}{\Delta}_{{\mathit{DB}}_{1}}\right].\hfill \end{array}$$

Analogously, for controls, we have:

$$\begin{array}{cc}\hfill & {\Delta}_{{A}_{1}{B}_{1}\mid \text{control}}^{C}={\Delta}_{{A}_{1}{B}_{1}\mid \text{control}}+{p}_{{A}_{1}\u2215{B}_{1}\text{control}}-{p}_{{A}_{1}\mid \text{control}}{p}_{{B}_{1}\mid \text{control}}\hfill \\ \hfill & ={\Delta}_{{A}_{1}{B}_{1}}+\frac{{\Delta}_{{A}_{1}{D}_{1}}{\Delta}_{{D}_{1}{B}_{1}}}{1-\varphi}\left[{\varphi}_{\mathit{DD}}-2{\varphi}_{\mathit{Dd}}+{\varphi}_{\mathit{dd}}\right]+\frac{1}{1-\varphi}\left[{p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right]\hfill \\ \hfill & \phantom{\rule{thickmathspace}{0ex}}\times \left[{\Delta}_{{A}_{1}{\mathit{DB}}_{1}}-\frac{2}{1-\varphi}\left({p}_{D}({\varphi}_{\mathit{DD}}-{\varphi}_{\mathit{Dd}})+{p}_{d}({\varphi}_{\mathit{Dd}}-{\varphi}_{\mathit{dd}})\right){\Delta}_{{A}_{1}D}{\Delta}_{{\mathit{DB}}_{1}}\right].\hfill \end{array}$$

The cystic fibrosis transmembrane conductance regulator (CFTR) is located across 200 Kb in region q31.2 on the long arm of chromosome 7. From the HapMap database, we downloaded the marker data of the CFTR haplotype in the CEU (CEPH-Utah resident) sample. We randomly selected, to generate separate genotypes for each of 4,000 persons (2,000 representing cases and 2,000 representing controls), 11 SNPs whose minor allele frequencies are larger than 0.1 and used the SNP located in the middle (with allele frequency ranging between 0.1 and 0.5) as a causal SNP for the 2,000 cases, the others being used as markers. To study empirical type I error, we assume homozygous and heterozygote disease genotype relative risks *λ*_{2} and *λ*_{1} equal to 1, and randomly generated a binary trait. To study empirical power, we assumed *λ*_{2}=1.4, *λ*_{1} being determined by the disease mode of inheritance: *λ*_{1} = 1 for a recessive disease, *λ*_{1} = *λ*_{2} for a dominant disease, ${\lambda}_{1}=\frac{{\lambda}_{2}+1}{2}$ for an additive disease and ${\lambda}_{1}=\sqrt{{\lambda}_{2}}$ for a multiplicative disease. We considered two marker densities: (i) sparse markers, the consecutive markers being separated by about 30 Kb, and (ii) dense markers, separated by about 5Kb. For each situation, the empirical power and size was calculated from 1,000 replicates of 2,000 cases and 2,000 controls. Thus, in all, 4,000× 1,000 = 4,000,000 separate genotypes were generated.

In these simulations, the statistics, *S _{a}* and

From the previous simulations, we confirmed that for sparse markers the type I error of *S _{LD}* can be inflated if the most probable haplotypes are assumed to be true haplotypes. We now examine with simulations both the validity and efficiency of the three LD contrast tests modified for phase uncertainty: ${S}_{\mathit{LD}}^{m},{S}_{\mathit{LD}}^{w}$ and ${S}_{\mathit{LD}}^{c}$. In particular the two-marker haplotype frequencies for ${S}_{\mathit{LD}}^{m}$ and ${S}_{\mathit{LD}}^{w}$ are estimated by the EM algorithm [Excoffier and Slatkin, 1995]. The disease prevalence and the disease allele frequency are assumed to be 0.15 and 0.1, respectively. We used

Figures Figures22 and and33 show the empirical type I error from 20,000 replicates of 10,000 cases and 10,000 controls as a function of LD between markers when *p*_{A1} and *p*_{B1} are 0.2 and 0.3, respectively. For *w*, we considered two different situations: (i) the haplotype frequencies in cases and controls are the same, so that the weights, *w*, are equal for cases and controls, and (ii) the haplotype frequencies in cases and controls are different. The results show that estimation of the weight, *w*, can result in preserved type I errors for ${S}_{\mathit{LD}}^{m}$ and ${S}_{\mathit{LD}}^{w}$ under case (i) but results in inflated type I error under case (ii), while ${S}_{\mathit{LD}}^{c}$ always preserves the type I error well. However, when the absolute values of Lewontin's D′ [Lewontin, 1964] between markers are high, in both situations the inflation of type I error is negligible. Thus we conclude that type I error is preserved well as long as the same weights are used for cases and controls.

Figures Figures4,4, ,55 and and66 show the average empirical power from 20,000 replicates for three different LD structures as a function of sample size; for *w* in ${S}_{\mathit{LD}}^{m}$ and ${S}_{\mathit{LD}}^{w}$, we assume that the haplotype frequencies are the same in cases and controls. The sample sizes for cases and controls are equal. From the results, we can confirm that the LD contrast test (${S}_{\mathit{LD}}^{m},{S}_{\mathit{LD}}^{w}$ and ${S}_{\mathit{LD}}^{c}$) can be more powerful than *S _{a}* with LD structure (3), because there is high three-locus LD between marker and disease alleles. Also, as was shown before, ${S}_{\mathit{LD}}^{c}$ is better than ${S}_{\mathit{LD}}^{m}$ and ${S}_{\mathit{LD}}^{w}$ for a recessive disease. Otherwise, ${S}_{\mathit{LD}}^{c}$ loses power because its variance is approximately twice as large as the variance of ${S}_{\mathit{LD}}^{m}$ or ${S}_{\mathit{LD}}^{w}$.

Figure 7 shows, for *p*_{A1} and *p*_{B1} = 0.2, the analytical correlations of _{A1} and * _{A}* with ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{m}$, ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{w}$ and ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{c}$ when the haplotype frequencies are known (panels

The above simulations confirmed that the methods for handling phase uncertainty preserve the type I error in certain situations. We now examine quantification of the standardized expected differences between cases and controls in order to analytically compare their efficiencies. If we let *SED* be the standardized expected difference for a single case and a single control, defined as

$$\mathit{SED}=\frac{\text{expectation of the difference statistic for one case and one control}}{\text{s.d. of the difference statistic for one case and one control}},$$

and if we let Φ(·) and *Z*_{1-α/2} be respectively the cumulative standard normal distribution and the 1-α/2 quantile of the standard normal distribution, the power for *N* cases and *N* controls is equal to $\Phi \left(\mathit{SED}\sqrt{N}-{Z}_{1-\alpha \u22152}\right)+\Phi \left(-\mathit{SED}\sqrt{N}-{Z}_{1-\alpha \u22152}\right)$. The required sample size for given power and significance level can be plotted as a function of the *SED* [Won and Elston, 2008]. In Figures Figures88 and and9,9, we plot *SED*s as a function of LD for various modes of inheritance. Again, the disease allele frequency was assumed to be 0.1 and we used *λ*_{2} =1.4, *λ*_{1} being determined by the disease mode of inheritance. We assume the disease prevalence is 0.15 and the minor allele frequencies for the markers *A* and *B* are 0.2. Figure 8 shows the *SED* as a function of Δ_{A1D1}, when Δ_{A1D1B1} = 0.25Δ_{A1D1} and Δ_{B1D1} = 0.2Δ_{A1D1}. These results show that ${S}_{\mathit{LD}}^{c}$ is less powerful than *S _{LD}* except in the case of a recessive disease. The expectation of the statistic for ${I}_{\mathit{LD}}^{c}$ is slightly larger than that of

Table 3 shows the empirical type I error according to whether there is HWE or HWD in the population. *p*_{A1}, *p*_{B1} and *p*_{D1} are assumed to be 0.2, 0.2 and 0.1, respectively, *λ*_{2} is 1.4 and *λ*_{1} is calculated according to the disease mode of inheritance. The disease prevalence is 0.15 and LE is assumed between marker and disease alleles. In all cases we assumed the parameter for haplotype-level HWD for the three loci (*A*, *B*, and *D*) is 6.4×10^{−5}, which is the maximal amount possible in this situation. Two types of variances for *S _{a}*,

Haplotype-level HWD under LE between marker and disease alleles can result in differences between cases and controls for three types of parameters. For example, the allele frequencies for cases and controls are

$$\begin{array}{cc}\hfill {p}_{{A}_{1}\mid \text{case}}& =\frac{1}{\varphi}\left[({P}_{{A}_{1}{D}_{1}}^{2}+{P}_{{A}_{1}{D}_{1}}{P}_{{A}_{2}{D}_{1}}+{d}_{12}^{\mathit{AD}}+{d}_{14}^{\mathit{AD}}){\varphi}_{{D}_{1}{D}_{1}}+(2{P}_{{A}_{1}{D}_{1}}{P}_{{A}_{1}{D}_{2}}+{P}_{{A}_{1}{D}_{1}}{P}_{{A}_{2}{D}_{2}}+{P}_{{A}_{1}{D}_{2}}{P}_{{A}_{2}{D}_{1}}\right]\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\left[-2{d}_{12}^{\mathit{AD}}-{d}_{14}^{\mathit{AD}}-{d}_{23}^{\mathit{AD}}){\varphi}_{{D}_{1}{D}_{2}}+({P}_{{A}_{1}{D}_{2}}^{2}+{P}_{{A}_{1}{D}_{2}}{P}_{{A}_{2}{D}_{2}}+{d}_{12}^{\mathit{AD}}+{d}_{23}^{\mathit{AD}}){\varphi}_{{D}_{2}{D}_{2}}\right]\hfill \\ \hfill & ={p}_{{A}_{1}}+\frac{1}{\varphi}\left[({\Delta}_{{A}_{1}{D}_{1}}{p}_{{D}_{1}}+{d}_{12}^{\mathit{AD}}+{d}_{14}^{\mathit{AD}})({\varphi}_{{D}_{1}{D}_{1}}-{\varphi}_{{D}_{1}{D}_{2}})+({\Delta}_{{A}_{1}{D}_{1}}{p}_{{D}_{2}}-{d}_{12}^{\mathit{AD}}-{d}_{23}^{\mathit{AD}})({\varphi}_{{D}_{1}{D}_{2}}-{\varphi}_{{D}_{2}{D}_{2}})\right],\hfill \\ \hfill {p}_{{A}_{1}\mid \text{control}}& ={p}_{{A}_{1}}\frac{1}{1-\varphi}\left[({\Delta}_{{A}_{1}{D}_{1}}{p}_{{D}_{1}}+{d}_{12}^{\mathit{AD}}+{d}_{14}^{\mathit{AD}})({\varphi}_{{D}_{1}{D}_{1}}-{\varphi}_{{D}_{1}{D}_{2}})+({\Delta}_{{A}_{1}{D}_{1}}{p}_{{D}_{2}}-{d}_{12}^{\mathit{AD}}-{d}_{23}^{\mathit{AD}})({\varphi}_{{D}_{1}{D}_{2}}-{\varphi}_{{D}_{2}{D}_{2}})\right].\hfill \end{array}$$

These results indicate that the allele frequencies can be different between cases and controls under the null hypothesis Δ_{A1D1} =0, if there is haplotype-level HWD in the population. However, Table 3 shows that type I error is preserved well and its effect is negligible for most situations as long as we use the variances that allow for HWD.

In addition to confirming previous results [Won and Elston, 2008], we have shown that, if markers are dense enough, the LD contrast test can be more informative than *S _{a}*. Also, we extended

The decomposition of the parameters available for a case-control study enable self-replication and the increase in power resulting from self-replication can be understood in terms of methods for combining p-values. The most powerful method has been derived for when the *SED*s are known and this method can be extended to statistics that are correlated^{16}. Also, it should be remembered that among the possible choices for phase uncertainty ${S}_{\mathit{LD}}^{m}$ is the least appropriate when the markers are not dense because it is then the most correlated with the statistics that use the other types of information (see Figure 7). If the disease mode of inheritance is known, in the case of a recessive disease we suggest combining the p-values from *S _{a}*,

However, in spite of the efficiency and validity of the proposed method, there are still problems that need attention. First, the suggested strategy that determines the disease mode of inheritance with *S _{HWD}* does not use

In summary, even though there have now been many investigations using GWA, except for genetic mapping based on linkage disequilibrium units [Collins and Lau, 2008], they have so far considered only the statistic based on *I _{a}* for an initial scan, which can be powerless in some cases. Also, application to real marker data indicates that the LD contrast test can be very informative in some situations, so that combining p-values from

This work was supported in part by a U.S. Public Health Service Resource grant (RR03655) from the National Center for Research Resources, Research grant (GM28356) from the National Institute of General Medical Sciences and Cancer Center Support Grant (P30CAD43703) from the National Cancer Institute.

Before we derive the expectations, variances and covariances, we define the following notation that is used in Weir [Weir, 1996]:

$$\begin{array}{cc}\hfill {\Delta}_{{A}_{1}\u2215{B}_{1}}& \equiv {p}_{{A}_{1}\u2215{B}_{1}}-{p}_{{A}_{1}}{p}_{{B}_{1}}={p}_{{A}_{1}}{p}_{{B}_{1}}+{d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}}-{p}_{{A}_{1}}{p}_{{B}_{1}}={d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}}\hfill \\ \hfill {d}_{{A}_{1}{A}_{1}{B}_{1}}& \equiv {p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}+\frac{1}{2}{p}_{{A}_{1}{B}_{2}}^{{A}_{1}{B}_{1}}-{p}_{{A}_{1}}{\Delta}_{{A}_{1}{B}_{1}}^{C}-{p}_{{B}_{1}}{d}_{A}-{p}_{{A}_{1}}^{2}{p}_{{B}_{1}}\hfill \\ \hfill {d}_{{A}_{1}{B}_{1}{B}_{1}}& \equiv {p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}+\frac{1}{2}{p}_{{A}_{2}{B}_{1}}^{{A}_{1}{B}_{1}}-{p}_{{B}_{1}}{\Delta}_{{A}_{1}{B}_{1}}^{C}-{p}_{{A}_{1}}{d}_{B}-{p}_{{A}_{1}}{p}_{{B}_{1}}^{2}\hfill \\ \hfill & ={d}_{12}^{\mathit{AB}}(1-{p}_{{A}_{1}})+{d}_{14}^{\mathit{AB}}(1-{p}_{{A}_{1}}-{p}_{{B}_{1}})+{d}_{23}^{\mathit{AB}}({p}_{{B}_{1}}-{p}_{{A}_{1}})-{d}_{34}^{\mathit{AB}}{p}_{{A}_{1}}\hfill \\ \hfill {d}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}& \equiv {p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}-2{p}_{{A}_{1}}{d}_{{A}_{1}{B}_{1}{B}_{1}}-2{p}_{{B}_{1}}{d}_{{A}_{1}{A}_{1}{B}_{1}}-2{p}_{{A}_{1}}{p}_{{B}_{1}}{\Delta}_{{A}_{1}{B}_{1}}^{C}-{p}_{{A}_{1}}^{2}{d}_{B}-{p}_{{B}_{1}}^{2}{d}_{A}\hfill \\ \hfill & -{\Delta}_{{A}_{1}{B}_{1}}^{2}-{\Delta}_{{A}_{1}\u2215{B}_{1}}^{2}-{d}_{A}{d}_{B}-{p}_{{A}_{1}}^{2}{p}_{{B}_{1}}^{2}\hfill \\ \hfill & ={d}_{12}^{\mathit{AB}}+{d}_{13}^{\mathit{AB}}+{d}_{14}^{\mathit{AB}}+2{p}_{{A}_{1}}{d}_{{A}_{1}{B}_{1}{B}_{1}}-2{p}_{{B}_{1}}{d}_{{A}_{1}{A}_{1}{B}_{1}}-2{p}_{{A}_{1}}{p}_{{B}_{1}}{\Delta}_{{A}_{1}{B}_{1}}^{C}\hfill \\ \hfill & -{p}_{{A}_{1}}^{2}{d}_{B}-{p}_{{B}_{1}}^{2}{d}_{A}-{\Delta}_{{A}_{1}\u2215{B}_{1}}^{2}-{d}_{A}{d}_{B},\hfill \end{array}$$

where it should be noted that ${\Delta}_{{A}_{1}{B}_{1}}^{c}={\Delta}_{{A}_{1}{B}_{1}}+{\Delta}_{{A}_{1}\u2215{B}_{1}}$. Under HWE, ${\Delta}_{{A}_{1}\u2215{B}_{1}}={d}_{{A}_{1}{A}_{1}{B}_{1}}={d}_{{A}_{1}{B}_{1}{B}_{1}}=0$ because ${d}_{{A}_{1}{A}_{1}{B}_{1}}={d}_{13}^{\mathit{AB}}(1-{p}_{{B}_{1}})+{d}_{14}^{\mathit{AB}}(1-{p}_{{A}_{1}}-{p}_{{B}_{1}})+{d}_{23}^{\mathit{AB}}({p}_{{A}_{1}}-{p}_{{B}_{1}})-{d}_{24}^{\mathit{AB}}{p}_{{B}_{1}}$, ${d}_{{A}_{1}{B}_{1}{B}_{1}}={d}_{12}^{\mathit{AB}}(1-{p}_{{A}_{1}})+{d}_{14}^{\mathit{AB}}(1-{p}_{{A}_{1}}-{p}_{{B}_{1}})+{d}_{23}^{\mathit{AB}}({p}_{{B}_{1}}-{p}_{{A}_{1}})-{d}_{34}^{\mathit{AB}}{p}_{{A}_{1}}$.

The variances of ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}$ and ${\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}$ were derived as follows [Weir, 1996]:

$$\begin{array}{cc}\hfill \mathrm{var}& \left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}\right)=\frac{1}{2N}\left[{p}_{{A}_{1}}(1-{p}_{{A}_{1}}){p}_{{B}_{1}}(1-{p}_{{B}_{1}})+(1-2{p}_{{A}_{1}})(1-2{p}_{{B}_{1}}){\Delta}_{{A}_{1}{B}_{1}}-{\Delta}_{{A}_{1}{B}_{1}}^{2}\right]\hfill \\ \hfill & \left[+{d}_{A}{d}_{B}+{\Delta}_{{A}_{1}\u2215{B}_{1}}^{2}+{d}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}\right]\hfill \\ \hfill \mathrm{var}& \left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\right)=\frac{1}{N}\left[{p}_{{A}_{1}{B}_{1}}^{{A}_{1}{B}_{1}}+{p}_{{A}_{1}}{p}_{{B}_{1}}(1-{p}_{{A}_{1}}-{p}_{{B}_{1}})+{p}_{{A}_{1}}(1-2{p}_{{A}_{1}}){d}_{B}+{p}_{{B}_{1}}(1-2{p}_{{B}_{1}}){d}_{A}+(1-4{p}_{{A}_{1}}){d}_{{A}_{1}{B}_{1}{B}_{1}}\right]\hfill \\ \hfill & \left[+(1-4{p}_{{B}_{1}}){d}_{{A}_{1}{A}_{1}{B}_{1}}+\frac{1}{2}(1-2{p}_{{A}_{1}}-2{p}_{{B}_{1}}){\Delta}_{{A}_{1}{B}_{1}}^{C}-{{\Delta}_{{A}_{1}{B}_{1}}^{C}}^{2}\right]\hfill \end{array}$$

With results from Won and Elston [Won and Elston, 2008], we have

$$\begin{array}{cc}\hfill & E\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}\right)={\Delta}_{{A}_{1}{B}_{1}}+{p}_{{A}_{1}{B}_{1}}-\frac{1}{2{N}^{2}}E\left(\underset{i=1}{\overset{N}{\Sigma}}{X}_{i1}\right)\left(\underset{i=1}{\overset{N}{\Sigma}}{Y}_{i1}\right)-\frac{1}{2{N}^{2}}E\left(\underset{i=1}{\overset{N}{\Sigma}}{X}_{i1}\right)\left(\underset{i=1}{\overset{N}{\Sigma}}{Y}_{i2}\right)\hfill \\ \hfill & ={\Delta}_{{A}_{1}{B}_{1}}-\frac{{\Delta}_{{A}_{1}{B}_{1}}+{d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}}}{2N},\hfill \\ \hfill & E\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C}\right)=E\left({\widehat{\Delta}}_{{A}_{1}{B}_{1}}+\frac{1}{2N}\underset{i=1}{\overset{N}{\Sigma}}[{X}_{i1}{Y}_{i2}+{X}_{i2}{Y}_{i1}]-{\widehat{p}}_{{A}_{1}}{\widehat{p}}_{{B}_{1}}\right)\hfill \\ \hfill & ={\Delta}_{{A}_{1}{B}_{1}}+{d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}}-\frac{{\Delta}_{{A}_{1}{B}_{1}}+{d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}}}{N},\hfill \\ \hfill & \mathrm{cov}({\widehat{p}}_{{A}_{1}},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C})=\frac{1}{2N}\left[-2{p}_{{B}_{1}}{d}_{A}+({d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}})+2{d}_{13}^{\mathit{AB}}+2{d}_{14}^{\mathit{AB}}-4{p}_{{A}_{1}}({d}_{14}^{\mathit{AB}}-{d}_{23}^{\mathit{AB}})\right]\hfill \\ \hfill & \left[\phantom{\rule{thickmathspace}{0ex}}+{\Delta}_{{A}_{1}{B}_{1}}(1-2{p}_{{A}_{1}})\right]+O\left(\frac{1}{{N}^{2}}\right),\hfill \\ \hfill & \mathrm{cov}({\widehat{d}}_{A},{\widehat{\Delta}}_{{A}_{1}{B}_{1}}^{C})=\frac{1}{N}\left[-2{p}_{{B}_{1}}{d}_{A}(1-2{p}_{{A}_{1}})+2({d}_{13}^{\mathit{AB}}+{d}_{14}^{\mathit{AB}})(1-2{p}_{{A}_{1}})+\left({d}_{14}^{\mathit{AB}}{-}_{23}^{\mathit{AB}}\right)(-{p}_{{A}_{1}}+3{p}_{{A}_{1}}^{2}-{d}_{A})\right]\hfill \\ \hfill & \left[\phantom{\rule{thickmathspace}{0ex}}+{\Delta}_{{A}_{1}{B}_{1}}({p}_{{A}_{1}}-{p}_{{A}_{1}}^{2}-{d}_{A})\right]+O\left(\frac{1}{{N}^{2}}\right).\hfill \end{array}$$

.

- Collins A, Lau W. CHROMOSCAN: genome-wide association using a linkage disequilibrium map. J Hum Genet. 2008;53:121–126. [PubMed]
- Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol. 1995;12:921–927. [PubMed]
- Fisher RA. Statistical Methods for Research Workers. 11th ed. Oliver & Boyd; Edinburgh: 1950.
- Lewontin RC. The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models. Genetics. 1964;49:49–67. [PubMed]
- Liptak T. On the combination of independent tests. Magyar Tud.Akad.Mat.Kutato' Int.Ko"zl. 1958;3:171–197.
- Nielsen DM, Ehm MG, Weir BS. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am J Hum Genet. 1998;63:1531–1540. [PubMed]
- Nielsen DM, Ehm MG, Zaykin DV, Weir BS. Effect of two- and three-locus linkage disequilibrium on the power to detect marker/phenotype associations. Genetics. 2004;168:1029–1040. [PubMed]
- S.A.G.E. 5.2 Statistical Analysis for Genetic Epidemiology. 2007. http://darwin.cwru.edu/sage/
- Sasieni PD. From genotypes to genes: doubling the sample size. Biometrics. 1997;53:1253–1261. [PubMed]
- Song K, Elston RC. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies. Stat in Med. 2006;25:105–126. [PubMed]
- Van Steen K, McQueen MB, Herbert A, Raby B, Lyon H, DeMeo DL, Murphy A, Su J, Datta S, Rosenow C, et al. Genomic screening and replication using the same data set in family-based association testing. Nat Genet. 2005;37:683–691. [PubMed]
- Wang T, Zhu W, Elston RC. Improving power in contrasting linkage-disequilibrium patterns between cases and controls. Am J Hum Genet. 2007;80:911–920. [PubMed]
- Weir BS. Genetic Data Analysis II. Sinauer Associates; Sunderland, MA: 1996.
- Wittke-Thompson JK, Pluzhnikov A, Cox NJ. Rational inferences about departures from Hardy-Weinberg equilibrium. Am J Hum Genet. 2005;76:967–986. [PubMed]
- Won S, Elston RC. The power of independent types of genetic information to detect association in a case-control study design. Genet Epidemiol. 2008 In press. [PubMed]
- Won S, Morris N, Lu Q, Elston RC. Choosing optimal method for combining p-values. Stat in Med. 2008 Submitted.
- Zaykin DV. Bounds and normalization of the composite linkage disequilibrium coefficient. Genet Epidemiol. 2004;27:252–257. [PubMed]
- Zaykin DV, Meng Z, Ehm MG. Contrasting linkage-disequilibrium patterns between cases and controls as a novel association-mapping method. Am J Hum Genet. 2006;78:737–746. [PubMed]
- Zheng G, Song K, Elston RC. Adaptive two-stage analysis of genetic association in case-control designs. Hum Hered. 2007;63:175–186. [PubMed]
- Zheng G, Ng HK. Genetic model selection in two-phase analysis for case control association studies. Biostatistics. 2008;9:391–399. [PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |