For a rare disease and dichotomous secondary endpoint and genetic marker, we have investigated whether and how to use data from diseased subjects to study the association between a genetic marker and secondary phenotype. We considered both estimating and testing the null hypothesis of no association for a pre-specified SNP, and for discovering an association in a GWAS with either hypothesis testing approaches or approaches based on ranking the chi-square statistics. In the absence of an interaction
δ12 in model (
2), each of the five methods we considered leads to valid inference, and the W and MLE methods are particularly efficient. In the presence of interaction (
δ12 ≠ 0), the CO method controls the type I error perfectly, and the AW has proper size for rare interactions (1%) and only modestly supra-nominal type I error for common interactions (20%). The CA, W and MLE methods do not control type I error and cannot be recommended if it is plausible that
δ12 ≠ 0. The AW method has lower MSE for a pre-selected SNP and greater power than the CO method, which is achieved at the cost of a slight increase in type I error. We showed that the MLE method reduces to the CO method if the model allows for non-null
δ12.
Under the assumption of rare disease and dichotomous genetic marker and phenotype, the CO method is robust in that it maintains the unbiasedness and nominal type I error despite any interaction effect. The W and MLE methods fully utilize both controls and cases and are most efficient for estimation. When there is no interaction effect, both weighted and MLE are almost twice as efficient as the control only method in estimation and have around 70% more power than the control only method. We prefer the weighted method because it is nearly fully efficient and its estimator is non-iterative. Thus, there are no problems of convergence as can arise with the MLE method. However, even a small interaction effect causes large bias and highly inflated type I error for the CA, W, and MLE methods. The AW method strikes a balance between the robust CO method and the W method. It maintains the unbiasedness and near nominal type I error across most values of δ12, although it has moderately inflated type I error when δ12 is not far from zero. If δ12 is near zero, estimates based on AW have smaller MSE than those from CO for a prespecified SNP. Under a mixture distribution for δ12 which was chosen to allow most δ12 values to be zero, the AW method achieved an important gain in power compared to CO. The detection probabilities of the W and MLE methods degrade when ranking SNPs in the presence of increasing variability of δ12. However, CO and AW methods maintain their detection probabilities and are robust to increasing variable δ12.
Jiang, Scott and Wild [2006] discussed methods for analyzing secondary phenotypes in case-control studies. Their fully non-parametric approach (SPML1) corresponds to MLE under our model (
2) with
δ12 included, which is equivalent to the CO method for inference on
β1. Assuming
δ12 = 0 corresponds to possibly misspecified parametric modeling (SPML2) in their notation. MLE under SPML2 was described by
Lin and Zeng [2009], who also covered non rare diseases and both dichotomous and continuous secondary phenotypes. If
δ12 ≠ 0, the MLE method of Lin and Zeng does not control the type I error, as indicated by our results in Section 3.2 and 3.3 and in the discussion of model misspecification for SPML2 by
Jiang et al. [2006].
In unreported analysis, we evaluated the performance of the various methods using prostate cancer data from the Cancer Genetic Markers of Susceptibility (CGEMS) study. We conducted a genome-wide scan on the association between the secondary phenotype, body mass index BMI (1, if BMI≤ 25; 0, else), and the 516,564 SNPs from 22 autosomal chromosomes. We estimated the distribution of
δ12, and found no evidence that the variance of

across SNPs exceeded that which would be expected from the multinomial sampling error alone. Thus, we did not find evidence that
δ12 ≠ 0 for some SNPs. Under such situation, W and MLE are two most efficient methods, and both identified SNP rs7575639 with a genome-wide significant
p < 10
−7. The 20 SNPs with smallest p-values selected by the W and MLE methods were identical, with only slight differences in ranking. For 11 SNPs, spurious results resulted from convergence problems for MLE. Only careful scrutiny of the extreme values for these SNPs revealed the problem with MLE. For this reason, we recommend the numerically stable W method instead of MLE.
Kraft [2007] argued that it is unlikely for both the secondary phenotype and genetic marker to affect the original case-control disease risk, let alone for there to be an interaction. In terms of
equation (2), he suggested that either
δ1 or
δ2 would usually be zero and implicitly that
δ12 would be zero. If this is so, one could use the W method and gain precision and power thereby. More experience is needed with GWASs to see if the W or MLE methods yield many false positive results as a consequence of
δ12 ≠ 0, or if their detection probabilities for ranking promising SNPs are degraded by the presence of interaction effects. Our work makes it clear that spurious positive findings may result from such an interaction, and that one can protect against such findings by using the control only or adaptively weighted approaches.