Improving analyses by ignoring covariates seems counterintuitive, as they should
provide some information. To extract value from covariates, Zaitlen et al.

[6] developed a new
method that uses existing evidence of covariate associations with the trait of
interest, and trait prevalence, to increase power. This approach first builds a
liability model using estimates of a covariate's independent effect in the form
of trait prevalences at various levels of the covariate (e.g., type 2 diabetes
prevalences by age). Then it evaluates the association between the genetic variant
of interest and the liability model residuals (). In effect, the external information
about covariate effects is used to distinguish high- and low-risk cases and
controls. Tests of genetic variant associations with these quantitative residuals
have more power than tests of genetic associations with the original binary
trait.

The value of Zaitlen et al.'s approach is demonstrated in several data sets with
case-control and case-control-covariate ascertainment, where the selection
probability for an individual to join the study depends on covariate levels, such as
in matched studies or those with overrepresentation of low-risk cases. While
covariate-based ascertainment of cases and controls can induce selection bias that
must be addressed by including the covariate in a conventional regression model

[8], the new
method provides a potentially powerful alternative.

The authors show by application and simulation that the liability model approach
increases association test statistics by 18% and 16% in comparison
with logistic regression with or without covariates, respectively. Of course, this
improvement hinges on having accurate external covariate information; one could
envision scenarios where the external covariate data is so poor that using this
approach would actually decrease power. One could also use covariate information
discerned from a given dataset, but external information may be even better. A
framework to propagate uncertainties through the multistage analysis of Zaitlen et
al. would be useful to assess sensitivity to the quality of published or assumed
trait prevalences and covariate effects, and to the estimation errors in the
formation of the liability model and in the calculation of residuals. A starting
point might be to repeat the analyses for a range of covariate-specific trait
prevalences that bracket the actual published or assumed values.

Zaitlen and colleagues have also developed a version of the liability model approach
for when the covariates are genetic markers with known trait associations

[9]. Future work
might compare these novel liability methods to alternative approaches for inclusion
of external information, such as Bayesian models with informative priors for the
covariate effects. Moreover, schemes for weighted analyses

[10] suggest other ways to
potentially increase association study power.

In summary, if one undertakes a case-control association study and has information on
covariates that are independent risk factors for a trait—and are not
confounders—simply including them in a logistic regression model is not always
the optimal approach for discovering genetic variants. Instead, more power may be
gained by excluding them, by using the liability model approach of Zaitlen et al.

[6],

[9], or by applying
other novel techniques to leverage information from such covariates.