|Home | About | Journals | Submit | Contact Us | Français|
Structured association tests (SAT), like any statistical model, assumes that all variables are measured without error. Measurement error can bias parameter estimates and confound residual variance in linear models. It has been shown that admixture estimates can be contaminated with measurement error causing SAT models to suffer from the same afflictions. Multiple imputation (MI) is presented as a viable tool for correcting measurement error problems in SAT linear models with emphasis on correcting measurement error contaminated admixture estimates.
Several MI methods are presented and compared, via simulation, in terms of controlling Type I error rates for both non-additive and additive genotype coding.
Results indicate that MI using the Rubin or Cole method can be used to correct for measurement error in admixture estimates in SAT linear models.
Although MI can be used to correct for admixture measurement error in SAT linear models, the data should be of reasonable quality, in terms of marker informativeness, because the method uses the existing data to borrow information in which to make the measurement error corrections. If the data are of poor quality there is little information to borrow to make measurement error corrections.
In statistical modeling, ignoring confounding variables can lead to either increased false positive or increased false negative rates  and a bias in parameter estimates either away from or toward a null value. A confounder is a variable that is correlated with the predictor(s) and the outcome variable(s) in the model, and can cause a biased estimation of the causal association between these variables if not properly taken into account. To control for a confounder's effects, it is often included in the model as a covariate, which partials out its relationship with the predictor(s) and outcome variables in the model to obtain more accurate estimates of the relationship between predictor(s) and outcome(s) variables. In genetic association studies is overwhelming evidence that population stratification, assortative mating, and admixture among populations can result in intrapopulation variation in ancestry, correlations of allelic variation among unlinked loci, and ultimately confound association studies [2,3,4,5,6].
When discussing individual ancestry and individual admixture, it is important to distinguish what is meant by these two concepts. By individual ancestry (proportion) we mean the proportion of an individual's ancestors that come from a specified population. In contrast, individual admixture (proportion) is defined as the proportion of an individual's genome that is inherited from a specific parental population .
Several approaches to correct for population stratification and admixture have been proposed. Genomic control (GC) [4, 8, 9] and structured association testing (SAT) [10,11,12,13] are two such statistical approaches. Although GC can be useful in correcting for population stratification, we focus here on precisely estimating ancestry and using it as a covariate in SAT. The SAT model can flexibly accommodate time-to-event, dichotomous, ordinal, or continuous responses for the outcome measure and the model parameters can be estimated through standard statistical software. However, the model is subject to the same assumptions associated with standard linear models, including an implicit assumption that all variables are measured without error. In linear models, measurement error in predictors can introduce bias in the parameter estimates and increase the residual variance, which translates into inaccurate conclusions about hypotheses being tested.
Admixture may mask the true relationship between the phenotype (outcome variable) and genotypes (predictors) and produce false positives [14,15,16,17] and/or false negatives . Individual admixture estimates are typically used as proxies for individual ancestry because individual ancestry is rarely known. Redden et al.  and Divers et al.  have shown that individual admixture estimates, as proxies for individual ancestry, are contaminated with measurement error for several reasons. First, only a subset of genetic markers with imperfectly known ancestral population allele frequencies is used to estimate admixture (i.e., not fully ancestry informative markers). Second, imperfect historical knowledge about the admixed population can lead to inaccurate estimates of individual admixture. Third, individual ancestry is the expected value of individual admixture, but the process of meiosis introduces random variation between the two constructs. Finally, genotyping errors will also contribute to individual admixture being estimated with error. All or any one of these conditions will cause a discrepancy between individual ancestry and estimates of individual admixture, which translates into error contaminated ancestry estimates.
This paper addresses accounting for admixture measurement error in SAT and explores a specific alternative, multiple imputation (MI), to the methods previously described by Divers et al. . We use simulation to evaluate the performance of the proposed methods and conclude with a discussion of results and how the methods can be extended.
Redden et al.  formulated SAT in the form of a general linear model as follows:
In the model f(Yi) is the link function linking Yi variable (phenotype) to the parameters of the model, Ai is the ancestry of the i-th individual, P1i and P2i are the ancestry values of the two parents, and Gijk is an indicator variable for the i-th individual with k and only k alleles at the j-th locus of type m (specific allele states). Redden et al.  propose inclusion of the product term for parental ancestry to better control for spurious association and achieve the desired Type I error rate. This general model can accommodate covariates such as gender, age, and treatment group and phenotypes such as time-to-event, dichotomous, ordinal, or continuous responses. The Ai ancestry component is included to control for the potential confounding effect and must either be assumed to be measured without error or necessitates a measurement error correction.
where xij is the j-th observed score (estimated admixture) for the i-th individual, τi is the true score (ancestry) for the i-th individual, and uij are the random components for the j-th admixture estimate (j = 1,2, …, p). In the CTM it is typically assumed that E(Uij) = 0 and var(uij) = σ2u with uij mutually independent of each other and of τj[20, 21]. It can then be shown that E(xij) = τi or μxi = τi and σ2x = σ2τ + σ2u. Note that τi and uij are latent variables that are never observed, but both influence xij, which is observed. Nevertheless, an estimate of σ2u can be obtained using only the data from the xij's. This can be done through a reliability coefficient, generically defined as
and ranges from 0 to 1 . It should be noted that ρ2xτ is sometimes referred to as the intra-class correlation. Of specific interest here is Cronbach's alpha (αc), a measure of the reliability of the sum of the equally weighted
, computed as
The computation of αc only requires that the xi's measure the same construct or latent variable (i.e., Tau-equivalence) . The estimated reliability coefficient in turn provides an estimate of σ2u as σ2u = σ2x (1 – ρ2xτ) = σ2x (1 – αc)  and is a weighted estimate of the observed score variance. Note that αc is being used instead of ρ2xτ. In genetic association/mapping studies of population data, ancestry informative markers (AIMs) on each of the autosomal chromosomes can be used to obtain chromosome-specific admixture estimate for each person who, conditional on true individual ancestry, is independent. From here on we denote admixture estimate for an individual by xij. The chromosome-specific admixture estimates can be used to estimate αc. For a discussion of how Cronbach's alpha effects association tests see Divers et al. .
Consider the linear model
with ~ NID(0, σ2). If X is measured with error, it can be shown that the Ordinary Least Squares (OLS) regression of Y on X yields a consistent estimator of
which is attenuated towards zero. In addition, measurement error affects the residual variance as seen in the expression
From the above two expressions, the smaller the measurement error variance (σ2u), the closer β* will be to β and the residual variance will be less confounded. Of course, neither problem will exist when there is no measurement error (σ2u = 0).
Divers et al.  demonstrated the use of quadratic measurement error correction (QMEC) [23, 24], regression calibration , expanded regression calibration [26, 27], and the simulation extrapolation (SimEx) algorithm [28, 29] to address the admixture measurement error challenge in SAT models. They found that the QMEC method performed best in terms of controlling the Type I error rate and the expanded regression calibration method performed the worst. However, the QMEC method is limited to linear models making a more flexible model desirable. Multiple imputation (MI) can in principle correct for measurement error in the general SAT model of Redden et al.  and flexibly accommodate a variety of special cases such as logistic and Cox regression.
Measurement error problems may be conceptualized as missing data problems in which we observe imperfect measurements but true scores are never seen (missing) . Using MI to impute the missing true values as a means of correcting for measurement error in conjunction with alpha, which is used to estimate the measurement error variance, has the advantage of using the observed data as opposed to using (a) validation data in which the true values of the variable are actually observed, (b) replication data where multiple measurements of the variable are made, or (c) instumental data  in which two or more alternative methods are required to measure the variable.
In MI one treats imputed true values as probable and not as the one ‘true’ value, and using the one ‘true’ value ignores imputation variability or uncertainty about the actual value. Imputing a single value would fail to take into account the uncertainty about the actual value and can lead to underestimated standard errors, confidence intervals that cover less than their nominal coverage, and inflated Type I error rates. MI accounts for the uncertainty by imputing multiple values for each missing value and accounting for the resulting uncertainty and will yield valid estimates and tests pursuant to certain assumptions about the missing data mechanism [for details, see [32, 33]].
To use MI for measurement error correction one can proceed by obtaining an estimate of the true score (ancestry) for i-th individual based on the observed data  by formulating the prediction equation from regression theory as follows:
where Ŷi is the predicted score, ρXY is the correlation between X and Y, μY and μX, and σY and σX are the means and standard deviations of Y and X, respectively. Equation 1.8 can be rewritten as
Substituting for Ŷi, ρXτ for ρXY, στ/σX = ρXτ, and μτ = μX yields
Note that αc is used instead of ρ2Xτ. The variance associated with this estimated true score is . The reliability index is defined as ρXτ = στ/σX. Equation 1.10 is a Bayesian or ‘shrunken’ estimator . Thus, probable true scores can be generated using estimated coefficients and variances . This idea will be revisited in the imputation process.
Redden et al.  indicated that the product of parental ancestries is required to achieve the desired Type I error rate when genotypic (as opposed to simply allelic) effects at the marker locus are tested. Divers et al.  found that squaring the individual admixture estimate ‘adequately approximates the product of ancestral ancestries’. Hence, in the present context, quadratic terms of the probable true scores are also required. Here, we justify the centering of the admixture estimate before implementation of MI. Assume that X ~ N(μ, σ2), then
By centering X, then (X – μ) ~ N(0, σ2), it then follows that cov((X – μ), (X– μ)2) = 0. Thus, centering the admixture estimate allows one to ignore the covariance between X and X2 in the imputation process and subsequently only requires the squaring of the probable true score.
Using the SAT model proposed by Redden et al. , and given in equation (1.1), the following steps were implemented for the MI process.
In the above steps, measurement correction is essentially variance correction in the form of .
It is important to recall that MI assumes that the missing values are missing at random (MAR). In short, MAR means the probability that values are missing on a certain variable Y depends on other variables in the model, but not on Y itself. Although, MI is not specifically being used to impute missing values, the MAR assumption still holds. What is being treated as missing are the true value, which are not observed. Even so, it is assumed that the true values have a relationship with the other variables in the model, which is the MAR assumption.
For comparative purposes, the data were analyzed through a naïve model, a model that treats the variables as if they had no measurement error.
The simulation investigated the effect of error-contaminated individual ancestry proportions on the Type I error rate in SAT models. The underlying individual ancestry distribution (X) was simulated by making draws from a mixture of uniform and normal distributions that mimic the ancestral distribution observed in African American populations following the simulation procedures by Tang et al. . A thousand datasets, each containing 500 markers and 1000 individuals were generated. The delta-value of each marker is allowed to vary between 0 and 0.9. However, only ancestry informative markers were retained for individual ancestry proportion estimation. They were sampled more heavily toward the upper bound of this interval for high Cronbach's alpha values and more toward the lower bound for lower Cronbach's alpha values. These markers were evenly divided into 22 blocks, which are used to provide a set of 22 estimates of individual ancestry. These estimates are used to estimate Cronbach's alpha. From these sets, 20 sets of 500 markers for each mean Cronbach Alpha values of were randomly selected. The allele frequency of each marker in the admixed sample was computed as a mixture of two parental allele frequencies as follow:
where P1j and P2j are frequencies of allele 1 at the j-th marker for the 1st and 2nd parental populations, Xi the simulated ancestry of the i-th admixed individual, and Padxij is the allele 1 frequency for the i-th admixed individual for the j-th marker. In this simulation, given a specific delta value, P1j ~ U(0, 1), P2j = P1j + δ where δ ~ Bin(100, delta) × 0.01, and Xi = 0.2 × U(0.1, 0.9) + 0.8 × N(0.15, 0.052) [19, 34]. The trait or phenotypic variable was generated as
for the linear and quadratic model, respectively, where i ~ N(0, 4). The linear model was generated for comparative purposes. In the simulation Xi is the simulated true ancestry proportion from the above mixture distribution and Wi = Xi + ei is the observed ancestry proportion, where ei ~ N(0, σ2i), is the error-contaminated ancestry coefficient. Note that this is ancestry estimated in the form of the classical true-score model (CTM). The σ2i values were selected so that the observed correlations between Wi and Xi vary between 0.85 and 0.95, and to demonstrate that highly yet still imperfectly correlated true and estimated (or measured) ancestry proportions can still lead to Type I error inflation. We note that a correlation between 0.85 and 0.95 ensures that Cronbach's alpha is bounded between 0.7 and 0.9. Under this scheme, 20 datasets of 500 markers containing 1000 individuals were simulated for a total of 10,000 markers. Each marker was tested for association with the simulated phenotype.
Each dataset contained a sample of 1000 individuals with 500 markers. Both the SAT models with and without the squared ancestry term were fitted to the data; we refer to the former as a linear SAT model and the latter a quadratic SAT model. Assume there are two alleles (A, a) at a locus forming three genotypes (aa, aA, AA) and allele A is of interest. The genotypes can be coded to allow for testing of only additive or both additive and non-additive effects and table table11 offers respective coding schemes.
Table Table22 contains the Type I error rates of the linear and quadratic SAT models with additive and non-additive genotypic coding for different reliability coefficients’ corresponding to naïve model (i.e. without measurement correction). The type I error rates are liberal irrespective of genotype coding, a linear or quadratic SAT model, and reliability coefficient, implying that the association test will have a higher false positive rate if there is confounding by admixture and the model is not corrected for measurement error.
Tables Tables3,3, ,4,4, ,55 provide the type I error rates with measurement correction corresponding to the Rubin, Bootstrap, and Cole methods. Table Table33 contains Type I error rates for both the linear and quadratic SAT model with additive and non-additive genotypic coding for reliability coefficient of 0.90. The type I error rates for all three methods of imputation were slightly conservative for the linear SAT model. A similar trend occurred for the quadratic SAT model with the exception of the bootstrap method, where the type I error rates for the β3 were slightly liberal.
The Type I error rates of the linear and quadratic SAT model with additive and non-additive genotypic coding with reliability coefficient of 0.80 are presented in table table44 with measurement correction using the Rubin, Bootstrap, and Cole's method. For the linear SAT model, the Bootstrap imputation method controlled the Type I error rate best followed closely by the Cole and Rubin's methods. Additionally, the Cole and Rubin methods were not as conservative as before. However, the type I error rates were liberal for the quadratic SAT model using the Bootstrap method irrespective of genotype coding system. Both Rubin and Cole's methods provided type I error rates closer to nominal significance level of 0.05 and slightly less conservative compared to the situation with reliability coefficient of 0.90.
Lastly, table table55 displays the Type I error rates of the linear and quadratic SAT models with additive and non-additive genotypic coding with reliability of 0.70. The type I error rates for the Bootstrap method were very liberal compared to either of Rubin's or Cole's method. However, all methods performed poorly for the quadratic SAT model. However, the Rubin and Cole methods kept the type I error rate closer the nominal significance level of 0.05. The slight exception here is that both the Rubin and Cole methods were slightly conservative for the β4 parameter estimate.
Measurement error in linear model variables is an important consideration, and through simulation we demonstrated the importance for correcting measurement error in linear models. Of particular interest was using multiple imputation (MI) for measurement error correction for the Redden et al.  SAT model. Although the Redden SAT model requires individual ancestry estimates to control for admixture confounding, individual admixture estimates were used because individual ancestry estimates are rarely known, so admixture estimates can be used as a surrogate for the ancestry estimates. We then describe how to use MI for measurement correction. Like Divers et al. , we also used Cronbach's alpha  as a component of our measurement error correction procedure. We also described three different methods for imputing probable true scores for admixture: Rubin, Bootstrap, Cole.
In the linear SAT model, of the three different methods for imputing probable admixture scores, the Rubin and Cole methods appear to work best. Although at first it looks like the Bootstrap method controls the Type I error correctly whereas the Rubin and Cole methods are slightly conservative, as the marker informativeness begins to decrease it is the Rubin and Cole methods that control Type I error rate and the Bootstrap method becomes liberal. Consistently, the Rubin and Cole method provided better control of the Type I error rate than the Bootstrap method. This same pattern was observed in Divers et al. , in that measurement error correction only appears to be required when the informativeness of the markers is of intermediate value. The reason for this is that when markers are highly informative, the measurement correction method provides little improvement. On the other hand, when marker informativeness is low, the measurement correction method has poor information to borrow for measurement correction. MI for measurement correction as presented uses the existing data to accomplish this goal and require no external information.
In the quadratic SAT model, of the three different methods for imputing probable admixture scores, the Rubin and Cole methods again appear to work best. The Bootstrap method did not consistently provide reasonable control of the Type I error rate. One interesting point is that the type I error rates of the Bootstrap method, in all models, are very similar to the type I error rates of the model without measurement error correction, suggesting that the Bootstrap method is not providing much measurement error correction. Notably, none of the methods works particularly well for a quadratic SAT model with admixture reliability of 0.70. Because of this result the linear SAT model corrected for measurement error may be considered, yet it too can have problems if the genetic effects are markedly non-additive (e.g., overdominance).
There is now much agreement that population admixture and/or population stratification can confound association studies when not taken into account. However, it should also be mentioned that accuracy with which admixture is measured will have an influence on Type I error. When admixture or any other continuous variable are contaminated with error, MI for measurement error correction can help control the specified Type I error rate. However, this method is only useful if the data are of reasonably good quality with respect to marker information, which means that much care should still be taken when designing association studies, and in particular when measuring variables that will be used for analysis in a statistical model.
This work was supported in part by National Institutes of Health grants: 5R01AR052658-02, ES009912, DK056336, CA100949-03, HL072757, AR007450, AR049084, R21LM008791, R01GM077490. The opinions expressed are solely those of the authors and do not necessarily represent those of the NIH or any other organization with which the authors are affiliated.