Analysis of covariance (ANCOVA) was devised for classical experimental designs with random group assignment to minimize preexisting group differences, a situation where group differences in characteristics like IQ or socioeconomic status (SES) occur only by chance, the theoretical populations to which the experimenter wishes to generalize being equated on the distribution of the covariate. Even with random assignment, study differences may occur on the covariate by chance, so ANCOVA is a possible means of adjusting for sample differences on the covariate and providing an unbiased estimate of the population difference in means on the dependent variable (because the hypothetical populations to which the treatments have been assigned have been equated by design).
When the covariate is an attribute of the disorder or of its treatment, or is intrinsic to the condition, it becomes meaningless to “adjust” the treatment effects for differences in the covariate, and ANCOVA cannot be used to control treatment assignment independent of the covariate (Adams et al., 1985
; Evans & Anastasio, 1968
; Lord, 1967
; Miller & Chapman, 2001
; Tupper & Rosenblood, 1984
). In his classic demonstration of an agronomist comparing rates of growth in corn plants that differ inherently in stalk height, Lord (1969)
showed that any attempt to compare the yields of the two classes of plants by adjusting for plant height must give a meaningless result, one that could only come about through fundamental alterations of the two plants. The causal network relating plant species to plant height and plant yield cannot be manipulated to isolate the causal impact of species on yield in the absence of species effects on height and height effects on yield; neither ANCOVA nor matching can correct these effects of species and height.
The best case scenario for the use of a covariate (Huitema, 1980
) exists when: (a) the assignment to the independent variable (e.g., neurodevelopmental disorder) is done randomly; (b) the covariate is related to the outcome measure, but this relation is of no theoretical interest in terms of the investigative question (i.e., the covariate is a source of irrelevant variation in the dependent variable, which, if controlled, allows for a more powerful test of the effects of the independent variable of interest]; (c) the covariate is unrelated to the independent variable, which is assured probabilistically if (a) is true; and (d) the covariate is not differentially related to the dependent variable at different levels of the independent variable [also assured if (a) is true]. Ideally, the covariate should also be stable and measured without error.
When assignment to the independent variable is not through randomization, or the covariate otherwise does not meet all the requirements of the ideal scenario, then their proper use requires consideration of precisely how the independent variable, the dependent variable, and the covariate come together to form a causal network. For instance, covariates can meaningfully be incorporated into the analysis when the dependent and independent variables are spuriously related to the covariate, or when the covariate mediates (partially or fully) the relation between the independent and the dependent variable, and the investigator is interested in estimating the direct effect of the independent variable on the outcome. In these instances, the use of a covariate can clarify the relation between the independent and the dependent variables.
We next argue that the typical use of IQ as a covariate does not fulfill the requirements of the ideal scenario. Furthermore, it rarely meets the requirements for the meaningful use of covariates in less than ideal circumstances.
The Ideal Scenario for a Covariate
At the heart of an ideal scenario and all meaningful uses of covariates is the tripartite relation of the covariate, the independent variable, and the outcome. In appropriate uses of covariates, the covariate is a cause of the outcome, such as age causing achievement, or at least serving as a proxy for exposure, education, and instruction. The covariate should not be an outcome of the dependent variable or of the independent variable. In this three-dimensional space, what complicates matters is the relation between the covariate and the independent variable, and by implication, the joint relations among the independent variable, the dependent variable, and the covariate. When assignment to values on the independent variable is through a random process (the ideal scenario), the independent variable and the covariate are unrelated (i.e., the extent to which the groups differ on the covariate is probabilistically zero), and the inclusion of covariates in the statistical analysis increases power for finding a true relation between the independent and the dependent variables by keeping the numerator of the F value the same while reducing the denominator.
This situation is depicted graphically in . Although the situation depicted in is hypothetical, we have labeled the horizontal axis as IQ and the vertical axis as Memory to make the situation less abstract. In , the difference in the heights of the two ellipses at the mean of IQ is equivalent to the difference between the groups’ means on the Memory measure. The difference in this adjusted comparison is not in the estimate of the mean difference between groups, but in terms of the variance in Memory. In the comparison of Memory controlling for IQ, the variance of Memory is replaced by the variance in Memory conditional on IQ. Given the correlation of .6 in the population, the conditional variance in Memory will be about 64% of the unconditional variance in Memory, thereby leading to a more powerful test of the difference between groups on Memory.
Fig. 1 The ellipses in the figure represent the 99% quantiles in a bivariate normal distribution for two groups where the correlation between IQ (graphed on the horizontal axis) and Memory (graphed on the vertical axis) is .6 for each group. In the margins of (more ...)
The Less-Than-Ideal Scenario for a Covariate
When preexisting groups are compared in a nonexperimental study, participants are recruited nonrandomly, as they exist in nature. If we knew how children come to be “assigned” to the population of children with SBM or LD, it might be possible to incorporate the assignment process into the comparison; even for genetic disorders, however, modeling the selection process is not currently possible, so groups may differ on variables potentially related to the assignment mechanism. It is a false inference that any measure on which groups differ and which is not itself the comparison of interest must be controlled because it is related to the assignment mechanism.
Many differences between naturally occurring groups are themselves consequences of the unknown assignment mechanism, being neither artifacts of how the relevant sample was ascertained nor part of the assignment mechanism, but rather differences between the populations from which the researcher wishes to sample. Investigators understandably wish to adjust for selection effects that arise due to nonrepresentative sampling from the populations, in order to derive a better estimate of population differences by adjusting for sampling biases, such as differences in age or gender. But when the populations differ on the attribute, even random sampling from the populations will result in attribute differences between samples that represent not biased sampling but true population differences.
For groups with neurodevelopmental disorders, mean IQ scores will be generally below the population normative mean. Consequently, groups will differ when appropriately selected from the populations of these disorders. Differences in IQ between children with SB and age-matched controls represent, not poor sampling, but preexisting, nonrandom differences beyond experimenter control.
This situation is depicted in , which is developed in a fashion similar to , but allows for differences between the two groups on the variable IQ. The distance between the two solid horizontal lines depicts the difference in Memory controlling for the differences between groups on IQ. depicts two population distributions, with the two distributions being closer together at the grand mean of IQ than at the respective group means on IQ. The two distributions are almost nonoverlapping, such that much less than 50% of the lower performing group lies at or above the grand mean on IQ, while substantially more than 50% of the higher performing group lies above the grand mean on IQ. In the hypothetical situation depicted in , a comparison at the grand mean is roughly equivalent to comparing the 25th percentile in the higher performing group and the 75th percentile in the lower performing group. That this statistical adjustment can be performed mathematically says nothing about the scientific validity of the resulting comparison, which requires a model of the neurocognitive function.
Fig. 2 This figure differs systematically from because the two groups differ on the mean of IQ. As in , the correlation between the IQ and the Memory is .6 for each group. In addition, includes two heavy lines that depict the regression (more ...)
The inability to control group assignment renders the foregoing discussion somewhat academic, insofar as it relates to controlling for preexisting differences on covariates. It does highlight the fact that the key to appropriate use of covariates is understanding their role in the assignment mechanism and the selection process and articulating a causal network about how different cognitive and neurodevelopmental processes are related.
Assumptions of ANCOVA
The use of IQ as a covariate in neurodevelopmental studies rarely meets standard assumptions for ANCOVA. In addition to the assumptions of analysis of variance, ANCOVA adds the assumption of homogeneity of regression, which practically means that the within-group regressions of IQ and the dependent variable are not different. ANCOVA assumes further that the residuals are normally distributed and have equal variance in all groups.
Although these assumptions can be relaxed with appropriate alternative estimation methods, consider what happens when the covariate seemingly has no effect on the outcome or, conversely, when the covariate relates to the dependent variable in a different manner for each group, such that group differences in the outcome vary as a function of the value of the covariate. In the former situation, the lack of direct impact of the covariate on the dependent variable when the ANCOVA assumptions are met implies that the covariate does not mediate or moderate the relationship between the group measure and the dependent variable; such an inference is not necessarily justified if the assumptions of the ANCOVA model do not hold. The presence of a relation between the covariate and the dependent variable does not imply that the covariate mediates or moderates the relationship between the group measure and the dependent variable; such an inference requires a line of causal argument that is not simply statistical in nature, and so must be supported through both theory and empirical findings.
The alteration of group differences by inclusion of a covariate occurs when groups differ on the covariate or when the covariate operates differently in predicting group outcome; of itself, the alteration does not license the inference that the covariate mediates or moderates the relationship between the group and the dependent variable. In the absence of heterogeneity of regression, an adjustment in the mean difference occurs because the groups differ on average on the covariate, as shown in . Comparing groups at the mean value of the covariate leads to a different estimate of group differences on the outcome than simply comparing groups on their unadjusted means on the dependent variable.
Controlling for the covariate usually reduces the magnitude of group differences, as shown in , although this adjustment need not shrink group differences. In fact, when the covariate is positively related to the outcome within groups, but the lower scoring group is higher on the covariate, the adjusted mean difference will exceed the unadjusted mean difference in magnitude. This scenario is depicted in , where there is homogeneity of regression but where group differences are larger when the covariate, IQ, is controlled than when the groups are compared on Memory ignoring IQ. This effect can be seen in by comparing the separation between the two solid horizontal lines, which show the difference in the adjusted means, to the separation between the two dashed horizontal lines, which show the difference in the unadjusted means, and which are also referenced by the centers of the marginal distributions for Memory. Such findings are possible when the covariate is causally implicated in the dependent variable but other factors operate to bias group selection.
Fig. 3 This figure is similar to , but in this case, the displacement of groups on IQ is opposite what would be expected given the overall positive correlation between IQ and Memory in both groups. The two solid horizontal lines show the difference in (more ...)
When the relation between the covariate and the outcome is different for each of the two groups, differences on the outcome vary with the value of the covariate. In , the ellipses are of different sizes reflecting the overall weaker relation between IQ and Memory in the lower performing group (r = .4 vs. r = .8 in the higher performing group). Differences between groups on the outcome measure Memory depends on where along the IQ distribution the comparison between groups is made. The standard ANCOVA comparison is made at the grand mean on IQ, which in this case represents approximately the 25th percentile for the higher performing group and the 75th percentile for the lower performing group. If a common regression line was applied to the two groups, the adjustment would be too little for the higher performing group (where dependence of Memory on IQ is greater) and too great for the lower performing group (where IQ and Memory are less strongly related).
Fig. 4 In this figure, the ellipses are of different sizes reflecting the overall weaker relation between IQ and Memory in the lower performing group. In this case, the correlation between the two measures is only .4, whereas in the higher performing group, (more ...)