In contrast, observational studies of ROA progression can produce biased estimates of the effect of an exposure on the outcome given the very fact that the study sample is restricted to the knees that have preexisting mild or moderate knee ROA. To illustrate this, we go back to a causal diagram, this time to depict the relationship of a risk factor (e.g., obesity) to knee ROA progression in a hypothetical observational study (). To simplify our discussion throughout this paper we consider only one potential confounder (although this can be readily extended into more complex scenarios with more potential confounders), in this case a genetic factor that increases the risk of both incident and progressive ROA. Note that the genetic factor is not associated with obesity before knees develop ROA. Thus it would not be a confounder in a study of obesity and incident ROA.
As shown in , knees that have preexisting ROA at baseline developed ROA because of either obesity or the genetic factor; thus preexisting ROA at baseline is a “common effect” of obesity and the genetic factor (i.e., obesity → preexisting ROA ← genetic factor). The fact that there is no association between the obesity and genetic factor before knees develop mild or moderate ROA is indicated by the lack of an arrow between obesity and the genetic factor. The only paths between these two variables are blocked by colliding arrow heads at their common effects nodes, i.e., preexisting ROA (i.e., obesity → preexisting ROA ← genetic factor) or ROA progression (i.e., obesity → ROA progression ← genetic factor).
However, as a result of conditioning on a common effect, in this case preexisting ROA at baseline, obesity and the genetic factor are no longer independent. Such conditioning opens an alternative non-causal path between obesity and the genetic factor (i.e., obesity --- genetic factor → ROA progression). This alternative path biases the effect estimate of obesity on ROA progression, unless the effect of the genetic factor is appropriately adjusted for. However, in many instances, not all confounders are measured or are known, therefore leading to confounded effect estimates.
Here we provide an intuitive example to illustrate the logic behind potential spurious associations created by conditioning on a common effect
16. Suppose one has two fair coins and one bell. The bell rings whenever either coin comes up heads on a toss of the two coins. Thus, the bell ringing is a “common effect“ of heads appearing on the toss of either coin A or coin B (i.e., coin A → Bell ringing ← coin B). Obviously, heads appearing as a result of one coin toss is independent of heads appearing as a result of the other coin toss; thus the correlation coefficient between heads appearing from coin A and from coin B equals 0. However, suppose that we only examined the relationship between heads appearing with the two coin tosses in those instances where the bell did ring (analogous to only examining the effect of obesity on ROA progression in people with preexisting ROA). By conditioning on the status of the bell ringing (i.e., conditioning on a common effect), heads appearing from coin A and heads appearing from coin B are no longer independent. For example, if the bell rings and coin A came up tails, then that must mean that coin B came up heads (and vice versa if coin B came up tails). As a consequence, conditioning on the status of the bell ringing induces a negative correlation between heads appearing from coin A and from coin B.
As we assumed that there are only two causal factors (e.g., obesity and the genetic factor) for development of mild or moderate ROA in our scenario, some preexisting ROA at baseline was caused by obesity, and others, by the genetic factor. While the genetic factor is not associated with obesity before knees developed ROA, among knees that have ROA at baseline, obesity and the genetic factor are no longer independent. This is because if the cause of preexisting ROA is not obesity, it must then be the genetic factor, or vice versa. Subsequently, when evaluating the relation of obesity to the risk of ROA progression, such a negative correlation between obesity and the genetic factor would bias the effect of obesity on ROA progression downward unless the genetic factor is appropriately controlled for.
Ideally, to assess the effect of obesity on the risk of ROA progression one should compare the risk of ROA progression among persons with obesity with those without obesity, with all else being equal. However, in an observational study persons with baseline knee ROA but are not obese must have been exposed to other risk factors for ROA. Thus, the two groups (i.e., obese vs. non-obese) are not comparable in terms of distribution of potential confounders. Such a study can almost be considered as a study that compares the risk of ROA progression among persons who are exposed to obesity with those who are exposed to the genetic as well as other risk factors for ROA. Of course, one should acknowledge that some of the knees with preexisting ROA at baseline may have been exposed to obesity, the genetic factor, and other risk factors. Nonetheless, conditioning on preexisting ROA at baseline would tend to bias the effect of obesity on progression towards the null unless the analysis adjusts for genetic and other risk factors. Unfortunately, not all factors are always known or measured, and therefore cannot be adjusted for, leading to bias.
Using data from the Multicenter Osteoarthritis (MOST) Study, we explored this possibility when evaluating the effect of obesity as an example of a chronic risk factor, on progression of knee ROA to illustrate this issue. Among knees eligible for progressive knee ROA all known risk factors for ROA (e.g., female gender, knee injury, high BMD, and knee malalignment) were more prevalent among persons who were obese than those not obese. Since obesity is a strong risk factor for knee ROA at baseline, as are many of the other factors examined, one would speculate that there must be other risk factors, not yet identified, that contribute to ROA development in non-obese people at baseline. If these unknown risk factors are also associated with ROA progression, it would bias the effect of obesity towards the null.