|Home | About | Journals | Submit | Contact Us | Français|
The stage of disease and general fitness of patients usually influences the treatments they receive. Because of this, tests of treatments and treatment comparisons need to ensure that treatment comparison groups are made up of patients who are as similar as possible. Without this assurance, any differences in the progress of patients receiving different treatments cannot confidently be ascribed to differential effects of treatments: they may simply reflect differences in the characteristics of patients receiving the different treatments. This is the rationale underlying random allocation (randomization) to treatment groups in randomized trials—to ensure that whether treatment is given or withheld is unrelated to factors influencing the prognoses of patients.
Sometimes, for one or more of a variety of reasons, the organization of randomized controlled trials (RCTs) presents challenges that are judged likely to be very difficult to overcome. One such example relates to the need to assess the effects of bone marrow transplantation in the treatment of acute myeloid leukaemia in children. Randomized comparisons of treatment with and without bone marrow transplantation in this form of leukaemia have been judged unlikely to be achieved successfully, yet it is clearly important to obtain reliable, unbiased estimates of the effects of transplantation—wanted and unwanted—because the disadvantages may outweigh any advantages of this invasive treatment.
In 1991, Richard Gray and Keith Wheatley reported an ingenious method for obtaining unbiased estimates of the effects of bone marrow transplantation without conducting a traditional randomized trial.1 They pointed out that unbiased comparisons could be made between child patients who had a genetically compatible sibling, and so in principle could receive a matched sibling bone marrow transplant, with other child patients who had no genetically compatible sibling, and so were incapable of receiving a matched sibling bone marrow transplant. Because having or not having a genetically compatible sibling is a matter of chance—it is determined by random assortment of genes at the time of gamete formation and conception—this situation produces what is effectively a randomized comparison.
Whether a child with leukaemia belongs to the group with genetically compatible siblings or the group without such siblings will not be related to potential confounding factors such as disease stage and general fitness at the time of diagnosis. In a form of ‘intention-to-treat analysis’, Gray and Wheatley noted that an unbiased comparison meant comparing all the patients with a genetically compatible sibling with all the patients without such a sibling, regardless of whether or not every patient with a potential donor actually received a transplant (Figure 1).1
Gray and Wheatley appear to have been the first authors to refer to this particular way of capitalizing on chance events in nature to create unbiased comparison groups to assess the effects of a treatment. They called it ‘Mendelian randomization’. Several further studies have now been carried out using their design,2-4 including studies to assess the effects of treatment for acute lymphoblastic leukaemia.5-7 Some of these studies have confirmed that like is being compared with like in comparison groups defined in this way.3-6 They have also shown that there are differences in prognostic factors between groups defined by the treatment they received; differences that would confound a conventional observational analysis comparing different treatments.4-6
The basic design outlined in Figure 1 might be improved by taking into account the number of siblings each patient has.8 Patients with more siblings have a greater chance of having a genetically compatible donor, and therefore groups defined by having a compatible donor will differ according to average number of siblings and thus will differ by factors that may be related to prognosis. Indeed, a later study found that the number of siblings could itself be related to survival.9 The study by Gray and Wheatley (Gray, personal communication) and another more recent study4 applied an analysis restricted to patients with at least one sibling, but exact stratification or matching on number of siblings might be a more robust approach.
Mendelian randomization as a term has also recently been used to describe methods for obtaining unbiased estimates of causal associations in observational studies in aetiological epidemiology. In part, these developments reflect a response to well-publicized cases of conventional observational epidemiology having produced misleading information about supposed causal or protective factors, for example, for β carotene and cancer, vitamins C and E and coronary heart disease, and hormone replacement therapy and cardiovascular disease. Observational epidemiological studies suggested that these factors had important protective effects, but randomized controlled trials failed to confirm this. The probable reason for these discrepancies between analyses of observational data and randomized trials is that there is considerable confounding between, for example, dietary vitamin C or E intake or taking β carotene supplements and various behavioural and socio-economic factors related to increased risks of disease.
Mendelian randomization—the random assortment of genes from parents to offspring which occurs during gamete formation and conception—provides a method of assessing whether certain environmental exposures are causally related to a disease. The association between risk of a disease and a genetic variant that influences the exposure or mimics the biological link between a proposed exposure and disease is not generally susceptible to the ‘reverse causation’ or confounding that may distort interpretations of conventional observational studies.
A Dutch researcher, Martijn Katan, was an early exponent of what has since become termed Mendelian randomization. He was concerned that observational studies were suggesting that low serum cholesterol levels were associated with an increased risk of cancer, and thus that treatment to lower cholesterol could have detrimental effects.10 This association might be explained by early cancer lowering cholesterol levels (reverse causality) or by confounding factors (such as cigarette smoking), both of which are related to future cancer risk and to lower circulating cholesterol levels.11
Katan pointed out12 that polymorphic forms of the apolipoprotein ε (APOE) gene were related to different average levels of serum cholesterol. If low circulating cholesterol levels were indeed a causal factor for cancer, individuals with the genotype associated with lower average cholesterol levels should be expected to have higher cancer risk. If, however, reverse causation or confounding generated the association between low cholesterol levels and cancer, then no association would be expected between the APOE genotype and cancer. Individuals with lower cholesterol because of their genotype, rather than because existing clinically unrecognized cancers had lowered their cholesterol, would not have a higher risk of cancer; nor would there be substantial confounding between genotype-associated differences in cholesterol levels and lifestyle or socio-economic factors. Although Katan did not have any data to investigate these possibilities, he advocated a study design taking these considerations into account.12
As far as I am aware, Katan's intriguing suggestion has not yet been explicitly applied to address the important question he posed, although sporadic reports relating APOE to risk of specific cancers continue to appear.13-17 However, several examples where the phenotypic effects of polymorphisms are well-documented provide encouraging evidence of the explanatory power of Mendelian randomization.8,18,19
Gregor Mendel (1822-1884) concluded from his hybridization studies with pea plants that ‘the behaviour of each pair of differentiating characteristics [such as the shape and colour of seeds] in hybrid union is independent of the other differences between the two original plants’.20 This formulation was actually the only regularity that Mendel referred to as a ‘law’. In Carl Correns' 1900 paper (one of a trio appearing that year which are considered to represent the rediscovery of Mendel) he refers to this as Mendel's Law.21,22
Morgan23 discusses independent assortment and refers to this process as being realized ‘whenever two pairs of characters freely Mendelize’. Morgan's use of Mendel's surname as a verb did not catch on, but Morgan later christened this Mendel's Second Law,24 and it has been known by this name, or as ‘The Law of Independent Assortment’, ever since. The law suggests that inheritance of one trait is independent of—that is, randomized with respect to—the inheritance of other traits.
The analogy with a randomized controlled trial will clearly be most applicable to parent-offspring designs investigating the frequency with which one of two alleles from a heterozygous parent is transmitted to offspring with a particular disease. However, at a population level, traits influenced by genetic variants are generally not associated with the social, behavioural and environmental factors that confound relationships observed in conventional epidemiological studies. Thus while the ‘randomization’ is approximate and not absolute in genetic association studies, empirical observations suggest that it applies in most circumstances.25,26
As discussed above, the term ‘Mendelian randomization’ itself was introduced by Gray and Wheatley in a somewhat different context, in which advantage is taken of the random assortment of genetic variants at conception to provide an unconfounded study design for estimating treatment effects for childhood malignancies.1,27 However, the term has recently become widely used with the meaning ascribed to it here.
The notion that genetic variants can serve as an indicator of the action of environmentally modifiable exposures has been expressed in many contexts. For example, since the mid-1960s various investigators have pointed out that the autosomal dominant condition of lactase persistence is associated with milk drinking. Associations of lactase persistence with osteoporosis, bone mineral density or fracture risk thus provide evidence that milk drinking protects against these conditions.28,29 In a related vein, it was proposed in 1979 that as N-acetyltransferase pathways are involved in the detoxification of arylamine, a potential bladder carcinogen, the observation of increased bladder cancer risk among people with genetically determined slow acetylator phenotype provided evidence that arylamines are involved in the aetiology of the disease.30
Since that time various commentators have pointed out that the associations of genetic variants of known function with disease outcomes provides evidence about aetiological factors.31-35 However, these commentators have not emphasized the key strengths of Mendelian randomization: the avoidance of confounding, bias due to reverse causation or reporting tendency, and the underestimation of risk associations due to variability in behaviours and phenotypes.18
These key concepts were present in Martijn Katan's 1986 Lancet letter in which he suggested that genetic variants related to cholesterol level could be used to investigate whether the observed association between low cholesterol and increased cancer risk was real,12 and by Honkanen and colleagues in their understanding of how lactase persistence could better characterize the difficult-to-measure environmental influence of calcium intake than could direct dietary reports.36 Since 2000 there have been several reports using the term ‘Mendelian randomization’ in the way it is used here,8,37-40 and its use is becoming widespread. The fact that Mendelian randomization is one of a family of techniques referred to as ‘instrumental variable’ approaches for obtaining robust causal inferences from observational data has also been recognized,41 and statistical approaches to instrumental variables analysis developed within econometrics have been applied to data from Mendelian randomization studies.25,42
Competing interests None declared
Contributorship GDS is the sole contributor
Acknowledgments I am grateful to Richard Gray, Keith Wheatley and Jan Vandenbroucke for comments on earlier drafts of this commentary.
Additional material for this article is available from the James Lind Library website [http://www.jameslindlibrary.org], where this paper was previously published.