|Home | About | Journals | Submit | Contact Us | Français|
The epidemiology of obesity suggests that, for the majority of individuals, the disorder arises from an interaction between genetic predisposition and lifestyle behaviors such as dietary intake and physical activity. Unravelling the molecular basis of such interactions is complex but is becoming a realistic proposition as evidence emerges from whole genome association studies of genetic variants that are definitively associated with obesity. A range of possible study designs is available for investigating gene–lifestyle interaction, and the strengths and weaknesses of each approach are discussed in this article. Given the likely small main effect of common genetic variants and the difficulties in demonstrating associations of lifestyle factors with future risk of obesity, we would favor an analytical approach based on the clear specification of prior probabilities to reduce the likelihood of false discovery. Mixed approaches combining data from large-scale observational studies with smaller intervention trials may be ideal. In designing new studies to investigate these issues, a key choice is how precisely to quantify the important, but difficult to measure lifestyle behaviors. It is clear from power calculations that an approach based on enhancing precision of measurement of diet and physical activity is critical.
The high heritability of obesity coupled with the rapid increase in prevalence suggests that a combination of genetic and behavioral factors is critical to the etiology of obesity (1). It is easy to propose such a model for the development of obesity, but it is altogether much harder to identify the molecular mechanisms that underlie such a model. In this article, we review overall strategies and possible epidemiological study designs for investigating how genetic and behavioral risk factors combine to lead to excess weight gain.
Case-only studies have been proposed as an efficient design for studying gene–environment interaction. The efficiency comes from only requiring cases which are, in general, easier to collect in epidemiological studies than appropriate controls. In the analysis of a case-only study, the parameter of interest is the interaction term between a genetic marker and a behavioral risk factor (2). One can determine whether such an interaction is or is not present, but the design is unable to identify the individual and combined effects of the genetic and behavioral risk factors (3). This is an important weakness since most of our interest is not in whether or not there is an interaction per se but rather in how genetic and behavioral factors combine to determine risk. The mathematics of the case-only design depends on the critical assumption that the genetic and behavioral factors are entirely independent, an assumption that may not hold for behaviors such as dietary intake or habitual physical activity which themselves may be partially determined by genetic factors.
The traditional cross-sectional case–control design has proved to be effective for analysis of the genetic basis of disease, particularly when coupled with advances in genotyping technology such as the modern generation of genome-wide chips. Indeed the past year has seen a rapid acceleration in the number of genetic variants that have been demonstrably proven to be associated with disease, largely as a result of the combination of new genotyping technology with a coordinated effort to increase initial and replication study sample sizes (4). While the cross-sectional nature of the case–control design is not a limitation for genetic studies (except for those focusing on epigenetic phenomena), it is a major limitation for studies of behavioral determinants of conditions such as obesity since the lifestyle exposure and outcome are assessed simultaneously, leading to major problems of recall bias. Thus for studies that aim to study the combined effects of genes and behavior lifestyles, prospective designs are required.
The classical advantage of the cohort design is linked to its longitudinal nature so that measurement of exposure predates development of disease, thus reducing the likelihood of biased assessment of the lifestyle exposure differential to disease outcome. For diseases such as type 2 diabetes, such problems are considerable as the initial clinical efforts to manage the condition usually involve education and advice about activity and dietary intake, which would therefore bias the recall of these behaviors if people with diabetes were recruited to a cross-sectional study. The situation with obesity as the outcome of interest is similar and may even be more complicated since there is potential for biases other than that simply due to recall. For example, it may be that reported dietary intake of total energy may be systematically different according to the degree of obesity (5). The use of a longitudinal study design avoids the biases due to differential recall of lifestyles but is inefficient for rarer diseases since many individuals have to be recruited and studied prospectively for relatively few to be informative as incident cases. It is also inefficient if the costs of assessing exposure on a full cohort are prohibitive, as might be the case, for example, for some nutritional biomarkers. The classical solution to this issue is to combine the efficiency of a case–control study with the lower likelihood of bias in a cohort in a hybrid design which provides the best features of both. In a nested case–control study, exposure data is collected on an entire cohort but is not analyzed at baseline, and is then only processed and analyzed at follow-up on those individuals who have become cases plus a selection of controls. This combines the efficiency of a case–control study with the reduction of exposure recall bias seen in cohort studies. This prospective approach is the basis for many current initiatives, such as the UK BioBank, aimed at building cohort studies that can investigate how genes and lifestyles combine to impact on disease risk (6). Within such studies there are multiple design questions related to issues such as how lifestyle exposures are measured and how large studies should be. In the following section, we describe some critical issues that impact on the study design for investigating gene–lifestyle determination of weight gain.
Given the virtually infinite combination of genetic and lifestyle factors that could be studied in any study, the probability that seemingly positive results are truly positive is low unless we reduce the number of hypotheses tested (7). A logical approach would be to initially limit our attention to genetic and lifestyle factors that have previously been demonstrated to be associated with the outcome. In the case of the potential lifestyle factors one would additionally require that the level of causal inference is high, being supported by other forms of evidence to avoid alternative explanations for the observed association, e.g., confounding. In this section of the article, we focus on the difficulties of demonstrating the causal relationship between behavioral risk factors and weight gain. We will use the example of the association between physical activity and weight gain, but the same principles apply to other risk factors such as dietary intake.
A systematic review in 2000 by Fogelholm, and Kukkonen-Harjula described data from observational cohort studies on physical activity and weight gain in adults and concluded that there was inconsistent evidence that baseline physical activity predicted subsequent weight gain (8). They noted, however, that the association between weight gain and change in activity or activity at follow-up was stronger, although still modest. In an updated review (9), we identified an additional 14 observational studies on physical activity and weight gain in adults, with more recent studies reporting associations in the expected direction. Overall, the magnitude of the effect was small, even in those studies using objective measurement of physical activity (10,11). Similar observations have been seen in studies involving children. A review in 2000 suggested that four out of a total of seven studies showed that physical activity was associated with less weight gain in children (12). Our review identified a further 16 papers (9). Overall as in the adult studies, the results were mixed. Even when an association was demonstrable, the measures of effect tended to be small.
There are several possible explanations for these results. First, physical activity may be an important factor in determining the level of future weight gain, but the true association is difficult to detect because of measurement error in the assessment of physical activity. Second, the true association could be the other way around, i.e., higher weight at baseline leads to decline in physical activity levels over time (13). This issue of reverse causality is traditionally described as being less of an issue in a cohort study, but in reality this is not the case if exposure and outcome change simultaneously, making it difficult to determine cause and effect. Finally self-reported physical activity could be a marker for a general healthy lifestyle, and thus the association between activity and development of obesity could be affected by confounding. The latter of these two issues can only be addressed by alternative experimental designs involving randomization. It is not within the scope of this review to describe trial evidence demonstrating how changes in physical activity impact on weight gain, but suffice it to say that the evidence is weaker than we might suppose (9). Our focus is instead on the issue of measurement error, which can be addressed in cohort studies by choices about how we elect to measure lifestyle behaviors.
Physical activity and dietary behavior are not simple to assess in epidemiological studies, as they are complex, multidimensional behaviors. Many physical activity instruments used in epidemiological studies, for example, are rather simple and reduce this complex behavior to a global self-report index (14). Although this may suffice to demonstrate an overall association between activity and an outcome, it leaves issues of dose–response and type of activity that is most closely associated with the outcome unresolved (15). Even when questionnaires are more extensive and assess the various domains—transportation, domestic life, occupation, and recreation—they are still relatively imprecise as a measure of total energy expenditure (16,17). Such imprecision can be adjusted for in epidemiological studies, provided the degree of measurement error is known and it is not differential with respect to outcome (18). This measurement error correction or adjustment for regression dilution is of considerable utility in providing more accurate estimation of the true underlying association between an exposure and an outcome in a large epidemiological study where it is not feasible for practical reasons to assess exposure precisely. In this situation one relies on undertaking a calibration study within the cohort to determine the degree of regression dilution bias (19). The critical term here is the correlation between the observed measure of exposure and the true level which can be estimated by a calibration study employing a gold standard technique with repeated measurement. The degree of regression dilution of the risk ratio is a function of the square of this term. Most physical activity questionnaires have a correlation with a gold standard assessment of energy expenditure of around 0.3 (16,17) and thus may introduce a considerable degree of attenuation of the true association when used in an epidemiological study, as the observed association is a function of the true measure raised to the power of 0.3 squared or ~0.1. This measurement error can be adjusted for, and this approach is highly appropriate for studies of the link between exposure and outcome alone. However, the ability to adjust does little to resolve the problems of power to detect gene–lifestyle interaction, an issue to which we return later in this article.
Given the level of measurement imprecision in assessing physical activity using self-report instruments, it may be appropriate to consider alternative methods such as objective assessment. A number of alternative methods are available, but in essence they involve movement sensing by accelerometry, heart rate monitoring coupled with individual calibration by energy expenditure, or a combination of the two methods (20,21,22). All of these approaches have different merits and are more or less appropriate to different settings but in general the level of precision of physical activity energy expenditure is an order of magnitude greater than for a self-report questionnaire-based method, with a correlation with the true underlying measure of physical activity energy expenditure of between 0.7 and 0.85 rather than 0.3. One could make similar arguments for the assessment of dietary factors with relation to classical food frequency methods and nutritional biomarkers (23). Given the general observation that objective measurement is much more precise than self-report but less feasible on a mass scale, the question of the appropriate scale for studies to examine the combined genetic and lifestyle determinants of disorders such as obesity becomes critical.
Many studies that are currently being designed to investigate the combined effects of genes and lifestyles are large, long-term, and consequently very expensive. The number of incident cases needs to be high and even for relatively common disorders this necessitates establishing a cohort of around 500,000 people who are followed up for at least a decade. In such a situation, the scale of the study means that compromises have to be made as the design needs to be capable of addressing multiple exposure-disease hypotheses, lest the study become uneconomic to funding agencies. With a nested case–control design, for example, a sample size yielding at least 7,500 incident cases is required to achieve 80% power for detecting a moderate interaction effect (θ = OREG=1/OREG=0 = 1.5) with a common gene variant (allele-frequency = 0.25) and significance at 5%. A stronger interaction effect (θ = 3) would be detectable in a smaller study, but such strong interactions are rare (24). Even for relatively common disorders such as type 2 diabetes, a cohort of 500,000 people would need to be studied for 5 years to generate sufficient incident cases. By their very size these types of studies tend to have to compromise on the measurement precision for factors such as diet and physical activity. When the outcome of interest is not truly an incident condition, but rather change in a continuously distributed variable, as would be the case for weight, body mass index (BMI), or fat mass, then one also has issues with precision with the phenotype. In an analysis of the power to detect gene–lifestyle interaction in studies of continuously distributed outcomes, we demonstrated that a study of under 10,000 individuals which employs reasonably accurate measurement of exposure and outcome would be equally powered to detect an interaction as one of more than 150,000 individuals that employed less precise measures (25). Given the cost of genotyping on such a scale, it would seem logical to consider complementary approaches to the large-scale cohorts employing detailed assessment of exposure and more intensive phenotyping.
In addition to focusing on lifestyle factors that have demonstrably been associated with outcome, it would be logical to limit the number of genetic hypotheses that are tested to reduce the likelihood of false discovery. This is not to suggest that one would not employ some form of mass genotyping platform for reasons of typing efficiency, but rather that in the analysis consideration should be made of the different prior likelihood of association in a Bayesian framework. One could determine such a prior probability based on biological evidence of plausibility of interaction or on previous evidence of genetic main effect. Until recently the latter would have been a relatively limited strategy since the number of genes proven to be associated with obesity was small, with only 22 genes being consistently associated with obesity-related phenotypes in at least five studies, and only 12 having more than 10 replications (26). Even when associations of variants in strong candidate genes such as the melanocortin 4 receptor (MC4R) have been examined in large-scale meta-analyses, the level of association is small and the level of significance modest (27). However, the advent of large-scale collaborative efforts involving genome-wide association methods has given new impetus to the discovery of the genetic basis of common obesity. The first proven association from this approach was that of variants in the FTO gene (28). Although this association is statistically strong, with multiple studies replicating the initial observation, the magnitude of the association itself is small, with each copy of the variant allele being associated with approximately a 0.1 increase in log BMI z-score. Power calculations to inform subsequent studies have suggested that only studies of more than 10,000 individuals would be able to detect the association of minor alleles with a frequency of more than 20% with BMI when the effect size was between 0.5 and 0.1 log BMI z-score units. Although the strategy of increasing study size through collaboration and meta-analysis is likely to be effective in finding confirmed new associations, it appears highly probable that the effect sizes detected will be small. At the level of biological understanding, the magnitude of effect is of lesser importance since the confirmed observation of an association of a previously unthought-of gene such as FTO with obesity opens up new avenues for exploring mechanism, whatever the size of association (29). However at the population level, the small size of association not only limits the clinical utility of the finding but also makes the task of detecting interaction with lifestyle behaviors that much harder.
The demonstration that the FTO gene is convincingly associated with obesity opens up several forms of follow-up study. First, the gene variant may be used to examine the causal nature of the relationship between obesity and clinical outcomes where this is in doubt, such as some cancers. This Mendelian Randomization or instrumental variable approach is limited by the effect size of the gene variant on obesity and the strength of the association between obesity and the clinical outcome of interest (30). The FTO association was originally discovered in a study of type 2 diabetes, and as the association between the gene variant and obesity fully explains the association between the variant and type 2 diabetes in an adjusted model, this can be used as an argument that the relationship between obesity and type 2 diabetes is causal. Of course, there was little doubt about this even before this study, not only because of the availability of randomized controlled trial data (31), but also because of the very high magnitude of association between obesity and diabetes (32). It is this very strong level of association that makes it possible to use the instrumental variable approach because as Frayling et al. (28) demonstrated, the magnitude of the gene effect is itself quite small. Thus although this Mendelian Randomization approach might seem appealing, it is unlikely to be successful in resolving uncertainties about causality for outcomes of obesity where the likely level of association is much weaker than for type 2 diabetes.
The observation of the association of FTO with obesity also opens questions about whether the impact of the variant on obesity is manifest through an association on dietary intake or energy expenditure, or both. Energy intake is difficult to assess in free-living individuals even with techniques involving weighed measurement of food (33). In the context of monogenic causes of obesity in children, studies have been possible on ad libitum energy intake when children are offered unlimited food availability (34). These have demonstrated clear evidence that individuals with mutations in leptin (LEP), neurotrophic tyrosine kinase receptor type 2 (TRKB), brain-derived neurotrophic factor (BDNF), and MC4R genes have elevated energy intake compared to controls. In the case of FTO, the likelihood magnitude of effect on energy intake is likely to be much smaller if it were present at all, and it is highly unlikely that one could observe such a small effect in a study that used imprecise measures. It is therefore probable that one would need to conduct smaller more intensive physiological studies in which special measurement of behaviors such as ad libitum energy intake were made. For common variants such as FTO, it would be possible to conduct such studies and then undertake the genotyping on all people recruited. For rarer variants, it may be more efficient because of the expense of the phenotyping to create a framework for selecting people for more detailed physiological studies on the basis of genotype. Various examples now exist of biobanks that have the requisite structure and ethical approvals to allow studies to be conducted but, to date, few examples of the scientific output from this approach have been published (35).
The possibility of selecting individuals by phenotype makes it possible not only to undertake physiological studies to generate information about the biological pathway to the outcome of interest, but also allows the design of experimental studies that can investigate whether there is a differential response to intervention by genotype for the major outcome of interest. This is a question that is highly relevant clinically, as the demonstration that people with varying genetic background responded differently to a lifestyle intervention would be a stepping stone to the introduction of targeted intervention. However, the prospect of selecting people by genotype for such a study is limited, as one would need a very strong prior likelihood of success before embarking on an expensive long-term trial that followed people up to a relevant clinical outcome. Much more likely is the retrospective examination of differential response by genotype in existing lifestyle intervention studies. The problem here is one of power. In the case of type 2 diabetes, the strongest genetic variant is transcription factor 7-like 2 (TCF7L2) (36). The impact of this variant on risk of progression to diabetes has been examined in the large randomized controlled trial, the Diabetes Prevention Program, which randomized high risk individuals with impaired glucose tolerance to metformin, lifestyle, or placebo arms (37). Even though it was possible to show in this sizeable study with 3,234 people randomized that there was a materially different impact of the genetic variant on risk of progression to diabetes by randomized group, the interaction term itself was not statistically significant. Thus this suggests that this very strong study design should be restricted to the examination of a few key hypotheses. A combined approach using large-scale observational epidemiological studies to inform which gene–lifestyle interactions to examine in intervention studies would be appropriate.
The task of designing and conducting appropriate studies for investigating how genes and lifestyle factors combine to lead to common disease holds considerable difficulties. Given the high probability of false positivity for results that emerge from an uninformed search for interaction, it would seem logical to focus our attention on lifestyle factors that have previously been demonstrated to be associated with the outcome and that are highly likely to be causally associated. Even though one might employ genotyping methods that efficiently capture genetic variation, it would be preferable to concentrate analytical attention on genetic variants that have previously been shown to be associated or for which there is a high level of biological plausibility for interaction with a specific lifestyle factor. In the case of development of obesity, this task is harder than might first seem apparent, since lifestyle factors such as physical activity are only weakly associated with future risk of weight gain and, until recently, relatively few genetic variants had been demonstrably confirmed to be associated with obesity. The approach of establishing large cohorts with follow-up for disease end points holds considerable potential as a framework for establishing future nested case–control studies, but the ability to detect gene–lifestyle interactions will be critically dependent upon the investment that is made in the precision of measurement of the important but difficult to measure lifestyle behaviors. This is particularly important at the outset of such a study since poor decisions in the design stage cannot be remedied later when it comes to the analysis. There are, however, limits to what methods can be employed in very large studies, and there is a strong case for smaller more focused studies where both the behavioral exposures and phenotypic outcome are measured with greater precision. This is especially true for conditions such as development of obesity, which can be expressed not just as risk of progression to a discrete end point but also as a change in a continuously distributed parameter. These observational study designs need to be complemented by smaller physiological studies to investigate mechanism which may require the establishment of biobanks to allow recruitment by genotype for reasons of efficiency. Finally the analysis of existing intervention studies to determine whether response to lifestyle change is differential by genotype is an exciting complement to the observational studies, but for reasons of power needs to be employed only for a few highly plausible gene–lifestyle interactions so that the probability of false discovery is reduced.
This publication was sponsored by the National Cancer Institute (NCI) to present the talks from the “Gene–Nutrition and Gene–Physical Activity Interactions in the Etiology of Obesity” workshop held on 24–25 September 2007. The opinions or assertions contained herein are the views of the authors and are not to be considered as official or reflecting the views of the National Institutes of Health.
Disclosure: The authors declared no conflict of interest.