The study provides an application of genetic instruments for studying the effects of behaviors on health. The findings generally support the utility of employing genetic instruments for obtaining consistent estimates of maternal risk behavior effects on child health. Accurate estimates of the “causal” behavioral effects are needed for designing effective prevention programs for adverse child health outcomes and devising public policies to improve child health. To our knowledge, this application is among the first to use genetic instruments for assessing maternal behavior effects on infant health. The paper also illustrates an application of instrumental variables using data from case-cohort or case-control designs and ways of interpreting the behavior effects in such situations.
The employed instruments are predictive of cigarette smoking and do not appear to be related to health outcomes (OFC) through unobserved pathways based on the literature or statistical tests (including the inability to reject the over-identification restrictions at high critical values and the observation of similar smoking effects when excluding other relevant behavioral factors from the model). Of course, it is still possible that the instruments may somehow be related to OFC through unobserved, as is the case for any application using genetic instruments. As further knowledge is obtained on the genes’ functions and pathways, it is important to reevaluate the evidence for the exogeneity of the instruments and its implications for the results. One limitation of the instruments employed in this study that may be relevant to other applications is that instruments may have “weak” (although statistically significant) effects on behaviors. This is not surprising given the complex etiologies of risk behaviors, which may involve several genetic, economic and psychosocial factors, and given that large datasets with genetic instruments are unavailable for many applications, especially for studying maternal behavior effects. In this case, it is important to employ weak-instrument robust inference methods such as those presented above.
Genetic instruments provide a unique source of behavior variation to identify “causal” behavioral effects. In applications where the genetic instruments may be “weak”, the benefits of the genetic instrumental variables approach may outweigh the inference challenges due to “weak instruments” given that weak-instrument robust inference methods are available for most IV estimators. This is particularly relevant when non-genetic instruments are not available or suffer from theoretical limitations. While prices/taxes and other area-level variables are commonly used as instruments for smoking and other risk behaviors, there are significant theoretical and empirical challenges in employing these instruments as described above. We are limited to a small set of already genotyped variants as instruments for smoking in this study. These instruments are considered “weak” as discussed above. Future studies are expected to have better access to stronger instruments as the genetic etiologies of behaviors are further revealed and genotypic data become more widely available. Recent research studies have identified new variants that have strong effects on smoking. Of these, variants in CHRNA3
are the most promising and are strong candidates to be used as instruments (Liu et al, 2010
Future studies are needed to formally evaluate the utility of these variants as instruments for smoking.
The study results suggest that maternal cigarette smoking may substantially increase the child’s OFC risk, and that this effect may be significantly underestimated in analyses that ignore unobserved confounders. If our estimates are correct, the prevention of all cigarette smoking at the population level may reduce OFC incidence by more than 50%. The study provides further evidence that direct adjustment for observed confounders may be insufficient for consistent estimation of maternal behavioral effects due to self-selection into these behaviors based on unobserved factors that also affect child health.
The results are consistent with favorable self-selection into smoking based on unobserved characteristics that may reduce the OFC risk. In other words, women who smoke at and during pregnancy may, on average, have higher rates of certain unobserved “baseline” characteristics that reduce OFC risk, and therefore result in underestimating the harmful smoking effects when ignored in classical single-equation models. These characteristics may include favorable family and child health history and lower maternal baseline health risks. Such factors may increase maternal propensity to smoke, but may relate to unobserved “health endowments” that reduce OFC risk such as favorable genetic, economic or psychosocial factors. This may seem counterintuitive given that most observable characteristics suggest adverse self selection into smoking with less education, unemployment, alcohol drinking, underweight, not using vitamins and not planning the pregnancy being positively correlated with cigarette smoking (see Table S3
). The only exception is the positive correlation between income and cigarette smoking. We cannot evaluate further the hypothesis of adverse self-selection based on unobservables. However, the change in smoking effects on OFC with the IV estimation is consistent with several previous IV studies of smoking impacts on birth weight using cigarette tax rates and other non-genetic instruments (Evans and Ringel 1999
; Grossman and Joyce 1990
; Lien 2005
; Permutt and Hebel 1989
; Rosenzweig 1983
) which find larger adverse smoking effects using the IV models and also with IV studies of other behavioral effects on child health such as prenatal care effects on birth weight, which also appear to be underestimated in classical models (Wehby et al. 2009a
One potential bias source that may contribute to underestimating adverse smoking effects in classical models is biased reporting of maternal smoking status depending on observed OFC status. Mothers of children with OFC may be less likely to report, post delivery, their participation in and intensity of smoking before and during pregnancy compared to mothers of unaffected children due to guilt feeling and avoiding blame. In this study, mothers reported smoking after the child’s birth (by about 4 months). However, previous analyses of this data showed that differences between smoking status reported prospectively by the study women during their first prenatal visit (available through birth registry data), which was around 10.3 gestational weeks and for most cases likely before obtaining information about OFC status, and first trimester smoking status as reported post delivery, are similar between mothers of affected and unaffected children (Lie et al. 2008
). Therefore, it is unlikely that biased smoking report based on observed OFC is resulting in underestimating smoking effects in the classical probit model. However, it is possible that the smoking effects are attenuated toward zero in the classical models due to random errors in smoking self-report or biases that are not related to OFC, which may partially explain the observed increase in the effects of smoking when treated as endogenous. Intuitively, the estimates of the IV model are expected to apply mainly to those whose behaviors change with the instruments20
, which may also contribute to the increase in smoking effects if these instruments affect smoking behaviors in specific ways (e.g., spacing or timing of cigarette use or smoking certain brands), or if those who smoke because of these instruments have different characteristics that intensify the smoking effects, although we cannot evaluate this in this study. It also remains theoretically possible that part of the increase is due to noise introduced by the instruments or unforeseen endogeneity issues with the instruments that may be aggravated by the instrument weakness. Therefore, it is important to replicate this study in the future with the recently identified and potentially stronger instruments for smoking.
One limitation of designs that condition sampling on the study outcome such as case-control or case-cohort designs is that they are at higher risk for sample selection bias. This bias occurs when the sampling frame or study participation is related to “unobserved” factors that also affect the outcome (Heckman 1979
). Additional sample selection bias that is particularly relevant for IV applications using such data may occur if the sampling frame or study participation is related to unobserved factors that are related to both the outcome and the employed instruments. While such limitations are theoretically possible, we do not expect that either the sampling frame or maternal decision to participate in this study is related to unobserved factors that affect OFC or are related to the employed genetic instruments. The study case-cohort sample included the majority of the eligible OFC population in Norway during the study period and a random sample of unaffected children. When suspected, sample selection should be modeled and accounted for using approaches that explicitly account for the role of unobserved relevant factors that result in this bias such as Heckman’s approach.
The IV model using genetic instruments may be applied in several other frameworks and research areas that so far have either relied solely on adjusting for observed confounders or utilized instruments that may not be effective in accounting for omitted variable bias. Studies have identified specific genetic risk factors for major behavioral risk factors such as alcohol use (Edenberg and Foroud 2006
; Luo et al. 2006
; Tolstrup et al. 2008
) and obesity (Dina et al. 2007
; Frayling et al. 2007
; Loos et al. 2008
; Qi et al. 2008
; Willer et al. 2009
). In the US, 34% of women of childbearing age are obese and 8% are extremely obese – another 25% are overweight (Flegal et al. 2010
). The alarmingly high obesity rates increase the importance of obtaining consistent estimates of the “causal” maternal obesity effects on child health in order to forecast the impacts of changes in obesity rates on disease incidence and assess the returns of prevention programs.
In conclusion, the study provides a novel application using genetic instruments to assess the “causal” effects of maternal smoking before and during pregnancy on child health in the form of OFC status. The study finds that genetic instruments may be useful for such applications and highlights some potential limitations of this approach and ways for addressing them. Employing genetic instruments provides a valuable approach for accounting for unobserved confounders and obtaining consistent estimates of the “causal” behavior effects. The model may be used to study maternal risk behavior effects on various infant and child health outcomes such as birth weight, fetal growth, preterm birth, birth defects and child development and also to study long-term behavior effects on health, economic and psychosocial status.