PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Health Serv Outcomes Res Methodol. Author manuscript; available in PMC 2012 July 1.
Published in final edited form as:
Health Serv Outcomes Res Methodol. 2011 July 1; 11(1-2): 54–78.
doi:  10.1007/s10742-011-0071-9
PMCID: PMC3216039
NIHMSID: NIHMS316203

GENES AS INSTRUMENTS FOR STUDYING RISK BEHAVIOR EFFECTS: AN APPLICATION TO MATERNAL SMOKING AND OROFACIAL CLEFTS

George Wehby, PhD,corresponding author Astanand Jugessur, PhD, Jeffrey C. Murray, MD., Lina Moreno, PhD, DDS., Allen Wilcox, MD, PhD., and Rolv T. Lie, PhD.

Abstract

This study uses instrumental variable (IV) models with genetic instruments to assess the effects of maternal smoking on the child’s risk of orofacial clefts (OFC), a common birth defect. The study uses genotypic variants in neurotransmitter and detoxification genes relateded to smoking as instruments for cigarette smoking before and during pregnancy. Conditional maximum likelihood and two-stage IV probit models are used to estimate the IV model. The data are from a population-level sample of affected and unaffected children in Norway. The selected genetic instruments generally fit the IV assumptions but may be considered “weak” in predicting cigarette smoking. We find that smoking before and during pregnancy increases OFC risk substantially under the IV model (by about 4–5 times at the sample average smoking rate). This effect is greater than that found with classical analytic models. This may be because the usual models are not able to consider self-selection into smoking based on unobserved confounders, or it may to some degree reflect limitations of the instruments. Inference based on weak-instrument robust confidence bounds is consistent with standard inference. Genetic instruments may provide a valuable approach to estimate the “causal” effects of risk behaviors with genetic-predisposing factors (such as smoking) on health and socioeconomic outcomes.

1. INTRODUCTION

Risk behaviors such as smoking, alcohol use, drug use, poor nutrition/obesity, and others have major impact on health, psychosocial and economic status. Studies of these effects are common and span a wide range of outcomes and populations. A particular research area of major public health and policy implications is the study of maternal risk behavior and its effects on child health. Numerous studies have evaluated the impact of maternal smoking, alcohol use, obesity and other risk behaviors on infant and child health outcomes such as birth weight, infant mortality, birth defects and child development.

One major challenge in studying behavioral effects on health and other outcomes is non-random self-selection into behaviors based on unobserved or inadequately measured characteristics. Behaviors are a function of individual preferences for health and risk taking, perceptions of health risks, expectations of health outcomes and biological predispositions to behaviors (driven environmentally or genetically). In the case of maternal health behaviors, the mother’s choice of prenatal risk behaviors is also a function of her preferences for child health, her perceptions of extent of fetal health risk factors and expectations of infant and child health outcomes. Preferences and risk perceptions are generally unobserved in data sources used for studying behavioral effects. Furthermore, direct adjustment using proxy measures is typically inadequate to account for such unobservable relevant factors.

When ignored, self-selection into behaviors based on unobserved factors can result in biased estimation of behavioral effects. The unobserved factors that affect behaviors may also affect the study outcomes through other pathways, resulting in classical omitted variable (confounding) bias. Preferences and risk perceptions likely affect the choice of several behaviors, some of which may not be adequately observed in available data. Furthermore, risk perceptions may be correlated with actual risk factors that may not be adequately observed and that affect the study outcomes. Therefore, standard regression models that involve direct adjustment for observable confounders only may not adequately account for the unobservable confounders.

Instrumental variable (IV) designs may be used to account for unobservable confounders when studying behavioral effects on health or other outcomes. Several risk behaviors have identified genetic risk factors that may serve as instruments. These genetic factors may serve as instruments that provide a useful source of behavior variation for identifying the “causal” behavioral effects on health and other outcomes. The paper proceeds as follows: Section 2 provides a background on instrumental variables, genetic instruments and study objectives; section 3 describes the data sample and methods; section 4 provides the results; and section 5 includes discussion and conclusions.

2. BACKGROUND

Instrumental variables (IV) have been used extensively to study the effects of prenatal care use, maternal smoking, alcohol use and other behaviors on infant health (Evans and Ringel 1999; Grossman and Joyce 1990; Lien 2005; Rosenzweig 1983; Wehby et al. 2009a). The IV design uses instruments that are strongly correlated with endogenous variables (behaviors), but are otherwise unrelated to the study outcomes either directly or indirectly through unobserved confounders (Angrist, Imbens, and Rubin 1996). Under these assumptions, the IV model uses variation in behaviors that is unrelated to the unobservable confounders (and is, therefore, free from self-selection and confounding) to estimate the behavioral effects on the outcomes.

Previous IV studies of risk behaviors such as smoking or alcohol use have typically employed prices/taxes or related policies such as state-level policies preventing public-area smoking. However, there are several disadvantages with these instruments. Such aggregate-level instruments use area-level variation in risk behaviors and ignore behavioral variations among individuals living in the same area. Furthermore, the individual’s self-selection into area of residence may introduce correlations between these instruments and other potentially relevant factors such as individual preferences and health risks. Finally, state-level policies and prices/taxes may also be reversely affected by the population’s preferences and health risks.

In this study, we evaluate the utility of employing genetic variants as instruments in IV designs to assess risk behavior effects. This approach exploits variation in risk behaviors that is generated by exogenously inherited genetic factors for estimating the behavior effects on outcomes. The genetic variants are single nucleotide polymorphisms (SNPs), which are DNA base-pair variants, each of which has two alleles (variants). The strength of genetic variants as instruments is the “random” transmission of alleles from parents into their children. Each of the two SNP alleles has an equal chance of being transmitted during generation of germ cells (meiosis). This is the reason for describing the use of genetic instruments as “Mendelian Randomization” in epidemiology (Lawlor et al. 2008). However, the approach is a standard application of instrumental variables (Wehby, Ohsfeldt, and Murray 2008b). The “random” assignment/inheritance of alleles into an individual’s genome seems to support the IV condition of exogenous instruments and suggests that unobserved confounders are unlikely to be correlated with genetic instruments when these do not affect the unobserved confounders and are not correlated with genetic variants that affect these confounders. (Smith et al. 2007) find that genetic variants are generally uncorrelated with behavioral and social variables when such associations are not a-priori expected or hypothesized.

“Random” inheritance does not mean that all individuals are at the same chance of inheriting a certain allele. Furthermore, “genetic instruments” may predict variation in an individual’s risk behavior not only through the direct genetic effects, but also through other pathways that may or may not be correlated to the unobserved confounders. For example, consider a genetic variant x that increases smoking risks. Parents who have (more copies of) variant x are more likely to smoke and more likely to pass this variant to their children than parents without this variant. If parental smoking causes family-level economic and psychosocial effects that are also inherited either directly or indirectly through affecting their children’s preferences, health risks, and other factors, and that may be related to the unobserved confounders of interest when using the children’s genes as instruments for their own behaviors (such as for studying the effects of smoking on health outcomes), then the genetic instruments may no longer be endogenous. While it may be unlikely that smoking has such effects, such bias is theoretically possible. Therefore, it is important to evaluate the “assumption” of instrument exogeneity with the available tests described below. However, one clear-cut advantage in using genetic instruments is that they cannot be affected by risk behaviors – unlike other individual-level instruments such as income or distance to providers/supplies, or area-level instruments such as prices/taxes and policies. Health economic studies have begun to utilize genetic factors as instruments to identify health risk and behavior effects (Ding et al. 2009; Fletcher 2009a; Fletcher 2009b; von Hinke Kessler Scholder 2010)

2.1 Study Objectives and Significance

We evaluate the utility of genetic instruments for studying maternal smoking effects on the child’s risk of being born with orofacial clefts (OFC). OFC are common birth defects occurring in about 1 per 700 births and are among the most prevalent birth defects in the US (about 20,400 births in 1999–2001) (Centers for Disease Control and Prevention 2006). OFC, including cleft lip with/without palate and cleft palate only, have a complex etiology of genetic, environmental and behavioral risk factors (Mossey et al. 2009). OFC impose significant burdens on individual and family health and economic wellbeing due to the need for surgical, speech and behavioral treatments. These may significantly reduce individual and family quality of life (Wehby and Cassell 2010; Wehby, Ohsfeldt, and Murray 2006). OFC increase child hospitalization costs by more than five times (Boulet et al. 2009; Cassell, Meyer, and Daniels 2008), and may increase the risk of long term health problems including certain cancers (Bille et al. 2005), mental health problems (Christensen and Mortensen 2002), and mortality and suicide risks (Christensen et al. 2004).

Several studies suggest that maternal smoking moderately increases OFC risk, with some variation across studies in the magnitude of the effects and the affected cleft type (Bille et al. 2007; Chung et al. 2000; Honein et al. 2007; Khoury, Gomez-Farias, and Mulinare 1989; Lie et al. 2008; Lieff et al. 1999; Little et al. 2004a; MacLehose et al. 2009; Shaw et al. 2009; Werler et al. 1990). Two meta-analyses of earlier studies estimated 1.3 times increased odds for OFC (Little, Cardy, and Munger 2004b; Wyszynski, Duffy, and Beaty 1997).

Some variation in the smoking effect estimates may be due to self-selection bias. All previous studies of maternal smoking effects on OFC have assumed random maternal self-selection into risk behaviors after adjusting for observed/measured confounders. These studies use designs that cannot account for self-selection bias due to unobserved factors and therefore, may suffer from confounding bias.

Behavioral theories and empirical evidence for infant health outcomes strongly indicate that maternal behaviors during pregnancy are in part determined by unobserved factors that are also related to infant health (Mullahy 1997; Rosenzweig and Schultz 1983; Wehby et al. 2009a). Available data sources for studying maternal behavioral effects on infant health lack valid measures of the relevant unobserved factors such as preferences and risk perceptions described above. Also, good proxies for these unobservable confounders are usually unavailable. Therefore, it is virtually impossible to fully account for these unobservable factors using only direct adjustment multivariate methods.

Maternal risk behaviors prior to and during pregnancy are “endogenous” to infant health outcomes, including OFC status. The assumption of ignorable self-selection bias in previous studies of smoking effects on OFC is strong and unsupported. Further, the direction of bias (i.e. over- or underestimation) in estimates that ignore self-selection is theoretically ambiguous due to the potential role of several unobserved factors. Obtaining accurate estimates of smoking effects on OFC is important for developing effective counseling and prevention programs for OFC. If smoking effects are underestimated in classical models, identifying the real smoking effects may improve counseling of women of childbearing age and make it more effective for preventing OFC. Similarly, if smoking effects are overestimated in classical models, smoking-focused approaches may not be effective in preventing OFC. Therefore, identifying the causal smoking effects of OFC has important implications for improving child health outcomes and reducing birth defects.

This paper reports the results of a novel application of the genetic IV model to estimate maternal smoking effects on OFC. The IV analysis simultaneously models 1) smoking as a function of relevant observed characteristics and genetic variants that influence smoking, and 2) OFC risk as a function of smoking and relevant observed factors. The goal is to exploit variation in smoking that is generated only by the genetic instruments and to account for non-random self-selection and unobserved confounders that have been a common limitation of previous studies of the effects of maternal smoking on OFC. By doing this, we will evaluate the utility of employing genetic variants as instruments to study maternal behavior effects on child health. This is one of the first studies to apply the genetic IV model to evaluate the effects of maternal risk behaviors on child health. This paper is also one of the first to apply genetic instrumental variables to data collected with case-cohort or case-control designs, and illustrates approaches for estimating and interpreting the behavior effects in such applications, which do not provide directly interpretable effect estimates, unlike classical models (such as standard logistic regression) that assume exogenous behaviors.

2.2 Genetics of Smoking

Smoking behaviors are well known to involve a complex etiology that involves genetic and environmental factors (Li 2006; Tyndale 2003), and there is a convergent evidence of strong links between specific genes and cigarette smoking. Twin and adoption studies find that genetic heritability is at least 50% for both smoking initiation and persistence (Carmelli et al. 1992; Heath and Martin 1993; Lessov et al. 2004; Maes et al. 2004).

Several nicotine, detoxification, and neurotransmitter genes have been implicated with smoking behaviors. Some of the main implicated genes are dopamine 2 receptor (DRD2) (Comings DE 1996; Noble et al. 1994; Spitz et al. 1998), dopamine beta hydroxylase (DBH) (McKinney et al. 2000), DOPA decarboxylase (DDC) (Ma et al. 2005; Yu et al. 2006), Cholecystokinin (CCK) (Comings et al. 2001), Tryptophan Hydroxylase (TPH) (Lerman et al. 2001; Sullivan 2001), gamma-aminobutyric acid type B receptor subunit 2 gene (GABAB2) (Beuten 2005; Li et al. 2009), nicotinic acetylcholine receptor a4 subunit (CHRNA4) (Li et al. 2005) and a3 subunit (rs1051730 in CHRNA3, (Thorgeirsson et al. 2008)) and Cytochrome P450 2A6 (CYP2A6; (Pianezza, Sellers, and Tyndale 1998; Sellers, Kaplan, and Tyndale 2000)).1 Recent studies have found significant associations between smoking and SNPs in GABRA4, GABRA2 and GABRE (Agrawal et al. 2008). Two recent Genome Wide Association Studies (GWAS) have identified correlations between various nicotine dependence measures and SNPs in CHRNA3 and CHRNA5 (Berrettini et al. 2008; Caporaso et al. 2009) and in MAOA and ACTN1 (Caporaso et al. 2009). Recently, a large meta-analysis confirmed the previously reported CHRNA3 and CHRNA5 effects and identified two new SNPs within these genes (rs55853698 and rs6495308) that explain most of the locus 15q2 effects on smoking quantity (Liu et al. 2010).

3. METHODS

3.1 Data Source and Study Measures

This paper employs a sample from a study of oral clefts in Norway. The study is a population-level survey of infants born with OFC in Norway in 1996 through 2001 and their parents as well as a randomly selected control sample of infants born without birth defects in the same period (NIEHS. 2009). The main study sample includes 574 children with OFC who represent 88% of the total eligible OFC population and 763 unaffected children who represent 76% of the randomly selected 1,006 eligible live births. DNA samples were collected from parents and children, and extensive data on maternal behaviors and household factors and socioeconomics were provided by the mothers. Data on maternal behaviors and characteristics was completed through self-administered surveys and most variables have complete data for more than 97% of the survey sample. The sample in this paper includes 856 mother-child pairs with available maternal genotypic data on the employed instruments and all model variables -- 363 mothers of children with isolated oral clefts and 493 mothers of unaffected children. There are no significant differences in smoking and other relevant model variables between the analytical sample and those excluded due to incomplete data on genetic instruments or other variables. All study models include these 856 persons except for the additional models that exclude cases with cleft palate only, as described below.

The study outcome is isolated OFC without other malformations. Some of the previous studies summarized above report differences in smoking effects between cleft lip with/without palate and cleft palate only. In order to evaluate the potential heterogeneity in smoking effects by cleft type, we estimate an additional specification that excludes cleft palate only from the sample (106 affected children), leaving 257 cases and 493 controls. 2

We measure maternal smoking by the average number of cigarettes smoked per day during the 12 months before pregnancy and alternatively during the first trimester, for which we identify instruments for the IV analysis as described below. Women were asked about whether they smoked at all during the last 12 months before pregnancy and during the first pregnancy trimester, and if so how much they smoked “on average” per day or month.3 Smoking during the first trimester is the more biologically relevant and preferred measure of exposure for fetal development of OFC. However, we also evaluate smoking before pregnancy as a complementary measure given that some women may not become aware of their pregnancy and modify their smoking status until a few weeks after conception. OFC occur by the 9th gestational week (Sperber 2002) and in many cases before the mother becomes aware of the pregnancy, as several pregnancies are unplanned.4 This raises the possibility that the report of quitting smoking right before pregnancy (i.e. reporting smoking before pregnancy but not during the first trimester) may not always be accurate, causing the embryo to be exposed to cigarette smoking. For these cases, smoking before pregnancy (as measured in this study) may also represent smoking during the early stages of pregnancy. About 47% of the sample mothers smoked during the last 12 months before pregnancy and about 37% reported smoking during the first pregnancy trimester (i.e. about 23% of the women who smoked before pregnancy quit smoking in the first trimester).

There is a decreased variation in smoking participation and intensity in the first pregnancy trimester compared to before pregnancy. In addition to decreasing participation, the average cigarettes smoked per day among smokers decreases from about 9 to 6 cigarettes. This decreases the power for observing significant instrument effects on cigarette smoking. Furthermore, women who choose to continue to smoke during pregnancy may be different from those who quit smoking during pregnancy based on their genetic risk factors for smoking. Some genetic variants that predict smoking during a more ordinary time (e.g., before pregnancy) may not have significant effects on smoking during pregnancy. We find most of the genetic instruments for smoking before pregnancy to have significant yet “weaker” effects on smoking during the first trimester, as described below.

3.2 Genetic Instruments and Instrument Validation

In this study, we use SNPs that had been genotyped as part of a large-scale genotyping project through the Center for Inherited Disease Research (CIDR) for studies of OFC. SNPs were selected to study the effects of smoking on OFC. We employ SNPs rs1435252, rs1930139 and rs1547272 in the gamma-aminobutyric acid type B receptor subunit 2 gene (GABAB2) and SNP rs2743467 in cytochrome P450 gene CYP2D6 as instruments for average cigarettes per day during the 12 months before pregnancy5. None of these genes is considered to be a candidate gene for OFC or is known to affect OFC through unobserved behaviors. There are numerous neurotransmitter and detoxification pathways with multiple genes that have different and specialized functions. The fact that a gene plays a role in one pathway and affects a certain risk behavior does not necessarily mean that it plays a role in all risky behaviors as many of the genetic effects are specific to certain behaviors and pathways. We use the same GABAB2 SNPs (rs1435252, rs1930139 and rs1547272) as instruments for cigarette smoking during the first trimester and before pregnancy. The CYP2D6 SNP has an insignificant effect on first trimester cigarette smoking, and is therefore not used as an instrument for cigarettes in the first trimester of pregnancy. Previous studies have reported variants in GABAB2 and CYP2D6 to be associated with smoking behaviors (Beuten 2005; Li et al. 2009; Saarikoski et al. 2000).

Table S1 in the Supplementary Material summarizes the various instrument specifications that we employ. The main instrument specification for cigarettes before pregnancy includes, for each SNP, two indicators for the minor allele homozygote genotype and for the heterozygote genotype – the reference category is the major allele homozygote genotype. This does not impose the restriction that these two genotypes have the same effect. As sensitivity analysis, we evaluate alternative instrument specifications to gauge the sensitivity of the IV cigarette effect estimates to various instruments and to assumptions about gene-dosage instrument effects. These alternative specifications include one indicator per SNP that combines the minor allele homozygote and heterozygote genotypes.6 Further, for smoking before pregnancy, we evaluate additional instrument specifications excluding the CYP2D6 SNP as an instrument given that there is generally less evidence from previous studies for CYP2D6 involvement in smoking dependence than GABAB2.7

We evaluate the extent to which the instruments fit the first IV assumption by testing the joint significance of their effects on the number of cigarettes adjusting for the other model variables. There is no statistical test to fully evaluate the second IV assumption – that instruments influence OFC only through smoking and observed variables – due to the role of unobservable factors (Wooldridge 2002). Therefore, it is important to select the instruments based on the theoretical (and in the case of genetic instruments biological) justification for their excludability from the outcome function (Wehby, Ohsfeldt, and Murray 2008a). To our knowledge, there is no evidence that the specific employed instruments or their genes are involved in the biological/genetic pathways for OFC other than through smoking (Jugessur, Farlie, and Kilpatrick 2009; Lidral, Moreno, and Bullard 2008). We evaluate the second IV assumption using the standard over-identification test (Baum 2007; Lee 1992), which is a partial test of the second IV assumption and evaluates whether the additional instruments used to identify the IV model (i.e. the over-identification restrictions) have no effects on OFC independent of smoking and are, therefore, excludable from the OFC function (a minimum of one instrument is needed to identify the effect of one endogenous variable).8

3.3 Model Estimation

We estimate the following function for OFC status:

equation M1
(1)

For child i, OFC status is a function of maternal cigarettes smoked before pregnancy or alternatively in the first trimester (CIGARETTES), maternal socioeconomic and demographic factors that are relevant for OFC including age, education and income (SOCIOECONOMIC), and other risk and health behaviors (OTHER_BEHAVIOR) that may affect OFC. We include the following in OTHER_BEHAVIOR: participation in and intensity of alcohol drinking during the first pregnancy trimester, multivitamin intake in the first trimester, body weight indicators at pregnancy, average number of daily calories during the first trimester, and pregnancy intendedness. 9 Table 1 includes a description of the model variables.

Table 1
Description of the Study Variables

As described above, we estimate this function using an IV model in order to account for the correlation between CIGARETTES and v (i.e. the correlation between CIGARETTES and the unobserved confounders that also affect OFC), which if ignored, may lead to a biased estimation of the effect (β) of CIGARETTES on OFC. The IV model involves an additional function for CIGARETTES which we specify as follows:

equation M2
(2)

where GENES include the genetic instruments for cigarettes as described above.

We estimate the IV model using conditional maximum likelihood (CML) which simultaneously estimates equations (1) and (2) with equation (1) estimated as a probit functional form. The model has several advantages over other estimation approaches (Wooldridge 2002). First, CML provides the most efficient estimator if the distributional assumptions are correctly specified. Second, the average partial effects of the model variables can be directly estimated without retransformation. Third, the hypothesis of exogenous cigarette selection (i.e. no confounding bias when ignoring unobserved factors) can be directly evaluated by testing the correlations between the error terms v and e. A significant correlation indicates that cigarettes are endogenous and that standard probit estimates may be biased.

In order to evaluate the sensitivity of the cigarette effect estimates to the CML assumptions, we also estimate the IV model using a two-stage IV probit model. This is the two-stage residual substitution method of Rivers and Vuong (1988), which includes the OLS residual term (e) of equation (2) as a regressor in equation (1), which is estimated by probit as follows (Rivers and Vuong 1988):

equation M3
(3)

Testing the coefficient γ of e in equation (3) evaluates whether cigarettes are endogenous. One advantage of this method is that it does not impose the CML joint distributional assumptions. However, a disadvantage is that equation (1) coefficients are estimated only up to a scale (Wooldridge 2002). Obtaining the marginal effects requires rescaling the coefficients. We employ Wooldridge’s (2002) approach for rescaling the coefficients.10

Estimating the marginal effects of cigarettes on the sample probability (proportions) of OFC alone is less intuitive and insufficient in case-cohort and case-control designs. Given that the outcome is a binary measure and the data is collected using with a case-cohort design, we also estimate the “odds ratio” (OR) for smoking effects as follows:

equation M4
(4)

where ps is the average predicted OFC probability at the mean cigarette number among smokers and pn is the average predicted OFC probability with no smoking. The OFC probabilities are predicted using the estimated regression coefficients for all observations based on their values for the model variables (except for cigarette number as explained above) and then averaged across the sample in order to estimate ps and pn. The OR is the standard effect estimate for case-control and case-cohort designs, and is a consistent estimate of the relative risk for rare outcomes such as OFC. The OR as defined in equation (4) based on a probit function is virtually similar to that obtained from a logistic regression11. However, logistic regressions are significantly harder to estimate in the case of endogenous regressors, especially using conditional maximum likelihood. The confidence intervals for the ORs are estimated using bootstrap with 2,000 replications.12 As a reference model, we also estimate equation (1) using standard probit regression ignoring the problem of unobserved confounders.

Another way of interpreting the regression coefficients involves weighting the regression models by the sample probability weights, which provides an estimate of the effects of smoking on oral cleft incidence in the population. As mentioned above, the study sample includes most of the infants with OFC born in Norway during the study years and randomly selected unaffected births. This allows constructing probability weights for participation in the study. However, these probability weights are rather simplistic as they do not account for potential differences in study participation by factors that may affect oral clefts. Also, these weights make the CML model unstable and cause convergence problems. Therefore, as an alternative approach, we use these sample probability weights with the standard probit and the two-stage probit model to evaluate the smoking effects on OFC incidence.

The variables in OTHER_BEHAVIOR may be correlated with unobserved confounders, which may bias the model estimates. We do not have access to instruments to account for the endogenous maternal selection into all of these factors. In order to gauge their effects on the cigarette effect estimates, we estimate an additional specification of equation (1) by CML that excludes OTHER_BEHAVIOR. The expectation is that the IV cigarette effect estimates should be insensitive to any potential biases in these variables if it is insensitive to their exclusion.

3.4 Weak Instruments

As described below, the genetic instruments employed in this study have statistically significant effects on cigarette smoking (F statistic of 3.33 in the main instrument specification; p=0.0009). However, the magnitude of these effects suggests that they are “weak”, or not strongly correlated with cigarette smoking13. If instruments are “exogenous”, which is the required IV assumption, weak instruments are expected to bias the IV effect estimates towards the classical estimates and the IV variance estimates downward (Hahn and Hausman 2003). Therefore, standard inference approaches may be biased with weak instruments. In order to assess the weak-instrument effects on our inference, we re-estimate the IV probit function using Newey’s minimum chi-square distance estimator (Newey 1987) with weak-instrument robust confidence bounds for the cigarette coefficients and compare these confidence bounds to the usual bounds that do not account for weak instruments (Finaly and Magnusson 2009). We show below that the cigarette marginal effects are comparable between Newey’s IV probit estimator and the CML estimator, providing support for using the Newey’s estimator to evaluate the weak-instrument effects on inference.14

Assuming exogenous instruments, weak-instruments are expected to result in over-rejection of the over-identification restrictions (Hahn and Hausman 2003). Therefore, given that our evaluations of the second IV condition suggest that the instruments are exogenous as described below, we do not expect that the weak-instrument problem is resulting in an inability to reject the over-identification restrictions.

4. RESULTS

4.1. Main Estimates

Table 2 reports the effects of smoking before pregnancy and during the first pregnancy trimester on OFC risk using the main instrument specifications. 15 Also listed are the results of testing the joint instrument effects on cigarettes (first IV assumption), the excludability of the instruments from the OFC function using the over-identification test (second IV assumption), and the exogenous selection of cigarettes.

Table 2
Cigarette Smoking Effects on OFC

Cigarette smoking before pregnancy or during the first trimester has a small and marginally significant effect on OFC using standard probit. Smoking 9 cigarettes per day before pregnancy (the average smoking rate among smokers) increases OFC risk by 1.2 times relative to non-smoking. A similar effect is observed for smoking 6 cigarettes per day during the first trimester, which is the average smoking rate among smokers.16

Using the main instrument specification and the CML-IV probit model, the cigarette smoking coefficients increase and become more strongly statistically significant. Average cigarette smoking (9 cigarettes per day) before pregnancy relative to non-smoking significantly increases OFC risk by about 4.2 times. Smoking during the first pregnancy trimester has a larger effect, increasing OFC risk by 5.4 times at the first trimester average smoking rate of 6 cigarettes. The hypothesis of the exogenous selection of cigarette smoking is rejected for both smoking measures (at p <0.05). As expected, the bootstrapped-based OR confidence intervals are generally less statistically significant compared with the cigarette regression coefficients, given that multiple variables (and variances) are averaged in the OR estimate and its variance.

The main instrument specification has statistically significant effects on cigarette smoking both before pregnancy and during the first pregnancy trimester (p <0.01 and <0.05, respectively). However, as mentioned above, the instruments may be considered “weak” based on the magnitude of their effects in the reduced-form model (F statistics of 3.33 and 2.63 for cigarettes before pregnancy and during the first trimester, respectively). The over-identification restrictions (instrument excludability from the OFC function) are not rejected at p=0.86 for cigarettes before pregnancy and p=0.77 for smoking in the first pregnancy trimester.

4.1.a Excluding Cleft Palate Only

Table 3 reports the smoking effects on cleft lip with/without palate (CL/P) excluding cleft palate only from the sample.17 The same pattern of differences in effects between the probit and CML-IV probit is observed as in the total sample. However, the IV smoking effects are larger than in the combined sample. In the probit model, average smoking before pregnancy (9 cigarettes per day) or during the first pregnancy trimester (6 cigarettes per day) significantly increases OFC risk by about 1.3 times. Under CML-IV probit, average smoking before pregnancy increases OFC risk by 4.9 times. A larger effect is observed for first trimester smoking, with a 6.1 times increase in OFC risk. The cigarette coefficients are statistically significant for both periods under the CML-IV probit model, but only the OR for smoking before pregnancy is marginally significant. Similar to the total sample, the exogenous selection of smoking is also rejected for the CL/P sample (at p<0.05). The instruments have similar effects in the CL/P subsample as in the total sample, and the over-identification restrictions are not rejected (p=0.79 for smoking before pregnancy and 0.64 for first trimester smoking).

Table 3
Cigarette Smoking Effects on Cleft Lip with/without Cleft Palate

4.2 Sensitivity Analysis

4.2.1 Smoking before Pregnancy

Table 4 reports the effects of smoking before pregnancy on OFC under alternative model/instrument specifications and estimation approaches. Excluding the other potential endogenous variables from the model (OTHER_BEHAVIOR) has virtually no effect on the cigarette smoking effect or other test results. Furthermore, the smoking effect is similar using the two-stage IV probit model (with residual substitution) to the CML-IV probit model, with also a 4.2 times higher OFC risk with average smoking.

Table 4
Effects of Cigarette Smoking before pregnancy on OFC under Alternative Model and Instrument Specifications

The pattern of higher smoking effects under the IV model compared with the probit model is consistent across the alternative instrument specifications, and the OR estimate is slightly larger than the main instrument specification and ranges from 4.6 to 4.8 times across these specifications. The exogenous selection of cigarettes is rejected under these instrument specifications. Under all specifications, instruments have statistically significant effects on smoking (at p <0.01) and the over-identification restrictions cannot be rejected.

4.2.2 First Trimester Smoking

Table 5 reports the first trimester smoking effects on OFC under alternative model and instrument specifications. Similar to smoking before pregnancy, the smoking effects are generally insensitive to excluding the OTHER_BEHAVIOR vector from the model (the effect increases slightly in the probit model). The effect in the two-stage IV probit model is virtually similar to that of the CML-IV probit model. The exogenous cigarette selection is rejected in these specifications.

Table 5
First Trimester Cigarette Smoking Effects on OFC under Alternative Model and Instrument Specifications

Similar to the first instrument specification, the first trimester smoking effect under instrument specification 2 is larger than the probit effect with an OR of 5.9 (marginally significant). The exogenous selection of cigarettes is rejected. The instruments have significant effects on smoking (p<0.05), and the over-identification restrictions cannot be rejected.

4.2.3 Weak-Instrument Effects

Table 6 reports the cigarette coefficients in the IV probit regression estimated using the minimum chi-square distance estimator (Newey 1987). Also included in Table 6 are the standard 95% confidence bounds for the coefficients, which do not assume weak-instruments, and confidence bounds that are robust for weak instruments. Evaluating the sensitivity of the inference to the instrument weakness involves comparing the standard and the weak-instrument robust confidence bounds for this estimator. Alternative weak-instrument robust confidence bounds are presented using the Conditional Likelihood Ratio (CLR), Anderson Rubin (AR), Lagrange Multiple (LM) and the LM-J statistics (Andrews, Moreira, and Stock 2006; Finaly and Magnusson 2009).18 The CLR test is more powerful under weak-instrument identification than the AR or the LM test (Andrews et al. 2006). However, we report the AR and the LM bounds for comparison purposes.

Table 6
Weak-Instrument Robust Confidence Bounds for the IV probit Cigarette Smoking Coefficients

Inference for the smoking regression coefficient using the classical asymptotic standard errors is overall comparable between Newey’s IV probit estimator and the CML-IV probit estimator for both smoking measures. This provides support for evaluating the weak-instrument issue using Newey’s estimator. The CLR, LM and LM-J weak-instrument robust confidence bounds result in similar inference for the smoking regression coefficient as the standard confidence bounds under all employed instrument specifications. However, the AR confidence bounds suggest insignificant smoking coefficients under all specifications. Given that the AR test suffers from lower power than the other statistics, which all result in similar inference, these results suggest that the weak-instrument problem is unlikely to have substantially biased the IV smoking effects presented above. Of course, this conclusion is conditional on the instruments being truly exogenous. Both the over-identification tests described above and the current knowledge of the biologic functions of the instrument genes and of the OFC genetic etiology support the assumption that the instruments are exogenous.

4.2.4 Weighted Models and Cigarette Effects on Incidence

We estimate the population incidence of OFC based on the study sample and the employed probability weights to be about one affected birth per 500 births, which is similar to what has been previously estimated for Scandinavian populations.(Christensen 1999; Harville et al. 2007; Harville et al. 2005). Table 7 reports the smoking and quitting effects on the incidence OFC, which is estimated by weighting the models with probit and two-stage probit models with the sampling probability weights described above. When weighted, the standard probit model suggests that a population-level smoking rate before pregnancy of about 9 cigarettes per day (the average number of cigarettes among smokers) increases the population OFC incidence rate by one additional OFC birth per 2,500 births or by about 20% relative to no smoking (marginally significant). This model predicts that OFC incidence would decrease by about 8% if all current smokers quit (marginally significant). However, the weighted two-stage probit model predicts that a population-level smoking rate before pregnancy of about 9 cigarettes per day relative to no smoking increases OFC incidence by one additional affected birth per 200 births, or by about 245%, and that incidence would decrease by about 70% if all smokers quit (marginally significant). The weighted models for smoking during the first trimester show generally comparable effects on OFC incidence but the effects are overall statistically insignificant based on the bootstrap confidence bounds.

Table 7
Cigarettes Effects on OFC Incidence Based on Weighted Models

5. DISCUSSION

The study provides an application of genetic instruments for studying the effects of behaviors on health. The findings generally support the utility of employing genetic instruments for obtaining consistent estimates of maternal risk behavior effects on child health. Accurate estimates of the “causal” behavioral effects are needed for designing effective prevention programs for adverse child health outcomes and devising public policies to improve child health. To our knowledge, this application is among the first to use genetic instruments for assessing maternal behavior effects on infant health. The paper also illustrates an application of instrumental variables using data from case-cohort or case-control designs and ways of interpreting the behavior effects in such situations.

The employed instruments are predictive of cigarette smoking and do not appear to be related to health outcomes (OFC) through unobserved pathways based on the literature or statistical tests (including the inability to reject the over-identification restrictions at high critical values and the observation of similar smoking effects when excluding other relevant behavioral factors from the model). Of course, it is still possible that the instruments may somehow be related to OFC through unobserved, as is the case for any application using genetic instruments. As further knowledge is obtained on the genes’ functions and pathways, it is important to reevaluate the evidence for the exogeneity of the instruments and its implications for the results. One limitation of the instruments employed in this study that may be relevant to other applications is that instruments may have “weak” (although statistically significant) effects on behaviors. This is not surprising given the complex etiologies of risk behaviors, which may involve several genetic, economic and psychosocial factors, and given that large datasets with genetic instruments are unavailable for many applications, especially for studying maternal behavior effects. In this case, it is important to employ weak-instrument robust inference methods such as those presented above.

Genetic instruments provide a unique source of behavior variation to identify “causal” behavioral effects. In applications where the genetic instruments may be “weak”, the benefits of the genetic instrumental variables approach may outweigh the inference challenges due to “weak instruments” given that weak-instrument robust inference methods are available for most IV estimators. This is particularly relevant when non-genetic instruments are not available or suffer from theoretical limitations. While prices/taxes and other area-level variables are commonly used as instruments for smoking and other risk behaviors, there are significant theoretical and empirical challenges in employing these instruments as described above. We are limited to a small set of already genotyped variants as instruments for smoking in this study. These instruments are considered “weak” as discussed above. Future studies are expected to have better access to stronger instruments as the genetic etiologies of behaviors are further revealed and genotypic data become more widely available. Recent research studies have identified new variants that have strong effects on smoking. Of these, variants in CHRNA3 and CHRNA5 are the most promising and are strong candidates to be used as instruments (Liu et al, 2010).19 Future studies are needed to formally evaluate the utility of these variants as instruments for smoking.

The study results suggest that maternal cigarette smoking may substantially increase the child’s OFC risk, and that this effect may be significantly underestimated in analyses that ignore unobserved confounders. If our estimates are correct, the prevention of all cigarette smoking at the population level may reduce OFC incidence by more than 50%. The study provides further evidence that direct adjustment for observed confounders may be insufficient for consistent estimation of maternal behavioral effects due to self-selection into these behaviors based on unobserved factors that also affect child health.

The results are consistent with favorable self-selection into smoking based on unobserved characteristics that may reduce the OFC risk. In other words, women who smoke at and during pregnancy may, on average, have higher rates of certain unobserved “baseline” characteristics that reduce OFC risk, and therefore result in underestimating the harmful smoking effects when ignored in classical single-equation models. These characteristics may include favorable family and child health history and lower maternal baseline health risks. Such factors may increase maternal propensity to smoke, but may relate to unobserved “health endowments” that reduce OFC risk such as favorable genetic, economic or psychosocial factors. This may seem counterintuitive given that most observable characteristics suggest adverse self selection into smoking with less education, unemployment, alcohol drinking, underweight, not using vitamins and not planning the pregnancy being positively correlated with cigarette smoking (see Table S3). The only exception is the positive correlation between income and cigarette smoking. We cannot evaluate further the hypothesis of adverse self-selection based on unobservables. However, the change in smoking effects on OFC with the IV estimation is consistent with several previous IV studies of smoking impacts on birth weight using cigarette tax rates and other non-genetic instruments (Evans and Ringel 1999; Grossman and Joyce 1990; Lien 2005; Permutt and Hebel 1989; Rosenzweig 1983) which find larger adverse smoking effects using the IV models and also with IV studies of other behavioral effects on child health such as prenatal care effects on birth weight, which also appear to be underestimated in classical models (Wehby et al. 2009a, 2009b).

One potential bias source that may contribute to underestimating adverse smoking effects in classical models is biased reporting of maternal smoking status depending on observed OFC status. Mothers of children with OFC may be less likely to report, post delivery, their participation in and intensity of smoking before and during pregnancy compared to mothers of unaffected children due to guilt feeling and avoiding blame. In this study, mothers reported smoking after the child’s birth (by about 4 months). However, previous analyses of this data showed that differences between smoking status reported prospectively by the study women during their first prenatal visit (available through birth registry data), which was around 10.3 gestational weeks and for most cases likely before obtaining information about OFC status, and first trimester smoking status as reported post delivery, are similar between mothers of affected and unaffected children (Lie et al. 2008). Therefore, it is unlikely that biased smoking report based on observed OFC is resulting in underestimating smoking effects in the classical probit model. However, it is possible that the smoking effects are attenuated toward zero in the classical models due to random errors in smoking self-report or biases that are not related to OFC, which may partially explain the observed increase in the effects of smoking when treated as endogenous. Intuitively, the estimates of the IV model are expected to apply mainly to those whose behaviors change with the instruments20, which may also contribute to the increase in smoking effects if these instruments affect smoking behaviors in specific ways (e.g., spacing or timing of cigarette use or smoking certain brands), or if those who smoke because of these instruments have different characteristics that intensify the smoking effects, although we cannot evaluate this in this study. It also remains theoretically possible that part of the increase is due to noise introduced by the instruments or unforeseen endogeneity issues with the instruments that may be aggravated by the instrument weakness. Therefore, it is important to replicate this study in the future with the recently identified and potentially stronger instruments for smoking.

One limitation of designs that condition sampling on the study outcome such as case-control or case-cohort designs is that they are at higher risk for sample selection bias. This bias occurs when the sampling frame or study participation is related to “unobserved” factors that also affect the outcome (Heckman 1979). Additional sample selection bias that is particularly relevant for IV applications using such data may occur if the sampling frame or study participation is related to unobserved factors that are related to both the outcome and the employed instruments. While such limitations are theoretically possible, we do not expect that either the sampling frame or maternal decision to participate in this study is related to unobserved factors that affect OFC or are related to the employed genetic instruments. The study case-cohort sample included the majority of the eligible OFC population in Norway during the study period and a random sample of unaffected children. When suspected, sample selection should be modeled and accounted for using approaches that explicitly account for the role of unobserved relevant factors that result in this bias such as Heckman’s approach.

The IV model using genetic instruments may be applied in several other frameworks and research areas that so far have either relied solely on adjusting for observed confounders or utilized instruments that may not be effective in accounting for omitted variable bias. Studies have identified specific genetic risk factors for major behavioral risk factors such as alcohol use (Edenberg and Foroud 2006; Luo et al. 2006; Tolstrup et al. 2008) and obesity (Dina et al. 2007; Frayling et al. 2007; Loos et al. 2008; Qi et al. 2008; Willer et al. 2009). In the US, 34% of women of childbearing age are obese and 8% are extremely obese – another 25% are overweight (Flegal et al. 2010). The alarmingly high obesity rates increase the importance of obtaining consistent estimates of the “causal” maternal obesity effects on child health in order to forecast the impacts of changes in obesity rates on disease incidence and assess the returns of prevention programs.

In conclusion, the study provides a novel application using genetic instruments to assess the “causal” effects of maternal smoking before and during pregnancy on child health in the form of OFC status. The study finds that genetic instruments may be useful for such applications and highlights some potential limitations of this approach and ways for addressing them. Employing genetic instruments provides a valuable approach for accounting for unobserved confounders and obtaining consistent estimates of the “causal” behavior effects. The model may be used to study maternal risk behavior effects on various infant and child health outcomes such as birth weight, fetal growth, preterm birth, birth defects and child development and also to study long-term behavior effects on health, economic and psychosocial status.

Supplementary Material

Footnotes

1The variant rs1051730 is also related to quitting and intensity of smoking during pregnancy (Freathy et al. 2009).

2Cleft lip only and cleft lip with palate are commonly grouped in the literature given their similarities in development and recurrence risks.

3The specific questions on smoking and other survey questions can be found on http://www.niehs.nih.gov/research/atniehs/labs/epi/studies/ncl/question.cfm. ). We do not have direct data on changes in smoking amounts within these periods, but these may be partly reflected in the responses to average smoking intensity as described above. We focus on estimating the overall average effects of cigarette quantity per day for the days when the women smoked in these periods.

4About 27% of the mothers in the study sample reported that the pregnancy was unplanned.

5These SNPs pass the Hardy-Weinberg Equilibrium (HWE) test at p=0.34–0.9. The HWE test evaluates whether the genotypic proportions at each SNP express deviations from an equilibrium/constant distribution due to non-random mating and other factors that may result in such deviations. Higher p values suggest more confidence in not rejecting the random mating hypothesis and in lack of deviation from the HWE equilibrium.

6We include one indicator because we find that the effects of the two indicators for each SNP are in the same direction and generally insignificant from each other, except for SNP rs1435252, for which these two effects are significantly different at p<0.05. Therefore, we keep two indicators for SNP rs1435252. Another advantage of reducing the number of instruments is that in the presence of “weak instruments”, which is the case for these instruments as described below, the bias in the IV estimate increases with the number of instruments (Hahn and Hausman 2003).

7The Norway sample had also been genotyped independently of the CIDR project for several other variants in GABAB2 as well as DDC, and CHRNA4, which are genes that have also been implicated in smoking behaviors (Ma et al. 2005; (Li et al. 2005). However, these variants had insignificant effects on smoking during the first pregnancy trimester. Supplementary Table S2 reports the results for the relationship between these additional variants and first trimester smoking.

8We evaluate the over-identification restrictions based on Newey’s (1987) minimum chi-square distance estimator of the IV probit model (Newey 1987).

9Some studies report that maternal alcohol consumption, mainly excessive consumption, increases the risk of OFC (e.g. (Grewal et al. 2008; Romitti et al. 1999). However, other studies report insignificant (e.g. (Bille et al. 2007; Meyer et al. 2003). Some studies also report that maternal obesity increases OFC risk (e.g. (Stothard et al. 2009; Villamor, Sparen, and Cnattingius 2008) but others do not find significant effects (Shaw 2000). Food insecurity may also increase cleft palate risk (Carmichael et al. 2007). Low socioeconomic resources may increase OFC risk (Clark et al. 2003; Durning P 2007; Yang et al. 2008). We include pregnancy intendedness as it reflects maternal preferences for risk taking and may proxy for some unobserved confounders.

10This approach averages the predicted probabilities from equation (3) across all observations i (so that the predicted effects are averaged across ei).

11In models that include only smoking and no covariates, the OR as defined in equation (4) that is obtained from a probit function is identical to that from a logistic function, which is the standard regression model used in case-control or case-cohort designs. For example, both logistic regression and probit regression of OFC on smoking participation (yes/no) in the first trimester provide an OR estimate of 1.54. When the model includes additional covariates, the odds ratio from the probit model is expected to differ slightly from the logistic regression OR, because unlike the logistic regression OR, the probit OR is dependent on the covariate values. However, the ORs from the two models are still expected to be very close to each other. For example, when adding the covariates described above into the model, the OR for first trimester smoking participation is 1.379 and 1.370 in the probit and logistic regressions, respectively.

12For the bootstraps, we set the maximum number of iterations for each IV-probit CML regression to 10. Using the full study sample, The CML model converged at 5 iterations with cigarettes before pregnancy and 6 iterations for first trimester cigarettes.

13The “threshold” for weak instruments varies with several parameters including sample size, number of instruments, and other parameters. However, F-statistics below 10 are generally considered to suggest weak-instruments in linear models (Staiger and Stock 1997). There are no rules of thumb for instrument strengths in the CML model, but the instruments may also be weak in that model.

14The weak-instrument robust confidence bounds are only available for the Newey’s IV probit estimator and are not available for the other IV probit models (CML IV-Probit or the two-stage IV probit with residual substitution).

15Table S3 in the Supplementary Material reports the full CML regression results for the OFC functions in (Equation 1). Table S4 reports the full regression results for the cigarette functions in the CML model (Equation 2).

16We interpret the smoking effects using odds ratios based on the estimated cigarette number regression coefficients by simulating the effects of smoking the average number of cigarettes among smokers relative to not smoking. The average cigarette numbers (9 cigarettes per day before pregnancy and 6 per day during the first trimester) used in this simulation are conditional on being a smoker. Note that these effects may not be directly compared to odds ratios for smoking participation (yes/no) effects. One reason is that cigarette number is likely to be measured with some error, which attenuates the effect towards zero. We do not use binary smoking status or categorical indicators for cigarette number because of the limited instrument effects on these measures. Therefore, these effects should not be contrasted with odds ratio effects reported in previous studies using this sample (Lie et al. 2008). For instance, adjusting for the model covariates described above, we estimate an odds ratio of any smoking during the first trimester of 1.38 (95% CI: 1.02; 1.88) for OFC and 1.55 (95% CI: 1.1,2.17) for cleft lip without palate.

17Table S5 in the Supplementary Material reports the full regression results for the OFC function excluding cleft palate only. Table S6 reports the full regression results for the cigarette function.

18The regression coefficients cannot be directly compared between these models as they require transformation to obtain the variable effects.

19Recent studies have provided several additional candidates for smoking, with confirmed associations for the gene encoding cholinergic receptor, nicotinic, alpha 5 (CHRNA5 on chr 15q24) in multiple, independent studies (Hung et al. 2008; Liu et al. 2010; Saccone et al. 2010; Thorgeirsson et al. 2008; Thorgeirsson et al. 2010). Additional susceptibility loci have been identified through recent genome-wide association studies, including acetylcholine receptor genes CHRNB3 and CHRNA6 on chr 8p11, egl nine homolog 2 (EGLN2) on chr 9q13, brain-derived neurotrophic factor (BDNF) on chr 11p13, CHRNA3 on chr 15q24, and cytochrome P450 genes CYP2A6 and CYP2B6 on chr 19q13 (2010; McKay et al. 2008; Thorgeirsson et al. 2010).

20Unlike for the local average treatment effect of two stage least squares (2SLS), this is not a strict but rather an intuitive interpretation of the effects in the CML and other IV probit models.

Contributor Information

George Wehby, Assistant Professor, Dept. of Health Management and Policy, College of Public Health, University of Iowa, 200 Hawkins Drive, E205 GH, Iowa City, IA 52242 USA, Phone: 319- 384-5133; Fax: 319-384-5125.

Astanand Jugessur, Norwegian Institute of Public Health, Oslo Norway.

Jeffrey C. Murray, University of Iowa, Iowa City, Iowa, USA.

Lina Moreno, University of Iowa, Iowa City, Iowa, USA.

Allen Wilcox, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA.

Rolv T. Lie, University of Bergen, Bergen, Norway.

References

  • Tobacco and Genetics Consortium. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat Genet. 2010;42(5):441–7. [PMC free article] [PubMed]
  • Agrawal A, Pergadia ML, Saccone SF, et al. Gamma-aminobutyric acid receptor genes and nicotine dependence: evidence for association from a case-control study. Addiction. 2008;103(6):1027–38. [PubMed]
  • Andrews DWK, Moreira MJ, Stock JH. Optimal two-sided invariant similar tests for instrumental variables regression. Econometrica. 2006;74(3):715–52.
  • Angrist JD, Imbens GW, Rubin DB. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association. 1996;91(434):444–55.
  • Baum CFWV, Stillman S, Schaffer ME. OVERID: Stata module to calculate tests of overidentifying restrictions after ivreg, ivreg2, ivprobit, ivtobit, reg3, Statistical Software Components S396902. Boston College Department of Economics; 2007.
  • Berrettini W, Yuan X, Tozzi F, et al. Alpha-5/alpha-3 nicotinic receptor subunit alleles increase risk for heavy smoking. Mol Psychiatry. 2008;13(4):368–73. [PMC free article] [PubMed]
  • Beuten J, Ma JZ, Payne TJ, Dupont RT, Crews KM, Somes G, Williams NJ, Elston RC, Li MD. Single- and multilocus allelic variants within the GABA(B) receptor subunit 2 (GABAB2) gene are significantly associated with nicotine dependence. Am J Hum Genet. 2005;76(5):6. [PubMed]
  • Bille C, Olsen J, Vach W, et al. Oral clefts and life style factors--a case-cohort study based on prospective Danish data. Eur J Epidemiol. 2007;22(3):173–81. [PubMed]
  • Bille C, Winther JF, Bautz A, et al. Cancer risk in persons with oral cleft--a population-based study of 8,093 cases. Am J Epidemiol. 2005;161(11):1047–55. [PMC free article] [PubMed]
  • Boulet SL, Grosse SD, Honein MA, et al. Children with orofacial clefts: health-care use and costs among a privately insured population. Public Health Rep. 2009;124(3):447–53. [PMC free article] [PubMed]
  • Caporaso N, Gu F, Chatterjee N, et al. Genome-wide and candidate gene association study of cigarette smoking behaviors. PLoS One. 2009;4(2):e4653. [PMC free article] [PubMed]
  • Carmelli D, Swan GE, Robinette D, et al. Genetic influence on smoking--a study of male twins. N Engl J Med. 1992;327(12):829–33. [PubMed]
  • Carmichael SL, Shaw GM, Ma C, et al. Maternal corticosteroid use and orofacial clefts. American Journal of Obstetrics and Gynecology. 2007;197(6):585.e1–85.e7. [PubMed]
  • Cassell CH, Meyer R, Daniels J. Health care expenditures among Medicaid enrolled children with and without orofacial clefts in North Carolina, 1995–2002. Birth Defects Res A Clin Mol Teratol. 2008;82(11):785–94. [PubMed]
  • Centers for Disease Control and Prevention, C. Improved national prevalence estimates for 18 selected major birth defects--United States, 1999–2001. MMWR Morb Mortal Wkly Rep. 2006;54(51):1301–5. [PubMed]
  • Christensen K. The 20th century Danish facial cleft population--epidemiological and genetic-epidemiological studies. Cleft Palate Craniofac J. 1999;36(2):96–104. [PubMed]
  • Christensen K, Juel K, Herskind AM, et al. Long term follow up study of survival associated with cleft lip and palate at birth. BMJ. 2004;328(7453):1405. [PMC free article] [PubMed]
  • Christensen K, Mortensen PB. Facial clefting and psychiatric diseases: a follow-up of the Danish 1936–1987 Facial Cleft cohort. Cleft Palate Craniofac J. 2002;39(4):392–6. [PubMed]
  • Chung KC, Kowalski CP, Kim HM, et al. Maternal cigarette smoking during pregnancy and the risk of having a child with cleft lip/palate. Plastic and Reconstructive Surgery. 2000;105(2):485–91. [PubMed]
  • Clark JD, Mossey PA, Sharp L, et al. Socioeconomic status and orofacial clefts in Scotland, 1989 to 1998. Cleft Palate Craniofac J. 2003;40(5):481–5. [PubMed]
  • Comings DE, Bradshaw-Robinson FLS, Burchette R, Chiu C, Muhleman D. The dopamine D2 receptor (DRD2) gene: a genetic risk factor in smoking. Pharmacogenetics. 1996;6(1):73–79. [PubMed]
  • Comings DE, Wu S, Gonzalez N, et al. Cholecystokinin (CCK) gene as a possible risk factor for smoking: a replication in two independent samples. Mol Genet Metab. 2001;73(4):349–53. [PubMed]
  • Dina C, Meyre D, Gallina S, et al. Variation in FTO contributes to childhood obesity and severe adult obesity. Nat Genet. 2007;39(6):724–6. [PubMed]
  • Ding W, Lehrer SF, Rosenquist JN, et al. The impact of poor health on academic performance: New evidence using genetic markers. Journal of Health Economics. 2009;28(3):578–97. [PubMed]
  • Durning PCI, Morgan MZ, Lester NJ. The relationship between orofacial clefts and material deprivation in wales. Cleft Palate Craniofac J. 2007;44(2):5. [PubMed]
  • Edenberg HJ, Foroud T. The genetics of alcoholism: identifying specific genes through family studies. Addict Biol. 2006;11(3–4):386–96. [PubMed]
  • Evans WN, Ringel JS. Can higher cigarette taxes improve birth outcomes? Journal of Public Economics. 1999;72(1):135–54.
  • Finaly K, Magnusson LM. Implementing weak-instrument robust tests for a general class of instrumental-variables models. Stata Journal. 2009;9(3):398–421.
  • Flegal KM, Carroll MD, Ogden CL, et al. Prevalence and trends in obesity among US adults, 1999–2008. JAMA. 2010;303(3):235–41. [PubMed]
  • Fletcher JM, Lehrer SF. The effects of adolescent health on educational outcomes: Causal evidence using genetic lotteries between siblings. Forum on Health Economics and Policy. 2009a;12(2)
  • Fletcher JM, Lehrer SF. Using Genetic Lotteries within Families to Examine the Causal Impact of Poor Health on Academic Achievement. National Bureau of Economic Research Working Paper Series No. 15148 2009b
  • Frayling TM, Timpson NJ, Weedon MN, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007;316(5826):889–94. [PMC free article] [PubMed]
  • Freathy RM, Ring SM, Shields B, et al. A common genetic variant in the 15q24 nicotinic acetylcholine receptor gene cluster (CHRNA5-CHRNA3-CHRNB4) is associated with a reduced ability of women to quit smoking in pregnancy. Hum Mol Genet. 2009:ddp216. [PMC free article] [PubMed]
  • Grewal J, Carmichael SL, Ma C, et al. Maternal periconceptional smoking and alcohol consumption and risk for select congenital anomalies. Birth Defects Res A Clin Mol Teratol. 2008;82(7):519–26. [PMC free article] [PubMed]
  • Grossman M, Joyce TJ. Unobservables, Pregnancy Resolutions, and Birth Weight Production Functions in New York City. Journal of Political Economy. 1990;98(5):983.
  • Hahn J, Hausman J. Weak instruments: Diagnosis and cures in empirical econometrics. American Economic Review. 2003;93(2):118–25.
  • Harville EW, Wilcox AJ, Lie RT, et al. Epidemiology of cleft palate alone and cleft palate with accompanying defects. Eur J Epidemiol. 2007;22(6):389–95. [PubMed]
  • Harville EW, Wilcox AJ, Lie RT, et al. Cleft lip and palate versus cleft lip only: are they distinct defects? Am J Epidemiol. 2005;162(5):448–53. [PubMed]
  • Heath AC, Martin NG. Genetic models for the natural history of smoking: evidence for a genetic influence on smoking persistence. Addict Behav. 1993;18(1):19–34. [PubMed]
  • Heckman JJ. Sample Selection Bias as a Specification Error. Econometrica. 1979;47(1):153–61.
  • Honein MA, Rasmussen SA, Reefhuis J, et al. Maternal smoking and environmental tobacco smoke exposure and the risk of orofacial clefts. Epidemiology. 2007;18(2):226–33. [PubMed]
  • Hung RJ, McKay JD, Gaborieau V, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008;452(7187):633–7. [PubMed]
  • Jugessur A, Farlie PG, Kilpatrick N. The genetics of isolated orofacial clefts: from genotypes to subphenotypes. Oral Dis. 2009;15(7):437–53. [PubMed]
  • Khoury MJ, Gomez-Farias M, Mulinare J. Does maternal cigarette smoking during pregnancy cause cleft lip and palate in offspring? American Journal of Diseases of Children. 1989;143(3):333–7. [PubMed]
  • Lawlor DA, Harbord RM, Sterne JA, et al. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Statistics in Medicine. 2008;27(8):1133–63. [PubMed]
  • Lee LF. Amemiya’s Generalized Least Squares and Tests of Overidentification in Simultaneous Equation Models with Qualitative or Limited Dependent Variables. Econometric Reviews. 1992;11(3):319–28.
  • Lerman C, Caporaso NE, Bush A, et al. Tryptophan hydroxylase gene variant and smoking behavior. Am J Med Genet. 2001;105(6):518–20. [PubMed]
  • Lessov CN, Martin NG, Statham DJ, et al. Defining nicotine dependence for genetic research: evidence from Australian twins. Psychol Med. 2004;34(5):865–79. [PubMed]
  • Li MD. The genetics of nicotine dependence. Curr Psychiatry Rep. 2006;8(2):158–64. [PubMed]
  • Li MD, Beuten J, Ma JZ, et al. Ethnic- and gender-specific association of the nicotinic acetylcholine receptor alpha4 subunit gene (CHRNA4) with nicotine dependence. Hum Mol Genet. 2005;14(9):1211–9. [PubMed]
  • Li MD, Mangold JE, Seneviratne C, et al. Association and interaction analyses of GABBR1 and GABBR2 with nicotine dependence in European- and African-American populations. PLoS One. 2009;4(9):e7055. [PMC free article] [PubMed]
  • Lidral AC, Moreno LM, Bullard SA. Genetic Factors and Orofacial Clefting. Semin Orthod. 2008;14(2):103–14. [PMC free article] [PubMed]
  • Lie RT, Wilcox AJ, Taylor J, et al. Maternal smoking and oral clefts: the role of detoxification pathway genes. Epidemiology. 2008;19(4):606–15. [PubMed]
  • Lieff S, Olshan AF, Werler M, et al. Maternal cigarette smoking during pregnancy and risk of oral clefts in newborns. Am J Epidemiol. 1999;150(7):683–94. [PubMed]
  • Lien DS, Evans WN. Estimating the impact of large cigarette tax hikes: The case of maternal smoking and infant birth weight. Journal of Human Resources. 2005;40(2):373–92.
  • Little J, Cardy A, Arslan MT, et al. Smoking and orofacial clefts: a United Kingdom -based case-control study. Cleft Palate-Craniofacial Journal. 2004a;41(4):381–6. [PubMed]
  • Little J, Cardy A, Munger RG. Tobacco smoking and oral clefts: a meta-analysis. Bulletin of the World Health Organization. 2004b;82(3):213–8. [PubMed]
  • Liu J, Tozzi ZF, Waterworth DM, et al. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat Genet. 2010;42(5):436–40. [PMC free article] [PubMed]
  • Loos R, Lindgren JCM, Li S, et al. Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet. 2008;40(6):768–75. [PMC free article] [PubMed]
  • Luo X, Kranzler HR, Zuo L, et al. Diplotype trend regression analysis of the ADH gene cluster and the ALDH2 gene: multiple significant associations with alcohol dependence. Am J Hum Genet. 2006;78(6):973–87. [PubMed]
  • Ma JZ, Beuten J, Payne TJ, et al. Haplotype analysis indicates an association between the DOPA decarboxylase (DDC) gene and nicotine dependence. Hum Mol Genet. 2005;14(12):1691–98. [PubMed]
  • MacLehose RF, Olshan AF, Herring AH, et al. Bayesian methods for correcting misclassification: an example from birth defects epidemiology. Epidemiology. 2009;20(1):27–35. [PubMed]
  • Maes HH, Sullivan PF, Bulik CM, et al. A twin study of genetic and environmental influences on tobacco initiation, regular tobacco use and nicotine dependence. Psychol Med. 2004;34(7):1251–61. [PubMed]
  • McKay JD, Hung RJ, Gaborieau V, et al. Lung cancer susceptibility locus at 5p15.33. Nat Genet. 2008;40(12):1404–6. [PMC free article] [PubMed]
  • McKinney EF, Walton RT, Yudkin P, et al. Association between polymorphisms in dopamine metabolic enzymes and tobacco consumption in smokers. Pharmacogenetics. 2000;10(6):483–91. [PubMed]
  • Meyer KA, Werler MM, Hayes C, et al. Low maternal alcohol consumption during pregnancy and oral clefts in offspring: the Slone Birth Defects Study. Birth Defects Res A Clin Mol Teratol. 2003;67(7):509–14. [PubMed]
  • Mossey PA, Little J, Munger RG, et al. Cleft lip and palate. Lancet. 2009;374(9703):1773–85. [PubMed]
  • Mullahy J. Instrumental-Variable Estimation of Count Data Models: Applications to Models of Cigarette Smoking Behavior. Review of Economics and Statistics. 1997;79(4):586–93.
  • Newey WK. Efficient Estimation of Limited Dependent Variable Models with Endogenous Explanatory Variables. Journal of Econometrics. 1987;36(3):231–50.
  • NIEHS. Norway Facial Cleft Study (NCL) 2009.
  • Noble EP, St Jeor ST, Ritchie T, et al. D2 dopamine receptor gene and cigarette smoking: a reward gene? Med Hypotheses. 1994;42(4):257–60. [PubMed]
  • Permutt T, Hebel JR. Simultaneous-equation estimation in a clinical trial of the effect of smoking on birth weight. Biometrics. 1989;45(2):619–22. [PubMed]
  • Pianezza ML, Sellers EM, Tyndale RF. Nicotine metabolism defect reduces smoking. Nature. 1998;393(6687):750. [PubMed]
  • Qi L, Kraft P, Hunter DJ, et al. The common obesity variant near MC4R gene is associated with higher intakes of total energy and dietary fat, weight change and diabetes risk in women. Hum Mol Genet. 2008;17(22):3502–08. [PubMed]
  • Rivers D, Vuong QH. Limited Information Estimators and Exogeneity Tests for Simultaneous Probit Models. Journal of Econometrics. 1988;39(3):347–66.
  • Romitti PA, Lidral AC, Munger RG, et al. Candidate genes for nonsyndromic cleft lip and palate and maternal cigarette smoking and alcohol consumption: evaluation of genotype-environment interactions from a population-based case-control study of orofacial clefts. Teratology. 1999;59(1):39–50. [PubMed]
  • Rosenzweig M, Schultz TP. Estimating a household production function: Heterogeneity, the demand for health inputs, and their effects on birth weight. The Journal of Political Economy. 1983;91(5):723–46.
  • Rosenzweig MR, Schultz TP. Estimating a Household Production Function: Heterogeneity, the Demand for Health Inputs, and Their Effects on Birth Weight. The Journal of Political Economy. 1983;91(5):723–46.
  • Saarikoski ST, Sata F, Husgafvel-Pursiainen K, et al. CYP2D6 ultrarapid metabolizer genotype as a potential modifier of smoking behaviour. Pharmacogenetics. 2000;10(1):5–10. [PubMed]
  • Saccone NL, Culverhouse RC, Schwantes-An TH, et al. Multiple independent loci at chromosome 15q25.1 affect smoking quantity: a meta-analysis and comparison with lung cancer and COPD. PLoS Genet. 2010;6(8) [PMC free article] [PubMed]
  • Sellers EM, Kaplan HL, Tyndale RF. Inhibition of cytochrome P450 2A6 increases nicotine’s oral bioavailability and decreases smoking. Clin Pharmacol Ther. 2000;68(1):35–43. [PubMed]
  • Shaw G, Todoroff K, Schaffer DM, Selvin S. Maternal height and prepregnancy body mass index as risk factors for selected congenital anomalies. Paediatric & Perinatal Epidemiology. 2000;14(3):234–39. [PubMed]
  • Shaw GM, Carmichael SL, Vollset SE, et al. Mid-pregnancy cotinine and risks of orofacial clefts and neural tube defects. Journal of Pediatrics. 2009;154(1):17–9. [PubMed]
  • Smith GD, Lawlor DA, Harbord R, et al. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS Med. 2007;4(12):e352. [PMC free article] [PubMed]
  • Sperber G. Formation of the primary palate. In: WDF, editor. Cleft Lip and Palate: From Origin to Treatment. Oxford University Press; 2002. pp. 5–13.
  • Spitz MR, Shi H, Yang F, et al. Case-control study of the D2 dopamine receptor gene and smoking status in lung cancer patients. J Natl Cancer Inst. 1998;90(5):358–63. [PubMed]
  • Staiger D, Stock JH. Instrumental Variables Regression with Weak Instruments. Econometrica. 1997;65(3):557–86.
  • Stothard KJ, Tennant PW, Bell R, et al. Maternal overweight and obesity and the risk of congenital anomalies: a systematic review and meta-analysis. JAMA. 2009;301(6):636–50. [PubMed]
  • Sullivan PF, Jiang Yuxin, Neale Michael C, Kendler Kenneth S, Straub Richard E. Association of the tryptophan hydroxylase gene with smoking initiation but not progression to nicotine dependence. American Journal of Medical Genetics. 2001;105(5):479–84. [PubMed]
  • Thorgeirsson TE, Geller F, Sulem P, et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 2008;452(7187):638–42. [PubMed]
  • Thorgeirsson TE, Gudbjartsson DF, Surakka I, et al. Sequence variants at CHRNB3-CHRNA6 and CYP2A6 affect smoking behavior. Nat Genet. 2010;42(5):448–53. [PMC free article] [PubMed]
  • Tolstrup JS, Nordestgaard BG, Rasmussen S, et al. Alcoholism and alcohol drinking habits predicted from alcohol dehydrogenase genes. Pharmacogenomics J. 2008;8(3):220–7. [PubMed]
  • Tyndale R. Genetics of alcohol and tobacco use in humans. Annals of Medicine. 2003;35(2):94–122. [PubMed]
  • Villamor E, Sparen P, Cnattingius S. Risk of oral clefts in relation to prepregnancy weight change and interpregnancy interval. Am J Epidemiol. 2008;167(11):1305–11. [PubMed]
  • von Hinke Kessler Scholder S, Smith GD, Lawlor DA, Propper C, Windmeijer F. Genetic markers as instrumental variables: An application to child fat mass and academic achievement. The Centre for Market and Public Organisation; 2010. Jun,
  • Wehby GL, Cassell CH. The impact of orofacial clefts on quality of life and healthcare use and costs. Oral Dis. 2010;16(1):3–10. [PMC free article] [PubMed]
  • Wehby GL, Murray JC, Castilla EE, et al. Prenatal care demand and its effects on birth outcomes by birth defect status in Argentina. Econ Hum Biol. 2009a;7(1):84–95. [PMC free article] [PubMed]
  • Wehby GL, Murray JC, Castilla EE, et al. Quantile effects of prenatal care utilization on birth weight in Argentina. Health Economics. 2009b;18(11):1307–21. [PMC free article] [PubMed]
  • Wehby GL, Ohsfeldt RL, Murray JC. Health professionals’ assessment of health-related quality of life values for oral clefting by age using a visual analogue scale method. Cleft Palate-Craniofacial Journal. 2006;43(4):383–91. [PMC free article] [PubMed]
  • Wehby GL, Ohsfeldt RL, Murray JC. ‘Mendelian randomization’ equals instrumental variable analysis with genetic instruments. Statistics in Medicine. 2008b;27(15):2745–9. [PMC free article] [PubMed]
  • Werler MM, Lammer EJ, Rosenberg L, et al. Maternal cigarette smoking during pregnancy in relation to oral clefts. Am J Epidemiol. 1990;132(5):926–32. [PubMed]
  • Willer C, Speliotes JEK, Loos RJ, et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet. 2009;41(1):25–34. [PMC free article] [PubMed]
  • Wooldridge JM. Econometric analysis of cross section and panel data. Cambridge and London: MIT Press; 2002.
  • Wyszynski DF, Duffy DL, Beaty TH. Maternal cigarette smoking and oral clefts: a meta-analysis. Cleft Palate-Craniofacial Journal. 1997;34(3):206–10. [PubMed]
  • Yang J, Carmichael SL, Canfield M, et al. Socioeconomic status in relation to selected birth defects in a large multicentered US case-control study. Am J Epidemiol. 2008;167(2):145–54. [PubMed]
  • Yu Y, Panhuysen C, Kranzler HR, et al. Intronic variants in the dopa decarboxylase (DDC) gene are associated with smoking behavior in European-Americans and African-Americans. Hum Mol Genet. 2006;15(14):2192–9. [PubMed]