Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Biodemography Soc Biol. Author manuscript; available in PMC 2012 January 12.
Published in final edited form as:
Biodemography Soc Biol. 2011; 57(1): 3–32.
PMCID: PMC3256988

A Genetic Instrumental Variables Analysis of the Effects of Prenatal Smoking on Birth Weight: Evidence from Two Samples

George Wehby, Ph.D.,corresponding author* Jason M. Fletcher, PhD,corresponding author* Steven F. Lehrer, Ph.D., Lina M. Moreno, PhD., DDS., Jeffrey C. Murray, MD., Allen Wilcox, MD, PhD., and Rolv T. Lie, PhD.


There is a large literature showing the detrimental effects of prenatal smoking on birth and childhood health outcomes. It is somewhat unclear, though, whether these effects are causal or reflect other characteristics and choices by mothers who choose to smoke that may also affect child health outcomes or biased reporting of smoking. In this paper, we use genetic markers that predict smoking behaviors as instruments in order to address the endogeneity of smoking choices in the production of birth and childhood health outcomes. Our results indicate that prenatal smoking produces more dramatic declines in birth weight than estimates that ignore the endogeneity of prenatal smoking, which is consistent with previous studies with non-genetic instruments. We use data from two distinct samples from Norway and the US with different measured instruments and find nearly identical results. The study provides a novel application that can be extended to study several behavioral impacts on health, social and economic outcomes.

Keywords: Smoking, Birth Weight, Infant Health, Instrumental Variables, Genetic Instruments, Mendelian Randomization

I. Introduction

Maternal health behaviors at conception and during pregnancy are important determinants of fetal growth and child development. Maternal smoking is one of the most commonly studied behavioral risk factors that affect fetal/child development and is often considered the single most important, modifiable factor affecting birth outcomes (Kramer 1987). Prenatal and postnatal exposures to cigarette smoking are leading causes of child mortality and morbidity (DiFranza and Lew 1996; Ebrahim, Floyd, Merritt, et al. 2000). Prenatal smoking has also been linked to low fetal growth, low birth weight, premature births, and sudden infant death syndrome (Schoendorf and Kiely 1992), and has been shown to increase the risk of admission to neonatal intensive care, increasing healthcare costs of the birth by $700 (Adams et al. 2002).

Several observational studies have found that prenatal maternal smoking decreased birth weight by about 250 grams (Evans and Ringel 1999; Rosenzweig 1983). Low birth weight is an important predictor of child neurodevelopment and future health and socioeconomic status (Anderson and Doyle 2003; Boardman et al. 2002; Mervis et al. 1995; Saigal et al. 2001; Victora et al. 2008; Wolf, Smit, and de Groot 2001), suggesting that fetal exposure to smoking may reduce long term health and human capital through the impact of smoking on birth outcomes. Indeed, researchers have found that maternal smoking during pregnancy is associated with greater child’s behavioral risks including developing behavioral problems later in childhood (Weitzman, Gortmaker, and Sobol 1992), participating in criminal behavior, and lifetime nicotine dependence (Buka, Shenassa, and Niaura 2003). Maternal smoking during pregnancy has also been associated with increased risks of language problems, hyperactivity, fearfulness, and not getting along with peers (Faden and Graubard 2000).

The prevalence of prenatal smoking has fallen over time but is still high with a substantial number of children exposed to tobacco pre and postnatally worldwide. For example, 13.8% of women smoked during pregnancy in 2005 in the US(Tong et al. 2009). In Norway, the rate of smoking during pregnancy was 11 % in 2004, compared to about 21% in 1994–1995 (Eriksson et al. 1998; Kvalvik, Skjaerven, and Haug 2008).

Discovering which behavioral factors have the greatest negative effect on fetal and child development will aid policymakers in the development of interventions to reduce these negative effects (Heckman 2000; Heckman 2008). Given that maternal health behaviors during the prenatal period are likely to influence multiple child physical and neurological outcomes (such as birth weight and neurological development), developing interventions that address these health behaviors is likely to have large returns in child and future health and to be more cost-effective in enhancing child health than specific interventions that target child developmental problems post occurrence.

The commonly reported harmful effects of prenatal smoking on fetal growth and child health may occur via various biological pathways, including cell damage and changes to the placenta and a reduction in oxygen availability (hypoxia) to the fetus (Walsh 1994). However, endogenous maternal selection into smoking and biased reporting of smoking behaviors complicate the estimation of the causal effects of smoking on birth outcomes(Brachet 2005). Specifically, mothers who smoke during pregnancy are also likely to self-select into smoking based on their preferences for health and risk taking and their perceptions of fetal health endowments. These factors, typically unobserved in available data samples, are related to fetal health through other pathways besides smoking. For example, women who smoke during pregnancy may adopt other unhealthy behaviors that may also have adverse effects on the fetus (e.g. poorer nutrition or reduced prenatal care), but may also be less likely to have a family history of poor birth outcomes. Therefore, the actual contribution of smoking to child health, independently of the confounding pathways that correlate with both smoking and child health, and the direction of the potential net bias in estimating the effects of smoking on birth outcomes without accounting for non-random self-selection into smoking, is theoretically ambiguous and an open question.

Several papers have previously evaluated the effects of smoking on birth weight using a myriad of statistical and econometric methods. Most commonly, researchers employ statistical models that attempt to adjust directly for a variety of observable characteristics that may proxy for the relevant unobservable factors. These factors include other maternal behaviors besides smoking, measures of pregnancy wantedness and maternal health (Reichman et al. 2006). Others have used propensity score matching strategies that are also limited to observable characteristics and found similar results as more traditional specifications (Almond, Chay, and Lee 2005).

Several authors have used experimental or quasi-experimental designs to attempt to estimate causal effects of prenatal smoking on birth outcomes. Permutt and Hebel (1989) use a smoking cessation intervention to introduce random variation in smoking status and find large effects of smoking cessation on birth weight (15 grams per cigarette vs. 2 grams using OLS and an overall effect of 400 grams)(Permutt and Hebel 1989). Evans and Ringel (1999) use state level cigarette taxes in an instrumental variable (IV) strategy and find no statistically discernable difference between 2SLS and baseline estimates(Evans and Ringel 1999). However, the 2SLS effect estimate is 350–600 gram decrease in birth weight across several 2SLS models versus 230–250g across OLS models. Interestingly, the 2SLS estimates in both Permutt and Hebel and Evans and Ringel are larger than their OLS estimates. This is consistent with other studies that use IV estimation of smoking effects on birth weight, which also find generally larger adverse smoking effects using IV than those found with classical single-equation models (see also (Grossman and Joyce 1990; Lien 2005; Rosenzweig 1983)).

In this paper, we employ an IV model with genetic risk factors for smoking as instruments to shed more light on the causal link between maternal smoking during pregnancy and infant birth weight.1 The goal is to identify the effects of smoking using a previously unexplored source of variation in smoking that is due to individual-level differences in genetic risk factors that predispose for smoking, in order to account for unobserved factors that are related to the choice of smoking and to birth outcomes. One strength of utilizing genetic variants as instruments is that these variants are inherited at conception and, therefore, cannot be reversely affected by smoking or other behaviors. Another advantage is that confounding factors for smoking and birth weight, such as state-level health measures, are unlikely to be correlated with genetic variants compared to other instruments, such as tax rates or smoking policies (Lawlor et al. 2008; Smith et al. 2007). Similar to other instruments, there are also challenges in using genetic instruments, which we describe in detail in the instrument validity section below.

Previous IV estimations of the effects of smoking have mainly utilized taxes or changes in smoking policies between states as instruments. One limitation in these analyses is that these are aggregate level measures that only utilize area-level variation in smoking and ignore within area variation due to individual-level factors. Our primary contribution is that variation in genetic markers occurs within the individual at the molecular level. Thus, the study uses a different source of variation to identify impacts. It is of interest to use different instrument sets in order to potentially estimate different local average treatment effects (LATE). We describe below how we select the genetic instruments for this analysis.

We focus our attention on birth weight as the measure of infant health, similar to most of the previous studies. Low birth weight is generally considered to be an important predictor of other health and human capital outcomes later in life. From a policy perspective, the study findings are important for developing health policies that target pharmacotherapy or raise awareness for smoking cessation and public taxation policies that aim at reducing the negative externalities of smoking, given that accurate estimation of the effects of smoking on child health is needed in order to assess the cost-effectiveness of such policies.

There is a broad range of scientific evidence that supports the use of genetic markers as instruments for smoking. Many twin and adoption studies have demonstrated that genetic heritability is at least 50% for both smoking initiation and smoking persistence (Carmelli et al. 1992; Heath and Martin 1993; Lessov et al. 2004; Maes et al. 2004; Sullivan and Kendler 1999). Researchers have also identified several variants in nicotine, detoxification, and neurotransmitter genes to be significantly correlated with smoking behaviors including through candidate gene, genome-wide association studies (GWAS), and meta-analyses(Berrettini et al. 2008; Beuten 2005; Caporaso et al. 2009; Freathy et al. 2009; Li et al. 2009; Liu et al. 2010; Tyndale 2003; Vink, Staphorsius, and Boomsma 2009). Specific genetic risk factors have also been related to quitting and intensity of smoking during pregnancy (Freathy et al. 2009). Further, several studies have shown that the success of a treatment for smoking cessation may vary by neurotransmitter and detoxification pathway genes(Kortmann et al. 2009). The genetic variants that are employed in this study and described below in detail are in candidate genes for smoking.

The remaining sections are designed as follows: Section II describes the data sources and the model used to identify the impact of prenatal smoking on birth weight, as well as a description of the study measures. Section III presents our empirical results. Section IV discusses the study findings and implications for public policy and future research studies.

II. Data and Methodology

A. Study Samples

The study uses two independent samples from Norway and the US and conducts all analyses separately for each sample. The goal is to compare the results between samples that differ in their populations, smoking rates, and the genetic instruments used to identify the effects of smoking in order to gauge the result sensitivity to these factors. The idea is that observing similar findings under these different factors suggests that the effect is generally insensitive to differences in these factors. We describe below each data source in detail.

A.1 Norwegian Sample

The Norwegian sample includes children born without birth defects who were enrolled as a control group in a study of oral clefts in Norway. The study is a joint collaboration between the National Institute for Environmental Health Sciences (NIEHS.) and researchers at the University of Bergen, and involved a population survey of infants born with oral clefts in Norway in 1996 through 2001 and their parents as well as a randomly selected control sample of infants born without oral clefts in the same period (NIEHS. 2009), with a total sample of 574 cleft cases and 763 non-cleft cases recruited. The study goal is to identify behavioral, environmental and genetic risk factors for oral clefts, which are common and burdensome birth defects. DNA samples from parents and children and extensive data on maternal behaviors and household factors and socioeconomics were collected. Behavioral data were self-reported by the mothers through self-administered questionnaires, which were completed between 3–4 months on average post delivery. The DNA samples have been genotyped for a large list of single nucleotide polymorphisms (SNPs), which are DNA base-pair variants, including those in genes that are related to nicotine dependence and detoxification. The data has been used in several studies(Jugessur et al. 2009; Lie et al. 2008).2

Our study uses the sample of babies without oral clefts (control sample) to identify the effects of smoking on birth weight, using a set of SNPs that are involved in neurotransmitter or detoxification pathways and that are predictive of smoking in this sample. We limit the study to the control sample given that it represents a random sample of the population of births without oral clefts and in order to provide a more comparable sample to the Add Health sample, which is a general birth sample. We describe the SNP selection process below. The analytical sample includes 507 children with complete data on all the model variables, including genetic variants. Out of the 763 unaffected children, 592 samples had been genotyped (Jugessur et al. 2009).3

For the analyses using the Norway sample, we measure smoking by smoking participation and the number of cigarettes smoked per day during the first trimester.4 We evaluate the effects of any smoking participation during pregnancy given that less than daily smoking exposure may still have adverse effects on fetal development.5 We also study the number of cigarettes in order to evaluate the smoking intensity effects.

The Norway dataset provides data on several inputs and risk factors that are relevant for infant health production including behavioral inputs (alcohol drinking, multivitamin use, maternal calorie intake and BMI during pregnancy; pregnancy wantedness), and maternal socioeconomic and demographic characteristics. Table 1 presents the distribution of the study variables for the Norway sample with complete data on all model variables. Table A1 in the Appendix compares the distributions of study variables between the study samples and the excluded samples due to incomplete data. As can be seen, there is no systematic differences between the analytical and excluded samples, suggesting random data loss.

Table 1
Distribution of Study Variables in the Norway sample (N=507)

The average birth weight for the analytical Norway sample is 3647 grams. The mean population birth weight in Norway between 1999 and 2004 was around 3566 grams(Kvalvik, Skjaerven, and Haug 2008). The average rate of any first trimester smoking participation in the analytical sample is 31.7%. About 21% of the sample smoked 1 cigarette or more per day during the first trimester. Eriksson et al. (1998) reported that about 21% of pregnant women in Norway reported daily smoking around 18 weeks of pregnancy in 1994–1995 based on a multisite sample(Eriksson et al. 1998). Using data on the whole birth population, Kvalvik et al, (2008) reported that about 17.3% of pregnant women reported daily smoking at the end of the pregnancy in 1999–2001. The Norway sample includes births in 1995–2002. Therefore, the sample smoking participation rates are comparable to other study estimates for Norway.6

A2. US Sample

The US data sample is from the restricted version of the National Longitudinal Study of Adolescent Health (Add Health). Add health was initially designed as a school-based study of the health-related behaviors of a nationally representative sample of adolescents who were in grades 7 to 12 in 1994/5. A large number of these adolescents have subsequently been followed and interviewed two additional times in both 1995–6, and 2001–2. There are approximately 6,700 records of completed pregnancies by the time of the wave 3 survey, when the 15,000 respondents were on average 22 years old. Approximately 4,500 completed pregnancies are reported by females. Nearly 2,900 pregnancies resulted in a live birth, and nearly 2,100 pregnancies are a first birth.7 Nearly 1,850 pregnancies were reported as the first pregnancy of the relationship. We have information on birth weight and child gender for approximately 1,700 of these births. Because the DNA sample currently includes only 2,500 of the original 15,000 respondents in Add Health, our sample size that has genetic marker information available is approximately 300 mothers.

For the analyses using the Add Health sample, we measure smoking by smoking participation during pregnancy.8 Table 2 represents the distribution of the study variables used in the analytical sample of the Add Health sample. Table A2 in the Appendix compares the distributions of the model variables between the analytical and the overall sample. As can be seen, the analytical sample is representative of the overall sample and data loss is thought to be random.

Table 2
Distribution of Study Variables in the Add Health sample(N=307)

The average birth weight in the Add Health sample is over 3200 grams (approximately 7.2 pounds). Seventeen percent of the sample reports smoking during pregnancy (2.5 cigarettes a day on average). Ebrahim et al. (2000) shows using the BRFSS survey that the prevalence of pregnant women who smoked between 1987 and 1996 fell from 16.3% to 11.8%(Ebrahim, Floyd, Merritt, et al. 2000). Ventura et al. (2003) shows using recent birth certificate data that the proportion of mothers who smoke prenatally is 15% for 15–17 year olds, 19% for 18–19 year olds and 16.8% for 20–24 year olds(Ventura et al. 2003). The CDC (2004) finds that approximately 11.4% of all US women report smoking during pregnancy in the early 1990s and 2000s(CDC 2004). Fewer than 40% of the births in the Add Health sample were reported to be wanted at the time of the pregnancy.

B. Empirical model

A well established empirical literature in economics has analyzed the effects of prenatal investments on birth outcomes. These studies generally estimate birth weight production functions and examine the impacts of various inputs including, pregnancy-specific behavioral investments such as parental smoking and prenatal care (Grossman and Joyce 1990; Rosenzweig 1983; Wehby et al. 2009). Following this analytical framework, the birth weight production function is specified as:


where Smoke is an indicator of maternal smoking during pregnancy and X is a vector of other inputs to the production process.

It is well known that OLS estimation of equation (1) would yield biased estimates of β1 since the mother’s decision whether or not to smoke reflects behavioral choices which are likely based on her perceptions of fetal health risks and anticipations of child health outcomes, including birth weight.9 Therefore, self-selection of risk behaviors is likely correlated with unobserved characteristics that also affect child health in equation (1).

To overcome the endogeneity problem we consider an instrumental variable (IV) analysis which exploits exogenous variation in smoking generated from a set of instruments to identify its impact on birth weight. Specifically, we use sets of genetic variants in candidate genes for smoking as the instruments.10 These variants are statistically valid instruments if they are significantly correlated with smoking, and are otherwise unrelated to birth weight through unobserved characteristics.

We estimate the IV model using 2SLS, which estimates the following equation by OLS prior to equation (1), with Z representing a vector of the genetic instruments:


(Ebrahim, Floyd, Merritt, et al.; Ebrahim, Floyd, Merritt II, et al. 2000)

The 2SLS model estimates the local average treatment effect of smoking on birth weight under well-known conditions(Angrist, Imbens, and Rubin 1996; Imbens and Angrist 1994). We conduct a Hausman test to determine if researchers should treat maternal smoking as endogenous (Hausman 1978). We describe below the genetic instruments.

B.1 Instrument selection in Norwegian Sample

In order to identify candidate genetic instruments, we evaluate 65 SNPs in neurotransmitter and detoxification genes that are considered to be candidate genes for nicotine dependence by the NICSNP Nicotine Project11 and that have been genotyped in this sample12. We assess the SNP correlations with smoking using chi-square and ANOVA tests for smoking participation indicator and number of cigarettes, respectively.

SNPs in NAT2, CYP2D6, GABBR2, GABRB3, and ACTH1havea significant or marginally significant association with the smoking measures and are used as instruments.13 Due to potential interactive effects between smoking and detoxification pathways genes on birth outcomes (Shi, Wehby, and Murray 2008), we estimate sensitivity analysis models that excluded NAT2 and CYP2D6 variants as instruments. The instruments include two binary indicators that represent the three genotypes of each of the relevant SNP in order to avoid any restrictive assumptions about the recessive or dominant effects of these genes and to allow for gene dosage effects.

B.2 Instrument selection in Add Health Sample

While the Norwegian data includes several candidate SNPS, the Add Health sample only contains six genetic markers that are in candidate genes for smoking, including the dopamine transporter (DAT), the dopamine D4 receptor (DRD4), the serotonin transporter (5HTT), monoamine oxidase A (MAOA), the dopamine D2 receptor (DRD2) and the cytochrome P4502A6 (CYP2A6) gene.14 Gene-gene interactions are likely important and a growing body of evidence is consistent with this notion. For example, Skowronek et al. (2006) suggest an interactive effect between DRD4 and 5-HTTLPR in predicting tobacco use(Skowronek et al. 2006). After examining the first stage properties of the genetic markers and consulting the medical literature, we use combinations between the DRD2 and MAOA, the DRD4 and 5HTT, and the MAOA and 5HTT genes as instruments for the analysis, which are described in Table 2.1516

B.3 Instrument Validity and Alternative Estimations

Similar to other instruments, genetic instruments may suffer from certain limitations that should be acknowledged. One challenge in using genetic instruments is that they may correlate with other “physically near” genetic variants on the same chromosome that in turn are correlated with the unobserved confounders that impact the outcome (Lawlor et al. 2008).17 However, we are aware of no such effects for the specific instruments employed in this study. Indeed, work by Fletcher and Lehrer (2009) find no evidence of linkage disequilibrium for the genetic variants used in our study. Another potential limitation is that genetic variants may also influence the study outcome through other pathways besides the endogenous variable due to the multi-functionality of certain genes and alleles(Lawlor et al. 2008). For example, it is possible that one or more of the employed instruments may impact birth weight through impacting unobserved behaviors or risk factors that are also related to birth weight. However, there is no current evidence that the specific employed instruments in this study are related to birth weight through such unobserved pathways.

There is no way to fully test that instruments, of any type, are truly exogenous due to the role of unobservables(Wooldridge 2002). Therefore, it is important to appeal to the theoretical strengths and genetics literature that motivate our use of genetic variants as instrumental variables. However, in order to further validate the extent to which the set of genetic instruments fit the IV assumptions, we employ the standard over-identification statistical test, which is a partial test of the extent to which instruments are truly excludable from the birth weight function. We also describe the sensitivity of our results to alternative instruments specifications as described above.18

One limitation that may be common with genetic instruments and that is relevant for our study is that instruments may have “weak” statistically significant effects on behaviors do to the complex etiology of behaviors that may involve multiple genetic and non-genetic risk factors. Instruments are generally considered to be weak if they have a joint F-statistic in equation (Ebrahim, Floyd, Merritt, et al.) less than 10 (Staiger and Stock 1997). In our analysis the F-statistics range from 3.3 to 4.4. In this case, weak-instrument robust confidence bounds are needed for accurate inference. Therefore, in addition to standard inference using the usual asymptotic standard errors, we estimate 95% confidence bounds that are robust for weak instruments using the conditional likelihood ratio (CLR) statistic, which has more statistical power than other tests (Finaly and Magnusson 2009; Andrews, Moreira, and Stock 2006). Furthermore, we also re-estimate the IV model using limited information maximum likelihood (LIML), which has been suggested to provide less biased estimates with weak-instruments compared to 2SLS (Stock, Wright, and Yogo 2002). Assuming that the instruments are exogenous (unrelated to unobserved confounders), weak-instruments tend to over-reject the over-identification restrictions (Hahn and Hausman 2003). Therefore, failing to reject the over-identification restrictions if instruments are exogenous is unlikely to be a result of weak instruments.

III. Results

A. Norway Sample

Table 3 below reports the OLS and the 2SLS coefficients of the birth weight production function in the Norway Sample.19 Under OLS, smoking participation in the first trimester reduces birth weight by about 162 grams. The number of cigarettes smoked reduces birth weight by about 12.7 grams per cigarette (marginally significant).

Table 3
Effects of Smoking on Birth Weight

Under 2SLS, smoking participation has a marginally significant and larger effect (in absolute value) on birth weight than under the OLS model. Smoking participation reduces birth weight by about 523 grams (p=0.075). However, the effect is not significant based on the 95% weak-instrument robust confidence bounds. A larger smoking effect (in absolute value) is observed under LIML, with a 614 gram decrease in birth weight (p=0.098), but the effect is insignificant based on the weak-instrument robust confidence bounds.20

Similar to smoking participation, the number of cigarettes has a larger and significant effect under 2SLS, reducing birth weight by about 46 grams (p=0.04).21 The effect is significant based on the weak-instrument robust confidence bounds. The LIML cigarette effect is slightly larger than the 2SLS estimate, with a 52 gram birth weight decrease per cigarette (p=0.053), and the effect is significant using the weak-instrument robust confidence bounds.

The instruments have significant F statistics of 3.3 for smoking participation and 3.99 for the number of cigarettes. Over-identification tests fail to reject the validity of the instruments, with p values ranging from 0.5 to 0.8.

Table A5 in the Appendix reports the sensitivity analysis models in the Norway sample described above. The 2SLS estimates of smoking participation and cigarette effects on birth weight are larger (in absolute value) when the detoxification gene instruments are excluded. The same pattern of differences between OLS and 2SLS estimates is observed when daily smoking participation (smoking at least one cigarette per day) is used as the smoking measure, with a decrease of 612 grams in birth weight under the 2SLS model (p=0.056). The over-identification restrictions cannot be rejected under these sensitivity analyses.

B. Add Health Sample

Table 3 presents the OLS and 2SLS estimates of the effects of smoking in the Add Health sample.22 Under OLS, mother’s report of smoking during pregnancy reduces birth weight by over 150 grams. The 2SLS results suggest larger effects of smoking on birth weight than those indicated in our baseline specifications. Maternal prenatal smoking reduces birth weight by 587 grams (p-value<0.15).23 The instruments have a first stage F-statistic of 4.4, and the over-identification test does not reject the excludability of the instruments from the birth weight function (p-value < 0.63).24 A larger smoking effect (in absolute value) is observed under LIML, with a 617 gram decrease in birth weight (p=0.162) and an over-identification test of p=0.632.

VI. Discussion and Conclusion

The study provides the first empirical estimation of the effects of prenatal smoking on birth weight using a source of variation in smoking that is due to individual-level genetic instruments, unlike previous studies which mostly utilize area-level instruments. The findings suggest that standard OLS estimates may significantly underestimate the harmful effects of maternal smoking during pregnancy on birth weight. Specifically, smoking may reduce birth weight by as much as three times more than what is estimated using OLS.

It is important to acknowledge that the instruments are considered “weak” which complicates IV inference. However, the significant weak-instrument robust confidence bounds for cigarettes in the Norway sample and the similarity of the IV smoking participation effects between the two study samples even with different populations, smoking rates, and genetic instruments, provides support for the study results. Also, it is important to acknowledge that even if the instruments satisfy the over-identification tests at relatively high p-values, which provide statistical support excluding the instruments from the birth weight function, it is still possible that the instruments may be related to birth weight through other unobserved factors. If so, the IV estimates would be biased as they would be reflecting the effects of other factors besides smoking. However, there are some safeguards that provide assurance against a major potential bias. First, there is no apriori consistent evidence in the literature that the instruments we employ in the Norwegian data are not exogenous and that they affect birth weight through unobserved pathways. Second, we control for several behavioral and human capital factors that are expected to serve as good proxies for potential unobserved behavioral effects on birth weight that are related to the instrument. Third, we re-estimate the Add Health model adjusting for alcohol use during pregnancy and ADHD status, which have been associated with the instruments we employ in Add Health, and find no qualitative difference in our findings (results available upon request). Fourth, we also re-estimate the Add Health model adjusting for twin-birth status, given that the Add Health sample includes a higher-than-average proportion of twins. Both the OLS and 2SLS smoking effects are virtually unaffected by this adjustment25. Finally, the result insensitivity to alternative smoking measures and instrument specifications that are described above also provides support for the stability of the 2SLS estimates. Therefore, while it is still possible that the study findings are biased due to weak-instruments and the potential of the instruments may be related to birth weight through other factors besides smoking, the set of results as a whole provides are quite suggestive. As further knowledge becomes available on the functions of the genes that play a role in smoking, future studies become feasible to further evaluate these issues in selecting the instruments for smoking.

There is currently limited knowledge of what constitutes a causal genetic variant for smoking and several efforts are currently underway to identify causal variants. This study uses existing genotypic data which provides a cost-effective approach to study the smoking effects on birth weight using genetic instruments but imposes the limitation of being restricted to a certain set of candidate instruments. As discussed above, we select the instruments from variants in several genes that are commonly considered to be candidate genes for smoking, and some of which have been reported in more than one study to be associated with smoking (e.g. GABBR2). However, we are unable to evaluate other genes and variants as candidate instruments. Identifying causal variants for smoking behaviors will enable future studies to evaluate their utility as instruments for studying the effects of smoking on health outcomes. Further, this emphasizes the importance of replicating our study in future studies that use different samples and/or different variants.

The underestimation of OLS smoking effects is consistent with most previous IV studies that also find larger harmful smoking effects on birth weight under IV estimation (Evans and Ringel 1999; Grossman and Joyce 1990; Lien 2005; Permutt and Hebel 1989; Rosenzweig 1983). This may result from “favorable” self-selection into smoking based on unobservable factors that also affect birth weight. For example, a favorable history of pregnancy outcomes and infant health in the immediate and extended family of the mother may increase the propensity of the mother to smoke during pregnancy with the study infant, ceteris paribus, but may also improve the infant’s birth weight through correlated unobserved genetic or social health endowments. Moreover, mothers whose parents smoked and had favorable health and pregnancy outcomes may be more likely to smoke themselves during pregnancy than mothers whose parents smoked and had unfavorable health and pregnancy outcomes. It is expected that the unobserved genetic or social endowments that contribute to better health and pregnancy outcomes in the first group of families may also contribute to improving the birth weight of infants of mothers in this group compared to the second group. Unobserved maternal health characteristics may also contribute to favorable selection into smoking, conditional on the other determinants of smoking. Specifically, healthier mothers may be more likely to smoke during pregnancy compared to less healthy mothers due to their perceptions of low health risks. The underestimation of smoking effects on birth weight in OLS is similar to the underestimation of prenatal care effectiveness due to adverse maternal self-selection into prenatal care (Rosenzweig 1983; Wehby et al. 2009).

Biased maternal reporting of prenatal smoking status post delivery based on observed infant health and birth weight may also contribute to underestimation of the adverse smoking effects on birth weight. Specifically, mothers of infants with poorer birth outcomes may underreport their prenatal smoking status due to feelings of guilt, which would result in a positive bias in the OLS smoking effects. In both the Norway and Add Health samples, mothers reported their prenatal smoking status after delivery. However in the Norway sample, all mothers who report smoking at the time of their first prenatal visit also report smoking during the first trimester in the post-delivery survey (Lie et al. 2008). This provides some assurance against significant underreporting of smoking in the Norway sample based on observed birth weight.26

Other unobserved health factors such as maternal preferences for child health and risk taking which result in correlations between several maternal health and risk behaviors may result in a positive bias in OLS smoking effects. For instance, smoking mothers may also adopt less healthy behaviors during pregnancy (drug use, poor nutrition, less exercise, more stress). The net OLS bias is a function of the contribution of all unobserved factors, which may contribute differently to this bias, as described above. The study results suggest that conditional on the observed maternal behaviors and other production inputs that may influence the infant’s birth weight, the net effect of unobservable factors results in a positive bias in OLS estimates.

We must also be aware of the issues of population stratification in studies that use genetic variants as instruments. While the Norwegian population is quite homogenous, the Add Health data comes from a representative sample of the US. In order to attempt to control for issues of population stratification, we adjusted the analyses for racial/ethnic group membership. Our sample size in the Add Health is far too small to report separate estimates for each racial group (65 blacks and 57 Hispanics), so we conducted auxiliary analyses for the sample of Caucasian mothers in our sample. The estimates were somewhat larger (−660 grams), statistically significant at the 10% level and had a first stage F-statistic of 5.2. Therefore, we place more weight on the results for Caucasian mothers in both samples and exercise caution in extrapolating the results to other populations until further evidence is available.

The used instruments represent variants in genes that are candidates for contributing to the etiology of smoking. Other genetic variants have been identified that also contribute to smoking behaviors. Unfortunately, many of these have not been measured yet in the available data samples for this study. In the future, we hope to replicate this study with other genetic instruments for smoking when they are measured. Studies using genetic instruments in other samples are also needed in order to gauge the sensitivity of the results to alternative samples and genetic specifications. The study supports the utility of employing genetic instruments to identify the effects of maternal smoking on infant health. Other maternal risk behaviors and health characteristics including alcohol use and obesity are also influenced by genetic factors, and previous studies have identified specific genetic variants that may contribute to their etiology. Therefore, genotyping these genetic variants in relevant data samples with maternal DNA is a major research priority in order evaluate the utility of these variants as instruments for maternal alcohol use and weight and in order to obtain accurate estimates of the effects of these risk behaviors on infant health outcomes.

The study results have major implications for developing public health policies to reduce the negative externalities of maternal smoking on child health. The study highlights the need for further public health efforts to increase the awareness of women of childbearing age about the harmful effects of smoking, which the current study suggests exceed previous findings. Further, the study suggests that taxation rates and smoking ban policies need to take into account a larger negative effect of smoking on child health that may also be carried into larger health risks and poorer socioeconomic outcomes during adulthood. Finally, the study supports the development of additional policy interventions that enhance early life investments in health beginning at conception and reduce maternal risk behaviors as these may have large individual and social health and economic returns.


This research uses data from the Norway Facial Cleft Study (NCL) and Add Health. The research using the NCL sample was supported by NIDCR grants R03 DE018394 and R01 DE020895. Add Health is a program project designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris, and funded by a grant P01-HD31921 from the National Institute of Child Health and Human Development, with cooperative funding from 17 other agencies. Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Persons interested in obtaining data files from Add Health should contact Add Health, Carolina Population Center, 123 W. Franklin Street, Chapel Hill, NC 27516-2524 (ude.cnu@htlaehdda). We thank Jeremy Green for helpful comments and research assistance.


Table A1

Distribution of study variables in the NCL analytical and excluded samples due to incomplete data

VariableMean (SD)
Analytical sampleExcluded Samplee
Birth weight3646.64** (589.76)3540.04 (585.64)
Smoking0.32 (0.47)0.32 (0.47)
Daily smoking0.27 (0.44)0.29 (0.45)
Cigarettes1.83 (3.79)2.36 (4.61)
Binge drinking in the first trimestera0.05 (0.21)0.06 (0.23)
Moderate drinking in the first trimestera0.09 (0.29)0.09 (0.28)
Low drinking in the first trimestera0.18 (0.39)0.14 (0.34)
Binge drinking before pregnancya0.38 (0.49)0.4 (0.49)
Moderate drinking before pregnancya0.4 (0.49)0.41 (0.49)
Low drinking before pregnancya0.13* (0.34)0.09 (0.29)
Multivitamin use0.38 (0.49)0.34 (0.47)
Underweightb0.03 (0.17)0.05 (0.22)
Overweightb0.19 (0.39)0.18 (0.38)
Obeseb0.07 (0.25)0.08 (0.26)
Calories2281.19 (751.15)2300.01 (928.38)
Maternal age29.43 (4.7)28.84 (4.91)
Married0.53 (0.5)0.53 (0.5)
Less than high schoolc0.09*** (0.28)0.17 (0.37)
High schoolc0.27 (0.44)0.3 (0.46)
Technical collegec0.2 (0.4)0.21 (0.41)
Universityc0.07 (0.26)0.04 (0.2)
Maternal employment0.86 (0.35)0.83 (0.37)
Very low maternal incomed0.34* (0.47)0.41 (0.49)
Moderate maternal incomed0.25 (0.43)0.2 (0.4)
High maternal incomed0.15 (0.35)0.13 (0.34)
Pregnancy planning0.74 (0.44)0.71 (0.45)
rs1041983_ C/T0.43 (0.5)0.45 (0.5)
rs1041983_ C/C0.49 (0.5)0.36 (0.49)
rs1930139_ A/G0.32 (0.47)0.26 (0.44)
rs1930139_ G/G0.05 (0.21)0.11 (0.31)
rs721398_C/T0.39 (0.49)0.49 (0.51)
rs721398_C/C0.07 (0.26)0.07 (0.26)
rs5758589_A/G0.49 (0.5)0.5 (0.51)
rs5758589_G/G0.25 (0.44)0.24 (0.43)
rs1432007_A/G0.52 (0.5)0.56 (0.5)
rs1432007_G/G0.26 (0.44)0.24 (0.43)
rs2059574_A/T0.47 (0.5)0.52 (0.51)
rs2059574_T/T0.34 (0.47)0.36 (0.49)
rs2268973_A/G0.4 (0.49)0.36 (0.48)
rs2268973_G/G0.52 (0.5)0.48 (0.51)
aThe reference category is no drinking.
bThe reference category is normal weight (18.5≤BMI<25).
cThe reference category is 2–4 years of college.
dThe reference category is 151,000–200,000 kr.
eThe distribution in the excluded sample for a certain variable is based on the observations that had data for that variable (these observations did not have complete data on all study variables). In the Norway sample, About 250 observations had data on non-genetic variables except for calories (about 200 observations). About 40–45 observations had data on the genetic instruments.
*, **, and *** indicate significant differences between the analytical and excluded

samples at p<0.1, 0.05 and 0.01 respectively.

Table A2

Distribution of study variables in the Add Health analytical and excluded samples due to incomplete data

VariableMean (SD)
Analytical sampleExcluded Samplea
Birth weight (grams)3246 (556)3260 (568)
Smoke During Pregnancy0.17 (0.38)0.17 (0.38)
Age (wave 1)17.43 (1.64)17.43 (1.6)
Male baby0.51 (0.5)0.52 (0.5)
Married at Birth0.28 (0.45)0.26 (0.44)
Black0.21 (0.41)0.3 (0.46)
Hispanic0.19 (0.39)0.16 (0.37)
Family Income (during adolescence)36.0 (30.78)35.58 (33.62)
Grandmother Education12.45 (2.04)12.56 (2.01)
No Health Insurance (wave 3)0.25 (0.44)0.22 (0.41)
Medicaid (wave 3)0.26 (0.44)0.26 (0.44)
Want Child Before Pregnancy0.38 (0.49)0.37 (0.48)
aN = 1691.

Table A3

Birth Weight Function Coefficients in Norway Sample

OLS SmokingOLS Cigarettes2SLS Smoking2SLS Cigarettes
Smoking−161.7** (63.74)−522.5* (293.4)
Cigarettes−12.66 (8.256)−46.21** (22.85)
Binge drinking in the first trimester59.55 (137.2)47.33 (137.8)117.5 (147.7)83.72 (148.8)
Moderate drinking in the first trimester126.8 (91.89)111.4 (91.35)221.1** (112.8)182.7* (103.2)
Low drinking in the first trimester96.64 (73.96)84.51 (75.52)156.3* (89.16)123.3 (80.79)
Binge drinking before pregnancy−176.4 (112.3)−210.0* (111.8)−71.88 (128.0)−174.7 (112.6)
Moderate drinking before pregnancy−69.72 (108.4)−92.46 (108.9)−34.73 (108.2)−111.2 (111.1)
Low drinking before pregnancy99.63 (119.3)79.95 (119.6)117.1 (117.4)48.52 (122.3)
Multivitamin use−90.13* (53.56)−91.43* (53.74)−101.5* (55.86)−108.4** (54.57)
Underweight−257.6** (101.6)−284.3*** (100.6)−197.5 (123.4)−283.9*** (103.8)
Overweight107.9* (61.65)107.2* (62.10)116.5* (61.90)115.7* (60.94)
Obese77.52 (131.2)67.65 (129.4)108.3 (139.5)77.98 (129.2)
Calories0.0115 (0.0358)0.00693 (0.0352)0.0273 (0.0394)0.0136 (0.0352)
Maternal age10.65 (6.572)10.35 (6.549)12.28* (6.787)11.47* (6.557)
Married−68.63 (57.55)−68.58 (57.38)−93.70 (62.87)−98.21 (62.44)
Less than high school−2.996 (101.7)−12.19 (103.7)87.71 (129.6)71.16 (122.6)
High school62.46 (68.28)50.23 (68.10)105.6 (72.63)69.02 (66.82)
Technical college−92.98 (78.27)−104.0 (78.33)−38.16 (92.93)−68.29 (81.38)
University0.125 (105.9)14.41 (107.9)−9.110 (107.3)41.32 (116.4)
Maternal employment4.716 (81.44)16.67 (81.98)−49.57 (93.47)−16.11 (85.36)
Very low maternal income98.28 (69.89)96.38 (69.72)110.1 (71.17)105.4 (69.56)
Moderate maternal income−48.59 (79.60)−49.23 (79.42)−30.41 (78.32)−29.36 (77.50)
High maternal income−86.23 (101.5)−83.38 (103.0)−94.69 (102.9)−85.87 (105.2)
Pregnancy planning56.70 (61.52)58.24 (62.15)40.61 (61.70)43.21 (61.58)
Constant3408.6*** (263.5)3424.8*** (263.9)3394.8*** (265.9)3451.1*** (261.2)
Instrument F statistic [df]3.33*** [8, 476]3.99*** [12, 472]
Over-identification Chisquare [df]6.0 [7]8.66 [11]
Wu-Hausman F Statistic [df]1.66 [1, 482]2.53 [1, 482]

Heteroscedasticity-robust asymptotic standard errors are in parentheses.

*p< 0.1,
**p< 0.05,
***p< 0.01

Table A4

First Stage Regression Estimates in the Norway sample

Binge drinking in the first trimester0.172* (0.0960)0.983 (0.776)
Moderate drinking in the first trimester0.258*** (0.0691)2.320*** (0.562)
Low drinking in the first trimester0.171*** (0.0529)1.233*** (0.426)
Binge drinking before pregnancy0.300*** (0.0767)1.122* (0.620)
Moderate drinking before pregnancy0.101 (0.0757)−0.773 (0.614)
Low drinking before pregnancy0.0192 (0.0860)−1.204* (0.694)
Multivitamin use−0.0424 (0.0402)−0.605* (0.324)
Underweight0.164 (0.114)0.138 (0.922)
Overweight0.00861 (0.0496)0.214 (0.401)
Obese0.0749 (0.0782)0.271 (0.632)
Calories0.00004 (0.00003)0.0001 (0.0002)
Maternal age0.00504 (0.00459)0.0516 (0.0373)
Married−0.0661 (0.0413)−0.891*** (0.334)
Less than high school0.216*** (0.0786)2.172*** (0.636)
High school0.109** (0.0521)0.507 (0.418)
Technical college0.133** (0.0589)0.953** (0.473)
University−0.0235 (0.0842)0.862 (0.681)
Maternal employment−0.142** (0.0624)−0.915* (0.506)
Very low maternal income0.0410 (0.0524)0.466 (0.425)
Moderate maternal income0.0326 (0.0570)0.342 (0.462)
High maternal income−0.0245 (0.0709)−0.166 (0.570)
Pregnancy planning−0.0517 (0.0458)−0.388 (0.371)
rs1041983_C/T−0.109 (0.0717)−1.029 (1.355)
rs1041983_C/C−0.00760 (0.0713)0.312 (1.499)
rs1930139_A/G0.0782* (0.0415)
rs1930139_G/G0.0839 (0.0908)
rs1432007_A/G−0.0973* (0.0497)−0.955** (0.404)
rs1432007_G/G−0.0738 (0.0565)−0.469 (0.463)
rs2268973_A/G−0.157** (0.0746)−0.563 (0.607)
rs2268973_G/G−0.204*** (0.0725)−1.507** (0.591)
rs721398_C/T1.077 (0.737)
rs721398_C/C1.136 (1.593)
rs5758589_A/G−0.116 (0.383)
rs5758589_G/G−0.962** (0.436)
rs2059574_A/T0.703* (0.414)
rs2059574_T/T−0.618 (0.433)
Constant0.226 (0.207)2.164 (2.112)

Heteroscedasticity-robust asymptotic standard errors are in parentheses.

*p< 0.1,
**p< 0.05,
***p< 0.01

Table A5

Sensitivity Analysis Models in the Norway sample

Smoking 2SLSaCigarettes 2SLSaDaily smoking 2LSbDaily smoking OLSb
Smoking−676.3* (358.0)
Cigarettes−57.45** (28.85)
Daily smoking−107.1 (68.07)−612.2* (320.5)
Binge drinking in the first trimester142.2 (159.9)95.91 (156.2)45.53 (136.1)101.8 (150.0)
Moderate drinking in the first trimester261.4** (126.0)206.5* (110.0)109.0 (92.90)224.6** (112.5)
Low drinking in the first trimester181.8* (96.85)136.2 (84.30)86.44 (76.24)164.5* (91.70)
Binge drinking before pregnancy−27.30 (142.9)−162.9 (116.0)−199.2* (112.1)−85.61 (121.6)
Moderate drinking before pregnancy−19.81 (112.2)−117.5 (113.5)−82.87 (108.1)−70.96 (107.3)
Low drinking before pregnancy124.5 (119.8)37.98 (125.5)91.85 (119.2)92.06 (118.3)
Multivitamin use−106.4* (57.86)−114.1** (56.65)−86.88 (54.01)−95.63* (57.28)
Underweight−171.9 (135.6)−283.7*** (106.4)−275.1*** (101.3)−231.1* (124.3)
Overweight120.2* (64.43)118.5* (61.75)104.4* (61.94)106.2* (62.30)
Obese121.4 (145.7)81.44 (130.6)72.68 (130.2)114.8 (143.5)
Calories0.0341 (0.0416)0.0159 (0.0355)0.00818 (0.0354)0.0260 (0.0394)
Maternal age12.97* (7.008)11.85* (6.621)9.970 (6.604)10.18 (6.861)
Married−104.4 (66.40)−108.1* (65.53)−66.77 (57.59)−111.0 (68.41)
Less than high school126.4 (144.2)99.09 (132.7)−17.47 (102.5)106.0 (136.9)
High school124.0 (77.86)75.31 (67.78)51.10 (68.48)88.63 (73.30)
Technical college−14.79 (101.2)−56.31 (85.20)−97.79 (79.61)−4.656 (103.6)
University−13.05 (111.2)50.33 (122.0)1.775 (106.2)−9.955 (108.7)
Maternal employment−72.71 (101.2)−27.09 (88.11)15.68 (81.90)−47.34 (93.51)
Very low maternal115.2 (75.38)108.4 (70.80)92.13 (69.99)88.13 (74.23)
Moderate maternal−22.66 (81.51)−22.69 (78.59)−53.50 (79.90)−38.26 (80.82)
High maternal income−98.30 (105.1)−86.70 (107.3)−88.63 (102.4)−117.8 (112.3)
Pregnancy planning33.75 (63.02)38.17 (62.87)59.28 (61.87)37.43 (63.83)
Constant3388.9*** (274.0)3459.9*** (265.3)3429.7*** (265.0)3500.0*** (278.6)
F statistic [df]2.88*** [6, 478]4.52*** [6, 478]3.18*** [8, 478]
Over-identification Chisquare [df]4.59 [5]7.1 [5]4.74 [7]
Wu-Hausman F Statistic [df]2.3 [1, 482]2.98 [1, 482]2.76* [1, 482]

Heteroscedasticity-robust asymptotic standard errors are in parentheses.

*p< 0.1,
**p< 0.05,
***p< 0.01
aThe detoxification gene (NAT2 and CYP2D6) instruments are excluded.
bWomen who smoked less than one cigarette per day on average are considered non-smokers (daily smoking defined as smoking 1 cigarette per day or more).

Table A6

Birth Weight Function Coefficients in Add Health Sample

Smoke During Pregnancy−154.907* (88.511)−586.857 (409.118)
Age (wave 1)0.035 (22.339)3.286 (22.230)
Male Baby120.112* (65.475)114.881* (65.743)
Married at Time of Birth67.252 (76.857)7.854 (97.790)
Black−81.584 (80.032)−206.224 (126.365)
Hispanic96.933 (103.675)−8.166 (136.918)
Family Income2.221*** (0.730)2.176*** (0.669)
Maternal Education3.820 (22.537)−1.340 (23.379)
No Health Insurance (wave 3)105.727 (74.230)148.051* (83.208)
Medicaid (wave 3)31.645 (90.822)58.944 (92.977)
Want Child−34.866 (70.966)−48.549 (71.466)
Constant3,040.608*** (572.485)3,184.209*** (636.250)
F-statistic [df]4.353*** [3, 301]
Over-identification Chisquare [df]0.926 [3]
Wu-Hausman F Statistic [df]0.82 [1, 289]

Heteroscedasticity-robust asymptotic standard errors are in parentheses.


Table A7

First Stage Regression Estimations in the Add Health sample

Age−0.012 (0.013)
Male Baby−0.005 (0.042)
Married at Time of Birth−0.119** (0.049)
Black−0.233*** (0.046)
Hispanic−0.238*** (0.048)
Family Income ($1000s)0.002 (0.007)
Maternal Education−0.017* (0.010)
No Health Insurance (wave 3)0.097* (0.056)
Medicaid (wave 3)0.078 (0.052)
Want Child−0.016 (0.044)
A1A2 × MAOA4R1−0.101** (0.048)
SL × DRD4R10.138* (0.070)
SL × MAOAR2−0.150*** (0.053)
Constant0.713** (0.280)

Heteroscedasticity-robust asymptotic standard errors are in parentheses.

*p< 0.1,
**p< 0.05,
***p< 0.01


The authors do not have any conflicts of interest in this work.

1The use of genetic instruments is sometimes referred to as “Mendelian Randomization” in epidemiology, but the approach is a standard instrumental variable application with genetic instruments (Wehby, Ohsfeldt, and Murray 2008).

2See study website

3The genotyped sample did not include samples of 171 control children due to inadequate and/or low quality DNA samples. The analytical sample excludes 85 cases from the genotyping sample due to missing data on the study variables, including genotypic data. The data loss is not correlated with any characteristics that are related to smoking and birth weight as described below, and is therefore thought to be random and not systematic.

4The first trimester is a critical period for fetal development and maternal exposures during this period are likely to have large effects on fetal growth and birth weight. The majority (about 74%) of mothers who smoked at pregnancy continued to do so during the first trimester.

5In a sensitivity analysis, we estimated the impact of daily smoking (i.e. a minimum of 1 cigarette per day) during the first trimester on birth weight.

6The smoking rates in the Norway sample reported at the first prenatal visit at about 10.3 weeks of gestation on average (available through the Medical Birth Registry) were lower than first trimester smoking rates reported post delivery in the NCL survey, suggesting that smoking status later in pregnancy may not accurately reflect first trimester smoking due to quitting during pregnancy (Lie et al. 2008).

7We identify first births by examining whether the reported age of the mother at the time the pregnancy ended was the lowest age of all observations for that mother.

8The Add Health sample had no data on number of cigarettes smoked—only number of packs smoked.

9These expectations are based on maternal perceptions of the biologic and environmental risk factors that contribute to maternal and child’s health (health. endowments). However, maternal risk perceptions and health endowments are inadequately measured in typically available datasets for birth outcomes studies, including the samples used in this study. The net bias in estimating the effects of smoking on birth weight by estimating equation (1) via OLS is a function of the average positive and negative biases in an available sample and cannot be signed a priori. For example, a potential mother may choose to smoke prior to pregnancy in part due to her perceptions of her health risks and how these risks might be affected by smoking. During pregnancy, the mother will decide to continue to smoke or not in part due to her perceptions of her health risks during pregnancy, of fetal health risks, and of the effect of smoking on these risks. Thus, a woman may decide to smoke during pregnancy in part because she considers herself to be healthy and considers that continuing to smoke will have no adverse effects on her and her child’s health. In this case, unobservable indicators of health endowments that are positively correlated with both smoking and child health (such as no history of low birth weight in the family or in previous pregnancies, history of smoking without health problems in the family, and others) may result in an underestimation (positive bias) of the negative effects of smoking on birth weight. On the other hand, mothers who smoke are likely to have, on average, stronger preferences for current versus future consumption and are therefore more likely to engage in other unhealthy behaviors besides smoking, such as poor nutrition, lack of exercise, drug use, overall risk taking, and others. These factors likely have negative effects on birth weight. If some of the health behaviors correlated with smoking and birth weight are unobserved, as is typically the case in available datasets, the effects of smoking on birth weight may be overestimated (negative bias). The net bias in estimating the effects of smoking on birth weight is a function of the average positive and negative biases in an available sample and cannot be signed a priori.

10A few studies (Evans and Ringel 1999; Rosenzweig 1983) among others have used an IV strategy to account for self-selection into prenatal smoking. As we discuss above, the main contribution is to exploit a new source of variation generated by genetic instruments that vary at the individual level as opposed to group or state level instruments of tax rates.


12These were GSTM1, UGT1A7, NAT1, NAT2, CYP2E1, CHRNA4, GSTT1, CYP2D6, GABBR2, GABRB3, DDC, GAD1, and KCNJ2. We also consider SNPs in the gene ACTN1 which has been identified in a recent GWA study of smoking and that have been genotyped in this sample (Caporaso et al. 2009).

13rs1041983 (NAT2), rs1930139 (GABBR2), rs1432007 (GABRB3), and rs4906908 (GABRB3) are correlated with smoking participation and are used as instruments. rs1041983 (NAT2), rs721398 (NAT2), rs1432007 (GABRB3), rs2059574 (GABRB3), rs5758589 (CYP2D6), rs2268973 (ACTN1) are correlated with the number of cigarettes and used as instruments. NAT2 and CYP2D6 are genes of detoxification pathways and have been implicated in several types of cancer including breast, prostate, bladder cancer, and others (Abdel-Rahman et al. 2000; Sanderson, Salanti, and Higgins 2007)) especially when combined with smoking and alcohol, though results are generally inconsistent across studies. Given that the study is limited to women of childbearing age and is focused on birth outcomes, it is unlikely that these variants affected the studied outcomes through their effect on cancer risks. GABBR2 and GABRB3 are genes that code receptors for the neurotransmitter GABA, which are involved in neurological inhibition. GABBR2 has been implicated in smoking behaviors Previous studies of smoking genetics that included GABRB3 did not report significant results t in coding for cytoskeletal proteins and has overall no well documented disease associations and functions, but a SNP in ACTN1 has recently been found in a GWA study to be significantly related to a threshold indicator of number of cigarettes per day(Caporaso et al. 2009).

14All of these genes except for DRD4 are considered to be candidate genes for smoking by the NICSNP Nicotine Project. However, DRD4 have been linked in previous studies to smoking behaviors (Laucht et al, 2008; Hutchison et al. 2002). Comings et al. (1996) find an association between DRD2 and smoking behavior, while Jin et al. (2006) find an association between MAOA and smoking behaviors, and Gerra et al. (2005) find an association between 5-HTT and smoking behaviors(Comings 1996; Gerra et al. 2005).

15There is no consistent evidence in the literature for interactive effects between the genetic variants used as instruments in the analysis of the Norway sample. Therefore, we do not include interaction terms between these variants in the Norway sample analysis. Using binary indicators as instruments for the main effects of these genetic variants as done in the Norway data model is suboptimal as these indicators have insignificant effects, which weakens the first stage.

16Four new SNPs have recently been added to the Add Health data, but they were weaker predictors of smoking behaviors of pregnant women in our sample than those we use in this paper.

17This may occur due to the correlations between alleles that are tightly linked within a certain genomic area on a certain chromosome (referred to as linkage-disequilibrium).

18See Fletcher and Lehrer (2009), Ding et al. (2006, 2009) and Norton and Han (2008) for other applications that use genetic markers as instruments. These studies use a similar approach to evaluate the instrument validity (Ding 2006; Ding et al. 2009; Fletcher and Lehrer 2009; Norton and Han 2008).

19Table A3 in the Appendix reports the full regression results along with the tests of the IV assumption and Table A4 reports the coefficients of the first stages of the 2SLS models.

20The exogeneity of smoking participation is not rejected based on a Hausman test (0.189)

21The exogeneity of cigaretes is not rejected based on a Hausman test (p=0.107).

22Table A6 reports the full OLS and 2SLS regression results.

23The Add Health Wave IV data have recently been released. At a reviewer’s request, we attempted to increase our sample by including the births that occurred between the two waves of data. While we were able to double our sample, the instruments were somewhat weaker in the larger sample, (F-stat <2). The point estimates were nearly identical to those in the tables (−570 in the large sample vs. −586 in this paper). The reason for the weaker instruments in the large sample are unknown but could be related to the composition of the new births, which are from older, more advantaged mothers. It appears that these genetic variants play a smaller role in the smoking decisions of these mothers.

24First-stage results are available in Appendix Table A7.

25In the Add Health models that adjust for twin-birth status, the smoking coefficient is −144 (marginally significant), −551 and −571 under OLS, 2SLS and LIML, respectively. The instrument effects on smoking are also virtually unaffected (F-statistic = 4.22). Further detailed results are available upon request.

26The rates of smoking reported post delivery in the Norway sample were higher than those reported during the first prenatal visit. This may be due to some mothers stopping smoking before their first prenatal visit (around 10 weeks of pregnancy on average in this sample) or due to underreporting of smoking during prenatal visits) (Lie et al. 2008). Therefore, biased reporting of smoking cannot be completely ruled out as a potential contributor to underestimation of smoking effects by OLS in the Norway sample.

Contributor Information

George Wehby, Assistant Professor of Health Economics, Dept. of Health Management and Policy, College of Public Health, University of Iowa, 200 Hawkins Drive, E205 GH, Iowa City, IA 52242 USA, Phone: 319-384-5133; Fax: 319-384-5125.

Jason M. Fletcher, Assistant Professor of Public Health, Division of Health Policy and Administration Department of Epidemiology and Public Health Yale University, 60 College St, #303; New Haven, CT 06520.

Steven F. Lehrer, Queen’s University, School of Policy Studies, Kingston, OntarioCanada, K7L 3N6.

Lina M. Moreno, Assistant Professor, University of Iowa, N401 DSB, Iowa City, IA, 52242, USA.

Jeffrey C. Murray, University of Iowa, Dept of Pediatrics, Iowa City, IA 52242, USA, Phone 1 319 335 6897.

Allen Wilcox, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA.

Rolv T. Lie, University of Bergen, Bergen, Norway.


  • Abdel-Rahman SZ, Salama SA, Au WW, Hamada FA. Role of polymorphic CYP2E1 and CYP2D6 genes in NNK-induced chromosome aberrations in cultured human lymphocytes. Pharmacogenetics. 2000;10(3):239–49. [PubMed]
  • Adams E, Kathleen P, Vincent Miller, Ernst Carla K, Brenda Nishimura, Cathy Melvin, Robert Merritt. Neonatal health care costs related to smoking during pregnancy. Health Economics. 2002;11(3):193–206. [PubMed]
  • Almond Douglas, Chay Kenneth Y, Lee David S. The Costs of Low Birth Weight. Quarterly Journal of Economics. 2005;120(3):1031–1083.
  • Anderson P, Doyle LW. Neurobehavioral outcomes of school-age children born extremely low birth weight or very preterm in the 1990s. Jama. 2003;289(24):3264–72. [PubMed]
  • Andrews DWK, Moreira MJ, Stock JH. Optimal two-sided invariant similar tests for instrumental variables regression. Econometrica. 2006;74(3):715–752.
  • Angrist Joshua D, Imbens Guido W, Rubin Donald B. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association. 1996;91(434):444–455.
  • Berrettini W, Yuan X, Tozzi F, Song K, Francks C, Chilcoat H, Waterworth D, Muglia P, Mooser V. Alpha-5/alpha-3 nicotinic receptor subunit alleles increase risk for heavy smoking. Mol Psychiatry. 2008;13(4):368–73. [PMC free article] [PubMed]
  • Beuten J, Ma JZ, Payne TJ, Dupont RT, Crews KM, Somes G, Williams NJ, Elston RC, Li MD. Single- and multilocus allelic variants within the GABA(B) receptor subunit 2 (GABAB2) gene are significantly associated with nicotine dependence. Am J Hum Genet. 2005;76(5):6. [PubMed]
  • Boardman JD, Powers DA, Padilla YC, Hummer RA. Low birth weight, social factors, and developmental outcomes among children in the United States. Demography. 2002;39(2):353–68. [PubMed]
  • Brachet T. Maternal smoking, misclassification, and infant health. Working paper. 2005
  • Buka SL, Shenassa ED, Niaura R. Elevated risk of tobacco dependence among offspring of mothers who smoked during pregnancy: a 30-year prospective study. Am J Psychiatry. 2003;160(11):1978–84. [PubMed]
  • Caporaso N, Gu F, Chatterjee N, Sheng-Chih J, Yu K, Yeager M, Chen C, Jacobs K, Wheeler W, Landi MT, Ziegler RG, Hunter DJ, Chanock S, Hankinson S, Kraft P, Bergen AW. Genome-wide and candidate gene association study of cigarette smoking behaviors. PLoS ONE. 2009;4(2):e4653. [PMC free article] [PubMed]
  • Carmelli D, Swan GE, Robinette D, Fabsitz R. Genetic influence on smoking--a study of male twins. N Engl J Med. 1992;327(12):829–33. [PubMed]
  • CDC. 2004 Smoking During Pregnancy - United States. MMWR Surveill Summ. 53:911–915. [PubMed]
  • Comings DE, Ferry L, Bradshaw-Robinson S, Burchette R, Chiu C, Muhleman D. The dopamine D2 receptor (DRD2) gene: a genetic risk factor in smoking. Pharmacogenetics. 1996;6(1):73–79. [PubMed]
  • DiFranza JR, Lew RA. Morbidity and mortality in children associated with the use of tobacco products by other people. Pediatrics. 1996;97(4):560–8. [PubMed]
  • Ding W, Lehrer SF, Rosenquist JN, Audrain-McGovern J. The impact of poor health on academic performance: New evidence using genetic markers. J Health Econ. 2009;28(3):578–97. [PubMed]
  • Ding W, Lehrer S, Rosenquist NJ, Audrain-McGovern J. NBER Working Papers. National Bureau of Economic Research, Inc; 2006. The Impact of Poor Health on Education: New Evidence Using Genetic Markers; p. 12304.
  • Ebrahim SH, Floyd RL, Merritt RK, 2nd, Decoufle P, Holtzman D. Trends in pregnancy-related smoking rates in the United States, 1987–1996. Jama. 2000;283(3):361–6. [PubMed]
  • Eriksson KM, Haug K, Salvesen KA, Nesheim BI, Nylander G, Rasmussen S, Andersen K, Nakling JO, Eik-Nes SH. Smoking habits among pregnant women in Norway 1994–95. Acta Obstet Gynecol Scand. 1998;77(2):159–64. [PubMed]
  • Evans William N, Ringel Jeanne S. Can higher cigarette taxes improve birth outcomes? Journal of Public Economics. 1999;72(1):135–154.
  • Faden VB, Graubard BI. Maternal substance use during pregnancy and developmental outcome at age three. J Subst Abuse. 2000;12(4):329–40. [PubMed]
  • Finaly K, Magnusson LM. Implementing weak-instrument robust tests for a general class of instrumental-variables models. Stata Journal. 2009;9(3):398–421.
  • Fletcher Jason M, Lehrer Steven F. Using Genetic Lotteries within Families to Examine the Causal Impact of Poor Health on Academic Achievement. National Bureau of Economic Research Working Paper Series No. 15148 2009
  • Freathy Rachel M, Ring Susan M, Shields Beverley, Galobardes Bruna, Knight Beatrice, Weedon Michael N, Smith George Davey, Frayling Timothy M, Hattersley Andrew T. A common genetic variant in the 15q24 nicotinic acetylcholine receptor gene cluster (CHRNA5-CHRNA3-CHRNB4) is associated with a reduced ability of women to quit smoking in pregnancy. Hum Mol Genet. 2009;18(15):2292–2927. [PMC free article] [PubMed]
  • Gerra G, Garofano L, Zaimovic A, Moi G, Branchi B, Bussandri M, Brambilla F, Donnini C. Association of the serotonin transporter promoter polymorphism with smoking behavior among adolescents. Am J Med Genet B Neuropsychiatr Genet. 2005;135B(1):73–8. [PubMed]
  • Grossman Michael, Joyce Theodore J. Unobservables, Pregnancy Resolutions, and Birth Weight Production Functions in New York City. Journal of Political Economy. 1990;98(5):983.
  • Hahn J, Hausman J. Weak instruments: Diagnosis and cures in empirical econometrics. American Economic Review. 2003;93(2):118–125.
  • Hausman JA. Specification Tests in Econometrics. Econometrica. 1978;46(6):1251–1271.
  • Heath AC, Martin NG. Genetic models for the natural history of smoking: evidence for a genetic influence on smoking persistence. Addict Behav. 1993;18(1):19–34. [PubMed]
  • Heckman JJ. Schools, Skills, and Synapses. Economic Inquiry. 2008;46(3):289–324. [PMC free article] [PubMed]
  • Heckman James J. Policies to foster human capital. Research in Economics. 2000;54(1):3–56.
  • Hutchison KE, LaChance H, Niaura R, Bryan A, Smolen A. The DRD4 VNTR polymorphism influences reactivity to smoking cues. J Abnorm Psychol. 2002;111(1):134–43. [PubMed]
  • Imbens Guido W, Angrist Joshua D. Identification and Estimation of Local Average Treatment Effects. Econometrica. 1994;62(2):467–475.
  • Jin Ying, Chen Dafang, Hu Yonghua, Guo Song, Sun Hongqiang, Lu Aili, Zhang Xiaoyan, Li Lingsong. Association between monoamine oxidase gene polymorphisms and smoking behaviour in Chinese males. The International Journal of Neuropsychopharmacology. 2006;9:557–564. doi: 10.1017/S1461145705006218. [PubMed] [Cross Ref]
  • Jugessur A, Shi M, Gjessing HK, Lie RT, Wilcox AJ, Weinberg CR, Christensen K, Boyles AL, Daack-Hirsch S, Trung TN, Bille C, Lidral AC, Murray JC. Genetic determinants of facial clefting: analysis of 357 candidate genes using two national cleft studies from Scandinavia. PLoS One. 2009;4(4):e5385. [PMC free article] [PubMed]
  • Kortmann GL, Dobler CJ, Bizarro L, Bau CH. Pharmacogenetics of smoking cessation therapy. Am J Med Genet B Neuropsychiatr Genet. 2009;153B(1):17–28. [PubMed]
  • Kramer MS. Intrauterine growth and gestational duration determinants. Pediatrics. 1987;80(4):502–11. [PubMed]
  • Kvalvik LG, Skjaerven R, Haug K. Smoking during pregnancy from 1999 to 2004: a study from the Medical Birth Registry of Norway. Acta Obstet Gynecol Scand. 2008;87(3):280–5. [PubMed]
  • Laucht M, Becker K, Frank J, Schmidt MH, Esser G, Treutlein J, Skowronek MH, Schumann G. Genetic variation in dopamine pathways differentially associated with smoking progression in adolescence. J Am Acad Child Adolesc Psychiatry. 2008 Jun;47(6):673–81. [PubMed]
  • Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Statistics in Medicine. 2008;27(8):1133–63. [PubMed]
  • Lessov CN, Martin NG, Statham DJ, Todorov AA, Slutske WS, Bucholz KK, Heath AC, Madden PA. Defining nicotine dependence for genetic research: evidence from Australian twins. Psychol Med. 2004;34(5):865–79. [PubMed]
  • Li MD, Mangold JE, Seneviratne C, Chen GB, Ma JZ, Lou XY, Payne TJ. Association and interaction analyses of GABBR1 and GABBR2 with nicotine dependence in European- and African-American populations. PLoS One. 2009;4(9):e7055. [PMC free article] [PubMed]
  • Lie RT, Wilcox AJ, Taylor J, Gjessing HK, Saugstad OD, Aabyholm F, Vindenes H. Maternal smoking and oral clefts: the role of detoxification pathway genes. Epidemiology. 2008;19(4):606–15. [PubMed]
  • Lien DS, Evans WN. Estimating the impact of large cigarette tax hikes: The case of maternal smoking and infant birth weight. Journal of Human Resources. 2005;40(2):373–392.
  • Liu JZ, Tozzi F, Waterworth DM, Pillai SG, Muglia P, Middleton L, Berrettini W, Knouff CW, Yuan X, Waeber G, Vollenweider P, Preisig M, Wareham NJ, Zhao JH, Loos RJ, Barroso I, Khaw KT, Grundy S, Barter P, Mahley R, Kesaniemi A, McPherson R, Vincent JB, Strauss J, Kennedy JL, Farmer A, McGuffin P, Day R, Matthews K, Bakke P, Gulsvik A, Lucae S, Ising M, Brueckl T, Horstmann S, Wichmann HE, Rawal R, Dahmen N, Lamina C, Polasek O, Zgaga L, Huffman J, Campbell S, Kooner J, Chambers JC, Burnett MS, Devaney JM, Pichard AD, Kent KM, Satler L, Lindsay JM, Waksman R, Epstein S, Wilson JF, Wild SH, Campbell H, Vitart V, Reilly MP, Li M, Qu L, Wilensky R, Matthai W, Hakonarson HH, Rader DJ, Franke A, Wittig M, Schafer A, Uda M, Terracciano A, Xiao X, Busonero F, Scheet P, Schlessinger D, St Clair D, Rujescu D, Abecasis GR, Grabe HJ, Teumer A, Volzke H, Petersmann A, John U, Rudan I, Hayward C, Wright AF, Kolcic I, Wright BJ, Thompson JR, Balmforth AJ, Hall AS, Samani NJ, Anderson CA, Ahmad T, Mathew CG, Parkes M, Satsangi J, Caulfield M, Munroe PB, Farrall M, Dominiczak A, Worthington J, Thomson W, Eyre S, Barton A, Mooser V, Francks C, Marchini J. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat Genet. 2010;42(5):436–40. [PMC free article] [PubMed]
  • Maes HH, Sullivan PF, Bulik CM, Neale MC, Prescott CA, Eaves LJ, Kendler KS. A twin study of genetic and environmental influences on tobacco initiation, regular tobacco use and nicotine dependence. Psychol Med. 2004;34(7):1251–61. [PubMed]
  • Mervis CA, Decoufle P, Murphy CC, Yeargin-Allsopp M. Low birthweight and the risk for mental retardation later in childhood. Paediatr Perinat Epidemiol. 1995;9(4):455–68. [PubMed]
  • NIEHS. Norway Facial Cleft Study (NCL) 2009
  • Norton Edward C, Han Euna. Genetic Information, Obesity, and Labor Market Outcomes. Health Economics. 2008;17(9):1089–1104. [PubMed]
  • Permutt T, Hebel JR. Simultaneous-equation estimation in a clinical trial of the effect of smoking on birth weight. Biometrics. 1989;45(2):619–22. [PubMed]
  • Reichman Nancy E, Corman Hope, Noonan Kelly, Dave Dhaval. National Bureau of Economic Research Working Paper Series No. 12004. 2006. Typically Unobserved Variables (TUVs) and Selection into Prenatal Inputs: Implications for Estimating Infant Health Production Functions.
  • Rosenzweig M, Schultz TP. Estimating a household production function: Heterogeneity, the demand for health inputs, and their effects on birth weight. The Journal of Political Economy. 1983;91(5):723–746.
  • Saigal S, Stoskopf BL, Streiner DL, Burrows E. Physical growth and current health status of infants who were of extremely low birth weight and controls at adolescence. Pediatrics. 2001;108(2):407–15. [PubMed]
  • Sanderson S, Salanti G, Higgins J. Joint effects of the N-acetyltransferase 1 and 2 (NAT1 and NAT2) genes and smoking on bladder carcinogenesis: a literature-based systematic HuGE review and evidence synthesis. Am J Epidemiol. 2007;166(7):741–51. [PubMed]
  • Schoendorf KC, Kiely JL. Relationship of sudden infant death syndrome to maternal smoking during and after pregnancy. Pediatrics. 1992;90(6):905–8. [PubMed]
  • Shi M, Wehby GL, Murray JC. Review on genetic variants and maternal smoking in the etiology of oral clefts and other birth defects. Birth Defects Res C Embryo Today. 2008;84(1):16–29. [PMC free article] [PubMed]
  • Skowronek M, Laucht M, Hohm E, Becker K, Schmidt M. Interaction between the dopamine D4 receptor and the serotonin transporter promoter polymorphisms in alcohol and tobacco use among 15-year-olds. neurogenetics. 2006;7(4):239–246. [PubMed]
  • Smith GD, Lawlor DA, Harbord R, Timpson N, Day I, Ebrahim S. Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS Med. 2007;4(12):e352. [PMC free article] [PubMed]
  • Staiger Douglas, Stock James H. Instrumental Variables Regression with Weak Instruments. Econometrica. 1997;65(3):557–586.
  • Stock JH, Wright J, Yogo M. A survey of weak instruments and weak identification in generalized method of moments. Journal of Business and Economic Statistics. 2002;20(4):518–529.
  • Sullivan PF, Kendler KS. The genetic epidemiology of smoking. Nicotine Tob Res. 1999;1(Suppl 2):S51–7. discussion S69–70. [PubMed]
  • Tong VT, Jones JR, Dietz PM, D’Angelo D, Bombard JM. Trends in smoking before, during, and after pregnancy - Pregnancy Risk Assessment Monitoring System (PRAMS), United States, 31 sites, 2000–2005. MMWR Surveill Summ. 2009;58(4):1–29. [PubMed]
  • Tyndale RF. Genetics of alcohol and tobacco use in humans. Annals of Medicine. 2003;35(2):94–122. [PubMed]
  • Ventura SJ, Hamilton BE, Mathews TJ, Chandra A. Trends and variations in smoking during pregnancy and low birth weight: evidence from the birth certificate, 1990–2000. Pediatrics. 2003;111(5 Part 2):1176–80. [PubMed]
  • Victora CG, Adair L, Fall C, Hallal PC, Martorell R, Richter L, Sachdev HS. Maternal and child undernutrition: consequences for adult health and human capital. Lancet. 2008;371(9609):340–57. [PMC free article] [PubMed]
  • Vink JM, Staphorsius AS, Boomsma DI. A genetic analysis of coffee consumption in a sample of Dutch twins. Twin Res Hum Genet. 2009;12(2):127–31. [PubMed]
  • Walsh RA. Effects of maternal smoking on adverse pregnancy outcomes: examination of the criteria of causation. Hum Biol. 1994;66(6):1059–92. [PubMed]
  • Wehby GL, Murray JC, Castilla EE, Lopez-Camelo JS, Ohsfeldt RL. Quantile effects of prenatal care utilization on birth weight in Argentina. Health Econ. 2009;18(11):1307–21. [PMC free article] [PubMed]
  • Wehby GL, Ohsfeldt RL, Murray JC. ‘Mendelian randomization’ equals instrumental variable analysis with genetic instruments. Statistics in Medicine. 2008;27(15):2745–9. [PMC free article] [PubMed]
  • Weitzman M, Gortmaker S, Sobol A. Maternal smoking and behavior problems of children. Pediatrics. 1992;90(3):342–9. [PubMed]
  • Wolf MJ, Smit B, de Groot I. Behavioural problems in children with low birthweight. Lancet. 2001;358(9284):843. [PubMed]
  • Wooldridge Jeffrey M. Econometric analysis of cross section and panel data. Cambridge and London: MIT Press; 2002.