To demonstrate the potential differences that might arise in actual practice between the 2SPS and 2SRI estimates, we re-estimated
Mullahy’s (1997) model of the effect of prenatal cigarette smoking on birthweight using data supplied by the author.
Mullahy (1997) suspects that maternal smoking during pregnancy may be correlated with the unobservable determinants of birthweight, so he specifies a nonlinear conditional mean regression model, which can be viewed as a special case of (
1). In Mullahy’s model, birthweight (y) is the following function of prenatal smoking (x
e), other observable determinants (x
o) and a scalar representing the unobservable birthweight determinants that are correlated with prenatal smoking (x
u)18
where x and β are defined canonically, and e is the random error term which is tautologically defined as e = y − exp(x
eβ
e + x
oβ
o + x
u), so that E[e | x] = 0.
Mullahy (1997) demonstrates that, given a vector of instrumental variables w = [x
o w
+], if the following conditions hold
then β
e and β
o can be consistently estimated via a generalized method of moments (GMM) estimator that does not require explicit specification of an auxiliary regression of x
e on w.
19 We implemented a flexible functional form for which GMM is not feasible (see
Terza, 2006a) and 2SPS is not consistent. Specifically, we replaced (
15) with the following variant of the inverse of the
Box-Cox (1964) model originally suggested by
Wooldridge (1992) for nonlinear models that do not involve endogeneity
and 0 ≤ τ ≤ 2. This version of the inverse Box-Cox (IBC) model maintains the desired positivity of the regression function (regardless of the values of τ and x
eβ
e + x
oβ
o + x
uβ
u), and possesses all of the essential properties of
Wooldridge=s (1992) IBC formulation. In particular, k(a, τ) subsumes the linear model when τ = 2, and k(a, τ) 6 exp(a) as τ 6 0. We estimated the parameters of (
17) using both the 2SRI and 2SPS estimators. Following
Mullahy (1997) who states
A… a linear reduced form for CIGARETTES may not be unreasonable@ we specify the auxiliary regression for prenatal cigarette consumption (x
e) as in (
8).
We used the same variables as did Mullahy: y = the newborn’s weight measured in pounds; xe = number of cigarettes smoked per day during pregnancy, xo = [1 PARITY WHITE MALE], w+ = [EDFATHER EDMOTHER FAMINCOM CIGTAX88]; PARITY = birth order; WHITE = 1 if white, 0 otherwise; MALE = 1 if male, 0 otherwise; EDFATHER = paternal schooling − yrs.; EDMOTHER = maternal schooling inus; yrs.; FAMINCOM = family income (× 10−3); CIGTAX99 = per pack state excise tax on cigarettes. The descriptive statistics of the sample are given in .
| Table 3Descriptive Statistics of Sample for Re-Analysis of Mullahy’s Birthweight Model |
For 2SRI estimation, we applied OLS to the linear auxiliary equation, and used NLS to estimate β and τ in the following version of (
9)
where
u =x
e − w
![[alpha]](/corehtml/pmc/pmcents/agrcirc.gif)
and
![[alpha]](/corehtml/pmc/pmcents/agrcirc.gif)
denotes the first-stage OLS estimator of α.
20 For 2SPS estimation, we implemented the same first stage estimator of α but in the second stage applied NLS to
where
e denotes the first-stage OLS predictor of x
e. The first-stage OLS estimates are given in . The 2SRI and 2SPS results are shown in . As in the simulation analyses of the previous section, the ultimate estimation objective is the causal effect of an exogenous change in prenatal smoking frequency on birthweight. For instance, consider
| Table 4First Stage OLS Estimates of Auxiliary Regression in the Re-Analysis of Mullahy’s Birthweight Model |
where

is the random variable representing birthweight as it would be under the exogenously imposed prenatal smoking level x
e*.
Terza (2006b) shows that under general conditions we can rewrite (
20) as E[E[y|x
e =0, x
o, x
u] − E[y|x
e =20, x
o, x
u]] which, when combined with (
16) yields
We estimated (
21), using the 2SRI results, as
where
u denotes the first stage OLS residual, and the
A^s@ indicate the 2SRI estimates. Alternatively, we estimated (
21) with the 2SPS results using
where the
A~s@ indicate the 2SPS estimates. The values of (
22) and (
23) are given in the last row of . As is shown therein, the predicted effects of an exogenous reduction in smoking from a pack per day to abstinence differ substantially between the two methods. Given the consistency of the 2SRI estimates, the results imply that 2SPS overstates (in absolute terms) the effect on birthweight by approximately 7 oz. which is about 6% of the sample mean birthweight.
| Table 5Estimation Results B IBC Version of Mullahy=s Birthweight Model |