Home | About | Journals | Submit | Contact Us | Français |

**|**Biometrika**|**PMC3371719

Formats

Article sections

- Summary
- 1. Introduction
- 2. The apparent contradiction
- 3. Explaining the apparent contradiction
- 4. Concluding remarks
- References

Authors

Related links

Biometrika. 2010 December; 97(4): 997–1001.

Published online 2010 July 31. doi: 10.1093/biomet/asq049

PMCID: PMC3371719

Lingling Li

Department of Population Medicine, Harvard Medical School, Harvard Pilgrim Health Care Institute, Boston, Massachusetts 02115, U.S.A., Email: ude.dravrah.tsop@il_gnilgnil

Division of Biostatistics, Indiana University School of Medicine, Regenstrief Institute, Indianapolis, Indiana 46202, U.S.A., Email: ude.iupui@nuhcoaix

Received 2009 November; Revised 2010 April

Copyright © 2010 Biometrika Trust

This article has been cited by other articles in PMC.

Standardized means, commonly used in observational studies in epidemiology to adjust for potential confounders, are equal to inverse probability weighted means with inverse weights equal to the empirical propensity scores. More refined standardization corresponds with empirical propensity scores computed under more flexible models. Unnecessary standardization induces efficiency loss. However, according to the theory of inverse probability weighted estimation, propensity scores estimated under more flexible models induce improvement in the precision of inverse probability weighted means. This apparent contradiction is clarified by explicitly stating the assumptions under which the improvement in precision is attained.

Often, epidemiological studies aim to evaluate the causal effect of a discrete exposure on an outcome. In observational studies systematic bias due to confounding is a serious concern. For this reason, investigators routinely collect and adjust for a large number of confounding factors in data analyses. A common analytic strategy is to categorize the confounders and then to compare the exposure group-specific standardized means. These are exposure group-specific weighted means of the outcome across levels of the categorized confounders with weights equal to the empirical probabilities of the categorized confounders in the entire sample. It is well known that overcategorization, i.e. unnecessary categorization, may induce efficiency losses. This issue is essentially the same as the well-understood increase in variance induced by adding in a linear regression model covariates that have no partial correlation with the outcome (Cochran, 1968). It has been studied in a number of nonlinear regression settings, e.g. Mantel & Haenszel (1959), Breslow (1982), Gail (1988), Robinson & Jewell (1991), Neuhauhaser & Becher (1997) and De Stavola & Cox (2008), and has been empirically analyzed for standardized means in Brookhart et al. (2006).

The issue, however, appears to contradict well-known facts in the theory of inverse probability weighted estimation. Specifically, a standardized mean is equal to a so-called inverse probability of treatment weighted mean. More precisely, it is equal to a group-specific mean of the outcome weighted by the inverse of the empirical propensity score. An empirical propensity score is the maximum likelihood estimate of the true propensity score, i.e. of the probability of being in the exposure group given the confounders, under a saturated model for the probability of exposure given the categorized confounder. The apparent contradiction is that more refined categorization corresponds to more flexible models for the propensity score, and according to the theory of inverse probability estimation, the use of more flexible propensity score models induces an improvement in the precision of inverse probability means, and not a decrease in precision as regression theory indicates.

The purpose of this note is to clarify this apparent contradiction showing that indeed, efficiency losses induced by unnecessarily refined categorizations do not contradict, and indeed are a consequence of, the theory of inverse probability estimation.

Consider a cohort study in which a discrete exposure variable *A*, an outcome *Y* and a vector of pre-exposure covariates *X* are measured for each of *n* subjects drawn at random from a study population. Although the typical goal of such a study is the evaluation of the exposure effect on the outcome, i.e. a comparison across exposure levels, the issues in this note are best understood by considering inference about the outcome mean at one specific exposure level. Thus, we will assume that *A* is binary and that the goal is to estimate the outcome mean at exposure level *A* = 1. Consider a categorization of *X* into *J* strata and let *L* denote the polytomous variable that records the stratum, a subject with covariates *X* belongs to. The standardized mean at exposure level *A* = 1 and with categorized variable *L* is

$$\widehat{\mu}\equiv {E}_{n}\left\{{E}_{n}(Y|A=1,\hspace{0.17em}\hspace{0.17em}L)\right\},$$

(1)

where throughout for any *U* and *V*,

$${E}_{n}(U)\equiv {n}^{-1}\sum _{i=1}^{n}{U}_{i},\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}{E}_{n}(U|A=1,\hspace{0.17em}\hspace{0.17em}V)\equiv \left(\sum _{i:{A}_{i}=1,{V}_{i}=V}{U}_{i}\right)/\left(\sum _{i:{A}_{i}=1,\hspace{0.17em}{V}_{i}=V}^{n}1\right).$$

For standardized means to be informative about the causal effects certain assumptions need to hold. The issue is best articulated within the potential outcomes framework. Let *Y _{a}* be the subject’s potential outcome if, perhaps contrary to fact, he is exposed to

*Assumption*1. Consistency:*Y*=*Y*._{A}*Assumption*2. Positivity: pr(*A*= 1 |*L*)*>*0.*Assumption*3. No unmeasured confounders:*Y*_{1}and*A*are conditionally independent given*L*, because in such a case$$\mu =E\{E\hspace{0.17em}(Y|A=1,\hspace{0.17em}\hspace{0.17em}L)\}.$$(2)

The apparent contradiction discussed in this note refers to the asymptotic behaviour of under two categorizations, one more refined than the other. The essence of the matter is best understood by considering the extreme case contrasting the asymptotic behaviour of the adjusted average with that of the crude unadjusted average,

$$\tilde{\mu}\equiv {E}_{n}(Y|A=1).$$

Our discussion focusses on this comparison. The well-known risk of bias induced by underadjustment, i.e. by failure to adjust for an important confounder, is vividly unmasked in this extreme case: does not generally converge in probability to *E*(*Y*_{1}). Formally, converges to *E*(*Y*_{1} | *A* = 1) which is not generally equal to *E*(*Y*_{1}) because *Y*_{1} and *A* may share the common determinant *L*. Consistency of requires that, in addition to Assumptions 1–3, at least one of the following two independencies hold.

*Assumption*4. The variables*Y*and*L*are conditionally independent given*A*= 1.*Assumption*5. The variables*A*and*L*are independent.

In the Appendix we show that solves the inverse probability weighted estimating equation

$${E}_{n}\left\{\frac{A}{{E}_{n}(A|L)}(Y-\mu )\right\}=0,$$

(3)

whereas solves the inverse probability weighted estimating equation

$${E}_{n}\left\{\frac{A}{{E}_{n}(A)}(Y-\mu )\right\}=0$$

(4)

whence the apparent contradiction emerges. Specifically, both *E _{n}*(

The apparent contradiction arises because of the vagueness of the statement about the efficiency gains induced by including *L* in the propensity score estimators, which does not explicitly mention the assumptions required for its validity. To explain the contradiction, let denote the model defined by Assumptions 1–3, let denote the model defined by Assumptions 1–4 and let denote Assumptions 1–3 and 5.

Both and are consistent for *E*(*Y*_{1}) under model or but only is consistent for *E*(*Y*_{1}) under model .

The estimator is asymptotically efficient under model and under model but is asymptotically efficient under model . These efficiency results are best understood by examining the likelihood

$${\mathcal{L}}_{n}\left({f}_{A,\hspace{0.17em}Y,\hspace{0.17em}L}\right)={\mathcal{L}}_{1,\hspace{0.17em}n}\left({f}_{L,}\hspace{0.17em}{f}_{Y|A,L}\right){\mathcal{L}}_{2,n}\left({f}_{A\hspace{0.17em}|\hspace{0.17em}L}\right),$$

(5)

where

$${\mathcal{L}}_{1,n}\begin{array}{ll}\left({f}_{L},{f}_{Y|A,L}\right)=\prod _{i=1}^{n}{f}_{L}\left({L}_{i}\right){f}_{Y|A,L}\left({Y}_{i}|{A}_{i},{L}_{i}\right),\hfill & {\mathcal{L}}_{2,n}\left({f}_{A|L}\right)=\prod _{i=1}^{n}{f}_{A|L}\left({A}_{i}|{L}_{i}\right).\hfill \end{array}$$

Model imposes restrictions on the law of (*Y*_{1}*, L, A*) but not on the distribution *f _{A,Y,L}* of the observed data (

Model restricts the law *f _{A}*

Model imposes the restriction *f _{Y}*

Given an arbitrary function *d*(*l*) and any *π* (*l*)*,* let * _{d}* (

$${E}_{n}\left\{\frac{A}{\pi (L)}d(L)(Y-\mu )\right\}=0.$$

(6)

The following Lemma, a corollary of the theory laid out in Robins et al. (1994), states the precise result of the theory of inverse probability weighted estimation that the gain in efficiency of over appears to contradict.

Lemma 1. *Given one of the models , or for the observables, let * (*l*) *and π̃*(*l*) *be the maximum likelihood estimators of f _{A}*

$$\text{avar}\left\{{\widehat{\mu}}_{d}(\widehat{\pi})\right\}\u2a7d\text{avar}\left\{{\widehat{\mu}}_{d}(\tilde{\pi})\right\}.$$

Observe that because solves (3) and solves (4) we can write = _{d1}() and = _{d1}(*π̃*) with *d*_{1}(*l*) = 1*, * (*l*) = *E _{n}*(

The efficiency gains conferred by over under model can be deduced from the general theory of efficient inverse probability estimation in semiparametric models for missing data (Robins et al., proposition 8.1, 1994). In the Supplementary Material we apply this theory to show that: (a) is asymptotically equivalent to _{d2}() with *d*_{2}(*l*) = *E*(*A* | *L* = *l*) and (b)
_{d2}(), and therefore , is semiparametric efficient under .

In conclusion, the fallacy arises because the claim about efficiency gains assumes an explicit model for the law of (*A, L, Y*) and it requires that both propensity score models be correct under the given model. However, *E _{n}*(

Our analysis extends to inference in marginal structural mean models for the effect of a, possibly polytomous, exposure *A* given, a possibly strict, subset *Z* of the confounders *L*. These models assume that *E*(*Y _{a}* |

Andrea Rotnitzky was funded by a grant from the National Institutes of Health, U.S.A. The authors wish to thank two referees and the associate editors for helpful comments.

For any given law *f* (*l, a, y*)*,* define the new law *f*
^{*}(*l, a, y*) = *f* (*l*)*I*_{1}(*a*) *f* (*y* | *a, l*). Then *E*{*E*(*Y* | *A* = 1*, L*)} = *E*^{*}(*Y*) where *E*(·) and *E*^{*}(·) denote expectations under *f* and *f*
^{*} respectively. But, *f*
^{*}(*l, a, y*)*/ f* (*l, a, y*) = *I*_{1}(*a*)*/ f* (*a* | *l*)*,* so *E*^{*}(*Y*) = *E*{*I*_{1}(*a*)*Y / f* (*A* | *L*)} thus proving that *E*{*E*(*Y* | *A* = 1*, L*)} = *E*{*AY/ f* (1 | *L*)} for any *f* and *A* binary. That solving (3) also admits the representation (1) follows by applying this result when *f* is the empirical law.

- Brookhart MA, Schneeweiss AL, Rothman KJ, Glynn RJ, Avorn J, Sturmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;163:1149–56. [PMC free article] [PubMed]
- Breslow N. Design and analysis of case control studies. Annual Rev. Public Health. 1982;3:29–54. [PubMed]
- Cochran WC. The effectiveness of adjusting by subclassification in removing bias in observational studies. Biometrics. 1968;24:295–313. [PubMed]
- De Stavola HL, Cox DR. On the consequences of overstratification. Biometrika. 2008;95:992–6.
- Gail M. The effect of pooling across strata in perfectly balanced studies. Biometrics. 1988;44:151–62.
- Gill R, van der Laan M, Robins JM. Coarsening at random: characterizations, conjectures and counterexamples. In: Lin D, Fleming T, editors. Proc 1st Seattle Symp Biostatist. New York: Springer; 1997. pp. 255–94.
- Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Nat Cancer Inst. 1959;22:719–48. [PubMed]
- Neuhausaer M, Becher H. Improved odds ratio estimation by post-hoc stratification of case-control data. Statist Med. 1997;16:993–1004. [PubMed]
- Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–66.
- Robinson L, Jewell NP. Some surprising results about covariate adjustment in logistic regression models. Int Statist Rev. 1991;59:227–40.

Articles from Biometrika are provided here courtesy of **Oxford University Press**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |