Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3030650

Formats

Article sections

- SUMMARY
- 1. Introduction
- 2. Model
- 2. Estimation
- 4. Re-analysis of depression data in Cheng (2009)
- 5. Conclusion
- Supplementary Material
- References

Authors

Related links

Biometrics. Author manuscript; available in PMC 2012 March 1.

Published in final edited form as:

PMCID: PMC3030650

NIHMSID: NIHMS205246

Stuart G. Baker, Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, EPN 3131, 6130 Executive Blvd MSC 7354, Bethesda, MD 20892-7354, USA.

The publisher's final edited version of this article is available at Biometrics

See other articles in PMC that cite the published article.

Recently Cheng (Biometrics, 2009) proposed a model for the causal effect of receiving treatment when there is all-or-none compliance in one randomization group, with maximum likelihood estimation based on convex programming. We discuss an alternative approach that involves a model for all-or-none compliance in two randomization groups and estimation via a perfect fit or an EM algorithm for count data. We believe this approach is easier to implement, which would facilitate the reproduction of calculations.

Cheng (2009) proposed a causal analysis of a randomized trial to investigate the effect of seeing a specialist on the development of depression symptoms among elderly patients. The role of the specialist was to increase adherence to the depression treatments via education and assessment. Twenty primary care practices were randomized to either seeing a specialist or usual care which involved no access to the specialist. For simplicity, group randomization was ignored, and randomization was considered at the level of the individual. An important outcome was depression symptoms (none, minor, and major) at four months. Some subjects randomized to seeing a specialist did not see the specialist and so essentially received usual care. To analyze these data, Cheng (2009) specified a causal model with latent compliance classes for multinomial outcomes with all-or-none compliance in one randomization group. Cheng (2009) computed maximum likelihood (ML) estimates using a grid search and a convex programming algorithm. For a more general causal model with all-or-none compliance in two groups, we propose ML estimation via a perfect fit estimate or a simple EM algorithm for count data. Our estimation approach is simple, which is helpful if others wish to reproduce the results. Reproducibility of research is an important desideratum in statistical analyses (Peng, 2009).

All-or-none-compliance, sometimes called all-or-none switching of interventions, has a precise meaning with important implications for the formulation of a causal model (Baker, 1997, Baker and Kramer 2005). Suppose subjects are randomly assigned to either T0 or T1, which denote different programs of interventions starting at randomization. All-or-none compliance in one randomization group means that all subjects randomized to T0 receive T0, and some subjects randomized to T1 receive T0 and some receive T1. All-or-none compliance in two randomization groups means that in each randomization group some subjects receive T0 and some receive T1. Under all-or-none compliance, one can use latent compliance classes, randomization, and reasonable assumptions to estimate the causal effect of receiving intervention (e.g. Baker and Kramer, 2005). In a more general setting, latent compliance classes are called principal strata (Frangakis and Rubin, 2002).

Models for latent compliance classes with all-or-none compliance in one group were formulated by Bloom (1984) for continuous outcomes and Sommer and Zeger (1991) and Connor et al (1991) for binary outcomes. Similar models for latent compliance classes with all-or-none compliance in two groups were formulated by Baker and Lindeman (1994) for binary outcomes (with before-and-after time periods treated like randomization groups) and Angrist Imbens and Rubin (1996) for continuous outcomes.

We generalize the model of Cheng et al (2009) to multinomial outcomes with all-or-none compliance in two groups instead of one. Let *r* index randomization group with *r* = 0 for assignment to intervention T0 and *r* = 1 for assignment to intervention T1. Let *a* index intervention received, where *a* = 0 for receipt of T0 and *a* = 1 for receipt of T1. Let *j* = 1, 2, …, *J* index multinomial outcomes. The counts are denoted

*n*= number of persons in randomization group_{raj}*r*who receive intervention*a*and have outcome*j*.

In the terminology of Angrist et al (1996), there are four latent compliance classes:

*A*, always-takers, who would receive intervention T1 if randomized to either group;*C*, compliers, who would receive the intervention to which they were randomized;*N*, never-takers, who would receive intervention T0 if randomized to either group;*D*, defiers, who would receive the opposite intervention to which they were randomized.

Following Cheng (2009), we make the same assumptions listed in Angrist et al (2006) and described below without mathematical notation.

- ASSUMPTION 1:
*Stable unit treatment value assumption. The outcome for a subject is unaffected by the particular assignments of treatments to the other subjects (Rubin, 1980).* - ASSUMPTION 2:
*Random assignment to randomization groups.* - ASSUMPTION 3:
*Exclusion restriction. For always-takers and never-takers, the distribution of outcomes does not depend on randomization group.* - ASSUMPTION 4:
*The fraction of subjects who receive each intervention varies by randomization group.* - ASSUMPTION 5:
*Monotonicity. There are no defiers.*

Extending the notation in Cheng (2009), we define the following parameters under the latent compliance model:

- π
_{C}, the probability of being a complier, - π
_{A}, the probability of being an always-taker, - π
_{N}, the probability of being a never-taker, - ν
_{j}, the probablity a complier in randomization group 0 has outcome*j*, *t*, the probablity a complier in randomization group 1 has outcome_{j}*j*,*s*, the probablity a never-taker has outcome_{j}*j*,*b*, the probablity an always-taker has outcome_{j}*j*,

where π_{C} + π_{A} + π_{N} = 1 and Σ_{j} *t _{j}* = Σ

The quantity of interest is the change in a weighted sum of outcomes due to receiving T1 instead of T0 among compliers,

(1)

where *w _{j}* denotes a weight for outcome

Estimation is via maximum likelihood with the count data. The kernel of the likelihood is

(2)

which generalizes the likelihood in Baker and Lindeman (1994) from binomial to multinomial outcomes and the likelihood in Cheng (2009) from all-or-none compliance in one group to all-or-none compliance in two groups. Let _{CACE} denote the ML estimate of θ_{CACE}. There are two cases for _{CACE}: _{CACE(PerfectFit)} if ML estimates of all parameters lie in the interior of the parameter space, and _{CACE(Boundary)} if ML estimates of some parameters lie on the boundary of the parameter space.

A perfect fit estimate is possible with this model because the number of independent parameters equals the number of independent cell counts. There are 4 *J* − 2 independent parameters, corresponding to *J* − 1 values for each of *s _{j}, b_{j}, t_{j}*, and ν

(3)

When _{C}, _{j}, and * _{j}* are each between zero and one,

The perfect fit estimate of causal effect equals the intent-to-treat estimate, _{ITT}, divided by the difference in the fraction receiving T1 in each group,

(4)

which parallels the perfect fit estimate for a causal effect with binary outcomes (Baker and Lindeman, 1994) and the instrumental variables estimate for a causal effect with continuous outcomes (Angrist Imbens and Rubin, 1996). The estimated asymptotic variance of _{CACE(PerfectFit)} can be computed using the Mulitnomial-Poisson transformation (Baker, 1994),

(5)

where the derivatives are readily computed using software for symbolic algebra.

If either _{C}, _{j}, or * _{j}* in (3) is less than zero or greater than one, the perfect fit estimate is not admissible, and we require the ML estimate of θ

(6)

and iteration (*i*) of the M-step is

(7)

The parameter estimates in (7) are used for iteration (*i* + 1) of the E-Step. The ML estimate of θ_{CACE} is obtained by substituting the final M-step estimates at convergence into (1). Starting values for parameter estimates equal the perfect fit parameter estimates in (3) with any perfect fit estimate of π_{C}, ν_{j}, or *t _{j}* that is less than zero set equal to zero, and any perfect fit estimate of π

Consider the important special case of a binomial outcome with all-or-none compliance in one randomization group. The counts in group *r* = 0 are {*n*_{00j}} and the counts in group *r* = 1 are {*n*_{1aj}}, for *a* = 0, 1 and *j* = 0, 1. In this case, both ML perfect fit and ML boundary estimates can be written in closed-form,

(8)

To our knowledge, the above ML boundary estimate, which is derived in Appendix B, is a new result. A previous ML estimate for this particular boundary scenario involved an iterative algorithm (Cheng et al, 2009).

For the analysis of the depression data from Cheng (2009) in Table 2, we modified the formulas for the special case of all-or-none compliance in one randomization group, so there are no always-takers. *Assumption 1* holds because a patient's outcome did not depend on the treatment of another patient. *Assumption 2* holds because of random assignment. *Assumption 3* holds because a person who would receive usual care in either group (a never-taker) receives the same usual care if randomized to either group. *Assumptions 4* and *5* hold because seeing a specialist was not possible in the control group.

Data from Cheng (2009)

We computed _{CACE} for the weights in Cheng (2009) of (*w*_{1}, *w*_{2}, *w*_{3}) = (1, 2, 3) and (1, 3, 4). We also investigated weights (0, −0.5, −1.0) under the premise that minor depression is half the severity of major depression, with a negative sign indicating a decrease in these detrimental outcomes. With these latter weights, _{CACE} can be interpreted as the expected reduction in major depression "equivalents" per person due to seeing a specialist instead of receiving usual care. Confidence intervals computed asymptotically and by bootstrapping were similar (Table 3), with all but one of ten thousand bootstrap iterations yielding a perfect fit estimate in the interior of the parameter space. For the proposed weighting, _{CACE(PerfectFit)} = 0.009 with a 95% confidence interval of (− 0.14, 0.16). Thus, there is little evidence that seeing a specialist reduces the probability of developing depression symptoms, but the wide confidence intervals may preclude a definitive conclusion.

With the growing emphasis on reproducible research (Peng, 2009), this simpler approach to ML estimation in causal models for multinomial data should be attractive to many researchers.

The author thanks Jing Cheng for helpful comments and kindly providing the data.

Using the following procedure we derived the perfect fit estimates in (3) for all-or-none compliance in two randomization groups with multinomial outcomes. Setting observed counts equal to their expected values yields

(A.1)

(A.2)

(A.3)

(A.4)

Summing (A.2) and (A.3) over *j* yields *n*_{01+} = *n*_{0++} π_{A} and *n*_{10+} = *n*_{1++} π_{N}. Therefore _{A} = *n*_{01+}/*n*_{0++}, _{N} = *n*_{10+}/*n*_{1++}, and _{C}= 1 − _{A} − _{N}. Substituting _{A} and _{N} into (A.2) and (A.3) gives * _{j}* =

Using the following procedure we derived the closed-form ML boundary estimates in (8) for all-or-none compliance in only one randomization group and a binomial outcome. The counts in group *r* = 0 are {*n*_{0a}}, and the counts in group *r* = 1 are {*n*_{1aj}} for *j* = 0, 1. The kernel of the log-likelihood is

Let *n* = *n*_{0+} + *n*_{1++}. Using software for symbolic algebra (Wolfram Research, 2008) to solve /π_{C} = /*s*_{1} = /*t*_{1} = 0 yields _{1} = *n*_{111}/*n*_{11+} and

where *f* = *n*_{100} + *n*_{11+} and *k* = *n*_{101} + *n*_{11+}.

**Supplementary materials**

Computer code in Mathematica 7.0 (Wolfram Research, Inc., 2008) to reproduce these results is available under the Paper Information link at the *Biometrics* website http://www.biometrics.tibs.org.

- Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables. Journal of the American Statistical Association. 1996;91:44–455.
- Baker SG. The Multinomial-Poisson transformation. The Statistician. 1994;43:495–504.
- Baker SG. Compliance, all-or-none. In: Kotz S, Read CR, Banks DL DL, editors. The Encyclopedia of Statistical Science, Update Volume. New York: John Wiley and Sons, Inc.; 1997. pp. 134–138.
- Baker SG, Kramer BS. Simple maximum likelihood estimates of efficacy in randomized trials and before-and-after studies, with implications for meta-analysis. Statistical Methods in Medical Research. 2005;14:1–19. correction 14, 349. [PubMed]
- Baker SG, Lindeman KS. The paired availability design: a proposal for evaluating epidural analgesia during labor. Statistics in Medicine. 1994;13:2269–2278. correction 14, 1841. [PubMed]
- Bloom HS. Accounting for no-shows in experimental evaluation designs. Evaluation Review. 1984;8:225–246.
- Cheng J. Estimation and inference for the causal effect of receiving treatment on a multinomial outcome. Biometrics. 2009;65:96–103. [PubMed]
- Cheng J, Small DS, Tan Z, Ten Have TR. Efficient nonparametric estimation of causal effects in randomized trials with noncompliance. Biometrika. 2009;96:19–36.
- Connor R, Prorok PC, Weed DL. The case-control design and the assessment of the efficacy of cancer screening. Journal of Clinical Epidemiology. 1991:44.1215–44.1221. [PubMed]
- Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. [PubMed]
- Peng RD. Reproducible research and
*Biostatistics*. Biostatistics. 2009;10:405–408. [PubMed] - Imbens GW, Rubin DB. Estimating outcome distributions for compliers in instrumental variables models. Review of Economic Studies. 1997;64:555–574.
- Rohatgi VK. An Introduction to Probability Theory and Mathematical Statistics. New York: John Wiley and Sons, Inc.; 1976. p. 383.
- Rubin DB. Comment on 'Randomization analysis of experimental data: The Fisher randomization test' by D. Basau. Journal of the American Statistical Association. 1990;75:591–593.
- Sommer A, Zeger SL. On estimating efficacy from clinical trials. Statistics in Medicine. 1991;10:45–52. [PubMed]
- Wolfram Research, Inc. Mathematica, Version 7.0. Champaign, IL: Wolfram Research, Inc.; 2008.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |