Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2907896

Formats

Article sections

- SUMMARY
- 1 Introduction
- 2 Motivating Example: JHU PIRC Family-School Partnership (FSP) Intervention Study
- 3 Common Setting: CRT with Noncompliance
- 4 ITT Analysis Considering Clustering
- 5 Comparison of Analysis Options: Monte Carlo Simulations
- 6 Conclusions
- References

Authors

Related links

Stat Med. Author manuscript; available in PMC 2010 July 21.

Published in final edited form as:

Stat Med. 2008 November 29; 27(27): 5565–5577.

doi: 10.1002/sim.3370PMCID: PMC2907896

NIHMSID: NIHMS66711

Booil Jo, Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305-5795;

Booil Jo: ude.drofnats@lioob

See other articles in PMC that cite the published article.

In cluster randomized trials (CRT), individuals belonging to the same cluster are very likely to resemble one another, not only in terms of outcomes, but also in terms of treatment compliance behavior. Whereas the impact of resemblance in outcomes is well acknowledged, little attention has been given to the possible impact of resemblance in compliance behavior. This study defines compliance intraclass correlation as the level of resemblance in compliance behavior among individuals within clusters. On the basis of Monte Carlo simulations, it is demonstrated how compliance intraclass correlation affects power to detect intention-to-treat (ITT) effect in CRT. As a way of improving power to detect ITT effect in CRT accompanied by noncompliance, this study employs an estimation method, where ITT effect estimates are obtained based on compliance-type-specific treatment effect estimates. A multilevel mixture analysis using an ML-EM estimation method is used for this estimation.

In conducting randomized field experiments, individual-level randomization is not always possible for practical and ethical reasons. Two examples are situations in which a number of patients belong to each doctor in primary care settings (e.g., [1]), and in school settings, a number of students belong to each teacher (e.g., [2]). In these situations, it is problematic (e.g., administrative burden, teacher/parent complaints, ethical reasons) to assign individuals to different treatment conditions ignoring their cluster membership (i.e., physician, teacher). Therefore, cluster randomized trials (CRT) have been widely used in practice, treating a cluster of individuals as the unit of randomization. Although practical/ethical reasons are the main motivation, there is also a statistical advantage to employing CRT. That is, by assigning individuals that are very likely to interact to the same condition, each treatment condition is less likely to be contaminated by other conditions, therefore making the comparison between different treatment conditions more valid [3]. As a result of cluster-level randomization, individuals in the same cluster are very likely to resemble one another, not only in terms of pretreatment characteristics, but also in terms of treatment receipt behavior and posttreatment outcomes.

If resemblance among individuals is ignored (i.e., data are treated as if they were from individual-level randomized trials), small variations within the same cluster may result in underestimated standard errors, exaggerating the statistical significance (i.e., results in incorrect confidence intervals and significance levels) of the effect of treatment assignment, which is a cluster-level variable. An honest (valid) way of analysis in this situation is to take into account increased variance across clusters (due to reduced variance within clusters). For proper analyses accounting for clustered data structures, multilevel analysis techniques developed in various statistical frameworks can be employed (e.g., [4–9]). In designing CRT, it is critical to adjust expected power and required sample sizes assuming that the data will be properly analyzed taking into account within-cluster resemblance among individuals (e.g., [10–11]).

Whereas a good amount of attention has been paid to handling resemblance among individuals in terms of posttreatment outcomes in CRT, little attention has been given to handling resemblance among individuals in terms of treatment compliance behavior. Individuals with the same cluster membership share the environment of the cluster they belong to, resulting in resemblance among individuals in terms of compliance behavior. For example, some doctors or teachers, which represent cluster units, may more eagerly encourage their patients or students to comply with the given treatment. A recent study [12] called attention to this problem, demonstrating the necessity and possibility of estimating compliance-specific treatment assignment effects considering both CRT and noncompliance. Whereas their study focused on compliance-specific treatment assignment effects (e.g., [13]), the main interest of the current study is in investigating how resemblance among individuals in compliance behavior influences the intention-to-treat (ITT) effect and whether the situation can be improved by considering both CRT and treatment noncompliance in the analysis.

Standard ITT analysis is commonly used in analyzing data from randomized trials to estimate an overall effect of treatment assignment (i.e., effectiveness) by comparing groups as randomized. In analyzing data from CRT, the same analysis may be used with an adjustment for the design effect, or multilevel analysis techniques can be employed accounting for resemblance among individuals with the same cluster membership. Given that we are not interested in compliance-type-specific treatment effects (such as for compliers) and that the effect of cluster-level randomization can be taken into account in the analysis, it is unclear whether we need to worry about the effect of treatment noncompliance in estimating ITT effect in CRT. This study shows how resemblance in compliance behavior within clusters can affect the evaluation of ITT effect in CRT and suggests the use of analyses that consider both clustering and noncompliance.

The Johns Hopkins University Preventive Intervention Research Center’s (JHU PIRC) Family-School Partnership (FSP) intervention trial [2], which was used as a prototype for the Monte Carlo simulations reported in this study, was designed to improve academic achievement and to reduce early behavioral problems of school children. First-grade children were randomly assigned to the intervention or to the control condition, and the unit of randomization was a classroom (9 classrooms were assigned to the intervention condition, and another 9 classrooms were assigned to the control condition). Focusing on the shy behavior outcome, the intraclass correlation was about 0.125 at the 6-month follow-up assessment. It is well known that, unless properly handled in the analysis, intraclass correlation in posttreatment outcomes may lead to misestimation of variances, exaggerating statistical significance of treatment effects in CRT.

In addition to the fact that the unit of randomization was a classroom, another main complication in the JHU PIRC trial was poor compliance of parents. In the FSP intervention condition, parents were asked to implement 66 take-home activities related to literacy and mathematics. It was expected that the intervention would not show any desirable effects unless parents report a quite high level of completion (over-reporting of completion level was very likely given that parents self-reported). Compliance behavior was observed in the FSP intervention condition, but not in the control condition, since parents assigned to the control condition were not invited to implement intervention activities. When the receipt of intervention is defined as completing at least two thirds of activities, about 46% of children in the intervention condition properly received the intervention. Further, parents’ compliance with the intervention activities substantially varied depending on the classroom their children belonged to. Table 1 shows proportions of students whose parents completed at least two thirds of intervention activities.

JHU PIRC FSP Intervention Condition: Proportion of students whose parents completed at least two thirds of intervention activities.

Varying compliance rates across clusters indicate that parents belonging to the same classroom tend to be similar in terms of compliance behavior (intraclass correlation of compliance is about 0.377). One possible explanation for this variation would be that, in some classrooms, teachers (or parents) are more motivated than in other classrooms (e.g., in Table 1, 100% of parents in one classroom properly implemented the intervention treatments, whereas in another classroom, only 5% did). The question here is how resemblance in compliance will affect the estimation of ITT effect.

Assume a CRT setting in line with the JHU PIRC trial, where some study participants do not comply with the given treatment. Individual *i* (*i* = 1, 2, 3,…, *m _{j}*) belongs to cluster

In line with the JHU PIRC trial, it is assumed that study participants were prohibited from receiving a different treatment than the one that they were assigned to. Therefore, only two compliance types are possible based on *Z* and *D*. The latent compliance type *C _{ij}* = 1 if individual

$${C}_{ij}=\{\begin{array}{ll}1\phantom{\rule{0.16667em}{0ex}}(\text{complier})\hfill & \text{if}\phantom{\rule{0.16667em}{0ex}}{D}_{ij}(1)=1,\phantom{\rule{0.16667em}{0ex}}\text{and}\phantom{\rule{0.16667em}{0ex}}{D}_{ij}(0)=0\hfill \\ 0\phantom{\rule{0.16667em}{0ex}}(\text{noncomplier})\hfill & \text{if}\phantom{\rule{0.16667em}{0ex}}{D}_{ij}(1)=0,\phantom{\rule{0.16667em}{0ex}}\text{and}\phantom{\rule{0.16667em}{0ex}}{D}_{ij}(0)=0.\hfill \end{array}$$

Assuming these two compliance types, a continuous outcome *Y* for individual *i* in cluster *j* can be expressed as

$${Y}_{ij}={\alpha}_{n}+({\alpha}_{c}-{\alpha}_{n}){C}_{ij}+{\gamma}_{c}{C}_{ij}{Z}_{j}+{\epsilon}_{bj}+{\epsilon}_{\mathit{wij}},$$

(1)

where *α _{n}* is the mean potential outcome for noncompliers when

In the absence of covariates that predict compliance, the proportions of compliers and noncompliers can be expressed in the empty logistic regression as

$$\begin{array}{r}P({C}_{ij}=1)={\pi}_{ij},\\ P({C}_{ij}=0)=1-{\pi}_{ij},\\ \mathit{logit}({\pi}_{ij})={\beta}_{0}+{\xi}_{j}.\end{array}$$

(2)

where *π _{ij}* is the probability of being a complier for individual

Intraclass correlation (ICC) has been widely used to represent the level of resemblance among individuals belonging to the same cluster in terms of outcomes. As ICC increases, variance within clusters will decrease, resulting in inflation of variance across clusters. The direct consequence of this variance inflation is reduced power (compared to power in individual-level randomized trials) to detect the effect of treatment assignment, which is a cluster-level variable in CRT. However, if this variance inflation is ignored in the analysis, the resulting type I error rate will be incorrectly inflated.

From equation (1), the ICC coefficient in outcome *Y* given *Z* is defined as

$${\text{ICC}}_{Y}=\frac{{\sigma}_{b}^{2}}{{\sigma}_{b}^{2}+{\sigma}_{w}^{2}},$$

(3)

where
${\sigma}_{b}^{2}$ denotes the between-cluster variance of outcome *Y* given *Z*. The total variance is the sum of the between-and within-cluster variances (
${\sigma}^{2}={\sigma}_{b}^{2}+{\sigma}_{w}^{2}$).

In addition to the conventional outcome ICC, another ICC is defined in this study to represent resemblance among individuals belonging to the same cluster in terms of compliance behavior. In CRT, individuals belonging to the same cluster are likely to show resemblance not only in terms of outcomes, but also in terms of compliance behavior. The compliance ICC represents a unique complication in CRT accompanied by treatment noncompliance.

There are several ways to present heterogeneity across clusters in proportions [14–18]. In line with McKelvey and Zavoina [19], the intraclass correlation coefficient in compliance can be defined from equation (2) as

$${\text{ICC}}_{C}=\frac{{\psi}_{b}^{2}}{{\psi}_{b}^{2}+{\pi}^{2}/3},$$

(4)

where
${\psi}_{b}^{2}$ is the between-cluster variance (i.e., variance of *ξ _{j}*) and

Under the assumption of Stable Unit Treatment Value (SUTVA; [20–22]), each individual’s potential outcomes are uncorrelated with other individuals’ treatment assignment status. SUTVA is a critical assumption that makes identification of causal treatment effects possible. When dealing with individuals nested within clusters in randomized trials, plausibility of SUTVA is highly suspect. Cluster-level randomization plays a critical role in making this obvious violation of SUTVA a more manageable problem by concentrating individuals who are most likely to interact with one another in the same treatment condition. For example, in the FSP intervention trial, the unit of randomization was a classroom. By employing cluster randomization, the interaction rate among individuals across different treatment conditions remains about the same as that observed without systematic nesting structures (i.e., classrooms). However, interaction in the same cluster is highly likely, which can be handled statistically by considering resemblance among individuals with the same cluster membership in the analysis.

Standard ITT analysis is commonly used in analyzing data from randomized trials to estimate an overall effect of treatment assignment. In analyzing data from CRT, the same analysis may be used in conjunction with multilevel analysis techniques. Since noncompliance is not considered in this method, individual-level and cluster-level variations in compliance behavior is not taken into account. Given that, the situation described in equation (1) is simplified as follows. That is,

$${Y}_{ij}=\alpha +\gamma {Z}_{j}+{\epsilon}_{bj}+{\epsilon}_{\mathit{wij}},$$

(5)

where *α* is the overall mean potential outcome when *Z* = 0, and the average effect of treatment assignment (i.e., ITT effect) is *γ*. The macro-unit residual *ε _{bj}* is assumed to be normally distributed with zero mean and between-cluster variance
${\sigma}_{b}^{2}$. The micro-unit residual

The analysis model described in equation (5) is a standard hierarchical linear model and can be estimated with the ML estimator. A number of different algorithms are available for obtaining the ML estimates [9]. In this paper, we used the EM algorithm [23–25] implemented in Mplus version 5 [26].

We define a two-level ML estimate of ITT effect as

$${\widehat{\gamma}}^{2ML}={{\widehat{\mu}}_{1}}^{2ML}-{{\widehat{\mu}}_{0}}^{2ML},$$

(6)

where _{1}^{2}* ^{ML}* and

Another way to look at the ITT effect is as a combination of the treatment assignment effect for compliers and the treatment assignment effect for noncompliers. In this approach, the existence of noncompliance can be taken into account. Considering non-compliance may have some impact on ITT effect estimation when information on the mixture distribution of compliers and noncompliers is utilized in CACE estimation. The ML mixture approach is known to be often more efficient than the IV approach in the estimation of CACE [27–28 ]. Estimation of ITT effect may also benefit from this improved efficiency if the additional assumptions necessary to identify CACE, such as the exclusion restriction and monotonicity [13], hold.

In the current setting we consider (i.e., individuals assigned to the control condition have no access to the actual treatment as in the JHU PIRC trial), monotonicity is a plausible assumption (i.e., no individuals do the opposite of what they are assigned to do). The exclusion restriction may not hold if the treatment is not truly all-or-none, especially in non-blinded studies. When this assumption is violated, CACE estimation is likely to benefit from the ML mixture analysis, which mitigates the impact of violation by utilizing auxiliary information such as from distributional heterogeneity, parametric assumptions, and covariates [29]. However, ITT effect estimation based on these adjusted CACE estimates in conjunction with the exclusion restriction may result in biased results. In principle, it is possible to relax the exclusion restriction relying on auxiliary information such as from proper priors and covariates [30–32], although it is not well known how these methods work in the context of CRT. In the current paper, we focus on situations where the exclusion restriction is a plausible assumption.

To simultaneously handle data clustering, noncompliance, and interaction between these two, the two-level ML mixture approach considers the same model described in equations (1) and (2) in estimating the ITT effect. On the basis of the model described in equations (1) and (2), a formal multilevel mixture analysis [33–34] using the ML estimator can be conducted. The observed data likelihood for the treatment and the control group is different because the compliance variable *C _{ij}* is observed when

In the treatment group, the observed data likelihood for cluster *j* is described as

$${L}_{j}\propto \int \left(\prod _{i}{f}_{1}({Y}_{ij}\mid {C}_{ij},{\epsilon}_{bj})\right){\phi}_{bj}({\epsilon}_{bj})d{\epsilon}_{bj}\xb7\int \left(\prod _{i}{\pi}_{ij}^{{C}_{ij}}{(1-{\pi}_{ij})}^{1-{C}_{ij}}\right){\phi}_{j}({\xi}_{j})d{\xi}_{j},$$

(7)

where *f*_{1}(*Y _{ij}* |

$${f}_{1}({Y}_{ij}\mid {C}_{ij},{\epsilon}_{bj})=\mathit{Exp}\left(-\frac{{({Y}_{ij}-{\alpha}_{n}-({\alpha}_{c}-{\alpha}_{n}){C}_{ij}-{\gamma}_{c}{C}_{ij}-{\epsilon}_{bj})}^{2}}{2{\sigma}_{w}^{2}}\right)/(\sqrt{2\pi}{\sigma}_{w}),$$

(8)

* _{bj}*(

$${\phi}_{bj}({\epsilon}_{bj})=\mathit{Exp}(-{\epsilon}_{bj}^{2}/(2{\sigma}_{b}^{2}))/(\sqrt{2\pi}{\sigma}_{b}),$$

(9)

* _{j}*(

$${\phi}_{j}({\xi}_{j})=\mathit{Exp}(-{\xi}_{j}^{2}/(2{\psi}^{2}))/(\sqrt{2\pi}\psi ),$$

(10)

and the probability of compliance

$${\pi}_{ij}=\frac{\mathit{Exp}({\beta}_{0}+{\xi}_{j})}{1+\mathit{Exp}({\beta}_{0}+{\xi}_{j})}.$$

(11)

In the control group, *C _{ij}* is unobserved and thus the observed data likelihood is

$${L}_{j}\propto \int \left(\prod _{i}({f}_{0}({Y}_{ij}\mid {\epsilon}_{bj},{C}_{ij}=1){\pi}_{ij}+{f}_{0}({Y}_{ij}\mid {\epsilon}_{bj},{C}_{ij}=0)(1-{\pi}_{ij}))\right){\phi}_{bj}({\epsilon}_{bj}){\phi}_{j}({\xi}_{j})d{\epsilon}_{bj}d{\xi}_{j},$$

(12)

where *f*_{0}(*Y _{ij}* |

$${f}_{0}({Y}_{ij}\mid {C}_{ij},{\epsilon}_{bj})=\mathit{Exp}\left(-\frac{{({Y}_{ij}-{\alpha}_{n}-({\alpha}_{c}-{\alpha}_{n}){C}_{ij}-{\epsilon}_{bj})}^{2}}{2{\sigma}_{w}^{2}}\right)/(\sqrt{2\pi}{\sigma}_{w}).$$

(13)

The total likelihood function

$$L=\prod _{j}{L}_{j}$$

(14)

does not have a closed form expression and to compute it we use 2-dimensional numerical integration. By maximizing *L* with respect to the parameters in the model we obtain the ML estimates. The likelihood can be maximized directly by using a general maximization algorithm. Numerical methods can be used to compute the derivatives of *L* with respect to the parameters. A more efficient method for maximizing the likelihood, however, is the EM algorithm, which is implemented in Mplus version 5 [29]. This algorithm treats the unknown compliance status in the control group as well as the between level random effects as missing data. Details on the implementation of this algorithm are available in Muthén and Asparouhov [35]. Parametric standard errors are computed from the information matrix using the second-order derivatives of *L*.

We assume random assignment of treatment conditions, SUTVA, and the exclusion restriction in this analysis. In CRT, interaction among individuals in the same cluster is highly likely. As in two-level ML analysis, we statistically deal with resemblance among individuals with the same cluster membership in two-level ML mixture analysis. In that sense, SUTVA is not a necessary assumption. However, in CRT, the interaction rate among individuals across different treatment conditions remains about the same as that observed without systematic nesting structures. Therefore, although it may not be serious, some deviation from SUTVA is possible as in any randomized trial.

A two-level ML mixture estimate of CACE is described as

$${\widehat{\gamma}}_{c}^{2ML.\mathit{mix}}={\widehat{\mu}}_{1c}^{2ML.\mathit{mix}}-{\widehat{\mu}}_{0c}^{2ML.\mathit{mix}},$$

(15)

where
${\widehat{\mu}}_{1c}^{2ML.\mathit{mix}}$ and
${\widehat{\mu}}_{0c}^{2ML.\mathit{mix}}$ are the two-level ML mixture estimates of *μ*_{1}* _{c}* and

Then, a two-level ML mixture estimate of ITT effect is

$${\widehat{\gamma}}^{2ML.\mathit{mix}}={\widehat{\gamma}}_{c}^{2ML.\mathit{mix}}{\widehat{\pi}}_{c}^{2ML.\mathit{mix}},$$

(16)

where
${\widehat{\pi}}_{c}^{2ML.\mathit{mix}}$ is the t wo-level ML mixture estimate of *π _{c}*.

Standard errors of the ITT estimates are obtained using the delta method as

$$\begin{array}{l}\mathit{Var}({\widehat{\gamma}}_{c}^{2ML.\mathit{mix}}{\widehat{\pi}}_{c}^{2ML.\mathit{mix}})\approx {({\widehat{\gamma}}_{c}^{2ML.\mathit{mix}})}^{2}\mathit{Var}({\widehat{\pi}}_{c}^{2ML.\mathit{mix}})+{({\widehat{\pi}}_{c}^{2ML.\mathit{mix}})}^{2}\mathit{Var}({\widehat{\gamma}}_{c}^{2ML.\mathit{mix}})\\ +2{\widehat{\gamma}}_{c}^{2ML.\mathit{mix}}{\widehat{\pi}}_{c}^{2ML.\mathit{mix}}\mathit{Cov}({\widehat{\gamma}}_{c}^{2ML.\mathit{mix}},{\widehat{\pi}}_{c}^{2ML.\mathit{mix}}).\end{array}$$

(17)

To examine the impact of ICC* _{C}* and ICC

The Monte Carlo simulation results presented in this study are based on 500 replications. The size of each cluster (*m*) is 20, and the total number of clusters (*G*) is 100 (50 in the control and 50 in the treatment condition). A large number of clusters (100 in this study compared to 18 in the JHU Study) is employed to avoid another source of variance misestimation and to focus on variance misestimation only due to intraclass correlations. The true ratio of the treatment and control groups is 50%:50%. The size of ITT effect increases or decreases proportionally as a function of the compliance rate, and therefore noncompliance has a direct impact on power to detect ITT effect [36]. In this paper, beyond this direct impact through compliance rates, we are more interested in studying the impact of noncompliance on power through within-cluster resemblance in compliance. Therefore, we used the same true compliance rate (50%) across all simulation settings.

The true ICC* _{C}* value ranges from 0.0 to 0.8. A zero ICC

Data were generated according to equations (1) and (2). Complier and noncomplier outcome means (i.e., *α _{n}* and

The true within-cluster variance
${\sigma}_{w}^{2}$ takes values of 1.00, 0.95, and 0.90. The true between-cluster variance
${\sigma}_{b}^{2}$ takes values of 0.00, 0.05, and 0.10 to reflect ICC* _{Y}* of 0.00, 0.05, and 0.10 given the total variance of 1.0. The true treatment assignment effect for compliers

In summarizing analysis results with the simulated data, coverage is defined as the proportion of replications out of 500 replications where the true parameter values are covered by the nominal 95% confidence interval of the parameter estimates. Power is defined as the proportion of replications out of 500 replications where the ITT effect estimates are significantly different from zero (*α* = .05).

Figure 1 shows simulation results based on two-level ML analysis and two-level ML mixture analysis. Since both approaches consider data clustering in the analyses, coverage rates in these analyses stay close to the nominal level regardless of ICC* _{Y}* and ICC

Two-level ML analysis and two-level ML mixture analysis: Statistical power in detecting ITT effect as a function of ICC_{C} and ICC_{Y} (100 clusters, 20 individuals per cluster). The dotted lines represent power when two-level ML analysis is employed. The **...**

Figure 1 shows power to detect ITT effect when two-level ML analysis and two-level ML mixture analysis are employed. In two-level ML analysis (see the dotted lines), possible sources of variance misestimation can be seen by comparing the simplified model in equation (5) and the full model in equation (1). The cluster-level outcome residual *ε _{bj}* in equation (1) is properly modeled in two-level ML analysis, and therefore is not a source of variance misestimation. However, the variance associated with (

In two-level ML mixture analysis (see the solid lines), data generated on the basis of the model described in equations (1) and (2) are analyzed using the same model considering the fact that randomization was done at the cluster level and that some individuals did not comply with the given treatment. Since within- and between-cluster variances (
${\sigma}_{w}^{2}$ and
${\sigma}_{b}^{2}$) are correctly estimated by simultaneously considering ICC* _{Y}*, ICC

Frangakis and Rubin [37] previously pointed out that estimation of intention-to-treat effect can be biased in the analysis that ignores treatment noncompliance due to interaction between noncompliance and nonresponse (i.e., availability of outcome data at posttreatment assessments). The current study calls attention to a similar phenomenon (i.e., how we deal with compliance information in the analysis affects the evaluation of treatment effects even if we are not interested in estimating compliance-type-specific treatment assignment effects) in a different context, where noncompliance may interact with clustering of individuals. It was demonstrated in this study that ignoring compliance information in analyzing data from CRT may result in substantially decreased power to detect ITT effect.

To simultaneously handle data clustering and noncompliance, this study employed a formal multilevel analysis combined with the mixture analysis. The joint analysis of both complications is computationally demanding, but it provides a general framework that can accommodate various forms of clustered data structures considering mixture distributions of compliers and noncompliers. The ML-EM estimation of the multilevel mixture models has been implemented in the Mplus program [26], providing an accessible tool for complex statistical modeling. Although not covered in this study, other complications in randomized trials such as missing outcomes can also be incorporated in this estimation framework in addition to noncompliance and data clustering. Further study is needed for better understanding of how ITT effect estimation may benefit from the joint modeling of multiple complications in various contexts of randomized trials.

As a way of improving power to detect ITT effect in CRT accompanied by non-compliance, this study employed an estimation method, where ITT effect estimates are obtained on the basis of compliance-type-specific treatment effect estimates. The same approach was used by Frangakis and Rubin [37] to avoid bias in the estimation of ITT effect. The limitation of this approach is that ITT effect estimates can be biased if underlying assumptions employed to identify compliance-type-specific treatment effects are violated. Given that, although they may seem irrelevant, methods to better handle identification problems in estimating compliance-type-specific treatment effects are likely to improve estimation of ITT effect when faced with various complications in randomized trials. Extensive treatment of this topic is left for future study.

Along with possible violation of assumptions employed to identify compliance-type-specific treatment effects, another complication that poses a major problem in applying the multilevel mixture analysis method is having small numbers of clusters. In simulation results reported in this paper, a large number of clusters has been employed (i.e., 100) to focus on variance misestimation only due to intraclass correlations. In practice, however, much smaller numbers of clusters are often employed in CRT as in the JHU PIRC trial (i.e., 18). When applying multilevel mixture analysis, having small numbers of clusters poses serious consequences. That is, with small numbers of clusters, not only standard errors, but also compliance-specific treatment effects are likely to be poorly estimated at the cluster level (e.g., CACE will be basically estimated based on 9 classroom observations in the JHU trial).

In situations where multilevel ML mixture analysis is not recommended, such as in the JHU PIRC trial, multilevel ML analysis considering only clustering seems to be a reasonable solution. A few strategies such as the use of the bootstrap method, the use of the Bayesian method with strong priors, and the use of an approximate F-test have been used to improve estimation in one-class multilevel analyses with small numbers of clusters. When we apply one-class multilevel ML analysis (adjusted for small numbers of clusters), a combination of simpler analyses and the simulation results reported in this paper can be used together to guide interpretation of the results. For example, on the basis of a two-level logistic regression analysis using only the intervention group data, ICC* _{C}* can be estimated. Complier and noncomplier means (

Booil Jo, Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305-5795.

Tihomir Asparouhov, Muthén & Muthén.

Bengt O. Muthén, Graduate School of Education & Information Studies, University of California, Los Angeles.

1. Dexter P, Wolinsky F, Gramelspacher G, Zhou XH, Eckert G, Waisburd M, Tierney W. Effectiveness of computer-generated reminders for increasing discussions about Advance Directives and completion of Advance Directives. Annals of Internal Medicine. 1998;128:102–110. [PubMed]

2. Ialongo NS, Werthamer L, Kellam SG, Brown CH, Wang S, Lin Y. Proximal impact of two first-grade preventive interventions on the early risk behaviors for later substance abuse, depression and antisocial behavior. American Journal of Community Psychology. 1999;27:599–642. [PubMed]

3. Sobel ME. What do randomized studies of housing mobility demonstrate: Causal inference in the face of interference. Journal of the American Statistical Association. 2006;101:1398–1407.

4. Aitkin M, Longford N. Statistical modeling issues in school effectiveness studies (with discussion) Journal of Royal Statistical Society, Ser A. 1986;149:1–43.

5. Goldstein H. Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika. 1986;73:43–56.

6. Liang KH, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22.

7. McCulloch CE. Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association. 1997;92:162–170.

8. Muthén BO, Satorra A. Complex sample data in structural equation modeling. In: Marsden PV, editor. Sociological Methodology. Blackwell; Cambridge, MA: 1995. pp. 267–316.

9. Raudenbush SW, Bryk AS. Hierarchical Linear Models: Applications and Data Analysis Methods. Sage; Thousand Oaks, CA: 2002.

10. Donner A, Klar N. Statistical considerations in the design and analysis of community intervention trials. Journal of Clinical Epidemiology. 1996;49:435–439. [PubMed]

11. Murray DM. Design and Analysis of Group-Randomized Trials. Oxford University Press; New York: 1998.

12. Frangakis CE, Rubin DB, Zhou XH. Clustered encouragement design with individual noncompliance: Bayesian inference and application to advance directive forms. Biostatistics. 2002;3:147–164. [PubMed]

13. Angrist JD, Imbens GW, Rubin DB. Identification of causal effects using instrumental variables. Journal of the American Statistical Association. 1996;91:444–455.

14. Agresti A. Categorical Data Analysis. Wiley; New York: 1990.

15. Commenges D, Jacqmin H. The intraclass correlation coefficient: distribution-free definition and test. Biometrics. 1994;50:517–526. [PubMed]

16. Haldane JBS. The mean and variance of *χ*_{2}, when used as a test of homogeneity, when expectations are small. Biometrika. 1940;31:346–355.

17. McCullagh P, Nelder JA. Generalized Linear Models. Chapman & Hall; London: 1989.

18. Snijders TAB, Bosker RJ. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Sage; Thousand Oaks, CA: 1999.

19. McKelvey RD, Zavoina W. A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology. 1975;4:103–120.

20. Rubin DB. Bayesian inference for causal effects: The role of randomization. Annals of Statistics. 1978;6:34–58.

21. Rubin DB. Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by D. Basu. Journal of the American Statistical Association. 1980;75:591–593.

22. Rubin DB. Comment on “Neyman (1923) and causal inference in experiments and observational studies. Statistical Science. 1990;5:472–480.

23. Dempster A, Laird N, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B. 1977;39:1–38.

24. Little RJA, Rubin DB. Statistical Analysis with Missing Data. Wiley; New York: 2002.

25. McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. Wiley; New York: 1997.

26. Muthén LK, Muthén BO. Mplus User’s Guide. Muthén & Muthén; Los Angeles: 1998–2008.

27. Imbens GW, Rubin DB. Bayesian inference for causal effects in randomized experiments with non-compliance. The Annals of Statistics. 1997;25:305–327.

28. Little RJA, Yau LHY. Statistical techniques for analyzing data from prevention trials: Treatment of no-shows using Rubin’s causal model. Psychological Methods. 1998;3:147–159.

29. Jo B. Model misspecification sensitivity analysis in estimating causal effects of interventions with noncompliance. Statistics in Medicine. 2002;21:3161–3181. [PubMed]

30. Hirano K, Imbens GW, Rubin DB, Zhou XH. Assessing the effect of an influenza vaccine in an encouragement design. Biostatistics. 2000;1:69–88. [PubMed]

31. Jo B. Estimation of intervention effects with noncompliance: Alternative model specifications. Journal of Educational and Behavioral Statistics. 2002;27:385–409.

32. Jo B, Asparouhov T, Muthén BO, Ialongo NS, Brown CH. Cluster randomized trials with treatment noncompliance. Psychological Methods. 2008;13:1–18. [PMC free article] [PubMed]

33. Asparouhov T, Muthén BO. Multilevel mixture models. In: Hancock GR, Samuelsen KM, editors. Advances in latent variable mixture models. Information Age Publishing; Greenwich, CT: 2007.

34. Muthén BO. Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In: Kaplan D, editor. Handbook of quantitative methodology for the social sciences. Sage; Newbury Park, CA: 2004. pp. 345–368.

35. Muthén BO, Asparouhov T. Growth mixture analysis: Models with non-Gaussian random effects. In: Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, editors. Advances in Longitudinal Data Analysis. Chapman & Hall; London: 2007.

36. Jo B. Statistical power in randomized intervention studies with noncompliance. Psychological Methods. 2002;7:178–193. [PubMed]

37. Frangakis CE, Rubin DB. Addressing complications of intent-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika. 1999;86:365–379.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |