Cluster randomized trials have become increasingly common in studies that aim to evaluate non-pharmaceutical, behavioral, and quality-improvement intervention [1,2]. Examples of such interventions include community-improvement projects , lifestyle interventions such as cessation of smoking and increasing physical activity , education intervention trials , and office-based child-care intervention programs . The cluster design is often used to minimize contamination between intervention and control groups.
Under a cluster design, individuals within a cluster may be more alike than individuals across clusters. A universally used parameter for measuring the extent to which individuals within a cluster are similar to one another is the intraclass correlation (ICC). Suppose the ICC for patients within the same clinic was 0.04, which is not an uncommon level of correlation , and that there were 100 patients in a clinic. The sample size for a study based on the clinic as a unit of randomization requires five times as many patients as a study using the patient as a unit of randomization. This demonstrates how varying ICC estimates can dramatically change the sample size that would be required to test a hypothesis, and thus its estimation has important implications for study design, study cost, and implementation [2,7].
Consequently, to conduct a power analysis for a cluster randomized trial, reliable estimates of the magnitude of ICC are often required. At least two challenges exist for accomplishing power analysis. First, existing ICC estimates usually come from small pilot studies that use cluster sampling or other intervention studies of a similar kind. Unfortunately, ICC values from pilot studies can be rather unstable, and published ICC values from large studies are rarely available. Murray et al.  reviewed a large number of cluster randomized trials in terms of study design and data analysis and concluded that the lack of published estimated ICCs may lead to inaccurate power calculations and compromise the ability to test the hypothesis articulated. This is not a new problem [5, 9,10]. Second, even for published values of ICC, their standard errors are often too large for practical use. For example, an ICC value of 0.02 that has a standard error of 0.01 implies that the 95 percent confidence bounds are 0.00 and 0.04. For the above clinical example, the sample sizes corresponding to these bounds would range from one third to 1.7 times that of the estimate of 0.02.
This paper has two objectives. First, we report ICC numbers from a national study in the United States—the Randomized Controlled Trial to Prevent Child Violence (Safety Check, NIH-R01HD042260), a multi-level, cluster randomized trial that used the clinical office to deliver the intervention . In addition to reporting ICC values, we show how data collected from repeated measures in cluster randomized trials can be pooled together to improve the precision of ICC estimates. Because studies that have repeated measures over participants are becoming more common, the method of using repeated measurements for estimating ICC has important implications for future design in randomized trials and the analysis of clustered data.