The overall design of the CYDS includes multiple assessments of student outcomes, risk and protective factors, and prevention service system functioning. In addition to the cohort design described in this paper, other design elements of the CYDS include: (a) a cross-sectional design assessing community levels and trends in adolescent substance use, delinquency, risk, and protection using repeated anonymous biennial population-based surveys of 6th-, 8th-, 10th-, and 12th-grade students in both intervention and control communities (Murray et al. 2006
); (b) repeated pre-post measures of community-level indicators of prevention service planning and delivery (e.g., prevention collaboration (Brown et al. 2008
)) by representative samples of community leaders (Arthur, Glaser, and Hawkins 2005
); (c) repeated pre-post documentation of community prevention resources and program exposure via surveys with prevention service providers (e.g., teachers, principals, and prevention program directors); (d) annual assessments of CTC functioning via surveys of CTC board members; and (e) ongoing assessment of prevention program implementation fidelity (Fagan et al. 2008
LONGITUDINAL COHORT DESIGN
The CYDS cohort design provides an assessment of the efficacy of the CTC prevention system at reducing adolescent substance use and delinquency outcomes that is separate from the cross-sectional design. The cohort design allows for examination of within-individual change in a panel of students whom the implemented prevention programs in CTC communities were likely to reach given CYDS’s focus on students in Grades 5 through 9. This design component reduces the susceptibility of the overall CYDS design to random heterogeneity due to secular changes in student populations. Secular changes have been identified as potentially limiting the ability to identify intervention effects in community trials (Bauman, Suchindran, and Murray 1999
The cohort design used in the CYDS calls for annual repeated measurements of a panel of students who were in the fifth grade during the 2003–2004 school year. Wave 1 data collection took place in the spring of 2004 and represents a pre-intervention baseline assessment of students’ substance use, delinquency, levels of risk and protective factors, and demographic characteristics. Following the CTC training and implementation schedule, prevention programs and strategies began to be implemented in intervention communities in the summer of 2004. Wave 2 of data collection was conducted in the spring of 2005 and included an effort to recruit additional students in the cohort who were not recruited in the Wave 1 administration.
Communities in the CYDS were selected from a larger pool of 19 matched pairs, and one matched triad, of communities in seven states (i.e., Colorado, Illinois, Kansas, Maine, Oregon, Utah, and Washington) that participated in a naturalistic study of the diffusion of science-based prevention strategies (Arthur, Glaser, and Hawkins 2005
). Communities were matched within state by total population, poverty, racial/ethnic diversity, and unemployment and crime indices. Eligibility criteria for inclusion in the CYDS consisted of: (a) having not selected tested, effective prevention programs to address prioritized community risk factors according to community leaders interviewed in 2001 and (b) securing letters of support from the superintendent of schools, the mayor or city manager, and the lead law enforcement officer in each community agreeing to random assignment of communities and to all ensuing CYDS data collection activities. Consequently, one community from within each of 12 matched pairs of communities that met eligibility requirements was assigned by coin toss to either the CTC intervention or the control (i.e., prevention services as usual) condition.
The 24 CYDS communities are small and medium-sized incorporated towns with an average population of 14,646 (range = 1,578 to 40,787 (U.S. Census Bureau 2001
)). These towns are geographically distinct communities (i.e., at least 60 miles apart) with clear community boundaries (i.e., were not suburbs of larger cities) and separate governmental, educational, and law enforcement structures. On average, 89% of the population members are European American (range = 64% to 98), 3% are African American (range = 0% to 21%), 10% are of Hispanic origin (range = 1% to 65%), 12% are between the ages of 10 and 17 (range = 9% to 16%), and 37% of students are eligible for free or reduced price school lunch (range = 21% to 66%).
All data collection procedures were designed and implemented consistently across all communities. Prevention resources and prevention program exposure in both CTC and control communities were documented by CYDS researchers to monitor the potential of experimental contamination. The CYDS includes analyses of community prevention resources and program exposure to assess the degree to which elements, if any, of the CTC system have been implemented in control communities. Although exposure of the control group to elements of the CTC prevention system could decrease the likelihood of observing effects of CTC in this trial, we conduct the CYDS trial under the assumption of noninterference among communities (i.e., the stable unit-treatment value assumption (SUTVA, Rosenbaum 2007
Measures of substance use, delinquency, risk and protective factors, and demographic characteristics are obtained from the CYDS Youth Development Survey (YDS) (Social Development Research Group 2005
). Patterned after the Communities That Care Youth Survey (Arthur et al. 2007
; Glaser et al. 2005
; Arthur et al. 2002
), the YDS is a self-administered, paper-and-pencil questionnaire designed to be completed in a 50-minute classroom period. The survey includes questions on student demographic characteristics, (i.e., age, gender, race/ethnicity, family composition, and parental education); lifetime and 30-day measures of alcohol, marijuana, cigarette, and other drug use; heavy episodic drinking (i.e., five or more drinks in a row); past-year delinquency; and risk and protective factors in community, school, family, and peer/individual domains. Additional items (e.g., more severe forms of delinquency, sexual behavior) were added to the YDS as deemed developmentally appropriate. lists the 28 risk and protective factor scales measured in the YDS, the number of items, and reliability coefficient alphas for each scale based on data from the Wave 1 administration. Items used to measure risk and protective factors were standardized and then averaged within each scale.
Risk and Protective Factors (by Domain), Number of Items, and Reliability Coefficient Alphas
Recruitment for the cohort sample began in the fall of 2003 by mailing information packets and making in-person calls to each school district superintendent and school principal within the 24 CYDS communities, asking for their continued commitment to participate in the study and outlining the requirements of involvement in the coming year. As a result, 28 of 29 school districts, comprising 88 schools, agreed to participate. In participating school districts, a school-based, teacher-coordinated approach was used to secure informed parental consent and student assent for participation in the study. School principals identified a lead teacher to coordinate the distribution of consent materials to teachers of fifth-grade classes. Lead teachers worked closely with classroom teachers to help them distribute the consent forms to students for parental consent, provide instructions to their students, and collect the consent forms in two weeks, indicating whether or not each eligible student’s parents gave informed consent for their child to participate in the study. Lead teachers served as the point of contact for CYDS staff and received a $20 cash incentive for their assistance. To encourage high return rates, teachers of fifth-grade classes were offered $100 for classroom supplies if at least 90% of the eligible students returned their consent forms and $150 if at least 95% of students returned their forms. Each school received an additional $150 for its overall participation in the study. In the one community whose schools declined to participate, a community-based recruitment method was used whereby families with fifth graders were solicited via newspaper advertisements and flyers distributed at child-centered locations.
Recruitment efforts for Wave 1 data collection yielded a return of 92.5% of the consent forms to the schools. Parents of 63.1% (n = 3,682) of eligible students consented to their participation in the study at Wave 1 (community range = 24.7% to 72.9%). To increase the overall participation rate of students across the 24 CYDS communities, a second recruitment effort was initiated in Wave 2. Beginning in the fall of 2004, trained CYDS recruiters were sent into study schools to conduct 5-minute classroom presentations to interest students in the study, answer questions, and directly distribute new consent brochures to eligible students whose families had not previously consented or otherwise had not been recruited in Wave 1. In addition to working with lead teachers, CYDS recruiters directly contacted sixth-grade teachers in Wave 2 to explain the study. A new incentive plan was implemented for Wave 2, setting recruitment goals at the school level instead of at the classroom level. In addition to $150 given to each participating school, an incentive of $150 was distributed to every participating classroom if at least 85% of eligible sixth-grade students from the entire school returned their consent forms within a 2-week period and $75 if 85% of eligible students returned their consent forms by the targeted survey date, regardless of whether the form granted or denied permission for the student to participate in the survey. This recruitment effort resulted in an additional 1,146 sixth-grade students consented to the study.
Eleven percent (n = 404) of the students consented in Wave 1 were ineligible for participation in Wave 2 because they moved out of the school district (n = 388), did not remain in their grade cohort (i.e., skipped or were held back a grade; n = 4), were in foster care and did not have consent from state authorities to participate (n = 7), or were unable to complete the survey on their own due to severe learning disabilities (n = 5). Excluding ineligible students and including the newly recruited students resulted in a total of 4,420 students whose parents consented to their participation in the study (76.4% of the eligible population). Final consent rates did not differ significantly by intervention condition (i.e., rates were 76.1% for students in intervention communities and 76.7% for students in control communities). Overall, 3,585 students completed a Wave 1 survey, 4,390 students completed a Wave 2 survey, and 4,407 students completed either a Wave 1 or Wave 2 survey.
DATA COLLECTION PROCEDURES
In Wave 1, trained CYDS interviewers read the survey aloud to classrooms of fifth graders who followed along and marked their answers in the survey booklet. Two interviewers were present in each classroom; one to read the survey aloud to the class and the other to assist students individually, as needed. In Wave 2, CYDS interviewers introduced the YDS to classrooms of sixth-grade students who then read and completed the survey on their own. Again, interviewers were available to assist students with questions or special requests as needed. In both waves, make-up sessions with students who needed extra time or required special attention were conducted by CYDS staff. For students who were not surveyed by the time CYDS staff left the community (e.g., continued absence or suspension from school), teacher-proctor survey packets were left with lead teachers with explicit instructions on how to administer the survey.
To ensure confidentiality, no names or other identifying information were included on any of the surveys. Identification numbers were printed on the survey booklets to allow tracking across data collection waves. Students read and signed assent statements indicating that they were fully informed of their rights as research participants and agreed to participate in the study. At the end of the classroom period, CYDS interviewers collected the survey booklets and assent statements from students, separated them, and sealed them in separate secured envelopes for return to the University of Washington. Upon completion of the survey, students received small incentive gifts worth approximately $5 to $8. Wave 1 participants recruited through community-based methods were surveyed in the local library and received $50 cash for their participation.
Strategies for collecting future waves of data from the longitudinal cohort data include maintaining continuous locating information on all students in the cohort through contacts with CYDS schools and tracking all cohort students, even if they move to locations other than the original 24 CYDS communities. Students who are not present at the time of school data collection have separate data collection dates scheduled with CYDS interviewers. Student attrition is monitored closely and ongoing analyses will assess whether rates of attrition differ significantly by intervention condition or other student and community characteristics and outcomes. Especially valuable in this context will be the use of longitudinal diagnostics suggested by Graham (2009)
following Hedeker and Gibbons (1997)
. Variables found to be related to student attrition will be incorporated in outcome analyses as covariates or in multiple imputation analyses as auxiliary variables (Collins, Schafer, and Kam 2001
; Rubin 1987
; Schafer 1997
PLANNED STATISTICAL ANALYSES
In the longitudinal cohort design, the repeatedly measured outcomes (Time, t
) are nested within students (i
) who, in turn, are nested within communities (j
), with communities being nested within matched pairs of communities (k
). To address the statistical dependencies that can occur in such nesting, we will rely on the General Linear Mixed Model (McCulloch and Searle 2001
; Raudenbush and Bryk 2002
) for Gaussian distributed outcomes and the Generalized Linear Mixed Model (Breslow and Clayton 1993
; Liang and Zeger 1986
) with logit link transformation for Bernoulli distributed outcomes. Three sets of statistical analyses are planned for the cohort sample. First, beginning with the Grade 7 wave of student data collection, we will examine differences in the prevalence rates of substance use and delinquent behaviors using a mixed-model ANCOVA (Murray 1998
), controlling for baseline levels of substance use or delinquent behaviors. Second, we will use multilevel discrete-time survival models to assess the efficacy of CTC to prevent the onset of substance use and delinquency during successive waves of data collection. The third set of analyses will employ latent growth models (Laird and Ware 1982
; Raudenbush 2001
) to examine intervention effects in long-term trajectories of substance use, delinquency, and risk/protective factors. All analytic strategies assess the effects of the intervention at the appropriate unit of randomization (i.e., communities) with appropriate estimates of standard errors and degrees of freedom and allow for regression adjustment of potential covariate effects. Student- and community-level covariates will be added as linear predictors of targeted outcomes to improve the precision of estimated intervention effects (Murray 1998
; Schafer and Kang 2008
). Although communities in the CYDS were matched into pairs with one community from each matched pair assigned randomly to experimental condition, random assignment does not guarantee that student populations within each community pair will be equivalent with regard to their respective distributions of demographic and individual characteristics; nor does it guarantee that community pairs will remain similar over time with regard to population and economic growth.
Using the Bernoulli-distributed 30-day alcohol use outcome at Grade 7 as an example (i.e., 1 = alcohol use during the previous 30 days, 0 = no alcohol use during the previous 30 days), the Generalized Linear Model for the mixed-model ANCOVA can be expressed (in multilevel equation format; see Raudenbush and Bryk 2002
- Level 1 (Student i)
=1)/(1 - P(G7ALC30ijk
- Level 2 (Community j)
- Level 3 (Community-matched pair k)
This model statistically controls for student baseline characteristics: age (AGE), gender (SEX), race/ethnicity (White vs. Nonwhite [WHITE] and Hispanic vs. Nonhispanic [HISP]), parental education (PARED), religious attendance (RELIG), and rebelliousness (REBEL); and includes a baseline measure of the dependent variable (G5ALC30) as an additional adjustment for any potential baseline differences. These covariates were selected on the basis of having putative zero-order linear relationships with targeted outcomes, as suggested by previous research (e.g., Hawkins, Catalano, and Arthur 2002
; Johnston et al. 2008
). Characteristics of students’ communities, population size (POP) and percentage of students receiving free or reduced price school lunch (PCTFRL) are included as community-level covariates. In the absence of a priori theory regarding the functional form of these covariates, we will include them as linear predictors of community-level intercepts. We make the assumption of linear additive covariate effects as a matter of convenience, but will consider modeling interactions among covariates should they be warranted.
The intervention effect (γ001) for the community-level dichotomous indicator of intervention status (CTC; 0 = control community, 1 = CTC community) is estimated as the mean difference in adjusted community-level prevalence rates between intervention and control communities as tested against the average variation among the intervention condition-specific adjusted community-level prevalence rates, with degrees of freedom equal to the number of community-matched pairs (12) minus the number of community-level covariates and intervention effect (3), minus one (i.e., df = 8; Murray 1998
). We note that the variance for the level 1 random effect is a function of the proportion of students responding affirmatively to the outcome in question and, therefore, is not an estimated parameter; however, random effects for variability in intercepts (the mean log odds of 30-day alcohol use at Grade 7) across communities and community-matched pairs (u
, respectively) are included.
Multilevel Discrete-time Survival Analyses
Multilevel discrete-time survival analysis (Barber et al. 2000
; Hedeker, Siddiqui, and Hu 2000
; Yau 2001
) will be used to assess the effects of the CTC intervention in delaying onset of alcohol, marijuana, cigarettes, and delinquency. This model also can be specified as a Generalized Linear Mixed Model (Reardon, Brennan, and Buka 2002
), however, as a longitudinal model, an additional level of nesting is incorporated to model the time-specific hazard of initiation as a function of a student’s grade level at time of first self-reported occasion of substance use or delinquency (coded 1 = initiated during the time interval, 0 = did not initiate during the time interval). In line with specifications for survival models (Singer and Willett 2003
), observations for individuals following the first reported event will be coded as missing data since they are no longer at risk for the indexed event; similarly, students that do not experience the target event before the conclusion of the study period will be treated as right-censored observations. Using first annual use of alcohol (FIRSTALC) as an example, the statistical model for the ML-DTSA is expressed (in multilevel equation format) as:
- Level 1 (Time t)
=1)/(1 - (FIRSTALCtijk
- Level 2 (Student i)
- Level 3 (Community j)
- Level 4 (Community-matched pair k)
This model is similar to that of the mixed-model ANCOVA except that an additional level of nesting (Time) is introduced with Time (coded “0” for Grade 5, “1” for Grade 6, and so on) modeled as a fixed effect. An additional random effect (r
) is included to model the variability in the log odds of alcohol use initiation across students. Random effects u
are retained to model variation in the log odds of alcohol use initiation across community and community-matched pairs, respectively. The intervention effect (γ001) is assessed as the mean difference in adjusted community-level hazard rates between intervention and control communities and is tested against the average variation among the intervention condition-specific adjusted community-level hazard rates, with the same number of degrees of freedom as the mixed-model ANCOVA.
Latent Growth (Hierarchical Linear) Models
We will use latent growth models, also known as hierarchical linear models, to examine intervention effects on the change in the frequency of substance use and delinquency, and levels of risk and protective factors, over time. Similar to the ML-DTSA shown above, the latent growth/hierarchical linear model consists of four levels of nesting and explicitly models the outcome as a function of data collection wave. However, unlike the ML-DTSA, no conditionality is imposed on the values of the outcome and, as the dependent variables are considered to be distributed as Gaussian, no logit link transformation is required. Thus, the latent growth/hierarchical linear model, using the frequency of 30-day alcohol use (ALC30) as an example, can be depicted as:
- Level 1 (Time t):
- Level 2 (Student i):
- Level 3 (Community j):
- Level 4 (Community-Matched Pair k):
This model includes random effects to capture deviations in intercepts (i.e., predicted levels of 30-day alcohol use at Grade 5) between (a) a student’s observed and predicted levels at each wave of data collection, conditional upon being in a specific community and community-matched pair (etijk
); (b) a student’s predicted level and his/her respective community’s predicted level of use (r
); (c) each community’s predicted level and the predicted level for that community’s matched pair (u
); and (d) observed and predicted levels for each matched pair of communities (v
). The model additionally includes random effects to capture variation in growth rates (slopes) among (a) each student’s predicted rate of growth in the frequency of 30-day alcohol use relative to his/her respective community’s average rate of growth (r
), and (b) a community’s predicted rate of growth in the frequency of 30-day alcohol use at Grade 5 and the predicted growth rate for that community’s matched pair (u
). Random effects are assumed to be Gaussian with a mean of 0 and variance = σ 2
, and uncorrelated with model covariates, which are reasonable assumptions given analysis of existing baseline data from the longitudinal cohort and cross-sectional designs (Murray et al. 2006
). Variation in random effects is assumed to be homogeneous over time, among students, and between CTC and control communities; however, these a priori assumptions will be assessed during the course of the analyses and violations of model assumptions will be addressed (e.g., explicitly modeling heteroskedastic random effects). In all latent growth/hierarchical linear models, the intervention effect (γ101) will be estimated as the difference between the intervention condition-specific growth rates and will be tested against the average variation among the intervention condition-specific growth rates across communities, with appropriate degrees of freedom (Murray 1998
). To assure that intervention effects are not unduly influenced by model covariates, we will conduct the analyses by first assessing fully unconditional models (i.e., no predictor variables), then examining models of unadjusted intent–to–treat intervention effects, and finally adding model covariates in fully conditional models with regression adjustment for covariates.