Based on the initial review of early intervention research, internal and external expert input, and the development of the Legacy model, five related study aims were identified:
Aim 1: Document the implementation of the Legacy intervention and evaluate intervention fidelity.
Aim 2: Determine the relationship between self-efficacy and sense of the community and positive maternal-child interaction among the Legacy intervention mothers versus the comparison group mothers.
Aim 3: Evaluate the long-term goals of the intervention by examining the developmental outcomes of the children of the Legacy intervention mothers versus the comparison group mothers, in the domains of cognitive, language, socio-emotional, and behavioral development.
Aim 4: Understand how mothers responded to the intervention and which factors affected the quality of intervention each mother received.
Aim 5: Determine the costs associated with the group-based intervention in order to calculate overall costs and, if the intervention yielded significant effects, cost-benefit and cost-effectiveness indices.
These aims were executed, in concert with the Legacy Logic Model (Figure ).
Legacy for ChildrenTM logic model.
As stated earlier, two implementation sites were selected based on a competitive award process: University of Miami and University of California at Los Angeles. In order to increase the likelihood of a objective program evaluation, the randomization process and the evaluation data were coordinated by the independent PCC, including collection and processing of all assessment data as well as those process data that were not based on direct observation by the intervention staff.
Randomized controlled clinical trial
The evaluation design that was selected to test the impact of the Legacy model was a pair of randomized controlled trials, which was registered as such with ClinicalTrials.gov (Identifier: NCT00164697). Participants randomly assigned to the intervention group received intervention and contact with intervention staff on a regular basis, as described in the Site implementation section, above. The children in the comparison groups did not receive any core components of the Legacy intervention model but participated in the same schedule of developmental assessments. Families in this “usual care” comparison group were not prevented from utilizing any service that would otherwise be available to them, even if the service was similar to the services received in the intervention arm of the study. It was anticipated that families in the comparison group received some community services for which they were eligible, e.g. primary health care services, immunizations, and other early intervention programs. Families in both the comparison and intervention arms of the study were referred if the child scored in the risk range on standardized assessments (e.g., 1.5 standard deviations below the mean on IQ scores). Therefore, the comparison group represented a mild intervention group and tests of intervention effectiveness are conservative.
The intervention and evaluation procedures were pilot tested at each implementation site using the full-length pilot phase with 60 dyads at each site to test recruitment and retention procedures, obtain feedback on intervention content and delivery, and refine and test the multi-phasic assessment battery for process, outcome, and cost data. Although such pilot efforts have been shown to be extremely helpful in launching complicated social science investigations and are recommended as part of intervention development [54
], they have not typically been utilized in early intervention research due to limited resources [55
]. The pilot study data were not used to assess the effectiveness of the intervention but to refine and adapt the procedures and methods to the participant population.
The cross-site inclusion criteria had several components -- one that defined the target population and two that were logistically necessary. Pilot data were used to examine recruitment procedures and feasibility of recruitment criteria. The first criterion was environmental risk for poor developmental child outcomes, which was operationalized differently by the two sites. Families were selected based on environmental risk rather than medical or biological risk. Therefore, both intervention sites recruited from well-care settings and excluded mothers reporting mental health or substance abuse problems.
The second two criteria were factors that were required to facilitate intervention provision. The first of these logistical necessities was comfort speaking and understanding conversational English, given that the intervention was conducted in English, the assessment instruments were largely available only in English, and it was not feasible at either site to create groups that could be conducted in other languages. A second such criterion related to custody and age; because the intervention included attendance by mothers and their children, only mothers who had custody of their children and who were of age to consent to their own participation (18) were included in the study. Mothers who lost custody during the course of the intervention were no longer eligible to participate.
In addition to the aforementioned cross-site Legacy inclusion criteria, each site elected to impose additional site-specific criteria. The original eligibility criteria for the UCLA project further restricted eligibility to women who lived within 10 miles of UCLA, planned to stay in the LA area for 3 years, and received their prenatal and well-baby medical care from the UCLA Medi-Cal Health Maintenance Organization (which is a public health insurance program which provides needed health care services for low-income individuals). For feasibility issues related to intervention and assessment provision, mothers who had more than 2 children (including the target child), or were expecting a multiple birth, if known at the time of recruitment (26 weeks), were not eligible.
The original eligibility criteria for the UM project further restricted eligibility to mothers who lived within a 50-minute drive of at least one of the two community intervention sites, who gave birth at Jackson Memorial Hospital (the main teaching hospital in Miami-Dade County), and who planned to stay in the area for 3 years. Similar to UCLA and for feasibility issues, mothers who had multiple births or had more than two other children or children older than 4 years were excluded. UM operationalized poverty by including mothers who lived in specific zip codes selected because of low performing schools and schools with a high percentage of free lunch eligibility, as well as including only mothers who reported less than 12 years of maternal education.
Modification of main study inclusion criteria as a function of pilot findings
Several important lessons were learned from the pilot study recruitment process. First, the formal screening procedures used to identify maternal mental health and substance use problems were difficult and unreliable to implement in the pilot, so formal screening for these issues was dropped. However, clinic and hospital staff assisted Legacy
staff by referring only mothers with no known medical risk factors, such as maternal mental health problems, substance use problems, and high-risk pregnancies. Second, the pilot study recruitment proceeded very slowly, particularly in Miami. The majority of mothers approached at that site could not be included because they were ineligible, rather than not being interested. Specifically, 249 of 316 mothers (79%) approached in Miami for the pilot were ineligible. Only 7 eligible mothers (10%) choose not to participate. In comparison, of 106 LA mothers screened for the pilot, 31 (29%) were ineligible, and 12 eligible mothers (16%) chose not to participate. The low eligibility rate of the Miami mothers, if it continued into the main study, would limit the generalizability of the results. Preliminary analyses also revealed that the mothers included in the Miami pilot were significantly less resourced and were at greater demographic risk than the LA mothers (61% vs. 40% household income
$20,000; 13% vs. 22% married, 71% vs. 43% living in neighborhoods with high unemployment), probably due to differences in how poverty was operationalized at the sites. Therefore, inclusion criteria were broadened and aligned across the two sites for the main study.
Specific changes to the project inclusion criteria included dropping the restriction on number or age of siblings at both sites. The remaining changes were implemented in Miami alone. Specifically, the criterion regarding maternal education and recruiting from zip codes with low performing and low-income schools was dropped for Miami, in favor of recruiting mothers who were eligible for state programs such as Medicaid, food stamps, or TANF. Ultimately, the final UM main study eligibility criteria included mothers who currently lived within 50 minutes of one of the community intervention sites, gave birth at Jackson Memorial Hospital or Jackson North Maternity Center, expected to stay within the catchment area for three years, and had a household income below 200% of the poverty level as indicated by Medicaid, food stamps, or TANF eligibility. In addition, to better align the Miami post-natal participant group with that of LA, the UM main study participants were required to report having had at least one prenatal care visit.
Sample size and power
A key consideration for this intervention study was to determine the sample size required to detect meaningful effects of the intervention. Sample size calculations were conducted before study initiation to guide recruitment, informed by pilot study attrition rates and informed by the literature and knowledge of the time. During the study planning period, staff did not know how many sites would be funded to implement the Legacy model. Therefore, sample size calculations were conducted to allow for sufficient power to detect meaningful effects at each funded site.
Although the focus of the intervention model is on overall child development, the sample size estimation was based on an age-appropriate cognitive measure at the end of the intervention. The rationale for this follows. First, child cognition is positively correlated with other child development outcomes in the early years of life. Second, it has been shown to be a good predictor of later child outcomes. Third, the majority of early intervention studies have used child cognition as an outcome measure [56
]. Fourth, as mentioned earlier, most policy makers place an emphasis on cognitive outcomes. As such, there is more literature available on the expected potential effect size for cognition than for other child development domains.
The key element in the statistical estimation of sample size was specification of the magnitude of the effect size (i.e., the absolute difference in group means divided by the assumed common standard deviation) that we want to be able to detect. Findings from previous parenting education programs have found small effect sizes (< 0.25) with respect to child development outcomes [56
]. However, the proposed model varies from the traditional parenting education programs in its focus. The focus of this intervention is on the role of parents in the development of their children and presumes that parents can successfully address this role independently of their own personal circumstances or external stressors in their lives. Findings from early intervention efficacy randomized controlled trials (RCT) of children whose parents were from low socioeconomic strata have shown up to a full standard deviation difference in mean cognitive level between the intervention and comparison groups [16
We chose an effect size of 0.50 for the age-appropriate cognitive measure in this study. Several considerations led to this choice. First, Legacy
might be considered less intensive, with reference to the group leader working directly with the child, than some previous early intervention studies. A second consideration is that although previous parenting interventions found small effect sizes, this intervention will be different in intensity than these programs. The intervention will most likely be more intensive with reference to parent–child interaction than previous parenting interventions. A third consideration was the unknown level of noncompliance that would be encounter in the intervention group. In the Infant Health and Development Program, the level of program participation was found to be strongly related to the estimated effect size, with the observed difference in mean Stanford-Binet scores between the intervention subgroup with the lowest participation rate and the comparison group being about .25 standard deviations [62
]. The level of intensity, with reference to previous early intervention and parenting programs, and low participation in the intervention by intervention families have the potential to affect the observed effect size.
Another important element to take into consideration when calculating sample size is the participant loss rate. The literature indicated a wide range, 7% to 70%, of subject loss rates for various early intervention programs [56
]. The 7% loss rate was for a highly structured study where extraordinary measures were taken to minimize subject drop out [63
]. It was highly unlikely that similar measures could be be employed in this study. Most studies have shown a 30% to 40% loss rate (37% [64
]). Even with a conservative, anticipated loss rate of 50% by age five, the study would still have a power of 0.86 for detecting an effect size of 0.50 at each site. It should be noted that this sample size (120) refers to the total number of children (intervention and comparison) for whom data on an age-appropriate cognitive measure will be available at the end of the assessment period (age five). The computations assume a one-sided test at the alpha
0.05 level. A one-sided test was chosen since the available literature on early intervention gives no indication of a detrimental effect on a child’s development.
Based on data from the pilot study, we had a 50 - 70% attendance in the mothers’ groups. In order to ensure a practical group size (approximately 7–10 per group) we decided to recruit 15 intervention mothers per group for the main study, resulting in a recruitment ratio of 15 intervention mothers to 10 comparison mothers. Because we experienced approximately an 80% assessment compliance rate, we did not increase the number of comparison mothers. The pilot study also suggested that the rate of fetal loss that should be expected among the prenatally recruited LA participant group was 5%. Therefore, the final main study sample included 300 Miami and 315 LA participants who were randomized in a ratio of 3 intervention: 2 comparison, resulting in 180 intervention and 120 comparison participants in Miami, and 190 intervention and 125 comparison participants in LA. Post-randomization, 9 participants became ineligible due to fetal loss (n
3) and administrative recruiting errors (n
Recruitment and randomization
After the main study inclusion criteria were established and sample size calculations were complete, recruitment began. Mothers were recruited at prenatal WIC clinics in LA and at birth hospitals in Miami. Due to differences in the point of recruitment, the enrollment process varied slightly between sites.
In LA, the recruitment process began at a set of prenatal clinics within the catchment zip codes. Mothers were either approached by clinic staff or by recruiters directly. Interested mothers were told about the study using a scripted format and an eligibility screener was conducted. Interested mothers completed the screener and then, if eligible, the consent process was initiated at the clinic. The recruiters also secured consent at a later date if interested mothers felt that they could not immediately grant consent for their participation. In Miami, the recruitment process began at the well-baby unit of the recruitment hospitals soon after the mothers delivered their babies. Recruitment was conducted using a two-step process. The recruiters first approached mothers in the well-baby unit, with information about the study delivered using a scripted format. An eligibility screener was completed by all interested mothers. Within two weeks, eligible mothers received a follow-up visit by the recruiter at the homes of the mothers to complete informed consent and enrollment in the study.
Blinded randomization of the consenting participants was performed within each site (neither the site staff nor the participants knew the group assignment at enrollment) via a centralized, computer-generated process at the PCC. Assignments to either the intervention or comparison group were made on a weekly basis for enrolled participants. After randomization, the intervention site teams received assignments from the PCC and communicated them to the mothers.
To protect the rights of the research participants, mothers were asked for consent not only at the initial enrollment but again annually before each assessment visit. Separate consents were received for additional qualitative data collection on subgroups, including focus groups and case study interviews. Human subjects reviews were conducted by the Institutional Review Boards at CDC, at RTI, at UCLA, at UM, and at Western IRB for the time between 2005 and 2008 when UM contracted with them to conduct human subjects protection reviews.
The final participant characteristics for the 574 mothers who completed the baseline assessment appear in Table . Statistical comparisons of each sociodemographic characteristic across the two sites were conducted in order to identify site differences. T-tests were used to contrast the continuous sociodemographic items (e.g., maternal age) and Chi-square statistics were used to compare distributions across categorical demographic variables. The sites differed significantly on a number of maternal characteristics, including age, education, marital status, cohabitation, race/ethnicity, and employment. The participant groups were similar for factors that reflected the recruitment criteria, including household income and the proportion of mothers speaking English in the home. The two groups were also similar in the proportion of male children, mothers living in public housing, and for indicators of the mother’s household composition.
Baseline socio-demographics for mothers in the Miami and LA samples and the combined sample, by randomization group
Statistical comparisons of intervention and comparison groups were conducted within sites to identify any group differences that may have remained after group randomization. As depicted in Table , randomization of eligible participants to each of the two randomization groups within each site resulted in equivalence of groups across each measured sociodemographic characteristic.
Baseline socio-demographics for mothers in the Miami and LA samples and the combined sample, by randomization group
Retention of subjects
Activities intended to maximize retention were implemented in both the intervention and assessment settings. Once a mothers was randomized to the intervention group, mothers received varying amounts of the intervention, depending on their own compliance. Mothers who ceased intervention participation were encouraged to rejoin the intervention at any time. Mothers who could not complete an assessment were invited for all subsequent assessments unless the mother refused participation. Whenever possible, an exit interview was completed for families electing to permanently disengage with the study before its completion. There were four reasons for permanently terminating the affiliation of a mother with the Legacy study. Enrolled participant mothers were dropped from the study if they 1) permanently refused participation, 2) permanently moved out of the catchment area, 3) were abusive within the group setting (e.g., threatened another participant), or in the case of 4 mother's or child’s death.
A number of specific study-related activities focused on maximizing retention in both the intervention and assessment settings. Our retention efforts began with an investment in continuous participant tracking. Each site employed staff members whose primary responsibility was to track and maintain contact with the participants. Tracking information included residential addresses, telephone numbers, alternate contact names and contact information, and preferred method of contact. These data were maintained electronically so that they could be viewed real-time by staff at both the intervention and assessment offices.
Incentives were also used to compensate participants for their time and efforts as well as to encourage study participation (Table ). Although families in the intervention group were in more frequent contact with the Legacy staff than those in the comparison group (weekly vs. three times per year), efforts were made to minimize differential loss. Local staff kept in touch with the comparison families between assessment points through phone calls and mailings. Shortly after an assessment took place, written feedback was mailed to families. If the results of an assessment indicated a need for further services, mothers were provided with information on available services and referrals to outside early intervention services were facilitated.
Incentives used in theLegacy for ChildrenTMstudy
Another key factor to facilitate mothers’ participation in the assessment and intervention component was overcoming common barriers to attendance. Both sites provided van transportation to assessment and intervention activities to any mother who wanted it. To maximize mothers’ ability to attend group meetings, van scheduling accommodated parents to the extent that was possible. In Miami, the decision was made to conduct meetings during the week. Initially, pilot Miami groups were conducted during the day but, as mothers returned to full-time employment, evening groups were added. In LA, all group meetings were initially conducted on the weekends, but experience with the pilot study showed that some weekday groups were necessary.
When a family moved out of the catchment area, the sites attempted to maintain communication with the mother so that future assessments could still be conducted. Mothers who returned to the area were invited back to participate in intervention and/or assessments during the course of the study. Reasons for non-retention (i.e., nonparticipation in any of the assessments) were documented on an individual basis.
Analytic sample size and statistical power
Retention and sample size
Table reports the sample sizes for Legacy in terms of the number of completed assessments through age 5, with a full description of retention and participant loss by randomization group depicted in Figure . Although the Legacy impact assessment has been extended beyond age 5 and continues into school-age, the following discussion is constrained to the original impact assessment, conducted through age 5.
Sample size forLegacy for ChildrenTM, by site, year, and intervention status
CONSORT Flowchart for the Legacy for ChildrenTM project.
The study began with a total of 338 intervention and 236 comparison baseline assessments, which represents 94% of the 361 mothers randomized to intervention and 96% of 245 mothers randomized to comparison. Baseline sample sizes were similar for the two sites. The mean assessment window for each assessment wave was within four to six weeks of the target date, from baseline to five years. Within each site, annual retention rates for each assessment wave were comparable across the two randomization groups. The sample sizes for each annual assessment are presented in Table and culminate in Year 5 assessment rates of 64% across the two sites.
Attendance at group intervention sessions varied by site. In Miami, 90% of mothers attended at least one group session during the course of intervention provision. In LA, uptake was lower, with 78% of mothers attending at least one group session over the three years that intervention was delivered at the site.
In order to address the research questions of interest, three main categories of data were collected: a) process data on the intervention implementation and the mothers’ responses to the intervention; b) assessment data on mediating, moderating, and outcome variables; and c) cost data collected separately for intervention and research components. To develop and test the measurement protocol, complete data were collected for all participants in the pilot study, including process, outcome, and cost data.
The process measures served several purposes. The first purpose was to determine the quality and fidelity of the intervention as implemented in each site for each of the key components included in the intervention design (e.g., parent groups, one-on-one visits, sense of community building efforts). In addition, the process measures provided feedback for formative purposes and ongoing assessment of program performance. This feedback was used to assist the intervention team in targeting training and technical assistance activities over the period of the study. Further, these data were used by the sites to monitor ongoing operations, in order to reduce their data burden and maintain investment in the study. Finally, the process measures contained qualitative information as a measure of intermediate outcomes (e.g., persistence of families in the program, engagement of participants) and key program components to assist in interpreting quantitative outcomes.
In short, the process measures were collected to evaluate whether the intervention conforms to the hypothesized model, whether the intervention was working and how it could be improved, the extent to which the intervention group actively participated in the program, and the causes and consequences of participant attrition. Thus, the process measures were used for both formative and summative evaluation.
The process data consisted of time-based measures dependent on the mother’s random assignment to the intervention or comparison group. The process data were compiled from four data sources: (1) direct observation, (2) program record data, (3) data from program providers and (4) data from program participants.
Direct observation measures
A portion of all intervention sessions were directly observed by ethnographers. They followed selected groups at each of the sites throughout the Legacy project, but also observed other groups intermittently to document responses of different groups to similar intervention contents. The ethnographers described the Legacy meetings in detailed notes which were later coded and themed by trained and reliable coders.
Program record data
Program record data consisted of information about attendance and assessment rates, one-on-one visits, and information about contacts between Legacy staff and mothers. Forms were used to record information regarding one-on-one visits as well as individual contacts.
Data from program providers
Data were collected using interviews and questionnaires about how the group leaders perceived their success with the intervention as well as participants’ responses both on a group and individual level. At the completion of each group session, a parent group summary form was completed, documenting whether the session was completed as planned, as well as how mothers responded to the intervention and each other. A parent engagement form was completed every 10 weeks by the group leader for each mother. In addition to information regarding how many sessions each participant attended, the group leaders were asked for information regarding the perceived interest and participation of each mother in the main topics covered, the Legacy group, and their child. Also, the group leader rated the quality of the mother’s observed interaction with her child. These data are being used to calculate the dosage of the intervention received and to provide cross-validation of information provided by the mothers on the quantitative assessments.
Data from program participants
These data were collected through interviews, questionnaires, and focus groups. Focus groups of 10–12 randomly chosen participants were conducted annually by an evaluator from the PCC to explore the mothers’ perceptions and responses to the intervention curriculum, group leaders, and other participants. For each site, five mothers were selected for in-depth case studies and participated in annual interviews by an evaluator. The focus groups and case studies served to generate narratives about overall project experiences and gave respondents a chance to tell their story.
In addition, parent satisfaction surveys were conducted systematically for a subsample of participants, selecting those who actively participated (i.e., attended at least one session in the 6 months prior to the annual assessment time point). This interview solicited information about the mothers’ satisfaction with Legacy, the group leader, and the activities they may have taken part in. In addition, the interview solicited information regarding parental self-efficacy and relationships with other parents in the group.
The data gathered from the pilot study on process measures were used to refine the instruments in an iterative process. For example, when mothers in focus groups as well as in some of the group sessions discussed that they felt that being in Legacy was a unique experience, unlike any other group setting, a question was added to the satisfaction survey to learn whether this perception was shared by the majority of mothers or not.
Demographic, mediating, and outcome measures
The quantitative assessments were selected to measure the three constructs of interest: child development, parent/family characteristics, and parental sense of community. These constructs will be used to determine the degree to which the Legacy intervention influences child development and to analyze the paths of those impacts. The outcome domains for child development include cognitive development, language development, socio-emotional functioning, behavioral functioning, and health. In addition, data were collected on other parent and family domains that were expected to be important as either control variables or as mediators. These consisted of demographics and family background, self-efficacy, parental responsibility, parental investment, devotion of time and energy, parenting behavior (including guidance of behavioral and emotional regulation and facilitation of cognitive development), and quality of the mother-child relationship. The parental sense of community domains included support/resources/memberships, community/peer relations, satisfaction of needs (life and self), and psychological sense of community.
Assessments were administered at baseline, when the child reached 6 and 12 months, and then annually until the child reached 5 years of age. The baseline assessment was designed to measure maternal characteristics, attitudes, and knowledge before the beginning of the Legacy intervention. The intervention began prenatally in LA and postnatally in Miami; thus, the baseline was administered prenatally (before intervention attendance was permitted) in LA and when the infants were 4–8 weeks old in Miami.
These assessments were conducted by assessment staff from the PCC who were blind to the intervention assignment. As an additional precaution, assessments took place in a lab that was in a separate building from the intervention, and contact between assessment and intervention staff was limited.
The assessment batteries consisted of questionnaires, direct assessments, and some videotaped observations of mother and child. Most of the measures were standardized and had been previously used in similar research projects, but some questionnaires were developed or adapted for Legacy. The majority of the measures were administered in person using a computer assisted personal interviewing procedure; however, some sensitive questions, e.g., regarding alcohol and drug use were completed by the mothers alone, using an audio-enhanced computer-assisted self-completed interviewing process. For a complete list of measures by time point, see Table .
Domain, constructs and measures by assessment time point in theLegacy for ChildrenTMStudy
In addition to the baseline and follow-up assessments, a family update interview (FUI) was administered by the tracking staff every six months of child age from 9 through 57 months. These interviews included questions concerning contact information, household composition, child care, and employment and services received and were conducted by telephone or in person.
Development of the assessment plan
The approach to selecting child and family measures was based on relevance to the intervention goals and specific research question. The assessments included all child outcome domains, maternal mediating domains, as well as demographics and family background variables. Whenever possible, measures for which standardization samples and norms matching the current sample were available, as well as documented reliability and validity with an internal consistency reliability (alpha coefficient) of at least .70, were selected. Whenever feasible, measures that were appropriate to the mothers’ expected reading and comprehension levels and their cultural backgrounds were selected.
Developmental change is rapid from birth to 5 years, the age ranged covered in the Legacy
study. Therefore, many measures of child outcome focused on a relatively narrow age range. Thus, to measure a particular outcome at different ages, different outcome measures had to be selected for some domains (e.g., Brief Infant-Toddler Social & Emotional Assessment [65
] for younger ages and Devereux Early Childhood Assessment [66
] for older ages to assess social/emotional development). Additionally, to accommodate the possibility that some children might have developmental lags, particular attention was paid to the floor (and ceilings) of the selected measures. A major issue of concern was the burden of assessment on the participants. Where possible we chose shorter assessments, and selected the most feasible measures in terms of time and materials.
Even with these selection criteria, several measures that were tested with the pilot participant group were dropped or replaced for the main study. One major cause for changes in methods was the fact that while mothers were asked whether they planned to teach English to their child, in reality many mothers who were bilingual had children who were either bilingual or mono-lingual non-English speaking, particularly at the earlier ages (primarily Spanish). Therefore, measures were selected, that were suitable for bilingual children, and, when possible, allowed for administration in Spanish. Specifically, the Kaufmann Assessment Battery for Children [67
] was selected to measure cognitive development because it can be administered bilingually. The Spanish version of the Preschool Language Scales [68
] was also included for those children deemed dominant in Spanish (as attained by maternal interview on child’s language exposure and preference).
Other measures turned out to be too challenging for the participant group. For example, the child version of the Violence Exposure measure [69
] was too complicated for many children and was dropped. The direct child assessments of cognition and language were alternated with language assessed at 2 and 4 years of age and cognition at 3 and 5 years of age and the Bracken Basic Concept [70
] scale was dropped because the testing time was too long for the children.
A major issue in evaluating the effectiveness of any early intervention model is to understand the cost of the intervention. Detailed cost measures regarding each component of the study, including intervention, assessment, and operational costs, were collected to document costs, and estimate cost benefit and/or cost effectiveness analyses.
Detailed records of expenditures were kept for each resource category of labor, materials, equipment, buildings and facilities, and miscellaneous, as well as records of donated items. These records were supplemented by staff cost and activities diaries; Legacy staff members who routinely perform three or more Legacy activities were required to complete time and activity diaries. Each diary detailed specific activities and the respective amounts of time spent on each Legacy activity. These diaries were collected semi-annually.
As described in the measurement section, Legacy gathered a rich dataset that included quantitative assessments of mothers and their children, qualitative ethnographic and quantitative process data from the intervention sessions, and cost data related to the administration and evaluation of the intervention as a whole. Throughout the course of the intervention, these data were described and compared across groups by intervention site through the use of classic intent-to-treat (ITT) analytic methods. The general analytic plan is described below.
Descriptive statistics will be used with Legacy data for at least three purposes: 1) to compare the mothers who consented to participate in Legacy to the entire screened sample; 2) to examine the initial comparability of the intervention and comparison groups, 3) to provide a rich description of the Legacy mothers and their children at each time point. For group comparisons, parametric statistics will be used where appropriate and non-parametric statistics will be used where the assumptions needed for parametric tests are not met. For continuous variables, t-tests will be used to examine group differences and for categorical data Chi-square statistics will be calculated. For preliminary descriptive group comparisons, simulations, the Bonferroni correction method, or a false discovery rate approach will be used to account for multiple comparisons within an assessment domain.
Analyses of outcome and mediating factors
In general, all intervention effects will be investigated using site- and time-point-stratified models first, using an ITT approach. Due to important implementation differences, site pooled models with site interaction included, will be considered carefully and used sparingly. The primary outcomes will be first examined using t-test and Chi-Square analyses to investigate individual outcomes as a function of Legacy group assignment. Multivariate statistical modeling will be used to address changes over time as well as the effects of mediating and moderating variables. Given the longitudinal nature of the data, trend analyses will be employed to test for linear and curvilinear relationships in the repeated measures over time. In the anticipated event that significant group differences are detected in maternal and child mediators and outcomes, multivariate linear and logistic regression techniques will be employed as a means of identifying effect modification and controlling for significant confounders. All regression modeling procedures will follow the general conceptual path of influence described in the Legacy Logic Model (Figure ). Multiple ANOVA or regression models that account for correlation of outcomes within subjects will be employed to model the collective influence of intervention randomization on multiple measures of a single outcome domain (e.g., cognitive/language outcomes). In anticipation of an increasing amount of missing data over time, multiple imputation and nonresponse weighting will be used as alternative means of conducting ITT analyses within a general PROC MIXED regression framework.
A strength of the Legacy evaluation is a design in which many mediators and outcome variables are measured at multiple time points. This will allow a mixed model approach. In the first stage, growth curves are fit for individuals. Depending on the number of measurement, these curves may be linear, quadratic, etc. In the second stage, predictors such as intervention or comparison group assignment, demographic characteristics, and other predictors will be added and evaluated for their ability to predict slope and intercept parameters.
Further, structural equation modeling will be used, as appropriate, to estimate multiple and interrelated dependence relationships and to represent unobserved concepts in these relationships and account for the measurement error in the estimation process. This type of analysis will be used to fit measurement and structural models based on the Legacy Logic Model. Models will incorporate multiple paths and mediating relations; in short, the models will represent paths through which the Legacy intervention might have an effect. In addition, multi-group models will allow us to compare sites, demographic groups, and the randomization groups for equivalency. A variety of indices will be used to evaluate these models.
Analysis of process data
Participation frequency in the intervention groups serves as a measure of intervention dosage, and is expected to be related to the impact of the intervention. Participation will be examined both as a continuous and categorical measure, with intervention mothers being categorized into low, medium, high and non-participant groups. However, because there are likely to be confounding factors, such as demographics and maternal characteristics, that affect likelihood of participation, propensity scores corresponding to predicted attendance will be calculated for all mothers in the Legacy program. These scores will be used in dose–response analyses of primary outcomes to match intervention mothers with comparison mothers predisposed to the same level of participation (had they been randomized to intervention), balancing factors related to mothers’ ability to participate. In addition, qualitative process data related to retention and attrition will be examined.
Ethnographic field notes made up a significant portion of Legacy’s qualitative data. A first purpose of the process data throughout the intervention was to provide feedback to the intervention sites on maternal responses to the intervention. For this analysis, the ethnographers reviewed their notes for major themes and observations, which were then compiled into monthly reports and shared with the intervention sites. Organized around Legacy’s major themes, the reports provided a narrative around the development of a sense of community among participants, parental self-efficacy, program response, and group dynamics.
For summative purposes, all ethnographic field notes were then themed and coded by trained coders from the PCC. During the pilot phase, the coding system was revised to accommodate new categories and themes, suggested by additional analysis of the data and field notes. Other forms of secondary analyses include case studies of individual participants, following the progress of selected participants over time, group case studies, following specific groups over time, and analyses of participant engagement.
Analysis of cost data
Cost analyses will include both simple descriptive analyses and cost-effectiveness analyses to assess the cost per unit of outcome achieved. To date, cost estimates have been generated by site, type of intervention research activity (e.g., parent group meetings, visits to the home, and transportation), and at the family level. If participation in Legacy suggests improved outcomes among intervention participants, key outcomes will be identified to assess the intervention costs for unit increases in outcomes. Cost-effectiveness analysis is limited, however, in that multiple outcomes cannot be combined into a single measure of effectiveness. To overcome this limitation, estimates of cost savings may be summed from the literature for a number of outcome changes observed during and after the Legacy intervention. These estimated cost savings (benefits) could be compared to the costs of intervening to help assess whether the benefits of Legacy justify the costs.