|Home | About | Journals | Submit | Contact Us | Français|
The extremes of birth weight and preterm birth are known to result in a host of adverse outcomes, yet studies to date largely have used cross-sectional designs and variable-centered methods to understand long-term sequelae. Growth mixture modeling (GMM) that utilizes an integrated person- and variable-centered approach was applied to identify latent classes of achievement from a cohort of school-age children born at varying birth weights. GMM analyses revealed two latent achievement classes for calculation, problem-solving, and decoding abilities. The classes differed substantively and persistently in proficiency and in growth trajectories. Birth weight was a robust predictor of class membership for the two mathematics achievement outcomes and a marginal predictor of class membership for decoding. Neither visuospatial-motor skills nor environmental risk at study entry added to class prediction for any of the achievement skills. Among children born preterm, neonatal medical variables predicted class membership uniquely beyond birth weight. More generally, GMM is useful in revealing coherence in the developmental patterns of academic achievement in children of varying weight at birth, and is well suited to investigations of sources of heterogeneity.
Modern advances in perinatal care (e.g., assisted ventilation techniques for neonates and maternal antenatal steroid medications) have led to the survival of increasing numbers of children who are born very preterm (< 32 weeks gestational age) and/or with very low birth weight (VLBW, <1500 g, 3 lbs. 5 oz). The increased survival of prematurely born infants has been most dramatic in children born at the lower extreme of birth weight (Hack & Faranoff, 1999). Children with VLBW have higher rates of cognitive and academic deficits, behavior problems, and neurosensory and other health disorders than do term-born children of normal birth weights (Taylor, Klein & Hack, 2000). These adverse sequelae are more common and more severe in children of lower extreme of birth weight or gestational age (Klebanov, Brooks-Gunn, & McCormick, 1994, Hack, Klein, & Taylor, 1996). Although modern neonatal care has contributed to lower rate of intraventricular hemorrhage (Wilson-Costello et al., 2007), increases in survival have not been accompanied by decreases in the rates of other major neonatal and postnatal medical complications. In fact, any benefit of improved neonatal care has been offset by the survival of higher risk, lower birth weight infants (Anderson & Doyle, 2003; Taylor, Klein, Drotar, Schluchter, & Hack, 2006).
Neuropsychological methods detect more subtle cognitive weaknesses or “hidden” handicaps that accompany these early complications. Even early in life, the adverse developmental effects of VLBW are evident, including lower overall mental and motor skills, reduced visual recognition memory, and poorer language, executive, and attentional skills (Espy et al., 2002; Goyen, Lui, & Woods, 1998; Landry, Smith, Miller-Loncar, & Swank, 1997; Sullivan & McGrath, 2003;). A substantial body of literature documents neuropsychological deficits during the preschool years and early primary school that persists throughout the school-age years (Anderson et al., 2004; Botting, Powls, Cooke, & Marlow, 1998; Friske & White, 1994; Luoma, Herrgard, Martinkainen, & Ahonen, 1998; Msall, Buck, Rogers & Catanzaro, 1992; Saigal, 2000; Taylor, Minich, Bangert, Filipek, & Hack, 2004a; Volke & Meyer, 1999). Deficits in non-verbal skills, perceptual-motor abilities, executive control, and attention cannot be attributed entirely to overall mental deficiency or neurosensory handicaps (Anderson et al., 2004; Hack et al., 1992; Taylor, Klein, Minich, & Hack, 2000). Furthermore, these children have more learning and social-behavioral problems than term-born children (Bhutta, Cleves, Casey, Cradock & Anand, 2002; Hille et al., 2001; Klein, Hack & Breslau, 1989), including lower adaptive behavior skills and social competence Saigal, Pinelli, Hoult, Kim & Boyle, 2003); more internalizing and externalizing symptomatology (Botting et al., 1997; Breslau & Chilcoat, 2000; Szatmari, Saigal, Rosenbaum, Campbell & King, 1990; Whitaker et al., 1997); and higher rates of learning disabilities. Mathematics disabilities are particularly prominent (Anderson & Doyle, 2003; Espy et al., 2004; Litt, Taylor, Klein, & Hack, 2005; Taylor, Hack, Klein, & Schatschneider, 1995). Children with VLBW also have lower levels of academic achievement and higher rates of grade repetition and special education than term-born controls (Klebanov, Brooks-Gunn & McCormick, 1994; Saigal, 2000; Taylor et al., 2000).
Neurobiological risks, such as the degree of low birth weight, abnormalities on neonatal cranial ultrasounds, chronic lung disease, septicemia, and composite biological risk indices account for substantial variability in outcome (Hack et al., 1992; Hack, Wilson-Costello, Friedman, Taylor, Schluchter, & Fanaroff, 2000; Koller, Lawson, Rose, Wallace & McCarton, 1997; Landry, Fletcher, Denson & Chapieski, 1993; Liaw & Brooks-Gunn, 1993; McGrath & Sullivan, 2002; Taylor, Klein, Schatschneider, & Hack, 1998; Taylor et al., 2006) Adverse outcomes are observed even in children without major neonatal complications, making it challenging to identify those at risk (Espy et al., 2002; Taylor et al., 2000; Taylor et al., 2006). Environmental risks, including sociodemographic characteristics and financial disadvantage, as well as more "proximal" family influences such as family functioning, negative life events, and maternal psychological distress also are related to outcome (Bendersky & Lewis, 1995; Breslau, 1995; Breslau & Chilcoat, 2000; Taylor et al., 1998; 2006).
Adding to these complexities is the differences in outcome that unfold across development. Because cross-sectional designs are used most often in outcome studies, it is difficult to determine whether the pattern of observed weaknesses is stable over time, resolves with age, or worsens with development as the brain areas most compromised by early insult become more engaged for skill acquisition and maintenance. In some studies, a relatively stable pattern of weaknesses has been observed across school age into adolescence (Breslau, Chilcoat, Susser, Matte, Liang & Peterson, 2001; Powls, Botting, Cooke, & Marlow, 1995; Rickards, Ryan, & Kitchen, 1988). Other results suggest an exacerbation of impairment into adolescence (Botting, Powls, Cooke & Marlow, 1998; Cohen, Beckwith, Parmelee, Sigman, Asarnow & Espinosa, 1996; O’Callaghan et al., 1996; Saigal, Hoult, Streiner, Stoskopf, & Rosenbaum, 2000; Taylor, Klein, Minich, & Hack, 2000; Zelkowitz, Papageorgiou, Zelazo, & Weiss, 1995). Finally, in one study, initial reductions on a test of vocabulary in young children with VLBW relative to term-born peers diminished across development, with the two groups obtaining similar scores by at age 8 years (Ment et al., 2003).
Although weaknesses in visuo-spatial-motor abilities, attention, executive control, memory, and academic achievement are commonly identified in studies of children born very early and at VLBW, these investigations have used “variable-centered” approaches (Muthen & Muthen, 2000), where the goal is to relate pre-established risk factors to outcomes of interest. Studies exemplifying this approach are ones that examine neuropsychological proficiencies as a function of birth weight, neonatal complications, and environmental disadvantage. In contrast, “person-centered” approaches such as growth mixture modeling (GMM), although also incorporating the variable-centered approach, use cluster or latent class analyses. These approaches address questions on relations among individuals, where the interest is to subgroup persons with similar outcomes and understand how subgroups differ from one another. Person-centered approaches are particularly well suited to study of outcomes of VLBW because individuals classified according to birth weight differ substantially in other risk factors, such as perinatal complications. This approach may also be ideal for teasing apart developmental trajectories associated with low base rate phenomena, such as the specific medical conditions that can accompany VLBW.
Fortunately, recent statistical advances have resulted in the application of both variable- and person-centered techniques to modeling of growth for longitudinally collected data. Conventional growth modeling (CGM) is a variable-centered approach that can be conducted using structural equations models (SEM), mixed linear models, or hierarchical linear (HLM) models, which are known generally as multi-level models (Hedeker & Gibbons, 1994; McCulloch & Searle, 2001; Muthen, 2004; Raudenbush, 2001; Raudenbush & Bryk, 2002; Singer & Willett, 2003; Skrondal & Rabe-Hesketh, 2004). Although CGM is accomplished somewhat differently in SEM and HLM methods, the results are identical when the same growth parameters are modeled. These approaches now are applied routinely to address developmental questions, including changes in brain responses to auditory stimuli across early development (Espy, Molfese, Molfese, & Modglin, 2004) and variation in longitudinal neuropsychological outcomes in children with VLBW (Taylor, Klein, Minich & Hack, 2004).
The major objective of the present study was to apply person-centered growth modeling techniques to better understand individual variation in the development of academic skills in children with VLBW. Learning difficulties at school age increase risks for long-term problems in behavior adjustment and limited educational and vocational attainments (Ewing-Cobbs et al., 2004; Klebanov, Brooks-Gunn, & McCormick, 1994). Identifying different patterns of growth in academic achievement during the school-age years and risk factors associated with these patterns is thus critical to determine which children may need the most extensive early interventions to promote skill development. Previous studies have documented persistent deficits in academic skills (Hack, 2005), but it is unclear if these deficits are stable over time or if the gap in skills between children with VLBW and their term-born peers widens with age. Several longitudinal studies suggest relatively constant deficits in academic achievement among VLBW cohorts across the school-age years (Breslau, Paneth, & Lucia, 2004; Schneider, Wolke, Schlagmuller, & Meyer, 2004), while others suggest that these deficits may become more pronounced with age (Saigal et al., 2000; Taylor, Klein, Minich, & Hack, 2000). It is also important to investigate factors other than VLBW that may contribute to risks for poor achievement and to determine whether growth in achievement is affected more adversely in some children than in others.
In previous reports on outcomes for the sample of children followed in the present study, Taylor and colleagues (Taylor et al., 1995; Taylor et al., 2000) assessed achievement in two groups of children varying in the degree of VLBW (<750 g and 750–1499 g) and term-born controls at the mean ages of 7 and 11 years. Results revealed lower scores for the <750 g group compared with term controls on reading and mathematics at both follow-up assessments. Taylor et al. (2000) also observed that the <750 g group made less positive gains in reading across the two assessments than the term group. However, changes in achievement across subsequent follow-ups at later ages were not examined in these earlier reports. Multiple assessments of achievement as part of this larger study provided a unique opportunity to examine longer-term achievement outcomes into adolescence. Application of both variable- and person-centered GMM methods (Muthen & Muthen, 2000) also enabled better characterization of differences in growth of academic skills with age and the correlates of these individual differences.
Specific aims were to 1) empirically characterize the developmental trajectories of differing academic skills across childhood into adolescence, and 2) determine whether birth weight, non-verbal neuropsychological abilities and environmental risk measured at early school age predicted different patterns of academic proficiencies. Based on past literature, we anticipated that birth weight would be a robust predictor of class membership, and that non-verbal neuropsychological skills and environmental risk in early childhood would contribute substantively to the prediction of class membership (Taylor et al., 1995). Finally, we hypothesized that among children with VLBW, neonatal medical conditions would predict class membership beyond birth weight alone (Taylor et al., 1998).
The total sample consisted of 196 children, 67 children born at term (>36 weeks) of normal birth weight (> 2500 g) and 129 children born preterm and VLBW at < 1499 g. Most of the children were recruited into the study at early school age (Hack et al., 1996), with a few additional children recruited at the second follow-up assessment to maximize sample size for study of developmental change (Taylor et al., 2004). Because there are fewer children born at the lowest end of the birth weight spectrum, children <750 g (n = 64) were the sampled “target” participant, representing 93% of the survivors in this range of birth weight born from July 1, 1982 through December 31, 1986 in the 6-county region surrounding Cleveland, Ohio. Children in the 750–1499 g (n = 65) and term-born groups were individually matched with a <750 g child based on birth date (within 3 months), race, and either the same hospital of birth (750–1499 g children) or same school (term children).
Table 1 presents the sample background and perinatal characteristics. The number of females and males were approximately equal between term and higher (750–1499 g) and lower (< 750 g) weight children with VLBW, χ2 (2, N=196) =0.066, p > .97. The number of children of minority race also did not differ by birth weight group, χ2 (2, N=196) = 0.092, p > .95. Finally, birth weight groups did not differ in socioeconomic status (SES), F (2, 175) = 0.25, p > .78, calculated using the Hollingshead index (Hollingshead, 1957) composite of parent education and occupation (reversed scored so that higher score reflected higher SES) and then normalized within the sample. Not surprisingly, the birth weight groups differed in length of hospitalization, F (1,126) = 43.28, p < .001, days on the ventilator, F (1,126) = 32.10, p < .001, and number of children with chronic lung disease, χ2 (1, N = 128) =19.14, p < .001, septicemia χ2 (1, N =128) = 5.16, p = .024, and apnea χ2 (1, N = 129) = 5.44, p = .023. Further sample information is provided in Taylor, Minich, Klein, and Hack (2004b).
A sequential panel design (Mehta & West, 2000; Muthén, Khoo, Francis, & Boscardin, 2003; Nesselroade & Baltes, 1979; Tony, Ohlin & Farrington, 1991) was used, where participants were enrolled in early elementary school, around age 7 years (M = 6.87 years, SD = 0.93; Range 5.31 – 9.34 years). Children were assessed approximately 4 years after the initial visit and annually for 4 subsequent assessments. Given the variability in age at study entry, the respective follow-up intervals spanned from ages 10 through 16 years, shown in Table 2. Age-related change in children’s academic achievement scores was modeled because the interest is in growth across development, not change between visits. Because variability in the age at entry was large relative to the other ages, the difference in the actual age at enrollment from age 7 was used as a covariate for this assessment only. Because of this sampling, subsequent assessment schedule, and some attrition (sample retention = 92%), there were a different number of assessments at each age period, resulting in an unbalanced design. At each age, however, the mean ages and sample sizes were approximately equal across the three birth weight groups (see Table 2).
Three subtests from the Woodcock-Johnson Psycho-Educational Battery-Revised (WJ-R) Tests of Achievement (Woodcock & Johnson, 1989) were administered to measure academic achievement outcome. Calculation requires the examinee to perform mathematic operations that vary in difficulty. In Applied Problems, examinees analyze and solve practical mathematics problems. Letter-Word Identification assesses reading decoding by requiring examinees to orally read a list of single words of increasing difficulty. These subtests were chosen for their demonstrated high reliability and validity and because the resultant W scores are Rasch model-derived values that represent equal-interval measurement both within and across individuals, where any given difference along the scale has the same implication for performance at any level or age, a desirable property for growth modeling. For simplicity, we refer to scores on Calculation, Applied Problems, and Letter-Word Identification as Calculation, Problem-solving, and Decoding, respectively.
The central predictor was birth weight (BWT), a continuous variable measured in grams. At the initial evaluation at study entry, children were administered a neuropsychological battery that included the tetrad short-form of the Kaufman Assessment Battery for Children (K-ABC, Kaufman & Applegate, 1988), as well as tests of picture naming, verbal short-term memory and verbal comprehension, perceptual-motor skills, and attention and executive function (Taylor, Hack, Klein, & Schatschneider, 1995). A principal axis factor analysis with varimax rotation was conducted on age-standardized scores, as described in Taylor, Burant, Holding, Klein, and Hack (2002), to reduce the number of predictors and to identify distinct cognitive constructs. Tests with low primary loadings or high cross-loadings were excluded after the initial analysis. The final factor analysis yielded two factors accounting for 63% of the variance in scores. Factor 1 had an Eigen value of 4.46 and explained 50% of the variance in scores, and Factor 2 had an Eigen value of 1.21 and explained 13% of the variance in scores. Tests loading on Factor 1, referred to as the visuospatial/perceptual-motor factor (VSPM), included the Developmental Test of Visual-Motor Integration (Beery, 1989), the short-form of the Test of Motor Proficiency (Bruininks & Bruininks-Oseretsky, 1978), the Purdue Pegboard (Gardner, 1979), the Computerized Test of Attention (Murphy-Berman & Wright, 1987) and Triangles and Matrix Analogies subtests of the K-ABC. Tests loading on Factor 2, referred to as the verbal memory factor, included the Pseudoword Repetition Test (Taylor, Lean, & Schwartz, 1989), Recalling Sentences subtest of the Clinical Evaluation of Language Fundamentals-Revised (Semel, Wiig, Secord, & Sabers, 1987), and Word Order subtest of the K-ABC. Factor composites were computed by averaging the age-adjusted standard scores of the constituent tests. Because VSPM and the verbal memory factor were correlated significantly (ρ = 0.32, p < .001) and VSPM accounted for the greatest variance, only VSPM was retained as a predictor of class membership in the subsequent analyses.
To assess the contribution of the child’s social environment to class membership, the Life Stressors and Social Resources Inventory-Adult Form (LISRES-A; Moos & Moos, 1994) administered at study entry was used as an index of proximal life stressors and social resources (Taylor et al., 2004b). A summary score of environmental risk (ER) was created from the mean of the T-scores for six stressors scales (health, work, spouse, extended family, friends, and negative life events). Because the daily care for a preterm child can contribute to perceived family stress, items pertinent to the child were removed in computing the summary score for all participants. The ER score was used as the predictor of class membership, where higher scores reflected more stressful environments. There was no difference in the ER score among the birth weight groups, F (2, 193) = 0.68, p = .51.
Several neonatal medical variables were selected as predictors of class membership: two continuously distributed variables, Length of Hospital (in days) and Days of Ventilation, and four categorical variables, Apnea, Chronic Lung Disease, Jaundice, and Necrotizing Enterocolitis (coded 1 for children who experienced the medical condition and 0 for those who did not).
Like the conventional HLM and SEM approaches to growth modeling, GMM can be used to examine a mean growth trajectory and individual variation within a population, considering both the person-centered and variable-centered approaches (Muthen & Muthen, 2000). Unlike conventional growth models, however, GMM utilizes a “mix” of latent continuous and categorical variables, and is used to identify meaningful subpopulations within the larger population to examine the mean trajectories and individual variation across and within the subpopulations. Importantly, these subpopulations are not known a priori, but rather are determined empirically, termed “latent classes.” Individuals are assigned to subpopulations or latent classes based on their posterior probabilities using multinomial logistic regression (Muthén & Shedden, 1999). Latent class analysis (Nagin, 1999; Nagin & Tremblay, 2001) is similar to GMM in terms of identifying the latent classes and modeling the growth trajectory across classes, but GMM has the added advantage of allowing for within-class variation of variable-centered methods (Nagin & Tremblay, 2005; Muthén, 2006). GMM is the “second generation” of SEM-parameterized growth models, which fully incorporates the multilevel approach to understand nested, individual variation (Muthén, 2001a, 2001b, 2002). GMM has the same advantages as CGM in accommodating missing data, and thus can be applied to unbalanced designs.
GMM starts from conventional growth models to identify the growth functions (e.g., linear or quadratic). In GMM, the CGM assumption that all participants are drawn from a single population with common population parameters (e.g., intercepts, slopes, or acceleration) then is relaxed. GMM uses latent categorical variables to allow for the parameter variation within and across unobserved latent classes. Figure 1 depicts a general diagram of the GMM as applied to our data. Double-arrowed curved lines represent growth factor covariance and single-arrowed lines represent estimated path values. Each growth factor in the respective circles has indicators y1 to y8 representing reading or math scores at the 8 ages (7, 10, 11, 12, 13, 14, 15, and 16 years), respectively. Residuals are represented by ε1–8 in squares, where Age7 in the square represents the age covariate for the initial assessment y1. The three predictors of interest BWT, VSPM, and ER, are shown in the lower left box. The categorical latent growth trajectory variable c is below the black line, representing the unobservable latent “class” of children, who are determined empirically to represent a coherent subgroup based on the pattern of variation in their growth trajectory.
Using CGM, unconditional models with only the growth coefficients (linear and quadratic), no predictors or covariates, and no latent classes were run first for Calculation, Problem-solving, and Decoding across age. In these and all subsequent models, age 13 was chosen as the centering point for greater measurement precision, as the majority (72%) of participants had completed assessments at this age. By setting the intercept at the age of 13, the intercept is the estimated subtest performance at age 13 and the variance estimate also reflects the value at age 13. Because a quadratic model best fit the data, the slope parameter is the estimated increase in achievement per unit of age at age 13. Finally, the quadratic parameter is the estimated change in slope across the observation period, and identifies whether growth is “convex (troughed)” or “concave (peaked)”. The positive sign of the quadratic parameter indicates accelerating growth and the negative sign indicates deceleration. For achievement outcomes, a “concave” pattern with a peak is expected as children’s skills grow towards a maximum value. Using calculus, the age at which the quadratic function reaches its peak can be calculated by [13- ½(αs/αq)] where αs and αq represent the estimates for slope and quadratic parameters, although as with any polynomial, the timing of the peak may not fall within the range of the data. In general, these CGM models examine the latent growth factors and growth trajectory shape drawn from a single population, where individual trajectories were allowed to vary around a single population mean trajectory.
In GMM, the heterogeneity of population growth trajectories is captured by the latent categorical variable c with K classes, where the continuous latent growth variables for individuals in the kth class are related to c and the observed predictors x. Here, the mixture models contain three growth factor means in the kth class (intercept, slope, acceleration), and the 3 fixed-effect coefficients (BWT, VSPM, and ER) of xi on the 3 latent growth factors. A multinomial logistic regression model is applied in GMM to describe the relation between predictors and latent trajectory classes, where the probability of being in the kth class for child i is conditional on the predictors.
In this study, all models were estimated by maximum likelihood using the expectation maximization algorithm (Dempster, Laird & Rubin, 1977; Muthén & Shedden, 1999; Muthén & Muthén, 2001, 2006). This method is appropriate for analysis of data that are collected under conditions of “planned missingness” and are considered missing at random (Graham, Taylor, & Cumsille, 2001; Little & Rubin, 2002; Schafer & Graham, 2002), as was the case for this study. The maximum likelihood estimator with robust standard errors and χ2 likelihood ratio test allows missing data that is consistent with missing at random assumption, as well as non-normal and non-independence outcomes (Yuan & Bentler, 2000). Local maxima are encountered often in GMM, especially with an increasing number of latent classes (Muthén, 2004; Muthén & Muthén, 2006). For K ≥ 2, this study used 100 – 10,000 random sets of starting values at the initial stage and 5–20 optimizations at the final stage to avoid local maxima.
To determine the CGM trajectory shape, the χ2 test, based on maximum log-likelihood ratio (MLR) and scaling correction factors (Satorra, 2000), was used to compare relative fit among models that included linear and quadratic growth functions, respectively. The selected latent class model then was adopted in subsequent models examining the effects of predictors of class membership. The a priori models were tested by holding parameters invariant across classes, fixing or freeing parameters within and across classes, and then adding predictors. The χ2 likelihood ratio test is not appropriate for comparing models with different numbers of classes. An integrated approach was adopted here, where the number of latent classes was determined by the overall evaluation of the four criteria: (a) Bayesian Information Criteria (BIC; the smaller information criterion indicates better fit), (b) entropy, (c) bootstrap likelihood ratio test (BLRT) and (d) graphs of estimated class mean trajectories with and without covariates. BIC (Schwartz, 1978) was selected because it best identifies the correct number of classes in GMM using the common information criteria (Nylund, Asparouhov & Muthén, 2006). Entropy (Ek) was used to measure the classification quality based on participant’s posterior class membership probabilities (Nagin, 1999; Ramaswamy et al., 1993), where entropy values closer to 1 indicate clear classification. BLRT uses bootstrap samples to estimate the distribution of the log likelihood difference test statistic and was regarded as a powerful indication of the correct number of clusters (Nylund, Asparouhov & Muthén, 2006). A low p-value of BLRT (e.g., p < .05) indicates the rejection of k -1 classes in favor of k classes. Because the prediction of class membership is a key feature of GMM that permits testing empirically derived hypotheses, Muthén (2004) has recommended that predictors be included in models to help determine the number of classes. Therefore, graphs of class mean trajectories were plotted with and without predictors to examine whether the latent classes were substantive. For example, the kth class mean trajectory may be so close to the (k -1)th class as to render the distinction between the two latent classes a trivial one.
The longitudinal achievement data was analyzed to address our objectives through the following steps: (1) establish the trajectory shape, (2) identify the latent classes, and (3) examine the prediction of class membership from the predictors of particular interest. To establish the trajectory shape, unconditional linear and quadratic functions were fit to the data using conventional growth models, respectively. The χ2 difference test based on MLR and scaling factors indicated a quadratic growth curve model best fit the data for Calculation, χ2 (4) = 55.54, p < .001, Problem-solving, χ2 (4) =322.30, p < .001, and Decoding, χ2 (4) = 337.46, p < .001. The quadratic term was significant and negative in sign, indicating that with advancing age, the rate of linear growth was progressively smaller for the three achievement scores: Calculation aq = −1.18, p < .001, Problem-solving aq = −0.76, p < .001, Decoding aq = −1.28, p < .001 (see Table 3). Figure 2 displays these quadratic growth curves for the entire sample, and Table 3 contains the growth statistics.
For GMM, the unconditional mixture models were fit to the achievement data by assuming that the three continuous latent growth coefficients were invariant within classes (Muthen & Muthen, 2007; Nagin, 1999). Because the variances of intercepts and residuals differed across classes based on the χ2 likelihood ratio tests for the model of the same classes, these terms were allowed subsequently to vary within classes. As initial exploratory modeling showed that the GMM with more than 3 classes (k > 3) contributed trivially to class identification, the models with k > 3 were not investigated further and only models with 2- and 3-classes were studied in comparison to the 1-class conventional growth model.
To compare the models with different numbers of classes, the integrated criteria were applied initially for each of the achievement models without the inclusion of any predictors, shown in Table 3. For the comparison of 1-class CGM and 2-class models, the smaller BIC and significant p-value of the BLRT test indicated that the 2-class mixture model better fit the data (see Table 3). The contrast of 2- and 3-class unconditional models showed some support for selection of the 3-class model for Calculation. The BIC was smaller for the 3-class model than for the 2-class model (BIC = 6997.87 vs.7018.31). The LMR likelihood ratio test for the 3-class model also indicated better fit, and the entropy value showed just slightly better classification quality (Ek = .94 vs. Ek = .93). However, examining the plots of estimated mean trajectories raised concern about the utility of the 3-class model for Calculation, as the two higher mean class trajectories in the 3-class model overlapped nearly entirely, indicating that there were not substantive differences in the Calculation trajectories between classes. The 2-class model was thus adopted for Calculation based on the overall evaluation of the application of the four criteria. For the Problem-solving models, all criteria indicated the 2-class solution was preferred. For Decoding, entropy and BLRT suggested a 3-class solution, whereas BIC and the plots of the estimated mean trajectories pointed to a 2-class model as best-fitting.
To confirm the selection of the 2-class model, the predictors then were included in the respective achievement GMMs. With the inclusion of BWT, VSPM, and ER, all fit criteria indicated that the 2-class solution better described the observed Calculation, Problem-solving, and Decoding achievement data, also depicted in Table 3. The BIC was smaller for the conditional 2-class model (BIC = 6291.26) than for the 3-class model (BIC = 6350.53). The BLRT also favored the 2-class model, and the entropy value of the conditional 2-class model (Ek = 0.85) was substantially larger than that for the conditional 3-class model (Ek = 0.76). Table 4 shows the average probabilities of being in a class given a 2-class solution. In support of this model, values on the diagonal were close to 1 and those on the off-diagonal were low and close to 0 (Muthen & Muthen, 2007). As expected based on the unconditional results, the conditional 2-class Calculation model fit better than the conditional 1-class CGM. A final consideration in selecting the 2-class model for Calculation was that the plots for the two higher achieving classes were closely adjacent to each other in the 3-class conditional model. Because these same criteria indicated a better fit of the 2-class model for Problem-solving and Decoding, the 2-class model also was retained for these achievement outcomes.
Figure 3 displays the expected mean trajectory for Calculation, Problem-solving, and Decoding for the 2-class model. The mean trajectories of the two identified latent classes (termed “Average” and “Low”) are shown separately. The growth parameter estimates for these two classes for Calculation, Problem-solving, and Decoding are shown in Table 5. For Calculation, the Average class scored more than 60 W score points higher (intercept α0) than the Low class at age 13 years and their linear change rate (slope αs) also was faster at this age. Acceleration (αq) for both groups was negative in sign, meaning that the rate of linear change was progressively slowing across age, with a larger magnitude of deceleration for the Average class than for the Low class. The estimated peak of the developmental trajectory for the Average class was at age of 15.6 years compared with an estimated peak at 17.4 years for the Low class, although the gap in achievement between Low and Average classes persisted across age, as evident in Figure 2. Furthermore, there was significant variation in the Calculation intercepts at age 13 for both the Average and Low classes, as well as significant variation in slopes at age 13 for the Average class. Because variation in acceleration was non-significant for both classes, the quadratic parameters were fixed to 0. Participants in the Average class performed better on Calculation than the Low class at enrollment, and this performance difference between classes was evident at all ages. The Average class, though, showed faster skill growth early in the developmental period and greater deceleration across age, which resulted in obtaining the maximal level of Calculation achievement at a younger age relative to those in the Low class.
The plots and growth parameter estimates for Problem-solving was similar to those for Calculation, with a higher intercept and faster linear rate of change at age 13 for the Average class compared to the Low class. Because growth deceleration also was faster for the Average class, the estimated maximal Problem-solving score was at age 16.1 years for the Average class and at 17.0 years for the Low class. The pattern of individual differences in variances of the growth parameters for the two latent classes was identical to that for Calculation. Overall, for Problem-solving, participants in the Average class performed better than those in the Low class across age. Similar to Calculation, children identified in the Average class showed faster skill growth early in the developmental period, although the magnitude of the difference in early growth between Average and Low classes was smaller than for Calculation (estimated Calculation growth was 22.11 for the Average class and 14.37 for the Low class at age 7, for Problem-solving the respective values for these classes were 14.55 and 11.02). With greater growth deceleration, the Average class obtained the maximal Problem-solving achievement at a younger age compared to the Low class.
For Decoding, the two latent classes differed in all three growth parameters. The Average class was estimated to score more than 50 W points higher than the Low class at age 13 years. In contrast to the two mathematics scores, the linear rate of change for Decoding at age 13 was higher for the Low relative to the Average class. Similar to the two mathematics scores, both latent classes showed deceleration in growth, with a faster rate of deceleration for the Average class. The Average class was estimated to reach maximal Decoding score at 15.4 years, whereas the Low class was estimated to reach its peak at 17.8 years, again with a persistent gap in achievement scores noted across the observation period. The variance of the intercept for the Low class was relatively larger than that for the Average class, whereas the variance for the slope and quadratic parameters did not differ from 0 and thus was fixed. Similar to the two mathematics outcomes, participants in the Average class performed better on Decoding than those in the Low class across age, showed faster skill growth early in the developmental period, and greater growth deceleration, again resulting in a higher maximal level of Decoding achievement at a younger age compared to the Low class. The early, age-related difference between the Low and Average classes in linear growth in Decoding were similar in magnitude to Calculation (at age 7, estimated growth in Decoding was 23.55 for the Average class and 17.24 for the Low class)..
To characterize the relation between class membership and the predictors of interest, the categorical latent class variable, c, was regressed on BWT, VSPM, and ER simultaneously, which revealed the effects of each predictor controlling for the influences of the others. For Calculation and Problem-solving, BWT was the only significant predictor of class membership (γBWT_calculation = 0.001, p = 0.014; γBWT_problem-solving = 0.001, p = 0.031). For Decoding, the effect of BWT on class membership was marginal (γBWT_decoding = 0.001, p = 0.086). As BWT is continuously distributed, these coefficients indicate that the log odds of being in the Average relative to the Low class increases by .001 for each unit increase in BWT. The probability plots of BWT by the latent classes for each achievement outcome are displayed in Figure 3. For Calculation, the probability of being classified into the Average achievement group increased from 0.46 to 0.54 as BWT increased from 439 to 750 grams; from 0.54 to 0.71 as BWT increased from 750 to 1499 grams; and from 0.71 to 0.99 as BWT increased over 1500 g. The probability plot indicates that children born at less than 600 grams had a greater probability of being assigned to the Low class relative to the Average for Calculation (i.e., where the Low and Average class probability plots cross, as the probability of Low class assignment > 0.50). For Problem-solving, the probability of being classified into the Average group increased from .54 to 0.62 as BWT increased from 439 to 750 grams; from 0.62 to 0.77 as BWT increased from 750 to 1499 grams; and from 0.77 to 0.99 as BWT increased beyond 1500 g. In contrast to the pattern evident for Calculation, children across the BWT spectrum were more likely to be identified as average-achieving for Problem-solving. For Decoding, the coefficient was not significant; therefore, the probability of identification in the Average class relative to the Low was a constant (0.63) for children irrespective of BWT. Neither VSPM nor ER predicted class membership beyond BWT for Calculation (γvspm_calculation = 0.279, p = 0.258; γER_calculation = 0.05, p = 0.325), Problem-solving (γvspm_problem-solving = 0.279, p = 0.258; γES_problem-solving = 0.05, p = 0.325), or Decoding (γvspm_decoding = 0.236, p = 0.492; γES_decoding = 0.067, p = 0.195).
The final step in analysis was to examine the impact of the neonatal medical variables on group membership after controlling for BWT. Because these conditions were coded only for children with VLBW, the analyses were conducted without inclusion of term children, although the class membership derived from the full sample GMM analyses was utilized. Days on Ventilation significantly predicted class membership for all three achievement outcomes (γvent_calculation = 0.015, p = 0.048, γvent_problem-solving = 0.015, p = 0.034, and γvent_decoding = 0.017, p = 0.028). Children with VLBW who required more ventilation days were more likely to be classified in the Low class relative to the Average class on Calculation, Problem-solving and Decoding, as the log odds increases by 0.015–0.017 for a unit increase in Day on Ventilation. The probability plots of the significant neonatal variables by latent classes for the three achievement outcomes are shown in Figure 4. The plot for Calculation demonstrates that children receiving ventilation for 34 days or more (again where the Low and Average class probabilities cross and the probability of Low class assignment > 0.50), were more likely to be classified into the Low relative to Average class. Similarly, children who received 43 or more days of ventilation were more likely to be in the Low relative to Average class on Problem-solving; and those who received 70 or more days of ventilation were more likely to be in the Low compared to Average class on Decoding.
Controlling for BWT, Length of Hospitalization predicted class membership for the Calculation and Decoding variables, (γhosp_calculation = 0.012, p = 0.040, γhosp_decoding = 0.013, p = 0.023). Specifically, the respective odds ratio of being classified in the Low versus Average class for Calculation and Decoding was 1.012 and 1.013 for a unit (i.e. one day) increase in hospitalization. For Calculation, children who were hospitalized for 101 days or more were more likely to be classified in the Low class than in the Average class. For Decoding, the probability of a child being classified in the Low compared to Average class exceeded .50 when the length of hospitalization was 136 days or more.
Again controlling for BWT, Chronic Lung Disease was associated with class membership for Problem-solving (γlung_problem-solving = 1.072, p = 0.032). Children with VLBW with chronic lung disease had a probability of being assigned to the Low compared to the Average class for Problem-solving of .58, whereas children with VLBW without chronic lung disease had a probability of being assigned to the Average compared to Low class of .80.
GMM, an integrated person- and variable-centered approach, was applied to longitudinal academic achievement data from a large cohort of children born early and at VLBW in order to better understand the heterogeneity of outcome across development. Two latent classes were identified empirically, defining subgroups of average- and low-achieving participants whose pattern of longitudinal growth was similar enough to successfully result in cohesive classification. These results demonstrate how GMM extends CGM approaches by incorporating the concept of latent classes into growth variation, therefore explaining more variance of the individual growth trajectories. The findings confirmed our hypotheses that the degree of VLBW is related to suboptimal growth in Calculation, Problem-solving, and Decoding skills across childhood and into adolescence.
The GMM models revealed important information regarding the nature of academic achievement skill growth in children of varying BWT. With two latent classes, a coherent, low-performing group was identified empirically that differed from average-achieving children, calculation, problem-solving, and decoding skills. The consistency of the two class solution across differing achievement skills, as well as when term-born children are included in the models or not (results available from the author), suggests a coherent person-level group structure from school age into adolescence. A key feature of GMM is the assumption that individual differences also exist within each of the empirically identified groups. For example, some of the lightest children with VLBW performed as well as those born at term at certain ages, and concomitantly, some term-born children scored as poorly as the VLBW children across ages. Less than half of the lightest VLBW children (< 750 g) were identified as members of the Low class for Calculation (28 of 60, 47%) and for Problem-solving (24 of 60, 40%), and about one-third (21 of 60, 35%) were classified in the Low class for Decoding (see Table 6). These findings confirm the degree of heterogeneity in academic outcomes, as even among the lightest of children with VLBW, the majority does not show sub-optimal patterns of academic achievement skill growth. The GMM approach that includes CGM and latent classes can be utilized to effectively capture the variation within and across groups of children in developmental periods of interest.
Importantly, the difference between the low- and average-achieving classes was apparent across the developmental period from age 7 to 16 years for Calculation, Problem-solving, and Decoding, with the Average class outscoring the Low class by a substantial margin. However, the GMMs revealed a more complicated developmental picture. Given the faster linear rates of change early in the observation period, as well as in the greater growth deceleration, children in the average-achieving class were estimated to obtain maximal achievement performance between 15 and 16 years (depending on outcome type), whereas low-achieving children were estimated to achieve their maximal scores at later ages, beyond age 17, although a persistent gap in achievement for the Low class was evident across age. Furthermore, the early, age-related growth and the differences between the Low and Average classes in linear growth both were greatest for Decoding and Calculation compared to Problem-solving, which is consistent with the early, rapid acquisition and application of sound-symbol relations and mathematic operations for typically developing children.
For low-achieving children, this developmental pattern suggests that early achievement skill acquisition is fundamentally disrupted early in life that persists and skill growth is further impaired across development. These findings extend the earlier achievement findings of Taylor et al. (2000) by revealing continued and even increasing deficits in achievement over time in children with more extreme VLBW compared with term-born controls. The results are also consistent with studies that have either found stable deficits across development in children with VLBW (Breslau et al., 2001; Powls et al., 1995; Richards et al., 1988; Stevenson et al., 1999) or that have raised the possibility of slower age-related acquisition in some skills (Botting et al., 1998; Cohen et al., 1996; O’Callaghan et al., 1996; Zelkowitz et al, 1995). The observed pattern of deficits between low- and average-achieving classes identified here with GMM approach that includes meaningful subgroups further refines our knowledge of individual variation in academic skill developmental patterns. The different trajectory patterns suggests that there are distinct biologically determined upper limits of achievement skill acquisition, and that children identified in the Low class can continue to benefit from learning inputs and academic instruction beyond the age at which academic learning typically levels off in average-achieving children in order to achieve maximal proficiency. Of course, a different developmental pattern might have been observed during the transition from preschool to school age, or into adulthood.
Consistent with previous research, poorer achievement outcomes were related to several biological risk factors (Taylor et al., 2004a, 2006). What is newly demonstrated here using the GMM approach is the impact of these risks at the person level. More than half of the respective low-achieving classes for Calculation (28 of 38; 74%), Problem-solving (24 of 43; 56%), and Decoding (21 of 36; 58%) were the lightest VLBW children (<750 g). These findings were reinforced by GMM analyses that identified the individuals within the sample who were at highest risk for sub-optimal achievement growth. For Calculation, children under 600 g were more likely than not to be classified as low-achieving. For the other two academic skills, Problem-solving and Decoding, the probability of identification in the average-achieving group relative to the low-achieving group was greater across the entire BWT spectrum. For these measures though, the specificity of more extreme VLBW was low, as less than half of the lightest VLBW children were identified as members of the Low classes.
Contrary to expectations, nonverbal skills as assessed by the VSPM Factor score were not related to latent class membership for Calculation, Problem-solving, or Decoding when controlling for BWT. The failure of the VSPM score to predict poorer outcomes is surprising given the well-documented associations of these skills with academic achievement (Grunau, Whitfield, & Davis, 2002; Taylor et al., 2002). A likely explanation is that the influence of nonverbal skills on academic achievement was mediated in large part by BWT. The GMM results enrich this interpretation by including person-level effects, suggesting that impact of both BWT and nonverbal skills overlaps substantially at the person level and therefore leaves little systematic variation to predict achievement growth patterns. Environmental risk also failed to contribute to group membership beyond the influence of BWT for any of the achievement outcomes, contrary to extant findings (Breslau & Chilcoat, 2000; Bendersky & Lewis, 1995; Taylor et al., 1998; 2006). Including the person-level suggests that the neurobiological consequencea of very preterm birth at weights at the extreme of the continuum are the more significant contributors to sub-optimal patterns of academic achievement in school age and into adolescence (Taylor et al., 1998, 2006).
Although BWT was the central predictor, it is only a proxy for the impact of prematurity and related neonatal medical complications on the developing brain. Among the children with VLBW, risk factors that marked a more complicated neonatal course, including more days on ventilation, a longer period of neonatal hospitalization, and chronic lung disease predicted class membership for one of more of the achievement outcomes. While consistent with previous findings (D’Angio et al., 2002; Short et al., 2007; Taylor, 1998, 2006), these results extend this literature by indicating that these associations are independent of BWT. Calculation was the most sensitive to disruption among the two mathematics domains, as the probability of classification as low- compared to average-achieving for Calculation was higher than that for Problem-solving at the fewer days of ventilation (34 days compared to 43 days with BWT controlled). Decoding skills were more “resilient,” where identification into the Low relative to Average class for this outcome exceeded .5 after 70 days of ventilation. Chronic lung disease and days on ventilation also predicted class membership for Problem-solving, potentially pointing to the impacts of prolonged oxygen imbalances on mathematics applications. Although these neonatal medical variables are more specific indicators of neurobiologic risk than BWT, they do not directly or specifically quantify the impact on the central nervous system. Because complications vary across children and likely have variable impact on the brain, neuroimaging methods may provide one means for further distinguishing the level of individual risk. More refined methods of assessing brain status thus may thus improve our understanding of the sources of variability in achievement skill growth across development. Larger multi-site samples also may be required in investigations of the impact on achievement outcome of low base rate neonatal complications.
Subgroups of average- and low-achieving children were identified empirically and were related to predictors of interest, demonstrating the flexibility and capacity of the GMM approach to understand heterogeneity of outcome. This two-group classification likely does not reflect what would be determined in a representative sample of all children between ages 7 and 16 years, as such a sample would include on average over 90% of children born term at weights > 2500 and very few children at weights <750 g. Indeed, the sampling used here was ideal to determine empirically whether differing classes of children with distinctive developmental trajectories could be identified to understand heterogeneity in academic outcomes among children at-risk due to VLBW. If a different sampling strategy was used, it is likely that other coherent subgroups would have emerged. GMM extends conventional models that are predicated on a variable-centered orientation, and then incorporates latent class growth analysis to identify the number of coherent classes of individuals using a person-centered approach. Other analytic approaches also can be accommodated by GMM. For example, using Markov or piecewise GMM analysis and assuming transition stages or critical time points, the class membership shifting of children of varying BWT groups across latent classes can be modeled to address whether some children are average-achieving early in life and then shift to a low achieving sub-group during a critical period. The flexibility of GMM makes it an ideal framework with which to better model the complexities and heterogeneities of preterm and other at-risk children as they develop across time.
This research was supported in part by HD026554 and HD050399 from the National Institute of Child Health and Development to the senior author and his colleague Dr. Maureen Hack, and by R01 MH065668 from the National Institute of Mental Health, R01 DA014661 from the National Institute on Drug Abuse, and P01 HD038051 from the National Institute of Child Health and Development to the first author. The authors gratefully acknowledge Dr. Hack for her vital contributions to the design, conduct, and oversight of the longitudinal project. We also wish to thank the participating families, project staff, and graduate students who assisted in various tasks associated with this study.
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/journals/neu.
Kimberly Andrews Espy, Office of Research and the Department of Psychology, University of Nebraska-Lincoln.
Hua Fang, Office of Research and the Department of Psychology, University of Nebraska-Lincoln.
David Charak, Learning Point Associates.
Nori Minich, Department of Pediatrics, Case Western Reserve University and Rainbow Babies & Children’s Hospital, University Hospitals Case Medical Center.
H. Gerry Taylor, Department of Pediatrics, Case Western Reserve University and Rainbow Babies & Children’s Hospital, University Hospitals Case Medical Center.