Before describing changes in personality traits we evaluate the proportions of between- and within-subjects variance, and examine to what extent the within-subject variance is explained by age. Between- and within-individual variance estimates

^{2} from the basic one-way ANOVA with random effects model (

Raudenbush & Bryk, 2002, p. 24) were used to estimate the proportion of the stable variance in personality traits, as the ratio of between-subjects variance (u

_{0}: intercept variance) to the total variance (u

_{0}: intercept variance + σ

^{2}: within-subject variance). This ratio, which is an intraclass correlation, indicates that the proportion of variance that was stable over the course of this study ranged from 67% for Thoughtfulness to 86% for Masculinity (

*Mdn* = 71%).

Given that the between-subjects variance accounts for about 70% of the total variance, the remaining 30% is within-subject variance. By comparing the within-subject variance from the above baseline model with the residual within-subject variance from a model that includes age and age squared, it is possible to estimate the proportion of within-subject variance explained by age (

Raudenbush & Bryk, 2002, p. 24). The within-subject variance accounted for by age ranged from 4% for Restraint to 26% for General Activity (

*Mdn* = 14%). Thus, age accounts on average for about 14% × 30% = 4% of the total variance, a figure consistent with cross-sectional estimates (

Terracciano et al., 2005).

HLM results for the final models of the ten GZTS scales are reported in . We first describe the average trajectories with the fixed effects of the intercept, linear, and quadratic terms respectively (γ_{00}, γ_{01}, γ_{02}), then the associated random effects (u_{0}, u_{1}, u_{2}), and finally the effects of predictors (i.e., gender, cohort, attrition, and death) of intercept and slope variability.

| **Table 3**HLM Coefficients and Variance Estimates of Intercept, Linear, and Quadratic terms for GZTS scales |

Fixed Effects

Of most interest are the linear and quadratic fixed effects, which determine the shape of the developmental trajectory. Estimated age trajectories for the ten scales are depicted in , setting cohort as equal to the sample mean date of birth (1929), and separately for men and women for the scales that showed significant gender differences.

As hypothesized, the four GZTS scales related to E have distinct patterns: General Activity is stable or slightly increases in young adulthood and then declines at an accelerating rate in the very old; Restraint (which is negatively related to E, and positively related to C6: Deliberation) increases linearly; Ascendance has a curvilinear trajectory which peaks around age 60; and Sociability decreases linearly at a very modest rate. Emotional Stability and Objectivity, the two scales most strongly and inversely related to N, have a similar pattern: Both increase at a decelerating rate up to age 70, and then decrease slightly in old age. Personal Relations, which is related to A1: Trust, increases up to age 50 and then declines. Contrary to our hypotheses, Friendliness, the GZTS scale related to A, is stable, and Thoughtfulness, the GZTS scale related to O5: Ideas, follows a concave curve, with some increase in old age. Masculinity declines with age, as hypothesized, but the effect is linear. The changes in adulthood on the ten scales amount, at most, to about one *T*-score point (.1 *SD*) per decade.

Random Effects

The random effects terms (u_{0}: variance) associated with the intercepts, which reflect between-individual differences in personality traits, were all statistically significant and substantially higher than the within-individual variance (σ^{2}). General Activity, Ascendance, and Sociability showed less within-individual variability than other scales. In other words, on these scales, individuals are relatively more consistent across assessments. Masculinity shows markedly smaller values of both within- and between-individuals variance values (σ^{2} and u_{0}), setting it apart from the other GZTS scales, perhaps because Masculinity is a marker of sex role attributes more than a personality trait.

None of the variances of the quadratic slopes in was significant, but for seven scales, variances associated with the linear slopes were significant, indicating that for those scales, there are individual slopes departing from the overall trends. Next, we attempt to explain such variability with individual difference variables.

Level-2 Predictors of Intercept and Slope Variability

Gender was a significant predictor of the intercept for six scales. In the terminology of the NEO-PI-R, women were higher on A (Friendliness and low Masculinity), lower in Assertiveness (Ascendance) and higher on N (i.e., lower on Emotional Stability, Objectivity, Personal Relations, and Masculinity), which is consistent with cross-cultural patterns of sex-differences (

Costa, Terracciano, & McCrae, 2001). Gender explains 55% of the variance in the intercept of Masculinity, 3.3% for Ascendance, and about 1% for the other scales.

Cohort (year of birth centered on 1929) was a significant predictor of the intercept for seven of the ten GZTS scales. Later-born cohorts have lower intercepts (about 1 *T*-score point per decade) on Restraint, Friendliness, and Personal Relations, and higher intercepts on General Activity, Ascendance, Emotional Stability, and Thoughtfulness. These effects are net of the estimated effects of age (the level-1 predictor), and explain 7.6% of the variance in the intercept of Personal Relations and 2.6% for Ascendance, but less than 1% for the other scales.

As indicated by the table note in , gender was a significant predictor of the slope variability for Thoughtfulness, with women declining in young adulthood more, and increasing in old age less, than men (see ). This gender by age effect explained less than 1% of the slope variance. Birth cohort was a significant predictor of the linear slope of Sociability and Masculinity, explaining 1% and 3.7% of the slopes' variance, respectively. In both cases, later-born cohorts decline less.

Attrition and Death

Although there is little evidence that attrition influences the results of longitudinal studies of personality traits (

Roberts et al., in press), we examined whether the trajectories of individuals who dropped out of the study differed from the trajectories of those who remained in this study. We examined also whether individuals who died during the course of this study had different levels (intercept) or slopes compared to individuals still alive. We created a dummy variable, Attrition, that contrasted those who dropped out with those who remained in the study, and another dummy variable, Death, that contrasted those who died with those still alive. We entered the dummy variables as level 2 predictors of intercept, linear, and quadratic terms in the models presented in .

Overall, HLM analyses indicate that attrition and death had limited effects on the trajectories of the GZTS scales. The only significant attrition effect was found on Emotional Stability: Individuals who dropped out of the study had a 1.3

*T*-score points lower intercept (

*p* < .05; see also

Terracciano et al., 2005). The trajectories of participants who died differed significantly from those who were still alive on two of the ten GZTS scales. Those who died had a 2.4

*T*-score points lower intercept on Emotional Stability (

*p* < .001). For Friendliness, those who later died showed a decline of less than 2.5

*T*-score points from age 20 to 90 (

*p* < .05), whereas those still alive were substantially stable or slightly increased.

Supplementary Analyses

The data structure for this study was unusual because data were collected for the first 20 years of the study only from men. To examine whether this feature affected results, we attempted to replicate the results reported in on the assessments collected from 1978 to 2002 only, when both men and women were tested. This analysis removes possible Gender × Time-of-Measurement interaction effects. These analyses produced essentially the same results as those reported in , with only 4 of the 54 significant coefficients reported in becoming non-significant.

Next, we repeated the analyses in separate samples of men and women. These analyses cannot be compared to the full model, where gender was a level-2 variable, but the major concern is with the shape of the developmental curves. Of the 14 significant linear and quadratic fixed effects, all except the linear effects for General Activity in men and Sociability in women remained significant, and in the same direction, in the new analyses. In addition, Personal Relations showed a significant linear effect in men. Overall, however, these analyses suggest similar results for independent analyses of men and women.

Finally, we examined whether we could replicate the findings in

Terracciano et al. (2005) using the subset of 916 individuals who had not completed the NEO-PI-R. In this subset we can test hypotheses about developmental curves that were derived from a different instrument administered to a non-overlapping sample (see ). We used Overlaping vs. Non-Overlapping with the NEO-PI-R sample as a level-2 variable in analyses of the full GZTS sample. Those analyses showed that Personal Relations declined more steeply in the Non-Overlapping group, but there were no other significant difference between slopes. Thus, consistency of results for the NEO-PI-R and the GZTS is not due simply to overlapping samples.