|Home | About | Journals | Submit | Contact Us | Français|
A new model describing normal values of bone mineral density in children has been evaluated, which includes not only the traditional parameters of age, gender, and race, but also weight, height, percent body fat, and sexual maturity. This model may constitute a better comparative norm for a specific child with given anthropometric values.
Previous descriptions of children's bone mineral density (BMD) by age have focused on segmenting diverse populations by race and gender without adjusting for anthropometric variables or have included the effects of anthropometric variables over a relatively homogeneous population.
Multivariate semi-metric smoothing (MS2) provides a way to describe a diverse population using a model that includes multiple effects and their interactions while producing a result that can be smoothed with respect to age in order to provide connected percentiles. We applied MS2 to spine BMD data from the Bone Mineral Density in Childhood Study to evaluate which of gender, race, age, height, weight, percent body fat, and sexual maturity explain variations in the population's BMD values. By balancing high adjusted R2 values and low mean square errors with clinical needs, a model using age, gender, race, weight, and percent body fat is proposed and examined.
This model provides narrower distributions and slight shifts of BMD values compared to the traditional model, which includes only age, gender, and race. Thus, the proposed model might constitute a better comparative standard for a specific child with given anthropometric values and should be less dependent on the anthropometric characteristics of the cohort used to devise the model.
The inclusion of multiple explanatory variables in the model, while creating smooth output curves, makes the MS2 method attractive in modeling practically sized data sets. The clinical use of this model by the bone research community has yet to be fully established.
Osteoporosis is a major public health concern. It is a condition of bone fragility that can lead to pain, disability, and reduced quality of life .
The normal accumulation of bone during childhood may play an important role in avoiding or delaying osteoporosis later in life. Inadequate bone accretion during childhood can be related to lifestyle factors, such as diet and physical activity , chronic medical conditions with primary or secondary effects on bone , and concomitant medications. Thus, identifying low bone mineral density (BMD) in children may allow clinicians to make more informed decisions about treatments for children with poor bone mineral accretion.
Dual-energy X-ray absorptiometry (DXA) is the most commonly used method of measuring BMD because of its reproducibility, safety, and widespread availability, and it is recommended for clinical assessment in children . However, clinical assessment of BMD in children requires special consideration of the expected patterns of change in BMD associated with growth and development .
The Bone Mineral Density in Childhood Study (BMDCS) is a prospective, longitudinal study with the goal of providing the necessary reference data for the clinical assessment of bone density in children. The publication of the first results of the BMDCS  used the LMS  statistical method to construct sex- and race-specific reference percentiles of BMD and bone mineral content (BMC) relative to age.
The LMS method smoothes the estimated distributions across age and adjusts the data for skewness, a common problem in biological data. The primary disadvantage of the LMS method is the inability to easily include multiple effects within the same model.
Several major classifying variables, including age, gender, and race, predict BMD. Additionally, the anthropometric variables of weight, height, and body composition (amount of lean and fat tissue in the body) correlate well with BMD [8, 9]. Several of these factors are highly collinear. As children age, their height, weight, lean body mass, and percent body fat tend to change concurrently.
Others have used general linear regression to include these anthropometric variables in their models [9–12]. Unlike the LMS method that groups subjects by age, these methods use age as a continuous variable, but like the LMS method they do not include interaction terms that would allow the model to consider the non-linear changing effects of anthropometric variables as subjects age. For instance, the effect of height on BMD may change as subjects grow older. Including interaction terms is particularly important when the observed population spans several developmental milestones. Furthermore, these methods often transform BMD by using the inverse or log functions, making interpretation of the parameter effects and the model results difficult.
Our analysis presents a strategy of modeling normal childhood BMD data by taking into account potentially relevant anthropometric variables and including the effects of the interaction of these variables with age while preserving the smoothness of the estimates as is provided by the LMS method.
Building a more complete model that includes these anthropometric factors should produce narrower distributions and provide more accurate, anthropometrically adjusted Z-scores for children than by using only age, gender, and race.
Lumbar spine BMD values, measured by DXA using Hologic QDR4500A, QDR 4500W and Delphi/A bone densitometers (Hologic Inc. Bedford, MA, USA) were employed from the BMDCS. The BMDCS study population, data collection, and calibration methods have been described previously . The original enrollment population contained 1,554 healthy children and adolescents aged 5-16 who underwent measurements at baseline and yearly thereafter for up to 5 years. Additional study participants were enrolled after year 3 to supplement some of the smaller age groups. The study was approved by the ethical review board of each study center. Before enrollment into the study, participants 18 years old and older provided written informed consent. Younger participants gave their assent combined with their parent's or guardian's written consent.
A BMDCS data record was excluded if the visit's record was incomplete or if the individual had been under long-term drug treatment, usually a steroid or a form of birth control, after enrollment in the study. Records were also excluded if the participant had aged to 21 years. The final data set contained 7,655 observations from 1,889 participants, 987 boys and 902 girls. Participants ranged in age from 5 to 20 years.
Although race was recorded for several groups including American Indian, Asian, black, Hispanic, white, other, and mixed, Tukey's confidence intervals showed that, once age and gender had been applied, the only significant racial distinction was black vs. non-black (24% vs. 76% of the sample, respectively).
In addition to age, gender, race, and BMD of the spine (L1-L4), we examined percent body fat (by whole body DXA), height (by stadiometer), weight (by electronic scale), and sexual maturity (Tanner stage ). Age was truncated to the last birthday for the purpose of this analysis.
Figure 1 shows the number of subjects in the different age groups. All groups show decreased members at the oldest and youngest age groups, particularly for older black female participants. Consequently, for any fully predictive model, following the initial BMDCS analysis team's recommendation of excluding ages 5-6 and 18-20 years due to inadequate age-specific sample sizes would be prudent . However, in the present study, we did not exclude them in order to evaluate the stability of our modeling approach with smaller numbers of data points.
Inspection of the data revealed several important characteristics. Many of the explanatory variables were collinear; the number of observations in each subgroup was variable, suggesting that a weighting scheme might be necessary to account for the unbalanced experimental design; most individuals were measured multiple times, which can give undue weight to some individuals, because the collected data are no longer fully independent. Furthermore, it is desirable that each variable's regression coefficients at adjacent ages be similar, requiring some type of smoothing.
These are exactly the conditions that led Wainer and Thissen to develop multivariate semi-metric smoothing (MS2) . Variants of MS2 have been used to predict stature. Those variants were evaluated by Khamis and Guo , who determined that a version with spline smoothing (MCS2) was optimal.
We applied this method in the following manner:
Separating the fits by age/gender/race groups naturally weights all the coefficients by the sample size in that group while maintaining balance between the different groups. Transformation into the orthonormal space acts as a component factorization and breaks up the parameters' collinearity. The smoothing in the orthonormal space acts to protect the fit from large parameter jumps from age to age.
Overfitting the data can lead to artificial changes of the effects based on randomness in the data. Likewise, under-fitting the data would lead to a loss of accuracy in the model. The choice of using a fixed λ of 1.0 in the smoothing spline represents a slight variation of the MCS2 method, which expresses the variation of the spline as the number of knots instead of balancing the curvature and the error by using λ. In practice, curves with a λ of 1.0 are lightly smoothed but still retain enough elasticity to not overfit the data. Using a λ larger than 1.0, particularly for the intercepts, underfits some of the data, leading to massive increases in the error.
Selecting which variables to include in the final model is a multistep exercise that requires balancing the practicality of a clinical examination with the demands of statistical methodology. For the most complicated model, the following regression equation was used:
The crossing of weight, height, fat, and maturity coefficients with the age/gender/race groups indicates that these coefficients are different for each group. This represents the inclusion of the interaction of the group effects on these coefficients.
Whereas the original BMD values may be skewed, if the errors εijkl from the model can be shown to be normal, then we can infer that the skewness in the original data is accounted for, or induced by, the effects included in the model.
Adding variables to a model may or may not improve the model's prediction significantly. The mathematical considerations in Appendix 1 were used to judge a model's validity and to compare models containing different parameters.
The explanatory variables are listed in the order of importance in Table 1 and show a major additional influence of weight and percent body fat beyond age, gender and race. On the other hand, sexual maturity and height contribute little to the model once the other factors have been included.
Different combinations of explanatory variables resulted in different models (Table 2). The traditional model, model A, approximates the previous analysis , but instead of using medians and a power transformation as the LMS method does, model A uses the more standard parametric technique employing means, no power transformation and a symmetric error distribution. We find that the previous results via the LMS model and model A are almost identical. We will use model A as an analog to the output of the LMS method for comparison purposes.
The full model, model B, is the “kitchen sink” model, in which everything is included, even parameters that may not add much information to the model.
Accurately evaluating sexual maturity in practice requires careful cross-training of the assessors and may be uncomfortable for both the subject and the assessor. Measuring the percent body fat means another DXA measurement, slightly increasing radiation dose to the subject. The other parameters can be gathered quickly and agreeably, resulting in the practical model, model C.
Because of computational difficulty with orthogonalizing the nominal variable of maturity, fully smoothed models for some parameter sets were not calculated. The results for the unsmoothed model provide us with a reasonable estimate of the effectiveness of the final model. As seen in Tables 1 and and2,2, dropping sexual maturity as a predictor does not significantly reduce the predictive power of the model.
Statistically, model F may be optimal in terms of creating the most powerful model while adding the fewest predicting parameters, but Tukey's confidence intervals indicate that, at the black/non-black level, race still matters for some age/gender combinations. Including both weight and percent body fat will intrinsically account for lean mass in the model . Model D may be optimal in terms of balancing the needs of the subjects with the most powerful model, and this model D will be investigated further.
When segmented by age, gender, and race, there are 64 simultaneous fits for each model. For the sparse model, 13 of these groups show a lack of normality in the underlying BMD values at the 0.05 level using a Shapiro-Wilk's test for normality. This lack of normality was the reason previous investigators chose to use the LMS model. For the smoothed model D, the residuals of only eight of these fits do not show normality due to outliers that make the distributions look leptokurtic. Since the outliers tend to appear in an uneven manner, the leptokurtosis leads to perceived skewness, and all eight of these groups have a moment of skewness twice the error of skewness, meaning the residuals show more than mild skewness ; however, selectively removing one outlier from the extreme tail of the sample allows all but one of these groups to pass a test of normality. In samples of this size, this result reinforces the notion that the underlying distribution is actually less skewed and more normal.
By examining the smoothed coefficients of each parameter, we may glean some information about the parameter's effect and how this effect changes with age. Weight is positively related to BMD (Fig. 2a), but the influence declines with increasing age. For most groups, greater percent body fat is associated with lower bone density (Fig. 2b). Since the parameters are adjusted for weight, this negative association between percent body fat and BMD indicates that for two children of the same sex, race, age, height, and weight, the child with a higher percent body fat has a lower lean body mass and lower BMD, reflecting the known strong association of lean body mass and BMD . Also note that for late teen black females, the parameter reverses sign, but this is the area where data are most sparse.
Whereas weight usually has a strong positive contribution to BMD, height produces a much smaller negative correction after weight, age, and percent body fat are accounted for (Fig. 2c). This can be interpreted as an effect of the given bone mass, defined largely by body weight for a child with given sex, race, age, and percent body fat being distributed over a larger projected area in a taller child and, thus, resulting in a lower measured BMD. For children at older ages, both black males and females show a reversal of this pattern, and height contributes positively to BMD.
The intercepts of each group represent a baseline before the other effects are layered in (Fig. 2d). Again, a difference between the black and non-black groups is striking.
Building a more complete model, including anthropometric factors, should allow for narrower distributions than would be allowed by simply using age, gender, and race. We can indeed confirm that the expected distribution narrows as parameters are added; however, after age, gender, race, weight, height, and percent body fat are included in the model, sexual maturity does not contribute in narrowing the distribution further.
In addition to narrowing the estimated distributions, the inclusion of anthropometric variables may produce shifts in the expected BMD. Subject weight creates a major shift in the mean, whereas subject height has little influence (Fig. 3a). The sparse model (model A) shows a wider distribution as it includes all variations in weight and height. Gender and race produce additional shifts in the model curves (Fig. 3b).
When comparing the total root mean squared error (RMSE) of the sparse model, model A, to the total RMSE produced by a more complex model, model D, a lower total RMSE is observed for the more complex model (Table 2). Furthermore, plotting the race/gender subgroups' individual RMSEs by age (Fig. 4) shows that model D also has a consistently lower RMSE at the group level. Thus, a model including relevant anthropometric data does produce narrower distributions. Also note that, since the RMSEs are expressed in the same scale as the dependent variable, they tend to increase as the subjects grow older just as the subjects' BMD levels increase with age.
Comparing model D, the full model, and model A, the reduced model, the F statistic returns a value above 14, where the degrees of freedom in the denominator and the numerator are both above 7000. The p value below 0.001 means that adding the anthropometric parameters makes numeric sense and helps to conclude that model D is superior to model A.
Not factoring in the effect of weight as a determinant of BMD leads to the possibility that children of normal weight, based on the 50% value of the CDC growth chart, will be judged as having low BMD when that is not the case (Fig. 5a). The sparse model agrees better with individuals who are at the 75th percentile of weight and height by CDC standards (Fig. 5b) than it does with those at the 50th percentile of the CDC standards.
Two cases of independent longitudinal data from a study that follows human development in a normal population  are depicted in Fig. 6. Differences in Z-scores are apparent between the sparse model (model A) and the model without sexual maturity (model D). Although neither of these selected cases approaches a critical value suggesting abnormality, they serve as examples clearly showing that inclusion of explanatory variables in the model can easily produce a difference of more than one Z-score when compared to the sparse model.
This study has investigated an alternative approach to modeling bone density data by taking into account weight, height, and percent body fat beyond the traditionally used independent variables of age, gender, and race. The resulting model's curves are characterized by narrower distributions and small shifts, allowing more sensitive assessment of the bone density status of children with given anthropometric properties.
Manufacturers have released average BMD values starting at the age of 20 years. The method described here might be improved by including such values at the end point of the models as fixed or target values; however, without the relevant anthropometric data available, it is not clear how to account for these values.
The fact that Hologic fan-beam instruments have been used for this study may have an influence on the resulting data. Change in body weight may change the distance of the spine above the scan table, and this, in turn, would change the magnification of the bones in the image. Such magnification errors would clearly influence a geometric parameter like bone area as well as bone mineral content, which is derived by multiplying BMD with bone area. BMD is least influenced by magnification and has been shown to be relatively independent from the distance of the bone above the table .
The small sample size of late teen black females and the difference between the behavior of the effect coefficients that group produces, compared to the rest of the data set, calls those results into question. Combining the data and eliminating race seems attractive, but Tukey's confidence intervals for the residuals of the age/gender/weight/height/fat fits without race still indicate that race is a significant determinant for BMD.
The models presented differ from other models that take anthropometrics into account for the interpretation of DXA data. A number of authors have suggested a multi-step approach [9, 12, 21], which narrows the appropriate comparisons down by gradually including additional anthropometric variables. Our models include these variables up front, simplifying the interpretation step of the DXA data. These models, however, critically rely on the implemented smoothing approach, as the subgroups become very small even in a data set containing several thousand observations.
Pubertal stage did not have a major influence on our models. Horlick et al. made similar observations for their whole body BMC models , and they argued that the use of a normal population, the cross-sectional design of the study and the consideration of anthropometric parameters diminished the importance of pubertal stage. This would, however, not mean that abnormal pubertal development would not have an influence on bone density.
Whereas there are alternative approaches to correcting BMD measurements that account for skeletal size, most notably bone mineral apparent density , there does not appear to be clear agreement about the best way to correct for such an effect or agreement about how meaningful such a correction would be. When and if such a correction is agreed upon, the methods presented in this paper would also apply to this skeletal size-corrected parameter. However, because several anthropometric parameters were considered in the models, it is quite possible that the introduction of a new skeletal size-corrected parameter is unnecessary.
A purely statistical approach to the problem of classifying juvenile subjects with respect to BMD would be to discard age entirely as an explanatory variable. The colinearity of both weight and height with age would make such an approach seem viable. Indeed, fitting BMD by gender and race against weight, height, and percent body fat produces an impressive adjusted R2 of 0.81 and an RMSE of 0.094, although not as good as model D's values of 0.83 and 0.088, respectively. By its nature, such a model would not include any interaction effects. Unfortunately, removing the percent body fat from this equation drives the R2 to 0.77 and increases the RMSE to 0.103, values that are comparable to what is achieved by using just age, gender and race, the sparse model A.
In conclusion, the proposed model adjusts for height, weight, and percent body fat. A similar model using the same type of smoothing can be implemented without percent body fat if that measurement is not available. These models adjust for the differences in the distribution of body weight between this sample and the CDC growth charts.
The models presented here can be used to determine a child's Z-score adjusted not just for age, gender, and race but also for body weight, size, and composition. These Z-scores are reasonably smooth at age breaks because of the MCS2 smoothing. Individuals are compared to more appropriate means, considering their anthropometric status, and placed within a narrower distribution, allowing for more sensitive clinical classification.
Full development of how to clinically use anthropometrically adjusted Z-scores for children is beyond the scope of this methods paper. Application of the proposed models to some well-characterized disease cohorts might be a first step.
This work was funded by the National Institute of Child Health and Human Development (NICHD), contract number N01-HD-1-3328.
For any data set, the total variation can be expressed as , where is the average of the set, and the degrees of freedom DoFtot as the total number of observations in the data set minus 1. For any model of that data set, the variation explained by the model can be expressed as , where ŷ is the predicted value from the model, and is equivalent to the sum of the squares of the ε's in Eq. 1. The degrees of freedom of the model DoFreg is the number of parameters fit by the model. Finally, the variation in a data set unexplained by the model can be expressed as SSerr = SStot − SSreg and the degrees of freedom for the error as DoFerr = DoFtot − DoFreg.
The coefficient of determination, R2 = 1 − SSerr/SStot, is a measure of how well the model fits the data. When adding terms to a model, R2 always increases, even if the extra terms are not significantly adding explanatory value to the model. The adjusted coefficient of determination, adjusted R2 = 1 − (SSerr/SStot) / (DoFerr/DoFtot), penalizes the statistic by accounting for the number of explanatory variables used in the model. It approximately measures the percentage of variation in the data accounted for by the model in a way that can be compared between competing models .
The root mean squared error, , provides an estimate of the error for the model. The overall RMSE for the entire model is formed by using the total SSerr and its degrees of freedom. In addition, each of the 64 age/gender/race subgroups also has an individual RMSE, formed by using each of the subgroups' SSerr and their respective and their respective degrees of freedom.
When more significant parameters are added to the model, the adjusted R2 increases and the total RMSE usually decreases. When comparing two models, where model x is a more complex model and model z is a simpler model, the ratio of ((SSerrz−SSerrx)/SSerrx)/((DoFerrz− DoFerrx)/DoFerrx) becomes an F statistic that can be used to judge if adding the extra parameters makes model x a better fit for the data than model z .
Conflict of interest None
D. F. Short, Wright State University, Dayton, OH, USA.
B. S. Zemel, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
V. Gilsanz, Children's Hospital Los Angeles, Los Angeles, CA, USA.
H. J. Kalkwarf, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
J. M. Lappe, Creighton University, Omaha, NE, USA.
S. Mahboubi, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
S. E. Oberfield, Columbia University Medical Center, New York, NY, USA.
J. A. Shepherd, University of California at San Francisco, San Francisco, CA, USA.
K. K. Winer, National Institute of Child Health and Human Development, Bethesda, MD, USA.
T. N. Hangartner, Wright State University, Dayton, OH, USA.