|Home | About | Journals | Submit | Contact Us | Français|
The main goal was to determine the accuracy and precision of new fetal weight estimation models, based on fractional limb volume and conventional 2D sonographic measurements during the second and third trimesters of pregnancy.
A prospective cross-sectional study of 271 fetuses was performed using 3D ultrasonography to acquire standard biometry (BPD, AC, FDL), fractional arm volume (AVol) and fractional thigh volume (TVol) within 4 days of delivery. Weighted multiple linear regression was used to develop ‘modified Hadlock’ models and new models using transformed predictors that included soft tissue parameters for estimating birth weight (BW). Estimated and observed BW’s were compared using mean percent difference (systematic weight estimation error) and the standard deviation of the percent differences (random weight estimation error). The proportion of newborns with estimated BW’s within 5 or 10% of actual BW were compared using McNemar’s test.
Birth weights ranged from 235 - 5,790 grams with equal proportions of male and female infants. Six new fetal weight estimation models were compared to results for modified Hadlock models with sample specific coefficients. All new models were very accurate with mean percent differences that were not significantly different from zero. Model 3 (Ln BPD, Ln AC, Ln AVol) and Model 6 (Ln BPD, Ln AC, Ln TVol) provided the most precise weight estimations (random error = 6.6% of actual birth weight) as compared to 8.5% for the best Hadlock model. Model 5 (Ln AC, Ln TVol) classified an additional 9.1% and 8.3% of the fetuses within 5% and 10% of BW. Model 6 classified an additional 7.3% and 4.1% of infants within 5% and 10% of BW.
The precision of fetal weight estimation can be improved by adding fractional limb volume measurements to conventional two-dimensional biometry. New models that consider fractional limb volume may offer new insight into the contribution of soft tissue development to weight estimation.
The nutritional status of newborn infants is routinely assessed using birth weight (BW) and population-based standards. In 2004, the World Health Organization estimated that more than 20 million infants were born with low birth weight of less than 2,500 grams, particularly in Asia and Africa (1). In the United States alone, the National Center for Health Statistics reported that 8.2 percent of over 4.1 million newborns were delivered with low birth weight (2). On the other end of the spectrum, 9.1 percent of American newborns were delivered with birth weights of at least 4,000 grams. Both of these extreme conditions represent public health issues of major importance. Low birth weight is an important determinant of infant mortality and is associated with an increased risk of hypertension, diabetes, and obesity in adult life (3). Macrosomic infants have an increased likelihood of operative delivery, shoulder dystocia, brachial plexus injury, anal lacerations, and postpartum infection (4). Hence, the routine practice of estimating fetal weight is supported by a clinical need to detect and monitor abnormal growth.
Estimated fetal weight (EFW) has been used to identify growth abnormalities for over three decades. A sonographic measurement of abdominal circumference (AC) is usually combined with other growth predictors, such as head circumference (HC) and/or femoral diaphysis length (FDL), for EFW prior to delivery. In a systematic review of eleven prediction models, Dudley (5) concluded that there was no preferred method for the ultrasound estimation of fetal weight because the magnitude of random errors resulting from these predictions were a major obstacle to confident use in clinical practice. This review concluded that 95% confidence intervals exceeded 14% of birth weight in all studies. Clearly, we must improve the precision of EFW calculations on the basis of these important observations.
Despite the widespread application of fetal weight estimation models in obstetrical practice, relatively few have included soft tissue assessment for this purpose. Several investigators have proposed fetal soft tissue evaluation of the fetal thigh thickness or circumference (6,7), cheek-to-cheek diameter (8,9), abdominal subcutaneous tissue thickness (10-14), and appearance of the fetal buttocks (15). Practical implementation of these soft tissue predictors is limited by a paucity of validation data regarding the reproducibility of these measurements between examiners throughout pregnancy.
Three-dimensional ultrasonography (3DUS) provides a method for limb volume measurements and subsequent calculation of EFW. Chang and associates (16,17) initially described the use of arm or thigh volumes for estimating weight during the third trimester. Their measuring procedure took approximately 10-15 minutes to complete for each limb. Schild and colleagues (18) have described a combination of 2D and 3D sonographic parameters for calculating EFW. Volume predictors included fetal thigh, upper arm, and abdomen. Superior fetal weight estimation was achieved by including these soft tissue predictors although they concluded that the extra time spent on measuring volumes was justified in cases where accurate weight estimation was important. Improved results were also obtained using multiple parameters in a subsequent study of fetuses weighing ≤ 1,600 grams (19). Unfortunately, these volume measurements alone took an average of 10 to 15 minutes for each fetus.
The clinical application of volume measurements for weight estimation is limited by the extra time that is required to manually trace soft tissue borders along the entire limb. Acoustic shadowing also makes it difficult to confidently trace soft tissue borders near the limb joints. The concept of fractional limb volume was introduced in order to address these technical limitations (20). This soft tissue parameter is derived from a central portion of the limb diaphysis because transverse slices of the mid-limb are more likely to display the sharpest soft tissue borders. Measuring times are substantially reduced because only five equidistant slices are traced within the partial limb volume and areas of acoustic shadowing are less likely to occur. Fractional limb volume measurements are also reproducible between blinded examiners and technical factors affecting their acquisition have already been described (21).
We now examine the accuracy and precision of new fetal weight estimation models that combine fractional limb volume with conventional 2D sonographic measurements (BPD, AC, FDL) during the second and third trimesters of pregnancy.
This was a prospective, cross-sectional study of pregnant women who were invited to participate under informed consent that was approved by Institutional Review Boards at William Beaumont Hospital, Wayne State University, and the Eunice Kennedy Shriver National Institute of Child Health and Human Development. Inclusion criteria consisted of women in their second and third trimesters of pregnancy. The protocol excluded multiple gestations, fetuses with structural anomalies, and fetuses with poorly visualized limbs due to technical factors. Subjects were primarily from uncomplicated pregnancies although some had known risk factors that included suspected fetal growth restriction (n = 42), hypertension (n = 17), and diabetes (n = 13). Maternal age, gravidity, menstrual age at time of scan, gender, ethnicity, and presence of obstetrical complications were documented. Some of the research subjects were previously reported in our original publication that described the relationship of birth weight to fractional limb volume in late third trimester fetuses (n = 87) (20). Additional subjects (n = 88) were reported in two other articles that were unrelated to fetal weight estimation (21,22).
Fetal age was based on the first day of the last normal menstrual period and confirmed by either first or early second trimester ultrasound scans. A normal last menstrual period was defined as regular cyclic menses without antecedent oral contraceptive use. Age estimates in the first trimester were based on crown—rump length measurements (23). Age estimates in the second trimester were determined by using biparietal diameter (BPD), head circumference (HC), abdominal circumference (AC), and femoral diaphysis length (FDL) measurements (24-27). Sonographic age was used to adjust menstrual age if there was more than a one-week discrepancy between menstrual dating and sonographic assessment.
3D volume data sets were acquired from the head, trunk, and thigh as previously described (21). Each pregnancy was scanned only once within 4 days of delivery. The data was acquired using hybrid mechanical and curved array abdominal ultrasonic transducers (RAB 4-8P, RAB 2-5P; Voluson 730 and Voluson Expert, GE Healthcare, Milwaukee, WI). Image depth and magnification were adjusted for a volume of interest to fill at least two thirds of the video display screen. For fractional limb volume, the acoustic focal zone was adjusted near the long bone diaphysis and the system gain was optimized. Each volume acquisition, lasting approximately 10 seconds, was taken from a sagittal sweep of the limb diaphysis. Image data was archived on digital media for subsequent off-line analysis.
Sonographic measurements of the AC and FDL were extracted from the acquired volume data sets; routine head volume acquisitions were added in 2001 to allow inclusion of BPD. Two commonly used fetal weight estimation models (AC and FDL; BPD, AC, and FDL) were used as a basis of comparison from Hadlock et al. (28,29). Fractional limb volumes were calculated using commercially available software (4D View, GE Healthcare, Milwaukee, WI). Volume measurements were based on either 50% of humeral diaphysis (AVol) or 50% of femoral diaphysis length (TVol). Each partial volume was sub-divided into five equidistant slices that were centered along the midarm or mid thigh (21,22). Images were again magnified to fill at least two-thirds of the display. Soft tissue borders were enhanced by use of a color filter (sepia) with additional gamma curve adjustments for brightness and contrast. The fractional limb volume was automatically calculated after each of the five slices was manually traced from a transverse view of the extremity.
All continuous variables were first assessed using numerical and graphical techniques, including scatter plots, to determine if they met the distributional assumptions of statistical tests being used to analyze them. All scatter plots (anatomic parameters versus BW) revealed curvilinear relationships and presence of heteroscedasticity. Natural logarithmic transformations from the Box-Cox family of transformations were applied to all growth parameters and BW.
Weighted regression analysis was performed with transformed data on all of the models to address heteroscedasticity. These weights were computed as the reciprocal of model specific variance because they represent the best linear unbiased estimates (BLUE) of true birth weights. Professor Douglas Altman has previously reviewed key statistical concepts for this procedure (30-32). The weights are multiplied by √(π/2), or 1.253, using a half standard normal distribution (30). Since all of the anatomic parameters change as a function of pregnancy age, the residuals should have a normal distribution at each value of the parameter and the absolute values of the residuals should have a half normal distribution. It follows that the mean of the absolute residuals multiplied by √(π/2) is an estimate of the standard deviation (SD) of the residuals. If the SD is not fairly constant for each parameter, their predicted values from regressing absolute residuals against the predictors multiplied by √(π/2) will provide parameter-specific estimates of the SD of the signed residuals, and hence of birth weight. For each model, studentized residuals from the weighted regression analysis were assessed for normality using a normal probability plot.
Main effects polynomial regression models were fitted to the data to capture the curvature in the data. A quadratic or cubic term for each predictor in the model typically provided a good fit to the data as evidenced by random patterns in the residual analyses and excellent coefficients of multiple determination. Hence, interaction terms were not required for any of the new weight estimation models.
For each model, variance inflation factors (VIF) were examined to assess the magnitude and severity of multicollinearity (33). Severe multicollinearity, typically indicated by a VIF > 10, does not usually influence the ability of a fitted model for making inferences about mean responses or for making predictions, provided that the predicted values for which inferences are to be made follow the same multicollinearity pattern as the data from which the regression model is based (33). Since our main goals were to develop robust weight estimation models and to make inferences using predicted values that follow the same pattern of multicollinearity as fetal weight, there was no need to incorporate corrective measures such as Ridge Regression to address this matter (33-36).
For each regression model, the Mean Square Error (MSE) was computed as an unbiased estimator of the population variance a required mathematical property of estimates that is statistically referred to as random error. The positive square root of MSE was also used to estimate the population standard deviation. Systematic errors were also calculated as the mean percent differences in birth weight using the following formula: Percent difference = [(estimated birth weight — actual birth weight)/(actual birth weight)] × 100. Random errors were expressed by the standard deviation of these percent differences. Systematic error between the proposed best models and Hadlock’s original (OH1 and OH2) and modified models (MH1 and MH2) were compared using either the Sign or Student’s t tests. Random errors were compared using the Pitman test for correlated variances (37,38).
New fetal weight estimation models with the best performances were compared to comparable models using the Partial F tests that were based on the Sequential and Extra Sums of Squares. This technique is routinely used for the comparisons of fitted regression models (33).
This was accomplished by analyzing 2 × 2 contingency tables using the McNemar’s test for paired observations using the Hochberg adjustment for multiple comparisons. This test depends on the number of disagreements between two models, namely the number of false positives and false negatives. When the number of false negatives is much lower than the number of false positives, it is strongly indicative of a drop in agreement with corresponding significant p-values.
Calculations were performed using PASS 2005 (38). P-values less than an alpha of 0.05 (Probability of Type I error) were considered statistically significant. Statistical analysis was performed using The SAS System for Windows version 9.2.
Two hundred and seventy one pregnancies were prospectively scanned within 4 days of delivery during the study period from June, 1998 to September, 2008 (76.4% between 1998 - 2002). Examinations were performed between 18.4 and 42.1 weeks gestation, menstrual age. The mean maternal age was 30.6 ± 5.9 years with an average gravidity of 2.9 pregnancies. Ethnicities included 63.5% White, 27.3% Black, 4.4% Asian, 1.5% Hispanic, and 1.5% Native American Indian subjects. Newborn infants consisted of 50.2% females and 49.8% males. Birth weights were non-normally distributed and ranged from 235 - 5,790 grams (Shapiro-Wilk’s test, 0.965, p < 0.0001) (Figure 1).
Curvilinear relationships of sonographic parameters to birth weight are summarized in Figure 2. Transformed parameters and their relationships to the natural logarithm of birth weight demonstrated increased linearity and reduced heteroscedasticity (Figure 3).
The original Hadlock weight estimation functions (OH1 and OH2) were based on a Houston population sample (23,24) (Table 1). When these published models were applied to our Michigan sample, their mean systematic weight estimation errors indicated an overestimation that ranged between 7.7 and 8.8 percent of actual BW. The best precision (random weight estimation error = 8.5% of BW) was obtained using a 3-predictor model that included BPD2, AC, and FDL. Modified Hadlock weight estimation models (MH1 and MH2), using the same anatomic parameters as OH1 and OH2, were also developed for our study sample. The modified Hadlock models (MH1 and MH2) were very accurate with mean systematic errors that were not statistically different from zero (p-value of Student’s t-test = 0.5069 and 0.654, respectively) and random errors that were similar to the performance of the original Hadlock models ranging from 7.6 to 8.3 percent.
Table 2 summarizes six optimized multiple regression models with their coefficients and y-intercepts. Soft tissue parameters were combined with conventional 2D measurements for estimating the natural logarithm of birth weight as the outcome variable. Models 2 and 5 consisted of two parameters that included the fetal trunk and limb. Models 3 and 6 consisted of three parameters that included the fetal head, trunk, and limb. The systematic and random errors of these new models, using fractional limb volume, are also summarized. All of them had high adjusted r-square values. Mean squared errors were lowest for Models 2, 3, 5, and 6. Models 3 and 6 had the lowest mean percent differences or systematic errors, ranging from 0.12 to 0.18 percent, and were not significantly different from zero (p < 0.0001, Student’s t-test). Models 3 and 6 demonstrated the most precise weight predictions (random error = 6.6% of BW). The standard deviations of the percent differences for two-parameter models (Model 2 or Model 5) or three- parameter models (Model 3 or Model 6) were significantly lower when compared to their corresponding modified Hadlock models (p < 0.05).
Table 3 compares the proportion of newborn infants with EFW results that were correctly classified as being within 5 or 10 percent of BW.
First, two-parameter models of the trunk and limb were compared. The original Hadlock model OH1 correctly classified 30.5% and 53.1% of newborns within 5% and 10% of BW. The corresponding new two-parameter models correctly classified a significantly greater proportion of infants within 5% (Model 2 = 50.8%; Model 5 = 56.4%) or 10% (Model 2 = 84.8%; Model 5 = 84.9%) (p < 0.0001). Next, the three-parameter models of the head, trunk, and limb were also compared. The original Hadlock model OH2 correctly classified 35.7% and 63.6% of newborns within 5% and 10% of BW. The corresponding new two-predictor models classified a significantly greater proportion of infants within 5% (Model 3 = 50.4%; Model 6 = 57.3%) or 10% of BW (Model 3 = 89.8%; Model 6 = 84.1%) (p < 0.0001).
The two-parameter modified Hadlock model (MH1) classified 47.3% and 76.6% of newborns within 5% and 10% of BW. By comparison, the three-parameter modified Hadlock model (MH2) classified 50.0% and 80.0% of newborns within 5% and 10% of BW. Based on EFW comparisons within 5% shown in Table 3, no significant differences were found between Models 2 and 3 when compared to their corresponding modified Hadlock models. Model 5 classified an additional 9.1% and 8.3% of the fetuses within 5% and 10% of BW. Similarly, Model 6 classified an additional 7.3% and 4.1% of infants within 5% and 10% of BW.
Figure 4 summarizes the relationship between systematic and random errors for all fetal weight estimation models. Models 6 and 3 demonstrated the best overall accuracy and precision when compared to the modified Hadlock models (MH1 and MH2).
In order to determine the benefit of adding a soft tissue parameter to AC alone, we compared the following weight estimation models that differed by only one term - the presence or absence of fractional limb volume.
|Ln BW = -6.7673 + 5.9976 Ln AC - 0.5133 (Ln AC) 2||Adjusted R2 = 0.958|
|Ln BW = -3.6138 + 4.6761 Ln AC - 0.4959 (Ln AC) 2 + 0.3795 (Ln AVol)||Adjusted R2 = 0.978|
The addition of (Ln AVol) to a model already containing (Ln AC) and (Ln AC) 2 improved EFW by explaining an additional 2.0% of the total variance in Ln BW (p < 0.0001).
|Ln BW = 4.7806 + 0.7596 (Ln TVol)||Adjusted R2 = 0.961|
|Ln BW = 2.1264 + 1.1461 (Ln AC) + 0.4314 (Ln TVol)||Adjusted R2 = 0.980|
A model that was based on Ln TVol alone explained 96.1% of the total variation in Ln BW. The addition of Ln AC accounted for an additional 1.9% of the total variance in Ln BW (p < 0.0001).
Sample size calculations were based on the two optimal models that included (Ln AVol) or (Ln TVol) parameters. A sample of 138 patients achieved nearly 100% power to detect their respective R-square values of 0.9897 and 0.9873 attributed to four independent variables. This is based on an F-Test with a significance level (alpha) of 0.05.
Intrauterine malnutrition is a commonly suspected cause of poor fetal growth from protein and/or micronutrient deficiencies (40). Although this condition cannot be precisely established during fetal life, it is biologically plausible that a malnourished fetus would manifest insufficient or excessive soft tissue development. Indeed, there is mounting epidemiological and clinical evidence for an association between fetal programming of body composition and musculoskeletal development (41). For example, BW and poor prenatal nutrition are associated with altered fat distribution (42,43), reduced muscle mass (44); and low bone mineral density (45,46) - all of which have the potential for affecting cell numbers, altering stem cell function, and resetting of regulatory hormones during later adult life. Ay and colleagues (47) recently described an association between fetal weight changes during the late pregnancy with postnatal “catch up” growth within 6 weeks after birth. Their investigation used dual energy x-ray absorptiometry (DXA) scans in the same infants at 6 months to demonstrate that these fetal and postnatal growth patterns were significantly correlated with body composition into early childhood. A related longitudinal investigation of 1,012 children from the “Generation R” project also found that subcutaneous fat mass tends to track in the first 2 years after birth (48). The aforementioned studies underscore the importance of fetal nutritional assessment and its potential impact on the continuum of health and disease during adult life.
Fractional limb volume has been proposed for the detection and monitoring of malnourished fetuses (21). This concept is supported by an anthropometric study of neonatal body composition that estimated lean body and fat mass in 188 newborn infants within 24 hours of birth. Although neonatal fat mass constituted only 14% of total birth weight, it explained 46% of its variance (22). However, widely accepted weight estimation models do not usually consider the clinical significance of fetal soft tissue in routine obstetrical practice. This practical limitation is partially explained by technical challenges related to the reproducibility of fetal soft tissue measurements.
We have previously described the relationship between two-dimensional sonographic parameters, fractional limb volume, EFW, and BW to neonatal percent body fat using air displacement plethysmography. Fractional thigh volume had the greatest correlation to percent body fat in third trimester newborn infants (50). Similar to actual birth weight, the TVol predictor explained 46.1% of the variability in percent body fat. Abdominal circumference and EFW accounted for only 24.8% and 30.4% of the variance in percent body fat, respectively. Khoury and colleagues (51) also correlated 2D sonographic parameters, fractional thigh volume, and birth weight with neonatal skin fold measurements. They concluded that fractional thigh volume reflects neonatal fat mass and is better correlated with BW than conventional 2D measurements.
The widely accepted Hadlock weight estimation models (Houston, Texas) were used as an initial basis for comparison (27,28). In the present study, one of the original Hadlock models (BPD, AC, FDL) was associated with an 8 percent systematic error for our study sample. This overestimation may be related to multicollinearity from the interaction between two or more highly correlated predictor variables. Multicollinearity can cause relatively large standard errors of model coefficients and increased variability in weight estimates when these formulae are applied to different populations. To minimize this effect, “modified Hadlock” functions and model coefficients were developed using the same sample from which the new prediction formulae were derived. A more objective comparison of model performance was achieved by substituting a 2D limb parameter, such as (Ln FDL), with a corresponding limb volume parameter such as (Ln TVol). Relatively few sonographic studies have correlated anatomic parameters with BW using sample specific model coefficients (20, 51).
Lindell and Marsal (52) recently reported fetal weight estimation using fractional thigh volume for a Swedish population. They studied 176 pregnant women ≥ 287 days of gestation within 4 days of delivery. The formula of Persson and Weldner (BPD, abdominal diameter, and FDL) (53) were compared to preliminary weight estimation models (BPD, AC, TVol) that were previously reported by our research group (54,55). A new formula (head circumference, abdominal diameter, abdominal volume, and TVol) was developed for their population sample. Both the Persson model (53) and our preliminary fetal weight estimation model using TVol (55) yielded the smallest random weight estimation errors of 6.3%, although the latter led to underestimated mean percent differences of 6.0%. For 63 subjects, their new volume based model (head circumference, abdominal diameter, abdominal volume, TVol) resulted in a mean percent difference of 0.3 ± 5.6%.
A more appropriate and objective comparison of fetal weight estimation models can be made if the model coefficients (for both new models and those from the literature) are derived from the same local study population. Our data underscores the importance of using sample specific model coefficients in comparing the performance of new weight estimation models with published methods (Table 1). Studies that do not compare models with sample-specific coefficients are subject to weight estimation errors due to differences in sample characteristics, which appear to affect systematic errors more than random errors. In this context, Siemer and colleagues (56) retrospectively compared 3,975 pregnancies with commonly used weight estimation models derived from regression analysis (57-60). The Hadlock model (BPD, AC, FDL) had the lowest systematic weight estimation error of -0.28 percent (22,23). Seven other models had systematic errors ranging from -8.84% to +5.28%. The best random weight estimation error of 9.49% (SD of percent differences) resulted from using the model of Dudley (EFW = 0.32 × AC2 × FDL +0.053 × HC2 × FDL) (60). By comparison, the Hadlock model had a random weight estimation error of 10.0 percent in their study. Many of these discrepancies may have resulted from comparing models that were derived from different populations. In our investigation, the original Hadlock model (OH2) had a mean error of 7.7 ± 8.5%; the systematic weight estimation error was improved by applying a modified Hadlock MH2 model (0.29 ± 7.6%), using specific coefficients from our patient sample.
Our results indicate that fractional limb volume can be combined with 2D sonographic predictors of the head and trunk to improve the precision of EFW. Inclusion of a limb soft tissue component may also provide a novel index of fetal soft tissue development as part of the weight estimation procedure. Several statistical modeling techniques were used to develop optimal fetal weight estimation models that included soft tissue parameters. The substitution of fractional limb volume for FDL, use of natural logarithmic transformations with weighted regression analysis, and selective application of squared transformed parameter terms essentially reduced the random error to 6.6%. A validation study from an independent sample is currently underway to examine the performance of these new fetal weight estimation models in all weight groups, including macrosomic fetuses, before they can be confidently adopted for routine obstetrical care.
The Authors wish to acknowledge the technical assistance of Melissa Powell, RDMS and Beverley McNie, BS, CCRP. This research was supported (in part) by the Perinatology Research Branch, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, DHHS