|Home | About | Journals | Submit | Contact Us | Français|
Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age nine years. The analyses for the most recent nine-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated nonlinearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of twenty-one endpoints available at age nine years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other nine-year endpoints that in the linear analysis has a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels.
Mercury is naturally present in the environment and human activities bring it in close contact with people. In aquatic environments, some bacteria can methylate inorganic mercury to form methylmercury (MeHg) and it enters the food chain. It is known that MeHg can damage the developing fetal central nervous system at high doses. Pregnant women who consume fish expose their fetus to low dosages of MeHg. It has been proposed that prenatal exposures to MeHg, measured as total mercury (THg) concentrations above 10 ppm in maternal hair, may be associated with declines in neurodevelopment (Cox et al., 1989). Frequent fish consumers are known to reach such levels of exposure (Airey, 1983). Epidemiologic studies have examined this possibility of an association between prenatal MeHg exposure from fish consumption and child development and reported both its presence (Myers et al., 1995a, b; Grandjean et al., 1997; Grandjean et al., 1999; Grandjean et al., 2003) and absence (Marsh et al., 1995a; Myers et al., 1995c, Davidson et al., 1995; Davidson et al., 1998; Myers et al., 2003). Recent scientific reviews have reexamined the data from human studies of the prenatal MeHg health risks from fish consumption (National Institute of Environmental Health Sciences 1998; National Research Council 2000), but many questions remain unresolved. Given that fish consumption is reported to have significant nutritional benefits, it continues to be controversial whether public health authorities should encourage fish consumption for health benefits or discourage it for possible risk from MeHg. Huang et al. (2005) suggested that the effect of MeHg exposure through fish consumption might be nonlinear. They found that dosages between background and about 10-12 ppm were not associated with any changes in neurodevelopmental outcomes while higher exposures were associated with a decline in endpoint scores.
We have examined the association between prenatal exposure to MeHg from maternal fish consumption and child development in the Republic of Seychelles. The study population is a cohort of 779 children in the Republic of Seychelles. Prenatal exposure to mercury was assessed in a segment of maternal hair corresponding to growth during pregnancy. The children's neurological and developmental status was evaluated at age 6.5, 19, 29, 66, and 107 months (approximately 9 years) (Myers et al., 1995b; Davidson et al., 1995; Davidson et al., 1998; Myers et al., 2003) using standard multiple linear regression analysis. Through age 9 years, no consistent association between prenatal MeHg exposure and child development has been identified in this cohort. The most recent data at 9 years of age using conventional linear regression analysis (Myers et al., 2003) found only one association where a developmental score decreased with increasing prenatal exposure and one association where the endpoint score improved with increasing exposure out of 21 endpoints. Myers et al. (2003) concluded that the data provided little evidence for an declining association between prenatal MeHg exposure at the levels studied (mean prenatal MeHg 6.9 ppm) and child development. A more flexible analysis using semiparametric additive models (Huang et al., 2005) confirmed these findings, but raised the possibility that there may be associations where outcomes declined in the prenatal exposure range above 12ppm, supporting the Iraq study (Cox et al., 1989).
The analyses for the Seychelles nine-year data assume that the relationship between MeHg and child development is homogeneous across all children in the population, i.e., a population model. A population model is useful for examining the overall exposure/outcome relationship. However, if the relationship is non-homogeneous, i.e., different between subgroups of the population, a population model may fail to detect subgroups that are possibly at risk. For example, although there was no evidence for an association between IQ and prenatal MeHg exposure in the 9-year analysis (Myers et al., 2003), there could be subgroups of children that are more susceptible. It may be hypothesized that children with lower home environmental stimulation might be more likely to show subtle effects of prenatal MeHg exposure on specific developmental domains than children from more stimulating home environments. This does occur with some toxins such as lead (Bellinger et al., 1988). If this were the case for MeHg, such a non-homogeneous relationship would not have been detected by either conventional linear regression analysis or semiparametric additive models unless the model contained interactions between MeHg and subgroup indicators. However the subgroups are not likely to be known in advance and hence conventional models rarely include such interactions. To test the hypothesis of nonhomogeneity, we reanalyzed the nine-year data using regression tree methods (Breiman et al., 1984).
The SCDS main cohort consists of 779 mother-child pairs who were enrolled in 1989-1990, representing about 50% of the live births during that time period. There were 717 children from the cohort still eligible for the study at age 9 years, and 643 returned for testing at an average age of 107 months. Detailed descriptions of the cohort and exclusion criteria have been published elsewhere (Myers et al., 1995c; Marsh et al., 1995b; Shamlaye et al., 1995; Myers et al., 1997; Davidson et al., 1998; Myers et al., 2003). The study was approved by the research subjects boards of the University of Rochester and the Ministry of Health in the Republic of Seychelles. Informed consent was obtained from each child's parent or guardian before the child participated in the study.
Prenatal exposure was measured using the mean of the total mercury (THg) concentration in the longest available segment of maternal hair representing growth during pregnancy, as discussed previously (Cernichiari et al., 1995). This is the prenatal exposure index used in the SCDS analyses and in nearly all earlier studies. Total mercury was used as the measure of exposure because 80% of THg in hair samples collected from fish eating populations is MeHg (Cernichiari et al., 1995; Phelps et al., 1980). Total mercury was measured by cold vapor atomic absorption and correlates well with MeHg levels in maternal blood and infant brain (WHO 1990; Cernichiari et al., 1995).
At 9 years of age, each child was given a battery of developmental tests that resulted in 21 endpoints assessing cognition, language, memory, motor, perceptual-motor and behavioral functions (Myers et al., 2003). We selected for this analysis the WISC-III Full-scale IQ (FSIQ, Wechsler 1991), an well-known measure of global cognition and those endpoints from the linear regression analysis for which the coefficient describing an effect of prenatal exposure was different from 0 with a p-value of less than 0.2 (Myers et al., 2003). The other endpoints analyzed were the California Verbal Learning Test (CVLT) –Short Delay (Delis et al., 1994) which also measures global intelligence, the Grooved Pegboard (Knights and Moule, 1968) (both the dominant and non-dominant hands) and the Bruininks-Oseretsky (B-O) Test of Motor Proficiency (Bruininks, 1978) for motor function, and the Connors Teacher Rating Scale (CTRS) (Connors, 1985) measuring the child's activity level. For the Grooved Pegboard and CTRS a higher score indicates poorer performance, while for the remaining tests an increase in the score is associated with improved performance.
In addition to maternal hair THg levels as an independent variable, the linear analysis included the following covariates, which were chosen a priori: sex, maternal age, test examiner, caregiver's intelligence (Kaufman Brief Intelligence Test (KBIT, Kaufman and Kaufman, 1990), child's medical history, the Family Resource Scale (FRS, Dunst et al., 1994), a family status code (2, 1, or no biological parents at home), the Hollingshead measure of socioeconomic status (SES), the Henderson Environmental Learning Profile Scale (Henderson et al., 1972), the child's age at testing, the child's home environment during toddlerhood (the Caldwell-Bradley Home Observation for Measurement of the Environment, HOME, Caldwell and Bradley, 1984), the child's hearing score, and a measure of recent postnatal MeHg exposure. The HOME, SES and hearing score were categorized in the linear analyses (Myers et al., 2003), but we used their original scale for the regression tree since the covariate effects in the regression tree are non-additive and non-linear.
In the 643 cohort children available at age 9 years, 500 had a 1-cm segment of hair closest to the scalp analyzed, representing approximately one month of recent MeHg exposure. The remaining 143 (all males) had shaved their heads for stylistic purposes. When 9-year postnatal hair samples were not available, we used a hair value obtained at an earlier time point. The postnatal hair value for 129 subjects was taken at 66 months and for 14 subjects at 48 months. The correlation between 66- and 107-month hair values was 0.44.
Statistical regression tree methods (Breiman et al., 1984) provide an alternative approach ro multiple linear regression for data analysis, in that the covariate effects are non-linear and non-additive. The CART algorithm (Breiman et al., 1984) and the tree functions in the S-PLUS (Mathsoft, 2005) software package have made tree methods a popular tool in statistical applications. When there are a large number of explanatory variables and a complex relationship is expected between the response variable and independent variables, regression tree methods may be more adept at capturing non-additive and non-homogeneous behavior and are sometimes easier to describe and interpret than linear models. In this reanalysis, we first identified the pattern of covariate effects at different IQ levels using a regression tree without regard to MeHg exposure and then formed clusters that represent subgroups of the population with different developmental conditions.
For this analysis, observations with missing information were ignored. The term data space means those observations with complete covariates and MeHg exposure corresponding to a response variable. The approach for forming subgroups (clusters) by the regression tree algorithm (Breiman et al., 1984) may be explained as follows: recursively partitioning the data space into two groups by choosing a cut point for one of the predictor variables (growing a tree), such that the two groups' outcome variable are statistically the most different from each other, and then finding the right size of the tree (pruning the tree) by a cross-validation approach, as in assessing the bias and variance trade-off for common regression problems. Statistically, the tree is pruned by snipping off the least important splits based on upon the amount of variance that each explains so that the analysis focuses increasingly on those branches that are most important (the terminal nodes). For the FSIQ response variable, we first “grew the tree” with all the covariates except pre- and postnatal exposure, so that the covariate effects were treated non-linearly and non-additively. Then applying the “cross-validation” approach, the optimal tree was determined by balancing the statistical deviance and the tree size.
After partitioning the data into clusters, we then considered statistical models that included the MeHg exposure. Let Yij denote the j-th observation in cluster i, and Xij and Zij denote corresponding pre- and post-natal exposure respectively. The models under consideration are
Model (1): E(Yij)= αi + β Xij + γ Zij,
Model (2): E(Yij)= αi + βi Xij + γ Zij.
Model (1) assumes different means αi, representing the homogeneous covariate effect within each cluster and common linear trends of pre- and postnatal MeHg exposure. Model (2) allows separate (heterogeneous) linear effects of prenatal exposure within each cluster. For model (1), we also explored gender by prenatal mercury exposure interactions, and when significant, the interaction terms were retained in the model. The S-PLUS software package, version 7.0, was used to construct the regression tree and fit the two models.
The tree based on the FSIQ and corresponding covariates is shown in Figure 1 and contains nine clusters. The cluster index is arbitrarily numbered 1 through 9 from the left to the right. The cluster rules, proportions of observations in each cluster, and the mean(SD) of their pre- and postnatal exposure are presented in Table 1. Cluster 9 has a high average IQ of 107 (range 90-139) and contains only 1.38% of the observations with high SES and a maternal IQ > =95.5. Cluster 8 has the second highest average IQ of 100 (range 82-107) and contains only 0.98% of the children. At the other extreme, cluster 1 has the lowest average IQ of 62.9 (range 48-82) with only 1.77% of the observations. Cluster 5 has the largest proportion of observations, 53.35%, with an average IQ of 80.8 (range 52-108). The prenatal exposure appears similar between the clusters. Cluster 8 has the highest average prenatal exposure, 8.8 ppm, influenced by two high values above 15ppm. The tree model (E(Yij)= αi) for the FSIQ has a very high R-squared of 0.987. The cluster mean values (αi‘s – numbers at the final nodes in Figure 1) were all significant, supporting the nine-cluster partition.
The covariate effects seen in this model are important since they confirm the sensitivity of the endpoints to factors already known to influence child development. Overall the associations between the covariates and the FSIQ in the regression tree were consistent with the linear analysis (Myers et al., 2003) and with developmental theory. Maternal IQ was selected as the first partition variable; the two groups have an average child's FSIQ of 79.3 (range 48-108) and 85.4 (range 65-139). Home environment (HOME), SES, and family resources (FRS) were also entered as partition criteria in the regression tree. Not surprisingly, the regression tree suggests that the child's home environment (HOME and FRS) can improve a child's IQ beyond biological inheritance from the parents. Between clusters 7 and 8, a high FRS is associated with an improved child's IQ, and similarly between clusters 5 and 6, a relatively higher HOME score is associated with an improved child's IQ.
As in the linear analysis (Myers et al., 2003), the postnatal hair concentration of MeHg was used as a covariate to adjust for prenatal effects. The coefficient estimates for postnatal MeHg exposure for models (1) and (2) are given in Table 2. For one of the five endpoints, the CTRS, the results suggest an adverse association with postnatal exposure.
We now consider models (1) and (2) with MeHg exposure. For five endpoints, the model fit had R-squared values higher than those in the linear analysis (see Table 2). In Myers et al. (2003), the R-squared values range from 0.105 for CTRS to 0.199 for the FSIQ. For the CVLT-short delay, the R-squared (0.047 and 0.059 for Models (1) and (2) respectively) was lower than 0.117 found in the linear analysis and consequently this endpoint was dropped from further consideration. The coefficient estimates for MeHg exposure for the remaining five endpoints are presented in Table 2. The estimates for the αi‘s for models (1) and (2) were similar to those in the FSIQ regression tree, and hence are omitted for brevity. Figure 2 shows the fitted linear trends for each cluster for the FSIQ for models (1) and (2). Both fitted models (1) and (2) have a very high goodness-of-fit (R-squared) of at least 94%. For model (1), assuming a homogeneous linear trend of pre- and postnatal exposure, performance on the B-O and CTRS tests improved significantly with increasing prenatal exposure; the latter is consistent with results reported earlier (Myers et al., 2003). The sex by prenatal MeHg exposure interactions were significant for the Grooved Pegboard—non-dominant hand as in the linear analysis. For model (2), allowing separate slope parameters for prenatal exposure, B-O test scores improved significantly for cluster 5 with increasing MeHg exposure; see Figure 3(a). Some other small but not significant p-values are: cluster 4 for the Grooved Pegboard—non-dominant hand, and clusters 5 and 8 for the CTRS.
In addition the fitted models (1) and (2) were rerun without outliers. Most of the results were unchanged except the following. Under model (1) Grooved Pegboard—non-dominant hand scores declined only in males, but the effect was marginally significant (slope=0.51, p=0.055) consistent with Myers et al. (2003). In addition, CTRS scores in Clusters 5 and 8 significantly improved with increasing MeHg (slopes -0.37 and -1.98, p= 0.03 and 0.042 respectively). Given that cluster 8 contains only 0.98% of children, the significance is sensitive to influential points and therefore may not be useful. Increasing MeHg in cluster 2 was associated with a significant decline in Grooved Pegboard—dominant hand scores (slope=1.34, p=0.018) as shown in Figure 3(c).
Regression trees form a potentially useful alternative to continuous linear regression models. The goodness of fit for five out of six endpoints, as measured by the R-squared, in this reanalysis is much higher than those of the linear regression analysis in previous studies of the same data. In addition, the covariate effects are simpler to describe as cluster effects. The main advantage of the regression tree approach is that it allows us to explore non-homogeneous exposure effects in different clusters. As expected, there were some clusters with small sample sizes. However, the design of the Seychelles study focuses on overall population effect and may not be powered to identify subtle effects at either end of the developmental spectrum.
The regression tree based on the FSIQ was consistent with child development theory and further confirms that a child's development can be improved by providing a more stimulating home environment and better family support. Although our main study was designed to examine prenatal exposure, we included a measure of the postnatal exposure, since fish consumption and consequently MeHg exposure is continuous in this society. We found one of the five endpoints we examined, the CTRS, had an adverse association with postnatal exposure. This association may be fortuitous, but its interpretation is not clear and more extensive postnatal analyses are currently in progress.
For prenatal mercury exposure, this tree-structured reanalysis with a common linear effect of exposure (Model (1)) confirms the results reported earlier (Myers et al., 2003). The models for both the CTRS and the Grooved Pegboard (non-dominant hand) have higher R-squared values and the same association with prenatal exposure. In addition, B-O test scores improved significantly with increasing MeHg exposure in the reanalysis. For every 10ppm increase in exposure, the B-O mean score improved by 1.3 points. MeHg in the B-O test score had a positive coefficient in the linear analysis (coefficient estimate 0.093), but did not reach significance in the linear analysis (p=0.10). When allowing separate prenatal exposure effects for different subgroups (Model (2)), B-O test scores improved significantly with increasing MeHg exposure for cluster 5, a group of 53.35% of the children who had average developmental stimulation. When outliers were removed, CTRS test scores improved significantly with increasing MeHg for cluster 5 and Grooved Pegboard—dominant hand scores declined significantly with increasing MeHg for cluster 2. The Groove Pegboard – dominant hand was not significant in the linear analysis. Cluster 2 had 7.28% of the children and had a mean IQ of 71.9 (range 53-87). This association indicated that with every 10 ppm increase of exposure, the child needed on average an additional 13.4 seconds to complete the task (95% confidence interval [2.3, 24.4]).
In summary, this reanalysis using the regression tree approach supports the findings from the primary linear regression analysis (Myers et al., 2003). We continue to find no consistent evidence for effects from prenatal MeHg through fish consumption at the exposure levels present in the Seychelles Child Development cohort. However, the current results do raise an interesting point that goes beyond previous analyses: the exposure/outcome relationship may not be homogeneous across all individuals. Using the regression tree methods, some associations appeared that were not present using other analysis methods. The B-O and CTRS showed improvement as MeHg exposure increased for children with average stimulating environments. In addition there was a decline in Grooved Pegboard test—dominant hand scores with increasing MeHg that appeared in children with lower scores on the HOME indicating they had a less stimulating home environment. These findings suggest that the effects of prenatal MeHg exposure from maternal fish consumption during pregnancy are complex and may not be homogeneous between children with different backgrounds and developmental environments. Further study is needed to see if these associations are consistent and related to mercury exposure.
This research was supported by Grants R01-ES10219; R01-08442; ES-01247, and T32 ES-007271 from the National Institutes of Health; the Food and Drug Administration, U.S.D.H.H.S., and by the Ministry of Health, Republic of Seychelles.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.