Six of the 34 studies identified were excluded because of missing information on the number of subjects or the mean and variance of the outcome [see for a study selection flow chart and Supplemental Material,
Table S1 (
http://dx.doi.org/10.1289/ehp.1104912) for additional information on studies that were excluded from the analysis]. Another study (
Trivedi et al. 2007) was excluded because SDs reported for the outcome parameter were questionably small (1.13 for the high-fluoride group, and 1.23 for the low-fluoride group) and the SMD (–10.8; 95% CI: –11.9, –9.6) was > 10 times lower than the second smallest SMD (–0.95; 95% CI: –1.16, –0.75) and 150 times lower than the largest SMD (0.07; 95% CI: –0.083, 0.22) reported for the other studies, which had relatively consistent SMD estimates. Inclusion of this study in the meta-analysis resulted with a much smaller pooled random-effects SMD estimate and a much larger
I2 (–0.63; 95% CI: –0.83, –0.44,
I2 94.1%) compared with the estimates that excluded this study (–0.45; 95% CI: –0.56, –0.34,
I2 80%) (see Supplemental Material,
Figure S1). Characteristics of the 27 studies included are shown in (
An et al. 1992;
Chen et al. 1991;
Fan et al. 2007;
Guo et al. 1991;
Hong et al. 2001; Li FH et al. 2009; Li XH et al. 2010; Li XS 1995; Li Y et al. 1994; Li Y et al. 2003;
Lin et al. 1991;
Lu et al. 2000;
Poureslami et al. 2011;
Ren et al. 1989;
Seraj et al. 2006;
Sun et al. 1991; Wang G et al. 1996; Wang SH et al. 2001; Wang SX et al. 2007; Wang ZH et al. 2006;
Xiang et al. 2003;
Xu et al. 1994;
Yang et al. 1994;
Yao et al. 1996,
1997; Zhang JW et al. 1998;
Zhao et al. 1996). Two of the studies included in the analysis were conducted in Iran (
Poureslami et al. 2011;
Seraj et al. 2006); the other study cohorts were populations from China. Two cohorts were exposed to fluoride from coal burning (
Guo et al. 1991; Li XH et al. 2010); otherwise populations were exposed to fluoride through drinking water. The CRT-RC was used to measure the children’s intelligence in 16 studies. Other intelligence measures included the Wechsler Intelligence tests (3 studies;
An et al. 1992;
Ren et al. 1989; Wang ZH et al. 1996), Binet IQ test (2 studies;
Guo et al. 1991;
Xu et al. 1994), Raven’s test (2 studies;
Poureslami et al. 2011;
Seraj et al. 2006), Japan IQ test (2 studies;
Sun et al. 1991; Zhang JW et al. 1998), Chinese comparative intelligence test (1 study;
Yang et al. 1994), and the mental work capacity index (1 study; Li Y et al. 1994). Because each of the intelligence tests used is designed to measure general intelligence, we used data from all eligible studies to estimate the possible effects of fluoride exposure on general intelligence.
| Table 1Characteristics of epidemiological studies of fluoride exposure and children’s cognitive outcomes. |
In addition, we conducted a sensitivity analysis restricted to studies that used similar tests to measure the outcome (specifically, the CRT-RC, Wechsler Intelligence test, Binet IQ test, or Raven’s test), and an analysis restricted to studies that used the CRT-RC. We also performed an analysis that excluded studies with co-exposures including iodine and arsenic, or with non-drinking-water fluoride exposure from coal burning.
Pooled SMD estimates. Among the 27 studies, all but one study showed random-effect SMD estimates that indicated an inverse association, ranging from –0.95 (95% CI: –1.16, –0.75) to –0.10 (95% CI: –0.25, 0.04) (). The study with a positive association reported an SMD estimate of 0.07 (95% CI: –0.8, 0.22). Similar results were found with the fixed-effects SMD estimates. The fixed-effects pooled SMD estimate was –0.40 (95% CI: –0.44, –0.35), with a
p-value < 0.001 for the test for homogeneity. The random-effects SMD estimate was –0.45 (95% CI: –0.56, –0.34) with an
I2 of 80% and homogeneity test
p-value < 0.001 (). Because of heterogeneity (excess variability) between study results, we used primarily the random-effects model for subsequent sensitivity analyses, which is generally considered to be the more conservative method (
Egger et al. 2001). Among the restricted sets of intelligence tests, the SMD for the model with only CRT-RC tests and drinking-water exposure (and to a lesser extent the model with only CRT-RC tests) was lower than that for all studies combined, although the difference did not appear to be significant. Heterogeneity, however, remained at a similar magnitude when the analyses were restricted ().
| Table 2Sensitivity analyses of pooled random-effects standardized weighted mean difference (SMD) estimates of child’s intelligence score with high exposure of fluoride. |
Sources of heterogeneity. We performed meta-regression models to assess study characteristics as potential predictors of effect. Information on the child’s sex and parental education were not reported in > 80% of the studies, and only 7% of the studies reported household income. These variables were therefore not included in the models. Among the two covariates, year of publication (0.02; 95% CI: 0.006, 0.03), but not mean age of the study children (–0.02; 95% CI: –0.094, 0.04), was a significant predictor in the model with all 27 studies included. I2 residual 68.7% represented the proportion of residual between-study variation due to heterogeneity. From the adjusted R2, 39.8% of between-study variance was explained by the two covariates. The overall test of the covariates was significant (p = 0.004).
When the model was restricted to the 16 studies that used the CRT-RC, the child’s age (but not year of publication) was a significant predictor of the SMD. The
R2 of 65.6% of between-study variance was explained by the two covariates, and only 47.3% of the residual variation was attributable to heterogeneity. The overall test of both covariates in the model remained significant (
p = 0.0053). On further restriction of the model to exclude the 7 studies with arsenic and iodine as co-exposures and fluoride originating from coal burning (thus including only the 9 with fluoride exposure from drinking water), neither age nor year of publication was a significant predictor, and the overall test of covariates was less important (
p = 0.062), in accordance with the similarity of intelligence test outcomes and the source of exposure in the studies included. Although official reports of lead concentrations in the study villages in China were not available, some studies reported high percentage (95–100%) of low lead exposure (less than the standard of 0.01 mg/L) in drinking-water samples in villages from several study provinces (
Bi et al. 2010;
Peng et al. 2008;
Sun 2010).
Publication bias. A Begg’s funnel plot with the SE of SMD from each study plotted against its corresponding SMD did not show clear evidence of asymmetry, although two studies with a large SE also reported relatively large effect estimates, which may be consistent with publication bias or heterogeneity (). The plot appears symmetrical for studies with larger SE, but with substantial variation in SMD among the more precise studies, consistent with the heterogeneity observed among the studies included in the analysis. Begg (p = 0.22) and Egger (p = 0.11) tests did not indicate significant (p < 0.05) departures from symmetry.
Pooled risk ratios. The relative risk (RR) of a low/marginal score on the CRT-RC test (< 80) among children with high fluoride exposure compared with those with low exposure (16 studies total) was 1.93 (95% CI: 1.46, 2.55;
I2 58.5%). When the model was restricted to 9 studies that used the CRT-RC and included only drinking-water fluoride exposure (
Chen et al. 1991;
Fan et al. 2007; Li XH et al. 2010; Li XS et al. 1995; Li Y et al. 2003;
Lu et al. 2000; Wang ZH et al. 2006;
Yao et al. 1996,
1997), the estimate was similar (RR = 1.75; 95% CI: 1.16, 2.65;
I2 70.6%). Although fluoride exposure showed inverse associations with test scores, the available exposure information did not allow a formal dose–response analysis. However, dose-related differences in test scores occurred at a wide range of water-fluoride concentrations.