|Home | About | Journals | Submit | Contact Us | Français|
Bayesian kriging is a useful tool for estimating spatial distributions of metals; however, estimates are generally only verified statistically. In this study surface soil samples were collected on a uniform grid and analyzed for As, Cr, Pb, and Hg. The data were interpolated at individual locations by Bayesian kriging. Estimates were validated using a leave-one-out cross validation (LOOCV) statistical method which compared the measured and LOOCV predicted values. Validation also was carried out using additional field sampling of soil metal concentrations at points between original sampling locations, which were compared to kriging prediction distributions. LOOCV results suggest that Bayesian kriging was a good predictor of metal concentrations. When measured internode metal concentrations and estimated kriged values were compared, the measured values were located within the 5th – 95th percentile prediction distributions in over half of the internode locations. Estimated and measured internode concentrations were most similar for As and Pb. Kriged estimates did not compare as well to measured values for concentrations below the analytical minimum detection limit, or for internode samples that were very close to the original sampling node. Despite inherent variability in metal concentrations in soils, the kriged estimates were validated statistically and by in situ measurement.
Chronic exposure to high concentrations of metals in the environment has been show to be associated with negative neurological effects (1–4), but the potential impact of environmental exposure to low concentrations of metals is difficult to ascertain. Soil is a relevant route of human exposure for a variety of metals (5–6), and in an on-going study we are examining the potential relation between soil metal concentrations and an elevated prevalence of mental retardation and developmental delay (MR/DD)(7–9) using Bayesian local likelihood kriging. Kriging is the most common method of geostatistical interpolation used for environmental measurements. Originally applied to predict ore concentrations in mining exploration (10), this method has been expanded to estimation of soil metal concentrations (11–14), temperature (15), and elevation (16). Based on the known geographic location of some measured values, data are smoothed and weighted based on both the mean and covariance of the measured values (17–18), and concentrations are estimated over the area of interest.
A variety of classical kriging methods exists, including simple kriging which assumes a known constant trend, ordinary kriging which assumes an unknown constant trend, and universal kriging which assumes a general linear trend model (19). Bayesian kriging allows uncertainty in the model parameters to be reflected in the widths of the prediction intervals, thus providing a more reliable prediction of the parameter of interest (18). In geostatistics, Bayesian inference provides a way to incorporate parameter uncertainty in the prediction by treating the parameters as random variables and integrating over the parameter space to obtain the predictive distribution of any quantity of interest.
The objectives of this study were to estimate arsenic (As), chromium (Cr), lead (Pb), and mercury (Hg) soil concentrations using Bayesian local likelihood kriging based on metals measured on a regular grid within an area that had a significantly higher prevalence of MR/DD in children as compared to the state-wide MR/DD average (9). Kriged estimates were validated using a statistical approach, a leave-one-out cross validation (LOOCV) method, and the measured and LOOCV predicted values were compared. The second validation method involved comparing soil metal concentrations measured between original sampling locations within the grid (internode points) to the kriging prediction distributions at these locations. Comparing these measured internode values and kriged values to confirm the validity of the estimated soil metal concentrations has not been completed to our knowledge
Initial soil sampling was carried out in an area that was identified as an MR/DD cluster based on Bayesian local likelihood cluster modeling of maternal residence during pregnancy and the MR/DD outcome of the child born to each mother (9). The Strip area (130 km2) latitude and longitude coordinates were mapped using ArcGIS® Version 9.2 software (20). A regular 120 node grid was laid out over the strip area and surface soil samples (top 5 cm, n=119) were collected as close as possible to grid nodes (Figure 1) with sterile spatulas and placed in sterile Whirl-Pak™ bags. Samples were refrigerated until analysis, which occurred within two weeks of sampling. Samples were acid digested and analyzed for As, Cr, Pb, and Hg (mg kg−1 dry weight (dw)) by an independent analytical laboratory (7–8, 21). Metals that were below the minimum detection limit (MDL) were replaced with the MDL concentration as described in the EPA Guidance for Data Quality Assessment (22).
For the internode sampling validation, three internode surface soil samples were collected approximately one year after the initial 120 node grid sampling. Samples could not be collected at even distances between two original nodes because roads, commercial, and residential structures prohibited a precise sampling plan. Therefore, sampling occurred at variable internode locations, with varying distances to original nodes. Of these three samples, two internode samples were individual sample measurements, and the third internode value was the median of six closely-spaced samples (~ 1–3 m apart, to identify environmental variability) (Figure 1). Five areas were selected for the internode sampling (total n=15), and the closely-spaced sampling at these five areas corresponded to locations L2, L5, L8, L11 and L14 (Figure 1). The internode sampling locations were chosen without taking into account metal concentrations at the nearby original sampling locations.
For the Bayesian kriging, we used a Gaussian spatial linear mixed model as described by Diggle and Ribeiro (18). Let (ui, Y(ui)) for i = 1,, n denote the measurement of the random variable Y(ui) at the sampling location ui. A simplified version of the model with one latent spatial process and nugget effect (zero) was considered, which can be expressed in a hierarchical scheme:
In level 1, X represented a vector of spatially referenced non-random variables at location ui= (x1i, x2i) and β is a parameter vector. We assumed a spatial linear trend for coordinates ui = (x1i, x2i), and, hence, the ith row of X is Xi = (1,x1i, x2i). In level 2, S(u) is a stationary Gaussian process (mean = 0, variance = σ2 and correlation function R(h;)), where is the correlation parameter and h is the vector distance between two locations. We set an exponential correlation in the analysis:
The third level specified the prior for the model parameters (β, σ2, ), with a flat prior set for β and and a reciprocal prior for σ2. Based on these parameters, the estimated chemical concentration at individual locations was calculated from the predictive distribution using the krige.bayes function of geoR library in R, which was used for all statistical analyses (23–24).
Metal concentrations were always positive and often skewed to lower concentrations; therefore, the Gaussian assumption from the above model did not hold. Therefore, a transformation was required to meet the normality assumptions of the model (25). A Box-Cox family transformation was applied to all of our metal concentrations (26).
where λ is the transformation parameter (Table 1). The former transformation (eq. 3) was used for As and the latter (eq. 4) for Cr, Pb, and Hg concentrations. A Q-Q normal plot was also completed to test for normality of the transformed concentrations. Bayesian kriging was performed after all data were transformed to ensure Gaussianity. We examined a large range of chemical data from various sites and found that the stationarity assumption is valid for our data and, therefore, we have not adjusted for stationarity. Further, the Box-Cox transformations used to ensure Gaussianity will generally increase stationarity if there is an initial lack of stationarity.
We employed Trans-Gaussian Bayesian kriging (i.e. transformation before kriging) which is a standard approach when transformation is used prior to analysis. Bayesian kriging is superior to ordinary non-Bayesian kriging as it allows for the parameters to have prior distributions specified, which has two main effects. First it allows the estimation of parameters directly in the model.; secondly, it allows the sensitivity analysis to prior specification, which is not available in non-Bayesian forms.
The optimal method for cross-validation would be to have two sets of independent data: one for kriging and the other for validation. However, a leave-one-out cross validation (LOOCV), which removes each transformed datum in turn and then kriges at the location of the removed point using the remaining data can also be used to statistically validate the Bayesian kriging model, and to ensure the quality of the prediction We also calculated the following summary statistics for the LOOCV: mean error (ME)
mean square error (MSE),
and the mean squared deviation ratio (MSDR)
where Ŷ (ui) and Y(ui) are the predicted and observed values, respectively, and σ2(ui) is the kriging variance at point ui. The ME should be close to zero and the MSE should be as small as possible. The MSDR is the mean squared deviation ratio of residuals with kriging variance, whose value should be close to 1 if the model is correctly specified.
For a second cross-validation method we used another advantage of Bayesian kriging, that the predictive distribution of a value at a given location can be obtained and the probability of the observed value at that location estimated. The measured metal concentrations from the internode sampling were independent data used as true values to validate the Bayesian kriging model. We use the word true to indicate the measured value although we recognize that there is analytical variability in the measured values. We first predicted the metal concentration at each internode location (L1-L15) with the Bayesian kriging model described above using only the original transformed sampling data. Then a random sample was drawn (5000) from the predictive distribution of each metal at each of the 15 internode locations. We then observed where the measured value (also transformed by the corresponding Box Cox λ value) fell in the predictive distribution at each location.
A sensitivity analysis using only As data was completed to assess how sensitive the analysis is to specification of the correlation, which will affect the local variation in As concentrations. In the original analysis, the posterior median value of was 0.0064 and we examined a range of fixed around this value. In the sensitivity analysis, instead of assuming a prior distribution, the value of the correlation parameter () was fixed, and ME, MSE, and MSDR were calculated for different values of (0.001 to 0.2).
Metal concentrations were always positive and skewed to lower concentrations (Figure 2) with skewness coefficients ranging from 2.29 to 5.26; thus, the Box-Cox transformation was used. The Q-Q normal plot comparing transformed metal concentrations with a normal distribution showed that the Box-Cox transformation achieved a normal distribution and hence met the Gaussian assumption. Q-Q plots for Cr and Pb are shown in Supporting Information, Figure S1. For the LOOCV, the measured and predicted values, and the regression and reference (equation y = x, with slope = 1, and y-intercept = 0) equation lines are plotted in Figure 3. For all transformed metals, the measured and predicted values (points) were distributed evenly around the reference line (dotted line). Also, the regression and reference lines overlapped for all metals, which suggested a good prediction using the Bayesian kriging method. The cross-validation statistics also indicated a good prediction; the ME values were close to zero and MSDR values were close to 1 for all metals (Table 1). The MSE value was lowest for the As model.
For the sensitivity analysis using the As data, was fixed to a range of values. Results show that MSE is smallest, and the MSDR is closest to 1 when = 0.006, which is very close to the posterior median (0.0064) used in the original model, although the ME supported larger values (see Supporting Information, Table S1). In general when different values of are used the local behavior of the As surface will change.
Measured metal concentrations varied on a small spatial scale (1–3 m), although not consistently (Table 2). The coefficient of variation (CV), or ratio of the mean and standard deviation (SD) was small at some locations (e.g.,L14 for As, L8, L11 and L14 for Cr, and L11 and L14 for Pb, etc.) and larger (>60%) at other locations (e.g., L11 for As, L5 for Cr, L2 for Pb, and L8 for Hg). Overall, the CVs for Cr were smaller than those of the other metals, and all metals had small SDs and CVs at L14.
Box plots (edges represent the 25th and 75th percentiles; whiskers represent the 10th and 90th percentiles; dash marks represent the 5th and 95th percentiles; and the median line divides the box) of the kriging prediction distributions and transformed measured metal concentrations are shown in Figure 4 and Figure 5. Measured As values did not fall within the prediction distribution box plot (outside the 5th or 95th percentiles) at three locations (L7, L11, and L12) (Figure 4A); measured Cr value were outside of the box plot prediction distribution at three locations (L3, L4, and L7) (Figure 4B); measured Pb values were not located within the prediction distribution at three locations (L2, L3, and L7) (Figure 5A); and measured Hg values were outside of the prediction distributions at five locations (L7, L8, L10, L12, and L13) (Figure 5B). For all metals, the measured values were not located within the 5th to 95th percentile prediction distributions at L7. Also, measured values of both As and Hg were outside the 5th/95th percentiles for L12 and for both Cr and Pb at L3.
The CVs of the 6 closely-spaced soil samples (locations L2, L5, L8, L11, and L14) also were compared to the predictive distributions at those locations. For all metals except Cr, the location of the highest CVs (at least > 65 %) was also a location where the measured value was outside the 5th to 95th percentiles of the prediction distribution. These locations were L11 for As, L2 for Pb, and L8 for Hg. For all locations with the lowest CVs (< 31 %; L14 for As, L8 for Cr, L11 for Pb, and L14 for Hg) for each metal, the measured values fell within the 25th to 75th percentiles.
There may be local lack of fit, with some groups of neighbors correlated and behaving as outliers, as occurs at L7, L11, and L12 for As (Figure 4A). At L7, the distance to its nearest original sampling node neighbor is 3.39 10−04 compared to 1.08 10−02 (relative longitude latitude distance) of its next nearest neighbor (Figure 1). The measured value for As at L7 is at least two times greater than the concentrations measured at the three nearest original sampling locations. According to the exponential correlation model we used for Bayesian kriging, the smaller the distance between two points, the larger the correlation. The median of the kriged value for the transformed As concentration at L7 is 0.756, which is close to the transformed measured value of its nearest sample point neighbor (0.739).
The Bayesian kriging provided a reliable prediction of As, Cr, Pb, and Hg soil concentrations, and based on the first validation method (LOOCV) and sensitivity analysis of As data, the model chosen was valid. Reference and regression lines overlapped, the cross-validation statistics were good and the ME and MSDR values were similar for each metal. The MSE value indicated that the model best explained the variability in the observations for As and Cr.
When kriged prediction distributions were compared with metal internode concentrations in our second validation method, estimated median concentrations agreed closely with measured concentrations in over half the samples, which we consider good estimates, given the inherent variability in environmental samples. The prediction distributions were most similar to measured concentrations for As and Pb. Disregarding Hg, which is present at concentrations orders of magnitude smaller than the other metals, As had a smaller measured range than Cr and Pb, and the lowest CVs in concentrations of the six closely-spaced samples.
Kriged estimates did not compare as well to the measured values for samples that fell below the analytical MDL. Thompson and Howarth (27) suggest that analytical precision will obtain a steady state only at concentrations at least two orders of magnitude greater than the detection limit. The L12 As measured value was below the MDL, and had the greatest difference between the estimated prediction distribution and the measured value. Similarly, the Hg measured value at L10 was below the MDL and the kriged prediction distribution was greater than the measured value. The US EPA provides guidance on statistically addressing values below the MDL (22), but not on implications for kriged predictions.
Analytical error due to low concentrations of metals in soils has been reported (28). In the current study, Hg concentrations were two-to-three orders of magnitude lower than concentrations of other metals, and the corresponding 25th to 75th percentile prediction ranges were small, ranging from only 0.003 to 0.021mg kg−1 dw. Mercury detection levels were much lower than those for the other metals due to the use of the mercury vapor generation (EPA Method 7471) (29) analytical method, which increased the analytical precision of the Hg analyses and offset the potential deleterious effects of low concentration ranges on the validity of the kriged estimates. Mercury concentrations were not within the 5th to 95th prediction distributions for 33% of internode samples, which was on par with that of the other metals.
Analytical uncertainties exist in sample collection, extraction and analysis. Analytical matrix effects also exist, and are greater in soils than in homogeneous environments such as water. US EPA (30) suggests that the effect of the matrix be documented by spiking with the analyte of interest, and that acceptance limits should be ±25% recovery of the spiked value for accuracy and ±20% for precision. Matrix effects also may affect the assumption of a Gaussian distribution. Non-Gaussian conditions are more apparent if the sample is heterogeneous and the analyte is distributed in a small portion of the sample, causing large sub-sampling variance and a skewed concentration distribution (27). Clays may have more matrix effects than sand because sand is generally considered to be more homogeneous than clays. In natural systems, analytical matrix effects may be compounded by kriging estimates. Bogaert and D’Or (31) were able to better predict sand distribution than silt or clay distributions using Bayesian maximum entropy kriging in natural soils. Thus the true or measured value has variance in its measurement, as does the predicted value.
In addition to matrix effects, soil metals may vary spatially in the environment on the order of cms, and concentrations used in kriging are often measured on a spatial scale of kms. The CVs of our six closely-spaced samples (on the order of 1–3 m) ranged from lows of 11% for Pb and 15% for As, to highs of 80% for Cr and 87% for Hg. In general, when a metal had the greatest difference between the prediction distribution and the median of the six measured values, the CV at that location was also greatest. The converse also held true; the site with the smallest CV for each metal also had the smallest difference between the prediction distribution and the median of the six measured values. Kriging better predicts metal concentrations in environments that have less small-scale variability, however natural variability may be unknown and vary by metal in different environmental areas.
Due to the heterogeneous nature of soil, the actual or “true” soil map for a chemical is not as smooth as the kriging model, and the existence of outliers under these conditions is commonly accepted. Juang et al. (11) addressed large concentration variations, high skewness and some extreme values using rank-order geostatistics on a small spatial scale (~0.07 km2) in a rice field irrigated with Cd-contaminated discharge from a chemical plant. To verify results, Cd concentrations were back transformed from estimated standardized ranks and were more reliable than ordinary kriging. However, areas close to hot spots had large uncertainty in the Cd concentrations. The Cd concentration CV was 140% which was greater than the CVs for our metals.
Since kriging fits a smooth surface over the area, the true value of each independent location is correlated to its surrounding sample-point neighbors, and hence the estimated value is influenced by its neighbors. L7 was close to an original sample point, and the predicted range at L7 was influenced largely by that sample point. This influence will cause the predicted value to be close to that original value and have a small variance, resulting in the relatively narrow range of predictive distributions for the metals. However, the measured value for each metal at L7 was of the same order of magnitude as concentrations measured in the two other internode samples in that area (L8 and L9), which suggests that the measured values may be a better indicator of the metal concentration than the kriged estimates for L7, and other closely spaced samples.
Sources of metals, i.e., anthropogenic or natural, may affect their distribution in soils. Previous research at this site suggested that for As, Pb and Hg, the anthropogenic source component was more significant than the natural source component based on principal component analysis (32), and for all of these metals except Cr, industrial releases are minimal in the sampling area (8). Additional information that elucidates soil characteristics and the geologic and anthropogenic distribution of soil metals may facilitate interpretation of the kriged values.
Our Bayesian local likelihood kriging model provided good estimates and has the advantages of taking the uncertainty of the parameters into account and providing a predictive distribution. These results were validated using two methods: a statistical LOOCV method and by internode sampling. We showed the inherent small scale, natural variability in closely-spaced soil samples and the influence of high CVs of these samples on the prediction distributions. Prediction distributions for internode sampling locations that were close to original locations were highly influenced by these original metal concentrations, and samples with values below the MDL for analyses were not well predicted by the Bayesian kriging. In both of these cases the measured values were more accurate than the predicted distributions. Regardless, this novel approach to validation of statistically derived interpolated sample concentrations provided good estimates in environmental samples and bridges the boundary between spatially interpolated and measured values.
Funding for this research was provided by the National Institutes of Health, National Institute of Environmental Health Sciences, R01 Grant No. ES012895-01A1. We thank M. Engle and S. Jayasinghe for help with sampling, and three anonymous reviewers for their helpful suggestions.
Supporting Information Available
The results of Q-Q plots for Cr and Pb, and As sensitivity analysis results are available in Supporting Information. This information is available free of charge via the Internet at http://pubs.acs.org.