# Related Articles

Objective

To propose a more realistic model for disease cluster detection, through a modification of the spatial scan statistic to account simultaneously for inflated zeros and overdispersion.

Introduction

Spatial Scan Statistics [1] usually assume Poisson or Binomial distributed data, which is not adequate in many disease surveillance scenarios. For example, small areas distant from hospitals may exhibit a smaller number of cases than expected in those simple models. Also, underreporting may occur in underdeveloped regions, due to inefficient data collection or the difficulty to access remote sites. Those factors generate excess zero case counts or overdispersion, inducing a violation of the statistical model and also increasing the type I error (false alarms). Overdispersion occurs when data variance is greater than the predicted by the used model. To accommodate it, an extra parameter must be included; in the Poisson model, one makes the variance equal to the mean.

Methods

Tools like the Generalized Poisson (GP) and the Double Poisson [2] may be a better option for this kind of problem, modeling separately the mean and variance, which could be easily adjusted by covariates. When excess zeros occur, the Zero Inflated Poisson (ZIP) model is used, although ZIP’s estimated parameters may be severely biased if nonzero counts are too dispersed, compared to the Poisson distribution. In this case the Inflated Zero models for the Generalized Poisson (ZIGP), Double Poisson (ZIDP) and Negative Binomial (ZINB) could be good alternatives to the joint modeling of excess zeros and overdispersion. By one hand, Zero Inflated Poisson (ZIP) models were proposed using the spatial scan statistic to deal with the excess zeros [3]. By the other hand, another spatial scan statistic was based on a Poisson-Gamma mixture model for overdispersion [4]. In this work we present a model which includes inflated zeros and overdispersion simultaneously, based on the ZIDP model. Let the parameter p indicate the zero inflation. As the the remaining parameters of the observed cases map and the parameter p are not independent, the likelihood maximization process is not straightforward; it becomes even more complicated when we include covariates in the analysis. To solve this problem we introduce a vector of latent variables in order to factorize the likelihood, and obtain a facilitator for the maximization process using the E-M (Expectation-Maximization) algorithm. We derive the formulas to maximize iteratively the likelihood, and implement a computer program using the E-M algorithm to estimate the parameters under null and alternative hypothesis. The p-value is obtained via the Fast Double Bootstrap Test [5].

Results

Numerical simulations are conducted to assess the effectiveness of the method. We present results for Hanseniasis surveillance in the Brazilian Amazon in 2010 using this technique. We obtain the most likely spatial clusters for the Poisson, ZIP, Poisson-Gamma mixture and ZIDP models and compare the results.

Conclusions

The Zero Inflated Double Poisson Spatial Scan Statistic for disease cluster detection incorporates the flexibility of previous models, accounting for inflated zeros and overdispersion simultaneously.

The Hanseniasis study case map, due to excess of zero cases counts in many municipalities of the Brazilian Amazon and the presence of overdispersion, was a good benchmark to test the ZIDP model. The results obtained are easier to understand compared to each of the previous spatial scan statistic models, the Zero Inflated Poisson (ZIP) model and the Poisson-Gamma mixture model for overdispersion, taken separetely. The E-M algorithm and the Fast Double Bootstrap test are computationally efficient for this type of problem.

PMCID: PMC3692937

Scan statistics; Zero inflated; Overdispersion; Expectation-Maximization algorithm

Background

Predictive microbiology develops mathematical models that can predict the growth rate of a microorganism population under a set of environmental conditions. Many primary growth models have been proposed. However, when primary models are applied to bacterial growth curves, the biological variability is reduced to a single curve defined by some kinetic parameters (lag time and growth rate), and sometimes the models give poor fits in some regions of the curve. The development of a prediction band (from a set of bacterial growth curves) using non-parametric and bootstrap methods permits to overcome that problem and include the biological variability of the microorganism into the modelling process.

Results

Absorbance data from Listeria monocytogenes cultured at 22, 26, 38, and 42°C were selected under different environmental conditions of pH (4.5, 5.5, 6.5, and 7.4) and percentage of NaCl (2.5, 3.5, 4.5, and 5.5). Transformation of absorbance data to viable count data was carried out. A random effect multiplicative heteroscedastic model was considered to explain the dynamics of bacterial growth. The concept of a prediction band for microbial growth is proposed. The bootstrap method was used to obtain resamples from this model. An iterative procedure is proposed to overcome the computer intensive task of calculating simultaneous prediction intervals, along time, for bacterial growth. The bands were narrower below the inflection point (0-8 h at 22°C, and 0-5.5 h at 42°C), and wider to the right of it (from 9 h onwards at 22°C, and from 7 h onwards at 42°C). A wider band was observed at 42°C than at 22°C when the curves reach their upper asymptote. Similar bands have been obtained for 26 and 38°C.

Conclusions

The combination of nonparametric models and bootstrap techniques results in a good procedure to obtain reliable prediction bands in this context. Moreover, the new iterative algorithm proposed in this paper allows one to achieve exactly the prefixed coverage probability for the prediction band. The microbial growth bands reflect the influence of the different environmental conditions on the microorganism behaviour, helping in the interpretation of the biological meaning of the growth curves obtained experimentally.

doi:10.1186/1471-2105-11-77

PMCID: PMC2829529
PMID: 20141635

Summary

Public health researchers often estimate health effects of exposures (e.g., pollution, diet, lifestyle) that cannot be directly measured for study subjects. A common strategy in environmental epidemiology is to use a first-stage (exposure) model to estimate the exposure based on covariates and/or spatio-temporal proximity and to use predictions from the exposure model as the covariate of interest in the second-stage (health) model. This induces a complex form of measurement error. We propose an analytical framework and methodology that is robust to misspecification of the first-stage model and provides valid inference for the second-stage model parameter of interest.

We decompose the measurement error into components analogous to classical and Berkson error and characterize properties of the estimator in the second-stage model if the first-stage model predictions are plugged in without correction. Specifically, we derive conditions for compatibility between the first- and second-stage models that guarantee consistency (and have direct and important real-world design implications), and we derive an asymptotic estimate of finite-sample bias when the compatibility conditions are satisfied. We propose a methodology that (1) corrects for finite-sample bias and (2) correctly estimates standard errors. We demonstrate the utility of our methodology in simulations and an example from air pollution epidemiology.

doi:10.1002/env.2233

PMCID: PMC3994141
PMID: 24764691

measurement error; spatial statistics; two-stage estimation; air pollution; environmental epidemiology

Propensity-score matching is frequently used to estimate the effect of treatments, exposures, and interventions when using observational data. An important issue when using propensity-score matching is how to estimate the standard error of the estimated treatment effect. Accurate variance estimation permits construction of confidence intervals that have the advertised coverage rates and tests of statistical significance that have the correct type I error rates. There is disagreement in the literature as to how standard errors should be estimated. The bootstrap is a commonly used resampling method that permits estimation of the sampling variability of estimated parameters. Bootstrap methods are rarely used in conjunction with propensity-score matching. We propose two different bootstrap methods for use when using propensity-score matching without replacementand examined their performance with a series of Monte Carlo simulations. The first method involved drawing bootstrap samples from the matched pairs in the propensity-score-matched sample. The second method involved drawing bootstrap samples from the original sample and estimating the propensity score separately in each bootstrap sample and creating a matched sample within each of these bootstrap samples. The former approach was found to result in estimates of the standard error that were closer to the empirical standard deviation of the sampling distribution of estimated effects.

doi:10.1002/sim.6276

PMCID: PMC4260115
PMID: 25087884

propensity score; propensity-score matching; bootstrap; variance estimation; Monte Carlo simulations; matching

Health-Related Quality of Life (HRQoL) measures are becoming increasingly used in clinical trials as primary outcome measures. Investigators are now asking statisticians for advice on how to analyse studies that have used HRQoL outcomes.

HRQoL outcomes, like the SF-36, are usually measured on an ordinal scale. However, most investigators assume that there exists an underlying continuous latent variable that measures HRQoL, and that the actual measured outcomes (the ordered categories), reflect contiguous intervals along this continuum.

The ordinal scaling of HRQoL measures means they tend to generate data that have discrete, bounded and skewed distributions. Thus, standard methods of analysis such as the t-test and linear regression that assume Normality and constant variance may not be appropriate. For this reason, conventional statistical advice would suggest that non-parametric methods be used to analyse HRQoL data. The bootstrap is one such computer intensive non-parametric method for analysing data.

We used the bootstrap for hypothesis testing and the estimation of standard errors and confidence intervals for parameters, in four datasets (which illustrate the different aspects of study design). We then compared and contrasted the bootstrap with standard methods of analysing HRQoL outcomes. The standard methods included t-tests, linear regression, summary measures and General Linear Models.

Overall, in the datasets we studied, using the SF-36 outcome, bootstrap methods produce results similar to conventional statistical methods. This is likely because the t-test and linear regression are robust to the violations of assumptions that HRQoL data are likely to cause (i.e. non-Normality). While particular to our datasets, these findings are likely to generalise to other HRQoL outcomes, which have discrete, bounded and skewed distributions. Future research with other HRQoL outcome measures, interventions and populations, is required to confirm this conclusion.

doi:10.1186/1477-7525-2-70

PMCID: PMC543443
PMID: 15588308

Health Related Quality of Life; SF-36; Bootstrap Simulation; Statistical Analysis.

In many environmental epidemiology studies, the locations and/or times of exposure measurements and health assessments do not match. In such settings, health effects analyses often use the predictions from an exposure model as a covariate in a regression model. Such exposure predictions contain some measurement error as the predicted values do not equal the true exposures. We provide a framework for spatial measurement error modeling, showing that smoothing induces a Berkson-type measurement error with nondiagonal error structure. From this viewpoint, we review the existing approaches to estimation in a linear regression health model, including direct use of the spatial predictions and exposure simulation, and explore some modified approaches, including Bayesian models and out-of-sample regression calibration, motivated by measurement error principles. We then extend this work to the generalized linear model framework for health outcomes. Based on analytical considerations and simulation results, we compare the performance of all these approaches under several spatial models for exposure. Our comparisons underscore several important points. First, exposure simulation can perform very poorly under certain realistic scenarios. Second, the relative performance of the different methods depends on the nature of the underlying exposure surface. Third, traditional measurement error concepts can help to explain the relative practical performance of the different methods. We apply the methods to data on the association between levels of particulate matter and birth weight in the greater Boston area.

doi:10.1093/biostatistics/kxn033

PMCID: PMC2733173
PMID: 18927119

Air pollution; Measurement error; Predictions; Spatial misalignment

Often, in environmental data collection, data arise from two sources: numerical models and monitoring networks. The first source provides predictions at the level of grid cells, while the second source gives measurements at points. The first is characterized by full spatial coverage of the region of interest, high temporal resolution, no missing data but consequential calibration concerns. The second tends to be sparsely collected in space with coarser temporal resolution, often with missing data but, where recorded, provides, essentially, the true value. Accommodating the spatial misalignment between the two types of data is of fundamental importance for both improved predictions of exposure as well as for evaluation and calibration of the numerical model. In this article we propose a simple, fully model-based strategy to downscale the output from numerical models to point level. The static spatial model, specified within a Bayesian framework, regresses the observed data on the numerical model output using spatially-varying coefficients which are specified through a correlated spatial Gaussian process.

As an example, we apply our method to ozone concentration data for the eastern U.S. and compare it to Bayesian melding (Fuentes and Raftery 2005) and ordinary kriging (Cressie 1993; Chilès and Delfiner 1999). Our results show that our method outperforms Bayesian melding in terms of computing speed and it is superior to both Bayesian melding and ordinary kriging in terms of predictive performance; predictions obtained with our method are better calibrated and predictive intervals have empirical coverage closer to the nominal values. Moreover, our model can be easily extended to accommodate for the temporal dimension. In this regard, we consider several spatio-temporal versions of the static model. We compare them using out-of-sample predictions of ozone concentration for the eastern U.S. for the period May 1–October 15, 2001. For the best choice, we present a summary of the analysis. Supplemental material, including color versions of Figures 4, 5, 6, 7, and 8, and MCMC diagnostic plots, are available online.

doi:10.1007/s13253-009-0004-z

PMCID: PMC2990198
PMID: 21113385

Bayesian melding; Calibration; Markov chain Monte Carlo; Ordinary kriging; Spatial misalignment; Spatially varying coefficient model

Background

Model rejections lie at the heart of systems biology, since they provide conclusive statements: that the corresponding mechanistic assumptions do not serve as valid explanations for the experimental data. Rejections are usually done using e.g. the chi-square test (χ2) or the Durbin-Watson test (DW). Analytical formulas for the corresponding distributions rely on assumptions that typically are not fulfilled. This problem is partly alleviated by the usage of bootstrapping, a computationally heavy approach to calculate an empirical distribution. Bootstrapping also allows for a natural extension to estimation of joint distributions, but this feature has so far been little exploited.

Results

We herein show that simplistic combinations of bootstrapped tests, like the max or min of the individual p-values, give inconsistent, i.e. overly conservative or liberal, results. A new two-dimensional (2D) approach based on parametric bootstrapping, on the other hand, is found both consistent and with a higher power than the individual tests, when tested on static and dynamic examples where the truth is known. In the same examples, the most superior test is a 2D χ2vsχ2, where the second χ2-value comes from an additional help model, and its ability to describe bootstraps from the tested model. This superiority is lost if the help model is too simple, or too flexible. If a useful help model is found, the most powerful approach is the bootstrapped log-likelihood ratio (LHR). We show that this is because the LHR is one-dimensional, because the second dimension comes at a cost, and because LHR has retained most of the crucial information in the 2D distribution. These approaches statistically resolve a previously published rejection example for the first time.

Conclusions

We have shown how to, and how not to, combine tests in a bootstrap setting, when the combination is advantageous, and when it is advantageous to include a second model. These results also provide a deeper insight into the original motivation for formulating the LHR, for the more general setting of nonlinear and non-nested models. These insights are valuable in cases when accuracy and power, rather than computational speed, are prioritized.

doi:10.1186/1752-0509-8-46

PMCID: PMC4022267
PMID: 24742065

Model rejection; Bootstrapping; Combining information; 2D; Insulin signaling; Model Mimicry; Likelihood ratio

Wu, Xia | Chen, Kewei | Yao, Li | Ayutyanont, Napatkamon | Langbaum, Jessica B.S. | Fleisher, Adam. | Reschke, Cole | Lee, Wendy | Liu, Xiaofen | Alexander, Gene E | Bandy, Dan | Foster, Norman L | Thompson, Paul M. | Harvey, Danielle J. | Weiner, Michael W | Koeppe, Robert A | Jagust, William J | Reiman, Eric M.
Fluorodeoxyglucose positron emission tomography (FDG-PET) studies report characteristic patterns of cerebral hypometabolism in probable Alzheimer's disease (pAD) and amnestic mild cognitive impairment (aMCI). This study aims to characterize the consistency of regional hypometabolism in pAD and aMCI patients enrolled in the AD Neuroimaging Initiative (ADNI) using statistical parametric mapping (SPM) and bootstrap resampling, and to compare bootstrap based reliability index to the commonly used type-I error approach with or without correction for multiple comparisons. Batched SPM5 was run for each of 1,000 bootstrap iterations to compare FDG-PET images from 74 pAD and 142 aMCI patients, respectively, to 82 normal controls. Maps of the hypometabolic voxels detected for at least a specific percentage of times over the 1000 runs were examined and compared to an overlap of the hypometabolic maps obtained from 3 randomly partitioned independent sub-datasets. The results from the bootstrap derived reliability of regional hypometabolism in the overall data set were similar to that observed in each of the three non-overlapping sub-sets using family-wise error. Strong but non-linear association was found between the bootstrap based reliability index and the type-I error. For threshold p=0.0005, pAD was associated with extensive hypometabolic voxels in the posterior cingulate/precuneus and parietotemporal regions with reliability between 90% and 100%. Bootstrap analysis provides an alternative to the parametric family-wise error approach used to examine consistency of hypometabolic brain voxels in pAD and aMCI patients. These results provide a foundation for the use of bootstrap analysis characterize statistical ROIs or search regions in both cross-sectional and longitudinal FDG PET studies. This approach offers promise in the early detection and tracking of AD, the evaluation of AD-modifying treatments, and other biologically or clinical important measurements using brain images and voxel-based data analysis techniques.

doi:10.1016/j.jneumeth.2010.07.030

PMCID: PMC2952503
PMID: 20678521

Alzheimer's Disease; MCI; FDG PET; Reproducibility of Results; Reliability; Bootstrap Resampling; Familywise Error; SPM

Background: Studies estimating health effects of long-term air pollution exposure often use a two-stage approach: building exposure models to assign individual-level exposures, which are then used in regression analyses. This requires accurate exposure modeling and careful treatment of exposure measurement error.

Objective: To illustrate the importance of accounting for exposure model characteristics in two-stage air pollution studies, we considered a case study based on data from the Multi-Ethnic Study of Atherosclerosis (MESA).

Methods: We built national spatial exposure models that used partial least squares and universal kriging to estimate annual average concentrations of four PM2.5 components: elemental carbon (EC), organic carbon (OC), silicon (Si), and sulfur (S). We predicted PM2.5 component exposures for the MESA cohort and estimated cross-sectional associations with carotid intima-media thickness (CIMT), adjusting for subject-specific covariates. We corrected for measurement error using recently developed methods that account for the spatial structure of predicted exposures.

Results: Our models performed well, with cross-validated R2 values ranging from 0.62 to 0.95. Naïve analyses that did not account for measurement error indicated statistically significant associations between CIMT and exposure to OC, Si, and S. EC and OC exhibited little spatial correlation, and the corrected inference was unchanged from the naïve analysis. The Si and S exposure surfaces displayed notable spatial correlation, resulting in corrected confidence intervals (CIs) that were 50% wider than the naïve CIs, but that were still statistically significant.

Conclusion: The impact of correcting for measurement error on health effect inference is concordant with the degree of spatial correlation in the exposure surfaces. Exposure model characteristics must be considered when performing two-stage air pollution epidemiologic analyses because naïve health effect inference may be inappropriate.

Citation: Bergen S, Sheppard L, Sampson PD, Kim SY, Richards M, Vedal S, Kaufman JD, Szpiro AA. 2013. A national prediction model for PM2.5 component exposures and measurement error–corrected health effect inference. Environ Health Perspect 121:1017–1025; http://dx.doi.org/10.1289/ehp.1206010

doi:10.1289/ehp.1206010

PMCID: PMC3764074
PMID: 23757600

It is a common practice to use resampling methods such as the bootstrap for calculating the p-value for each test when performing large scale multiple testing. The precision of the bootstrap p-values and that of the false discovery rate (FDR) relies on the number of bootstraps used for testing each hypothesis. Clearly, the larger the number of bootstraps the better the precision. However, the required number of bootstraps can be computationally burdensome, and it multiplies the number of tests to be performed. Further adding to the computational challenge is that in some applications the calculation of the test statistic itself may require considerable computation time. As technology improves one can expect the dimension of the problem to increase as well. For instance, during the early days of microarray technology, the number of probes on a cDNA chip was less than 10,000. Now the Affymetrix chips come with over 50,000 probes per chip. Motivated by this important need, we developed a simple adaptive bootstrap methodology for large scale multiple testing, which reduces the total number of bootstrap calculations while ensuring the control of the FDR. The proposed algorithm results in a substantial reduction in the number of bootstrap samples. Based on a simulation study we found that, relative to the number of bootstraps required for the Benjamini-Hochberg (BH) procedure, the standard FDR methodology which was the proposed methodology achieved a very substantial reduction in the number of bootstraps. In some cases the new algorithm required as little as 1/6th the number of bootstraps as the conventional BH procedure. Thus, if the conventional BH procedure used 1,000 bootstraps, then the proposed method required only 160 bootstraps. This methodology has been implemented for time-course/dose-response data in our software, ORIOGEN, which is available from the authors upon request.

doi:10.2202/1544-6115.1360

PMCID: PMC2752392
PMID: 18384266

Prescriptions for radiation therapy are given in terms of dose-volume constraints (DVCs). Solving the fluence map optimization (FMO) problem while satisfying DVCs often requires a tedious trial-and-error for selecting appropriate dose control parameters on various organs. In this paper, we propose an iterative approach to satisfy DVCs using a multi-objective linear programming (LP) model for solving beamlet intensities. This algorithm, starting from arbitrary initial parameter values, gradually updates the values through an iterative solution process toward optimal solution. This method finds appropriate parameter values through the trade-off between OAR sparing and target coverage to improve the solution. We compared the plan quality and the satisfaction of the DVCs by the proposed algorithm with two nonlinear approaches: a nonlinear FMO model solved by using the L-BFGS algorithm and another approach solved by a commercial treatment planning system (Eclipse 8.9). We retrospectively selected from our institutional database five patients with lung cancer and one patient with prostate cancer for this study. Numerical results show that our approach successfully improved target coverage to meet the DVCs, while trying to keep corresponding OAR DVCs satisfied. The LBFGS algorithm for solving the nonlinear FMO model successfully satisfied the DVCs in three out of five test cases. However, there is no recourse in the nonlinear FMO model for correcting unsatisfied DVCs other than manually changing some parameter values through trial and error to derive a solution that more closely meets the DVC requirements. The LP-based heuristic algorithm outperformed the current treatment planning system in terms of DVC satisfaction. A major strength of the LP-based heuristic approach is that it is not sensitive to the starting condition.

doi:10.4236/jct.2014.52025

PMCID: PMC4261934
PMID: 25506501

Fluence Map Optimization (FMO); Linear Programming (LP); Nonlinear Programming (NLP); Dose-Volume Constraint (DVC); Intensity-Modulated Proton Therapy (IMPT)

This paper presents a Bayesian hierarchical spatiotemporal method of interpolation, termed as Markov Cube Kriging (MCK). The classical Kriging methods become computationally prohibitive, especially for large datasets due to the O(n3) matrix decomposition. MCK offers novel and computationally efficient solutions to address spatiotemporal misalignment, mismatch in the spatiotemporal scales and missing values across space and time in large spatiotemporal datasets. MCK is flexible in that it allows for non-separable spatiotemporal structure and nonstationary covariance at the hierarchical spatiotemporal scales. Employing MCK we developed estimates of daily concentration of fine particulates matter ≤2.5 μm in aerodynamic diameter (PM2.5) at 2.5 km spatial grid for the Cleveland Metropolitan Statistical Area, 2000 to 2009. Our validation and cross-validation suggest that MCK achieved robust prediction of spatiotemporal random effects and underlying hierarchical and nonstationary spatiotemporal structure in air pollution data. MCK has important implications for environmental epidemiology and environmental sciences for exposure quantification and collocation of data from different sources, available at different spatiotemporal scales.

doi:10.1016/j.atmosenv.2013.02.034

PMCID: PMC3768020
PMID: 24039539

Time-space Kriging; Spatiotemporal hierarchical model; Gaussian Markov Random Fields; Nonstationarity; Bayesian computation; Fine particulate matter PM2.5

The relationship between exposure to environmental chemicals during pregnancy and early childhood development is an important issue which has a spatial risk component. In this context, we have examined mental retardation and developmental delay (MRDD) outcome measures for children in a Medicaid population in South Carolina and sampled measures of soil chemistry (e.g. As, Hg, etc.) on a network of sites which are misaligned to the outcome residential addresses during pregnancy. The true chemical concentration at the residential addresses is not observed directly and must be interpolated from soil samples. In this study, we have developed a Bayesian joint model which interpolates soil chemical fields and estimates the associated MRDD risk simultaneously. Having multiple spatial fields to interpolate, we have considered a low-rank Kriging method for the interpolation which requires less computation than Bayesian Kriging. We performed a sensitivity analysis for a bivariate smoothing, changing the number of knots and the smoothing parameter. These analyses show that a low-rank Kriging method can be used as an alternative to a full-rank Kriging, reducing computational burden. However, the number of knots for the low-rank Kriging model need to be selected with caution as a bivariate surface estimation can be sensitive to the choice of the number of knots.

doi:10.1002/sim.3777

PMCID: PMC3004226
PMID: 19904772

environmental exposure; logistic; spatial; low-rank Kriging; Bayesian

We computationally investigate two approaches for uncertainty quantification in inverse problems for nonlinear parameter dependent dynamical systems. We compare the bootstrapping and asymptotic theory approaches for problems involving data with several noise forms and levels. We consider both constant variance absolute error data and relative error which produces non-constant variance data in our parameter estimation formulations. We compare and contrast parameter estimates, standard errors, confidence intervals, and computational times for both bootstrapping and asymptotic theory methods.

doi:10.1016/j.mcm.2010.06.026

PMCID: PMC2935305
PMID: 20835347

Uncertainty quantification; parameter estimation; nonlinear dynamic models; bootstrapping; asymptotic theory standard errors; ordinary least squares vs. generalized least squares; computational examples

Background

Very frequently the same biological system is described by several, sometimes competing mathematical models. This usually creates confusion around their validity, ie, which one is correct. However, this is unnecessary since validity of a model cannot be established; model validation is actually a misnomer. In principle the only statement that one can make about a system model is that it is incorrect, ie, invalid, a fact which can be established given appropriate experimental data. Nonlinear models of high dimension and with many parameters are impossible to invalidate through simulation and as such the invalidation process is often overlooked or ignored.

Results

We develop different approaches for showing how competing ordinary differential equation (ODE) based models of the same biological phenomenon containing nonlinearities and parametric uncertainty can be invalidated using experimental data. We first emphasize the strong interplay between system identification and model invalidation and we describe a method for obtaining a lower bound on the error between candidate model predictions and data. We then turn to model invalidation and formulate a methodology for discrete-time and continuous-time model invalidation. The methodology is algorithmic and uses Semidefinite Programming as the computational tool. It is emphasized that trying to invalidate complex nonlinear models through exhaustive simulation is not only computationally intractable but also inconclusive.

Conclusion

Biological models derived from experimental data can never be validated. In fact, in order to understand biological function one should try to invalidate models that are incompatible with available data. This work describes a framework for invalidating both continuous and discrete-time ODE models based on convex optimization techniques. The methodology does not require any simulation of the candidate models; the algorithms presented in this paper have a worst case polynomial time complexity and can provide an exact answer to the invalidation problem.

doi:10.1186/1471-2105-10-132

PMCID: PMC2704209
PMID: 19422679

It has been suggested that children with larger brains tend to perform better on IQ tests or cognitive function tests. Prenatal head growth and head growth in infancy are two crucial periods for subsequent intelligence. Studies have shown that environmental exposure to air pollutants during pregnancy is associated with fetal growth reduction, developmental delay, and reduced IQ. Meanwhile, genetic polymorphisms may modify the effect of environment on head growth. However, studies on gene–environment or gene–gene interactions on growth trajectories have been quite limited partly due to the difficulty to quantitatively measure interactions on growth trajectories. Moreover, it is known that assessing the significance of gene–environment or gene–gene interactions on cross-sectional outcomes empirically using the permutation procedures may bring substantial errors in the tests. We proposed a score that quantitatively measures interactions on growth trajectories and developed an algorithm with a parametric bootstrap procedure to empirically assess the significance of the interactions on growth trajectories under the likelihood framework. We also derived a Wald statistic to test for interactions on growth trajectories and compared it to the proposed parametric bootstrap procedure. Through extensive simulation studies, we demonstrated the feasibility and power of the proposed testing procedures. We applied our method to a real dataset with head circumference measures from birth to age 7 on a cohort currently being conducted by the Columbia Center for Children's Environmental Health (CCCEH) in Krakow, Poland, and identified several significant gene–environment interactions on head circumference growth trajectories.

doi:10.1002/gepi.21613

PMCID: PMC3380164
PMID: 22311237

gene–environment interactions; growth curves; Wald test; parametric bootstrap

Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.

PMCID: PMC3178196
PMID: 21949562

Bayesian inference; Errors in variables; Gene-environment interactions; Markov Chain Monte Carlo sampling; Missing data; Pseudo-likelihood; Semiparametric methods

A hybrid approach is proposed to estimate exposure to fine particulate matter (PM2.5) at a given location and time. This approach builds on satellite-based aerosol optical depth (AOD), air pollution data from sparsely distributed Environmental Protection Agency (EPA) sites and local time–space Kriging, an optimal interpolation technique. Given the daily global coverage of AOD data, we can develop daily estimate of air quality at any given location and time. This can assure unprecedented spatial coverage, needed for air quality surveillance and management and epidemiological studies. In this paper, we developed an empirical relationship between the 2 km AOD and PM2.5 data from EPA sites. Extrapolating this relationship to the study domain resulted in 2.3 million predictions of PM2.5 between 2000 and 2009 in Cleveland Metropolitan Statistical Area (MSA). We have developed local time–space Kriging to compute exposure at a given location and time using the predicted PM2.5. Daily estimates of PM2.5 were developed for Cleveland MSA between 2000 and 2009 at 2.5 km spatial resolution; 1.7 million (~79.8%) of 2.13 million predictions required for multiyear and geographic domain were robust. In the epidemiological application of the hybrid approach, admissions for an acute exacerbation of chronic obstructive pulmonary disease (AECOPD) was examined with respect to time–space lagged PM2.5 exposure. Our analysis suggests that the risk of AECOPD increases 2.3% with a unit increase in PM2.5 exposure within 9 days and 0.05° (~5 km) distance lags. In the aggregated analysis, the exposed groups (who experienced exposure to PM2.5 >15.4 μg/m3) were 54% more likely to be admitted for AECOPD than the reference group. The hybrid approach offers greater spatiotemporal coverage and reliable characterization of ambient concentration than conventional in situ monitoring-based approaches. Thus, this approach can potentially reduce exposure misclassification errors in the conventional air pollution epidemiology studies.

doi:10.1038/jes.2013.52

PMCID: PMC3980441
PMID: 24045428

PM2.5 exposure; local time–space Kriging; aerosol optical depth; times–pace lagged exposure; COPD

Background

Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice.

Methods

Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample.

Results

The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples.

Conclusion

Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.

doi:10.1186/1471-2288-6-53

PMCID: PMC1636059
PMID: 17067394

Tropospheric ozone (O3) pollution is a major problem worldwide, including in the United States of America (USA), particularly during the summer months. Ozone oxidative capacity and its impact on human health have attracted the attention of the scientific community. In the USA, sparse spatial observations for O3 may not provide a reliable source of data over a geo-environmental region. Geostatistical Analyst in ArcGIS has the capability to interpolate values in unmonitored geo-spaces of interest. In this study of eastern Texas O3 pollution, hourly episodes for spring and summer 2012 were selectively identified. To visualize the O3 distribution, geostatistical techniques were employed in ArcMap. Using ordinary Kriging, geostatistical layers of O3 for all the studied hours were predicted and mapped at a spatial resolution of 1 kilometer. A decent level of prediction accuracy was achieved and was confirmed from cross-validation results. The mean prediction error was close to 0, the root mean-standardized-prediction error was close to 1, and the root mean square and average standard errors were small. O3 pollution map data can be further used in analysis and modeling studies. Kriging results and O3 decadal trends indicate that the populace in Houston-Sugar Land-Baytown, Dallas-Fort Worth-Arlington, Beaumont-Port Arthur, San Antonio, and Longview are repeatedly exposed to high levels of O3-related pollution, and are prone to the corresponding respiratory and cardiovascular health effects. Optimization of the monitoring network proves to be an added advantage for the accurate prediction of exposure levels.

doi:10.3390/ijerph110100983

PMCID: PMC3924486
PMID: 24434594

tropospheric ozone (O3); geostatistical analysis; prediction; interpolation; spatial resolution; visualization; Geographical Information Systems (GIS)

Background

The dynamics of gene regulation play a crucial role in a cellular control: allowing the cell to express the right proteins to meet changing needs. Some needs, such as correctly anticipating the day-night cycle, require complicated oscillatory features. In the analysis of gene regulatory networks, mathematical models are frequently used to understand how a network’s structure enables it to respond appropriately to external inputs. These models typically consist of a set of ordinary differential equations, describing a network of biochemical reactions, and unknown kinetic parameters, chosen such that the model best captures experimental data. However, since a model’s parameter values are uncertain, and since dynamic responses to inputs are highly parameter-dependent, it is difficult to assess the confidence associated with these in silico predictions. In particular, models with complex dynamics - such as oscillations - must be fit with computationally expensive global optimization routines, and cannot take advantage of existing measures of identifiability. Despite their difficulty to model mathematically, limit cycle oscillations play a key role in many biological processes, including cell cycling, metabolism, neuron firing, and circadian rhythms.

Results

In this study, we employ an efficient parameter estimation technique to enable a bootstrap uncertainty analysis for limit cycle models. Since the primary role of systems biology models is the insight they provide on responses to rate perturbations, we extend our uncertainty analysis to include first order sensitivity coefficients. Using a literature model of circadian rhythms, we show how predictive precision is degraded with decreasing sample points and increasing relative error. Additionally, we show how this method can be used for model discrimination by comparing the output identifiability of two candidate model structures to published literature data.

Conclusions

Our method permits modellers of oscillatory systems to confidently show that a model’s dynamic characteristics follow directly from experimental data and model structure, relaxing assumptions on the particular parameters chosen. Ultimately, this work highlights the importance of continued collection of high-resolution data on gene and protein activity levels, as they allow the development of predictive mathematical models.

doi:10.1186/1752-0509-7-71

PMCID: PMC3733791
PMID: 23895261

Bootstrap; Identifiability; Oscillatory models; Circadian rhythms; Sensitivity analysis; Parameter estimation

Background

Scoring systems are a very attractive family of clinical predictive models, because the patient score can be calculated without using any data processing system. Their weakness lies in the difficulty of associating a reliable prognostic probability with each score. In this study a bootstrap approach for estimating confidence intervals of outcome probabilities is described and applied to design and optimize the performance of a scoring system for morbidity in intensive care units after heart surgery.

Methods

The bias-corrected and accelerated bootstrap method was used to estimate the 95% confidence intervals of outcome probabilities associated with a scoring system. These confidence intervals were calculated for each score and each step of the scoring-system design by means of one thousand bootstrapped samples. 1090 consecutive adult patients who underwent coronary artery bypass graft were assigned at random to two groups of equal size, so as to define random training and testing sets with equal percentage morbidities. A collection of 78 preoperative, intraoperative and postoperative variables were considered as likely morbidity predictors.

Results

Several competing scoring systems were compared on the basis of discrimination, generalization and uncertainty associated with the prognostic probabilities. The results showed that confidence intervals corresponding to different scores often overlapped, making it convenient to unite and thus reduce the score classes. After uniting two adjacent classes, a model with six score groups not only gave a satisfactory trade-off between discrimination and generalization, but also enabled patients to be allocated to classes, most of which were characterized by well separated confidence intervals of prognostic probabilities.

Conclusions

Scoring systems are often designed solely on the basis of discrimination and generalization characteristics, to the detriment of prediction of a trustworthy outcome probability. The present example demonstrates that using a bootstrap method for the estimation of outcome-probability confidence intervals provides useful additional information about score-class statistics, guiding physicians towards the most convenient model for predicting morbidity outcomes in their clinical context.

doi:10.1186/1472-6947-10-45

PMCID: PMC2940863
PMID: 20796275

Background

Non-parametric bootstrapping is a widely-used statistical procedure for assessing confidence of model parameters based on the empirical distribution of the observed data [1] and, as such, it has become a common method for assessing tree confidence in phylogenetics [2]. Traditional non-parametric bootstrapping does not weigh each tree inferred from resampled (i.e., pseudo-replicated) sequences. Hence, the quality of these trees is not taken into account when computing bootstrap scores associated with the clades of the original phylogeny. As a consequence, traditionally, the trees with different bootstrap support or those providing a different fit to the corresponding pseudo-replicated sequences (the fit quality can be expressed through the LS, ML or parsimony score) contribute in the same way to the computation of the bootstrap support of the original phylogeny.

Results

In this article, we discuss the idea of applying weighted bootstrapping to phylogenetic reconstruction by weighting each phylogeny inferred from resampled sequences. Tree weights can be based either on the least-squares (LS) tree estimate or on the average secondary bootstrap score (SBS) associated with each resampled tree. Secondary bootstrapping consists of the estimation of bootstrap scores of the trees inferred from resampled data. The LS and SBS-based bootstrapping procedures were designed to take into account the quality of each "pseudo-replicated" phylogeny in the final tree estimation. A simulation study was carried out to evaluate the performances of the five weighting strategies which are as follows: LS and SBS-based bootstrapping, LS and SBS-based bootstrapping with data normalization and the traditional unweighted bootstrapping.

Conclusions

The simulations conducted with two real data sets and the five weighting strategies suggest that the SBS-based bootstrapping with the data normalization usually exhibits larger bootstrap scores and a higher robustness compared to the four other competing strategies, including the traditional bootstrapping. The high robustness of the normalized SBS could be particularly useful in situations where observed sequences have been affected by noise or have undergone massive insertion or deletion events. The results provided by the four other strategies were very similar regardless the noise level, thus also demonstrating the stability of the traditional bootstrapping method.

doi:10.1186/1471-2148-10-250

PMCID: PMC2939571
PMID: 20716358

This paper details the design, evaluation, and implementation of a framework for detecting and modeling nonlinearity between a binary outcome and a continuous predictor variable adjusted for covariates in complex samples. The framework provides familiar-looking parameterizations of output in terms of linear slope coefficients and odds ratios. Estimation methods focus on maximum likelihood optimization of piecewise linear free-knot splines formulated as B-splines. Correctly specifying the optimal number and positions of the knots improves the model, but is marked by computational intensity and numerical instability. Our inference methods utilize both parametric and nonparametric bootstrapping. Unlike other nonlinear modeling packages, this framework is designed to incorporate multistage survey sample designs common to nationally representative datasets. We illustrate the approach and evaluate its performance in specifying the correct number of knots under various conditions with an example using body mass index (BMI; kg/m2) and the complex multi-stage sampling design from the Third National Health and Nutrition Examination Survey to simulate binary mortality outcomes data having realistic nonlinear sample-weighted risk associations with BMI. BMI and mortality data provide a particularly apt example and area of application since BMI is commonly recorded in large health surveys with complex designs, often categorized for modeling, and nonlinearly related to mortality. When complex sample design considerations were ignored, our method was generally similar to or more accurate than two common model selection procedures, Schwarz’s Bayesian Information Criterion (BIC) and Akaike’s Information Criterion (AIC), in terms of correctly selecting the correct number of knots. Our approach provided accurate knot selections when complex sampling weights were incorporated, while AIC and BIC were not effective under these conditions.

doi:10.3389/fnut.2014.00016

PMCID: PMC4297674
PMID: 25610831

Free-knot splines; nonlinear modeling; logistic regression; bootstrap; complex samples; body mass index