Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3403723

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Method and parameter inference
- 3. Using the Intake_epis_food() function in R
- 4. Application of the Intake_epis_food() function
- 5. Discussion
- References

Authors

Related links

J Stat Softw. Author manuscript; available in PMC 2012 July 24.

Published in final edited form as:

PMCID: PMC3403723

NIHMSID: NIHMS355296

Adriana Pérez, Division of Biostatistics and Michael & Susan Dell Center for Healthy Living, The University of Texas Health Science Center at Houston, School of Public Health, 1616 Guadalupe Street, Suite 6.340, Austin, TX 78705, Telephone: +1-(512)-391-2524, Fax: +1-(512)-482-6185, Email: ude.cmt.htu@zerep.anairda;

The publisher's final edited version of this article is available at J Stat Softw

We consider a Bayesian analysis using `WinBUGS` to estimate the distribution of usual intake for episodically consumed foods and energy (calories). The model uses measures of nutrition and energy intakes via a food frequency questionnaire (FFQ) along with repeated 24 hour recalls and adjusting covariates. In order to estimate the usual intake of the food, we phrase usual intake in terms of person-specific random effects, along with day-to-day variability in food and energy consumption. Three levels are incorporated in the model. The first level incorporates information about whether an individual in fact reported consumption of a particular food item. The second level incorporates the amount of intake from those individuals who reported consumption of the food, and the third level incorporates the energy intake. Estimates of posterior means of parameters and distributions of usual intakes are obtained by using Markov chain Monte Carlo calculations. This `R` function reports to users point estimates and credible intervals for parameters in the model, samples from their posterior distribution, samples from the distribution of usual intake and usual energy intake, trace plots of parameters and summary statistics of usual intake, usual energy intake and energy adjusted usual intake.

There are many statistical challenges when modeling food intakes reported on two or more 24 hours recalls. Some of the challenges involve the presence of measurement error because estimating the distribution of usual intake of nutrients and foods in the population involves monitoring and measuring such intakes over time and their associated recall biases. In addition, many consumers report episodically consumed foods, foods that are not typically consumed every day. For example, fish may be reported in a particular day with a positive intake, but on a different day fish intake may equal zero. Consequently, it is difficult to estimate nutrition intake with recall surveys when one of these recalls incorporate excess zero measurements. Data of this nature is often modeled with measurement error models with zero inflated data.

Recently, in nutritional surveillance (Tooze *et al.* 2006) and nutritional epidemiology (Kipnis *et al.* 2009) two-part methods had been developed for analyzing episodically consumed foods. In the first part of the model, the probability of an episodically consumed food is estimated using logistic regression with a person-specific random effect. Then, in the second part of the model the amount of that episodically consumed food per day is modeled using linear regression on a transformed scale with also a person-specific effect. These two parts are linked allowing that the two person-specific effects are correlated as well as by allowing common covariates in both parts of the model. This method is known as the “NCI method” (http://riskfactor.cancer.gov/diet/usualintakes/).

An extension into a three-part method that also incorporates the estimation of the amount of energy intake consumed per day is described in detail by Kipnis *et al.* (2010). These authors estimated this three-part method using nonlinear mixed effects models with likelihoods computed by adaptive Gaussian quadrature in `SAS` software. However, computationally it was found to have serious convergence issues within the context of nutritional epidemiology and nutritional surveillance as indicated by Kipnis *et al.* (2010) and Zhang *et al.* (2010).

The goal of this article is to present a function for the implementation of this nonlinear mixed effects model. The function `Intake_epis_food()` allows readers to input their data in `R` (R Development Core Team 2009) to generate and run the script of the three part model in `WinBUGS` (Spiegelhalter *et al.* 1999) and to run and obtain output simulations to `R` using the package `R2WinBUGS` (Sturtz *et al.* 2005). While the most important functions of the package `R2WinBUGS` are illustrated, we do not provide comprehensive documentation here; instead the reader is referred to the manual and online documentation with that package available from the Comprehensive `R` Archive Network at http://CRAN.R-project.org/package=plink. In the next sections, the function and corresponding algorithms are explained and an example is provided.

The computational details of the Bayesian approach to fit the three-part model for an episodically consumed food and energy model (Kipnis *et al.* 2010) through Markov Chain Monte Carlo techniques are provided by Zhang *et al.* (2010) with a brief summary given here. Our model takes into consideration measures of nutrition and energy intakes via a food frequency questionnaire (FFQ) along with two repeated 24 hour recalls (24hr) and adjustment covariates. These two repeated 24hr provide information on the amounts of food and energy consumed by each individual. Consequently, an indicator variable of whether the food was consumed can be generated from the reported amount of food consumed. In addition, because two or more 24hr require distinguishing between and within person random error (Eckert *et al.* 1997), a type of classical measurement error is needed. Generally nutritional data are skewed, which may require transformations to reach normality and standardization. For person *i* = 1, …, *n*, and for the *k* = 1, 2 repeats of the 24hr, the data are *Ỹ _{ik}* = (

*Y*_{i1k}= Indicator of whether the food is consumed.*Y*_{i2k}= Amount of the food consumed as reported by the 24hr, which equals zero if the food is not consumed.*Y*_{i3k}= Amount of energy consumed as reported by the 24hr.

We will use a Box-Cox transformation to account for the skewness in the data. The Box-Cox transformation with transformation parameter λ, is *h*(*y*, λ) = (*y*^{λ} −1)/λ if λ ≠ 0, and *h*(*y*, λ) = log(*y*) if λ = 0. We will allow for the user to specify different transformation parameters λ_{F} and λ_{E} for food and energy intake, respectively.

After Box-Cox transformation, we further standardize and center these transformed variables to have a mean of zero and variance of one. This is useful for making the prior distribution specifications given below to be sensible and allows rapid convergence of the posterior samples. Specifically, let μ_{λF} and σ_{λF} be the mean and standard deviation of the transformed non-zero food data *h*(*Y*_{i2k}, λ_{F}), and let μ_{λE} and σ_{λE} be the mean and standard deviation of the transformed energy data *h*(*Y*_{i3k}, λ_{E}). Then, our analysis is performed using the following transformations:

$${Q}_{1\mathit{\text{ik}}}={Y}_{i1k};$$

(1)

$${Q}_{i2k}=\sqrt{2}{Y}_{i1k}\{h({Y}_{i2k},{\mathrm{\lambda}}_{F})-{\mu}_{{\mathrm{\lambda}}_{F}}\}/{\sigma}_{{\mathrm{\lambda}}_{F}};$$

(2)

$${Q}_{i3k}=\sqrt{2}\{h({Y}_{i3k},{\mathrm{\lambda}}_{E})-{\mu}_{{\mathrm{\lambda}}_{E}}\}/{\sigma}_{{\mathrm{\lambda}}_{E}}={h}_{\text{tr}}({Y}_{i3k},{\mathrm{\lambda}}_{E},{\mu}_{{\mathrm{\lambda}}_{E}},{\sigma}_{{\mathrm{\lambda}}_{E}});$$

(3)

There are also covariates such as age category, ethnic status and, in many cases the results of reported intakes from a food frequency questionnaire. In principle, the covariates can differ based on the three types of data, so we denote them as *V*_{i1}, *V*_{i2}, *V*_{i3}, which are vector-valued. To improve linearity and homocedasticity of the model, we follow the recommendations of (i) implementing Box-Cox transformations on all or some covariates like intakes from a food frequency questionnaire (Kipnis *et al.* 2009), (ii) centering and (iii) scaling on all covariates. Without loss of generality, let λ_{c} represent a vector containing the Box-Cox transformation parameter used for covariates. Let μ_{λc} and σ_{λc} be the vector means and standard deviations of the transformed covariates. Then, these transformed, center and scaled covariates are denoted by:

$${X}_{i1}=\{h({V}_{i1},{\mathrm{\lambda}}_{c})-{\mu}_{{\mathrm{\lambda}}_{c}}\}/{\sigma}_{{\mathrm{\lambda}}_{c}}$$

(4)

$${X}_{i2}=\{h({V}_{i2},{\mathrm{\lambda}}_{c})-{\mu}_{{\mathrm{\lambda}}_{c}}\}/{\sigma}_{{\mathrm{\lambda}}_{c}}$$

(5)

$${X}_{i3}=\{h({V}_{i3},{\mathrm{\lambda}}_{c})-{\mu}_{{\mathrm{\lambda}}_{c}}\}/{\sigma}_{{\mathrm{\lambda}}_{c}}$$

(6)

Let (*U*_{i1}, *U*_{i2}, *U*_{i3}) = Normal(0, Σ_{u}) be random effects associated with consumption, amount (if the food is consumed), and energy. Similarly, for *k* = 1, 2, (ε_{i1k}, ε_{i2k}, ε_{i3k}) = Normal(0, Σ_{ε}) accounts for day-to-day variation. The model for whether there is consumption can be stated as:

$$\mathtt{P}({Q}_{i1k}=1|{X}_{i1},{U}_{i1},{U}_{i2},{U}_{i3})=\mathrm{\Phi}({\alpha}_{1}+{X}_{i1}^{\mathrm{T}}{\beta}_{1}+{U}_{i1}),$$

(7)

where Φ(·) is the standard normal distribution function and (*W*_{i1k}, *W*_{i2k}, *W*_{i3k}) represent their corresponding latent variables as follows. A food being consumed at visit *k* is equivalent to

$${Q}_{i1k}=1\iff {W}_{i1k}={\alpha}_{1}+{X}_{i1}^{\mathrm{T}}{\beta}_{1}+{U}_{i1}+{\epsilon}_{i1k}>0.$$

(8)

The food when consumed and energy intake are modeled as

$$[{Q}_{i2k}|{Q}_{i1k}>0]\text{}=\text{}{W}_{i2k}={\alpha}_{2}+{X}_{i2}^{\mathrm{T}}{\beta}_{2}+{U}_{i2}+{\epsilon}_{i2k}$$

(9)

$$[{Q}_{i3k}]\text{}=\text{}{W}_{i3k}={\alpha}_{3}+{X}_{i3}^{\mathrm{T}}{\beta}_{3}+{U}_{i3}+{\epsilon}_{i3k}.$$

(10)

The prior distribution of parameters β_{1}, β_{2}, β_{3} is assumed to be multivariate Normal with vector mean 0 and variance covariance Σ_{u}. An inverse Wishart denoted as IW(Ω_{u}, *m _{u}*) prior was specified for Σ

$${\mathrm{\Sigma}}_{\epsilon}=\left[\begin{array}{ccc}\hfill 1\hfill & \hfill 0\hfill & \hfill {p}_{1}{s}_{33}^{1/2}\hfill \\ \hfill 0\hfill & \hfill {s}_{22}\hfill & \hfill {p}_{2}{({s}_{22}{s}_{33})}^{1/2}\hfill \\ \hfill {p}_{1}{s}_{33}^{1/2}\hfill & \hfill {p}_{2}{({s}_{22}{s}_{33})}^{1/2}\hfill & \hfill {s}_{33}\hfill \end{array}\right],$$

(11)

where *p*_{1} = γcos(θ) and *p*_{2} = γsin (θ). The recommended prior distributions for *s*_{22} and *s*_{33} are Uniform(0,3), for γ is Uniform(−1,1) and for θ is Uniform(−π, π) (Zhang *et al.* 2010). The corresponding inverse of Σ_{u} is a Wishart distribution denoted as ${\mathrm{\Sigma}}_{u}^{-1}=\mathrm{W}({\mathrm{\Omega}}_{u}^{-1},{m}_{u})$, and the inverse of Σ_{ε} can be written as:

$${\mathrm{\Sigma}}_{\epsilon}^{-1}=\left[\begin{array}{ccc}\hfill 1+{{\mathit{\text{wp}}}_{1}}^{2}{s}_{33}\hfill & \hfill {\mathit{\text{wp}}}_{1}{p}_{2}{s}_{33}{s}_{22}^{-1/2}\hfill & \hfill -{\mathit{\text{wp}}}_{1}{s}_{33}^{1/2}\hfill \\ \hfill {\mathit{\text{wp}}}_{1}{p}_{2}{s}_{33}{s}_{22}^{-1/2}\hfill & \hfill {s}_{22}^{-1}+{\mathit{\text{wp}}}_{2}^{2}{s}_{33}{s}_{22}^{-1}\hfill & \hfill -{\mathit{\text{wp}}}_{2}{s}_{33}^{1/2}{s}_{22}^{-1/2}\hfill \\ \hfill -{\mathit{\text{wp}}}_{1}{s}_{33}^{1/2}\hfill & \hfill -{\mathit{\text{wp}}}_{2}{s}_{33}^{1/2}{s}_{22}^{-1/2}\hfill & \hfill w\hfill \end{array}\right],$$

(12)

where $w={\{{s}_{33}(1-{p}_{1}^{2}-{p}_{2}^{2})\}}^{-1}$.

Besides the parameters and the random effects (*U*_{i1}, *U*_{i2}, *U*_{i3}) in this context it is vital to also have the posterior samples of usual intake *T _{Fi}*, usual energy intake

$${T}_{\mathit{\text{Ei}}}=E\{{h}_{\text{tr}}^{-1}({\alpha}_{3}+{X}_{i3}^{\mathrm{T}}{\beta}_{3}+{U}_{i3}+{\epsilon}_{i3},{\mathrm{\lambda}}_{E},{\mu}_{{\mathrm{\lambda}}_{E}},{\sigma}_{{\mathrm{\lambda}}_{E}})|{X}_{i3},{U}_{i3}\}$$

(13)

$$\approx {h}_{\text{tr}}^{*}\{{\alpha}_{3}+{X}_{i3}^{\mathrm{T}}{\beta}_{3}+{U}_{i3},{\mathrm{\lambda}}_{E},{\mu}_{{\mathrm{\lambda}}_{E}},{\sigma}_{{\mathrm{\lambda}}_{E}},{\mathrm{\Sigma}}_{\epsilon}\phantom{\rule{thinmathspace}{0ex}}(3,3)\}.$$

(14)

where

$${h}_{\text{tr}}^{*}\{\upsilon ,\mathrm{\lambda},\mu ,\sigma ,\mathrm{\Sigma}\}={h}_{\text{tr}}^{-1}(\upsilon ,\mathrm{\lambda},\mu ,\sigma )+\frac{\mathrm{\Sigma}}{2}\frac{{\partial}^{2}{h}_{\text{tr}}^{-1}(\upsilon ,\mathrm{\lambda},\mu ,\sigma )}{\partial {\upsilon}^{2}}.$$

Similarly, a person’s usual intake of the dietary component on the original scale is defined as

$${T}_{\mathit{\text{Fi}}}=\mathrm{\Phi}({\alpha}_{1}+{X}_{i1}^{\mathrm{T}}{\beta}_{1}+{U}_{i1}){h}_{\text{tr}}^{*}\{{\alpha}_{2}+{X}_{i2}^{\mathrm{T}}{\beta}_{2}+{U}_{i2},{\mathrm{\lambda}}_{F},{\mu}_{{\mathrm{\lambda}}_{F}},{\sigma}_{{\mathrm{\lambda}}_{F}},{\mathrm{\Sigma}}_{\epsilon}(2,2)\}.$$

(15)

When λ = 0, the back-transformation and second derivative are:

$$\begin{array}{cc}\hfill {h}_{\text{tr}}^{-1}(\upsilon ,0,\mu ,\sigma )\text{}=& \text{exp}(\mu +\sigma \upsilon /\sqrt{2});\hfill \\ \hfill {\partial}^{2}{h}_{\text{tr}}^{-1}(\upsilon ,0,\mu ,\sigma )/\partial {\upsilon}^{2}\text{}=& {\sigma}^{2}{h}_{\text{tr}}^{-1}(\upsilon ,0,\mu ,\sigma )/2.\hfill \end{array}$$

Similarly, when λ ≠ 0, the back-transformation and second derivative are:

$$\begin{array}{cc}\hfill {h}_{\text{tr}}^{-1}(\upsilon ,\mathrm{\lambda},\mu ,\sigma )\text{}=& {(1+\mathrm{\lambda}\mu +\mathrm{\lambda}\sigma \upsilon /\sqrt{2})}^{1/\mathrm{\lambda}};\hfill \\ \hfill {\partial}^{2}{h}_{\text{tr}}^{-1}(\upsilon ,\mathrm{\lambda},\mu ,\sigma )/\partial {\upsilon}^{2}\text{}=& \hfill \frac{{\sigma}^{2}}{2}(1-\mathrm{\lambda}){(1+\mathrm{\lambda}\mu +\mathrm{\lambda}\sigma \upsilon /\sqrt{2})}^{-2+1/\mathrm{\lambda}}.\hfill \end{array}$$

Users must install the `R2WinBUGS` (Sturtz *et al.* 2005) package in `R` and `WinBUGS` (Spiegelhalter *et al.* 1999) before running `Intake_epis_food()` function. This function assumes that there are no missing observations. `Intake_epis_food()` loads the libraries `R2WinBUGS`, `stats` and `fSeries` automatically.

The `Intake_epis_food()` function incorporates this three-part model, with truncated normal random variables for generating the latent *W*_{i1k}, and Metropolis-Hastings computations for α_{1}, α_{2}, α_{3}, β_{1}, β_{2}, β_{3}, the elements of Σ_{u}, and Σ_{ε}. This function generates and runs the script of this model in `WinBUGS` and returns Markov Chain Monte Carlo (MCMC) computation. We emphasize that MCMC computation can either be thought of as a strictly Bayesian computation with ordinary Bayesian inference, or as a means of developing *frequentist* estimators of the crucial parameters. These MCMC produces estimates which are known to be asymptotically equivalent to maximum likelihood estimators, which are difficult to obtain (Zhang *et al.* 2010). In Bayesian terms we used the posterior means of the model. Users are mainly interested in estimates of equations (14), (15) as well as the energy-adjusted usual intake 1000*T _{Fi}/T_{Ei}*.

The `Intake_epis_food()` function produces seven output files. The first output file contains parameter estimates (posterior means) of α_{1}, α_{2}, α_{3}, β_{1}, β_{2}, β_{3} from equations (8), (9) and (10) as well as Σ_{μ}, ${\mathrm{\Sigma}}_{\mu}^{-1}$, Σ_{ε}, and ${\mathrm{\Sigma}}_{\epsilon}^{-1}$. The second output file contains summary statistics of Bayesian estimates of equations (14), (15) and the energy-adjusted usual intake 1000*T _{Fi}/T_{Ei}*. The third output file contains the iterations from the MCMC computations. The fourth output file contains summary statistics of Bayesian estimates for

The default input for the `Intake_epis_food()` function in `R` is:

Intake_epis_food(data.file.name="data.file.name",numcov=4, df=5,lambdaf=0.32,lambdae=0.23,lvar1=999, lvar2=999, lvar3=0.33,lvar4=0, n.iter=11000,n.burnin=1000, n.thin=10, bugs.seed=123456,working.directory="C:/", file.estimates="estimates.csv",file.istats="estimates_intake.csv", file.iterations="iterations.csv",file.uis="uis.csv", file.tracep="trace_plots.pdf",file.tpvarcov="trace_plot_var_cov_matrices.pdf",file.densityp="density_plots.pdf",bugs.directory="c:/Program Files/WinBUGS14/")

with the following arguments:

`data.file.name`: name of the file containing the data (default=data.file.name) in the following order:- the first variable that identifies individuals, e.g. an ID number
- the next two variables are the 24hr variables of food intake, in order
- the next two variables are the 24hr variables of energy intake, in order
- the next variables are the covariates, and cannot have a column of ones. We limited the number of covariates to four, given that each covariate may have a different Box-Cox transformation parameter.

`numcov`: number of covariates in user’s dataset (default=4)`df`: number of the degrees of freedom of*m*(default=5)_{u}`lambdaf`: value of λ_{F}to be used for transformation of food intake (default=1 indicating no transformation done)`lambdae`: value of λ_{E}to be used for transformation of energy intake (default=1 indicating no transformation done)`lvar1`: value of λ_{c}to be used for Box-Cox transformation of first covariable (default=999 indicating no transformation done)`lvar2`: value of λ_{c}to be used for Box-Cox transformation of second covariable (default=999 indicating no transformation done)`lvar3`: value of λ_{c}to be used for Box-Cox transformation of third covariable (default=999 indicating no transformation done)`lvar4`: value of λ_{c}to be used for Box-Cox transformation of fourth covariable (default=999 indicating no transformation done)`n.iter`: number of iterations of the MCMC including burning iterations (default=11,000)`n.burnin`: number of burn-in iterations of the MCMC (default=1,000)`n.thin`: thinning rate of MCMC (default=10)`bugs.seed`: random seed to be use for MCMC (default=123456)`working.directory`: drive and folder location for output files (default=”C:”)`file.estimates`: name of the file to save posterior summary statistics including mean, standard deviation as well as percentiles: 2.5^{th}, 25^{th}, 50^{th}, 75^{th}and 97.5^{th}(default = estimates.csv), of α_{1}, α_{2}, α_{3}, β_{1}, β_{2}, β_{3}from equations (8), (9) and (10) as well as Σ_{μ}, ${\mathrm{\Sigma}}_{\mu}^{-1}$, Σ_{ε}, and ${\mathrm{\Sigma}}_{\epsilon}^{-1}$.`file.iterations`: name of the file to save iterations (default=iterations.csv). The number of iterations saved corresponds to (*n.iter*−*n.burnin*)/(*n.thin*)`file.uis`: name of the file to save summary statistics of Bayesian estimates for*U*_{i1},*U*_{i2},*U*_{i3}including mean, standard deviation as well as percentiles: 2.5^{th}, 25^{th}, 50^{th}, 75^{th}and 97.5^{th}for person*i*= 1, …,*n*. The first variable given in the dataset that identifies individuals is included as the first variable in this file for linking purposes to users.`file.tracep`: name of the file to save trace plots of parameters (default = trace plots.pdf)`file.tpvarcov`: name of the file to save trace plots of variance-covariances Σ_{μ}, ${\mathrm{\Sigma}}_{\mu}^{-1}$, Σ_{ε}and ${\mathrm{\Sigma}}_{\epsilon}^{-1}$ (default = trace plot var cov matrices.pdf)`file.densityp`: name of the file to save density plots of usual food intake and usual food intake per 1000 calories. Each one is generated individually and an overlayed plot is generated as well (default = density plots.pdf)`bugs.directory`: drive and folder location of`WinBUGS`(default=”c:/Program Files /Win-BUGS14/”)

Once the data is input, `Intake_epis_food()` invokes `WinBUGS` from `R. WinBUGS` requires a file describing the model in equations (1)–(5) in Section 2. This model is included as file `intake_program.txt` and is loaded in `R` using its `bugs()` argument. Although, the default priors of the constrained variance-covariance matrix of random errors are geared to the pre-standardization of the data as described in Section 2, the user may change those prior entries within the file `intake_program.txt`, possibly changing:

Constrained variance-covariance matrix of random errors | ||
---|---|---|

Matrix notation | Default distribution and range | Code in intake_program.txt |

s_{22} | Uniform(0,3) | tau.e1 ~ dunif(0,3) |

s_{33} | Uniform(0,3) | tau.e2 ~ dunif(0,3) |

γ | Uniform (−1,1) | a.e1 ~ dunif(−0.99,0.99) |

θ | Uniform (−1,1) | a.e2 ~ dunif(−3.11,3.11) |

Users may change any of the defaults of `Intake_epis_food()` function when the function is called to run by replacing those arguments in the last sentence of the file accompanied by this paper. MCMC posterior means of α_{1}, α_{2}, α_{3}, β_{1}, β_{2}, β_{3}, the elements of Σ_{u}, Σ_{ε}, ${\mathrm{\Sigma}}_{u}^{-1}\text{and}{\mathrm{\Sigma}}_{\epsilon}^{-1}$ are displayed on the screen of `R` by `Intake_epis_food()` and saved automatically in numeral matrices on file names previously listed. See Section 4 for an example.

We simulated data using parameter values identified from the calibration sub-study of the National Institutes of Health (NIH) and the American Association of Retired Persons (AARP) Diet and Health Study (Schatzkin *et al*. 2001). The NIH-AARP study is a cohort composed of people who resided in one of six states: California, Florida, Pennsylvania, New Jersey, North Carolina, and Louisiana or in two metropolitan areas: Atlanta, Georgia and Detroit, Michigan. From 1995 through 1996, 3.5 million questionnaires were mailed to members of the AARP, aged 50–71 years. The questionnaire included a dietary section as well as some lifestyle questions. Over 500,000 people returned the questionnaire after three mailing waves. This, then, is the largest study of diet and health ever conducted in the USA. In 1996–1997, these participants received a Risk-Factor Questionnaire which asked additional questions about lifestyle and behavior, and in 2004–2006 these participants received another follow-up questionnaire (http://dietandhealth.cancer.gov/history.html).

Participants for the calibration study were randomly selected from the 46,970 subjects who had responded to the first wave as of January 1996. Two thousand participants was the targeted sample size to attempt recruitment calls for a baseline and 24 hours follow up. The sample available to us included 920 men. Because the data are not publicly available, we simulated 920 food intakes that are similar in distribution to the men in the NIH-AARP calibration study. We used as ${X}_{\mathit{\text{ik}}}^{T}$ four covariates: age, body mass index (BMI), consumption of servings of whole grains from the FFQ, and energy intake from whole grains from the FFQ. Therefore, the degrees of freedom of the inverse Wishart were setup as *m _{u}* = 5. The latter two covariates were transformed by the cube root and the logarithm, respectively.

We present results from `Intake_epis_food()` for this simulated data example with a sample size of 920 men, modeling whole grains consumption. We used a burn-in of 1, 000 steps followed by 10, 000 MCMC iterations; our `Intake_epis_food()` function took approximately 6 hours and 16 minutes (Pentium computer (R) D CPU 3.5GHz and 1.99GB of RAM). We used only every 10^{th} value of the chain. The first output file contains the average over 10,000 MCMC of posterior mean, posterior standard deviation, posteriors: 2.5^{th} percentile, 25^{th} percentile, 50^{th} percentile, 75^{th} percentile and 97.5^{th} percentile. The intercepts of the model are α_{1}, α_{2}, α_{3} corresponding to each level of the three-part model. Also, estimates of regression parameters for each covariate appear for each level of the model. This is the notation used for the output:

`alpha_1`: represents the intercept in equation (8).`alpha_2`: represents the intercept in equation (9).`alpha_3`: represents the intercept in equation (10).`beta_1_j`: represents the estimated coefficient of the*j*covariate in equation (8), j=1,‥,numcov.^{th}`beta_2_j`: represents the estimated coefficient of the*j*covariate in equation (9), j=1,‥,numcov.^{th}`beta_3_j`: represents the estimated coefficient of the*j*covariate in equation (10), j=1,‥,numcov.^{th}`tau_j_p`: represents the (j,p) entry estimate of ${\mathrm{\Sigma}}_{u}^{-1}$, j=p=1,2,3.`sigma2_j_p`: represents the (j,p) entry estimate of Σ_{u}, j=p=1,2,3.`a.e1`: represents the estimate of γ`a.e2`: represents the estimate of θ`sigmaemat_j_p`: represents the (j,p) entry estimate of Σ_{ε}, j=p=1,2,3, with Σ_{ε11}=1 and Σ_{ε12}=Σ_{ε21}=0.`tauemat_j_p`: represents the the (j,p) entry estimate of ${\mathrm{\Sigma}}_{\epsilon}^{-1}$, j=p=1,2,3.

We used λ_{F} = 0.32 as the Box-Cox transformation parameter of food intake λ_{E} = 0.23 as the Box-Cox transformation of energy intake. We setup our working directory in `R` with the command `setwd(’C:’)` and using `Intake_epis_food()` function:

Intake_epis_food(data.file.name="Sim_AARP_wg_Men.csv",numcov=4, df=5,lambdaf=0.32,lambdae=0.23,lvar1=999, lvar2=999, lvar3=0.33,lvar4=0, n.iter=11000,n.burnin=1000, n.thin=10, bugs.seed=123456,working.directory="C:/", file.estimates="estimates.csv",file.istats="estimates_intake.csv", file.iterations="iterations.csv",file.uis="uis.csv", file.tracep="trace_plots.pdf",file.tpvarcov="trace_plot_var_cov_matrices.pdf",file.densityp="density_plots.pdf",bugs.directory="c:/Program Files/WinBUGS14/")

After calculations, estimates of α_{1}, α_{2}, α_{3}, β_{1}, β_{2}, β_{3} from equations (8), (9) and (10), Σ_{μ}, ${\mathrm{\Sigma}}_{\mu}^{-1}$, Σ_{ε}, ${\mathrm{\Sigma}}_{\epsilon}^{-1}$ are shown as well as summary statistics using a Bayesian approach for three variables are provided immediately in the console screen (a) food usual intake, (b) usual food intake per 1000 calories, and (c) energy usual intake. The following shows the results for whole grains.

Inference for Bugs model at ' Intake_program.txt ', fit using WinBUGS, 1 chains, each with 11000 iterations (first 1000 discarded) n.thin= 10 n.sims= 1000 iterations saved mean sd 2.5% 25% 50% 75% 97.5% alpha_1 0.89 0.06 0.79 0.85 0.89 0.92 1.00 alpha_2 −0.04 0.05 −0.13 −0.07 −0.04 −0.01 0.05 alpha_3 0.00 0.04 −0.07 −0.03 0.00 0.02 0.07 beta_1_1 0.16 0.05 0.07 0.13 0.16 0.19 0.25 beta_1_2 −0.08 0.05 −0.18 −0.12 −0.08 −0.05 0.00 beta_1_3 0.57 0.05 0.47 0.54 0.57 0.61 0.67 beta_1_4 −0.29 0.05 −0.38 −0.32 −0.29 −0.26 −0.19 beta_2_1 −0.07 0.04 −0.15 −0.09 −0.06 −0.04 0.01 beta_2_2 −0.08 0.04 −0.16 −0.11 −0.08 −0.06 −0.01 beta_2_3 0.42 0.04 0.34 0.40 0.42 0.45 0.50 beta_2_4 −0.11 0.04 −0.18 −0.13 −0.11 −0.08 −0.03 beta_3_1 −0.07 0.04 −0.15 −0.10 −0.07 −0.05 0.00 beta_3_2 −0.06 0.04 −0.14 −0.09 −0.06 −0.04 0.00 beta_3_3 0.00 0.04 −0.07 −0.03 0.00 0.03 0.08 beta_3_4 0.34 0.04 0.27 0.32 0.34 0.37 0.42 tau_1_1 2.05 0.57 1.26 1.66 1.94 2.31 3.50 tau_1_2 −0.33 0.60 −1.64 −0.66 −0.31 0.04 0.82 tau_1_3 −0.11 0.19 −0.47 −0.23 −0.11 0.01 0.29 tau_2_1 −0.33 0.60 −1.64 −0.66 −0.31 0.04 0.82 tau_2_2 3.29 0.79 2.21 2.77 3.13 3.62 5.31 tau_2_3 −0.57 0.28 −1.21 −0.73 −0.54 −0.38 −0.09 tau_3_1 −0.11 0.19 −0.47 −0.23 −0.11 0.01 0.29 tau_3_2 −0.57 0.28 −1.21 −0.73 −0.54 −0.38 −0.09 tau_3_3 1.53 0.16 1.26 1.42 1.51 1.63 1.89 sigma2_1_1 0.56 0.13 0.33 0.47 0.55 0.64 0.83 sigma2_1_2 0.06 0.09 −0.12 0.00 0.06 0.12 0.22 sigma2_1_3 0.06 0.06 −0.06 0.02 0.06 0.10 0.18 sigma2_2_1 0.06 0.09 −0.12 0.00 0.06 0.12 0.22 sigma2_2_2 0.37 0.07 0.23 0.32 0.36 0.41 0.51 sigma2_2_3 0.13 0.05 0.03 0.10 0.13 0.16 0.23 sigma2_3_1 0.06 0.06 −0.06 0.02 0.06 0.10 0.18 sigma2_3_2 0.13 0.05 0.03 0.10 0.13 0.16 0.23 sigma2_3_3 0.72 0.07 0.60 0.68 0.72 0.77 0.86 a.e1 0.17 0.05 0.09 0.14 0.17 0.20 0.26 a.e2 0.81 0.29 0.28 0.62 0.80 0.99 1.37 sigmaemat_1_3 0.12 0.05 0.03 0.09 0.12 0.16 0.23 sigmaemat_2_2 1.26 0.08 1.11 1.21 1.26 1.31 1.42 sigmaemat_2_3 0.14 0.05 0.04 0.11 0.14 0.18 0.24 sigmaemat_3_1 0.12 0.05 0.03 0.09 0.12 0.16 0.23 sigmaemat_3_2 0.14 0.05 0.04 0.11 0.14 0.18 0.24 sigmaemat_3_3 1.16 0.06 1.06 1.12 1.16 1.19 1.27 tauemat_1_1 1.02 0.01 1.00 1.01 1.01 1.02 1.05 tauemat_1_2 0.01 0.01 0.00 0.01 0.01 0.02 0.03 tauemat_1_3 −0.11 0.05 −0.21 −0.14 −0.11 −0.08 −0.03 tauemat_2_1 0.01 0.01 0.00 0.01 0.01 0.02 0.03 tauemat_2_2 0.81 0.05 0.72 0.78 0.81 0.84 0.91 tauemat_2_3 −0.10 0.04 −0.17 −0.13 −0.10 −0.08 −0.03 tauemat_3_1 −0.11 0.05 −0.21 −0.14 −0.11 −0.08 −0.03 tauemat_3_2 −0.10 0.04 −0.17 −0.13 −0.10 −0.08 −0.03 tauemat_3_3 0.89 0.04 0.81 0.87 0.89 0.92 0.98 Usual food intake Usual food intake per 1000 calories Energy usual intake Mean 0.9359701 0.4072328 2298.3852 S.d. 0.6947905 0.2849965 446.3253 5th 0.1887330 0.0823138 1636.7102 10th 0.2632069 0.1156591 1755.9317 25th 0.4359173 0.1992969 1982.9681 50th 0.7421419 0.3408097 2267.2364 75th 1.2117088 0.5350990 2568.3280 90th 1.9042402 0.7920862 2891.3676 95th 2.3828280 0.9484864 3088.9744

This is the only output presented in the console screen. These results are saved in `file.estimates` and `file.istats` respectively. Five additional output files are stored as mentioned before, but examples of these files are not presented here. Instead, we present the posterior density of the mean of usual whole grains intake plot and the posterior density of the mean of usual whole grains intake per 1000 calories in figure 1 (`file.densityp`). We present trace plots of the intercept and coefficients in equations (8), (9) and (10) in figure 2 (`file.tracep`). We present trace plots of entry estimates saved in `file.tpvarcov` of variance-covariance matrices (i) for Σ_{μ} in figure 3, (ii) for ${\mathrm{\Sigma}}_{\mu}^{-1}$ in figure 4, (iii) for Σ_{ε} in figure 5 and (iv) for ${\mathrm{\Sigma}}_{\epsilon}^{-1}$ in figure 6.

Posterior density of the mean for whole grains. The solid line is the density estimate for usual intake from 1000 MCMC. The dashed line is the density estimate for usual intake per 1000 calories.

Trace plot from 1000 MCMC of intercepts and estimated coefficients. First row shows trace plots of parameters for the intercept, age, body mass index, consumption of whole grains and energy intake from whole grains in equation (8). Second row shows trace **...**

Trace plot from 1000 MCMC for each entry estimate of variance-covariance matrix ${\mathrm{\Sigma}}_{\mu}^{-1}$ j=p=1,2,3.

Trace plot from 1000 MCMC for each entry estimate of variance-covariance matrix Σ_{ε} j=p=1,2,3. The following parameters are neither estimated nor plotted because they are fixed: Σ_{ε11}=1 and Σ_{ε12}=Σ **...**

This paper was motivated by the AARP calibration sub-study in nutritional epidemiology. Our main aim when implementing this function was to help users to estimate this recent nonlinear mixed three-part model of measurement error for an episodically food consumed in a Bayesian manner.

This research was supported by a grant from the National Cancer Institute (R37-CA-057030).

Adriana Pérez, Division of Biostatistics and Michael & Susan Dell Center for Healthy Living, The University of Texas Health Science Center at Houston, School of Public Health, 1616 Guadalupe Street, Suite 6.340, Austin, TX 78705, Telephone: +1-(512)-391-2524, Fax: +1-(512)-482-6185, Email: ude.cmt.htu@zerep.anairda.

Saijuan Zhang, Department of Statistics, Blocker Building, Room, Texas A&M University, 3143 TAMU, College Station TX 77843-3143, Telephone: +1-979-845-3141, Fax: +1-979-845-3144, Email: ude.umat.tats@gnahzjs..

Victor Kipnis, Division of Cancer Prevention, National Cancer Institute, 6130 Executive Boulevard, Bethesda, Maryland 20892-7354, Telephone: +1-301-496-7464, Fax: +1-301-496-7463, Email: vog.hin@b3kv.

Douglas Midthune, Division of Cancer Prevention, National Cancer Institute, 6130 Executive Boulevard, Bethesda, Maryland 20892-7354, Telephone: +1-301-496-7464, Fax: +1-301-496-7463, Email: vog.hin@q67md.

Laurence S. Freedman, Gertner Institute for Epidemiology and Health Policy Research, Biostatistics Unit, Sheba Medical Center, Tel Hashomer 52161, Israel, Telephone: +972-3-530-5390, Fax: +972-3-534-9607, Email: li.oc.moctca@fsl.

Raymond J. Carroll, Department of Statistics, Blocker Building, Room, Texas A&M University, 3143 TAMU, College Station TX 77843-3143, Telephone: +1-979-845-3141, Fax: +1-979-845-3144, Email: ude.umat.tats@llorrac..

- Dodd KW, Guenther PM, Freedman LS, Subar AF, Kipnis V, Midthune D, Tooze JA, Krebs-Smith SM. Statistical Methods for Estimating Usual Intake of Nutrients and Foods: a Review of the Theory. Journal of the American Dietetic Association. 2006;106(10):1640–1650. [PubMed]
- Eckert RS, Carroll RJ, Wang N. Transformations to Additivity in Measurement Error Models. Biometrics. 1997;53:262–272. [PubMed]
- Kipnis V, Freedman LS, Carroll RJ, Midthune D. A measurement error model for episodically consumed foods and energy. 2010 Preprint.
- Kipnis V, Midthune D, Buckman DW, Dodd KW, Guenther PM, Krebs-Smith SM, Subar AF, Tooze JA, Carroll RJ, Freedman LS. Modeling Data with Excess Zeros and Measurement Error: Application to Evaluating Relationships between Episodically Consumed Foods and Health Outcomes. Biometrics. 2009;65(4):1003–1010. [PMC free article] [PubMed]
- R Development Core Team. R: A Language and Environment for Statistical Computing. 2009 ISBN 3-900051-07-0, URL http://www.R-project.org.
- Schatzkin A, Subar AF, Thompson FE, Harlan LC, Tangrea J, Hollenbeck AR, Hurwitz PE, Coyle L, Schussler N, Michaud DS, Freedman LS, Brown CC, Midthune D, Kipnis V. Design and serendipity in establishing a large cohort with wide dietary intake distributions: the National Institutes of Health-AARP Diet and Health Study. American Journal of Epidemiology. 2001;154:1159–1125. [PubMed]
- Spiegelhalter DJ, Thomas A, Best NG. WinBUGS Version 1.2 User Manual. Cambridge, UK: MRC Biostatistics Unit; 1999. URL http://www.mrc-bsu.cam.ac.uk/bugs/.
- Sturtz S, Ligges U, Gelman A. R2WinBUGS: A Package for Running WinBUGS from R. Journal of Statistical Software. 2005;12(3):1–16. URL http://www.jstatsoft.org.
- Tooze JA, Midthune D, Dodd KW, Freedman LS, Krebs-Smith SM, Subar AF, Guenther PM, Carroll RJ, Kipnis V. A New Statistical Method for Estimating the Usual Intake of Episodically Consumed Foods With Application to Their Distribution. Journal of the American Dietetic Association. 2006;106(10):1575–1587. [PMC free article] [PubMed]
- Zhang S, Midthune D, Pérez A, Buckman DW, Kipnis V, Freedman LS, Dodd KW, Krebs-Smith SM, Carroll RJ. A Bivariate Nonlinear Measurement Error Model for Episodically Consumed Dietary Components. 2010
*Preprint.*[PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |