Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3016053

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Methods for analysis of point process data
- 3. Larynx cancer data
- 4. Random effects in Bayesian point process models
- 5. Methods based on grid mesh
- 6. Data example
- 7. Simulated data
- 8. Discussion
- References

Authors

Related links

Comput Stat Data Anal. Author manuscript; available in PMC 2011 January 5.

Published in final edited form as:

Comput Stat Data Anal. 2009 June 1; 53(8): 2831–2842.

doi: 10.1016/j.csda.2008.05.017PMCID: PMC3016053

NIHMSID: NIHMS237214

Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, 800 Sumter Street, Columbia, SC 29208, United States

See other articles in PMC that cite the published article.

A range of point process models which are commonly used in spatial epidemiology applications for the increased incidence of disease are compared. The models considered vary from approximate methods to an exact method. The approximate methods include the Poisson process model and methods that are based on discretization of the study window. The exact method includes a marked point process model, i.e., the conditional logistic model. Apart from analyzing a real dataset (Lancashire larynx cancer data), a small simulation study is also carried out to examine the ability of these methods to recover known parameter values. The main results are as follows. In estimating the distance effect of larynx cancer incidences from the incinerator, the conditional logistic model and the binomial model for the discretized window perform relatively well. In explaining the spatial heterogeneity, the Poisson model (or the log Gaussian Cox process model) for the discretized window produces the best estimate.

Analyzing case event data in spatial epidemiology with residential locations (also known as spatial point patterns) is gaining more importance. The advantage of this approach is that the analysis incorporates the geographical location of events of interest which helps to reduce the model variance and leads to a correct inferential procedure. This paper aims to review some of the point process models that are commonly used in relation to the assessment of the effects of putative sources of hazard for the increased incidence of disease and to compare their relative performances when the true parameter values are known. We also mention the differences in interpreting the distance effects for each of these models. The approximate methods include the Poisson process model and the methods that are based on discretization of the study window. The exact method is based on a marked point process model, i.e., a conditional logistic model. The paper also addresses the issue of flexible modeling by demonstrating the use of approximate likelihood and Bayesian models, and their posterior sampling.

In analyzing case event data, the theoretical advancement has outmatched ready-to-use software. While routines now have appeared in, for example, the R package (R Development Core Team, 2004), that allow testing or simple modeling (e.g. DCluster (Gómez-Rubio et al., 2004), Spatstat (Baddeley and Turner, 2005), Splancs (Rowlingson and Diggle, 1993)), there is limited availability of algorithms that can be implemented easily for more sophisticated analyses with covariates. This is partly due to the fact that the basic likelihoods, e.g. the non-homogeneous Poisson process, involve normalizing constants that must be evaluated. There is a demand for flexible approaches to the modeling of case event data that can allow for the inclusion of covariates (spatially referenced or otherwise). Another advantage would be to have the extra flexibility of a Bayesian hierarchical modeling approach to case event data. These needs mirror the developments of Bayesian methods and software for count data (see e.g. Congdon (2003) and Lawson et al. (2003)). The implementation of all the methods, approximate and exact, described in this paper is via WinBUGS (Spiegelhalter et al, 2003), and hence the full range of Bayesian modeling machinery within that package is potentially available.

The organization of this paper is as follows. In the next section, we give a brief illustration of all the methods that are considered in this paper to analyze a point process dataset. In this illustration, we describe the point process models with the Berman and Turner (1992) proposal of weighted sum approximation to an integral in the likelihood (see also Lawson (1992)), the conditional logistic models (Diggle and Rowlingson, 1994), and very briefly, the grid mesh construction. The details of the Poisson mesh model and binomial mesh model approaches, based on the discretization of the study window, are introduced in Section 5. Although the discretization approach to case event data (or point process data) in epidemiological studies is quite common, we illustrate the theoretical justification and the assumptions involved in that approach. The Lancashire larynx cancer data are used in relative comparison and the data are introduced in Section 3. In Section 4, we introduce two random components in the specification of intensity function and define their distributions in a Bayesian setting. Section 6 illustrates the larynx cancer data analysis and results. The simulation technique and results are given in Section 7 and the concluding remarks are in Section 8.

A brief illustration of likelihood models and the conditional logistic model is given in this section, followed by a brief introduction of alternative methodology based on grid meshes. The adopted notations are as follows.

Suppose that the geographical study region, *A*, is given within which we intend to describe the spatial distribution of disease of a population. The data are represented by a set of locations **s**_{i}*A*, *i* = 1, …, *n* for cases and **c**_{j}*A*, *j* = 1, …, *m* for controls. For the models that we consider, cases and controls form a pair of independent, inhomogeneous Poisson point processes, with respective intensities, at the data locations, λ(**s**) and λ_{0}(**c**) where the vectors **s** = (**s**_{1}, …, **s*** _{n}*)

The number of cases follows a Poisson distribution with expectation *μ _{A}* =

$$L=\prod _{i=1}^{n}\lambda ({\mathbf{s}}_{i})exp\left(-{\int}_{A}\lambda (\mathbf{s})\text{d}\mathbf{s}\right).$$

(2.1)

The intensity function, λ(**s**), can be defined to include a modulating function which can represent the background population, and also covariate information. We assume that the cases are governed by a first-order intensity of the general form

$$\lambda (\mathbf{s})=\rho {\lambda}_{0}(\mathbf{s})\theta (\mathbf{s}),$$

(2.2)

where *θ*(**s**) is the relative risk at the location **s** including all the covariates and has a parametric form, λ_{0}(**s**) represent the background population effect at **s**, and *ρ* is a scaling parameter reflecting the overall incidence of cases relative to the region of controls. When *θ*(**s**) = 1, *ρ* is the scaling factor of the two intensities.

Berman and Turner (1992) proposed a quadrature rule to approximate the integral in (2.1) which makes feasible the estimation of model parameters in a GLM framework. We call this method the BT method. Following the rule, the integral in (2.1) is approximated as

$${\int}_{A}\lambda (\mathbf{s})\text{d}\mathbf{s}\doteq \sum _{k=1}^{N}{\omega}_{k}\lambda ({\mathbf{s}}_{k}),$$

where **s*** _{k}*,

$$l=\sum _{k=1}^{N}{\omega}_{k}\{\frac{{I}_{k}}{{\omega}_{k}}log\lambda ({\mathbf{s}}_{k})-\lambda ({\mathbf{s}}_{k})\}$$

where *I _{k}* is an indicator function, which takes the value 1 if

Instead of considering the population to be described as a pair of Poisson processes, we can think of it as a single marked point process with a common intensity (Diggle and Rowlingson, 1994). A random variable *Y _{i}* (

$${p}_{i}=\frac{\lambda ({\mathbf{s}}_{i})}{{\lambda}_{0}({\mathbf{s}}_{i})+\lambda ({\mathbf{s}}_{i})}=\frac{\rho \theta ({\mathbf{s}}_{i})}{1+\rho \theta ({\mathbf{s}}_{i})}.$$

In likelihood construction, the data are now treated as a binary marked observation rather than a point process. An immediate advantage of this approach is that the integral of (2.1), and hence the approximation, are no longer required. The log-likelihood takes the form

$$\sum _{i=1}^{n+m}\{{y}_{i}log{\phi}_{i}-log(1+{\phi}_{i})\},$$

where * _{i}* =

While it is possible to utilize these models directly, alternative approximate approaches may also be useful. Here we use an alternative method consisting of dividing the study area into a number of smaller mesh cells. Based on the count of number of points in each cell a conditional Poisson or binomial model can be fitted as a Bayesian hierarchical model where correlation is admitted at a higher level of the hierarchy. Standard errors and credible intervals of estimates and also model goodness-of-fit statistics, e.g. deviance, DIC (deviance information criteria) or PPL (posterior predictive loss) are readily available. A big advantage of this approach is that all the models could be implemented within available software (e.g. WinBUGS, which would make available for use a rich variety of MCMC sampling machinery). The incorporation of certain types of spatial association into models in WinBUGS is also straightforward.

We consider a particular data example, Lancashire larynx cancer data (Diggle, 1990), to illustrate all the methods considered in this paper. The data include the residential location of larynx cancer deaths (58 cases), recorded in the Chorley and South Ribble Health Authority of Lancashire during 1974–83. The objective in that work was to assess the evidence of increased larynx cancer incidence near an industrial incinerator, which was suspected as a possible source of health hazard. As a control variable, the author considered lung cancer deaths (978 controls) for the same period in the same study area. Fig. 1 displays the locations of cases, controls and incinerator. The units of co-ordinates are given in 10 m.

Larynx cancer as case denoted by “*”, lung cancer as control denoted by “.”, and “” the incinerator location with grid mesh.

In all our model development discussed later, we consider distance from the source as the only spatially derived covariate. It is straightforward to include other important covariates, e.g. direction, in these models. Also, in more generality, a range of covariates could be included. However, for simplicity of presentation we confine the discussion only to the distance effect, which is of main interest in the assessment of the incinerator effect on increased number of larynx cancer cases.

Whenever a likelihood function is available, it is possible to use a Bayesian paradigm to provide a posterior distribution for all unknown parameters. It is reasonably straightforward to assign prior distributions for all the unknowns in the above intensity function (2.2). In what follows we will examine Bayesian extensions to this and other likelihoods and assign suitable prior distributions to relevant parameters. We will give the detailed specification of prior distributions for each model below.

The relative risk at point **s*** _{i}*,

$$\theta ({\mathbf{s}}_{i})=g({d}_{i},\mathit{\beta})exp({\mathbf{z}}_{i}^{\text{T}}\mathit{\gamma})$$

where *d _{i}* is the distance of the

$$g({d}_{i},\mathit{\beta})=exp\{{\beta}_{0}+{\beta}_{1}{d}_{i}\},$$

where *β*_{0} is the intercept and *β*_{1} measures the distance effect. In the case of multiple putative hazard sources, the above expression can be extended to add an individual source distance with a coefficient to measure their effects on disease incidence.

In the intensity specification, in the model for *θ* (**s**), it is possible to include random effect terms that can allow for extra variation in the rate of the process. These effects are often specified with a log link to the risk to ensure positivity. For example,
${z}_{i}^{\text{T}}\gamma ={v}_{i}+{u}_{i}$ could be specified where *v _{i}* and

$$log{\theta}_{i}={\beta}_{0}+{\beta}_{1}{d}_{i}+{v}_{i}+{u}_{i}.$$

The two random effects, *v _{i}* and

In setting the prior distribution for correlated random effects, we consider the conditional distribution approach and use the CAR (conditional autoregressive distribution) model, which is defined as a series of univariate conditional distributions [*v _{i}*|{

$${v}_{i}\mid \{{v}_{j}\},{m}_{i},{\sigma}_{v}^{2}\sim \text{N}\left({\overline{v}}_{i},\frac{{\sigma}_{v}^{2}}{{m}_{i}}\right),$$

where
${m}_{i}={\sum}_{j=1}^{N}I(j\in \{{\delta}_{i}\}),{\overline{v}}_{i}={\sum}_{j\in \{{\delta}_{i}\}}{\scriptstyle \frac{{v}_{j}}{{m}_{i}}}$, {*δ _{i}*} is the set of first-order neighbors (the regions which share common geographical boundaries with

For the other random effect, *u _{i}*, which models heterogeneity arising from uncorrelated noise, we assume an exchangeable normal distribution with zero mean and variance
${\sigma}_{u}^{2}$.

In this section we illustrate a novel approach to the analysis of point pattern data. The methodology is based on the use of discretization of the study window into a grid mesh. A map of a study area can be of any shape. It is possible to split a map into a grid system of regular cell sizes, e.g., into a number of equal rectangular meshes if the study region is rectangular. That is, for a rectangular study area we consider disjoint sets {*A _{k}*},

The points of the observed processes (case and control) are binned into the cells and a mesh of cell counts results. If any point is on the grid line of two meshes, it is counted in either of the meshes at random. Given the counts observed within meshes, a likelihood can be constructed which describes the probability of counts observed within the cells. Possible choices of models could be Poisson or binomial depending on whether the disease is rare or non-rare.

Let the map be split into *K* cells: *A _{k}*(

$${\mu}_{k}^{\text{case}}={\int}_{{A}_{k}}\lambda (\mathbf{u})\phantom{\rule{0.16667em}{0ex}}\text{d}\mathbf{u}={\int}_{{A}_{k}}\rho \theta \phantom{\rule{0.16667em}{0ex}}(\mathbf{u}){\lambda}_{0}(\mathbf{u})\phantom{\rule{0.16667em}{0ex}}\text{d}\mathbf{u},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\forall k.$$

If we assume that all of the variables in *θ* **(u)** are constant within each cell *A _{k}*, then the means are

$${\mu}_{k}^{\text{case}}=\rho {\theta}_{k}{\int}_{{A}_{k}}{\lambda}_{0}(\mathbf{u})\phantom{\rule{0.16667em}{0ex}}\text{d}\mathbf{u},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\forall k.$$

For reasonably large numbers of mesh cells, the effect of this assumption is negligible. The remaining integral on λ_{0} can be replaced by background (or control) population estimates. Commonly, it is replaced by a case-mix adjusted expected count for the *k*-th cell. Diggle (2000) proposed replacing the above integral by its unbiased estimator, the population size within cell *A _{k}*. For this specific dataset, we propose replacing the above integral by

$${n}_{k}^{\text{case}}\sim \text{Poisson}({\mu}_{k}^{\text{case}})$$

and, at the second level, a log-linear model gives

$$log({\mu}_{k}^{\text{case}})=log({n}_{k})+log\rho +{\beta}_{0}+{\beta}_{1}{d}_{k}+{v}_{k}^{\text{case}}+{u}_{k}^{\text{case}},$$

where *d _{k}* is the distance of the centroid of

In a similar argument, we can conclude that the number of controls in the *k*-th cell,
${n}_{k}^{\text{control}}$, follows a Poisson distribution with mean

$${\mu}_{k}^{\text{control}}={\int}_{{A}_{k}}{\lambda}_{0}(\mathbf{u})\phantom{\rule{0.16667em}{0ex}}\text{d}\mathbf{u},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\forall k.$$

As before, the above integral can be replaced by *n _{k}*. In addition to

$${n}_{k}^{\text{control}}\sim \text{Poisson}({\mu}_{k}^{\text{control}})$$

and, at the second level as

$$log({\mu}_{k}^{\text{control}})=log({n}_{k})+{v}_{k}^{\text{control}}+{u}_{k}^{\text{control}}.$$

Two random effects, *u*^{control} and *v*^{control}, are included and these represent unstructured and structured heterogeneity, respectively.

A number of models can be formulated based on common structured heterogeneity (SH) and/or common unstructured heterogeneity (UH) of case and control counts. Common SH will leads to a correlated model. We can choose the nature of random effects to include by assessing model goodness-of-fit. The DIC (a model selection criterion) (Spiegelhalter et al., 2002) can be used, which selects the most parsimonious model among those fitted after penalizing for model complexity.

For a non-rare disease, instead of the Poisson distribution at mesh level, a binomial distribution can be fitted. Let *n _{k}* be the total number of points in cell

$${n}_{k}^{\text{case}}\sim \text{Binomial}({p}_{k}^{\text{case}},{n}_{k}),$$

where
${p}_{k}^{\text{case}}$ is the probability that the *k*-th cell has
${n}_{k}^{\text{case}}$ cases. A logit link function will lead to

$$\text{logit}({p}_{k}^{\text{case}})=log\rho +{\beta}_{0}+{\beta}_{1}{d}_{k}+{v}_{k}^{\text{case}}+{u}_{k}^{\text{case}}$$

and the hierarchical model can be specified at higher levels as in Section 5.1.

In short, we call these models the PM model for the Poisson mesh model and the BM model for the binomial mesh model.

Moller et al. (1998) gave a detail description of the log Gaussian Cox process (LGCP) and its properties. A Cox process *X* is a point process with random intensity λ such that *X*|λ is a Poisson process with intensity function λ. For an LGCP, the random intensity function is given by λ = exp(*Z*), where *Z* is a Gaussian field on *A* for which the random intensity measure *μ _{A}* =

After discretizing the observation window *A* into *K* disjoint cells {*A _{k}*} similarly as for grid meshes, the count of realized cases
${n}_{k}^{\text{case}}$ will follow a Poisson distribution with mean
$exp({\chi}_{k}^{\text{case}})$, where
${\chi}_{k}^{\text{case}}$ is a realization of

$${\chi}_{k}^{\text{case}}=log({n}_{k})+log\rho +{\beta}_{0}+{\beta}_{1}{d}_{k}+{v}_{k}^{\text{case}}+{u}_{k}^{\text{case}}.$$

Rue et al. (submitted for publication) adopted a similar discretization approach to an LGCP (see also, Waagepetersen (2003)).

For the specific parameterization of the intensity function that we have illustrated in Section 4, the regression coefficient (*β*_{1}) for the distance effect on disease incidence should be interpreted cautiously for each model. This is mainly because of the distinctness of the assumed probability distribution and the formulation at the first level of hierarchy.

For the CL and the BM models, *β*_{1} measures the average effect of distance from the incinerator for a probability of disease occurrence on a logit scale. In the case of the BT model, *β*_{1} measures the average effect of distance on the ratio of case and control intensities on a log scale. Generally, it measures the changes in SMR (standardized mortality ratio). The interpretation of *β*_{1} for the PM model will depend on the approximation used for the integral on λ_{0}. If this integral is replaced by a case-mix adjusted expected count, the *β*_{1} in the PM model will have a similar interpretation as for the BT model. In the Lancashire larynx cancer data, we replace this integral by the total count of cases and controls for each cell; the *β*_{1} in PM model will have a similar interpretation as for the CL and BM models.

In the implementation of the BT method, we have considered 462 points, out of which 58 are case points and 404 are dummy points (400 centroids of 20 × 20 grid meshes and 4 corner points). The role of dummy points is to increase the precision of integration scheme. To estimate the unknown control intensity λ_{0} (**x*** _{j}*), we use the “sm.density” function in R to get a nonparametric density estimates for controls, and then rescale it by multiplying the density estimates by

As in other methods, 1036 binary points are considered for the CL method and 900 cells are constructed for the grid methods, although a comparison has been made with a smaller number of cells (400) within grid methods to observe changes in estimates for mesh size changes. In the implementation of all the methods, we use the CAR prior distribution for the correlated random effect *v _{i}*. The adjacent points for the BT and CL methods are obtained by the Dirichlet tessellation method (Okabe et al., 2000). Dirichlet tessellation is a process of dividing an area into smaller, contiguous non-overlapping tiles, one per data point, with no gaps in between them. The

In the implementation of the BM model, a difficulty arises when we observe that many *n _{i}*’s are counted as zero. This means that some cells have zero total count. To circumvent this problem we adopt an ad hoc measure proposed in http://www.mrc-bsu.cam.ac.uk/bugs/faqs/lavine.shtml. The idea is developed as follows.

A vector, *ξ** _{k}*, for each

$${\xi}_{k1}=I({n}_{k}=0)+I({n}_{k}=1)$$

and,

$${\xi}_{kj}=I({n}_{k}=j),$$

where *j* = 2, …, *C* and *C* is set to the maximum value of *n*. The elements of vector *ξ** _{k}* will have one 1 and (

$$\mathit{new}.{n}_{k}\mid {\mathit{\xi}}_{k}\sim \text{Categorical}({\mathit{\xi}}_{k}).$$

In order to ensure that
${p}_{k}^{\text{case}}$ is zero whenever *n _{k}* is zero, a new
${p}_{k}^{\text{case}}$ is introduced as

$$\mathit{new}.{p}_{k}^{\text{case}}={p}_{k}^{\text{case}}(1-I({n}_{k}=0)).$$

Now, the binomial model appears as

$${n}_{k}^{\text{case}}\sim \text{Binomial}(\mathit{new}.{p}_{k}^{\text{case}},\mathit{new}.{n}_{k}).$$

The sampling from the above binomial (sometimes called phony binomial) distribution is feasible since the *new.n _{k}*’s are non-zero; at the same time, this approach also enforces
$\mathit{new}.{p}_{k}^{\text{case}}$ equal to zero whenever

Table 1 gives the results obtained by each method for each parameter, *β*_{0}, *β*_{1}, log (*ρ*), *σ _{u}*, and

Posterior mean estimates (95% credible intervals are given in parentheses) from the BT, CL and grid mesh methods for larynx cancer data

Two mesh sizes, 20 × 20 and 30 × 30, were considered for PM and BM models to check the mesh-size effect on the estimates. The results vary a little for mesh sizes for the estimates of *β*_{0}, *β*_{1} and log (*ρ*). For both mesh models, increasing the mesh size increases the variability of the uncorrelated random effects. The other noticeable change for the change in mesh size, the variability for the spatially correlated random effects, increases with the increase of the mesh size for the PM model, whereas for the BM model it decreases.

Table 2 gives results for the PM model for a 20 × 20 mesh size. We have considered four models depending on common and/or separate random effects for case and control models. The first model in column 2 considers no spatially correlated random effect for case and control, but has a separate uncorrelated random effect. The second model in column 3 considers separate spatially correlated random effects for case and control, and also has a separate uncorrelated random effect. The third model in column 4 considers a common spatially correlated random effect for case and control, but has a separate uncorrelated random effect. Note that this model considers a spatial correlation between case and control. The fourth model in column 5 considers a separate spatially correlated random effect but has a common uncorrelated random effect.

All the estimates are given with 95% credible interval in parentheses. The DIC values are given with the effective number of parameters in parentheses. DIC.total is obtained by summing the DIC.case and DIC.control. The smallest DIC.total is obtained for the first model in column 2, which considers neither the case nor the control has a spatially correlated random effect but has a separate uncorrelated random effect. The DIC’s the for third and fourth models are very close, although each assumes quite different random effects for cases and controls. The smallest deviance (after subtracting the effective number of parameters from the DIC) without penalizing for extra parameters is obtained for the second model in column 3, which considers a separate spatially correlated random effect for case and control as well as a separate uncorrelated random effect.

We have also considered the performance of all these models in a simulation experiment whose purpose is to examine the ability to recover true parameter values. We generated a number of cases and controls for given values of all unknown parameters. These given values in the simulation experiment are known as *true values*.

A number of control data points and case data points are simulated separately within the Lancashire study area. Control points were generated from _{0}, a nonparametric estimate of intensity function for controls, in order to maintain a dependency on the original lung cancer data. All the steps of the simulation of control points are given in Appendix A. At step 1, estimates of the optimal smoothing parameters are obtained from “sm.density” function in R. At step 2, a random sample is drawn from the list of controls by using the “sample” function in R.

The parameterization that we have considered for the case intensity has the form λ_{i}**=** *ρ*e^{β0+β1di}e^{Si}, where *S _{i}* represents unexplained spatial variation. This

Cases were simulated from an LGCP. The values for the parameters *β*_{0} and *β*_{1} are assumed as −1.0 and −0.005 to maintain similarities with the estimates from real data. The value for *ρ* is assigned as 1.0. All the steps of the simulation of case points are given in Appendix B. At step 3, *r* realizations of a GRF, *S _{i}*, were generated with the parameters, variance

We generate 200 controls and *r* realizations of 200 cases within the Lancashire study area. The numbers are chosen arbitrarily but purposely, to ensure equal number of cases and controls so that setting *ρ* **=** 1 is justified. In the case simulation, we set *m _{c}*

Simulated 20 datasets of 200 cases (all except the right-bottom cell) and one dataset of 200 controls (right-bottom cell).

Thus, for each dataset the total number of points is 400, among them 200 cases and 200 controls, which is the total number of points for the CL model. In the implementation of the BT model for simulated data, 200 case points and 229 dummy points (225 centroids from a 15 × 15 grid mesh plus 4 corner points) are created, giving a total of 429 points. A 20 **×** 20 grid mesh is considered for both the PM model and the BM model which leads to 400 cells for the mesh models. The number of dummy points in the BT model and the number of cells in the grid mesh models were chosen in such a way as to keep the total number of points close to 400.

The model for the log relative risk that we have fitted by each model for simulated data is given as

$$log{\theta}_{k}={\beta}_{0}+{\beta}_{1}{d}_{k}={v}_{k}.$$

The above model incorporates only the spatially correlated random effect, *v _{k}*, to explain all random variations in the GRF, unlike the model for real data which also include a random effect,

In order to keep consistency in choosing the prior distribution for the spatially correlated random effects, *v*, we adopt the conditional distribution approach for all the models and assume a CAR distribution. The prior distributions for *β*’s were set to normal distribution with mean 0 and large variance. In this section, we summarize the results for three parameters, the intercept *β*_{0}, the coefficient *β*_{1}, and the spatially correlated heterogeneity
${\sigma}_{v}^{2}$, which are of main interest. The parameter *ρ*, which is the proportion of cases to controls, was set to 1.

The boxplots in Fig. 3 display the distribution of 20 *β*_{0}’s, *β*_{1}’s and
${\sigma}_{v}^{2}$’s. The bar within the box is for the median value, the left and right ends of the box are for the first and third quartiles, and the two whiskers are for the minimum and maximum values. The mean of *β*_{0} from the BT, CL, PM and BM models is, respectively, 2.672, 2.722, 0.337, and 2.738. For all the models the estimates of *β*_{0} are positively biased and none of the models even includes the true value within the range. We assume it is the R function “GaussRF” responsible for these biased estimates of *β*_{0}. The current algorithm of “GaussRF” does not support generating a nonstationary GRF. By saying this we mean that a nonstationary GRF obtained by adding trend to a zero mean generated GRF, and obtained by generating a GRF with non-zero mean (i.e., trend), are not the same. In the former case (i.e., the way a nonstationary GRF was introduced in this paper), *β*’s connection to the spatial covariance for the generation of GRF is not well defined. We strongly believe that if the GRF were generated with nonstationary mean log (*ρ*) + *β*_{0} + *β*_{1}*d _{i}*, we could have obtained better estimates of

Boxplots of *β*_{0}, *β*_{1} and
${\sigma}_{v}^{2}$, from 20 simulated datasets where the true parameters are
${\beta}_{0}^{\text{true}}=-1.0,{\beta}_{1}^{\text{true}}=-0.005$ and
${\sigma}_{v}^{2,\text{true}}=0.5$. The box limits are the quartiles and whiskers are the minimum **...**

The mean for *β*_{1} from each model is respectively, −0.0044, −0.0047, −0.0024, and −0.0047. The means from the CL and BM models, and the median from the BT model for *β*_{1}, are much closer to the true value compared to the other model. The *β*_{1} estimates from the PM model are positively biased and do not even include the true value within the range, although the estimates are very close to the true value. Similarly, the mean for
${\sigma}_{v}^{2}$ from each model is respectively, for BT, CL, PM, BM: 0.1009, 2.0509, 0.5668, and 1.7463. The mean and median values from the PM model for
${\sigma}_{v}^{2}$ are much closer to the true value compared to other models. The
${\sigma}_{v}^{2}$ estimates from the BT method and the BM model are, respectively, negatively and positively biased, and neither model even includes the true value within the range.

Table 3 gives the summary measures of error values for each model for *β*_{0}, *β*_{1} and
${\sigma}_{v}^{2}$. For intercept *β*_{0}, the average error and the average absolute error are equal as all the models produce a positively biased estimate (Fig. 3). It is interesting to note that, among these biased estimates, the PM model has the least errors. For coefficient *β*_{1}, the CL and the BM models have the least average error and the least average absolute error. Similarly, for spatially correlated heterogeneity
${\sigma}_{v}^{2}$, the PM model has the least average error and average absolute error, and it outperforms the estimates from other models.

We compare four point process models in the assessment of the effects of a putative source of hazard (i.e., an incinerator) on the incidence of larynx cancer death in Lancashire. The BT method is a useful approximate method for analyzing point pattern health data with a normalizing integral. An important disadvantage of this method is that it is subject to quadrature error. However, this error will be insignificant for a large number of dummy points. The alternative CL model uses the exact form of likelihood rather than any form of approximation. The PM and BM models, based on the discretization of the study window, are easy to implement in any statistical packages once the binned counts are available. The “PIP” function of the R package can be used to get these counts. This paper has given a theoretical justification of these two models and illustrated the assumptions involved in that process. We also showed the equivalence between the PM model and the LGCP when discretization is preferred.

The model for log relative risks that we have considered is a multiplicative model, and it includes spatially correlated and uncorrelated random effects to explain the heterogeneity in observed data. The analysis of real data (i.e., Lancashire larynx cancer data) using the CL, PM, and BM models produces similar results. Among the discretization methods, we have observed small changes in estimates for changes in mesh size. ZIP spatial regression (Agarwal et al., 2002) can be useful for large mesh sizes, since for large mesh size many cells will have zero counts. We have also demonstrated with the PM model that a rich variety of models can be formulated based on common structured heterogeneity and/or common unstructured heterogeneity of cases and controls. This is certainly an advantage of discretization methods.

From simulated data, in estimating the intercept *β*_{0}, the average error and the average absolute error indicate that the PM model performs better compared to other models, although all the models produce positively biased estimates. We suspect that the reason for this poor performance of *β*_{0} is the way the GRF was generated. This requires further study. In estimating the regression coefficient *β*_{1}, which is often the main interest of many epidemiological studies, the CL and BM models perform equally well. In estimating the variance parameter for spatially correlated random effects
${\sigma}_{v}^{2}$, the PM model can be singled out as the best performing model based on the least average error and the least average absolute error.

Finally, the results of simulation experiment indicate that we cannot observe a single model that performs the best in relation to every parameter of the above multiplicative model. The outcome is mixed and the choice of a specific model will depend on the study purpose. If the objective of analysis is to get estimates of coefficients, then either the CL model or BM model may be a good choice. If the interest is in explaining the heterogeneity, the PM model may be a better choice. In terms of computational time, the models based on grid meshes (BT model, PM model, and BM model) depend on the number of dummy points or number of cells. The reported results for the maximum number of cells (30 × 30) took approximately 30 min to run for 10,000 iterations in WinBUGS in a laptop with 2.33 MHz speed. The CL model took approximately the same time. Because of space limitation, we avoided providing all the WinBUGS and R codes that we have developed for this paper, but they can be made available from the first author on request.

We would also like to point out two limitations of this study. Throughout the paper we have used a CAR model for the spatially correlated random effects. Instead of this conditional specification, it is possible to adopt a multivariate normal distribution for the joint specification. We have observed that this joint distribution approach requires excessive computational time for the increased data size because at each iteration an inversion of an *N* × *N* covariance matrix will be required, where *N* is the number of data and dummy points in the BT method, the number of case-control events in the CL model, and the number of cells in mesh models. To circumvent this problem, resort could be made to the reduced rank kriging method (see Banerjee et al. (in press), Fuentes (2007), and Kammann and Wand (2003)). The other limitation is that we have a limited number of generated datasets for the simulation study. However, these are limited due to the computational burden of the approach.

The authors would like to gratefully acknowledge the support of NIH grant # 1 R03CA125828-01. Our sincere thanks go to Rolf Turner for providing the R code to calculate the adjacency matrix of a point process by the Dirichlet tessellation method.

Control points are generated from _{0} (nonparametric estimates of intensity function for controls) as follows:

- Set ${\widehat{\lambda}}_{0}={h}_{{\lambda}_{0}}^{-1}{\sum}_{i=1}^{m}k\left({\scriptstyle \frac{\mid \mathbf{s}-{\mathbf{s}}_{i}\mid}{{h}_{{\lambda}_{0}}}}\right)$, where |
**s − s**| is the distance between_{i}**s**and**s**,_{i}*k*(·) is an independent component bivariate kernel, and*h*_{λ0}(*=**h*_{λ0,}_{x}*,**h*_{λ0,})_{y}^{T}are the optimal smoothing parameters - Randomly draw a sample from set
**s**, defined as**s**_{sampled}= (*x*_{sampled},*y*_{sampled})^{T} - Generate
*ε*~_{x}*N*(0, 1) and*ε*~_{y}*N*(0, 1) - Compute
*x*_{simulate}=*x*_{sampled}**+***h*_{λ0},and_{x}ε_{x}*y*_{simulate}=*y*_{sampled}+*h*_{λ0},_{y}ε_{y} - Continue steps 2 to 4 until the desired number of control points are generated.

A log Gaussian Cox process has been generated for the intensity of cases, λ, as follows.

- Generate a large number of controls for
*i*= 1, …,*m*following the steps in Appendix A2_{c} - Calculate
*θ*_{i}1 + exp(*=**β*_{0}+*β*_{1}*d*), where_{i}*d*is the distance of_{i}*i*-th generated control point from the source - Generate
*r*realizations of a Gaussian random field for*S*= (*S*_{1}, …,*S*) with mean 0 and elements of the variance-covariance matrix_{mc}$${k}_{ij}(\sigma ,\alpha ,\tau )=\sigma \phantom{\rule{0.16667em}{0ex}}exp(-{d}_{ij}/\alpha )+\tau $$ - Calculate λ
_{i}*=**ρθ*exp(_{i}*S*)_{i} - Calculate λ
_{max}= max (λ)_{i} - Randomly draw a sample from the generated controls in step 1 and define the intensity of the drawn sample as λ
_{sampled} - Generate a random number,
*u*= Uniform (0, 1) - If ${\scriptstyle \frac{{\lambda}_{\text{sampled}}}{{\lambda}_{max}}}>u$, accept the point; otherwise, reject it. If the point is accepted, the location of this point is stored as a case location
- Continue steps 6 to 8 until the desired number of case points are generated.

- Agarwal DK, Gelfand AE, Citron-Pousty S. Zero-inflated models with application to spatial count data. Environmental and Ecological Statistics. 2002;9:341–355.
- Baddeley A, Turner R. Spatstat: An R package for analyzing spatial point patterns. Journal of Statistical Software. 2005;12:1–42.
- Banerjee S, Gelfand AE, Finley AO, Sang H. Gaussian predictive process models for large spatial datasets. Journal of the Royal Statistical Society, Series B. 2008 (in press) [PMC free article] [PubMed]
- Berman M, Turner R. Approximating point process likelihoods with GLIM. Applied Statistics. 1992;41:31–38.
- Besag J, York J, Mollie A. Bayesian image restoration with two applications in spatial statistics (with discussion) Annals of the Institute of Statistical Mathematics. 1991;43:1–59.
- Congdon P. Applied Bayesian Models. John Wiley; 2003.
- Diggle P. Overview of statistical methods for disease mapping and its relationship to cluster detection. In: Elliott P, Wakefield J, Best N, Briggs DJ, editors. Spatial Epidemiology: Methods and Applications. Oxford University Press; 2000.
- Diggle PJ. A point process modelling approach to raised incidence of a rare phenomenon in the vicinity of a prespecified point. Journal of the Royal Statistical Society, Series A. 1990;153:349–362.
- Diggle PJ, Morris SN, Elliott P, Shaddick G. Regression modelling of disease risk in relation to point sources. Journal of the Royal Statistical Society, Series A. 1997;160:491–505.
- Diggle PJ, Rowlingson B. A conditional approach to point process modelling of elevated risk. Journal of the Royal Statistical Society, Series A. 1994;157:433–440.
- Fuentes M. Approximate likelihood for large irregularly spaced spatial data. Journal of the American Statistical Association. 2007;102:321–331. [PMC free article] [PubMed]
- Gómez-Rubio V, Ferrándiz J, López A. In Functions for the detection of spatial clusters of diseases. 2004. http://matheron.uv.es/~virgil/Rpackages/DCluster.
- Kammann EE, Wand MP. Geoadditive models. Applied Statistics. 2003;52:1–18.
- Lawson AB. GLIM and normalising constant models in spatial and directional data analysis. Computational Statistics & Data Analysis. 1992;13:331–348.
- Lawson AB, Browne WJ, Vidal-Rodiero CL. Disease Mapping with WinBUGS and MLwiN. Wiley; New York: 2003.
- Ma B, Lawson AB, Liu Y. Evaluation of Bayesian models for focused clustering in health data. Environmetrics. 2007;18:1–16.
- Moller J, Syversveen AN, Waagepetersen RP. Log Gaussian Cox processes. Scandinavian Journal of Statistics. 1998;25:451–482.
- Okabe A, Boots B, Sugihara K, Chiu SN. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. Wiley; 2000.
- R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing; Vienna, Austria: 2004.
- Rowlingson B, Diggle P. Splancs: Spatial point pattern analysis code in S-Plus. Computers and Geosciences. 1993;19:627–655.
- Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations. 2007 (submitted for publication)
- Spiegelhalter D, Best N, Carlin B, Linde A. Bayesian measures of model complexity and fit (with discussion) Journal of the Royal Statistical Society, Series B. 2002;64:583–639.
- Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS User Manual. MRC Biostatistics Unit, Institute of Public Health; Cambridge, UK: 2003.
- Waagepetersen R. Convergence of posteriors for discretized log Gaussian Cox processes. Statistics & Probability Letters. 2003;66:229–235.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |