Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2907441

Formats

Article sections

- Summary
- 1. Introduction
- 2. Method
- 3. Simulation Study
- 4. Application
- 5. Discussion
- Supplementary Material
- References

Authors

Related links

Biometrics. Author manuscript; available in PMC 2010 September 21.

Published in final edited form as:

PMCID: PMC2907441

NIHMSID: NIHMS186545

Email: gro.chg@ja.kooc

The publisher's final edited version of this article is available at Biometrics

See other articles in PMC that cite the published article.

Spatial cluster detection is an important methodology for identifying regions with excessive numbers of adverse health events without making strong model assumptions on the underlying spatial dependence structure. Previous work has focused on point or individual-level outcome data and few advances have been made when the outcome data are reported at an aggregated level, e.g. at the county- or census tract-level. This paper proposes a new class of spatial cluster detecion methods for point or aggregate data, comprising of continuous, binary, and count data. Compared with the existing spatial cluster detection methods it has the following advantages. First, it readily incorporates region-specific weights, for example, based on a region’s population or a region’s outcome variance, which is key for aggregate data. Second, the established general framework allows for area-level and individual-level covariate adjustment. A simulation study is conducted to evaluate the performance of the method. The proposed method is then applied to assess spatial clustering of high Body Mass Index in a HMO population in the Seattle, Washington USA area.

The increasing prevalence of obese/overweight individuals is a growing public health concern, causing a tremendous strain on the health care system. Identifying regions or neighborhoods that have elevated body mass index (BMI) compared to the rest of the area could help direct the distribution of resources for obesity prevention and treatment programs. Furthermore, if the elevated BMI regions can be explained by factors, such as area-level socioeconomic status (SES), walkability, and density of fast food establishments, these may shape an appropriate intervention program tailored for the residents in a particular geographic region. Typically these types of data are not available at the individual’s residence location level, but are aggregated to census tracts, counties, or states. To be able to conduct such analyses, a spatial cluster detection method is needed that handles weighted continuous outcomes, while being able to adjust for area-level covariates.

Currently, numerous spatial cluster detection methods are available for the analysis of individual level data using a wide variety of outcomes. For example, there are methods for binary outcomes to identify areas with elevated prevalence of disease and for count outcomes to identify excess rates of incidence or mortality (Kulldorff et al., 2006; Tango and Takahashi, 2005; Duczmal and Assunção, 2004; Patil and Taillie, 2004; Tango, 2000; Kulldorff, 1997; Turnball et al., 1990). There are also several methods for censored continuous outcomes, which explore potential spatial clusters for detection of time to early event (Cook et al., 2007; Huang et al., 2007). However, the only available method for aggregate continuous data is a recently accepted spatial scan statistic for weighted normal outcomes (Huang et al., 2009). This latter method is not able to incorporate covariate adjustment (area or individual-level) and is not a general approach for any weighted non-continuous outcome, which is key for our application of interest.

The application of interest is a prospective cohort study evaluating areas of elevated BMI of adult females from a mixed-model health plan and delivery system located in the Seattle area in Western Washington USA. The purpose of this analysis is to assess whether there are areas of elevated BMI and if these elevated areas can be explained by area-level SES predictors. BMI is measured at the individual-level, but spatial location is measured only at the aggregate, census-tract, level due to data availability. Furthermore, even though BMI is a continuous outcome, it is highly skewed with heavier tails at the upward end of the distribution than would be expected if it was normal distributed, i.e. methods assuming a normal assumption are not valid. Our proposed spatial cluster detection approach is able to handle aggregate level, skewed data, while being able to adjust for both individual and area-level covariates, which is critical for this analysis.

We present in this paper a general statistical approach to quantifying spatial cluster detection for weighted outcomes that can be continuous, but is applicable for most outcome types. The new method is presented in Section 2 and a simulation study evaluating its properties can be found in Section 3. In Section 4, we apply the proposed weighted cumulative geographic residual method to an analysis assessing elevated areas of BMI for adult females in Western Washington USA. We conclude with a general discussion in Section 5.

We exemplify the development of our test statistic in the framework of a continuous outcome, though the formulation may be easily generalized to any binary/discrete data with proper link functions (e.g. Poisson data with a log link function). Suppose the outcome for region *i*(*i* = 1, …, *n*), *Y _{i}*, is continuous, with a 1 ×

$${Y}_{i}={\mathbf{X}}_{i}\mathit{\beta}+{e}_{i},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}{e}_{i}\stackrel{\mathit{ind}}{\sim}(0,{\sigma}^{2}/{w}_{i}),$$

(1)

where ** β** is a

We consider a two-dimensional moving block process over location (*x*_{1}, *x*_{2}), *Z _{loc}*(

$${Z}_{\mathit{loc}}({x}_{1},{x}_{2}\mid b)=\frac{1}{\sqrt{n}}\sum _{i=1}^{n}{W}_{i}({x}_{1},{x}_{2}\mid b){\widehat{e}}_{i},$$

(2)

where *W _{i}*(

The exact distribution of *Z _{loc}*(

$${\widehat{Z}}_{\mathit{loc}}({x}_{1},{x}_{2}\mid b)=\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\left[{W}_{i}({x}_{1},{x}_{2}\mid b){\widehat{e}}_{i}+\nu ({x}_{1},{x}_{2}\mid b){\mathbf{I}}^{-1}(\widehat{\mathit{\beta}}){U}_{i\beta}(\widehat{\mathit{\beta}})\right]{G}_{i},$$

(3)

where

$$\nu ({x}_{1},{x}_{2}\mid b)=-\sum _{i=1}^{n}{W}_{i}({x}_{1},{x}_{2}\mid b)\partial \mu /\partial \mathit{\beta}=-\sum _{i=1}^{n}{W}_{i}({x}_{1},{x}_{2}\mid b){\mathbf{X}}_{i}.$$

**I**(** β**) = −

This asymptotic result immediately allows us to approximate the null distribution of *Z _{loc}*(

$${S}_{\mathit{loc}}=sup\left[\underset{{x}_{1},{x}_{2}}{sup}{Z}_{\mathit{loc}}({x}_{1},{x}_{2}\mid {b}_{1}),\dots ,\underset{{x}_{1},{x}_{2}}{sup}{Z}_{\mathit{loc}}({x}_{1},{x}_{2}\mid {b}_{M})\right].$$

Continuous mapping theorem will show that *S _{loc}* has the same limiting distribution as the following stochastic process, conditional on the observed data,

$${\widehat{S}}_{\mathit{loc}}=sup\left[\underset{{x}_{1},{x}_{2}}{sup}{\widehat{Z}}_{\mathit{loc}}({x}_{1},{x}_{2}\mid {b}_{1}),\dots ,\underset{{x}_{1},{x}_{2}}{sup}{\widehat{Z}}_{\mathit{loc}}({x}_{1},{x}_{2}\mid {b}_{M})\right].$$

Hence, the empirical p-values can be computed as
$\text{p}-\text{value}={\scriptstyle \frac{{\displaystyle \sum _{j=1}^{N}}I\left[{S}_{\mathit{loc}}\le {\widehat{S}}_{j,\mathit{loc}}\right]}{N}}$, where *Ŝ _{j,l}*

This hypothesis test can be inverted to form confidence bands around *Z _{loc}*(

This overcomes a potential limitation of this method in which potential clusters assume a square cluster grid instead of a potentially more intuitive circular cluster grid. The square grid indicator function, *W _{i}*(

The weighted cumulative geographic residual method can be applied to aggregate data in which the outcome is continuous, such as mean Body Mass Index (BMI) in a region or 5 year mortality of breast cancer in a region. We are able to use the population-size or inverse of variance weights for parameter estimation through the weighted linear regression model and directly in our weighting function for spatial cluster detection using the residuals. The method is able to adjust for area-level covariates, but adjustment for both individual and area-level covariates has not been incorporated. The next section will propose a simple two-stage approach to extend the method to general covariate adjustment.

Often datasets evaluating spatial clustering only have data available at an aggregate level including spatial locations, outcomes, and covariates. However, there are situations, such as the application of this paper for the health care population, in which spatial locations may be available at the aggregate-level only, but outcomes and covariates are available at both the individual and aggregate-level. Therefore it is important to extend our method to be able to address the question of whether spatial clustering exists after adjusting for individual-level covariates and if it still persists after adjusting for area-level covariates, such as SES.

We propose a simple two-stage approach to account for individual level covariates. The first stage is the model which regresses individual-level covariates,
${\mathbf{X}}_{ij}^{I}$, on the outcome *Y _{ij}*, while still allowing for differences across regions,

$${Y}_{ij}={\mathit{\beta}}_{I}{\mathbf{X}}_{ij}^{I}+{\mathbf{U}}_{i}+{e}_{ij}$$

(4)

where subscript *ij* denotes individual *j*(*j* = 1, …, *n _{i}*) in region

Stage two applies the weighted cumulative residual method discussed in Section 2.1 using derived outcomes and weights from stage one. The derived outcome from stage one can be thought of as an individual-level covariate adjusted outcome and we can derive this outcome using two different approaches; (1) **U*** _{i}* is fixed effect or (2)

Estimation of *V _{i}* and its variance may have issues for regions with small sample sizes. Therefore, when the fixed effect approach is not estimable we propose an alternate approach using a random effect framework to obtain individual-level adjusted outcomes. Now assume that
${\mathbf{U}}_{i}\stackrel{\mathit{ind}}{\sim}({\mathit{\beta}}_{R},{\sigma}_{R}^{2})$ and
${e}_{ij}\stackrel{\mathit{ind}}{\sim}(0,{\sigma}^{2})$. For estimation one can choose any distribution, but it is most common to assume a normal distribution. Inference can be sensitive to distribution chose so it may be important to assess several distributions. Given the distribution assumption on

$${B}_{i}=\sum _{j=1}^{{n}_{i}}[{Y}_{ij}-{\widehat{\mathit{\beta}}}_{I}{\mathbf{X}}_{ij}^{I}-{\widehat{\mathbf{U}}}_{i}]/{n}_{i}.$$

Assuming no relationship between aggregate spatial location and outcome *Y _{ij}* given
${\mathbf{X}}_{ij}^{I}$ the

We have proposed two different approaches for adjusting for individual-level covariates. The fixed effect approach is preferable since it limits modeling assumptions. Specifically, when using the random effect approach estimation is sensitive to the random effects distribution assumed as have been noted by others (Litiere et al., 2008; Heagerty and Kurland, 2001). For similar reasons we have proposed a two-stage approach instead of a single stage approach, i.e. using residuals directly from a generalized linear mixed model that adjusts for both area and individual-level covariates as have been proposed by others for goodness-of-fit tests (Pan and Lin, 2005). Particularly, since we are in the situation of spatial clustering it may actually be inappropriate to apply a generalized linear mixed model, which does not take into account spatial correlation, but only regional correlation, since it may lead to biased parameter estimates due to residual confounding (Wakefield, 2003). Therefore, the general modeling approach presented in this paper tries to present methods that limit model assumptions. The next section evaluates the performance of the proposed cumulative geographic residual method in a simulation study.

We conducted a simulation study calculating the Type I error and power for the proposed cumulative geographic residual test for weighted outcomes. For computational efficiency, we allowed a finite range for half edge length, *b*, of 0.5 to 3 sequenced by 0.1 and simulated 1000 * _{loc}* per dataset. We ran 1000 simulated datasets per calculation.

We first ran Type I error calculations to evaluate the validity of our proposed cluster detection method. The cumulative residual method is based on how well * _{loc}* approximates

Power calculations of the Weighted Cumulative Geographic Residual Test and Weighted Spatial Scan Statistic for different weights and effect size.

We then ran a simulation evaluating the effect on the Type I error when we assume the weights are known/fixed, when they actually are variable. For simplicity we assume a 10×10 unit-less grid with 100 equally sized square regions. We simulate the data assuming *Y _{i}* ~

The next set of power calculations follow the lines of the simulations performed for the spatial scan statistic for weighted outcomes presented by Huang et al. (2009). The spatial scan statistic for weighted outcomes applies a likelihood ratio statistic assuming a weighted normal linear regression model assessing if the outcome mean inside the potential cluster is greater than the outcome mean outside the potential cluster, *μ _{I}* >

For the power calculation we begin by assuming a 10 × 10 unitless study area representing 100 geographic units. The true cluster area *Z*^{*} is defined as the circular region centered at grid (6,3) with a radius of 2. All 13 geographic regions that have the center of their region included within the circle are included in *Z*^{*}. We defined power as the proportion of simulations that detect a significant (*σ* = 0.05) cluster. We define sensitivity as, of those simulations that detect a significant cluster, the proportion that includes at least one region in *Z*^{*}. We define accuracy as, of those simulations that find a significant cluster, the number of detected regions that are part of *Z*^{*} out of the total number of regions included in the detected cluster. For both sensitivity and accuracy we only included the highest significant cluster and not all clusters significant at 0.05 level.

For the first power calculation we generated the dataset assuming that the outcome distribution outside the cluster area is, *Y _{i}*|

Table 1 displays the results of the power calculation. As expected, the power increases with increased effect size and becomes above 90% power after *c >* 1.3. The proposed cumulative geographic residual method has lower power compared to the weighted spatial scan statistic, but this is mainly due to the design of the power calculation. If the true cluster is a circle the weighted spatial scan statistic will have higher power, sensitivity, and accuracy than the proposed cumulative residual method since the cumulative geographic residual method detects square clusters and not circles like the spatial scan statistic. However, the advantage of the cumulative geographic residual test is that it can find multiple clusters which the spatial scan statistic cannot. Therefore the clusters detected are not necessarily squares, but can be a combination of overlapping square areas.

When the weight of the observations within *Z*^{*} increases while the weight of observations outside *Z*^{*} stays constant the power substantially decreases. This is expected since the residual of the regions within *Z*^{*} will go to zero since the estimates of the model will give more weight to the values within *Z*^{*}. Similarly when the weight of the observations outside *Z*^{*} increases while the weight of the observations inside *Z*^{*} stays constant the power decreases. Now the estimates of the model give more weight to the values outside *Z*^{*} so the residuals within *Z*^{*} may be larger, but the weighted indicator variable *W _{i}*(

The next power calculation evaluates the robustness of the cumulative geographic residual test when the outcome is not normal and follows the power calculation presented by Huang et al. (2009). We apply the method assuming a double exponential(DoubleE), logistic, uniform, lognormal, and Poisson distributions. For the double exponential, logistic, and uniform distributions we assume a mean 0 and variance 1 outside *Z*^{*} and mean
$c\sqrt{2}$ and variance 1 inside *Z*^{*} and equal weights of 1 for all geographic areas. For the lognormal distribution we assumed the mean outside *Z*^{*} to be 2 instead of 0 since *Y _{i} >* 0 and the mean inside

Table 2 displays the results of the power calculation evaluating the robustness to distribution assumptions. The power is lower for the double exponential, lognormal, and Poisson distributions and equivalent for the uniform and logistic distributions compared to the normal distribution. Overall the power does not seem largely affected by the distribution assumptions, which is similar to the results of the weighted spatial scan statistic.

We then conducted a simulation to evaluate the effect of area-level covariate adjustment on cluster detection. We assumed that
$E({Y}_{i}\mid {Z}_{i}^{\ast},{X}_{i})=c\sqrt{2}{Z}_{i}^{\ast}+\beta {X}_{i}$ and *Var*(*Y _{i}*) = 1 where

The first five columns of Table 3 display the results for when *c* = 1 and therefore when the outcome, *Y _{i}*, and area-level covariate,

Power calculations of the Cumulative Geographic Residual Test for adjusted and unadjusted area-level and individual-level covariate analyses when spatial clustering exists independent of covariate and outcome relationship (c=1).

The next set of simulations evaluates the concept of the spatial cluster being explained by the area-level covariate. Results are shown in Web-based Table 3, but are briefly summarized here. Our first simulation assessed step 1, the unadjusted analyses, when there is spatial clustering indirectly induced by
${X}_{i}(E({Y}_{i}\mid {Z}_{i}^{\ast})=\beta \gamma {Z}_{i}^{\ast})$. We varied *β* (−2 to 2 sequenced by 1) and *γ* = 0, .5, 1. When *β* ≤ 0 and *γ* = 0.5 or 1.0 there is no power to detect spatial clustering and the power is approximately the Type I error, 0.05. When we allowed *β* > 0 the power increased as *β* increased with a maximum power of 0.373 when *β* = 4 and *γ* = 1. Therefore the power to detect a significant spatial cluster is relatively low, but it does increase as expected with more positive dependence between
${Z}_{i}^{\ast}$ and *X _{i}* (

The next set of simulations evaluate the performance of the fixed effect individual-level covariate adjustment approach and compares this method to adjusting for composite area-level covariates. The simulation of each dataset is a two step process as follows; Step 1: Simulate
${X}_{ij}^{I}\sim N(\gamma {Z}_{i}^{\ast},1)$ and Step 2: Simulate
${Y}_{ij}\sim N(\sqrt{2}c{Z}_{i}^{\ast}+\beta {X}_{ij}^{I},1)$ independently for *i* = 1, …, 100 and *j* = 1, …, *n _{i}*. We vary the effect of
${Z}_{i}^{\ast}$ on
${X}_{ij}^{I}(\gamma )$, the effect of
${X}_{ij}^{I}$ on

The first simulation assessed the scenario when there is no direct relationship between cluster location, *Z*^{*}, and individual-level outcome, *Y _{ij}* (c=0) but a relationship between individual-level covariate,
${X}_{ij}^{I}$, and

We then assessed the power for the proposed fixed effect individual-level and the composite area-level covariate adjustment approaches. Results are displayed in the last four columns of Table 3. The individual-level approach is more powerful then the area-level adjusted approach and the loss of power increased as the dependence between ${X}_{ij}^{I}$ and ${Z}_{i}^{\ast}$ increased.

The next section will assess the performance of the proposed methods evaluating spatial clustering of BMI after adjusting for individual and area-level socioeconomic covariates.

We applied the proposed weighted cumulative geographic residual method to an observational study conducted at Group Health Cooperative (GHC), a mixed model health plan and delivery system serving approximately 300,000 members in western Washington State, USA. Females 18 years of age and older who had been continuously enrolled in GHC for at least six months as of July 31, 2006 (with less than a 60 day lapse in membership) and were residents of King County, WA were eligible for inclusion. Nursing staff and/or medical assistants obtained weight measurements during routine clinical care and entered these into the electronic medical record (EMR). We extracted any adult height and the most recent measurement of weight from the EMR during the preceding two-year period. Among the 57,499 females in our cohort, 19,451(33.8%) did not have a weight and height measurement in the EMR, 2,627(4.6%) did not have valid address information to identify their US census tract location; these females were excluded. We also excluded a census tract with only 3 women residing in it due to model estimation issues giving a total of 35,418 (61.6%) women from 372 census tracts retained for all subsequent analyses.

Location was only available at the census tract level so using a weighted spatial cluster detection method was deemed most appropriate for this analysis. The outcome of interest was body mass index (BMI) which ranged from 15.0 *kg*/*m*^{2} to 99.30 *kg*/*m*^{2} with a median of 26.0 *kg*/*m*^{2} and a mean of 27.5 *kg*/*m*^{2}. Figure 1 Part A shows the variability of the mean BMI within a census tract across the study region. First, it was of interest to assess if there were any elevated regions of BMI in the area and then to assess if the spatial clustering persisted after adjusting for important individual and area-level demographic and socioeconomic status (SES) variables. For our area-level SES variables we used median household income, percent white race, percent of adult males with a Bachelors degree or higher, and percent below of household poverty level all obtained through the 2000 US census. The only available individual-level covariates were females age and type of insurance (corporate, medicare, or medicaid). Maps of the spatial distribution of the area-level covariate are shown in the Web-based Figure 1.

Assessing spatial clustering of female BMI cluster in King County WA. Part A displays the raw mean census tract BMI and Part B and C display displays the areas with statistically significant spatial clustering from the unadjusted analyses and then adjusting **...**

The number of observations per census tract varied across the study area and ranged from 12 to 272 with a median of 85 people. For all analyses we used the fixed effect approach for individual-level covariate adjustment and ranged the half edge length, *b*, from 1 to 5 miles sequenced by 0.25 miles and ran 1000 simulations of the
${G}_{i}^{\prime}$*s*.

The first analysis assessed if there existed any spatial clusters when not adjusting for covariates. We found significant spatial clusters in the southern western region of the area as displayed in Figure 1, Part B. We then ran a series of analyses adjusting for individual-level covariates and then all area-level SES variables. When only adjusting for individual-level covariates the spatial cluster remained statistically significant in a similar location as without adjusting for covariates. When further adjusting for area-level covariates there still existed significant spatial clustering but in a smaller region as shown in Figure 1, Part C. The location moved to the most southern part of the study area and displays spatial clustering that is not explained away by our current area-level SES variables.

The final region with persistent clustering of elevated BMI, after adjusting for all individual and area-level covariates, is a region that is less urban compared to the other areas initially detected, including Federal Way that is a large suburb to Auburn which is a mixture of suburban and rural. The region has also a higher mixture of races including more recent immigrants from South Asia. Therefore there are many possible unmeasured reasons that this area shows persistent clustering such as walkability of the neighborhood (i.e. sidewalks and safety), fast food density, and cultural differences. The link to BMI/obesity and SES in the United States has been well researched in the literature (Dubowitz et al., 2008; King et al., 2005; Wang et al., 2007), particularly for the Caucasian and African-American populations, but how obesity relates to neighborhood infrastructure such as walkability and differences between ethnic groups have been less established (McGinn et al., 2007; Mujahid et al., 2008).

The results from this analysis may not be directly generalizable to Washington state, or even King county, since it only includes an insured population that would have had at least one primary care visit. This is still an important analysis for the healthcare system to try to understand were BMI is elevated and what are potential explanations for the observed elevated BMI that could potentially lead to targeted weight reduction programs.

We also applied the unweighted normal spatial scan statistic using the SaTScan* ^{™}* software (Kulldorff and Information Management Services, 2006). Software for the weighted normal spatial scan is not yet available. Results are displayed in Web-based Figure 2. The spatial scan found a much larger region (≈ 50%) which included the entire souther half of the study area. It also found several small statistically significant secondary clusters in the rest of the study area. Some of these regions are based on weights from < 20 females. This example shows the issues with not incorporating weights and the results are not directly comparable with the results presented for the weighted cumulative residual method.

In this paper we have proposed a robust to distributional assumptions spatial cluster detection method for weighted data that can be applied for a wide variety of outcomes. It was shown to have good power to detect single spatial clusters even when the outcome is not continuous. However, the simulation study did show that if interest is in detecting circular spatial clusters without adjusting for covariates then the weighted spatial scan statistic is more powerful. The new method is able to handle both individual and aggregate level covariate adjustment easily and to assess whether covariates are able to explain the detected spatial clusters. There are no previous methods available for weighted outcomes that can adjust for covariates or detect multiple clusters.

The proposed method improved upon the method published by Cook et al. (2007) using cumulative martingale residuals for censored outcome data, as it incorporates weighted outcomes of any type that can be applied to both point and aggregate data. This new method is more general than what was previously proposed since only the first two moments of the outcome are assumed. Further, by flexibly handling differential weighting of observations this method can be applied to a larger array of applications. The method proposed by Cook et al. (2007) may be thought of as a special case of the general methodology proposed in this paper.

Throughout most of this paper, including the BMI application, we have assumed the weights are known and not estimated. We ran a small simulation study assessing the Type I error when weights were variable and did not find a large effect on the Type I error. However, there may be more of an effect for certain scenarios including an effect on power and extending the method to incorporate extra variability from estimating the weights following the framework presented by Houseman et al. (2006) may be a potential for future work.

Another potential for the development of new statistical methods would be the extension of this method to handle longitudinal data and to be able to assess if a spatial cluster persists in the same location over time. This would answer questions such as if a region consistently has higher mean BMI compared to other regions. This type of method could be used simultaneously with the proposed method in this paper to assess cross-sectional and longitudinal location of spatial clusters over time.

The authors would like to thank the associate editor and reviewers for their helpful feedback. Support for this work was provided by NIH Roadmap 5P20RR020774 (AJC) and NIH RO1 CA95747(YL).

The views expressed in this article do not necessarily represent those of the Food and Drug Administration

7. Supplementary Material

Web Appendix, Tables, and Figures referenced in Sections 2, 3, and 4 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org. R code (R Development Core Team, 2009) implementing the new weighted cumulative geographic residual method are available at the author’s website, http://faculty.washington.edu/acook/software.html.

- Cook A, Gold D, Li Y. Spatial cluster detection for censored outcome data. Biometrics. 2007;63:540–549. [PubMed]
- Dubowitz T, Heron M, Bird C, Lurie N, Finch B, Basurto-Dãvila R, Hale L, Escarce J. Neighborhood socioeconomic status and fruit and vegetables intake among whites, blacks, and mexican americans in the united states. Am J of Clin Nut. 2008;87:1883–1891. [PubMed]
- Duczmal L, Assunção R. A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Comp Stat and Data Analysis. 2004;45:269–286.
- Heagerty P, Kurland B. Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika. 2001;88:973–985.
- Houseman E, Coull B, Ryan L. A functional-based distribution diagnostic for a linear model with correlated outcomes. Biometrika. 2006;93:911–926.
- Huang L, Kulldorff M, Gregorio D. A spatial scan statistic for survival data. Biometrics. 2007;63:109–118. [PubMed]
- Huang L, Tiwari R, Kulldorff M, Zou J, Feuer E. Weighted normal spatial scan statistic for heterogenous population data. JASA. 2009 in press.
- King W, Belle S, Brach J, Simkin-Silverman L, Soska T, Kriska A. Objective measures of neighborhood environment and physical activity in older women. Am J of Prev Med. 2005;28:461–469. [PubMed]
- Kulldorff M. A spatial scan statistic. Comm in Stat. 1997;26:1481–1496.
- Kulldorff M, Huang L, Pickle L, Duczmal L. An elliptic spatial scan statistic. Stat in Med. 2006;25:3929–3943. [PubMed]
- Kulldorff M. Information Management Services, I. SaTScan
^{™}v7.0: Software for the spatial and space-time scan statistics. 2006. http://www.satscan.org/ - Litiere S, Alonso A, Molenberghs G. The impact of a misspecified random-effects distribution on the estimation and the performance of inferential procedures in generalized linear mixed models. Stat in Med. 2008;27:3125–3144. [PubMed]
- McGinn A, Evenson K, Herring A, Huston S, Rodriguez D. Exploring associations between physical activity and perceived and objective measures of the built environment. J of Urban Health. 2007;84:162–184. [PMC free article] [PubMed]
- Mujahid M, Roux A, Shen M, Gowda D, Sanchez B, Shea S, Jacobs DJ, Jackson A. Relation between neighborhood environments and obesity in the multi-ethnic study of atherosclerosis. Am J of Epi. 2008;167:1349–1357. [PubMed]
- Pan Z, Lin D. Goodness-of-fit methods for generalized linear mixed models. Biometrics. 2005;61:1000–1009. [PubMed]
- Patil G, Taillie C. Upper level set scan statistic for detecting arbitrarily shaped hotspots. Environ and Eco Stat. 2004;11:183–197.
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2009.
- Su J, Wei L. A lack-of-fit test for the mean function in a generalized linear model. JASA. 1991;86:420–426.
- Tango T. A test for spatial disease clustering adjusted for multiple testing. Stat in Med. 2000;19:191–204. [PubMed]
- Tango T, Takahashi K. A flexibly shaped spatial scan statistic for detecting clusters. Int J Health Geo. 2005;4:11. [PMC free article] [PubMed]
- Turnball G, Iwano E, Burnett W, Howe H, Clark L. Monitoring for clusters of disease: Application to leukemia incidence in upstate new york. Am J of Epi. 1990;132:136–143. [PubMed]
- Wakefield J. Sensitivity analyses for ecological regression. Biometrics. 2003;59:9–17. [PubMed]
- Wang M, Kim S, Gonzalez A, MacLeod K, Winkleby M. Socioeconomic and food-related physical characteristics of the neighborhood environment are associated with body mass index. J Epi Comm Health. 2007;61:491–498. [PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |