Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Ann Appl Stat. Author manuscript; available in PMC 2010 July 12.
Published in final edited form as:
Ann Appl Stat. 2010 March 1; 4(1): 439–459.
doi:  10.1214/09-AOAS278
PMCID: PMC2901904



A large amount of data is typically collected during a periodontal exam. Analyzing these data poses several challenges. Several types of measurements are taken at many locations throughout the mouth. These spatially-referenced data are a mix of binary and continuous responses, making joint modeling difficult. Also, most patients have missing teeth. Periodontal disease is a leading cause of tooth loss, so it is likely that the number and location of missing teeth informs about the patient’s periodontal health. In this paper we develop a multivariate spatial framework for these data which jointly models the binary and continuous responses as a function of a single latent spatial process representing general periodontal health. We also use the latent spatial process to model the location of missing teeth. We show using simulated and real data that exploiting spatial associations and jointly modeling the responses and locations of missing teeth mitigates the problems presented by these data.

Key words and phrases: Binary spatial data, informative cluster size, multivariate data, periodontal data, probit regression, shared parameter model

1. Introduction

Periodontal disease or periodontitis is an inflammatory disease affecting periodontium, the tissues that support and maintain teeth. Periodontitis causes progressive bone loss around the tooth which can lead to tooth loosening and eventually tooth loss. It has been estimated that about 50% of US adults over the age of 35 experience early stages of periodontal disease [Oliver, Brown and Loe (1998)], making periodontitis the primary cause of adult tooth loss. To measure periodontal status, dental hygienists often use a periodontal probe to measure several disease markers throughout the mouth. Three of the most popular markers are (a) clinical attachment loss (CAL), (b) periodontal pocket depth (PPD) and (c) bleeding on probing (BOP). PPD and CAL are continuous variables, usually rounded to the nearest millimeter. CAL is the distance down a tooth’s root that is no longer attached to the surrounding bone by the periodontal ligament, and PPD is the distance from the gingival margin to the base of the pocket. BOP is a binary response and is indicative of whether a particular site bled with the application of a dental probe. During a full periodontal exam, all three markers are usually measured at six pre-specified sites [Darby and Walsh (1995)] for each tooth (excluding the third molars, i.e., the wisdom teeth). So for a patient with no missing teeth, there are S = 168 measurements for each marker (Figure 1).

FIG. 1
Observed CAL for a typical patient. The shaded boxes represent teeth, the circles represent measurement sites, and the gray lines represent neighbor pairs connecting adjacent sites on the same tooth and sites that share a gap between teeth. “Maxillary” ...

The motivating example is a clinical study conducted at the Medical University of South Carolina (MUSC) to determine the periodontal disease status for Type-2 diabetic Gullah-speaking African-Americans, originally presented in Fernandez et al. (2009). The objective of this analysis is to quantify the disease status of this population, and to study the associations between disease status and patient-level covariates such as age, BMI, gender, HbA1C and smoking status.

Quantifying a patient’s disease status from the extensive data collected during a periodontal exam is difficult. For example, it is common to summarize disease status using the whole-mouth average CAL or the number of teeth with CAL above a certain threshold. Using the whole-mouth average CAL as the response in a regression with patient-level covariates is reasonable when the patients’ residual distributions are identical. However, this assumption is often violated in practice, as different patients have different error variances, spatial covariances and missing data patterns. In this paper we present a multivariate spatial model to jointly analyze periodontal data from multiple markers and multiple measurement locations to improve estimation of disease status, and hence develop a more powerful method for studying the association between patient-level covariates and periodontal disease.

We use spatial factor analysis [Wang and Wall (2003), Hogan and Tchernis (2004), Lopes, Salazar and Gamerman (2008)] to model these multivariate spatial data. We postulate that the three markers are all related to a single latent spatial process (factor) measuring periodontal health. The latent periodontal health factor varies from site to site and is smoothed spatially using a conditionally autoregressive prior [Besag, York and Mollié (1991), Banerjee, Carlin and Gelfand (2004)]. The data collected for this study provide interesting challenges that require extensions of the spatial factor model. First, the data are a mix of continuous and binary responses. To jointly model these data types, we develop a spatial probit model for binary responses, which has the advantage of being fully-conjugate and leads to rapid MCMC sampling and convergence. Also, we have data from multiple patients, and exploratory analysis suggests that the covariance of the latent spatial factor varies by patient. Therefore, we develop a hierarchial model which allows the covariance to vary between patients, but pools information across patients to estimate the covariance parameters. We show in a simple case that in terms of estimating the effect of patient-level covariates, this model is equivalent to a weighted multiple regression, where the patient’s scalar response is a linear combination of all data across location and marker, and the patient’s weight decreases with the spatial correlation and variability.

Another challenging aspect of analyzing periodontal data is the considerable number of missing teeth (around 20% for these data). The assumption that teeth are missing completely at random is not valid because periodontal disease is the leading cause of adult tooth loss, so patients with several missing teeth likely have poor periodontal health. For nonspatial data a common method to handle so-called “informative cluster size” is to include the number of observations as a covariate, or in the weights of a weighted regression [Hoffman, Sen and Weinberg (2001), Williamson, Datta and Satten (2003), Follman, Proschan and Leifer (2003), Lu (2005), Benhin, Rao and Scott (2005), Panageas et al. (2007), Cong, Yin and Shen (2007), Williamson et al. (2008)]. Dunson, Chen and Harry (2003) take a different approach. They propose a joint model for clustered mixed (continuous and binary) data and the number of responses in each cluster, using a continuation ratio probit model for cluster size. Another approach is the shared parameter model [e.g., Wu and Carroll (1988), Follman and Wu (1995)]. The shared parameter model accounts for informative missingness by introducing random effects that are shared between the missing data process and the measurement process. Conditioned on the random effects, the missing data and measurement processes are assumed to be independent.

We propose a shared variable model to jointly model missing teeth with the other markers of periodontal disease. However, in our spatial setting both the number and location of missing teeth are informative. For example, a missing tooth in the front of the mouth surrounded by teeth with low CAL may not be informative; in contrast, a missing tooth in the back of the mouth (where periodontal disease is often the most advanced) surrounded by teeth with high CAL is indicative of poor periodontal health in that region of the mouth. Therefore, we model the number and spatial distribution of missing teeth using our latent spatial factor model. In this model, CAL, PPD, BOP and the location of missing teeth are all modeled simultaneously in terms of a latent periodontal health factor; this approach uses all available information to estimate periodontal disease status.

The paper proceeds as follows. Section 2 presents our unified approach to modeling multivariate spatially-referenced periodontal data, as well as our model for informatively missing teeth. Section 3 offers some influence diagnostics to determine which patients and response types are the most informative about the patient-level covariates. Computing details are given in Section 4. Section 5’s simulation study shows that accounting for spatial association and informative observation location can lead to a substantial improvement in estimating the patient-level covariate effects. We analyze the periodontal data in Section 6. Section 7 concludes.

2. Latent spatial factor model for periodontal data

In this section we describe our approach for spatially-referenced mixed periodontal data with informative missingness. We begin in Section 2.1 by specifying a latent spatial factor model assuming no missing teeth. Section 2.2 introduces the spatial probit model for missing teeth and Section 2.3 specifies priors and discusses identifiability of the latent variable model.

2.1. Complete data model

We assume our multivariate spatial data has J types of responses (for the periodontal data the J = 3 responses are CAL, PPD and BOP) at each spatial location for each patient. If the jth type of response is continuous (CAL and PPD), let yij(s) be the response at spatial location s for patient i, s = 1,…,S and i = 1,…,N. Our data also has binary responses (BOP). If the jth type of response is binary, let yij*(s) be the response at spatial location s for patient i. We model binary responses using probit regression, that is, yij*(s)=I(yij(s)>0), where I(·) is the binary indicator function and yij(s) is a Gaussian latent variable.

All J responses are modeled as functions of the latent spatial disease status, μi(s), which represents the overall periodontal health of patient i at location s. Let


where aj is the intercept for response j, bj relates the latent factor to response type j, and εij(s)~N(0,σij2) is error. As is customary for probit regression, we assume σij2=1 for binary responses for identification. Since all J responses depend on the common latent factor, they are correlated with

Cor(yij(s),yil(s))=bjbl Var[μi(s)]bj2 Var[μi(s)]+σij2bl2 Var[μi(s)]+σil2.

The slopes bj and bl determine the sign and magnitude of the correlation; if either bj or bl is zero, then yij(s) and yil(s) are uncorrelated, and if bj and bl share (do not share) the same sign, then yij(s) and yil(s) are positively (negatively) correlated.

The latent vector μi = (μi(1),…,μi(S))′ has a multivariate normal prior with conditionally autoregressive covariance [“CAR,” Besag, York and Mollié (1991)]. The mean of μi is


where W is an S × q matrix of spatial covariates (e.g., tooth number) that do not vary across patient, Xi is the p-vector of patient-level covariates (e.g., age) that do not vary across space within patient, Ωi=Xi1S, 1S is the S-vector of ones, and α and β are the corresponding regression parameters. The covariance of μi is τi2Q(ρi)1, where Qi) = M − ρiD, Dss is one if locations s and s′ are adjacent and zero otherwise, M is diagonal with diagonal elements Mss = Σs Dss. In this spatial model, ρi [set membership] [0, 1] controls the degree of spatial association and τi2>0 controls the magnitude of variation. Let ri(s) = μi(s) − Ei(s)). A convenient interpretation of the CAR prior is that the conditional distribution of ri(s) given ri(s′) for all s′ ≠ s is normal with mean ρi [r with macron]i(s) and variance τi2/m(s), where [r with macron]i(s) is the average of ri(s) at location s’s m(s) neighbors.

The degree of spatial variation is allowed to differ between patients by means of σij2,τi2 and ρi. To pool information across patients, we use models

σij2|cj,dj~Gamma(cj,dj), τi2|e,f~Gamma(e,f),ρi|g,h~Beta(g,h),

where {cj}, {dj}, e, f, g and h are hyperparameters.

2.2. Model for the location of missing teeth

For our data described in Section 1, roughly 20% of the teeth are missing. The locations of the missing teeth are not random, but rather related to the periodontal health in that region of the mouth. Therefore, we propose a model for the location of missing teeth as a function of the underlying latent factor μi(s).

For our data either the six observations on a tooth for all J responses are all observed or all unobserved. That is, if a tooth is missing, we have no data for the tooth, and if a tooth is not missing, we have complete data. Let yi0*(t) be an indicator of whether tooth t = 1,…,T is missing for patient i. As with the binary data in Section 2, we model yi0*(t) using probit regression. Let yi0*(t)=I(yi0(t)>0), where yi0(t) is a latent continuous variable. Then


where Zt is such that Ztμi is the mean of μi at the six observations on tooth t and εi0(t)~i.i.d.N(0,1). a0 and b0 relate the latent process to the missing tooth indicator. Note that since μi(s) is included in both the model for presence of and value of the responses, both presence and value of the data contribute to the posterior of μi(s), and thus the posterior of β. Also note that bi0 = 0 corresponds to independence between the latent factor and the location of missing teeth, in which case the location of missing teeth does not contribute to estimating β.

2.3. Identifiability and prior choice

Identifiability is a key issue in latent variable modeling. To see this, we inspect the first two moments of the multivariate response at location s for patient i after integrating over the latent factor μi,


where W(s) is the row of W corresponding to location s and q(s) is the (s, s) diagonal element of Qi)−1. Identifiability concerns arise in both moment expressions, as multiplying all of the slopes bj by scalar c and dividing α, β and τi2 by c gives identical moments. Although there are other ways to address this issue, we fix b1 [equivalent] 1. This identifies both the regression coefficients α and β via the mean of the first response and the CAR variance τi2 via the variance of the first response. In our analysis of periodontal data of Section 6 we take the first response with fixed slope to be clinical attachment loss, the most commonly used measure of periodontal disease. We also compare these results with other baseline assignments and discuss sensitivity to this assumption.

The regression coefficients {aj}, {bj} (j ≠ 1), α and β have independent N(0, w2) priors. The hyperparameters {cj}, {dj}, e, f, g and h have independent Gamma(u, v) priors. In the simulation study (Section 5) and data analysis (Section 6) we take u = v = 0.1 and w = 10 to give vague yet proper priors. We conduct a sensitivity analysis in Section 6 which shows that the results are not sensitive to these priors for this large periodontal data set.

3. Influence diagnostics

Our primary interest is in the patient-level parameters β. In this complicated hierarchical Bayesian model, we would like to identify the sources of data that are most informative about β. In this section we develop diagnostics to determine which patients, spatial locations and response types are the most influential. We assume no missing teeth, that all responses are Gaussian, and that no covariates depend on space (W is null). In this case the regression coefficients only affect the overall average response for each patient, and a tempting simplification is to collapse data over space and use each patient’s overall average as a scalar response. We show that even in this case different areas of the mouth are more than less informative, and that patients are weighted differently depending on their spatial covariance parameters. This motivates the hierarchical spatial model even in this simple case.

Integrating over latent effect μi, but conditioning on σij2,τi2, ρi, aj and bj, the posterior for β is Gaussian with




δi=τi2j=1Jbj2/σij2, and


The posterior in (6) is equivalent to a weighted linear regression where each patient contributes the scalar response zi and is weighted according to wi. Analyzing zi and wi shows which sites, patients and outcomes contribute the most to β’s posterior.

First we consider zi:


where the vector ki = 1′ [Qi)(δiIs + Qi))−1]/wi. Therefore, zi is a linear combination of all the observations for patient i, with k(s) controlling the relative weight of observations at location s and bj/σij2 controlling the relative weight of response type j. Figure 2(a) plots k(s) (scaled to sum to S) for four combinations of ρi and δi. Observations in the gaps between teeth have the highest weight; these sites have the most neighbors and thus the smallest prior variance. Observations in the back of the mouth and on the sides of teeth get less weight.

FIG. 2
Panel (a) plots the spatial weights k(s) for various δ and ρ. “Maxillary” and “Mandibular” refer to upper and lower jaws respectively, while “buccal” and “lingual” refer to ...

The patient weights wi are plotted as a function of ρi, δi and τi in Figure 2(b). The weight decreases with ρi and τi2, and increases with δi (inversely related to error variances σij2). That is, patients with little spatial association and small variances τi2 and σij2 (and thus large δi) have the most influence on β’s posterior. To search for overly-influential patients, we compute the weights by evaluating (7) using posterior means [rho with circumflex]i, τ^i2 and σ^i12. However, the marginal posterior for β is not available in closed-form for Section 6’s data with informative missing teeth and binary responses. Therefore, we use only the CAL error variance σ^i12, that is, δi=τ^i2/σ^i12 (b1 = 1, Section 2.3), as an approximation. This approximation is not meant to be definitive, but rather a useful heuristic device.

4. MCMC sampling algorithm

MCMC sampling is carried out using the free software R (, although it would also be straightforward to implement the model using WinBUGS ( Sample code to analyze a single continuous response is available in the supplemental article [Reich and Bandyopadhyay (2009)]. We draw 20,000 MCMC samples and discard the first 5000 as burn-in. Convergence is monitored using trace plots of the deviance as well as several representative parameters.

The patient-specific parameters are conditionally-conjugate except for the CAR spatial association parameters ρi, which are updated using Metropolis–Hastings sampling with a Beta(50ρi*,50(1ρi*)) candidate distribution, where ρi* is the value at the previous iteration. The remaining parameters are updated using Gibbs sampling with full conditionals given below. The latent continuous variables corresponding to the probit model for binary responses, yij(s), are updated from their truncated full conditionals yij(s) ~ N(aj + bjμi(s), 1), restricted to (,0) if yij*(s)=0 and (0,) if yij*(s)=1. The vector of latent effects for patient i, μi, is multivariate normal with

V(μi| rest)1=ZZb02+Q(ρi)/τi2+(j=1Jbj2/σij2)In,E(μi| rest)=V(μi| rest)(b0Z(y0ia0)+Q(ρi)(Wα+Ωiβ)/τi2+j=1Jbj(yijaj)/σij2),

where Z = (Z1,…,ZT), yij = (yij(1),…,yij(S)) and yi0 = (yi0(1),…,yi0(T)). The measurement error variances for the continuous responses have full conditional

σij2| rest ~ InvGamma(S/2+cj,s=1S(yij(s)ajbjμi(s))2/2+dj),τj2| rest ~ InvGamma(S/2+e,riQ(ρi)ri/2+f),

where ri = μiWα − Ωiβ.

The intercept/slope pairs (aj, bj) have bivariate normal full conditionals with mean

V((aj,bj)| rest)1=w2I2+i=1NΔiΔi/σij2,E((aj,bj)| rest)=V((aj,bj)| rest)(i=1NΔiyij/σij2),

where Δi = (1,μi). The regression coefficients α and β have multivariate normal full conditionals with

V(α| rest)1=w2Ips+i=1NWQ(ρi)W/τi2,E(α| rest)=V(βs| rest)Xsi=1NQ(ρi)(μiΩiβ)/τi2


V(β| rest)1=w2Ip+i=1NΩiQ(ρi)Ωi/τi2,E(β| rest)=V(β| rest)i=1NΩiQ(ρi)(μiWα)/τi2.

The remaining parameters {cj}, {dj}, e, f and g are updated using Metropolis sampling with Gaussian candidate distributions tuned to give acceptance ratios near 0.40.

5. Simulation study

In this section we conduct a simulation study to demonstrate the effects of spatial correlation and informative missingness on the analysis of patient-level fixed effects. For computational purposes we assume only one quadrant (i.e., half jaw) for each patient leaving S = 42, that there are no spatial covariates W, and the same CAR spatial association parameter for each patient, that is, ρi = ρ. We also assume there is only a single continuous response. Data are generated from the model

P(yi(s)=observed)=1Φ(a0+b0μi(s)),yi(s)|yi(s) observed ~N(a1+b1μi(s),σi2),

where μi~N([xiβ]1S,τi2Q1(ρ)). Each simulated data set contains data generated from this model for N = 50 patients. The p = 6 patient-level covariates xi are generated independently from the standard normal distribution and the regression coefficients are β = (0, 0, 0, 1, 2, 3)/20. Finally, a1 = b1 = 1 and a0 = −1.

M = 100 data sets are generated from each of six designs specified by varying the true value of the covariance parameters σi2,τi2 and ρ and the missing data mechanism b0:

  • Design 1: ρ = 0.0, b0 = 0 and σi2=τi2=1,
  • Design 2: ρ = 0.9, b0 = 0 and σi2=τi2=1,
  • Design 3: ρ = 0.9, b0 = 0 and σi2=τi2=1.5*I(i is odd)+0.5,
  • Design 4: ρ = 0.9, b0 = 1 and σi2=τi2=1,
  • Design 5: ρ = 0.9, b0 = 1 and σi2=τi2=1.5*I(i is odd)+0.5,
  • Design 6: ρ = 0.5, b0 = 1 and σi2=τi2=1.5*I(i is odd)+0.5.

Observations within patients are independent under the first design and spatially correlated under all other designs. The variances are the same for all patients under Design 2 and vary across patients for Design 3. Designs 4 and 5 are similar to Designs 2 and 3, except that the locations of missing observations are informative with b0 = 1. Design 6 is the same as Design 5, except with moderate spatial association ρ = 0.5.

We analyze each simulated data set using five models:

  • Model 1: Linear regression, y¯i=sSiyi(s)/|Si|~N(xiβ,σ2),
  • Model 2: Section 2’s spatial model without informative missingness or patient-specific variances, that is, b0 = 0, σi2=σ2 and τi2=τ2,
  • Model 3: Section 2’s spatial model with patient-specific variances but without informative missingness, that is, b0 = 0,
  • Model 4: Section 2’s spatial model with informative missingness but without patient-specific variances, that is, σi2=σ2 and τi2=τ2,
  • Model 5: Section 2’s full spatial model,

where Si in Model 1 is the set of locations of observed data for patient i. Model 1 ignores spatial associations and missing teeth, and simply uses each patient’s average observed response in a multiple regression. Models 2–5 explicitly model all observations individually and account for spatial associations between nearby observations.

The results are presented in Table 1. For each model and each design, we calculate the proportion of the 95% posterior intervals for b0 and the regression coefficients that exclude zero. We also compute the mean squared error and relative bias, MSE=1pMm=1Mj=1p(β^j(m)βj)2 and RelBiasj=1Mm=1M(β^j(m)βj)/βj, where β^j(m) is the posterior mean of βj for the mth simulated data set and βj is the true value. Relative bias is only presented for the largest coefficient, β6.

Simulation study results. Column labels “b0”−“β6” give the proportion of 95 % intervals that exclude zero. The Monte Carlo standard errors (not shown) are between 0.007 and 0.045 for 100*MSE and between ...

Data for the first design are generated without spatial association or informative missingness. In this case all five models give nearly identical results, demonstrating that the spatial models are able to approximate the simple regression model if appropriate. The five models are also nearly identical for Design 2 where the data are generated with spatial correlation and the same variances for each patient. In this case the patient means yi are Gaussian with mean a1+xiβ and the same variances, satisfying the usual regression assumptions.

The linear regression model does not perform well for Design 3’s spatial model with patient-dependent variances. In this case the patient means yi have different variances, violating the usual regression assumptions. The spatial models that allow for patient-dependent variances (Models 3 and 5) give dramatic improvements in both power and mean squared error compared to the homoskedastic models.

The locations of missing observations are informative for Designs 4, 5 and 6. For these two designs the models (Models 1–3) that do not account for informative missingness are biased for β6. The models that allow for informative location consistently identify b0 as nonzero (power 1.0 in all three designs), which alleviates the bias for the nonnull predictors and improves power. Design 5 has both informative missingness and patient-dependent variances, common traits of periodontal data. In this case our full model is more than three times more powerful for β6 (0.26 to 0.94) and has roughly one fourth the mean squared error (0.193 to 0.780) of the usual nonspatial regression approach.

6. Analysis of periodontal data

The motivating data were collected from a clinical study [Fernandes et al. (2009)] conducted by the Center for Oral Health Research (COHR) at the Medical University of South Carolina (MUSC). The relationship between periodontal disease and diabetes level has been previously studied in the dental literature [Faria-Almeida, Navarro and Bascones (2006), Taylor and Borgnakke (2008)]. The objective of this study was to explore the relationship between periodontal disease and diabetes level (determined by the popular marker HbA1c, or “glycosylated hemoglobin”) in the Type-2 diabetic adult (13 years or older) Gullah-speaking African-American population residing in the coastal seaislands of South Carolina. Since this is part of an ongoing study, we selected 199 patients with complete covariate information and with at least 50% responses available.

For each patient CAL, PPD and BOP are measured at six locations on each nonmissing tooth, as shown in Figure 1. Additionally, several patient-level covariates were obtained, including age (in years), gender (1 = Female, 0 = Male), body mass index or BMI (in kg/m2), smoking status (1 = a smoker, 0 = never) and HbA1c (1 = High, 0 = controlled). We also include spatial covariates for the site in the gap between teeth (1 = in the gap, 0 = on the side of a tooth), jaw (1 = maxilla, 0 = mandible) and six tooth number indictors with the first tooth (front of the mouth, Figure 1) serving as the reference tooth. All covariates are standardized to have mean zero and variance one. The spatial adjacency structure is shown in Figure 1; we consider neighboring sites on the same tooth as well as neighboring sites on the consecutive teeth to be adjacent.

We begin by fitting several models with the same variances for all patients, that is, σij2=σj2,τi2=τ2 and ρi2=ρ2. We fit four models by assuming spatial association (ρ ~ Unif[0, 1]) and independence (ρ = 0), and assuming missing teeth are informative (b0 ≠ 0) and not informative (b0 = 0). Table 2 gives posterior 95% intervals for several parameters. The slopes bj for pocket depth and bleeding on probing (as described in Section 2.3, slope for attachment loss is fixed at one) are significantly positive for all models, suggesting strong positive associations between the three responses. Several covariates are significant for all models, including patient effects gender, smoking status and HbA1c status, indicators of a site in the gap between teeth and a site on the upper jaw, and several tooth number indictors with sites in the back of the mouth having higher mean responses.

Posterior 95% intervals for models assuming variances σij2 and τi2 are constant across patients. “Spatial” models take ρ ≠ 0 and models with informative missing teeth (“Info missing”) ...

The slope relating the latent spatial process with the probability of a missing tooth, b0, is also significantly positive. This matches the intuition that patients with poor periodontal health generally have more missing teeth. Figure 3 plots the data and fitted values for a typical patient to illustrate the effects of accounting for informative missing teeth. This plot compares the spatial models with b0 set to zero (solid lines) and b0 not set to zero (dashed lines). The posterior means [Figure 3(a)–(c)] and credible sets [3(d)] are nearly identical for observations on non-missing teeth. However, for missing teeth the fitted values for all three responses are larger (worse periodontal health) when accounting for informative observation location.

FIG. 3
Panels (a)–(c) plot the data (dots) and posterior mean of the expected response (lines) for a typical patient. Panel (d) plots the posterior mean (bold) and 95% interval (thin) for the latent spatial process μ(s). All plots include results ...

Accounting for informative observation location also affects the patient effect for age. The 95% interval ignoring spatial association and informative observations location is (−0.002, 0.022), compared to (0.036, 0.115) for the full model. The measures of periodontal disease are cumulative, so it seems reasonable that age should be an important predictor. Our data show a relationship between age and the number of missing teeth; patients that are younger than 54 (the mean age) have an average of 135.8 (sd = 20.1) observations and patients that are older than 54 have an average of 124.7 (sd = 22.2) observations. By accounting for this relationship, we identify age as a significant predictor of periodontal health.

Section 5’s simulation study shows that the fixed effects can also be affected if patients have different spatial covariances. To explore this possibility for our periodontal data, we apply Section 2’s model with variances σij2 and τi2 varying across patients. Figure 4(a) and (b) summarize the posteriors of the variance parameters. Here we see considerable variation across patients; Figure 4(b) shows that the posterior 95% intervals for τi are nonoverlapping for patients with small and large τi.

FIG. 4
Panel (a) gives the posterior medians of the patient-specific standard deviations (τiij) for attachment loss, and panel (b) plots the posterior of the CAR standard deviations τi (the horizontal lines in the ith column are the ...

Section 3’s wi diagnostic in (7) indicates which patients are the most influential on the regression coefficients. The wi (computed using only the CAL error variance) have median 28.0 and vary greatly across patients with 95% interval (5.8, 61.7). Figure 4(c) and (d) plot CAL for the patients with the smallest and largest wi. The responses for the patient with smallest wi vary considerably from site-to-site within the mouth, with attachment loss ranging from 0 to 11 mm. Information about the patient-level covariates accumulate via the mean of the latent parameters μi; due to spatial variability, the mean is quite uncertain for this patient and, thus, this patient provides little information about β. In contrast, the patient with largest wi is stable from site-to-site, providing reliable information the mean of μi and thus about β.

Table 3 gives the 95% posterior intervals for several parameters from the model with patient-dependent variances. Comparing the spatial models with informative missingess, the results for the patient-level covariates are fairly similar for the models with and without patient-dependent variances (i.e., the final columns of Tables 2 and and3).3). However, we note that the width of the credible intervals are smaller for the model with patient-dependent variances.

Posterior 95% intervals for models assuming variances σij2 and τi2 vary across patients.“Spatial” models take ρ ≠ 0 and models with informative missing teeth (“Info missing”) ...

Finally, we conducted a sensitivity analysis to determine the effect of modeling assumptions for the full spatial model with patient-dependent variances and informative observation location. We modified the analysis by changing the reference group with slope bj fixed to one from CAL to PPD and BOP, changing the hyperparameters u = v = 0.1 to u = v = 0.0001, and changing the hyperparameter w = 10 to w = 1000. The posterior 95% intervals are given in Table 4 for the patient effects, scaled by b1 for comparison across reference group. The modification with the largest effect is changing the reference group from CAL to BOP. The patient level effects are generally closer to zero using BOP as the reference group. Despite this change in scale, the signs of the coefficients and the subset of coefficients with intervals that exclude zero remains the same as the original analysis.

Posterior 95% intervals for the patient effects for various modeling/prior choices for the full model with spatial correlation, patient dependent variance and informative observation location. “Ref group” refers to the response that has ...

Also, we consider modifying the adjacency structure shown by the gray lines in Figure 1 (“spatial grid 1”) in two ways: first by not considering sites on the opposite side of a tooth to be neighbors (“spatial grid 2”) to give independent AR(1) models to the sites on the buccal and lingual sides of each jaw, and second by considering all pairs of observations on the same tooth to be neighbors (“spatial grid 3”). To determine how well each of these spatial grids fit our data, we use the deviance information criteria (DIC) of Spiegelhalter et al. (2002). To compare spatial structures using DIC, we analyze only a single continuous response, CAL, and do not consider informative missing teeth. DIC prefers grid 1 (DIC = 66,128) over grids 2 (DIC = 68,168) and 3 (DIC = 68,562). Table 4 gives the posterior of the subject-level effects for the full data analysis using the three spatial grids; the results are not sensitive to the choice of spatial structure.

7. Discussion

In this paper we develop a latent factor model for multivariate spatial periodontal data with a mix of binary and continuous responses. Our model allows for a different spatial covariance for each patient and for informative missing teeth. We show using simulated and real data that accounting for these factors leads to a substantial improvement for estimating covariate effects compared to standard regression techniques.

We have assumed throughout that the patient’s periodontal health can be captured by a single latent factor. It would be straightforward, conceptually if not computationally, to include more latent factors. However, this leads to the problem of selecting the appropriate number of latent factors, interpreting the roles of the different latent factors, and understanding the effects of the covariates on the different latent factors. For these data with three strongly-correlated responses we prefer the single factor model for computational simplicity and interpretability. If multiple factors are allowed, the number of factors could be chosen using the deviance information criteria. Another approach would be to allow the number of factors to be unknown. Lopes, Salazar and Gamerman (2008) and Salazar, Gamerman and Lopes (2009) use reversible jump MCMC to account for uncertainty in the number of latent factors. Extending this approach to our setting may be complicated by the large number of subjects, since the proposal density would have to propose spatial models that simultaneously fit well for all 199 subjects. Another possibility would be to extend the parameter expansion method of Ghosh and Dunson (2008) to the spatial setting.

We have also assumed that the latent spatial process is Gaussian. For nonspatial data several authors have proposed methods that avoid assuming the shared random effects are Gaussian [Lin et al. (2000), Song, Davidian and Tsiatis (2002), Beunckens et al. (2008), Tsonaka, Verbeke and Lesaffre (2009)]. These approaches could be extended to the periodontal setting by replacing the Gaussian spatial model with a non-Gaussian spatial model [e.g., Gelfand, Kottas and MacEachern (2005), Griffin and Steel (2006), Reich and Fuentes (2007)].

An area of future work is to apply this method to longitudinal periodontal data. Periodontal data is often collected repeatedly for a single patient over time to monitor disease progression. Reich and Hodges (2008) propose a spatiotemporal model for attachment loss. It should be possible to extend this model to accommodate mixed multivariate responses and informative missing teeth.

Supplementary Material



The authors thank the Center for Oral Health Research (COHR) at the Medical University of South Carolina for providing the data and context for this work, in particular, Drs. S. London, J. Fernandes, C. Salinas, W. Zhao, Ms. L. Summerlin and Ms. P. Hudson. We also wish to acknowledge several helpful discussions with Dr. James Hodges of the University of Minnesota.


1Supported in part by NIH/NCRR Grant P20 RR017696-06 and NIH Grant 1R01LM009153.


Computer code (spatial factor.R) (DOI: 10.1214/09-AOAS278SUPP; .R). In the supplemental file, we include R code to analyze a single continuous response with informative missingness. Use of the code is described in the file and is illustrated with an analysis of a simulated data set.


  • Banerjee S, Carlin BP, Gelfand AE. Hierarchical Modeling and Analysis for Spatial Data. 1st ed. Chapman & Hall/CRC: Boca Raton, FL; 2004.
  • Benhin E, Rao JNK, Scott AJ. Mean estimating equation approach to analysing cluster-correlated data with nonignorable cluster sizes. Biometrika. 2005;92:435–450. MR2201369.
  • Besag J, York JC, Mollié A. Bayesian image restoration, with two applications in spatial statistics (with discussion) Ann. Inst. Statist. Math. 1991;43:1–59. MR1105822.
  • Beunckens C, Molenberghs G, Verkeke G, Mallinchrodt C. A latent-class mixture model for incomplete longitudinal Gaussian data. Biometrics. 2008;64:96–105. MR2422823. [PubMed]
  • Cong XJ, Yin G, Shen Y. Marginal analysis of correlated failure time data with informative cluster sizes. Biometrics. 2007;63:663–672. MR2395702. [PubMed]
  • Darby ML, Walsh MM. Dental Hygiene: Theory and Practice. 1st ed. Philadelphia, PA: W. B. Saunders; 1995.
  • Dunson DB, Chen Z, Harry J. A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes. Biometrics. 2003;59:521–530. MR2004257. [PubMed]
  • Faria-Almeida R, Navarro A, Bascones A. Clinical and metabolic changes after conventional treatment of Type-2 diabetic patients with chronic periodontitis. Journal of Periodontology. 2006;77:591–598. [PubMed]
  • Fernandez JK, Wiegand RE, Salinas CF, Grossi SG, Sanders JJ, Lopes-Virella MF, Slate EH. Periodontal disease status in Gullah African Americans with Type 2 diabetics living in South Carolina. Journal of Periodontology. 2009;80:1062–1068. [PMC free article] [PubMed]
  • Follman D, Wu MC. An approximate generalized linear model with random effects and informative missing data. Biometrics. 1995;51:151–168. MR1341233. [PubMed]
  • Follman D, Proschan M, Leifer E. Multiple outputation: Inference for complex clustered data by averaging analyses from independent data. Biometrics. 2003;59:420–429. MR1987409. [PubMed]
  • Gelfand AE, Kottas A, MacEachern SN. Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Amer. Statist. Assoc. 2005;100:1021–1035. MR2201028.
  • Ghosh J, Dunson DB. Bayesian model selection in factor analytic models. In: Dunson DB, editor. Random Effect and Latent Variable Model Selection. New York: Wiley; 2008. pp. 151–164.
  • Griffin JE, Steel MFJ. Order-based dependent Dirichlet processes. J. Amer. Statist. Assoc. 2006;101:179–194. MR2268037.
  • Hoffman EB, Sen PK, Weinberg CR. Within-cluster resampling. Biometrika. 2001;88:1121–1134. MR1872223.
  • Hogan JW, Tchernis R. Bayesian factor analysis for spatially correlated data, with application to summarizing area-level material deprivation from census data. J. Amer. Statist. Assoc. 2004;99:314–324. MR2109313.
  • Lin H, McCulloch CE, Turnbull BW, Slate E, Clark L. A latent class mixed model for analysing biomarker trajectories with irregularly scheduled observations. Stat. Med. 2000;19:1303–1318. [PubMed]
  • Lopes HF, Salazar E, Gamerman D. Spatial dynamic factor analysis. Bayesian Anal. 2008;3:759–792.
  • Lu W. Marginal regression of multivariate event times based on linear transformation models. Lifetime Data Anal. 2005;11:389–404. MR2186652. [PubMed]
  • Oliver RC, Brown LJ, Loe H. Periodontal diseases in the United States population. Journal of Periodontology. 1998;69:269–278. [PubMed]
  • Panageas KS, Schrag D, Localio AR, Venkatraman ES, Begg CB. Properties of analysis methods that account for clustering in volume-outcome studies when the primary predictor is cluster size. Stat. Med. 2007;26:2017–2035. MR2364289. [PubMed]
  • Reich BJ, Bandyopadhyay D. Supplement to “A latent factor model for spatial data with informative missingness.” 2009 DOI: 10.1214/09-AOAS278SUPP. [PMC free article] [PubMed]
  • Reich BJ, Fuentes M. A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields. Ann. Appl. Statist. 2007;1:249–264. MR2393850.
  • Reich BJ, Hodges JS. Modeling longitudinal spatial periodontal data: A spatially-adaptive model with tools for specifying priors and checking fit. Biometrics. 2008;64:790–799. [PubMed]
  • Salazar E, Gamerman D, Lopes HF. Technical report. Univ. Chicago: 2009. Generalized spatial dynamic factor analysis.
  • Song X, Davidian M, Tsiatis AA. A semiparametric likelihood approach to joint modeling of longitudinal and time to event data. Biometrics. 2002;58:742–753. MR1945011. [PubMed]
  • Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesian measures of model complexity and fit (with discussion) J. Roy. Statist. Soc. Ser. B. 2002;64:583–639. MR1979380.
  • Taylor GW, Borgnakke WS. Periodontal disease: Associations with diabetes, glycemic control and complications. Oral Diseases. 2008;14:191–203. [PubMed]
  • Tsonaka R, Verbeke G, Lesaffre E. A semi-parametric shared parameter model to handle nonmonotone nonignorable missingness. Biometrics. 2009;65:81–87. [PubMed]
  • Wang F, Wall M. Generalized common spatial factor model. Biostatistics. 2003;4:569–582. [PubMed]
  • Williamson JM, Datta S, Satten GA. Marginal analysis of clustered data when the cluster size is informative. Biometrics. 2003;59:36–42. MR1978471. [PubMed]
  • Williamson JM, Kim HY, Manatunga A, Addiss DG. Modeling survival data with informative cluster size. Stat. Med. 2008;27:543–555. MR2418464. [PubMed]
  • Wu MC, Carroll R. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics. 1988;44:175–188. MR0931633.