4.1 Details on the VHA National Mental Health Performance Monitoring System data
The continuous inpatient and outpatient responses were standardized to have mean 0 and variance 1. Patient-level covariates included in the within-unit model [see
(2.1) and
(2.4)] included age (categorized as ≤39, 40–59, or ≥60), race (categorized as white, black, or other), and primary mental health diagnosis (schizophrenia, other psychoses, posttraumatic stress disorder, alcohol abuse, or other). Over the 6-year period, the majority (95%) of patients were male; one-quarter black; 60% aged between 40 and 59 years of age at the time of their index discharge; and 27% had a primary mental health diagnosis of schizophrenia (data not shown). indicates a 3.4% absolute decline between 1993 and 1998 in 6-month readmission rates, and a corresponding 5.3% increase in the outpatient visit rate over the 22 networks. These changes are consistent with expected patterns following health care reorganization. In what follows, networks are labeled using roman letters (A–W).
4.2 Sampling algorithm
We ran five parallel chains, each of length 1100. The chains appeared to have converged within 100 iterations when examining the trace plots of selected parameters within each chain; we therefore used the 100 iterations as the burn-in. To minimize autocorrelation within the chains, we sampled every 10th iteration. The total posterior sample size was 500.
4.3 Prior for discrimination matrix, Λ
We fit a two-latent variable model (
L = 2) where, a priori, the latent variables were thought to represent inpatient and outpatient quality. The prior specification given in Section 3.1 required specification of two hyperparameters,
π and
c^{2}. We wanted to specify priors flexible enough to verify the a priori belief of an inpatient and outpatient latent variable, but informative enough to avoid identifiability problems. We first set
π = 0.25 and
c^{2} = 0.01 (
c = 0.1) and then examined sensitivity to fixing
π/
c and varying
π; inferences were basically insensitive to this variation. We then fixed
π and varied the ratio. Values of
π/
c less than two were getting close to identifiability problems with components of the latent variables switching and signs changing on the discrimination parameters. Values of the ratio over three seemed to be too informative. Thus, for inferences in Sections 4.4–4.6, we set
π = 0.25 and
π/
c 2.5.
The posterior means of the discrimination parameters were consistent with the prior which differentiated inpatient and outpatient measures (), although they were considerably smaller than their prior means. The discriminating abilities of the components of the inpatient and outpatient latent variables were not very consistent with equality among the four inpatient and outpatient variables. For example, the discrimination parameters for the number of visits and periods of continuity were larger than visits within 180 days and days to first visit. In addition, the posterior variances of the discrimination parameters were much tighter than the prior variances, indicating that the data very much determined the posterior on λ. Finally, there was no evidence that the discrimination parameters varied over time implying no need to index λ_{k} by t.
| Table 2Posterior means and 95% credible intervals for the discrimination parameters, λ_{k} for the inpatient and outpatient latent variables for the final model given in Section 4.3 |
4.4 Serial dependence
To assess the importance of modeling the dependence of the networks over time, we first fit unstructured correlation matrices for both latent variables, R_{l}, l = 1, 2. The posterior mean of the correlation matrices for the inpatient and outpatient latent variables in the unstructured model are given in .
| Table 3Posterior mean of unstructured correlation matrices for inpatient (R_{1}) and outpatient (R_{2}) latent variables |
In general, the posterior means of the lag 1 correlations were around 0.5–0.7. The correlation generally decreased as the lag increased. The posterior distribution of the correlations were quite skewed with the posterior medians and mode for the large positive correlations closer to one than the posterior means (). A first-order Markov model seemed most reasonable for R_{1}. Because the lag k correlations did not vary much over time (moving down each off-diagonal of the correlation matrix), we fit a first-order Markov model with ρ_{l,t−1} = ρ_{l}. In this first-order model, the lag 1 correlations was quite high with a posterior mean (95% credible interval) of 0.71 (0.63, 0.79).
Ultimately, in addition to the unstructured and Markov specification of the correlation matrix, we also considered independence over time, R_{l} = I. The R = I model was fit to assess the potential efficiency gains over modeling the data separately by year as this model did not seem feasible when examining the estimated correlation matrices in .
Using the DIC, models that employed serial dependence for the inpatient latent variable fit best (). Fewer effective number of parameters in the dependence models which have more parameters in the R_{l} matrices is a result of more shrinkage of the η_{i} in these models. We used the best-fitting model, R_{1} = R_{m}, R_{2} = R, for the basis of inference. An alternative to picking the ‘best’ model would involve model averaging over the various specifications of the correlation matrix. We discuss this in Section 5.
| Table 4DIC. Markov(1) corresponds to the first-order Markov model discussed in Section 2.4 |
4.5 Longitudinal patterns of quality
displays scatterplots of the posterior means of the inpatient and outpatient latent variables for each network. These variables summarize the inpatient and outpatient quality of the networks over time with higher values corresponding to better care. Network W had consistently high inpatient quality, network N had consistently high outpatient quality, and network U had consistently poor inpatient and outpatient quality (lower left quadrant) across all years.
displays the posterior means of selected inpatient and outpatient latent variables for a subset of the 22 networks. Some networks appear to be ‘significantly’ improving or worsening over the 6-year period. For example, network V appears to provide improving outpatient quality while network P appears to have worsening outpatient quality.
As a comparison, the probabilities in parentheses below are derived from assuming independence over time, the
R =
I model. There was no strong evidence that any of the networks provided consistently poor care over time using using
(2.12). Setting
q = 0.20 (20th percentile), the highest probability of being in the lower 20% for all 6 years was network U on inpatient quality with probability 0.49 (0.20 under
R =
I) and networks U and L on outpatient quality, with probabilities 0.37 (0.04 under
R =
I) and 0.18 (0.04 under
R =
I), respectively. In terms of excellent inpatient quality
(2.13), network W was in the upper 20% for all 6 years for the inpatient latent variable with probability 0.89 (0.54 under
R =
I) while network D had probability 0.51 (0.15 under
R =
I). For excellent outpatient quality, network N had probability 0.52 (0.14 under
R =
I ). From these results, it is clear that inferences change considerably under
R =
I model and stronger inferences are possible about the behavior of the networks over time by accounting for the temporal correlation.
As a less stringent criterion for longitudinal performance, we computed posterior probabilities of a positive (negative) slope for each latent variable for each network as given by
(2.14). Networks C, J, and V had posterior probabilities of 0.85 or greater for improving inpatient quality over the 6-year period while network M had a similar probability associated with worsening inpatient quality. Using the same criterion, networks F, K, L, and V had large probabilities for outpatient quality improvement. Degrading outpatient quality trends were associated with networks C, O, and P.
4.6 Posterior predictive checks
We first examined the correlation among the η_{i} corresponding to each of the eight responses, for each time, t, corr(η_{ikt}, η_{ik′t}), in order to ensure the model characterized the correlation among the network effects. This check resulted in 168 p-values with only 3 outside the interval (0.05, 0.95) and none outside the interval (0.01,0.99). We then examined the correlation among the η_{i} over time for each of the eight responses, corr(η_{ikt}, η_{ikt′}), to assess the appropriateness of temporal correlations. This check resulted in 120 p-values. The fit was not as good as the previous check, but not unreasonable—24 of the 120 p-values were outside the interval (0.05, 0.95), but only 5 were outside (0.01, 0.99). There was no consistent pattern in terms of particular correlations not being captured well. In addition, for the extreme p-values, the observed correlation was typically quite close to the observed extreme of the posterior predictive distribution. Finally, the average coverage for the confidence regions described in Section 3.2 for the entire η_{i} vector was 0.86 and for the lower dimensional η_{i·k} vectors, 0.92, which is quite good given the complexity of the data and the ‘parsimonious’ hierarchical model.