We fit the joint latent class models to data from the epidemiological study that examines the effect of environmental PCB concentrations on the risk of endometriosis. Details of the study have been previously reported in Buck Louis

*and others* (2005) and have been summarized in Section 1. The primary interests of the data analysis are in examining the association between complex patterns of PCB concentrations and the risk of endometriosis. Using the methodology developed in the previous sections, we fit a series of models with different numbers of latent classes. As discussed in Section 2, there is evidence to suggest

*h*(

*μ*_{ij}(

*L*_{i},

**b**_{j},

**Z**_{ij}),

**T**_{ij},

*ζ*) =

*μ*_{ij}(

*L*_{i},

**b**_{j},

**Z**_{ij}) () and

*g*^{2}(

*μ*_{ij}(

*L*_{i},

**b**_{j},

**Z**_{ij}),

*ξ*) =

*ξ*^{2}*μ*_{ij}^{2}(

*L*_{i},

**b**_{j},

**Z**_{ij}) () when fitting the models. Two dummy variables

*I*_{j,e} and

*I*_{j,a} are created to represent estrogenic PCBs (

*I*_{j,e} = 1 if a PCB is estrogenic;

*I*_{j,e} = 0 otherwise) and antiestrogenic PCBs (

*I*_{j,a} = 1 if a PCB is antiestrogenic;

*I*_{j,a} = 0 otherwise), respectively. Thus, the PCB mean structure is:

The MCEM algorithm was implemented as described in Section 3 and standard errors (SEs) were estimated using the nonparametric bootstrap method with 500 bootstrap samples (

Efron and Tibshirani, 1993).

shows parameter estimates and SEs of 4 random effects latent class models, with the number of latent classes being 2, 3, 4, and 5, respectively. The table also reports Bayesian information criterion (BIC) type IC

_{Q} values (see Ibrahim

*and others*, 2008 and the

supplementary material available at

*Biostatistics* online) for the 4 candidate models. The model selection criterion IC

_{Q} chose the 3-class model as the best among the competing ones. This model divides participants into 3 disease classes according to their risks with respect to endometriosis: 71.40% of the participants belonged to the low-risk group, 25.27% to the middle-risk group, and 3.33% to the high-risk group. The endometriosis prevalences in the 3 groups were estimated as 30.42%, 53.16%, and 74.66%, respectively. Since

*β*_{1} was restricted to be positive, we performed a one-sided Wald test of

*H*_{0}:

*β*_{1} = 0 versus

*H*_{1}:

*β*_{1} > 0 and obtained a

*p* value of 0.0186. Tests of whether

*α*_{1}≠0,

*α*_{1} +

*α*_{1,e}≠0, or

*α*_{1} +

*α*_{1,a}≠0 examine whether the mean biomarker values vary with latent risk class in the other, estrogenic, and antiestrogenic PCBs, respectively; these 3 tests were all highly significant (less than 0.0001). These tests indicate that the PCB concentration pattern is associated with the risk of endometriosis, and that the overall means of PCBs are significantly different between classes. The parameter estimates

and

provide information on the probabilities of observing zero PCB concentration levels. The positive

*η*_{1} estimate indicates that, for a given PCB and participant, the probability of observing a positive PCB level increases with the mean. We found a statistically significant relationship between the pattern of PCB concentrations and the risk of endometriosis using the proposed model. We now estimate features of the pattern in each of the risk groups.

| **Table 1.**Analysis of the endometriosis and PCB concentrations data: parameter estimates (SEs) and values of model selection criterion IC_{Q} from joint latent class models with different number of latent classes^{†} |

To investigate the features of the individual PCB congener concentration, we computed empirical Bayes estimators of random effects

and

,

*j* = 1,2,…,62. The nonzero concentration levels for the

*j*th PCB congener in the

*k*th latent class can be estimated by replacing random effects with their empirical Bayes estimators and replacing unknown parameters with their estimates in (4.1). The estimated rate of observing zero concentration levels for the

*j*th PCB can be similarly derived. shows, in the selected 3-class model, the nonzero concentration levels of each PCB congener. There is a positive relationship between PCB concentration mean and the risk class for all PCBs (i.e. the pattern in mean concentration for all PCBs is highest for high-risk participants and lowest for low-risk participants). This suggests that there is a combined contribution of all 62 PCB concentrations in assessing the risk of endometriosis. The differences in the PCB patterns from the low risk to the high risk are largest for estrogenic PCBs among the 3 PCB groups. Moreover, we see a reverse pattern for the probability of obtaining a zero PCB. The rate of zero varied across PCB concentrations and across classes (range 0–0.7).

The inferences for *β*_{1}, *α*_{1}, *α*_{1,e}, and *α*_{1,a} were similar across the different models (2- through 5-class models), suggesting that even though the data are best described by the 3-class model, important inferences are robust to model misspecification. Further, empirical Bayes estimators of concentration patterns in each of the risk groups (similar to , but for 2-, 4-, 5-class models) showed similar results to those presented in . Namely, concentration levels increased with an increasing risk group for all PCBs, and the increasing concentration levels were largest for estrogenic markers as compared to other markers (data not shown).

The joint latent class model provided strong evidence for an association between high-dimensional PCB exposure and the risk and endometriosis. The primary goal of the proposed latent class model is to provide a way to associate high-dimensional exposure data with the risk of endometriosis. The fact that the nature of the association (*β*_{1} in and parameters that characterize PCB biomarker exposures) is similar across the models with different numbers of latent class provides reassurance about the robustness of important inference to the number of latent classes. That said, the evidence for 3 latent classes of risk does suggest that there may be 3 types of exposure patterns that are associated with the risk of endometriosis. The fact that overall patterns are higher in the high-risk group and are lower in the other groups suggests that what distinguishes the different groups are small differences in a majority of PCBs.