Weighted estimation of the parameters of marginal structural models requires fitting several models: 1) the structural (i.e., weighted) model, 2) the exposure model, and 3) the censoring model. For simplicity and because this paper focuses on constructing weights to estimate the parameters of any marginal structural model through weighted regression, we will assume throughout that the structural model is correctly specified. In practice, investigators will want to explore the sensitivity of their estimates to different structural model specifications (e.g., linear vs. threshold dose-response, long- vs. short-term effects, and so on).

To construct appropriate weights, investigators need to correctly specify the models for exposure and censoring. Here, we will discuss only modeling of the exposure distribution, but our comments apply equally to modeling the censoring distribution. As stated above, a necessary condition for correct model specification is that the stabilized weights have a mean of one (

2). In , we provide a step-by-step example of building weights for the marginal structural model detailed previously (

19) and described above. Although the step-by-step process is a simplified representation of the actual process, we hope that sharing the general approach may guide future implementations of marginal structural models.

In specification 1, the model to estimate the denominator of the weights was a pooled logistic model for the probability of exposure initiation at each visit. Specifically, each person-visit was treated as an observation, and the model was fit on those person-visits for which no exposure had occurred through the prior visit. The covariates were linear terms for follow-up time, baseline CD4 cell count and viral load, and time-varying CD4 cell count and viral load measured at the prior visit. This model, which is a parametric discrete-time approximation of the Cox proportional hazards model for exposure initiation (

35,

36), assumes that the relation between the baseline covariates (and follow-up time) and the probability of exposure initiation is linear on the logit scale. The model to estimate the numerator of the weights was also a pooled logistic model for the probability of exposure initiation, except that time-varying CD4 cell count and viral load were not included as covariates. The mean of the estimated weights was 1.07 (standard deviation, 1.47), the 1/minimum and maximum estimated weights were 33.3 and 26.4, and the effect estimate was −1.94 (standard error, 0.17).

In specification 2, we replace the linear terms for baseline and time-varying CD4 and viral load with categories (i.e., CD4: <200, 200–500, >500 cells/mm^{3}; and viral load detectable (at 400 copies/ml) or not) to illustrate the impact of potential residual confounding within categories of the confounders. The estimated weights appear better behaved than in specification 1 (e.g., the mean moves from 1.07 to 1.05, 1/minimum and maximum notably smaller), and the standard error for the difference in log_{10} viral load is a striking 39 percent (1 − 0.104/0.170 = 0.388) smaller, but the effect estimate of −1.66 moved closer to the unadjusted value of −1.56 (i.e., one category, ).

In specification 3, the numerator and denominator are as in specification 1, but we add three-knot restricted cubic splines to all linear terms. Other smoothing techniques could be used (

37). This flexible parameterization of the time-varying confounders is generally preferred, because it liberates one from much of the residual confounding or finite-sample bias inherent in categorical variables (e.g., specification 2) and reduces the potential bias due to model misspecification from strong linearity assumptions (e.g., specification 1). Compared with specification 1, the estimated weights and effect estimate are similar, but the standard error is reduced by 18 percent (1 − 0.139/0.170 = 0.182).

In specification 4, we added a product term between time-varying CD4 count and follow-up time suggested by clinical colleagues, which had

*p* = 0.03. Compared with specification 3, there is little change in the estimated weights (although the maximum weight increases), and the effect estimate remains unaltered, but its standard error is reduced by 5 percent. This is essentially the model specification used previously (

19); however, the (conservative) robust standard error reported (

19) was 0.135, while the bootstrap standard error reported here is 0.132.

In specification 5, we explored more fully detailed covariate histories, using time-varying CD4 count and viral load measured two visits prior to the visit at-risk for HAART initiation in addition to values measured one visit prior. Beyond an increase in the maximum weight, no notable changes are apparent.

In specification 6, we explored the addition of two more possible time-varying confounders, namely, clinical AIDS status and HIV-related symptoms (i.e., reports of persistent fever, diarrhea, night sweats, or weight loss) at the visit prior to the visit at-risk for HAART initiation. Again, no notable changes are apparent.