Suppose that n patients per group are randomized to placebo or vaccine. Prior to randomization, all patients receive a rabies vaccine and the immune response to rabies vaccine (W0) is measured before randomization. Patients are then randomized to either a placebo or HIV vaccine injection and shortly thereafter, immune response to the HIV vaccine (X0) is measured in the vaccine group. At the closeout or end of the trial, all uninfected placebo recipients receive the HIV vaccine and shortly thereafter, immune response to this vaccine is measured (XC). Let Y be the infection indicator and Z be the vaccine indicator. A schematic representation of a vaccine trial augmented with BIV and CPV is given in .
Figure 2 Schematic representation of augmented designs. Circles and lowercase letters denote inoculations, immuneresponse is denoted by capital letters. Under a traditional design, patients are vaccinated either with HIV vaccine (h) or placebo (p) and immune response (more ...)
Our approach to using these data is perhaps best described using counterfactual reasoning (Rubin, 1974
; Halloran and Struchiner, 1995
) and principal stratification (Frangakis and Rubin, 2002
). First, let W0i
be the baseline rabies-specific adaptive immune response for patient i
. This is seen in everyone. The response to HIV vaccination is different. One can write X0i
) as the (post) baseline HIV-specific immune response to HIV vaccination. We call X0i
(1) potential covariates; X0i
(1) is measured in vaccine recipients while X0i
(0) would be 0 in nearly everyone. We say that X0i
(1) is realized
in the vaccine group and unrealized
in the placebo group. Using the terminology of Frangakis and Rubin (2002)
(1) = x
(0) = 0 defines a principal stratum indexed by x
. Principal strata are a classification of subjects defined by the potential values of a post-treatment variable under each of the treatments being considered. They also call X0
(1) a principal surrogate and distinguish it from a “statistical” surrogate, which for our setup would be Xobs
(0) (1 − Z
). We next define Yi
) as the outcome for person i
following treatment z
. We call the pair Yi
(1) potential outcomes. We also define XCi
) as the closeout HIV-specific adaptive immune response for person i
when given treatment z
and following outcome y
. Only XCi
(0, 0) is measured and meaningful:
We make the following simplifying assumptions:
- All patients receive the assigned injections so there is no noncompliance.
- There are no missing data; W0, Y0 are measured on everyone, X0 is measured on all vaccinees, and XC is measured on all placebo uninfecteds.
- No infections occur between the time of randomization and when X0 is measured, say the interval [0, m].
The first two are for simplicity and can be relaxed. For example, if there is some noncompliance but it is governed by an independent random mechanism, our methods could be applied to just the compliers. With data missing completely at random the methods can be applied directly to the observed data. If the data are missing at random, methods that incorporate covariates associated with missingness can be used. The last assumption is more likely to be met if m is small. If a few infections occur in [0, m], an analysis that throws them out may be acceptable. We discuss how to modify our approach to incorporate infections during [0, m] in Section 6.
We next specify probit models for the effect of the “baseline covariate” X0
(1) on the probability of infection in both groups:
where Φ( ) is the standard normal c.d.f. (cumulative distribution function). This equation specifies a model for a standard covariate by treatment interaction for a clinical trial. The probit is handy because it is easy to integrate over x
, which we will need to do later. Note that (1) assumes that W0
has no effect on Y
) once X0
(1) and Z
are in the model. This can also be relaxed, as we discuss in Section 5.
Different causal estimands can be used to quantify the effect of the vaccine as a function of X0
(1). For example, following Hudgens and Halloran (2004)
we define vaccine efficacy as
With our probit model, a natural estimand is
Note that when β3 = 0, ΔP(x) is free of x, this is not true for VE(x).
If X0i(1) were observed in everyone, estimation would be straightforward. As X0i(1) is not observed in the placebo group, we require at least one of the following two assumptions to proceed:
- X0i(1) can be viewed as a baseline covariate or
- For placebo uninfecteds, X0i(1) = xi + U1 and XCi(0, 0) = xi + U2 where U1 and U2 are i.i.d. (independent and identically distributed) mean 0. We call this time constancy of immune response.
The first assumption is true by design in randomized trials and allows us to impute X0i (1) based on W0i in the placebo group. While technically measured post-randomization, this “post-baseline” covariate can be used as a baseline covariate. The second assumption allows us to replace X0i(1) with XCi(0, 0) as a covariate in the probit model for placebo uninfecteds. Under the model X = x + U, one can think of x as the true time constant immune response, which is observed subject to measurement error and our interest focuses on the regression of Y on X. This assumption cannot be accepted uncritically as immune response can diminish with age, such as for herpes zoster, if the trial is long enough. Additionally, volunteers might get subinfectious exposures to a virus that modifies immune response. This is thought possible for HIV where commercial sex workers showed immune responses to HIV but remained uninfected. However even here, the assumption might hold if the immune response is effectively primed by subinfectious exposure pre-baseline and this response is maintained during the course of the trial. Additionally, this assumption can be examined, as we will discuss in Section 5.
Our final assumption allows us to easily integrate over the distribution of X0(1)|W0:
- The distribution of X0(1), W0 is bivariate normal with moments μx, μw,
This assumption can also be relaxed but the integration would be more complicated.
To estimate β = (β0, β1, β2, β3), we use maximum likelihood. We begin by constructing a likelihood incorporating both BIV and CPV. The likelihood contribution for vaccinees is simple,
is the set of vaccinees. For uninfected placebo volunteers we use XCi
in lieu of X0i
and their contribution is
) is the set of uninfected placebo recipients. In the placebo infecteds, X0
(1) is missing and we need to integrate p0
(1)) over the distribution of X0
to obtain their likelihood contribution. Under our last assumption, it follows that X0
is normal with mean μ
) = μx
. The (integrated) probability of infection for a person with W0
The right-hand side obtains the result that
). The overall likelihood is thus
depends on the moments of X0
, which are unknown. We advocate estimating these moments using vaccine group data and regard them as fixed in LBC
. Because of this, the standard error estimates obtained by the Fisher information matrix are incorrect and we suggest using the nonparametric bootstrap method to obtain standard errors.
We can also construct likelihoods based on augmenting the usual design with BIV alone or CPV alone. These are, respectively,
is the set of placebo recipients, and
) is the number of placebo infecteds. The last Φ( ) in LC
) is just the probability that a generic placebo patient is infected and equals E
(1)}, where X0
(1) is normal (μx
). Based on the estimated β
’s it is a simple matter to plug them into a causal estimand. Standard errors and confidence intervals for causal estimands can be computed from the bootstrap.