Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2794923

Formats

Article sections

- SUMMARY
- 1 Introduction
- 2 Vaccine efficacy for susceptibility
- 3 Surrogates of protection
- 4 Discussion
- References

Authors

Related links

Biometrics. Author manuscript; available in PMC 2009 December 17.

Published in final edited form as:

PMCID: PMC2794923

NIHMSID: NIHMS93953

The publisher's final edited version of this article is available at Biometrics

See other articles in PMC that cite the published article.

Evaluation of HIV vaccine candidates in non-human primates (NHPs) is a critical step toward developing a successful vaccine to control the HIV pandemic. Historically, HIV vaccine regimens have been tested in NHPs by administering a single high dose of the challenge virus. More recently, evaluation of candidate HIV vaccines has entailed repeated low-dose challenges which more closely mimic typical exposure in natural transmission settings. In this paper, we consider evaluation of the type and magnitude of vaccine efficacy from such experiments. Based on the principal stratification framework, we also address evaluation of potential immunological surrogate endpoints for infection.

As of 2007, approximately 33.2 million people were living infected with HIV, with over 2.1 million people dying of AIDS in that year (UNAIDS 2007). While great strides have been made in developing effective antiretroviral therapy for treatment of HIV infected individuals, a preventive vaccine remains the greatest hope in curbing the HIV pandemic. HIV vaccine research begins with in vitro and animal studies. A critical component of this pre-clinical development entails evaluation of candidate vaccines in non-human primates (NHPs) such as macaques. Historically, HIV vaccine regimens have been tested in NHPs by administering a single high dose of the challenge virus. More recently, evaluation of candidate HIV vaccines has entailed repeated low-dose (RLD) challenges which more closely mimic typical exposure to HIV in natural transmission settings (Regoes et al. 2005; Subbarao et al. 2006; Ellenberger et al. 2006). A primary objective of these studies is to assess vaccine efficacy for prevention of infection. A secondary objective is to determine immune biomarkers which are surrogate endpoints for infection, which we refer to as “surrogates of protection.”

Since the RLD challenge study design has only recently been implemented in evaluation of candidate HIV vaccines, the corresponding statistical literature is rather limited. One exception is Regoes et al. (2005), who show that for clinically feasible samples sizes, RLD challenge studies in NHPs can be adequately powered to test for vaccine efficacy to prevent infection. This rather surprising result is due to the exquisitely precise nature of the exposure and infection history in challenge studies. In contrast, studies of HIV in humans typically provide only vague information on the number of exposures prior to infection. Consequently RLD challenge studies with small sample sizes can have the same power as large phase III clinical trials to test for vaccine efficacy.

Beyond testing for a vaccine effect, it is not clear what additional information can be inferred from RLD challenge studies. For example, can these studies inform about the type or magnitude of a vaccine’s protective effect? Accurately characterizing the mechanism of protection would provide important information for the design and analysis of future efficacy trials and for population models on the impact of a licensed vaccine. Additionally, is it possible to evaluate potential immune surrogates of protection in RLD challenge studies? This paper seeks to answer these questions. In Section 2 we consider evaluating the type and magnitude of vaccine efficacy from RLD challenge experiments; an illustrative example is given using recently published results from a challenge study of a candidate HIV vaccine. In Section 3 we describe a causal inference approach to assessing potential immunological surrogates of protection in this setting. We conclude with a discussion in Section 4. Similar to Regoes et al. (2005), we find that, despite relatively small sample sizes, RLD challenge studies can provide accurate and precise information about vaccine efficacy and immune surrogate markers. While this work is motivated by the development of an HIV vaccine, the proposed methods can easily be applied to RLD challenge studies of other vaccines and other preventive interventions.

In this section we consider evaluating the type and magnitude of vaccine efficacy for susceptibility (VE* _{S}*), i.e., a vaccine’s ability to protect against infection. Our approach entails applying maximum likelihood methods to a discrete time survival model which allows for possible heterogeneous vaccine effects (Halloran et al. 1992; Longini and Halloran 1996).

Let *p* denote the probability of transmission to a susceptible, unvaccinated individual given a single exposure (i.e., challenge). Assuming the probability of infection is independent of the number of prior challenges, the probability of escaping infection from *t* challenges for an unvaccinated individual is (1 − *p*)* ^{t}*. On the other hand, the probability an unvaccinated individual becomes infected on the

Since the probability a vaccinated individual becomes infected from a single challenge is (1 − *θ*)*p*, define the vaccine efficacy as

$${\text{VE}}_{S}\equiv 1-\frac{(1-\theta )\phi p}{p}=1-(1-\theta )\phi ,$$

(1)

i.e., the relative reduction in the per contact transmission probability if vaccinated compared to if not vaccinated. This measure of vaccine efficacy has been referred to as the *per contact* or *biological* efficacy (Halloran et al. 1999). If *θ* > 0 and = 1, then each vaccinated individual is either completely protected or not at all protected. In this case the vaccine is said to have an “all-or-none” effect with VE* _{S}* =

Maximum likelihood methods (see Web Appendix A) can be employed for inference regarding (*p*, *θ, *) based on data from a NHP challenge study such as described below in Section 2.2. For such a study, we consider point and interval estimation of VE* _{S}*, as well as model selection from among the four mechanism of protection models described above. While it is difficult to discern a vaccine’s protective mechanism from a large human vaccine efficacy trial (Farrington 1998; Gilbert 2001), it is more feasible in an RLD challenge study, due to the far greater information on exposure and transmission.

Ellenberger et al. (2006) employed a RLD challenge study in macaques to assess the efficacy of a candidate HIV vaccine. The vaccine was given to 16 macaques and 14 additional macaques served as controls, i.e., did not receive the vaccine. All animals were then repeatedly exposed weekly to a hybrid simian-human immunodefficiency virus (SHIV) with a different HIV sequence than the HIV sequence represented in the vaccine. Evidence of systemic infection was assessed after each exposure. Infection was defined as having detectable cell-free virus and provirus in peripheral blood mononuclear cells. It is assumed that the cell-free and cell-associated diagnostic tests used for infection diagnosis were sufficiently accurate and the weekly time intervals between challenges were far enough apart such that determination of the infecting exposure was made without error. Four monkeys in the vaccine arm and one in the control arm were administratively right censored after escaping infection from multiple exposures.

Data from this experiment are given in Web Table 1 and the corresponding maximum likelihood results are given in Table 1. Based on a likelihood ratio test (LRT) comparing the leaky and null models, there is evidence of a significant leaky vaccine effect (p-value=0.003). A one-sided Fisher’s exact test (Regoes et al. 2005) gives a similar result (p-value=0.006). Comparison of the log likelihood values for the leaky and all-or-none models suggests the leaky model demonstrates superior fit. Similarly, the LRT comparing the mixed and leaky models is not significant (p-value=0.2). The Akaike Information Criterion (AIC) also suggests the leaky model provides the most parsimonious model that adequately fits the given data.

Maximum likelihood results using data from Ellenberger et al. (2006).

The Kaplan-Meier estimates of the survival function for each arm of the study are given in the left panel of Figure 1. The agreement between these nonparametric estimates and the corresponding estimates from the leaky vaccine model evaluated at the MLEs = 0.36 and = 0.20 suggest good fit of this model to the data. The nonparametric estimates of the complementary log-log survival curves in the right panel of Figure 1 are roughly parallel, also supporting a leaky vaccine effect (Halloran et al. 1999). To evaluate the mechanism of protection further, simulated data sets were generated from the four different models under consideration (evaluated at the MLEs from Table 1) to provide a basis of comparison. Figure 2 shows the difference in the non-parametric estimates of the complementary log-log survival curves from the observed data and 25 simulated data sets from the four models. These plots also suggest the leaky model provides a better fit than the all-or-none or null models.

Left panel: Nonparametric (solid line) and parametric (dotted line) estimates of the survival functions based on data from Ellenberger et al. (2006) and fitted leaky vaccine model from Table 1. Right panel: Nonparametric estimates of the complementary **...**

Difference in nonparametric estimates of the complementary log-log survival functions between the vaccine and control arms for the observed data (bold lines) and 25 simulated data sets (gray lines) for each of the four different mechanism of protection **...**

To formally test for goodness-of-fit, the following Kolmogorov-Smirnov type test statistic was computed:

$${T}_{KS}=max\{\underset{t}{sup}\mid {S}_{0}^{c}(t)-{\widehat{S}}^{c}(t)\mid ,\underset{t}{sup}\mid {S}_{0}^{v}(t)-{\widehat{S}}^{v}(t)\mid \}$$

where
${S}_{0}^{c}(t)$ and
${S}_{0}^{v}(t)$ denote the survival curves for controls and vaccinees under the leaky model when *p* = 0.20 and = 0.36, and *Ŝ ^{c}*(

In total, these results suggest a significant leaky vaccine effect. The MLE of VE* _{S}* is 0.64, which can be interpreted as a 64% reduction in the probability of infection per exposure. The Jewell (1986) (see also Chick et al. 2001) bias corrected estimate of VE

Simulation studies were conducted to investigate the operating characteristics of several of the statistical methods employed in the example above. Unless stated otherwise, data were simulated assuming a transmission probability under control of *p* = 0.2, an equal number of NHPs in the vaccine and control arms, and the experiment ceases after 20 exposures if a NHP is still SHIV negative.

The first set of simulations assumed a leaky mechanism of protection, i.e., *θ* = 0. Based on 10,000 simulations, the power of the LRT comparing the leaky and null models to detect a departure from the null hypothesis *H*_{0}: = 1 at the *α* = 0.1 level is given in the top half of Table 2. For purposes of comparison with the power to detect an all-or-none effect (discussed below), here we assumed ≤ 1, i.e., the vaccine does not increase the probability of infection per exposure in individuals who are not complete protected. Consequently, since is on the boundary of the parameter space under *H*_{0}, the distribution of the LRT statistic was assumed to be
$0.5{\chi}_{0}^{2}+0.5{\chi}_{1}^{2}$, i.e., a 50:50 mixture of chi-squared distributions with 0 and 1 degrees of freedom (Self and Liang 1987). The results of Table 2 suggest the LRT preserves the type I error for sample sizes typical of challenge studies and that the Ellenberger et al. (2006) experiment had over 80% power to detect a leaky VE* _{S}* of 60%. Similar results were found using a Wald test (results not shown). These findings are in agreement with Regoes et al. (2005), who also showed low-dose challenge studies can be adequately powered to detect leaky vaccine effects.

Simulated power × 100% to reject *H*_{0}:VE_{S} = 0. Each table entry is based on applying the likelihood ratio test (LRT) to 10,000 simulated data sets generated assuming *p* = 0.2 transmission probability in the control arm, sample size *m* per arm, and **...**

The second set of simulations assumed an all-or-none mechanism of protection, i.e., = 1. The simulation results in the lower portion of Table 2 give the power for comparing the all-or-none and null models to detect a departure from the null *H*_{0}:*θ* = 0. Again the distribution of the LRT statistic was assumed to be
$0.5{\chi}_{0}^{2}+0.5{\chi}_{1}^{2}$. Comparison of the upper and lower portions of Table 2 indicates there is greater power to detect an all-or-none effect than a leaky effect. Additional simulations (results not shown) indicate the LRT also preserves the type I error rate when comparing the mixture and leaky models.

A simulation study was also employed to assess the ability of the AIC to select the correct mechanism of protection model (i.e., null, all-or-none, leaky, or mixed). The results in Web Table 2 give the probability of selecting the correct model for different values of *p*, *θ*, , and the maximum number of allowable challenges *C ^{max}*. If the true model is all-or-none or null, the probability of selection is typically adequate, i.e., approximately 0.9. Correct selection can be substantially less likely for leaky or mixture models. For example, if

Finally, we also examined the bias of the MLE of VE* _{S}* under the leaky model. Simulation results in Web Table 3 demonstrate an appreciable negative bias of
${\widehat{\text{VE}}}_{S}$ for smaller sample sizes. The Jewell bias corrected estimator (Jewell 1986; Chick et al. 2001) appears to be preferable in terms of bias; Chick et al. (2001) reached a similar conclusion in the context of vaccine efficacy evaluation in small or intermediate size (e.g., Phase IIb) clinical trials. For each simulated data set, we also computed a profile likelihood 95% CI for VE

To this point we have assumed a homogeneous transmission probability *p*, i.e., that every individual has the same natural susceptibility to infection at each time point. This assumption may be violated due to among-individual variability in host genetics (e.g., HLA type), immunity, and other characteristics. Failing to account for susceptibility heterogeneity may lead to biased estimation of VE* _{S}* and to undercoverage of confidence intervals (Halloran et al. 1992). To relax this homogeneity assumption, we suppose the transmission probabilities vary between NHPs according to a beta distribution. For the control group, the beta distribution of the transmission probability

For each of the mixture, leaky, all-or-none, and null models, the MLE of (*θ*, , *μ*, *η*), and thus of VE* _{S}*, can be computed using the likelihood given in the Web Appendix A. Returning to the RLD challenge study in Section 2.2, maximum likelihood results allowing for heterogeneous transmission probabilities are given in Web Table 4. Comparing these results with Table 1, the AIC still selects a leaky vaccine model with homogeneous transmission probabilities.

The definition of vaccine efficacy given in (1) is based on a single contact. More generally, VE* _{S}* could be defined in terms of

$${\text{VE}}_{S}(t)=1-\frac{(1-\theta )\{1-{(1-\phi p)}^{t}\}}{1-{(1-p)}^{t}},$$

the relative reduction in the probability of infection from *t* exposures under vaccine compared to control. If the vaccine is tested or to be used in a low-risk population, then the estimand VE* _{S}*(1)=VE

The generalized definition VE* _{S}*(

In this section we present an approach for assessing potential immunological surrogates of protection (SoP) in a RLD challenge study. In general, a SoP is defined to be an immunological variable *S* such that a vaccine effect on *S* is predictive of a vaccine effect on the risk of infection or disease (i.e., is predictive of VE* _{S}*). The utility of such a surrogate marker includes guiding vaccine development, improving immunogens iteratively between basic and clinical research, providing guidance for regulatory decisions, bridging efficacy of a vaccine observed in a trial to a new setting, and guiding public immunization policy. For RLD challenge studies, knowledge of an immunological surrogate may allow insightful and cost-effective comparisons of vaccine candidates in animals, support predictions of vaccine efficacy in humans, and inform prioritizing the most promising candidates for testing in humans.

Despite the importance of finding SoPs, the literature on methods for their quantitative assessment is quite limited. Most existing approaches simply assess correlates of risk (CoRs), i.e., immunological biomarkers that are associated with risk of infection or disease. For example, in the first phase III trial of an HIV vaccine (VAX004), a significant negative association was found between risk of HIV infection and antibody (Ab) response to the vaccine (Gilbert et al. 2005). However, this purely correlational analysis provides no information to distinguish between the possible explanations that (i) a greater vaccine effect on the immune response predicted a greater vaccine effect on infection risk, or (ii) the immune response simply marked an innate ability to escape infection but did not predict vaccine efficacy. In other words, it was not possible to conclude whether Ab response to the vaccine was a SoP or just a CoR.

Qin et al. (2007) defined a hierarchy of two levels of SoPs: a *specific SoP* is predictive of VE* _{S}* for the same setting (population, environmental factors) as present in the particular study, and a

Recently, novel experimental designs and corresponding statistical methodology have been proposed for evaluating potential specific SoPs in the context of human efficacy trials (Follmann 2006; Gilbert and Hudgens 2008). Here we consider one of two designs proposed by Follmann (2006) wherein for each individual we measure a baseline covariate(s) *W* that is correlated with the immune response that individual would have to the HIV vaccine being evaluated. For example, *W* might be an immune response to a rabies vaccine. The missing HIV vaccine immune response for individuals in the control arm can then be predicted from their *W* and a prediction model based on observed data from the vaccine group. In turn, we can assess how well causal treatment effects on the HIV immune response predict the causal effect of the vaccine to prevent infection. Simulation studies of large (e.g., Phase III) randomized studies have demonstrated that the additional information provided by *W* can enable assessing the extent to which a CoR is a SoP (Follmann 2006; Gilbert and Hudgens 2008). Below we consider whether measuring a baseline predictor *W* in RLD challenge studies might also afford sufficient information for inference regarding possible SoPs. We know of no RLD challenge studies to date which have implemented Follmann’s baseline predictor study design.

Motivated by Ellenberger et al. (2006), we consider assessment of possible SoPs assuming a leaky vaccine effect. We begin by introducing the potential outcomes notation to be used for the SoP model. For subject *i*, let *T _{i}*(

In order to identify the causal estimands of interest defined below, we invoke the stable unit treatment value assumption (SUTVA) and assume ignorable treatment assignment. The lack of interference between NHPs implied by SUTVA should hold in this setting since investigators can prevent interaction between NHPs. Use of randomization in assigning NHPs to receive vaccine or serve as a control will insure ignorable treatment assignment. In the context of human vaccine trials, Gilbert and Hudgens (2008) make the additional assumption that the risk of infection prior to measurement of *S _{i}* is the same for

The average causal effect of the vaccine on survival is defined as *h*(*E*{*T _{i}*(0)},

Next consider the principal stratification (Frangakis and Rubin 2002) of individuals according to the pair of potential immune responses (*S _{i}*(0),

$$h(E\{{T}_{i}(0)\mid {S}_{i}(1)=0\},E\{{T}_{i}(1)\mid {S}_{i}(1)=0\})=0$$

and

$$h(E\{{T}_{i}(0)\mid {S}_{i}(1)=s\},E\{{T}_{i}(1)\mid {S}_{i}(1)=s\})\ne 0\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{all}\phantom{\rule{0.16667em}{0ex}}s>C,$$

for some constant *C* ≥ 0. In words, *S* is an SoP if the vaccine has no average effect on survival in groups of individuals who would have no immune response under vaccine and has some average effect in groups of individuals who would have an immune response greater than *C* under vaccine. Let *p*(*z*, *s*) denote the transmission probability conditional on *S _{i}*(1) =

$$p(0,0)=p(1,0)\phantom{\rule{0.16667em}{0ex}}\text{and}\phantom{\rule{0.16667em}{0ex}}p(0,s)\ne p(1,s)\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{all}\phantom{\rule{0.16667em}{0ex}}s>C.$$

(2)

Therefore, whether *S* is an SoP can be evaluated through inference about the transmission probability curves *p*(0, *s*) and *p*(1, *s*).

In practice, a biomarker may have value as a surrogate even if (2) is not strictly satisfied. For example, if *p*(0, 0) is approximately equal to *p*(1, 0) while *p*(0, *s*) is substantially greater than *p*(1, *s*) for *s* > *C*, then *S* is predictive of the vaccine’s effect on risk of infection. Therefore, to summarize the predictiveness or “surrogate value” of a biomarker, Gilbert and Hudgens (2008) proposed the proportion associative effect statistic *PAE* |*EAE*|/(|*EAE*| + |*EDE*|) where

$$\begin{array}{l}\mathit{EAE}\equiv E[h(E\{{T}_{i}(0)\mid {S}_{i}(1)\},E\{{T}_{i}(1)\mid {S}_{i}(1)\})\mid {S}_{i}(1)>{S}_{i}(0)]\\ {\int}_{s>0}h(E\{{T}_{i}(0)\mid {S}_{i}(1)=s\},E\{{T}_{i}(1)\mid {S}_{i}(1)=s\})d{F}_{S}(s)/Pr[{S}_{i}(1)>{S}_{i}(0)]\end{array}$$

and

$$\begin{array}{l}\mathit{EDE}\equiv E[h(E\{{T}_{i}(0)\mid {S}_{i}(1)\},E\{{T}_{i}(1)\mid {S}_{i}(1)\})\mid {S}_{i}(1)={S}_{i}(0)]\\ =h(E\{{T}_{i}(0)\mid {S}_{i}(1)=0\},E\{{T}_{i}(1)\mid {S}_{i}(1)=0\})\end{array}$$

are the expected associative and dissociative effects, and *F _{S}* is the CDF of

Following Follmann (2006), we model the transmission probability by

$$p(Z,{S}_{i}(1),{W}_{i};\beta )=\mathrm{\Phi}\{{\beta}_{1}+{\beta}_{2}Z+{\beta}_{3}{S}_{i}(1)+{\beta}_{4}Z{S}_{i}(1)+{\beta}_{5}{W}_{i}\},$$

(3)

where Φ is the standard normal CDF, *W _{i}* is some baseline covariate that is correlated with

$$\mathit{PAE}=\mid {\beta}_{2}+\kappa {\beta}_{4}\mid /\{\mid {\beta}_{2}\mid +\mid {\beta}_{2}+\kappa {\beta}_{4}\mid \},$$

(4)

where *κ* *E*{*S _{i}*(1)|

For subject *i*, let *T _{i}* min{

$$f(O;\beta ,G)\equiv \varphi {(1,S,W,T,\delta ;\beta )}^{Z}{\left\{\int \varphi (0,s,W,T,\delta ;\beta )dG(s\mid W)\right\}}^{1-Z},$$

and *ϕ*(*Z*, *S*, *W*, *T*, *δ; β*) {1 − *p*(*Z*, *S*, *W; β*)}^{T}^{−}* ^{δ}p*(

Maximum “estimated likelihood” (Pepe and Fleming 1991) or “pseudolikelihood” (Liang and Self 1996) can be used for inference regarding *β* and *PAE*. As in Follmann (2006) and Gilbert and Hudgens (2008), we assume (*S*(1), *W*) arise from a bivariate normal distribution with means (*μ _{S}*,

A simulation study was conducted to assess whether sample sizes typical of RLD challenge studies provide adequate power to detect immune responses with high surrogate value. Data were generated assuming: *m* NHPs per arm; a maximum number of exposures per NHP of *C _{i}* = 30 for all

Simulation results are given in Table 3 and Figure 3. The MELE
$\widehat{\mathit{PAE}}$ is positively biased when *PAE* = 0.5 or 0.7 and negatively biased when *PAE* = 0.9, although the magnitude of the bias is negligible for *m* > 10. Likewise, the estimated transmission probability curves *p*(0, *s; *) and *p*(1, *s; *) exhibit minimal bias. The PBT has approximately the nominal size overall and adequate power for the high surrogate value scenario (i.e., *PAE* = 0.9) when there are 20 NHPs per arm. For each combination of *PAE*, *ρ*, and *m* in Table 3, we also estimated the power to detect a CoR, i.e., an association between *S _{i}*(1) and

Despite limited sample size, RLD challenge studies can inform about a vaccine’s mechanism of protection and magnitude of effect. In particular, using discrete time survival models, we show that maximum likelihood methods can afford accurate and precise estimates of vaccine efficacy. While determining a vaccine’s mechanism of protection is difficult in human studies, our results demonstrate that careful experimental design of challenge studies can lead to correct determination of the type of mechanism. We also consider a generalization of these models that allows for heterogeneous transmission probabilities. Similar extensions could easily be made to incorporate baseline covariates or allow for the possibility that a subset of NHPs are naturally immune to infection.

Our results also indicate it is possible to reliably evaluate potential immune SoPs in this setting. Properly designed RLD challenge studies can be adequately power to detect CoRs, and, to a lesser extent, immunological biomarkers with high surrogate value. These studies can also yield accurate estimates of surrogate value, based on the estimated transmission probability curves or functionals thereof such as *PAE*. However, results from our simulation study should be interpreted with caution for several reasons. First, 20 NHPs per arm are needed to have sufficient power to detect an immune marker with high surrogate value. Although such sample sizes are not the norm in this setting, several RLD challenge studies at least this large are being planned or conducted presently (John Mascola, personal communication). Second, the model assumed by the SoP analysis was correct, which in practice will rarely if ever be the case (discussed further below). Third, the assumed correlation of at least 0.5 between the HIV vaccine immune response *S _{i}*(1) and the baseline predictor

The proposed approach to evaluating possible SoPs entails assuming SUTVA, ignorable treatment assignment, and a parametric regression model for the transmission probabilities. As discussed in Section 3.1, these first two assumptions should hold in RLD challenge studies. The appropriateness of the probit model (3) will be more difficult to assess without additional information on *S _{i}*(1). Some elaborations in the design of RLD challenge studies might be helpful in this regard. For example, in addition to the baseline predictor design studied in Section 3, Follman (2006) also proposed a second design, “closeout placebo vaccination” or CPV. Using this approach, controls who remain uninfected by the end of study would receive the HIV vaccine and their subsequent immune response would be measured. In the RLD challenge study setting, CPV may not be feasible since most, if not all, control animals are often infected after repeated challenges, e.g., see Subbarao et al. (2006) and Ellenberger et al. (2006). If this scenario is anticipated, a possible variation on the CPV design would be to vaccinate those NHPs randomized to the control arm that remain uninfected after a specific number of exposures. Further research is needed on applying CPV and variations therein to the RLD challenge setting.

We caution that the scope of inference drawn regarding the surrogate value of candidate specific SoPs should be limited to settings similar to the study at hand. A single RLD challenge study typically will not provide sufficient information for extrapolation to different vaccine formulations or human populations. Such inferences generally require conduct of additional studies. We refer the reader to Gilbert and Hudgens (2008) and Qin et al. (2007) for further related discussion of SoP assessment in vaccine studies.

In contrast to SoPs, smaller RLD challenge studies can provide adequate power to detect a CoR. For example, in our simulation study with 10 NHPs per arm, there was at most 50% power to detect a SoP with high surrogate value, whereas the power to detect a CoR was greater than 95%. In addition to requiring fewer NHPs, evaluation of potential CoRs does not require obtaining a baseline covariate *W* correlated with *S*(1). These results are in concert with the well-known principle in the surrogate endpoint literature that establishing a valid surrogate requires substantially more evidence than merely determining a correlate.

This work was supported by NIH grant R01 AI054165-01. The authors thank Chih-Da Wu for his helpful comments and fitting the heterogeneous transmission probability model.

Supplementary Materials

The Web Appendix, Tables, and Figure referenced in Section 2 and 3 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.

- Chick SE, Barth-Jones DC, Koopman JS. Bias reduction for risk ratio and vaccine effect estimators. Statistics in Medicine. 2001;20:1609–1624. [PubMed]
- Czeschinski P, Binding N, Witting U. Hepatitis A and hepatitis B vaccinations: immunogenicity of combined vaccine and of simultaneously or separately applied single vaccines. Vaccine. 2000;18:1074–1080. [PubMed]
- Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge University Press; 1997.
- Ellenberger D, Otten RA, Li B, Rodriguez V, Sariol CA, Martinez M, Monsour M, Wyatt L, Hudgens MG, Kraiselburd E, Moss B, Robinson H, Folks T, Butera S. HIV-1 DNA/MVA vaccination reduces the per exposure probability of infection during repeated mucosal SHIV challenges. Virlogy. 2006;352:216–225. [PubMed]
- Farrington CP. Communicable diseases. In: Armitage P, Colton T, editors. Encyclopedia of Biostatistics. New York: Wiley; 1998. pp. 795–815.
- Follmann D. Augmented designs to assess immune response in vaccine trials. Biometrics. 2006;62:1161–1169. [PMC free article] [PubMed]
- Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58:21–29. [PubMed]
- Gilbert PB. Interpretability and robustness of sieve analysis models for assessing HIV strain variations in vaccine efficacy. Statistics in Medicine. 2001;20(2):263–279. [PubMed]
- Gilbert PB, Hudgens MG. Evaluating candidate principal surrogate endpoints. Biometrics. 2008 In press. [PMC free article] [PubMed]
- Gilbert PB, Peterson ML, Follmann D, Hudgens MG, Francis DP, Gurwith M, Heyward WL, Jobes DV, Popovic V, Self SG, Sinangil F, Burke D, Berman PW. Correlation between immunologic responses to a recombinant glycoprotein 120 vaccine and incidence of HIV-1 infection in a phase 3 HIV-1 preventive vaccine trial. Journal of Infectious Diseases. 2005;191:666–77. [PubMed]
- Halloran ME, Haber M, Longini IM. Interpretation and estimation of vaccine efficacy under heterogeneity. American Journal of Epidemiology. 1992;136:328–343. [PubMed]
- Halloran ME, Longini IM, Struchiner CJ. Design and interpretation of vaccine field studies. Epidemiological Reviews. 1999;21:73–88. [PubMed]
- Jewell NP. On the bias of commonly used measures of association for 2 × 2 tables (C/R: V45 p1030–1032) Biometrics. 1986;42:351–358.
- Liang KY, Self SG. On the asymptotic behaviour of the pseudolikelihood ratio test statistic. Journal of the Royal Statistical Society, Series B: Methodological. 1996;58:785–796.
- Longini IM, Halloran ME. A frailty mixture model for estimating vaccine efficacy. Applied Statistics. 1996;45:165–173.
- Pepe MS, Fleming TR. A nonparametric method for dealing with mismeasured covariate data. Journal of the American Statistical Association. 1991;86:108–113.
- Qin L, Gilbert P, Corey L, McElrath M, Self S. A framework for assessing immunological correlates of protection in vaccine trials. Journal of Infectious Diseases. 2007;196:1304–1312. [PubMed]
- Regoes RR, Longini IM, Feinberg MB, Staprans SI. Preclinical assessment of HIV vaccines and microbicides by repeated low-dose virus challenges. PLoS Medicine. 2005;2(8):e249. [PMC free article] [PubMed]
- Self SG, Liang K. Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association. 1987;82:605–610.
- Smith PG, Rodrigues LC, Fine PEM. Assessment of the protective efficacy of vaccines against common diseases using case-control and cohort studies. International Journal of Epidemiology. 1984;13:87–93. [PubMed]
- Subbarao S, Otten R, Ramos A, Jackson E, Monsour M, Bashirian S, Kim C, Johnson J, Soriano V, Hudgens MG, Butera S, Janssen R, Paxton L, Greenberg A, Folks T. Chemoprophylaxis with Tenofovir Disoproxil Fumarate provided partial protection against Simian Human Immunodefficiency Virus infection in macaques given multiple virus challenges. Journal of Infectious Diseases. 2006;194:904–11. [PubMed]
- UNAIDS. Geneva: Dec, 2007. AIDS epidemic update. www.unaids.org.
- Weinberg CR, Gladen BC. The beta-geometric distribution applied to comparative fecundability studies. Biometrics. 1986;42:547–560. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |