|Home | About | Journals | Submit | Contact Us | Français|
Simulation studies were conducted to estimate the statistical power of repeated low-dose challenge experiments in non-human primates to detect a candidate HIV vaccine’s effect. The effect of various design parameters on power was explored. Simulation results indicate repeated low-dose challenge studies with total sample size 50 (25 per arm) typically provide adequate power to detect a 50% reduction in the per-exposure probability of infection due to vaccination. Power generally increases with the maximum number of allowable challenges per animal, the per-exposure risk of infection in controls, and the proportion susceptible to infection.
A preventive vaccine against HIV-1 would be a valuable tool in curbing the pandemic. Evaluation of vaccine candidates in the SIV non-human primate model is an important step in the assessment of the potential efficacy of analogous HIV-1 vaccines [1, 2]. Historically, vaccine regimens have been tested in non-human primates by administering a single high dose intravenous or mucosal inoculation of the challenge virus, typically resulting in infection of all animals under study after one exposure. Recently, evaluation of candidate HIV vaccines (and other preventive interventions) has entailed repeated low-dose mucosal challenge studies [3–6] that may more closely mimic typical exposure in natural human transmission settings. A primary objective of these studies is to assess vaccine efficacy for prevention of infection.
Since the repeated low-dose challenge study design has only recently been implemented in evaluation of candidate HIV vaccines, the literature on design considerations of these studies is limited. Recent investigations [7, 8] demonstrated such challenge studies can be adequately powered to test for vaccine efficacy to prevent infection with feasible samples sizes of non-human primates. However, the effect of various design parameters such as the challenge dose, the percent of animals susceptible to infection, or the unequal allocation of animals to vaccine and placebo arms have not been systematically investigated. Given results from a recent HIV vaccine efficacy trial , the possibility that vaccination may increase the probability of infection needs to be entertained. Direct comparisons of different statistical tests employed in the analysis of repeated low-dose challenge studies are also needed.
In this paper, we describe simulation studies which were conducted to better understand how the design of challenge experiments can affect the statistical power to detect a candidate vaccine’s effect. Simulated scenarios were varied according to the total sample size, the per-exposure risk of infection in controls, the magnitude and direction of the vaccine effect, the fraction allocated to vaccine or control, the proportion of susceptible animals, and the maximum number of challenges per animal. While this work is motivated by the development of an HIV vaccine, our results can be used to inform the design of challenge studies for evaluation of other vaccines and other preventive interventions.
Simulation studies were conducted to assess the statistical power of challenge experiments to detect a candidate vaccine’s effect on the per-exposure risk of infection. Power was estimated by simulating multiple challenge studies and calculating the proportion of simulated data sets where various statistical tests (described below) rejected the null hypothesis of no vaccine effect. Simulations were also conducted under the null hypothesis to evaluate type I error, i.e., the probability a statistical test incorrectly rejects the null. All statistical tests were two-sided and conducted at the α=0.05 significance level. For each scenario described below, 10,000 challenge studies were simulated.
Simulations were conducted under the following four key assumptions. First, challenge studies have two arms, with a specified fraction of non-human primates randomly assigned to receive the candidate vaccine and the remaining animals serving as controls. Second, the vaccine has a leaky mechanism of protection in animals that are susceptible to infection, i.e., the vaccine decreases (or increases) the per-exposure probability of infection by the same multiplicative amount for all susceptible animals. The power is expected to be at least as high in trials where the vaccine has an all-or-none mechanism of protection . Third, challenges are ceased if an animal remains uninfected after cmax exposures. Fourth, given an animal is susceptible, the probability of infection is independent of the number of prior exposures.
Let p1 (p0) denote the probability of infection from a single exposure in a vaccinated (unvaccinated) animal. We are interested in testing the null hypothesis H0 : p0 = p1 or equivalently H0 : RR = p1/p0 = 1, where RR denotes the relative risk of infection per-exposure. Three different tests of H0 were considered in the simulations studies: the logrank test (as in ), Fisher’s exact test (as in ), and a likelihood ratio test (as in ). The likelihood ratio test is described in the Appendix. For all tests, we evaluated power as a function of RR. Values of RR less than one indicate vaccination has a protective effect. For example, RR = 0.4 corresponds to 60% vaccine efficacy where vaccine efficacy equals (1 − RR) × 100%. Conversely, values of RR greater than one indicate a harmful effect of vaccination. For instance, RR = 1.5 corresponds to the vaccine increasing the per-exposure probability of infection by 50%.
Unless stated otherwise, all results are based on simulations where p0 = 0.5, equal numbers of animals are allocated to the vaccine and control arms, the maximum number of challenges per animal is cmax = 10, and all animals are susceptible to infection. Simulated power is presented only for the logrank test unless the results for the likelihood ratio or Fisher’s exact tests differ markedly.
First we considered the effect of the total sample size N on power. The results depicted in Figure 1A indicate a repeated low-dose challenge study with N = 50 animals has at least 80% power to detect RR = 0.5. Smaller studies of size N=40 and N=30 have only 74% and 61% power respectively to detect RR = 0.5. If the vaccine enhances the probability of infection, N = 50 also provides at least 80% power to detect RR = 1.7.
Figure 1B shows power for different values of p0. These results demonstrate that power increases with p0, suggesting challenge doses should be chosen such that the per-exposure probability of infection is approximately 0.5. Values of p0 greater than 0.5 were not considered as such high doses can preclude the ability to determine whether vaccination enhances the probability of infection. For instance, suppose p0 = 1 corresponding to a high-dose challenge study and the vaccine has no protective effect. Then all animals will become infected after a single challenge, making it impossible to detect any increase in the probability of infection due to vaccination that might occur at lower doses. Additional disadvantages of high-dose challenge studies have been described elsewhere [4, 7].
The effect of varying the relative number of animals allocated to the vaccine (nv) and control (nc) arms is depicted in Figure 1C. In general, there is a modest diminution of power with an unbalanced design. Given a secondary objective of challenge studies can entail identifying immunological correlates of risk, allocating more than half of the non-human primates to vaccine may be preferred.
Figure 1D shows power increases with the maximum number of challenges per animal (cmax). There is an appreciable increase in power moving from cmax = 1 to cmax = 3, indicating the importance of allowing for repeated exposures when using a low-dose challenge. For example, when RR = 0.5 there is 47% power for cmax = 1 compared to 75% power for cmax = 3. Gains in power are less pronounced when increasing cmax from 3 to 10. For example, the power to detect RR = 0.5 is 84% when cmax = 10. Increasing cmax to 20 yields no appreciable increase in power over cmax = 10 (results not shown) since so few animals are expected to escape infection after 10 challenges when p0 =0.5 and RR=0.5.
Finally, we consider the possibility some fraction of animals is not susceptible to infection. For example, using repeated low-dose rectal challenges of SHIV, Garcia-Lerma et al.  reported 18 untreated macaques became infected after a median of two challenges, yet one macaque remained uninfected after 14 exposures. We assume the probability of an animal not being susceptible is independent of randomization assignment. Figure 2 shows the power of the three tests assuming different fractions of susceptible animals (100%, 90%, 80%). Unlike previous results (Figure 1), power and type I error of the logrank, Fisher’s exact and likelihood ratio tests differ for this set of simulations.
There are three important results to be gleaned from Figure 2. First, Fisher’s exact test has a type I error rate substantially above the nominal α = 0.05 significance level. Second, power of the logrank test decreases as the fraction of susceptible animals decreases. This decrease in power is consistent with the results of Regoes et al.  when they consider the effect of heterogeneity in the per-exposure probability of infection on power. Third, the likelihood ratio test maintains the appropriate type I error rate and has only minimal loss of power as the fraction of susceptible animals decreases.
These results have three implications. First, if some animals are not susceptible to infection, the standard logrank test is not recommended due to diminished power. Rather, tests designed to provide greater power in the presence of an immune or “cured” fraction should be considered. If the modeling assumptions are justified, the likelihood ratio test will be optimal. However, if the model is incorrect, the likelihood ratio test may not have the correct type I error rate or may be less powerful than nonparametric tests designed for this setting (e.g., a weighted logrank test ). Second, Fisher’s exact test is not recommended if some animals are not susceptible due to inflated type I error. Similar results (not shown) were obtained from simulations allowing for heterogeneity in susceptibility between animals by randomly sampling individual transmission probabilities from beta distributions . Third, provided the appropriate statistical test is employed, repeated low-dose challenge studies can still achieve adequate power even if 10 –20% of animals are not susceptible to infection. For example, a challenge study with N = 50 animals has over 80% power (using the likelihood ratio test) to detect RR = 0.45 when 90% of the animals are susceptible.
Despite limited sample size, repeated low-dose challenge studies can reliably detect the protective effect of a vaccine candidate. For example, 50 non-human primates (25 per arm) will generally provide sufficient power to detect a 50% reduction in the per-exposure probability of infection due to vaccination. In other words, repeated low-dose challenge studies can achieve the same power to detect vaccine efficacy as much larger phase IIB or III clinical trials and can produce roughly the same number of endpoints (i.e., infections) as a screening test-of-concept (TOC) trial [11, 12].
Power of repeated low-dose challenge studies generally increases with the per-exposure risk of infection in controls, suggesting titration studies should strive for identifying challenge doses where the per-challenge probability of infection is approximately 0.5. Randomizing animals to vaccine in 2:1 or 3:1 ratios results in modest diminution of power compared to 1:1 allocation. Power tends to decrease with the proportion not susceptible to infection, although adequate power can still be achieved provided the proportion of animals not susceptible does not exceed 20% and the appropriate statistical test is employed. Finally, power decreases with the maximum number of challenges per animal, with repeated low-dose challenge designs having substantially more power than single challenge studies.
The results presented here can aid individual investigators in determining an efficient study design for their particular setting. Additionally, we have developed a web-based calculator (accessible from http://www.bios.unc.edu/~mhudgens) to estimate the power of repeated low-dose challenge studies for various study designs.
Financial support: National Institute of Allergy and Infectious Diseases, National Institutes of Health (grant R01 AI054165-04)
The likelihood ratio test is based on the following discrete time survival model. Let θ denote the probability an animal is not susceptible to infection. For animal i, let zi denote treatment assignment (1 vaccine, 0 control), ti denote the number of exposures until infection (up to cmax), and δi denote whether the animal is infected by the end of the study. Assume we observe N independent, identically distributed copies of ( zi, ti, δi) and that, conditional on whether an animal is susceptibile, the probability of infection is independent of the number of prior exposures. Then the likelihood is
The likelihood ratio test equals twice the difference in the maximum value of the likelihood computed with and without the constraint p0 = p1.
Potential conflicts of interest: none
Presented at: AIDS Vaccine 2008 Meeting, Cape Town, South Africa, October 2008