|Home | About | Journals | Submit | Contact Us | Français|
The HIV Prevention Trial Network (HPTN) 052 Study is a Phase III, two-arm, controlled, open-labeled, randomized clinical trial designed to determine whether early antiretroviral therapy (ART) can prevent the sexual transmission of human immunodeficiency virus type 1 (HIV-1). A total of 1,763 couples in which one partner was HIV-1-positive and the other was HIV-1-negative were enrolled in four continents, nine countries and thirteen study sites. The HIV-1-positive partner was randomly assigned to either of the two arms: “immediate” (early) therapy with ART initiated upon enrollment plus HIV primary care, or “delayed” therapy with HIV primary care but ART initiated when the index case would have two consecutive measurements of a CD4+ cell count within or below the range of 200–250 cells/mm3, or develop an AIDS-defining illness. In this paper, we describe several key statistical considerations for the design of this landmark study. Despite that the observed event rates were lower than expected, which might have compromised the study power, an early release of the trial results in May 2011 showed an overwhelming 96% risk reduction for the immediate therapy in the prevention of genetically linked HIV-1 incident transmissions. Nevertheless, the durability of its long-term effectiveness is yet to be assessed. The HPTN 052 Study is still ongoing and will not complete till 2015.
In the absence of antiretroviral therapy (ART), human immunodeficiency virus type 1 (HIV-1) leads to inexorable destruction of critical immune cells (CD4+), opportunistic infections, and death. Since its introduction in the late 1990s, highly active ART has dramatically reduced the morbidity and mortality of HIV-1 infection through sustained reduction in HIV-1 viral replication. Nevertheless, such therapy did not cure HIV infection, and viral resistance was expected to develop in most patients on regimens that were not completely suppressive. Although it was widely recognized that initiation of ART should not be delayed beyond when the CD4+ cell counts fell below 200 cells/mm3, there had been great debate for many years regarding when to start ART to optimize the benefit-to-risk profile of treatment and prevention for both HIV-infected individuals and their sexual partners, respectively. The obvious benefits of ART also were weighed against a global shortage of antiviral agents and treatment infrastructure, cost, short- and long-term side effects and severe consequences of non-adherence.
The HIV Prevention Trial Network (HPTN) 052 Study is a randomized clinical trial in serodiscordant couples to determine whether earlier initiation of ART for HIV-infected (index) participants can reduce the short- and long-term risk of sexual transmission of HIV-1 to their HIV-negative partners and also yield better clinical outcomes in the HIV-infected index participants. The primary objective of the study was to compare the rates of genetically linked HIV-infection among HIV-negative partners of the index cases in the two study arms: immediate ART (initiated immediately upon enrollment, when the CD4+ counts in the index are between 350–550 cells/mm3) and delayed ART (delayed until the participant has two consecutive measurements of a CD4+ cell count below 250 cells/mm3, or develops an AIDS-defining illness); participants in both study arms would receive HIV-1 primary care throughout the study. A key secondary endpoint was the clinical outcome of the HIV-infected index participants, including death, World Health Organization (WHO) Stage 4 events, severe bacterial infections and pulmonary tuberculosis.
On April 28, 2011, an independent data and safety monitoring board (DSMB) of the U.S. National Institutes of Health/National Institute of AIDS and Infectious Disease (NIH/NIAID) Division of AIDS (DAIDS) reviewed the results of an interim analysis of the data collected as of February 21, 2011. Per the DSMB’s recommendation, the interim analysis results were released to the trial participants and the general public on May 12, 2011. They showed an overwhelming 96% risk reduction for the immediate therapy in prevention of the genetically linked HIV-1 incident transmissions []. Since then, all the HIV-positive partners have been provided ART regardless of their CD4 counts.
Despite its early favorable results for the immediate therapy, the HPTN 052 Study is still ongoing to further assess whether or not the early efficacy is durable. It is expected to complete the follow-up of study participants in 2015. In this article, we present several key statistical considerations during the protocol development, which we believe can be helpful in the design of future similar studies.
The HPTN 052 Study is a Phase III, two-arm, randomized, controlled, open-labeled and multi-country clinical trial comparing early versus delayed ART strategies for the prevention of HIV transmission in HIV serodiscordant couples and reduction of morbidity and mortality in the HIV-infected index participants. The couples to be enrolled were sexual partners, same or opposite sex, who were married, had been living together, or considered each other a primary partner. They were required to be sexually active. Specifically, couples should have reported to have sex (vaginal or anal) with partner at least 3 times in the last 3 months.
Its study sites include 13 sites in 9 countries: Gaborone, Botswana; Kisumu, Kenya; Lilongwe and Blantyre, Malawi; Johannesburg and Soweto, South Africa; Harare, Zimbabwe; Rio de Janeiro and Porto Alegre, Brazil; Pune and Chennai, India; Chiang Mai, Thailand; and Boston, MA, United States. Although differences might exist among the countries in health delivery practice, all the clinics selected for this study were HTPN Clinical Trial Units that had been prepared to receive same site training.
Between April 2005 and May 2010, 1763 serodiscordant couples were enrolled into the study under its protocol versions 2.0 and 3.0. In order to provide data on long-term ART effectiveness and public health utility, all enrolled couples were to be followed up for at least 5 years, and the HPTN 052 Study is expected to be complete in 2015. All versions of the HPTN 052 Study protocol and its amendments can be found at the HPTN Web site (http://www.hptn.org/research_studies/HPTN052StudyDocuments.asp#Protocol).
Genetically linked HIV-1 incident infection occurring in the HIV-negative partners of randomized HIV-infected index participants was the primary prevention endpoint for the study. A complementary analysis would also consider all acquisitions, regardless of their linkage. The effectiveness estimate obtained via this latter analysis would provide a measure of the overall public health effect of ART in the prevention of HIV transmission. Details regarding determination of HIV transmission linkage analyses were provided elsewhere [].
To compare the effectiveness of early versus delayed ART on clinical outcomes in HIV-infected index participants, a primary clinical endpoint was chosen as the earliest event of death, a WHO Stage 4 diagnosis, or a severe bacterial infection or pulmonary tuberculosis. This endpoint reflected the most serious clinical events associated with HIV-1 infection.
The primary prevention endpoint and the primary clinical endpoint are both of time-to-event, which would be analyzed according to a pre-specified time-to-event analysis plan.
Corresponding to the HPTN 052 Study’s secondary objectives, several secondary endpoints were specified. Details of these secondary endpoints and how they were measured are provided in Table 1.
The primary objective of obtaining a long-term comparison between treatment arms presented a major challenge to the sample size and power calculation for the HPTN 052 Study. First, the reduction of HIV rates due to ART initiation might not be constant over time, as HIV-infected patients might not adhere to therapy, fail therapy, or develop resistant HIV variants. Second, for index participants in the delayed arm of the trial, the time post-randomization when ART was initiated would vary since it depends on when the participant had either an AIDS-defining illness or consecutively measured CD4+ cell counts below 250 cells/mm3. Further, the relative reduction of morbidity and mortality in the HIV-infected index partners was also expected to change over time. As a result, any naïve use of conventional methods, for example, the usual Cox proportional hazards model assuming a constant regression parameter, to calculate sample size and power would be problematic.
To account for potentially time-varying ART treatment effectiveness and different ART initiation times in the delayed arm, three main assumptions were assumed: (1) the “baseline” risk of HIV transmission within a couple of the positive partner “not” receiving any ART was expected to decline over time; (2) the ART effectiveness for the positive partner receiving ART might decrease over time during the study follow-up; and (3) the delay time before ART initiation in Arm 2 would have an impact on HIV transmission. To actually calculate the expected cumulative event rates in the two arms and the associated power, we followed adopted a two-step procedure:
Specifically, we used the following assumptions to facilitate the sample size and power calculation:
Based on these assumptions, power was calculated (also shown in Table 4) assuming 6.5 years of trial duration, 1.5 years of accrual, and 5% annual loss-to-follow-up per arm. Specifically, for the two scenarios of Assumption 2, in scenario (1), the power was greater than 87% to detect effectiveness > 39%, which amounted to a > 4.9% absolute rate reduction in the cumulative rates (13.2% versus 8.3%). This power was achieved with an upper limit of a 50% risk reduction of acquisition for the partners of index cases who had initiated ART in the delayed arm during follow-up. In scenario (2), the power was 61% to detect a 3.8% absolute rate reduction (14.9% versus 11.1%). If the risk reduction for the partners of those having ART initiated in the delayed arm was more than 25%, the trial would be greatly underpowered. In this case, however, the absolute rate reduction of cumulative HIV rates would be less than 3.1%, which might not be of clinical importance.
In addition to assessing the prevention benefit of reducing new HIV-1 infection, for a compelling result that might change international policy and guidelines, the study should also be powered to determine whether or not earlier initiation of ART provided sufficient reduction in serious clinical events among HIV-infected partners when compared to the delayed ART. Specifically, the study should have a high probability that the upper bound of the 95% confidence interval for the true hazard ratio was <0.8 when the true hazard ratio was in the range of a 40% to 50% reduction. It was determined that the 1750 HIV+ individuals enrolled in the HPTN 052 Study would provide at least 80% power to show that early initiation of ART would provide at least a 20% reduction in hazard of serious clinical events when the true hazard ratio was a 40% reduction, assuming an underlying 5-year event rate of 18% with delayed ART compared to 9% with immediate ART – an average relative hazard of 0.5.
Both treatment strategies were expected to differentially affect the immunologic and virologic responses throughout follow-up. It was expected that initially the HIV-1 RNA levels would be lower in the immediate arm compared with the delayed arm. Early in the trial, this would lead to a reduced rate of HIV acquisition in the immediate arm. Later in follow-up, these differences might increase, diminish or even be reversed. Therefore, we were expecting short-term differences in effectiveness with a possible reversal in effectiveness in the longer term. Thus, the study data monitoring plan should balance the need to protect trial participants, while enabling the trial to address its primary objective regarding the evaluation of the relative long-term effectiveness of the two intervention strategies.
The HPTN Study Monitoring Committee (SMC) and the NIH/NIAID DSMB have been monitoring the trial by reviewing the study data at least once per year. The SMC reviews mainly focused on the operational characteristics of the trial and the overall trial conduct and performance on data pooled over treatment arms. Since SMC reviews took place prior to DSMB reviews and the SMC review minutes were included in DSMB reports as supplement materials, this review process allowed the DSMB to mostly focus on the review of by-arm efficacy and safety endpoints.
At study initiation, guidelines were established for monitoring efficacy and safety endpoints at a minimum of three interim and a final analyses to satisfy the ethical need for early study termination if initial results were extreme, while not increasing the chance of false conclusions. Specifically, they should: (1) address the importance that the trial provided persuasive evidence when considering both treatment and prevention issues, (2) adjust for the nature of interim monitoring that involved repeated testing over time, (3) reflect particular caution given the benefit-to-risk profile of an immediate ART strategy relative to a delayed ART strategy could change substantially over time, and (4) be driven by the morbidity and mortality events that had the greatest clinical impact. In addressing these requirements, a composite endpoint for each couple was chosen to be the earliest occurrence of death, a WHO Grade 4 event in the index, such as death or extrapulmonary Tuberculosis, or transmission of HIV to the partner. A time-to-event analysis would be performed for this Mortality/Morbidity (M/M) composite endpoint. In this trial, it was expected that approximately 340 of the 1750 couples would experience an “event” relative to this M/M composite endpoint.
To guide recommendations about trial termination when interim results on the M/M composite endpoint were favorable for the immediate ART strategy, the “upper boundary” to establish superiority for the immediate ART strategy relative to the delayed ART strategy would be based on an application of the O’Brien-Fleming boundary to preserve the (one-sided) 0.025 false positive error rate relative to the hypothesis:
To guide recommendations about trial termination when interim M/M composite endpoint results were unfavorable for the immediate ART strategy, the “lower boundary” to establish lack of superiority would be based on an application of the O’Brien-Fleming boundary to preserve the (one-sided) 0.025 false negative error rate relative to the hypothesis:
For illustration, Table 5 presents the O’Brien-Fleming boundaries for the relative risk (RR) estimates that would lead to rejection of H0 or H1 at analyses performed when one would have observed 25%, 50%, 75% or 100% of the trial’s expected total of 340 couples experiencing the M/M composite endpoint.
Observe that, to reach the O’Brien-Fleming boundary when interim results on the M/M composite endpoint were favorable for the immediate ART strategy, the delayed ART group would need to have at least 43 excess M/M composite endpoint events (21 in the immediate ART arm versus 64 in the delayed ART arm) at the 25% information fraction, at least 54 excess events (58 in the immediate ART arm versus 112 in the delayed ART arm) at the 50% information fraction, at least 65 excess events (95 in the immediate ART arm versus 160 in the delayed ART arm) at the 75% information fraction, and at least 74 excess events (133 in the immediate ART arm versus 207 in the delayed ART arm) at the 100% information fraction.
Observe that, to reach the O’Brien-Fleming boundary when interim results on the M/M composite endpoint were unfavorable for the immediate ART strategy, the immediate ART arm would need to have at least 15 excess M/M composite endpoint events (50 in the immediate ART arm versus 35 in the delayed ART arm) at the 25% information fraction, at most 6 fewer events (82 in the immediate ART arm versus 88 on the delayed ART arm) at the 50% information fraction, at most 28 fewer events (113 in the immediate ART arm versus 141 in the delayed ART arm) at the 75% information fraction, and at most 50 fewer events (145 in immediate ART arm versus 195 in the delayed ART arm) at the 100% information fraction.
This trial might also be terminated or modified for poor recruitment, adherence, retention, and/or low HIV acquisition rate. The following measures were intended to serve as guidelines for possibly stopping or modifying the study early, pending a final DSMB recommendation.
The study sites were expected to complete recruitment in 18 months. Given the completion of the run-in period, for the rest of 1668 couples to be recruited in the full study, the study was expected to recruit 60 couples per month for the first 6 months and 110 couples per month thereafter for 12 months. The full study was implemented in a staggered fashion for the study sites due to varying regulatory processes in host countries, which would result in different starting enrollment dates for the sites. Such differences would be taken into account in the recruitment rate calculation. Stopping or modifying the study might be considered if the study team failed to recruit more than 75% of the above targeted rates.
Based on the expected incidence of HIV transmission, the target retention rate for the study was 98% per year, i.e., 2% of loss-to-follow-up per year. Stopping or modifying the study might be considered if the study team failed to retain more than 96% of couples per year, i.e., 4% of loss-to-follow-up per year. Differential loss-to-follow-up by study arms and sites should be reviewed carefully, since participants might choose to leave the study if treatment appeared to fail and/or if other treatments became available. When the loss-to-follow-up was treated as censored, this type of informative censoring could seriously bias the study’s primary analysis if not properly adjusted. This retention guideline might be modified if the baseline incidence was determined to be much lower/higher than expected.
The primary objective of this study might not be addressed if the time of ART initiation in the delayed arm was too short. It was expected that ART would be initiated 2 to 3 years (median 2.8 years) after enrollment of participants in the delayed ART arm. Stopping or modifying the study might be considered if the median delay time was less than 1 year. This delay time guideline should be evaluated in light of the safety and efficacy endpoint data collected since the expected differences in HIV transmission relied on assumptions about ART effectiveness over time.
The overall ART benefits depended on adherence to the regimens prescribed to suppress viral loads. Direct and indirect measures of adherence would be reviewed:
Pooled (across arms) rate of HIV acquisition would be monitored. Stopping or modifying the study might be considered if the upper boundary of an 80% confidence interval for the pooled rate of HIV acquisition was smaller than the expected rate.
As of the writing of this paper, all of the HIV-infected partners of the HPTN 052 Study have received ART. Monitored by the HPTN SMC and NIH/NIAID DSMB, the HPTN 052 Study is actively following up all study participants as planned to be complete in 2015.
The primary objective of the HPTN 052 Study was to compare, for the prevention of HIV in serodiscordant couples, two ART management strategies: the immediate strategy where ART is initiated immediately following the enrollment of index partners with CD4+ counts between 350 and 550, and the delayed strategy where ART is initiated once index partners’ CD4+ counts drop between 200 and 250 or the occurrence of an AIDS-defining illness during follow-up. To address this objective, the study was designed to be a prospective Phase III two-arm randomized trial with a relatively long period of follow-up. By this design, it was expected that in the short-term the immediate strategy would be better than the delayed regimen in reducing the risk of HIV transmission in a serodiscordant couple, given the fact that the virus might be significantly suppressed immediately following the ART initiation in the index partner. However, the long term benefit-to-risk ratio of this strategy has yet to be determined. In fact, it was expected that the delayed strategy might lead to improved viral suppression in the longer term. That is, the hazard functions of these two strategies may very well cross over a relatively long time period of follow-up.
Crossing-over of hazard functions is challenging in terms of sample size determination for time-to-event outcomes with a naïve use of the conventional methods based on the log-rank test or the Cox proportional hazards model. For the HPTN 052 Study, we addressed this issue by: (1) comparing the cumulative HIV incidence rates instead of using the usual hazards ratios to estimate treatment effectiveness; (2) assuming non-constant effectiveness of ART in reducing the risk of HIV acquisition and allowing it to vary over time with reduced magnitude to reflect the expectation that ART effectiveness might diminish gradually over time; and (3) including the delay time as the so-called “change point” [] in our calculations for the expected cumulative HIV incidence rates. Since we did not know individual CD4+ counts a-priori, we had calculated a distribution for the ART initiation rate, which was similar to the latent treatment effectiveness lag time discussed in Chen et al (2002) []. Overall, the above strategies allowed us to compare the risk of HIV acquisition during follow-up while taking into account possible crossover in hazard functions. The strategies were based on a limited number of assumptions resulting from substantial consultation with field experts and literature; however, they did not intend to cover all the possible types of crossover in hazard functions. For example, the timing of ART initiation was assumed to be independent of the duration of partnership. More sophisticated modeling of ART initiation time based on additional characteristics might nevertheless improve the calculation of the delay time distribution.
For the sample size calculations, we had assumed homogenous risk of HIV acquisition, that is, the risk of acquisition for partners of those with CD4+ cell count dropping below 250 cells/mm3 and/or developing AIDS-defining illnesses was similar to the one for partners of those with CD4+ cell count above 250 cells/mm3 with no AIDS-defining illnesses. This assumption could be restrictive. However, if in the absence of ART, the number of infected partners with CD4+ cell count dropping below 250 cells/mm3 and/or experiencing AIDS-defining illnesses was increasing rapidly over time, the overall rates would greatly increase over time. This increase might contradict Assumption 1, which was based on observed data from the literature.
The annual HIV incidence rates of acquisition in Assumption 1 did not necessarily reflect the “true” incidence rates of the HPTN 052 Study population. Higher HIV rates would lead to an increase in power. For example, if the annual rates of Table 2 are 7%, 7%, 5%, 5%, and 2% for Years 1 to 5, respectively, the power under Scenario 1 in Table 7 were all above 90%. For Scenario 2, moderate power between 50% and 75% was achieved if the decrease in risk of acquisition for partners of those on the delayed arm who initiate ART was 25% or 35%.
As of May 12, 2011, when the study results were first disclosed, the actual observed 5-year annual HIV incidence rates were 0.2%, 0.2%, 0.5%, 0% and 0% for the immediate arm, and 2.2%, 2.1%, 2.0%, 7.7% and 0% for the delayed arm. These incidence rates were somewhat below their expected ones, except for the last two years when follow-up data were still limited for the estimates to be reliable. Although they might have compromised the calculated power, the actual treatment difference was so overwhelming that the interim analysis still led to the early disclosure of results.
In addition to the prevention outcomes for the partners, the HPTN 052 Study also gave significant weight to treatment outcomes of the indexes for the immediate and delayed strategies. Finally, extra caution was needed for interim monitoring of the trial since early results would not provide reliable insights on the long-term benefit-to-risk ratio associated with both treatment and prevention outcomes. Hence, our proposed monitoring guidelines based on a composite of treatment and prevention endpoints intended to warrant that the trial would provide persuasive evidence regarding both treatment and prevention issues.
Although the main objective of the HPTN 052 Study was to estimate and compare the effectiveness of the two treatment strategies, treatment effect on behaviors, for instance, was part of the intervention that needed to be included in the assessment of effectiveness. The unblinded nature of its design might allow for the proper and timely clinical management of the index cases. However, statistical bias in response measurements, conscious or unconscious, could occur in particular for the self-reported behavioral and safety data (i.e., symptoms), which would be carefully monitored and investigated.
Nevertheless, the HPTN 052 Study was intended to measure the long-term effectiveness of immediate versus delayed strategies. Although the early release of interim analysis results in favor of the immediate therapy had led to the rest of the ART-naïve index partners in the delayed arm to initiate ART sooner than expected, we observed that there was a sufficient contrast in delay time between the two arms. As a result, the HPTN 052 Study shall continue its course to assess whether or not the early treatment efficacy is durable for the longer term.
The HPTN 052 Study is supported by the HIV Prevention Trials Network (HPTN) and by grants (UM1-AI068619 and U01-AI068619; and UM1-AI068617 and U01-AI068617, to the HPTN Statistical and Data Management Center) from the NIH/NIAID.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.