|Home | About | Journals | Submit | Contact Us | Français|
Few HIV prevention interventions have been evaluated in randomized controlled trials (RCTs). We examined design, implementation, and contextual considerations that may limit detection of a positive or adverse effect in HIV prevention trials.
A systematic review of late phase RCTs for prevention of sexual transmission of HIV that 1) randomly allocated intervention and comparison groups; 2) evaluated interventions to prevent sexual transmission in non-pregnant populations; and 3) reported HIV incidence as the primary or secondary outcome.
PubMed/MEDLINE, other electronic databases, and electronic conference proceedings of recent HIV/AIDS-related conferences were searched to identify published or unpublished trials meeting the inclusion criteria. Descriptive, methodological, and contextual factors were abstracted from each trial.
The review included 36 HIV prevention RCTs reporting on 38 unique interventions. Only six RCTs, all evaluating biomedical interventions, demonstrated definitive effects on HIV incidence. Five of the six RCTs significantly reduced HIV infection: all three male circumcision trials, one trial of STI treatment and care, and one vaccine trial. One microbicide trial of nonoxynol-9 gel produced adverse results. Lack of statistical power, poor adherence, and diluted versions of the intervention in comparison groups may have been important issues for the other trials that demonstrated “flat” results.
Almost 90% of HIV prevention trials had “flat” results, which may be attributable to trial design and/or implementation. The HIV prevention community must not only examine evidence from significant RCTs, but must also examine flat trials, and address design and implementation issues that limit detection of an effect.
The global need for effective HIV prevention programs has never been more urgent. Although the number of people receiving antiretroviral drugs in low- and middle-income countries increased 10-fold in the last six years, new infections in 2007 outpaced antiretroviral therapy uptake by a margin of five to two.[4, 5] Randomized controlled trials (RCTs) are generally considered the gold standard to define the evidence base for HIV prevention programs and policies. However, only one in seven RCTs of interventions to prevent sexual transmission of HIV has shown efficacy.[6-12] In fact, the overwhelming majority of completed RCTs are “flat” – unable to demonstrate either a positive or adverse effect.
Flat results may occur for three reasons. The underlying concept may be flawed; the concept may be sound, but the specific intervention approach may be ineffective; or, especially for interventions postulated to have only modest impact, aspects of the study design, implementation, or context may limit detection of a true effect. RCT results due to the first two situations have been critically important to advancing HIV prevention science.[6, 7, 10, 11] The third squanders resources and costs lives. Our review addresses this last phenomenon.
We conducted a systematic literature review of phase IIb or III RCTs for prevention of sexual transmission of HIV. We searched PubMed/MEDLINE, EMBASE, the Cochrane Library including the Cochrane Central Register of Controlled Trials (CENTRAL), and Web of Science for articles meeting our inclusion criteria as of December 6, 2009. There were no language restrictions to the search. We developed a customized search strategy for each database relying on the database's controlled vocabulary or index (e.g., medical subject headings (MeSH)) or free text terms. In most cases, search strategies combined terms for (1) HIV infection, (2) incidence or hazard, (3) prevention, and (4) study design restrictions (randomized controlled designs). Appropriate MeSH or free text terms were also included to exclude trials that focused on the prevention of mother-to child transmission of HIV. In PubMed/MEDLINE, we searched for clinical trials using Cochrane's “Highly Sensitive Search Strategy” for identifying randomized controlled trials. Search strategies for each database are available from the authors.
We conducted a cited reference search with key articles, scanned reference lists of eligible articles and reviews, and searched the electronic conference proceedings of recent HIV/AIDS-related conferences (Conference on Retroviruses and Opportunistic Infections, International Society for Sexually Transmitted Disease (STD) Research annual meetings, and International AIDS Society annual meetings). Clinicaltrials.gov was searched to identify ongoing or recently completed trials. We contacted study authors and reviewed secondary articles to obtain more information about the trials, as needed. We also searched reference lists of eligible articles and recent commentaries and reviews for potentially relevant trials, and contacted colleagues for additional information not revealed in our search. The broad inclusion criteria and the resulting heterogeneity of interventions precluded the use of meta-analytic methods.
Eligible trials were those that 1) used randomized controlled designs (individual or community); 2) evaluated interventions focusing on sexual transmission of HIV in non-pregnant populations; and 3) reported HIV incidence as the primary or secondary outcome. Trials that assessed HIV infection as part of an aggregate STD outcome (i.e. any incident STD at follow-up) were excluded.[14-18] We first examined the citations from the literature search to eliminate obviously ineligible studies (e.g., non-randomized studies, studies without biological outcomes, inappropriate article types such as reviews or commentaries). Abstracts were specifically searched for mention of an intervention tested against a control intervention with biological outcomes. Report of any STD outcome in the abstract such as incident gonorrhea or chlamydia infections automatically warranted a full length review of the article to determine if HIV testing was performed. We then conducted a detailed manual review of full length articles to determine eligibility.
All authors independently reviewed and abstracted data from each eligible article or abstract. Discrepancies among the authors were identified and resolved. In two cases, multiple interventions were described in a single article (i.e., multiple treatment arms with different interventions and a single control).[19, 20] In these instances, each intervention was counted separately as an appropriate comparison could be made to the control. We abstracted descriptive information about each trial and the most adjusted measure of effect on HIV infection and 95% confidence interval (CI), giving preference to intention-to-treat analyses. We computed the measure of effect using standard methods when only the risk or rate was presented stratified by study arm (e.g., incidence in the treatment group divided by the incidence in the comparison group). We also abstracted the measure of effect in pre-planned subgroup analyses among those studies for which we were aware that such analyses had been conducted.
We categorized trials by type of prevention intervention (e.g. behavioral, microfinance, diaphragm, microbicides, pre-exposure prophylaxis (PrEP), male circumcision, sexually transmitted infection (STI) treatment, and vaccines) and their effect on HIV incidence. “Positive” intervention trials significantly reduced the risk of HIV infection in the intervention arm compared to the control arm, whereas “adverse” trials significantly increased HIV risk. Interventions with “no effect” were those in which no statistically significant effect on HIV acquisition or transmission was reported (positive or adverse), thus the null hypothesis could not be rejected. Although we examined effect size and precision of the measures of effect (usually an incidence rate ratio (IRR) or hazard ratio (HR)), we considered statistical significance of the effect measure (α=0.05) as the definitive measure of whether an intervention was positive, adverse, or had no effect.
We also considered a broad set of methodological and contextual factors to identify specific issues in the design, conduct, or environment in which the trials were conducted that might have contributed to their results. We compared the projected and observed HIV incidence in the comparison arm (per 100 person years or annual cumulative incidence) either given by the authors or computed using standard methods. The level of adherence achieved in both arms of the study was also abstracted. In general, adherence in the intervention arms was defined as attendance at intervention activities for behavioral trials, the proportion of sex acts where the study product was used for diaphragm and microbicide trials, the proportion of prescribed doses completed for PrEP and for STI treatment or suppression trials, or the proportion of participants receiving all doses in vaccine trials. Adherence in the control arm was defined similarly for the placebo product or prevention intervention offered to the control group.
In all instances, the intervention arm examined the marginal benefit of the intervention over and above other prevention services in both the intervention and control arm. We assessed whether the these services, often offered to both the intervention and control arms, exceeded the local community standard of care in terms of basic components of HIV prevention such as risk-reduction counseling or HIV education, provision of condoms and related counseling, and availability of STI treatment services or case management. Comparison arms in which all of these components were augmented beyond the local, existing standard of HIV prevention were classified as “exceptional” prevention packages. Comparison arms in which at least one of these components was added or strengthened were classified as “enhanced” prevention, whereas those with nothing above the local standard were classified as “standard of care.” In addition, regardless of the intensity of the prevention services offered, in some cases the type of preventive services provided to the control arm overlapped with the primary intervention being tested, albeit in a diluted form (e.g., for behavioral interventions, improved VCT in the control group compared to enhanced VCT combined with other behavioral interventions in the intervention group). Trials with control groups that overlapped with the main intervention in this way were highlighted as “overlapping”. Finally, to assess the impact of these improved prevention services, we evaluated whether risk behavior decreased in both arms of the study compared to baseline measurements.
The results of the literature search are presented in Figure 1. We identified 3,616 unique citations from electronic databases of which 3,555 were excluded based on title examination and 27 were excluded based on abstract-level review. Thirty-four full-length articles were reviewed in detail. Twenty-nine articles from the literature search met our inclusion criteria. Eight additional studies were identified through reference list examinations, reviews of conference abstracts, or other sources.[7, 22-25] Thus, 37 HIV prevention RCTs were included in this review reporting on 39 unique interventions to prevent sexual transmission of HIV (Table 1). Five studies included in the review were unpublished and obtained from conference presentations or other sources.[20, 25-29]1 Here we summarize characteristics of the prevention trials and then discuss the most salient factors associated with RCTs demonstrating significant results and those with flat results, by type of intervention.
The 37 trials were conducted over approximately 30 years from 1987 to 2009 at a rapidly accelerating pace with more than 80% being initiated since 1995 (Table 1). The majority were implemented partially or entirely in southern Africa (n=32) and employed individually randomized designs (n=26). Six RCTs included sites in the Americas [20, 30-34], three enrolled participants in Asia [35-37], and one trial each involved populations in the Netherlands and Australia.[33, 34] Nine community randomized controlled trials (C-RCTs) evaluated behaioral, microfinance, and STI treatment interventions and constituted more than half of the 16 RCTs in these categories.[19, 27, 38-44] The two remaining RCTs examined intervention impact on HIV transmission in discordant couples.[29, 45]
Most of the RCTs (n=17) were conducted exclusively in women; 13 in female sex workers (FSWs) or women at high risk of HIV infection [31, 35, 36, 46-55] and four in sexually active women in the general population.[20, 25, 56, 57] Sixteen studies evaluated outcomes in both males and females, including three RCTs in adolescents.[26, 27, 38, 40] Together the trials enrolled nearly 160,000 participants.
Six RCTs, all examining biomedical interventions, delivered definitive results on HIV infection (Table 2).[36, 42, 58-61] Five of the six trials were positive; three RCTs of male circumcision (all of which assessed female-to-male HIV transmission) [58-60], one RCT of enhanced STI management , and one vaccine trial. In all three male circumcision trials, a substantial effect size was observed ranging from a 51- 61% reduction in HIV incidence.[58-60] The trial of improved STI management in the Mwanza region of Tanzania demonstrated a 42% reduction in HIV incidence. The prime boost combination of ALVAC-HIV, a recombinant canarypox vectored vaccine, and AIDSVAX B/E, a recombinant gp120 vaccine, lowered the rate of HIV acquisition by 31% among Thai volunteers, although there was no reduction in post-infection HIV plasma viral load. One of the six significant RCTs, a microbicide trial of nonoxynol-9 gel, produced adverse results with a substantial effect size; a 50% increase in HIV incidence.
Statistical power was not a major issue in any of these six trials (Table 3).[36, 42, 58-60] However, despite the large sample size (16,402) in the ALVAC/AIDSVAX vaccine trial , the low HIV incidence among this sample of the general population limited study power sufficiently so that both the strict intent to treat and per protocol analyses yielded a non-significant 26% reduction in HIV incidence and even the significant efficacy documented in the modified intent to treat analysis had such a wide confidence interval that questions have been raised about the interpretability of the results. Five of the six studies were powered based on fairly close approximations of the observed HIV incidence in the comparison arm [42, 58-60], and in the nonoxynol-9 gel trial, the observed incidence in the comparison arm was approximately double what had been anticipated (5% annually predicted vs. 10.3 per 100 person-years observed).
Similarly, poor adherence was not a key factor in most of the significant trials. Male circumcision is a one-time, directly administered procedure for which adherence issues are limited to refraining from intercourse during healing [7, 8, 11] and likewise, adherence was relatively high in the microbicide (70-72%) and vaccine (75-78%) trials.[36, 62] In addition, the male circumcision and vaccine trials share the benefit of having directly observed interventions, substantially eliminating the problem of reporting bias. Although adherence was not reported in the STI treatment trial, it is likely to have been substantial because trial participants were symptomatic patients seeking STI care and health educators visited villages to encourage prompt treatment for symptomatic STIs.
In none of these trials did the control group receive a dilute version of the intervention being tested (i.e. no overlap). Male circumcision, nonoxynol-9 gel and vaccine were not provided in any form to controls, and no additional STI training, drugs, or supervisory or educational visits were provided to the control communities in the Mwanza trial. However, in all of the interventions with the exception of the STI treatment trial , the control group received an enhanced or exceptional prevention package beyond the local standard of care.
All seven behavioral interventions yielded flat results, however, in the Masaka, Uganda community randomized trial, significant protective effects were found in a subgroup analysis among sexually active women in analyses at the individual-level (IRR=0.41, 95% CI: 0.19, 0.89).[19, 63] Of the four studies that reported both projected and observed HIV incidence in the control arm, all but one observed incidence rates that were substantially lower than original projections. Only in Project EXPLORE did the projected HIV incidence in the control arm accurately reflect that observed in the trial. Thus overall, limited power may have limited the ability to detect an effect.
A central issue common to six of these seven trials is that some form of the intervention overlaps with that provided to the control group, albeit in more dilute form. With one exception (Regai Dzive Shiri) [26, 27] each of these studies offered some combination of risk reduction counseling, condom promotion, and referral and treatment for STIs that exceeded the local standard of care. The effects of so-doing are apparent in the four trials that measured changes in risk behaviors, with all four noting declines in both arms.[19, 30, 40, 64]
The Regai Dzive Shiri youth intervention was a community RCT (see footnote); no prevention services were offered to the comparison communities and uptake of the intervention was low (41%) among survey participants. Adherence was also poor in the Stepping Stones trial, with 44% of participants reporting attendance at 75% or more of the Stepping Stones sessions.[27, 40] Adherence ranged between 70-75% for the other two trials for which adherence data was reported, Project Explore  and the Zimbabwean workplace intervention.
To date, there has only been one microfinance RCT that has examined HIV endpoints and there was no effect on HIV incidence in study communities. This may be due, in part, to the indirect nature of the intervention, which was directly offered to women micro-financers whereas the effect on HIV was expected to diffuse to younger women in the populations, which might take years to manifest. The observed HIV incidence was greater than that projected, adherence was moderate (65%), and there was no change in behavior in either study arm.
Eleven of the 12 trials of microbicides and the latex diaphragm trial demonstrated flat results. However, of note, 0.5% PRO 2000/5 gel demonstrated a 33% reduction in incidence compared to placebo gel (HR=0.7, 95% CI: 0.5, 1.1, p=0.10) in the intent-to-treat analysis and a 36% reduction compared to no gel (p = 0.04) in the per protocol analysis. In a larger study, PRO 2000 gel had no effect on HIV incidence. Although the overall estimates for the cellulose sulfate  and SAVVY (C31G) gel  trials were non-significant, subgroup analyses revealed significant adverse results in the interim and per protocol analyses of the cellulose sulfate trial and among women in the SAVVY trial with greater than median coital frequency and greater than median frequency of gel use.
Reduced power to detect an effect resulting from a lower than expected incidence of new infections was apparent in six of the seven flat microbicide trials that reported this information. Adherence ranged between 73-96% among those studies for which adherence was reported (the lower bound reported in the diaphragm trial), with only two achieving rates of at least 85% in the intervention arm.[35, 57] In addition, results from the diaphragm trial indicated differential condom use over the course of the trial with an average of 53% in the intervention arm and 85% in the control. Although none of these studies offered a dilute, overlapping version of the intervention to controls, all of the studies offered exceptional prevention interventions to both arms. With one exception, risk taking behavior decreased in both study arms in all of the trials that reported these data.
Only one trial of pre-exposure prophylaxis (PrEP) using ARVs for HIV prevention has been completed. The observed annual HIV incidence was roughly half of what was anticipated, and premature closure of the trial markedly reduced study power. Adherence rates shy of 70% were observed in both arms. Enhanced prevention services were provided to controls and risk behavior was reduced in both study arms throughout the trial.
A fourth circumcision trial examined HIV transmission to female partners of HIV-infected men who were enrolled in the circumcision RCT in Rakai, Uganda. Overall, no significant reduction in HIV incidence was observed, but HIV acquisition was increased in the subgroup of female partners of men who resumed sexual activity early before complete wound healing compared to those who delayed resumption of sexual activity (RR=2.92, 95% CI: 1.02, 8.46, p=0.06).
Eight of the nine trials of STI treatment for HIV prevention delivered flat results, although one study found a significant effect on HIV incidence in a subgroup of males who attended program meetings (adjusted IRR= 0.48, 95% CI: 0.24, 0.98, p= 0.04). Five RCTs evaluated various approaches to improved management of curable STIs, including two of syndromic STI management in the general population (both community RCTs) [19, 44]), two of periodic presumptive therapy (one a CRCT in the general population  and the other an individual RCT in female sex workers (FSWs) ) and one individual RCT of intensive, microscopy-assisted STI screening in FSWs. The remaining three RCTs tested acyclovir suppressive therapy in both high and low risk populations.[29, 32, 55]
Several factors probably contributed to the striking contrast between the positive results of the Mwanza STI treatment trial and the five flat RCTs that targeted curable STIs. Although low power, poor adherence, overlapping interventions, or other enhanced prevention services in the comparison communities did not appear to be problems in the Mwanza trial, at least one of these issues arose in each of the five flat trials. In addition, observed HIV incidence fell short of that projected in all four of the RCTs that reported this information [19, 43, 44, 54] and adherence was a moderate 81% in the fifth trial. Perhaps more importantly, like the behavioral interventions, all of these trials offered enhanced or exceptional prevention services to controls, including improved STI services (constituting overlap with the primary intervention being tested) in three of the five RCTs.[43, 53, 54] Indeed, these control arm interventions were reflected in decreases in risk behaviors in both arms of all five RCTs. Finally, the Mwanza trial was implemented in an earlier phase HIV epidemic than was the case for the five flat trials of treatment of curable STIs, all of which were conducted in late-phase, generalized epidemics when genital herpes had largely replaced curable etiologies of genital ulcers while rates of other curable STIs had fallen substantially in the general population.
Although epidemic phase was not a concern in the three acyclovir suppression trials, adherence was a challenge, with the proportion of participants reporting taking >90% of pills ranging from 51% to 73% in two of the three RCTs. In addition, exceptional HIV prevention services were available to controls with attendant reductions in risk behaviors in all trial arms.[29, 32, 55] However, growing data suggest that in these trials it is likely that the 400 mg BID acyclovir regimen tested was not capable of extinguishing persistent immune activation or other biological mechanisms triggered by HSV infection, which could increase susceptibility to HIV.[65, 66]
Three vaccine trials yielded flat results [33, 34, 37, 67]; however, subgroup analyses in the Step Study of the Merck Ad5 gag/pol/nef vaccine suggested significant adverse effects among uncircumcised men (HR=3.8, 95% CI 1.5, 9.3) and those with high pre-existing Ad5 antibody titers (HR=2.3, 95% CI 1.2, 4.3).. However, the increased risk of HIV seroconversion associated with preexisting immunity to the Ad5 vector did not persist in an interim analysis of Step Study participants after unblinding, and it was not confirmed in analyses of the Phambili trial of the same candidate in South Africa.[67-69]. Of note, vaccinations in the Phambili trial were stopped prematurely when the results of the Step trial became available and, as a result, the study accrued very limited numbers of HIV seroconverters prior to unblinding, limiting its power to assess this association.
Large sample size and an endpoint-driven design provided adequate power in the one vaccine RCT that was not terminated early. Adherence was also not an issue in these RCTs as vaccine was directly administered and completion of the vaccine series was high (82-90%) in the studies reporting these data. However enhanced or exceptional HIV prevention were offered to controls in all three trials with decreases in risk behaviors in both study arms in the two trials reporting this information.[33, 34]
HIV prevention can work, as demonstrated by successes in Thailand and Uganda and now the promise of male circumcision.[58-60, 70, 71] Yet in the face of what continues to be one of the most devastating pandemics we have ever known, new, evidence-based approaches are urgently needed. In this context, the fact that almost 90% of RCTs of interventions for prevention of sexual transmission of HIV have delivered flat results demands careful analysis. This review revealed that the majority of flat RCTs are attributable, at least in part, to issues related to trial design and/or implementation. These issues must be addressed in future intervention research.
A key issue in evaluating RCTs is the nature of the control group, and in particular, the intensity of prevention services offered to both study arms.[72-75] The introduction of enhanced or exceptional HIV prevention programs in the control arm that exceed the community standard of care (and may even include a diluted version of the primary intervention) are rarely sustainable after trial completion, and their intensity may dramatically reduce the ability to detect the effect of a new and effective intervention. Furthermore, in these cases, the results have little external validity because the comparison does not represent the effect of the intervention compared to the condition actually experienced by individuals in the community who could not avail themselves of the services offered to the control group.
Trials in which investigators or institutional review boards often feel obligated to provide controls with enhancements that are dilute versions of the intervention being tested (such as RCTs of behavioral or STI treatment interventions) are particularly vulnerable to flat results. In contrast, RCTs of new biomedical interventions such as vaccines, microbicides or male circumcision rarely offer dilute forms of the intervention in the comparison group. This makes distinguishing an ineffective intervention from design issues more straightforward. Indeed, in all the trials that had a significant effect (positive or adverse), the control group did not receive a prevention package that resembled the main intervention.
The ethical issues of offering enhanced HIV prevention services in the comparison arm must be weighed against the ethical issues of lengthy and expensive prevention trials that provide the control group with an unsustainable level of prevention services that does not reflect community standards. Further, such trials may jeopardize our ability to identify and offer participants and at-risk individuals around the world additional effective HIV prevention options. Stepped-wedge designs are one approach that, when appropriate, may help address this ethical dilemma. Similarly, a reliable incidence assay might obviate the need for prospectively following controls, in which case the need to offer additional services over the course of the trial might be moot.
We also examined changes in risk behavior over the course of the trial in the intervention and control arms and found that in most cases, risk-taking behavior was reduced in both. Some of this change may be attributable to enhanced prevention services offered in the trial. A “Hawthorne effect”, in which simply participating in the study and being followed produces a positive result, might also have contributed. For example, in a prospective cohort study of female sex workers in Kenya, HIV-1 incidence declined 10-fold during 3 years of follow-up. This phenomenon further attenuates the ability to detect the marginal benefit of the new intervention, especially if it is postulated to have only a modest effect.
Much has been written about unexpectedly low incidence in trials [24, 78, 79], which results in low statistical power unless additional endpoints are obtained by extending trial duration and incurring increased costs. We observed that this was the case in 14 (64%) of the 22 flat trials reporting this information. It was also recently announced that the PrEP study in Botswana would be unable to evaluate efficacy given a lower than expected rate of new infections (note this study is in progress and so was not included in this review). Lower than expected incidence could result from erroneous assumptions about incidence prior to beginning the study. Development of improved assays to detect incident HIV infection might improve estimates of incidence before the onset of a trial and reduce trial costs by providing a more accurate basis for sample size calculations and assisting in the more reliable selection of populations with sufficiently high HIV incidence to permit trials of shorter duration.. Alternatively, unexpectedly low incidence could result from changes in risk behaviors during the trial (potentially due to prevention services offered or to a Hawthorne effect). In addition, unexpectedly low incidence could reflect changes in the nature of the local epidemic, driven either by unanticipated implementation of new interventions or by changes in the epidemic phase in which the RCT was conducted – due, in part, to the protracted time for RCT design, implementation and completion, which can take as long as a decade. In such trials, it is important for investigators to document and iteratively reassess their original assumptions.
Another key issue is the importance of adherence in trials.[8, 56] If trials assess effectiveness as opposed to efficacy, one might argue that adherence is an inherent part of the intervention and that if it is suboptimal during the trial, it does not bode well for general uptake outside the course of the trial where adherence is likely to be even worse than in the context of a study. Nevertheless, if the methods require high adherence, as is likely to be the case with microbicides, PrEP and treatment regimens for genital herpes, critical priorities are to better define, measure, analyze and routinely report adherence, as well as to identify concomitant behavioral interventions to increase adherence (as we currently do for male condoms) before products are incorporated into HIV prevention packages.
It is clear from results of recent trials that, in the near-term, there will be no single “magic bullet” for HIV prevention. Instead, the emphasis in prevention research is shifting to evaluation of combination prevention packages in which synergies among interventions with modest levels of effect might lead to substantial efficacy overall.[81, 82] What types of evidence for potential efficacy should be used to select interventions for inclusion in these packages? In addition to approaches such as male circumcision with robust RCT evidence for significant effects, we might consider interventions that demonstrate efficacy among specific, relevant subgroups (e.g., males who attended program meetings in the STI intervention in Zimbabwe ). Interventions with significant secondary outcomes in RCTs (e.g. other STDs [3, 83]) might also be candidates for evaluation in combination packages if we assume that these outcomes lie on the causal pathway to HIV infection.
Should we also consider interventions for which evidence from RCTs is mixed or absent, but for which observational data or modeling strongly suggest that both the underlying concept and the specific intervention are likely to deliver substantial protection? Most HIV prevention programs and policies are not currently based on RCT evidence. Public health agencies such as the Centers for Disease Control and Prevention (CDC), the Agency for Healthcare Research and Quality, the World Health Organization and UNAIDS have long recognized the dilemma posed by relying solely on RCT data. Therefore, many develop guidelines (as established, for example, by the U.S. Preventive Services Task Force (USPSTF)) that are based not only on RCTs, but also on observational data from well-designed non-randomized trials; cohort, case-control, or multiple time series studies; mathematical modeling; dramatic results from uncontrolled experiments or even expert opinion by explicitly using a systematic approach to rating the evidence from these different methods. Similarly, epidemiologists and public health practitioners rarely rely only on the evidence from controlled trials to infer causality between an exposure and an outcome. For example, when reviewing evidence from multiple sources to establish whether smoking was associated with lung cancer, Sir Bradford Hill suggested that experimental evidence was only one of several considerations for causal inference. Other criteria included parameters that characterized the nature of the association between the independent and dependent variable including: the magnitude of the effect, consistency, temporality, specificity, and the biological plausibility of the association.
That said, a critical issue is that most RCTs reviewed here are highly managed, labor-intensive and expensive; meeting the standards of regulatory bodies such as the US Food and Drug Administration (FDA). As alluded to above, before abandoning randomization, it is important to consider the entire universe of RCTs including stepped wedge designs, large simple trials, or trials where intervention communities are compared to either what currently exists or where incidence comparisons are made with non-participants as determined by reliable assays (should they be developed). The need for a counterfactual to infer causality cannot be denied and may be especially important when such combination programs are rolled out and evaluated at-scale, particularly if they are designed based on uncertain efficacy.
Randomized controlled trials will undoubtedly remain our gold standard in defining the evidence-base for prevention programs and policies. However, to assess the purity of this gold standard, the HIV prevention science community must not only examine evidence from randomized controlled trials with significant outcomes (including from subgroups and secondary outcomes), but must also examine flat trials and address the design and implementation issues discussed above. In addition, we must acknowledge and explicitly define the role of other types of evidence in the development of HIV prevention recommendations.
Pure gold is a thing of great beauty and value, but lacks the strength and affordability that make alloys like steel so useful and durable. Similarly, well designed and executed RCTs are magnificent and invaluable cornerstones of HIV prevention policies and programs. However, before abandoning entire classes of potentially beneficial interventions, we must forge “alloys” of data from RCTs, observational studies and other lines of evidence; cautiously and explicitly titrating the use of less rigorous sources, and recognize that these “alloys” are likely to offer the best guide to decide what to include in prevention packages, what to scale up, and where further research is warranted.
Dr. Wasserheit conceptualized the idea of a “score card” and delivered the initial results in 2007 at the Conference on Retroviruses and Opportunistic Infections reviewing the results of several STI treatment trials to reduce incidence of HIV. Dr. Padian elaborated on this concept and reviewed the larger array of biomedical trials using this paradigm and introduced the importance of “flat trials” in 2007 in a plenary for the International AIDS Society. Together, Drs. Padian and Wasserheit refined the concept and both were involved in all aspects of the review and drafting the manuscript. Ms. Balkus conducted the initial literature review and participated in writing the first draft of the manuscript. Dr. McCoy updated the review, reviewed the abstracted data for accuracy with Ms. Balkus, and participated in writing and revising the manuscript. Dr. Wasserheit presented a preliminary version of this paper in 2009 (co-authored by Drs. Padian and McCoy and Ms. Balkus) at the International Society for STD Research.
Dr. Padian was supported by a grant from the Bill and Melinda Gates Foundation (No. 21082) and Dr. Wasserheit was supported by a grant from the National Institute of Allergy and Infectious Diseases (RO1 AI083034).
Sources of Support: Bill and Melinda Gates Foundation (Dr. Padian, No. 21082)
1Although the Regai Dzive Shiri trial was originally implemented as a community randomized controlled trial with evaluation in a longitudinal cohort, the analysis was re-conceptualized as a serial cross-sectional evaluation due to significant out-migration from trial communities.[26,27] We included this trial in the review, despite not reporting on HIV incidence (HIV prevalence was measured in study communities after the intervention was implemented) because of the size of the study and the important implications this study has for understanding the challenges in scaling up prevention packages.