Markers that predict treatment effect have the potential to improve patient outcomes. For example, the Oncotype DX
® Recurrence Score® has some ability to predict the benefit of adjuvant chemotherapy over and above hormone therapy for the treatment of estrogen-receptor-positive breast cancer, facilitating the provision of chemotherapy to women most likely to benefit from it. Given that the score was originally developed for predicting outcome given hormone therapy alone, it is of interest to develop alternative combinations of the genes comprising the score that are optimized for treatment selection. However most methodology for combining markers is useful when predicting outcome under a single treatment. We propose a method for combining markers for treatment selection which requires modeling the treatment effect as a function of markers. Multiple models of treatment effect are fit iteratively by upweighting or “boosting” subjects potentially misclassified according to treatment benefit at the previous stage. The boosting approach is compared to existing methods in a simulation study based on the change in expected outcome under marker-based treatment. The approach improves upon methods in some settings and has comparable performance in others. Our simulation study also provides insights as to the relative merits of the existing methods. Application of the boosting approach to the breast cancer data, using scaled versions of the original markers, produces marker combinations that may have improved performance for treatment selection.
Biomarker; Boosting; Model mis-specification; Treatment selection
The Net Reclassification Index (NRI) and its P value are used to make conclusions about improvements in prediction performance gained by adding a set of biomarkers to an existing risk prediction model. Although proposed only 5 years ago, the NRI has gained enormous traction in the risk prediction literature. Concerns have recently been raised about the statistical validity of the NRI.
Using a population dataset of 10000 individuals with an event rate of 10.2%, in which four biomarkers have no predictive ability, we repeatedly simulated studies and calculated the chance that the NRI statistic provides a positive statistically significant result. Subjects for training data (n = 420) and test data (n = 420 or 840) were randomly selected from the population, and corresponding NRI statistics and P values were calculated. For comparison, the change in the area under the receiver operating characteristic curve and likelihood ratio statistics were calculated.
We found that rates of false-positive conclusions based on the NRI statistic were unacceptably high, being 63.0% in the training datasets and 18.8% to 34.4% in the test datasets. False-positive conclusions were rare when using the change in the area under the curve and occurred at the expected rate of approximately 5.0% with the likelihood ratio statistic.
Conclusions about biomarker performance that are based primarily on a statistically significant NRI statistic should be treated with skepticism. Use of NRI P values in scientific reporting should be halted.
Despite the heightened interest in developing biomarkers predicting treatment response that are used to optimize patient treatment decisions, there has been relatively little development of statistical methodology to evaluate these markers. There is currently no unified statistical framework for marker evaluation. This paper proposes a suite of descriptive and inferential methods designed to evaluate individual markers and to compare candidate markers. An R software package has been developed which implements these methods. Their utility is illustrated in the breast cancer treatment context, where candidate markers are evaluated for their ability to identify a subset of women who do not benefit from adjuvant chemotherapy and can therefore avoid its toxicity.
Net reclassification indices have recently become popular statistics for measuring the prediction increment of new biomarkers. We review the various types of net reclassification indices and their correct interpretations. We evaluate the advantages and disadvantages of quantifying the prediction increment with these indices. For pre-defined risk categories, we relate net reclassification indices to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for net reclassification indices and evaluate the merits of hypothesis testing based on such indices. We recommend that investigators using net reclassification indices should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the components of net reclassification indices are the same as the changes in the true-positive and false-positive rates. We advocate use of true- and false-positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against net reclassification indices because they do not adequately account for clinically important differences in shifts among risk categories. The category-free net reclassification index is a new descriptive device designed to avoid pre-defined risk categories. However, it suffers from many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index. If investigators want to use net reclassification indices, confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in net benefit.
The HIV prevention landscape is evolving rapidly, and future efficacy trials of candidate vaccines, which remain the best long-term option for stemming the HIV epidemic, will be conducted in the context of partially effective nonvaccine prevention modalities. It is essential that these trials provide for valid and efficient evaluation of vaccine efficacy and immune correlates. The availability of partially effective prevention modalities presents opportunities to study their interactions with vaccines to maximally reduce HIV incidence. This article proposes an approach for conducting future vaccine efficacy trials in the context of background use of partially effective nonvaccine prevention modalities, and for conducting future vaccine efficacy trials that provide nonvaccine prevention modalities in one or more of the randomized study groups. Strategies are discussed for responding to emerging evidence on nonvaccine prevention modalities during ongoing vaccine trials. Next-generation HIV vaccine efficacy trials will almost certainly be more complex in their design and implementation but may become more relevant to at-risk populations and better suited to the ultimate goal of reducing HIV incidence at the population level.
The contribution of host T-cell immunity and HLA class I alleles to the control of human immunodeficiency virus (HIV-1) replication in natural infection is widely recognized. We assessed whether vaccine-induced T-cell immunity, or expression of certain HLA alleles, impacted HIV-1 control after infection in the Step MRKAd5/HIV-1 gag/pol/nef study. Vaccine-induced T cells were associated with reduced plasma viremia, with subjects targeting ≥3 gag peptides presenting with half-log lower mean viral loads than subjects without Gag responses. This effect was stronger in participants infected proximal to vaccination and was independent of our observed association of HLA-B*27, –B*57 and –B*58:01 alleles with lower HIV-1 viremia. These findings support the ability of vaccine-induced T-cell responses to influence postinfection outcome and provide a rationale for the generation of T-cell responses by vaccination to reduce viremia if protection from acquisition is not achieved. Clinical trials identifier: NCT00095576.
HIV-1 vaccine; Step study; Gag-specific T cells; HLA class I alleles
Purpose of review
With multiple HIV vaccine candidates suitable for efficacy evaluation in a rapidly changing HIV prevention landscape, innovative HIV vaccine trial design research is much needed to optimally utilize resources by building on lessons learned from past HIV vaccine efficacy trials.
Several recent articles propose new vaccine efficacy trial design strategies tailored to the emerging needs in HIV vaccine evaluation. These include a focus on efficacy evaluation proximal to the vaccination series; more intensive interim monitoring for potential harm, non-efficacy and high efficacy of the vaccine; simultaneous evaluation of multiple vaccine regimens with a shared placebo group; designs that include pilot immunogenicity studies of putative immune correlates to expedite their evaluation; as well as designs tailored to evaluate vaccine efficacy in the context of partially effective non-vaccine prevention modalities.
A more rapid evaluation of multiple vaccine candidates is possible. Weaker vaccines can be weeded out quickly. Pilot studies can be done during the trial to prepare for a timely immune correlates assessment. Evidence that emerges regarding the efficacy of non-vaccine prevention modalities will have important implications for future trial designs.
HIV prevention; multi-arm trial; vaccine efficacy; immune correlates
The phase III RV144 HIV-1 vaccine trial estimated vaccine efficacy (VE) to be 31.2%. This trial demonstrated that the presence of HIV-1–specific IgG-binding Abs to envelope (Env) V1V2 inversely correlated with infection risk, while the presence of Env-specific plasma IgA Abs directly correlated with risk of HIV-1 infection. Moreover, Ab-dependent cellular cytotoxicity responses inversely correlated with risk of infection in vaccine recipients with low IgA; therefore, we hypothesized that vaccine-induced Fc receptor–mediated (FcR-mediated) Ab function is indicative of vaccine protection. We sequenced exons and surrounding areas of FcR-encoding genes and found one FCGR2C tag SNP (rs114945036) that associated with VE against HIV-1 subtype CRF01_AE, with lysine at position 169 (169K) in the V2 loop (CRF01_AE 169K). Individuals carrying CC in this SNP had an estimated VE of 15%, while individuals carrying CT or TT exhibited a VE of 91%. Furthermore, the rs114945036 SNP was highly associated with 3 other FCGR2C SNPs (rs138747765, rs78603008, and rs373013207). Env-specific IgG and IgG3 Abs, IgG avidity, and neutralizing Abs inversely correlated with CRF01_AE 169K HIV-1 infection risk in the CT- or TT-carrying vaccine recipients only. These data suggest a potent role of Fc-γ receptors and Fc-mediated Ab function in conferring protection from transmission risk in the RV144 VE trial.
The RV144 HIV-1 vaccine trial demonstrated partial efficacy of 31% against HIV-1 infection. Studies into possible correlates of protection found that antibodies specific to the V1 and V2 (V1/V2) region of envelope correlated inversely with infection risk and that viruses isolated from trial participants contained genetic signatures of vaccine-induced pressure in the V1/V2 region. We explored the hypothesis that the genetic signatures in V1 and V2 could be partly attributed to selection by vaccine-primed T cells. We performed a T-cell-based sieve analysis of breakthrough viruses in the RV144 trial and found evidence of predicted HLA binding escape that was greater in vaccine versus placebo recipients. The predicted escape depended on class I HLA A*02- and A*11-restricted epitopes in the MN strain rgp120 vaccine immunogen. Though we hypothesized that this was indicative of postacquisition selection pressure, we also found that vaccine efficacy (VE) was greater in A*02-positive (A*02+) participants than in A*02− participants (VE = 54% versus 3%, P = 0.05). Vaccine efficacy against viruses with a lysine residue at site 169, important to antibody binding and implicated in vaccine-induced immune pressure, was also greater in A*02+ participants (VE = 74% versus 15%, P = 0.02). Additionally, a reanalysis of vaccine-induced immune responses that focused on those that were shown to correlate with infection risk suggested that the humoral responses may have differed in A*02+ participants. These exploratory and hypothesis-generating analyses indicate there may be an association between a class I HLA allele and vaccine efficacy, highlighting the importance of considering HLA alleles and host immune genetics in HIV vaccine trials.
IMPORTANCE The RV144 trial was the first to show efficacy against HIV-1 infection. Subsequently, much effort has been directed toward understanding the mechanisms of protection. Here, we conducted a T-cell-based sieve analysis, which compared the genetic sequences of viruses isolated from infected vaccine and placebo recipients. Though we hypothesized that the observed sieve effect indicated postacquisition T-cell selection, we also found that vaccine efficacy was greater for participants who expressed HLA A*02, an allele implicated in the sieve analysis. Though HLA alleles have been associated with disease progression and viral load in HIV-1 infection, these data are the first to suggest the association of a class I HLA allele and vaccine efficacy. While these statistical analyses do not provide mechanistic evidence of protection in RV144, they generate testable hypotheses for the HIV vaccine community and they highlight the importance of assessing the impact of host immune genetics in vaccine-induced immunity and protection. (This study has been registered at ClinicalTrials.gov under registration no. NCT00223080.)
A safe and effective vaccine for the prevention of human immunodeficiency virus type 1 (HIV-1) infection is a global priority. We tested the efficacy of a DNA prime–recombinant adenovirus type 5 boost (DNA/rAd5) vaccine regimen in persons at increased risk for HIV-1 infection in the United States.
At 21 sites, we randomly assigned 2504 men or transgender women who have sex with men to receive the DNA/rAd5 vaccine (1253 participants) or placebo (1251 participants). We assessed HIV-1 acquisition from week 28 through month 24 (termed week 28+ infection), viral-load set point (mean plasma HIV-1 RNA level 10 to 20 weeks after diagnosis), and safety. The 6-plasmid DNA vaccine (expressing clade B Gag, Pol, and Nef and Env proteins from clades A, B, and C) was administered at weeks 0, 4, and 8. The rAd5 vector boost (expressing clade B Gag-Pol fusion protein and Env glycoproteins from clades A, B, and C) was administered at week 24.
In April 2013, the data and safety monitoring board recommended halting vaccinations for lack of efficacy. The primary analysis showed that week 28+ infection had been diagnosed in 27 participants in the vaccine group and 21 in the placebo group (vaccine efficacy, −25.0%; 95% confidence interval, −121.2 to 29.3; P = 0.44), with mean viral-load set points of 4.46 and 4.47 HIV-1 RNA log10 copies per milliliter, respectively. Analysis of all infections during the study period (41 in the vaccine group and 31 in the placebo group) also showed lack of vaccine efficacy (P = 0.28). The vaccine regimen had an acceptable side-effect profile.
The DNA/rAd5 vaccine regimen did not reduce either the rate of HIV-1 acquisition or the viral-load set point in the population studied. (Funded by the National Institute of Allergy and Infectious Diseases; ClinicalTrials.gov number, NCT00865566.)
The HIV epidemic has carved contrasting trajectories around the world with sub-Saharan Africa (SSA) being most affected. We hypothesized that mean HIV-1 plasma RNA viral loads (VL) are higher in SSA than other areas, and that these elevated levels may contribute to the scale of epidemics in this region.
Design and Methods
To evaluate this hypothesis, we constructed a database of means of 71,668 VL measurements from 44 cohorts in seven regions of the world. We used linear regression statistical models to estimate differences in VL between regions. We also constructed and analyzed a mathematical model to describe the impact of the regional VL differences on HIV epidemic trajectory.
We found substantial regional VL heterogeneity. The mean VL in SSA was 0.58 log10 copies/mL higher than in North America (95% CI: 0.45 to 0.71); this represents about a 4-fold increase. The highest mean VLs were found in Southern and East Africa, while in Asia, Europe, North America, and South America, mean VLs were comparable. Mathematical modeling indicated that conservatively 14% of HIV infections in a representative population in Kenya could be attributed to the enhanced infectiousness of subjects with heightened VL.
We conclude that community VL appears to be higher in SSA than in other regions and this may be a central driver of the massive HIV epidemics in this region. The elevated VLs in SSA may reflect, among other factors, the high burden of co-infections or the preponderance of HIV-1 subtype C infection.
HIV; viral load; co-infection; epidemic; sub-Saharan Africa; mathematical model
In the RV144 trial, the estimated efficacy of a vaccine regimen against human immunodeficiency virus type 1 (HIV-1) was 31.2%. We performed a case–control analysis to identify antibody and cellular immune correlates of infection risk.
In pilot studies conducted with RV144 blood samples, 17 antibody or cellular assays met prespecified criteria, of which 6 were chosen for primary analysis to determine the roles of T-cell, IgG antibody, and IgA antibody responses in the modulation of infection risk. Assays were performed on samples from 41 vaccinees who became infected and 205 uninfected vaccinees, obtained 2 weeks after final immunization, to evaluate whether immune-response variables predicted HIV-1 infection through 42 months of follow-up.
Of six primary variables, two correlated significantly with infection risk: the binding of IgG antibodies to variable regions 1 and 2 (V1V2) of HIV-1 envelope proteins (Env) correlated inversely with the rate of HIV-1 infection (estimated odds ratio, 0.57 per 1-SD increase; P = 0.02; q = 0.08), and the binding of plasma IgA antibodies to Env correlated directly with the rate of infection (estimated odds ratio, 1.54 per 1-SD increase; P = 0.03; q = 0.08). Neither low levels of V1V2 antibodies nor high levels of Env-specific IgA antibodies were associated with higher rates of infection than were found in the placebo group. Secondary analyses suggested that Env-specific IgA antibodies may mitigate the effects of potentially protective antibodies.
This immune-correlates study generated the hypotheses that V1V2 antibodies may have contributed to protection against HIV-1 infection, whereas high levels of Env-specific IgA antibodies may have mitigated the effects of protective antibodies. Vaccines that are designed to induce higher levels of V1V2 antibodies and lower levels of Env-specific IgA antibodies than are induced by the RV144 vaccine may have improved efficacy against HIV-1 infection.
Treatment-selection markers are biological molecules or patient characteristics associated with one’s response to treatment. They can be used to predict treatment effects for individual subjects and subsequently help deliver treatment to those most likely to benefit from it. Statistical tools are needed to evaluate a marker’s capacity to help with treatment selection. The commonly adopted criterion for a good treatment-selection marker has been the interaction between marker and treatment. While a strong interaction is important, it is, however, not suffcient for good marker performance. In this paper, we develop novel measures for assessing a continuous treatment-selection marker, based on a potential outcomes framework. Under a set of assumptions, we derive the optimal decision rule based on the marker to classify individuals according to treatment benefit, and characterize the marker’s performance using the corresponding classification accuracy as well as the overall distribution of the classifier. We develop a constrained maximum-likelihood method for estimation and testing in a randomized trial setting. Simulation studies are conducted to demonstrate the performance of our methods. Finally, we illustrate the methods using an HIV vaccine trial where we explore the value of the level of pre-existing immunity to Adenovirus serotype 5 for predicting a vaccine-induced increase in the risk of HIV acquisition.
Classification accuracy; Constrained maximum likelihood; Monotone treatment effect; Potential outcomes; Sensitivity analysis; Treatment-selection marker
Extensive observational data suggest that HSV-2 infection may
facilitate HIV acquisition, increase HIV viral load, and accelerate HIV
progression and onward transmission. To explore these relationships, we
examined the impact of pre-existing HSV-2 infection in an international HIV
We analyzed the associations between prevalent HSV-2 infection and
HIV-1 acquisition and progression among 1836 men who have sex with men
(MSM). We used Cox proportional hazards regression models to estimate the
association between HSV-2 infection and both HIV acquisition and ART
initiation, and linear regression to explore the effect of HSV-2 on pre-ART
HSV-2 infection increased risk of HIV-1 acquisition among all
volunteers (adjusted hazard ratio 2.2; 95% CI, 1.4 to 3.5).
Adjusting for demographic variables, circumcision, Ad5 titer and significant
risk behaviors, the risk of HIV acquisition among HSV-2 infected placebo
recipients was three fold higher than HSV-2 seronegatives (hazard ratio 3.3;
95% CI, 1.6 to 6.9). Past HSV-2 infection was associated with a 0.2
log10 copies/ml higher adjusted mean set point viral load
(95% CI, 0.3 lower to 0.6 higher). HSV-2 infection was not
associated with time to ART initiation.
Among MSM in an HIV-1 vaccine trial, pre-existing HSV-2 infection was
a major risk factor for HIV acquisition. Past HSV-2 did not significantly
increase HIV viral load or early disease progression. HSV-2 seropositive
persons will likely prove more difficult than HSV-2 seronegative persons to
protect against HIV infection using vaccines or other prevention
Herpes Simplex Virus Type II; HIV incidence
The sieve analysis for the Step trial found evidence that breakthrough HIV-1 sequences for MRKAd5/HIV-1 Gag/Pol/Nef vaccine recipients were more divergent from the vaccine insert than placebo sequences in regions with predicted epitopes. We linked the viral sequence data with immune response and acute viral load data to explore mechanisms for and consequences of the observed sieve effect.
Ninety-one male participants (37 placebo and 54 vaccine recipients) were included; viral sequences were obtained at the time of HIV-1 diagnosis. T-cell responses were measured 4 weeks post-second vaccination and at the first or second week post-diagnosis. Acute viral load was obtained at RNA-positive and antibody-negative visits.
Vaccine recipients had a greater magnitude of post-infection CD8+ T cell response than placebo recipients (median 1.68% vs 1.18%; p = 0·04) and greater breadth of post-infection response (median 4.5 vs 2; p = 0·06). Viral sequences for vaccine recipients were marginally more divergent from the insert than placebo sequences in regions of Nef targeted by pre-infection immune responses (p = 0·04; Pol p = 0·13; Gag p = 0·89). Magnitude and breadth of pre-infection responses did not correlate with distance of the viral sequence to the insert (p>0·50). Acute log viral load trended lower in vaccine versus placebo recipients (estimated mean 4·7 vs 5·1) but the difference was not significant (p = 0·27). Neither was acute viral load associated with distance of the viral sequence to the insert (p>0·30).
Despite evidence of anamnestic responses, the sieve effect was not well explained by available measures of T-cell immunogenicity. Sequence divergence from the vaccine was not significantly associated with acute viral load. While point estimates suggested weak vaccine suppression of viral load, the result was not significant and more viral load data would be needed to detect suppression.
The rapid and continuing progress in gene discovery for complex diseases is fuelling interest in the potential application of genetic risk models for clinical and public health practice.The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality.Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction.A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by prior reporting guidelines.These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.
Markers for treatment selection are being developed in many areas of medicine. Technological advances are rapidly producing an abundance of candidates for study. Clinicians hope to use these markers to identify which individuals will benefit from a given treatment, with the goal of maximizing good outcomes and minimizing side effects, treatment burden, and medical costs.
It is essential that we have appropriate methods for evaluating treatment selection markers, in order to make informed decisions regarding marker advancement and, ultimately, clinical application. However, existing statistical methods for evaluating treatment selection markers are largely inadequate. This paper proposes several novel statistical measures of marker performance aimed at addressing key questions in marker evaluation: 1) Does the marker help patients choose amongst treatment options?; 2) How should treatment decisions be made based on a continuous marker measurement?; 3) What is the impact on the population of using the marker to select treatment?; and 4) What proportion of patients will have different treatment recommendations following marker measurement? The proposed approach is contrasted with existing methods for marker evaluation, including assessing a marker’s prognostic value, evaluating treatment effects in a subset of the population who are marker-positive, and testing for a statistical interaction between marker value and treatment. The approach is illustrated in the context of choosing adjuvant chemotherapy treatment for women with estrogen-receptor positive and node-positive breast cancer. The results have important implications for the design of marker evaluation studies, and can serve as the basis for further development of standards for assessing treatment selection markers.
When estimating the association between an exposure and outcome, a simple approach to quantifying the amount of confounding by a factor, Z, is to compare estimates of the exposure–outcome association with and without adjustment for Z. This approach is widely believed to be problematic due to the nonlinearity of some exposure-effect measures. When the expected value of the outcome is modeled as a nonlinear function of the exposure, the adjusted and unadjusted exposure effects can differ even in the absence of confounding (Greenland , Robins, and Pearl, 1999); we call this the nonlinearity effect. In this paper, we propose a corrected measure of confounding that does not include the nonlinearity effect. The performances of the simple and corrected estimates of confounding are assessed in simulations and illustrated using a study of risk factors for low birth–weight infants. We conclude that the simple estimate of confounding is adequate or even preferred in settings where the nonlinearity effect is very small. In settings with a sizable nonlinearity effect, the corrected estimate of confounding has improved performance.
Collapsibility; Confounding; Odds ratio
In many clinical settings, statistical models are being developed for predicting risk of disease or other adverse event. These models are intended to help patients and physicians make informed decisions. A new approach to assessing the value of adding a new marker to a risk prediction model, called the risk stratification approach, was recently proposed by Cook and colleagues (1,2). This involves cross-tabulating risk predictions on the basis of models with and without the new marker, and has been widely adopted in the literature. We argue that important information with regard to three important model validation criteria can be extracted from risk stratification tables: 1) model fit or calibration; 2) capacity for risk stratification; and 3) accuracy of classifications based on risk. However, we describe how the information contained in the tables must be interpreted carefully, and caution against common misuses of the method. The concepts are illustrated using data from a recently published study of a breast cancer risk prediction model by Tice et al. (3).
The rapid and continuing progress in gene discovery for complex diseases is fueling interest in the potential application of genetic risk models for clinical and public health practice. The number of studies assessing the predictive ability is steadily increasing, but they vary widely in completeness of reporting and apparent quality. Transparent reporting of the strengths and weaknesses of these studies is important to facilitate the accumulation of evidence on genetic risk prediction. A multidisciplinary workshop sponsored by the Human Genome Epidemiology Network developed a checklist of 25 items recommended for strengthening the reporting of Genetic RIsk Prediction Studies (GRIPS), building on the principles established by previous reporting guidelines. These recommendations aim to enhance the transparency, quality and completeness of study reporting, and thereby to improve the synthesis and application of information from multiple studies that might differ in design, conduct or analysis.