Assessing immune responses to study vaccines as surrogates of protection plays a central role in vaccine clinical trials. Motivated by three ongoing or pending HIV vaccine efficacy trials, we consider such surrogate endpoint assessment in a randomized placebo-controlled trial with case-cohort sampling of immune responses and a time to event endpoint. Based on the principal surrogate definition under the principal stratification framework proposed by Frangakis and Rubin [Biometrics 58 (2002) 21–29] and adapted by Gilbert and Hudgens (2006), we introduce estimands that measure the value of an immune response as a surrogate of protection in the context of the Cox proportional hazards model. The estimands are not identified because the immune response to vaccine is not measured in placebo recipients. We formulate the problem as a Cox model with missing covariates, and employ novel trial designs for predicting the missing immune responses and thereby identifying the estimands. The first design utilizes information from baseline predictors of the immune response, and bridges their relationship in the vaccine recipients to the placebo recipients. The second design provides a validation set for the unmeasured immune responses of uninfected placebo recipients by immunizing them with the study vaccine after trial closeout. A maximum estimated likelihood approach is proposed for estimation of the parameters. Simulated data examples are given to evaluate the proposed designs and study their properties.
Clinical trial; discrete failure time model; missing data; potential outcomes; principal stratification; surrogate marker
Given a randomized treatment Z, a clinical outcome Y, and a biomarker S measured some fixed time after Z is administered, we may be interested in addressing the surrogate endpoint problem by evaluating whether S can be used to reliably predict the effect of Z on Y. Several recent proposals for the statistical evaluation of surrogate value have been based on the framework of principal stratification. In this paper, we consider two principal stratification estimands: joint risks and marginal risks. Joint risks measure causal associations of treatment effects on S and Y, providing insight into the surrogate value of the biomarker, but are not statistically identifiable from vaccine trial data. While marginal risks do not measure causal associations of treatment effects, they nevertheless provide guidance for future research, and we describe a data collection scheme and assumptions under which the marginal risks are statistically identifiable. We show how different sets of assumptions affect the identifiability of these estimands; in particular, we depart from previous work by considering the consequences of relaxing the assumption of no individual treatment effects on Y before S is measured. Based on algebraic relationships between joint and marginal risks, we propose a sensitivity analysis approach for assessment of surrogate value, and show that in many cases the surrogate value of a biomarker may be hard to establish, even when the sample size is large.
Estimated likelihood; Identifiability; Principal stratification; Sensitivity analysis; Surrogate endpoint; Vaccine trials
The effects of vaccine on postinfection outcomes, such as disease, death, and secondary transmission to others, are important scientific and public health aspects of prophylactic vaccination. As a result, evaluation of many vaccine effects condition on being infected. Conditioning on an event that occurs posttreatment (in our case, infection subsequent to assignment to vaccine or control) can result in selection bias. Moreover, because the set of individuals who would become infected if vaccinated is likely not identical to the set of those who would become infected if given control, comparisons that condition on infection do not have a causal interpretation. In this article we consider identifiability and estimation of causal vaccine effects on binary postinfection outcomes. Using the principal stratification framework, we define a postinfection causal vaccine efficacy estimand in individuals who would be infected regardless of treatment assignment. The estimand is shown to be not identifiable under the standard assumptions of the stable unit treatment value, monotonicity, and independence of treatment assignment. Thus selection models are proposed that identify the causal estimand. Closed-form maximum likelihood estimators (MLEs) are then derived under these models, including those assuming maximum possible levels of positive and negative selection bias. These results show the relations between the MLE of the causal estimand and two commonly used estimators for vaccine effects on postinfection outcomes. For example, the usual intent-to-treat estimator is shown to be an upper bound on the postinfection causal vaccine effect provided that the magnitude of protection against infection is not too large. The methods are used to evaluate postinfection vaccine effects in a clinical trial of a rotavirus vaccine candidate and in a field study of a pertussis vaccine. Our results show that pertussis vaccination has a significant causal effect in reducing disease severity.
Causal inference; Infectious disease; Maximum likelihood; Principal stratification; Sensitivity analysis
Frangakis and Rubin (2002, Biometrics 58, 21–29) proposed a new definition of a surrogate endpoint (a “principal” surrogate) based on causal effects. We introduce an estimand for evaluating a principal surrogate, the causal effect predictiveness (CEP) surface, which quantifies how well causal treatment effects on the biomarker predict causal treatment effects on the clinical endpoint. Although the CEP surface is not identifiable due to missing potential outcomes, it can be identified by incorporating a baseline covariate(s) that predicts the biomarker. Given case–cohort sampling of such a baseline predictor and the biomarker in a large blinded randomized clinical trial, we develop an estimated likelihood method for estimating the CEP surface. This estimation assesses the “surrogate value” of the biomarker for reliably predicting clinical treatment effects for the same or similar setting as the trial. A CEP surface plot provides a way to compare the surrogate value of multiple biomarkers. The approach is illustrated by the problem of assessing an immune response to a vaccine as a surrogate endpoint for infection.
Case cohort; Causal inference; Clinical trial; HIV vaccine; Postrandomization selection bias; Structural model; Prentice criteria; Principal stratification
If a vaccine does not protect individuals completely against infection, it could still reduce infectiousness of infected vaccinated individuals to others. Typically, vaccine efficacy for infectiousness is estimated based on contrasts between the transmission risk to susceptible individuals from infected vaccinated individuals compared with that from infected unvaccinated individuals. Such estimates are problematic, however, because they are subject to selection bias and do not have a causal interpretation. Here, we develop causal estimands for vaccine efficacy for infectiousness for four different scenarios of populations of transmission units of size two. These causal estimands incorporate both principal stratification, based on the joint potential infection outcomes under vaccine and control, and interference between individuals within transmission units. In the most general scenario, both individuals can be exposed to infection outside the transmission unit and both can be assigned either vaccine or control. The three other scenarios are special cases of the general scenario where only one individual is exposed outside the transmission unit or can be assigned vaccine. The causal estimands for vaccine efficacy for infectiousness are well defined only within certain principal strata and, in general, are identifiable only with strong unverifiable assumptions. Nonetheless, the observed data do provide some information, and we derive large sample bounds on the causal vaccine efficacy for infectiousness estimands. An example of the type of data observed in a study to estimate vaccine efficacy for infectiousness is analyzed in the causal inference framework we developed.
causal inference; principal stratification; interference; infectious disease; vaccine
Pearl’s article provides a useful springboard for discussing further the benefits and drawbacks of principal stratification and the associated discomfort with attributing effects to post-treatment variables. The basic insights of the approach are important: pay close attention to modification of treatment effects by variables not observable before treatment decisions are made, and be careful in attributing effects to variables when counterfactuals are ill-defined. These insights have often been taken too far in many areas of application of the approach, including instrumental variables, censoring by death, and surrogate outcomes. A novel finding is that the usual principal stratification estimand in the setting of censoring by death is by itself of little practical value in estimating intervention effects.
principal stratification; causal inference
It is frequently of interest to estimate the intervention effect that adjusts for post-randomization variables in clinical trials. In the recently completed HPTN 035 trial, there is differential condom use between the three microbicide gel arms and the No Gel control arm, so that intention to treat (ITT) analyses only assess the net treatment effect that includes the indirect treatment effect mediated through differential condom use. Various statistical methods in causal inference have been developed to adjust for post-randomization variables. We extend the principal stratification framework to time-varying behavioral variables in HIV prevention trials with a time-to-event endpoint, using a partially hidden Markov model (pHMM). We formulate the causal estimand of interest, establish assumptions that enable identifiability of the causal parameters, and develop maximum likelihood methods for estimation. Application of our model on the HPTN 035 trial reveals an interesting pattern of prevention effectiveness among different condom-use principal strata.
microbicide; causal inference; posttreatment variables; direct effect
Data analysis for randomized trials including multi-treatment arms is often complicated by subjects who do not comply with their treatment assignment. We discuss here methods of estimating treatment efficacy for randomized trials involving multi-treatment arms subject to non-compliance. One treatment effect of interest in the presence of non-compliance is the complier average causal effect (CACE) (Angrist et al. 1996), which is defined as the treatment effect for subjects who would comply regardless of the assigned treatment. Following the idea of principal stratification (Frangakis & Rubin 2002), we define principal compliance (Little et al. 2009) in trials with three treatment arms, extend CACE and define causal estimands of interest in this setting. In addition, we discuss structural assumptions needed for estimation of causal effects and the identifiability problem inherent in this setting from both a Bayesian and a classical statistical perspective. We propose a likelihood-based framework that models potential outcomes in this setting and a Bayes procedure for statistical inference. We compare our method with a method of moments approach proposed by Cheng & Small (2006) using a hypothetical data set, and further illustrate our approach with an application to a behavioral intervention study (Janevic et al. 2003).
Causal Inference; Complier Average Causal Effect; Multi-arm Trials; Non-compliance; Principal Compliance; Principal Stratification
Pearl (2011) asked for the causal inference community to clarify the role of the principal stratification framework in the analysis of causal effects. Here, I argue that the notion of principal stratification has shed light on problems of non-compliance, censoring-by-death, and the analysis of post-infection outcomes; that it may be of use in considering problems of surrogacy but further development is needed; that it is of some use in assessing “direct effects”; but that it is not the appropriate tool for assessing “mediation.” There is nothing within the principal stratification framework that corresponds to a measure of an “indirect” or “mediated” effect.
causal inference; mediation; non-compliance; potential outcomes; principal stratification; surrogates
When the true end points (T) are difficult or costly to measure, surrogate markers (S) are often collected in clinical trials to help predict the effect of the treatment (Z). There is great interest in understanding the relationship among S, T, and Z. A principal stratification (PS) framework has been proposed by Frangakis and Rubin (2002) to study their causal associations. In this paper, we extend the framework to a multiple trial setting and propose a Bayesian hierarchical PS model to assess surrogacy. We apply the method to data from a large collection of colon cancer trials in which S and T are binary. We obtain the trial-specific causal measures among S, T, and Z, as well as their overall population-level counterparts that are invariant across trials. The method allows for information sharing across trials and reduces the nonidentifiability problem. We examine the frequentist properties of our model estimates and the impact of the monotonicity assumption using simulations. We also illustrate the challenges in evaluating surrogacy in the counterfactual framework that result from nonidentifiability.
Bayesian estimation; Counterfactual model; Identifiability; Multiple trials; Principal stratification; Surrogate marker
In vaccine research, immune biomarkers that can reliably predict a vaccine’s effect on the clinical endpoint (i.e., surrogate markers) are important tools for guiding vaccine development. This paper addresses issues on optimizing two-phase sampling study design for evaluating surrogate markers in a principal surrogate framework, motivated by the design of a future HIV vaccine trial. To address the problem of missing potential outcomes in a standard trial design, novel trial designs have been proposed that utilize baseline predictors of the immune response biomarker(s) and/or augment the trial by vaccinating uninfected placebo recipients at the end of the trial and measuring their immune biomarkers. However, inefficient use of the augmented information can lead to counterintuitive results on the precision of estimation. To remedy this problem, we propose a pseudo-score type estimator suitable for the augmented design and characterize its asymptotic properties. This estimator has superior performance compared with existing estimators and allows calculation of analytical variances useful for guiding study design. Based on the new estimator we investigate in detail the problem of optimizing the sampling scheme of a biomarker in a vaccine efficacy trial for efficiently estimating its surrogate effect, as characterized by the vaccine efficacy curve (a causal effect predictiveness curve) and by the predicted overall vaccine efficacy using the biomarker.
Closeout placebo vaccination; Estimated likelihood; Immune correlate; Principal surrogate; Pseudo-score; Two-phase sampling design
Evaluation of HIV vaccine candidates in non-human primates (NHPs) is a critical step toward developing a successful vaccine to control the HIV pandemic. Historically, HIV vaccine regimens have been tested in NHPs by administering a single high dose of the challenge virus. More recently, evaluation of candidate HIV vaccines has entailed repeated low-dose challenges which more closely mimic typical exposure in natural transmission settings. In this paper, we consider evaluation of the type and magnitude of vaccine efficacy from such experiments. Based on the principal stratification framework, we also address evaluation of potential immunological surrogate endpoints for infection.
Causal inference; Correlates of protection; HIV; Potential outcomes; Surrogate marker; Vaccine trial
There has been a recent emphasis on the identification of biomarkers and other biologic measures that may be potentially used as surrogate endpoints in clinical trials. We focus on the setting of data from a single clinical trial. In this paper, we consider a framework in which the surrogate must occur before the true endpoint. This suggests viewing the surrogate and true endpoints as semi-competing risks data; this approach is new to the literature on surrogate endpoints and leads to an asymmetrical treatment of the surrogate and true endpoints. However, such a data structure also conceptually complicates many of the previously considered measures of surrogacy in the literature. We propose novel estimation and inferential procedures for the relative effect and adjusted association quantities proposed by Buyse and Molenberghs (1998, Biometrics, 1014 – 1029). The proposed methodology is illustrated with application to simulated data, as well as to data from a leukemia study.
Bivariate survival data; Copula model; Dependent Censoring; Multivariate failure time data; Prentice criterion
The federal and provincial governments have undertaken a universal immunization program to protect school-aged girls against cervical cancer using the new human papillomavirus vaccine Gardasil®. While the vaccine appears to be effective and safe, there are a number of important unanswered questions regarding it and the effects of the immunization program. Here we briefly review key literature about the vaccine and then use the Erickson criteria, which offer an evidence basis for decision-making regarding national immunization strategies, to evaluate whether the program is congruent with sound public health policy. Our analysis of the national decision to recommend and fund a vaccination program using Gardasil® raises significant questions about the basis for this program.
Identification of an immune response to vaccination that reliably predicts protection from clinically significant infection, i.e. an immunological surrogate endpoint, is a primary goal of vaccine research. Using this problem of evaluating an immunological surrogate as an illustration, we describe a hierarchy of three criteria for a valid surrogate endpoint and statistical analysis frameworks for evaluating them. Based on a placebo-controlled vaccine efficacy trial, the first level entails assessing the correlation of an immune response with a study endpoint in the study groups, and the second level entails evaluating an immune response as a surrogate for the study endpoint that can be used for predicting vaccine efficacy for a setting similar to that of the vaccine trial. We show that baseline covariates, innovative study design, and a potential outcomes formulation can be helpful for this assessment. The third level entails validation of a surrogate endpoint via meta-analysis, where the goal is to evaluate how well the immune response can be used to predict vaccine efficacy for new settings (building bridges). A simulated vaccine trial and two example vaccine trials are presented, one supporting that certain anti-influenza antibody levels are an excellent surrogate for influenza illness and another supporting that certain anti-HIV antibody levels are not useful as a surrogate for HIV infection.
clinical trial; counterfactual; immune correlate; meta-analysis; potential outcomes; principal surrogate; statistical surrogate
In this commentary, structural equation models (SEMs) are discussed as a tool for epidemiologic analysis. Such models are related to and compared with other analytic approaches often used in epidemiology, including regression analysis, causal diagrams, causal mediation analysis, and marginal structural models. Several of these other approaches in fact developed out of the SEM literature. However, SEMs themselves tend to make much stronger assumptions than these other techniques. SEMs estimate more types of effects than do these other techniques, but this comes at the price of additional assumptions. Many of these assumptions have often been ignored and not carefully evaluated when SEMs have been used in practice. In light of the strong assumptions employed by SEMs, the author argues that they should be used principally for the purposes of exploratory analysis and hypothesis generation when a broad range of effects are potentially of interest.
causal inference; causality; causal modeling; confounding factors (epidemiology); epidemiologic methods; regression analysis; structural equation model
The RENAAL (Reduction of Endpoints in NIDDM with the Angiotensin II Antagonist Losartan) study is a multinational, double-blind, randomized, placebo controlled trial which was recently published. It was aimed to evaluate the effect of the angiotensin receptor blocker losartan in patients with diabetic nephropathy. The primary efficacy measure was the time to the first event of the composite end point of a doubling of serum creatinine, end-stage renal disease, or death. The conclusion was that losartan led to significant improvement in renal outcomes, that was beyond that attributable to blood pressure control in patients with type 2 diabetes and nephropathy.
The perusal of the report raises concern, regarding to both the patient population as well as the outcome measures. At randomization, the placebo group included more patients with angina, myocardial infarction and lipid disorders than the losartan group. Information on glucose metabolism was disregarded, and data on antihyperglycemic therapy – which may have undesirable influences on cardiac performance – were not included in a multivariate analysis. In addition, only data on first hospitalization were reported, whilst information on total specific-cause hospitalizations was disregarded, thus potentially masking further unfavorable events. Furthermore, creatinine seems not to be a reliable surrogate end point. Based on its mechanism of action, losartan may possess favorable renoprotective properties. However, due to the methodological flaws and the incomplete data in the RENAAL study, the question of the effectiveness and safety of this drug in diabetic nephropathy remains yet unanswered.
Angiotensin receptor blockers; Clinical trials; Diabetes mellitus; Losartan; Nephropathy; RENAAL study
There has been substantive interest in the assessment of surrogate endpoints in medical research. These are measures which could potentially replace “true” endpoints in clinical trials and lead to studies that require less follow-up. Recent research in the area has focused on assessments using causal inference frameworks. Beginning with a simple model for associating the surrogate and true endpoints in the population, we approach the problem as one of endogenous covariates. An instrumental variables estimator and general two-stage algorithm is proposed. Existing surrogacy frameworks are then evaluated in the context of the model. In addition, we define an extended relative effect estimator as well as a sensitivity analysis for assessing what we term the treatment instrumentality assumption. A numerical example is used to illustrate the methodology.
Clinical Trial; Counterfactual; Nonlinear response; Prentice Criterion; Structural equations model
Using multiple historical trials with surrogate and true endpoints, we consider various models to predict the effect of treatment on a true endpoint in a target trial in which only a surrogate endpoint is observed. This predicted result is computed using (1) a prediction model (mixture, linear, or principal stratification) estimated from historical trials and the surrogate endpoint of the target trial and (2) a random extrapolation error estimated from successively leaving out each trial among the historical trials. The method applies to either binary outcomes or survival to a particular time that is computed from censored survival data. We compute a 95% confidence interval for the predicted result and validate its coverage using simulation. To summarize the additional uncertainty from using a predicted instead of true result for the estimated treatment effect, we compute its multiplier of standard error. Software is available for download.
Randomized trials; Reproducibility; Principal stratification
We investigated the putative surrogate endpoints (PSEs) of best response (BR), complete response (CR), confirmed response (CoR), and progression-free survival (PFS) for associations with Overall Survival (OS), and as possible surrogate endpoints for OS.
Individual patient (pt) data from 870 untreated ES-SCLC pts participating in 6 single-arm (274 pts) and 3 randomized trials (596 pts) were pooled. Patient-level associations between PSEs and OS were assessed by Cox models using landmark analyses. Trial-level surrogacy of PSEs assessed by the association of treatment effects on OS and individual PSEs. Trial-level surrogacy measures included: R2 from weighted least squares regression model (WLS R2), Spearman's correlation coefficient, and R2 from bivariate survival model (Copula R2).
Median OS and PFS were 9.6 (95% CI: 9.1-10.0) and 5.5 (95% CI: 5.2-5.9) months, respectively; BR, CR, and CoR rates were 44%, 22%, and 34%, respectively. Patient-level associations showed that PFS status at 4 months was a strong predictor of subsequent survival (HR=0.42 (95% CI: 0.35-0.51); concordance index=0.63; p<0.01), with 6-month PFS being the strongest (HR=0.41 (95% CI: 0.35-0.49); concordance index=0.66; p<0.01). At the trial-level, PFS showed the highest level of surrogacy for OS (WLS R2=0.79; Copula R2=0.80), explaining 79% of the variance in OS. Tumor response endpoints showed lower surrogacy levels (WLS R2≤0.48).
PFS was strongly associated with OS at both the patient and trial-level. PFS also shows promise as a potential surrogate for OS, but further validation is needed using data from a larger number of randomized phase III trials.
extensive-stage small cell lung cancer; surrogate endpoints; pooled analysis; progression-free survival; tumor response
In clinical trials, a biomarker (S) that is measured after randomization and is strongly associated with the true endpoint (T) can often provide information about T and hence the effect of a treatment (Z) on T. A useful biomarker can be measured earlier than T and cost less than T. In this paper we consider the use of S as an auxiliary variable and examine the information recovery from using S for estimating the treatment effect on T, when S is completely observed and T is partially observed. In an ideal but often unrealistic setting, when S satisfies Prentice’s definition for perfect surrogacy, there is the potential for substantial gain in precision by using data from S to estimate the treatment effect on T. When S is not close to a perfect surrogate, it can provide substantial information only under particular circumstances. We propose to use a targeted shrinkage regression approach that data-adaptively takes advantage of the potential efficiency gain yet avoids the need to make a strong surrogacy assumption. Simulations show that this approach strikes a balance between bias and efficiency gain. Compared with competing methods, it has better mean squared error properties and can achieve substantial efficiency gain, particularly in a common practical setting when S captures much but not all of the treatment effect and the sample size is relatively small. We apply the proposed method to a glaucoma data example.
Auxiliary Variable; Biomarker; Randomized Trials; Ridge Regression; Missing Data
The paired availability design for historical controls postulated four classes corresponding to the treatment (old or new) a participant would receive if arrival occurred during either of two time periods associated with different availabilities of treatment. These classes were later extended to other settings and called principal strata. Judea Pearl asks if principal stratification is a goal or a tool and lists four interpretations of principal stratification. In the case of the paired availability design, principal stratification is a tool that falls squarely into Pearl's interpretation of principal stratification as “an approximation to research questions concerning population averages.” We describe the paired availability design and the important role played by principal stratification in estimating the effect of receipt of treatment in a population using data on changes in availability of treatment. We discuss the assumptions and their plausibility. We also introduce the extrapolated estimate to make the generalizability assumption more plausible. By showing why the assumptions are plausible we show why the paired availability design, which includes principal stratification as a key component, is useful for estimating the effect of receipt of treatment in a population. Thus, for our application, we answer Pearl's challenge to clearly demonstrate the value of principal stratification.
principal stratification; causal inference; paired availability design
The literature on potential outcomes has shown that traditional methods for characterizing surrogate endpoints in clinical trials based only on observed quantities can fail to capture causal relationships between treatments, surrogates, and outcomes. Building on the potential-outcomes formulation of a principal surrogate, we introduce a Bayesian method to estimate the Causal Effect Predictiveness (CEP) surface and quantify a candidate surrogate’s utility for reliably predicting clinical outcomes. In considering the full joint distribution of all potentially-observable quantities, our Bayesian approach has the following features. First, our approach illuminates implicit assumptions embedded in previously-used estimation strategies that have been shown to result in poor performance. Second, our approach provides tools for making explicit and scientifically-interpretable assumptions regarding associations about which observed data are not informative. Through simulations based on an HIV vaccine trial, we found that the Bayesian approach can produce estimates of the CEP surface with improved performance compared to previous methods. Third, our approach can extend principal-surrogate estimation beyond the previously-considered setting of a vaccine trial where the candidate surrogate is constant in one arm of the study. We illustrate this extension through an application to an AIDS therapy trial where the candidate surrogate varies in both treatment arms.
Biomarker; Causal effect predictiveness; principal stratification; surrogate endpoint
Seven randomized trials published in the last six years have shown that warfarin reduces the risk of ischaemic strokes and death in patients with atrial fibrillation. The annual rates of major bleeding episodes in all these trials were low and, as a result, doctors in primary and secondary care are being encouraged to consider using warfarin for patients with atrial fibrillation unless there are obvious contraindications. However, the populations used in these studies were highly selected and rigorously monitored throughout the trial period to minimize the risk of bleeding in a way which probably could not be expected in routine primary care. Although the rates of major bleeding episodes were uniformly low, the rates of minor bleeding episodes were much higher and these could impact substantially on patients' views of the treatment and on the workload of the primary care team. Evidence is now at hand which allows the stratification of risk in patients with atrial fibrillation which should enable those who are at greatest risk to be considered for this form of treatment. Patients may develop risk factors over time which could render them unsuitable for continuation of warfarin therapy. The general practitioner is centrally placed to make the decision about initiating or continuing treatment or indeed stopping it. Several models for decision making in warfarin treatment from primary and secondary care are proposed.
Overall survival (OS) is the gold standard for the demonstration of a clinical benefit in cancer trials. Replacement of OS by a surrogate endpoint allows to reduce trial duration. To date, few surrogate endpoints have been validated in digestive oncology. The aim of this study was to draw up an ordered list of potential surrogate endpoints for OS in digestive cancer trials, by way of a survey among clinicians and methodologists. Secondary objective was to obtain their opinion on surrogacy and quality of life (QoL).
In 2007 and 2008, self administered sequential questionnaires were sent to a panel of French clinicians and methodologists involved in the conduct of cancer clinical trials. In the first questionnaire, panellists were asked to choose the most important characteristics defining a surrogate among six proposals, to give advantages and drawbacks of the surrogates, and to answer questions about their validation and use. Then they had to suggest potential surrogate endpoints for OS in each of the following tumour sites: oesophagus, stomach, liver, pancreas, biliary tract, lymphoma, colon, rectum, and anus. They finally gave their opinion on QoL as surrogate endpoint. In the second questionnaire, they had to classify the previously proposed candidate surrogates from the most (position #1) to the least relevant in their opinion.
Frequency at which the endpoints were chosen as first, second or third most relevant surrogates was calculated and served as final ranking.
Response rate was 30% (24/80) in the first round and 20% (16/80) in the second one. Participants highlighted key points concerning surrogacy. In particular, they reminded that a surrogate endpoint is expected to predict clinical benefit in a well-defined therapeutic situation. Half of them thought it was not relevant to study QoL as surrogate for OS.
DFS, in the neoadjuvant settings or early stages, and PFS, in the non operable or metastatic settings, were ranked first, with a frequency of more than 69% in 20 out of 22 settings. PFS was proposed in association with QoL in metastatic primary liver and stomach cancers (both 81%). This composite endpoint was ranked second in metastatic oesophageal (69%), colorectal (56%) and anal (56%) cancers, whereas QoL alone was also suggested in most metastatic situations.
Other endpoints frequently suggested were R0 resection in the neoadjuvant settings (oesophagus (69%), stomach (56%), pancreas (75%) and biliary tract (63%)) and response. An unexpected endpoint was metastatic PFS in non operable oesophageal (31%) and pancreatic (44%) cancers. Quality and results of surgical procedures like sphincter preservation were also cited as eligible surrogate endpoints in rectal (19%) and anal (50% in case of localized disease) cancers. Except for alpha-FP kinetic in hepatocellular carcinoma (13%) and CA19-9 decline (6%) in pancreas, few endpoints based on biological or tumour markers were proposed.
The overall results should help prioritise the endpoints to be statistically evaluated as surrogate for OS, so that trialists and clinicians can rely on endpoints that ensure relevant clinical benefit to the patient.