The general approach developed below is to construct a hypothetical trial that embodies the known characteristics of the treatment and pharmacogenomic marker – the overall treatment effect unstratified by the marker, the marker effect in each study arm, and the distribution of the marker. The comparative treatment effect for the marker subgroups is estimated by demonstrating that only specific values of the treatment effect for the subgroups will be consistent with the set of treatment and marker characteristics specified.
If an appropriately designed RCT, comparing treatments α and β, were available in which the pharmacogenomic marker status for participants is known, a subgroup analysis may be undertaken on the basis of the marker. For simplicity it is assumed here that the marker only has two values (A and A′; e.g. corresponding to positive/negative, high/low, mutated/wildtype, carriage of allele/no carriage of allele) and that the outcome of interest is a binary event (e) that has a probability (P) of occurring over a specified time period. For each marker subgroup the risk ratio (
) for the comparative treatment effect may be directly estimated from such an RCT. As indicated by equation 1
, the information derived from such a trial would be sufficient to determine the choice of therapy (α
) for each subgroup that will minimize the risk of the event. However, such trials are not always available. Therefore, the specific goal of the analysis presented in this paper is to indirectly estimate
A common form of evidence for a pharmacogenomic marker is an association study. Data from an association study (or meta-analysis of association studies) provides an estimate of the risk ratio of an outcome between individuals with different values of the marker for individuals using treatment α (
: equation 2
). A similar estimate may be available for individuals using an alternative treatment β (
With this information, a prescriber can advise a patient of his or her prognosis given the use of either drug. However, this information is insufficient to advise the patient as to the optimum choice of therapy; that which minimizes P[e]. Specifically, if
it does not follow that patients with the marker value A′ should not be treated with therapy α, which could still be more effective compared to alternative treatment options (e.g. β).
In addition to estimates of
from association studies it is assumed that an estimate of the treatment effect is available from a conventional RCT (or meta-analysis of RCTs), in which the cohort is not stratified for the marker of interest (
may be based on an indirect treatment comparison of RCTs with a common comparator although this may lead to an increased the risk of bias 
. Third, it is assumed that data is available on the prevalence of the marker in patients who have the condition that will be treated with α or β. This information is generally available from the association studies but may also be sourced elsewhere. It is assumed that the prevalence of the marker is balanced between arms of the hypothetical trial.
The probability of the clinical outcome in the unstratified cohort is estimated to be the weighted average of the probability of the clinical outcome in the pharmacogenomic subgroups, using the law of total probability, which relates marginal probability and conditional probability (equation 3
Combining equations 2
leads to the following formulas for indirectly estimating risk of the event in the pharmacogenomic subgroups (A and A′) for treatment α. Calculation of the risk of the event in the pharmacogenomic subgroups for treatment β may be similarly undertaken.
Subsequently, using the relationship described in equation 1
the comparative treatment effect for the subgroups defined by the pharmacogenomic marker may be indirectly estimated.
Credible intervals (analogous to confidence intervals) for pharmacogenomics subgroup treatment effects and the statistical inference on the difference between subgroup treatment effects may be estimated using Monte Carlo simulation. This approach essentially estimates the uncertainty of the output (
) based on the collective uncertainty of the inputs (
). Thus, information on the distribution of the above parameters (e.g. based on the 95% confidence interval) would need to be available. Typically risk ratio estimates are represented by a lognormal distribution and probabilities by a beta distribution 
. Monte Carlo simulation involves randomly drawn values from the distributions of the input variables and the calculation of the output variable. This process is repeated a large number of times (e.g. 10,000) producing the distribution of the output variable. Assessment of whether the difference between subgroups is statistically significant (statistical test of interaction) may also be estimated 
. However, care must be taken in interpreting the statistical significance due to the risk of bias inherent in the indirect estimation.
The key assumption of the method is exchangeability of the studies (association studies, RCT). Specifically, the study populations should not differ on any modifiers of the prognostic effect of the marker or for any modifiers of the predictive effect of the marker. We introduce the label “marker-modifiers” to encompass both prognostic and predictive modifiers. Candidate marker-modifiers include patient factors (age, sex, severity of index condition, co-existing disease, ethnicity), study factors (length of follow-up, intensity of surveillance) and treatment factors (concomitant medications, surgery, or dose and duration of the index treatment).
Note that these factors could have different distributions in the included studies without invalidating the assumption of exchangeability. It is only when differences in these factors affects outcome in groups defined by the marker (i.e., only when a factor is a marker-modifier) that the assumption of exchangeability does not hold. In general, the greater the degree to which the assumption of exchangeability does not hold, the greater the expected risk of bias for comparative treatment effect estimates of the pharmacogenomic subgroups. The assumption of exchangeability in this context is analogous to the assumption of exchangeability (sometimes called “similarity”) of RCTs in an indirect treatment comparison; or more broadly of exchangeability for RCTs, non-randomised studies and direct head-to-head studies in a network meta-analysis. The variables (if any) that can modify the pharmacogenomic association study effect size and the direction of the modification will tend to be specific to the marker and drug in question and hence it is not possible to make a generic statement of how factors will affect exchangeability. The marker prevalence is unlikely to be an issue with respect to exchangeability unless there are substantial differences in marker prevalence between studies and marker prevalence is believed to modify the marker effect.
It is also assumed that the contributing studies are methodologically sound and their results are not subject to bias. In general, the greater the risk of bias in the contributing studies, the greater the expected risk of bias for comparative treatment effect estimates of the pharmacogenomic subgroups. The inputs and assumptions of the approach are summarized in .
Required inputs and assumptions of the indirect estimation approach.