|Home | About | Journals | Submit | Contact Us | Français|
Cost-effectiveness analysis has gained status over the last 15 years as an important tool for assisting resource allocation decisions in a budged-limited environment such as healthcare. Randomised (multicentre) multinational controlled trials are often the main vehicle to collect primary patient-level information on resource use, cost and clinical effectiveness associated with alternative treatment strategies. However, trial-wide cost-effectiveness results may not be directly applicable to any one of the countries that participate in a multinational trial, requiring some form of additional modelling to customise the results to the country of interest.
The aim is to produce recommendations regarding methods that can be (currently) considered ‘good practice’ when exploring the geographical generalisability of cost-effectiveness data. The manuscript proposes an algorithm to assist with the choice of the appropriate analytical strategy when facing the task of adapting the study results from one country to another. The algorithm considers different scenarios characterised by whether or not (a) the country of interest participated in the trial, and (b) individual patient-level data (IPD) from the trial are available.
Structured review with description and discussion of case studies.
Methods to reflect between-country variability in cost-effectiveness data are available. It is important to be transparent regarding the assumptions made in the analysis and (where possible) assess their impact on the study results.
Cost-effectiveness analysis (CEA) has gained status over the last 15 years as an important tool for assisting resource allocation decisions in a budged-limited environment such as healthcare. Many national and provincial/state governments nowadays require cost-effectiveness evidence when assessing new healthcare technologies.[1-6] In the case of emerging technologies, where available clinical and economic evidence is still scarce, multinational and multicentre trials are often the main vehicle for collecting primary (patient-level) information on resource use, cost and clinical effectiveness associated with alternative treatment strategies. The multilocation design of these trials offers the benefit of speedy patient recruitment and large sample size, while facilitating reimbursement submissions in several jurisdictions. In fact, by recruiting participants from different countries (and settings), international clinical studies are believed[7, 8] to offer the advantage to generate evidence more likely to be ‘generalisable’ across locations than that produced by single-centre trials.
In spite of this, it has been argued that - strictly speaking - multinational trial-wide cost-effectiveness results may not be directly applicable to any one of the countries that participate in the clinical study, requiring some form of additional modelling to customise the results to the countrya of interest. There are various reasons for this. Decision-makers are inherently country-specific and are more interested in results which are directly relevant to their own jurisdiction. Secondly, it is possible that the country of interest did not participate in the clinical trial. Thirdly, even when the country of interest is part of the original study, the presence of country-specific factors potentially affecting the geographical variability of the study results (effectiveness, cost, and quality of life) means that trial-wide results may not be informative for reimbursement decisions at country-level.
Bernie O'Brien was among the first to raise concerns regarding the generalisability of CEA data collected in one country to inform reimbursement decisions in another. His seminal paper suggested that between-country differences in (i) demography and epidemiology of disease, (ii) clinical practice and conventions, (iii) incentives and regulations for healthcare providers, (iv) relative price levels, (v) consumer preferences, and (vi) opportunity costs of resources, could all be potential threats. Health economists have attempted to address these concerns in many different ways since then and, as methodology become more refined, new and alternative approaches are proposed. Three extensive reviews of conceptual and applied research in the area of generalisability of CEA studies have been recently published by Reed et al, Sculpher et al, and Goeree et al  and the interested reader is invited to refer to these reports for further details.
The present manuscript concerns itself with recent methodological developments in the analysis of cost-effectiveness data collected alongside multicentre and multinational RCTs. The aim is to produce recommendations regarding methods that can be (currently) considered ‘good practice’ when exploring the geographical generalisability of cost-effectiveness data. The manuscript proposes an algorithm to assist with the choice of the appropriate analytical strategy when facing the task of adapting the study results from one country to another. The algorithm considers different scenarios characterised by whether or not (a) the country of interest participated in the trial, and (b) individual patient-level data (IPD) from the trial are available. Given the programme of this conference, the manuscript focuses - specifically - on the methods, and provides only a brief discussion of the policy context. For a more detailed discussion of the latter the reader is invited to refer to the Sculpher and Drummond paper at this meeting.
The manuscript is structured as follow. Section 2 reviews the rationale for assessing the generalisability of cost-effectiveness results from country to country, and summarises the current perception as to which data can be directly ‘applied’ from one country to another and which one need to be country-specific. Section 3 presents the methodology used to produce country-specific estimates of cost-effectiveness distinguishing four main scenarios, characterised by whether or not (i) the country of interest participated in the study, and (ii) IPD from the trial are available. The implications for the design, data collection, analysis and presentation of the study results are considered next. The final section discusses future lines of applied and policy research in this area.
The globalisation of clinical research for pharmaceuticals and medical devices in many disease areas (e.g. cardiovascular disease, oncology, respiratory disease, etc), paired with the need of the industry to seek regulatory approval in different jurisdictions, means that multinational trials are often the preferred vehicle for primary (resource use, clinical and quality of life) data collection. Geographical locations in North America, Western Europe and Asia, traditionally chosen as a base from which to recruit study participants have now been joined by countries in Latin America and Eastern Europe. Possible reasons behind this trend relate to the need to recruit even larger study samples, the expanding market for pharmaceuticals (and devices) in these geographical areas, and the less stringent regulatory regimens operating in some countries.
While increasing the potential for the conduct of large studies, the internationalisation of clinical research poses several challenges in terms of design, management, statistical analysis and interpretation of the study results. In recognition of these challenges the International Committee on Harmonisation (IHC) has developed a series of guidelines[15-19] with the objective to facilitate the conduct of international clinical studies with the expectation that the evidence derived from these studies would be used to meet regulatory requirements in different jurisdictions
One of the consequences of the globalisation of clinical research, though, is that researchers are now more aware of the existence of country and regional variations in multinational trials with respect to the resource use and clinical outcome results.[20, 21] Between-country variability in clinical effectiveness, resource use and quality of life results, even after adjustment for patient baseline characteristics, is in fact a well known phenomenon in multinational randomised controlled trials (RCTs) investigating the management of patients following myocardial infarction (MI), for instance. Similar findings have been observed in studies considering patients' management after stroke,[24, 25] and the management of acute coronary syndrome (ACS). In the latter clinical area, multinational RCTs found important between[26, 27] and within country differences in resource use, therapeutic strategies and short-term mortality. The same trend has been observed in many other clinical areas[22, 23, 29-32] specifically with respect to variations in average length of stay when an identical treatment was implemented in similar populations in several countries simultaneously.
There are various factors that could affect the generalisability of the results of CEA studies. Sculpher et al reviewed the literature to identify these factors and found 36 papers discussing potential sources of variability between locations.
Patient-level variation feeds through to centre or country variations in cost-effectiveness if patients' characteristics (clinical and socio-demographic) are not evenly distributed between locations. It can be partly explained in terms of differences in demography [33-39] Variation between locations (e.g. centres and countries) in the epidemiology can also translate onto different case-mixes between locations with obvious impact on the cost-effectiveness of a given treatment in a specific location.[36, 38, 40]
Clinicians can influence the effectiveness, cost and cost-effectiveness of interventions. This ‘clinician effect’ is particularly important in non-drug interventions (e.g. surgical), but pharmaceutical trials can also display between-clinician differences – for instance - in background treatments given to patients over and above the study treatment, or in the management of adverse events. The clinician effect is typically not easy to quantify within clinical[41-43] or economic evaluations.[44, 45] In part, variation in how clinical staff perform can be due to the fact that healthcare systems differ also in terms of the incentives that they offer to staff.[11, 36, 38, 46-48]
There may be numerous differences between countries and centres [36, 38, 40, 47, 49-56] (other than patients and clinicians characteristics) in terms of the process of healthcare delivery. Between-country differences in relative cost may be influenced by the technology involved in the production of healthcare, the level of substitution between labour and capital, and the types and cost of resource inputs used in production of healthcare. This within-country variation is likely to be particularly pronounced in large and economically heterogeneous countries. Clinical practice and conventions are also known to differ widely between (and within) countries.[33, 36, 37, 47, 50, 53, 55, 58, 59]
Other factors which may have an impact on the generalisability of CEA results from country to country relate to variation between locations in terms of more general socio-economic factors. The willingness (and ability) of a region/country to devote resources to healthcare is one of these factors. Another broader factor discussed by Bernie O'Brien relates to the health-related preferences of the population such as those reflected in health state utilities used to calculate quality-adjusted life years (QALYs).
In view of the arguments developed above and in consideration of the time and effort required to complete a multinational trial-based CEA, it seems reasonable - from the viewpoint of both the industry and national/state governments - to support the use of methods which facilitate the ‘translation’ of cost-effectiveness data obtained from one country to make them applicable to another. The need to customise the economic study results to a specific jurisdiction is not purely academic, but stems from the decision-makers' need for context-specific information. This raises two overarching methodological questions: ‘what methods are there to make cost-effectiveness estimates more country-specific?’, and ‘how can we account for factors that may affect the between-country generalisability of cost-effectiveness results?’
In broad terms, the analytical options available range from the use of regression-based techniques to the application of decision-analytic models, and are already part of the toolkit of the health economist working in CEA. Decision models are typically used when the evidence base from the trial(s) of interest is available exclusively in summary format. However, there are examples where IPD from a single multinational trial have also been used - in combination with non-trial IPD - to populate a decision model with the objective to generate cost-effectiveness estimates for a country different from the one where the trial had been carried out. Regression-based methods, on the other hand, are used mainly when the country of interest actively recruited patients into the trial and the analyst has access to the study IPD (or at least country-specific summary data).
It emerges that the answers to the questions posed above depend, therefore, upon whether or not (a) the country of interest participated in the trial, and (b) IPD from the trial are available. To structure the discussion in a logical way, this manuscript proposes an algorithm (presented in Figure 1) developed to assist the decision as to which analytical strategy to adopt when faced with the task of generating country-specific cost-effectiveness estimates.
Let us start from the situation where, despite the IPD from the multinational (multicentre) trial being available, the country of interest did not participate in the trial. The decision-maker in the country of interest will be interested in the extent to which the results from this trial apply to her own setting. In this case, some form of decision modelling to extrapolate the study results from one country to another will be required. This is a very common situation in health technology assessment (HTA) and there are various examples in the literature (see relevant chapters in the reviews by Sculper et al and Goeree et al).
As discussed in section 2, it is possible not only for resource use and cost data to vary by location, but the same can apply to clinical data. In the applied work, analysts have addressed this issue by assuming that the baseline risks for particular clinical events are location-specific, whilst the relative treatment effect is more generalisable across locations. In this case it is considered good practice to develop an ‘events based model’ built around ‘generalisable’ features of the disease or patient's prognosis, and use the IPD from the trial to estimate the likelihood of occurrence of the clinical events of interest which are expected have an impact on resource use and / or health-related quality of life. The trial-wide relative treatment effect such as, for instance, the relative risk reduction (RRR) in the event(s) of interest observed in the trial (e.g. relative reduction in risk of deaths, MI, side effects), is then applied to the reference (baseline) risk (R0) - i.e. the event rate without the treatment - for the country of interest. The latter information can be ascertained either from long-term follow-up cohort studies, or (more practically) using existing risk equations assuming that the risk factors (e.g. age, tobacco consumption, etc) between the trial population and that in the country of interest are the same, regardless of the country. Different distributions of these risk factors in different countries will translate in differences in country-specific baseline risks. Because cost-effectiveness is essentially concerned with absolute differences (in costs and effects) the absolute number of events averted - for instance, should this be the measure of clinical outcome – in the new country is simply obtained multiplying the trial-wide treatment effect by the baseline event rate in the country of interest.
A well known example of this methodology is the application of the West of Scotland Coronary Prevention Study (WOSCOPS) cost-effectiveness results to Belgium, Canada, Sweden and South Africa. The WOSCOP study concluded that treatment with pravastatin reduced the risk of first-time heart attack and death in middle-aged hypercolesterolaemic men. Because of the increased risk of cardiovascular disease (CVD) in the Scottish male population, though, the authors were keen to address possible concerns regarding the generalisability of the study findings to other countries. Using a system of competing risk equations Caro et al combined trial (i.e. relative risk reduction of cardiovascular events) and non-trial IPD (i.e. baseline event rates based on risk equations and risk factors distribution for Belgium, local costs, and life expectancy) within a decision-analytic model to estimate the cost-effectiveness of pravastatin in Belgium. A simplified structure of their model is represented in Figure 2.
At any point in time men with hypercholesterolaemia were assumed to be either (i) alive without experiencing cardiovascular events (and be still at risk in the following period), (ii) dead following a non-cardiovascular event, (iii) have a non-fatal cardiovascular event (in which case a given life expectancy was estimated), or (iv) dead following cardiovascular event. Parameter estimates from the risk equations governing the above transitions in a population not receiving the active treatment were obtained applying an exponential regression to the WOSCOP trial IPD, considering a set of risk factors (e.g. age, high diastolic blood pressure, smoking, etc). Parameter estimates from the exponential regression model were then applied to IPD on the same set of risk factors obtained from a Belgian epidemiological study, to predict individual-patient (and average) probabilities of cardiovascular events for the cohort in the model. For each cardiovascular event considered in the model, direct costs were estimated using a registry which includes 35% of the hospitalisations in Belgium. Similarly, local data was used to estimate drug costs, while mortality tables for Belgium were used to extrapolate long term survival.
It should be pointed out that this approach can also be used in case the country of interest participated in the trial, but IPD are only available for the clinical outcome (i.e. no country-specific resource use data collection for the country of interest had taken place). Furthermore, when the researcher is interested in the cost-effectiveness of the intervention in particular sub-groups of patients the analysis above would need to be run separately for different risk groups (based on the regression results).
The class of models discussed above typically relies on the (often untested) assumption that the relative clinical efficacy is independent of the disease underlying baseline risk, and that while the latter captures a range of country-specific factors (e.g. epidemiology, medical attitude, etc) the relative clinical effectiveness of the intervention does not differ greatly across countries. The use of clinical data collected from different countries (centres) to estimate a single relative treatment effect on clinical outcomes is therefore an accepted practice. However, as recognised by Caro et al[61, 63] it is possible that between-country variability in the risk factors could lead to different cost-effectiveness results in different locations. Between-country differences in the distribution of individual-patient level risk factors (e.g. age, smoking status, blood pressure) are always likely to exist, which when paired with possible difference in country-specific factors (e.g. type of healthcare system, percentage of national GDP spent on healthcare, etc.) could limit the generalisability of the relative treatment effect observed in the trial.
Despite these considerations, the assumption regarding the generalisability of the relative treatment effect on the clinical outcome from one country (or a set of countries) is only rarely scrutinised, even when access to IPD from the multinational trial is not an issue. When assessed, it is done so using a test of heterogeneity despite the fact that this test is typically underpowered.[14, 67] In the ‘absence of evidence’ (which does not imply ‘evidence of absence’) about between-country heterogeneity in the data, the results of this test are used as a basis to justify the analysis of the pooled (clinical and resource use) data, regardless of the country of origin. The implication is that non-statistically significant between-country differences in relative treatment effect may not be a concern for a given country decision-maker, even when these are qualitative in nature (i.e. the treatment effect in different countries not only differs in magnitude but also in direction).
It can be argued, though, that the presence of between-country differences in the magnitude and sometimes in the direction of the relative treatment effect is a fundamental consideration which, when paired with international differences in factors affecting resource use and costs, makes the estimate of cost-effectiveness for a particular country based on the trial-wide relative treatment effect unreliable, even when individual risk factors (and their distributions) are similar between countries.
One of the earliest attempts to address the statistical analysis of multinational clinical trials for a cost-effectiveness analysis was presented by Willke et al. Using a system of related regression equations, the authors developed a novel approach to explore the between-country variability in the CEA results by looking at the treatment-by-country interactions in both effectiveness and costs. Willke et al presented the cost-effectiveness results for five separate countries under different assumptions regarding the generalisability of the data from the trial. Their results emphasized the differing spread of estimates that could be obtained under different approaches. The authors compared (i) a fully pooled analysis with multinational costing, which produced as a single cost-effectiveness estimate for the whole trial, assuming trial-wide effectiveness; (ii) a pooled analysis (for each country) with price weights from the individual countries again assuming trial-wide effectiveness, and which produced very little variability in the results; (iii) a strategy relating trial-wide effectiveness and country specific costs (with countries own price weights), which provided a much greater spread, and (iv) the fully split analysis, which resulted in the widest between-country variation.
The latter approach is equivalent to splitting the data and running a series of regression analyses for each country independently from the others. The potential problem with this approach is that it requires a choice to be made between ‘pooling’ vs ‘splitting’. The limitations of a pooled analysis have been discussed already. Splitting the data, on the other hand, is impractical when the country of interest has recruited a limited number of patients compared with the rest of the countries in the trial.
Furthermore, it can be argued that data collected from different countries (and patients) may share some degree of similarity and that there may be advantages in trying to capture such similarities. One way to do this is to reflect the hierarchical structure in the multinational data, inherent in the natural clustering arising from patients being recruited in specific countries and receiving treatment in centres with different characteristics.
Various authors have explored the use of hierarchical regression models for the analysis of multinational (and multicentre) trial-based cost-effectiveness data.[70-75] A simple hierarchical model for either cost (or health outcomes) data can be described as follows. Let Yij be the observed cost (health outcome) of individual i in country j , and tij be an indicator variable taking values 0 (control) and 1 (intervention), depending on whether the patient has been treated respectively in the control or the intervention group. The regression model for multinational cost-effectiveness data can be described as follows
where, the coefficients αj and βj are respectively the country j mean cost in the control arm and the differential mean cost between the two arms of the trial, so that the mean cost in the intervention group in country j is given by (αj + βj). The model can be re-written as
where the last three terms represent the random components at the country (uj and vj) and individual (εij) level, usually assumed to follow a normal distribution with mean zero.
The terms α and β in (2) are overall (fixed) effects representing the trial-wide estimates, while vj and uj are the ‘random effects’ representing, respectively, the jth country-specific departure from the overall mean cost in the control group (vj) and the differential mean cost, uj.b Equation (2) allows to partition the overall variability observed in the data in two components, one associated with variation at patient-level (εij), and the other(s) associated with variation at country level (uj and vj). Finally, the parameter of interest, the country j specific mean difference in is given by (β + uj).
With respect to the splitting approach, which requires a treatment-by-country interaction term, (2) assumes that the random effects are ‘latent’ variables with a specific distribution, representing the potential departures that the country-specific effects could have from the overall mean. By assuming country effects as random, these models allow the country-specific estimates to be obtained using, not only the data from the country of interest, but also the data from the other countries that participate in the trial. In this sense, the estimates are said to be borrowing strength from each other. Willan et al showed how this class of models can be used to analyse multinational trial-based aggregate country-specific cost-effectiveness data. Others[71-75] illustrated the application of hierarchical regression models for costs and CEA in presence of trial-based IPD.
That the analysis based on IPD from multinational trials offers more flexibility is undisputed. While allowing for the inclusion of a set of country-specific covariates, the aggregate level data analysis does not facilitate inclusion of individual-level covariates, which may lead in some cases to the problem of ‘ecological fallacy’.[76, 77] The hierarchical model with IPD, on the other hand, offers the potential to analyse the full dataset, hence, the possibility to accommodate both patients- and country-specific covariates. In particular, using both patient and country level covariates the hierarchical modelling can be particularly useful when attempting to explain the observed between-country variability in the results of multinational trial-based CEAs. Given the need to produce robust cost-effectiveness evidence for jurisdiction-specific decision-making, it is only by incorporating both patient and country level covariates that potential between-country heterogeneity in the cost-effectiveness of the intervention can be fully explored and accounted for.
Manca et al used hierarchical multivariate models to reanalyse the economic data from a large multinational clinical study, the Assessment of Treatment with Lisinopril and Survival (ATLAS). This multinational trial enrolled 3164 patients in 19 countries, and compared low dose and high dose of the ACE inhibitor lisinopril in patients with chronic heart failure. Details of the main economic and clinical analyses have been reported elsewhere.[79, 80] This case study uses a total of 3061 observations (low dose, n=1545; high-dose, n=1516) from 17 countries. The analysis reported here refers to the first three years of follow up. Therefore, due to these assumptions, it must be stressed that the specific results presented here are not to be considered in alternative of the main study report. The authors compared the splitting and pooling approaches in the analysis of international cost-effectiveness data against the recently proposed use of hierarchical models.
Figures Figures33 and and44 show the results of this analysis respectively for the mean difference in costs and survival gain, and compare the splitting and pooling approaches (on the left hand graph of each figure) against the hierarchical modelling strategy. It can be seen how the country-specific estimates (empty square markers) obtained using the splitting approach display a large dispersion around the overall mean (black circle marker) obtained using the pooling approach. The hierarchical regression model approach, on the other hand, produces country-specific estimates which are closer to the population mean (the black circle at the bottom of the graph on the right hand side) compared to their counterparts in the analysis obtained by splitting the data. That is, the country-specific estimates in the hierarchical model borrow strength from each other. Some countries are more or less shrunken towards the overall mean. The degree of shrinkage is proportional to the between and within country variances, as well as the country-specific sample size.
One of the criticisms often moved to the use of hierarchical models in multinational CEA is that this method may be inappropriate when we expect systematic differences between countries but are not able to ‘explain’ these adequately using country-level covariates. In other words, the assumption that country-specific random effects are drawn from a common distribution may be erroneous. This criticism can be addressed using the arguments proposed by Gelman et al , who state that
“In virtually any statistical application, it is natural to object to exchangeability on the grounds that the units actually differ. […] The fact that the experiments differ implies that the θj's differ, but it might be perfectly acceptable to consider them as drawn from a common distribution. […] Objecting to exchangeability for modelling ignorance is no more reasonable than objecting to an iid model for samples from a common population, objecting to regression models in general, or, for that matter, objecting to displaying points in a scatter plot without individual labels. As with regression, the valid concern is not about exchangeability, but encoding relevant knowledge as explanatory variables where possible.” (page 124).
In essence this means that,
“…the usual way to model exchangeability with covariates is through conditional independence with x = (x1,x2,….,xJ). In this way exchangeable models become almost universally applicable, because any information to distinguish different units should be encoded in the x and y [outcome] variables” (page 123).
As discussed in section 2, there are various factors that could explain the between-country variation in costs and effects differences observed in Figures Figures33 and and4.4. The re-analysis implemented here developed a Bayesian bivariate hierarchical regression model for cost and survival data in the trial, while controlling for a set of patient and country specific covariates. Figures Figures55 and and66 plot the country-specific differential cost and survival gain, respectively, against life expectancy at birth and the public expenditure in healthcare as a percentage of the national GDP. Both graphs indicate a positive relationship between the treatment effects and the country-specific covariate, suggesting that these factors may need to be accounted for when assessing the generalisability of the cost-effectiveness results between countries.
A more challenging situation is when IPD are unavailable and the country of interest did not participate in the trial. In this case the analyst has to rely on data published in the literature, and to assume that the relative treatment effect estimated from other countries is indeed generalisable to the country of interest. In this case, methods similar to those explained in section 3.1 can be used, again supplementing the evidence base with additional IPD specific to the country of interest.
The example here relates to an analysis undertaken to inform NICE on the cost-effectiveness of using Glycoprotein IIb/IIIa antagonists (GPAs) in the management of non-ST-elevation ACS and illustrates a real life example of a decision model developed for the National Institute for Health and Clinical Excellence. In this section, we focus on the methods used in this example[7, 82] and the steps followed in building the model. The paper by Sculpher and Drummond at this conference discusses the policy rationale behind this approach. There were several challenges relating to the generalisability of the data the authors had to deal with in building a model that was relevant for the decision-maker in the UK (i.e. the NHS). These challenges together with the solutions adopted to address them are reviewed in turn.
As mentioned in section 2.3, differences in clinical practice may be an important factor which needs to be accounted for when translating study results from one country to another. In developing the model for NICE, an important consideration was therefore how GPAs would be used in the UK routine clinical practice. The evidence base contained two types of GPAs trial: those comparing the drugs with standard practice (i.e. management without GPAs) in all patients with non-ST elevation ACS regardless of whether a percutaneous coronary intervention (PCI) was subsequently undertaken (medical management); and those which looked at GPAs as an adjunct to PCI. Four treatment strategies were considered to be relevant for the UK, and the model was structured to compare all of them against each other. Lack of trial evidence comparing these strategies head-to-head was overcome by using ‘evidence synthesis’ methodology such as indirect comparison. Hence, it was necessary to re-structure the effectiveness data to reflect the nature of the indirect clinical comparison which was needed to populate the decision model. This was achieved by separating out the baseline event rates measured in the standard therapy control groups in the trials from the treatment effect observed in the GPA arms relative to the control group. The relative treatment effects for each treatment strategy were pooled across the various groups of trials.
Given that the trials were undertaken largely outside the UK, the baseline event rates in patients not having GPAs in the UK were considered to be potentially quite different to those patients randomised to the control groups in the trials. As mentioned in Section 2, this could reflect differences in the epidemiology of the disease or, more probably, differences in overall management of patients with ischaemic heart disease (IHD) in the UK. After consulting with UK clinical experts, Palmer et al considered the principal difference in the management of IHD in the UK, compared to that in other developed countries, was that fewer patients were considered for PCI at the time of the analysis. It was felt that the lower rates of PCI in the UK could have the effect of generating higher baseline event rates than those observed in the literature. Secondly, the limited availability of ‘acute’ PCI (i.e. percutaneous procedures undertaken in non-ST elevation patients shortly after presentation) in the NHS could cause clinicians to select ACS patients for acute PCI in a different way than clinicians in the GPA trials. Therefore, baseline event rate data, which were specific to UK practice, were sought and information from the Prospective Registry of Acute Ischaemic Syndromes in the UK (PRAIS-UK), which is an observational cohort registry of 1046 patients admitted to 56 UK hospitals with ACS in 1999, was used.
In the absence of IPD from the trials of interest, one way of adapting the clinical results from international trials to the UK setting is by separating out the baseline event rates associated with standard management (without GPAs), estimating those parameters from UK-specific data and applying the pooled relative treatment effects, for the alternative treatment strategies being considered relative to the control (i.e. no use of GPAs), from the trials. This amounts to assuming that baseline risks are not transferable internationally, but relative risk reductions are. It may be, however, that the relative treatment effect is itself related to baseline risk – for example, the higher the baseline risk, the lower the treatment effect – in which case the assumed independence between the two components of clinical effectiveness is not sustainable. It was important therefore to ascertain whether any relationship existed between the relative treatment effect (i.e. relative risk reduction) and the baseline risk observed in the literature. If that was the case, then this had to be built into the decision model in order to relate the UK baseline risk (taken from local sources) and the relative risk reduction estimated from the literature.
In order to investigate whether the log relative risk in the individual trials varied with log baseline risk (i.e. the log event rate in the control group), a random effects meta-regression model[84, 85] was used.c This form of meta-regression works by fitting a regression line between the event rates measured in the control groups of the trials and those of the experimental (i.e. GPA) groups, with the number of points from which the estimate is made being the number of available trials. This function characterises the relationship between the baseline risks and the relative risks. Once the function has been estimated in the meta-regression, the pooled relative risk estimates from the trials could be adjusted according to the point on the regression line which accords with the UK baseline risk. If the results suggest that a relationship exists, the decision model can be used to adjust the relative risk estimates according to the baseline risk employed in the model.
Figure 7 shows the results of this meta-regression, plotting the relationship between baseline risk (in the control group) and relative risk – on the log scale for the group of trials relating to one of the strategies considered in the model (i.e. GPA as part of initial medical management). The results of the analysis showed that there was a negative relationship between log baseline risks and log relative risk. This relationship was found to be not statistically significant, and the authors decided to estimate the relative risk reduction using a meta-analysis. This choice was justified not only by the lack of statistical significance, but probably more realistically by (a) the lack of direct evidence (in the studies included in the meta-regression) directly referring to the baseline risk observed in the UK, and (b) the inherent threats involved in using meta-regression for clinical decision-making.[86-88]
An important function of many decision models is to extrapolate the effectiveness data available in trials.d Palmer et al had the problem of taking the short-term effectiveness data in the available trials, which typically had follow-up of no more than six months, and extrapolate them to the lifetime time horizon of the typical patient. Therefore, a long-term (extrapolation) model was developed to estimate the future prognosis for patients who finish the short-term (six month) model in one of two disease states: those having experienced a non-fatal MI and those who have not but remain alive. The structure of the model is illustrated in Figure 8.
The long-term model took the form of a 4-state Markov process. The model assumed that at any point in time, patients could be in one of the following states: ischaemic heart disease (that is, patients who had not experienced a non-fatal MI), non-fatal MI (not showed in figure 8) where patients spent a single cycle of one year, post-MI (where surviving patients entered after one year following an MI) and death. It was necessary to be aware of the variation between countries in long-term survival following cardiac events. Transition probabilities were, therefore, taken from a UK-specific observational study - the Nottingham Heart Attack Register (NHAR). Two cohorts of patients (total n = 1,279) from the NHAR were used, with a diagnoses indicative of ACS, which had follow-up data for up to 5 years. In the context of the GPA model, the relevant ‘events’ estimated in the short-term model included revascularisation rates and days in hospital. For the longer-term model, these events included MIs, revascularisations and days in hospital. In both cases, these estimates of event rates were based on data from the UK observational studies – PRAIS-UK for the short-term model and NHAR for the long-term model. All estimates of the costs of these events were then taken from UK sources. For further details about this model the reader is invited to refer to the original publications.[82, 89]
The methods presented in 3.3 section may be a sensible approach even when, although the trial produces evidence directly relevant to the country of interest (e.g. the trial was carried out in, or included, the country of interest and a country-specific estimate is available), neither aggregate (country-level) nor IPD are available. There are several reasons for this, which mainly relate to the perceived limitations of using evidence from a single trial to inform decision-making in a specific jurisdiction (country).
It has been argued that most trials will not randomise individuals to all the relevant management options available in a specific healthcare system. Furthermore, the period of follow-up in a trial will often be shorter than the relevant time horizon for cost-effectiveness decisions; and finally, reliance on a single trial will often ignore other relevant evidence[90-93] possibly more location-specific, which could be used to adapt the trial results to the country of interest. While the use of (clinical and cost-effectiveness) evidence from a single multinational (or multicentre) RCT to inform reimbursement decisions in a given country is still debated,[90, 92, 93] clinical trials will always play a crucial role in providing unbiased country-specific estimates of (clinical) treatment effects. Key data on the likelihood of particular events (e.g. side effect, complications, etc) and their relationship with quality of life and resource use implications in particular treatment settings (e.g. hospital, country) is also a fundamental piece of information for healthcare decision makers. The evidence base concerning the treatment strategy being examined could include several trials comparing various alternative treatment strategies, giving rise to a ‘network of evidence’. In this case it is ‘good practice’ to synthesise all the available evidence within a comprehensive decision-analytic model,[82, 90, 92-97] the parameters of which (e.g. treatment effect, transition probabilities, etc) can be estimated through an evidence synthesis model (e.g. indirect[98-100]or mixed treatment comparison[83, 101, 102]).
The difference with respect to the methodology described in Section 3.3 is subtle. It is argued here that when the country of interest participated in one or more of the trials that form the evidence base used to inform the estimation of the parameters used in the model, the synthesis model should include not only explanatory variables at study-level, but also a country (or at the very least a geographical area) indicator identifying in which the study was carried out. The model could be supplemented with country-specific IPD on baseline risk, resource use, natural history of the disease, and mortality tables as explained in section 3.1.
Methods which enable policy makers to assess the extent to which the trial (economic and clinical) results are valid in different geographical settings should have a key role in Health Technology Assessment (HTA). There are several reasons for this. First and foremost, reimbursement decisions are made at local (i.e. jurisdictional and country) levels. Furthermore, despite their interest in a set of specific interventions, many healthcare systems (especially in mid-income and developing countries) are unable either to fund (or participate in) relevant trials. Third, the efficient use of R&D funding for HTA at country level requires the avoidance of duplication of funding efforts to address research questions already being addressed in other jurisdictions. Given such a complex scenario, decision-makers need to be able to discern to what extent the observed variability is country-specific and to what extent it is patient-related. There is, therefore, a great value in developing methods that enable decision-makers to use international data to generate jurisdiction-specific cost-effectiveness estimates. Several authors have attempted to address the issue of how should multinational trial-based cost-effectiveness data be analysed, with different degrees of sophistication. Others have developed decision-modelling methods in combination with evidence synthesis techniques to overcome the lack of IPD to produce cost-effectiveness estimates under specific assumptions (e.g. that the relative treatment effect is generalisable between countries).
This paper reviews the methods that could be used to reflect between-location differences in cost-effectiveness, and provides guidance as to which approaches can be currently considered ‘good practice’. An algorithm helping decide which strategy to adopt under different scenarios faced by the analyst is proposed. These scenarios are characterised in terms of trial-based IPD availability and whether the country of interest participated in the study.
Although, these methods have been used mostly to inform decisions at national and state level, there are reasons to believe that they could be useful to inform policy decisions in jurisdictions within a given country/state (eg. HMOs, regions in federalist countries, GP fund holding).
It must be emphasised that the methodology in this research area is currently under rapid development. However, various considerations can be made in terms of recommendations for future research. The paper by Sculpher and Drummond at this conference provides a list of the main issues to consider in trial- and model-based cost-effectiveness analysis in relation to issues of generalisability. These include aspects relevant to the design, data collection, analysis, and reporting of the study results.
This paper has shown that the availability of IPD evidence external to the trial referring to the country of interest is often paramount, especially (but not exclusively) when the country of interest did not participate in the clinical study. It was argued that, in multinational trial-based CEA, patients and country-specific covariates are essential for an accurate assessment of the between-location variability of the cost-effectiveness results. While baseline patient-level data are routinely collected as part of the trial, this is not the case for country (centre) level characteristics considered to have the potential to affect the generalisability of the study results. Questions that remain to be addressed regard, what country (centre) covariates should be collected, can these be routinely collected, if not what alternative sources can be used. This concern affects another element relevant during the design phase of the study: the selection of countries (and centres) participating in the trial. Ideally this selection should be at random. Current methods of selection are unclear, but they probably reflect the level of funding for the cost-effectiveness study.
In terms of modelling methods available it has been shown that hierarchical regression is a useful tool to analyse IPD collected alongside multinational trials. On the other hand, there are valid arguments for accepting that within-trial analysis is often not sufficient to appropriately inform reimbursement decisions at country level and that some form of additional modelling is probably required (e.g. long term extrapolation, synthesis of additional evidence, additional comparators). In this sense, it is possible that Bayesian methods integrating individual- and aggregate-level data could provide additional flexibility in the analysis of cost-effectiveness data for policy decisions. Further methodological developments are expected.
Finally, in terms of reporting of the study results Drummond et al indicated that more transparency would help ascertain the generalisability of the study results to a specific context. A recent review by Urdahl et al assessed the extent to which applied modelling studies in osteoporosis incorporated data inputs, which were appropriate for the target jurisdiction or decision-maker as stated or inferred from each study. It was found that studies tended to be more assiduous in selecting cost inputs, which were specific to their target decision-maker than they were in identifying appropriate clinical inputs. This is likely to reflect an implicit assumption that parameters relating to clinical effectiveness, whilst needing to be specific to the relevant patient group defined in the decision problem, are inherently more transportable geographically. Whilst this assumption may be justified within healthcare systems and countries, the factors discussed in section 2 of this paper suggest that this will not necessarily be so between systems and countries.
Given the decision-makers' need for jurisdiction-specific cost-effectiveness information, and the presence of country-specific factors that contribute to the international variation in cost-effectiveness results, trial-wide (i.e. pooled) results may not always be useful for jurisdiction-specific resource allocation decisions. Methods to reflect (and address) between-country differences in cost-effectiveness data are available and are likely to develop further.
A. Manca is recipient of a Wellcome Trust funded post-doctoral Training Fellowship in Health Services Research (grant number GR071304MA). A. R. Willan is funded through the Discovery Grant Program of the Natural Sciences and Engineering Research Council of Canada (grant number 44868-03). The authors are grateful to Mark Sculpher for his permission to use the GPAs example from the NHS R&D funded project on generalisability, and to Stefano Conti for his help with some of the graphs in this paper. The views and opinions expressed therein are the authors' and do not necessarily reflect those of the funding institutions.
§Paper presented at the conference: Better Analysis for Better Decisions – A Conference in Honour of Bernie O'Brien, McMaster University, Hamilton, (Ontario, Canada), 19-20 June 2006.
aThe issue is the mismatch between the source of the data and the location of the decision-maker. Thus, it could be argued that we should be referring to the ‘jurisdiction of interest’, where the term ‘jurisdiction’ encompasses both, within-country (e.g. regions, provinces) and country-level decision makers.
bNote that vj and uj are assumed to follow a bivariate normal distribution, to reflect the fact that in each country's mean cost in the control arm is correlated to the differential mean cost. In the analysis of the clinical data this assumption would reflect the fact that the baseline events are correlated with the relative treatment effects.
cAn analysis similar to what would be carried out if one were to explore the generalisability of the absolute treatment effect identified in trials across a range of clinically-defined patient sub-groups where the same separation of baseline risks and relative treatment effect can be employed.
dThere are various dimensions in the extrapolation problem. This can relate to beyond trial extrapolation (that is from short to long term outcomes), from intermediate endpoints to final outcomes, and from intermediate endpoints or final clinical outcomes to health-related quality of life.