|Home | About | Journals | Submit | Contact Us | Français|
There is an urgent need to evaluate HIV prevention interventions, thereby improving our understanding of what works, under what circumstances and what is cost effective.
To describe an integrated mathematical evaluation framework designed to assess the population‐level impact of large‐scale HIV interventions and applied in the context of Avahan, the Indian AIDS Initiative, in southern India. The Avahan Initiative is a large‐scale HIV prevention intervention, funded by the Bill & Melinda Gates Foundation, which targets high‐risk groups in selected districts of the six states most affected by the HIV/AIDS epidemic (Maharashtra, Karnataka, Tamil Nadu, Andhra Pradesh, Nagaland and Manipur) and along the national highways.
One important component of the monitoring and evaluation of Avahan relies on an integrated mathematical framework that combines empirical biological and behavioural data from different subpopulations in the intervention areas, with the use of tailor‐made transmission dynamics models embedded within a Bayesian framework.
An overview of the Avahan Initiative and the objectives of the monitoring and evaluation of the intervention is given. The rationale for choosing this evaluation design compared with other possible designs is presented, and the different components of the evaluation framework are described and its advantages and challenges are discussed, with illustrated examples.
This is the first time such an approach has been applied on such a large scale. Lessons learnt from the CHARME project could help in the design of future evaluations of large‐scale interventions in other settings, whereas the results of the evaluation will be of programmatic and public health relevance.
International agencies have committed significant resources to implement large‐scale HIV/AIDS interventions. Only a small fraction of the resources is, however, allocated to the evaluation of these interventions.1,2 Understandably, one dilemma is whether resources are best invested to evaluate or implement interventions that “should” work. Nevertheless, in order not to jeopardise resources on a large scale, there is an urgent need to evaluate HIV prevention interventions objectively to obtain a better understanding of what works, when and how.
This paper describes an integrated mathematical evaluation framework, designed to assess the population‐level impact of large‐scale HIV interventions, applied in the context of Avahan, the Indian AIDS Initiative, in southern India. We first give an overview of Avahan, the objectives of the evaluation and the rationale for choosing this design. We describe the different components of the evaluation framework and discuss its advantages and challenges.
The HIV epidemic in India is highly heterogeneous. HIV prevalence is highest in two northern states (Nagaland and Manipur), where transmission is mainly via injection drug use, and in four southern states (Maharashtra, Karnataka, Tamil Nadu and Andra Pradesh), where transmission is mainly sexual and concentrated among high‐risk groups.3,4,5
The Avahan Initiative is based on the theory of core group transmission and on the assumption that India's HIV epidemic depends on transmission from and within high‐risk groups.6,7,8,9 It aims to reduce HIV incidence and prevalence in high‐risk groups in order to limit HIV transmission to and within the general population. Avahan is a large‐scale, US$258 million, HIV prevention intervention funded by the Bill & Melinda Gates Foundation. The intervention targets high‐risk groups in 83 selected districts of the six states most affected by the HIV/AIDS epidemic (above), and along the national highways.7 In approximately 53 of these districts, Avahan and its partners are the sole or major implementers of prevention services for high‐risk populations. In the southern states, the intervention focuses on male and female commercial sex workers (MFSW), men who have sex with men (MSM), and clients of MFSW. In the northeast states, the focus is mainly on injection drug users. The different components of the Avahan intervention in the southern states focus on improving the availability and quality of services for sexually transmitted infections (STI), unlimited free distribution and promotion of condoms to MFSW and MSM, expansion of retail outlets for condom social marketing, encouraging behaviour change and reducing vulnerability.7 The Avahan programme, which started in 2004, attempts to take core group interventions to scale by providing services to more than 80% of known high‐risk groups in the selected districts, within a relatively short time‐frame.7
The overarching evaluation objective is to assess the population‐level impact of a large‐scale targeted intervention in a concentrated epidemic. The specific objectives are: (1) to obtain a better understanding of the local HIV transmission dynamics and the factors that determine the intervention's impact; (2) to assess the effectiveness of the intervention in high‐risk groups and the general population; and (3) to estimate the cost and cost effectiveness of the intervention and its different components (namely STI and condom components) in different districts with heterogeneous sociodemographic and epidemiological contexts.
One important component of the overall monitoring and evaluation (M&E) of Avahan relies on an integrated mathematical framework, which combines serial cross‐sectional biological and behavioural data from different subpopulations in the intervention areas, with the use of tailor‐made transmission dynamics models embedded within a Bayesian framework. This component of the evaluation, described here, covers only the four southern states where Avahan is implemented, and represents a small fraction of the resources allocated to the intervention.7
The rationale for incorporating modelling as part of the M&E is based on the following considerations.
Despite strong theoretical evidence6,8,9 and the existence of many intervention projects targeted at high‐risk groups in different countries, there is limited empirical evidence of the effectiveness and cost effectiveness of core group interventions.10,11,12 There is thus a broad interest in understanding the extent to which the Avahan core group intervention will impact on HIV transmission in high‐risk groups and the general population. The evaluation needs to be large scale and to assess the intervention impact at the population level (ie both direct and indirect effects). Impact depends on intervention efficacy, as well as on the interaction between coverage and intensity. The magnitude of impact and how fast a given change in the primary outcomes of interest (eg prevalence, incidence) can be achieved depend on the natural history of HIV/AIDS and on the epidemiological context. As a result of the long incubation period of HIV/AIDS, the full potential of an intervention on prevalence may take decades before it is achieved, especially in established/mature epidemics (fig 11).
Given the goals of the evaluation, the strength of evidence needed for decision making (adequacy, plausibility or probability as defined in table 11)13 and the relevant indicators (provision, utilisation, coverage, or heath impact) must be specified. Given the scale and importance of Avahan, its evaluation should aim to demonstrate strong plausibility.
The external validity of the evaluation results for a wide range of contexts is particularly important in a setting such as India, with heterogeneous sociodemographic and HIV epidemiological contexts. An adequate design needs to assess the impact of the intervention on HIV trends objectively and to minimise subjective interpretation and evaluator biases. Impact assessment should not delay the scaling up of the intervention and should be logistically, programmatically and ethically feasible and affordable.1,2
Theoretically, community randomised controlled trials (C‐RCT) are the gold standard for evaluating the population‐level effectiveness of an intervention and making probability statements about impact (table 11).13,14,15,16 In practice, they are very expensive and logistically difficult to conduct. This is especially true in the Avahan context, because of the heterogeneity of the Indian epidemic and the broad geographical scope of the Avahan programme. It would also be difficult to establish control districts in this context, as pre‐existing programming and migration of high‐risk groups would probably be significant sources of contamination between intervention and non‐intervention communities. C‐RCT, even stepped‐wedge design,15,16 may be deemed unethical (table 11).). The external validity of C‐RCT is often limited to the location and population where it is conducted. To date, C‐RCT of HIV interventions have been of short duration, included few communities (approximately 10) and produced ambiguous results.17,18,19,20 Mathematical modelling has been used to help interpret inconclusive or contradictory trial results.19,20
The evaluation of large‐scale interventions is often based on data from surveillance21 of relevant process and/or health indicators in target populations over time, to assess the performance and coverage of the programme, as well as levels and spread of the health indicator.1,2,13 Surveillance data alone, however, do not permit researchers to assess the extent of an intervention's effectiveness objectively. Changes in the desired direction (eg decline in prevalence) may also be caused by the natural transmission dynamics of HIV/STI epidemics, or other external influences (eg other programmes; fig 11).6,19
To demonstrate plausibility, a valid control group must be defined to assess what might have happened in the absence of the intervention (table 11).13,22,23 Comparisons of HIV/STI trends using internal control groups from population subgroups, completely or partly unexposed, can only assess intervention effects at the individual level. External control groups from areas where the programme was not implemented, or baseline pre‐intervention data from internal control groups, can yield population‐level estimates. These comparisons are, however, prone to biases and cannot account for changes caused by natural infection dynamics. Simulated control groups based on transmission dynamics models may also suffer from the same internal validity threats as quasi‐experimental designs, but have the added advantage of providing a framework that takes into account changes caused by natural infection dynamics and that can be used to estimate potential sources of biases as a result of changes unrelated to the intervention (details are given in supplementary table S1, available on the STIwebsite: http://sti.bmj.com/).17,19
Mathematical models are often used a posteriori to help interpret epidemiological trends and assess the likely impact of interventions in an explanatory fashion, or for cost‐effectiveness analyses.6,9,19,20,24,25 In contrast, our integrated framework has been rigorously designed a priori as an integral part of the Avahan M&E, and is based on a set of well‐defined procedures to maximise objectivity and to take into account uncertainties in estimates. The framework was designed within the context of the intervention planned by Avahan.7
The framework is based on empirical biological and behavioural data collection and HIV/STI transmission dynamics modelling embedded within a Bayesian framework (fig 22).26,27,28 The evaluation is taking place at the district level over seven years, which is longer than most HIV trials, to maximise the chance of observing changes in HIV prevalence.
The primary data for defining the model structure, informing model parameters, validating the model and defining attribution are collected within the context of the overall Avahan M&E programme by numerous agencies responsible for implementing interventions and collecting evaluation data.7 Serial cross‐sectional biobehavioural surveys (integrated behavioural and biological assessments; IBBA) are carried out in target high‐risk groups in selected districts of the four southern states covered by Avahan. The surveys are repeated at two or three different time points (table 22).). In addition, special behavioural surveys (SBS) among high‐risk groups, as well as biobehavioural general population surveys (GPS), are being carried out in selected IBBA districts to validate and complement IBBA data and collect general population data. The questionnaires have been specifically designed to obtain detailed information required for epidemiological and modelling analyses. Methods to minimise social desirability biases are also used to elicit more accurate reporting of high‐risk behaviour (table 11).7,29,30
Within Avahan, a management information system (MIS) consisting of a series of routine monthly indicators (eg number of condoms distributed, estimated size of core groups, number of syndromes treated) obtained from implementing partners is used to estimate intervention coverage and intensity. Costing studies of Avahan districts over time, involving a mixture of detailed bottom‐up costing methods and the use of routine financial and project data,31,32 are also taking place (table 22).
At the core of the framework is a tailor‐made deterministic transmission dynamics model of HIV/AIDS and STI. Ideally, the model being developed will be as parsimonious as possible, while including the complexity that matters.33 The model structure and level of complexity will depend on the characteristics, STI levels, and risk factors observed in the population of interest and on the nature of the intervention in each district. Analyses of the baseline data will identify important sources of heterogeneity that should be included in the model. We focus on four southern states of India where HIV transmission is mostly sexual. The model will take into account age, and district‐specific male and female commercial sex work, mixing patterns, STI prevalence, as well as other relevant characteristics such as migration and patterns of intravenous drug use. A series of preliminary modelling studies, such as studies on HIV and herpes simplex virus type 2 interaction and the impact of seasonal migration,34,35 will help to determine which aspects of the natural history of STI/HIV and population characteristics will least influence our impact projections, and can therefore be ignored. Modelling studies assessing the simplest, yet adequate, way to model the general population are ongoing.
For each of the four main parameter categories (demographic, natural history of HIV and STI, behavioural and intervention exposure), prior parameter distributions (defined in the next section) will be specified for each round of data collection for the different risk groups. The distribution for the natural history parameters will be based on published literature. The IBBA and SBS will be used to inform behavioural and intervention exposure parameters. Demographic parameters will be based on official sources (eg census) and our different surveys. MIS, GPS and IBBA and other complementary data sources will provide estimates of the size of high‐risk groups and coverage of the intervention (FMSW, MSM, clients; table 22).
The transmission dynamics model will be used within a Bayesian framework (fig 22).26,27,28,29 Initially, available data will be used to specify what is known about each parameter by defining a plausible range of values (the prior distribution) for each one. Then, the prior distribution will be sampled many times in order to test a large number of parameter combinations (>>>10 000) against the empirical validation data used to fit the model (eg HIV and STI prevalence). The goodness of fit of each parameter set will be assessed by comparing prevalence estimates predicted by the model at specific time points with corresponding empirical estimates. The model will be fitted to two or three rounds of age‐specific HIV and STI prevalence data by risk group in each district. The model will be simultaneously fitted to data from different districts and states to constrain biological parameters (fig 3A3A).). Only the subset of parameter sets that fits the empirical data well (posterior distribution) will be kept for further analyses (fig 3B3B).). The advantage of this approach is to produce point estimates and credibility intervals (CrI) that will reflect the uncertainty in parameter assumptions on model predictions (fig 3C3C).26,27,28,29 This is necessary because our model will be relatively complex with many uncertain parameters, which means that more than one set of parameters could produce an equally good fit to epidemiological trends. Using only one parameter set could lead to biased estimates of intervention impact. Together, the point and CrI estimates will provide information to judge if the intervention is sufficiently effective and/or cost effective to be declared of public health use. Ideally, the public health criteria should be determined by public health authorities and stakeholders, before evaluation.
As a result of the large quantity of data (>60 datasets per study round) and the extensive fitting procedure, the process needs to be automated and rigorous. As existing fitting methods have not been tested for very complex HIV dynamic models, our choice will be based on the results of ongoing independent validation studies (conducted before impact assessment) comparing the precision, validity of impact estimates, and computing time needed by the different “Bayesian style” methods. The procedures will be evaluated by using them to fit a range of models to “fitting” data generated by a model with the same or a more complex structure.28 The latter analyses will give pointers as to whether the complexity of the model can be reduced while conserving its ability to produce adequate impact estimates. The procedures explored will include different search algorithms (Markov chain, Monte Carlo)27,28 or Latin hypercube sampling combined with likelihood methods or a more heuristic target fitting method.9,26
The main model outcomes will include age‐specific HIV/STI incidence and prevalence, and numbers of new STI/HIV infections averted over a fixed time period and by district. To produce estimates and CrI of intervention impact, parameter sets from the posterior parameter distribution will first be used to simulate the different health outcomes of interest in the intervention group over time. Then, the same health outcomes will be simulated in a matched control group using the same parameter sets but with the intervention parameters (eg coverage, condom use, STI treatment) reset to pre‐intervention levels, thus providing population‐level impact estimates that take into account the transmission dynamics of infection. For each district, estimates of the main model outcomes in different subpopulations will be obtained by comparing predicted STI/HIV infections in the presence and absence of the intervention. Then, the district‐specific model predictions will help to understand the influence of the different epidemiological contexts and local transmission dynamics of HIV/STI infections on the population‐level effectiveness of the intervention, and to improve prevention strategies. The primary estimation of intervention impact with the “full” model will occur at the end of the seven years.
To estimate the fraction of new infections prevented by the Avahan intervention and its different components, extra steps are needed to reflect the uncertainty surrounding the simulated control groups (see supplementary fig S2, available on the STIwebsite: http://sti.bmj.com/). First, simulated control groups using baseline data will allow the estimation of the overall impact of any changes in high‐risk or treatment‐seeking behaviours after the intervention (as a result of the intervention or any other causes) on HIV, independently of the transmission dynamics. Second, IBBA, SBS, GPS and MIS data will be used to estimate plausible ranges for the fraction of individuals exposed to any intervention, or to the Avahan intervention specifically, and the incremental level of behavioural modification observed among those exposed to any intervention, or Avahan intervention only (when they are the only provider), compared with those not exposed (simulated control group). MIS process indicators of programme adequacy will be used for validation of coverage and intensity, and for estimating the improvement in STI services. Together, this will permit an estimation of the fraction of new HIV infections prevented by any intervention to which Avahan contributed (contribution) or by Avahan only (attribution), while taking into account uncertainties about risk behaviours of the control group.
By merging modelled effectiveness estimates with empirical costing studies the cost effectiveness of the programme and its different components at the district level will be derived.9,31,32 Modelled projections of HIV/STI cases averted will be combined to obtain estimates of the overall cost per disability‐adjusted life‐year saved, and then compared with the empirical cost data to obtain district, state and programme‐wide cost‐effectiveness estimates (measured as cost per HIV and STI cases averted and cost per disability‐adjusted life‐year saved). The models will be used to explore how the projected cost effectiveness changes if the intervention (and the resulting patterns of behaviour change) is sustained for different lengths of time, using district‐based cost data. To estimate the uncertainty in the cost‐effectiveness ratio, the uncertainty of costs, based on data from 75 districts, will be combined with the uncertainty in the impact estimates at the sampling stage.
The integrated mathematical framework presented has been designed for the rigorous evaluation of a large‐scale HIV intervention, and to minimise limitations and validity threats associated with uncertainty in parameter assumptions, model specification, and the non‐experimental nature of the design (table 11).). Serial cross‐sectional studies have been designed to collect detailed modelling data, using techniques to minimise reporting biases. The uncertainty in model structure will be studied before impact assessment to validate models of reduced complexity. The extensive Bayesian fitting procedure will take into account the uncertainty in model parameters on impact estimates, and permit the testing of different hypotheses on previous beliefs in a scientific and objective manner. Despite residual uncertainty about the simulated control group, mathematical modelling remains the only way to provide quantitative impact estimates that take into account changes caused by the transmission dynamics of the infection; this cannot be done on the basis of observed epidemiological trends alone. The combination of impact estimates with costing data will provide estimation of the cost effectiveness of the intervention, an issue of ever‐increasing importance in a context of limited funding and competing public health priorities. The added value of defining the plan of analysis before the evaluation is to minimise observer biases. This is important given that the evaluation occurs outside the context of a blinded experimental design. As the intervention will be evaluated in a larger number of sites than would be possible with C‐RCT, it will be possible to make an overall assessment of the intervention impact in different epidemiological contexts, hopefully improving external validity.
This is the first time such an approach has been applied on such a large scale. The framework needs to deal with an unprecedented quantity of data for an HIV/AIDS modelling study, and a substantial amount of programming needs to be carried out before impact assessment results can be produced. The integrated mathematical framework, combined with the high‐quality second‐generation surveillance data being collected through the overall Avahan M&E, will help achieve a higher level of certainty in conclusions than analysis of epidemiological trends alone. Lessons learnt from the CHARME project could help the design of future evaluations of large‐scale interventions in other settings, whereas the results of the evaluation will be of programmatic and public health relevance.
The authors would like to thank Gina Dallabetta and Padma Chandrasekaran for very useful discussion and suggestions.
C‐RCT - community randomised controlled trial
CrI - credibility interval
FSW - female sex worker
GPS - general population survey
IBBA - integrated behavioural and biological assessment
M&E - monitoring and evaluation
MFSW - male and female commercial sex worker
MIS - management information system
MSM - men who have sex with men
SBS - special behavioural survey
STI - sexually transmitted infection
Funding: Support for this research was provided by the Bill & Melinda Gates Foundation through Avahan, its India AIDS Initiative.
Competing interests: None.
Contributions: MCB, CML, PV, LK, JB, SM, BMR, CW, RW, ACL, RMA, SRP and MA contributed to the different aspects of the study design. KD and MP designed the mathematical models used to produce simulation results and figures. MP helped with the design of the Bayesian framework. MCB wrote the first draft of the manuscript with the help of CML. KD, MP, MA, PV, LK, JB, SM, and MA contributed to the different drafts of the manuscript.
The views expressed herein are those of the authors and do not necessarily reflect the official policy or position of the Bill & Melinda Gates Foundation and Avahan.
CHARME‐India is the CHA—HIV/AIDS Research, Monitoring and Evaluation Project, India. The CHARME‐India team includes the investigators RMA, MA, JB, MCB, LK, ACL, CML, SM, BMR, SRP, PV, CW, RW and co‐workers S Chandrashekar, KN Deering, A Foss, K Gurav, AA Jayachandran, S Joseph, B Mahapatra, A Phillips and M Pickles. The main collaborating institutions include CHA, University of Manitoba, KHPT, St John's Medical College, Imperial College, LSHTM; collaborating institutions for field work include the Tata Institute of Social Sciences, Mumbai, and the Centre for Media Studies, Hyderabad.