|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: SL XBC SO EAA GEV JL DG JAR PS MG PT FEJ AH. Wrote the first draft of the manuscript: SL. Contributed to the writing of the manuscript: XBC SO EAA GEV JL DG JAR PS MG PT FEJ AH. ICMJE criteria for authorship read and met: SL XBC SO EAA GEV JL DG JAR PS MG PT FEJ AH. Agree with manuscript results and conclusions: SL XBC SO EAA GEV JL DG JAR PS MG PT FEJ AH.
This is one paper in a three-part series that sets out how evidence should be translated into guidance to inform policies on health systems and improve the delivery of clinical and public health interventions.
Health systems interventions establish or modify governance (e.g., licensing of professionals), financial (e.g., health insurance mechanisms) and delivery (e.g., by whom care is provided) arrangements, and implementation strategies (e.g., strategies to change health provider behaviours) within health systems (which consist of “all organisations, people and actions whose primary intent is to promote, restore or maintain health”; see Box S1 for definitions of the terms used in this article). The focus of these interventions is to strengthen health systems in their own right or to get cost-effective programmes and technologies (e.g., drugs and vaccines) to those who need them. Decisions regarding health systems strengthening, including the development of recommendations by policy makers, require evidence on the effectiveness of these interventions, as well as many other forms of evidence. For example, in assessing potential policy options, reviews of economic evaluations and of qualitative studies of stakeholders' views regarding these options might be important (Table S1). Such evidence helps to address questions such as the cost-effectiveness of these options and which options are seen as appropriate by stakeholders.
Assessing how much confidence to place in the types of evidence available on health systems interventions is a key component in informing judgements regarding the use of such interventions for health systems strengthening (Box 1). This paper, which is the third of a three-part series on health systems guidance ,, aims to:
Source: adapted from .
The first paper in this series makes a case for developing guidance to inform decisions on health systems questions and explores challenges in producing such guidance and how these might be addressed . The second paper explores the links between guidance development and policy development at global and national levels, and examines the range of factors that can influence policy development .
In this paper, which like the other two papers is based on discussions of the Task Force on Developing Health Systems Guidance (Box 2; ,), we focus particularly on the GRADE approach, which provides a transparent and systematic approach to rating the quality of evidence and grading the strength of recommendations .
Well-conducted systematic reviews  can be used to identify the best available evidence to inform judgements about the effects of policy options and to inform other key steps within the policy-making process (Table S1). As discussed in the second paper of this series , users need to be able to assess the quality of evidence presented in such reviews in relation to each step of the policy-making process. For example, when defining the problem and the need for intervention, tools are required to assess the confidence we can place in evidence from reviews of studies highlighting different ways of conceptualising the problem (e.g., reviews of studies of people's experiences of the problem) . When assessing potential policy options, tools are needed to assess the confidence that can be placed in, for example, studies assessing impact (e.g., reviews of effectiveness studies). Similarly, when identifying implementation considerations, tools are required to assess the confidence that can be placed in reviews of factors affecting implementation.
Many tools are available to assess the risk of bias in individual studies of the effects of interventions  and to appraise individual qualitative studies . Tools are also available to assess the quality of evidence synthesised in systematic reviews . Such tools need to be appropriate to the types of studies included in the review and generic enough to be applicable across a range of questions, and must allow meaningful conclusions to be drawn regarding the quality of the included evidence. Judgements on how much confidence can be placed in the evidence from a review need to be distinguished from judgements about how well the review was conducted (i.e., its reliability). Tools have been developed to assess the reporting of systematic reviews and meta-analyses (e.g., the PRISMA checklist ) and to assess their methodological quality or reliability (e.g., the SUPPORT tools  and AMSTAR ) (Table 1). However, we focus here on tools to assess how much confidence can be placed in the evidence identified and presented in those reviews.
Tools to assess how much confidence to place in review findings are most developed for systematic reviews of evidence on effectiveness. The GRADE approach is one such tool, but many others are available . Within GRADE, the quality of evidence derived from a systematic review is related to the quality of the included studies and to a range of other factors (Table S2). This approach has many strengths (Table S3) and is now used increasingly by international organizations, including the World Health Organization (WHO), the Cochrane Collaboration, and several agencies developing guidelines . (Also see http://www.gradeworkinggroup.org.) We will discuss the application of the GRADE approach to assess the quality of evidence and the strength of recommendations for health systems interventions later.
Tools to assist judgements on how much confidence to place in the findings of reviews of questions other than effects that are relevant to the policy-making process are at an early stage of development. Such questions include stakeholders' values and preferences and the feasibility of interventions (Table S1). For some of these issues, judgements might be informed by systematic reviews of qualitative studies, together with local evidence . Where reviews seek a qualitative answer to understand the nature of a problem, quality appraisal aims to assess the coherence of the resulting explanation, possibly across different contexts. Although quality criteria for individual qualitative research studies commonly consider the methods of each study and the credibility and richness of their findings ,,, thus far tools for assessing the quality of systematic reviews of qualitative research have not considered the credibility and richness of findings. A potential tool for doing this is proposed in Table S4.
Reviews exploring factors affecting the implementation of options might employ mixed methods syntheses (i.e., syntheses of both qualitative and quantitative evidence) such as realist synthesis , which explores the explanatory theories implicit in existing programmes or policies, or framework synthesis, which provides a highly structured, deductive approach to data analysis drawing on an existing model or framework (for example, –). Quality appraisal then focuses on the confidence that can be placed in each conclusion drawn from individual studies . Grading the evidence as a whole can take into account the number and context of the studies contributing to each conclusion, and the appropriateness of their methods for drawing that conclusion (for illustration, see ). A single study might refute or qualify a theory, but multiple studies together contribute to strengthening a theory. As yet, there are no tools for appraising how well mixed methods reviews have synthesised studies to draw conclusions about the advantages or disadvantages of policy options.
Resource use is another key issue, and tools are available to assess the reliability of reviews of economic studies . In addition, GRADE provides guidance on how to incorporate considerations of resource use into recommendations .
The GRADE approach clearly separates two issues: the quality of the evidence and the strength of recommendations. Quality of evidence is only one of several factors considered when assessing the strength of recommendations.
Within the context of a systematic review, GRADE defines the quality of evidence as the extent to which one can be confident in the estimate of effect. Within the context of guidelines or guidance, GRADE defines the quality of evidence as the confidence that the effect estimate supports a particular recommendation. The degree of confidence is a continuum but, for practical purposes, it is categorised into high, moderate, low, and very low quality (Table S5).
Evidence on the effectiveness of health systems interventions raises a number of challenges that may, in turn, influence assessments of the quality of this evidence and the development of recommendations using the GRADE approach. Firstly, while experimental studies (including pragmatic randomised trials ) are feasible for some health systems interventions, for others (particularly those related to governance and financial arrangements), evidence may come mainly from observational studies, including evaluations of national or state-wide programmes ,. Secondly, evaluations of health systems interventions often use clustered designs and these are frequently poorly conducted, analysed, and reported ,. Thirdly, health systems interventions tend to measure proxy outcomes, such as the use of services or the uptake of an incentive. Evidence users need to decide whether there is sufficiently strong evidence of a relationship between the proxy outcome and the desired health outcome. The development of an outcomes framework to assist in assessing interventions (for an example, see ,) may help those developing guidance decide whether proxy outcomes are sufficient. Finally, poorly described health systems and political systems factors and implementation considerations may make it difficult to develop contextualised recommendations on policy options . GRADE attempts to make the judgements regarding these issues systematic and transparent.
Moving from evidence to recommendations on options for consideration often necessitates the interpretation of factors other than evidence. In most cases, these interpretations require judgments, making it important to be transparent, particularly given that recommendations will sometimes need to consider multiple complex health systems interventions, each with its own assessment of quality of evidence. Another challenge is the additional complexity of assessing the wide range of health system and political system factors that will influence the choice and implementation of options for addressing a health system problem in different settings (see the other papers in this series ,). For example, a health systems problem may involve a wide range of stakeholders, each with views regarding the available options. In addition, health systems interventions may have system-wide effects that vary across settings. Consequently, rather than making a single recommendation as in clinical guidelines, it may be more useful for health systems guidance to set out the evidence and outline a range of options, appropriate to different settings, to address a given health systems problem. These options may, in turn, feed into deliberative or decision-making processes at national or sub-national levels, as discussed later and elsewhere in this series .
Tools such as GRADE assist in grading the strength of a recommendation regarding options , and can be applied to health systems interventions, but may benefit from explicitly including some additional factors (Box 3). Further research is needed to explore the usefulness of these additional factors but, in general, any tool used to guide the development of recommendations should aim to improve transparency by explicitly describing the factors, and their interpretation, that contributed to the development of recommendations.
GRADE factors (adapted from  ):
Additional factors that it may be useful to consider for health systems interventions:
Within the GRADE approach, recommendations reflect the degree of confidence that the desirable effects of applying a recommendation outweigh the undesirable effects. Specifically, a strong recommendation implies confidence that the desirable effects of applying a recommendation outweigh the undesirable effects, whereas a conditional/qualified/weak recommendation suggests that the desirable effects of applying a recommendation probably outweigh the undesirable effects, but there is uncertainty.
GRADE attempts to make all judgments regarding the factors that are considered in developing recommendations transparent (by documenting these judgments) and systematic (by using the same approach across all the questions being considered by the guideline). Tables 2, S6, and S7 provide illustrations of the application of the GRADE approach to health systems interventions involving delivery and financial arrangements and implementation strategies, respectively, and show how guidance on health systems interventions might outline a range of options appropriate to different settings—an approach on which further research is needed. In common with other grading systems, GRADE does not yet provide guidance on how to assess the level of confidence that can be placed in evidence on “acceptability” or “feasibility”. Conventionally, these judgements have been made by consensus among the guideline panel, which needs to include individuals with expertise and experience relevant to the guideline questions. Further work is needed to develop a formal way of assessing the quality of such evidence.
The move from an assessment of the quality of evidence to making recommendations on policy options involves a number of challenges. Firstly, assessments of the strength of a recommendation may require a detailed understanding of the evidence creation and evaluation process that is not always available. Secondly, categorising recommendations as “strong” and “weak” can raise difficulties. Panels developing global guidance may, for example, be reluctant to make “weak” recommendations in case policy makers fail to respond to such recommendations because they assume they are equivalent to “no recommendation”. Thirdly, the quality of evidence is typically assessed for two alternative policy options in tools such as GRADE, while many health system (and, indeed, many clinical decisions) involve multiple interventions, which adds to the complexity of interpretation and decision-making and makes it even more important to be transparent.
How might some of these challenges be addressed? Methodological expertise is needed to conduct and interpret systematic reviews and perform assessments using tools such as GRADE. Health systems guidance panels therefore need to be supported by methodology experts. Moreover, the outputs of these tools need to be “translated” into appropriate language and formats to ensure that they can be interpreted and used correctly by the panel. Research is under way within initiatives such as the DECIDE (Developing and Evaluating Communication strategies to support Informed Decisions and practice based on Evidence; http://www.decide-collaboration.eu) collaboration on ways of presenting information on GRADE assessments and policy options to policy makers.
There are other wider challenges involved in making recommendations on policy options regarding health systems interventions. Firstly, the tools used to assess the quality of evidence and develop recommendations need to be able to accommodate the wide range of study designs that is used to assess the effectiveness of health system interventions. This is possible within GRADE. Secondly, tools need to be developed to inform judgements on how much confidence to place in the other forms of evidence (e.g., evidence on acceptability) that are needed to develop recommendations regarding health systems interventions .
Finally, international standard-setting organisations, such as WHO, have to formulate recommendations that are applicable at a global level. However, as noted earlier, creating global recommendations on health systems questions can be difficult because of important variations in context-specific factors that influence the applicability of interventions at national and sub-national levels ,. An approach that should help to link guidance development at the global level with policy development at the national level is outlined in the second paper of this series .
Where it is useful to make recommendations at the global level, those developing guidance may choose to outline policy options rather than a single recommendation. Such options may encompass one or more questions and may be based on the range of interventions considered in relation to these questions. Health and political systems factors could be taken into account by linking specific options to these factors. For example, the options may describe variations in the intervention content and method of implementation, based on the evaluations that have been conducted in different settings. Further work is needed to explore how policy makers might interpret and select policy options outlined in global guidance. However, one useful approach might be to provide national decision makers with tools to assist them in making recommendations appropriate to their setting. Several such tools are available , or in development (see http://www.decide-collaboration.eu/work-packages-strategies). Importantly, global guidance should always indicate the factors that should be considered to assess the implications of variations in intervention, context, and other conditions. Decision models may be useful in exploring the effects of these variations (for example, ,). Given the often low quality evidence available regarding policy options for health systems problems, it is also likely that in many cases the recommended option(s) will need to be evaluated.
The best way to communicate evidence on contextual and implementation issues related to health systems and political systems to guidance panels and policy makers to inform their judgements about the strength of recommended options is currently unclear. Related work on summary of findings tables for systematic reviews of effects and evidence summaries for policy makers has illustrated the importance of paying attention to both format and content in developing useful and understandable presentation approaches ,. The Handbook for Developing Health Systems Guidance sets out an approach for presenting this type of evidence to stakeholders in a user-friendly evidence profile . Similarly, the second paper in this series describes the wider features of health and political systems that may need to be assessed to inform decision-making .
If we want to ensure that guidance panels and policy makers use evidence to inform judgements about the strength of recommended options, more research is needed to develop and test approaches (including visual formats) for presenting the available evidence to such groups. In addition, efforts are needed to build the capacity of policy makers to use evidence to inform their decisions ,,.
Useful tools are available for grading quality of evidence and strength of recommendations on policy options regarding health systems interventions, but several challenges need to be addressed. Firstly, these tools involve judgements, and these need to be made systematically and transparently. Secondly, for many health systems questions, evidence is still likely to be of low quality. Better quality research in these areas is needed and would allow guidance panels to have more confidence in the evidence and to issue stronger recommendations. Thirdly, research is needed on ways to develop, structure, and present policy options for consideration within global health systems guidance. These options need to include evidence on health and political system factors and implementation considerations, and tools to assess such evidence need to be refined. Finally, greater attention needs to be given to how guidance on health systems interventions may be implemented at the local level.
Translation of the Summary Points into Spanish by Xavier Bosch-Capblanch
Translation of the Summary Points into French by Bruno Clary, William Lenoir, and Lise Beck
Translation of the Summary Points into Portuguese by Bruno Viana
Translation of the Summary Points into Arabic by Fadi El-Jardali
Definitions of terms used in this paper
Types of systematic reviews needed for different steps in the policy-making process and the availability of tools to assess how much confidence can be placed in the evidence presented in these reviews
GRADE criteria for assessing the quality of evidence for each important outcome assessed in a systematic review of effects
Strengths of the GRADE approach
Assessing how much confidence can be placed in the findings of systematic reviews of qualitative studies
Definitions of the quality of evidence categories within GRADE
Example of factors affecting decisions about strength of recommendations—Changes in user fees in low- and middle-income countries
Example of factors affecting decisions about strength of recommendations—Continuing education programmes for rural health workers to support their retention
We acknowledge other members of the Task Force on Developing Health Systems Guidance, who include (with their affiliations at the time when the Task Force was initiated): Edgardo Abalos, Centro Rosarino de Estudios Perinatales (Argentina); Abdul Ghaffar, Alliance for Health Policy and Systems Research, World Health Organization (Switzerland); Timothy Evans, Assistant Director-General, Information, Evidence and Research, World Health Organization (Switzerland); Regina Kulier, Department of Research Policy and Cooperation, Information, Evidence and Research, World Health Organization (Switzerland); Pierre Ongolo-Zogo, Centre for the Development of Best Practices in Health, Yaoundé Central Hospital, and University of Yaoundé, Yaoundé (Cameroon); Tikki Pang, Department of Research Policy and Cooperation, Information, Evidence and Research, World Health Organization (Switzerland); Ulysses Panisset, Department of Research Policy and Cooperation, Information, Evidence and Research, World Health Organization (Switzerland).
Thanks also to the following for their thoughtful input: Claire Glenton and Andy Oxman, Norwegian Knowledge Centre for the Health Services (Norway).
The Swiss Tropical and Public Health Institute and the Norwegian Knowledge Centre for the Health Services received funds from the WHO for the contributions of DD, PS, LB, SL, and XBC to developing a Handbook to produce health systems guidance, and some of this work is reported in this article. DG is a member of the PLoS Medicine Editorial Board. EAA and GEV are members of the GRADE Working Group. All other authors declare no competing interests.
The development of the Handbook was supported by a grant to the WHO by the Rockefeller Foundation, which had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The paper represents the views of the authors and neither WHO nor the Rockefeller Foundation.
Provenance: Not commissioned; externally peer reviewed.