|Home | About | Journals | Submit | Contact Us | Français|
In clinical practice guidelines, the quality of the available evidence is graded according to its reliability and quality. This study aimed to evaluate the quality of the available research evidence, using the levels of evidence, in the evidence summaries of 64 Finnish national evidence‐based Current Care guidelines.
Electronic web‐based guidelines in Finland.
The proportions of evidence summaries with different levels of evidence (A–D).
The 64 guidelines had a total of 2419 evidence summaries. Of these, 532 (22.0%) were evidence level A, 891 (36.8%) were evidence level B, 808 (33.4%) were evidence level C, and 188 (7.8%) were evidence level D. Most—that is, 81% of the level C and D evidence summaries dealt with diagnosis and treatment. Most of the evidence summaries pertained to treatment (58.2%) and diagnosis (22.4%). The sections on diagnosis and treatment represented 80% of all the level A and level B evidence, and 81% of all the level C and level D evidence.
There is adequate high‐quality evidence (level A) to support only a fifth of the main statements of the 64 guidelines. This is most likely an optimistic estimate, since level D evidence often does not have an evidence summary. The guideline development groups find it easier to agree on recommendations based on level A and level B evidence.
Quality is of pivotal importance in every healthcare system. Quality of care is a complex concept and consists of both subjective and more objective components.1,2 Knowledge is at the heart of good quality care, and according to Muir Gray3 it consists of three components: knowledge from research (evidence), knowledge from measurement of healthcare performance (statistics) and knowledge from experience (mistakes). We use published evidence to complement the silent knowledge passed on from previous generations of researchers. Without good research evidence, clinical decision making, be it diagnosis or treatment, is on shaky grounds.
Clinical practice guidelines have been widely adopted as a tool to improve quality of care.4 Such guidelines have been increasingly produced since the 1980s. Currently, most are produced in guideline programmes and aim to be explicitly evidence based. International collaboration has been established in this area,4 and efforts are underway to harmonise grading of the available evidence. However, as of now there are several systems.5 The basis of the grading systems is the evaluation of the validity of the research. It has been estimated that between 10% and 20% of healthcare decisions are based on high grades of evidence and only 50%, or even less, up to 15%, of medical treatment has been validated in clinical trials.6 However, a high grade does not necessarily mean clinically important.
In Finland, a national evidence‐based guideline programme was established in 1994 under the auspices of the Finnish Medical Society Duodecim, Current Care (Käypä hoito; http://www.kaypahoito.fi).7 To ensure good methodology, a guideline developer's handbook has been available since the beginning of the programme. In January 2006, the collection included 64 Current Care guidelines covering a wide variety of clinical topics (table 11).). These were supported by 2419 evidence summaries and graded recommendations. The present study aimed to provide an overview of the evidence for clinical decision making, using this collection of guidelines and the evidence summaries as the study material.
The development process of a Current Care guideline is outlined in box 1 and follows the quality standards of the Appraisal of Guidelines, Research and Evaluation in Europe (AGREE) instrument.8 The Current Care board selects topics from suggestions made mostly by the specialist societies, lately with the help of a prioritising tool.9 The development group consists of clinical experts.
The process begins with a literature search. Critical appraisal of the literature is based on criteria published by the Evidence Based Medicine Working Group.10 Depending on the quality and size of the original studies, the evidence base of the main statements is graded from A to D (table 22).). The key statements are supported by evidence summaries. The Current Care consensus process is an informal one in which the guideline development group discusses the evidence in the context of the Finnish healthcare system. When there is lack of grade A or B evidence, and especially in the case of grade D evidence, this process can be tedious. The discussion is an iterative process at the end of which the actual recommendations are carefully worded. For more recent guidelines, the groups have been using computers to project the text on a screen for editing it together.
Electronic publication allows easy linking to the evidence base and wide dissemination. The important characteristic is the accessibility of the evidence summaries if more information on the topic is needed. Our most read guidelines in 2005 were the hypertension guideline (28445 hits, 4.4% of all hits), lower back conditions guideline (24615 hits, 3.8% of all hits) and schizophrenia guideline (21455 hits, 3.4% of all hits). In all, the Current Care guidelines were read 640434 times in 2005.
We examined all available 64 Current Care guidelines. The material for analysis was retrieved directly from the updated Current Care guideline XML database (accessed 7 February 2006). The guidelines were listed on the basis of the topics and then further by the sections in each guideline (epidemiology, prevention, diagnosis, treatment, rehabilitation, screening, recommended organisational level of care). Then all the evidence summaries were listed and classified by their topic and level of evidence (A–D). We also retrieved the evidence summaries linked to every Current Care guideline and classified these according to the sections of the guideline to which they were linked. Here we report the actual numbers of evidence summaries within these classifications and the percentages. We correlated the number of evidence summaries with the length (in pages) of the Current Care guideline.
There were 2419 evidence summaries for the 64 guidelines. There were 532 level A, 891 level B, 808 level C and 188 level D evidence summaries. The distribution of the evidence summaries (A–D) of all the guidelines is shown in fig 11.. The sections on treatment and diagnosis had the most evidence (all levels from A to D). The section on treatment contained 58.2% of all evidence and diagnosis had 22.4% (fig 22).). Level A and B evidence represented 58.8% of all evidence, with the sections on diagnosis and treatment containing 80% of all level A and level B evidence. Figure 33 provides a breakdown of all the level A and level B evidence according to the sections. The dyslipidaemias guideline had the greatest percentage of level A and B evidence (n=23; 95.7%). Figure 44 depicts the distribution of the evidence summaries graded C or D. The sections on diagnosis and treatment represented 81% of all level C and D evidence.
The number of printed pages (PDF layout, without references) of the guideline varied from four pages (corticosteroid treatment in patients at risk of premature labour) to 25 pages (atrial fibrillation), with a mean of 12.1 pages. The number of evidence summaries in a guideline ranged from eight in six pages (acute bronchitis) to 96 in 19 pages (rheumatoid arthritis). Evidence summaries are published only in the electronic format. The spinal cord injury guideline had a total of 67 evidence summaries. This guideline had the greatest percentage of level C and D evidence (n=55). The guideline on schizophrenia had the greatest proportion of level D evidence (n=11; 23%).
The main result of the present study is that, strictly speaking, only 22% of the key statements of the 64 guidelines were supported by high‐grade evidence (level A). This is probably an overestimate, since there seems to be a relative lack of level D evidence summaries. The sections on diagnosis (30%) and treatment (51%) had the greatest proportion of level C and D evidence. Current Care guidelines are meant to provide recommendations on the diagnosis and treatment of a condition, however, areas such as pathophysiology, rehabilitation and prevention are also included. The end result is a comprehensive guideline on a clinical condition with recommendations supported by various levels of evidence. That the level of evidence is “only” C or D does not indicate that the recommendation is clinically less important. On the contrary, important clinical decisions have to be made despite the level of evidence.
The use of only one set of guidelines is both a strength and a limitation of the present study. The methodology and especially grading of the evidence are based on the same handbook. Some updating has been done, but the basic rules remain the same as at the beginning of the programme. The contents of the guidelines are similar, with a basic set of sections that form the core. On the other hand, although English translations of the most recent guideline abstracts have been available since 2004, the guidelines are only available in Finnish. It is therefore difficult for guideline developers in other countries to evaluate them.
It seems that the Finnish guideline groups have a preference for level A and B evidence summaries. They probably find it easiest to make recommendations that are based on evidence levels A and B. Burgers11 recently stated that high‐quality guidelines are based on evidence as well as a broad consensus of opinions, which facilitates the acceptance and effective use of the guideline by the target group. There is, in particular, probably a relative lack of level D summaries, since these should consist of an outline of the consensus reached by the development group on a topic with clearly little evidence. In the earlier guidelines, level D key statements were just indicated by the letter D without an accompanying evidence summary, but this practice has since changed. Now the level of evidence has to be supported by an evidence summary. There are probably some key statements or recommendations that will need to be supported by evidence level D summaries, and therefore the proportion of level D evidence is underestimated in the present overview. The Current Care handbook states that only the most central recommendations in the guideline should be supported by evidence summaries. Therefore level D evidence might easily be supported by just giving the reference—for example, an overview.
Another source of bias may be that because the Current Care guidelines are ideologically evidence based, the groups may be tending to draw up summaries that are graded at least C. For this study, we did not analyse the key recommendations that should be backed by an evidence summary. It is an interesting point whether the evidence summaries, especially evidence level D (and C) in clinical practice guidelines could be used as an important source of research questions. Since the guidelines are developed by clinicians, these questions may have direct clinical relevance, and this reasonable notion will shortly be explored in our guideline material.
We did not systematically analyse how often an evidence summary was referred to in the guidelines. However, on the basis of preliminary scanning, this seems exceptional. One of the basic rules is that the evidence summary should only answer one question. According to our experience, the groups abide by this rather well.
Use of guidelines may measure one component of the organisation's maturity (Maturity Matrix12). The other components of the Maturity Matrix are clinical records, audit of clinical performance, clinician access to clinical information, prescribing, practice‐based organisational meetings, sharing information with patients and patient feedback systems. The grading of maturity increases as the level of competence increases. The Current Care guidelines are well distributed and disseminated to all professionals and practically all healthcare organisations via a professional health portal (Terveysportti) and they are also freely available on the internet (http://www.kaypahoito.fi). Thus the guidelines are incorporated into clinical information systems, which underline the importance of the quality of the guidelines. Care pathways and the 2005 implemented national criteria for referral to elective care (http://www.stm.fi) are based on guidelines whenever possible, so the evidence is integrated in the core healthcare.
The quality of evidence is relevant to guideline implementation. According to Dutch researchers, compliance with guidelines is better if the evidence base is solid.13 The implementation of a guideline is also facilitated by the quality of the guideline itself—that is, its readability and directness.14 One cornerstone of good‐quality guidelines is supporting the most central recommendations with evidence summaries. These also serve as a message from the guideline development group to the audience, highlighting the importance of specific areas of the guideline, which therefore are supported by the evidence summaries. The aim of using the evidence and the guidelines is to improve the quality of healthcare. The existing evidence is not always applicable or even directly relevant to clinical practice. Pertaining to guidelines, a tool has been developed to aid the Current Care board in making decisions about new guideline topics, to better ensure their clinical relevance and importance to the national healthcare system.9 Another project has been launched with the aim of developing an interactive electronic decision support (EDS) based on the guidelines, which, along with the electronic patient record (EPR), will support clinical decision making.15 There are still many obstacles, such as motivating clinicians to use the EPR actively in a structured way and resolution of data protection issues. However, in the long run, when the electronic guidelines, the EDS and the EPR, are in place, it will become normal practice to collect and analyse clinical outcomes data online. This system may also enable us to improve the relevance of our research questions.
We have used the available Current Care guideline material to describe the available proportions of the evidence levels A–D for guideline development. The present results confirm the previous estimates that about 10–20% of clinical decisions are based on good evidence. The next step will be to analyse more carefully the grade C and D evidence summaries and whether these can be used to highlight the gaps in our evidence base, and thereby guide the questions for researchers to answer next.
All authors contributed to the study design and interpretation of the data, carried out the analyses and prepared the draft manuscript. All authors participated in revising the manuscript for important intellectual content and approved the version to be published.
No ethics approval was required for this study.
Competing interests: None.