|Home | About | Journals | Submit | Contact Us | Français|
Guidelines translate best evidence into best practice. A well-crafted guideline promotes quality by reducing healthcare variations, improving diagnostic accuracy, promoting effective therapy, and discouraging ineffective – or potentially harmful – interventions. Despite a plethora of published guidelines, methodology is often poorly defined and varies greatly within and among organizations.
This manual describes the principles and practices used successfully by the American Academy of Otolaryngology – Head and Neck Surgery to produce quality-driven, evidence-based guidelines using efficient and transparent methodology for action-ready recommendations with multi-disciplinary applicability. The development process, which allows moving from conception to completion in twelve months, emphasizes a logical sequence of key action statements supported by amplifying text, evidence profiles, and recommendation grades that link action to evidence.
As clinical practice guidelines become more prominent as a key metric of quality healthcare, organizations must develop efficient production strategies that balance rigor and pragmatism. Equally important, clinicians must become savvy in understanding what guidelines are – and are not – and how they are best utilized to improve care. The information in this manual should help clinicians and organizations achieve these goals.
If you use or develop clinical practice guidelines this manual will likely be of interest. “There are many paths to the top of the mountain,” suggests an old Chinese proverb, “but the view is always the same.”1 Although many paths lead to guidelines, we offer proven strategies for crafting a valid and action-ready product within twelve months. The driving force is quality improvement with a continuous effort to balance pragmatism with developmental rigor. The end product is a starting point for performance improvement.
This manual builds on an earlier publication2 by American Academy of Otolaryngology – Head and Neck Surgery (AAO-HNS) to systematize internal guideline development. By following these principles the AAO-HNS published five multidisciplinary guidelines in five years, all within 12 months from conception to completion.3–7 Each guideline presented a fresh opportunity to test and refine prior efforts, necessitating a revised and greatly expanded manual only three years after initial publication. Our new manual not only summarizes this experience, but allows other organizations to assess and adapt the processes.
Our goals in publishing a revised manual are several. First, we sought to provide clinicians with a straightforward explanation of guidelines, considering the increasing prominence of guidelines as a quality metric. Second, we wanted a pragmatic resource, which accurately reflects current practices, to sustain consistent guideline development at the AAO-HNS. Last, we wanted to share our successful development process with the guideline community at-large to encourage an exchange of ideas and to promote best practices.
Guidelines are particularly important when wide regional variations exist in managing a condition. Similarly, the wide variability in guideline methodology, both within and between organizations, is precisely what mandates a systematic approach to guideline development. Despite a plethora of techniques reflected in published guidelines, we could not find a single, comprehensive “how-to” manual with a valid and pragmatic approach that could be readily implemented. This work is offered to address this void.
We thank the AAO-HNS for their trust, support, and flexibility throughout this fruitful collaboration, and sincerely hope that you may also benefit from the experience. We humbly acknowledge that ours is one of many paths to the mountain top, and look forward to further refinement based on reader feedback and ongoing experience.
Throughout the manual we emphasize principles and practices, recognizing that both are needed to translate concepts into action. Principles underlying practices are always stated, to promote conceptual focus and clarity before getting sidetracked with implementation details. Practices are illustrated with examples from prior AAO-HNS guidelines to clarify how we chose to implement a principle, with the understanding that other development groups will need to modify the particulars to fit their organizational structure and resources.
Content that is offset from the remainder of the text is intended to emphasize a concept, insight, or principle, of special note or importance. This also serves to create visual breaks that improve readability, along with tables, bulleted lists, numbered lists, and frequent subheadings.
The following list describes some of the fundamental principles underlying guideline development that are discussed sequentially in this manual:
This manual is also intended as a practical resource for use during guideline development. Major sections of the text correspond to the activities involved (Table 1), allowing the user to move from one section to the next as development proceeds. When significant principles apply to an activity they are discussed within, or just before, the relevant section.
We have tried to make this manual as reader-friendly as possible, but how to best approach the material will depend upon one’s background and perspective. Individuals and organizations involved in guideline development comprise a diverse audience that may benefit from the material in the manual in several ways:
Guideline users are perhaps even more diverse than guideline developers. Whereas most guideline users do not need, or want, detailed information on methodology, some understanding is necessary to interpret and utilize existing products:
The Institute of Medicine has identified three crucial tasks for a national system to identify highly effective health care services: priority setting, evidence review, and developing recommendations (guidelines).8 The last task – creating clinical practice guidelines – is perhaps the most challenging, because methodology continues to evolve, the quality and relevance of available evidence is highly variable, and evidence gaps mandate valid processes for incorporating expert consensus.
Guidelines help clinicians translate best evidence into best practice. A well-crafted guideline promotes quality by reducing healthcare variations, improving diagnostic accuracy, promoting effective therapy, and discouraging ineffective – or potentially harmful – interventions.
This manual offers one approach to efficient guideline development based on experience of the American Academy of Otolaryngology – Head and Neck Surgery (AAO-HNS) and the Yale Center for Medical Informatics. The AAO-HNS and associated Foundation sponsor continuing medical education, professional meetings, scientific research, and practice management guidance for more than 13,000 ear, nose, and throat specialists in the United States and abroad. Since 2004 the AAO-HNS has devoted substantial resources to creating and publishing guidelines that improve quality of care in diverse clinical practice environments.
Despite an immediate need for valid, action-ready guidelines, many barriers exist. Published guidelines, although numerous, are often poorly suited to assess performance or influence care, because recommendations do not translate into measurable actions or activities. Moreover, the development process is generally inefficient and highly complex, requiring, on average, about 2 to 3 years per guideline. Gaps in the evidence base for many important issues typically preclude guideline recommendations based on evidence, even when quality concerns or practice variations mandate urgent action.
One solution is to produce quality-driven, evidence-based guidelines using efficient and transparent methodology for action-ready recommendations with multi-disciplinary applicability:
The utility of a guideline depends highly on its transparency, which makes clear the purpose and basis of recommendations to end users. Transparency mandates disclosure of competing interests by authors, explicit statements about the reasons for developing a policy, and explanation of contributing factors are weighed.9
AAO-HNS guidelines prescribe recommendations in key action statements followed by amplifying text. All guideline action statements should ideally be supported by evidence profiles that summarize clearly the decision-making process in terms of aggregate evidence quality, harm-benefit assessment, development group values, and the role of patient preference. Evidence profiles are discussed fully later in this manual.
As defined by the Institute of Medicine, clinical practice guidelines are “systematically developed statements to assist practitioner and patient decisions about appropriate health care for specific clinical circumstances.”10 Despite increasing acceptance of an evidence-based approach to clinical decision-making, much clinical practice is still not based on the best available evidence. Guidelines are one way of implementing evidence into practice.11 They can serve as a guide to best practices, a framework for clinical decision-making, and a benchmark for evaluating performance.
Guidelines benefit patients through better outcomes, fewer ineffective interventions, greater consistency of care, and by creating secondary implementation materials (pamphlets, videos, etc.). Clinicians can use guidelines to make better decisions, initiate quality improvement efforts, prioritize new research initiatives, and support coverage or reimbursement for appropriate services. Conversely a flawed guideline could significantly harm both patients and clinicians, thereby mandating sound methodology as a basis for guideline development.12
Simply inserting the word “guideline” in the title of a document does not make it so. Many review articles, consensus statements, practice parameters, and policy recommendations are mistakenly labeled as “guidelines,” even though they do not possess the methodologic rigor to warrant such a designation. A real guideline is one that fulfills all or most of the specific criteria defined below.
The Appraisal of Guidelines Research & Evaluation (AGREE) instrument is a widely used generic measure of guideline quality.13 Quality guidelines are characterized by the following attributes:
The Conference on Guideline Standardization (COGS) Checklist is another tool that specifies characteristics of a valid and usable clinical practice guideline.14 In contrast to the AGREE instrument, which assesses guidelines after completion, the COGS checklist can be used during development to improve quality. The 18 characteristics in the COGS checklist are shown in Table 2.
Guidelines meeting certain quality standards are included in the National Guideline Clearinghouse (NGC) database, an initiative of the Agency for Healthcare Research and Quality NGC inclusion criteria are:15
Equally important to understanding what guidelines are is a clear appreciation of what guidelines are not. Without this perspective clinicians may become apprehensive about the impact of guidelines on their lives, and organizations may apply guidelines to situations they were never intended.
Guidelines are never intended to supersede professional judgment; rather they may be viewed as a relative constraint on individual clinician discretion in a particular clinical circumstance.16 Clinicians should always act and decide in a way that they believe will best serve their patients’ interests and needs, regardless of guideline recommendations. Guidelines simply represent the best judgment of a team of experienced clinicians and methodologists addressing the scientific evidence for a particular topic.
Guidelines differ from systematic reviews and evidence reports that identify and combine studies using explicit methods to reduce bias, but do not typically define appropriate actions or incorporate values. In contrast, a guideline uses information from evidence reviews and other sources to make specific recommendations by considering values and linking the strength of recommendation to the quality of evidence.
Last, evidence-based clinical practice guidelines are not intended for cost control or healthcare rationing. Guidelines seek to produce optimal health outcomes for patients, minimize harm, and reduce inappropriate variations in clinical care. Whereas some of these outcomes may also reduce costs, financial benefits alone are generally not the main focus of an evidence-based clinical practice guideline.
Without substantial advance planning, guideline development is likely to be biased and inefficient. Moreover, an a priori protocol is mandatory to ensure attention to the COGS and AGREE quality standards. Based on literature review and direct experience in North America and the United Kingdom, Shekelle and colleagues17 concluded that five steps are involved in the initial development of an evidence-based guideline:
Turner and co-workers11 compared approaches to guideline development in six handbooks from the Council of Europe, World Health Organization, and from national organizations in Australia, Scotland, New Zealand, and the United Kingdom. All handbooks agreed that key aspects of development included a multidisciplinary panel, consumer involvement, identifying clinical questions or problems, systemically reviewing and appraising the literature, a process for drafting recommendations, external consultation and review, and planned updating.
Guyatt and colleagues18 have focused on grading evidence quality and recommendation strength in guidelines, emphasizing that both are separate and distinct processes essential to validity. An optimal grading system is characterized by simplicity and transparency for the clinician consumer, sufficient (but not too many) categories, explicitness of methodology for guideline developers, simplicity for guideline developers, consistency with general trends in grading systems, and an explicit approach to different levels of evidence for different outcomes.
The guideline development process described in this manual addresses the above issues, yet strives for a balance between rigor and pragmatism that maintains efficiency.
Efficiency is critical in guideline development, because moving from planning to completion in about 12 months helps avoid a situation in which new evidence continues to appear. With an efficient protocol in place an organization can stagger guidelines under simultaneous development to result in a finished product every 6 months (depending on resources). The timeline in Table 1 has been developed to ensure rigor in development while promoting efficiency. The remainder of this manual describes the steps listed in Table 1 in terms of general concepts and specific suggestions based on prior experience.
Guidelines can be developed for a wide range of topics, including conditions (sinusitis, ear infections), procedures (tonsillectomy, tympanostomy tubes), and signs or symptoms (cough, hoarseness). Topics selected for guideline development should be high-priority and feasible.
High-priority topics have the potential for evidence-based practice to improve health outcomes, minimize undesirable variations in care, and reduce the burden of disease and health disparities. The Institute of Medicine has identified the following priority setting criteria as common to most international guideline development groups:8
Feasible topics have a sufficient base of high quality published evidence (ideally randomized, controlled trials) to drive guideline development, have one or more existing systematic reviews or meta-analyses already published on relevant issues, and have relatively clear definitions of the condition or procedure under consideration.
A steering committee that includes organizational leadership and broad stakeholder representation can help identify, prioritize, and refine guideline topics. Diversity of expertise and perspective helps minimize bias caused by conflicts of interests.
The AAO-HNS convened the Guideline Development Task Force as a steering committee for developing evidence-based guidelines and related knowledge products.19 The task force includes representatives of all sub-specialty groups within otolaryngology and of all relevant internal Academy groups, including research, patient safety, quality improvement, board of governors, and evidence-based medicine. Topics are solicited with a standardized form, based on principles outlined above, then presented to the task force for ranking and prioritization.
Perhaps the most important decision in creating a successful guideline relates to composition of the working group. A group size of 15–20 members encourages diversity and efficiency yet is small enough to avoid delays and redundancy.
The group should consist of the (1) chair and two assistant chairs, (2) staff lead and assistant, (3) technical consultant, (4) content experts, (5) stakeholders from all relevant disciplines, including nursing, primary care, and allied health, and (6) a consumer representative. The roles and responsibilities of group members are outlined in the sections that follow.
A staff lead is assigned as the primary liaison for the group, with one or more assistants who have the dual responsibility of supporting the lead and learning the process so they may serve as a future lead. Qualifications for staff lead include service as an assistant staff lead on a prior guideline panel, experience conducting literature searches and using a citation database, and a basic understanding of study design, medical terminology, and levels of evidence.
Specific responsibilities of the staff lead and assistants include:
A technical consultant is assigned to ensure that the working group adheres to methodologic standards and protocols endorsed by the organization, and to serve as a facilitator who supports the chair during conference calls and meetings. The technical consultant should be fluent with guideline methodology, understand the process of systematic review, and have direct experience prior guidelines developed by the organization.
Developing valid guidelines is not intuitive, but is an acquired skill that is independent from clinical expertise and accomplishment. Whereas an explicit and comprehensive manual aids the process, it cannot substitute for hands-on experience.
A chair should be identified to lead the group in developing the guideline and to work with the technical consultant and staff lead to ensure adherence to methodologic standards. The chair also facilitates the interpersonal aspects of the group processes, so the members work in a spirit of collaboration with balanced contribution from all members.
The chair is appointed by a selection panel that includes organizational leadership, steering committee representation, the guideline staff lead, and the technical consultant. An ideal chair should be efficient and motivated, have demonstrated leadership ability, have prior experience with evidence-based guideline development, have demonstrated skills in scientific writing, and be fluent with using the internet, e-mail, and e-mail attachments. Candidates for chair will be asked to submit a curriculum vitae and declaration of competing interests, and to confirm that they understand and accept the substantial time commitment involved.
The chair should ideally not be a content expert for the guideline topic, but should be familiar with the scientific literature and management of the clinical condition. Content experts are usually abundant in an organization and can be readily added to the working group to fill in knowledge gaps. Conversely, the chair should be an impartial leader who stimulates discussion, not an advocate who injects their own opinions.17
One or two assistant chairs should be identified who will be asked to chair the next guideline development effort. To maintain a pipeline of guideline projects, a continuing source of leadership for upcoming projects is needed. The best way to groom new chairs is to have them serve on one or two prior guideline groups to learn methodology and expectations early on. An ideal assistant chair should have experience with evidence-based medicine, but does not necessarily need prior guideline development experience.
The chair is ultimately responsible for moving along the guideline process and keeping the group focused and task oriented. Having more than one chair is inadvisable, because responsibilities can be easily shifted and diffused. Instead, the structure should include one chair and one or more assistant-chairs, as noted above.
Guideline development panels should include individuals from a range of relevant stakeholder groups to minimize bias. Multidisciplinary participation helps identify and evaluate all relevant evidence, builds support among the intended guideline users, and increases the chances of addressing practical problems related to implementation.10
Many guidelines warrant input from nursing, consumers, and primary care clinicians. Based on the target population and setting, the working group may include internists, pediatricians, geriatricians, family practitioners, and emergency medicine physicians. Additional specialty clinicians are recruited as dictated by the specific topic or condition under study. Allied health professions are similarly recruited, and may include audiologists, physical therapists, speech-language pathologists, and others.
An excellent source of consumer participants for guideline development is Consumers United for Evidence-based Healthcare (CUE), a national coalition of health and consumer advocacy organizations, which empowers consumers through critical appraisal of articles, guidelines, and systematic reviews.22 CUE is a project of the U.S. Cochrane Center and works closely with the Cochrane Consumer Network.
If another discipline is to be a full partner in developing the guideline, they are approached early to secure interest and cooperation. Alternatively, working group members can be selected to represent their “discipline,” not their “organization.” In this model a pediatrician member of the working group would provide essential input for pediatrics as a discipline, but would not necessarily represent the American Academy of Pediatrics or imply their specific endorsement of the resulting guideline.
In deciding what disciplines other than otolaryngology to include in guideline development, a useful approach is to ensure that every discipline or organization that would be involved with implementation, including consumers, has a voice at the table. This will nearly always include one or more primary care clinicians, since invariably they will be involved in counseling the patient and coordinating care with the specialist. Representatives of all relevant medical specialties other than otolaryngology must also be considered.
A single specialty group will reach different conclusions than a multidisciplinary group when presented with the same evidence.17 Individuals from a single discipline are often biased towards procedures in which they have a vested interest. Involving multiple disciplines tends to balance bias and produce more valid guidelines.8
Potential members of the working group can be identified by organizational leadership, partner organizations, the working group chair, and the staff liaisons. An understanding of evidence-based medicine is desirable. Individuals are invited as representatives of their field or discipline, but need not be content experts for the guideline topic. Content experts should be a minority voice on the working group to limit bias.
Specific responsibilities of the working group members include:
The importance of choosing an appropriate working group cannot be overemphasized. This is called a “working” group for a reason: producing a guideline requires substantial time and effort. All members have a responsibility to other participants to behave with integrity, commitment, and a fully professional demeanor.
Despite the upfront commitment of all working group members to participate fully in guideline development, conflicts or unexpected circumstances may arise that threaten validity if an important discipline is not represented. Therefore, certain disciplines, which include primary care and selected others depending on the topic, should be represented by two group members to ensure representation.
The staff lead should compile a grid of contact information for all working group members and organizational representatives. Included in the grid should be (1) name and degrees, (2) working group role, (3) organizational affiliation, (4) clinical and academic titles, (5) mailing address, (6) disclosed conflicts of interests, and (7) contact information.
A conflict of interest exists when a participant or the participant’s institution has financial or personal relationships with other people or organizations that may inappropriately influence (bias) his or her actions.
Despite good intentions, it is not appropriate for individuals to decide if a particular relationship causes conflict; their role is to declare, not interpret. The group as a whole must ultimately determine if a conflict may result in bias, and whether or not the degree of conflict excludes the individual from participating in the entire guideline or selected sections.
Financial relationships are easily identifiable, but conflicts can also occur because of personal relationships, academic competition, or intellectual passion. Examples of financial conflicts include employment, consultancies, stock ownership, honoraria, paid expert testimony, patents or patent applications, and travel grants. Full disclosure is advised regardless of whether the participant considers the relationship relevant to the guideline content.
The contact and disclosure list should be distributed to all members for verification and should be updated, as needed, during guideline development and prior to publication.
Adhering to a predetermined, specific timeline allows publication of the guideline within 18 months. Arranging dates for conference calls and meetings is particularly difficult when dealing with individuals representing multiple organizations and disciplines. Therefore, it is critical to plan early in the process. Events are planned using the timetable in Table 1
Conference calls are often most feasible if planned to start at 8:00 p.m. Eastern Time. Calls should be generally scheduled for 2 hours. In-person meetings can begin at noon with a light working lunch to allow attendees to fly in the same morning. Similarly, they can end by noon the next day to allow a return flight the same day. A group dinner should be planned the first day. A convenient schedule is to begin on either Friday or Sunday, and end the next day.
The staff lead prepares a grid of potential dates for the calls and meetings. The grid is circulated by electronic mail to the chair, assistant chair, and technical consultant to determine to determine available dates for the first two conference calls. For the in-person meetings and future conference calls, the grid may be circulated to the entire working group to assess availability. There will clearly be a need for compromise by some group members, since the odds of finding dates agreeable to all are extremely low. Group members must commit to attending these meetings at the start.
The importance of having all working group members participate in all conference calls and attend all meetings cannot be overemphasized. Advance planning is the best guarantee of success, since maximal time is available for group members to adjust their schedules as needed and block out event dates in their calendars. If a group member cannot make this commitment, an alternate should be found as soon as possible.
The validity of an evidence-based guideline depends in large part on an unbiased and comprehensive literature search. The goal is to locate the best evidence from all relevant sources, producing a comprehensive body of evidence that will allow clinical questions to be answered and highlight gaps in the evidence base where formal consensus methods may be needed.20
Although identifying evidence is essential for guideline development, we suggest the proper role is as supporting cast, not protagonist:
Although it is tempting to exclude topics with limited evidence from guideline development, it is precisely such topics that benefit most from inclusion because of uncertainty and conflicting opinions. Even if evidence is limited recommendations are still possible if well-document benefit or harm is identified.
Using expert opinion or consensus to fill evidence gaps is entirely appropriate, provided this basis is explicit and transparent to the critical reader.9 Discussing topics with limited evidence allows guideline developers to highlight future research needs, with critical suggestions on how to best fill existing gaps. The guideline as a whole, however, should focus on topics with high quality evidence and avoid overreliance on expert opinion as a primary decision-making strategy.23
Similar to the National Institute of Clinical Excellence,20 we have found that searching is an iterative process that is best implemented in three stages. The stages correspond to different phases of guideline development, and are discussed in detail at the appropriate point in the manual. The three stages of searching (Table 1) can be briefly summarized as:
All search stages must be documented for transparency and reproducibility. Specific considerations include databases, time periods, key words, subject headings, language restrictions, use of gray literature (e.g., symposium proceedings), and selection criteria, such as filters, algorithms, or inclusion and exclusion criteria. A balance of pragmatism and rigor is required to avoid delays in the development process.
Simply identifying reviews, guidelines, and randomized trials does not ensure quality, and basing decisions on research with weak design or flawed methodology may yield biased or invalid conclusions. Therefore, to filter out potentially biased or poorly conducted studies, quality assessment must be performed as part of identifying evidence. Suggestions for assessing reviews, guidelines, and randomized trials are presented later in the manual when the related search stage is discussed.
An organization may find it necessary to perform a systematic review as part of guideline development if there are no published reviews, or if existing reviews are outdated or of poor quality. Systematic review is a rigorous and complex undertaking, which often requires additional expertise, resources, and staff support.
All systematic reviews should be conducted using a priori protocols that adhere to standards for the conduct and reporting of meta-analyses, as suggested in the QUOROM statement for randomized trials24 and the MOOSE statement for observational studies.25 Systematic reviews can be used to define natural history using placebo group outcomes and the absolute or comparative efficacy of interventions.26–27
The stage 1 search establishes a foundation for the first working group conference call by identifying existing systematic reviews and practice guidelines related to the current topic. This provides important on perspective on what has already been accomplished, what areas of controversy exist, how robust the evidence based is to support guideline development, and where the greatest opportunities lie for improving upon the existing knowledge base.
The stage 1 search is coordinated by the staff lead before the first working group conference call using parameters defined by the chair and technical consultant. Search results are reviewed by the chair to eliminate irrelevant items. A summary grid is compiled and full text files are obtained for distribution to the working group.
Systematic reviews will greatly facilitate guideline development because they identify and synthesize evidence in a format that is readily usable by the working group. Systematic reviews and meta-analyses are found by:
Clinical practice guidelines may already exist for the topic under consideration, but do not preclude further guideline development. Existing guidelines may be outdated or may not have been developed with the methodologic rigor or relevancy that is currently sought. These documents, however, are a useful starting point for group discussions. Clinical practice guidelines can be identified by:
Systematic reviews published by the Cochrane Collaboration or government agencies (AHRQ) are typically of high methodologic quality and may not require further assessment. Conversely, reviews authored by individuals or other organizations are highly variable in rigor and quality. Minimum quality criteria for systematic reviews might include (a) an a priori, hypothesis driven protocol, (b) explicit and systematic literature search, (c) validated data extraction from source articles, (d) data pooling with standard statistical techniques, and (e) tabular presentation of results with graphical summaries.
Clinical practice guidelines are highly variable in quality regardless of origin. Minimum quality criteria might include (a) explicit scope and purpose, (b) multi-disciplinary stakeholder involvement, (c) systematic literature review, (d) explicit system for ranking evidence, and (e) explicit system for linking evidence to recommendations.
The first conference call (Table 2) sets the stage for guideline development by introducing working group members, defining the guideline timeline and scope, discussing conflicts of interest, and planning for the stage 2 literature search. The call is planned to last 120 minutes and may be recorded for future reference.
The staff lead records minutes of the call, for dissemination and review by the group after the call concludes. The main purpose is to document process, workflow, and decisions made, thereby avoiding the discussion of settled controversies. The group chair takes additional notes during the call to record ideas, concepts, definitions, and key phrases that may later prove difficult to reproduce or remember.
Documents should be distributed by e-mail prior to the conference call for review by participants before the call. Materials specific to the guideline that should be distributed include:
General materials that should be distributed include:
Group members should review contact information and titles for accuracy. Group members should briefly introduce themselves, including their areas of expertise and experience in developing prior guidelines and their role in the workgroup. The need for any additional group members should be discussed, taking care to be sure that all relevant disciplines are adequately represented.
The purpose of sponsoring organization(s) in developing the guideline should be specified, and can be revised and updated as development proceeds. The purpose can often be divided into two distinct, but related components:
Here is an example of how purpose was stated in the AAO-HNS guideline on acute otitis externa: “The primary purpose of this guideline is to promote appropriate use of oral and topical antimicrobials for diffuse acute otitis externa and to highlight the need for adequate pain relief. Additional goals are to make possible an acute otitis externa performance measure and to make clinicians aware of modifying factors that can or may alter management (e.g., diabetes, immune compromised state, prior radiotherapy, tympanostomy tube, non-intact tympanic membrane).”3
As another example, consider this statement of purpose from the AAO-HNS guideline on adult sinusitis: “The primary purpose of this guideline is to improve diagnostic accuracy for adult rhinosinusitis, reduce inappropriate antibiotic use, reduce inappropriate use of radiographic imaging, and promote appropriate use of ancillary tests that include nasal endoscopy, computed tomography, and testing for allergy and immune function.”4
After the guideline purpose has been discussed, the consultant provides a very brief overview of development methodology. Points worthy of emphasis include:
The detailed methodology for classifying recommendations should not be discussed at this time to avoid an unnecessary tangent. Instead, this is optimally discussed before classifications are made at the first or second in-person meeting. Members who want additional details can be referred to the American Academy of Pediatrics Policy Statement on Classifying Recommendations.16
Group members must disclose all industry relationships and potential conflicts of interest during the first conference call. The group will then decide if any particular relationships are significant enough to preclude participation of any individual(s). Relationships should be thoroughly documented and included in the guideline manuscript.
Since more than 80% of guideline authors, in general, have potential conflicts of interest, the existence of a relationship alone is not sufficient to preclude participation.30 They are only excluded if the nature of the relationship is considered by the group to interfere with objective participation (e.g., equity relationship, patent holder, royalty arrangements). Based on the nature of the disclosed relationship, a member may be asked to not participate in a specific section of the guideline where a conflict may produce bias.
A well-crafted guideline has a clearly defined scope. Defining scope will occupy most of the first conference call, and may require a second for completion. Inexperienced guideline developers attempt to cover all aspects of a condition, resulting in a broad scope that will stall development efforts. The key to progress is a razor sharp focus from the start, recognizing that some issues important to some stakeholders will inevitably be left out.
The group should identify the conditions, procedures, or signs or symptoms for which the guideline is intended. This may be a single condition or a list of potential target conditions, which could be later condensed into those that can be realistically examined by the group within its allotted time. A guideline can be procedure-based instead of disease-oriented. For example, the emphasis can be on “tonsillectomy” as a procedure instead of tonsillitis as an acute or chronic condition.
Any diseases or procedures should be explicitly defined by the group. Definitions derived from publications on the topic can be used, if available, but a multidisciplinary group can often improve upon definitions advanced by an individual or single discipline. This is a particularly valuable contribution when existing definitions are controversial or unclear.
The definition of the target or procedure should be clear and concise (Table 3). The definition, however, should be distinguished from diagnostic criteria, which are typically specified later in the guideline and have more precise and detailed information to guide clinicians.
The authoring group should specify the type of patient for whom the guideline is intended as precisely as possible. The target patient can be specified in terms of demographics, presenting signs and symptoms, past health history, results of previous diagnostic tests, or similar criteria.
Equally important as defining the target patient is defining clearly the types of patients or clinical presentations that are beyond the scope of the group’s analysis. One or more exclusion criteria should generally accompany the definition. For example, consider this definition from the AAO-HNS guideline on acute otitis externa:
“The target patient is aged 2 years or older with diffuse acute otitis externa, defined as generalized inflammation of the external ear canal, with or without involvement of the pinna or tympanic membrane. This guideline does not apply to children under age 2 years or to patients of any age with chronic or malignant (progressive necrotizing) otitis externa. acute otitis externa is uncommon before age 2 years, and very limited evidence exists regarding treatment or outcomes in this age group. Although the differential diagnosis of the “draining ear” will be discussed, recommendations for management will be limited to diffuse acute otitis externa, which is almost exclusively a bacterial infection. The following conditions will be briefly discussed but not considered in detail: furunculosis (localized acute otitis externa), otomycosis, herpes zoster oticus (Ramsay Hunt syndrome), and contact dermatitis.”3
Here is another example of target patient definition from the AAO-HNS guideline on cerumen impaction:
“The target patient for this guideline is over 6 months of age with a clinical diagnosis of cerumen impaction. The guideline does not apply to patients with cerumen impaction associated with the following conditions: dermatologic diseases of the ear canal; recurrent otitis externa; keratosis obturans; prior radiation therapy affecting the ear; previous tympanoplasty/myringoplasty or canal wall down mastoidectomy. However, the guideline will discuss the relevance of these conditions in cerumen management. The following modifying factors are not the primary focus of the guideline, but will be discussed relative to their impact on management: non-intact tympanic membrane (perforation or tympanostomy tube); ear canal stenosis; exostoses; diabetes mellitus; immunocompromised state; or anticoagulant therapy.”5
The decision about the intended users of the guideline needs to be made early in the process, since it influences decisions about the interventions that will be considered and the audiences to which the language in the final product and specific implementation suggestions will be directed. Ideally, a representative of each target audience group or organization should be included on the guideline working group. Stakeholder representatives should also be involved in reviewing and pre-testing the document.
Practice settings should also be defined, since a guideline may be applicable only in selected settings (e.g., rural, primary care, hospital emergency room, operating room, managed care, specific geographic regions). The working group should identify those settings in which using the guideline would be appropriate as well as settings where it should not be applied.
Here is an example of how practice setting was defined in the AAO-HNS otitis media with effusion guideline: “The guideline is intended for use by providers of health care to children, including primary care and specialist physicians, nurses and nurse practitioners, physician assistants, audiologists, speech-language pathologists, and child development specialists. The guideline is applicable to any setting in which children with otitis media with effusion would be identified, monitored, or managed.”29
As another example, consider the definition used in the guideline on benign, paroxysmal positional vertigo: “The guideline is intended for all clinicians who are likely to diagnose and manage patients with benign, paroxysmal positional vertigo and applies to any setting in which benign, paroxysmal positional vertigo would be identified, monitored, or managed.”6
The group should generate a list of the clinical interventions (diagnostic tests, treatments, preventive measures) that will be considered in developing the guideline. This list should include all interventions relevant to the topic. A sample list developed for use in a sinusitis guideline is shown in Table 4.
The purpose of the topic list is to document transparency and to stimulate discussion as development proceeds, reminding the group of all interventions available for consideration. In contrast, the list is not intended as an outline or template for writing the guideline, since many items will be outside the document focus.
A similar list of exclusions should be generated. For example, some groups may be reluctant to evaluate drugs, procedures, or other interventions that have only recently been introduced into practice and have limited experience regarding long-term benefits and harms. Other groups may find these relevant. Any exclusions should be specifically noted in the list of interventions considered (Table 4).
Outcomes should be selected prospectively that limit scope and provide measures against which to evaluate the effectiveness of the recommendation’s limit.
Other measures to consider include cost, quality, and utilization. Often the outcome of interest is related only tenuously to the proposed interventions. In such cases, proxy indicators of outcome or process may be selected.
Here is an example of outcome definition from the AAO-HNS acute otitis externa guideline: “The primary outcome considered in this guideline is clinical resolution of acute otitis externa. Additional outcomes considered include minimizing the use of ineffective treatments; eradicating pathogens; minimizing recurrence, cost, complications and adverse events; maximizing the health-related quality of life of individuals afflicted with acute otitis externa; increasing patient satisfaction; and permitting the continued use of necessary hearing aids.”3
As another example, consider the definition from the cerumen impaction guideline: “The primary outcome considered in this guideline is resolution or change in the signs and symptoms associated with cerumen impaction. Secondary outcomes include complications or adverse events. Cost, adherence to therapy, quality of life, return to work or activity, return physician visits, and effect on co-morbid conditions (e.g., sensorineural hearing loss, conductive hearing loss) were also considered.”5
The stage 2 literature search, described in the next section, identifies randomized controlled. During the first conference call, the parameters of the literature search conducted by the staff are discussed and defined.
At the end of the call the group reviews specific assignments or requests for additional information made during the call. Deadlines are assigned for completing the assignment, emphasizing the importance of responding within the time frame specified.
After the call the staff leads forwards notes and minutes to the chair for review and clarification. The revised minutes are distributed to the group for review and feedback. The definitions, scope, and purpose are further refined by e-mail exchange before the next conference call.
The second step in identifying evidence is to assess the quantity and scope of randomized controlled trials (RCTs) available to support guideline development. Recommendations are strongest when supported by RCTs or systematic reviews of RCTs, and a paucity – or surplus – of quality studies may impact group decisions.
The stage 2 search should be coordinated by the staff lead before the second working group conference call using parameters defined by the working group during the first conference call. Search results are reviewed by the chair and assistant chairs to eliminate irrelevant items. Remaining RCTs are organized by broad subject headings to facility group discussion and reference. A summary grid is compiled and distributed to the working group. The grid is most useful if some brief, descriptive information is included for each trial, such as sample size, blinding (open, single, or double), and industry funding (no or yes).
Randomized trials are most valuable for evaluating therapeutic interventions. Different search strategies are required for questions related to prognosis, natural history, diagnostic tests, etc. Relevant clinical trials can be identified by:
Randomized controlled trials are highly variable in methodology and validity. A simple and efficient scale can be used to rank quality from 1 (poor) to 5 (excellent) based on (a) method and adequacy of randomization, (b) method and adequacy of masking, and (c) reporting of withdrawals and dropouts.32 A similar quality scale is available for randomized trials included in systematic reviews.33
The primary purpose of the second conference call (Table 2) is to refine and polish the concepts developed in the first call, particularly the scope and definition(s). The interval between the first and second conference calls should be kept short, about 4 to 6 weeks, to facilitate recall and sustain momentum.
The stage 2 literature search is now available and will help identify errors, omissions, and exclusions in the earlier discussion. The call ends with a discussion of quality improvement opportunities that are used to form a preliminary topic list, which will be further refined and prioritized by electronic mail exchange after the call. The call is planned to last 120 minutes and may be recorded.
Documents should be distributed by e-mail prior to the conference call for review by participants before the call. Materials for predistribution include:
Minutes from the first conference call are reviewed, with emphasis on the guideline purpose, scope, and definitions. Feedback is also solicited on the stage 2 literature search, especially from the group content experts, regarding content, organization, and possible omissions.
The heart of a guideline is a series of key action statements that reflect issues deemed most important by the group. Although there is no rigid guide as to the number of key statements, they are usually limited to about 10 to 18 based on the guideline scope. Since each statement will require supporting text and an evidence profile, the number is limited for feasibility and timeliness. The goal is to achieve maximal quality improvement with a manageable set of actions.
Quality health care is ideally patient-centered yet also accounts for the needs of the population. A useful definition of quality for guideline development is how well physicians and health care institutions fulfill their care obligations to individual patients, and how well patients, physicians, and health care institutions enable these obligations to be fulfilled justly across the population. The goal is to improve desired health outcomes that are consistent with current professional knowledge.34
The process for developing key action statements begins by asking the working group to suggest topics that represent opportunities for quality improvement within the guideline scope. A given topic may become the basis for a key action statement, or, if deemed of lesser importance, may be incorporated into the supporting text of a related statement. Opportunities for quality improvement may be broadly summarized as:8
After reviewing the above list with the group, the chair solicits feedback for quality improvement topics. The points below should help in moving the discussion along:
Recall that the quality-driven approach allows all important topics to be included, even if the supporting evidence is weak or limited. Although recommendations are facilitated by strong evidence, important topics with weak evidence may sill become key action statements if there is a clear preponderance of benefit or harm. The group should focus on potential quality impact in selecting topics, not primarily on level of evidence.
Topic suggestions should be short and simple, emphasizing content, not structure. No attempt should be made at this time to create polished action statements. The topic list is often much longer that the eventual list of key action statements. All group members must contribute to ensure multidisciplinary involvement. At least 20 to 30 topics are desired, roughly twice the number of anticipated key action statements.
As topics are suggested by working group members, the chair composes a simple list, assisted by the staff lead. The goal is to have a starting point for electronic exchange after the call that will refine and complete the entries.
A sample topic list developed from conference call #2 for a sinusitis guideline is shown below.
Many of the topic suggestions for sinusitis were the basis for key action statements, but the final wording had little relationship to how the topic first appeared. Topics are best viewed as the raw material for deriving key action statements, which is the subject of the first in-person meeting. The purpose of conference call #2, and the subsequent electronic exchange, is to create a robust platform of raw material to assist the group at the meeting.
At the end of the call the group reviews specific assignments or requests for additional information made during the call. Three types of assignments will follow the call:
Deadlines are assigned for completing any assignments, emphasizing the importance of responding within the time frame specified. Group members are reminded of the dates for the upcoming in-person meetings.
The room should be large enough to comfortably accommodate the working group and facilitate discussion. One suggestion is to use a rectangular or U-shaped seating arrangement, with the chair and assistant chairs at the head, and participants along the sides.
A digital projector and screen are used for presentations (see below) and for real-time projection of meeting notes taken by the chair. The screen should be large enough to be readily seen in all parts of the room. Internet access should be available for literature searches and to address questions that arise during the meeting.
The first in-person meeting will ideally have several focused PowerPoint presentations to set the stage for the ensuing discussions. The following topics are suggested, allowing 30 minutes for each:
Progress at the first meeting is facilitated if the chair or staff lead creates a guideline template that will be updated in real time during the meeting. The template is based on the format used in previously published guidelines by the organization. An outline of the major sections in AAO-HNS guidelines is shown in Table 5.
The final topic list, based on electronic exchange after the second conference call, is made into a two column table. The first column, left blank, has the heading “Rank,” and the second column, containing the topics, has the heading “Topic.” The order of the topics is not important, and can simply correspond to the sequence in which they were suggested by the group.
The staff lead distributes the topic list to working group members with the following instructions, replacing the number “31” in this example with the total number of topics:
“Please rank the 31 topics below in order of importance for inclusion in this guideline by placing a number from 1 to 31 under the ‘Rank’ column. Please use each number only once. Assign the number ‘1’ to the most important topic, ‘2’ to the next most important, ‘3’ to the 3rd most important topic, so on and so forth. Number ‘31’ will be the least important topic. The table should not be sorted, but should be left in the original order. In addition, please send any comments to [staff lead e-mail address]. Thank you for your time.”
The rank lists are collated by the staff lead to determine the mean rank score for each topic, with lower scores indicating higher priority. A table is created with the items sorted by rank score with additional columns for the topic, number of systematic reviews relevant to the topic (based on the stage 1 search), and number of randomized trials relevant to the topic (based on the stage 2 search). An example of a completed topic rank list is shown in Table 6. The literature search in compiling this table does not need to be exhaustive at this stage; it is simply intended as a guide to the evidence landscape for the upcoming meeting.
A guideline can only be implemented if the recommendations are clear and identifiable. This goal is best achieved by structuring the guideline around a series of key action statements, which are supported by amplifying text, evidence profiles, and recommendation grade. Unfortunately recommendation statements are often not readily identifiable in published guidelines and many statements are not executable as written. This section introduces the concept of key action statements, laying the foundation for developing the statements from the topic list at the first in-person meeting.
The group must understand the purpose and structure of key action statements before they can begin creating them from the prioritized topic list. Key statements are action-oriented prescriptions of specific behavior from a clinician. As such, they should suggest measurable activities that can form the basis of performance measures or other quality initiatives.
Key statements are quality-driven and propose actions that will improve quality of care. These actions include, but are not limited to, reducing variations in care, improving diagnosis/recognition, promoting appropriate care, avoiding unnecessary tests or interventions, improved coordination of care, and improved patient safety.
An ideal key action statement describes:
Key action statements should be brief, yet precise. The accompanying text amplifies why the recommendation important and how it is to be carried out.
Examples of key action statements from prior guidelines are listed below with the specific action requested in italics. These examples are polished statements developed after extensive discussion, and not necessarily what was initially proposed based on the topic list.
All of the statements above apply to “clinicians” as the “who,” followed by the word “should” then an action statement. The word “should” qualifies the strength of the statement (the subject of in-person meeting #2) and is replaced by “may” if the level of evidence is not strong or the harm-benefit relationship is unclear. Recommendations supported by consistent, high-quality evidence and a strong preponderance of benefit over risk, harm, and cost may occasionally be associated with the term “must.” Developers should understand the possible legal and reimbursement ramifications when using that term.
The final wording of key action statements will be determined based on evidence profiles at the next in-person meeting, so the group should not waste time debating whether “must,” “should,” or “may” is appropriate at this point. At this stage the emphasis is on concepts and clarity, not precise wording.
The goal of this meeting to develop a “straw man” draft of the guideline’s key action statements based on the prioritized topic list. A consensus is reached regarding key statements, the messages to be delivered, and the order in which they are to be presented. The template resulting from this meeting facilitates working group assignments to sketch in the supporting text and other details prior to the next meeting. The task of classifying the statements into recommendations can be briefly discussed, but the process is best deferred until the next meeting.
As one of only two in-person meetings during guideline development, the venue must be organized efficiently. One suggestion is to begin at noon and end at 1:00 p.m. the next day, with two working lunches and a group dinner. The noon start allows some members to arrive the same morning, and the early afternoon finish allows all members to return the same day. In between, there is about 10 hours of working time, excluding the dinner. This time allotment, combined with effective leadership from the chair and consultant, should be adequate for completing the meeting agenda.
Documents should be distributed by e-mail prior to the conference call for review by participants before the meeting. Materials for predistribution include:
The prearranged PowerPoint presentations on methodology and content are given at the start of the meeting. Presentations should be informative, not didactic, focusing on ideas and concepts to stimulate the group rather than attempting to make definitive statements. Current methodology and common pitfalls in guideline development should be described.
The introductory sections of the guideline, composed by the chair and assistant chairs after the second conference call, are reviewed and discussed by the group for broad concepts, clarity, and consistency. The first two sections are based largely on discussions at the first two conference calls and the content should accurately reflect group decisions made regarding scope, purpose, and definitions. Members should focus on identifying sections of the text in need for revision, clarification, or documentation, but should not be concerned about the nuances of wording. There will be ample opportunity in future e-mail exchanges to word smith the final document.
An efficient method of group review of guideline text is to project the relevant section on a screen using a projector and laptop computer. The group moves sequentially through paragraphs of text while the chair, or their designee, makes corrections in real time. If a revision is complex or requires additional research, the need is clearly indicated and can be addressed after the meeting by electronic exchange.
The group now begins the important task of creating a draft list of key action statements for the guideline based on the prioritized topic list. This is not an exercise in linguistic perfection, but instead emphasizes a logical sequence of proposed actions that capture opportunities for quality improvement.
Keeping in mind the quality-driven nature of the process, the chair leads the group in deciding which topics on the prioritized list will be used to create action statements. Invariably there will be some obvious choices that stand out, since they are likely the reason the guideline was undertaken.
Don’t be surprised if the task of creating key statements seems awkward or chaotic at first; developing simple, but insightful, action-oriented statements is difficult, even for seasoned guideline writers.
Some considerations in selecting topics include:
As each topic is discussed the group, assisted by the consultant, should draft a rough version of a related key action statement. This is facilitated with a brief discussion of why the topic was initially proposed and what quality improvement opportunity is offered. The statement should include, as discussed above, the details of “who, what, when, to whom, why and how.”
The most important word in a key action statement is the verb describing the action to be taken. Most guideline-prescribed activities can be described with a limited vocabulary of actions, shown in Table 7.36 This list is not intended to be restrictive, but rather to help the group in getting started with creating the statements. Other verbs can be used provided they offer clear guidance.
Guideline authors should remember the intended audience and assure that their guidance is applicable. For example, a recommendation that “patients should not be exposed to passive cigarette smoke” is only pertinent if the intended audience includes smokers. A preferable statement directed to an intended audience of (non-smoking) clinicians would be to counsel patients about the importance of avoiding cigarette smoke.
Key action statements should be clear and precise to avoid inconsistent interpretation and prevent inappropriate practice variation. Having drafted a list of key statements, the group should review the list for ambiguous or vague actions.
Ambiguity is present when a term can reasonably be interpreted in more than one discrete way.37 True ambiguity is almost always unintentional and readily correctible when identified. Examples might include interpretation of an acronym in more than one way (e.g., LAD = left anterior descending, left axis deviation, and lymphadenopathy; MS might be interpreted as morphine sulfate, magnesium sulfate, or multiple sclerosis).
Although key guideline statements should generally be precise and unambiguous, there may occasionally be a need for deliberate vagueness or underspecification. Reasons for intentionally creating vague recommendations include:9
An explicit statement of the reasons for writing deliberately vague recommendations can help users interpret and apply them.
The group should now review and discuss the proposed outline of key actions statements for the draft guideline. Is the sequence logical? Have all major quality concerns been addressed? Are there any obvious omissions or inconsistencies? Do the statements reflect the concerns of all disciplines in the working group?
A logical sequence of key action statements is important for clarity and will also facilitate writing if later statements build upon concepts developed in earlier text. The order of statements can always be adjusted later, but effort at this time to ensure smooth, conceptual flow is time well spent.
As mentioned before the goal is not to have a perfect list of statements, but rather to have an acceptable list that serves as a platform for moving forward with guideline development. The list can be revised as the process proceeds, but ideally the agreed upon structure should be maintained. A draft list of guideline key action statements is shown in Table 8, based upon the topic list for the Hoarseness Guideline shown in the preceding section. Note that the wording is rough and the strength of action (should vs. may vs. must) will be determined subsequently based on the associated evidence profile at in-person meeting #2.
Following each key action statement authors should write several paragraphs of text that develop the rationale for the statement, present the underlying evidence (answering “Why?”), and provide sufficient detail or references that members of the intended audience will be able to carry it out (answering “How?”). It is important at this stage to remain focused on the recommendation statement and not to strive for encyclopedic coverage of the topic. Concisely stated guidance is more likely to be read and followed.
Although the text will not be written at this time, the group should discuss each key statement sequentially and outline concepts for the supporting text (Table 9). Ideas are recorded under each statement with the results displayed for immediate group feedback. The goal at this stage is define broad concepts; no attempt is made to create definitive language or statements.
Any topics or issues that were considered important by the group but were not chosen to be key recommendations can nonetheless be discussed in the text under a related key statement heading. By demoting a topic to the amplifying text, information can still be incorporated without the rigor needed to support a key recommendation.
Guideline authors need to be careful not to make substantive recommendations in the amplifying text. In contrast, they should be explicit about the evidence base, or lack thereof, for the actions proposed.
Before moving on to the next key action statement, the group must discuss the need for any Stage 3 literature searches to support the statement. For example, a search may be needed to fill an evidence “gap” for a key action statement. A statement in the Otitis Media with Effusion guideline reads “Clinicians should document the laterality, duration of effusion, and presence and severity of associated symptoms at each assessment of the child with otitis media with effusion.”29 Since no reviews or randomized trials were identified to support this, additional searches were done on the value of documentation, in general, in ambulatory care settings. Two references were identified to support the importance and value of appropriate documentation.
Considerations in planning the stage 3 literature searches include:
When a key statement is supported by multiple randomized controlled trials the group should identify and assess existing systematic reviews or meta-analyses. If none exist, an internal systematic review may be planned provided that time, resources, and expertise are available. If an existing, but outdated, systematic review is found, it may be possible to modify or update the data without conducting an entirely new review.
At the conclusion of the first in-person meeting the chair distributes writing assignments to group members. Group members set and agree to deadlines for preparation of their assignments that will allow the project to remain on track. A democratic process helps to assure adherence.
Each key statement and supporting text is assigned a primary author, who composes a draft that is reviewed by a secondary author with complementary expertise. For example, a statement about surgery might have a surgeon and primary care clinician as primary and secondary authors, respectively. If the group includes representatives of consumer groups and advance practice nursing they may serve as reviewers, authors, or both. A sample writing assignment grid is shown in Table 10.
The goal of each writing assignment is to ensure that rationale for the related key action statement is explained fully, the logic behind the statement is apparent, all medical terms and actions are clear and unambiguous, and that best evidence supporting the statement is explained and referenced. Since most working group members will likely never have encountered this type of assignment, the chair must provide clear and specific written instructions to guide the process (Table 11). The chair should review these instructions with the group to clarify any uncertainty and to emphasize the importance of following them explicitly.
The final step in identifying evidence is driven by the specific key action statements developed by the working group, which form the core of the guideline. These statements reflect opportunities for education and quality improvement, which may not necessarily be supported by existing systematic reviews or randomized controlled trials. Therefore, the stage 3 searches focus on identifying best published evidence to facilitate writing assignments for specific action statements, and to subsequently assist the group in determining the corresponding evidence profiles and strengths of recommendation.
As discussed in the preceding section, the group should identify the need for stage 3 searches when discussing the guideline topic list and composing the list of concepts to be covered in the corresponding supporting text.
The stage 3 searches should be coordinated by the staff lead after the first working group meeting. Since multiple searches may be needed, additional staff may be required. Search results should be grouped by major subheadings (e.g., etiology, diagnosis, therapy, prognosis) available in standard electronic databases. Search results are sent to the group member assigned as primary writer for the specific action statement. The writer eliminates irrelevant items, leaving a core of evidence that is distributed to the group for reference along with the statement that prompted the literature search.
Supplementary evidence can be identified by using PICO-type questions, which pose a well-focused question in terms of the patient, intervention, comparison, and outcome.8,,38 As an example, consider the question “In adults with uncomplicated acute bacterial rhinosinusitis, would therapy with amoxicillin, compared with placebo or no medication, improve clinical symptoms in 7 to 10 days?” PICO-type questions facilitate literature searches and can be formulated as follows:
If a specific action statement is already supported by high quality evidence such as systematic reviews or randomized controlled trials, additional stage 3 searches may be unnecessary. When high quality evidence is sparse, however, a PICO-type question may be formulated for each action statement to facilitate the search.
Timely completion of working group assignments is mandatory to keep guideline development on schedule. The chair, staff lead, or both should send reminders in advance of deadlines and monitor when assignments are completed. Delinquent group members should be contacted to ascertain the reason for delay and identify remedial action. In some cases assignments may need to be modified.
The need for a search may only become apparent after once the author assigned to write the text begins the task and becomes more familiar with the material. In this circumstance the author may conduct their own search, with or without assistance from the staff lead.
Authors must take care with supplementary literature searches to avoid introducing bias. Search and filter criteria should be documented, regardless of whether the search is conducted by the author, staff lead, or both.
If any systematic reviews or meta-analyses are required, they are planned with the group consultant or methodologist. This will entail substantial added effort because of the methodological rigor needed to produce a valid, unbiased, publication-quality systematic review. Whenever possible, guideline topics should be selected based on a foundation of existing systematic reviews, rather than attempting to create reviews from scratch. Sometimes, however, the need to create a review is unavoidable, and the project should be approached with realistic expectations regarding the time and effort involved.
Systematic reviews or meta-analyses conducted to support the guideline are conducted with an a priori protocol to justify publication as an independent manuscript. Attempting to stuff the review into the guideline text is ill advised, because the reporting detail needed to demonstrate validity of the review will overwhelm the guideline with unnecessary technical details. Only a brief, readable summary is included, with external reference to the systematic review manuscript, which is submitted separately for publication.
As the chair receives the completed writing assignments (by electronic mail) they are collated into a first draft of the guideline, using the previously described template (Table 5). The goal is create the draft before the next in-person meeting while still allowing sufficient time for preliminary feedback by the group.
In producing the first draft the chair should strive to maintain the ideas and concepts provided by the writing assignments, although editing may be necessary to standardize the writing style. Writing assignments are best viewed as raw material for composing the draft to ensure a balanced, multi-disciplinary, product. Authors of the original writing assignment will have ample opportunity to comment on any changes made by the chair in style or format for consistency with the remainder of the guideline.
The effort required by the chair in collating the writing assignments should not be underestimated. The assignments are likely to vary greatly in quality, completeness, and punctuality. Consequently, the chair may need to rewrite substantial portions of submissions and fill in conceptual gaps not covered.
When complete, the first draft of the guideline is distributed by e-mail to all working group members for review and comment. Using the “line numbers” feature of the word processor facilitates commenting by line number. The document is distributed as a read-only file (e.g., pdf format) to prevent direct modification and force the use of comments based on stable line numbers.
Comments from the group are collected and collated by the staff lead, who then distributes the final list to the chair for incorporation into the draft. Strict accounting of responses to substantive comments is critical to avoid a situation in which the same topic is revisited on multiple occasions. A recommended approach is for the staff lead to create a 4-column table (Table 12) in which all comments received from group members are listed by line number. The “disposition” states how the chair handled the comment, and will ensure members that their concerns have been addressed.
The chair revises the guideline draft based on the collated comments and dispositions. Any changes made to the draft are performed using the “track changes” feature of the word processor, so they are easily identifiable. The revised guideline and the summary table of comments are distributed to working group members before the next meeting.
An essential of guideline development is transparency in how policy statements are developed and classified as recommendations. An elegant way of accomplishing this is to add an “evidence profile” after the supporting text for each, key action statement that lists all decisions made by the group.
The best time to complete the profile is immediately after the group discusses a specific action statement and the associated supporting text.
Evidence profiles assist guideline writers and users by:
Evidence profiles appear immediately after the supporting text for a specific key action statement as a bulleted list with the following headings, defined in Table 13:
The evidence profile makes explicit and transparent the process by which evidence and opinion are transformed into recommendations about appropriate care. The group sequentially describes each aspect of the profile.
To illustrate the structure of evidence profiles, examples from the AAO-HNS guideline on benign, paroxysmal, positional vertigo are provided below.6 Each profile lists the associated key action statement followed by the supporting rationale. Details on determining the aggregate evidence quality are provided in the next section.
The first sample evidence profile accompanies a key action statement deemed a “recommendation” in the final guideline based on evidence quality and harm-benefit assessment. Note the detailed information provided under aggregate evidence quality and exclusions.
Sample key action statement #1: Clinicians should treat patients with posterior semicircular canal benign, paroxysmal positional vertigo with a particle repositioning maneuver. Evidence profile:
The second sample evidence profile accompanies a key action statement deemed an “option” in the final guideline. Note the relative balance of harm vs. benefits and the explicit statements about value judgments and intentional vagueness.
Sample key action statement #2: Clinicians may offer observation as initial management for patients with benign, paroxysmal positional vertigo and assurance of follow-up. Evidence profile:
The third sample evidence profile accompanies a key action statement deemed a “recommendation against” in the final guideline. Note the preponderance of benefit over harm and the exclusions.
Sample key action statement #3: Clinicians should not routinely treat benign, paroxysmal positional vertigo with vestibular suppressant medications such as antihistamines or benzodiazepines. Evidence profile:
Guideline groups often have difficulty reaching consensus on the aggregate level of evidence supporting a key action statement. The aggregate is not based simply on the highest or lowest quality single study identified, but rather a composite rating of the quality, consistency, and relevance of the overall group of studies. Assigning a rating always incorporates some component of judgment, which is permissible provided that the group is consistent and clearly states the reasoning involved under the “aggregate evidence quality” section of the evidence profile (Table 14).
The goal of the evidence review is to determine our confidence in the factors of benefit (e.g., magnitude of each beneficial effect) offset by our confidence in our understanding of the risks, harms, and costs. Evidence for harms should be assessed with the same diligence applied to evidence for benefit. The purpose of the evidence review is to help us understand the benefit risk equation.
Many rating scales have been developed, both for individual studies and for aggregate assessments. The scale used by the American Academy of Pediatrics16 (Table 14) is recommended for clarity and simplicity. A useful scale has also been developed by the United States Preventive Service Task Force, rating the overall evidence for a service as good, fair, or poor based on study number, quality, consistency, and generalizability.39
In nearly all situations the aggregate evidence level can be designated as A, B, C, or D using the criteria in Table 14. Recall, however, that these designations apply only to the aggregate evidence level and not to the individual, contributing studies. At times the group may decide to extrapolate evidence from similar, but not directly comparable patients (e.g., using findings from a study of adults in making a recommendation for the pediatric population), especially if high-quality evidence is found.20 The basis for extrapolating data and the assumptions made should be stated concisely and explicitly as part of the aggregate evidence level description.
A final category of evidence is “Grade X,” used for exceptional situations where validating studies cannot be performed and there appears to be a clear preponderance of benefit or harm. This special category is appropriate in rare circumstances where the group has defined a need for a key action statement to improve quality, but the nature of the situation is unlikely to ever result in high quality evidence. For example, randomized trials would be unethical to study antimicrobial prophylaxis for anthrax,16 ototoxic vs. non-ototoxic ear drops for acute otitis externa with a non-intact tympanic membrane perforation,Error! Bookmark not defined. or prolonged observation of otitis media with effusion in children with developmental delays or disorders.29
Although many different methods have been proposed for grading recommendation strength, most developers agree that determining the strength of action is distinct from rating the aggregate quality of evidence. High quality evidence (e.g., grade A) does not always justify strong recommendations, and recommendations – or even strong recommendations – may be possible despite lower quality evidence (e.g., grade B, C, or X).40 The primary modifying factor in this regard is the benefit-harm assessment, as defined in the preceding section on evidence profiles.
The method for determining strength of recommendation (Figure 1 and Table 15) developed by the American Academy of Pediatrics is simple, transparent, and clinically relevant.16 Similar to the GRADE approach,41 the aggregate evidence level and benefit-harm assessment are the primary rating determinants. GRADE is more complex, however, and offers only 2 levels of action strength (“strong recommendation” and “(weak) recommendation”) in contrast to the 3 levels from the AAP (“strong recommendation,” “recommendation,” and “option”). The authors’ empiric experience in developing guidelines suggests that 3 levels supports more flexible decision making and is better accepted by clinicians.
Using three levels of action strength is supported by research into the obligation level conveyed by terms commonly found in clinical practice guidelines.42 Despite a large number of descriptive terms, the obligation levels cluster into three distinct levels: “must” conveys the highest obligation level, “may” the lowest, and “should” an intermediate level. These terms can be used to strengthen a connection between recommendation language and expected adherence to recommendations. For example, a “strong recommendation” carries an obligation of “must” or “should,” a “recommendation” an obligation of “should,” and an “option” an obligation of “may.” “Should” is the most commonly used term in published guidelines.
The strength of action is best viewed as a relative constraint on clinician behavior. In general, less frequent variation in practice is expected for a strong recommendation than might be expected for a recommendation. The desire of many authors to make uniformly strong recommendations must be tempered by the reality of the evidence quality and benefit-risks assessment.
Assigning a strength of action to a key statement should be very straightforward if performed after the evidence profile is constructed. As shown in Figure 1, however, when there is a preponderance of benefit over harm the group may choose between “recommendation” or “strong recommendation” with level B or X evidence quality. Once a choice is made the reasons should be stated in the evidence profile, usually under the “values” section.
To understand better how the strength of action is determined by the aggregate quality and benefit-harm assessment, consider the statements below from various AAO-HNS guidelines. In each case Figure 1 can be used to cross-check the link between evidence, harm-benefit, and action strength.
The following key action statements are “strong recommendations,” meaning clinicians should follow this guidance unless a clear and compelling rationale for acting in a contrary manner is present.
The following key action statements are “recommendations,” meaning clinicians should generally follow this guidance but also should be alert to new information and sensitive to patient preferences.
The following key action statements are “options,” which offer clinicians flexibility in their decision-making but may set boundaries on alternatives. Patient preference should have a substantial role in influencing clinical decision-making.
Key action statements that lead to recommendations or strong recommendations are most desirable in guidelines, but options and no recommendations may also serve an important educational role. Options are helpful in addressing controversial aspects of management, especially when wide practice variation exists but evidence is sparse. A clear, systematic review and interpretation of the evidence, using expert consensus to fill gaps, may be very helpful to clinicians, consumers, and policy makers in facilitating decisions.
The purpose of this meeting is to polish the key action statements, review supporting text, and assign evidence profiles to each action statement. Creating evidence profiles should occupy most of the available time, because it is a critical part of guideline development that is best accomplished with a in-person interchange among group members. The evidence profiles are the main determinant of strength for the associated key action statement, based on the aggregate level of supporting evidence and the benefit-harm profile associated with following the action (Table 13).
The main product of the meeting is a second draft of the guideline that accurately reflects the logic and goals of the working group. In addition, the group incorporates suggestions for implementation and future research into the guideline.
Documents should be distributed by e-mail prior to the conference call for review by participants before the meeting. Materials for predistribution include:
The main goal of reviewing the draft guideline is to achieve a logical, consistent document that accurately reflects the group intentions, minimizing vagueness and underspecification. The goal is not to quibble over semantics, grammar, or sentence structure, all of which waste valuable time and can be done through electronic mail exchange.
The chair, assisted by the consultant, must effectively manage time and the group dynamics during the meeting to facilitate steady progress and to ensure that a minority of voices do not dominate the session.
Begin by reviewing the guideline front matter, which includes information about purpose, and disease burden. Since this was already reviewed once at the prior meeting the process should not require substantial time. The document is revised in real time by the chair, or their designate, and requests for additional information or fact checking are assigned for completion after the meeting.
Reviewing the draft guideline presents an ideal opportunity for identifying research and implementation needs, beyond any already specified in the writing assignment. Throughout the meeting thoughts related to opportunities for effective implementation and future research are recorded:
Next, each key action statement is reviewed by the group along with the supporting text. The chair and consultant lead the discussion, striving for balanced input from the group and efficient use of time. The chair or their designate record changes to the guideline in real time, projecting the guideline for all to see and using “track changes” on the word processor to clearly identify what has been altered.
The following sequence is advised when reviewing each action statement and supporting text:
Repeat the steps above until all key action statements and supporting text are discussed, and consensus has been reached on composition and structure. Some sections will invariably require more discussion than others, but time must be allowed for adequate discussion of all statements and supporting text. If a section requires significant rewriting, reorganization, or additional citations, the specific needs are recorded as an action item that will be addressed immediately after the meeting by the chair or their designee.
One of the most important goals of sequentially reviewing the key action statements and accompanying text is to reach consensus on the evidence profiles (Table 13). Although the profile statements should be concise, at this stage it is better to be more verbose and err on the side of over explanation, than risk not being clear on why decisions were made. The profiles can subsequently be edited for brevity and consistency.
Evidence profiles are a primary means of promoting transparency in guideline development, and must be developed with care and consistency. Additional time spent ensuring full consensus on the profiles will facilitate grading recommendation strength.
Once the profile has been agreed upon for a specific key action statement the group determines strength of recommendation using Figure 1 and Table 15. This should be very straightforward if the evidence profile was fully discussed. At times, however, the strength of recommendation does not “make sense” to one or more group members. When this occurs the evidence profile is reviewed, especially the aggregate evidence quality, to ensure accuracy of the information and that no important evidence was overlooked.
An accurate, explicit evidence profile offers the most compelling argument for the group to accept a recommendation grade that challenges existing biases and preconceptions.
After the key action statements have been developed and the sequence agreed upon by the group, a table is added to the guideline summarizing the statement topics and strengths. An example is the table included in the AAO-HNS sinusitis guideline (Table 16). The purpose is to orient readers to the structure and content of the guideline, and to highlight stronger statements for easy identification.
Many guidelines benefit from having one or more clinical algorithms that graphically display decision logic and sequences of activities (Figure 2). Including an algorithm in a guideline can (1) rapidly convey the scope and organization of the guideline; (2) result in faster learning, higher retention, and better compliance by the practice community; and (3) specify appropriate indications for particular management strategies.
Algorithms are most useful when the decision logic of a guideline is complex and the temporal sequence of activities is unclear.
A well-crafted guideline includes a plan for how the recommendations will be implemented, and anticipates obstacles to implementation. As suggested earlier, a running list of implementation issues and needs should be maintained by one of the assistant chairs and updated as discussion proceeds during the second in-person meeting. Issues to consider in this section included (a) plans for distribution and dissemination of the guideline, (b) anticipated obstacles to implementation and proposed solutions, and (c) evaluation plans to assess the impact of the guideline on clinical care processes and patient outcomes. The lifespan of the guideline should also be specified, with a statement about when review or revision is planned.
The chair reviews specific assignments made during the meeting and assigns deadlines for completion. The chair will also compile a revised guideline based on decisions made at the meeting plus information received from assignments. The final revision should include a section about future research needs.
Guideline appraisal is valuable at this stage to ensure that the guideline is clear, adheres to current methodological standards, and (importantly) that recommendations are can be implemented in a manner that is likely to influence clinician behavior. This can be accomplished by staff at the sponsoring organization, some of whom have not participated in the guideline development process.
The Yale Center for Medical Informatics (YCMI) has developed a Guideline Appraisal Report to aid developers in identifying and remedying potential problems with validity or implementation before publication. Activities involved in the YCMI guideline appraisal include:
The following examples illustrate how feedback from the GLIA assessment identifies areas for improvement in guideline statements. In each case the specific key action statement under analysis is stated followed by the general and specific areas of concern:
GLIA analysis of draft statement on drug delivery from the acute otitis externa guideline: Clinicians should inform patients how to administer topical drops. When the ear canal is obstructed, delivery of topical antimicrobials should be enhanced by aural toilet, placing a wick, or both.
GLIA analysis of draft statement on watchful waiting from the acute sinusitis guideline: Observation without use of antibiotics is an option for selected adults with uncomplicated acute bacterial rhinosinusitis based on illness severity and assurance of follow-up.
The purpose of this call is to review the Guideline Appraisal Report, address any deficiencies identified, and plan for external peer review. In advance of the call the Guideline Appraisal Report should be distributed to group members.
Now is not the time for major changes in the structure or order of key action statements unless the GLIA report identifies a serious deficiency that requires corrective action. The group should focus on remedying barriers to implementation identified in the report, without revisiting peripheral issues. If the conversation does digress, often the evidence profiles can be used as “organizational memory” as to why earlier decisions were made.
Suggestions are next solicited for external peer reviewers to review the draft guideline. Peer reviewers should represent the intended target audience and practice settings, and are selected with input from the chair, working group members, Academy committee chairs, specialty leaders, and others.
External, multidisciplinary peer review is essential to ensure guideline clarity and anticipate concerns or objections before the document is published. Several reviewers are solicited from each discipline involved in the guideline to increase the chance of comprehensive feedback. Typically as many as 25 to 35 reviewers are solicited.
Independent, external peer review of a guideline is critical aspect of development. Existing guideline processes have been criticized as biased, self-promoting, and exempt from accepted procedures for scientific publication and editorial peer review. The implementability assessment described in the preceding section promotes clarity of action, but does not address whether the recommended actions are appropriate and meaningful in the first place. This latter concern is the subject of multi-disciplinary peer review as described in this section.
A final draft of the guideline is prepared using the “line numbers” feature of the word processor to insert continuous line numbers along the left margin. The draft guideline is distributed electronically to external peer reviewers with instructions to submit comments by line number. A strict deadline is specified by which time comments should be submitted to the chair and staff lead.
Distributing the draft guideline as a pdf file has the advantage of making it impossible for a reviewer to simply type changes into the guideline text (which will greatly complicate the chair’s ability to identify them). The pdf file also ensures that all reviewers use the same line numbers, since numbers may not match when different word processing programs or operating systems are used.
Peer reviewers should be asked to focus on three main guideline attributes: validity, reliability, and feasibility:
Comments from the external reviewers are collected and collated by the chair. A recommended approach is for the chair to create a 4-column table (Table 18) in which all comments received are listed by line number. The “dispositions” state how the chair handled the comment, and will ensure external reviewers that their concerns have been addressed.
The chair revises the guideline draft based on the collated comments and dispositions. Any changes made to the draft are performed using the “track changes” feature of the word processor, so they are easily identifiable. The revised guideline and the summary table of comments are distributed electronically to external peer reviewers.
The final guideline draft and the summary table of external reviewer comments are distributed electronically to the working group for review and approval.
Prior to publication, the guideline should be distributed for approval to the Board of Directors of the sponsoring organization(s). The procedure will vary based on organizational policy, but the process used at the AAO-HNS is:
Because the document is based on evidence, any substantive changes requested by an oversight body (e.g., the organizational Board of Directors) must be supported and accompanied by evidence. The oversight body should be informed, however, that the purpose of review is not to rewrite the guideline, but rather to ensure that the recommended actions are consistent with the organization’s mission and values.
Any comments or concerns expressed by the board are responded to by the chair, with input from the working group solicited as needed.
The final guideline is converted by staff liaison and chair into a document that meets publication requirements the sponsoring organization’s official journal.
At the AAO-HNS, the managing editor of the journal is notified that the guideline should not be submitted for external peer review, because it has already been extensively reviewed and further changes are not possible. A copy of the chair’s summary table of external reviewer comments and their disposition should be submitted to the managing editor to document the external review and remain on file in lieu of traditional editorial peer review. Depending on the length of the document the guideline may be published as a supplement or within the main journal.
If the guideline is lengthy and published as a supplement, an Executive Summary may be prepared for publication in the main journal to promote awareness. The Summary should contain condensed versions of the introductory segments, a tabular listing of all key action statements, full evidence profiles for the action statements, and condensed versions of supporting text.
The guideline page proofs should be checked and expeditiously. The guideline chair should prepare a summary of newsworthy points from the guideline, which the organization’s public relations department can use in preparing a press release. Embargo and publication dates are coordinated among all involved organizations.
Guidelines should be published with an accompanying disclaimer, created by the organization’s legal counsel, to set clear bounds on the intended use of the document. The AAO-HNS adds a brief disclaimer to the abstract and a longer disclaimer to the end of the manuscript. Having a disclaimer in the abstract is advised because individuals without access to the full-text may cite only the abstract text.
Here is the AAO-HNS disclaimer text added to the end of the abstract:6
“This clinical practice guideline is not intended as a sole source of guidance in managing [topic specified here]. Rather, it is designed to assist clinicians by providing an evidence-based framework for decision-making strategies. The guideline is not intended to replace clinical judgment or establish a protocol for all individuals with this condition, and may not provide the only appropriate approach to diagnosing and managing this problem.”
Here is the AAO-HNS disclaimer text added to the end of the manuscript:6
“As medical knowledge expands and technology advances, clinical indicators and guidelines are promoted as conditional and provisional proposals of what is recommended under specific conditions, but they are not absolute. Guidelines are not mandates and do not and should not purport to be a legal standard of care. The responsible physician, in light of all the circumstances presented by the individual patient, must determine the appropriate treatment. Adherence to these guidelines will not ensure successful patient outcomes in every situation. The American Academy of Otolaryngology—Head and Neck Surgery (AAO-HNS), Inc. emphasizes that these clinical guidelines should not be deemed to include all proper treatment decisions or methods of care, or to exclude other treatment decisions or methods of care reasonably directed to obtaining the same results.”
Publishing the guideline is the first step towards implementation, but publication alone is unlikely to change clinician behavior. The implementation plan outlined in the guideline is begun, with necessary resources committed by the sponsoring organization or external funding agencies. Relevant brochures educational materials, and continuing medical education (CME) produced by the organization should be updated for consistency with the new guideline recommendations.
Awareness of the guideline is increased by ensuring that the National Guideline Clearinghouse (NGC) receives a copy of the final publication with appropriate copyright release to produce a summary document. Efforts to create performance measures based on the guideline should be planned when appropriate. Such efforts are facilitated by clear action-oriented recommendations that are supported by high-quality evidence.
In general, performance measures are likely to be most valid when built around strong recommendations and recommendations, where the benefit-risk deliberation shows a preponderance of one or the other and good quality evidence supports the policy.
Performance measures are unlikely to be valuable if built around (1) statements that are vague or underspecified, (2) recommendations where anticipated benefit is balanced by anticipated risk, harm, and cost, or (3) recommendations are based on evidence that may change.
Guidelines should describe how and when the need for an update will be assessed. Situations that might require clinical guidelines to be updated include:48
There are several possible methods for deciding when to update:48
When an update is required, a decision must be made whether to retain and revise the guideline, or to replace it entirely. This decision is necessarily subjective, taking into consideration the breadth and quality of new evidence plus the number of policy statements that are outdated.
We thank Jenissa Haidari, MPH, and Milesh Patel, MS for their critical review of the manuscript and helpful suggestions, and David R. Nielsen, MD for his support of our collaborative efforts to create superior guidelines and knowledge products with the AAO-HNS/F.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
No sponsorships or competing interests have been disclosed for this article.
Richard M. Rosenfeld, Department of Otolaryngology, State University of New York Downstate and The Long Island College Hospital, Brooklyn, NY (RMR)
Richard N. Shiffman, Department of Pediatrics and the Yale Center for Medical Informatics, Yale University School of Medicine, New Haven, CT (RNS)