|Home | About | Journals | Submit | Contact Us | Français|
To test the feasibility of a framework for prioritizing new comparative effectiveness research questions related to management of primary open-angle glaucoma (POAG) using practice guidelines and a survey of clinicians.
Members of the American Glaucoma Society
We restated as an answerable clinical question each recommendation in the 2005 American Academy of Ophthalmology Preferred Practice Patterns (PPPs) regarding the management of POAG. We asked members of the American Glaucoma Society to rank the importance of each clinical question, on a scale of 0 (not important at all) to 10 (very important), using a two-round Delphi survey conducted online between April and September 2008. Respondents had the option of selecting “no judgment” or “research has already answered this question” to each question in lieu of the 0 to 10 rating. We used the ratings assigned by the Delphi respondents to determine the importance of each clinical question.
Ranking of importance of each clinical question
We derived 45 clinical questions from the PPPs. Of the 620 American Glaucoma Society members invited to participate in the survey, 169 completed the Round One survey; 105 of 169 also completed Round Two. We observed four response patterns to the individual questions. Nine clinical questions were ranked as the most important: four on medical intervention, four on filtering surgery, and one on adjustment of therapy.
Our theoretical model for priority setting for comparative effectiveness research question is a feasible and pragmatic approach that merits testing in other medical settings.
Comparative effectiveness research (CER), “the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat and monitor a clinical condition, or to improve the delivery of care,”1 serves as the foundation for evidence-based practice and decision making, and is one of the top priorities of the nation’s health care agenda.2 Findings of CER inform patients, providers, payers, policy makers and others about the treatments that work best for a defined group of patients under specified conditions, and whether more extensive or expensive management leads to better health outcomes.3,4 Substantial investment of resources is required to conduct new primary studies or to synthesize existing evidence regarding the clinical effectiveness and appropriateness of interventions that are used to manage health conditions. For clinical questions about intervention effectiveness, randomized controlled trials (RCTs) and systematic reviews of RCTs are generally considered to be the highest level of evidence.5
To conserve resources and to obtain the answers that decision makers need, increased attention has focused on ways to prioritize CER. The Institute of Medicine (IOM), for example, was asked by the United States Congress to recommend research priorities that present the most compelling need for evidence of comparative effectiveness considering input from all stakeholders. 2, 6
While the necessity to establish CER priorities is well recognized, there is little empirical evidence to guide the criteria and processes for priority setting. Many organizations report using a nomination-based approach and use similar broad criteria for setting priorities, including cost, burden of disease, public interest, topic controversy, new evidence, impact on quality of life, measurable patient outcomes, and the potential to reduce variations in clinical practice.7–9
The objective of this study was to test a framework for prioritizing clinical questions for new CER starting with clinical practice guidelines. We believe that clinical practice guidelines reflect the state of knowledge in the clinical community regarding prevention, screening and therapy, as well as the community’s interpretation of evidence,10, 11 and provide a reasonable starting point for priority-setting. We used the American Academy of Ophthalmology (AAO)’s practice guidelines, the Preferred Practice Patterns (PPPs),16 for primary open-angle glaucoma (POAG), last updated in 2005 (http://one.aao.org/CE/PracticeGuidelines/PPP.aspx; accessed March 21, 2010), as a topic area for testing the framework. We selected POAG because of its substantial burden on patients and health care resources.12–15 AAO PPPs for POAG have been used in the United States, and have been referenced in international glaucoma guidelines.16
In 2006, we developed a list of clinical questions about POAG that could be addressed by clinical trials and systematic reviews of trials (e.g., questions about intervention effectiveness) that we proposed to send to clinicians with a request to rank their relative research priorities. To do this, two individuals independently reviewed the 2005 AAO PPPs on the management of POAG and extracted every statement that could be considered as a recommendation. We restated each recommendation as an answerable clinical question based on characteristics of the patient population, test and comparison interventions, and outcomes.17 We did not translate into clinical questions guideline statements on disease definition, etiology, screening, and diagnosis; statements that we classified as ethical or legal statements; or statements that were not recommendations.
In the PPP, the AAO panel “rated each recommendation according to its important to the care process (http://one.aao.org/CE/PracticeGuidelines/PPP.aspx; accessed March 21, 2010).” We extracted the assigned importance rating for each AAO recommendation, defined as: “level A,” most important, “level B,” moderately important, and “level C,” relevant but not critical. Not all recommendations were rated.
We consulted with three glaucoma specialists who have expertise both in the management of glaucoma and in forming answerable clinical questions to confirm that our restatement was accurate. We also consulted the Cochrane Eyes and Vision Group US Satellite (CEVG@US) Advisory Board and methodologists on the accuracy of the restatements and on our proposal to ask ophthalmologists to rank the relative importance of each question. This revealed that many ophthalmologists are uncomfortable ranking clinical questions outside their sub-specialty. Furthermore, when we pilot tested the survey, some clinicians responded to our request as if we were asking about their knowledge rather than prioritization. The clinician advisors also suggested that we add a response option “research has already answered this question” to our survey. We modified the survey questionnaire and methods accordingly.
Between April and September 2008, we surveyed the membership of the American Glaucoma Society (AGS), asking them to rank the relative importance of the clinical questions we derived from the AAO PPPs for POAG using two rounds of an online Delphi survey. The format of the questionnaire and the data collection plan were adapted from the defining Standard Protocol Items for Randomized Trials (SPIRIT) Initiative Delphi Survey.18 The Johns Hopkins Bloomberg School of Public Health Institutional Review Board (IRB) approved the Delphi survey on March 13, 2008 (IRB number: 00001124). The AGS Research Committee approved the Delphi survey and their role on March 26, 2008.
Round One of the survey, which included the initial invitation, consent to participate, and access to the survey website, was sent by the AGS office via email to their membership list of 620 people. The invitation was signed by the Chair of the AGS Research Committee (Dr. Henry Jampel). The survey instructions said “Please rate the importance of having the answer to each of the following questions for providing effective patient care” on a scale of 0 (not important at all) to 10 (very important). Participants also had the option of assigning a rating of “no judgment” or “research has already answered this question” to each question in lieu of the 0 to 10 rating. We provided space for comments, questions, and nomination of items not included in the list of research questions derived from PPPs. Additionally, Round One requested demographic and other information such as occupation/field, sub-specialty, and place of employment (e.g., government, industry, academia, other), and experience in clinical trials and systematic reviews. AGS members were given 3 weeks to respond to Round One. After the initial request, an email reminder was sent at the end of week 1, week 2, and 1 day prior to the end of week 3, from the AGS office.
Round Two of the survey was sent electronically to Round One respondents approximately 4 weeks after the completion of Round One. We modified the wording of seven survey questions, for clarity only, using feedback from Round One. In Round Two, we provided each respondent with his/her previous individual rating for each question, as well as a summary, in the form of a histogram, of all responses for each question. Respondents were asked to re-rate each clinical question in light of ratings and comments from the previous round. In Round Two we also asked respondents “Please rate the amount you have relied on the AAO PPPs for POAG in deciding how best to provide effective patient care” on a scale of 0 (none) to 10 (heavy). Demographic questions asked in Round One were not asked again in Round Two. Round One respondents were given 3 weeks to respond to Round Two. After the initial request, and at the end of week 1, the study moderator (TL) sent an email reminder. To increase response, the Chair of the AGS Research Committee sent an email reminder at the end of week 2, and 1 day prior to the end of week 3, with a personalized message encouraging participation.
To analyze the survey data, we fitted a 3-level hierarchical model to allow for dependence among the responses observed for units belonging to the same cluster (responses are clustered within question and clustered within respondent). We assumed a normal likelihood for ratings assigned by respondents, and chose non-informative prior distribution for each parameter. We calculated the mean rating and the 95% credible interval for each question using Markov chain Monte Carlo (MCMC) methods in WinBUGS 126.96.36.199 In Bayesian statistics, a credible interval is a posterior probability interval that is used similarly to confidence intervals in frequentist statistics. We categorized ratings into three levels resembling the system used by the AAO PPPs: “most important,” “moderately important,” and “relevant but not critical.” “Most important” questions were those with a lower credible limit (2.5% percentile) above the overall mean; “moderately important” questions were those with 95% credible intervals that include the overall mean; and “relevant but not critical” questions were those with an upper credible limit (97.5% percentile) below the overall mean. We compared the ratings assigned by the Delphi respondents with the associated importance rating assigned in the AAO PPPs.
We initially coded “research has already answered this question” and “no judgment” as “missing” in the analysis. We conducted sensitivity analyses, assigning a 10 for “research has already answered this question”, and compared ranks under this assumption. A “sensitivity analysis” repeats the primary analysis, substituting alternative ranges of values for decisions that are subjective, and is used to examine how robust the findings are.17 In assigning a 10 for “research has already answered this question” we assumed that sufficient research to answer a question indicates that the question is important and a systematic review is warranted.
We derived 45 clinical questions from the AAO PPPs related to management of POAG (Table 1, available at http://aaojournal.org): 13 on medical interventions, eight on laser trabeculoplasty, 20 on filtering surgery, one on cyclodestructive surgery, and three on adjustment of therapy.
Of the 620 AGS members invited to participate in the survey, 169 completed the Round One survey; 105 of 169 also completed Round Two (Table 2). The number of responses per day increased when reminder emails were sent. In Round Two, the number of responses per day peaked when the reminder email was sent by the Chair of the AGS Research Committee. We compared the characteristics of those who completed Round One, Round Two, and those who completed Round One but not Round Two (Table 3). Respondents who classified themselves in Round One as “at least moderately experienced in clinical trials as an investigator,” as “ever (co)-authoring a systematic review,” or as “self-employed/private practice” were less likely to complete Round Two. In Round Two, the mean score for respondent’s level of reliance on AAO PPPs for providing effective patient care was 5.2 (standard deviation=2.7; range: 0 to 10).
We observed four response patterns to the individual questions (Figure 1): 1) the majority (>50%) of respondents said that the clinical question had been answered by the research; 2) respondents varied in their opinions of the importance of the research question and a few (≤10%) said that the clinical question had been answered by research; 3) respondents varied in their opinions of the importance of the research question and a substantial proportion (>10% and ≤50%) said that the clinical question had been answered by research; 4) some respondents may not have understood the clinical question, indicated by high frequency of “no judgment” and the written comments. For example, 8/169 respondents in Round One commented that they did not understand question 13 “Are interventions (e.g., repeated instructions, patient education) effective for improving adherence to medical therapy and efficacy of medical therapy in patients with POAG?” In addition, respondents sometimes answered the clinical question or nominated new clinical questions and appeared less interested in or aware of our methodologic goal of setting research priorities. For example, for question 45 “What is the optimal interval for conducting follow-up visits to assess the response and side effects from washout of the old medication and onset of maximum effect of the new medication,” 12/169 respondents in Round One provided an optimal follow-up interval, ranging from 4 to 6 weeks. An example of a nominated clinical question is whether beta-blockers stop progression of POAG.
We calculated the mean and 95% credible intervals for the ratings of importance for 45 clinical questions (Figure 2). When “no judgment” and “research has already answered this question” were coded as missing, questions 1, 2, 3, 7, 13, 23, 25, 33, 39, 40, 41, 42, and 43 were ranked as the most important clinical questions to which answers are needed for providing effective patient care, with their lower 2.5% credible limits above the overall means.
We performed a sensitivity analysis, assigning a rating of 10 to response option “research has already answered this question” (“no judgment” was coded as missing), and again compared ranks of the 45 clinical questions. Nine questions (1, 2, 3, 7, 25, 33, 39, 40 and 43) were classified as the most important clinical questions under both coding strategies (Figure 3 and Table 4): four on medical intervention, four on filtering surgery, and one on the adjustment of therapy. The proportion of respondents selected “research has already answered this question” ranged from 7% to 68% in Round One, and 6% to 89% in Round Two for these 9 questions. Respondents who changed from assigning a score in Round One to selecting “research has already answered this question” in Round Two were less likely to report expertise in clinical trials and systematic reviews. Questions 13, 23, 41, 42, were classified as the most important clinical questions when “research has already answered this question” was coded as missing, and questions 4, 5, 14, 21, 22, 26, 28 were classified as the most important clinical questions when “research has already answered this question” was coded as 10, but not under both coding strategies (Figure 3). When the highest ranking questions under both coding assumptions were compared to the AAO importance ratings, 4/9 questions that the AGS members ranked highly in our Delphi survey received a “level A” rating in the AAO PPPs, while 5/9 questions were unrated by AAO (Table 4).
Using questions derived from clinical practice guidelines, and a survey of glaucoma specialists, we generated a ranked list of clinical questions for comparative effectiveness research related to the management of POAG. The questions to which clinicians assigned the highest importance ranking related to the effectiveness of medical interventions, filtering surgery, and adjustment of therapy. Clinical questions on laser trabeculoplasty and cyclodestructive surgery were ranked as less important. The clinical questions that respondents ranked as the most important involved either common clinical scenarios that practitioners face several times daily, such as the decision to use eye drops to lower the intraocular pressure, or less common scenarios, such as interventions after filtration surgery for which practitioners may lack knowledge about their utility.
The top five questions receiving a high importance ranking under both coding schemas were also ranked as “research has already answered this question” by more than 50% of respondents. This may indicate the existence of evidence, for example, from one or two trials that has convinced many clinicians, but not others. It is particularly interesting that the proportion responding that “research has already answered the question” increased after respondents had the opportunity to view the responses by others. We do not know whether those changing their responses were prompted to check the evidence or whether they were simply influenced by their peers.
The clinician survey rankings agreed with the AAO importance rankings in cases where importance rankings had been assigned. We frequently derived more than one clinical question from each AAO PPP recommendation, however. For example, for the PPP statement that medical interventions are generally effective for POAG, we derived seven questions, specifying each type of medical intervention (e.g., beta-blockers) as a unique question. This may explain why over half of the survey questions did not have an AAO importance ranking.
Two main steps are involved in any priority setting effort: identifying important answerable clinical questions, and prioritizing the list of questions using a specific methodology. Our approach for identifying important questions used direct clinician input. Because practice guidelines typically are developed by professional societies aiming to assist healthcare practitioners with decision making,11 the clinical questions derived from them reflect key issues and dilemmas facing clinicians at the time of guideline development.
Our study also used Delphi survey methods, a formal consensus technique, which incorporates individual value judgment into group decision-making. This method is in contrast to nomination-based methods in which, first, topics may be suggested by curious investigators, by payers concerned about cost, or members of the public concerned about contradictory claims of a treatment’s efficacy, and then explicit pre-decided criteria are applied to develop rankings.7, 8, 20–22 In certain cases, these other methods have not demonstrated satisfactory validity.21
Our method has several unique strengths. First, by surveying AGS members, we have queried a large group of stakeholders with highly specialized knowledge. This group of experts allowed us to examine a broad range of interventions for the same condition, thus meeting the urgent needs of practitioners to answer myriad questions within the subspecialty. In contrast, most priority setting methods in use focus on just a few questions in a specialty area. This is especially true for vision research. Since their start in 1997, the Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Centers (EPCs), the key U.S. producers of systematic reviews, have released only three eye and vision related evidence reports out of a total of 185 evidence reports completed.23 The recent Institute of Medicine report “Initial National Priorities for Comparative Effectiveness Research,”6 recommending research priorities for $400 million allocated by the American Recovery and Reinvestment Act fund to the Department of Health and Human Services, includes only two topics related to vision heath. While dealing with important topics, these reviews and nominated priority topics do not begin to address the important clinical questions of vision subspecialists.
Second, our method promises to lessen the gap between evidence generation and the translation of evidence to care, when the method is used in partnership with guideline developers, research funders, evidence producers, and consumers. We propose the following approach to filling the evidence gaps in a subspecialty area: 1) choose a topic area and work with guidelines producers to derive answerable clinical questions from existing guidelines; 2) survey members of one or more professional associations to assess individual and consensus rankings of the clinical questions; 3) determine evidence needs and research priorities by matching the ranked questions with existing evidence; 4) partner with funders, evidence producers, and evidence synthesizers (e.g., groups within The Cochrane Collaboration such as the Cochrane Eyes and Vision Group) to fill the information gaps.
Our approach can also be used to re-assess research priorities when novel medicine and technology emerge, new evidence gaps develop, and healthcare resources need to be re-allocated to meet these immediate needs.
Given our goal of developing a framework for prioritizing comparative effectiveness research, we faced several challenges. First, when translating the AAO guidelines into answerable clinical questions, we relied on interventions and outcome measures that were stated explicitly in the guidelines. One consequence was that intraocular pressure was over-emphasized as an outcome in the restated clinical questions. In addition, by their nature, practice guidelines may not cover all important clinical questions; the fact that respondents nominated new questions provides evidence regarding this issue.
A second challenge was that some clinicians failed to grasp that our purpose was research prioritization. At both the pilot testing and survey stages of the study, clinicians sometimes responded to the questions as if we were asking about their knowledge of the subject. We found that questions related to the delivery of care (e.g., behavioral interventions, pre- and post- operative care) appeared to be the most difficult in this regard.
We faced a particular challenge in analyzing clinical questions where a meaningful proportion of respondents selected “research has already answered the question,” a response option suggested by a clinician co-investigator. Although we performed sensitivity analyses to test how priority ratings would change under two different assumptions, neither approach provides information about whether research indeed has answered the questions and how existing evidence influenced a respondent’s interpretation of the question. To address these issues would require matching existing systematic reviews and RCTs to the 45 clinical questions. In future applications of our method, it may be useful to separate response options into two parts: 1) “Please rate the importance of having the answer to each of the following questions for providing effective patient care,” and 2) “Do you believe research has already answered each question?” so that responses can be analyzed separately. One could also ask respondents to cite the evidence if they say “research has already answered this question.”
High response variability may be used as an alternative criterion to prioritize clinical questions for additional research. Questions with high response variability (which would result in a wide credible interval) might reflect greater clinical uncertainty, and may be more suitable for additional research. In our study, however, response variability was small for all questions. Regardless, it is critical to search for and synthesize existing evidence for clinical questions identified as priorities by preparing, maintaining and disseminating systematic reviews, a necessary step prior to investment of research funds.24
Although our Delphi survey response rate was comparable to other web-based surveys of medical specialists,25 the opinions and rankings of our respondents may not be comparable to those of other American glaucoma specialists since fewer than one third of AGS members responded to the survey. In addition, those who responded to both Round One and Round Two were less likely to report expertise in clinical trials and systematic reviews, and were less likely to identify themselves as self-employed/private practice than those responding to Round 1 only, which may have influenced the final rankings.
If providers and patients are to make well-informed decisions about health care, comparative effectiveness research must be prioritized and conducted with all due speed. Prioritization of the generation and synthesis of research evidence to prevent, diagnose, treat, and monitor clinical conditions should reflect the most urgent needs. Those setting priorities report experience but little empirical evidence as to effective methods of prioritization.8, 22 Our study provides evidence supporting the practicality of a systematic method to identify priority questions utilizing stakeholder input.
In conclusion, we tested the feasibility of a framework for prioritizing answerable clinical questions for new comparative effectiveness research by using practice guidelines and a survey of clinicians. Our approach is systematic, transparent, participatory, and produces a ranked list of questions in a subspecialty area. We have demonstrated that our theoretical model for priority setting for comparative effectiveness research question is a pragmatic approach that merits testing in other medical settings.
We acknowledge valuable clinical and methodological perspective provided by Drs. Richard Wormald, Donald Minckler, and David S. Friedman for the Cochrane Eyes and Vision Group US Project (CEVG@US) Advisory Board, who helped to verify that our restatement from guidelines was accurate and our research plan was feasible. We thank Dave Shade who developed the survey webpage and database.
This study was supported by the Cochrane Collaboration Opportunities Fund, and Contract N01-EY2-1003, National Eye Institute, National Institutes of Health, USA.
The sponsor or funding organization had no role in the design or conduct of this research.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Evidence-based Priority-setting for New Systematic Reviews and Randomized Controlled Trials: a Case Study for Primary Open-angle Glaucoma. Presented at the 30th Society for Clinical Trials Annual Meeting, Atlanta, GA. May 2009
Evidence-based Priority-setting for New Systematic Reviews: a Case Study for Primary Open-angle Glaucoma. Presented at the XVI Cochrane Colloquium, Freiburg, Germany. October, 2008
Conflict of Interest:
Dr. Jampel is a consultant for Allergan, Glaukos, Ivantis, Endo Optics, Sinexux, and owns equity for Allergan. Other authors have no financial/conflicting interests to disclose.
Online only material:
This article contains online-only material. The following should appear online-only: Table 1. Statements in the American Academy of Ophthalmology primary open angle glaucoma Preferred Practice Pattern translated into clinical questions.