|Home | About | Journals | Submit | Contact Us | Français|
(1) To systematically collect and organize into clinical categories all outcomes reported in trials for abnormal uterine bleeding (AUB); (2) to rank the importance of outcomes for patient decision making; and (3) to improve future comparisons of effects in trials of AUB interventions.
Systematic review of English-language randomized controlled trials of AUB treatments in MEDLINE from 1950 to June 2008. All outcomes and definitions were extracted and organized into major outcome categories by an expert group. Each outcome was ranked “critically important,” “important,” or “not important” for informing patients’ choices.
One hundred thirteen articles from 79 trials met the criteria. One hundred fourteen different outcomes were identified, only 15 (13%) of which were ranked as critically important and 29 (25%) as important. Outcomes were grouped into eight categories: (1) bleeding; (2) quality of life; (3) pain; (4) sexual health; (5) patient satisfaction; (6) bulk-related complaints; (7) need for subsequent surgical treatment; and (8) adverse events.
To improve the quality, consistency, and utility of future AUB trials, we recommend assessing a limited number of clinical outcomes for bleeding, disease-specific quality of life, pain, sexual health, and bulk-related symptoms both before and after treatment and reporting satisfaction and adverse events. Further development of validated patient-based outcome measures and the standardization of outcome reporting are needed.
Abnormal uterine bleeding (AUB), an alteration in the volume, pattern, or duration of menstrual blood flow , is the single, most common reason for gynecologic referral . The large societal and personal burden of AUB lies in its major impact on quality of life, productivity, health care use, and costs [3,4]. Although hysterectomy cures AUB, it carries the risks associated with major surgery. A multitude of alternative less-invasive procedures have gained in popularity, including endometrial destruction techniques, myomectomy, polypectomy, and uterine artery embolization. Medical management of AUB includes modification of endometrial coagulation factors or prostaglandins, hormonal manipulation, and insertion of levonorgestrel-releasing intrauterine systems. This myriad of therapeutic options makes direct comparisons of the effectiveness of different AUB interventions necessary but difficult.
Further complicating meaningful comparisons of interventions for AUB is the subjective and variable nature of the presenting symptoms. Nearly half the women who seek treatment for AUB lose less blood monthly than the 80 mL that strictly defines “heavy menstrual bleeding” [5,6]. Rather, many patients may be more bothered by the unpredictability of bleeding or pain associated with passage of large clots. As such, outcomes that capture the symptoms most relevant to patients (so-called patient-reported outcome measures) are increasingly being used in research on treatments for AUB . In this study, our objectives were (1) to create a complete inventory of outcomes reported in published randomized controlled trials (RCTs) for treatment of AUB and organize them into overarching categories, and (2) to rank the importance of these measures by a physician expert-review process that considers the clinical relevance of the outcome to the patient and the quality of the measuring instrument. The ultimate goal of this review was to facilitate and improve future comparisons between different interventions for AUB.
The Systematic Review Group (SRG) of the Society of Gynecologic Surgeons (SGS) is a group of 25 physicians with expertise in benign gynecology (including AUB), urogynecology, obstetrics, and consultants with expertise in systematic review methods and guideline development. Members and relevant disclosures are described elsewhere (Appendix A, available on the journal’s Web site at www.elsevier.com).
The SRG undertook a systematic search to identify RCTs of treatments for AUB in women with either dysfunctional uterine bleeding (DUB) or fibroids. We performed a MEDLINE search of articles published between 1950 and June 11, 2008. Search terms included menorrhagia, menometorrhagia, uterine hemorrhage, dysfunctional uterine bleeding, abnormal uterine bleeding, heavy vaginal bleeding, fibroid, leiomyoma, anovulatory bleeding, hysterectomy, myomectomy, uterine artery embolization, endometrial ablation, intrauterine device or system, several classes of medical therapy, and a search module for RCTs. The search was limited to human and English-language studies. A manual review of bibliographies of retrieved articles was completed to identify additional studies. Inclusion criteria required RCTs to have at least 10 individuals per arm and at least 1-month duration of follow-up. We excluded RCTs of “pretreatments” before a subsequent intervention and analyses assessing only cost or resource use.
Studies meeting criteria were independently data extracted in duplicate by members of the review group, most of whom had experience with data extraction from a prior systematic review project . The data fields included study size; duration of follow-up; population (DUB, fibroids, or mixed); interventions compared; and outcomes (including patient-reported outcome measures and adverse events). A complete inventory of all reported outcomes was created. Each individual outcome was detailed in terms of its specific definition, the time point(s) when it was assessed, which instrument or test was used to measure it, whether the instrument or test had been validated, the units or metric used for presenting the results, and whether it was the “primary outcome” of the study (either explicitly described as such or used in a power calculation to determine study size). Composite outcomes were designated as such. All outcomes from the same trial were recorded on a single data extraction form. The list of all unique outcomes comprised the “outcome inventory.”
From this outcome inventory, the outcomes were organized and grouped into eight proposed overarching outcome domains. Categories were determined based on their applicability to all potential interventions for AUB and the physician expert group’s consensus of their relevance for informing patient choices. Outcomes related to cost, resource use, or those determined by the review group to have limited relevance for assessing clinical effectiveness were excluded from categorization and further analyses.
In preparation for ranking the importance of outcomes, the group completed a review of AUB outcome measures. The review included reports on the subjective experience of patients with AUB [8-13] and evaluations of the quality of patient-based outcome measures used for AUB, including findings from focus groups conducted by one of the coauthors (K.A.M.) on what women with AUB find important or bothersome [1,14]. The importance of each outcome was graded on a scale suggested by the GRADE working group (Grading of Recommendations Assessment, Development, and Evaluation) . This working group, which strives to standardize and improve the transparency of grading evidence and recommendations in guideline development, proposes a 9-point scale with outcomes scored as “critical” for decision making (score: 7–9), “important but not critical” (score: 4–6), and “not important” for decision making (score: 1–3). The GRADE group does not provide specific details for what to consider in the ranking. Our group had several discussions on considerations potentially impacting the importance of an outcome at in-person meetings or on conference calls. There was general agreement to consider the magnitude of the impact of the outcome on a patient’s well-being and the quality of the measurement tool used to assess the outcome in terms of what was known about its validity, reliability, and feasibility. The overall impact of the outcome on patient’s health and well-being superseded the assessment of the instrument used to measure the outcome. Consideration was given to whether the instruments were multi- or unidimensional, generic or condition specific, or psychometrically tested. The rankings for outcomes were increased if they were measured using multidimensional, condition-specific, or psychometrically tested instruments. Psychometrically robust instruments were thought by the group to have the capacity to more broadly capture symptoms important from the patients’ perspective (multidimensional or condition specific) or the ability to provide higher-quality measurement of outcomes (psychometrically tested). The group followed an iterative process of consensus building.
A ballot system was used by all members to grade the importance of each outcome based on his or her qualitative synthesis of the aforementioned considerations. The mean was calculated and rounded to the nearest whole number. Members were asked to report any potential conflicts of interest for this vote related to professional practice, research activities, or financial investments, and, if present, to abstain from voting. No conflicts were reported. For outcomes grouped under adverse events, no score was tallied as the group adapted the Consolidated Standards of Reporting Trials (CONSORT) statement on reporting of harms .
Finally, the group developed recommendations for outcome measures to assess in future trials for AUB based on the outcome inventory, the overarching outcome categories, and the “importance” rating of individual outcomes.
The literature search identified 5,025 citations. Of these, 4,892 were excluded after review of title or abstract for not being treatment trials for AUB. After examining the full text of 133 articles, 20 were excluded for not being RCTs or if they were studies of resource use and cost estimations (Fig. 1). Studies ranged in size from 20 to 372 patients. Length of follow-up ranged from only perioperative events to 5 years of follow-up. Appendix B (available on the journal’s Web site at www.elsevier.com) lists detailed summary data for the final 79 trials reported in 113 articles, of which 51 trials involved patients with DUB (72 articles), 20 trials with uterine fibroids contributing to abnormal bleeding (27 articles), and eight trials with mixed or unknown/unclassified cause of AUB (14 articles). Of the DUB trials, seven were surgical trials that included hysterectomy as an intervention. Less-invasive surgical interventions involving endometrial ablation or resection were reported in 31 trials. The remainder of DUB trials (19 articles) examined the role of medications, largely nonsteroidal anti-inflammatory and hormonal therapies. Most trials reporting on treatments for fibroid-associated AUB were surgical, with six involving hysterectomy. Myomectomy and uterine artery embolization were used in four and eight fibroid trials, respectively.
The 79 trials contained 114 outcomes, including 27 unique adverse events. After review of the inventory of outcomes, the SRG experts deemed 43 outcomes (38%) as not important for long-term clinical decision making (Appendix C [available on the journal’s Web site at www.elsevier.com]), for example, estradiol and ferritin levels, grading of uterine artery spasm, fibroid volume, and volume of embolic material. There was no further voting on adverse event outcomes, as it was agreed that these should be reported consistent with the recommendations of the CONSORT statement ; examples of these events include death, pulmonary embolism, wound hematoma, and fluid overload. Of the remaining 44 outcomes, the most commonly reported outcomes were patient satisfaction (35 of 79 trials, 44%), rate of amenorrhea (32 of 79 trials, 41%), and frequency of subsequently needed interventions—either hysterectomy (25 of 79 trials, 32%) or other nonhysterectomy surgical treatment (26 of 79 trials, 33%).
The overarching categories agreed upon by the SRG were (1) bleeding; (2) quality of life; (3) pain; (4) sexual health; (5) bulk-related symptoms (for patients with anatomic/fibroid-related AUB); (6) patient satisfaction; (7) need for additional treatments; and (8) adverse events. All outcomes deemed either “important” or “critical” for clinical decision making for treating AUB were grouped into these eight categories. Table 1 shows a listing of these outcomes with their corresponding scores.
When rating an outcome, we decided that it was appropriate to consider the concepts that an individual instrument measured. The group agreed on some principles. A comprehensive and multifaceted instrument rated higher than a focused unidimensional instrument, because the former has the capacity to more broadly capture symptoms that would be important from the patients’ perspectives. Condition-specific quality-of-life instruments rated higher than general instruments, because they had the capacity to include more meaningful questions related to the intermittent symptoms of AUB. In the bleeding category, for example, instruments that can assess both heaviness and irregularity of bleeding were given the highest scores and were most likely to be considered “critical” by our group. The Pictoral Bleeding Assessment Chart is one such measure, and it was commonly used (n = 22 trials). Other tools that examined just heaviness of bleeding were considered important but not critical, for example, sanitary pad counting (n = 7 trials).
Given the prevalence of AUB, high-quality research on treatment modalities for this burdensome condition is imperative for the advancement of women’s health-related quality of life [17-19]. In our systematic review, we identified 79 trials for AUB that reported 114 different outcomes, many of which attempted to evaluate outcomes from the patient perspective, a top priority in health research. However, the sheer number and variety of outcomes reported in the literature hinders comparing different interventions across different studies. This limits the utility of this literature for informing clinicians, patients, and policy makers about the respective benefits and harms of different interventions. In response to this dilemma, we generated an outcomes inventory and ranked the importance of outcomes in an explicit expert decision-making process to guide evidence synthesis and future research.
The outcome inventory generated included all clinically relevant patient-reported outcomes that have been used in past studies for AUB over the past 58 years. Categorizing the individual outcomes into overarching outcome domains and then rating each individual outcome in terms of “importance” for clinical decision making provides a point of reference for recommendations of outcomes to be used in future studies on AUB. Our ranking emphasizes patient-reported outcomes deemed to be important but not explicitly how to measure them, as currently, there is no definitive answer on how best to measure all of these outcomes. The variety of outcomes assessed and instruments used in our review highlights that there is no high-quality, standardized set of instruments that comprehensively evaluates outcomes for women with AUB that is universally accepted.
Although reliable, validated instruments exist for the specific determination of the amount of blood loss [20,21]; there is no single instrument that is considered the standard for evaluating menstrual symptoms, AUB-related quality of life, sexual health, or satisfaction with treatment in women with AUB. Further research is needed in the development of AUB-related quality-of-life measures that are proven responsive to capture changes after treatment. Generic health-related quality-of-life questionnaires can be used to augment validated disease-specific measures of blood loss; however, some women with AUB may experience difficulty in answering generic questions about health perceptions because of the intermittent nature of AUB [14,22-24]. As AUB symptoms can be chronic and intermittent, it is important to assess more than a single aspect of a woman’s bleeding experience, and that is why we have included all outcomes categories rated as “important” by our review group in the following recommendations.
Specification of important outcomes is a key step when developing a systematic review protocol to ensure that useful information is synthesized. The recently updated Cochrane Handbook for Systematic Reviews of Interventions recommends choosing a maximum of seven outcomes and to present the main findings on those in the “Summary of findings” table . This is because evidence synthesis across an excessive number of outcomes becomes confusing. The eight outcome categories proposed for AUB trials are not necessarily independent but capture different aspects of a disease that manifests with varying symptoms that can have variable impact on a woman’s health. Hopefully, going forward, single instruments will collapse the measurement of symptoms and manifestations across different categories, but at this point, the eight outcome categories are proposed as a first step to help with cleaning up a field plagued by a plethora of outcomes.
The problems encountered in this systematic review of AUB trials are not unique to gynecology. Similar strides for developing meaningful and parsimonious outcome measures have been undertaken in other medical disciplines. For example, OMERACT (Outcome Measures in Rheumatoid Arthritis Clinical Trials) has endeavored to improve national and international consensus on issues, such as minimum number of outcome measures to be included in treatment trials, and to decide on the magnitude of differences judged to be clinically relevant .
The following recommendations provide guidance for a more standardized assessment and reporting of outcomes specifically for trials for AUB. Our recommendations should be considered supplemental to the CONSORT guidelines  and the extensions of the CONSORT guidelines for trials assessing nonpharmacological treatments  and for assessing the harms .
To fully assess the effects of an intervention, we recommend assessing outcomes in all eight outcome categories (Table 2). Assessments of bleeding, quality of life, pain related to heavy bleeding, sexual health, and bulk-related complaints (in patients with leiomyomas) should be conducted both before and after treatment, and the change for each patient should be used as an analytic metric. Patient satisfaction, need for additional treatment, and adverse events should also be reported.
The goal of any treatment for AUB is to reduce or eliminate bleeding, improve other associated symptoms, and improve quality of life. We suggest that bleeding be assessed based on patient-reported amount, frequency of menses, duration of bleeding, and regularity of menses . Many studies include pain, an important symptom in terms of potential impact on patients’ health and well-being, as an outcome, and this should also be measured. Women with leiomyomata may also have symptoms specific to uterine size. Therefore, trials in women with leiomyomata should assess bulk-related symptoms as an outcome before and after treatment.
Condition-specific quality of life is a key patient-based outcome measure. The large personal burden of AUB lies in its adverse impact on quality of life [3,4], and this is an important outcome for AUB research. Sexual health, another important part of a woman’s quality of life, may be adversely affected by AUB and may be affected differentially by different treatments for AUB. Thus, sexual health should also be assessed before and after treatment.
Many studies we reviewed included “satisfaction with treatment” as an outcome for AUB treatments. We rated satisfaction with treatment as an important outcome, given its face validity, despite the dependency of this outcome on a patient’s pretreatment expectations and goals. Measured as the sole outcome for a study, “satisfaction with treatment” is limited, as it provides no information in terms of which symptoms the treatment alleviated and why the patient is satisfied. Collected as one of several outcomes, satisfaction with treatment may still be helpful to inform patients in their selection of different therapies.
These recommendations are not meant to provide an exhaustive or definitive list of the exact data and outcome measures to be collected. There is no one standard measurement tool for the suggested outcomes included in these recommendations. In the absence of a standardized instrument for measuring outcomes in trials for AUB, researchers should consider combining available validated instruments for measuring outcomes or clearly explain how new instruments for measuring outcomes were developed specifically for their study. Researchers should clearly describe which instruments or methods were used to assess bleeding, related symptoms, quality of life, sexual health, and satisfaction. It would be ideal if investigators developed a validated measure or set of instruments that captured all AUB-related symptoms so that a standardized tool can be used in all future trials.
One strength of this study is that the review group used a systematic process to identify RCTs for AUB and to list all outcomes reported in these trials. Additionally, a data-driven iterative expert consensus process was used to grade the importance of each of these outcomes. We used a voting process that left it to each group member to weigh the relative impact of the various considerations for what makes an outcome important for decision making and to arrive at a composite grade. Because this was, to some degree, subjective, we would have excluded individuals with any self-reported conflicts of interest, had there been any. As the GRADE working group states, knowledge about optimal strategies for making decisions about relative importance of outcomes remains limited. Regardless, it is desirable to make the process transparent and explicit . A limitation of the study is that there was no direct patient participation in deciding on the importance of outcomes for clinical decision making. For this reason, we referred to the available literature on patient experience with AUB and patient-based outcomes measures for AUB. For the same reason, we invited individuals with particular expertise in patient-reported outcome measures for AUB research to participate in the review group discussions.
An explicit, transparent process for grading the importance of outcomes is a key step in evidence synthesis and assessment of benefits and limitations of alternative treatments. Our process of ranking importance of AUB outcomes combined systematic review steps to generate an outcome inventory followed by a stepwise expert-based categorization and rating of outcomes. We have provided recommendations for future AUB trials to improve the consistency and quality of outcome data. This process for ranking of importance of outcomes may be used in other fields where heterogeneity in outcomes hampers comparison of results across different treatments.
To fundamentally address and ameliorate the problem of nonstandardized outcome measurement for AUB will require further work involving clinicians, patients, and researchers to narrow the range of outcomes to a core set of valid, discriminatory, and feasible tools and to use them consistently in future trials.
The following SGS Systematic Review Group members significantly contributed to the acquisition of data: LTC Jeffrey L. Clemons, MD; Rajiv B. Gala, MD; Oz Harmanli, MD; Lior Lowenstein, MD, MS; James Lukban, DO; Abraham Morse, MD; Miles Murphy, MD, MSPH; Nicole Calloway Rankins, MD, MPH; Scott Smilen, MD; Karl Tamussino, MD; and James Theofrastous, MD.
Funding/support: Funding of assistance by methodologic experts in systematic review and logistic support was provided by the Society of Gynecologic Surgeons.