|Home | About | Journals | Submit | Contact Us | Français|
Correspondence to: I Michael Leitman, MD, FACS, Department of Surgery, Beth Israel Medical Center, 10 Union Square East, 2M, New York, NY 10003, United States. gro.tenphc@namtielm
Telephone: +1-212-8448570 Fax: +1-212-8448440
Despite the prevailing emphasis in the medical literature on establishing evidence, many changes in the practice of surgery have not been achieved using proper evidence-based assessment. This paper examines the adoption of laparoscopic cholecystectomy (LC) into regular use for the treatment of cholecystitis and the process of its acceptance, focusing on the limited role of technology assessment in its appraisal. A review of the published medical literature concerning LC was performed. Approximately 3000 studies of LC have been conducted since 1985, and there have been nearly 8500 publications to date. As LC was adopted enthusiastically into practice, the results of outcome studies generally showed that it compared favorably with the traditional, open cholecystectomy with regard to mortality, complications, and length of hospital stay. However, despite the rapid general agreement on surgical technique, efficacy, and appropriateness, there remained lingering doubts about safety, outcomes, and cost of the procedure that suggested that essential research questions were ignored even as the procedure became standard. Using LC as a case study, there are important lessons to be learned about the need for important guidelines for surgical innovation and the adoption of minimally invasive surgical techniques into current clinical and surgical practice. We highlight one recent example, natural orifice transluminal endoscopic surgery and how necessary it is to properly evaluate this new technology before it is accepted as a safe and effective surgical option.
With the introduction of laparoscopic cholecystectomy (LC) in the late 1980s, gastrointestinal surgery was forever changed. Arguably more than any other laparoscopic procedure, LC drove the endoscopic revolution. However, nothing has been more astounding than the speed with which the procedure was unanimously hailed as the new gold standard for surgical gallbladder removal.
Recently, the Archives of Surgery published the results of a randomized clinical trial comparing LC with open cholecystectomy (OC), which concluded that a small-incision variant of the open procedure is just as efficacious as LC with regard to primary and secondary clinical outcome measures, thus questioning the status of LC as the gold standard. This article is notable not just for its unexpected conclusion but because it was published in 2008 - over 20 years since the first LC in France and 15 years since its global dissemination. Other studies have appeared in recent years, with similar conclusions[2-8]. These studies give rise to several questions: Why conduct randomized, controlled trials so many years after LC’s incorporation into clinical practice? How did questions regarding LC’s safety and efficacy come to linger unanswered, warranting review now? On what level of evidence was LC so widely practiced?
This article examines the evidence available to surgeons during the period of LC’s rapid diffusion. We conclude that there was minimal evidence to support the use of LC over OC and that the early enthusiasm for LC was largely unfounded. In particular, the quality, methodologic design, scope, and timeliness of published studies were often poor, and they failed to adequately establish LC’s safety, effectiveness, or cost savings over alternative therapies.
By critically appraising the development and diffusion of LC and identifying essential research questions that were ignored, we hope to illustrate the need for assessment that may be met in development of new surgical technologies. We conclude by considering the emergence of a more recent surgical innovation - natural orifice transluminal endoscopic surgery (NOTES) - and compare its current evaluation with the historical course of LC and the standards of technology assessment.
Originally described in 1882 by Langenbuch, open cholecystectomy remained the standard treatment for cholelithiasis and other diseases of the gallbladder for over 100 years. The laparoscopic alternative to the open procedure was originally developed in France by Mühe[10-13] in 1985 in an attempt to reduce post-operative morbidity and related cost. It was popularized by Mouret[14-17] and Dubois[18-22] in 1987. LC was quickly adopted as the procedure of choice in many countries, and in the United States LC accounted for 90% of all cholecystectomies a mere four years following its introduction[23,24].
LC has proven its value over time, but from an evidence-based perspective, the enthusiastic preference for LC in the 1990s was largely unfounded. The problem was not with the procedure itself, but rather the lack of data supporting it: the literature that argued for the superiority of LC over OC did not adequately document LC’s safety, effectiveness, or cost savings over alternative therapies.
In reviewing the LC literature, it is helpful to consider the publication timeline in light of McKinlay’s heuristic stages of medical innovation: LC went through an initial stage of the “promising report” (1985-1992), a middle stage of professional and organizational adoption (1993-1995), and a late stage of observational reports and standardization (1996-1999). A contemporaneous search of Medline via Ovid using keyword “cholecystectomy” revealed that in these 15 years (1985-1999), 2822 articles on LC were indexed in Medline, predominantly in English-language journals.
Remarkably, there were no publications on LC before 1990 - a good 5 years after its introduction and 2 to 3 years after its world-wide popularization. Of the 228 articles published between 1990 and 1993, the majority were concerned with technique (58 articles, 25%), safety (39 articles, 17%), and instrumentation (14 articles, 6%). Despite the fact that LC was developed partly in response to economic concerns, only 4 articles (1.8%) evaluated its economic aspects or made any economic comparison with OC. Two articles (0.9%) described trends in practice, but only 8 (3.5%) discussed standardization of technique.
By the mid-1990s, LC had become widely adopted as the standard of care for the surgical treatment of gallbladder disease. Between 1993 and 1995, 1370 articles on LC were published. Three hundred and thirty-two (24%) of these covered safety issues, for which data were already available, although many of these articles considered special populations (e.g. the pregnant, pediatric, geriatric, or immunosuppressed patient). Another 228 (17%) discussed variations in methodology or instrumentation. Bridging these two categories were outcome-oriented studies, which consisted mainly of case reports. Despite the 10 years that had passed since its introduction, only 46 (3%) articles discussed trends in utilization or the development of standard practice. Forty-three articles (3%) focused on its economic aspects. Importantly, during this period, a number of commentaries questioned whether the overwhelming support for LC was warranted. Nonetheless, most objections were voiced only in editorials, and no studies sought to elucidate such issues systematically.
In the late phase of adoption (between 1996 and 1999), 1224 articles were published on LC. While the numbers of procedural, methodologic, and economic articles were comparable to those published in previous periods, a large percentage (294, or 24%) focused on observational accounts of safety, adverse reactions, and contraindications. This attention to safety is noticeably out of place for a technology that, after 15 years, should already have been properly evaluated for safety issues. Moreover, this ongoing concern is evidence of the rushed adoption of the LC in its earliest stage (Figure (Figure1A).1A). Another 29 articles (2%) in this late phase were dedicated to discussing trends in practice, standardization of methods, and utilization issues, representing increasing sentiment among surgeons and health-services researchers that there was a need for better assessment of LC.
Today, Medline has indexed over 8300 articles concerning various aspects of LC. The continued appearance of articles dealing with safety and effectiveness suggests lingering concerns.
In accordance with the principles of technology assessment[27,28], the first studies of LC evaluated the safety of the new procedure - with regard to complication rate, post-operative morbidity, and mortality - and the methods involved in the procedure itself. These studies were generally conclusive: LC was found to be a safe, viable procedure with no serious complications or post-operative morbidity, and it appeared to be as safe as the standard, open surgery. These studies also attempted to quantify such outcome variables as efficacy and effectiveness. Most outcome studies purported the main advantages of LC to be “decreased pain and disability and improved cosmesis without increased mortality or morbidity rates”. Quantitative results supporting these conclusions differed from one study to another but were generally consistent: mean hospital stay for LC was reported to be one to two days, with many papers suggesting that LC could be done as an outpatient procedure; mean time to return to work after surgery was approximately 15 d; post-operative pain was subjectively decreased; and “quality of life” was improved. However, despite the favorable results, the outcome data had serious limitations.
Most of the research published between 1985 and 1999 consisted of anecdotal evidence or retrospective case series. As the fervor for endoscopic surgery grew, OC was no longer being performed in adequate numbers to permit comparison studies. To complicate matters, as LC gained favor among surgeons and started to become the new standard, it became increasingly unlikely that prospective comparisons could be performed.
Consequently, one of the initial challenges in defining the appropriate role of LC in contemporary practice was to evaluate the outcomes and risks of OC, by which to compare outcomes from LC. With great foresight, in 1992 Clavien et al conducted a longitudinal evaluation of OC with the express purpose of serving as an historical control for future evaluations of LC. Remarkably, however, this study was rarely cited; by 1999, it had been referenced only 41 times, and fewer than half the citations were in work that addressed LC. While many LC studies alluded to vague OC statistics from historical controls, many studies failed to cite any specific sources of validated data. Without a real comparison group, the claims of LC’s superiority were never truly validated.
Irrespective of methodologic inadequacies, the outcome literature also failed to account for the purported differences between LC and OC. Specifically, some critics argued that the reported advantages of LC were not exclusive to LC, but could also have been achieved with OC, had the latter procedure and post-operative management been improved. For example, while LC was credited with shorter hospital stays, long-term post-operative observation even for standard elective OC was already viewed as unnecessary and was in the process of changing; the routine use of drains, which delayed discharge but were found to provide little benefit, was already less frequent; and routine post-operative administration of antibiotics was increasingly discouraged. Therefore, some argue that the rapid adoption of LC, itself, did not deserve the credit for the dramatic changes in the care of patients, but rather served to catalyze rejection of unnecessary practices that increased costs or delayed discharge. Whether or not such modifications would have yielded the same results in the absence of LC was a worthwhile question that was never answered.
Between 1985 and 1999, available studies in the LC literature assessed only a narrow scope of outcome variables (i.e. morbidity, mortality, and length of hospital stay) but ignored many other outcomes of interest (e.g. cost and important quality-of-life issues). Despite the fact that, from the very beginning, LC was purported to be more cost-effective than OC, few investigators performed economic analyses. The majority of economics-related investigations were rudimentary studies, conducted by clinicians, that discussed only the hospital costs of LC vs OC; others considered costs indirectly via a cost-to-charge-ratio conversion of patient charges for both procedures[35,36]. Few of the early papers and none of the later studies gave detailed estimates of costs and no studies incorporated indirect costs or social impacts. Only two robust economic analyses were performed during the period of LC’s adoption: a cost-effectiveness analysis by Kesteloot et al, and a cost-utility analysis by Cook et al. While both of these were favorable to LC, they were published in the economics literature, where they remained unknown to, or underappreciated by, those in clinical practice.
Many studies advocated LC based upon “patient preference”. One investigator found that patients became convinced of the benefits of LC by the surgeons’ enthusiasm rather than their own understanding of the procedure, and that a thorough informed consent process eliminated this preference. Most studies on LC limited patient-reported outcomes to consideration of such simple Yes/No questions as, “Are you glad you got this procedure?” or, “Would you recommend this procedure to others who suffer from gallbladder disease?”. Detailed investigations of the impact of LC on patient satisfaction, quality of life, and general well-being were never performed. Lacking such analysis, these statistics cannot be used to make convincing claims regarding quality of life.
The first attempt to perform the necessary critical review of LC was conducted by the National Institutes of Health (NIH) in a Consensus Conference held in 1992. The NIH Consensus Panel’s assessment reviewed the numerous studies to date and concluded that LC was a safe and effective procedure that “leads to increased quality of life over other methods of treating gallbladder disease” (extracorporeal shock wave therapy, bile acid therapy, and open cholecystectomy). It acknowledged that the frequency of common bile duct injury was higher with LC than with OC, but suggested that this was in large part due to the experience, skill, and judgment of the surgeon[30,40]. The Panel also concluded that all patients with gallbladder disease should undergo LC in preference to OC except for the following contraindications: cardiopulmonary contraindications to a general anesthetic, hepatic cirrhosis with portal hypertension or coagulopathy, acute pancreatitis, acute gangrenous cholecystitis, septic shock, the third trimester of pregnancy, and previous upper abdominal surgery.
While the NIH Consensus Conference attempted to come to terms with the flood of new information regarding LC, it did not critically appraise the clinical studies that it consulted. The poor quality of evidence for LC at that time was noted by one Australian surgeon: “It is of particular note that the (NIH) review has 99 references and yet few are of real scientific substance. The published papers in this rapidly developing area still consist essentially of anecdotes. We have no adequately constructed clinical trials. We have very little good comparative study. We have essentially no long-term follow up and we have little in the way of objective measures of outcomes prepared by independent observers”.
Because it was based on sparse and incomplete data, the NIH consensus could be no more than “a rapid agreement for the most appropriate procedure” and amounted to little more than informed opinion. Critics later concluded that, despite the “consensus,” the issue was far from decided, especially with regard to such concerns as management of acute cholecystitis and choledocholithiasis and treatment of the pregnant patient.
Whereas the NIH consensus was widely publicized, other better-constructed studies appear to have been less accessible to surgeons, who may not have looked at work conducted outside their discipline. As noted earlier, the Kesteloot et al and Cook et al studies - both far superior to any other economic study published in the surgical literature, and both favorable toward LC - were not recognized by the surgical community. By the late phase of LC’s adoption in 1999, Kesteloot et al had been cited a total of only ten times, only five of which were in surgical journals. Cook et al was cited 15 times, but only twice in surgical journals. Although publication in peer-reviewed economics journals attested to the rigor of their research methods, an unfortunate consequence of their location was that the studies’ audience did not include the surgeons who performed LC, the hospital administrators who invested in equipment, or the third-party payors who were interested in how well LC compared to OC.
The accessibility problem is best exemplified by the excellent multi-stage prospective assessment - arguably the best study ever done on LC - published in a journal dedicated to technology assessment. In 1994 the International Journal of Technology Assessment in Health Care published a cautionary article on the increased use of LC following a multi-stage study commissioned to evaluate LC’s impact on patients, the hospital, and staff after its introduction to the Greater Victoria Hospital Society (GVHS) in Canada in 1991. In prospective, case-control fashion, the study demonstrated that cost of the LC was approximately 47% of that of OC, with the difference being attributed to reduction in length of stay. As had been observed elsewhere, operative time was found to increase by more than 20 min for LC. Post-operative hospital stay was significantly less for LC than for OC (3.2 d vs 13.1 d, respectively). The GVHS assessment even included qualitative patient-reported outcomes regarding need for pain medication and time to resumed normal daily activities as part of the evaluation process. Data collection was repeated annually for three years, and the integration of these multiple assessments was used incrementally to form a maturing consensus.
The study resulted in a favorable view of LC but concluded that the cost savings promised by LC “could only be realized by capitalizing on the reduced length of stay by removing the surgical beds from service” (i.e. earlier discharge), not from reduced costs of the procedure itself. In the 4 years covered by the assessment, the inpatient bed complement at the GVHS decreased by 155 beds, while the number of total surgical procedures was unaffected.
However, despite the strengths of the GVHS study, it, too, appears to have gone largely unnoticed by surgeons. Similar to the two economic evaluations published in the social science literature, by the end of the late phase of adoption in 1999, the GVHS study had never been cited in any medical or surgical literature. These important and fundamental studies might have influenced the diffusion of LC, yet their very existence seems to have been unknown to the medical profession.
The conversion from open to laparoscopic cholecystectomy has been called the most “precipitous and rapid” of all changes to modern surgical medicine. With its first demonstration in 1985, LC spearheaded a revolution in general surgery toward minimally invasive procedures that forever changed the nature of surgery[41,43]. A rigorous positive and normative evaluation of LC relative to OC would have indicated whether the tremendous physician preference for the laparoscopic procedure was warranted; but in the absence of meaningful outcome-based and economic research, the evaluation of LC fell short of standards of evidence-based medicine, and many important questions remained unanswered.
Why there was so little critical evaluation of LC is cause for debate. Indeed, concern for evidence-based practice and systematic evaluation of innovation was already widespread during the mid- to late-1980s when LC was popularized. First developed in the early 1970s, technology assessment had already matured to become a robust discipline essential to both public and private policy-making in many contexts[28,44,45]. Proponents of technology assessment, including the United Kingdom’s National Health Service and the United States’ Institute of Medicine, recognized its ability not only to examine safety and efficacy, but also to describe, both qualitatively and quantitatively, the indirect and delayed social, environmental, economic, legal, and ethical implications of a given medical technology - concepts that also bear strong relation to such clinically important concepts as quality of care, quality of life, and patient well-being in general[27,46,47].
Certainly, the case of LC was unique in many aspects. LC was conceived and popularized not at academic centers but by private clinics. Its explosive rate of adoption was led by market forces[31,43]. Experimental studies were not published, and the first articles appeared several years after LC’s introduction to clinical practice - quite late according to the standards of academic research, even considering the necessary time lag for publication.
Some academic surgeons did voice early concerns about LC and called for better evidence. For example, only a few years after the introduction of LC, O’Brien published his critical observations on laparoscopic surgery and the surgeons who embraced it: “There has been little shyness and little reticence by surgeons in testing the limits of (the) possibilities (of laparoscopic surgery). Any and every part of the gut from the oesophagus to the rectum can be removed endosurgically. … Trivial procedures … can be made difficult, and difficult procedures … have been made even more difficult. Common sense at times is overridden by the surgeons’ tenacity and blind commitment”.
In a survey of British surgeons regarding the necessity of a randomized trial comparing LC and OC, McMahon noted that only 58% of responders considered such a trial to be necessary. Few researchers questioned surgeons’ preference for LC, despite the paucity of scientific evidence. It is likely that the necessary conceptual questions were never seriously considered because they appeared to be self-explanatory. Clearly, the incentive for surgeons was to improve care for their patients, and minimally invasive surgery appeared intuitively superior to alternative treatments: it just “made sense” that a less invasive procedure would necessarily decrease morbidity and would be associated with faster recovery.
It seemed to make sense to patients also: LC quickly became, in popular opinion, the ideal for surgical treatment of gallbladder disease. Because neither surgeons nor patients could claim equipoise any longer, a prospective, randomized controlled trial comparing LC to OC quickly became ethically impossible. Despite long-standing criticism over the details of its assumed superiority, the claimed advantages of LC over OC were never elucidated, and the laparoscopic revolution continued unabated.
With the emergence of further advances in minimally invasive technology, approaches such as single-port surgery and NOTES were inevitable. In the case of NOTES, there is already excitement over its potential applications, and much experimental work has already been started[52-62]. The first report of NOTES in an animal model appeared in 2002, and the first report of NOTES in clinical patient care was published in 2004. Ninety-nine articles were published in the next five years. A search was conducted on March 20, 2009, in MEDLINE via Ovid using the intersection of medical subject headings (laparoscopy, endoscopy, or minimally invasive) with keywords (endoluminal, transluminal, translumenal, natural orifice, peroral, transgastric, transanal, transrectal, transvaginal, transcolonic, or transvesical) with results through January 2008. Standard endoscopic procedures such as percutaneous endoscopic gastrostomy, endoscopic polypectomy, biopsy, needle aspiration, and bilio-pancreatic procedures such as endoscopic retrograde cholangiopancreatography were then manually excluded from search results. The vast majority were descriptions of animal models (37%), non-systematic reviews (17%), or editorials/commentaries (25%). Only 7 case reports and 2 case series dealt with human patients, and only two systematic reviews were indexed (Figure (Figure1B1B).
These articles discussed novel, experimental procedures or instrumentation, followed by theoretical discussion such as advantages/disadvantages and indications/contraindications. However, of the 37 experimental studies, only 8 (21%) attempted to evaluate safety, and only 3 (8%) evaluated outcome. There are no definitive safety or outcome data yet available for human patients (Figure (Figure1C).1C). Nevertheless, the first case report of successful transvaginal cholecystectomy in a human being was published in 2007.
Recognizing the paucity of safety and outcome data, and perhaps learning from the experience with LC in the 1990s, clinicians have sought to set early responsible guidelines for research and development of NOTES. To their credit, in 2005, surgeons and gastroenterologists formed the Natural Orifice Surgery Consortium for Assessment and Research (NOSCAR)[65,66]. Their SAGES/ASGE white paper outlined the state of NOTES procedures and specified research that must still be done prior to the stepwise introduction of NOTES into clinical practice.
But is NOTES really ready for prime time? Private and academic medical centers are already hosting NOTES training seminars for surgeons, gastroenterologists, and even residents in these specialties to train them for this next frontier in minimally invasive surgery. There is tremendous appeal in “surgery without scars”, but proceeding directly from promising reports and the short-term evaluation of NOTES’ basic safety to its incorporation into clinical practice once again skips the crucial aspects of an evidence-based technology assessment that would justify its adoption.
McKinlay has argued that “there is a double standard in the acceptance of reports of surgical vs medical treatments, and that this arises from professional and lay attitudes”. However, the tenets of evidence-based medicine and the process of technology assessment are just as pertinent to the field of surgery as to other medical disciplines - and their warning resounds just as clearly. There should be a strict evidence-based progression from early safety studies to subsequent comparative outcome studies and economic analyses. Systematic critical appraisal of the evidence should be conducted periodically and favor large, contemporaneous, prospective, blinded, randomized and controlled studies over studies using other methodological approaches. Assessments may be performed at various stages of maturity and diffusion over the lifetime of a given medical technology. Even when a technology is assessed in its earliest stages of development, it is possible and important to articulate the goals for basic outcome assessment set during these conceptual and experimental stages[44,49,68,69].
As a testament to the importance of formal technology assessment in health policy and planning, many countries have established centers for comparative effectiveness research - such as the National Institute for Health and Clinical Excellence in the United Kingdom, the Canadian Coordinating Office for Health Technology Assessment in Canada, and the Institute for Quality and Efficiency in Germany[70-72]. In the United States, the U.S. Congress Office of Technology Assessment evaluated multiple technologies before being disbanded in 1995. The Agency for Healthcare Research and Quality, now the primary federal agency charged with improving the quality, safety, efficiency, and effectiveness of health care, was recently allocated $50 million in FY2009 to conduct comparative effectiveness research, and the Comparative Effectiveness Research Act recently introduced in Congress proposed the establishment of an institute dedicated to organizing and conducting such research.
With such a framework in place, surgeons worldwide are well-poised - and ethically bound - to ensure that scientific evidence for a surgical procedure supports its advantages over other surgical options. Proponents of developing surgical technologies must carefully follow the established principles of clinical research, technology assessment, and evidence-based medicine to safeguard the integrity of surgical practice and meet the professional responsibilities that monumental technological changes create.
Peer reviewers: Dr. Peter Draganov, Division of Gastroenterology, Hepatology and Nutrition, University of Florida, Gainesville, 1600 SW Archer Road, PO Box 100214, FL 32610, United States; Javier San Martín, Chief, Gastroenterology and Endoscopy, Sanatorio Cantegril, Av. Roosevelt y P 13, Punta del Este 20100, Uruguay
S- Editor Wang YR L- Editor Logan S E- Editor Ma WH