|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Temporomandibular joint disorders (TMJD) are multifactor, complex clinical problems affecting approximately 60–70% of the general population, with considerable controversy about the most effective treatment. For example, reports claim success rates of 70% and 83% for non-surgical and surgical treatment, whereas other reports claim success rates of 40% to 70% for self-improvement without treatment. Therefore, the purpose of this study was to (1) identify systematic reviews comparing temporomandibular joint disorder surgical and non-surgical treatment, (2) evaluate their methodological quality, and (3) evaluate the evidence grade within the systematic reviews.
A search strategy was developed and implemented for MEDLINE, Cochrane Library, LILACS, and Brazilian Dentistry Bibliography databases. Inclusion criteria were: systematic reviews (± meta-analysis) comparing surgical and non-surgical TMJD treatment, published in English, Spanish, Portuguese, Italian, or German between the years 1966 and 2007(up to July). Exclusion criteria were: in vitro or animal studies; narrative reviews or editorials or editorial letters; and articles published in other languages. Two investigators independently selected and evaluated systematic reviews. Three different instruments (AMSTAR, OQAQ and CASP) were used to evaluate methodological quality, and the results averaged. The GRADE instrument was used to evaluate the evidence grade within the reviews.
The search strategy identified 211 reports; of which 2 were systematic reviews meeting inclusion criteria. The first review met 23.5 ± 6.0% and the second met 77.5 ± 12.8% of the methodological quality criteria (mean ± sd). In these systematic reviews between 9 and 15% of the trials were graded as high quality, and 2 and 8% of the total number of patients were involved in these studies.
The results indicate that in spite of the widespread impact of TMJD, and the multitude of potential interventions, clinicians have expended sparse attention to systematically implementing clinical trial methodology that would improve validity and reliability of outcome measures. With some 20 years of knowledge of evidence-based healthcare, the meager attention to these issues begins to raise ethical issues about TMJD trial conduct and clinical care.
Temporomandibular Joint Disorders (TMJD) is a collective term used to describe a number signs and symptoms involving the temporomandibular joints, masticatory muscles, and associated structures. Approximately 60–70% of the general population has at least one sign of a temporomandibular disorder which include limited mouth opening, clicking, and locking (e.g.: [1,2]). Further, TMJD is frequently associated with pain in regions outside of the immediate joint area, such as recurrent headaches and neck pain[1,3]. Patients afflicted with a severe TMJD can experience significant reductions in quality of life, affecting both personal life and work, and everyday activities such as eating, talking, yawning, and laughing can become painful [1-3].
The treatment of TMJD can be divided into two main groups. The first is the non-surgical therapy and it includes treatments such as counselling, physiotherapy, pharmacotherapy, and occlusal splint therapy (e.g.:[1,4]). The other is the surgical therapy and it ranges from temporomandibular joint arthrocentesis and arthroscopy to the more complex open joint surgical procedures, referred to as arthrotomy (e.g.:). Narrative reviews indicate that the success rate of nonsurgical treatment is approximately 70% (e.g.:) and the surgical treatment success is approximately 83% (e.g.:), whereas other studies report approximately 40% to 70% self-improvement without any treatments (e.g.:[6,7]). Systematic reviews, however, paint a different picture. The systematic reviews indicate that there is not enough high quality evidence to make informed clinical decisions (e.g.: [8-17]). Yet, some systematic reviews do offer treatment guidance. Clearly, there is some controversy about which treatments are the most effective.
To provide a baseline of high quality information, Bader and Ismail  performed a literature survey identifying published systematic reviews of on clinical dentistry, of which 7 examined TMJD. Two addressed the cause and effect of TMJD, one study addressed diagnosis and imaging techniques, and the remaining 4 reviews were about the treatment of TMJD. None addressed TMJD surgical treatment.
The purpose of the current study, therefore, was to (1) identify systematic reviews comparing temporomandibular joint disorder surgical and non-surgical treatment, (2) evaluate their methodological quality, and (3) evaluate the evidence grade within the systematic reviews.
A search strategy (Table (Table1)1) was developed to identify systematic reviews (SR) that address TMJD treatment. The search comprised articles indexed in MEDLINE, Cochrane Library, LILACS, and Brazilian Dentistry Bibliography databases that were published in English, Spanish, Portuguese, Italian, and German between the years 1966 and 2007 (up to July, 31).
The following inclusion criteria were used to identify potentially relevant papers: the article was really a systematic review or meta-analysis, as stated and described in methods section of the publication, and the primary focus was a comparison among different TMJD surgical treatments and non-surgical treatment. Exclusion criteria were: systematic reviews addressing non-surgical TMJD only; clinical trials, narrative reviews or editorials or letters to editor; in vitro or animal studies; articles published in languages other than English, Spanish, Portuguese, Italian or German. Initially, two investigators (RVBN and RN) independently read all the titles and abstracts from the multiple search results to identify potentially eligible articles for inclusion. All potentially eligible studies were then retrieved and full-text article was reviewed to determine if it met the inclusion criteria. Disagreements were resolved by discussion among the investigators.
In the next phase, the investigators independently evaluated the identified systematic reviews using three instruments (AMSTAR, OQAQ and CASP) to appraise the methodological quality of systematic review [19-21].
AMSTAR or 'Assessment of Multiple Systematic Reviews' consists of 11 items tool and was developed pragmatically using previously published tools and expert consensus. The original 37 items were reduced to an 11 item instrument addressing key domains (eg. study question, search strategy, inclusion/exclusion criteria, data extraction, study quality). The resulting instrument was judged by the expert panel to have face and content validity and The Canadian Agency for Drugs and Technologies in Health (CADTH) was selected AMSTAR as the best instrument for appraising systematic reviews[22,23].
OQAQ or 'Overview Quality Assessment Questionnaire' has nine individual items and it specified purpose is to evaluate the scientific quality (i.e., adherence to scientific principles) of systematic reviews published in medical literature. This scale is not designed to measure literary quality or importance. The validity of the scale has been tested and validated.
The Critical Appraisal Skills Programme or CASP was developed by the Public Health team in Oxford and the programme of North Thames Research Appraisal Group (NTRAG). It aims to enable clinicians to develop the skills to find and make sense of research evidence, helping them to put knowledge into practice. CASP uses an instrument to appraise systematic reviews based on 10 questions. These questions address key domains (eg. comprehensive search, validity assessment, results combination) of methodological quality.
To have a standard way to compare, the answers of each instrument were scored 'yes' or 'no' when the criterion was explicitly met or not met, and 'can't tell' if methods were reported incompletely or not reported at all. For the calculation of the overall methodological quality the mean of the number of 'yes', 'can't tell', 'no' answers recorded for each instrument were determined.
To appraise the quality of the underlying evidence, we used the GRADE system [25,26]. Essentially GRADE classifies quality in four categories: randomized clinical trials (RCT) that show consistent results (high quality); randomized clinical trials with methodological limitations or QUASI-RCT (moderate quality); and observational studies with control groups (e.g., cohort and case-control studies) without exceptional strengths (low quality); unsystematic clinical observations (e.g., case reports and case series) as evidence of very low quality evidence (very low quality).
The number of reports that met each of the criteria was determined and tabulated. The estimated mean overall quality score was calculated using Excel for windows (Microsoft, USA) spreadsheets.
Figure Figure11 summarizes the results of the Medline, Cochrane Library, LiLACS and Brazilian Dentistry Bibliography databases searches. 211 references that appeared to meet the definition of systematic review or meta-analysis were identified (Additional file 1). Inspection of the titles and abstracts by two investigators excluded 97 as not meeting inclusion criteria, leaving 114 that appeared to be relevant or which abstracts were missing. Full text of the 114 references was obtained and their reference lists examined for additional relevant articles. Of the 114, 112 were eliminated leaving 2 articles that met all inclusion criteria.
The included systematic reviews are summarized in Table Table2.2. The first systematic review, Kropmans et al  addressed the question: what are the positive effects of arthroscopic surgery, arthrocentesis and physical therapy with regard to TMJD. The second, Reston and Turkelson , addressed the question: can any surgical procedures, compared to non-surgical therapy effectively treat TMJD?
The evaluation of the methodological quality of the included reviews for each instrument is shown in Tables Tables3.3. .4,4, ,5,5, ,6,6, ,7,7, ,88 and the overall quality results are shown in Table Table9.9. The Kropmans et al review met 23.5 ± 6.0% (ranging 18.2% to 30% 'yes' answers) and the Reston and Turkelson review met 77.5 ± 12.8% (ranging 63.6 to 88.9% yes answers) of the methodological quality criteria (mean ± sd).
Table Table1010 summarizes the quality of evidence in each review. Data from 20 articles (4585 patients) were analyzed by Kropmans et al. Data from 23 articles (1463 patients) were analyzed by Reston and Turkelson. Only three (15%) articles in Kropmans et al review were randomized clinical trials (high quality evidence), which represents 1.6% of the number of patients and six (30%) were Quasi-RCT (moderate evidence), representing 6% of the patients. In this case, methodological quality varied between RCTs in terms of how the sequence generation was done and the adequacy of allocation concealment. Conversely, eleven (55%) of the articles were non-systematic clinical observations (ie, case reports and case series), which were classified as very-low-quality evidence and represents 92.3% of the number of patients. The analysis of the included articles in Reston and Turkelson showed that two (8.7%) were randomized clinical trials, which represents 7.7% of the number of patients, five (21.7%) were observational studies (low evidence), which represents 29% of the patients, and 16 (69.6%) were case series, which represents 63.3% of the number of patients.
In this report, we sought to identify and evaluate the quality of systematic reviews comparing surgical and non-surgical care of TMJD. Our search strategy identified two reviews (one with and one without a meta-analysis). Kropmans et al found no difference in clinical outcomes (e.g.: maximal mouth opening, pain, or function improvement) when comparing arthroscopic surgery, arthrocentesis, and physical therapy. Similarly, Reston and Turkelson examining the efficacy of the surgical techniques arthrocentesis, arthroscopy, and disc repair/reposition, found no difference in clinical outcomes. Conceptually, both studies provide estimates of surgical vs. surgical and non surgical care, and both indicate that there are no significant differences between the interventions examined (Table (Table2).2). However, the methodological quality (from 0 to 100, with 100 being the best) of these two systematic reviews (Kropmans et al = 23.5; Reston and Turkelson = 77.5), and the underlying high quality evidence upon which they are based (from 0 to 100, with 100 being the best) (Kropmans et al = 15; Reston and Turkelson = 8.7), suggests that the interpretations of these authors may be over- (or under-) statements. This leads one to an interesting dilemma: How can one best care for patients if there are few high quality studies upon which to base this care?
The authors of the systematic reviews recognized some of these shortcomings. Reston and Turkelson pointed out that their study is an explicit attempt to provide a conclusion when reliable primary data are not available and recognized that clinical and policy decisions must often be made in the absence of well-designed trials. They also discussed two main flaws. The first is that there are few studies that evaluated a specific treatment on specific TMJD sub-groups of patients; therefore the relative efficacy of each treatment could not be defined. The other is that the studies contained a wide range of definitions of success or improvement after treatment. Kropmans et al reported that none of the reviewed scientific papers reported the measurement error of the procedures used (e.g.: standard error of measurement or 95% confidence interval).
Another set of issues is the classification of trials design. There are 8 articles common to both reviews, from which 3 match the same study design classification and 5 don't match. This could be explained by differences in the way that the included systematic reviews classified the study design of each article. A possible solution to this problem is the use of standard approaches to classify study designs, thus providing more consistency (e.g.:). These approaches has been used successfully by other areas of dental care (e.g.: preventive dentistry[30,31]).
While the overall quality of the reviews is of some interest, the results of each measurement tool (Tables (Tables3,3, ,4,4, ,5,5, ,6,6, ,7,7, ,8)8) may offer additional insight into areas that could be improved. For example, the common missing elements in systematic reviews, in general, are: explicit search strategy; explicit articulation of exclusion criteria; absence of quality assessment of the underlying studies; and inappropriate aggregation of study results. Thus, there are crucial elements in the conduct of a systematic review, without which the results may be questionable. From the analysis of Tables Tables3,3, ,4,4, ,5,5, ,6,6, ,7,7, ,88 one can identify that each included systematic review showed different results for each instrument. This could be explained by the fact that each instrument has a question pattern to appraise methodological quality and address different key areas. For example, AMSTAR addresses to potential bias in systematic reviews (such as: funding source and conflict of interest) that OQAQ and CASP doesn't.
Ideally, systematic reviews should be the starting point for any search for information. The results of the current study raise questions about the quality of the systematic reviews and clinical trials upon which TMJD care is based. Unfortunately, this is a long standing systematic problem. A systematic review of TMJD diagnosis, which was carried out more than 10 years ago, reported major to extensive methodologic flaws in 40% of the analyzed studies. Similarly, in examining the clinical literature, Bader and Ismail identified 7 systematic reviews on diagnosis and non-surgical treatment of TMJD. For 6 of the reviews (85.7%) they identified a need for well designed controlled studies, the use of standardized diagnostic criteria and outcome measures. Finally a 2007 report from the United States Government Accountability Office examined the Food and Drug Administration agency approval for the use of temporomandibular joint implants. They identified some of the same methodological problems: inadequate or inaccurate measurements, sample size and patient follow-up.
Thus, many of the previous results and this systematic review are congruent. That is, clinical scientists need to begin designing, implementing, and reporting clinical trials and systematic reviews that meet international standards. In terms of international standards and chronic pain, there is the IMMPACT recommendations, which suggest that chronic pain clinical trials should assess outcomes representing six core domains: (1) pain, (2) physical functioning, (3) emotional functioning, (4) participant ratings of improvement and satisfaction with treatment, (5) symptoms and adverse events, (6) participant disposition (e.g. adherence to the treatment regimen and reasons for premature withdrawal from the trial).
More specifically, for TMJD, a coalescence of symptom and outcome measurements could be profound. For example, one might consider measurements and reporting of pain as pioneered by the Oxford Pain Research Center. This Pain Center, in generating or evaluating systematic reviews of acute pain, prefers validated visual analog scales (VAS) to report pain, specifically includes randomised, double-blind, single-dose studies in patients with moderate to severe pain, and looks for outcomes of 50% pain relief at 4–6 hours. The analog for TMJD could be VAS of symptoms pre-therapy and 3-months post-therapy. Post-therapy reporting would include both the number and percentage of patients with 50% pain relief for both the experimental and control group.
This would, however, necessitate having clinical scientists register their trials, select and implement interventions and comparisons in a standard, randomized, blinded fashion, and completely report these results using the CONSORT guideline (e.g.:). Were this to occur it would substantially improve the evidence-base, reduce the variation in care and improve knowledge of patient outcomes. Moreover, there are also guidelines (e.g.: TREND) to help researchers improve the transparency or clarity of non-randomized designs reports.
There are at least four limitations to our study. First, several instruments exist to assess the methodological quality of systematic reviews, but not all of them have been developed systematically or empirically validated and have achieved general acceptance. Furthermore, since their development, considerable empirical research has accumulated about potential sources of bias in systematic reviews. For example, recent methodological research has highlighted the potential importance of publication language and publication bias in systematic reviews. As an attempt to address the slightly varying perspectives of these instruments, we averaged the outcomes of each instruments. Moreover, the average of the group of does not differ radically from the assessment of any individual instrument. This suggests that the average is a reasonable estimate of the quality.
Second, our searches identified 211 studies, of which 22 (10.4%) were in excluded languages. Thus, while we attempted to capture the majority of the published literature, we clearly missed a significant portion of the literature. Third, we did not examine the reference lists of the identified articles, further increasing the probability of missed information. Forth, to extrapolate our results to the care of individual patients would be difficult. All patients vary with regard to their pathology and clinical characteristics and need a personalized approach to care. Although the overall outcomes of TMJ surgery in clinical trials may not be different than non-surgical care, TMJ surgery may still be the best option for specific patients and should be a treatment that should not be forgotten.
The most troubling aspect of our findings involves the ethics of trials that do not meet international standards of conduct, and the care of patients that is not based on high levels of evidence. The potential implications of this failing is clearest to understand in terms of the U.S. Supreme Court ruling in Daubert v. Merrell Dow. In this case the Supreme Court applied the Federal Rules of Evidence for causality of harm, based on the highest level of evidence. This ruling supplanted the common-law test of Frye v. United States, which based rulings on local practice customs. Thus one might imagine that legal suits could arise from the application of trial methodology or clinical practice that does not meet international standards.
We identified two systematic reviews that compared surgical and non-surgical treatment of TMJD. Both studies indicated that there are no statistically significant differences between the effects of the identified treatments. However, both studies, while of different methodological quality, could only identify, and thus based their outcomes on studies of mixed quality. This raises 2 concerns. First, we are basing clinical care on studies of low quality and second there is a clear need for considerably more attention to clinical trial design, implementation and reporting, if one is to provide quality patient care.
There are no competing interests. This work was part of a thesis, support by the Brazilian Council for Science and Technology (CNPq), submitted to the faculty of graduate studies in partial fulfilment of the requirement for the degree of PhD, Department of Oral and Maxillofacial Surgery, University of Pernambuco, Brazil.
RN and BCEV conceived the study. RVBN and RN were responsible for the design of the study, searching, selection and data acquisition and analysis. All authors were involved in the drafting of the manuscript and gave approval of the final version.
The pre-publication history for this paper can be accessed here:
References that appeared to meet the definition of systematic review or meta-analysis.
The authors would like to thank Janaina Salmos for her help during the conclusion of this manuscript.