1.  Comparative Efficacy of Seven Psychotherapeutic Interventions for Patients with Depression: A Network Meta-Analysis 
PLoS Medicine  2013;10(5):e1001454.
Jürgen Barth and colleagues use network meta-analysis - a novel methodological approach - to reexamine the comparative efficacy of seven psychotherapeutic interventions for adults with depression.
Previous meta-analyses comparing the efficacy of psychotherapeutic interventions for depression were clouded by a limited number of within-study treatment comparisons. This study used network meta-analysis, a novel methodological approach that integrates direct and indirect evidence from randomised controlled studies, to re-examine the comparative efficacy of seven psychotherapeutic interventions for adult depression.
Methods and Findings
We conducted systematic literature searches in PubMed, PsycINFO, and Embase up to November 2012, and identified additional studies through earlier meta-analyses and the references of included studies. We identified 198 studies, including 15,118 adult patients with depression, and coded moderator variables. Each of the seven psychotherapeutic interventions was superior to a waitlist control condition with moderate to large effects (range d = −0.62 to d = −0.92). Relative effects of different psychotherapeutic interventions on depressive symptoms were absent to small (range d = 0.01 to d = −0.30). Interpersonal therapy was significantly more effective than supportive therapy (d = −0.30, 95% credibility interval [CrI] [−0.54 to −0.05]). Moderator analysis showed that patient characteristics had no influence on treatment effects, but identified aspects of study quality and sample size as effect modifiers. Smaller effects were found in studies of at least moderate (Δd = 0.29 [−0.01 to 0.58]; p = 0.063) and large size (Δd = 0.33 [0.08 to 0.61]; p = 0.012) and those that had adequate outcome assessment (Δd = 0.38 [−0.06 to 0.87]; p = 0.100). Stepwise restriction of analyses by sample size showed robust effects for cognitive-behavioural therapy, interpersonal therapy, and problem-solving therapy (all d>0.46) compared to waitlist. Empirical evidence from large studies was unavailable or limited for other psychotherapeutic interventions.
Overall our results are consistent with the notion that different psychotherapeutic interventions for depression have comparable benefits. However, the robustness of the evidence varies considerably between different psychotherapeutic treatments.
Editors' Summary
Depression is a very common condition. One in six people will experience depression at some time during their life. People who are depressed have recurrent feelings of sadness and hopelessness and might feel that life is no longer worth living. The condition can last for months and often includes physical symptoms such as headaches, sleeping problems, and weight gain or loss. Treatment of depression can include non-drug treatments (psychotherapy), antidepressant drugs, or a combination of the two. Especially for people with mild or intermediate depression, psychotherapy is often considered the preferred first option. Psychotherapy describes a range of different psychotherapies, and a number of established types of psychotherapies have all shown to work for at least some patients.
Why Was This Study Done?
While it is broadly accepted that psychotherapy can help people with depression, the question of which type of psychotherapy works best for most patients remains controversial. While many scientific studies have compared one psychotherapy with control conditions, there have been few studies that directly compared multiple treatments. Without such direct comparisons, it has been difficult to establish the respective merits of the different types of psychotherapy. Taking advantage of a recently developed method called “network meta-analysis,” the authors re-examine the evidence on seven different types of psychotherapy to see how well they have been shown to work and whether some work better than others.
What Did the Researchers Do and Find?
The researchers looked at seven different types of psychotherapy, which they defined as follows. “Interpersonal psychotherapy” is short and highly structured, using a manual to focus on interpersonal issues in depression. “Behavioral activation” raises the awareness of pleasant activities and seeks to increase positive interactions between the patient and his or her environment. “Cognitive behavioral therapy” focuses on a patient's current negative beliefs, evaluates how they affect current and future behavior, and attempts to restructure the beliefs and change the outlook. “Problem solving therapy” aims to define a patient's problems, propose multiple solutions for each problem, and then select, implement, and evaluate the best solution. “Psychodynamic therapy” focuses on past unresolved conflicts and relationships and the impact they have on a patient's current situation. In “social skills therapy,” patients are taught skills that help to build and maintain healthy relationships based on honesty and respect. “Supportive counseling” is a more general therapy that aims to get patients to talk about their experiences and emotions and to offer empathy without suggesting solutions or teaching new skills.
The researchers started with a systematic search of the medical literature for relevant studies. The search identified 198 articles that reported on such clinical trials. The trials included a total of 15,118 patients and compared one of the seven psychotherapies either with another one or with a common “control intervention”. In most cases, the control (no psychotherapy) was deferral of treatment by “wait-listing” patients or continuing “usual care.” With network meta-analysis they were able to summarize the results of all these trials in a meaningful way. They did this by integrating direct comparisons of several psychotherapies within the same trial (where those were available) with indirect comparisons across all trials (using no psychotherapy as a control intervention).
Based on the combined trial results, all seven psychotherapies tested were better than wait-listing or usual care, and the differences were moderate to large, meaning that the average person in the group that received therapy was better off than about half of the patients in the control group. When comparing the therapies with each other, the researchers saw small or no differences, meaning that none of them really stood out as much better or much worse than the others. They also found that the treatments worked equally well for different patient groups with depression (younger or older patients, or mothers who had depression after having given birth). Similarly, they saw no big differences when comparing individual with group therapy, or person-to-person with internet-based interactions between therapist and patient.
However, they did find that smaller and less rigorous studies generally found larger benefits of psychotherapies, and most of the studies included in the analysis were small. Only 36 of the studies had at least 50 patients who received the same treatment. When they restricted their analysis to those studies, the researchers still saw clear benefits of cognitive-behavioral therapy, interpersonal therapy, and problem-solving therapy, but not for the other four therapies.
What Do these Findings Mean?
Similar to earlier attempts to summarize and make sense of the many study results, this one finds benefits for all of the seven psychotherapies examined, and none of them stood as being much better than some or all others. The scientific support for being beneficial was stronger for some therapies, mostly because they had been tested more often and in larger studies.
Treatments with proven benefits still do not necessarily work for all patients, and which type of psychotherapy might work best for a particular patient likely depends on that individual. So overall this analysis suggests that patients with depression and their doctors should consider psychotherapies and explore which of the different types might be best suited for a particular patient.
The study also points to the need for further research. Whereas depression affects large numbers of people around the world, all of the trials identified were conducted in rich countries and Western societies. Trials in different settings are essential to inform treatment of patients worldwide. In addition, large high-quality studies should further explore the potential benefits of some of therapies for which less support currently exists. Where possible, future studies should compare psychotherapies with one another, because all of them have benefits, and it would not be ethical to withhold such beneficial treatment from patients.
PMCID: PMC3665892  PMID: 23723742
2.  MCPerm: A Monte Carlo Permutation Method for Accurately Correcting the Multiple Testing in a Meta-Analysis of Genetic Association Studies 
PLoS ONE  2014;9(2):e89212.
Traditional permutation (TradPerm) tests are usually considered the gold standard for multiple testing corrections. However, they can be difficult to complete for the meta-analyses of genetic association studies based on multiple single nucleotide polymorphism loci as they depend on individual-level genotype and phenotype data to perform random shuffles, which are not easy to obtain. Most meta-analyses have therefore been performed using summary statistics from previously published studies. To carry out a permutation using only genotype counts without changing the size of the TradPerm P-value, we developed a Monte Carlo permutation (MCPerm) method. First, for each study included in the meta-analysis, we used a two-step hypergeometric distribution to generate a random number of genotypes in cases and controls. We then carried out a meta-analysis using these random genotype data. Finally, we obtained the corrected permutation P-value of the meta-analysis by repeating the entire process N times. We used five real datasets and five simulation datasets to evaluate the MCPerm method and our results showed the following: (1) MCPerm requires only the summary statistics of the genotype, without the need for individual-level data; (2) Genotype counts generated by our two-step hypergeometric distributions had the same distributions as genotype counts generated by shuffling; (3) MCPerm had almost exactly the same permutation P-values as TradPerm (r = 0.999; P<2.2e-16); (4) The calculation speed of MCPerm is much faster than that of TradPerm. In summary, MCPerm appears to be a viable alternative to TradPerm, and we have developed it as a freely available R package at CRAN:
PMCID: PMC3931718  PMID: 24586601
3.  Meta-analyses of Adverse Effects Data Derived from Randomised Controlled Trials as Compared to Observational Studies: Methodological Overview 
PLoS Medicine  2011;8(5):e1001026.
Su Golder and colleagues carry out an overview of meta-analyses to assess whether estimates of the risk of harm outcomes differ between randomized trials and observational studies. They find that, on average, there is no difference in the estimates of risk between overviews of observational studies and overviews of randomized trials.
There is considerable debate as to the relative merits of using randomised controlled trial (RCT) data as opposed to observational data in systematic reviews of adverse effects. This meta-analysis of meta-analyses aimed to assess the level of agreement or disagreement in the estimates of harm derived from meta-analysis of RCTs as compared to meta-analysis of observational studies.
Methods and Findings
Searches were carried out in ten databases in addition to reference checking, contacting experts, citation searches, and hand-searching key journals, conference proceedings, and Web sites. Studies were included where a pooled relative measure of an adverse effect (odds ratio or risk ratio) from RCTs could be directly compared, using the ratio of odds ratios, with the pooled estimate for the same adverse effect arising from observational studies. Nineteen studies, yielding 58 meta-analyses, were identified for inclusion. The pooled ratio of odds ratios of RCTs compared to observational studies was estimated to be 1.03 (95% confidence interval 0.93–1.15). There was less discrepancy with larger studies. The symmetric funnel plot suggests that there is no consistent difference between risk estimates from meta-analysis of RCT data and those from meta-analysis of observational studies. In almost all instances, the estimates of harm from meta-analyses of the different study designs had 95% confidence intervals that overlapped (54/58, 93%). In terms of statistical significance, in nearly two-thirds (37/58, 64%), the results agreed (both studies showing a significant increase or significant decrease or both showing no significant difference). In only one meta-analysis about one adverse effect was there opposing statistical significance.
Empirical evidence from this overview indicates that there is no difference on average in the risk estimate of adverse effects of an intervention derived from meta-analyses of RCTs and meta-analyses of observational studies. This suggests that systematic reviews of adverse effects should not be restricted to specific study types.
Editors' Summary
Whenever patients consult a doctor, they expect the treatments they receive to be effective and to have minimal adverse effects (side effects). To ensure that this is the case, all treatments now undergo exhaustive clinical research—carefully designed investigations that test new treatments and therapies in people. Clinical investigations fall into two main groups—randomized controlled trials (RCTs) and observational, or non-randomized, studies. In RCTs, groups of patients with a specific disease or condition are randomly assigned to receive the new treatment or a control treatment, and the outcomes (for example, improvements in health and the occurrence of specific adverse effects) of the two groups of patients are compared. Because the patients are randomly chosen, differences in outcomes between the two groups are likely to be treatment-related. In observational studies, patients who are receiving a specific treatment are enrolled and outcomes in this group are compared to those in a similar group of untreated patients. Because the patient groups are not randomly chosen, differences in outcomes between cases and controls may be the result of a hidden shared characteristic among the cases rather than treatment-related (so-called confounding variables).
Why Was This Study Done?
Although data from individual trials and studies are valuable, much more information about a potential new treatment can be obtained by systematically reviewing all the evidence and then doing a meta-analysis (so-called evidence-based medicine). A systematic review uses predefined criteria to identify all the research on a treatment; meta-analysis is a statistical method for combining the results of several studies to yield “pooled estimates” of the treatment effect (the efficacy of a treatment) and the risk of harm. Treatment effect estimates can differ between RCTs and observational studies, but what about adverse effect estimates? Can different study designs provide a consistent picture of the risk of harm, or are the results from different study designs so disparate that it would be meaningless to combine them in a single review? In this methodological overview, which comprises a systematic review and meta-analyses, the researchers assess the level of agreement in the estimates of harm derived from meta-analysis of RCTs with estimates derived from meta-analysis of observational studies.
What Did the Researchers Do and Find?
The researchers searched literature databases and reference lists, consulted experts, and hand-searched various other sources for studies in which the pooled estimate of an adverse effect from RCTs could be directly compared to the pooled estimate for the same adverse effect from observational studies. They identified 19 studies that together covered 58 separate adverse effects. In almost all instances, the estimates of harm obtained from meta-analyses of RCTs and observational studies had overlapping 95% confidence intervals. That is, in statistical terms, the estimates of harm were similar. Moreover, in nearly two-thirds of cases, there was agreement between RCTs and observational studies about whether a treatment caused a significant increase in adverse effects, a significant decrease, or no significant change (a significant change is one unlikely to have occurred by chance). Finally, the researchers used meta-analysis to calculate that the pooled ratio of the odds ratios (a statistical measurement of risk) of RCTs compared to observational studies was 1.03. This figure suggests that there was no consistent difference between risk estimates obtained from meta-analysis of RCT data and those obtained from meta-analysis of observational study data.
What Do These Findings Mean?
The findings of this methodological overview suggest that there is no difference on average in the risk estimate of an intervention's adverse effects obtained from meta-analyses of RCTs and from meta-analyses of observational studies. Although limited by some aspects of its design, this overview has several important implications for the conduct of systematic reviews of adverse effects. In particular, it suggests that, rather than limiting systematic reviews to certain study designs, it might be better to evaluate a broad range of studies. In this way, it might be possible to build a more complete, more generalizable picture of potential harms associated with an intervention, without any loss of validity, than by evaluating a single type of study. Such a picture, in combination with estimates of treatment effects also obtained from systematic reviews and meta-analyses, would help clinicians decide the best treatment for their patients.
4.  Bias in meta-analysis detected by a simple, graphical test. 
BMJ : British Medical Journal  1997;315(7109):629-634.
OBJECTIVE: Funnel plots (plots of effect estimates against sample size) may be useful to detect bias in meta-analyses that were later contradicted by large trials. We examined whether a simple test of asymmetry of funnel plots predicts discordance of results when meta-analyses are compared to large trials, and we assessed the prevalence of bias in published meta-analyses. DESIGN: Medline search to identify pairs consisting of a meta-analysis and a single large trial (concordance of results was assumed if effects were in the same direction and the meta-analytic estimate was within 30% of the trial); analysis of funnel plots from 37 meta-analyses identified from a hand search of four leading general medicine journals 1993-6 and 38 meta-analyses from the second 1996 issue of the Cochrane Database of Systematic Reviews. MAIN OUTCOME MEASURE: Degree of funnel plot asymmetry as measured by the intercept from regression of standard normal deviates against precision. RESULTS: In the eight pairs of meta-analysis and large trial that were identified (five from cardiovascular medicine, one from diabetic medicine, one from geriatric medicine, one from perinatal medicine) there were four concordant and four discordant pairs. In all cases discordance was due to meta-analyses showing larger effects. Funnel plot asymmetry was present in three out of four discordant pairs but in none of concordant pairs. In 14 (38%) journal meta-analyses and 5 (13%) Cochrane reviews, funnel plot asymmetry indicated that there was bias. CONCLUSIONS: A simple analysis of funnel plots provides a useful test for the likely presence of bias in meta-analyses, but as the capacity to detect bias will be limited when meta-analyses are based on a limited number of small trials the results from such analyses should be treated with considerable caution.
PMCID: PMC2127453  PMID: 9310563
5.  Greater Response to Placebo in Children Than in Adults: A Systematic Review and Meta-Analysis in Drug-Resistant Partial Epilepsy 
PLoS Medicine  2008;5(8):e166.
Despite guidelines establishing the need to perform comprehensive paediatric drug development programs, pivotal trials in children with epilepsy have been completed mostly in Phase IV as a postapproval replication of adult data. However, it has been shown that the treatment response in children can differ from that in adults. It has not been investigated whether differences in drug effect between adults and children might occur in the treatment of drug-resistant partial epilepsy, although such differences may have a substantial impact on the design and results of paediatric randomised controlled trials (RCTs).
Methods and Findings
Three electronic databases were searched for RCTs investigating any antiepileptic drug (AED) in the add-on treatment of drug-resistant partial epilepsy in both children and adults. The treatment effect was compared between the two age groups using the ratio of the relative risk (RR) of the 50% responder rate between active AEDs treatment and placebo groups, as well as meta-regression. Differences in the response to placebo and to active treatment were searched using logistic regression. A comparable approach was used for analysing secondary endpoints, including seizure-free rate, total and adverse events-related withdrawal rates, and withdrawal rate for seizure aggravation. Five AEDs were evaluated in both adults and children with drug-resistant partial epilepsy in 32 RCTs. The treatment effect was significantly lower in children than in adults (RR ratio: 0.67 [95% confidence interval (CI) 0.51–0.89]; p = 0.02 by meta-regression). This difference was related to an age-dependent variation in the response to placebo, with a higher rate in children than in adults (19% versus 9.9%, p < 0.001), whereas no significant difference was observed in the response to active treatment (37.2% versus 30.4%, p = 0.364). The relative risk of the total withdrawal rate was also significantly lower in children than in adults (RR ratio: 0.65 [95% CI 0.43–0.98], p = 0.004 by metaregression), due to higher withdrawal rate for seizure aggravation in children (5.6%) than in adults (0.7%) receiving placebo (p < 0.001). Finally, there was no significant difference in the seizure-free rate between adult and paediatric studies.
Children with drug-resistant partial epilepsy receiving placebo in double-blind RCTs demonstrated significantly greater 50% responder rate than adults, probably reflecting increased placebo and regression to the mean effects. Paediatric clinical trial designs should account for these age-dependent variations of the response to placebo to reduce the risk of an underestimated sample size that could result in falsely negative trials.
In a systematic review of antiepileptic drugs, Philippe Ryvlin and colleagues find that children with drug-resistant partial epilepsy enrolled in trials seem to have a greater response to placebo than adults enrolled in such trials.
Editors' Summary
Whenever an adult is given a drug to treat a specific condition, that drug will have been tested in “randomized controlled trials” (RCTs). In RCTs, a drug's effects are compared to those of another drug for the same condition (or to a placebo, dummy drug) by giving groups of adult patients the different treatments and measuring how well each drug deals with the condition and whether it has any other effects on the patients' health. However, many drugs given to children have only been tested in adults, the assumption being that children can safely take the same drugs as adults provided the dose is scaled down. This approach to treatment is generally taken in epilepsy, a common brain disorder in children in which disruptions in the electrical activity of part (partial epilepsy) or all (generalized epilepsy) of the brain cause seizures. The symptoms of epilepsy depend on which part of the brain is disrupted and can include abnormal sensations, loss of consciousness, or convulsions. Most but not all patients can be successfully treated with antiepileptic drugs, which reduce or stop the occurrence of seizures.
Why Was This Study Done?
It is increasingly clear that children and adults respond differently to many drugs, including antiepileptic drugs. For example, children often break down drugs differently from adults, so a safe dose for an adult may be fatal to a child even after scaling down for body size, or it may be ineffective because of quicker clearance from the child's body. Consequently, regulatory bodies around the world now require comprehensive drug development programs in children as well as in adults. However, for pediatric trials to yield useful results, the general differences in the treatment response between children and adults must first be determined and then allowed for in the design of pediatric RCTs. In this study, the researchers investigate whether there is any evidence in published RCTs for age-dependent differences in the response to antiepileptic drugs in drug-resistant partial epilepsy.
What Did the Researchers Do and Find?
The researchers searched the literature for reports of RCTs on the effects of antiepileptic drugs in the add-on treatment of drug-resistant partial epilepsy in children and in adults—that is, trials that compared the effects of giving an additional antiepileptic drug with those of giving a placebo by asking what fraction of patients given each treatment had a 50% reduction in seizure frequency during the treatment period compared to a baseline period (the “50% responder rate”). This “systematic review” yielded 32 RCTs, including five pediatric RCTs. The researchers then compared the treatment effect (the ratio of the 50% responder rate in the treatment arm to the placebo arm) in the two age groups using a statistical approach called “meta-analysis” to pool the results of these studies. The treatment effect, they report, was significantly lower in children than in adults. Further analysis indicated that this difference was because more children than adults responded to the placebo. Nearly 1 in 5 children had a 50% reduction in seizure rate when given a placebo compared to only 1 in 10 adults. About a third of both children and adults had a 50% reduction in seizure rate when given antiepileptic drugs.
What Do These Findings Mean?
These findings, although limited by the small number of pediatric trials done so far, suggest that children with drug-resistant partial epilepsy respond more strongly in RCTs to placebo than adults. Although additional studies need to be done to find an explanation for this observation and to discover whether anything similar occurs in other conditions, this difference between children and adults should be taken into account in the design of future pediatric trials on the effects of antiepileptic drugs, and possibly drugs for other conditions. Specifically, to reduce the risk of false-negative results, this finding suggests that it might be necessary to increase the size of future pediatric trials to ensure that the trials have enough power to discover effects of the drugs tested, if they exist.
PMCID: PMC2504483  PMID: 18700812
6.  Can we rely on the best trial? A comparison of individual trials and systematic reviews 
The ideal evidence to answer a question about the effectiveness of treatment is a systematic review. However, for many clinical questions a systematic review will not be available, or may not be up to date. One option could be to use the evidence from an individual trial to answer the question?
We assessed how often (a) the estimated effect and (b) the p-value in the most precise single trial in a meta-analysis agreed with the whole meta-analysis. For a random sample of 200 completed Cochrane Reviews (January, 2005) we identified a primary outcome and extracted: the number of trials, the statistical weight of the most precise trial, the estimate and confidence interval for both the highest weighted trial and the meta-analysis overall. We calculated the p-value for the most precise trial and meta-analysis.
Of 200 reviews, only 132 provided a meta-analysis of 2 or more trials, with a further 35 effect estimates based on single trials. The average number of trials was 7.3, with the most precise trial contributing, on average, 51% of the statistical weight to the summary estimate from the whole meta-analysis. The estimates of effect from the most precise trial and the overall meta-analyses were highly correlated (rank correlation of 0.90).
There was an 81% agreement in statistical conclusions. Results from the most precise trial were statistically significant in 60 of the 167 evaluable reviews, with 55 of the corresponding systematic reviews also being statistically significant. The five discrepant results were not strikingly different with respect to their estimates of effect, but showed considerable statistical heterogeneity between trials in these meta-analyses. However, among the 101 cases in which the most precise trial was not statistically significant, the corresponding meta-analyses yielded 31 statistically significant results.
Single most precise trials provided similar estimates of effects to those of the meta-analyses to which they contributed, and statistically significant results are generally in agreement. However, "negative" results were less reliable, as may be expected from single underpowered trials. For systematic reviewers we suggest that: (1) key trial(s) in a review deserve greater attention (2) systematic reviewers should check agreement of the most precise trial and the meta analysis. For clinicians using trials we suggest that when a meta-analysis is not available, a focus on the most precise trial is reasonable provided it is adequately powered.
PMCID: PMC2851704  PMID: 20298582
7.  An empirical investigation of the potential impact of selective inclusion of results in systematic reviews of interventions: study protocol 
Systematic Reviews  2013;2:21.
Systematic reviewers may encounter a multiplicity of outcome data in the reports of randomised controlled trials included in the review (for example, multiple measurement instruments measuring the same outcome, multiple time points, and final and change from baseline values). The primary objectives of this study are to investigate in a cohort of systematic reviews of randomised controlled trials of interventions for rheumatoid arthritis, osteoarthritis, depressive disorders and anxiety disorders: (i) how often there is multiplicity of outcome data in trial reports; (ii) the association between selection of trial outcome data included in a meta-analysis and the magnitude and statistical significance of the trial result, and; (iii) the impact of the selection of outcome data on meta-analytic results.
Forty systematic reviews (20 Cochrane, 20 non-Cochrane) of RCTs published from January 2010 to January 2012 and indexed in the Cochrane Database of Systematic Reviews (CDSR) or PubMed will be randomly sampled. The first meta-analysis of a continuous outcome within each review will be included. From each review protocol (where available) and published review we will extract information regarding which types of outcome data were eligible for inclusion in the meta-analysis (for example, measurement instruments, time points, analyses). From the trial reports we will extract all outcome data that are compatible with the meta-analysis outcome as it is defined in the review and with the outcome data eligibility criteria and hierarchies in the review protocol. The association between selection of trial outcome data included in a meta-analysis and the magnitude and statistical significance of the trial result will be investigated. We will also investigate the impact of the selected trial result on the magnitude of the resulting meta-analytic effect estimates.
The strengths of this empirical study are that our objectives and methods are pre-specified and transparent. The results may inform methods guidance for systematic review conduct and reporting, particularly for dealing with multiplicity of randomised controlled trial outcome data.
PMCID: PMC3626625  PMID: 23575367
Systematic review; Randomised controlled trials; Reporting; Bias; Research methodology
8.  ROAST: rotation gene set tests for complex microarray experiments 
Bioinformatics  2010;26(17):2176-2182.
Motivation: A gene set test is a differential expression analysis in which a P-value is assigned to a set of genes as a unit. Gene set tests are valuable for increasing statistical power, organizing and interpreting results and for relating expression patterns across different experiments. Existing methods are based on permutation. Methods that rely on permutation of probes unrealistically assume independence of genes, while those that rely on permutation of sample are suitable only for two-group comparisons with a good number of replicates in each group.
Results: We present ROAST, a statistically rigorous gene set test that allows for gene-wise correlation while being applicable to almost any experimental design. Instead of permutation, ROAST uses rotation, a Monte Carlo technology for multivariate regression. Since the number of rotations does not depend on sample size, ROAST gives useful results even for experiments with minimal replication. ROAST allows for any experimental design that can be expressed as a linear model, and can also incorporate array weights and correlated samples. ROAST can be tuned for situations in which only a subset of the genes in the set are actively involved in the molecular pathway. ROAST can test for uni- or bi-direction regulation. Probes can also be weighted to allow for prior importance. The power and size of the ROAST procedure is demonstrated in a simulation study, and compared to that of a representative permutation method. Finally, ROAST is used to test the degree of transcriptional conservation between human and mouse mammary stems.
Availability: ROAST is implemented as a function in the Bioconductor package limma available from
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2922896  PMID: 20610611
9.  Psychosocial Interventions for Perinatal Common Mental Disorders Delivered by Providers Who Are Not Mental Health Specialists in Low- and Middle-Income Countries: A Systematic Review and Meta-Analysis 
PLoS Medicine  2013;10(10):e1001541.
In a systematic review and meta-analysis, Kelly Clarke and colleagues examine the effect of psychosocial interventions delivered by non–mental health specialists for perinatal common mental disorders in low- and middle-income countries.
Perinatal common mental disorders (PCMDs) are a major cause of disability among women. Psychosocial interventions are one approach to reduce the burden of PCMDs. Working with care providers who are not mental health specialists, in the community or in antenatal health care facilities, can expand access to these interventions in low-resource settings. We assessed effects of such interventions compared to usual perinatal care, as well as effects of interventions based on intervention type, delivery method, and timing.
Methods and Findings
We conducted a systematic review, meta-analysis, and meta-regression. We searched databases including Embase and the Global Health Library (up to 7 July 2013) for randomized and non-randomized trials of psychosocial interventions delivered by non-specialist mental health care providers in community settings and antenatal health care facilities in low- and middle-income countries. We pooled outcomes from ten trials for 18,738 participants. Interventions led to an overall reduction in PCMDs compared to usual care when using continuous data for PCMD symptomatology (effect size [ES] −0.34; 95% CI −0.53, −0.16) and binary categorizations for presence or absence of PCMDs (odds ratio 0.59; 95% CI 0.26, 0.92). We found a significantly larger ES for psychological interventions (three studies; ES −0.46; 95% CI −0.58, −0.33) than for health promotion interventions (seven studies; ES −0.15; 95% CI −0.27, −0.02). Both individual (five studies; ES −0.18; 95% CI −0.34, −0.01) and group (three studies; ES −0.48; 95% CI −0.85, −0.11) interventions were effective compared to usual care, though delivery method was not associated with ES (meta-regression β coefficient −0.11; 95% CI −0.36, 0.14). Combined group and individual interventions (based on two studies) had no benefit compared to usual care, nor did interventions restricted to pregnancy (three studies). Intervention timing was not associated with ES (β 0.16; 95% CI −0.16, 0.49). The small number of trials and heterogeneity of interventions limit our findings.
Psychosocial interventions delivered by non-specialists are beneficial for PCMDs, especially psychological interventions. Research is needed on interventions in low-income countries, treatment versus preventive approaches, and cost-effectiveness.
Editors' Summary
Perinatal common mental health disorders are among the most common health problems in pregnancy and the postpartum period. In low- and middle-income countries, about 16% of women during pregnancy and about 20% of women in the postpartum period will suffer from a perinatal common mental health disorder. These disorders, including depression and anxiety, are a major cause of disability in women and have been linked to young children under their care being underweight and stunted.
Why Was This Study Done?
While research shows that both pharmacological (e.g., antidepressants or anti-anxiety medications) and non-pharmacological (e.g., psychotherapy, education, or health promotion) interventions are effective for preventing and treating perinatal common mental disorders, most of this research took place in high-income countries. These findings may not be applicable in low-resource settings, where there is limited access to mental health care providers such as psychiatrists and psychologists, and to medications. Thus, non-pharmacological interventions delivered by providers who are not mental health specialists may be important as ways to treat perinatal common mental health disorders in these types of settings. In this study the researchers systematically reviewed research estimating the effectiveness of non-pharmacological interventions for perinatal common mental disorders that were delivered by providers who were not mental health specialists (including health workers, lay persons, and doctors or midwives) in low- and middle-income countries. The researchers also used meta-analysis and meta-regression—statistical methods that are used to combine the results from multiple studies—to estimate the relative effects of these interventions on mental health symptoms.
What Did the Researchers Do and Find?
The researchers searched multiple databases using key search terms to identify randomized and non-randomized clinical trials. Using specific criteria, the researchers retrieved and assessed 37 full papers, of which 11 met the criteria for their systematic review. Seven of these studies were from upper middle-income countries (China, South Africa, Columbia, Mexico, Argentina, Cuba, and Brazil), and four trials were from the lower middle-income countries of Pakistan and India, but there were no trials from low-income countries. The researchers assessed the quality of the selected studies, and one study was excluded from meta-analysis because of poor quality.
Combining results from the ten remaining studies, the researchers found that compared to usual perinatal care (which in most cases included no mental health care), interventions delivered by a providers who were not mental health specialists were associated with an overall reduction in mental health symptoms and the likelihood of being diagnosed with a mental health disorder. The researchers then performed additional analyses to assess relative effects by intervention type, timing, and delivery mode. They observed that both psychological interventions, such as psychotherapy and cognitive behavioral therapy, and health promotion interventions that were less focused on mental health led to significant improvement in mental health symptoms, but psychological interventions were associated with greater effects than health promotion interventions. Interventions delivered both during pregnancy and postnatally were associated with significant benefits when compared to usual care; however, when interventions were delivered during pregnancy only, the benefits were not significantly greater than usual care. When investigating mode of delivery, the researchers observed that both group and individual interventions were associated with improvements in symptoms.
What Do These Findings Mean?
These findings indicate that non-pharmacological interventions delivered by providers who are not mental health specialists could be useful for reducing symptoms of perinatal mental health disorders in middle-income countries. However, these findings should be interpreted with caution given that they are based on a small number of studies with a large amount of variation in the study designs, settings, timing, personnel, duration, and whether the intervention was delivered to a group, individually, or both. Furthermore, when the researchers excluded studies of the lowest quality, the observed benefits of these interventions were smaller, indicating that this analysis may overestimate the true effect of interventions. Nevertheless, the findings do provide support for the use of non-pharmacological interventions, delivered by non-specialists, for perinatal mental health disorders. Further studies should be undertaken in low-income countries.
10.  Standardized Treatment of Active Tuberculosis in Patients with Previous Treatment and/or with Mono-resistance to Isoniazid: A Systematic Review and Meta-analysis 
PLoS Medicine  2009;6(9):e1000150.
Performing a systematic review of studies evaluating retreatment of tuberculosis or treatment of isoniazid mono-resistant infection, Dick Menzies and colleagues find a paucity of evidence to support the WHO-recommended regimen.
A standardized regimen recommended by the World Health Organization for retreatment of active tuberculosis (TB) is widely used, but treatment outcomes are suspected to be poor. We conducted a systematic review of published evidence of treatment of patients with a history of previous treatment or documented isoniazid mono-resistance.
Methods and Findings
PubMed, EMBASE, and the Cochrane Central database for clinical trials were searched for randomized trials in previously treated patients and/or those with with mono-resistance to isoniazid, published in English, French, or Spanish between 1965 and June 2008. The first two sources were also searched for cohort studies evaluating specifically the current retreatment regimen. In studies selected for inclusion, rifampin-containing regimens were used to treat patients with bacteriologically confirmed pulmonary TB, in whom bacteriologically confirmed failure and/or relapse had been reported. Pooled cumulative incidences and 95% CIs of treatment outcomes were computed with random effects meta-analyses and negative binomial regression. No randomized trials of the currently recommended retreatment regimen were identified. Only six cohort studies were identified, in which failure rates were 18%–44% in those with isoniazid resistance. In nine trials, using very different regimens in previously treated patients with mono-resistance to isoniazid, the combined failure and relapse rates ranged from 0% to over 75%. From pooled analysis of 33 trials in 1,907 patients with mono-resistance to isoniazid, lower failure, relapse, and acquired drug resistance rates were associated with longer duration of rifampin, use of streptomycin, daily therapy initially, and treatment with a greater number of effective drugs.
There are few published studies to support use of the current standardized retreatment regimen. Randomized trials of treatment of persons with isoniazid mono-resistance and/or a history of previous TB treatment are urgently needed.
Editors' Summary
Every year, nearly ten million people develop tuberculosis—a contagious infection, usually of the lungs—and about 2 million people die from the disease. Tuberculosis is caused by Mycobacterium tuberculosis, bacteria that are spread in airborne droplets when people with the disease cough or sneeze. Its symptoms include a persistent cough, fever, weight loss, and night sweats. Diagnostic tests for tuberculosis include chest X-rays and sputum slide exams and cultures in which bacteriologists try to grow M. tuberculosis from mucus brought up from the lungs by coughing. The disease can be cured by taking several powerful antibiotics regularly (daily or several times a week) for at least 6 months. However, 10%–20% of patients treated for tuberculosis in low- and middle-income countries need re-treatment because the initial treatment fails to clear M. tuberculosis from their body or because their disease returns after they have apparently been cured (treatment relapse). Patients who need re-treatment are often infected with bacteria that are resistant to one or more of the antibiotics commonly used to treat tuberculosis.
Why Was This Study Done?
As part of its strategy to reduce the global burden of tuberculosis, the World Health Organization (WHO) recommends standardized treatment regimens for tuberculosis. For re-treatment, WHO recommends an 8-month course of isoniazid, rifampin, and ethambutol with pyrazinamide and streptomycin added for the first 3 and 2 months, respectively. All these drugs are given daily (the preferred regimen) or three times a week. Unfortunately, although this regimen is now used to treat about 1 million patients each year, it yields poor results, particularly in regions where drug resistance is common. In this study (which was commissioned by WHO to provide the evidence needed for a revision of its treatment guidelines), the researchers undertake a systematic review (a search using specific criteria to identify relevant research studies, which are then appraised) and a meta-analysis (a statistical approach that pools the results of several studies) of randomized trials and cohort studies (two types of study that investigate the efficacy of medical interventions) of re-treatment regimens in previously treated tuberculosis patients, and in patients with infection that was resistant to isoniazid (“mono-resistance”).
What Did the Researchers Do and Find?
The researchers' systematic search for published reports of randomized trials and cohort studies of the currently recommended re-treatment regimen identified no relevant randomized trials and only six cohort studies. In the three cohort studies in which the participants carried M. tuberculosis strains that were sensitive to all the antibiotics in the regimen, failure rates were generally low. However, in the studies in which the participants carried drug-resistant bacteria, failure rates ranged from 9% to 45%. The researchers also identified and analyzed the results of nine trials in which several re-treatment regimens, all of which deviated from the standardized regimen, were used in previously treated patients with isoniazid mono-resistance. In these trials, the combined failure and relapse rates ranged from 0% to more than 75%. Finally, the researchers analyzed the pooled results of 33 trials that investigated the effect of various regimens on nearly 2,000 patients (some receiving their first treatment for tuberculosis, some being re-treated) with isoniazid mono-resistance. This meta-analysis showed that lower relapse, failure, and acquired drug resistance rates were associated with longer duration of rifampicin treatment, use of streptomycin, daily therapy early in the treatment, and regimens that included a greater number of drugs to which the M. tuberculosis carried by the patient were sensitive.
What Do These Findings Mean?
These findings reveal that there is very little published evidence that supports the regimen currently recommended by WHO for the re-treatment of tuberculosis. Furthermore, this limited body of evidence is a patchwork of results gleaned from a few cohort studies and a set of randomized trials not specifically designed to test the efficacy of the standardized regimen. There is an urgent need, therefore, for a concerted international effort to initiate randomized trials of potential treatment regimens in both previously untreated and previously treated patients with all forms of drug-resistant tuberculosis. Because these trials will take some time to complete, the limited findings of the meta-analysis presented here may be used in the meantime to redesign and, hopefully, improve the current standardized re-treatment regimen. In fact, the revised WHO TB treatment guidelines will provide updated recommendations for patients with previously treated TB.
11.  Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews 
Background Many meta-analyses contain only a small number of studies, which makes it difficult to estimate the extent of between-study heterogeneity. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, and offers advantages over conventional random-effects meta-analysis. To assist in this, we provide empirical evidence on the likely extent of heterogeneity in particular areas of health care.
Methods Our analyses included 14 886 meta-analyses from the Cochrane Database of Systematic Reviews. We classified each meta-analysis according to the type of outcome, type of intervention comparison and medical specialty. By modelling the study data from all meta-analyses simultaneously, using the log odds ratio scale, we investigated the impact of meta-analysis characteristics on the underlying between-study heterogeneity variance. Predictive distributions were obtained for the heterogeneity expected in future meta-analyses.
Results Between-study heterogeneity variances for meta-analyses in which the outcome was all-cause mortality were found to be on average 17% (95% CI 10–26) of variances for other outcomes. In meta-analyses comparing two active pharmacological interventions, heterogeneity was on average 75% (95% CI 58–95) of variances for non-pharmacological interventions. Meta-analysis size was found to have only a small effect on heterogeneity. Predictive distributions are presented for nine different settings, defined by type of outcome and type of intervention comparison. For example, for a planned meta-analysis comparing a pharmacological intervention against placebo or control with a subjectively measured outcome, the predictive distribution for heterogeneity is a log-normal (−2.13, 1.582) distribution, which has a median value of 0.12. In an example of meta-analysis of six studies, incorporating external evidence led to a smaller heterogeneity estimate and a narrower confidence interval for the combined intervention effect.
Conclusions Meta-analysis characteristics were strongly associated with the degree of between-study heterogeneity, and predictive distributions for heterogeneity differed substantially across settings. The informative priors provided will be very beneficial in future meta-analyses including few studies.
PMCID: PMC3396310  PMID: 22461129
Meta-analysis; heterogeneity; intervention studies; Bayesian analysis
12.  Chinese Herbal Medicine for Diabetic Peripheral Neuropathy: An Updated Meta-Analysis of 10 High-Quality Randomized Controlled Studies 
PLoS ONE  2013;8(10):e76113.
Diabetic peripheral neuropathy (DPN) is very common in people with diabetes. Chinese herbal medicine (CHM) therapy has been developed for DPN empirically over the years. The aim of this systematic review and meta-analysis was to assess the efficacy and safety of CHMs for patients suffering from DPN.
We performed a meta-analysis of randomized-controlled clinical trials (RCTs) evaluating the efficacy and safety of CHM on DPN. Six databases were searched up to November 2012. The primary outcome measures were the absolute values or changing of motor or sensory nerve conduction velocity (NCV), and the secondary outcome measurements were clinical symptoms improvements and adverse events. The methodological quality was assessed by Jadad scale and the twelve criteria recommended by the Cochrane Back Review Group.
One hundred and sixty-three studies claimed RCTs. Ten studies with 653 individuals were further identified based on the Jadad score ≥3. These 10 studies were all of high methodological quality with a low risk of bias. Meta-analysis showed the effects of NCV favoring CHMs when compared with western conventional medicines (WCM) (P<0.05 or P<0.01). There is a significant difference in the total efficacy rate between the two groups (P<0.001). Adverse effects were reported in all of the ten included studies, and well tolerated in all patients with DPN.
Despite of the apparently positive findings and low risk of bias, it is premature to conclude the efficacy of CHMs for the treatment of DPN because of the high clinical heterogeneity and small sample sizes of the included studies. However, CHM therapy was safe for DPN. Further standardized preparation, large sample-size and rigorously designed RCTs are required.
PMCID: PMC3797714  PMID: 24146822
13.  Assessing Differential Expression in Two-Color Microarrays: A Resampling-Based Empirical Bayes Approach 
PLoS ONE  2013;8(11):e80099.
Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.
PMCID: PMC3842292  PMID: 24312198
14.  Fewer permutations, more accurate P-values 
Bioinformatics  2009;25(12):i161-i168.
Motivation: Permutation tests have become a standard tool to assess the statistical significance of an event under investigation. The statistical significance, as expressed in a P-value, is calculated as the fraction of permutation values that are at least as extreme as the original statistic, which was derived from non-permuted data. This empirical method directly couples both the minimal obtainable P-value and the resolution of the P-value to the number of permutations. Thereby, it imposes upon itself the need for a very large number of permutations when small P-values are to be accurately estimated. This is computationally expensive and often infeasible.
Results: A method of computing P-values based on tail approximation is presented. The tail of the distribution of permutation values is approximated by a generalized Pareto distribution. A good fit and thus accurate P-value estimates can be obtained with a drastically reduced number of permutations when compared with the standard empirical way of computing P-values.
Availability: The Matlab code can be obtained from the corresponding author on request.
Supplementary information:Supplementary data are available at Bioinformatics online.
PMCID: PMC2687965  PMID: 19477983
15.  Intra-Articular Viscosupplementation With Hylan G-F 20 To Treat Osteoarthritis of the Knee 
Executive Summary
To assess the effectiveness and cost-effectiveness of hylan G-F 20 as a substitute for existing treatments for pain due to osteoarthritis (OA) of the knee, other viscosupplementation devices, and/or as an adjunct to conventional therapy.
Hylan G-F 20 (brand name Synvisc, which is manufactured by Genzyme) is a high molecular weight derivative of hyaluronan, a component of joint synovial fluid. It acts as a lubricant and shock absorber. It is administered by injection into the joint space to treat pain associated with OA of the knee. Although the injection procedure is an insured service in Ontario, the device, hylan G-F 20, is not.
Clinical Need
Osteoarthritis is prevalent in 10% to 12% of Ontario adults, and exceeds 40% in Ontario residents aged 65 years and older. About one-half of these people have mild, moderate, or severe OA of the knee. Conventional treatment involves a combination of nonpharmacological management (e.g., weight loss, exercise, social support, and patient education), drugs, (e.g., acetaminophen, COX-2 inhibitors, nonsteroidal anti-inflammatory drugs with/without misoprostol, intra-articular glucocorticoids, opioids, and topical analgesics) and surgical interventions, such as debridement and total knee replacement, when pharmacological management fails.
The growing burden of OA of the knee in the aging Ontario population combined with recent safety concerns about COX-2 inhibitors and long wait times for total joint replacement is placing pressure on the demand for new, effective technologies to manage the pain of OA.
The Technology
Hylan G-F 20 is derived from rooster comb hyaluronan (HA). At the time of writing, eight viscosupplement hyaluronic products are licensed in Canada. Hylan G-F 20 is distinguished from the other products by its chemical structure (i.e., cross-linked hyaluronan, hence hylan) and relatively higher molecular weight, which may bestow greater therapeutic viscoelastic properties. A complete treatment cycle of hylan G-F 20 involves an intra-articular injection of 2 ml of hylan G-F 20 once a week for 3 weeks. It is licensed for use for patients in all stages of joint pathology, but should not be used in infected or severely inflamed joints, in joints with large effusion, in patients that have skin diseases or infections in the area of the injection site, or in patients with venous stasis. It is also contraindicated in patients with hypersensitivities to avian proteins.
Review Strategy
The Medical Advisory Secretariat used its standard search protocol to review the literature for evidence on the effectiveness of intra-articular hylan G-F 20 compared with placebo, as a substitute for alternate active treatments, or as an adjunct to conventional care for treatment of the pain of OA of the knee. All English-language journal articles and reviews with clearly described designs and methods (i.e., those sufficient to assign a Jadad score to) published or released between 1966 and February 2005 were included. Two more recently published meta-analyses were also included. The databases searched were Ovid MEDLINE, EMBASE, the Cochrane database and leading international organizations for health technology assessments, including the International Network of Agencies for Health Technology Assessments. The search terms were as follows: hyaluronan, hyaluronate adj sodium, hylan, hylan G-F 20 (Synvisc), Synvisc, Hyalgan, Orthovisc, Supartz, Artz, Artzal, BioHY, NASHA, NRD101, viscosupplementation, osteoarthritis, knee, knee joint. The primary outcome of interest was a clinically significant difference, defined as greater than 10 mm on 100 mm visual analogue scale, or a change from baseline of more than 20% in the mean magnitude of pain relief experienced among patients treated with hylan G-F 20 compared with those treated with the control intervention.
One clinical epidemiologist reviewed the full-text reports and extracted data using an extraction form. Key variables included, but were not limited to, the characteristics of the patients, method of randomization, type of control intervention, outcome measures for effectiveness and safety, and length of follow-up. The quality of the studies and level of the evidence was initially scored by one clinical epidemiologist using the Jadad scale and GRADE approach. Level of quality depends on the amount of certainty about the magnitude of effect and is based on study designs, extent of methodological limitations, consistency of results and applicability (i.e. directness) to the Ontario clinical context. The GRADE approach also permits comment on the strength of recommendations resulting from the evidence, based on estimates of the magnitude of effect relative to the magnitude of risk and burden and the level of certainty around these estimates. The quality assessments were subsequently peer-reviewed.
Summary of Findings
The literature search revealed 2 previous health technology assessments, 3 meta-analyses of placebo-controlled trials, 1 Cochrane review and meta-analysis encompassing 18 randomized controlled trials (RCTs) that compared hylan G-F 20 to either placebo or active treatments, 11 RCTs of hylan G-F 20 (all included in the Cochrane review), and 10 observational studies. Given the preponderance of evidence, the Medical Advisory Secretariat’s analysis focused on studies with Level 1 evidence of effectiveness (i.e., the meta-analyses of RCTs and the RCTs). Only safety data from the observational studies were included.
The authors of the 2 health technology assessments concluded that the data were sparse and poor quality. There was some evidence that hylan G-F 20 delivered a small, clinical benefit at 3 to 6 months after treatment on a magnitude comparable to NSAIDs and intra-articular steroids. Hylan G-F 20 appeared to carry a risk of a local adverse reaction of in the range of 3% to 18% per 100 injections, but there was no apparent risk of a severe adverse event, although the data were limited.
Each of the 3 meta-analyses of placebo-controlled trials of intra-articular hyaluronans had only 3 trials involving hylan G-F 20. There results were inconsistent, with one study concluding that intra-articular hyaluronans were efficacious, whereas the 2 other analyses concluded the effect size was small (0.32) and probably not clinically significant. The risk of a minor adverse event ranged from 8% to 19% per 100 injections. Major adverse events were rare.
The authors of the Cochrane review concluded that a pooled analysis supported the efficacy of hyaluronans, including hylan G-F 20. The 5- to 13-week post-injection period showed an improvement from baseline of 11% to 54% for pain and 9% to 15% for function. Comparable efficacy was noted against NSAIDs, and longer-term benefits were noted in against steroids. Few adverse events were noted.
When the Medical Advisory Secretariat applied the criterion of clinical significance to the magnitude of pain relief reported in the RCTs on hylan G-F 20, the following was noted:
There was inconsistent evidence that hylan G-F 20 was clinically superior to placebo at 5 to 26 weeks after treatment.
There was consistent evidence that, in terms of delivering pain relief, hylan G-F 20 was no better or worse than NSAIDs or intra-articular steroids at 5 to 26 weeks after treatment.
There was consistent evidence that hylan G-F 20 was not clinically superior to other hyaluronic products.
There was consistent evidence that hylan G-F 20 delivered a small magnitude of clinical benefit at 12 to 52 weeks post-injection when administered as an adjunct to conventional care.
There were limitations to the methods in many of the RCTs involving hylan G-F 20. When only the results from the higher-quality studies were considered, there was level 2 evidence that hylan G-F 20 was not clinically superior to placebo (or another hyaluronan) at 1 to 26 weeks after treatment in older patients with advanced disease for whom total knee replacement was indicated. There was level 2 evidence that hylan G-F 2- was comparable to NSAIDs at 4 to 13 weeks after treatment, and level 2 evidence that hylan G-F 20 was superior to placebo as an adjunct to conventional care 4 to 26 weeks after treatment.
With respect to safety, overall, hylan G-F 20 carries a risk of a minor, local adverse event rate of about 8% to 19% per 100 injections. Incidents of moderate-severe post-injection inflammatory joint reactions have been reported, but the likelihood appears to be low (0.15% of patients).
Economic Analysis
Case-costing estimates suggest that the annual cost of 2 treatment cycles of hylan G-F 20 (plus analgesics for breakthrough pain) is almost equivalent to the annual cost of taking a NSAID (with a gastroprotective agent) and is more expensive that taking intra-articular corticosteroids (plus analgesics for breakthrough pain). The estimated cost of funding hylan G-F 20 as an adjunct to conventional therapy (i.e., any of analgesics, NSAIDs, intra-articular steroids, physiotherapy, and surgery) is $700 per patient per year. Given the huge burden of mild to moderate OA among adults who seek medical care for it in Ontario (about 300,000), funding hylan G-F 20 as an adjunct to existing treatment could be expensive, depending on its diffusion and uptake. If only 10% to 30% of patients choose this option, then the estimated budget impact would be $21 million to $63 million (Cdn) per year.
When the benefits relative to the risks and costs are considered, NSAIDs and hylan G-F 20 appear comparable, as the table shows. Consequently, there’s little evidence on which to recommend hylan G-F 20 over NSAIDs, except perhaps for patients who cannot tolerate NSAIDs, although this evidence is indirect, since no studies looked specifically at this population.
CC indicates conventional care; IA, intra-articular; NSAID, nonsteroidal anti-inflammatory drug.
Intra-articular steroids appear to deliver the same risks and clinical benefits as hylan G-F 20 at a lower cost; therefore, there’s evidence that intra-articular steroids are the preferred option. Hylan G-F 20 as an adjunct to conventional care appears to deliver some clinical benefit, although funding hylan G-F 20 as an adjunct would have considerable budget impact, so the benefits of this option do not clearly outweigh the costs. There’s some uncertainty about the effect of hylan G-F 20 relative to other hyaluronans, mostly because some of the trials of this comparison were not published.
Many of the studies of hylan G-F 20 have considerable methodological limitations that result in uncertainty about the magnitude of effect. An upcoming review of the evidence by the Osteoarthritis Advisory Panel of clinical experts will likely help to reduce some of this uncertainty.
There is moderate evidence that hylan G-F 20 is no more clinically effective than NSAIDs. The evidence that hylan G-F 20 might be an appropriate option for a person with OA of the knee who cannot tolerate NSAIDs is indirect. The possible benefit of fewer cases of NSAID-induced gastropathy in this population must be weighed against the uncertainty of a severe inflammatory adverse reaction to hylan G-F 20.
Similarly, there is moderate evidence that hylan G-F 20 is no more clinically effective than intra-articular corticosteroids. The lower cost of intra-articular corticosteroids makes them the preferred option.
There is moderate evidence that hylan G-F 20 is effective as an adjunct to conventional care, delivering a small magnitude of temporary relief at 4 to 26 weeks after treatment. The estimated additional cost to the system of providing hylan G-F 20 as an adjunct to conventional care is about $700 (Cdn) per patient annually. The magnitude and duration of clinical benefit of hylan G-F 20 must be weighed against the uncertainty and potential magnitude of the budget impact (about $35 million to $105 million (Cdn) per year) of funding this device given the high burden of OA in Ontario adults.
There is level 2 evidence that hylan G-F 20 is not effective in people with advanced OA for whom total knee replacement is indicated.
PMCID: PMC3382385  PMID: 23074461
16.  Airway Clearance Devices for Cystic Fibrosis 
Executive Summary
The purpose of this evidence-based analysis is to examine the safety and efficacy of airway clearance devices (ACDs) for cystic fibrosis and attempt to differentiate between devices, where possible, on grounds of clinical efficacy, quality of life, safety and/or patient preference.
Cystic fibrosis (CF) is a common, inherited, life-limiting disease that affects multiple systems of the human body. Respiratory dysfunction is the primary complication and leading cause of death due to CF. CF causes abnormal mucus secretion in the airways, leading to airway obstruction and mucus plugging, which in turn can lead to bacterial infection and further mucous production. Over time, this almost cyclical process contributes to severe airway damage and loss of respiratory function. Removal of airway secretions, termed airway clearance, is thus an integral component of the management of CF.
A variety of methods are available for airway clearance, some requiring mechanical devices, others physical manipulation of the body (e.g. physiotherapy). Conventional chest physiotherapy (CCPT), through the assistance of a caregiver, is the current standard of care for achieving airway clearance, particularly in young patients up to the ages of six or seven. CF patients are, however, living much longer now than in decades past. The median age of survival in Canada has risen to 37.0 years for the period of 1998-2002 (5-year window), up from 22.8 years for the 5-year window ending in 1977. The prevalence has also risen accordingly, last recorded as 3,453 in Canada in 2002, up from 1,630 in 1977. With individuals living longer, there is a greater need for independent methods of airway clearance.
Airway Clearance Devices
There are at least three classes of airway clearance devices: positive expiratory pressure devices (PEP), airway oscillating devices (AOD; either handheld or stationary) and high frequency chest compression (HFCC)/mechanical percussion (MP) devices. Within these classes are numerous different brands of devices from various manufacturers, each with subtle iterations. At least 10 devices are licensed by Health Canada (ranging from Class 1 to Class 3 devices).
Evidence-Based Analysis of Effectiveness
Research Questions
Does long-term use of ACDs improve outcomes of interest in comparison to CCPT in patients with CF?
Does long-term use of one class of ACD improve outcomes of interest in comparison to another class of ACD in CF patients?
Literature Search
A comprehensive literature search was performed on March 7, 2009 using OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, the Cumulative Index to Nursing & Allied Health Literature (CINAHL), the Cochrane Library, and the International Agency for Health Technology Assessment (INAHTA) for studies published from January 1, 1950 to March 7, 2009.
Inclusion Criteria
All randomized controlled trials including those of parallel and crossover design,
Systematic reviews and/or meta-analyses. Randomized controlled trials (RCTs), systematic reviews and meta-analyses
Exclusion Criteria
Abstracts were generally excluded because their methods could not be examined; however, abstract data was included in several Cochrane meta-analyses presented in this paper;
Studies of less than seven days duration (including single treatment studies);
Studies that did not report primary outcomes;
Studies in which less than 10 patients completed the study.
Outcomes of Interest
Primary outcomes under review were percent-predicted forced expiratory volume (FEV-1), forced vital capacity (FVC), and forced expiratory flow between 25%-75% (FEF25-75). Secondary outcomes included number of hospitalizations, adherence, patient preference, quality of life and adverse events. All outcomes were decided a priori.
Summary of Findings
Literature searching and back-searching identified 13 RCTs meeting the inclusion criteria, along with three Cochrane systematic reviews. The Cochrane reviews were identified in preliminary searching and used as the basis for formulating this review. Results were subgrouped by comparison and according to the available literature. For example, results from Cochrane meta-analyses included abstract data and therefore, additional meta-analyses were also performed on trials reported as full publications only (MAS generally excludes abstracted data when full publications are available as the methodological quality of trials reported in abstract cannot be properly assessed).
Executive Summary Table 1 summarizes the results across all comparisons and subgroupings for primary outcomes of pulmonary function. Only two comparisons yielded evidence of moderate or high quality according to GRADE criteria–the comparisons of CCPT vs. PEP and handheld AOD vs. PEP–but only the comparison of CCPT vs. PEP noted a significant difference between treatment groups. In comparison to CCPT, there was a significant difference in favour of PEP for % predicted FEV-1 and FVC according to one long-term, parallel RCT. This trial was accepted as the best available evidence for the comparison. The body of evidence for the remaining comparisons was low to very low, according to GRADE criteria, being downgraded most often because of poor methodological quality and low generalizability. Specifically, trials were likely not adequately powered (low sample sizes), did not conduct intention-to-treat analyses, were conducted primarily in children and young adolescents, and outdated (conducted more than 10 years ago).
Secondary outcomes were poorly or inconsistently reported, and were generally not of value to decision-making. Of note, there were a significantly higher number of hospitalizations among participants undergoing AOD therapy in comparison to PEP therapy.
Summarization of results for primary outcomes by comparison and subgroupings
Bolding indicates significant difference
Positive summary statistics favour the former intervention
Abbreviations: AOD, airway oscillating device; CCPT, conventional chest physiotherapy; CI, confidence interval; HFCC, high frequency chest compression; MP, mechanical percussion; N/A: not applicable; PEP, positive expiratory pressure
Economic Analysis
Devices ranged in cost from around $60 for PEP and handheld AODs to upwards of $18,000 for a HFCC vest device. Although the majority of device costs are paid out-of-pocket by the patients themselves, their parents, or covered by third-party medical insurance, Ontario did provide funding assistance through the Assistive Devices Program (ADP) for postural drainage boards and MP devices. These technologies, however, are either obsolete or their clinical efficacy is not supported by evidence. ADP provided roughly $16,000 in funding for the 2008/09 fiscal year. Using device costs and prevalent and incident cases of CF in Ontario, budget impact projections were generated for Ontario. Prevalence of CF in Ontario for patients from ages 6 to 71 was cited as 1,047 cases in 2002 while incidence was estimated at 46 new cases of CF diagnosed per year in 2002. Budget impact projections indicated that PEP and handheld AODs were highly economically feasible costing around $90,000 for the entire prevalent population and less than $3,000 per year to cover new incident cases. HFCC vest devices were by far the most expensive, costing in excess of $19 million to cover the prevalent population alone.
There is currently a lack of sufficiently powered, long-term, parallel randomized controlled trials investigating the use of ACDs in comparison to other airway clearance techniques. While much of the current evidence suggests no significant difference between various ACDs and alternative therapies/technologies, at least according to outcomes of pulmonary function, there is a strong possibility that past trials were not sufficiently powered to identify a difference. Unfortunately, it is unlikely that there will be any future trials comparing ACDs to CCPT as withholding therapy using an ACD may be seen as unethical at present.
Conclusions of clinical effectiveness are as follows:
Moderate quality evidence suggests that PEP is at least as effective as or more effective than CCPT, according to primary outcomes of pulmonary function.
Moderate quality evidence suggests that there is no significant difference between PEP and handheld AODs, according to primary outcomes of pulmonary function; however, secondary outcomes may favour PEP.
Low quality evidence suggests that there is no significant difference between AODs or HFCC/MP and CCPT, according to both primary and secondary outcomes.
Very low quality evidence suggests that there is no significant difference between handheld AOD and CCPT, according to primary outcomes of pulmonary function.
Budget impact projections show PEP and handheld AODs to be highly economically feasible.
PMCID: PMC3377547  PMID: 23074531
17.  Gene-Based Tests of Association 
PLoS Genetics  2011;7(7):e1002177.
Genome-wide association studies (GWAS) are now used routinely to identify SNPs associated with complex human phenotypes. In several cases, multiple variants within a gene contribute independently to disease risk. Here we introduce a novel Gene-Wide Significance (GWiS) test that uses greedy Bayesian model selection to identify the independent effects within a gene, which are combined to generate a stronger statistical signal. Permutation tests provide p-values that correct for the number of independent tests genome-wide and within each genetic locus. When applied to a dataset comprising 2.5 million SNPs in up to 8,000 individuals measured for various electrocardiography (ECG) parameters, this method identifies more validated associations than conventional GWAS approaches. The method also provides, for the first time, systematic assessments of the number of independent effects within a gene and the fraction of disease-associated genes housing multiple independent effects, observed at 35%–50% of loci in our study. This method can be generalized to other study designs, retains power for low-frequency alleles, and provides gene-based p-values that are directly compatible for pathway-based meta-analysis.
Author Summary
Genome-wide association studies (GWAS) have successfully identified genetic variants associated with complex human phenotypes. Despite a proliferation of analysis methods, most studies rely on simple, robust SNP–by–SNP univariate tests with ever-larger population sizes. Here we introduce a new test motivated by the biological hypothesis that a single gene may contain multiple variants that contribute independently to a trait. Applied to simulated phenotypes with real genotypes, our new method, Gene-Wide Significance (GWiS), has better power to identify true associations than traditional univariate methods, previous Bayesian methods, popular L1 regularized (LASSO) multivariate regression, and other approaches. GWiS retains power for low-frequency alleles that are increasingly important for personal genetics, and it is the only method tested that accurately estimates the number of independent effects within a gene. When applied to human data for multiple ECG traits, GWiS identifies more genome-wide significant loci (verified by meta-analyses of much larger populations) than any other method. We estimate that 35%–50% of ECG trait loci are likely to have multiple independent effects, suggesting that our method will reveal previously unidentified associations when applied to existing data and will improve power for future association studies.
PMCID: PMC3145613  PMID: 21829371
18.  Effect of Statins on Venous Thromboembolic Events: A Meta-analysis of Published and Unpublished Evidence from Randomised Controlled Trials 
PLoS Medicine  2012;9(9):e1001310.
A systematic review and meta-analysis conducted by Kazem Rahimi and colleagues re-evaluates the hypothesis, generated in previous studies, that statins may reduce the risk of venous thromboembolic events. Their meta-analysis does not support the previous findings.
It has been suggested that statins substantially reduce the risk of venous thromboembolic events. We sought to test this hypothesis by performing a meta-analysis of both published and unpublished results from randomised trials of statins.
Methods and Findings
We searched MEDLINE, EMBASE, and Cochrane CENTRAL up to March 2012 for randomised controlled trials comparing statin with no statin, or comparing high dose versus standard dose statin, with 100 or more randomised participants and at least 6 months' follow-up. Investigators were contacted for unpublished information about venous thromboembolic events during follow-up. Twenty-two trials of statin versus control (105,759 participants) and seven trials of an intensive versus a standard dose statin regimen (40,594 participants) were included. In trials of statin versus control, allocation to statin therapy did not significantly reduce the risk of venous thromboembolic events (465 [0.9%] statin versus 521 [1.0%] control, odds ratio [OR] = 0.89, 95% CI 0.78–1.01, p = 0.08) with no evidence of heterogeneity between effects on deep vein thrombosis (266 versus 311, OR 0.85, 95% CI 0.72–1.01) and effects on pulmonary embolism (205 versus 222, OR 0.92, 95% CI 0.76–1.12). Exclusion of the trial result that provided the motivation for our meta-analysis (JUPITER) had little impact on the findings for venous thromboembolic events (431 [0.9%] versus 461 [1.0%], OR = 0.93 [95% CI 0.82–1.07], p = 0.32 among the other 21 trials). There was no evidence that higher dose statin therapy reduced the risk of venous thromboembolic events compared with standard dose statin therapy (198 [1.0%] versus 202 [1.0%], OR = 0.98, 95% CI 0.80–1.20, p = 0.87). Risk of bias overall was small but a certain degree of effect underestimation due to random error cannot be ruled out.
Please see later in the article for the Editors' Summary.
The findings from this meta-analysis do not support the previous suggestion of a large protective effect of statins (or higher dose statins) on venous thromboembolic events. However, a more moderate reduction in risk up to about one-fifth cannot be ruled out.
Editors' Summary
Blood normally flows smoothly throughout the human body, supplying its organs and tissues with oxygen and nutrients. But, when an injury occurs, proteins called clotting factors make the blood gel (coagulate) at the injury site. The resultant blood clot (thrombus) plugs the wound and prevents blood loss. Occasionally, however, a thrombus forms inside an uninjured blood vessel and partly or completely blocks the blood flow. A clot inside one of the veins (vessels that take blood towards the heart) deep within the body is called a deep vein thrombosis (DVT). Symptoms of DVT (which usually occurs in the leg) include pain, swelling, and redness in the affected limb. DVT is treated with heparin and warfarin, two anticoagulant drugs that stop the blood clot growing. If left untreated, part of the clot (an embolus) can break off and travel to the lungs, where it can cause a pulmonary embolism (PE), a life-threatening condition characterized by chest pain, breathlessness, coughing, and dizziness. Little is known about how to prevent DVTs and PEs but risk factors for these venous thromboembolic events include having an inherited blood clotting disorder, oral contraceptive use, having surgery, and prolonged inactivity (on long-haul plane flights, for example).
Why Was This Study Done?
In 2009, a secondary (add-on) analysis of data from a randomized controlled trial (RCT, a study that randomly assigns individuals to receive different treatments and compares the outcomes associated with each treatment) called the JUPITER trial reported that rosuvastatin—a cholesterol-lowering drug (statin)—halved the risk of venous thromboembolic events among apparently healthy adults. The JUPITER trial was initiated to test whether statins reduce the risk of strokes, heart attacks, and other cardiovascular diseases (conditions that involve the heart and the blood vessels) among adults with raised levels of a predictor for these diseases called C-reactive protein; statins reduce the levels of this protein as well as those of cholesterol. Because fewer than 100 of the participants in the JUPITER trial developed a DVT or PE, the reduction in the risk of a venous thromboembolic event among the participants who took rosuvastatin could have happened by chance. In this systematic review and meta-analysis of 29 RCTs of statins that collected information on many more venous thromboembolic events, the researchers test the hypothesis that statins substantially reduce the risk of such events. A systematic review uses predefined criteria to identify all the research on a given topic; a meta-analysis is a statistical approach that combines the results of several studies.
What Did the Researchers Do and Find?
The researchers identified 22 RCTs (105,759 participants) that compared the effects of statins with control (dummy) tablets and seven (40,594 participants) that compared an intensive statin regimen with a standard regimen. They then obtained largely unpublished information about the venous thromboembolic events that occurred during these trials (about 1,000 DVTs and PEs) from the original investigators. In the trials of statin versus control, allocation to statin therapy did not significantly reduce the risk of venous thromboembolic events. Thus, although events occurred in 465 participants who were given statins (0.9% of the participants) and in 521 participants who were given control tablets (1% of the participants), this difference in outcomes was not statistically significant—it could have happened by chance. Exclusion of the JUPITER trial results from the meta-analysis did not alter this finding. The researchers also found no evidence that intensive statin therapy reduced the risk of venous thromboembolic events compared to standard therapy.
What Do These Findings Mean?
The findings of this meta-analysis do not support the suggestion that statins, either at the standard dose or at higher doses, reduce the risk of venous thromboembolic events substantially among healthy adults. It is possible that the effect of statins has been underestimated in this meta-analysis because of missing data or because of some other source of bias. Furthermore, because the total number of events in this meta-analysis is still relatively modest, these findings do not rule out the possibility that statins may reduce the risk of venous thromboembolic events by up to about one-fifth in some or all individuals. Additional large RCTs are now needed to investigate whether statin treatment does in fact reduce the risk of venous thromboembolic events in adults and, if it does, whether all statins have a similar effect and whether statin treatment is beneficial in everyone or only in specific subgroups of people.
PMCID: PMC3445446  PMID: 23028261
19.  Is Immediate Imaging Important in Managing Low Back Pain? 
Journal of Athletic Training  2011;46(1):99-102.
Chou R, Fu R, Carrino JA, Deyo RA. Imaging strategies for low-back pain: systematic review and meta-analysis. Lancet. 2009;373(9662):463–472.
Clinical Questions:
In patients with low back pain (LBP) who do not have indications of a serious underlying condition, does routine, immediate lumbar imaging result in improved patient outcomes when compared with clinical care without immediate imaging?
Data Sources:
Studies were identified by searching MEDLINE (1966 through first week of August 2008) and the Cochrane Central Register of Controlled Trials (third quarter of 2008). The reference lists of identified studies were manually reviewed for additional citations. The search terms spine, low-back pain, diagnostic imaging, and randomized controlled trials were used in both databases. The complete search strategy was made available as an online supplement.
Study Selection:
The search criteria were applied to the articles obtained from the electronic searches and the subsequent manual searches with no language restrictions. This systematic review and meta-analysis included randomized, controlled trials that compared immediate, routine lumbar imaging (or routine provision of imaging findings) with usual clinical care without immediate lumbar imaging (or not routinely providing results of imaging) for LBP without indications of serious underlying conditions.
Data Extraction:
Data extraction and assessment of study quality were well described. The trials assessed one or more of the following outcomes: pain, function, mental health, quality of life, patient satisfaction, and overall patient-reported improvement. Two reviewers independently appraised citations considered potentially relevant, with disagreements between reviewers resolved by consensus. Two independent reviewers abstracted data from the trials and assessed quality with modified Cochrane Back Review Group criteria. The criterion for blinding of patients and providers was excluded because of lack of applicability to imaging studies. In addition, the criterion of co-intervention similarity was excluded because a potential effect of different imaging strategies is to alter subsequent treatment decisions. As a result of excluding these criteria, quality ratings were based on the remaining 8 criteria. The authors resolved disagreements about quality ratings through discussion and consensus. Trials that met 4 or more of the 8 criteria were classified as higher quality, whereas those that met 3 or fewer of the 8 criteria were classified as lower quality. In addition, the authors categorized duration of symptoms as acute (<4 weeks), subacute (4–12 weeks), or chronic (>12 weeks). The investigators also contacted the study authors for additional data if included outcomes were not published or if median (rather than mean) outcomes were reported. Statistical analysis was conducted on the primary outcomes of improvement in pain or function. Secondary outcomes of improvement in mental health, quality of life, patient satisfaction, and overall improvement were also analyzed. Outcomes were categorized as short term (≤3 months), long term (>6 months to ≤1 year), or extended (>1 year). For continuous outcomes, standardized mean differences (SMDs) of interventions for change between baseline and follow-up measurements were calculated. In studies reporting the same pain (visual analog scale [VAS] or Short Form-36 bodily pain score) or function (Roland-Morris Disability Questionnaire [RDQ]) outcomes, weighted mean differences (WMDs) were calculated. In all analyses, lower pain and function scores indicated better outcomes. For quality-of-life and mental health outcomes, higher scores indicated improved outcomes. All statistical analyses were performed with Stata 10.0. For outcomes in which SMDs were calculated, values of 0.2 to 0.5 were considered small, 0.5 to 0.8 were considered moderate, and values greater than 0.8 were considered large. For WMDs, mean improvements of 5 to 10 points on a 100-point scale (or equivalent) were considered small, 10-point to 20-point changes were considered moderate, and changes greater than 20 points were considered large. For the RDQ, mean improvements of 1 to 2 points were termed small, and improvements of 2 to 5 points were termed moderate.
Main Results:
The total number of citations identified using the search criteria was 479 articles and abstracts. Of these, 466 were excluded because either they were not randomized trials or they did not use imaging strategies for LBP. At this step, 13 articles were retrieved for further analysis. This analysis resulted in 3 additional articles being excluded (1 was not a randomized trial and the other 2 compared 2 imaging techniques rather than immediate imaging versus no imaging). The final step resulted in the inclusion of 6 trials reported in 10 publications for the meta-analysis. In the studies meeting the inclusion criteria, 4 assessed lumbar radiography and 2 assessed magnetic resonance imaging (MRI) or computed tomography (CT) scans. In these 6 trials, 1804 patients were randomly assigned to the intervention group. The duration of patient follow-up ranged from 3 weeks to 2 years. In addition, 1 trial excluded patients with sciatica or other radiculopathy symptoms, whereas another did not report the proportion of patients with these symptoms. In the other 4 studies, the proportion of patients with sciatica or radiculopathy ranged from 24% to 44%. Of the included trials, 3 compared immediate lumbar radiography with usual clinical care without immediate radiography, and a fourth study compared immediate lumbar radiography and a brief educational intervention with lumbar radiography if no improvement was seen by 3 weeks. The final 2 studies assessed advanced imaging modalities. Specifically, one group compared immediate MRI or CT with usual clinical care without advanced imaging in patients with primarily chronic LBP (82% with LBP for >3 months) who were referred to a surgeon. In the other advanced imaging study, all patients with LBP for <3 weeks underwent MRI and were then randomized to routine notification of results or to notification of results only if clinically indicated. With respect to study quality, 5 trials met at least 4 of the 8 predetermined quality criteria, leading to a classification of higher quality. In addition, 5 trials were included in the primary meta-analysis on pain or function improvement at 1 or more follow-up periods. With regard to short-term and long-term improvements in pain, no differences were noted between routine, immediate lumbar imaging and usual clinical care without immediate imaging (Table 1). In studies using the VAS pain score, the WMD (0.62, 95% confidence interval [CI]  =  0.03, 1.21) at short-term follow-up slightly favored no immediate imaging. No differences in outcome were seen in studies using the Short Form-36 bodily pain score. No improvements in function at short-term or long-term follow-up were noted between imaging strategies. Specifically, short-term function measured with the RDQ in 3 studies showed a WMD of 0.48 points (95% CI  =  −1.39, 2.35) between imaging strategies, whereas long-term function in 3 studies, also measured with the RDQ, showed a WMD of 0.33 points (95% CI  =  −0.65, 1.32). One included trial reported pain outcomes at extended (2-year) follow-up and found no differences between imaging strategies for pain (Short Form-36 bodily pain or Aberdeen pain score), with SMDs of −2.7 (95% CI  =  −6.17, 0.79) and −1.6 (−4.04, 0.84), respectively. The outcomes between immediate imaging and usual clinical care without immediate imaging did not differ for short-term follow-up in those studies reporting quality of life (SMD  =  −0.10, 95% CI  =  −0.53, 0.34), mental health (SMD  =  0.12, 95% CI  =  −0.37, 0.62), or overall improvement (mean risk ratio  =  0.83, 95% CI  =  0.65, 1.06). In those studies reporting long-term follow-up periods, similar results can be seen for quality of life (SMD  =  −0.15, 95% CI  =  −0.33, 0.04) and mental health (SMD  =  0.01, 95% CI  =  −0.32, 0.34). In the study reporting extended follow-up, immediate imaging was not better in terms of improving quality of life (SMD  =  0.02, 95% CI  =  −0.02, 0.07) or mental health (SMD  =  −1.50, 95% CI  =  −4.09, 1.09) when compared with usual clinical care without immediate imaging. In the included studies, no cases of cancer, infection, cauda equina syndrome, or other serious diagnoses were reported in patients randomly assigned to either imaging strategy.
Available evidence indicates that immediate, routine lumbar spine imaging in patients with LBP and without features indicating a serious underlying condition did not improve outcomes compared with usual clinical care without immediate imaging. Clinical care without immediate imaging seems to result in no increased odds of failure in identifying serious underlying conditions in patients without risk factors for these conditions. In addition to lacking clinical benefit, routine lumbar imaging is associated with radiation exposure (radiography and CT) and increased direct expenses for patients and may lead to unnecessary procedures. This evidence confirms that clinicians should refrain from routine, immediate lumbar imaging in primary care patients with nonspecific, acute or subacute LBP and no indications of underlying serious conditions. Specific consideration of patient expectations about the value of imaging was not addressed here; however, this aspect must be considered to avoid unnecessary imaging while also meeting patient expectations and increasing patient satisfaction.
PMCID: PMC3017496  PMID: 21214357
spine; assessment; outcomes
20.  Clinical Utility of Serologic Testing for Celiac Disease in Ontario 
Executive Summary
Objective of Analysis
The objective of this evidence-based evaluation is to assess the accuracy of serologic tests in the diagnosis of celiac disease in subjects with symptoms consistent with this disease. Furthermore the impact of these tests in the diagnostic pathway of the disease and decision making was also evaluated.
Celiac Disease
Celiac disease is an autoimmune disease that develops in genetically predisposed individuals. The immunological response is triggered by ingestion of gluten, a protein that is present in wheat, rye, and barley. The treatment consists of strict lifelong adherence to a gluten-free diet (GFD).
Patients with celiac disease may present with a myriad of symptoms such as diarrhea, abdominal pain, weight loss, iron deficiency anemia, dermatitis herpetiformis, among others.
Serologic Testing in the Diagnosis Celiac Disease
There are a number of serologic tests used in the diagnosis of celiac disease.
Anti-gliadin antibody (AGA)
Anti-endomysial antibody (EMA)
Anti-tissue transglutaminase antibody (tTG)
Anti-deamidated gliadin peptides antibodies (DGP)
Serologic tests are automated with the exception of the EMA test, which is more time-consuming and operator-dependent than the other tests. For each serologic test, both immunoglobulin A (IgA) or G (IgG) can be measured, however, IgA measurement is the standard antibody measured in celiac disease.
Diagnosis of Celiac Disease
According to celiac disease guidelines, the diagnosis of celiac disease is established by small bowel biopsy. Serologic tests are used to initially detect and to support the diagnosis of celiac disease. A small bowel biopsy is indicated in individuals with a positive serologic test. In some cases an endoscopy and small bowel biopsy may be required even with a negative serologic test. The diagnosis of celiac disease must be performed on a gluten-containing diet since the small intestine abnormalities and the serologic antibody levels may resolve or improve on a GFD.
Since IgA measurement is the standard for the serologic celiac disease tests, false negatives may occur in IgA-deficient individuals.
Incidence and Prevalence of Celiac Disease
The incidence and prevalence of celiac disease in the general population and in subjects with symptoms consistent with or at higher risk of celiac disease based on systematic reviews published in 2004 and 2009 are summarized below.
Incidence of Celiac Disease in the General Population
Adults or mixed population: 1 to 17/100,000/year
Children: 2 to 51/100,000/year
In one of the studies, a stratified analysis showed that there was a higher incidence of celiac disease in younger children compared to older children, i.e., 51 cases/100,000/year in 0 to 2 year-olds, 33/100,000/year in 2 to 5 year-olds, and 10/100,000/year in children 5 to 15 years old.
Prevalence of Celiac Disease in the General Population
The prevalence of celiac disease reported in population-based studies identified in the 2004 systematic review varied between 0.14% and 1.87% (median: 0.47%, interquartile range: 0.25%, 0.71%). According to the authors of the review, the prevalence did not vary by age group, i.e., adults and children.
Prevalence of Celiac Disease in High Risk Subjects
Type 1 diabetes (adults and children): 1 to 11%
Autoimmune thyroid disease: 2.9 to 3.3%
First degree relatives of patients with celiac disease: 2 to 20%
Prevalence of Celiac Disease in Subjects with Symptoms Consistent with the Disease
The prevalence of celiac disease in subjects with symptoms consistent with the disease varied widely among studies, i.e., 1.5% to 50% in adult studies, and 1.1% to 17% in pediatric studies. Differences in prevalence may be related to the referral pattern as the authors of a systematic review noted that the prevalence tended to be higher in studies whose population originated from tertiary referral centres compared to general practice.
Research Questions
What is the sensitivity and specificity of serologic tests in the diagnosis celiac disease?
What is the clinical validity of serologic tests in the diagnosis of celiac disease? The clinical validity was defined as the ability of the test to change diagnosis.
What is the clinical utility of serologic tests in the diagnosis of celiac disease? The clinical utility was defined as the impact of the test on decision making.
What is the budget impact of serologic tests in the diagnosis of celiac disease?
What is the cost-effectiveness of serologic tests in the diagnosis of celiac disease?
Literature Search
A literature search was performed on November 13th, 2009 using OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, the Cumulative Index to Nursing & Allied Health Literature (CINAHL), the Cochrane Library, and the International Agency for Health Technology Assessment (INAHTA) for studies published from January 1st 2003 and November 13th 2010. Abstracts were reviewed by a single reviewer and, for those studies meeting the eligibility criteria, full-text articles were obtained. Reference lists were also examined for any additional relevant studies not identified through the search. Articles with unknown eligibility were reviewed with a second clinical epidemiologist, then a group of epidemiologists until consensus was established. The quality of evidence was assessed as high, moderate, low or very low according to GRADE methodology.
Studies that evaluated diagnostic accuracy, i.e., both sensitivity and specificity of serology tests in the diagnosis of celiac disease.
Study population consisted of untreated patients with symptoms consistent with celiac disease.
Studies in which both serologic celiac disease tests and small bowel biopsy (gold standard) were used in all subjects.
Systematic reviews, meta-analyses, randomized controlled trials, prospective observational studies, and retrospective cohort studies.
At least 20 subjects included in the celiac disease group.
English language.
Human studies.
Studies published from 2000 on.
Clearly defined cut-off value for the serology test. If more than one test was evaluated, only those tests for which a cut-off was provided were included.
Description of small bowel biopsy procedure clearly outlined (location, number of biopsies per patient), unless if specified that celiac disease diagnosis guidelines were followed.
Patients in the treatment group had untreated CD.
Studies on screening of the general asymptomatic population.
Studies that evaluated rapid diagnostic kits for use either at home or in physician’s offices.
Studies that evaluated diagnostic modalities other than serologic tests such as capsule endoscopy, push enteroscopy, or genetic testing.
Cut-off for serologic tests defined based on controls included in the study.
Study population defined based on positive serology or subjects pre-screened by serology tests.
Celiac disease status known before study enrolment.
Sensitivity or specificity estimates based on repeated testing for the same subject.
Non-peer-reviewed literature such as editorials and letters to the editor.
The population consisted of adults and children with untreated, undiagnosed celiac disease with symptoms consistent with the disease.
Serologic Celiac Disease Tests Evaluated
Anti-gliadin antibody (AGA)
Anti-endomysial antibody (EMA)
Anti-tissue transglutaminase antibody (tTG)
Anti-deamidated gliadin peptides antibody (DGP)
Combinations of some of the serologic tests listed above were evaluated in some studies
Both IgA and IgG antibodies were evaluated for the serologic tests listed above.
Outcomes of Interest
Positive and negative likelihood ratios
Diagnostic odds ratio (OR)
Area under the sROC curve (AUC)
Small bowel biopsy was used as the gold standard in order to estimate the sensitivity and specificity of each serologic test.
Statistical Analysis
Pooled estimates of sensitivity, specificity and diagnostic odds ratios (DORs) for the different serologic tests were calculated using a bivariate, binomial generalized linear mixed model. Statistical significance for differences in sensitivity and specificity between serologic tests was defined by P values less than 0.05, where “false discovery rate” adjustments were made for multiple hypothesis testing. The bivariate regression analyses were performed using SAS version 9.2 (SAS Institute Inc.; Cary, NC, USA). Using the bivariate model parameters, summary receiver operating characteristic (sROC) curves were produced using Review Manager 5.0.22 (The Nordiac Cochrane Centre, The Cochrane Collaboration, 2008). The area under the sROC curve (AUC) was estimated by bivariate mixed-efects binary regression modeling framework. Model specification, estimation and prediction are carried out with xtmelogit in Stata release 10 (Statacorp, 2007). Statistical tests for the differences in AUC estimates could not be carried out.
The study results were stratified according to patient or disease characteristics such as age, severity of Marsh grade abnormalities, among others, if reported in the studies. The literature indicates that the diagnostic accuracy of serologic tests for celiac disease may be affected in patients with chronic liver disease, therefore, the studies identified through the systematic literature review that evaluated the diagnostic accuracy of serologic tests for celiac disease in patients with chronic liver disease were summarized. The effect of the GFD in patiens diagnosed with celiac disease was also summarized if reported in the studies eligible for the analysis.
Summary of Findings
Published Systematic Reviews
Five systematic reviews of studies that evaluated the diagnostic accuracy of serologic celiac disease tests were identified through our literature search. Seventeen individual studies identified in adults and children were eligible for this evaluation.
In general, the studies included evaluated the sensitivity and specificity of at least one serologic test in subjects with symptoms consistent with celiac disease. The gold standard used to confirm the celiac disease diagnosis was small bowel biopsy. Serologic tests evaluated included tTG, EMA, AGA, and DGP, using either IgA or IgG antibodies. Indirect immunoflurorescence was used for the EMA serologic tests whereas enzyme-linked immunosorbent assay (ELISA) was used for the other serologic tests.
Common symptoms described in the studies were chronic diarrhea, abdominal pain, bloating, unexplained weight loss, unexplained anemia, and dermatitis herpetiformis.
The main conclusions of the published systematic reviews are summarized below.
IgA tTG and/or IgA EMA have a high accuracy (pooled sensitivity: 90% to 98%, pooled specificity: 95% to 99% depending on the pooled analysis).
Most reviews found that AGA (IgA or IgG) are not as accurate as IgA tTG and/or EMA tests.
A 2009 systematic review concluded that DGP (IgA or IgG) seems to have a similar accuracy compared to tTG, however, since only 2 studies identified evaluated its accuracy, the authors believe that additional data is required to draw firm conclusions.
Two systematic reviews also concluded that combining two serologic celiac disease tests has little contribution to the accuracy of the diagnosis.
MAS Analysis
The pooled analysis performed by MAS showed that IgA tTG has a sensitivity of 92.1% [95% confidence interval (CI) 88.0, 96.3], compared to 89.2% (83.3, 95.1, p=0.12) for IgA DGP, 85.1% (79.5, 94.4, p=0.07) for IgA EMA, and 74.9% (63.6, 86.2, p=0.0003) for IgA AGA. Among the IgG-based tests, the results suggest that IgG DGP has a sensitivity of 88.4% (95% CI: 82.1, 94.6), 44.7% (30.3, 59.2) for tTG, and 69.1% (56.0, 82.2) for AGA. The difference was significant when IgG DGP was compared to IgG tTG but not IgG AGA. Combining serologic celiac disease tests yielded a slightly higher sensitivity compared to individual IgA-based serologic tests.
IgA deficiency
The prevalence of total or severe IgA deficiency was low in the studies identified varying between 0 and 1.7% as reported in 3 studies in which IgA deficiency was not used as a referral indication for celiac disease serologic testing. The results of IgG-based serologic tests were positive in all patients with IgA deficiency in which celiac disease was confirmed by small bowel biopsy as reported in four studies.
The MAS pooled analysis indicates a high specificity across the different serologic tests including the combination strategy, pooled estimates ranged from 90.1% to 98.7% depending on the test.
Likelihood Ratios
According to the likelihood ratio estimates, both IgA tTG and serologic test combinationa were considered very useful tests (positive likelihood ratio above ten and the negative likelihood ratio below 0.1).
Moderately useful tests included IgA EMA, IgA DGP, and IgG DGP (positive likelihood ratio between five and ten and the negative likelihood ratio between 0.1 and 0.2).
Somewhat useful tests: IgA AGA, IgG AGA, generating small but sometimes important changes from pre- to post-test probability (positive LR between 2 and 5 and negative LR between 0.2 and 0.5)
Not Useful: IgG tTG, altering pre- to post-test probability to a small and rarely important degree (positive LR between 1 and 2 and negative LR between 0.5 and 1).
Diagnostic Odds Ratios (DOR)
Among the individual serologic tests, IgA tTG had the highest DOR, 136.5 (95% CI: 51.9, 221.2). The statistical significance of the difference in DORs among tests was not calculated, however, considering the wide confidence intervals obtained, the differences may not be statistically significant.
Area Under the sROC Curve (AUC)
The sROC AUCs obtained ranged between 0.93 and 0.99 for most IgA-based tests with the exception of IgA AGA, with an AUC of 0.89.
Sensitivity and Specificity of Serologic Tests According to Age Groups
Serologic test accuracy did not seem to vary according to age (adults or children).
Sensitivity and Specificity of Serologic Tests According to Marsh Criteria
Four studies observed a trend towards a higher sensitivity of serologic celiac disease tests when Marsh 3c grade abnormalities were found in the small bowel biopsy compared to Marsh 3a or 3b (statistical significance not reported). The sensitivity of serologic tests was much lower when Marsh 1 grade abnormalities were found in small bowel biopsy compared to Marsh 3 grade abnormalities. The statistical significance of these findings were not reported in the studies.
Diagnostic Accuracy of Serologic Celiac Disease Tests in Subjects with Chronic Liver Disease
A total of 14 observational studies that evaluated the specificity of serologic celiac disease tests in subjects with chronic liver disease were identified. All studies evaluated the frequency of false positive results (1-specificity) of IgA tTG, however, IgA tTG test kits using different substrates were used, i.e., human recombinant, human, and guinea-pig substrates. The gold standard, small bowel biopsy, was used to confirm the result of the serologic tests in only 5 studies. The studies do not seem to have been designed or powered to compare the diagnostic accuracy among different serologic celiac disease tests.
The results of the studies identified in the systematic literature review suggest that there is a trend towards a lower frequency of false positive results if the IgA tTG test using human recombinant substrate is used compared to the guinea pig substrate in subjects with chronic liver disease. However, the statistical significance of the difference was not reported in the studies. When IgA tTG with human recombinant substrate was used, the number of false positives seems to be similar to what was estimated in the MAS pooled analysis for IgA-based serologic tests in a general population of patients. These results should be interpreted with caution since most studies did not use the gold standard, small bowel biopsy, to confirm or exclude the diagnosis of celiac disease, and since the studies were not designed to compare the diagnostic accuracy among different serologic tests. The sensitivity of the different serologic tests in patients with chronic liver disease was not evaluated in the studies identified.
Effects of a Gluten-Free Diet (GFD) in Patients Diagnosed with Celiac Disease
Six studies identified evaluated the effects of GFD on clinical, histological, or serologic improvement in patients diagnosed with celiac disease. Improvement was observed in 51% to 95% of the patients included in the studies.
Grading of Evidence
Overall, the quality of the evidence ranged from moderate to very low depending on the serologic celiac disease test. Reasons to downgrade the quality of the evidence included the use of a surrogate endpoint (diagnostic accuracy) since none of the studies evaluated clinical outcomes, inconsistencies among study results, imprecise estimates, and sparse data. The quality of the evidence was considered moderate for IgA tTg and IgA EMA, low for IgA DGP, and serologic test combinations, and very low for IgA AGA.
Clinical Validity and Clinical Utility of Serologic Testing in the Diagnosis of Celiac Disease
The clinical validity of serologic tests in the diagnosis of celiac disease was considered high in subjects with symptoms consistent with this disease due to
High accuracy of some serologic tests.
Serologic tests detect possible celiac disease cases and avoid unnecessary small bowel biopsy if the test result is negative, unless an endoscopy/ small bowel biopsy is necessary due to the clinical presentation.
Serologic tests support the results of small bowel biopsy.
The clinical utility of serologic tests for the diagnosis of celiac disease, as defined by its impact in decision making was also considered high in subjects with symptoms consistent with this disease given the considerations listed above and since celiac disease diagnosis leads to treatment with a gluten-free diet.
Economic Analysis
A decision analysis was constructed to compare costs and outcomes between the tests based on the sensitivity, specificity and prevalence summary estimates from the MAS Evidence-Based Analysis (EBA). A budget impact was then calculated by multiplying the expected costs and volumes in Ontario. The outcome of the analysis was expected costs and false negatives (FN). Costs were reported in 2010 CAD$. All analyses were performed using TreeAge Pro Suite 2009.
Four strategies made up the efficiency frontier; IgG tTG, IgA tTG, EMA and small bowel biopsy. All other strategies were dominated. IgG tTG was the least costly and least effective strategy ($178.95, FN avoided=0). Small bowel biopsy was the most costly and most effective strategy ($396.60, FN avoided =0.1553). The cost per FN avoided were $293, $369, $1,401 for EMA, IgATTG and small bowel biopsy respectively. One-way sensitivity analyses did not change the ranking of strategies.
All testing strategies with small bowel biopsy are cheaper than biopsy alone however they also result in more FNs. The most cost-effective strategy will depend on the decision makers’ willingness to pay. Findings suggest that IgA tTG was the most cost-effective and feasible strategy based on its Incremental Cost-Effectiveness Ratio (ICER) and convenience to conduct the test. The potential impact of IgA tTG test in the province of Ontario would be $10.4M, $11.0M and $11.7M respectively in the following three years based on past volumes and trends in the province and basecase expected costs.
The panel of tests is the commonly used strategy in the province of Ontario therefore the impact to the system would be $13.6M, $14.5M and $15.3M respectively in the next three years based on past volumes and trends in the province and basecase expected costs.
The clinical validity and clinical utility of serologic tests for celiac disease was considered high in subjects with symptoms consistent with this disease as they aid in the diagnosis of celiac disease and some tests present a high accuracy.
The study findings suggest that IgA tTG is the most accurate and the most cost-effective test.
AGA test (IgA) has a lower accuracy compared to other IgA-based tests
Serologic test combinations appear to be more costly with little gain in accuracy. In addition there may be problems with generalizability of the results of the studies included in this review if different test combinations are used in clinical practice.
IgA deficiency seems to be uncommon in patients diagnosed with celiac disease.
The generalizability of study results is contingent on performing both the serologic test and small bowel biopsy in subjects on a gluten-containing diet as was the case in the studies identified, since the avoidance of gluten may affect test results.
PMCID: PMC3377499  PMID: 23074399
21.  Activation Likelihood Estimation meta-analysis revisited 
Neuroimage  2011;59(3):2349-2361.
A widely used technique for coordinate-based meta-analysis of neuroimaging data is activation likelihood estimation (ALE), which determines the convergence of foci reported from different experiments. ALE analysis involves modelling these foci as probability distributions whose width is based on empirical estimates of the spatial uncertainty due to the between-subject and between-template variability of neuroimaging data. ALE results are assessed against a null-distribution of random spatial association between experiments, resulting in random-effects inference. In the present revision of this algorithm, we address two remaining drawbacks of the previous algorithm. First, the assessment of spatial association between experiments was based on a highly time-consuming permutation test, which nevertheless entailed the danger of underestimating the right tail of the null-distribution. In this report, we outline how this previous approach may be replaced by a faster and more precise analytical method. Second, the previously applied correction procedure, i.e. controlling the false discovery rate (FDR), is supplemented by new approaches for correcting the family-wise error rate and the cluster-level significance. The different alternatives for drawing inference on meta-analytic results are evaluated on an exemplary dataset on face perception as well as discussed with respect to their methodological limitations and advantages. In summary, we thus replaced the previous permutation algorithm with a faster and more rigorous analytical solution for the null-distribution and comprehensively address the issue of multiple-comparison corrections. The proposed revision of the ALE-algorithm should provide an improved tool for conducting coordinate-based meta-analyses on functional imaging data.
PMCID: PMC3254820  PMID: 21963913
fMRI; PET; permutation; inference; cluster-thresholding
22.  Exercise therapy for chronic low back pain: protocol for an individual participant data meta-analysis 
Systematic Reviews  2012;1:64.
Low back pain (LBP) is one of the leading causes of disability and has a major socioeconomic impact. Despite a large amount of research in the field, there remains uncertainty about the best treatment approach for chronic LBP, and identification of relevant patient subgroups is an important goal. Exercise therapy is a commonly used strategy to treat chronic low back pain and is one of several interventions that evidence suggests is moderately effective.
In parallel with an update of the 2005 Cochrane review, we will undertake an individual participant data (IPD) meta-analysis, which will allow us to standardize analyses across studies and directly derive results, and to examine differential treatment effects across individuals to estimate how patients’ characteristics modify treatment benefit.
We will use standard systematic review methods advocated by the Cochrane Collaboration to identify relevant trials. We will include trials evaluating exercise therapy compared to any or no other interventions in adult non-specific chronic LBP. Our primary outcomes of interest include pain, functional status, and return-to-work/absenteeism. We will assess potential risk of bias for each study meeting selection criteria, using criteria and methods recommended by the Cochrane BRG.
The original individual participant data will be requested from the authors of selected trials having moderate to low risk of bias. We will test original data and compile a master dataset with information about each trial mapped on a pre-specified framework, including reported characteristics of the study sample, exercise therapy characteristics, individual patient characteristics at baseline and all follow-up periods, subgroup and treatment effect modifiers investigated. Our analyses will include descriptive, study-level meta-analysis and meta-regression analyses of the overall treatment effect, and individual-level IPD meta-analyses of treatment effect modification. IPD meta-analyses will be conducted using a one-step approach where the IPD from all studies are modeled simultaneously while accounting for the clustering of participants with studies.
We will analyze IPD across a large number of LBP trials. The resulting larger sample size and consistent presentation of data will allow additional analyses to explore patient-level heterogeneity in treatment outcomes and prognosis of chronic LBP.
PMCID: PMC3564764  PMID: 23259855
Low back pain; Exercise therapy; Meta-analysis; Systematic review
The methods to detect gene-gene interactions between variants in genome-wide association study (GWAS) datasets have not been well developed thus far. PLATO, the Platform for the Analysis, Translation and Organization of large-scale data, is a filter-based method bringing together many analytical methods simultaneously in an effort to solve this problem. PLATO filters a large, genomic dataset down to a subset of genetic variants, which may be useful for interaction analysis. As a precursor to the use of PLATO for the detection of gene-gene interactions, the implementation of a variety of single locus filters was completed and evaluated as a proof of concept. To streamline PLATO for efficient epistasis analysis, we determined which of 24 analytical filters produced redundant results. Using a kappa score to identify agreement between filters, we grouped the analytical filters into 4 filter classes; thus all further analyses employed four filters. We then tested the MAX statistic put forth by Sladek et al. 1 in simulated data exploring a number of genetic models of modest effect size. To find the MAX statistic, the four filters were run on each SNP in each dataset and the smallest p-value among the four results was taken as the final result. Permutation testing was performed to empirically determine the p-value. The power of the MAX statistic to detect each of the simulated effects was determined in addition to the Type 1 error and false positive rates. The results of this simulation study demonstrates that PLATO using the four filters incorporating the MAX statistic has higher power on average to find multiple types of effects and a lower false positive rate than any of the individual filters alone. In the future we will extend PLATO with the MAX statistic to interaction analyses for large-scale genomic datasets.
PMCID: PMC2903053  PMID: 19908384
24.  Community-Based Multidisciplinary Care for Patients With Stable Chronic Obstructive Pulmonary Disease (COPD) 
Executive Summary
In July 2010, the Medical Advisory Secretariat (MAS) began work on a Chronic Obstructive Pulmonary Disease (COPD) evidentiary framework, an evidence-based review of the literature surrounding treatment strategies for patients with COPD. This project emerged from a request by the Health System Strategy Division of the Ministry of Health and Long-Term Care that MAS provide them with an evidentiary platform on the effectiveness and cost-effectiveness of COPD interventions.
After an initial review of health technology assessments and systematic reviews of COPD literature, and consultation with experts, MAS identified the following topics for analysis: vaccinations (influenza and pneumococcal), smoking cessation, multidisciplinary care, pulmonary rehabilitation, long-term oxygen therapy, noninvasive positive pressure ventilation for acute and chronic respiratory failure, hospital-at-home for acute exacerbations of COPD, and telehealth (including telemonitoring and telephone support). Evidence-based analyses were prepared for each of these topics. For each technology, an economic analysis was also completed where appropriate. In addition, a review of the qualitative literature on patient, caregiver, and provider perspectives on living and dying with COPD was conducted, as were reviews of the qualitative literature on each of the technologies included in these analyses.
The Chronic Obstructive Pulmonary Disease Mega-Analysis series is made up of the following reports, which can be publicly accessed at the MAS website at:
Chronic Obstructive Pulmonary Disease (COPD) Evidentiary Framework
Influenza and Pneumococcal Vaccinations for Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Smoking Cessation for Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Community-Based Multidisciplinary Care for Patients With Stable Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Pulmonary Rehabilitation for Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Long-term Oxygen Therapy for Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Noninvasive Positive Pressure Ventilation for Acute Respiratory Failure Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Noninvasive Positive Pressure Ventilation for Chronic Respiratory Failure Patients With Stable Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Hospital-at-Home Programs for Patients With Acute Exacerbations of Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Home Telehealth for Patients With Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis
Cost-Effectiveness of Interventions for Chronic Obstructive Pulmonary Disease Using an Ontario Policy Model
Experiences of Living and Dying With COPD: A Systematic Review and Synthesis of the Qualitative Empirical Literature
For more information on the qualitative review, please contact Mita Giacomini at:
For more information on the economic analysis, please visit the PATH website:
The Toronto Health Economics and Technology Assessment (THETA) collaborative has produced an associated report on patient preference for mechanical ventilation. For more information, please visit the THETA website:
The objective of this evidence-based analysis was to determine the effectiveness and cost-effectiveness of multidisciplinary care (MDC) compared with usual care (UC, single health care provider) for the treatment of stable chronic obstructive pulmonary disease (COPD).
Clinical Need: Condition and Target Population
Chronic obstructive pulmonary disease is a progressive disorder with episodes of acute exacerbations associated with significant morbidity and mortality. Cigarette smoking is linked causally to COPD in more than 80% of cases. Chronic obstructive pulmonary disease is among the most common chronic diseases worldwide and has an enormous impact on individuals, families, and societies through reduced quality of life and increased health resource utilization and mortality.
The estimated prevalence of COPD in Ontario in 2007 was 708,743 persons.
Multidisciplinary care involves professionals from a range of disciplines, working together to deliver comprehensive care that addresses as many of the patient’s health care and psychosocial needs as possible.
Two variables are inherent in the concept of a multidisciplinary team: i) the multidisciplinary components such as an enriched knowledge base and a range of clinical skills and experiences, and ii) the team components, which include but are not limited to, communication and support measures. However, the most effective number of team members and which disciplines should comprise the team for optimal effect is not yet known.
Research Question
What is the effectiveness and cost-effectiveness of MDC compared with UC (single health care provider) for the treatment of stable COPD?
Research Methods
Literature Search
Search Strategy
A literature search was performed on July 19, 2010 using OVID MEDLINE, OVID MEDLINE In-Process and Other Non-Indexed Citations, OVID EMBASE, EBSCO Cumulative Index to Nursing & Allied Health Literature (CINAHL), the Wiley Cochrane Library, and the Centre for Reviews and Dissemination database, for studies published from January 1, 1995 until July 2010. Abstracts were reviewed by a single reviewer and, for those studies meeting the eligibility criteria, full-text articles were obtained. Reference lists were also examined for any additional relevant studies not identified through the search.
Inclusion Criteria
health technology assessments, systematic reviews, or randomized controlled trials
studies published between January 1995 and July 2010;
COPD study population
studies comparing MDC (2 or more health care disciplines participating in care) compared with UC (single health care provider)
Exclusion Criteria
grey literature
duplicate publications
non-English language publications
study population less than 18 years of age
Outcomes of Interest
hospital admissions
emergency department (ED) visits
health-related quality of life
lung function
Quality of Evidence
The quality of each included study was assessed, taking into consideration allocation concealment, randomization, blinding, power/sample size, withdrawals/dropouts, and intention-to-treat analyses.
The quality of the body of evidence was assessed as high, moderate, low, or very low according to the GRADE Working Group criteria. The following definitions of quality were used in grading the quality of the evidence:
Summary of Findings
Six randomized controlled trials were obtained from the literature search. Four of the 6 studies were completed in the United States. The sample size of the 6 studies ranged from 40 to 743 participants, with a mean study sample between 66 and 71 years of age. Only 2 studies characterized the study sample in terms of the Global Initiative for Chronic Obstructive Lung Disease (GOLD) COPD stage criteria, and in general the description of the study population in the other 4 studies was limited. The mean percent predicted forced expiratory volume in 1 second (% predicted FEV1) among study populations was between 32% and 59%. Using this criterion, 3 studies included persons with severe COPD and 2 with moderate COPD. Information was not available to classify the population in the sixth study.
Four studies had MDC treatment groups which included a physician. All studies except 1 reported a respiratory specialist (i.e., respiratory therapist, specialist nurse, or physician) as part of the multidisciplinary team. The UC group was comprised of a single health care practitioner who may or may not have been a respiratory specialist.
A meta-analysis was completed for 5 of the 7 outcome measures of interest including:
health-related quality of life,
lung function,
all-cause hospitalization,
COPD-specific hospitalization, and
There was only 1 study contributing to the outcome of all-cause and COPD-specific ED visits which precluded pooling data for these outcomes. Subgroup analyses were not completed either because heterogeneity was not significant or there were a small number of studies that were meta-analysed for the outcome.
Quality of Life
Three studies reported results of quality of life assessment based on the St. George’s Respiratory Questionnaire (SGRQ). A mean decrease in the SGRQ indicates an improvement in quality of life while a mean increase indicates deterioration in quality of life. In all studies the mean change score from baseline to the end time point in the MDC treatment group showed either an improvement compared with the control group or less deterioration compared with the control group. The mean difference in change scores between MDC and UC groups was statistically significant in all 3 studies. The pooled weighted mean difference in total SGRQ score was −4.05 (95% confidence interval [CI], −6.47 to 1.63; P = 0.001). The GRADE quality of evidence was assessed as low for this outcome.
Lung Function
Two studies reported results of the FEV1 % predicted as a measure of lung function. A negative change from baseline infers deterioration in lung function and a positive change from baseline infers an improvement in lung function. The MDC group showed a statistically significant improvement in lung function up to 12 months compared with the UC group (P = 0.01). However this effect is not maintained at 2-year follow-up (P = 0.24). The pooled weighted mean difference in FEV1 percent predicted was 2.78 (95% CI, −1.82 to −7.37). The GRADE quality of evidence was assessed as very low for this outcome indicating that an estimate of effect is uncertain.
Hospital Admissions
Four studies reported results of all-cause hospital admissions in terms of number of persons with at least 1 admission during the follow-up period. Estimates from these 4 studies were pooled to determine a summary estimate. There is a statistically significant 25% relative risk (RR) reduction in all-cause hospitalizations in the MDC group compared with the UC group (P < 0.001). The index of heterogeneity (I2) value is 0%, indicating no statistical heterogeneity between studies. The GRADE quality of evidence was assessed as moderate for this outcome, indicating that further research may change the estimate of effect.
COPD-Specific Hospitalization
Three studies reported results of COPD-specific hospital admissions in terms of number of persons with at least 1 admission during the follow-up period. Estimates from these 3 studies were pooled to determine a summary estimate. There is a statistically significant 33% RR reduction in all-cause hospitalizations in the MDC group compared with the UC group (P = 0.002). The I2 value is 0%, indicating no statistical heterogeneity between studies. The GRADE quality of evidence was assessed as moderate for this outcome, indicating that further research may change the estimate of effect.
Emergency Department Visits
Two studies reported results of all-cause ED visits in terms of number of persons with at least 1 visit during the follow-up period. There is a statistically nonsignificant reduction in all-cause ED visits when data from these 2 studies are pooled (RR, 0.64; 95% CI, 0.31 to −1.33; P = 0.24). The GRADE quality of evidence was assessed as very low for this outcome indicating that an estimate of effect is uncertain.
One study reported results of COPD-specific ED visits in terms of number of persons with at least 1 visit during the follow-up period. There is a statistically significant 41% reduction in COPD-specific ED visits when the data from these 2 studies are pooled (RR, 0.59; 95% CI, 0.43−0.81; P < 0.001). The GRADE quality of evidence was assessed as moderate for this outcome.
Three studies reported the mortality during the study follow-up period. Estimates from these 3 studies were pooled to determine a summary estimate. There is a statistically nonsignificant reduction in mortality between treatment groups (RR, 0.81; 95% CI, 0.52−1.27; P = 0.36). The I2 value is 19%, indicating low statistical heterogeneity between studies. All studies had a 12-month follow-up period. The GRADE quality of evidence was assessed as low for this outcome.
Significant effect estimates with moderate quality of evidence were found for all-cause hospitalization, COPD-specific hospitalization, and COPD-specific ED visits (Table ES1). A significant estimate with low quality evidence was found for the outcome of quality of life (Table ES2). All other outcome measures were nonsignificant and supported by low or very low quality of evidence.
Summary of Dichotomous Data
Abbreviations: CI, confidence intervals; COPD, chronic obstructive pulmonary disease; n, number.
Summary of Continuous Data
Abbreviations: CI, confidence intervals; FEV1, forced expiratory volume in 1 second; n, number; SGRQ, St. George’s Respiratory Questionnaire.
PMCID: PMC3384374  PMID: 23074433
25.  Methodological characteristics and treatment effect sizes in oral health randomised controlled trials: Is there a relationship? Protocol for a meta-epidemiological study 
BMJ Open  2014;4(2):e004527.
It is fundamental that randomised controlled trials (RCTs) are properly conducted in order to reach well-supported conclusions. However, there is emerging evidence that RCTs are subject to biases which can overestimate or underestimate the true treatment effect, due to flaws in the study design characteristics of such trials. The extent to which this holds true in oral health RCTs, which have some unique design characteristics compared to RCTs in other health fields, is unclear. As such, we aim to examine the empirical evidence quantifying the extent of bias associated with methodological and non-methodological characteristics in oral health RCTs.
Methods and analysis
We plan to perform a meta-epidemiological study, where a sample size of 60 meta-analyses (MAs) including approximately 600 RCTs will be selected. The MAs will be randomly obtained from the Oral Health Database of Systematic Reviews using a random number table; and will be considered for inclusion if they include a minimum of five RCTs, and examine a therapeutic intervention related to one of the recognised dental specialties. RCTs identified in selected MAs will be subsequently included if their study design includes a comparison between an intervention group and a placebo group or another intervention group. Data will be extracted from selected trials included in MAs based on a number of methodological and non-methodological characteristics. Moreover, the risk of bias will be assessed using the Cochrane Risk of Bias tool. Effect size estimates and measures of variability for the main outcome will be extracted from each RCT included in selected MAs, and a two-level analysis will be conducted using a meta-meta-analytic approach with a random effects model to allow for intra-MA and inter-MA heterogeneity.
Ethics and dissemination
The intended audiences of the findings will include dental clinicians, oral health researchers, policymakers and graduate students. The aforementioned will be introduced to the findings through workshops, seminars, round table discussions and targeted individual meetings. Other opportunities for knowledge transfer will be pursued such as key dental conferences. Finally, the results will be published as a scientific report in a dental peer-reviewed journal.
PMCID: PMC3939646  PMID: 24568962
Oral & Maxillofacial Surgery; Oral Medicine

Results 1-25 (1234008)