PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1474978)

Clipboard (0)
None

Related Articles

1.  Is the Timed Up and Go test a useful predictor of risk of falls in community dwelling older adults: a systematic review and meta- analysis 
BMC Geriatrics  2014;14:14.
Background
The Timed Up and Go test (TUG) is a commonly used screening tool to assist clinicians to identify patients at risk of falling. The purpose of this systematic review and meta-analysis is to determine the overall predictive value of the TUG in community-dwelling older adults.
Methods
A literature search was performed to identify all studies that validated the TUG test. The methodological quality of the selected studies was assessed using the QUADAS-2 tool, a validated tool for the quality assessment of diagnostic accuracy studies. A TUG score of ≥13.5 seconds was used to identify individuals at higher risk of falling. All included studies were combined using a bivariate random effects model to generate pooled estimates of sensitivity and specificity at ≥13.5 seconds. Heterogeneity was assessed using the variance of logit transformed sensitivity and specificity.
Results
Twenty-five studies were included in the systematic review and 10 studies were included in meta-analysis. The TUG test was found to be more useful at ruling in rather than ruling out falls in individuals classified as high risk (>13.5 sec), with a higher pooled specificity (0.74, 95% CI 0.52-0.88) than sensitivity (0.31, 95% CI 0.13-0.57). Logistic regression analysis indicated that the TUG score is not a significant predictor of falls (OR = 1.01, 95% CI 1.00-1.02, p = 0.05).
Conclusion
The Timed Up and Go test has limited ability to predict falls in community dwelling elderly and should not be used in isolation to identify individuals at high risk of falls in this setting.
doi:10.1186/1471-2318-14-14
PMCID: PMC3924230  PMID: 24484314
2.  Predicting streptococcal pharyngitis in adults in primary care: a systematic review of the diagnostic accuracy of symptoms and signs and validation of the Centor score 
BMC Medicine  2011;9:67.
Background
Stratifying patients with a sore throat into the probability of having an underlying bacterial or viral cause may be helpful in targeting antibiotic treatment. We sought to assess the diagnostic accuracy of signs and symptoms and validate a clinical prediction rule (CPR), the Centor score, for predicting group A β-haemolytic streptococcal (GABHS) pharyngitis in adults (> 14 years of age) presenting with sore throat symptoms.
Methods
A systematic literature search was performed up to July 2010. Studies that assessed the diagnostic accuracy of signs and symptoms and/or validated the Centor score were included. For the analysis of the diagnostic accuracy of signs and symptoms and the Centor score, studies were combined using a bivariate random effects model, while for the calibration analysis of the Centor score, a random effects model was used.
Results
A total of 21 studies incorporating 4,839 patients were included in the meta-analysis on diagnostic accuracy of signs and symptoms. The results were heterogeneous and suggest that individual signs and symptoms generate only small shifts in post-test probability (range positive likelihood ratio (+LR) 1.45-2.33, -LR 0.54-0.72). As a decision rule for considering antibiotic prescribing (score ≥ 3), the Centor score has reasonable specificity (0.82, 95% CI 0.72 to 0.88) and a post-test probability of 12% to 40% based on a prior prevalence of 5% to 20%. Pooled calibration shows no significant difference between the numbers of patients predicted and observed to have GABHS pharyngitis across strata of Centor score (0-1 risk ratio (RR) 0.72, 95% CI 0.49 to 1.06; 2-3 RR 0.93, 95% CI 0.73 to 1.17; 4 RR 1.14, 95% CI 0.95 to 1.37).
Conclusions
Individual signs and symptoms are not powerful enough to discriminate GABHS pharyngitis from other types of sore throat. The Centor score is a well calibrated CPR for estimating the probability of GABHS pharyngitis. The Centor score can enhance appropriate prescribing of antibiotics, but should be used with caution in low prevalence settings of GABHS pharyngitis such as primary care.
doi:10.1186/1741-7015-9-67
PMCID: PMC3127779  PMID: 21631919
3.  Prevention of Falls and Fall-Related Injuries in Community-Dwelling Seniors 
Executive Summary
In early August 2007, the Medical Advisory Secretariat began work on the Aging in the Community project, an evidence-based review of the literature surrounding healthy aging in the community. The Health System Strategy Division at the Ministry of Health and Long-Term Care subsequently asked the secretariat to provide an evidentiary platform for the ministry’s newly released Aging at Home Strategy.
After a broad literature review and consultation with experts, the secretariat identified 4 key areas that strongly predict an elderly person’s transition from independent community living to a long-term care home. Evidence-based analyses have been prepared for each of these 4 areas: falls and fall-related injuries, urinary incontinence, dementia, and social isolation. For the first area, falls and fall-related injuries, an economic model is described in a separate report.
Please visit the Medical Advisory Secretariat Web site, http://www.health.gov.on.ca/english/providers/program/mas/mas_about.html, to review these titles within the Aging in the Community series.
Aging in the Community: Summary of Evidence-Based Analyses
Prevention of Falls and Fall-Related Injuries in Community-Dwelling Seniors: An Evidence-Based Analysis
Behavioural Interventions for Urinary Incontinence in Community-Dwelling Seniors: An Evidence-Based Analysis
Caregiver- and Patient-Directed Interventions for Dementia: An Evidence-Based Analysis
Social Isolation in Community-Dwelling Seniors: An Evidence-Based Analysis
The Falls/Fractures Economic Model in Ontario Residents Aged 65 Years and Over (FEMOR)
Objective
To identify interventions that may be effective in reducing the probability of an elderly person’s falling and/or sustaining a fall-related injury.
Background
Although estimates of fall rates vary widely based on the location, age, and living arrangements of the elderly population, it is estimated that each year approximately 30% of community-dwelling individuals aged 65 and older, and 50% of those aged 85 and older will fall. Of those individuals who fall, 12% to 42% will have a fall-related injury.
Several meta-analyses and cohort studies have identified falls and fall-related injuries as a strong predictor of admission to a long-term care (LTC) home. It has been shown that the risk of LTC home admission is over 5 times higher in seniors who experienced 2 or more falls without injury, and over 10 times higher in seniors who experienced a fall causing serious injury.
Falls result from the interaction of a variety of risk factors that can be both intrinsic and extrinsic. Intrinsic factors are those that pertain to the physical, demographic, and health status of the individual, while extrinsic factors relate to the physical and socio-economic environment. Intrinsic risk factors can be further grouped into psychosocial/demographic risks, medical risks, risks associated with activity level and dependence, and medication risks. Commonly described extrinsic risks are tripping hazards, balance and slip hazards, and vision hazards.
Note: It is recognized that the terms “senior” and “elderly” carry a range of meanings for different audiences; this report generally uses the former, but the terms are treated here as essentially interchangeable.
Evidence-Based Analysis of Effectiveness
Research Question
Since many risk factors for falls are modifiable, what interventions (devices, systems, programs) exist that reduce the risk of falls and/or fall-related injuries for community-dwelling seniors?
Inclusion and Exclusion Criteria
Inclusion Criteria
English language;
published between January 2000 and September 2007;
population of community-dwelling seniors (majority aged 65+); and
randomized controlled trials (RCTs), quasi-experimental trials, systematic reviews, or meta-analyses.
Exclusion Criteria
special populations (e.g., stroke or osteoporosis; however, studies restricted only to women were included);
studies only reporting surrogate outcomes; or
studies whose outcome cannot be extracted for meta-analysis.
Outcomes of Interest
number of fallers, and
number of falls resulting in injury/fracture.
Search Strategy
A search was performed in OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, the Cumulative Index to Nursing & Allied Health Literature (CINAHL), The Cochrane Library, and the International Agency for Health Technology Assessment (INAHTA) for studies published between January 2000 and September 2007. Furthermore, all studies included in a 2003 Cochrane review were considered for inclusion in this analysis. Abstracts were reviewed by a single author, and studies meeting the inclusion criteria outlined above were obtained. Studies were grouped based on intervention type, and data on population characteristics, fall outcomes, and study design were extracted. Reference lists were also checked for relevant studies. The quality of the evidence was assessed as high, moderate, low, or very low according to the GRADE methodology.
Summary of Findings
The following 11 interventions were identified in the literature search: exercise programs, vision assessment and referral, cataract surgery, environmental modifications, vitamin D supplementation, vitamin D plus calcium supplementation, hormone replacement therapy (HRT), medication withdrawal, gait-stabilizing devices, hip protectors, and multifactorial interventions.
Exercise programs were stratified into targeted programs where the exercise routine was tailored to the individuals’ needs, and untargeted programs that were identical among subjects. Furthermore, analyses were stratified by exercise program duration (<6 months and ≥6 months) and fall risk of study participants. Similarly, the analyses on the environmental modification studies were stratified by risk. Low-risk study participants had had no fall in the year prior to study entry, while high-risk participants had had at least one fall in the previous year.
A total of 17 studies investigating multifactorial interventions were identified in the literature search. Of these studies, 10 reported results for a high-risk population with previous falls, while 6 reported results for study participants representative of the general population. One study provided stratified results by fall risk, and therefore results from this study were included in each stratified analysis.
Summary of Meta-Analyses of Studies Investigating the Effectiveness of Interventions on the Risk of Falls in Community-Dwelling Seniors*
CI refers to confidence interval; RR, relative risk.
Hazard ratio is reported, because RR was not available.
Summary of Meta-Analyses of Studies Investigating the Effectiveness of Interventions on the Risk of Fall-Related Injuries in Community-Dwelling Seniors*
CI refers to confidence interval; RR, relative risk.
Odds ratio is reported, because RR was not available.
Conclusions
High-quality evidence indicates that long-term exercise programs in mobile seniors and environmental modifications in the homes of frail elderly persons will effectively reduce falls and possibly fall-related injuries in Ontario’s elderly population.
A combination of vitamin D and calcium supplementation in elderly women will help reduce the risk of falls by more than 40%.
The use of outdoor gait-stabilizing devices for mobile seniors during the winter in Ontario may reduce falls and fall-related injuries; however, evidence is limited and more research is required in this area.
While psychotropic medication withdrawal may be an effective method for reducing falls, evidence is limited and long-term compliance has been demonstrated to be difficult to achieve.
Multifactorial interventions in high-risk populations may be effective; however, the effect is only marginally significant, and the quality of evidence is low.
PMCID: PMC3377567  PMID: 23074507
4.  How well do clinical prediction rules perform in identifying serious infections in acutely ill children across an international network of ambulatory care datasets? 
BMC Medicine  2013;11:10.
Background
Diagnosing serious infections in children is challenging, because of the low incidence of such infections and their non-specific presentation early in the course of illness. Prediction rules are promoted as a means to improve recognition of serious infections. A recent systematic review identified seven clinical prediction rules, of which only one had been prospectively validated, calling into question their appropriateness for clinical practice. We aimed to examine the diagnostic accuracy of these rules in multiple ambulatory care populations in Europe.
Methods
Four clinical prediction rules and two national guidelines, based on signs and symptoms, were validated retrospectively in seven individual patient datasets from primary care and emergency departments, comprising 11,023 children from the UK, the Netherlands, and Belgium. The accuracy of each rule was tested, with pre-test and post-test probabilities displayed using dumbbell plots, with serious infection settings stratified as low prevalence (LP; <5%), intermediate prevalence (IP; 5 to 20%), and high prevalence (HP; >20%) . In LP and IP settings, sensitivity should be >90% for effective ruling out infection.
Results
In LP settings, a five-stage decision tree and a pneumonia rule had sensitivities of >90% (at a negative likelihood ratio (NLR) of < 0.2) for ruling out serious infections, whereas the sensitivities of a meningitis rule and the Yale Observation Scale (YOS) varied widely, between 33 and 100%. In IP settings, the five-stage decision tree, the pneumonia rule, and YOS had sensitivities between 22 and 88%, with NLR ranging from 0.3 to 0.8. In an HP setting, the five-stage decision tree provided a sensitivity of 23%. In LP or IP settings, the sensitivities of the National Institute for Clinical Excellence guideline for feverish illness and the Dutch College of General Practitioners alarm symptoms ranged from 81 to 100%.
Conclusions
None of the clinical prediction rules examined in this study provided perfect diagnostic accuracy. In LP or IP settings, prediction rules and evidence-based guidelines had high sensitivity, providing promising rule-out value for serious infections in these datasets, although all had a percentage of residual uncertainty. Additional clinical assessment or testing such as point-of-care laboratory tests may be needed to increase clinical certainty. None of the prediction rules identified seemed to be valuable for HP settings such as emergency departments.
doi:10.1186/1741-7015-11-10
PMCID: PMC3566974  PMID: 23320738
clinical prediction rules; serious infection in children; external validation; NICE guidelines feverish illness; Yale Observation Scale; diagnostic accuracy
5.  Validity of British Thoracic Society guidance (the CRB-65 rule) for predicting the severity of pneumonia in general practice: systematic review and meta-analysis 
The British Journal of General Practice  2010;60(579):e423-e433.
Background
The CRB-65 score is a clinical prediction rule that grades the severity of community-acquired pneumonia in terms of 30-day mortality.
Aim
The study sought to validate CRB-65 and assess its clinical value in community and hospital settings.
Design of study
Systematic review and meta-analysis of validation studies of CRB-65.
Method
Medline (1966 to June 2009), Embase (1988 to November 2008), British Nursing Index (BNI) and PsychINFO were searched, using a diagnostic accuracy search filter combined with subject-specific terms. The derived (index) rule was used as a predictive model and applied to all validation studies. Comparison was made between the observed and predicted number of deaths stratified by risk group (low, intermediate, and high) and setting of care (community or hospital). Pooled results are presented as risk ratios (RRs) in terms of over-prediction (RR>1) or under-prediction (RR<1) of 30-day mortality.
Results
Fourteen validation studies totalling 397 875 patients are included. CRB-65 performs well in hospitalised patients, particularly in those classified as intermediate (RR 0.91, 95% confidence interval [CI] = 0.71 to 1.17) or high risk (RR 1.01, 95% CI = 0.87 to 1.16). In community settings, CRB-65 over-predicts the probability of 30-day mortality across all strata of predicted risk, low (RR 9.41, 95% CI = 1.75 to 50.66), intermediate (RR 4.84, 95% CI = 2.61 to 8.69), and high (RR 1.58, 95% CI = 0.59 to 4.19).
Conclusion
CRB-65 performs well in stratifying severity of pneumonia and resultant 30-day mortality in hospital settings. In community settings, CRB-65 appears to over-predict the probability of 30-day mortality across all strata of predicted risk. Caution is needed when applying CRB-65 to patients in general practice.
doi:10.3399/bjgp10X532422
PMCID: PMC2944951  PMID: 20883616
general practice; meta-analysis; pneumonia; prognosis; severity of illness index
6.  The Alvarado score for predicting acute appendicitis: a systematic review 
BMC Medicine  2011;9:139.
Background
The Alvarado score can be used to stratify patients with symptoms of suspected appendicitis; the validity of the score in certain patient groups and at different cut points is still unclear. The aim of this study was to assess the discrimination (diagnostic accuracy) and calibration performance of the Alvarado score.
Methods
A systematic search of validation studies in Medline, Embase, DARE and The Cochrane library was performed up to April 2011. We assessed the diagnostic accuracy of the score at the two cut-off points: score of 5 (1 to 4 vs. 5 to 10) and score of 7 (1 to 6 vs. 7 to 10). Calibration was analysed across low (1 to 4), intermediate (5 to 6) and high (7 to 10) risk strata. The analysis focused on three sub-groups: men, women and children.
Results
Forty-two studies were included in the review. In terms of diagnostic accuracy, the cut-point of 5 was good at 'ruling out' admission for appendicitis (sensitivity 99% overall, 96% men, 99% woman, 99% children). At the cut-point of 7, recommended for 'ruling in' appendicitis and progression to surgery, the score performed poorly in each subgroup (specificity overall 81%, men 57%, woman 73%, children 76%). The Alvarado score is well calibrated in men across all risk strata (low RR 1.06, 95% CI 0.87 to 1.28; intermediate 1.09, 0.86 to 1.37 and high 1.02, 0.97 to 1.08). The score over-predicts the probability of appendicitis in children in the intermediate and high risk groups and in women across all risk strata.
Conclusions
The Alvarado score is a useful diagnostic 'rule out' score at a cut point of 5 for all patient groups. The score is well calibrated in men, inconsistent in children and over-predicts the probability of appendicitis in women across all strata of risk.
doi:10.1186/1741-7015-9-139
PMCID: PMC3299622  PMID: 22204638
7.  The Diagnostic Accuracy of Serologic and Molecular Methods for Detecting Visceral Leishmaniasis in HIV Infected Patients: Meta-Analysis 
Background
Human visceral leishmaniasis (VL), a potentially fatal disease, has emerged as an important opportunistic condition in HIV infected patients. In immunocompromised patients, serological investigation is considered not an accurate diagnostic method for VL diagnosis and molecular techniques seem especially promising.
Objective
This work is a comprehensive systematic review and meta-analysis to evaluate the accuracy of serologic and molecular tests for VL diagnosis specifically in HIV-infected patients.
Methods
Two independent reviewers searched PubMed and LILACS databases. The quality of studies was assessed by QUADAS score. Sensitivity and specificity were pooled separately and compared with overall accuracy measures: diagnostic odds ratio (DOR) and symmetric summary receiver operating characteristic (sROC).
Results
Thirty three studies recruiting 1,489 patients were included. The following tests were evaluated: Immunofluorescence Antibody Test (IFAT), Enzyme linked immunosorbent assay (ELISA), immunoblotting (Blot), direct agglutination test (DAT) and polimerase chain reaction (PCR) in whole blood and bone marrow. Most studies were carried out in Europe. Serological tests varied widely in performance, but with overall limited sensitivity. IFAT had poor sensitivity ranging from 11% to 82%. DOR (95% confidence interval) was higher for DAT 36.01 (9.95–130.29) and Blot 27.51 (9.27–81.66) than for IFAT 7.43 (3.08–1791) and ELISA 3.06 (0.71–13.10). PCR in whole blood had the highest DOR: 400.35 (58.47–2741.42). The accuracy of PCR based on Q-point was 0.95; 95%CI 0.92–0.97, which means good overall performance.
Conclusion
Based mainly on evidence gained by infection with Leishmania infantum chagasi, serological tests should not be used to rule out a diagnosis of VL among the HIV-infected, but a positive test at even low titers has diagnostic value when combined with the clinical case definition. Considering the available evidence, tests based on DNA detection are highly sensitive and may contribute to a diagnostic workup.
Author Summary
Human visceral leishmaniasis (VL), a potentially fatal disease, has emerged as an important opportunistic condition in HIV infected patients. In immunocompromised patients, serological investigation is considered not an accurate diagnostic method for VL diagnosis and molecular techniques seem especially promising. Demonstration of Leishmania parasites in bone marrow aspirate or in other biologic specimen, either by visualization or culture, remains the most reliable diagnostic technique in the setting of HIV co-infection. However, these tests are difficult to perform in rural areas and some of them are invasive and carry a risk of complication. This work is a systematic review to evaluate the accuracy of serologic and molecular tests for VL diagnosis in HIV-infected patients. Two reviewers searched the literature, evaluating quality of studies and comparing performance of diagnostic tests. Thirty three studies were included. Most studies were carried out in Europe. Serological tests varied in performance, but with overall limited sensitivity. Based on the evidence, serological tests should not be used to rule out a diagnosis of VL among HIV-patients, but a positive test at even low titers has diagnostic value when combined with the clinical case definition. Tests based on DNA detection are highly sensitive and may contribute to a diagnostic workup.
doi:10.1371/journal.pntd.0001665
PMCID: PMC3362615  PMID: 22666514
8.  Stress Echocardiography for the Diagnosis of Coronary Artery Disease 
Executive Summary
In July 2009, the Medical Advisory Secretariat (MAS) began work on Non-Invasive Cardiac Imaging Technologies for the Diagnosis of Coronary Artery Disease (CAD), an evidence-based review of the literature surrounding different cardiac imaging modalities to ensure that appropriate technologies are accessed by patients suspected of having CAD. This project came about when the Health Services Branch at the Ministry of Health and Long-Term Care asked MAS to provide an evidentiary platform on effectiveness and cost-effectiveness of non-invasive cardiac imaging modalities.
After an initial review of the strategy and consultation with experts, MAS identified five key non-invasive cardiac imaging technologies for the diagnosis of CAD. Evidence-based analyses have been prepared for each of these five imaging modalities: cardiac magnetic resonance imaging, single photon emission computed tomography, 64-slice computed tomographic angiography, stress echocardiography, and stress echocardiography with contrast. For each technology, an economic analysis was also completed (where appropriate). A summary decision analytic model was then developed to encapsulate the data from each of these reports (available on the OHTAC and MAS website).
The Non-Invasive Cardiac Imaging Technologies for the Diagnosis of Coronary Artery Disease series is made up of the following reports, which can be publicly accessed at the MAS website at: www.health.gov.on.ca/mas"> www.health.gov.on.ca/mas or at www.health.gov.on.ca/english/providers/program/mas/mas_about.html
Single Photon Emission Computed Tomography for the Diagnosis of Coronary Artery Disease: An Evidence-Based Analysis
Stress Echocardiography for the Diagnosis of Coronary Artery Disease: An Evidence-Based Analysis
Stress Echocardiography with Contrast for the Diagnosis of Coronary Artery Disease: An Evidence-Based Analysis
64-Slice Computed Tomographic Angiography for the Diagnosis of Coronary Artery Disease: An Evidence-Based Analysis
Cardiac Magnetic Resonance Imaging for the Diagnosis of Coronary Artery Disease: An Evidence-Based Analysis
Pease note that two related evidence-based analyses of non-invasive cardiac imaging technologies for the assessment of myocardial viability are also available on the MAS website:
Positron Emission Tomography for the Assessment of Myocardial Viability: An Evidence-Based Analysis
Magnetic Resonance Imaging for the Assessment of Myocardial Viability: an Evidence-Based Analysis
The Toronto Health Economics and Technology Assessment Collaborative has also produced an associated economic report entitled:
The Relative Cost-effectiveness of Five Non-invasive Cardiac Imaging Technologies for Diagnosing Coronary Artery Disease in Ontario [Internet]. Available from: http://theta.utoronto.ca/reports/?id=7
Objective
The objective of the analysis is to determine the diagnostic accuracy of stress echocardiography (ECHO) in the diagnosis of patients with suspected coronary artery disease (CAD) compared to coronary angiography (CA).
Stress Echocardiography
Stress ECHO is a non-invasive technology that images the heart using ultrasound. It is one of the most commonly employed imaging techniques for investigating a variety of cardiac abnormalities in both community and hospital settings. A complete ECHO exam includes M-mode, 2-dimensional (2-D) images and Doppler imaging.
In order to diagnosis CAD and assess whether myocardial ischemia is present, images obtained at rest are compared to those obtained during or immediately after stress. The most commonly used agents used to induce stress are exercise and pharmacological agents such as dobutamine and dipyridamole. The hallmark of stress-induced myocardial ischemia is worsening of wall motion abnormalities or the development of new wall motion abnormalities. A major challenge for stress ECHO is that the interpretation of wall motion contractility and function is subjective. This leads to inter-observer variability and reduced reproducibility. Further, it is estimated that approximately 30% of patients have sub-optimal stress ECHO exams. To overcome this limitation, contrast agents for LV opacification have been developed.
Although stress ECHO is a relatively easy to use technology that poses only a low risk of adverse events compared to other imaging technologies, it may potentially be overused and/or misused in CAD diagnosis. Several recent advances have been made focusing on quantitative methods for assessment, improved image quality and enhanced portability, however, evidence on the effectiveness and clinical utility of these enhancements is limited.
Evidence-Based Analysis
Research Questions
What is the diagnostic accuracy of stress ECHO for the diagnosis of patients with suspected CAD compared to the reference standard of CA?
What is the clinical utility1 of stress ECHO?
Literature Search
A literature search was performed on August 28, 2009 using OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, the Cumulative Index to Nursing & Allied Health Literature (CINAHL), the Cochrane Library, and the International Agency for Health Technology Assessment (INAHTA) for studies published from January 1, 2004 until August 21, 2009. Abstracts were reviewed by a single reviewer and, for those studies meeting the eligibility criteria, full-text articles were obtained. Reference lists were also examined for any relevant studies not identified through the search.
Inclusion Criteria
Systematic reviews, meta-analyses, randomized controlled trials, prospective observational studies, retrospective analyses
Minimum sample size of 20 enrolled patients
Comparison to CA (reference standard)
Definition of CAD specified as either ≥50%, ≥70% or ≥75% coronary artery stenosis on CA
Reporting accuracy data on individual patients (rather than accuracy data stratified by segments of the heart)
English
Human
Exclusion Criteria
Duplicate studies
Non-systematic reviews, case reports
Grey literature (e.g., conference abstracts)
Insufficient data for independent calculation of sensitivity and specificity
Use of ECHO for purposes other than diagnosis of CAD (e.g., arrhythmia, valvular disease, mitral stenosis, pre-operative risk of MI)
Transesophageal ECHO since its primary use is for non-CAD indications such as endocarditis, intracardiac thrombi, valvular disorders
Only resting ECHO performed
Outcomes of Interest
Accuracy outcomes (sensitivity, specificity, positive predictive value, negative predictive value)
Costs
Summary of Findings
Given the vast amount of published literature on stress ECHO, it was decided to focus on the studies contained in the comprehensive 2007 review by Heijenbrok-Kal et al. (1) as a basis for the MAS evidence-based analysis. In applying our inclusion and exclusion criteria, 105 observational studies containing information on 13,035 patients were included. Six studies examined stress ECHO with adenosine, 26 with dipyridamole and 77 with dobutamine, the latter being the most commonly used pharmacological stress ECHO agent in Ontario. A further 18 studies employed exercise as the stressor.2 The prevalence of CAD ranged from 19% to 94% with a mean estimated prevalence of 70%. Based on the results of these studies the following conclusions were made:
Based on the available evidence, stress ECHO is a useful imaging modality for the diagnosis of CAD in patients with suspected disease. The overall pooled sensitivity is 0.80 (95% CI: 0.77 – 0.82) and the pooled specificity is 0.84 (95% CI: 0.82 – 0.87) using CA as the reference standard. The AUC derived from the sROC curve is 0.895 and the DOR is 20.64.
For pharmacological stress, the pooled sensitivity is 0.79 (95% CI: 0.71 – 0.87) and the pooled specificity is 0.85 (95% CI: 0.83 – 0.88). When exercise is employed as the stress agent, the pooled sensitivity is 0.81 (95% CI: 0.76– 0.86) and the pooled specificity is 0.79 (95% CI: 0.71 – 0.87). Although pharmacological stress and exercise stress would be indicated for different patient populations based on ability to exercise there were no significant differences in sensitivity and specificity.
Based on clinical experts, diagnostic accuracy on stress ECHO depends on the patient population, the expertise of the interpreter and the quality of the image.
PMCID: PMC3377563  PMID: 23074412
9.  Development of a Standardized Screening Rule for Tuberculosis in People Living with HIV in Resource-Constrained Settings: Individual Participant Data Meta-analysis of Observational Studies 
PLoS Medicine  2011;8(1):e1000391.
Haileyesus Getahun and colleagues report the development of a simple, standardized tuberculosis (TB) screening rule for resource-constrained settings, to identify people living with HIV who need further investigation for TB disease.
Background
The World Health Organization recommends the screening of all people living with HIV for tuberculosis (TB) disease, followed by TB treatment, or isoniazid preventive therapy (IPT) when TB is excluded. However, the difficulty of reliably excluding TB disease has severely limited TB screening and IPT uptake in resource-limited settings. We conducted an individual participant data meta-analysis of primary studies, aiming to identify a sensitive TB screening rule.
Methods and Findings
We identified 12 studies that had systematically collected sputum specimens regardless of signs or symptoms, at least one mycobacterial culture, clinical symptoms, and HIV and TB disease status. Bivariate random-effects meta-analysis and the hierarchical summary relative operating characteristic curves were used to evaluate the screening performance of all combinations of variables of interest. TB disease was diagnosed in 557 (5.8%) of 9,626 people living with HIV. The primary analysis included 8,148 people living with HIV who could be evaluated on five symptoms from nine of the 12 studies. The median age was 34 years. The best performing rule was the presence of any one of: current cough (any duration), fever, night sweats, or weight loss. The overall sensitivity of this rule was 78.9% (95% confidence interval [CI] 58.3%–90.9%) and specificity was 49.6% (95% CI 29.2%–70.1%). Its sensitivity increased to 90.1% (95% CI 76.3%–96.2%) among participants selected from clinical settings and to 88.0% (95% CI 76.1%–94.4%) among those who were not previously screened for TB. Negative predictive value was 97.7% (95% CI 97.4%–98.0%) and 90.0% (95% CI 88.6%–91.3%) at 5% and 20% prevalence of TB among people living with HIV, respectively. Abnormal chest radiographic findings increased the sensitivity of the rule by 11.7% (90.6% versus 78.9%) with a reduction of specificity by 10.7% (49.6% versus 38.9%).
Conclusions
Absence of all of current cough, fever, night sweats, and weight loss can identify a subset of people living with HIV who have a very low probability of having TB disease. A simplified screening rule using any one of these symptoms can be used in resource-constrained settings to identify people living with HIV in need of further diagnostic assessment for TB. Use of this algorithm should result in earlier TB diagnosis and treatment, and should allow for substantial scale-up of IPT.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
In 2009, 1.7 million people died from tuberculosis (TB)—equating to 4,700 deaths a day—including 380,000 people living with HIV. TB remains the most common cause of death in people living with HIV and compared to people without HIV, people living with HIV are more than 20 times more likely to develop TB. Furthermore, TB infection may occur at any stage of HIV disease and is often the initial presentation of underlying HIV infection. Without antiretroviral treatment, up to 50% of people living with HIV who are diagnosed with TB die during the 6–8 months of TB treatment.
Although antiretroviral treatment can reduce the incidence of TB both at the individual and population level, people living with HIV on antiretroviral treatment still have higher TB incidence rates and a higher risk of dying from TB. Therefore, the World Health Organization recommends regular screening for active TB disease in all people living with HIV, so those identified as having active TB disease can be provided with appropriate treatment, and isoniazid preventive therapy (to help mitigate TB morbidity, mortality, and transmission) can be given to vulnerable individuals who do not yet have active TB.
Why Was This Study Done?
There is currently no internationally accepted evidence-based tool to screen for TB in people living with HIV—a serious gap given that the presenting signs and symptoms of TB in people living with HIV are different from those in people without HIV. Therefore, the researchers aimed to develop a simple, standardized TB screening rule for resource-constrained settings, on the basis of the best available evidence that would adequately distinguish between people living with HIV who are very unlikely to have TB from those who require further investigation for TB disease.
What Did the Researchers Do and Find?
The researchers selected 12 studies that met their strict criteria, then asked the authors of these studies for primary data so that they could map individual-level data to identify five symptoms common to most studies. Using a statistical model, the researchers devised 23 screening rules derived from these five symptoms and used meta-analysis methods (bivariate random-effects meta-analysis) and the association of study-level and individual-level correlates (hierarchical summary relative operating characteristic curves) to evaluate the sensitivity and specificity of each tool used in each individual study.
The authors of the selected studies were able to provide data for 29,523 participants, of whom 10,057 were people living with HIV. The dataset included 9,626 people living with HIV who had TB screening and sputum culture performed, of which 8,148 individuals could be evaluated on the five symptoms of interest from nine of 12 studies. TB disease was diagnosed in 5.8% of people living with HIV and the best performing rule was the presence of any one of the following: current cough (any duration), fever, night sweats, or weight loss. The overall sensitivity of the rule was 78.9% and the specificity was 49.6%. However, the sensitivity of the rule increased to 90.1% among participants selected from clinical settings and to 88.0% among those who were not previously screened for TB.
What Do These Findings Mean?
The results of this study suggest that in resource-constrained settings, the absence of current cough, fever, night sweats, and weight loss (all inclusive) can identify those people living with HIV who have a low probability of having TB disease. Furthermore, any one of these symptoms can be used in resource-constrained settings to identify people living with HIV who are in need of further diagnostic assessment for TB.
Despite the limitations of the methodology used in this study, until there are evidence-based and internationally recommended guidelines for the diagnosis and treatment of TB in people living with HIV, use of the algorithm developed and presented in this study could result in earlier TB diagnosis and treatment for people living with HIV and could help to substantially scale-up isoniazid preventive therapy.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1000391.
The World Health Organization has information about TB in people living with HIV
The US Centers for Disease Control and Prevention also provide information about TB and HIV coinfection
The World Health Organization also has information about isoniazid preventative therapy
The Stop TB Partnership's TB/HIV Working Group provide information about TB and HIV co-infection
doi:10.1371/journal.pmed.1000391
PMCID: PMC3022524  PMID: 21267059
10.  Usefulness of DWI in preoperative assessment of deep myometrial invasion in patients with endometrial carcinoma: a systematic review and meta-analysis 
Cancer Imaging  2014;14(1):32.
Background
The objective of this study was to perform a systematic review and a meta-analysis in order to estimate the diagnostic accuracy of diffusion weighted imaging (DWI) in the preoperative assessment of deep myometrial invasion in patients with endometrial carcinoma.
Methods
Studies evaluating DWI for the detection of deep myometrial invasion in patients with endometrial carcinoma were systematically searched for in the MEDLINE, EMBASE, and Cochrane Library from January 1995 to January 2014. Methodologic quality was assessed by using the Quality Assessment of Diagnostic Accuracy Studies tool. Bivariate random-effects meta-analytic methods were used to obtain pooled estimates of sensitivity, specificity, diagnostic odds ratio (DOR) and receiver operating characteristic (ROC) curves. The study also evaluated the clinical utility of DWI in preoperative assessment of deep myometrial invasion.
Results
Seven studies enrolling a total of 320 individuals met the study inclusion criteria. The summary area under the ROC curve was 0.91. There was no evidence of publication bias (P = 0.90, bias coefficient analysis). Sensitivity and specificity of DWI for detection of deep myometrial invasion across all studies were 0.90 and 0.89, respectively. Positive and negative likelihood ratios with DWI were 8 and 0.11 respectively. In patients with high pre-test probabilities, DWI enabled confirmation of deep myometrial invasion; in patients with low pre-test probabilities, DWI enabled exclusion of deep myometrial invasion. The worst case scenario (pre-test probability, 50%) post-test probabilities were 89% and 10% for positive and negative DWI results, respectively.
Conclusion
DWI has high sensitivity and specificity for detecting deep myometrial invasion and more importantly can reliably rule out deep myometrial invasion. Therefore, it would be worthwhile to add a DWI sequence to the standard MRI protocols in preoperative evaluation of endometrial cancer in order to detect deep myometrial invasion, which along with other poor prognostic factors like age, tumor grade, and LVSI would be useful in stratifying high risk groups thereby helping in the tailoring of surgical approach in patient with low risk of endometrial carcinoma.
doi:10.1186/s40644-014-0032-y
PMCID: PMC4331837  PMID: 25608571
Diffusion-weighted imaging; Magnetic resonance imaging; Endometrial carcinoma; Myometrial invasion
11.  Instruments for assessing the risk of falls in acute hospitalized patients: a systematic review and meta-analysis 
Background
Falls are a serious problem for hospitalized patients, reducing the duration and quality of life. It is estimated that over 84% of all adverse events in hospitalized patients are related to falls. Some fall risk assessment tools have been developed and tested in environments other than those for which they were developed with serious validity discrepancies. The aim of this review is to determine the accuracy of instruments for detecting fall risk and predicting falls in acute hospitalized patients.
Methods
Systematic review and meta-analysis. Main databases, related websites and grey literature were searched. Two blinded reviewers evaluated title and abstracts of the selected articles and, if they met inclusion criteria, methodological quality was assessed in a new blinded process. Meta-analyses of diagnostic ORs (DOR) and likelihood (LH) coefficients were performed with the random effects method. Forest plots were calculated for sensitivity and specificity, DOR and LH. Additionally, summary ROC (SROC) curves were calculated for every analysis.
Results
Fourteen studies were selected for the review. The meta-analysis was performed with the Morse (MFS), STRATIFY and Hendrich II Fall Risk Model scales. The STRATIFY tool provided greater diagnostic validity, with a DOR value of 7.64 (4.86 - 12.00). A meta-regression was performed to assess the effect of average patient age over 65 years and the performance or otherwise of risk reassessments during the patient’s stay. The reassessment showed a significant reduction in the DOR on the MFS (rDOR 0.75, 95% CI: 0.64 - 0.89, p = 0.017).
Conclusions
The STRATIFY scale was found to be the best tool for assessing the risk of falls by hospitalized acutely-ill adults. However, the behaviour of these instruments varies considerably depending on the population and the environment, and so their operation should be tested prior to implementation. Further studies are needed to investigate the effect of the reassessment of these instruments with respect to hospitalized adult patients, and to consider the real compliance by healthcare personnel with procedures related to patient safety, and in particular concerning the prevention of falls.
doi:10.1186/1472-6963-13-122
PMCID: PMC3637640  PMID: 23547708
Accidental falls; Adverse events; Clinical safety; Risk assessment; Inpatients; Systematic review; Nursing assessment
12.  Clinical Usefulness of the Ottawa Ankle Rules for Detecting Fractures of the Ankle and Midfoot 
Journal of Athletic Training  2010;45(5):480-482.
Abstract
Reference:
Bachmann LM, Kolb E, Koller MT, Steurer J, ter Riet G. Accuracy of Ottawa Ankle Rules to exclude fractures of the ankle and mid-foot: systematic review. BMJ. 2003;326(7386):417–423.
Clinical Question:
What is the evidence for the accuracy of the Ottawa Ankle Rules as a decision aid for excluding fractures of the ankle and midfoot?
Data Sources:
Studies were identified by searching MEDLINE and PreMEDLINE (Ovid version: 1990 to present), EMBASE (Datastar version: 1990–2002), CINAHL (Winspires version: 1990–2002), the Cochrane Library (2002, issue 2), and the Science Citation Index database (Web of Science by Institute for Science Information). Reference lists of all included studies were also searched, and experts and authors in the specialty were contacted. The search had no language restrictions.
Study Selection:
Minimal inclusion criteria consisted of (1) study assessment of the Ottawa Ankle Rules and (2) sufficient information to construct a 2 × 2 contingency table specifying the false-positive and false-negative rates.
Data Extraction:
Studies were selected in a 2-stage process. First, all abstracts and titles found by the electronic searches were independently scrutinized by the same 2 authors. Second, copies of all eligible papers were obtained. A checklist was used to ensure that all inclusion criteria were met. Disagreements related to the eligibility of studies were resolved by consensus. Both authors extracted data from each included study independently. Methods of data collection, patient selection, blinding and prevention of verification bias, and description of the instrument and reference standard were assessed. Sensitivities (using the bootstrap method), specificities, negative likelihood ratios (using a random-effects model), and their standard errors were calculated. Special interest was paid to the pooled sensitivities and negative likelihood ratios because of the calibration of the Ottawa Ankle Rules toward a high sensitivity. Exclusion criteria for the pooled analysis were (1) studies that used a nonprospective data collection, (2) unknown radiologist blinding (verification bias), (3) studies assessing the performance of other specialists (nonphysicians) using the rules, and (4) studies that looked at modifications to the rules.
Main Results:
The search yielded 1085 studies, and the authors obtained complete articles for 116 of the studies. The reference lists from these studies provided an additional 15 studies. Only 32 of the studies met the inclusion criteria and were used for the review; 5 of these met the exclusion criteria. For included studies, the total population was 15 581 (range  =  18–1032), and average age ranged from 11 to 31.1 years in those studies that reported age. The 27 studies analyzed (pooled) consisted of 12 studies of ankle assessment, 8 studies of midfoot assessment, 10 studies of both ankle and midfoot assessment, and 6 studies of ankle or midfoot assessment in children (not all studies assessed all regions). Pooled sensitivities, specificities, and negative likelihood ratios for the ankle, midfoot, and combined ankle and midfoot are presented in the Table. Based on a 15% prevalence of actual fracture in patients presenting acutely after ankle or foot trauma, less than a 1.4% probability of fracture existed. Because limited analysis was conducted on the data from the children, we elected to not include this cohort in our review.
Conclusions:
Evidence supports the use of the Ottawa Ankle Rules as an aid in ruling out fractures of the ankle and midfoot. The rules have a high sensitivity (almost 100%) and modest specificity. Use of the Ottawa Ankle Rules holds promise for saving time and reducing both costs and radiographic exposure without sacrificing diagnostic accuracy in ankle and midfoot fractures.
doi:10.4085/1062-6050-45.5.480
PMCID: PMC2938320  PMID: 20831394
radiography; clinical guidelines; lower extremity injuries; ankle sprains
13.  Accuracy and Quality of Clinical Decision Rules for Syncope in the Emergency Department: A Systematic Review and Meta-analysis 
Annals of emergency medicine  2010;56(4):362-373.e1.
Objectives
We assessed the methodological quality and prognostic accuracy of clinical decision rules (CDR) in emergency department (ED) syncope patients.
Methods
We searched 5 electronic databases, reviewed reference lists of included studies and contacted content experts to identify articles for review. Studies that derived or validated CDRs in ED syncope patients were included. Two reviewers independently screened records for relevance, selected studies for inclusion, assessed study quality and abstracted data. Random effects meta-analysis was used to pool diagnostic performance estimates across studies that derived or validated the same CDR. Between study heterogeneity was assessed with I-squared statistic (I2), and subgroup hypotheses were tested using a test of interaction.
Results
We identified 18 eligible studies. Deficiencies in outcome (blinding) and inter-rater reliability assessment were the most common methodological weaknesses. Meta-analysis of the San Francisco Syncope Rule (SFSR) [sensitivity 86% (95%CI 83-89); specificity 49% (95%CI 48-51)] and the Osservatorio Epidemiologico sulla Sincope nel Lazio (OESIL) risk score [sensitivity 95% (95%CI 88-98); specificity 31% (95%CI 29-34)]. Subgroup analysis identified study design [prospective, diagnostic odds ratio (DOR) 8.82 (95%CI 3.5-22) vs. retrospective, DOR 2.45 (95%CI 0.96-6.21)] and ECG determination [by evaluating physician, DOR 25.5 (95%CI 4.41-148) vs. researcher or cardiologist, DOR 4 (95%CI 2.15-7.55)] as potential explanations for the variability in SFSR performance.
Conclusion
The methodological quality and prognostic accuracy of CDRs for syncope is limited. Differences in study design and ECG interpretation may account for the variable prognostic performance of the SFSR when validated in different practice settings.
doi:10.1016/j.annemergmed.2010.05.013
PMCID: PMC2946941  PMID: 20868906
syncope; clinical decision rules
14.  Clinical Utility of Serologic Testing for Celiac Disease in Ontario 
Executive Summary
Objective of Analysis
The objective of this evidence-based evaluation is to assess the accuracy of serologic tests in the diagnosis of celiac disease in subjects with symptoms consistent with this disease. Furthermore the impact of these tests in the diagnostic pathway of the disease and decision making was also evaluated.
Celiac Disease
Celiac disease is an autoimmune disease that develops in genetically predisposed individuals. The immunological response is triggered by ingestion of gluten, a protein that is present in wheat, rye, and barley. The treatment consists of strict lifelong adherence to a gluten-free diet (GFD).
Patients with celiac disease may present with a myriad of symptoms such as diarrhea, abdominal pain, weight loss, iron deficiency anemia, dermatitis herpetiformis, among others.
Serologic Testing in the Diagnosis Celiac Disease
There are a number of serologic tests used in the diagnosis of celiac disease.
Anti-gliadin antibody (AGA)
Anti-endomysial antibody (EMA)
Anti-tissue transglutaminase antibody (tTG)
Anti-deamidated gliadin peptides antibodies (DGP)
Serologic tests are automated with the exception of the EMA test, which is more time-consuming and operator-dependent than the other tests. For each serologic test, both immunoglobulin A (IgA) or G (IgG) can be measured, however, IgA measurement is the standard antibody measured in celiac disease.
Diagnosis of Celiac Disease
According to celiac disease guidelines, the diagnosis of celiac disease is established by small bowel biopsy. Serologic tests are used to initially detect and to support the diagnosis of celiac disease. A small bowel biopsy is indicated in individuals with a positive serologic test. In some cases an endoscopy and small bowel biopsy may be required even with a negative serologic test. The diagnosis of celiac disease must be performed on a gluten-containing diet since the small intestine abnormalities and the serologic antibody levels may resolve or improve on a GFD.
Since IgA measurement is the standard for the serologic celiac disease tests, false negatives may occur in IgA-deficient individuals.
Incidence and Prevalence of Celiac Disease
The incidence and prevalence of celiac disease in the general population and in subjects with symptoms consistent with or at higher risk of celiac disease based on systematic reviews published in 2004 and 2009 are summarized below.
Incidence of Celiac Disease in the General Population
Adults or mixed population: 1 to 17/100,000/year
Children: 2 to 51/100,000/year
In one of the studies, a stratified analysis showed that there was a higher incidence of celiac disease in younger children compared to older children, i.e., 51 cases/100,000/year in 0 to 2 year-olds, 33/100,000/year in 2 to 5 year-olds, and 10/100,000/year in children 5 to 15 years old.
Prevalence of Celiac Disease in the General Population
The prevalence of celiac disease reported in population-based studies identified in the 2004 systematic review varied between 0.14% and 1.87% (median: 0.47%, interquartile range: 0.25%, 0.71%). According to the authors of the review, the prevalence did not vary by age group, i.e., adults and children.
Prevalence of Celiac Disease in High Risk Subjects
Type 1 diabetes (adults and children): 1 to 11%
Autoimmune thyroid disease: 2.9 to 3.3%
First degree relatives of patients with celiac disease: 2 to 20%
Prevalence of Celiac Disease in Subjects with Symptoms Consistent with the Disease
The prevalence of celiac disease in subjects with symptoms consistent with the disease varied widely among studies, i.e., 1.5% to 50% in adult studies, and 1.1% to 17% in pediatric studies. Differences in prevalence may be related to the referral pattern as the authors of a systematic review noted that the prevalence tended to be higher in studies whose population originated from tertiary referral centres compared to general practice.
Research Questions
What is the sensitivity and specificity of serologic tests in the diagnosis celiac disease?
What is the clinical validity of serologic tests in the diagnosis of celiac disease? The clinical validity was defined as the ability of the test to change diagnosis.
What is the clinical utility of serologic tests in the diagnosis of celiac disease? The clinical utility was defined as the impact of the test on decision making.
What is the budget impact of serologic tests in the diagnosis of celiac disease?
What is the cost-effectiveness of serologic tests in the diagnosis of celiac disease?
Methods
Literature Search
A literature search was performed on November 13th, 2009 using OVID MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, the Cumulative Index to Nursing & Allied Health Literature (CINAHL), the Cochrane Library, and the International Agency for Health Technology Assessment (INAHTA) for studies published from January 1st 2003 and November 13th 2010. Abstracts were reviewed by a single reviewer and, for those studies meeting the eligibility criteria, full-text articles were obtained. Reference lists were also examined for any additional relevant studies not identified through the search. Articles with unknown eligibility were reviewed with a second clinical epidemiologist, then a group of epidemiologists until consensus was established. The quality of evidence was assessed as high, moderate, low or very low according to GRADE methodology.
Studies that evaluated diagnostic accuracy, i.e., both sensitivity and specificity of serology tests in the diagnosis of celiac disease.
Study population consisted of untreated patients with symptoms consistent with celiac disease.
Studies in which both serologic celiac disease tests and small bowel biopsy (gold standard) were used in all subjects.
Systematic reviews, meta-analyses, randomized controlled trials, prospective observational studies, and retrospective cohort studies.
At least 20 subjects included in the celiac disease group.
English language.
Human studies.
Studies published from 2000 on.
Clearly defined cut-off value for the serology test. If more than one test was evaluated, only those tests for which a cut-off was provided were included.
Description of small bowel biopsy procedure clearly outlined (location, number of biopsies per patient), unless if specified that celiac disease diagnosis guidelines were followed.
Patients in the treatment group had untreated CD.
Studies on screening of the general asymptomatic population.
Studies that evaluated rapid diagnostic kits for use either at home or in physician’s offices.
Studies that evaluated diagnostic modalities other than serologic tests such as capsule endoscopy, push enteroscopy, or genetic testing.
Cut-off for serologic tests defined based on controls included in the study.
Study population defined based on positive serology or subjects pre-screened by serology tests.
Celiac disease status known before study enrolment.
Sensitivity or specificity estimates based on repeated testing for the same subject.
Non-peer-reviewed literature such as editorials and letters to the editor.
Population
The population consisted of adults and children with untreated, undiagnosed celiac disease with symptoms consistent with the disease.
Serologic Celiac Disease Tests Evaluated
Anti-gliadin antibody (AGA)
Anti-endomysial antibody (EMA)
Anti-tissue transglutaminase antibody (tTG)
Anti-deamidated gliadin peptides antibody (DGP)
Combinations of some of the serologic tests listed above were evaluated in some studies
Both IgA and IgG antibodies were evaluated for the serologic tests listed above.
Outcomes of Interest
Sensitivity
Specificity
Positive and negative likelihood ratios
Diagnostic odds ratio (OR)
Area under the sROC curve (AUC)
Small bowel biopsy was used as the gold standard in order to estimate the sensitivity and specificity of each serologic test.
Statistical Analysis
Pooled estimates of sensitivity, specificity and diagnostic odds ratios (DORs) for the different serologic tests were calculated using a bivariate, binomial generalized linear mixed model. Statistical significance for differences in sensitivity and specificity between serologic tests was defined by P values less than 0.05, where “false discovery rate” adjustments were made for multiple hypothesis testing. The bivariate regression analyses were performed using SAS version 9.2 (SAS Institute Inc.; Cary, NC, USA). Using the bivariate model parameters, summary receiver operating characteristic (sROC) curves were produced using Review Manager 5.0.22 (The Nordiac Cochrane Centre, The Cochrane Collaboration, 2008). The area under the sROC curve (AUC) was estimated by bivariate mixed-efects binary regression modeling framework. Model specification, estimation and prediction are carried out with xtmelogit in Stata release 10 (Statacorp, 2007). Statistical tests for the differences in AUC estimates could not be carried out.
The study results were stratified according to patient or disease characteristics such as age, severity of Marsh grade abnormalities, among others, if reported in the studies. The literature indicates that the diagnostic accuracy of serologic tests for celiac disease may be affected in patients with chronic liver disease, therefore, the studies identified through the systematic literature review that evaluated the diagnostic accuracy of serologic tests for celiac disease in patients with chronic liver disease were summarized. The effect of the GFD in patiens diagnosed with celiac disease was also summarized if reported in the studies eligible for the analysis.
Summary of Findings
Published Systematic Reviews
Five systematic reviews of studies that evaluated the diagnostic accuracy of serologic celiac disease tests were identified through our literature search. Seventeen individual studies identified in adults and children were eligible for this evaluation.
In general, the studies included evaluated the sensitivity and specificity of at least one serologic test in subjects with symptoms consistent with celiac disease. The gold standard used to confirm the celiac disease diagnosis was small bowel biopsy. Serologic tests evaluated included tTG, EMA, AGA, and DGP, using either IgA or IgG antibodies. Indirect immunoflurorescence was used for the EMA serologic tests whereas enzyme-linked immunosorbent assay (ELISA) was used for the other serologic tests.
Common symptoms described in the studies were chronic diarrhea, abdominal pain, bloating, unexplained weight loss, unexplained anemia, and dermatitis herpetiformis.
The main conclusions of the published systematic reviews are summarized below.
IgA tTG and/or IgA EMA have a high accuracy (pooled sensitivity: 90% to 98%, pooled specificity: 95% to 99% depending on the pooled analysis).
Most reviews found that AGA (IgA or IgG) are not as accurate as IgA tTG and/or EMA tests.
A 2009 systematic review concluded that DGP (IgA or IgG) seems to have a similar accuracy compared to tTG, however, since only 2 studies identified evaluated its accuracy, the authors believe that additional data is required to draw firm conclusions.
Two systematic reviews also concluded that combining two serologic celiac disease tests has little contribution to the accuracy of the diagnosis.
MAS Analysis
Sensitivity
The pooled analysis performed by MAS showed that IgA tTG has a sensitivity of 92.1% [95% confidence interval (CI) 88.0, 96.3], compared to 89.2% (83.3, 95.1, p=0.12) for IgA DGP, 85.1% (79.5, 94.4, p=0.07) for IgA EMA, and 74.9% (63.6, 86.2, p=0.0003) for IgA AGA. Among the IgG-based tests, the results suggest that IgG DGP has a sensitivity of 88.4% (95% CI: 82.1, 94.6), 44.7% (30.3, 59.2) for tTG, and 69.1% (56.0, 82.2) for AGA. The difference was significant when IgG DGP was compared to IgG tTG but not IgG AGA. Combining serologic celiac disease tests yielded a slightly higher sensitivity compared to individual IgA-based serologic tests.
IgA deficiency
The prevalence of total or severe IgA deficiency was low in the studies identified varying between 0 and 1.7% as reported in 3 studies in which IgA deficiency was not used as a referral indication for celiac disease serologic testing. The results of IgG-based serologic tests were positive in all patients with IgA deficiency in which celiac disease was confirmed by small bowel biopsy as reported in four studies.
Specificity
The MAS pooled analysis indicates a high specificity across the different serologic tests including the combination strategy, pooled estimates ranged from 90.1% to 98.7% depending on the test.
Likelihood Ratios
According to the likelihood ratio estimates, both IgA tTG and serologic test combinationa were considered very useful tests (positive likelihood ratio above ten and the negative likelihood ratio below 0.1).
Moderately useful tests included IgA EMA, IgA DGP, and IgG DGP (positive likelihood ratio between five and ten and the negative likelihood ratio between 0.1 and 0.2).
Somewhat useful tests: IgA AGA, IgG AGA, generating small but sometimes important changes from pre- to post-test probability (positive LR between 2 and 5 and negative LR between 0.2 and 0.5)
Not Useful: IgG tTG, altering pre- to post-test probability to a small and rarely important degree (positive LR between 1 and 2 and negative LR between 0.5 and 1).
Diagnostic Odds Ratios (DOR)
Among the individual serologic tests, IgA tTG had the highest DOR, 136.5 (95% CI: 51.9, 221.2). The statistical significance of the difference in DORs among tests was not calculated, however, considering the wide confidence intervals obtained, the differences may not be statistically significant.
Area Under the sROC Curve (AUC)
The sROC AUCs obtained ranged between 0.93 and 0.99 for most IgA-based tests with the exception of IgA AGA, with an AUC of 0.89.
Sensitivity and Specificity of Serologic Tests According to Age Groups
Serologic test accuracy did not seem to vary according to age (adults or children).
Sensitivity and Specificity of Serologic Tests According to Marsh Criteria
Four studies observed a trend towards a higher sensitivity of serologic celiac disease tests when Marsh 3c grade abnormalities were found in the small bowel biopsy compared to Marsh 3a or 3b (statistical significance not reported). The sensitivity of serologic tests was much lower when Marsh 1 grade abnormalities were found in small bowel biopsy compared to Marsh 3 grade abnormalities. The statistical significance of these findings were not reported in the studies.
Diagnostic Accuracy of Serologic Celiac Disease Tests in Subjects with Chronic Liver Disease
A total of 14 observational studies that evaluated the specificity of serologic celiac disease tests in subjects with chronic liver disease were identified. All studies evaluated the frequency of false positive results (1-specificity) of IgA tTG, however, IgA tTG test kits using different substrates were used, i.e., human recombinant, human, and guinea-pig substrates. The gold standard, small bowel biopsy, was used to confirm the result of the serologic tests in only 5 studies. The studies do not seem to have been designed or powered to compare the diagnostic accuracy among different serologic celiac disease tests.
The results of the studies identified in the systematic literature review suggest that there is a trend towards a lower frequency of false positive results if the IgA tTG test using human recombinant substrate is used compared to the guinea pig substrate in subjects with chronic liver disease. However, the statistical significance of the difference was not reported in the studies. When IgA tTG with human recombinant substrate was used, the number of false positives seems to be similar to what was estimated in the MAS pooled analysis for IgA-based serologic tests in a general population of patients. These results should be interpreted with caution since most studies did not use the gold standard, small bowel biopsy, to confirm or exclude the diagnosis of celiac disease, and since the studies were not designed to compare the diagnostic accuracy among different serologic tests. The sensitivity of the different serologic tests in patients with chronic liver disease was not evaluated in the studies identified.
Effects of a Gluten-Free Diet (GFD) in Patients Diagnosed with Celiac Disease
Six studies identified evaluated the effects of GFD on clinical, histological, or serologic improvement in patients diagnosed with celiac disease. Improvement was observed in 51% to 95% of the patients included in the studies.
Grading of Evidence
Overall, the quality of the evidence ranged from moderate to very low depending on the serologic celiac disease test. Reasons to downgrade the quality of the evidence included the use of a surrogate endpoint (diagnostic accuracy) since none of the studies evaluated clinical outcomes, inconsistencies among study results, imprecise estimates, and sparse data. The quality of the evidence was considered moderate for IgA tTg and IgA EMA, low for IgA DGP, and serologic test combinations, and very low for IgA AGA.
Clinical Validity and Clinical Utility of Serologic Testing in the Diagnosis of Celiac Disease
The clinical validity of serologic tests in the diagnosis of celiac disease was considered high in subjects with symptoms consistent with this disease due to
High accuracy of some serologic tests.
Serologic tests detect possible celiac disease cases and avoid unnecessary small bowel biopsy if the test result is negative, unless an endoscopy/ small bowel biopsy is necessary due to the clinical presentation.
Serologic tests support the results of small bowel biopsy.
The clinical utility of serologic tests for the diagnosis of celiac disease, as defined by its impact in decision making was also considered high in subjects with symptoms consistent with this disease given the considerations listed above and since celiac disease diagnosis leads to treatment with a gluten-free diet.
Economic Analysis
A decision analysis was constructed to compare costs and outcomes between the tests based on the sensitivity, specificity and prevalence summary estimates from the MAS Evidence-Based Analysis (EBA). A budget impact was then calculated by multiplying the expected costs and volumes in Ontario. The outcome of the analysis was expected costs and false negatives (FN). Costs were reported in 2010 CAD$. All analyses were performed using TreeAge Pro Suite 2009.
Four strategies made up the efficiency frontier; IgG tTG, IgA tTG, EMA and small bowel biopsy. All other strategies were dominated. IgG tTG was the least costly and least effective strategy ($178.95, FN avoided=0). Small bowel biopsy was the most costly and most effective strategy ($396.60, FN avoided =0.1553). The cost per FN avoided were $293, $369, $1,401 for EMA, IgATTG and small bowel biopsy respectively. One-way sensitivity analyses did not change the ranking of strategies.
All testing strategies with small bowel biopsy are cheaper than biopsy alone however they also result in more FNs. The most cost-effective strategy will depend on the decision makers’ willingness to pay. Findings suggest that IgA tTG was the most cost-effective and feasible strategy based on its Incremental Cost-Effectiveness Ratio (ICER) and convenience to conduct the test. The potential impact of IgA tTG test in the province of Ontario would be $10.4M, $11.0M and $11.7M respectively in the following three years based on past volumes and trends in the province and basecase expected costs.
The panel of tests is the commonly used strategy in the province of Ontario therefore the impact to the system would be $13.6M, $14.5M and $15.3M respectively in the next three years based on past volumes and trends in the province and basecase expected costs.
Conclusions
The clinical validity and clinical utility of serologic tests for celiac disease was considered high in subjects with symptoms consistent with this disease as they aid in the diagnosis of celiac disease and some tests present a high accuracy.
The study findings suggest that IgA tTG is the most accurate and the most cost-effective test.
AGA test (IgA) has a lower accuracy compared to other IgA-based tests
Serologic test combinations appear to be more costly with little gain in accuracy. In addition there may be problems with generalizability of the results of the studies included in this review if different test combinations are used in clinical practice.
IgA deficiency seems to be uncommon in patients diagnosed with celiac disease.
The generalizability of study results is contingent on performing both the serologic test and small bowel biopsy in subjects on a gluten-containing diet as was the case in the studies identified, since the avoidance of gluten may affect test results.
PMCID: PMC3377499  PMID: 23074399
15.  Protocol for a systematic review and individual patient data meta-analysis of prognostic factors of foot ulceration in people with diabetes: the international research collaboration for the prediction of diabetic foot ulcerations (PODUS) 
Background
Diabetes–related lower limb amputations are associated with considerable morbidity and mortality and are usually preceded by foot ulceration. The available systematic reviews of aggregate data are compromised because the primary studies report both adjusted and unadjusted estimates. As adjusted meta-analyses of aggregate data can be challenging, the best way to standardise the analytical approach is to conduct a meta-analysis based on individual patient data (IPD).
There are however many challenges and fundamental methodological omissions are common; protocols are rare and the assessment of the risk of bias arising from the conduct of individual studies is frequently not performed, largely because of the absence of widely agreed criteria for assessing the risk of bias in this type of review. In this protocol we propose key methodological approaches to underpin our IPD systematic review of prognostic factors of foot ulceration in diabetes.
Review questions;
1. What are the most highly prognostic factors for foot ulceration (i.e. symptoms, signs, diagnostic tests) in people with diabetes?
2. Can the data from each study be adjusted for a consistent set of adjustment factors?
3. Does the model accuracy change when patient populations are stratified according to demographic and/or clinical characteristics?
Methods
MEDLINE and EMBASE databases from their inception until early 2012 were searched and the corresponding authors of all eligible primary studies invited to contribute their raw data. We developed relevant quality assurance items likely to identify occasions when study validity may have been compromised from several sources. A confidentiality agreement, arrangements for communication and reporting as well as ethical and governance considerations are explained.
We have agreement from the corresponding authors of all studies which meet the eligibility criteria and they collectively possess data from more than 17000 patients. We propose, as a provisional analysis plan, to use a multi-level mixed model, using “study” as one of the levels. Such a model can also allow for the within-patient clustering that occurs if a patient contributes data from both feet, although to aid interpretation, we prefer to use patients rather than feet as the unit of analysis. We intend to only attempt this analysis if the results of the investigation of heterogeneity do not rule it out and the model diagnostics are acceptable.
Discussion
This review is central to the development of a global evidence-based strategy for the risk assessment of the foot in patients with diabetes, ensuring future recommendations are valid and can reliably inform international clinical guidelines.
doi:10.1186/1471-2288-13-22
PMCID: PMC3599337  PMID: 23414550
16.  San Francisco Syncope Rule to predict short-term serious outcomes: a systematic review 
Background:
The San Francisco Syncope Rule has been proposed as a clinical decision rule for risk stratification of patients presenting to the emergency department with syncope. It has been validated across various populations and settings. We undertook a systematic review of its accuracy in predicting short-term serious outcomes.
Methods:
We identified studies by means of systematic searches in seven electronic databases from inception to January 2011. We extracted study data in duplicate and used a bivariate random-effects model to assess the predictive accuracy and test characteristics.
Results:
We included 12 studies with a total of 5316 patients, of whom 596 (11%) experienced a serious outcome. The prevalence of serious outcomes across the studies varied between 5% and 26%. The pooled estimate of sensitivity of the San Francisco Syncope Rule was 0.87 (95% confidence interval [CI] 0.79–0.93), and the pooled estimate of specificity was 0.52 (95% CI 0.43–0.62). There was substantial between-study heterogeneity (resulting in a 95% prediction interval for sensitivity of 0.55–0.98). The probability of a serious outcome given a negative score with the San Francisco Syncope Rule was 5% or lower, and the probability was 2% or lower when the rule was applied only to patients for whom no cause of syncope was identified after initial evaluation in the emergency department. The most common cause of false-negative classification for a serious outcome was cardiac arrhythmia.
Interpretation:
The San Francisco Syncope Rule should be applied only for patients in whom no cause of syncope is evident after initial evaluation in the emergency department. Consideration of all available electrocardiograms, as well as arrhythmia monitoring, should be included in application of the San Francisco Syncope Rule. Between-study heterogeneity was likely due to inconsistent classification of arrhythmia.
doi:10.1503/cmaj.101326
PMCID: PMC3193123  PMID: 21948723
17.  Falls prevention for the elderly 
Background
An ageing population, a growing prevalence of chronic diseases and limited financial resources for health care underpin the importance of prevention of disabling health disorders and care dependency in the elderly. A wide variety of measures is generally available for the prevention of falls and fall-related injuries. The spectrum ranges from diagnostic procedures for identifying individuals at risk of falling to complex interventions for the removal or reduction of identified risk factors. However, the clinical and economic effectiveness of the majority of recommended strategies for fall prevention is unclear. Against this background, the literature analyses in this HTA report aim to support decision-making for effective and efficient fall prevention.
Research questions
The pivotal research question addresses the effectiveness of single interventions and complex programmes for the prevention of falls and fall-related injuries. The target population are the elderly (> 60 years), living in their own housing or in long term care facilities. Further research questions refer to the cost-effectiveness of fall prevention measures, and their ethical, social and legal implications.
Methods
Systematic literature searches were performed in 31 databases covering the publication period from January 2003 to January 2010. While the effectiveness of interventions is solely assessed on the basis of randomised controlled trials (RCT), the assessment of the effectiveness of diagnostic procedures also considers prospective accuracy studies. In order to clarify social, ethical and legal aspects all studies deemed relevant with regard to content were taken into consideration, irrespective of their study design. Study selection and critical appraisal were conducted by two independent assessors. Due to clinical heterogeneity of the studies no meta-analyses were performed.
Results
Out of 12,000 references retrieved by literature searches, 184 meet the inclusion criteria. However, to a variable degree the validity of their results must be rated as compromised due to different biasing factors. In summary, it appears that the performance of tests or the application of parameters to identify individuals at risk of falling yields little or no clinically relevant information. Positive effects of exercise interventions may be expected in relatively young and healthy seniors, while studies indicate opposite effects in the fragile elderly. For this specific vulnerable population the modification of the housing environment shows protective effects. A low number of studies, low quality of studies or inconsistent results lead to the conclusion that the effectiveness of the following interventions has to be rated unclear yet: correction of vision disorders, modification of psychotropic medication, vitamin D supplementation, nutritional supplements, psychological interventions, education of nursing personnel, multiple and multifactorial programs as well as the application of hip protectors.
For the context of the German health care system the economic evaluations of fall prevention retrieved by the literature searches yield very few useful results. Cost-effectiveness calculations of fall prevention are mostly based on weak effectiveness data as well as on epidemiological and cost data from foreign health care systems.
Ethical analysis demonstrates ambivalent views of the target population concerning fall risk and the necessity of fall prevention. The willingness to take up preventive measures depends on a variety of personal factors, the quality of information, guidance and decision-making, the prevention program itself and social support.
The analysis of papers regarding legal issues shows three main challenges: the uncertainty of which standard of care has to be expected with regard to fall prevention, the necessity to consider the specific conditions of every single case when measures for fall prevention are applied, and the difficulty to balance the rights to autonomous decision making and physical integrity.
Discussion and conclusions
The assessment of clinical effectiveness of interventions for fall prevention is complicated by inherent methodological problems (esp. absence of blinding) and meaningful clinical heterogeneity of available studies. Therefore meta-analyses are not appropriate, and single study results are difficult to interpret. Both problems also impair the informative value of economic analyses. With this background it has to be stated that current recommendations regarding fall prevention in the elderly are not fully supported by scientific evidence. In particular, for the generation of new recommendations the dependency of probable effects on specific characteristics of the target populations or care settings should be taken into consideration. This also applies to the variable factors influencing the willingness of the target population to take up and pursue preventive measures.
In the planning of future studies equal weight should be placed on methodological rigour (freedom from biases) and transferability of results into routine care. Economic analyses require input of German data, either in form of a “piggy back study“ or in form of a modelling study that reflects the structures of the German health care system and is based on German epidemiological and cost data.
doi:10.3205/hta000099
PMCID: PMC3334922  PMID: 22536299
accidental falls; accidents, home/*; activities of daily living; aged/*; aged/*psychology; adjustment of the living environment; cataract surgery; correction of the visual acuity; customisation of the living environment; diagnosis; dietary supplements; dose-response relationship, drug; EBM; economic evaluation; elderly; environment design; evidence-based medicine; exercise program; exercise/physiology; eye test; eyesight; eyesight test; fall; fall prevention; fall prophylaxis; fall risk; fall risk factors; falling consequences; falling danger; fall-related injuries; fracture; freedom/*; freedom-depriving measures; geriatric nursing home; health technology assessment; hip fracture; hip fractures; hip protectors; homes for the aged; HTA; humans; interventions; medical adjustment; meta-analysis as topic; motor activity; motor activity/drug effects; motor skills; motor function; multi-factorial programs; multimodal programs; nursing homes; peer review; power of movement; prevention; primary prevention; private domesticity; prophylaxis; randomized controlled trial; randomized controlled trials as topic; RCT; review literature as topic; risk assessment; risk factors; risk reduction behavior; seniors; sight; stabilized; systematic review; technology assessment, biomedical; training program; visual acuity; Vitamin D/administration & dosage
18.  Evidence-based Diagnostics: Adult Septic Arthritis 
Background
Acutely swollen or painful joints are common complaints in the emergency department (ED). Septic arthritis in adults is a challenging diagnosis, but prompt differentiation of a bacterial etiology is crucial to minimize morbidity and mortality.
Objectives
The objective was to perform a systematic review describing the diagnostic characteristics of history, physical examination, and bedside laboratory tests for nongonococcal septic arthritis. A secondary objective was to quantify test and treatment thresholds using derived estimates of sensitivity and specificity, as well as best-evidence diagnostic and treatment risks and anticipated benefits from appropriate therapy.
Methods
Two electronic search engines (PUBMED and EMBASE) were used in conjunction with a selected bibliography and scientific abstract hand search. Inclusion criteria included adult trials of patients presenting with monoarticular complaints if they reported sufficient detail to reconstruct partial or complete 2 × 2 contingency tables for experimental diagnostic test characteristics using an acceptable criterion standard. Evidence was rated by two investigators using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS). When more than one similarly designed trial existed for a diagnostic test, meta-analysis was conducted using a random effects model. Interval likelihood ratios (LRs) were computed when possible. To illustrate one method to quantify theoretical points in the probability of disease whereby clinicians might cease testing altogether and either withhold treatment (test threshold) or initiate definitive therapy in lieu of further diagnostics (treatment threshold), an interactive spreadsheet was designed and sample calculations were provided based on research estimates of diagnostic accuracy, diagnostic risk, and therapeutic risk/benefits.
Results
The prevalence of nongonococcal septic arthritis in ED patients with a single acutely painful joint is approximately 27% (95% confidence interval [CI] = 17% to 38%). With the exception of joint surgery (positive likelihood ratio [+LR] = 6.9) or skin infection overlying a prosthetic joint (+LR = 15.0), history, physical examination, and serum tests do not significantly alter posttest probability. Serum inflammatory markers such as white blood cell (WBC) counts, erythrocyte sedimentation rate (ESR), and C-reactive protein (CRP) are not useful acutely. The interval LR for synovial white blood cell (sWBC) counts of 0 × 109–25 × 109/ L was 0.33; for 25 × 109–50 × 109/L, 1.06; for 50 × 109–100 × 109/L, 3.59; and exceeding 100 × 109/L, infinity. Synovial lactate may be useful to rule in or rule out the diagnosis of septic arthritis with a +LR ranging from 2.4 to infinity, and negative likelihood ratio (−LR) ranging from 0 to 0.46. Rapid polymerase chain reaction (PCR) of synovial fluid may identify the causative organism within 3 hours. Based on 56% sensitivity and 90% specificity for sWBC counts of >50 × 109/L in conjunction with best-evidence estimates for diagnosis-related risk and treatment-related risk/benefit, the arthrocentesis test threshold is 5%, with a treatment threshold of 39%.
Conclusions
Recent joint surgery or cellulitis overlying a prosthetic hip or knee were the only findings on history or physical examination that significantly alter the probability of nongonococcal septic arthritis. Extreme values of sWBC (>50 × 109/L) can increase, but not decrease, the probability of septic arthritis. Future ED-based diagnostic trials are needed to evaluate the role of clinical gestalt and the efficacy of nontraditional synovial markers such as lactate.
doi:10.1111/j.1553-2712.2011.01121.x
PMCID: PMC3229263  PMID: 21843213
19.  Accuracy of 3 Diagnostic Tests for Anterior Cruciate Ligament Tears 
Journal of Athletic Training  2006;41(1):120-121.
Reference: Scholten RJPM, Opstelten W, van der Plas CG, Bijl D, Deville WLJM, Bouter LM. Accuracy of physical diagnostic tests for assessing ruptures of the anterior cruciate ligament: a meta-analysis. J Fam Pract20035268969412967539.
Clinical Question: In patients presenting with possible rupture of the anterior cruciate ligament (ACL), which diagnostic test can provide an accurate diagnosis during the physical examination?
Data Sources: Two reviewers searched MEDLINE (1966 to February 14, 2003) and EMBASE (1980 to February 14, 2003). Articles written in English, French, German, or Dutch were included. The key search terms were knee injuries, knee joint, and knee. These terms were combined with the headings joint instability and anterior cruciate ligament, as well as the text words laxity, instability, cruciate, and effusion. The results of these searches were combined with the subject headings sensitivity and specificity, physical examination, and not (animal not [human and animal]). Additional text words searched were sensitivit*, specificit*, false positive, false negative, accuracy, screening, physical examination, and clinical examination. The reference lists of included articles were examined.
Study Selection: Inclusion criteria consisted of (1) investigation of at least one physical diagnostic test for assessment of ACL ruptures in the knee and (2) the use of a reference standard of arthrotomy, arthroscopy, or magnetic resonance imaging.
Data Extraction: Two independent reviewers extracted data from each included study. The methodologic quality of each test was assessed and recorded on a checklist for the screening of diagnostic tests (www.cochrane.de/cochrane/sadtdoc1.htm). The 3 diagnostic tests validated in this review were the pivot shift test, the anterior drawer test, and the Lachman test. A summary receiver operating characteristic curve was performed for each test, and the sensitivity, specificity, and predictive values were reported.
Main Results: The search strategy produced 1090 potentially eligible studies, of which 17 studies were selected. One study was included via reference list examination and 2 reports referred to the same study. Thus, 17 studies met the inclusion criteria and were used for this review. For the included studies, the sample size ranged from 32 to 300 patients. As for the age of the subjects, the authors of 4 studies failed to report it. Thus, the average age of patients across 13 of the 17 studies was 28.6 years. Authors of all studies failed to measure the clinical test and reference standard separately and with blinding. In addition, all but two studies had a significant degree of verification bias. Arthrotomy was the lone reference standard in 4 studies whereas arthrotomy/arthroscopy was the reference standard in 5 studies. Arthroscopy alone was the reference standard in 6 studies where only 2 studies used MRI as the reference standard. Authors of 8 studies examined the anterior drawer test and reported sensitivity values ranging from 0.18–0.92 and specificity values ranging from 0.78–0.98. When pooled together using the bivariate random effects model (BREM), the sensitivity value of the 8 studies was 0.2 and the specificity value was 0.88. Authors of 9 studies examined the Lachman test and reported sensitivity values ranging from 0.63–0.93 and specificity values ranging from 0.55–0.99. Pooled together using the BREM, the sensitivity value was 0.86 and the specificity value was 0.91. Lastly, authors of 6 studies examined the pivot shift test and reported sensitivity values ranging from 0.18–0.48 and specificity values ranging from 0.97–0.99. Data for the pivot shift test could not be pooled using the BREM because of the low number of available studies. Predictive values were reported graphically, with the pivot shift test having the highest positive predictive value and the Lachman test having the best negative predictive value.
Conclusions: Based on predictive value statistics, it can be concluded that during the physical examination, a positive result for the pivot shift test is the best for ruling in an ACL rupture, whereas a negative result to the Lachman test is the best for ruling out an ACL rupture. It can also be concluded that, solely using sensitivity and specificity values, the Lachman test is a better overall test at both ruling in and ruling out ACL ruptures. The anterior drawer test appears to be inconclusive for drawing strong conclusions either way.
PMCID: PMC1421494  PMID: 16619105
sensitivity; specificity; physical examination; knee; validity; joint instability
20.  Hormonal Contraception Is Associated with a Reduced Risk of Bacterial Vaginosis: A Systematic Review and Meta-Analysis 
PLoS ONE  2013;8(9):e73055.
Objective
To examine the association between hormonal contraception (HC) and bacterial vaginosis (BV) by systematic review and meta-analysis.
Methods
Medline, Web of Science and Embase databases were searched to 24/1/13 and duplicate references removed. Inclusion criteria 1) >20 BV cases; 2) accepted BV diagnostic method; 3) measure of HC-use either as combined oestrogen-progesterone HC (combined), progesterone-only contraception (POC) or unspecified HC (u-HC); 4) ≥10% of women using HC; 5) analysis of the association between BV and HC-use presented; 6) appropriate control group. Data extracted included: type of HC, BV diagnostic method and outcome (prevalent, incident, recurrent), and geographical and clinic-setting. Meta-analyses were conducted to calculate pooled effect sizes (ES), stratified by HC-type and BV outcome. This systematic review is registered with PROSPERO (CRD42013003699).
Results
Of 1713 unique references identified, 502 full-text articles were assessed for eligibility and 55 studies met inclusion criteria. Hormonal contraceptive use was associated with a significant reduction in the odds of prevalent BV (pooled effect size by random-effects [reES] = 0.68, 95%CI0.63–0.73), and in the relative risk (RR) of incident (reES = 0.82, 95%CI:0.72–0.92), and recurrent (reES = 0.69, 95%CI:0.59–0.91) BV. When stratified by HC-type, combined-HC and POC were both associated with decreased prevalence of BV and risk of incident BV. In the pooled analysis of the effect of HC-use on the composite outcome of prevalent/incident/recurrent BV, HC-use was associated with a reduced risk of any BV (reES = 0.78, 95%CI:0.74–0.82).
Conclusion
HC-use was associated with a significantly reduced risk of BV. This negative association was robust and present regardless of HC-type and evident across all three BV outcome measures. When stratified by HC-type, combined-HC and POC were both individually associated with a reduction in the prevalence and incidence of BV. This meta-analysis provides compelling evidence that HC-use influences a woman’s risk of BV, with important implications for clinicians and researchers in the field.
doi:10.1371/journal.pone.0073055
PMCID: PMC3762860  PMID: 24023807
21.  CT coronary angiography vs. invasive coronary angiography in CHD 
Scientific background
Various diagnostic tests including conventional invasive coronary angiography and non-invasive computed tomography (CT) coronary angiography are used in the diagnosis of coronary heart disease (CHD).
Research questions
The present report aims to evaluate the clinical efficacy, diagnostic accuracy, prognostic value cost-effectiveness as well as the ethical, social and legal implications of CT coronary angiography versus invasive coronary angiography in the diagnosis of CHD.
Methods
A systematic literature search was conducted in electronic data bases (MEDLINE, EMBASE etc.) in October 2010 and was completed with a manual search. The literature search was restricted to articles published from 2006 in German or English. Two independent reviewers were involved in the selection of the relevant publications.
The medical evaluation was based on systematic reviews of diagnostic studies with invasive coronary angiography as the reference standard and on diagnostic studies with intracoronary pressure measurement as the reference standard. Study results were combined in a meta-analysis with 95 % confidence intervals (CI). Additionally, data on radiation doses from current non-systematic reviews were taken into account.
A health economic evaluation was performed by modelling from the social perspective with clinical assumptions derived from the meta-analysis and economic assumptions derived from contemporary German sources.
Data on special indications (bypass or in-stent-restenosis) were not included in the evaluation. Only data obtained using CT scanners with at least 64 slices were considered.
Results
No studies were found regarding the clinical efficacy or prognostic value of CT coronary angiography versus conventional invasive coronary angiography in the diagnosis of CHD.
Overall, 15 systematic reviews with data from 44 diagnostic studies using invasive coronary angiography as the reference standard (identification of obstructive stenoses) and two diagnostic studies using intracoronary pressure measurement as the reference standard (identification of functionally relevant stenoses) were included in the medical evaluation.
Meta-analysis of the nine studies of higher methodological quality showed that, CT coronary angiography with invasive coronary angiography as the reference standard, had a sensitivity of 96 % (95 % CI: 93 % to 98 %), specificity of 86 % (95 % CI: 83 % to 89 %), positive likelihood ratio of 6.38 (95 % CI: 5.18 to 7.87) and negative likelihood ratio of 0.06 (95 % CI: 0.03 to 0.10). However, due to non-diagnostic CT images approximately 3.6 % of the examined patients required a subsequent invasive coronary angiography.
Using intracoronary pressure measurement as the reference standard, CT coronary angiography compared to invasive coronary angiography had a sensitivity of 80 % (95 % CI: 61 % to 92 %) versus 67 % (95 % CI: 51 % to 78 %), a specificity of 67 % (95 % CI: 47 % to 83 %) versus 75 % (95 % CI: 60 % to 86 %), an average positive likelihood ratio of 2.3 versus 2.6, and an average negative likelihood ratio 0.3 versus 0.4, respectively.
Compared to invasive coronary angiography, the average effective radiation dose of CT coronary angiography was higher with retrospective electrocardiogram (ECG) gating and relatively similar with prospective ECG gating.
The health economic model using invasive coronary angiography as the reference standard showed that at a pretest probability of CHD of 50 % or lower, CT coronary angiography resulted in lower cost per patient with true positive diagnosis. At a pretest probability of CHD of 70 % or higher, invasive coronary angiography was associated with lower cost per patient with true positive diagnosis. Using intracoronary pressure measurement as the reference standard, both types of coronary angiographies resulted in substantially higher cost per patient with true positive diagnosis.
Two publications dealing explicitly with ethical aspects were identified. The first addressed ethical aspects regarding the principles of beneficence, autonomy and justice, and the second addressed those regarding radiation exposition, especially when used within studies.
Discussion
The discriminatory power of CT coronary angiography to identify patients with obstructive (above 50 %) coronary stenoses should be regarded as “high diagnostic evidence”, to identify patients without coronary stenoses as “persuasive diagnostic evidence”. The discriminatory power of both types of coronary angiography to identify patients with or without functionally relevant coronary stenoses should be regarded as “weak diagnostic evidence”.
It can be assumed that patients with a high pretest probability of CHD will need invasive coronary angiography and patients with a low pretest probability of CHD will not need subsequent revascularisation. Therefore, CT coronary angiography may be used before performing invasive coronary angiography in patients with an intermediate pretest probability of CHD.
For identifying or excluding of obstructive coronary stenosis, CT coronary angiography was shown to be more cost-saving at a pretest probability of CHD of 50 % or lower, and invasive coronary angiography at a pretest probability of CHD of 70 % or higher. The use of both types of coronary angiography to identify or to exclude functionally relevant coronary stenoses should be regarded as highly cost-consuming.
With regard to ethical, social or legal aspects, the following possible implications were identified: under-provision or over-provision of health care, unnecessary complications, anxiety, social stigmatisation, restriction of self-determination, unequal access to health care, unfair resource distribution and legal disputes.
Conclusion
From a medical point of view, CT coronary angiography using scanners with at least 64 slices should be recommended as a test to rule out obstructive coronary stenoses in order to avoid inappropriate invasive coronary angiography in patients with an intermediate pretest probability of CHD. From a health economic point of view, this recommendation should be limited to patients with a pretest probability of CHD of 50 % or lower.
From a medical and health economic point of view, neither CT coronary angiography using scanners with at least 64 slices nor invasive coronary angiography may be recommended as a single diagnostic test for identifying or ruling out functionally relevant coronary stenoses.
To minimise any potential negative ethical, social and legal implications, the general ethical and moral principles of benefit, autonomy and justice should be considered.
doi:10.3205/hta000100
PMCID: PMC3334923  PMID: 22536300
CHD; coronary angiography; coronary disease; coronary heart disease; cost-benefit-analysis; diagnosis; EBM; evidence based medicine; evidence-based medicine; health technology assessment; health-economic analysis; HTA; humans; meta-analysis; meta-analysis as topic; review literature as topic; stenosis; systematic review
22.  Can Falls Risk Prediction Tools Correctly Identify Fall-Prone Elderly Rehabilitation Inpatients? A Systematic Review and Meta-Analysis 
PLoS ONE  2012;7(7):e41061.
Background
Falls of elderly people may cause permanent disability or death. Particularly susceptible are elderly patients in rehabilitation hospitals. We systematically reviewed the literature to identify falls prediction tools available for assessing elderly inpatients in rehabilitation hospitals.
Methods and Findings
We searched six electronic databases using comprehensive search strategies developed for each database. Estimates of sensitivity and specificity were plotted in ROC space graphs and pooled across studies. Our search identified three studies which assessed the prediction properties of falls prediction tools in a total of 754 elderly inpatients in rehabilitation hospitals. Only the STRATIFY tool was assessed in all three studies; the other identified tools (PJC-FRAT and DOWNTON) were assessed by a single study. For a STRATIFY cut-score of two, pooled sensitivity was 73% (95%CI 63 to 81%) and pooled specificity was 42% (95%CI 34 to 51%). An indirect comparison of the tools across studies indicated that the DOWNTON tool has the highest sensitivity (92%), while the PJC-FRAT offers the best balance between sensitivity and specificity (73% and 75%, respectively). All studies presented major methodological limitations.
Conclusions
We did not identify any tool which had an optimal balance between sensitivity and specificity, or which were clearly better than a simple clinical judgment of risk of falling. The limited number of identified studies with major methodological limitations impairs sound conclusions on the usefulness of falls risk prediction tools in geriatric rehabilitation hospitals.
doi:10.1371/journal.pone.0041061
PMCID: PMC3398864  PMID: 22815914
23.  Multi-Detector Computed Tomography Angiography for Coronary Artery Disease 
Executive Summary
Purpose
Computed tomography (CT) scanning continues to be an important modality for the diagnosis of injury and disease, most notably for indications of the head and abdomen. (1) According to a recent report published by the Canadian Institutes of Health Information, (1) there were about 10.3 scanners per million people in Canada as of January 2004. Ontario had the fewest number of CT scanners per million compared to the other provinces (8 CT scanners per million). The wait time for CT in Ontario of 5 weeks approaches the Canadian median of 6 weeks.
This health technology and policy appraisal systematically reviews the published literature on multidetector CT (MDCT) angiography as a diagnostic tool for the newest indication for CT, coronary artery disease (CAD), and will apply the results of the review to current health care practices in Ontario. This review does not evaluate MDCT to detect coronary calcification without contrast medium for CAD screening purposes.
The Technology
Compared with conventional CT scanning, MDCT can provide smaller pieces of information and can cover a larger area faster. (2) Advancing MDCT technology (8, 16, 32, 64 slice systems) is capable of producing more images in less time. For general CT scanning, this faster capability can reduce the time that patients must stay still during the procedure, thereby reducing potential movement artefact. However, the additional clinical utility of images obtained from faster scanners compared to the images obtained from conventional CT scanners for current CT indications (i.e., non-moving body parts) is not known.
There are suggestions that the new fast scanners can reduce wait times for general CT. MDCT angiography that utilizes a contrast medium, has been proposed as a minimally invasive replacement to coronary angiography to detect coronary artery disease. MDCT may take between 15 to 45 minutes; coronary angiography may take up to 1 hour.
Although 16-slice and 32-slice CT scanners have been available for a few years, 64-slice CT scanners were released only at the end of 2004.
Review Strategy
There are many proven, evidence-based indications for conventional CT. It is not clear how MDCT will add to the clinical utility and management of patients for established CT indications. Therefore, because cardiac imaging, specifically MDCT angiography, is a new indication for CT, this literature review focused on the safety, effectiveness, and cost-effectiveness of MDCT angiography compared with coronary angiography in the diagnosis and management of people with CAD.
This review asked the following questions:
Is the most recent MDCT angiography effective in the imaging of the coronary arteries compared with conventional angiography to correctly diagnose of significant (> 50% lumen reduction) CAD?
What is the utility of MDCT angiography in the management and treatment of patients with CAD?
How does MDCT angiography in the management and treatment of patients with CAD affect longterm outcomes?
The published literature from January 2003 to January 31, 2005 was searched for articles that focused on the detection of coronary artery disease using 16-slice CT or faster, compared with coronary angiography. The search yielded 138 articles; however, 125 were excluded because they did not meet the inclusion criteria (comparison with coronary angiography, diagnostic accuracy measures calculated, and a sample size of 20 or more). As screening for CAD is not advised, studies that utilized MDCT for this purpose or studies that utilized MDCT without contrast media were also excluded. Overall, 13 studies were included in this review.
Summary of Findings
The published literature focused on 16-slice CT angiography for the detection of CAD. Two abstracts that were presented at the 2005 European Congress of Radiology meeting in Vienna compared 64-slice CT angiography with coronary angiography.
The 13 studies focussing on 16-slice CT angiography were stratified into 2 groups: Group 1 included 9 studies that focused on the detection of CAD in symptomatic patients, and Group 2 included 4 studies that examined the use of 16-slice CT angiography to detect disease progression after cardiac interventions. The 2 abstracts on 64-slice CT angiography were presented separately, but were not critically appraised due to the lack of information provided in the abstracts.
16-Slice Computed Tomography Angiography
The STARD initiative to evaluate the reporting quality of studies that focus on diagnostic tests was used. Overall the studies were relatively small (fewer than 100 people), and only about one-half recruited consecutive patients. Most studies reported inclusion criteria, but 5 did not report exclusion criteria. In these 5, the patients were highly selected; therefore, how representative they are of the general population of people with suspicion if CAD or those with disease progression after cardiac intervention is questionable. In most studies, patients were either already taking, or were given, β-blockers to reduce their heart rates to improve image quality sufficiently. Only 6 of the 13 studies reported interobserver reliability quantitatively. The studies typically assessed the quality of the images obtained from 16-slice CT angiography, excluded those of poor quality, and compared the rest with the gold standard, coronary angiography. This practice necessarily inflated the diagnostic accuracy measures. Only 3 studies reported confidence intervals around their measures.
Evaluation of the studies in Group 1 reported variable sensitivity, from just over 60% to 96%, but a more stable specificity, at more than 95%. The false positive rate ranged from 5% to 8%, but the false negative rate was at best under 10% and at worst about 30%. This means that up to one-third of patients who have disease may be missed. These patients may therefore progress to a more severe level of disease and require more invasive procedures. The calculated positive and negative likelihood ratios across the studies suggested that 16-slice CT angiography may be useful to detect disease, but it is not useful to rule out disease. The prevalence of disease, measured by conventional coronoary angiography, was from 50% to 80% across the studies in this review. Overall, 16-slice CT angiography may be useful, but there is no conclusive evidence to suggest that it is equivalent to or better than coronary angiography to detect CAD in symptomatic patients.
In the 4 studies in Group 2, sensitivity and specificity were both reported at more than 95% (except for 1 that reported sensitivity of about 80%). The positive and negative likelihood ratios suggested that the test might be useful to detect disease progression in patients who had cardiac interventions. However, 2 of the 4 studies recruited patients who had been asymptomatic since their intervention. As many of the patients studied were not symptomatic, the relevance of performing MDCT angiography in the patient population may be in question.
64-Slice Computed Tomography Angiography
An analysis from the interim results based on 2 abstracts revealed that 64-slice CT angiography was insufficient compared to coronary angiography and may not be better than 16-slice CT angiography to detect CAD.
Conclusions
Cardiac imaging is a relatively new indication for CT. A systematic review of the literature was performed from 2003 to January 2005 to determine the effectiveness of MDCT angiography (16-slice and 64-slice) compared to coronary angiography to detect CAD. At the time of this report, there was no published literature on 64-slice CT for any indications.
Based on this review, the Medical Advisory Secretariat concluded that there is insufficient evidence to suggest that 16-slice or 64-slice CT angiography is equal to or better than coronary angiography to diagnose CAD in people with symptoms or to detect disease progression in patients who had previous cardiac interventions. An analysis of the evidence suggested that in investigating suspicion of CAD, a substantial number of patients would be missed. This means that these people would not be appropriately treated. These patients might progress to more severe disease and possibly more adverse events. Overall, the clinical utility of MDCT in patient management and long-term outcomes is unknown.
Based on the current evidence, it is unlikely that CT angiography will replace coronary angiography completely, but will probably be used adjunctively with other cardiac diagnostic tests until more definitive evidence is published.
If multi-slice CT scanners are used for coronary angiography in Ontario, access to the current compliment of CT scanners will necessarily increase wait times for general CT scanning. It is unlikely that these newer-generation scanners will improve patient throughput, despite the claim that they are faster.
Screening for CAD in asymptomatic patients and who have no history of ischemic heart disease using any modality is not advised, based on the World Health Organization criteria for screening. Therefore, this review did not examine the use of multi-slice CT for this purpose.
PMCID: PMC3382628  PMID: 23074474
24.  World Health Organization fracture risk assessment tool in the assessment of fractures after falls in hospital 
Background
Falls are very common accidents in a hospital. Various risk factors and risk assessment tools are used to predict falls. However, outcomes of falls such as bone fractures have not been considered in these risk assessment tools, and the performance of risk assessment tools in a Japanese hospital setting is not clear.
Methods
This was a retrospective single-institution study of 20,320 inpatients aged from 40 to 90 years who were admitted to a tertiary-care university hospital during the period from April 2006 to March 2009. Possible risk factors for falls and fractures including STRATIFY score and FRAX™ score and information on falls and their outcome were obtained from the hospital information system. The datasets were divided randomly into a development dataset and a test dataset. The chi-square test, logistic regression analysis and survival analysis were used to identify risk factors for falls and fractures after falls.
Results
Fallers accounted for 3.1% of the patients in the development dataset and 3.5% of the patients in the test dataset, and 2.6% and 2.9% of the fallers in those datasets suffered peripheral fractures. Sensitivity and specificity of the STRATIFY score to predict falls were not optimal. Most of the known risk factors for falls had no power to predict fractures after falls. Multiple logistic analysis and multivariate Cox's regression analysis with time-dependent covariates revealed that FRAX™ score was significantly associated with fractures after falls.
Conclusions
Risk assessment tools for falls are not appropriate for predicting fractures after falls. FRAX™ might be a useful tool for that purpose. The performance of STRATIFY to predict falls in a Japanese hospital setting was similar to that in previous studies.
doi:10.1186/1472-6963-10-106
PMCID: PMC2868843  PMID: 20423520
25.  The diagnostic accuracy of the Patient Health Questionnaire-2 (PHQ-2), Patient Health Questionnaire-8 (PHQ-8), and Patient Health Questionnaire-9 (PHQ-9) for detecting major depression: protocol for a systematic review and individual patient data meta-analyses 
Systematic Reviews  2014;3:124.
Background
Major depressive disorder (MDD) may be present in 10%–20% of patients in medical settings. Routine depression screening is sometimes recommended to improve depression management. However, studies of the diagnostic accuracy of depression screening tools have typically used data-driven, exploratory methods to select optimal cutoffs. Often, these studies report results from a small range of cutoff points around whatever cutoff score is most accurate in that given study. When published data are combined in meta-analyses, estimates of accuracy for different cutoff points may be based on data from different studies, rather than data from all studies for each possible cutoff point. As a result, traditional meta-analyses may generate exaggerated estimates of accuracy. Individual patient data (IPD) meta-analyses can address this problem by synthesizing data from all studies for each cutoff score to obtain diagnostic accuracy estimates. The nine-item Patient Health Questionnaire-9 (PHQ-9) and the shorter PHQ-2 and PHQ-8 are commonly recommended for depression screening. Thus, the primary objectives of our IPD meta-analyses are to determine the diagnostic accuracy of the PHQ-9, PHQ-8, and PHQ-2 to detect MDD among adults across all potentially relevant cutoff scores. Secondary analyses involve assessing accuracy accounting for patient factors that may influence accuracy (age, sex, medical comorbidity).
Methods/design
Data sources will include MEDLINE, MEDLINE In-Process & Other Non-Indexed Citations, PsycINFO, and Web of Science. We will include studies that included a Diagnostic and Statistical Manual or International Classification of Diseases diagnosis of MDD based on a validated structured or semi-structured clinical interview administered within 2 weeks of the administration of the PHQ. Two reviewers will independently screen titles and abstracts, perform full article review, and extract study data. Disagreements will be resolved by consensus. Risk of bias will be assessed with the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Bivariate random-effects meta-analysis will be conducted for the full range of plausible cutoff values.
Discussion
The proposed IPD meta-analyses will allow us to obtain estimates of the diagnostic accuracy of the PHQ-9, PHQ-8, and PHQ-2.
Systematic review registration
PROSPERO CRD42014010673
doi:10.1186/2046-4053-3-124
PMCID: PMC4218786  PMID: 25348422
Patient health questionnaire; PHQ-9; PHQ-8; PHQ-2; Depression; Screening; Diagnostic test accuracy; Systematic review; Individual patient data meta-analysis

Results 1-25 (1474978)