PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (32)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
Document Types
1.  Comparing Screening Instruments to Predict Posttraumatic Stress Disorder 
PLoS ONE  2014;9(5):e97183.
Background
Following traumatic exposure, a proportion of trauma victims develops posttraumatic stress disorder (PTSD). Early PTSD risk screening requires sensitive instruments to identify everyone at risk for developing PTSD in need of diagnostic follow-up.
Aims
This study compares the accuracy of the 4-item SPAN, 10-item Trauma Screening Questionnaire (TSQ) and 22-item Impact of Event Scale-Revised (IES-R) in predicting chronic PTSD at a minimum sensitivity of 80%.
Method
Injury patients admitted to a level-I trauma centre (N = 311) completed the instruments at a median of 23 days and were clinically assessed for PTSD at 6 months. Areas under the curve and specificities at 80% sensitivity were compared between instruments.
Results
Areas under the curve in all instruments were adequate (SPAN: 0.83; TSQ: 0.82; IES-R: 0.83) with no significant differences. At 80% sensitivity, specificities were 64% for SPAN, 59% for TSQ and 72% for IES-R.
Conclusion
The SPAN, TSQ and IES-R show similar accuracy in early detection of individuals at risk for PTSD, despite differences in number of items. The modest specificities and low positive predictive values found for all instruments could lead to relatively many false positive cases, when applied in clinical practice.
doi:10.1371/journal.pone.0097183
PMCID: PMC4016271  PMID: 24816642
2.  Use of Expert Panels to Define the Reference Standard in Diagnostic Research: A Systematic Review of Published Methods and Reporting 
PLoS Medicine  2013;10(10):e1001531.
Loes C. M. Bertens and colleagues survey the published diagnostic research literature for use of expert panels to define the reference standard, characterize components and missing information, and recommend elements that should be reported in diagnostic studies.
Please see later in the article for the Editors' Summary
Background
In diagnostic studies, a single and error-free test that can be used as the reference (gold) standard often does not exist. One solution is the use of panel diagnosis, i.e., a group of experts who assess the results from multiple tests to reach a final diagnosis in each patient. Although panel diagnosis, also known as consensus or expert diagnosis, is frequently used as the reference standard, guidance on preferred methodology is lacking. The aim of this study is to provide an overview of methods used in panel diagnoses and to provide initial guidance on the use and reporting of panel diagnosis as reference standard.
Methods and Findings
PubMed was systematically searched for diagnostic studies applying a panel diagnosis as reference standard published up to May 31, 2012. We included diagnostic studies in which the final diagnosis was made by two or more persons based on results from multiple tests. General study characteristics and details of panel methodology were extracted. Eighty-one studies were included, of which most reported on psychiatry (37%) and cardiovascular (21%) diseases. Data extraction was hampered by incomplete reporting; one or more pieces of critical information about panel reference standard methodology was missing in 83% of studies. In most studies (75%), the panel consisted of three or fewer members. Panel members were blinded to the results of the index test results in 31% of studies. Reproducibility of the decision process was assessed in 17 (21%) studies. Reported details on panel constitution, information for diagnosis and methods of decision making varied considerably between studies.
Conclusions
Methods of panel diagnosis varied substantially across studies and many aspects of the procedure were either unclear or not reported. On the basis of our review, we identified areas for improvement and developed a checklist and flow chart for initial guidance for researchers conducting and reporting of studies involving panel diagnosis.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
Before any disease or condition can be treated, a correct diagnosis of the condition has to be made. Faced with a patient with medical problems and no diagnosis, a doctor will ask the patient about their symptoms and medical history and generally will examine the patient. On the basis of this questioning and examination, the clinician will form an initial impression of the possible conditions the patient may have, usually with a most likely diagnosis in mind. To support or reject the most likely diagnosis and to exclude the other possible diagnoses, the clinician will then order a series of tests and diagnostic procedures. These may include laboratory tests (such as the measurement of blood sugar levels), imaging procedures (such as an MRI scan), or functional tests (such as spirometry, which tests lung function). Finally, the clinician will use all the data s/he has collected to reach a firm diagnosis and will recommend a program of treatment or observation for the patient.
Why Was This Study Done?
Researchers are continually looking for new, improved diagnostic tests and multivariable diagnostic models—combinations of tests and characteristics that point to a diagnosis. Diagnostic research, which assesses the accuracy of new tests and models, requires that each patient involved in a diagnostic study has a final correct diagnosis. Unfortunately, for most conditions, there is no single, error-free test that can be used as the reference (gold) standard for diagnosis. If an imperfect reference standard is used, errors in the final disease classification may bias the results of the diagnostic study and may lead to a new test being adopted that is actually less accurate than existing tests. One widely used solution to the lack of a reference standard is “panel diagnosis” in which two or more experts assess the results from multiple tests to reach a final diagnosis for each patient in a diagnostic study. However, there is currently no formal guidance available on the conduct and reporting of panel diagnosis. Here, the researchers undertake a systematic review (a study that uses predefined criteria to identify research on a given topic) to provide an overview of the methodology and reporting of panel diagnosis.
What Did the Researchers Do and Find?
The researchers identified 81 published diagnostic studies that used panel diagnosis as a reference standard. 37% of these studies reported on psychiatric diseases, 21% reported on cardiovascular diseases, and 12% reported on respiratory diseases. Most of the studies (64%) were designed to assess the accuracy of one or more diagnostic test. Notably, one or more critical piece of information on methodology was missing in 83% of the studies. Specifically, information on the constitution of the panel was missing in a quarter of the studies and information on the decision-making process (whether, for example, a diagnosis was reached by discussion among panel members or by combining individual panel member's assessments) was incomplete in more than two-thirds of the studies. In three-quarters of the studies for which information was available, the panel consisted of only two or three members; different fields of expertise were represented in the panels in nearly two-thirds of the studies. In a third of the studies for which information was available, panel members made their diagnoses without access to the results of the test being assessed. Finally, the reproducibility of the decision-making process was assessed in a fifth of the studies.
What Do These Findings Mean?
These findings indicate that the methodology of panel diagnosis varies substantially among diagnostic studies and that reporting of this methodology is often unclear or absent. Both the methodology and reporting of panel diagnosis could, therefore, be improved substantially. Based on their findings, the researchers provide a checklist and flow chart to help guide the conduct and reporting of studies involving panel diagnosis. For example, they suggest that, when designing a study that uses panel diagnosis as the reference standard, the number and background of panel members should be considered, and they provide a list of options that should be considered when planning the decision-making process. Although more research into each of the options identified by the researchers is needed, their recommendations provide a starting point for the development of formal guidelines on the methodology and reporting of panel diagnosis for use as a reference standard in diagnostic research.
Additional Information
Please access these Web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001531.
Wikipedia has a page on medical diagnosis (note: Wikipedia is a free online encyclopedia that anyone can edit; available in several languages)
The Equator Network is an international initiative that seeks to improve the reliability and value of medical research literature by promoting transparent and accurate reporting of research studies; its website includes information on a wide range of reporting guidelines, including the STAndards for the Reporting of Diagnostic accuracy studies (STARD), an initiative that aims to improve the accuracy and completeness of reporting of studies of diagnostic accuracy
doi:10.1371/journal.pmed.1001531
PMCID: PMC3797139  PMID: 24143138
3.  Development and validation of a model to predict the risk of exacerbations in chronic obstructive pulmonary disease 
Purpose
Prediction models for exacerbations in patients with chronic obstructive pulmonary disease (COPD) are scarce. Our aim was to develop and validate a new model to predict exacerbations in patients with COPD.
Patients and methods
The derivation cohort consisted of patients aged 65 years or over, with a COPD diagnosis, who were followed up over 24 months. The external validation cohort consisted of another cohort of COPD patients, aged 50 years or over. Exacerbations of COPD were defined as symptomatic deterioration requiring pulsed oral steroid use or hospitalization. Logistic regression analysis including backward selection and shrinkage were used to develop the final model and to adjust for overfitting. The adjusted regression coefficients were applied in the validation cohort to assess calibration of the predictions and calculate changes in discrimination applying C-statistics.
Results
The derivation and validation cohort consisted of 240 and 793 patients with COPD, of whom 29% and 28%, respectively, experienced an exacerbation during follow-up. The final model included four easily assessable variables: exacerbations in the previous year, pack years of smoking, level of obstruction, and history of vascular disease, with a C-statistic of 0.75 (95% confidence interval [CI]: 0.69–0.82). Predictions were well calibrated in the validation cohort, with a small loss in discrimination potential (C-statistic 0.66 [95% CI 0.61–0.71]).
Conclusion
Our newly developed prediction model can help clinicians to predict the risk of future exacerbations in individual patients with COPD, including those with mild disease.
doi:10.2147/COPD.S49609
PMCID: PMC3797610  PMID: 24143086
exacerbation of COPD; risk prediction; external validation; vascular disease
4.  The impact of the HEART risk score in the early assessment of patients with acute chest pain: design of a stepped wedge, cluster randomised trial 
Background
Chest pain remains a diagnostic challenge: physicians do not want to miss an acute coronary syndrome (ACS), but, they also wish to avoid unnecessary additional diagnostic procedures. In approximately 75% of the patients presenting with chest pain at the emergency department (ED) there is no underlying cardiac cause. Therefore, diagnostic strategies focus on identifying patients in whom an ACS can be safely ruled out based on findings from history, physical examination and early cardiac marker measurement. The HEART score, a clinical prediction rule, was developed to provide the clinician with a simple, early and reliable predictor of cardiac risk. We set out to quantify the impact of the use of the HEART score in daily practice on patient outcomes and costs.
Methods/Design
We designed a prospective, multi-centre, stepped wedge, cluster randomised trial. Our aim is to include a total of 6600 unselected chest pain patients presenting at the ED in 10 Dutch hospitals during an 11-month period. All clusters (i.e. hospitals) start with a period of ‘usual care’ and are randomised in their timing when to switch to ‘intervention care’. The latter involves the calculation of the HEART score in each patient to guide clinical decision; notably reassurance and discharge of patients with low scores and intensive monitoring and early intervention in patients with high HEART scores. Primary outcome is occurrence of major adverse cardiac events (MACE), including acute myocardial infarction, revascularisation or death within 6 weeks after presentation. Secondary outcomes include occurrence of MACE in low-risk patients, quality of life, use of health care resources and costs.
Discussion
Stepped wedge designs are increasingly used to evaluate the real-life effectiveness of non-pharmacological interventions because of the following potential advantages: (a) each hospital has both a usual care and an intervention period, therefore, outcomes can be compared within and across hospitals; (b) each hospital will have an intervention period which enhances participation in case of a promising intervention; (c) all hospitals generate data about potential implementation problems. This large impact trial will generate evidence whether the anticipated benefits (in terms of safety and cost-effectiveness) of using the HEART score will indeed be achieved in real-life clinical practice.
Trial registration
ClinicalTrials.gov 80-82310-97-12154.
doi:10.1186/1471-2261-13-77
PMCID: PMC3849098  PMID: 24070098
HEART score; Chest pain; Clinical prediction rule; Risk score implementation; Impact; Stepped wedge design; Cluster randomised trial
5.  Internet-Based Early Intervention to Prevent Posttraumatic Stress Disorder in Injury Patients: Randomized Controlled Trial 
Background
Posttraumatic stress disorder (PTSD) develops in 10-20% of injury patients. We developed a novel, self-guided Internet-based intervention (called Trauma TIPS) based on techniques from cognitive behavioral therapy (CBT) to prevent the onset of PTSD symptoms.
Objective
To determine whether Trauma TIPS is effective in preventing the onset of PTSD symptoms in injury patients.
Methods
Adult, level 1 trauma center patients were randomly assigned to receive the fully automated Trauma TIPS Internet intervention (n=151) or to receive no early intervention (n=149). Trauma TIPS consisted of psychoeducation, in vivo exposure, and stress management techniques. Both groups were free to use care as usual (nonprotocolized talks with hospital staff). PTSD symptom severity was assessed at 1, 3, 6, and 12 months post injury with a clinical interview (Clinician-Administered PTSD Scale) by blinded trained interviewers and self-report instrument (Impact of Event Scale—Revised). Secondary outcomes were acute anxiety and arousal (assessed online), self-reported depressive and anxiety symptoms (Hospital Anxiety and Depression Scale), and mental health care utilization. Intervention usage was documented.
Results
The mean number of intervention logins was 1.7, SD 2.5, median 1, interquartile range (IQR) 1-2. Thirty-four patients in the intervention group did not log in (22.5%), 63 (41.7%) logged in once, and 54 (35.8%) logged in multiple times (mean 3.6, SD 3.5, median 3, IQR 2-4). On clinician-assessed and self-reported PTSD symptoms, both the intervention and control group showed a significant decrease over time (P<.001) without significant differences in trend. PTSD at 12 months was diagnosed in 4.7% of controls and 4.4% of intervention group patients. There were no group differences on anxiety or depressive symptoms over time. Post hoc analyses using latent growth mixture modeling showed a significant decrease in PTSD symptoms in a subgroup of patients with severe initial symptoms (n=20) (P<.001).
Conclusions
Our results do not support the efficacy of the Trauma TIPS Internet-based early intervention in the prevention of PTSD symptoms for an unselected population of injury patients. Moreover, uptake was relatively low since one-fifth of individuals did not log in to the intervention. Future research should therefore focus on innovative strategies to increase intervention usage, for example, adding gameplay, embedding it in a blended care context, and targeting high-risk individuals who are more likely to benefit from the intervention.
Trial Registration
International Standard Randomized Controlled Trial Number (ISRCTN): 57754429; http://www.controlled-trials.com/ISRCTN57754429 (Archived by WebCite at http://webcitation.org/6FeJtJJyD).
doi:10.2196/jmir.2460
PMCID: PMC3742408  PMID: 23942480
early intervention; prevention; Internet; posttraumatic stress disorder; cognitive behavior therapy
6.  Variation of a test’s sensitivity and specificity with disease prevalence 
Background:
Anecdotal evidence suggests that the sensitivity and specificity of a diagnostic test may vary with disease prevalence. Our objective was to investigate the associations between disease prevalence and test sensitivity and specificity using studies of diagnostic accuracy.
Methods:
We used data from 23 meta-analyses, each of which included 10–39 studies (416 total). The median prevalence per review ranged from 1% to 77%. We evaluated the effects of prevalence on sensitivity and specificity using a bivariate random-effects model for each meta-analysis, with prevalence as a covariate. We estimated the overall effect of prevalence by pooling the effects using the inverse variance method.
Results:
Within a given review, a change in prevalence from the lowest to highest value resulted in a corresponding change in sensitivity or specificity from 0 to 40 percentage points. This effect was statistically significant (p < 0.05) for either sensitivity or specificity in 8 meta-analyses (35%). Overall, specificity tended to be lower with higher disease prevalence; there was no such systematic effect for sensitivity.
Interpretation:
The sensitivity and specificity of a test often vary with disease prevalence; this effect is likely to be the result of mechanisms, such as patient spectrum, that affect prevalence, sensitivity and specificity. Because it may be difficult to identify such mechanisms, clinicians should use prevalence as a guide when selecting studies that most closely match their situation.
doi:10.1503/cmaj.121286
PMCID: PMC3735771  PMID: 23798453
7.  A decision rule to aid selection of patients with abdominal sepsis requiring a relaparotomy 
BMC Surgery  2013;13:28.
Background
Accurate and timely identification of patients in need of a relaparotomy is challenging since there are no readily available strongholds. The aim of this study is to develop a prediction model to aid the decision-making process in whom to perform a relaparotomy.
Methods
Data from a randomized trial comparing surgical strategies for relaparotomy were used. Variables were selected based on previous reports and common clinical sense and screened in a univariable regression analysis to identify those associated with the need for relaparotomy. Variables with the strongest association were considered for the prediction model which was constructed after backward elimination in a multivariable regression analysis. The discriminatory capacity of the model was expressed with the area under the curve (AUC). A cut-off analysis was performed to illustrate the consequences in clinical practice.
Results
One hundred and eighty-two patients were included; 46 were considered cases requiring a relaparotomy. A prediction model was build containing 6 variables. This final model had an AUC of 0.80 indicating good discriminatory capacity. However, acceptable sensitivity would require a low threshold for relaparotomy leading to an unacceptable rate of negative relaparotomies (63%). Therefore, the prediction model was incorporated in a decision rule were the interval until re-assessment and the use of Computed Tomography are related to the outcome of the model.
Conclusions
To construct a prediction model that will provide a definite answer whether or not to perform a relaparotomy seems a utopia. However, our prediction model can be used to stratify patients on their underlying risk and could guide further monitoring of patients with abdominal sepsis in order to identify patients with suspected ongoing peritonitis in a timely fashion.
doi:10.1186/1471-2482-13-28
PMCID: PMC3750491  PMID: 23870702
Secondary peritonitis; Abdominal sepsis; Relaparotomy; On-demand; Prediction model; Decision rule
8.  Serum Mesothelin for Diagnosing Malignant Pleural Mesothelioma: An Individual Patient Data Meta-Analysis 
Journal of Clinical Oncology  2012;30(13):1541-1549.
Purpose
Mesothelin is currently considered the best available serum biomarker of malignant pleural mesothelioma. To examine the diagnostic accuracy and use of serum mesothelin in early diagnosis, we performed an individual patient data (IPD) meta-analysis.
Methods
The literature search identified 16 diagnostic studies of serum mesothelin, measured with the Mesomark enzyme-linked immunosorbent assay. IPD of 4,491 individuals were collected, including several control groups and 1,026 patients with malignant pleural mesothelioma. Mesothelin levels were standardized for between-study differences and age, after which the diagnostic accuracy and the factors affecting it were examined with receiver operating characteristic (ROC) regression analysis.
Results
At a common diagnostic threshold of 2.00 nmol/L, the sensitivities and specificities of mesothelin in the different studies ranged widely from 19% to 68% and 88% to 100%, respectively. This heterogeneity can be explained by differences in study population, because type of control group, mesothelioma stage, and histologic subtype significantly affected the diagnostic accuracy. The use of mesothelin in early diagnosis was evaluated by differentiating 217 patients with stage I or II epithelioid and biphasic mesothelioma from 1,612 symptomatic or high-risk controls. The resulting area under the ROC curve was 0.77 (95% CI, 0.73 to 0.81). At 95% specificity, mesothelin displayed a sensitivity of 32% (95% CI, 26% to 40%).
Conclusion
In patients suspected of having mesothelioma, a positive blood test for mesothelin at a high-specificity threshold is a strong incentive to urge further diagnostic steps. However, the poor sensitivity of mesothelin clearly limits its added value to early diagnosis and emphasizes the need for further biomarker research.
doi:10.1200/JCO.2011.39.6671
PMCID: PMC3383122  PMID: 22412141
9.  Triage of frail elderly with reduced exercise tolerance in primary care (TREE). a clustered randomized diagnostic study 
BMC Public Health  2012;12:385.
Background
Exercise reduced tolerance and breathlessness are common in the elderly and can result in substantial loss in functionality and health related quality of life. Heart failure (HF) and chronic obstructive pulmonary disease (COPD) are common underlying causes, but can be difficult to disentangle due to overlap in symptomatology. In addition, other potential causes such as obesity, anaemia, renal dysfunction and thyroid disorders may be involved.
We aim to assess whether screening of frail elderly with reduced exercise tolerance leads to high detection rates of HF, COPD, or alternative diagnoses, and whether detection of these diseases would result in changes in patient management and increase in both functionality and quality of life.
Methods/Design
A cluster randomized diagnostic trial. Primary care practices are randomized to the diagnostic-treatment strategy (screening) or care as usual.
Patient population: Frail (defined as having three or more chronic or vitality threatening diseases and/or receiving five or more drugs chronically during the last year) community-dwelling persons aged 65 years and older selected from the electronic medical files of the participating general practitioners. Those with reduced exercise tolerance or moderate to severe dyspnoea (≥2 score on the Medical Research Counsel dyspnoea scale) are included in the study.
The diagnostic screening in the intervention group includes history taking, physical examination, electrocardiography, spirometry, blood tests, and echocardiography. Subsequently, participants with new diagnoses will be managed according to clinical guidelines. Participants in the control arm receive care as usual. All participants fill out health status and other relevant questionnaires at baseline and after 6 months of follow-up.
Discussion
This study will generate information on the yield of screening for previously unrecognized HF, COPD and other chronic diseases in frail elderly with reduced exercise tolerance and/or exercise induced dyspnoea. The cluster randomized comparison will reveal whether this yield will result in subsequent improvements in functional health and/or health related quality of life.
Trial registration
ClinicalTrials.gov NCT01148719
doi:10.1186/1471-2458-12-385
PMCID: PMC3407748  PMID: 22640176
Reduced exercise tolerance; Dyspnoea; Breathlessness; Heart failure; COPD; Frail; Elderly; Screening
10.  Validation of a Dutch Risk Score Predicting Poor Outcome in Adults with Bacterial Meningitis in Vietnam and Malawi 
PLoS ONE  2012;7(3):e34311.
We have previously developed and validated a prognostic model to predict the risk for unfavorable outcome in Dutch adults with bacterial meningitis. The aim of the current study was to validate this model in adults with bacterial meningitis from two developing countries, Vietnam and Malawi. Demographic and clinical characteristics of Vietnamese (n = 426), Malawian patients (n = 465) differed substantially from those of Dutch patients (n = 696). The Dutch model underestimated the risk of poor outcome in both Malawi and Vietnam. The discrimination of the original model (c-statistic [c] 0.84; 95% confidence interval 0.81 to 0.86) fell considerably when re-estimated in the Vietnam cohort (c = 0.70) or in the Malawian cohort (c = 0.68). Our validation study shows that new prognostic models have to be developed for these countries in a sufficiently large series of unselected patients.
doi:10.1371/journal.pone.0034311
PMCID: PMC3314623  PMID: 22470555
11.  Internet-based prevention of posttraumatic stress symptoms in injured trauma patients: design of a randomized controlled trial 
European Journal of Psychotraumatology  2011;2:10.3402/ejpt.v2i0.8294.
Background
Injured trauma victims are at risk of developing Posttraumatic Stress Disorder (PTSD) and other post-trauma psychopathology. So far, interventions using cognitive behavioral techniques (CBT) have proven most efficacious in treating early PTSD in highly symptomatic individuals. No early intervention for the prevention of PTSD for all victims has yet proven effective. In the acute psychosocial care for trauma victims, there is a clear need for easily applicable, accessible, cost-efficient early interventions.
Objective
To describe the design of a randomized controlled trial (RCT) evaluating the effectiveness of a brief Internet-based early intervention that incorporates CBT techniques with the aim of reducing acute psychological distress and preventing long-term PTSD symptoms in injured trauma victims.
Method
In a two armed RCT, 300 injured trauma victims from two Level-1 trauma centers in Amsterdam, the Netherlands, will be assigned to an intervention or a control group. Inclusion criteria are: being 18 years of age or older, having experienced a traumatic event according to the diagnostic criteria of the DSM-IV and understanding the Dutch language. The intervention group will be given access to the intervention's website (www.traumatips.nl), and are specifically requested to login within the first month postinjury. The primary clinical study outcome is PTSD symptom severity. Secondary outcomes include symptoms of depression and anxiety, quality of life, and social support. In addition, a cost-effectiveness analysis of the intervention will be performed. Data are collected at one week post-injury, prior to first login (baseline), and at 1, 3, 6 and 12 months. Analyses will be on an intention-to-treat basis.
Discussion
The results will provide more insight into the effects of preventive interventions in general, and Internet-based early interventions specifically, on acute stress reactions and PTSD, in an injured population, during the acute phase after trauma. We will discuss possible strengths and limitations.
doi:10.3402/ejpt.v2i0.8294
PMCID: PMC3402131  PMID: 22893814
injury; trauma; early intervention; prevention; Internet; e-Mental Health; PTSD; cognitive behavioral therapy (CBT)
12.  Costs of relaparotomy on-demand versus planned relaparotomy in patients with severe peritonitis: an economic evaluation within a randomized controlled trial 
Critical Care  2010;14(3):R97.
Introduction
Results of the first randomized trial comparing on-demand versus planned-relaparotomy strategy in patients with severe peritonitis (RELAP trial) indicated no clear differences in primary outcomes. We now report the full economic evaluation for this trial, including detailed methods, nonmedical costs, further differentiated cost calculations, and robustness of different assumptions in sensitivity analyses.
Methods
An economic evaluation was conducted from a societal perspective alongside a randomized controlled trial in 229 patients with severe secondary peritonitis and an acute physiology and chronic health evaluation (APACHE)-II score ≥11 from two academic and five regional teaching hospitals in the Netherlands. After the index laparotomy, patients were randomly allocated to an on-demand or a planned-relaparotomy strategy. Primary resource-utilization data were used to estimate mean total costs per patient during the index admission and after discharge until 1 year after the index operation. Overall differences in costs between the on-demand relaparotomy strategy and the planned strategy, as well as relative differences across several clinical subgroups, were evaluated.
Results
Costs were substantially lower in the on-demand group (mean, €65,768 versus €83,450 per patient in the planned group; mean absolute difference, €17,682; 95% CI, €5,062 to €29,004). Relative differences in mean total costs per patient (approximately 21%) were robust to various alternative assumptions. Planned relaparotomy consistently generated more costs across the whole range of different courses of disease (quick recovery and few resources used on one end of the spectrum; slow recovery and many resources used on the other end). This difference in costs between the two surgical strategies also did not vary significantly across several clinical subgroups.
Conclusions
The reduction in societal costs renders the on-demand strategy a more-efficient relaparotomy strategy in patients with severe peritonitis. These differences were found across the full range of healthcare resources as well as across patients with different courses of disease.
Trial Registration
ISRCTN51729393
doi:10.1186/cc9032
PMCID: PMC2911734  PMID: 20507557
13.  Polyp measurement based on CT colonography and colonoscopy: variability and systematic differences 
European Radiology  2009;20(6):1404-1413.
Objective
To assess the variability and systematic differences in polyp measurements on optical colonoscopy and CT colonography.
Materials
Gastroenterologists measured 51 polyps by visual estimation, forceps comparison and linear probe. CT colonography observers randomly assessed polyp size two-dimensionally (abdominal and intermediate window) and three-dimensionally (manually and semi-automatically). Linear mixed models were used to assess the variability and systematic differences between CT colonography and optical colonoscopy techniques.
Results
The variability of forceps and linear probe measurements was comparable and both showed less variability than measurement by visual assessment. Measurements by linear probe were 0.7 mm smaller than measurements by visual assessment or by forceps. The variability of all CT colonography techniques was lower than for measurements by forceps or visual assessment and sometimes lower (only 2D intermediate window and manual 3D) compared with measurements by linear probe. All CT colonography measurements judged polyps to be larger than optical colonoscopy, with differences ranging from 0.7 to 2.3 mm.
Conclusion
A linear probe does not reduce the measurement variability of endoscopists compared with the forceps. Measurement differences between observers on CT colonography were usually smaller than at optical colonoscopy. Polyps appeared larger when using various CT colonography techniques than when measured during optical colonoscopy.
doi:10.1007/s00330-009-1683-0
PMCID: PMC2861761  PMID: 20033180
CT colonography; Colon; Colonoscopy; Measurement; Cancer; 2D; 3D
14.  Hyperglycemia in bacterial meningitis: a prospective cohort study 
Background
Hyperglycemia has been associated with unfavorable outcome in several disorders, but few data are available in bacterial meningitis. We assessed the incidence and significance of hyperglycemia in adults with bacterial meningitis.
Methods
We collected data prospectively between October 1998 and April 2002, on 696 episodes of community-acquired bacterial meningitis, confirmed by culture of CSF in patients >16 years. Patients were dichotomized according to blood glucose level on admission. A cutoff random non-fasting blood glucose level of 7.8 mmol/L (140 mg/dL) was used to define hyperglycemia, and a cutoff random non-fasting blood glucose level of 11.1 mmol/L (200 mg/dL) was used to define severe hyperglycemia. Unfavorable outcome was defined on the Glasgow outcome scale as a score <5. We also evaluated characteristics of patients with a preadmission diagnosis of diabetes mellitus.
Results
69% of patients were hyperglycemic and 25% severely hyperglycemic on admission. Compared with non-hyperglycemic patients, hyperglycemia was related with advanced age (median, 55 yrs vs. 44 yrs, P < 0.0001), preadmission diagnosis of diabetes (9% vs. 3%, P = 0.005), and distant focus of infection (37% vs. 28%, P = 0.02). They were more often admitted in coma (16% vs. 8%; P = 0.004) and with pneumococcal meningitis (55% vs. 42%, P = 0.007). These differences remained significant after exclusion of patients with known diabetes. Hyperglycemia was related with unfavorable outcome in a univariate analysis but this relation did not remain robust in a multivariate analysis. Factors predictive for neurologic compromise were related with higher blood glucose levels, whereas factors predictive for systemic compromise were related with lower blood glucose levels. Only a minority of severely hyperglycemic patients were known diabetics (19%). The vast majority of these known diabetic patients had meningitis due to Streptococcus pneumoniae (67%) or Listeria monocytogenes (13%) and they were at high risk for unfavorable outcome (52%).
Conclusion
The majority of patients with bacterial meningitis have hyperglycemic blood glucose levels on admission. Hyperglycemia can be explained by a physical stress reaction, the central nervous system insult leading to disturbed blood-glucose regulation mechanisms, and preponderance of diabetics for pneumococcal meningitis. Patients with diabetes and bacterial meningitis are at high risk for unfavorable outcome.
doi:10.1186/1471-2334-9-57
PMCID: PMC2694198  PMID: 19426501
15.  Regional perinatal mortality differences in the Netherlands; care is the question 
BMC Public Health  2009;9:102.
Background
Perinatal mortality is an important indicator of health. European comparisons of perinatal mortality show an unfavourable position for the Netherlands. Our objective was to study regional variation in perinatal mortality within the Netherlands and to identify possible explanatory factors for the found differences.
Methods
Our study population comprised of all singleton births (904,003) derived from the Netherlands Perinatal Registry for the period 2000–2004. Perinatal mortality including stillbirth from 22+0 weeks gestation and early neonatal death (0–6 days) was our main outcome measure. Differences in perinatal mortality were calculated between 4 distinct geographical regions North-East-South-West. We tried to explain regional differences by adjustment for the demographic factors maternal age, parity and ethnicity and by socio-economic status and urbanisation degree using logistic modelling. In addition, regional differences in mode of delivery and risk selection were analysed as health care factors. Finally, perinatal mortality was analysed among five distinct clinical risk groups based on the mediating risk factors gestational age and congenital anomalies.
Results
Overall perinatal mortality was 10.1 per 1,000 total births over the period 2000–2004. Perinatal mortality was elevated in the northern region (11.2 per 1,000 total births). Perinatal mortality in the eastern, western and southern region was 10.2, 10.1 and 9.6 per 1,000 total births respectively. Adjustment for demographic factors increased the perinatal mortality risk in the northern region (odds ratio 1.20, 95% CI 1.12–1.28, compared to reference western region), subsequent adjustment for socio-economic status and urbanisation explained a small part of the elevated risk (odds ratio 1.11, 95% CI 1.03–1.20). Risk group analysis showed that regional differences were absent among very preterm births (22+0 – 25+6 weeks gestation) and most prominent among births from 32+0 gestation weeks onwards and among children with severe congenital anomalies. Among term births (≥ 37+0 weeks) regional mortality differences were largest for births in women transferred from low to high risk during delivery.
Conclusion
Regional differences in perinatal mortality exist in the Netherlands. These differences could not be explained by demographic or socio-economic factors, however clinical risk group analysis showed indications for a role of health care factors.
doi:10.1186/1471-2458-9-102
PMCID: PMC2674436  PMID: 19366460
16.  Transanal endoscopic microsurgery versus endoscopic mucosal resection for large rectal adenomas (TREND-study) 
BMC Surgery  2009;9:4.
Background
Recent non-randomized studies suggest that extended endoscopic mucosal resection (EMR) is equally effective in removing large rectal adenomas as transanal endoscopic microsurgery (TEM). If equally effective, EMR might be a more cost-effective approach as this strategy does not require expensive equipment, general anesthesia and hospital admission. Furthermore, EMR appears to be associated with fewer complications.
The aim of this study is to compare the cost-effectiveness and cost-utility of TEM and EMR for the resection of large rectal adenomas.
Methods/design
Multicenter randomized trial among 15 hospitals in the Netherlands. Patients with a rectal adenoma ≥ 3 cm, located between 1–15 cm ab ano, will be randomized to a TEM- or EMR-treatment strategy. For TEM, patients will be treated under general anesthesia, adenomas will be dissected en-bloc by a full-thickness excision, and patients will be admitted to the hospital. For EMR, no or conscious sedation is used, lesions will be resected through the submucosal plane in a piecemeal fashion, and patients will be discharged from the hospital. Residual adenoma that is visible during the first surveillance endoscopy at 3 months will be removed endoscopically in both treatment strategies and is considered as part of the primary treatment.
Primary outcome measure is the proportion of patients with recurrence after 3 months. Secondary outcome measures are: 2) number of days not spent in hospital from initial treatment until 2 years afterwards; 3) major and minor morbidity; 4) disease specific and general quality of life; 5) anorectal function; 6) health care utilization and costs. A cost-effectiveness and cost-utility analysis of EMR against TEM for large rectal adenomas will be performed from a societal perspective with respectively the costs per recurrence free patient and the cost per quality adjusted life year as outcome measures.
Based on comparable recurrence rates for TEM and EMR of 3.3% and considering an upper-limit of 10% for EMR to be non-inferior (beta-error 0.2 and one-sided alpha-error 0.05), 89 patients are needed per group.
Discussion
The TREND study is the first randomized trial evaluating whether TEM or EMR is more cost-effective for the treatment of large rectal adenomas.
Trial registration number
(trialregister.nl) NTR1422
doi:10.1186/1471-2482-9-4
PMCID: PMC2664790  PMID: 19284647
17.  Ignoring Dependency between Linking Variables and Its Impact on the Outcome of Probabilistic Record Linkage Studies 
Objectives
This study sought to examine the differences between ignoring (naïve) and incorporating dependency (nonnaïve) among linkage variables on the outcome of a probabilistic record linkage study.
Design and Measurements
We used the outcomes of a previously developed probabilistic linkage procedure for different registries in perinatal care assuming independence among linkage variables. We estimated the impact of ignoring dependency by re-estimating the linkage weights after constructing a variable that combines the outcomes of the comparison of 2 correlated linking variables. The results of the original naïve and the new nonnaïve strategy were systematically compared for 3 scenarios: the empirical dataset using 9 variables, the empirical dataset using 5 variables, and a simulated dataset using 5 variables.
Results
The linking weight for agreement on 2 correlated variables among nonmatches was estimated considerably higher in the naïve strategy than in the nonnaïve strategy (16.87 vs. 13.55). Therefore, ignoring dependency overestimates the amount of identifying information if both correlated variables agree. The impact on the number of pairs that was classified differently with both approaches was modest in the situation in which there were many different linking variables but grew substantially with fewer variables. The simulation study confirmed the results of the empirical study and suggests that the number of misclassifications can increase substantially by ignoring dependency under less favorable linking conditions.
Conclusion
Dependency often exists between linking variables and has the potential to bias the outcome of a linkage study. The nonnaïve approach is a straightforward method for creating linking weights that accommodate dependency. The impact on the number of misclassifications depends on the quality and number of linking variables relative to the number of correlated linking variables.
doi:10.1197/jamia.M2265
PMCID: PMC2528043  PMID: 18579842
18.  Factors associated with posttraumatic stress symptoms in a prospective cohort of patients after abdominal sepsis: a nomogram 
Intensive Care Medicine  2008;34(4):664-674.
Objective
To determine to what extent patients who have survived abdominal sepsis suffer from symptoms of posttraumatic stress disorder (PTSD) and depression, and to identify potential risk factors for PTSD symptoms.
Design and setting
PTSD and depression symptoms were measured using the Impact of Events Scale–Revised (IES-R), the Post-Traumatic Symptom Scale 10 (PTSS-10) and the Beck Depression Inventory II (BDI-II).
Patients and participants
A total of 135 peritonitis patients were eligible for this study, of whom 107 (80%) patients completed the questionnaire. The median APACHE-II score was 14 (range 12–16), and 89% were admitted to the ICU.
Measurements and results
The proportion of patients with “moderate” PTSD symptom scores was 28% (95% CI 20–37), whilst 10% (95% CI 6–17) of patients had “high” PTSD symptom scores. Only 5% (95% CI 2–12) of the patients expressed severe depression symptoms. Factors associated with increased PTSD symptoms in a multivariate ordinal regression model were younger age (0.74 per 10 years older, p = 0.082), length of ICU stay (OR = 1.4 per doubling of duration, p = 0.003) and having some (OR = 4.9, p = 0.06) or many (OR = 55.5, p < 0.001) traumatic memories of the ICU or hospital stay.
Conclusion
As many as 38% of patients after abdominal sepsis report elevated levels of PTSD symptoms on at least one of the questionnaires. Our nomogram may assist in identifying patients at increased risk for developing symptoms of PTSD.
Electronic supplementary material
The online version of this article (doi:10.1007/s00134-007-0941-3) contains supplementary material, which is available to authorized users.
doi:10.1007/s00134-007-0941-3
PMCID: PMC2271079  PMID: 18197398
Peritonitis; Sepsis; Posttraumatic stress disorder; PTSD; Depression; Intensive care; IES-R; PTSS-10; BDI-II
19.  Single-drug therapy or selective decontamination of the digestive tract as antifungal prophylaxis in critically ill patients: a systematic review 
Critical Care  2007;11(6):R126.
Introduction
The objective of this study was to determine and compare the effectiveness of different prophylactic antifungal therapies in critically ill patients on the incidence of yeast colonisation, infection, candidemia, and hospital mortality.
Methods
A systematic review was conducted of prospective trials including adult non-neutropenic patients, comparing single-drug antifungal prophylaxis (SAP) or selective decontamination of the digestive tract (SDD) with controls and with each other.
Results
Thirty-three studies were included (11 SAP and 22 SDD; 5,529 patients). Compared with control groups, both SAP and SDD reduced the incidence of yeast colonisation (SAP: odds ratio [OR] 0.38, 95% confidence interval [CI] 0.20 to 0.70; SDD: OR 0.12, 95% CI 0.05 to 0.29) and infection (SAP: OR 0.54, 95% CI 0.39 to 0.75; SDD: OR 0.29, 95% CI 0.18 to 0.45). Treatment effects were significantly larger in SDD trials than in SAP trials. The incidence of candidemia was reduced by SAP (OR 0.32, 95% CI 0.12 to 0.82) but not by SDD (OR 0.59, 95% CI 0.25 to 1.40). In-hospital mortality was reduced predominantly by SDD (OR 0.73, 95% CI 0.59 to 0.93, numbers needed to treat 15; SAP: OR 0.80, 95% CI 0.64 to 1.00). Effectiveness of prophylaxis reduced with an increased proportion of included surgical patients.
Conclusion
Antifungal prophylaxis (SAP or SDD) is effective in reducing yeast colonisation and infections across a range of critically ill patients. Indirect comparisons suggest that SDD is more effective in reducing yeast-related outcomes, except for candidemia.
doi:10.1186/cc6191
PMCID: PMC2246222  PMID: 18067657
20.  Health related quality of life six months following surgical treatment for secondary peritonitis – using the EQ-5D questionnaire 
Background
To compare health related quality of life (HR-QoL) in patients surgically treated for secondary peritonitis to that of a healthy population. And to prospectively identify factors associated with poorer (lower) HR-QoL.
Design
A prospective cohort of secondary peritonitis patients was mailed the EQ-5D and EQ-VAS 6-months following initial laparotomy.
Setting
Multicenter study in two academic and seven regional teaching hospitals.
Patients
130 of the 155 eligible patients (84%) responded to the HR-QoL questionnaires.
Results
HR-QoL was significantly worse on all dimensions in peritonitis patients than in a healthy reference population. Peritonitis characteristics at initial presentation were not associated with HR-QoL at six months. A more complicated course of the disease leading to longer hospitalization times and patients with an enterostomy had a negative impact on the mobility (p = 0.02), self-care (p < 0.001) and daily activities: (p = 0.01). In a multivariate analysis for the EQ-VAS every doubling of hospital stay decreases the EQ-VAS by 3.8 points (p = 0.015). Morbidity during the six-month follow-up was not found to be predictive for the EQ-5D or EQ-VAS.
Conclusion
Six months following initial surgery, patients with secondary peritonitis report more problems in HR-QoL than a healthy reference population. Unfavorable disease characteristics at initial presentation were not predictive for poorer HR-QoL, but a more complicated course of the disease was most predictive of HR-QoL at 6 months.
doi:10.1186/1477-7525-5-35
PMCID: PMC1950493  PMID: 17601343
21.  The clinical effect of a new infant formula in term infants with constipation: a double-blind, randomized cross-over trial 
Nutrition Journal  2007;6:8.
Background
Nutrilon Omneo (new formula; NF) contains high concentration of sn-2 palmitic acid, a mixture of prebiotic oligosaccharides and partially hydrolyzed whey protein. It is hypothesized that NF positively affects stool characteristics in constipated infants.
Methods
Thirty-eight constipated infants, aged 3–20 weeks, were included and randomized to NF (n = 20) or a standard formula (SF; n = 18) in period 1 and crossed-over after 3 weeks to treatment period 2. Constipation was defined by at least one of the following symptoms: 1) defecation frequency < 3/week; 2) painful defecation; 3) abdominal or rectal palpable mass.
Results
Period 1 was completed by 35 infants. A significant increase in defecation frequency (NF: 3.5 pre versus 5.6/week post treatment; SF 3.6 pre versus 4.9/week post treatment) was found in both groups, but was not significantly different between the two formulas (p = 0.36). Improvement of hard stool consistency to soft stool consistency was found more often with NF than SF, but did not reach statistical significance (90% versus 50%; RR, 1.8; 95% CI, 0.9–3.5; p = 0.14). No difference was found in painful defecation or the presence of an abdominal or rectal mass between the two groups. Twenty-four infants completed period 2. Only stool consistency was significantly different between the two formulas (17% had soft stools on NF and hard stools on SF; no infants had soft stools on SF and hard stools on NF, McNemar test p = 0.046).
Conclusion
The addition of a high concentration sn-2 palmitic acid, prebiotic oligosaccharides and partially hydrolyzed whey protein resulted in a strong tendency of softer stools in constipated infants, but not in a difference in defecation frequency. Formula transition to NF may be considered as treatment in constipated infants with hard stools.
doi:10.1186/1475-2891-6-8
PMCID: PMC1852321  PMID: 17428343
22.  Attenuated cerebrospinal fluid leukocyte count and sepsis in adults with pneumococcal meningitis: a prospective cohort study 
Background
A low cerebrospinal fluid (CSF) white-blood cell count (WBC) has been identified as an independent risk factor for adverse outcome in adults with bacterial meningitis. Whereas a low CSF WBC indicates the presence of sepsis with early meningitis in patients with meningococcal infections, the relation between CSF WBC and outcome in patients with pneumococcal meningitis is not understood.
Methods
We examined the relation between CSF WBC, bacteraemia and sepsis in a prospective cohort study that included 352 episodes of pneumococcal meningitis, confirmed by CSF culture, occurring in patients aged >16 years.
Results
CSF WBC was recorded in 320 of 352 episodes (91%). Median CSF WBC was 2530 per mm3 (interquartile range 531–6983 per mm3) and 104 patients (33%) had a CSF WBC <1000/mm3. Patients with a CSF WBC <1000/mm3 were more likely to have an unfavourable outcome (defined as a Glasgow Outcome Scale score of 1–4) than those with a higher WBC (74 of 104 [71%] vs. 87 of 216 [43%]; P < 0.001). CSF WBC was significantly associated with blood WBC (Spearman's test 0.29), CSF protein level (0.20), thrombocyte count (0.21), erythrocyte sedimentation rate (-0.15), and C-reactive protein levels (-0.18). Patients with a CSF WBC <1000/mm3 more often had a positive blood culture (72 of 84 [86%] vs. 138 of 196 [70%]; P = 0.01) and more often developed systemic complications (cardiorespiratory failure, sepsis) than those with a higher WBC (53 of 104 [51%] vs. 69 of 216 [32%]; P = 0.001). In a multivariate analysis, advanced age (Odds ratio per 10-year increments 1.22, 95%CI 1.02–1.45), a positive blood culture (Odds ratio 2.46, 95%CI 1.17–5.14), and a low thrombocyte count on admission (Odds ratio per 100,000/mm3 increments 0.67, 95% CI 0.47–0.97) were associated with a CSF WBC <1000/mm3.
Conclusion
A low CSF WBC in adults with pneumococcal meningitis is related to the presence of signs of sepsis and systemic complications. Invasive pneumococcal infections should possibly be regarded as a continuum from meningitis to sepsis.
doi:10.1186/1471-2334-6-149
PMCID: PMC1618396  PMID: 17038166
23.  Reproducibility of the STARD checklist: an instrument to assess the quality of reporting of diagnostic accuracy studies 
Background
In January 2003, STAndards for the Reporting of Diagnostic accuracy studies (STARD) were published in a number of journals, to improve the quality of reporting in diagnostic accuracy studies. We designed a study to investigate the inter-assessment reproducibility, and intra- and inter-observer reproducibility of the items in the STARD statement.
Methods
Thirty-two diagnostic accuracy studies published in 2000 in medical journals with an impact factor of at least 4 were included. Two reviewers independently evaluated the quality of reporting of these studies using the 25 items of the STARD statement. A consensus evaluation was obtained by discussing and resolving disagreements between reviewers. Almost two years later, the same studies were evaluated by the same reviewers. For each item, percentages agreement and Cohen's kappa between first and second consensus assessments (inter-assessment) were calculated. Intraclass Correlation coefficients (ICC) were calculated to evaluate its reliability.
Results
The overall inter-assessment agreement for all items of the STARD statement was 85% (Cohen's kappa 0.70) and varied from 63% to 100% for individual items. The largest differences between the two assessments were found for the reporting of the rationale of the reference standard (kappa 0.37), number of included participants that underwent tests (kappa 0.28), distribution of the severity of the disease (kappa 0.23), a cross tabulation of the results of the index test by the results of the reference standard (kappa 0.33) and how indeterminate results, missing data and outliers were handled (kappa 0.25). Within and between reviewers, also large differences were observed for these items. The inter-assessment reliability of the STARD checklist was satisfactory (ICC = 0.79 [95% CI: 0.62 to 0.89]).
Conclusion
Although the overall reproducibility of the quality of reporting on diagnostic accuracy studies using the STARD statement was found to be good, substantial disagreements were found for specific items. These disagreements were not so much caused by differences in interpretation of the items by the reviewers but rather by difficulties in assessing the reporting of these items due to lack of clarity within the articles. Including a flow diagram in all reports on diagnostic accuracy studies would be very helpful in reducing confusion between readers and among reviewers.
doi:10.1186/1471-2288-6-12
PMCID: PMC1522016  PMID: 16539705
24.  Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies 
Background
A quality assessment tool for diagnostic accuracy studies, named QUADAS, has recently been developed. Although QUADAS has been used in several systematic reviews, it has not been formally validated. The objective was to evaluate the validity and usefulness of QUADAS.
Methods
Three reviewers independently rated the quality of 30 studies using QUADAS. We assessed the proportion of agreements between each reviewer and the final consensus rating. This was done for all QUADAS items combined and for each individual item. Twenty reviewers who had used QUADAS in their reviews completed a short structured questionnaire on their experience of QUADAS.
Results
Over all items, the agreements between each reviewer and the final consensus rating were 91%, 90% and 85%. The results for individual QUADAS items varied between 50% and 100% with a median value of 90%. Items related to uninterpretable test results and withdrawals led to the most disagreements. The feedback on the content of the tool was generally positive with only small numbers of reviewers reporting problems with coverage, ease of use, clarity of instructions and validity.
Conclusion
Major modifications to the content of QUADAS itself are not necessary. The evaluation highlighted particular difficulties in scoring the items on uninterpretable results and withdrawals. Revised guidelines for scoring these items are proposed. It is essential that reviewers tailor guidelines for scoring items to their review, and ensure that all reviewers are clear on how to score studies. Reviewers should consider whether all QUADAS items are relevant to their review, and whether additional quality items should be assessed as part of their review.
doi:10.1186/1471-2288-6-9
PMCID: PMC1421422  PMID: 16519814
25.  Evidence of bias and variation in diagnostic accuracy studies 
Background
Studies with methodologic shortcomings can overestimate the accuracy of a medical test. We sought to determine and compare the direction and magnitude of the effects of a number of potential sources of bias and variation in studies on estimates of diagnostic accuracy.
Methods
We identified meta-analyses of the diagnostic accuracy of tests through an electronic search of the databases MEDLINE, EMBASE, DARE and MEDION (1999–2002). We included meta-analyses with at least 10 primary studies without preselection based on design features. Pairs of reviewers independently extracted study characteristics and original data from the primary studies. We used a multivariable meta-epidemiologic regression model to investigate the direction and strength of the association between 15 study features on estimates of diagnostic accuracy.
Results
We selected 31 meta-analyses with 487 primary studies of test evaluations. Only 1 study had no design deficiencies. The quality of reporting was poor in most of the studies. We found significantly higher estimates of diagnostic accuracy in studies with nonconsecutive inclusion of patients (relative diagnostic odds ratio [RDOR] 1.5, 95% confidence interval [CI] 1.0–2.1) and retrospective data collection (RDOR 1.6, 95% CI 1.1–2.2). The estimates were highest in studies that had severe cases and healthy controls (RDOR 4.9, 95% CI 0.6–37.3). Studies that selected patients based on whether they had been referred for the index test, rather than on clinical symptoms, produced significantly lower estimates of diagnostic accuracy (RDOR 0.5, 95% CI 0.3–0.9). The variance between meta-analyses of the effect of design features was large to moderate for type of design (cohort v. case–control), the use of composite reference standards and the use of differential verification; the variance was close to zero for the other design features.
Interpretation
Shortcomings in study design can affect estimates of diagnostic accuracy, but the magnitude of the effect may vary from one situation to another. Design features and clinical characteristics of patient groups should be carefully considered by researchers when designing new studies and by readers when appraising the results of such studies. Unfortunately, incomplete reporting hampers the evaluation of potential sources of bias in diagnostic accuracy studies.
doi:10.1503/cmaj.050090
PMCID: PMC1373751  PMID: 16477057

Results 1-25 (32)