Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Editors’ note: In order to encourage dissemination of the TRIPOD Statement, this article is freely accessible on the Annals of Internal Medicine Web site (www.annals.org) and will be also published in BJOG, British Journal of Cancer, British Journal of Surgery, BMC Medicine, British Medical Journal, Circulation, Diabetic Medicine, European Journal of Clinical Investigation, European Urology, and Journal of Clinical Epidemiology. The authors jointly hold the copyright of this article. An accompanying Explanation and Elaboration article is freely available only on www.annals.org; Annals of Internal Medicine holds copyright for that article.
Prediction models; Prognostic; Diagnostic; Model development; Validation; Transparency; Reporting
Drawing conclusions from systematic reviews of test accuracy studies without considering the methodological quality (risk of bias) of included studies may lead to unwarranted optimism about the value of the test(s) under study. We sought to identify to what extent the results of quality assessment of included studies are incorporated in the conclusions of diagnostic accuracy reviews.
We searched MEDLINE and EMBASE for test accuracy reviews published between May and September 2012. We examined the abstracts and main texts of these reviews to see whether and how the results of quality assessment were linked to the accuracy estimates when drawing conclusions.
We included 65 reviews of which 53 contained a meta-analysis. Sixty articles (92%) had formally assessed the methodological quality of included studies, most often using the original QUADAS tool (n = 44, 68%). Quality assessment was mentioned in 28 abstracts (43%); with a majority (n = 21) mentioning it in the methods section. In only 5 abstracts (8%) were results of quality assessment incorporated in the conclusions. Thirteen reviews (20%) presented results of quality assessment in the main text only, without further discussion. Forty-seven reviews (72%) discussed results of quality assessment; the most frequent form was as limitations in assessing quality (n = 28). Only 6 reviews (9%) further linked the results of quality assessment to their conclusions, 3 of which did not conduct a meta-analysis due to limitations in the quality of included studies. In the reviews with a meta-analysis, 19 (36%) incorporated quality in the analysis. Eight reported significant effects of quality on the pooled estimates; in none of them these effects were factored in the conclusions.
While almost all recent diagnostic accuracy reviews evaluate the quality of included studies, very few consider results of quality assessment when drawing conclusions. The practice of reporting systematic reviews of test accuracy should improve if readers not only want to be informed about the limitations in the available evidence, but also on the associated implications for the performance of the evaluated tests.
Diagnostic tests; Test accuracy; Systematic reviews; Meta-analysis; Quality; QUADAS; Risk of bias
Risk prediction models estimate the risk of developing future outcomes for individuals based on one or more underlying characteristics (predictors). We review how researchers develop and validate risk prediction models within an individual participant data (IPD) meta-analysis, in order to assess the feasibility and conduct of the approach.
A qualitative review of the aims, methodology, and reporting in 15 articles that developed a risk prediction model using IPD from multiple studies.
The IPD approach offers many opportunities but methodological challenges exist, including: unavailability of requested IPD, missing patient data and predictors, and between-study heterogeneity in methods of measurement, outcome definitions and predictor effects. Most articles develop their model using IPD from all available studies and perform only an internal validation (on the same set of data). Ten of the 15 articles did not allow for any study differences in baseline risk (intercepts), potentially limiting their model’s applicability and performance in some populations. Only two articles used external validation (on different data), including a novel method which develops the model on all but one of the IPD studies, tests performance in the excluded study, and repeats by rotating the omitted study.
An IPD meta-analysis offers unique opportunities for risk prediction research. Researchers can make more of this by allowing separate model intercept terms for each study (population) to improve generalisability, and by using ‘internal-external cross-validation’ to simultaneously develop and validate their model. Methodological challenges can be reduced by prospectively planned collaborations that share IPD for risk prediction.
Meta-analysis; Prognostic factor; Prognosis; Individual participant (patient) data; Review; Reporting
Prediction models for exacerbations in patients with chronic obstructive pulmonary disease (COPD) are scarce. Our aim was to develop and validate a new model to predict exacerbations in patients with COPD.
Patients and methods
The derivation cohort consisted of patients aged 65 years or over, with a COPD diagnosis, who were followed up over 24 months. The external validation cohort consisted of another cohort of COPD patients, aged 50 years or over. Exacerbations of COPD were defined as symptomatic deterioration requiring pulsed oral steroid use or hospitalization. Logistic regression analysis including backward selection and shrinkage were used to develop the final model and to adjust for overfitting. The adjusted regression coefficients were applied in the validation cohort to assess calibration of the predictions and calculate changes in discrimination applying C-statistics.
The derivation and validation cohort consisted of 240 and 793 patients with COPD, of whom 29% and 28%, respectively, experienced an exacerbation during follow-up. The final model included four easily assessable variables: exacerbations in the previous year, pack years of smoking, level of obstruction, and history of vascular disease, with a C-statistic of 0.75 (95% confidence interval [CI]: 0.69–0.82). Predictions were well calibrated in the validation cohort, with a small loss in discrimination potential (C-statistic 0.66 [95% CI 0.61–0.71]).
Our newly developed prediction model can help clinicians to predict the risk of future exacerbations in individual patients with COPD, including those with mild disease.
exacerbation of COPD; risk prediction; external validation; vascular disease
A medical tests may influence the health of patients by guiding clinical decisions, such as treatment in case of a positive test result. However, a medical test can influence the health of patients through other mechanisms as well, like giving reassurance. To make a clinical recommendation about a medical test, we should be aware of the full range of effects of that test on patients. This requires an understanding of the range of effects that medical testing can have on patients. This study evaluates the mechanisms through which medical testing can influence patients’ health, other than the effect on clinical management, from a gynecologist’s perspective.
A qualitative study in which explorative focus groups were conducted with gynecologists, gynecological residents and gynecological M.D. researchers (n = 43). Discussions were transcribed verbatim. Transcriptions were coded inductively and analyzed by three researchers.
All participants contributed various clinical examples in which medical testing had influenced patients’ health. Clinical examples illustrated that testing, in itself or in interaction with contextual factors, may provoke a wide range of effects on patients. Our data showed that testing can influence the doctor’s perceptions of the patients’ appraisal of their illness, their perceived control, or the doctor-patient relationship. This may lead to changes in psychological, behavioral, and/or medical outcomes, both favorably or unfavorably. The data were used to construct a conceptual framework of effects of medical testing on patients.
Besides supporting clinical decision making, medical testing may have favorable or unfavorable effects on patients’ health though several mechanisms.
Test evaluation; Patient outcomes; Diagnostic test; Methodology; Qualitative research
When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions.
Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated.
The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept.
The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.
Logistic regression analysis; Prediction model with random intercept; Validation
Proper evaluation of new diagnostic tests is required to reduce overutilization and to limit potential negative health effects and costs related to testing. A decision analytic modelling approach may be worthwhile when a diagnostic randomized controlled trial is not feasible. We demonstrate this by assessing the cost-effectiveness of modified transesophageal echocardiography (TEE) compared with manual palpation for the detection of atherosclerosis in the ascending aorta.
Based on a previous diagnostic accuracy study, actual Dutch reimbursement data, and evidence from literature we developed a Markov decision analytic model. Cost-effectiveness of modified TEE was assessed for a life time horizon and a health care perspective. Prevalence rates of atherosclerosis were age-dependent and low as well as high rates were applied. Probabilistic sensitivity analysis was applied.
The model synthesized all available evidence on the risk of stroke in cardiac surgery patients. The modified TEE strategy consistently resulted in more adapted surgical procedures and, hence, a lower risk of stroke and a slightly higher number of life-years. With 10% prevalence of atherosclerosis the incremental cost-effectiveness ratio was €4,651 and €481 per quality-adjusted life year in 55-year-old men and women, respectively. In all patients aged 65 years or older the modified TEE strategy was cost saving and resulted in additional health benefits.
Decision analytic modelling to assess the cost-effectiveness of a new diagnostic test based on characteristics, costs and effects of the test itself and of the subsequent treatment options is both feasible and valuable. Our case study on modified TEE suggests that it may reduce the risk of stroke in cardiac surgery patients older than 55 years at acceptable cost-effectiveness levels.
Diagnostic test; Patient outcomes; Cost-effectiveness analysis; Stroke; Cardiac surgery
Nested case–control studies become increasingly popular as they can be very efficient for quantifying the diagnostic accuracy of costly or invasive tests or (bio)markers. However, they do not allow for direct estimation of the test’s predictive values or post-test probabilities, let alone for their confidence intervals (CIs). Correct estimates of the predictive values itself can easily be obtained using a simple correction by the (inverse) sampling fractions of the cases and controls. But using this correction to estimate the corresponding standard error (SE), falsely increases the number of patients that are actually studied, yielding too small CIs. We compared different approaches for estimating the SE and thus CI of predictive values or post-test probabilities of diagnostic test results in a nested case–control study.
We created datasets based on a large, previously published diagnostic study on 2 different tests (D-dimer test and calf difference test) with a nested case–control design. We compared six different approaches; the approaches were: 1. the standard formula for the SE of a proportion, 2. adaptation of the standard formula with the sampling fraction, 3. A bootstrap procedure, 4. A approach, which uses the sensitivity, the specificity and the prevalence, 5. Weighted logistic regression, and 6. Approach 4 on the log odds scale. The approaches were compared with respect to coverage of the CI and CI-width.
The bootstrap procedure (approach 3) showed good coverage and relatively small CI widths. Approaches 4 and 6 showed some undercoverage, particularly for the D-dimer test with frequent positive results (positive results around 70%). Approaches 1, 2 and 5 showed clear overcoverage at low prevalences of 0.05 and 0.1 in the cohorts for all case–control ratios.
The results from our study suggest that a bootstrap procedure is necessary to assess the confidence interval for the predictive values or post-test probabilities of diagnostic tests results in studies using a nested case–control design.
Exercise reduced tolerance and breathlessness are common in the elderly and can result in substantial loss in functionality and health related quality of life. Heart failure (HF) and chronic obstructive pulmonary disease (COPD) are common underlying causes, but can be difficult to disentangle due to overlap in symptomatology. In addition, other potential causes such as obesity, anaemia, renal dysfunction and thyroid disorders may be involved.
We aim to assess whether screening of frail elderly with reduced exercise tolerance leads to high detection rates of HF, COPD, or alternative diagnoses, and whether detection of these diseases would result in changes in patient management and increase in both functionality and quality of life.
A cluster randomized diagnostic trial. Primary care practices are randomized to the diagnostic-treatment strategy (screening) or care as usual.
Patient population: Frail (defined as having three or more chronic or vitality threatening diseases and/or receiving five or more drugs chronically during the last year) community-dwelling persons aged 65 years and older selected from the electronic medical files of the participating general practitioners. Those with reduced exercise tolerance or moderate to severe dyspnoea (≥2 score on the Medical Research Counsel dyspnoea scale) are included in the study.
The diagnostic screening in the intervention group includes history taking, physical examination, electrocardiography, spirometry, blood tests, and echocardiography. Subsequently, participants with new diagnoses will be managed according to clinical guidelines. Participants in the control arm receive care as usual. All participants fill out health status and other relevant questionnaires at baseline and after 6 months of follow-up.
This study will generate information on the yield of screening for previously unrecognized HF, COPD and other chronic diseases in frail elderly with reduced exercise tolerance and/or exercise induced dyspnoea. The cluster randomized comparison will reveal whether this yield will result in subsequent improvements in functional health and/or health related quality of life.
Reduced exercise tolerance; Dyspnoea; Breathlessness; Heart failure; COPD; Frail; Elderly; Screening
Preterm birth is the principal factor contributing to adverse outcomes in multiple pregnancies. Randomized controlled trials of progestogens to prevent preterm birth in twin pregnancies have shown no clear benefits. However, individual studies have not had sufficient power to evaluate potential benefits in women at particular high risk of early delivery (for example, women with a previous preterm birth or short cervix) or to determine adverse effects for rare outcomes such as intrauterine death.
We propose an individual participant data meta-analysis of high quality randomized, double-blind, placebo-controlled trials of progestogen treatment in women with a twin pregnancy. The primary outcome will be adverse perinatal outcome (a composite measure of perinatal mortality and significant neonatal morbidity). Missing data will be imputed within each original study, before data of the individual studies are pooled. The effects of 17-hydroxyprogesterone caproate or vaginal progesterone treatment in women with twin pregnancies will be estimated by means of a random effects log-binomial model. Analyses will be adjusted for variables used in stratified randomization as appropriate. Pre-specified subgroup analysis will be performed to explore the effect of progestogen treatment in high-risk groups.
Combining individual patient data from different randomized trials has potential to provide valuable, clinically useful information regarding the benefits and potential harms of progestogens in women with twin pregnancy overall and in relevant subgroups.
Child abuse and neglect is an important international health problem with unacceptable levels of morbidity and mortality. Although maltreatment as a cause of injury is estimated to be only 1% or less of the injured children attending the emergency room, the consequences of both missed child abuse cases and wrong suspicions are substantial. Therefore, the accuracy of ongoing detection at emergency rooms by health care professionals is highly important. Internationally, several diagnostic instruments or strategies for child abuse detection are used at emergency rooms, but their diagnostic value is still unknown. The aim of the study 'Child Abuse Inventory at Emergency Rooms' (CHAIN-ER) is to assess if active structured inquiry by emergency room staff can accurately detect physical maltreatment in children presenting at emergency rooms with physical injury.
CHAIN-ER is a multi-centre, cross-sectional study with 6 months diagnostic follow-up. Five thousand children aged 0-7 presenting with injury at an emergency room will be included. The index test - the SPUTOVAMO-R questionnaire- is to be tested for its diagnostic value against the decision of an expert panel. All SPUTOVAMO-R positives and a 15% random sample of the SPUTOVAMO-R negatives will undergo the same systematic diagnostic work up, which consists of an adequate history being taken by a pediatrician, inquiry with other health care providers by structured questionnaires in order to obtain child abuse predictors, and by additional follow-up information. Eventually, an expert panel (reference test) determines the true presence or absence of child abuse.
CHAIN-ER will determine both positive and negative predictive value of a child abuse detection instrument used in the emergency room. We mention a benefit of the use of an expert panel and of the use of complete data. Conducting a diagnostic accuracy study on a child abuse detection instrument is also accompanied by scientific hurdles, such as the lack of an accepted reference standard and potential (non-) response. Notwithstanding these scientific challenges, CHAIN-ER will provide accurate data on the predictive value of SPUTOVAMO-R.
Chronic obstructive pulmonary disease (COPD) and asthma are underdiagnosed in primary care.
To determine how often COPD or asthma are present in middle-aged and older patients who consult their GP for persistent cough.
Design of study
A cross-sectional study in 353 patients older than 50 years, visiting their GP for persistent cough and not known to have COPD or asthma.
General practice in the Netherlands.
All participants underwent extensive diagnostic work-up, including symptoms, signs, spirometry, and body plethysmography. All results were studied by an expert panel to diagnose or exclude COPD and/or asthma. The reproducibility of the panel diagnosis was assessed by calculation of Cohen's κ statistic in a sample of 41 participants.
Of the 353 participants, 102 (29%, 95% confidence interval [CI] = 24 to 34%) were diagnosed with COPD. In 14 of these 102 participants, both COPD and asthma were diagnosed (4%, 95% CI = 2 to 7%). Asthma (without COPD) was diagnosed in 23 (7%, 95% CI = 4 to 10%) participants. Mean duration of cough was 93 days (median 40 days). The reproducibility of the expert panel was good (Cohen's κ = 0.90).
In patients aged over 50 years who consult their GP for persistent cough, undetected COPD or asthma is frequently present.
asthma; cough; COPD; early diagnosis
There is a need for brief instruments to ascertain the diagnosis of major depressive disorder. In this study, we present the reliability, construct validity and accuracy of the PHQ-9 and PHQ-2 to detect major depressive disorder in primary care.
Cross-sectional analyses within a large prospective cohort study (PREDICT-NL). Data was collected in seven large general practices in the centre of the Netherlands. 1338 subjects were recruited in the general practice waiting room, irrespective of their presenting complaint. The diagnostic accuracy (the area under the ROC curve and sensitivities and specificities for various thresholds) was calculated against a diagnosis of major depressive disorder determined with the Composite International Diagnostic Interview (CIDI).
The PHQ-9 showed a high degree of internal consistency (ICC = 0.88) and test-retest reliability (correlation = 0.94). With respect to construct validity, it showed a clear association with functional status measurements, sick days and number of consultations. The discriminative ability was good for the PHQ-9 (area under the ROC curve = 0.87, 95% CI: 0.84-0.90) and the PHQ-2 (ROC area = 0.83, 95% CI 0.80-0.87). Sensitivities at the recommended thresholds were 0.49 for the PHQ-9 at a score of 10 and 0.28 for a categorical algorithm. Adjustment of the threshold and the algorithm improved sensitivities to 0.82 and 0.84 respectively but the specificity decreased from 0.95 to 0.82 (threshold) and from 0.98 to 0.81 (algorithm). Similar results were found for the PHQ-2: the recommended threshold of 3 had a sensitivity of 0.42 and lowering the threshold resulted in an improved sensitivity of 0.81.
The PHQ-9 and the PHQ-2 are useful instruments to detect major depressive disorder in primary care, provided a high score is followed by an additional diagnostic work-up. However, often recommended thresholds for the PHQ-9 and the PHQ-2 resulted in many undetected major depressive disorders.
Despite its benefits, it is uncommon to apply the nested case-control design in diagnostic research. We aim to show advantages of this design for diagnostic accuracy studies.
We used data from a full cross-sectional diagnostic study comprising a cohort of 1295 consecutive patients who were selected on their suspicion of having deep vein thrombosis (DVT). We draw nested case-control samples from the full study population with case:control ratios of 1:1, 1:2, 1:3 and 1:4 (per ratio 100 samples were taken). We calculated diagnostic accuracy estimates for two tests that are used to detect DVT in clinical practice.
Estimates of diagnostic accuracy in the nested case-control samples were very similar to those in the full study population. For example, for each case:control ratio, the positive predictive value of the D-dimer test was 0.30 in the full study population and 0.30 in the nested case-control samples (median of the 100 samples). As expected, variability of the estimates decreased with increasing sample size.
Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies and should also be (re)appraised in current guidelines on diagnostic accuracy research.
Our objective was to systematically assess the differences in features, results, and usability of currently available meta-analysis programs.
Systematic review of software. We did an extensive search on the internet (Google, Yahoo, Altavista, and MSN) for specialized meta-analysis software. We included six programs in our review: Comprehensive Meta-analysis (CMA), MetAnalysis, MetaWin, MIX, RevMan, and WEasyMA. Two investigators compared the features of the software and their results. Thirty independent researchers evaluated the programs on their usability while analyzing one data set.
The programs differed substantially in features, ease-of-use, and price. Although most results from the programs were identical, we did find some minor numerical inconsistencies. CMA and MIX scored highest on usability and these programs also have the most complete set of analytical features.
In consideration of differences in numerical results, we believe the user community would benefit from openly available and systematically updated information about the procedures and results of each program's validation. The most suitable program for a meta-analysis will depend on the user's needs and preferences and this report provides an overview that should be helpful in making a substantiated choice.
The increased prevalence of unrecognised malignancy in patients with deep vein thrombosis (DVT) has been well established in secondary care settings. However, data from primary care settings, needed to tailor the diagnostic workup, are lacking.
To quantify the prevalence of unrecognised malignancy in primary care patients who have been diagnosed with DVT.
Prospective follow-up study.
All primary care physicians affiliated/associated with a non-teaching hospital in a geographically circumscribed region participated in the study.
A total of 430 consecutive patients without known malignancy, but with proven DVT were included in the study and compared with a control group of 442 primary care patients, matched according to age and sex. Previously unrecognised, occult malignancy was considered present if a new malignancy was diagnosed within 2 years following DVT diagnosis (DVT group) or inclusion in the control group. Patients with DVT were categorised in to those with unprovoked idiopathic DVT and those with risk factors for DVT (that is, secondary DVT).
During the 2-year follow-up period, a new malignancy was diagnosed 3.6 times more often in patients with idiopathic DVT than in the control group (2-year incidence: 7.4% and 2.0%, respectively). The incidence in patients with secondary DVT was 2.6%; only slightly higher than in control patients.
Unrecognised malignancies are more common in both primary and secondary care patients with DVT than in the general population. In particular, patients with idiopathic DVT are at risk and they could benefit from individualised case-finding to detect malignancy.
deep vein thrombosis; idiopathic; neoplasms; primary health care
Cardiotocography (CTG) is worldwide the method for fetal surveillance during labour. However, CTG alone shows many false positive test results and without fetal blood sampling (FBS), it results in an increase in operative deliveries without improvement of fetal outcome. FBS requires additional expertise, is invasive and has often to be repeated during labour. Two clinical trials have shown that a combination of CTG and ST-analysis of the fetal electrocardiogram (ECG) reduces the rates of metabolic acidosis and instrumental delivery. However, in both trials FBS was still performed in the ST-analysis arm, and it is therefore still unknown if the observed results were indeed due to the ST-analysis or to the use of FBS in combination with ST-analysis.
We aim to evaluate the effectiveness of non-invasive monitoring (CTG + ST-analysis) as compared to normal care (CTG + FBS), in a multicentre randomised clinical trial setting. Secondary aims are: 1) to judge whether ST-analysis of fetal electrocardiogram can significantly decrease frequency of performance of FBS or even replace it; 2) perform a cost analysis to establish the economic impact of the two treatment options.
Women in labour with a gestational age ≥ 36 weeks and an indication for CTG-monitoring can be included in the trial.
Eligible women will be randomised for fetal surveillance with CTG and, if necessary, FBS or CTG combined with ST-analysis of the fetal ECG.
The primary outcome of the study is the incidence of serious metabolic acidosis (defined as pH < 7.05 and Bdecf > 12 mmol/L in the umbilical cord artery). Secondary outcome measures are: instrumental delivery, neonatal outcome (Apgar score, admission to a neonatal ward), incidence of performance of FBS in both arms and cost-effectiveness of both monitoring strategies across hospitals.
The analysis will follow the intention to treat principle. The incidence of metabolic acidosis will be compared across both groups. Assuming a reduction of metabolic acidosis from 3.5% to 2.1 %, using a two-sided test with an alpha of 0.05 and a power of 0.80, in favour of CTG plus ST-analysis, about 5100 women have to be randomised. Furthermore, the cost-effectiveness of CTG and ST-analysis as compared to CTG and FBS will be studied.
This study will provide data about the use of intrapartum ST-analysis with a strict protocol for performance of FBS to limit its incidence. We aim to clarify to what extent intrapartum ST-analysis can be used without the performance of FBS and in which cases FBS is still needed.
Trial Registration Number
Meta-analysis has become a well-known method for synthesis of quantitative data from previously conducted research in applied health sciences. So far, meta-analysis has been particularly useful in evaluating and comparing therapies and in assessing causes of disease. Consequently, the number of software packages that can perform meta-analysis has increased over the years. Unfortunately, it can take a substantial amount of time to get acquainted with some of these programs and most contain little or no interactive educational material. We set out to create and validate an easy-to-use and comprehensive meta-analysis package that would be simple enough programming-wise to remain available as a free download. We specifically aimed at students and researchers who are new to meta-analysis, with important parts of the development oriented towards creating internal interactive tutoring tools and designing features that would facilitate usage of the software as a companion to existing books on meta-analysis.
We took an unconventional approach and created a program that uses Excel as a calculation and programming platform. The main programming language was Visual Basic, as implemented in Visual Basic 6 and Visual Basic for Applications in Excel 2000 and higher. The development took approximately two years and resulted in the 'MIX' program, which can be downloaded from the program's website free of charge. Next, we set out to validate the MIX output with two major software packages as reference standards, namely STATA (metan, metabias, and metatrim) and Comprehensive Meta-Analysis Version 2. Eight meta-analyses that had been published in major journals were used as data sources. All numerical and graphical results from analyses with MIX were identical to their counterparts in STATA and CMA. The MIX program distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features.
The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments. It can be downloaded free of charge via or .
Postherpetic neuralgia (PHN) is by far the most common complication of herpes zoster (HZ) and one of the most intractable pain disorders. Since PHN is seen most often in the elderly, the number of patients with this disorder is expected to increase in our ageing society. PHN may last for months to years and has a high impact on the quality of life. The results of PHN treatment are rather disappointing. Epidural injection of local anaesthetics and steroids in the acute phase of HZ is a promising therapy for the prevention of PHN. Since randomised trials on the effectiveness of this intervention are lacking, the PINE (Prevention by epidural Injection of postherpetic Neuralgia in the Elderly) study was set up. The PINE study compares the effectiveness and cost-effectiveness of a single epidural injection of local anaesthetics and steroids during the acute phase of HZ with that of care-as-usual (i.e. antivirals and analgesics) in preventing PHN in elderly patients.
Methods / design
The PINE study is an open, multicenter clinical trial in which 550 elderly (age ≥ 50 yr.) patients who consult their general practitioner in the acute phase of HZ (rash < 7 days) are randomised to one of the treatment groups. The primary clinical endpoint is the presence of HZ-related pain one month after the onset of the rash. Secondary endpoints include duration and severity of pain, re-interventions aiming to treat the existing pain, side effects, quality of life, and cost-effectiveness.
The PINE study is aimed to quantify the (cost-) effectiveness of a single epidural injection during the acute phase of HZ on the prevention of PHN.