PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1170)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
more »
1.  Moderate agreement between self-reported stroke and hospital-recorded stroke in two cohorts of Australian women: a validation study 
Background
Conflicting findings on the validity of self-reported stroke from existing studies creates uncertainty about the appropriateness of using self-reported stroke in epidemiological research. We aimed to compare self-reported stroke against hospital-recorded stroke, and investigate reasons for disagreement.
Methods
We included participants from the Australian Longitudinal Study on Women’s Health born in 1921–26 (n = 1556) and 1946–51 (n = 2119), who were living in New South Wales and who returned all survey questionnaires over a defined period of time. We determined agreement between self-reported and hospitalised stroke by calculating sensitivity, specificity and kappa statistics. We investigated whether characteristics including age, education, area of residence, country of birth, language spoken at home, recent mental health at survey completion and proxy completion of questionnaire were associated with disagreement, using logistic regression analysis to obtain odds ratios (ORs) with 95% confidence intervals (CIs).
Results
Agreement between self-report and hospital-recorded stroke was fair in older women (kappa 0.35, 95% CI 0.25 to 0.46) and moderate in mid-aged women (0.56, 95% CI 0.37 to 0.75). There was a high proportion with unverified self-reported stroke, partly due to: reporting of transient ischaemic attacks; strokes occurring outside the period of interest; and possible reporting of stroke-like conditions. In the older cohort, a large proportion with unverified stroke had hospital records of other cerebrovascular disease. In both cohorts, higher education was associated with agreement, whereas recent poor mental health was associated with disagreement.
Conclusion
Among women who returned survey questionnaires within the period of interest, validity of self-reported stroke was fair to moderate, but is probably underestimated. Agreement between self-report and hospital-recorded stroke was associated with individual characteristics. Where clinically verified stroke data are unavailable, self-report may be a reasonable alternative method of stroke ascertainment for some epidemiological studies.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-15-7) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-15-7
PMCID: PMC4320610  PMID: 25613556
Epidemiology; Stroke; Cerebrovascular disease; Validation studies; Self-report; Hospitalisation
2.  Clinical research and medical care: towards effective and complete integration 
Background
Despite their close relationship, clinical research and medical care have become separated by clear boundaries. The purpose of clinical research is to generate generalizable knowledge useful for future patients, whereas medical care aims to promote the well-being of individual patients. The evolution towards patient-centered medicine and patient-oriented research, and the gradual standardization of medicine are contributing to closer ties between clinical research and medical practice. But the integration of both activities requires addressing important ethical and methodological challenges.
Discussion
From an ethical perspective, clinical research should evolve from a position of paternalistic beneficence to a situation in which the principle of non-maleficence and patient autonomy predominate. The progressive adoption of “patient-oriented informed consent”, “patient equipoise”, and “altruism-based research”, and the application of risk-based ethical oversight, in which the level of regulatory scrutiny is adapted to the potential risk for patients, are crucial steps to achieve the integration between research and care.
From a methodological standpoint, careful and systematic observations should have greater relevance in clinical research, and experiments should be embedded into usual clinical practice. Clinical research should focus on individuals through the development of patient-oriented research. In a complementary way, the integration of experiments into medical practice through the systematic application of “point of care research” could help to generate knowledge for the individuals and for the populations.
Summary
The integration of clinical research and medical care will require researchers, clinicians, health care managers, and patients to reevaluate the way they understand both activities. The development of an integrated learning health care system will contribute to generating and applying clinically relevant medical knowledge, producing benefits for present and future patients.
doi:10.1186/1471-2288-15-4
PMCID: PMC4323129  PMID: 25575454
Research; Medical care; Patient; Patient-centered care; Preferences; Patient-reported outcomes; Randomized clinical trials; Observational studies; Evidence-based medicine; Bioethics
3.  Maximising response from GPs to questionnaire surveys: do length or incentives make a difference? 
Background
General Practitioners (GPs) respond poorly to postal surveys. Consequently there is potential for reduced data quality and bias in the findings. In general population surveys, response to postal questionnaires may be improved by reducing their length and offering incentives. The aim of this study was to investigate whether questionnaire length and/or the offer of an incentive improves the response of GPs to a postal questionnaire survey.
Methods
A postal questionnaire survey was sent to 800 UK GPs randomly selected from Binley’s database; a database containing contact details of professionals working in UK general practices. The random sample of GPs was assigned to one of four groups of 200, each receiving a different questionnaire, either a standard (eight sides of A4) or an abbreviated (four sides of A4) questionnaire, with or without the offer of an incentive (a prize draw entry for a £100 voucher) for completion. The effects of questionnaire length and offer of incentive on response were calculated.
Results
Of 800 mailed questionnaires, 19 GPs did not meet inclusion criteria and 172 (adjusted response 22.0%) completed questionnaires were received. Among the four groups, response ranged from 20.1% (standard questionnaire with no incentive and abbreviated questionnaire with incentive) through 21.8% (standard questionnaire with incentive), to 26.0% (abbreviated questionnaire with no incentive). There were no significant differences in response between the four groups (p = 0.447), between the groups receiving the standard versus the abbreviated questionnaire (% difference -2.1% (95% confidence interval (CI) -7.9, 3.7)) or the groups offered an incentive versus no incentive (% difference -2.1% (95% CI -7.9, 3.7).
Conclusions
Strategies known to improve response to postal questionnaire surveys in the general population do not significantly improve the response to postal questionnaire surveys among GPs. Further refinements to these strategies, or more novel strategies, aimed at increasing response specifically among GPs need to be identified in order to maximise data quality and generalisability of research results.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-15-3) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-15-3
PMCID: PMC4293861  PMID: 25563390
Cross-sectional survey; Postal questionnaires; General practice; Incentive; Non-response; Questionnaire length; Response
4.  Strategy for recruitment and factors associated with motivation and satisfaction in a randomized trial with 210 healthy volunteers without financial compensation 
Background
The aim was to describe a strategy for recruitment of healthy volunteers (HV) to a randomized trial that assessed the efficacy of different telephone techniques to assist HV in performing cardiac massage for vital emergency. Participation in the randomized trial was not financially compensated, however HV were offered emergency first-aid training. We also studied factors associated with HV motivation and satisfaction regarding participation in the trial.
Methods
Strategy for recruitment of 210 HV aged 18 to 60 years was based on: (1) the updated records of all telephone number since January 2000 of HV registered in the Rouen Clinical Investigation Centre HV database, (2) a communication campaign for the general public focussing on posters and media advertisements. Data on the recruitment, socio-demographics, motivation and satisfaction of the 210 HV were collected by anonymous self-administered questionnaire.
Results
Of the 210 HV included, 63.3% (n = 133) were recruited from the HV database and 36.7% (n = 77) by the communication campaign. On the one hand, the HV database enabled screening of 1315 HV, 54.8% (n = 721) of whom were reached by phone, 55.2% (n = 398) of these latter accepted to participate in the study and 10.1% of the initial screening (n = 133) were finally included. One the other hand, for the 77 HV not recruited from the HV database, word-of-mouth (56.1%) was the main means of recruitment. The male/female ratio of the 210 HV was 0.5 and mean age 43.5 years (Standard Deviation = 12.4). The main motivations given for participating in the trial were to support research (87.6%) and receive emergency first-aid training (85.7%). Overall satisfaction with the welcome process was significantly higher for older HV (46–60 years) (adjusted odds ratio (AOR): 3.44; 95% confidence interval (95% CI): 1.48-7.99), and for HV in management jobs (AOR: 4.26; 95% CI: 1.22-14.87). Satisfaction with protocol management was higher for women (AOR: 2.33; 95% CI: 1.18-4.60) and for older HV (46–60 years) (AOR: 4.76; 95% CI: 1.97-11.52).
Conclusions
Recruitment of non-compensated HV required broad screening with a primary HV database alongside word-of-mouth communication which seemed more efficient than media advertising. To enhance HV recruitment to randomized trials without financial compensation it seems crucial to provide them not only with a direct interest but also to ensure their satisfaction.
doi:10.1186/1471-2288-15-2
PMCID: PMC4293827  PMID: 25559410
Interventional study; Recruitment; Healthy volunteers; Motivation; Satisfaction
5.  Assessing the validity of the Global Activity Limitation Indicator in fourteen European countries 
Background
The Global Activity Limitation Indicator (GALI), the measure underlying the European indicator Healthy Life Years (HLY), is widely used to compare population health across countries. However, the comparability of the item has been questioned. This study aims to further validate the GALI in the adult European population.
Methods
Data from the European Health Interview Survey (EHIS), covering 14 European countries and 152,787 individuals, were used to explore how the GALI was associated with other measures of disability and whether the GALI was consistent or reflected different disability situations in different countries.
Results
When considering each country separately or all combined, we found that the GALI was significantly associated with measures of activities of daily living, instrumental activity of daily living, and functional limitations (P < 0.001 in all cases). Associations were largest for activity of daily living and lowest though still high for functional limitations. For each measure, the magnitude of the association was similar across most countries. Overall, however, the GALI differed significantly between countries in terms of how it reflected each of the three disability measures (P < 0.001 in all cases). We suspect cross-country differences in the results may be due to variations in: the implementation of the EHIS, the perception of functioning and limitations, and the understanding of the GALI question.
Conclusion
The study both confirms the relevance of this indicator to measure general activity limitations in the European population and the need for caution when comparing the level of the GALI from one country to another.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-15-1) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-15-1
PMCID: PMC4298058  PMID: 25555466
Global activity limitation indicator; Health expectancy; Disability-free life expectancy; Healthy life years; Disability; Functioning; Measurement
6.  Testing non-inferiority of a new treatment in three-arm clinical trials with binary endpoints 
Background
A two-arm non-inferiority trial without a placebo is usually adopted to demonstrate that an experimental treatment is not worse than a reference treatment by a small pre-specified non-inferiority margin due to ethical concerns. Selection of the non-inferiority margin and establishment of assay sensitivity are two major issues in the design, analysis and interpretation for two-arm non-inferiority trials. Alternatively, a three-arm non-inferiority clinical trial including a placebo is usually conducted to assess the assay sensitivity and internal validity of a trial. Recently, some large-sample approaches have been developed to assess the non-inferiority of a new treatment based on the three-arm trial design. However, these methods behave badly with small sample sizes in the three arms. This manuscript aims to develop some reliable small-sample methods to test three-arm non-inferiority.
Methods
Saddlepoint approximation, exact and approximate unconditional, and bootstrap-resampling methods are developed to calculate p-values of the Wald-type, score and likelihood ratio tests. Simulation studies are conducted to evaluate their performance in terms of type I error rate and power.
Results
Our empirical results show that the saddlepoint approximation method generally behaves better than the asymptotic method based on the Wald-type test statistic. For small sample sizes, approximate unconditional and bootstrap-resampling methods based on the score test statistic perform better in the sense that their corresponding type I error rates are generally closer to the prespecified nominal level than those of other test procedures.
Conclusions
Both approximate unconditional and bootstrap-resampling test procedures based on the score test statistic are generally recommended for three-arm non-inferiority trials with binary outcomes.
doi:10.1186/1471-2288-14-134
PMCID: PMC4277823  PMID: 25524326
Approximate unconditional test; Bootstrap-resampling test; Non-inferiority trial; Rate difference; Saddlepoint approximation; Three-arm design
7.  Bias in the study of prediction of change: a Monte Carlo simulation study of the effects of selective attrition and inappropriate modeling of regression toward the mean 
Background
Medical researchers often use longitudinal observational studies to examine how risk factors predict change in health over time. Selective attrition and inappropriate modeling of regression toward the mean (RTM) are two potential sources of bias in such studies.
Method
The current study used Monte Carlo simulations to examine bias related to selective attrition and inappropriate modeling of RTM in the study of prediction of change. This was done for multiple regression (MR) and change score analysis.
Results
MR provided biased results when attrition was dependent on follow-up and baseline variables to quite substantial degrees, while results from change score analysis were biased when attrition was more strongly dependent on variables at one time point than the other. A positive association between the predictor and change in the health variable was underestimated in MR and overestimated in change score analysis due to selective attrition. Inappropriate modeling of RTM, on the other hand, lead to overestimation of this association in MR and underestimation in change score analysis. Hence, selective attrition and inappropriate modeling of RTM biased the results in opposite directions.
Conclusion
MR and change score analysis are both quite robust against selective attrition. The interplay between selective attrition and inappropriate modeling of RTM emphasizes that it is not an easy task to assess the degree to which obtained results from empirical studies are over- versus underestimated due to attrition or RTM. Researchers should therefore use modern techniques for handling missing data and be careful to model RTM appropriately.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-133) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-133
PMCID: PMC4298063  PMID: 25519494
Monte Carlo simulation; Bias; Prediction of change; Longitudinal studies
8.  Visualizing inconsistency in network meta-analysis by independent path decomposition 
Background
In network meta-analysis, several alternative treatments can be compared by pooling the evidence of all randomised comparisons made in different studies. Incorporated indirect conclusions require a consistent network of treatment effects. An assessment of this assumption and of the influence of deviations is fundamental for the validity evaluation.
Methods
We show that network estimates for single pairwise treatment comparisons can be approximated by the evidence of a subnet that is decomposable into independent paths. Path-based estimates and the estimate of the residual evidence can be used with their contribution to the network estimate to set up a forest plot for the consistency assessment. Using a network meta-analysis of twelve antidepressants and controlled perturbations in the real and constructed consistent data, we discuss the consistency assessment by the independent path decomposition in contrast to an approach using a recently presented graphical tool, the net heat plot. In addition, we define influence functions that describe how changes in study effects are translated into network estimates.
Results
While the consistency assessment by the net heat plot comprises all network estimates, an independent path decomposition and visualisation in a forest plot is tailored to one specific treatment comparison. It allows for the recognition as to whether inconsistencies between different paths of evidence and outlier effects do affect the considered treatment comparison.
Conclusions
The approximation of the network estimate for a single comparison by the evidence of a subnet and the visualisation of the decomposition into independent paths provide the applicability of a graphical validation instrument that is known from classical meta-analysis.
doi:10.1186/1471-2288-14-131
PMCID: PMC4279676  PMID: 25510877
Network meta-analysis; Multiple treatments comparison meta-analysis; Mixed treatment comparison meta-analysis; Inconsistency; Influence diagnostics; Forest plot
9.  Presenting simulation results in a nested loop plot 
Background
Statisticians investigate new methods in simulations to evaluate their properties for future real data applications. Results are often presented in a number of figures, e.g., Trellis plots. We had conducted a simulation study on six statistical methods for estimating the treatment effect in binary outcome meta-analyses, where selection bias (e.g., publication bias) was suspected because of apparent funnel plot asymmetry. We varied five simulation parameters: true treatment effect, extent of selection, event proportion in control group, heterogeneity parameter, and number of studies in meta-analysis. In combination, this yielded a total number of 768 scenarios. To present all results using Trellis plots, 12 figures were needed.
Methods
Choosing bias as criterion of interest, we present a ‘nested loop plot’, a diagram type that aims to have all simulation results in one plot. The idea was to bring all scenarios into a lexicographical order and arrange them consecutively on the horizontal axis of a plot, whereas the treatment effect estimate is presented on the vertical axis.
Results
The plot illustrates how parameters simultaneously influenced the estimate. It can be combined with a Trellis plot in a so-called hybrid plot. Nested loop plots may also be applied to other criteria such as the variance of estimation.
Conclusion
The nested loop plot, similar to a time series graph, summarizes all information about the results of a simulation study with respect to a chosen criterion in one picture and provides a suitable alternative or an addition to Trellis plots.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-129) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-129
PMCID: PMC4272778  PMID: 25495636
Simulation study; Plot; Diagram; Graphical representation; Trellis plot
10.  A trivariate meta-analysis of diagnostic studies accounting for prevalence and non-evaluable subjects: re-evaluation of the meta-analysis of coronary CT angiography studies 
Background
A recent paper proposed an intent-to-diagnose approach to handle non-evaluable index test results and discussed several alternative approaches, with an application to the meta-analysis of coronary CT angiography diagnostic accuracy studies. However, no simulation studies have been conducted to test the performance of the methods.
Methods
We propose an extended trivariate generalized linear mixed model (TGLMM) to handle non-evaluable index test results. The performance of the intent-to-diagnose approach, the alternative approaches and the extended TGLMM approach is examined by extensive simulation studies. The meta-analysis of coronary CT angiography diagnostic accuracy studies is re-evaluated by the extended TGLMM.
Results
Simulation studies showed that the intent-to-diagnose approach under-estimate sensitivity and specificity. Under the missing at random (MAR) assumption, the TGLMM gives nearly unbiased estimates of test accuracy indices and disease prevalence. After applying the TGLMM approach to re-evaluate the coronary CT angiography meta-analysis, overall median sensitivity is 0.98 (0.967, 0.993), specificity is 0.875 (0.827, 0.923) and disease prevalence is 0.478 (0.379, 0.577).
Conclusions
Under MAR assumption, the intent-to-diagnose approach under-estimate both sensitivity and specificity, while the extended TGLMM gives nearly unbiased estimates of sensitivity, specificity and prevalence. We recommend the extended TGLMM to handle non-evaluable index test subjects.
doi:10.1186/1471-2288-14-128
PMCID: PMC4280699  PMID: 25475705
Meta-analysis; Diagnostic test; Non-evaluable subjects
11.  Methodological and ethical issues in research using social media: a metamethod of Human Papillomavirus vaccine studies 
Background
Online content is a primary source of healthcare information for internet-using adults and a rich resource for health researchers. This paper explores the methodological and ethical issues of engaging in health research using social media.
Methods
A metamethod was performed on systematically selected studies that used social media as a data source for exploring public awareness and beliefs about Human Papillomaviruses (HPV) and HPV vaccination. Seven electronic databases were searched using a variety of search terms identified for each of three concepts: social media, HPV vaccine, and research method. Abstracts were assessed for eligibility of inclusion; six studies met the eligibility criteria and were subjected to content analysis. A 10-item coding scheme was developed to assess the clarity, congruence and transparency of research design, epistemological and methodological underpinnings and ethical considerations.
Results
The designs of the six selected studies were sound, although most studies could have been more transparent about how they built in rigor to ensure the trustworthiness and credibility of findings. Statistical analysis that intended to measure trends and patterns did so without the benefit of randomized sampling and other design elements for ensuring generalizability or reproducibility of findings beyond the specified virtual community. Most researchers did not sufficiently engage virtual users in the research process or consider the risk of privacy incursion. Most studies did not seek ethical approval from an institutional research board or permission from host websites or web service providers.
Conclusions
The metamethod exposed missed opportunities for using the dialogical character of social media as well as a lack of attention to the unique ethical issues inherent in operating in a virtual community where social boundaries and issues of public and private are ambiguous. This suggests the need for more self-conscious and ethical research practices when using social media as a data source. Given the relative newness of virtual communities, researchers and ethics review boards must work together to develop expertise in evaluating the design of studies undertaken with virtual communities. We recommend that the principles of concern for welfare, respect for person, and justice to be applied in research using social media.
doi:10.1186/1471-2288-14-127
PMCID: PMC4265425  PMID: 25468265
Metamethod; Social media; Data collection; HPV vaccination; Ethics; Methodology
12.  Identifying complications of interventional procedures from UK routine healthcare databases: a systematic search for methods using clinical codes 
Background
Several authors have developed and applied methods to routine data sets to identify the nature and rate of complications following interventional procedures. But, to date, there has been no systematic search for such methods. The objective of this article was to find, classify and appraise published methods, based on analysis of clinical codes, which used routine healthcare databases in a United Kingdom setting to identify complications resulting from interventional procedures.
Methods
A literature search strategy was developed to identify published studies that referred, in the title or abstract, to the name or acronym of a known routine healthcare database and to complications from procedures or devices. The following data sources were searched in February and March 2013: Cochrane Methods Register, Conference Proceedings Citation Index – Science, Econlit, EMBASE, Health Management Information Consortium, Health Technology Assessment database, MathSciNet, MEDLINE, MEDLINE in-process, OAIster, OpenGrey, Science Citation Index Expanded and ScienceDirect. Of the eligible papers, those which reported methods using clinical coding were classified and summarised in tabular form using the following headings: routine healthcare database; medical speciality; method for identifying complications; length of follow-up; method of recording comorbidity. The benefits and limitations of each approach were assessed.
Results
From 3688 papers identified from the literature search, 44 reported the use of clinical codes to identify complications, from which four distinct methods were identified: 1) searching the index admission for specified clinical codes, 2) searching a sequence of admissions for specified clinical codes, 3) searching for specified clinical codes for complications from procedures and devices within the International Classification of Diseases 10th revision (ICD-10) coding scheme which is the methodology recommended by NHS Classification Service, and 4) conducting manual clinical review of diagnostic and procedure codes.
Conclusions
The four distinct methods identifying complication from codified data offer great potential in generating new evidence on the quality and safety of new procedures using routine data. However the most robust method, using the methodology recommended by the NHS Classification Service, was the least frequently used, highlighting that much valuable observational data is being ignored.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-126) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-126
PMCID: PMC4280749  PMID: 25430568
Adverse effects; Medical errors; Patient safety; Health information systems
13.  Analysis of four studies in a comparative framework reveals: health linkage consent rates on British cohort studies higher than on UK household panel surveys 
Background
A number of cohort studies and longitudinal household panel studies in Great Britain have asked for consent to link survey data to administrative health data. We explore commonalities and differences in the process of collecting consent, achieved consent rates and biases in consent with respect to socio-demographic, socio-economic and health characteristics. We hypothesise that British cohort studies which are rooted within the health sciences achieve higher consent rates than the UK household longitudinal studies which are rooted within the social sciences. By contrast, the lack of a specific health focus in household panel studies means there may be less selectivity in consent, in particular, with respect to health characteristics.
Methods
Survey designs and protocols for collecting informed consent to health record linkage on two British cohort studies and two UK household panel studies are systematically compared. Multivariate statistical analysis is then performed on information from one cohort and two household panel studies that share a great deal of the data linkage protocol but vary according to study branding, survey design and study population.
Results
We find that consent is higher in the British cohort studies than in the UK household panel studies, and is higher the more health-focused the study is. There are no systematic patterns of consent bias across the studies and where effects exist within a study or study type they tend to be small. Minority ethnic groups will be underrepresented in record linkage studies on the basis of all three studies.
Conclusions
Systematic analysis of three studies in a comparative framework suggests that the factors associated with consent are idiosyncratic to the study. Analysis of linked health data is needed to establish whether selectivity in consent means the resulting research databases suffer from any biases that ought to be considered.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-125) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-125
PMCID: PMC4280701  PMID: 25430545
14.  Shame for disrespecting evidence: the personal consequences of insufficient respect for structural equation model testing 
Background
Inappropriate and unacceptable disregard for structural equation model (SEM) testing can be traced back to: factor-analytic inattention to model testing, misapplication of the Wilkinson task force’s [Am Psychol 54:594-604, 1999] critique of tests, exaggeration of test biases, and uncomfortably-numerous model failures.
Discussion
The arguments for disregarding structural equation model testing are reviewed and found to be misguided or flawed. The fundamental test-supporting observations are: a) that the null hypothesis of the χ2 structural equation model test is not nil, but notable because it contains substantive theory claims and consequences; and b) that the amount of covariance ill fit cannot be trusted to report the seriousness of model misspecifications. All covariance-based fit indices risk failing to expose model problems because the extent of model misspecification does not reliably correspond to the magnitude of covariance ill fit – seriously causally misspecified models can fit, or almost fit.
Summary
The only reasonable research response to evidence of non-chance structural equation model failure is to diagnostically investigate the reasons for failure. Unfortunately, many SEM-based theories and measurement scales will require reassessment if we are to clear the backlogged consequences of previous deficient model testing. Fortunately, it will be easier for researchers to respect evidence pointing toward required reassessments, than to suffer manuscript rejection and shame for disrespecting evidence potentially signaling serious model misspecifications.
doi:10.1186/1471-2288-14-124
PMCID: PMC4297459  PMID: 25430437
Factor analysis; Factor model; Testing; Close fit; Structural equation model; SEM
15.  Impact of adding a limitations section to abstracts of systematic reviews on readers’ interpretation: a randomized controlled trial 
Background
To allow an accurate evaluation of abstracts of systematic reviews, the PRISMA Statement recommends that the limitations of the evidence (e.g., risk of bias, publication bias, inconsistency, imprecision) should be described in the abstract. We aimed to evaluate the impact of adding such limitations sections on reader’s interpretation.
Method
We performed a two-arm parallel group randomized controlled trial (RCT) using a sample of 30 abstracts of systematic reviews evaluating the effects of healthcare intervention with conclusions favoring the beneficial effect of the experimental treatments. Two formats of these abstracts were derived: one reported without and one with a standardized limitations section written according to the PRISMA statement for abstracts. The primary outcome was readers’ confidence in the results of the systematic review as stated in the abstract assessed by a Likert scale from 0, not at all confident, to 10, very confident. In total, 300 participants (corresponding authors of RCT reports indexed in PubMed) were randomized by a web-based randomization procedure to interpret one abstract with a limitations section (n = 150) or without a limitations section (n = 150). Participants were blinded to the study hypothesis.
Results
Adding a limitations section did not modify readers’ interpretation of findings in terms of confidence in the results (mean difference [95% confidence interval] 0.19 [−0.37–0.74], p = 0.50), confidence in the validity of the conclusions (0.07 [−0.49–0.62], p = 0.80), or benefit of the experimental intervention (0.12 [−0.42–0.44], p = 0.65).
This study is limited because the participants were expert-readers and are not representative of all systematic review readers.
Conclusion
Adding a limitations section to abstracts of systematic reviews did not affect readers’ interpretation of the abstract results. Other studies are needed to confirm the results and explore the impact of a limitations section on a less expert panel of participants.
Trial registration
ClinicalTrial.gov (NCT01848782).
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-123) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-123
PMCID: PMC4247631  PMID: 25420433
Meta-analysis; Systematic review; Bias; Limits; Limitation; Interpretation; Interpretation bias; Misinterpretation; Abstract; Results
16.  Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis 
Background
The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case–control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters.
Methods
The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages.
Results
By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression.
Conclusions
Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-122) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-122
PMCID: PMC4280686  PMID: 25417555
Statistics; Conditional distributions; Poisson regression; Time series regression; Environment
17.  Comparison of confidence interval methods for an intra-class correlation coefficient (ICC) 
Background
The intraclass correlation coefficient (ICC) is widely used in biomedical research to assess the reproducibility of measurements between raters, labs, technicians, or devices. For example, in an inter-rater reliability study, a high ICC value means that noise variability (between-raters and within-raters) is small relative to variability from patient to patient. A confidence interval or Bayesian credible interval for the ICC is a commonly reported summary. Such intervals can be constructed employing either frequentist or Bayesian methodologies.
Methods
This study examines the performance of three different methods for constructing an interval in a two-way, crossed, random effects model without interaction: the Generalized Confidence Interval method (GCI), the Modified Large Sample method (MLS), and a Bayesian method based on a noninformative prior distribution (NIB). Guidance is provided on interval construction method selection based on study design, sample size, and normality of the data. We compare the coverage probabilities and widths of the different interval methods.
Results
We show that, for the two-way, crossed, random effects model without interaction, care is needed in interval method selection because the interval estimates do not always have properties that the user expects. While different methods generally perform well when there are a large number of levels of each factor, large differences between the methods emerge when the number of one or more factors is limited. In addition, all methods are shown to lack robustness to certain hard-to-detect violations of normality when the sample size is limited.
Conclusions
Decision rules and software programs for interval construction are provided for practical implementation in the two-way, crossed, random effects model without interaction. All interval methods perform similarly when the data are normal and there are sufficient numbers of levels of each factor. The MLS and GCI methods outperform the NIB when one of the factors has a limited number of levels and the data are normally distributed or nearly normally distributed. None of the methods work well if the number of levels of a factor are limited and data are markedly non-normal. The software programs are implemented in the popular R language.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-121) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-121
PMCID: PMC4258044  PMID: 25417040
Confidence interval; Credible interval; Generalized confidence interval; Intraclass correlation coefficient; Modified large sample
18.  Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods 
Background
Thresholds for statistical significance when assessing meta-analysis results are being insufficiently demonstrated by traditional 95% confidence intervals and P-values. Assessment of intervention effects in systematic reviews with meta-analysis deserves greater rigour.
Methods
Methodologies for assessing statistical and clinical significance of intervention effects in systematic reviews were considered. Balancing simplicity and comprehensiveness, an operational procedure was developed, based mainly on The Cochrane Collaboration methodology and the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) guidelines.
Results
We propose an eight-step procedure for better validation of meta-analytic results in systematic reviews (1) Obtain the 95% confidence intervals and the P-values from both fixed-effect and random-effects meta-analyses and report the most conservative results as the main results. (2) Explore the reasons behind substantial statistical heterogeneity using subgroup and sensitivity analyses (see step 6). (3) To take account of problems with multiplicity adjust the thresholds for significance according to the number of primary outcomes. (4) Calculate required information sizes (≈ the a priori required number of participants for a meta-analysis to be conclusive) for all outcomes and analyse each outcome with trial sequential analysis. Report whether the trial sequential monitoring boundaries for benefit, harm, or futility are crossed. (5) Calculate Bayes factors for all primary outcomes. (6) Use subgroup analyses and sensitivity analyses to assess the potential impact of bias on the review results. (7) Assess the risk of publication bias. (8) Assess the clinical significance of the statistically significant review results.
Conclusions
If followed, the proposed eight-step procedure will increase the validity of assessments of intervention effects in systematic reviews of randomised clinical trials.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-120) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-120
PMCID: PMC4251848  PMID: 25416419
19.  Handling missing data in RCTs; a review of the top medical journals 
Background
Missing outcome data is a threat to the validity of treatment effect estimates in randomized controlled trials. We aimed to evaluate the extent, handling, and sensitivity analysis of missing data and intention-to-treat (ITT) analysis of randomized controlled trials (RCTs) in top tier medical journals, and compare our findings with previous reviews related to missing data and ITT in RCTs.
Methods
Review of RCTs published between July and December 2013 in the BMJ, JAMA, Lancet, and New England Journal of Medicine, excluding cluster randomized trials and trials whose primary outcome was survival.
Results
Of the 77 identified eligible articles, 73 (95%) reported some missing outcome data. The median percentage of participants with a missing outcome was 9% (range 0 – 70%). The most commonly used method to handle missing data in the primary analysis was complete case analysis (33, 45%), while 20 (27%) performed simple imputation, 15 (19%) used model based methods, and 6 (8%) used multiple imputation. 27 (35%) trials with missing data reported a sensitivity analysis. However, most did not alter the assumptions of missing data from the primary analysis. Reports of ITT or modified ITT were found in 52 (85%) trials, with 21 (40%) of them including all randomized participants. A comparison to a review of trials reported in 2001 showed that missing data rates and approaches are similar, but the use of the term ITT has increased, as has the report of sensitivity analysis.
Conclusions
Missing outcome data continues to be a common problem in RCTs. Definitions of the ITT approach remain inconsistent across trials. A large gap is apparent between statistical methods research related to missing data and use of these methods in application settings, including RCTs in top medical journals.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-118) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-118
PMCID: PMC4247714  PMID: 25407057
Missing data; Intention-to-treat; Sensitivity analysis
20.  A methodological systematic review of what’s wrong with meta-ethnography reporting 
Background
Syntheses of qualitative studies can inform health policy, services and our understanding of patient experience. Meta-ethnography is a systematic seven-phase interpretive qualitative synthesis approach well-suited to producing new theories and conceptual models. However, there are concerns about the quality of meta-ethnography reporting, particularly the analysis and synthesis processes. Our aim was to investigate the application and reporting of methods in recent meta-ethnography journal papers, focusing on the analysis and synthesis process and output.
Methods
Methodological systematic review of health-related meta-ethnography journal papers published from 2012–2013. We searched six electronic databases, Google Scholar and Zetoc for papers using key terms including ‘meta-ethnography.’ Two authors independently screened papers by title and abstract with 100% agreement. We identified 32 relevant papers. Three authors independently extracted data and all authors analysed the application and reporting of methods using content analysis.
Results
Meta-ethnography was applied in diverse ways, sometimes inappropriately. In 13% of papers the approach did not suit the research aim. In 66% of papers reviewers did not follow the principles of meta-ethnography. The analytical and synthesis processes were poorly reported overall. In only 31% of papers reviewers clearly described how they analysed conceptual data from primary studies (phase 5, ‘translation’ of studies) and in only one paper (3%) reviewers explicitly described how they conducted the analytic synthesis process (phase 6). In 38% of papers we could not ascertain if reviewers had achieved any new interpretation of primary studies. In over 30% of papers seminal methodological texts which could have informed methods were not cited.
Conclusions
We believe this is the first in-depth methodological systematic review of meta-ethnography conduct and reporting. Meta-ethnography is an evolving approach. Current reporting of methods, analysis and synthesis lacks clarity and comprehensiveness. This is a major barrier to use of meta-ethnography findings that could contribute significantly to the evidence base because it makes judging their rigour and credibility difficult. To realise the high potential value of meta-ethnography for enhancing health care and understanding patient experience requires reporting that clearly conveys the methodology, analysis and findings. Tailored meta-ethnography reporting guidelines, developed through expert consensus, could improve reporting.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-119) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-119
PMCID: PMC4277825  PMID: 25407140
Meta-ethnography; Systematic review; Qualitative health research; Reporting; Qualitative synthesis; Health; Evidence-based practice
21.  Added predictive value of omics data: specific issues related to validation illustrated by two case studies 
Background
In the last years, the importance of independent validation of the prediction ability of a new gene signature has been largely recognized. Recently, with the development of gene signatures which integrate rather than replace the clinical predictors in the prediction rule, the focus has been moved to the validation of the added predictive value of a gene signature, i.e. to the verification that the inclusion of the new gene signature in a prediction model is able to improve its prediction ability.
Methods
The high-dimensional nature of the data from which a new signature is derived raises challenging issues and necessitates the modification of classical methods to adapt them to this framework. Here we show how to validate the added predictive value of a signature derived from high-dimensional data and critically discuss the impact of the choice of methods on the results.
Results
The analysis of the added predictive value of two gene signatures developed in two recent studies on the survival of leukemia patients allows us to illustrate and empirically compare different validation techniques in the high-dimensional framework.
Conclusions
The issues related to the high-dimensional nature of the omics predictors space affect the validation process. An analysis procedure based on repeated cross-validation is suggested.
doi:10.1186/1471-2288-14-117
PMCID: PMC4271356  PMID: 25352096
Added predictive value; Omics score; Prediction model; Time-to-event data; Validation
22.  Validation of prediction models based on lasso regression with multiply imputed data 
Background
In prognostic studies, the lasso technique is attractive since it improves the quality of predictions by shrinking regression coefficients, compared to predictions based on a model fitted via unpenalized maximum likelihood. Since some coefficients are set to zero, parsimony is achieved as well. It is unclear whether the performance of a model fitted using the lasso still shows some optimism. Bootstrap methods have been advocated to quantify optimism and generalize model performance to new subjects. It is unclear how resampling should be performed in the presence of multiply imputed data.
Method
The data were based on a cohort of Chronic Obstructive Pulmonary Disease patients. We constructed models to predict Chronic Respiratory Questionnaire dyspnea 6 months ahead. Optimism of the lasso model was investigated by comparing 4 approaches of handling multiply imputed data in the bootstrap procedure, using the study data and simulated data sets. In the first 3 approaches, data sets that had been completed via multiple imputation (MI) were resampled, while the fourth approach resampled the incomplete data set and then performed MI.
Results
The discriminative model performance of the lasso was optimistic. There was suboptimal calibration due to over-shrinkage. The estimate of optimism was sensitive to the choice of handling imputed data in the bootstrap resampling procedure. Resampling the completed data sets underestimates optimism, especially if, within a bootstrap step, selected individuals differ over the imputed data sets. Incorporating the MI procedure in the validation yields estimates of optimism that are closer to the true value, albeit slightly too larger.
Conclusion
Performance of prognostic models constructed using the lasso technique can be optimistic as well. Results of the internal validation are sensitive to how bootstrap resampling is performed.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-116) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-116
PMCID: PMC4209042  PMID: 25323009
Clinical prediction models; Model validation; Multiple imputation; Quality of life; Shrinkage
23.  Comparison of two instruments for measurement of quality of life in clinical practice - a qualitative study 
Background
The study aimed to investigate the meaning patients assign to two measures of quality of life: the Schedule for Evaluation of Individual Quality of Life Direct Weighting (SEIQoL-DW) and the SEIQoL-DW Disease Related (DR) version, in a clinical oncology setting. Even though the use of quality of life assessments has increased during the past decades, uncertainty regarding how to choose the most suitable measure remains. SEIQoL-DW versions assesses the individual’s perception of his or her present quality of life by allowing the individual to nominate the domains to be evaluated followed by a weighting procedure resulting in qualitative (domains) as well as quantitative outcomes (index score).
Methods
The study applied a cross-sectional design with a qualitative approach and collected data from a purposeful sample of 40 patients with gastrointestinal cancer. Patients were asked to complete two measures, SEIQoL-DW and the SEIQoL-DR, to assess quality of life. This included nomination of the areas in life considered most important and rating of these areas; after completion patients participated in cognitive interviews around their selections of areas. Interviews were audiotaped and transcribed verbatim which was followed by analysis using a phenomenographic approach.
Results
The analyses of nominated areas of the two measures resulted in 11 domains reflecting what patients perceived had greatest impact on their quality of life. Analysis of the cognitive interviews resulted in 16 thematic categories explaining the nominated domains. How patients reflected around their quality of life appeared to differ by version (DW vs. DR). The DW version more often related to positive aspects in life while the DR version more often related to negative changes in life due to having cancer.
Conclusions
The two SEIQoL versions tap into different concepts; health-related quality of life, addressing losses and problems related to having cancer and, quality of life, more associated with aspects perceived as positive in life. The SEIQoL-DR and the SEIQoL-DW are recommended in clinical practice to take both negative and positive aspects into account and acting on the problems of greatest importance to the patient.
doi:10.1186/1471-2288-14-115
PMCID: PMC4210549  PMID: 25300493
Cognitive interviews; Gastrointestinal cancer; Health-related quality of life; Measures; SEIQoL; Quality of life
24.  Accuracy of the Berger-Exner test for detecting third-order selection bias in randomised controlled trials: a simulation-based investigation 
Background
Randomised controlled trials (RCT) are highly influential upon medical decisions. Thus RCTs must not distort the truth. One threat to internal trial validity is the correct prediction of future allocations (selection bias). The Berger-Exner test detects such bias but has not been widely utilized in practice. One reason for this non-utilisation may be a lack of information regarding its test accuracy. The objective of this study is to assess the accuracy of the Berger-Exner test on the basis of relevant simulations for RCTs with dichotomous outcomes.
Methods
Simulated RCTs with various parameter settings were generated, using R software, and subjected to bias-free and selection bias scenarios. The effect size inflation due to bias was quantified. The test was applied in both scenarios and the pooled sensitivity and specificity, with 95% confidence intervals for alpha levels of 1%, 5%, and 20%, were computed. Summary ROC curves were generated and the relationships of parameters with test accuracy were explored.
Results
An effect size inflation of 71% - 99% was established. Test sensitivity was 1.00 (95% CI: 0.99 – 1.00) for alpha level 1%, 5%, and 20%; test specificity was 0.94 (95% CI: 0.93 – 0.96); 0.82 (95% CI: 0.80 – 0.84), and 0.56 (95% CI: 0.54 – 0.58) for alpha 1%, 5%, and 20%, respectively. Test accuracy was best with the maximal procedure used with a maximum tolerated imbalance (MTI) = 2 as the randomisation method at alpha 1%.
Conclusions
The results of this simulation study suggest that the Berger-Exner test is generally accurate for identifying third-order selection bias.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2288-14-114) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2288-14-114
PMCID: PMC4209086  PMID: 25283963
Randomised trials; Selection bias; Berger-Exner test; Sensitivity; Specificity; ROC curve
25.  A comparison of three clustering methods for finding subgroups in MRI, SMS or clinical data: SPSS TwoStep Cluster analysis, Latent Gold and SNOB 
Background
There are various methodological approaches to identifying clinically important subgroups and one method is to identify clusters of characteristics that differentiate people in cross-sectional and/or longitudinal data using Cluster Analysis (CA) or Latent Class Analysis (LCA). There is a scarcity of head-to-head comparisons that can inform the choice of which clustering method might be suitable for particular clinical datasets and research questions. Therefore, the aim of this study was to perform a head-to-head comparison of three commonly available methods (SPSS TwoStep CA, Latent Gold LCA and SNOB LCA).
Methods
The performance of these three methods was compared: (i) quantitatively using the number of subgroups detected, the classification probability of individuals into subgroups, the reproducibility of results, and (ii) qualitatively using subjective judgments about each program’s ease of use and interpretability of the presentation of results.
We analysed five real datasets of varying complexity in a secondary analysis of data from other research projects. Three datasets contained only MRI findings (n = 2,060 to 20,810 vertebral disc levels), one dataset contained only pain intensity data collected for 52 weeks by text (SMS) messaging (n = 1,121 people), and the last dataset contained a range of clinical variables measured in low back pain patients (n = 543 people). Four artificial datasets (n = 1,000 each) containing subgroups of varying complexity were also analysed testing the ability of these clustering methods to detect subgroups and correctly classify individuals when subgroup membership was known.
Results
The results from the real clinical datasets indicated that the number of subgroups detected varied, the certainty of classifying individuals into those subgroups varied, the findings had perfect reproducibility, some programs were easier to use and the interpretability of the presentation of their findings also varied. The results from the artificial datasets indicated that all three clustering methods showed a near-perfect ability to detect known subgroups and correctly classify individuals into those subgroups.
Conclusions
Our subjective judgement was that Latent Gold offered the best balance of sensitivity to subgroups, ease of use and presentation of results with these datasets but we recognise that different clustering methods may suit other types of data and clinical research questions.
doi:10.1186/1471-2288-14-113
PMCID: PMC4192340  PMID: 25272975
Cluster analysis; Latent Class Analysis; Head-to-head comparison; Reproducibility; MRI; SMS

Results 1-25 (1170)