Related Articles
SUMMARY
Using validation sets for outcomes can greatly improve the estimation of vaccine efficacy (VE) in the field (Halloran and Longini, 2001; Halloran and others, 2003). Most statistical methods for using validation sets rely on the assumption that outcomes on those with no cultures are missing at random (MAR). However, often the validation sets will not be chosen at random. For example, confirmational cultures are often done on people with influenza-like illness as part of routine influenza surveillance. VE estimates based on such non-MAR validation sets could be biased. Here we propose frequentist and Bayesian approaches for estimating VE in the presence of validation bias. Our work builds on the ideas of Rotnitzky and others (1998, 2001), Scharfstein and others (1999, 2003), and Robins and others (2000). Our methods require expert opinion about the nature of the validation selection bias. In a re-analysis of an influenza vaccine study, we found, using the beliefs of a flu expert, that within any plausible range of selection bias the VE estimate based on the validation sets is much higher than the point estimate using just the non-specific case definition. Our approach is generally applicable to studies with missing binary outcomes with categorical covariates.
doi:10.1093/biostatistics/kxj031
PMCID: PMC2766283
PMID: 16556610
Bayesian; Expert opinion; Identifiability; Influenza; Missing data; Selection model; Vaccine efficacy
SUMMARY
In this paper, we develop a Bayesian method for joint analysis of longitudinal measurements and competing risks failure time data. The model allows one to analyze the longitudinal outcome with nonignorable missing data induced by multiple types of events, to analyze survival data with dependent censoring for the key event, and to draw inferences on multiple endpoints simultaneously. Compared with the likelihood approach, the Bayesian method has several advantages. It is computationally more tractable for high-dimensional random effects. It is also convenient to draw inference. Moreover, it provides a means to incorporate prior information that may help to improve estimation accuracy. An illustration is given using a clinical trial data of scleroderma lung disease. The performance of our method is evaluated by simulation studies.
doi:10.1002/sim.3562
PMCID: PMC3168565
PMID: 19308919
joint modeling; competing risks; longitudinal data; Bayesian approach
For decades, statisticians, philosophers, medical investigators and others interested in data analysis have argued that the Bayesian paradigm is the proper approach for reporting the results of scientific analyses for use by clients and readers. To date, the methods have been too complicated for non-statisticians to use. In this paper we argue that the World-Wide Web provides the perfect environment to put the Bayesian paradigm into practice: the likelihood function of the data is parsimoniously represented on the server side, the reader uses the client to represent her prior belief, and a downloaded program (a Java applet) performs the combination. In our approach, a different applet can be used for each likelihood function, prior belief can be assessed graphically, and calculation results can be reported in a variety of ways. We present a prototype implementation, BayesApplet, for two-arm clinical trials with normally-distributed outcomes, a prominent model for clinical trials. The primary implication of this work is that publishing medical research results on the Web can take a form beyond or different from that currently used on paper, and can have a profound impact on the publication and use of research results.
PMCID: PMC2232964
PMID: 8947703
While Berkson’s bias is widely recognized in the epidemiologic literature, it remains underappreciated as a model of both selection bias and bias due to missing data. Simple causal diagrams and 2×2 tables illustrate how Berkson’s bias connects to collider bias and selection bias more generally, and show the strong analogies between Berksonian selection bias and bias due to missing data. In some situations, considerations of whether data are missing at random or missing not at random is less important than the causal structure of the missing-data process. While dealing with missing data always relies on strong assumptions about unobserved variables, the intuitions built with simple examples can provide a better understanding of approaches to missing data in real-world situations.
doi:10.1097/EDE.0b013e31823b6296
PMCID: PMC3237868
PMID: 22081062
Summary
In the analysis of missing data, sensitivity analyses are commonly used to check the sensitivity of the parameters of interest with respect to the missing data mechanism and other distributional and modeling assumptions. In this article, we formally develop a general local influence method to carry out sensitivity analyses of minor perturbations to generalized linear models in the presence of missing covariate data. We examine two types of perturbation schemes (the single-case and global perturbation schemes) for perturbing various assumptions in this setting. We show that the metric tensor of a perturbation manifold provides useful information for selecting an appropriate perturbation. We also develop several local influence measures to identify influential points and test model misspecification. Simulation studies are conducted to evaluate our methods, and real datasets are analyzed to illustrate the use of our local influence measures.
doi:10.1111/j.1541-0420.2008.01179.x
PMCID: PMC2819734
PMID: 19210738
Influence measure; Local influence; Missing covariates; Perturbation manifold; Perturbation scheme
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. While the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well.
doi:10.1097/EDE.0b013e31823b539c
PMCID: PMC3257063
PMID: 22157311
Introduction
Systematic reviewer authors intending to include all randomized participants in their meta-analyses need to make assumptions about the outcomes of participants with missing data.
Objective
The objective of this paper is to provide systematic reviewer authors with a relatively simple guidance for addressing dichotomous data for participants excluded from analyses of randomized trials.
Methods
This guide is based on a review of the Cochrane handbook and published methodological research. The guide deals with participants excluded from the analysis who were considered ‘non-adherent to the protocol’ but for whom data are available, and participants with missing data.
Results
Systematic reviewer authors should include data from ‘non-adherent’ participants excluded from the primary study authors' analysis but for whom data are available. For missing, unavailable participant data, authors may conduct a complete case analysis (excluding those with missing data) as the primary analysis. Alternatively, they may conduct a primary analysis that makes plausible assumptions about the outcomes of participants with missing data. When the primary analysis suggests important benefit, sensitivity meta-analyses using relatively extreme assumptions that may vary in plausibility can inform the extent to which risk of bias impacts the confidence in the results of the primary analysis. The more plausible assumptions draw on the outcome event rates within the trial or in all trials included in the meta-analysis. The proposed guide does not take into account the uncertainty associated with assumed events.
Conclusions
This guide proposes methods for handling participants excluded from analyses of randomized trials. These methods can help in establishing the extent to which risk of bias impacts meta-analysis results.
doi:10.1371/journal.pone.0057132
PMCID: PMC3581575
PMID: 23451162
Summary
Longitudinal studies with binary repeated measures are widespread in biomedical research. Marginal regression approaches for balanced binary data are well developed, while for binary process data, where measurement times are irregular and may differ by individuals, likelihood-based methods for marginal regression analysis are less well developed. In this article, we develop a Bayesian regression model for analyzing longitudinal binary process data, with emphasis on dealing with missingness. We focus on the settings where data are missing at random, which require a correctly specified joint distribution for the repeated measures in order to draw valid likelihood-based inference about the marginal mean. To provide maximum flexibility, the proposed model specifies both the marginal mean and serial dependence structures using nonparametric smooth functions. Serial dependence is allowed to depend on the time lag between adjacent outcomes as well as other relevant covariates. Inference is fully Bayesian. Using simulations, we show that adequate modeling of the serial dependence structure is necessary for valid inference of the marginal mean when the binary process data are missing at random. Longitudinal viral load data from the HIV Epidemiology Research Study (HERS) are analyzed for illustration.
doi:10.1002/sim.3265
PMCID: PMC2581820
PMID: 18351709
Repeated measures; Marginal model; Nonparametric regression; Penalized splines; HIV/AIDS; Antiviral treatment
Background
Missing outcome data are very common in smoking cessation trials. It is often assumed that all such missing data are from participants who have been unsuccessful in giving up smoking (“missing=smoking”). Here we use data from a recent Internet based smoking cessation trial in order to investigate which of a set of a priori chosen baseline variables are predictive of missingness, and the evidence for and against the “missing=smoking” assumption.
Methods
We use a selection model, which models the probability that the outcome is observed given the outcome and other variables. The selection model includes a parameter for which zero indicates that the data are Missing at Random (MAR) and large values indicate “missing=smoking”. We examine the evidence for the predictive power of baseline variables in the context of a sensitivity analysis. We use data on the number and type of attempts made to obtain outcome data in order to estimate the association between smoking status and the missing data indicator.
Results
We apply our methods to the iQuit smoking cessation trial data. From the sensitivity analysis, we obtain strong evidence that older participants are more likely to provide outcome data. The model for the number and type of attempts to obtain outcome data confirms that age is a good predictor of missing data. There is weak evidence from this model that participants who have successfully given up smoking are more likely to provide outcome data but this evidence does not support the “missing=smoking” assumption. The probability that participants with missing outcome data are not smoking at the end of the trial is estimated to be between 0.14 and 0.19.
Conclusions
Those conducting smoking cessation trials, and wishing to perform an analysis that assumes the data are MAR, should collect and incorporate baseline variables into their models that are thought to be good predictors of missing data in order to make this assumption more plausible. However they should also consider the possibility of Missing Not at Random (MNAR) models that make or allow for less extreme assumptions than “missing=smoking”.
doi:10.1186/1471-2288-12-157
PMCID: PMC3507670
PMID: 23067272
There is accumulating evidence that prior knowledge about expectations plays an important role in perception. The Bayesian framework is the standard computational approach to explain how prior knowledge about the distribution of expected stimuli is incorporated with noisy observations in order to improve performance. However, it is unclear what information about the prior distribution is acquired by the perceptual system over short periods of time and how this information is utilized in the process of perceptual decision making. Here we address this question using a simple two-tone discrimination task. We find that the “contraction bias”, in which small magnitudes are overestimated and large magnitudes are underestimated, dominates the pattern of responses of human participants. This contraction bias is consistent with the Bayesian hypothesis in which the true prior information is available to the decision-maker. However, a trial-by-trial analysis of the pattern of responses reveals that the contribution of most recent trials to performance is overweighted compared with the predictions of a standard Bayesian model. Moreover, we study participants' performance in a-typical distributions of stimuli and demonstrate substantial deviations from the ideal Bayesian detector, suggesting that the brain utilizes a heuristic approximation of the Bayesian inference. We propose a biologically plausible model, in which decision in the two-tone discrimination task is based on a comparison between the second tone and an exponentially-decaying average of the first tone and past tones. We show that this model accounts for both the contraction bias and the deviations from the ideal Bayesian detector hypothesis. These findings demonstrate the power of Bayesian-like heuristics in the brain, as well as their limitations in their failure to fully adapt to novel environments.
Author Summary
In this paper we study how history affects perception using an auditory delayed comparison task, in which human participants repeatedly compare the frequencies of two, temporally-separated pure tones. We demonstrate that the history of the experiment has a substantial effect on participants' performance: when both tones are high relative to past stimuli, people tend to report that the 2nd tone was higher, and when they are relatively low, they tend to report that the 1st tone was higher. Interestingly, only the most recent trials bias performance, which can be interpreted as if the participants assume that the statistics of stimuli in the experiment is highly volatile. Moreover, this bias persists even in settings, in which it is detrimental to performance. These results demonstrate the abilities, as well as limitations, of the cognitive system when incorporating expectations in perception.
doi:10.1371/journal.pcbi.1002731
PMCID: PMC3486920
PMID: 23133343
This article studies a general joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the competing risks survival data, and a regression sub-model for the variance–covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. The model provides a useful approach to adjust for non-ignorable missing data due to dropout for the longitudinal outcome, enables analysis of the survival outcome with informative censoring and intermittently measured time-dependent covariates, as well as joint analysis of the longitudinal and survival outcomes. Unlike previously studied joint models, our model allows for heterogeneous random covariance matrices. It also offers a framework to assess the homogeneous covariance assumption of existing joint models. A Bayesian MCMC procedure is developed for parameter estimation and inference. Its performances and frequentist properties are investigated using simulations. A real data example is used to illustrate the usefulness of the approach.
doi:10.1007/s10985-010-9169-6
PMCID: PMC3162577
PMID: 20549344
Cause-specific hazard; Bayesian analysis; Cholesky decomposition; Mixed effects model; MCMC; Modeling covariance matrices
Having substantial missing data is a common problem in administrative and cancer registry data. We propose a sensitivity analysis to evaluate the impact of a covariate that is potentially missing not at random in survival analyses using Weibull proportional hazards regressions. We apply the method to an investigation of the impact of missing grade on post-surgical mortality outcomes in individuals with metastatic kidney cancer. Data came from the Surveillance Epidemiology and End Results (SEER) registry which provides population based information on those undergoing cytoreductive nephrectomy. Tumor grade is an important component of risk stratification for patients with both localized and metastatic kidney cancer. Many individuals in SEER with metastatic kidney cancer are missing tumor grade information. We found that surgery was protective, but that the magnitude of the effect depended on assumptions about the relationship of grade with missingness.
doi:10.1002/sim.3557
PMCID: PMC2741403
PMID: 19235263
When a randomized controlled trial has missing outcome data, any analysis is based on untestable assumptions, e.g. that the data are missing at random, or less commonly on other assumptions about the missing data mechanism. Given such assumptions, there is an extensive literature on suitable methods of analysis. However, little is known about what assumptions are appropriate. We use two sources of ancillary data to explore the missing data mechanism in a trial of adherence therapy in patients with schizophrenia: carer-reported (proxy) outcomes and the number of contact attempts. This requires additional assumptions to be made whose plausibility we discuss. Proxy outcomes are found to be unhelpful in this trial because they are insufficiently associated with patient outcome and because the ancillary assumptions are implausible. The number of attempts required to achieve a follow-up interview is helpful and suggests that these data are unlikely to depart far from being missing at random. We also perform sensitivity analyses to departures from missingness at random, based on the investigators’ prior beliefs elicited at the start of the trial. Wider use of techniques such as these will help to inform the choice of suitable assumptions for the analysis of randomized controlled trials.
doi:10.1111/j.1467-985X.2009.00627.x
PMCID: PMC2916212
PMID: 20711246
Informatively missing; Missing data; Missingness not at random; Prior elicitation; Proxy data; Repeated attempts; Sensitivity analysis
Treatment noncompliance and missing outcomes at posttreatment assessments are common problems in field experiments in naturalistic settings. Although the two complications often occur simultaneously, statistical methods that address both complications have not been routinely considered in data analysis practice in the prevention research field. This paper shows that identification and estimation of causal treatment effects considering both noncompliance and missing outcomes can be relatively easily conducted under various missing data assumptions. We review a few assumptions on missing data in the presence of noncompliance, including the latent ignorability proposed by Frangakis and Rubin (Biometrika 86:365–379, 1999), and show how these assumptions can be used in the parametric complier average causal effect (CACE) estimation framework. As an easy way of sensitivity analysis, we propose the use of alternative missing data assumptions, which will provide a range of causal effect estimates. In this way, we are less likely to settle with a possibly biased causal effect estimate based on a single assumption. We demonstrate how alternative missing data assumptions affect identification of causal effects, focusing on the CACE. The data from the Johns Hopkins School Intervention Study (Ialongo et al., Am J Community Psychol 27:599–642, 1999) will be used as an example.
doi:10.1007/s11121-010-0175-4
PMCID: PMC2912956
PMID: 20379779
Causal inference; Complier average causal effect; Latent ignorability; Missing at random; Missing data; Noncompliance
Background
Many randomized trials involve missing binary outcomes. Although many previous adjustments for missing binary outcomes have been proposed, none of these makes explicit use of randomization to bound the bias when the data are not missing at random.
Methods
We propose a novel approach that uses the randomization distribution to compute the anticipated maximum bias when missing at random does not hold due to an unobserved binary covariate (implying that missingness depends on outcome and treatment group). The anticipated maximum bias equals the product of two factors: (a) the anticipated maximum bias if there were complete confounding of the unobserved covariate with treatment group among subjects with an observed outcome and (b) an upper bound factor that depends only on the fraction missing in each randomization group. If less than 15% of subjects are missing in each group, the upper bound factor is less than .18.
Results
We illustrated the methodology using data from the Polyp Prevention Trial. We anticipated a maximum bias under complete confounding of .25. With only 7% and 9% missing in each arm, the upper bound factor, after adjusting for age and sex, was .10. The anticipated maximum bias of .25 × .10 =.025 would not have affected the conclusion of no treatment effect.
Conclusion
This approach is easy to implement and is particularly informative when less than 15% of subjects are missing in each arm.
doi:10.1186/1471-2288-3-8
PMCID: PMC194902
PMID: 12734019
SUMMARY
In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to HIV-infected women, instead of a single outcome variable, there are multiple binary outcomes (e.g., abnormal heart rate, abnormal blood pressure, abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated using generalized estimating equations (GEE) (Liang and Zeger, 1986), and consistent estimates can be obtained under the assumption of a missing completely at random (MCAR) mechanism. When the missing data mechanism is missing at random (MAR), that is the probability of missing a particular outcome at a time-point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models using a single modified GEE based on an EM-type algorithm. The proposed method is motivated by the longitudinal study of cardiac abnormalities in children born to HIV-infected women and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that under an MAR mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided the correlation model has been correctly specified, whereas estimates from standard GEE can lead to substantial bias.
doi:10.1111/j.1467-985X.2008.00564.x
PMCID: PMC2888330
PMID: 20585409
EM-type algorithm; generalized estimating equations; missing at random; missing completely at random
This article proposes a joint model for longitudinal measurements and competing risks survival data. The model consists of a linear mixed effects sub-model with t-distributed measurement errors for the longitudinal outcome, a proportional cause-specific hazards frailty sub-model for the survival outcome, and a regression sub-model for the variance-covariance matrix of the multivariate latent random effects based on a modified Cholesky decomposition. A Bayesian MCMC procedure is developed for parameter estimation and inference. Our method is insensitive to outlying longitudinal measurements in the presence of non-ignorable missing data due to dropout. Moreover, by modeling the variance-covariance matrix of the latent random effects, our model provides a useful framework for handling high-dimensional heterogeneous random effects and testing the homogeneous random effects assumption which is otherwise untestable in commonly used joint models. Finally, our model enables analysis of a survival outcome with intermittently measured time-dependent covariates and possibly correlated competing risks and dependent censoring, as well as joint analysis of the longitudinal and survival outcomes. Illustrations are given using a real data set from a lung study and simulation.
PMCID: PMC3166346
PMID: 21892381
Joint model; Competing risks; Bayesian analysis; Cholesky decomposition; Mixed effects model; MCMC; Modeling random effects covariance matrix; Outlier
Case-control studies are widely used to detect gene-environment interactions in the etiology of complex diseases. Many variables that are of interest to biomedical researchers are difficult to measure on an individual level, e.g. nutrient intake, cigarette smoking exposure, long-term toxic exposure. Measurement error causes bias in parameter estimates, thus masking key features of data and leading to loss of power and spurious/masked associations. We develop a Bayesian methodology for analysis of case-control studies for the case when measurement error is present in an environmental covariate and the genetic variable has missing data. This approach offers several advantages. It allows prior information to enter the model to make estimation and inference more precise. The environmental covariates measured exactly are modeled completely nonparametrically. Further, information about the probability of disease can be incorporated in the estimation procedure to improve quality of parameter estimates, what cannot be done in conventional case-control studies. A unique feature of the procedure under investigation is that the analysis is based on a pseudo-likelihood function therefore conventional Bayesian techniques may not be technically correct. We propose an approach using Markov Chain Monte Carlo sampling as well as a computationally simple method based on an asymptotic posterior distribution. Simulation experiments demonstrated that our method produced parameter estimates that are nearly unbiased even for small sample sizes. An application of our method is illustrated using a population-based case-control study of the association between calcium intake with the risk of colorectal adenoma development.
PMCID: PMC3178196
PMID: 21949562
Bayesian inference; Errors in variables; Gene-environment interactions; Markov Chain Monte Carlo sampling; Missing data; Pseudo-likelihood; Semiparametric methods
Many have documented the difficulty of using the current paradigm of Randomized Controlled Trials (RCTs) to test and validate the effectiveness of alternative medical systems such as Ayurveda. This paper critiques the applicability of RCTs for all clinical knowledge-seeking endeavors, of which Ayurveda research is a part. This is done by examining statistical hypothesis testing, the underlying foundation of RCTs, from a practical and philosophical perspective. In the philosophical critique, the two main worldviews of probability are that of the Bayesian and the frequentist. The frequentist worldview is a special case of the Bayesian worldview requiring the unrealistic assumptions of knowing nothing about the universe and believing that all observations are unrelated to each other. Many have claimed that the first belief is necessary for science, and this claim is debunked by comparing variations in learning with different prior beliefs. Moving beyond the Bayesian and frequentist worldviews, the notion of hypothesis testing itself is challenged on the grounds that a hypothesis is an unclear distinction, and assigning a probability on an unclear distinction is an exercise that does not lead to clarity of action. This critique is of the theory itself and not any particular application of statistical hypothesis testing. A decision-making frame is proposed as a way of both addressing this critique and transcending ideological debates on probability. An example of a Bayesian decision-making approach is shown as an alternative to statistical hypothesis testing, utilizing data from a past clinical trial that studied the effect of Aspirin on heart attacks in a sample population of doctors. As a big reason for the prevalence of RCTs in academia is legislation requiring it, the ethics of legislating the use of statistical methods for clinical research is also examined.
doi:10.4103/0975-9476.85548
PMCID: PMC3193681
PMID: 22022152
Bayesian; decision analysis; statistical hypothesis testing
An Approximate Bayesian Bootstrap (ABB) offers advantages in incorporating appropriate uncertainty when imputing missing data, but most implementations of the ABB have lacked the ability to handle nonignorable missing data where the probability of missingness depends on unobserved values. This paper outlines a strategy for using an ABB to multiply impute nonignorable missing data. The method allows the user to draw inferences and perform sensitivity analyses when the missing data mechanism cannot automatically be assumed to be ignorable. Results from imputing missing values in a longitudinal depression treatment trial as well as a simulation study are presented to demonstrate the method’s performance. We show that a procedure that uses a different type of ABB for each imputed data set accounts for appropriate uncertainty and provides nominal coverage.
doi:10.1016/j.csda.2008.07.042
PMCID: PMC2678725
PMID: 20016665
Not Missing at Random; NMAR; Multiple Imputation; Hot-Deck
A common concern in Bayesian data analysis is that an inappropriately informative prior may unduly influence posterior inferences. In the context of Bayesian clinical trial design, well chosen priors are important to ensure that posterior-based decision rules have good frequentist properties. However, it is difficult to quantify prior information in all but the most stylized models. This issue may be addressed by quantifying the prior information in terms of a number of hypothetical patients, i.e., a prior effective sample size (ESS). Prior ESS provides a useful tool for understanding the impact of prior assumptions. For example, the prior ESS may be used to guide calibration of prior variances and other hyperprior parameters. In this paper, we discuss such prior sensitivity analyses by using a recently proposed method to compute a prior ESS. We apply this in several typical Bayesian biomedical data analysis and clinical trial design settings. The data analyses include cross-tabulated counts, multiple correlated diagnostic tests, and ordinal outcomes using a proportional-odds model. The study designs include a phase I trial with late-onset toxicities, a phase II trial that monitors event times, and a phase I/II trial with dose-finding based on efficacy and toxicity.
doi:10.1007/s12561-010-9018-x
PMCID: PMC2910452
PMID: 20668651
Bayesian biostatistics; Bayesian clinical trial design; Bayesian analysis; effective sample size; parametric prior distribution
We propose a variety of methods based on the generalized estimation equations to address the issues encountered in haplotype-based pharmacogenetic analysis, including analysis of longitudinal data with outcome-dependent dropouts, and evaluation of the high-dimensional haplotype and haplotype-drug interaction effects in an overall manner. We use the inverse probability weights to handle the outcome-dependent dropouts under the missing-at-random assumption, and incorporate the weighted L1-penalty to select important main and interaction effects with high dimensionality. The proposed methods are easy to implement, computationally efficient, and provide an optimal balance between false positives and false negatives in detecting genetic effects.
doi:10.1080/10543400903572787
PMCID: PMC2845928
PMID: 20309762
adaptive LASSO; haplotype-based association analysis; haplotype-treatment interaction; missing data; repeated outcome; variable selection
Summary
With increasing frequency, epidemiologic studies are addressing hypotheses regarding gene-environment interaction. In many well studied candidate genes and for standard dietary and behavioral epidemiologic exposures, there is often substantial prior information available which may be used to analyze current data as well as for designing a new study. In this paper, first, we propose a proper full Bayesian approach for analyzing studies of gene-environment interaction. The Bayesian approach provides a natural way to incorporate uncertainties around the assumption of gene-environment independence, often used in such analysis. We then consider Bayesian sample size determination criteria for both estimation and hypothesis testing regarding the multiplicative gene-environment interaction parameter. We illustrate our proposed methods using data from a large ongoing case-control study of colorectal cancer investigating the interaction of N-acetyl transferase type 2 (NAT2) with smoking and red meat consumption. We use the existing data to elicit a design prior and show how to use this information in allocating cases and controls in planning a future study which investigates the same interaction parameters. The Bayesian design and analysis strategies are compared with their corresponding frequentist counterparts.
doi:10.1111/j.1541-0420.2009.01357.x
PMCID: PMC3103064
PMID: 19930190
case-only design; gene-environment independence; highest posterior density interval; molecular epidemiology of colorectal cancer; multinomial-Dirichlet; posterior odds
The fundamental objective for health research is to determine whether changes should be made to clinical decisions. Decisions made by veterinary surgeons in the light of new research evidence are known to be influenced by their prior beliefs, especially their initial opinions about the plausibility of possible results. In this paper, clinical trial results for a bovine mastitis control plan were evaluated within a Bayesian context, to incorporate a community of prior distributions that represented a spectrum of clinical prior beliefs. The aim was to quantify the effect of veterinary surgeons’ initial viewpoints on the interpretation of the trial results.
A Bayesian analysis was conducted using Markov chain Monte Carlo procedures. Stochastic models included a financial cost attributed to a change in clinical mastitis following implementation of the control plan. Prior distributions were incorporated that covered a realistic range of possible clinical viewpoints, including scepticism, enthusiasm and uncertainty. Posterior distributions revealed important differences in the financial gain that clinicians with different starting viewpoints would anticipate from the mastitis control plan, given the actual research results. For example, a severe sceptic would ascribe a probability of 0.50 for a return of <£5 per cow in an average herd that implemented the plan, whereas an enthusiast would ascribe this probability for a return of >£20 per cow. Simulations using increased trial sizes indicated that if the original study was four times as large, an initial sceptic would be more convinced about the efficacy of the control plan but would still anticipate less financial return than an initial enthusiast would anticipate after the original study. In conclusion, it is possible to estimate how clinicians’ prior beliefs influence their interpretation of research evidence. Further research on the extent to which different interpretations of evidence result in changes to clinical practice would be worthwhile.
doi:10.1016/j.prevetmed.2009.05.029
PMCID: PMC2729300
PMID: 19576643
Prior distribution; Clinical decision making; Bayesian analysis; Mastitis
The fundamental objective for health research is to determine whether changes should be made to clinical decisions. Decisions made by veterinary surgeons in the light of new research evidence are known to be influenced by their prior beliefs, especially their initial opinions about the plausibility of possible results. In this paper, clinical trial results for a bovine mastitis control plan were evaluated within a Bayesian context, to incorporate a community of prior distributions that represented a spectrum of clinical prior beliefs. The aim was to quantify the effect of veterinary surgeons’ initial viewpoints on the interpretation of the trial results.
A Bayesian analysis was conducted using Markov chain Monte Carlo procedures. Stochastic models included a financial cost attributed to a change in clinical mastitis following implementation of the control plan. Prior distributions were incorporated that covered a realistic range of possible clinical viewpoints, including scepticism, enthusiasm and uncertainty. Posterior distributions revealed important differences in the financial gain that clinicians with different starting viewpoints would anticipate from the mastitis control plan, given the actual research results. For example, a severe sceptic would ascribe a probability of 0.50 for a return of <£5 per cow in an average herd that implemented the plan, whereas an enthusiast would ascribe this probability for a return of >£20 per cow. Simulations using increased trial sizes indicated that if the original study was four times as large, an initial sceptic would be more convinced about the efficacy of the control plan but would still anticipate less financial return than an initial enthusiast would anticipate after the original study. In conclusion, it is possible to estimate how clinicians’ prior beliefs influence their interpretation of research evidence. Further research on the extent to which different interpretations of evidence result in changes to clinical practice would be worthwhile.
doi:10.1016/j.prevetmed.2009.05.029
PMCID: PMC2729300
PMID: 19576643
Prior distribution; Clinical decision making; Bayesian analysis; Mastitis