PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1297053)

Clipboard (0)
None

Related Articles

1.  Evaluating the Effect of Early Versus Late ARV Regimen Change if Failure on an Initial Regimen: Results From the AIDS Clinical Trials Group Study A5095 
The current goal of initial antiretroviral (ARV) therapy is suppression of plasma human immunodeficiency virus (HIV)-1 RNA levels to below 200 copies per milliliter. A proportion of HIV-infected patients who initiate antiretroviral therapy in clinical practice or antiretroviral clinical trials either fail to suppress HIV-1 RNA or have HIV-1 RNA levels rebound on therapy. Frequently, these patients have sustained CD4 cell counts responses and limited or no clinical symptoms and, therefore, have potentially limited indications for altering therapy which they may be tolerating well despite increased viral replication. On the other hand, increased viral replication on therapy leads to selection of resistance mutations to the antiretroviral agents comprising their therapy and potentially cross-resistance to other agents in the same class decreasing the likelihood of response to subsequent antiretroviral therapy. The optimal time to switch antiretroviral therapy to ensure sustained virologic suppression and prevent clinical events in patients who have rebound in their HIV-1 RNA, yet are stable, is not known. Randomized clinical trials to compare early versus delayed switching have been difficult to design and more difficult to enroll. In some clinical trials, such as the AIDS Clinical Trials Group (ACTG) Study A5095, patients randomized to initial antiretroviral treatment combinations, who fail to suppress HIV-1 RNA or have a rebound of HIV-1 RNA on therapy are allowed to switch from the initial ARV regimen to a new regimen, based on clinician and patient decisions. We delineate a statistical framework to estimate the effect of early versus late regimen change using data from ACTG A5095 in the context of two-stage designs.
In causal inference, a large class of doubly robust estimators are derived through semiparametric theory with applications to missing data problems. This class of estimators is motivated through geometric arguments and relies on large samples for good performance. By now, several authors have noted that a doubly robust estimator may be suboptimal when the outcome model is misspecified even if it is semiparametric efficient when the outcome regression model is correctly specified. Through auxiliary variables, two-stage designs, and within the contextual backdrop of our scientific problem and clinical study, we propose improved doubly robust, locally efficient estimators of a population mean and average causal effect for early versus delayed switching to second-line ARV treatment regimens. Our analysis of the ACTG A5095 data further demonstrates how methods that use auxiliary variables can improve over methods that ignore them. Using the methods developed here, we conclude that patients who switch within 8 weeks of virologic failure have better clinical outcomes, on average, than patients who delay switching to a new second-line ARV regimen after failing on the initial regimen. Ordinary statistical methods fail to find such differences. This article has online supplementary material.
doi:10.1080/01621459.2011.646932
PMCID: PMC3545451  PMID: 23329858
Causal inference; Double robustness; Longitudinal data analysis; Missing data; Rubin causal model; Semiparametric efficient estimation
2.  Child Mortality Estimation: Consistency of Under-Five Mortality Rate Estimates Using Full Birth Histories and Summary Birth Histories 
PLoS Medicine  2012;9(8):e1001296.
Romesh Silva assesses and analyzes differences in direct and indirect methods of estimating under-five mortality rates using data collected from full and summary birth histories in Demographic and Health Surveys from West Africa, East Africa, Latin America, and South/Southeast Asia.
Background
Given the lack of complete vital registration data in most developing countries, for many countries it is not possible to accurately estimate under-five mortality rates from vital registration systems. Heavy reliance is often placed on direct and indirect methods for analyzing data collected from birth histories to estimate under-five mortality rates. Yet few systematic comparisons of these methods have been undertaken. This paper investigates whether analysts should use both direct and indirect estimates from full birth histories, and under what circumstances indirect estimates derived from summary birth histories should be used.
Methods and Findings
Usings Demographic and Health Surveys data from West Africa, East Africa, Latin America, and South/Southeast Asia, I quantify the differences between direct and indirect estimates of under-five mortality rates, analyze data quality issues, note the relative effects of these issues, and test whether these issues explain the observed differences. I find that indirect estimates are generally consistent with direct estimates, after adjustment for fertility change and birth transference, but don't add substantial additional insight beyond direct estimates. However, choice of direct or indirect method was found to be important in terms of both the adjustment for data errors and the assumptions made about fertility.
Conclusions
Although adjusted indirect estimates are generally consistent with adjusted direct estimates, some notable inconsistencies were observed for countries that had experienced either a political or economic crisis or stalled health transition in their recent past. This result suggests that when a population has experienced a smooth mortality decline or only short periods of excess mortality, both adjusted methods perform equally well. However, the observed inconsistencies identified suggest that the indirect method is particularly prone to bias resulting from violations of its strong assumptions about recent mortality and fertility. Hence, indirect estimates of under-five mortality rates from summary birth histories should be used only for populations that have experienced either smooth mortality declines or only short periods of excess mortality in their recent past.
Please see later in the article for the Editors' Summary.
Editors' Summary
Background
In 1990, 12 million children died before they reached their fifth birthday. Faced with this largely avoidable loss of young lives, in 2000, world leaders set a target of reducing under-five mortality (death) to one-third of its 1990 level by 2015 as Millennium Development Goal 4 (MDG 4); this goal, together with seven others, aims to eradicate extreme poverty globally. To track progress towards MDG 4, experts need accurate estimates of the global and country-specific under-five mortality rate (U5MR, the probability of a child dying before age five). The most reliable sources of data for U5MR estimation are vital registration systems—national records of all births and deaths. Unfortunately, developing countries, which are where most childhood deaths occur, rarely have such records, so full or summary birth histories provide the data for U5MR estimation instead. In full birth histories (FBHs), which are collected through household surveys such as those conducted by Demographic and Health Surveys (DHS), women are asked for the date of birth of all their children and the age at death of any children who have died. In summary birth histories (SBHs), which are collected through household surveys and censuses, women are asked how many children they have had and how many are alive at the time of the survey.
Why Was This Study Done?
“Direct” estimates of U5MRs can be obtained from FBHs because FBHs provide detailed information about the date of death and the exposure of children to the risk of dying. By contrast, because SBHs do not contain information on children's exposure to the risk of dying, “indirect” estimates of U5MR are obtained from SBHs using model life tables (mathematical models of the variation of mortality with age). Indirect estimates are often also derived from FBHs, but few systematic comparisons of direct and indirect methods for U5MR estimation have been undertaken. In this study, Romesh Silva investigates whether direct and indirect methods provide consistent U5MR estimates from FBHs and whether there are any circumstances under which indirect methods provide more reliable U5MR estimates than direct methods.
What Did the Researcher Do and Find?
The researcher used DHS data from West Africa, East Africa, Latin America, and South/Southeast Asia to quantify the differences between direct and indirect estimates of U5MR calculated from the same data and analyzed possible reasons for these differences. Estimates obtained using a version of the “Brass” indirect estimation method were uniformly higher than those obtained using direct estimation. Indirect and direct estimates generally agreed, however, after adjustment for changes in fertility—the Brass method assumes that country-specific fertility (the number of children born to a woman during her reproductive life) remains constant—and for birth transference, an important source of data error in FBHs that arises because DHS field staff can lessen their workload by recording births as occurring before a preset cutoff date rather than after that date. Notably, though, for countries that had experienced political or economic crises, periods of excess mortality due to conflicts, or periods during which the health transition had stalled (as countries become more affluent, overall mortality rates decline and noncommunicable diseases replace infectious diseases as the major causes of death), marked differences between indirect and direct estimates of U5MR remained, even after these adjustments.
What Do These Findings Mean?
Because the countries included in this study do not have vital registration systems, these findings provide no information about the validity of either direct or indirect estimation methods for U5MR estimation. They suggest, however, that for countries where there has been a smooth decline in mortality or only short periods of excess mortality, both direct and indirect methods of U5MR estimation work equally well, after adjustment for changes in fertility and for birth transference, and that indirect estimates add little to the insights provided into childhood mortality by direct estimates. Importantly, the inconsistencies observed between the two methods that remain after adjustment suggest that indirect U5MR estimation is more susceptible to bias (systematic errors that arise because of the assumptions used to estimate U5MR) than direct estimation. Thus, indirect estimates of U5MR from SBHs should be used only for populations that have experienced either smooth mortality declines or only short periods of excess mortality in their recent past.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001296.
This paper is part of a collection of papers on Child Mortality Estimation Methods published in PLOS Medicine
The United Nations Childrens Fund (UNICEF) works for children's rights, survival, development, and protection around the world; it provides information on Millennium Development Goal 4, and its Childinfo website provides detailed statistics about child survival and health, including a description of the United Nations Inter-agency Group for Child Mortality Estimation; the 2011 UN IGME report Levels & Trends in Child Mortality is available
The World Health Organization has information about Millennium Development Goal 4 and provides estimates of child mortality rates (some information in several languages)
Further information about the Millennium Development Goals is available
Information is available about infant and child mortality data collected by Demographic and Health Surveys
doi:10.1371/journal.pmed.1001296
PMCID: PMC3429405  PMID: 22952436
3.  Performance of mixed effects models in the analysis of mediated longitudinal data 
Background
Linear mixed effects models (LMMs) are a common approach for analyzing longitudinal data in a variety of settings. Although LMMs may be applied to complex data structures, such as settings where mediators are present, it is unclear whether they perform well relative to methods for mediational analyses such as structural equation models (SEMs), which have obvious appeal in such settings. For some researchers, SEMs may be more difficult than LMMs to implement, e.g. due to lack of training in the methodology or the need for specialized SEM software. It therefore is of interest to evaluate whether the LMM performs sufficiently in a scenario particularly suitable for SEMs. We focus on evaluation of the total effect (i.e. direct and indirect) of an exposure on an outcome of interest when a mediating factor is present. Our aim is to explore whether the LMM performs as well as the SEM in a setting that is conducive to using the SEM.
Methods
We simulated mediated longitudinal data from an SEM where a binary, main independent variable has both direct and indirect effects on a continuous outcome. We conducted analyses with both the LMM and SEM to evaluate the performance of the LMM in a setting where the SEM is expected to be preferable. Models were evaluated with respect to bias, coverage probability and power. Sample size, effect size and error distribution of the simulated data were varied.
Results
Both models performed well in a range of settings. Marginal increases in power estimates were observed for the SEM, although generally there were no major differences in performance. Power for both models was good with a sample of size of 250 and a small to medium effect size. Bias did not substantially increase for either model when data were generated from distributions that were both skewed and kurtotic.
Conclusions
In settings where the goal is to evaluate the overall effects, the LMM excluding mediating variables appears to have good performance with respect to power, bias and coverage probability relative to the SEM. The major benefit of SEMs is that it simultaneously and efficiently models both the direct and indirect effects of the mediation process.
doi:10.1186/1471-2288-10-16
PMCID: PMC2842282  PMID: 20170503
4.  Inverse Odds Ratio-Weighted Estimation for Causal Mediation Analysis 
Statistics in medicine  2013;32(26):4567-4580.
An important scientific goal of studies in the health and social sciences is increasingly to determine to what extent the total effect of a point exposure is mediated by an intermediate variable on the causal pathway between the exposure and the outcome. A causal framework has recently been proposed for mediation analysis, which gives rise to new definitions, formal identification results and novel estimators of direct and indirect effects. In the present paper, the author describes a new inverse odds ratio-weighted (IORW) approach to estimate so-called natural direct and indirect effects. The approach which uses as a weight, the inverse of an estimate of the odds ratio function relating the exposure and the mediator is universal in that it can be used to decompose total effects in a number of regression models commonly used in practice. Specifically, the approach may be used for effect decomposition in generalized linear models with a nonlinear link function, and in a number of other commonly used models such as the Cox proportional hazards regression for a survival outcome. The approach is simple and can be implemented in standard software provided a weight can be specified for each observation. An additional advantage of the method is that it easily incorporates multiple mediators of a categorical, discrete or continuous nature.
doi:10.1002/sim.5864
PMCID: PMC3954805  PMID: 23744517
Causal Mediation Analysis; Inverse odds ratio weighted estimation; natural direct and indirect effects; double robustness
5.  Mediation Analysis for Nonlinear Models with Confounding 
Epidemiology (Cambridge, Mass.)  2012;23(6):879-888.
Recently, researchers have used a potential-outcome framework to estimate causally interpretable direct and indirect effects of an intervention or exposure on an outcome. One approach to causal-mediation analysis uses the so-called mediation formula to estimate the natural direct and indirect effects. This approach generalizes classical mediation estimators and allows for arbitrary distributions for the outcome variable and mediator. A limitation of the standard (parametric) mediation formula approach is that it requires a specified mediator regression model and distribution; such a model may be difficult to construct and may not be of primary interest. To address this limitation, we propose a new method for causal-mediation analysis that uses the empirical distribution function, thereby avoiding parametric distribution assumptions for the mediator. In order to adjust for confounders of the exposure-mediator and exposure-outcome relationships, inverse-probability weighting is incorporated based on a supplementary model of the probability of exposure. This method, which yields estimates of the natural direct and indirect effects for a specified reference group, is applied to data from a cohort study of dental caries in very-low-birth-weight adolescents to investigate the oral-hygiene index as a possible mediator. Simulation studies show low bias in the estimation of direct and indirect effects in a variety of distribution scenarios, whereas the standard mediation formula approach can be considerably biased when the distribution of the mediator is incorrectly specified.
doi:10.1097/EDE.0b013e31826c2bb9
PMCID: PMC3773310  PMID: 23007042
6.  Semiparametric Maximum Likelihood Estimation in Normal Transformation Models for Bivariate Survival Data 
Biometrika  2008;95(4):947-960.
SUMMARY
We consider a class of semiparametric normal transformation models for right censored bivariate failure times. Nonparametric hazard rate models are transformed to a standard normal model and a joint normal distribution is assumed for the bivariate vector of transformed variates. A semiparametric maximum likelihood estimation procedure is developed for estimating the marginal survival distribution and the pairwise correlation parameters. This produces an efficient estimator of the correlation parameter of the semiparametric normal transformation model, which characterizes the bivariate dependence of bivariate survival outcomes. In addition, a simple positive-mass-redistribution algorithm can be used to implement the estimation procedures. Since the likelihood function involves infinite-dimensional parameters, the empirical process theory is utilized to study the asymptotic properties of the proposed estimators, which are shown to be consistent, asymptotically normal and semiparametric efficient. A simple estimator for the variance of the estimates is also derived. The finite sample performance is evaluated via extensive simulations.
doi:10.1093/biomet/asn049
PMCID: PMC2600666  PMID: 19079778
Asymptotic normality; Bivariate failure time; Consistency; Semiparametric efficiency; Semiparametric maximum likelihood estimate; Semiparametric normal transformation
7.  Modeling the impact of hepatitis C viral clearance on end-stage liver disease in an HIV co-infected cohort with Targeted Maximum Likelihood Estimation 
Biometrics  2013;70(1):144-152.
Summary
Despite modern effective HIV treatment, hepatitis C virus (HCV) co-infection is associated with a high risk of progression to end-stage liver disease (ESLD) which has emerged as the primary cause of death in this population. Clinical interest lies in determining the impact of clearance of HCV on risk for ESLD. In this case study, we examine whether HCV clearance affects risk of ESLD using data from the multicenter Canadian Co-infection Cohort Study. Complications in this survival analysis arise from the time-dependent nature of the data, the presence of baseline confounders, loss to follow-up, and confounders that change over time, all of which can obscure the causal effect of interest. Additional challenges included non-censoring variable missingness and event sparsity.
In order to efficiently estimate the ESLD-free survival probabilities under a specific history of HCV clearance, we demonstrate the doubly-robust and semiparametric efficient method of Targeted Maximum Likelihood Estimation (TMLE). Marginal structural models (MSM) can be used to model the effect of viral clearance (expressed as a hazard ratio) on ESLD-free survival and we demonstrate a way to estimate the parameters of a logistic model for the hazard function with TMLE. We show the theoretical derivation of the efficient influence curves for the parameters of two different MSMs and how they can be used to produce variance approximations for parameter estimates. Finally, the data analysis evaluating the impact of HCV on ESLD was undertaken using multiple imputations to account for the non-monotone missing data.
doi:10.1111/biom.12105
PMCID: PMC3954273  PMID: 24571372
Double-robust; Inverse probability of treatment weighting; Kaplan-Meier; Longitudinal data; Marginal structural model; Survival analysis; Targeted maximum likelihood estimation
8.  Dynamic regression hazards models for relative survival 
Statistics in medicine  2008;27(18):3563-3584.
SUMMARY
A natural way of modelling relative survival through regression analysis is to assume an additive form between the expected population hazard and the excess hazard due to the presence of an additional cause of mortality. Within this context, the existing approaches in the parametric, semiparametric and non-parametric setting are compared and discussed. We study the additive excess hazards models, where the excess hazard is on additive form. This makes it possible to assess the importance of time-varying effects for regression models in the relative survival framework. We show how recent developments can be used to make inferential statements about the non-parametric version of the model. This makes it possible to test the key hypothesis that an excess risk effect is time varying in contrast to being constant over time. In case some covariate effects are constant, we show how the semiparametric additive risk model can be considered in the excess risk setting, providing a better and more useful summary of the data. Estimators have explicit form and inference based on a resampling scheme is presented for both the non-parametric and semiparametric models. We also describe a new suggestion for goodness of fit of relative survival models, which consists on statistical and graphical tests based on cumulative martingale residuals. This is illustrated on the semiparametric model with proportional excess hazards. We analyze data from the TRACE study using different approaches and show the need for more flexible models in relative survival.
doi:10.1002/sim.3242
PMCID: PMC2737139  PMID: 18338318
9.  An information criterion for marginal structural models 
Statistics in medicine  2012;32(8):1383-1393.
Summary
Marginal structural models were developed as a semiparametric alternative to the G-computation formula to estimate causal effects of exposures. In practice, these models are often specified using parametric regression models. As such, the usual conventions regarding regression model specification apply. This paper outlines strategies for marginal structural model specification, and considerations for the functional form of the exposure metric in the final structural model. We propose a quasi-likelihood information criterion adapted from use in generalized estimating equations. We evaluate the properties of our proposed information criterion using a limited simulation study. We illustrate our approach using two empirical examples. In the first example, we use data from a randomized breastfeeding promotion trial to estimate the effect of breastfeeding duration on infant weight at one year. In the second example, we use data from two prospective cohorts studies to estimate the effect of highly active antiretroviral therapy on CD4 count in an observational cohort of HIV-infected men and women. The marginal structural model specified should reflect the scientific question being addressed, but can also assist in exploration of other plausible and closely related questions. In marginal structural models, as in any regression setting, correct inference depends on correct model specification. Our proposed information criterion provides a formal method for comparing model fit for different specifications.
doi:10.1002/sim.5599
PMCID: PMC4180061  PMID: 22972662
Bias; Causal inference; Marginal structural model; Regression analysis; Model specification
10.  Longitudinal studies of binary response data following case-control and stratified case-control sampling: design and analysis 
Biometrics  2009;66(2):365-373.
SUMMARY
We discuss design and analysis of longitudinal studies after case-control sampling, wherein interest is in the relationship between a longitudinal binary response that is related to the sampling (case-control) variable, and a set of covariates. We propose a semiparametric modelling framework based on a marginal longitudinal binary response model and an ancillary model for subjects’ case-control status. In this approach, the analyst must posit the population prevalence of being a case, which is then used to compute an offset term in the ancillary model. Parameter estimates from this model are used to compute offsets for the longitudinal response model. Examining the impact of population prevalence and ancillary model misspecification, we show that time-invariant covariate parameter estimates, other than the intercept, are reasonably robust, but intercept and time-varying covariate parameter estimates can be sensitive to such misspecification. We study design and analysis issues impacting study efficiency, namely: choice of sampling variable and the strength of its relationship to the response, sample stratification, choice of working covariance weighting, and degree of flexibility of the ancillary model. The research is motivated by a longitudinal study following case-control sampling of the time course of ADHD symptoms.
doi:10.1111/j.1541-0420.2009.01306.x
PMCID: PMC3051172  PMID: 19673861
Bias; binary data; efficiency; Generalized Estimating Equations; longitudinal data; logistic regression; outcome dependent sampling
11.  Repeated Measures Semiparametric Regression Using Targeted Maximum Likelihood Methodology with Application to Transcription Factor Activity Discovery 
In longitudinal and repeated measures data analysis, often the goal is to determine the effect of a treatment or aspect on a particular outcome (e.g., disease progression). We consider a semiparametric repeated measures regression model, where the parametric component models effect of the variable of interest and any modification by other covariates. The expectation of this parametric component over the other covariates is a measure of variable importance. Here, we present a targeted maximum likelihood estimator of the finite dimensional regression parameter, which is easily estimated using standard software for generalized estimating equations.
The targeted maximum likelihood method provides double robust and locally efficient estimates of the variable importance parameters and inference based on the influence curve. We demonstrate these properties through simulation under correct and incorrect model specification, and apply our method in practice to estimating the activity of transcription factor (TF) over cell cycle in yeast. We specifically target the importance of SWI4, SWI6, MBP1, MCM1, ACE2, FKH2, NDD1, and SWI5.
The semiparametric model allows us to determine the importance of a TF at specific time points by specifying time indicators as potential effect modifiers of the TF. Our results are promising, showing significant importance trends during the expected time periods. This methodology can also be used as a variable importance analysis tool to assess the effect of a large number of variables such as gene expressions or single nucleotide polymorphisms.
doi:10.2202/1544-6115.1553
PMCID: PMC3122882  PMID: 21291412
targeted maximum likelihood; semiparametric; repeated measures; longitudinal; transcription factors
12.  Mediation analysis of the relationship between institutional research activity and patient survival 
Background
Recent studies have suggested that patients treated in research-active institutions have better outcomes than patients treated in research-inactive institutions. However, little attention has been paid to explaining such effects, probably because techniques for mediation analysis existing so far have not been applicable to survival data.
Methods
We investigated the underlying mechanisms using a recently developed method for mediation analysis of survival data. Our analysis of the effect of research activity on patient survival was based on 352 patients who had been diagnosed with advanced ovarian cancer at 149 hospitals in 2001. All hospitals took part in a quality assurance program of the German Cancer Society. Patient outcomes were compared between hospitals participating in clinical trials and non-trial hospitals. Surgical outcome and chemotherapy selection were explored as potential mediators of the effect of hospital research activity on patient survival.
Results
The 219 patients treated in hospitals participating in clinical trials had more complete surgical debulking, were more likely to receive the recommended platinum-taxane combination, and had better survival than the 133 patients treated in non-trial hospitals. Taking into account baseline confounders, the overall adjusted hazard ratio of death was 0.58 (95% confidence interval: 0.42 to 0.79). This effect was decomposed into a direct effect of research activity of 0.67 and two indirect effects of 0.93 each mediated through either optimal surgery or chemotherapy. Taken together, about 26% of the beneficial effect of research activity was mediated through the proposed pathways.
Conclusions
Mediation analysis allows proceeding from the question “Does it work?” to the question “How does it work?” In particular, we have shown that the research activity of a hospital contributes to superior patient survival through better use of surgery and chemotherapy. This methodology may be applied to analyze direct and indirect natural effects for almost any combination of variable types.
doi:10.1186/1471-2288-14-9
PMCID: PMC3917547  PMID: 24447677
Trial effect; Research activity; Healthcare outcomes; Mediation; Survival analysis
13.  Identification and efficient estimation of the natural direct effect among the untreated 
Biometrics  2013;69(2):310-317.
Summary
The natural direct effect (NDE), or the effect of an exposure on an outcome if an intermediate variable was set to the level it would have been in the absence of the exposure, is often of interest to investigators. In general, the statistical parameter associated with the NDE is difficult to estimate in the non-parametric model, particularly when the intermediate variable is continuous or high dimensional. In this paper we introduce a new causal parameter called the natural direct effect among the untreated, discus identifiability assumptions, propose a sensitivity analysis for some of the assumptions, and show that this new parameter is equivalent to the NDE in a randomized controlled trial. We also present a targeted minimum loss estimator (TMLE), a locally efficient, double robust substitution estimator for the statistical parameter associated with this causal parameter. The TMLE can be applied to problems with continuous and high dimensional intermediate variables, and can be used to estimate the NDE in a randomized controlled trial with such data. Additionally, we define and discuss the estimation of three related causal parameters: the natural direct effect among the treated, the indirect effect among the untreated and the indirect effect among the treated.
doi:10.1111/biom.12022
PMCID: PMC3692606  PMID: 23607645
Causal inference; direct effect; indirect effect; mediation analysis; semiparametric models; targeted minimum loss estimation
14.  Method for Evaluating Multiple Mediators: Mediating Effects of Smoking and COPD on the Association between the CHRNA5-A3 Variant and Lung Cancer Risk 
PLoS ONE  2012;7(10):e47705.
A mediation model explores the direct and indirect effects between an independent variable and a dependent variable by including other variables (or mediators). Mediation analysis has recently been used to dissect the direct and indirect effects of genetic variants on complex diseases using case-control studies. However, bias could arise in the estimations of the genetic variant-mediator association because the presence or absence of the mediator in the study samples is not sampled following the principles of case-control study design. In this case, the mediation analysis using data from case-control studies might lead to biased estimates of coefficients and indirect effects. In this article, we investigated a multiple-mediation model involving a three-path mediating effect through two mediators using case-control study data. We propose an approach to correct bias in coefficients and provide accurate estimates of the specific indirect effects. Our approach can also be used when the original case-control study is frequency matched on one of the mediators. We employed bootstrapping to assess the significance of indirect effects. We conducted simulation studies to investigate the performance of the proposed approach, and showed that it provides more accurate estimates of the indirect effects as well as the percent mediated than standard regressions. We then applied this approach to study the mediating effects of both smoking and chronic obstructive pulmonary disease (COPD) on the association between the CHRNA5-A3 gene locus and lung cancer risk using data from a lung cancer case-control study. The results showed that the genetic variant influences lung cancer risk indirectly through all three different pathways. The percent of genetic association mediated was 18.3% through smoking alone, 30.2% through COPD alone, and 20.6% through the path including both smoking and COPD, and the total genetic variant-lung cancer association explained by the two mediators was 69.1%.
doi:10.1371/journal.pone.0047705
PMCID: PMC3471886  PMID: 23077662
15.  SIMEX and standard error estimation in semiparametric measurement error models 
SIMEX is a general-purpose technique for measurement error correction. There is a substantial literature on the application and theory of SIMEX for purely parametric problems, as well as for purely non-parametric regression problems, but there is neither application nor theory for semiparametric problems. Motivated by an example involving radiation dosimetry, we develop the basic theory for SIMEX in semiparametric problems using kernel-based estimation methods. This includes situations that the mismeasured variable is modeled purely parametrically, purely non-parametrically, or that the mismeasured variable has components that are modeled both parametrically and nonparametrically. Using our asymptotic expansions, easily computed standard error formulae are derived, as are the bias properties of the nonparametric estimator. The standard error method represents a new method for estimating variability of nonparametric estimators in semiparametric problems, and we show in both simulations and in our example that it improves dramatically on first order methods.
We find that for estimating the parametric part of the model, standard bandwidth choices of order O(n−1/5) are sufficient to ensure asymptotic normality, and undersmoothing is not required. SIMEX has the property that it fits misspecified models, namely ones that ignore the measurement error. Our work thus also more generally describes the behavior of kernel-based methods in misspecified semiparametric problems.
doi:10.1214/08-EJS341
PMCID: PMC2710855  PMID: 19609371
Berkson measurement errors; measurement error; misspecified models; nonparametric regression; radiation epidemiology; semiparametric models; SIMEX; simulation-extrapolation; standard error estimation; uniform expansions
16.  Estimation of Causal Mediation Effects for a Dichotomous Outcome in Multiple-Mediator Models using the Mediation Formula 
Statistics in medicine  2013;32(24):4211-4228.
Mediators are intermediate variables in the causal pathway between an exposure and an outcome. Mediation analysis investigates the extent to which exposure effects occur through these variables, thus revealing causal mechanisms. In this paper, we consider the estimation of the mediation effect when the outcome is binary and multiple mediators of different types exist. We give a precise definition of the total mediation effect as well as decomposed mediation effects through individual or sets of mediators using the potential outcomes framework. We formulate a model of joint distribution (probit-normal) using continuous latent variables for any binary mediators to account for correlations among multiple mediators. A mediation formula approach is proposed to estimate the total mediation effect and decomposed mediation effects based on this parametric model. Estimation of mediation effects through individual or subsets of mediators requires an assumption involving the joint distribution of multiple counterfactuals. We conduct a simulation study that demonstrates low bias of mediation effect estimators for two-mediator models with various combinations of mediator types. The results also show that the power to detect a non-zero total mediation effect increases as the correlation coefficient between two mediators increases, while power for individual mediation effects reaches a maximum when the mediators are uncorrelated. We illustrate our approach by applying it to a retrospective cohort study of dental caries in adolescents with low and high socioeconomic status. Sensitivity analysis is performed to assess the robustness of conclusions regarding mediation effects when the assumption of no unmeasured mediator-outcome confounders is violated.
doi:10.1002/sim.5830
PMCID: PMC3789850  PMID: 23650048
mediation analysis; multiple mediators; latent variables; overall mediation effect; decomposed mediation effect; mediation formula; sensitivity analysis
17.  A Partial Linear Model in the Outcome Dependent Sampling Setting to Evaluate the Effect of Prenatal PCB Exposure on Cognitive Function in Children 
Biometrics  2010;67(3):876-885.
Summary
Outcome-dependent sampling (ODS) has been widely used in biomedical studies because it is a cost effective way to improve study efficiency. However, in the setting of a continuous outcome, the representation of the exposure variable has been limited to the framework of linear models, due to the challenge in terms of both theory and computation. Partial linear models (PLM) are a powerful inference tool to nonparametrically model the relation between an outcome and the exposure variable. In this article, we consider a case study of a partial linear model for data from an ODS design. We propose a semiparametric maximum likelihood method to make inferences with a PLM. We develop the asymptotic properties and conduct simulation studies to show that the proposed ODS estimator can produce a more efficient estimate than that from a traditional simple random sampling design with the same sample size. Using this newly developed method, we were able to explore an open question in epidemiology: whether in utero exposure to background levels of PCBs is associated with children’s intellectual impairment. Our model provides further insights into the relation between low-level PCB exposure and children’s cognitive function. The results shed new light on a body of inconsistent epidemiologic findings.
doi:10.1111/j.1541-0420.2010.01500.x
PMCID: PMC3182522  PMID: 21039397
Cost-effective designs; Empirical likelihood; Outcome dependent sampling; Partial linear model; Polychlorinated biphenyls; P-spline
18.  Community-Based Care for the Specialized Management of Heart Failure 
Executive Summary
In August 2008, the Medical Advisory Secretariat (MAS) presented a vignette to the Ontario Health Technology Advisory Committee (OHTAC) on a proposed targeted health care delivery model for chronic care. The proposed model was defined as multidisciplinary, ambulatory, community-based care that bridged the gap between primary and tertiary care, and was intended for individuals with a chronic disease who were at risk of a hospital admission or emergency department visit. The goals of this care model were thought to include: the prevention of emergency department visits, a reduction in hospital admissions and re-admissions, facilitation of earlier hospital discharge, a reduction or delay in long-term care admissions, and an improvement in mortality and other disease-specific patient outcomes.
OHTAC approved the development of an evidence-based assessment to determine the effectiveness of specialized community based care for the management of heart failure, Type 2 diabetes and chronic wounds.
Please visit the Medical Advisory Secretariat Web site at: www.health.gov.on.ca/ohtas to review the following reports associated with the Specialized Multidisciplinary Community-Based care series.
Specialized multidisciplinary community-based care series: a summary of evidence-based analyses
Community-based care for the specialized management of heart failure: an evidence-based analysis
Community-based care for chronic wound management: an evidence-based analysis
Please note that the evidence-based analysis of specialized community-based care for the management of diabetes titled: “Community-based care for the management of type 2 diabetes: an evidence-based analysis” has been published as part of the Diabetes Strategy Evidence Platform at this URL: http://www.health.gov.on.ca/english/providers/program/mas/tech/ohtas/tech_diabetes_20091020.html
Please visit the Toronto Health Economics and Technology Assessment Collaborative Web site at: http://theta.utoronto.ca/papers/MAS_CHF_Clinics_Report.pdf to review the following economic project associated with this series:
Community-based Care for the specialized management of heart failure: a cost-effectiveness and budget impact analysis.
Objective
The objective of this evidence-based analysis was to determine the effectiveness of specialized multidisciplinary care in the management of heart failure (HF).
Clinical Need: Target Population and Condition
HF is a progressive, chronic condition in which the heart becomes unable to sufficiently pump blood throughout the body. There are several risk factors for developing the condition including hypertension, diabetes, obesity, previous myocardial infarction, and valvular heart disease.(1) Based on data from a 2005 study of the Canadian Community Health Survey (CCHS), the prevalence of congestive heart failure in Canada is approximately 1% of the population over the age of 12.(2) This figure rises sharply after the age of 45, with prevalence reports ranging from 2.2% to 12%.(3) Extrapolating this to the Ontario population, an estimated 98,000 residents in Ontario are believed to have HF.
Disease management programs are multidisciplinary approaches to care for chronic disease that coordinate comprehensive care strategies along the disease continuum and across healthcare delivery systems.(4) Evidence for the effectiveness of disease management programs for HF has been provided by seven systematic reviews completed between 2004 and 2007 (Table 1) with consistency of effect demonstrated across four main outcomes measures: all cause mortality and hospitalization, and heart-failure specific mortality and hospitalization. (4-10)
However, while disease management programs are multidisciplinary by definition, the published evidence lacks consistency and clarity as to the exact nature of each program and usual care comparators are generally ill defined. Consequently, the effectiveness of multidisciplinary care for the management of persons with HF is still uncertain. Therefore, MAS has completed a systematic review of specialized, multidisciplinary, community-based care disease management programs compared to a well-defined usual care group for persons with HF.
Evidence-Based Analysis Methods
Research Questions
What is the effectiveness of specialized, multidisciplinary, community-based care (SMCCC) compared with usual care for persons with HF?
Literature Search Strategy
A comprehensive literature search was completed of electronic databases including MEDLINE, MEDLINE In-Process and Other Non-Indexed Citations, EMBASE, Cochrane Library and Cumulative Index to Nursing & Allied Health Literature. Bibliographic references of selected studies were also searched. After a review of the title and abstracts, relevant studies were obtained and the full reports evaluated. All studies meeting explicit inclusion and exclusion criteria were retained. Where appropriate, a meta-analysis was undertaken to determine the pooled estimate of effect of specialized multidisciplinary community-based care for explicit outcomes. The quality of the body of evidence, defined as one or more relevant studies was determined using GRADE Working Group criteria. (11)
Inclusion Criteria
Randomized controlled trial
Systematic review with meta analysis
Population includes persons with New York Heart Association (NYHA) classification 1-IV HF
The intervention includes a team consisting of a nurse and physician one of which is a specialist in HF management.
The control group receives care by a single practitioner (e.g. primary care physician (PCP) or cardiologist)
The intervention begins after discharge from the hospital
The study reports 1-year outcomes
Exclusion Criteria
The intervention is delivered predominately through home-visits
Studies with mixed populations where discrete data for HF is not reported
Outcomes of Interest
All cause mortality
All cause hospitalization
HF specific mortality
HF specific hospitalization
All cause duration of hospital stay
HF specific duration of hospital stay
Emergency room visits
Quality of Life
Summary of Findings
One large and seven small randomized controlled trials were obtained from the literature search.
A meta-analysis was completed for four of the seven outcomes including:
All cause mortality
HF-specific mortality
All cause hospitalization
HF-specific hospitalization.
Where the pooled analysis was associated with significant heterogeneity, subgroup analyses were completed using two primary categories:
direct and indirect model of care; and
type of control group (PCP or cardiologist).
The direct model of care was a clinic-based multidisciplinary HF program and the indirect model of care was a physician supervised, nurse-led telephonic HF program.
All studies, except one, were completed in jurisdictions outside North America. (12-19) Similarly, all but one study had a sample size of less than 250. The mean age in the studies ranged from 65 to 77 years. Six of the studies(12;14-18) included populations with a NYHA classification of II-III. In two studies, the control treatment was a cardiologist (12;15) and two studies reported the inclusion of a dietitian, physiotherapist and psychologist as members of the multidisciplinary team (12;19).
All Cause Mortality
Eight studies reported all cause mortality (number of persons) at 1 year follow-up. (12-19) When the results of all eight studies were pooled, there was a statistically significant RRR of 29% with moderate heterogeneity (I2 of 38%). The results of the subgroup analyses indicated a significant RRR of 40% in all cause mortality when SMCCC is delivered through a direct team model (clinic) and a 35% RRR when SMCCC was compared with a primary care practitioner.
HF-Specific Mortality
Three studies reported HF-specific mortality (number of persons) at 1 year follow-up. (15;18;19) When the results of these were pooled, there was an insignificant RRR of 42% with high statistical heterogeneity (I2 of 60%). The GRADE quality of evidence is moderate for the pooled analysis of all studies.
All Cause Hospitalization
Seven studies reported all cause hospitalization at 1-year follow-up (13-15;17-19). When pooled, their results showed a statistically insignificant 12% increase in hospitalizations in the SMCCC group with high statistical heterogeneity (I2 of 81%). A significant RRR of 12% in all cause hospitalization in favour of the SMCCC care group was achieved when SMCCC was delivered using an indirect model (telephonic) with an associated (I2 of 0%). The Grade quality of evidence was found to be low for the pooled analysis of all studies and moderate for the subgroup analysis of the indirect team care model.
HF-Specific Hospitalization
Six studies reported HF-specific hospitalization at 1-year follow-up. (13-15;17;19) When pooled, the results of these studies showed an insignificant RRR of 14% with high statistical heterogeneity (I2 of 60%); however, the quality of evidence for the pooled analysis of was low.
Duration of Hospital Stay
Seven studies reported duration of hospital stay, four in terms of mean duration of stay in days (14;16;17;19) and three in terms of total hospital bed days (12;13;18). Most studies reported all cause duration of hospital stay while two also reported HF-specific duration of hospital stay. These data were not amenable to meta-analyses as standard deviations were not provided in the reports. However, in general (and in all but one study) it appears that persons receiving SMCCC had shorter hospital stays, whether measured as mean days in hospital or total hospital bed days.
Emergency Room Visits
Only one study reported emergency room visits. (14) This was presented as a composite of readmissions and ER visits, where the authors reported that 77% (59/76) of the SMCCC group and 84% (63/75) of the usual care group were either readmitted or had an ER visit within the 1 year of follow-up (P=0.029).
Quality of Life
Quality of life was reported in five studies using the Minnesota Living with HF Questionnaire (MLHFQ) (12-15;19) and in one study using the Nottingham Health Profile Questionnaire(16). The MLHFQ results are reported in our analysis. Two studies reported the mean score at 1 year follow-up, although did not provide the standard deviation of the mean in their report. One study reported the median and range scores at 1 year follow-up in each group. Two studies reported the change scores of the physical and emotional subscales of the MLHFQ of which only one study reported a statistically significant change from baseline to 1 year follow-up between treatment groups in favour of the SMCCC group in the physical sub-scale. A significant change in the emotional subscale scores from baseline to 1 year follow-up in the treatment groups was not reported in either study.
Conclusion
There is moderate quality evidence that SMCCC reduces all cause mortality by 29%. There is low quality evidence that SMCCC contributes to a shorter duration of hospital stay and improves quality of life compared to usual care. The evidence supports that SMCCC is effective when compared to usual care provided by either a primary care practitioner or a cardiologist. It does not, however, suggest an optimal model of care or discern what the effective program components are. A field evaluation could address this uncertainty.
PMCID: PMC3377506  PMID: 23074521
19.  Mediation Analysis with Multiple Mediators 
Epidemiologic methods  2014;2(1):95-115.
Recent advances in the causal inference literature on mediation have extended traditional approaches to direct and indirect effects to settings that allow for interactions and non-linearities. In this paper, these approaches from causal inference are further extended to settings in which multiple mediators may be of interest. Two analytic approaches, one based on regression and one based on weighting are proposed to estimate the effect mediated through multiple mediators and the effects through other pathways. The approaches proposed here accommodate exposure-mediator interactions and, to a certain extent, mediator-mediator interactions as well. The methods handle binary or continuous mediators and binary, continuous or count outcomes. When the mediators affect one another, the strategy of trying to assess direct and indirect effects one mediator at a time will in general fail; the approach given in this paper can still be used. A characterization is moreover given as to when the sum of the mediated effects for multiple mediators considered separately will be equal to the mediated effect of all of the mediators considered jointly. The approach proposed in this paper is robust to unmeasured common causes of two or more mediators.
doi:10.1515/em-2012-0010
PMCID: PMC4287269  PMID: 25580377
Direct and indirect effects; joint effects mediation; regression; weighting
20.  A robust two-way semi-linear model for normalization of cDNA microarray data 
BMC Bioinformatics  2005;6:14.
Background
Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values.
Methods
We propose a robust semiparametric method in a two-way semi-linear model (TW-SLM) for normalization of cDNA microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that: (i) the percentage of differentially expressed genes is small; or (ii) the numbers of up- and down-regulated genes are about the same, as required in the LOWESS normalization method. We conduct simulation studies to evaluate the proposed method and use a real data set from a specially designed microarray experiment to compare the performance of the proposed method with that of the LOWESS normalization approach.
Results
The simulation results show that the proposed method performs better than the LOWESS normalization method in terms of mean square errors for estimated gene effects. The results of analysis of the real data set also show that the proposed method yields more consistent results between the direct and the indirect comparisons and also can detect more differentially expressed genes than the LOWESS method.
Conclusions
Our simulation studies and the real data example indicate that the proposed robust TW-SLM method works at least as well as the LOWESS method and works better when the underlying assumptions for the LOWESS method are not satisfied. Therefore, it is a powerful alternative to the existing normalization methods.
doi:10.1186/1471-2105-6-14
PMCID: PMC549200  PMID: 15663789
21.  A marginal approach to reduced-rank penalized spline smoothing with application to multilevel functional data 
Multilevel functional data is collected in many biomedical studies. For example, in a study of the effect of Nimodipine on patients with subarachnoid hemorrhage (SAH), patients underwent multiple 4-hour treatment cycles. Within each treatment cycle, subjects’ vital signs were reported every 10 minutes. This data has a natural multilevel structure with treatment cycles nested within subjects and measurements nested within cycles. Most literature on nonparametric analysis of such multilevel functional data focus on conditional approaches using functional mixed effects models. However, parameters obtained from the conditional models do not have direct interpretations as population average effects. When population effects are of interest, we may employ marginal regression models. In this work, we propose marginal approaches to fit multilevel functional data through penalized spline generalized estimating equation (penalized spline GEE). The procedure is effective for modeling multilevel correlated generalized outcomes as well as continuous outcomes without suffering from numerical difficulties. We provide a variance estimator robust to misspecification of correlation structure. We investigate the large sample properties of the penalized spline GEE estimator with multilevel continuous data and show that the asymptotics falls into two categories. In the small knots scenario, the estimated mean function is asymptotically efficient when the true correlation function is used and the asymptotic bias does not depend on the working correlation matrix. In the large knots scenario, both the asymptotic bias and variance depend on the working correlation. We propose a new method to select the smoothing parameter for penalized spline GEE based on an estimate of the asymptotic mean squared error (MSE). We conduct extensive simulation studies to examine property of the proposed estimator under different correlation structures and sensitivity of the variance estimation to the choice of smoothing parameter. Finally, we apply the methods to the SAH study to evaluate a recent debate on discontinuing the use of Nimodipine in the clinical community.
doi:10.1080/01621459.2013.826134
PMCID: PMC3909538  PMID: 24497670
Penalized spline; GEE; Semiparametric models; Longitudinal data; Functional data
22.  Mediation Analysis with Principal Stratification 
Statistics in medicine  2009;28(7):1108-1130.
In assessing the mechanism of treatment efficacy in randomized clinical trials, investigators often perform mediation analyses by analyzing if the significant intent-to-treat treatment effect on outcome occurs through or around a third intermediate or mediating variable: indirect and direct effects, respectively. Standard mediation analyses assume sequential ignorability, i.e., conditional on covariates the intermediate or mediating factor is randomly assigned, as is the treatment in a randomized clinical trial. This research focuses on the application of the principal stratification approach for estimating the direct effect of a randomized treatment but without the standard sequential ignorability assumption. This approach is used to estimate the direct effect of treatment as a difference between expectations of potential outcomes within latent sub-groups of participants for whom the intermediate variable behavior would be constant, regardless of the randomized treatment assignment. Using a Bayesian estimation procedure, we also assess the sensitivity of results based on the principal stratification approach to heterogeneity of the variances among these principal strata. We assess this approach with simulations and apply it to two psychiatric examples. Both examples and the simulations indicated robustness of our findings to the homogeneous variance assumption. However, simulations showed that the magnitude of treatment effects derived under the principal stratification approach were sensitive to model mis-specification.
doi:10.1002/sim.3533
PMCID: PMC2669107  PMID: 19184975
Principal stratification; mediating variables; direct effects; principal strata probabilities; heterogeneous variances
23.  Dimension reduced kernel estimation for distribution function with incomplete data 
SUMMARY
This work focuses on the estimation of distribution functions with incomplete data, where the variable of interest Y has ignorable missingness but the covariate X is always observed. When X is high dimensional, parametric approaches to incorporate X — information is encumbered by the risk of model misspecification and nonparametric approaches by the curse of dimensionality. We propose a semiparametric approach, which is developed under a nonparametric kernel regression framework, but with a parametric working index to condense the high dimensional X — information for reduced dimension. This kernel dimension reduction estimator has double robustness to model misspecification and is most efficient if the working index adequately conveys the X — information about the distribution of Y. Numerical studies indicate better performance of the semiparametric estimator over its parametric and nonparametric counterparts. We apply the kernel dimension reduction estimation to an HIV study for the effect of antiretroviral therapy on HIV virologic suppression.
doi:10.1016/j.jspi.2011.03.030
PMCID: PMC3127551  PMID: 21731174
curse of dimensionality; dimension reduction; distribution function; ignorable missingness; kernel regression; quantile
24.  Empirical Likelihood-Based Estimation of the Treatment Effect in a Pretest–Posttest Study 
The pretest–posttest study design is commonly used in medical and social science research to assess the effect of a treatment or an intervention. Recently, interest has been rising in developing inference procedures that improve efficiency while relaxing assumptions used in the pretest–posttest data analysis, especially when the posttest measurement might be missing. In this article we propose a semiparametric estimation procedure based on empirical likelihood (EL) that incorporates the common baseline covariate information to improve efficiency. The proposed method also yields an asymptotically unbiased estimate of the response distribution. Thus functions of the response distribution, such as the median, can be estimated straightforwardly, and the EL method can provide a more appealing estimate of the treatment effect for skewed data. We show that, compared with existing methods, the proposed EL estimator has appealing theoretical properties, especially when the working model for the underlying relationship between the pretest and posttest measurements is misspecified. A series of simulation studies demonstrates that the EL-based estimator outperforms its competitors when the working model is misspecified and the data are missing at random. We illustrate the methods by analyzing data from an AIDS clinical trial (ACTG 175).
doi:10.1198/016214508000000625
PMCID: PMC3666595  PMID: 23729942
Auxiliary information; Biased sampling; Causal inference; Observational study; Survey sampling
25.  Mediation analysis with multiple versions of the mediator 
Epidemiology (Cambridge, Mass.)  2012;23(3):454-463.
The causal inference literature has provided definitions of direct and indirect effects based on counterfactuals that generalize the approach found in the social science literature. However, these definitions presuppose well defined hypothetical interventions on the mediator. In many settings there may be multiple ways to fix the mediator to a particular value and these different hypothetical interventions may have very different implications for the outcome of interest. In this paper we consider mediation analysis when multiple versions of the mediator are present. Specifically, we consider the problem of attempting to decompose a total effect of an exposure on an outcome into the portion through the intermediate and the portion through other pathways. We consider the setting in which there are multiple versions of the mediator but the investigator only has access to data on the particular measurement, not which version of the mediator may have brought that value about. We show that the quantity that is estimated as a natural indirect effect using only the available data does indeed have an interpretation as a particular type of mediated effect; however, the quantity estimated as a natural direct effect in fact captures both a true direct effect and an effect of the exposure on the outcome mediated through the effect of the version of the mediator that is not captured by the mediator measurement. The results are illustrated using two examples from the literature, one in which the versions of the mediator are unknown and another in which the mediator itself has been dichotomized.
doi:10.1097/EDE.0b013e31824d5fe7
PMCID: PMC3771529  PMID: 22475830

Results 1-25 (1297053)