A dynamic regime provides a sequence of treatments that are tailored to patient-specific characteristics and outcomes. In 2004 James Robins proposed g-estimation using structural nested mean models for making inference about the optimal dynamic regime in a multi-interval trial. The method provides clear advantages over traditional parametric approaches. Robins’ g-estimation method always yields consistent estimators, but these can be asymptotically biased under a given structural nested mean model for certain longitudinal distributions of the treatments and covariates, termed exceptional laws. In fact, under the null hypothesis of no treatment effect, every distribution constitutes an exceptional law under structural nested mean models which allow for interaction of current treatment with past treatments or covariates. This paper provides an explanation of exceptional laws and describes a new approach to g-estimation which we call Zeroing Instead of Plugging In (ZIPI). ZIPI provides nearly identical estimators to recursive g-estimators at non-exceptional laws while providing substantial reduction in the bias at an exceptional law when decision rule parameters are not shared across intervals.
adaptive treatment strategies; asymptotic bias; dynamic treatment regimes; g-estimation; optimal structural nested mean models; pre-test estimators
Individualized treatment rules, or rules for altering treatments over time in response to changes in individual covariates, are of primary importance in the practice of clinical medicine. Several statistical methods aim to estimate the rule, termed an optimal dynamic treatment regime, which will result in the best expected outcome in a population. In this article, we discuss estimation of an alternative type of dynamic regime—the statically optimal treatment rule. History-adjusted marginal structural models (HA-MSM) estimate individualized treatment rules that assign, at each time point, the first action of the future static treatment plan that optimizes expected outcome given a patient’s covariates. However, as we discuss here, HA-MSM-derived rules can depend on the way in which treatment was assigned in the data from which the rules were derived. We discuss the conditions sufficient for treatment rules identified by HA-MSM to be statically optimal, or in other words, to select the optimal future static treatment plan at each time point, regardless of the way in which past treatment was assigned. The resulting treatment rules form appropriate candidates for evaluation using randomized controlled trials. We demonstrate that a history-adjusted individualized treatment rule is statically optimal if it depends on a set of covariates that are sufficient to control for confounding of the effect of past treatment history on outcome. Methods and results are illustrated using an example drawn from the antiretroviral treatment of patients infected with HIV. Specifically, we focus on rules for deciding when to modify the treatment of patients infected with resistant virus.
causal inference; longitudinal data; dynamic treatment regime; adaptive treatment strategy; history-adjusted marginal structural model; human immunodeficiency virus
Ideally, randomized trials would be used to compare the long-term
effectiveness of dynamic treatment regimes on clinically relevant outcomes.
However, because randomized trials are not always feasible or timely, we often
must rely on observational data to compare dynamic treatment regimes. An example
of a dynamic treatment regime is “start combined antiretroviral therapy
(cART) within 6 months of CD4 cell count first dropping below x
cells/mm3 or diagnosis of an AIDS-defining illness, whichever
happens first” where x can take values between 200 and
500. Recently, Cain et al (2011) used
inverse probability (IP) weighting of dynamic marginal structural models to find
the x that minimizes 5-year mortality risk under similar
dynamic regimes using observational data. Unlike standard methods, IP weighting
can appropriately adjust for measured time-varying confounders (e.g., CD4 cell
count, viral load) that are affected by prior treatment. Here we describe an
alternative method to IP weighting for comparing the effectiveness of dynamic
cART regimes: the parametric g-formula. The parametric g-formula naturally
handles dynamic regimes and, like IP weighting, can appropriately adjust for
measured time-varying confounders. However, estimators based on the parametric
g-formula are more efficient than IP weighted estimators. This is often at the
expense of more parametric assumptions. Here we describe how to use the
parametric g-formula to estimate risk by the end of a user-specified follow-up
period under dynamic treatment regimes. We describe an application of this
method to answer the “when to start” question using data from
the HIV-CAUSAL Collaboration.
A treatment regime is a rule that assigns a treatment, among a set of possible treatments, to a patient as a function of his/her observed characteristics, hence “personalizing” treatment to the patient. The goal is to identify the optimal treatment regime that, if followed by the entire population of patients, would lead to the best outcome on average. Given data from a clinical trial or observational study, for a single treatment decision, the optimal regime can be found by assuming a regression model for the expected outcome conditional on treatment and covariates, where, for a given set of covariates, the optimal treatment is the one that yields the most favorable expected outcome. However, treatment assignment via such a regime is suspect if the regression model is incorrectly specified. Recognizing that, even if misspecified, such a regression model defines a class of regimes, we instead consider finding the optimal regime within such a class by finding the regime the optimizes an estimator of overall population mean outcome. To take into account possible confounding in an observational study and to increase precision, we use a doubly robust augmented inverse probability weighted estimator for this purpose. Simulations and application to data from a breast cancer clinical trial demonstrate the performance of the method.
Doubly robust estimator; Inverse probability weighting; Outcome regression; Personalized medicine; Potential outcomes; Propensity score
A dynamic treatment regime consists of a set of decision rules that dictate how to individualize treatment to patients based on available treatment and covariate history. A common method for estimating an optimal dynamic treatment regime from data is Q-learning which involves nonsmooth operations of the data. This nonsmoothness causes standard asymptotic approaches for inference like the bootstrap or Taylor series arguments to breakdown if applied without correction. Here, we consider the m-out-of-n bootstrap for constructing confidence intervals for the parameters indexing the optimal dynamic regime. We propose an adaptive choice of m and show that it produces asymptotically correct confidence sets under fixed alternatives. Furthermore, the proposed method has the advantage of being conceptually and computationally much more simple than competing methods possessing this same theoretical property. We provide an extensive simulation study to compare the proposed method with currently available inference procedures. The results suggest that the proposed method delivers nominal coverage while being less conservative than alternatives. The proposed methods are implemented in the qLearn R-package and have been made available on the Comprehensive R-Archive Network (http://cran.r-project.org/). Analysis of the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study is used as an illustrative example.
dynamic treatment regime; Q-learning; non-regularity; m-out-of-n bootstrap
Treatment of schizophrenia is notoriously difficult and typically requires personalized adaption of treatment due to lack of efficacy of treatment, poor adherence, or intolerable side effects. The Clinical Antipsychotic Trials in Intervention Effectiveness (CATIE) Schizophrenia Study is a sequential multiple assignment randomized trial comparing the typical antipsychotic medication, perphenazine, to several newer atypical antipsychotics. This paper describes the marginal structural modeling method for estimating optimal dynamic treatment regimes and applies the approach to the CATIE Schizophrenia Study. Missing data and valid estimation of confidence intervals are also addressed.
Adaptive treatment strategies; causal effects; dynamic treatment regimes; inverse probability weighting; marginal structural models; personalized medicine; schizophrenia
A dynamic treatment regime is a set of decision rules, one per stage, each taking a patient’s treatment and covariate history as input, and outputting a recommended treatment. In the estimation of the optimal dynamic treatment regime from longitudinal data, the treatment effect parameters at any stage prior to the last can be nonregular under certain distributions of the data. This results in biased estimates and invalid confidence intervals for the treatment effect parameters. In this paper, we discuss both the problem of nonregularity, and available estimation methods. We provide an extensive simulation study to compare the estimators in terms of their ability to lead to valid confidence intervals under a variety of nonregular scenarios. Analysis of a data set from a smoking cessation trial is provided as an illustration.
dynamic treatment regime; nonregularity; bias; hard-threshold; soft-threshold; empirical Bayes; bootstrap
The pretest–posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest–posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates.
Analysis of covariance; covariate adjustment; influence function; inverse probability weighting; missing at random
Using validation sets for outcomes can greatly improve the estimation of vaccine efficacy (VE) in the field (Halloran and Longini, 2001; Halloran and others, 2003). Most statistical methods for using validation sets rely on the assumption that outcomes on those with no cultures are missing at random (MAR). However, often the validation sets will not be chosen at random. For example, confirmational cultures are often done on people with influenza-like illness as part of routine influenza surveillance. VE estimates based on such non-MAR validation sets could be biased. Here we propose frequentist and Bayesian approaches for estimating VE in the presence of validation bias. Our work builds on the ideas of Rotnitzky and others (1998, 2001), Scharfstein and others (1999, 2003), and Robins and others (2000). Our methods require expert opinion about the nature of the validation selection bias. In a re-analysis of an influenza vaccine study, we found, using the beliefs of a flu expert, that within any plausible range of selection bias the VE estimate based on the validation sets is much higher than the point estimate using just the non-specific case definition. Our approach is generally applicable to studies with missing binary outcomes with categorical covariates.
Bayesian; Expert opinion; Identifiability; Influenza; Missing data; Selection model; Vaccine efficacy
A dynamic treatment regime is a list of rules for how the level of treatment will be tailored through time to an individual’s changing severity. In general, individuals who receive the highest level of treatment are the individuals with the greatest severity and need for treatment. Thus there is planned selection of the treatment dose. In addition to the planned selection mandated by the treatment rules, the use of staff judgment results in unplanned selection of the treatment level. Given observational longitudinal data or data in which there is unplanned selection, of the treatment level, the methodology proposed here allows the estimation of a mean response to a dynamic treatment regime under the assumption of sequential randomization.
dynamic treatment regimes; nondynamic treatment regimes; causal inference; confounding
The effect of spatial structure has been proved very relevant in repeated games. In this work we propose an agent based model where a fixed finite population of tagged agents play iteratively the Nash demand game in a regular lattice. The model extends the multiagent bargaining model by Axtell, Epstein and Young  modifying the assumption of global interaction. Each agent is endowed with a memory and plays the best reply against the opponent's most frequent demand. We focus our analysis on the transient dynamics of the system, studying by computer simulation the set of states in which the system spends a considerable fraction of the time. The results show that all the possible persistent regimes in the global interaction model can also be observed in this spatial version. We also find that the mesoscopic properties of the interaction networks that the spatial distribution induces in the model have a significant impact on the diffusion of strategies, and can lead to new persistent regimes different from those found in previous research. In particular, community structure in the intratype interaction networks may cause that communities reach different persistent regimes as a consequence of the hindering diffusion effect of fluctuating agents at their borders.
The joint effects of multiple exposures on an outcome are frequently of interest in epidemiologic research. In 2001, Hernán, Brumback, and Robins (JASA 2001; 96: 440–448) presented methods for estimating the joint effects of multiple time-varying exposures subject to time-varying confounding affected by prior exposure using joint marginal structural models. Nonetheless, the use of these joint models is rare in the applied literature. Minimal uptake of these joint models, in contrast to the now widely used standard marginal structural model, is due in part to a lack of examples demonstrating the method. In this paper, we review the assumptions necessary for unbiased estimation of joint effects as well as the distinction between interaction and effect measure modification. We demonstrate the use of marginal structural models for estimating the joint effects of alcohol consumption and injection drug use on HIV acquisition, using data from 1,525 injection drug users in the AIDS Link to Intravenous Experience cohort study. In the joint model, the hazard ratio (HR) for heavy drinking in the absence of any drug injections was 1.58 (95% confidence interval= 0.67–3.73). The HR for any drug injections in the absence of heavy drinking was 1.78 (1.10–2.89). The HR for heavy drinking and any drug injections was 2.45 (1.45–4.12). The P values for multiplicative and additive interaction were 0.7620 and 0.9200, respectively, indicating a lack of departure from effects that multiply or add. However, we could not rule out interaction on either scale due to imprecision.
Dynamic treatment regime is a decision rule in which the choice of the treatment of an individual at any given time can depend on the known past history of that individual, including baseline covariates, earlier treatments, and their measured responses. In this paper we argue that finding an optimal regime can, at least in moderately simple cases, be accomplished by a straightforward application of nonparametric Bayesian modeling and predictive inference. As an illustration we consider an inference problem in a subset of the Multicenter AIDS Cohort Study (MACS) data set, studying the effect of AZT initiation on future CD4-cell counts during a 12-month follow-up.
Bayesian nonparametric regression; causal inference; dynamic programming; monotonicity; optimal dynamic regimes
Antiviral resistance in influenza is rampant and has the possibility of causing major morbidity and mortality. Previous models have identified treatment regimes to minimize total infections and keep resistance low. However, the bulk of these studies have ignored stochasticity and heterogeneous contact structures. Here we develop a network model of influenza transmission with treatment and resistance, and present both standard mean-field approximations as well as simulated dynamics. We find differences in the final epidemic sizes for identical transmission parameters (bistability) leading to different optimal treatment timing depending on the number initially infected. We also find, contrary to previous results, that treatment targeted by number of contacts per individual (node degree) gives rise to more resistance at lower levels of treatment than non-targeted treatment. Finally we highlight important differences between the two methods of analysis (mean-field versus stochastic simulations), and show where traditional mean-field approximations fail. Our results have important implications not only for the timing and distribution of influenza chemotherapy, but also for mathematical epidemiological modeling in general. Antiviral resistance in influenza may carry large consequences for pandemic mitigation efforts, and models ignoring contact heterogeneity and stochasticity may provide misleading policy recommendations.
Resistance of influenza to common antiviral agents carries the possibility of causing large morbidity and mortality through failure of treatment and should be taken into account when planning public health interventions focused on stopping transmission. Here we present a mathematical model of influenza transmission which incorporates heterogeneous contact structure and stochastic transmission events. We find scenarios when treatment either induces large levels of resistance or no resistance at identical values of transmission rates depending on the number initially infected. We also find, contrary to previous results, that targeted treatment causes more resistance at lower treatment levels than non-targeted treatment. Our results have important implications for the timing and distribution of antivirals in epidemics and highlight important differences in how transmission is modeled and where assumptions made in previous models cause them to lead to erroneous conclusions.
Oncolytic viruses are viruses that specifically infect cancer cells and kill them, while leaving healthy cells largely intact. Their ability to spread through the tumor makes them an attractive therapy approach. While promising results have been observed in clinical trials, solid success remains elusive since we lack understanding of the basic principles that govern the dynamical interactions between the virus and the cancer. In this respect, computational models can help experimental research at optimizing treatment regimes. Although preliminary mathematical work has been performed, this suffers from the fact that individual models are largely arbitrary and based on biologically uncertain assumptions. Here, we present a general framework to study the dynamics of oncolytic viruses that is independent of uncertain and arbitrary mathematical formulations. We find two categories of dynamics, depending on the assumptions about spatial constraints that govern that spread of the virus from cell to cell. If infected cells are mixed among uninfected cells, there exists a viral replication rate threshold beyond which tumor control is the only outcome. On the other hand, if infected cells are clustered together (e.g. in a solid tumor), then we observe more complicated dynamics in which the outcome of therapy might go either way, depending on the initial number of cells and viruses. We fit our models to previously published experimental data and discuss aspects of model validation, selection, and experimental design. This framework can be used as a basis for model selection and validation in the context of future, more detailed experimental studies. It can further serve as the basis for future, more complex models that take into account other clinically relevant factors such as immune responses.
This article considers the problem of assessing causal effect moderation in longitudinal settings in which treatment (or exposure) is time-varying and so are the covariates said to moderate its effect. Intermediate Causal Effects that describe time-varying causal effects of treatment conditional on past covariate history are introduced and considered as part of Robins’ Structural Nested Mean Model. Two estimators of the intermediate causal effects, and their standard errors, are presented and discussed: The first is a proposed 2-Stage Regression Estimator. The second is Robins’ G-Estimator. The results of a small simulation study that begins to shed light on the small versus large sample performance of the estimators, and on the bias-variance trade-off between the two estimators are presented. The methodology is illustrated using longitudinal data from a depression study.
Causal inference; Effect modification; Estimating equations; G-Estimation; 2-stage estimation; Time-varying treatment; Time-varying covariates; Bias-variance trade-off
Chaotic dynamics in a recurrent neural network model and in two-dimensional cellular automata, where both have finite but large degrees of freedom, are investigated from the viewpoint of harnessing chaos and are applied to motion control to indicate that both have potential capabilities for complex function control by simple rule(s). An important point is that chaotic dynamics generated in these two systems give us autonomous complex pattern dynamics itinerating through intermediate state points between embedded patterns (attractors) in high-dimensional state space. An application of these chaotic dynamics to complex controlling is proposed based on an idea that with the use of simple adaptive switching between a weakly chaotic regime and a strongly chaotic regime, complex problems can be solved. As an actual example, a two-dimensional maze, where it should be noted that the spatial structure of the maze is one of typical ill-posed problems, is solved with the use of chaos in both systems. Our computer simulations show that the success rate over 300 trials is much better, at least, than that of a random number generator. Our functional simulations indicate that both systems are almost equivalent from the viewpoint of functional aspects based on our idea, harnessing of chaos.
Chaotic dynamics; Recurrent neural network; Cellular automata; Information processing; Complex control; Adaptive function
In the presence of time-varying confounders affected by prior treatment, standard statistical methods for failure time analysis may be biased. Methods that correctly adjust for this type of covariate include the parametric g-formula, inverse probability weighted estimation of marginal structural Cox proportional hazards models, and g-estimation of structural nested accelerated failure time models. In this article, we propose a novel method to estimate the causal effect of a time-dependent treatment on failure in the presence of informative right-censoring and time-dependent confounders that may be affected by past treatment: g-estimation of structural nested cumulative failure time models (SNCFTMs). An SNCFTM considers the conditional effect of a final treatment at time m on the outcome at each later time k by modeling the ratio of two counterfactual cumulative risks at time k under treatment regimes that differ only at time m. Inverse probability weights are used to adjust for informative censoring. We also present a procedure that, under certain “no-interaction” conditions, uses the g-estimates of the model parameters to calculate unconditional cumulative risks under nondynamic (static) treatment regimes. The procedure is illustrated with an example using data from a longitudinal cohort study, in which the “treatments” are healthy behaviors and the outcome is coronary heart disease.
Causal inference; Coronary heart disease; Epidemiology; G-estimation; Inverse probability weighting
We consider the increasingly important and highly complex immunological control problem: control of the dynamics of immunosuppression for organ transplant recipients. The goal in this problem is to maintain the delicate balance between over-suppression (where opportunistic latent viruses threaten the patient) and under-suppression (where rejection of the transplanted organ is probable). First, a mathematical model is formulated to describe the immune response to both viral infection and introduction of a donor kidney in a renal transplant recipient. Some numerical results are given to qualitatively validate and demonstrate that this initial model exhibits appropriate characteristics of primary infection and reactivation for immunosuppressed transplant recipients. In addition, we develop a computational framework for designing adaptive optimal treatment regimes with partial observations and low frequency sampling, where the state estimates are obtained by solving a second deterministic optimal tracking problem. Numerical results are given to illustrate the feasibility of this method in obtaining optimal treatment regimes with a balance between under-suppression and over-suppression of the immune system.
Renal transplant; human cytomegalovirus; mathematical model; optimal feedback control; state estimation; model predictive control
Multistability of oscillatory and silent regimes is a ubiquitous phenomenon exhibited by excitable systems such as neurons and cardiac cells. Multistability can play functional roles in short-term memory and maintaining posture. It seems to pose an evolutionary advantage for neurons which are part of multifunctional Central Pattern Generators to possess multistability. The mechanisms supporting multistability of bursting regimes are not well understood or classified.
Our study is focused on determining the bio-physical mechanisms underlying different types of co-existence of the oscillatory and silent regimes observed in a neuronal model. We develop a low-dimensional model typifying the dynamics of a single leech heart interneuron. We carry out a bifurcation analysis of the model and show that it possesses six different types of multistability of dynamical regimes. These types are the co-existence of 1) bursting and silence, 2) tonic spiking and silence, 3) tonic spiking and subthreshold oscillations, 4) bursting and subthreshold oscillations, 5) bursting, subthreshold oscillations and silence, and 6) bursting and tonic spiking. These first five types of multistability occur due to the presence of a separating regime that is either a saddle periodic orbit or a saddle equilibrium. We found that the parameter range wherein multistability is observed is limited by the parameter values at which the separating regimes emerge and terminate.
We developed a neuronal model which exhibits a rich variety of different types of multistability. We described a novel mechanism supporting the bistability of bursting and silence. This neuronal model provides a unique opportunity to study the dynamics of networks with neurons possessing different types of multistability.
Dynamic treatment regimes are time-varying treatments that individualize sequences of treatments to the patient. The construction of dynamic treatment regimes is challenging because a patient will be eligible for some treatment components only if he has not responded (or has responded) to other treatment components. In addition there are usually a number of potentially useful treatment components and combinations thereof. In this article, we propose new methodology for identifying promising components and screening out negligible ones. First, we define causal factorial effects for treatment components that may be applied sequentially to a patient. Second we propose experimental designs that can be used to study the treatment components. Surprisingly, modifications can be made to (fractional) factorial designs - more commonly found in the engineering statistics literature -for screening in this setting. Furthermore we provide an analysis model that can be used to screen the factorial effects. We demonstrate the proposed methodology using examples motivated in the literature and also via a simulation study.
Multi-stage Decisions; Experimental Design; Causal Inference
Size-selective mortality caused by fishing can impose strong selection on harvested fish populations, causing evolution in important life-history traits. Understanding and predicting harvest-induced evolutionary change can help maintain sustainable fisheries. We investigate the evolutionary sustainability of alternative management regimes for lacustrine brook charr (Salvelinus fontinalis) fisheries in southern Canada and aim to optimize these regimes with respect to the competing objectives of maximizing mean annual yield and minimizing evolutionary change in maturation schedules. Using a stochastic simulation model of brook charr populations consuming a dynamic resource, we investigate how harvesting affects brook charr maturation schedules. We show that when approximately 5% to 15% of the brook charr biomass is harvested, yields are high, and harvest-induced evolutionary changes remain small. Intensive harvesting (at approximately >15% of brook charr biomass) results in high average yields and little evolutionary change only when harvesting is restricted to brook charr larger than the size at 50% maturation probability at the age of 2 years. Otherwise, intensive harvesting lowers average yield and causes evolutionary change in the maturation schedule of brook charr. Our results indicate that intermediate harvesting efforts offer an acceptable compromise between avoiding harvest-induced evolutionary change and securing high average yields.
Fisheries-induced adaptive change; management regimes; models; probabilistic maturation reaction norm; Salvelinus fontinalis
Most clinical guidelines recommend that AIDS-free, HIV-infected persons with CD4 cell counts below 0.350 × 109 cells/L initiate combined antiretroviral therapy (cART), but the optimal CD4 cell count at which cART should be initiated remains a matter of debate.
To identify the optimal CD4 cell count at which cART should be initiated.
Prospective observational data from the HIV-CAUSAL Collaboration and dynamic marginal structural models were used to compare cART initiation strategies for CD4 thresholds between 0.200 and 0.500 × 109 cells/L.
HIV clinics in Europe and the Veterans Health Administration system in the United States.
20 971 HIV-infected, therapy-naive persons with baseline CD4 cell counts at or above 0.500 × 109 cells/L and no previous AIDS-defining illnesses, of whom 8392 had a CD4 cell count that decreased into the range of 0.200 to 0.499 × 109 cells/L and were included in the analysis.
Hazard ratios and survival proportions for all-cause mortality and a combined end point of AIDS-defining illness or death.
Compared with initiating cART at the CD4 cell count threshold of 0.500 × 109 cells/L, the mortality hazard ratio was 1.01 (95% CI, 0.84 to 1.22) for the 0.350 threshold and 1.20 (CI, 0.97 to 1.48) for the 0.200 threshold. The corresponding hazard ratios were 1.38 (CI, 1.23 to 1.56) and 1.90 (CI, 1.67 to 2.15), respectively, for the combined end point of AIDS-defining illness or death.
CD4 cell count at cART initiation was not randomized. Residual confounding may exist.
Initiation of cART at a threshold CD4 count of 0.500 × 109 cells/L increases AIDS-free survival. However, mortality did not vary substantially with the use of CD4 thresholds between 0.300 and 0.500 ×109 cells/L.
Primary Funding Source
National Institutes of Health.
Milestoning is a procedure to compute the time evolution of complicated processes such as barrier crossing events or long diffusive transitions between predefined states. Milestoning reduces the dynamics to transition events between intermediates (the milestones) and computes the local kinetic information to describe these transitions via short molecular dynamics (MD) runs between the milestones. The procedure relies on the ability to reinitialize MD trajectories on the milestones to get the right kinetic information about the transitions. It also rests on the assumptions that the transition events between successive milestones and the time-lags between these transitions are statistically independent. In this paper, we analyze the validity of these assumptions. We show that sets of optimal milestones exist, i.e. sets such that successive transitions are indeed statistically independent. The proof of this claim relies on the results of transition path theory and uses the isocommittor surfaces of the reaction as milestones. For systems in the overdamped limit, we also obtain the probability distribution to reinitialize the MD trajectories on the milestones, and we discuss why this distribution is not available in closed form for systems with inertia. We explain why the time-lags between transitions are not statistically independent even for optimal milestones, but we show that working with such milestones allows one to compute mean first passage times between milestones exactly. Finally, we discuss some practical implications of our results and we compare milestoning with Markov state models in view of our findings.
transition path theory; committor function; Markov chain model; transition rate; reduced dynamics
The parametric g-formula can be used to contrast the distribution of potential outcomes under arbitrary treatment regimes. Like g-estimation of structural nested models and inverse probability weighting of marginal structural models, the parametric g-formula can appropriately adjust for measured time-varying confounders that are affected by prior treatment. However, there have been few implementations of the parametric g-formula to date. Here, we apply the parametric g-formula to assess the impact of highly active antiretroviral therapy on time to AIDS or death in two US-based HIV cohorts including 1,498 participants. These participants contributed approximately 7,300 person-years of follow-up of which 49% was exposed to HAART and 382 events occurred; 259 participants were censored due to drop out. Using the parametric g-formula, we estimated that antiretroviral therapy substantially reduces the hazard of AIDS or death (HR=0.55; 95% confidence limits [CL]: 0.42, 0.71). This estimate was similar to one previously reported using a marginal structural model 0.54 (95% CL: 0.38, 0.78). The 6.5-year difference in risk of AIDS or death was 13% (95% CL: 8%, 18%). Results were robust to assumptions about temporal ordering, and extent of history modeled, for time-varying covariates. The parametric g-formula is a viable alternative to inverse probability weighting of marginal structural models and g-estimation of structural nested models for the analysis of complex longitudinal data.
Cohort study; Confounding; g-formula; HIV/AIDS; Monte Carlo methods