Search tips
Search criteria

Results 1-24 (24)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  Comparative effectiveness of dynamic treatment regimes: an application of the parametric g-formula 
Statistics in biosciences  2011;3(1):119-143.
Ideally, randomized trials would be used to compare the long-term effectiveness of dynamic treatment regimes on clinically relevant outcomes. However, because randomized trials are not always feasible or timely, we often must rely on observational data to compare dynamic treatment regimes. An example of a dynamic treatment regime is “start combined antiretroviral therapy (cART) within 6 months of CD4 cell count first dropping below x cells/mm3 or diagnosis of an AIDS-defining illness, whichever happens first” where x can take values between 200 and 500. Recently, Cain et al (2011) used inverse probability (IP) weighting of dynamic marginal structural models to find the x that minimizes 5-year mortality risk under similar dynamic regimes using observational data. Unlike standard methods, IP weighting can appropriately adjust for measured time-varying confounders (e.g., CD4 cell count, viral load) that are affected by prior treatment. Here we describe an alternative method to IP weighting for comparing the effectiveness of dynamic cART regimes: the parametric g-formula. The parametric g-formula naturally handles dynamic regimes and, like IP weighting, can appropriately adjust for measured time-varying confounders. However, estimators based on the parametric g-formula are more efficient than IP weighted estimators. This is often at the expense of more parametric assumptions. Here we describe how to use the parametric g-formula to estimate risk by the end of a user-specified follow-up period under dynamic treatment regimes. We describe an application of this method to answer the “when to start” question using data from the HIV-CAUSAL Collaboration.
PMCID: PMC3769803  PMID: 24039638
2.  Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease 
Epidemiology (Cambridge, Mass.)  2008;19(6):766-779.
The Women’s Health Initiative randomized trial found greater coronary heart disease (CHD) risk in women assigned to estrogen/progestin therapy than in those assigned to placebo. Observational studies had previously suggested reduced CHD risk in hormone users.
Using data from the observational Nurses’ Health Study, we emulated the design and intention-to-treat (ITT) analysis of the randomized trial. The observational study was conceptualized as a sequence of “trials” in which eligible women were classified as initiators or noninitiators of estrogen/progestin therapy.
The ITT hazard ratios (95% confidence intervals) of CHD for initiators versus noninitiators were 1.42 (0.92 – 2.20) for the first 2 years, and 0.96 (0.78 – 1.18) for the entire follow-up. The ITT hazard ratios were 0.84 (0.61 – 1.14) in women within 10 years of menopause, and 1.12 (0.84 – 1.48) in the others (P value for interaction = 0.08). These ITT estimates are similar to those from the Women’s Health Initiative. Because the ITT approach causes severe treatment misclassification, we also estimated adherence-adjusted effects by inverse probability weighting. The hazard ratios were 1.61 (0.97 – 2.66) for the first 2 years, and 0.98 (0.66 – 1.49) for the entire follow-up. The hazard ratios were 0.54 (0.19 – 1.51) in women within 10 years after menopause, and 1.20 (0.78 – 1.84) in others (P value for interaction = 0.01). Finally, we also present comparisons between these estimates and previously reported NHS estimates.
Our findings suggest that the discrepancies between the Women’s Health Initiative and Nurses’ Health Study ITT estimates could be largely explained by differences in the distribution of time since menopause and length of follow-up.
PMCID: PMC3731075  PMID: 18854702
3.  Improved double-robust estimation in missing data and causal inference models 
Biometrika  2012;99(2):439-456.
Recently proposed double-robust estimators for a population mean from incomplete data and for a finite number of counterfactual means can have much higher efficiency than the usual double-robust estimators under misspecification of the outcome model. In this paper, we derive a new class of double-robust estimators for the parameters of regression models with incomplete cross-sectional or longitudinal data, and of marginal structural mean models for cross-sectional data with similar efficiency properties. Unlike the recent proposals, our estimators solve outcome regression estimating equations. In a simulation study, the new estimator shows improvements in variance relative to the standard double-robust estimator that are in agreement with those suggested by asymptotic theory.
PMCID: PMC3635709  PMID: 23843666
Drop-out; Marginal structural model; Missing at random
4.  Relation between three classes of structural models for the effect of a time-varying exposure on survival 
Lifetime data analysis  2009;16(1):71-84.
Standard methods for estimating the effect of a time-varying exposure on survival may be biased in the presence of time-dependent confounders themselves affected by prior exposure. This problem can be overcome by inverse probability weighted estimation of Marginal Structural Cox Models (Cox MSM), g-estimation of Structural Nested Accelerated Failure Time Models (SNAFTM) and g-estimation of Structural Nested Cumulative Failure Time Models (SNCFTM). In this paper, we describe a data generation mechanism that approximately satisfies a Cox MSM, an SNAFTM and an SNCFTM. Besides providing a procedure for data simulation, our formal description of a data generation mechanism that satisfies all three models allows one to assess the relative advantages and disadvantages of each modeling approach. A simulation study is also presented to compare effect estimates across the three models.
PMCID: PMC3635680  PMID: 19894116
5.  Credible Mendelian Randomization Studies: Approaches for Evaluating the Instrumental Variable Assumptions 
American Journal of Epidemiology  2012;175(4):332-339.
As with other instrumental variable (IV) analyses, Mendelian randomization (MR) studies rest on strong assumptions. These assumptions are not routinely systematically evaluated in MR applications, although such evaluation could add to the credibility of MR analyses. In this article, the authors present several methods that are useful for evaluating the validity of an MR study. They apply these methods to a recent MR study that used fat mass and obesity-associated (FTO) genotype as an IV to estimate the effect of obesity on mental disorder. These approaches to evaluating assumptions for valid IV analyses are not fail-safe, in that there are situations where the approaches might either fail to identify a biased IV or inappropriately suggest that a valid IV is biased. Therefore, the authors describe the assumptions upon which the IV assessments rely. The methods they describe are relevant to any IV analysis, regardless of whether it is based on a genetic IV or other possible sources of exogenous variation. Methods that assess the IV assumptions are generally not conclusive, but routinely applying such methods is nonetheless likely to improve the scientific contributions of MR studies.
PMCID: PMC3366596  PMID: 22247045
causality; confounding factors; epidemiologic methods; instrumental variables; Mendelian randomization analysis
6.  Pandemic Influenza: Risk of Multiple Introductions and the Need to Prepare for Them 
PLoS Medicine  2006;3(6):e135.
Containing an emerging influenza H5N1 pandemic in its earliest stages may be feasible, but containing multiple introductions of a pandemic-capable strain would be more difficult. Mills and colleagues argue that multiple introductions are likely, especially if risk of a pandemic is high.
PMCID: PMC1370924  PMID: 17214503
7.  On doubly robust estimation in a semiparametric odds ratio model 
Biometrika  2009;97(1):171-180.
We consider the doubly robust estimation of the parameters in a semiparametric conditional odds ratio model. Our estimators are consistent and asymptotically normal in a union model that assumes either of two variation independent baseline functions is correctly modelled but not necessarily both. Furthermore, when either outcome has finite support, our estimators are semiparametric efficient in the union model at the intersection submodel where both nuisance functions models are correct. For general outcomes, we obtain doubly robust estimators that are nearly efficient at the intersection submodel. Our methods are easy to implement as they do not require the use of the alternating conditional expectations algorithm of Chen (2007).
PMCID: PMC3412601  PMID: 23049119
Doubly robust; Generalized odds ratio; Locally efficient; Semiparametric logistic regression
8.  Higher Order Inference On A Treatment Effect Under Low Regularity Conditions 
Statistics & probability letters  2011;81(7):821-828.
We describe a novel approach to nonparametric point and interval estimation of a treatment effect in the presence of many continuous confounders. We show the problem can be reduced to that of point and interval estimation of the expected conditional covariance between treatment and response given the confounders. Our estimators are higher order U-statistics. The approach applies equally to the regular case where the expected conditional covariance is root-n estimable and to the irregular case where slower non-parametric rates prevail.
PMCID: PMC3088168  PMID: 21552339
Minimax; U-statistics; Influence functions; Nonparametric; Semi-parametric; Robust Inference
9.  Time-dependent cross ratio estimation for bivariate failure times 
Biometrika  2011;98(2):341-354.
In the analysis of bivariate correlated failure time data, it is important to measure the strength of association among the correlated failure times. One commonly used measure is the cross ratio. Motivated by Cox’s partial likelihood idea, we propose a novel parametric cross ratio estimator that is a flexible continuous function of both components of the bivariate survival times. We show that the proposed estimator is consistent and asymptotically normal. Its finite sample performance is examined using simulation studies, and it is applied to the Australian twin data.
PMCID: PMC3376771  PMID: 22822258
Correlated survival times; Empirical process theory; Local dependency measure; Pseudo-partial likelihood
10.  Estimating absolute risks in the presence of nonadherence: An application to a follow-up study with baseline randomization 
Epidemiology (Cambridge, Mass.)  2010;21(4):528-539.
The intention-to-treat (ITT) analysis provides a valid test of the null hypothesis and naturally results in both absolute and relative measures of risk. However, this analytic approach may miss the occurrence of serious adverse effects that would have been detected under full adherence to the assigned treatment. Inverse probability weighting of marginal structural models has been used to adjust for nonadherence, but most studies have provided only relative measures of risk. In this study, we used inverse probability weighting to estimate both absolute and relative measures of risk of invasive breast cancer under full adherence to the assigned treatment in the Women’s Health Initiative estrogen-plus-progestin trial. In contrast to an ITT hazard ratio (HR) of 1.25 (95% confidence interval [CI] = 1.01 to 1.54), the HR for 8-year continuous estrogen-plus-progestin use versus no use was 1.68 (1.24 to 2.28). The estimated risk difference (cases/100 women) at year 8 was 0.83 (−0.03 to 1.69) in the ITT analysis, compared with 1.44 (0.52 to 2.37) in the adherence-adjusted analysis. Results were robust across various dose-response models. We also compared the dynamic treatment regime “take hormone therapy until certain adverse events become apparent, then stop taking hormone therapy” with no use (HR= 1.64; 95% CI = 1.24 to 2.18). The methods described here are also applicable to observational studies with time-varying treatments.
PMCID: PMC3315056  PMID: 20526200
11.  Structural Nested Cumulative Failure Time Models to Estimate the Effects of Interventions 
Journal of the American Statistical Association  2012;107(499):10.1080/01621459.2012.682532.
In the presence of time-varying confounders affected by prior treatment, standard statistical methods for failure time analysis may be biased. Methods that correctly adjust for this type of covariate include the parametric g-formula, inverse probability weighted estimation of marginal structural Cox proportional hazards models, and g-estimation of structural nested accelerated failure time models. In this article, we propose a novel method to estimate the causal effect of a time-dependent treatment on failure in the presence of informative right-censoring and time-dependent confounders that may be affected by past treatment: g-estimation of structural nested cumulative failure time models (SNCFTMs). An SNCFTM considers the conditional effect of a final treatment at time m on the outcome at each later time k by modeling the ratio of two counterfactual cumulative risks at time k under treatment regimes that differ only at time m. Inverse probability weights are used to adjust for informative censoring. We also present a procedure that, under certain “no-interaction” conditions, uses the g-estimates of the model parameters to calculate unconditional cumulative risks under nondynamic (static) treatment regimes. The procedure is illustrated with an example using data from a longitudinal cohort study, in which the “treatments” are healthy behaviors and the outcome is coronary heart disease.
PMCID: PMC3860902  PMID: 24347749
Causal inference; Coronary heart disease; Epidemiology; G-estimation; Inverse probability weighting
12.  Multiply robust inference for statistical interactions 
A primary focus of an increasing number of scientific studies is to determine whether two exposures interact in the effect that they produce on an outcome of interest. Interaction is commonly assessed by fitting regression models in which the linear predictor includes the product between those exposures. When the main interest lies in the interaction, this approach is not entirely satisfactory because it is prone to (possibly severe) bias when the main exposure effects or the association between outcome and extraneous factors are misspecified. In this article, we therefore consider conditional mean models with identity or log link which postulate the statistical interaction in terms of a finite-dimensional parameter, but which are otherwise unspecified. We show that estimation of the interaction parameter is often not feasible in this model because it would require nonparametric estimation of auxiliary conditional expectations given high-dimensional variables. We thus consider ‘multiply robust estimation’ under a union model that assumes at least one of several working submodels holds. Our approach is novel in that it makes use of information on the joint distribution of the exposures conditional on the extraneous factors in making inferences about the interaction parameter of interest. In the special case of a randomized trial or a family-based genetic study in which the joint exposure distribution is known by design or by Mendelian inheritance, the resulting multiply robust procedure leads to asymptotically distribution-free tests of the null hypothesis of no interaction on an additive scale. We illustrate the methods via simulation and the analysis of a randomized follow-up study.
PMCID: PMC3097121  PMID: 21603124
Double robustness; Gene-environment interaction; Gene-gene interaction; Longitudinal data; Semiparametric inference
13.  Effectiveness of Early Antiretroviral Therapy Initiation to Improve Survival among HIV-Infected Adults with Tuberculosis: A Retrospective Cohort Study 
PLoS Medicine  2011;8(5):e1001029.
Molly Franke, Megan Murray, and colleagues report that early cART reduces mortality among HIV-infected adults with tuberculosis and improves retention in care, regardless of CD4 count.
Randomized clinical trials examining the optimal time to initiate combination antiretroviral therapy (cART) in HIV-infected adults with sputum smear-positive tuberculosis (TB) disease have demonstrated improved survival among those who initiate cART earlier during TB treatment. Since these trials incorporated rigorous diagnostic criteria, it is unclear whether these results are generalizable to the vast majority of HIV-infected patients with TB, for whom standard diagnostic tools are unavailable. We aimed to examine whether early cART initiation improved survival among HIV-infected adults who were diagnosed with TB in a clinical setting.
Methods and Findings
We retrospectively reviewed charts for 308 HIV-infected adults in Rwanda with a CD4 count≤350 cells/µl and a TB diagnosis. We estimated the effect of cART on survival using marginal structural models and simulated 2-y survival curves for the cohort under different cART strategies:start cART 15, 30, 60, or 180 d after TB treatment or never start cART. We conducted secondary analyses with composite endpoints of (1) death, default, or lost to follow-up and (2) death, hospitalization, or serious opportunistic infection. Early cART initiation led to a survival benefit that was most marked for individuals with low CD4 counts. For individuals with CD4 counts of 50 or 100 cells/µl, cART initiation at day 15 yielded 2-y survival probabilities of 0.82 (95% confidence interval: [0.76, 0.89]) and 0.86 (95% confidence interval: [0.80, 0.92]), respectively. These were significantly higher than the probabilities computed under later start times. Results were similar for the endpoint of death, hospitalization, or serious opportunistic infection. cART initiation at day 15 versus later times was protective against death, default, or loss to follow-up, regardless of CD4 count. As with any observational study, the validity of these findings assumes that biases from residual confounding by unmeasured factors and from model misspecification are small.
Early cART reduced mortality among individuals with low CD4 counts and improved retention in care, regardless of CD4 count.
Please see later in the article for the Editors' Summary
Editors' Summary
HIV infection has exacerbated the global tuberculosis (TB) epidemic, especially in sub-Saharan Africa, in which in some countries, 70% of people with TB are currently also HIV positive—a condition commonly described as HIV/TB co-infection. The management of patients with HIV/TB co-infection is a major public health concern.
There is relatively little good evidence on the best time to initiate combination antiretroviral therapy (cART) in adults with HIV/TB co-infection. Clinicians sometimes defer cART in individuals initiating TB treatment because of concerns about complications (such as immune reconstitution inflammatory syndrome) and the risk of reduced adherence if patients have to remember to take two sets of pills. However, starting cART later in those patients who are infected with both HIV and TB can result in potentially avoidable deaths during therapy.
Why Was This Study Done?
Several randomized control trials (RCTs) have been carried out, and the results of three of these studies suggest that, among individuals with severe immune suppression, early initiation of cART (two to four weeks after the start of TB treatment) leads to better survival than later ART initiation (two to three months after the start of TB treatment). These results were reported in abstract form, but the full papers have not yet been published. One problem with RCTs is that they are carried out under controlled conditions that might not represent well the conditions in varied settings around the world. Therefore, observational studies that examine how effective a treatment is in routine clinical conditions can provide information that complements that obtained during clinical trials. In this study, the researchers aimed to confirm the results from RCTs among a cohort of adult patients with HIV/TB co-infection in Rwanda, diagnosed under routine program conditions and using routinely collected clinical data. The researchers also wanted to investigate whether early cART initiation reduced the risk of other adverse outcomes, including treatment default and loss to follow-up.
What Did the Researchers Do and Find?
The researchers retrospectively reviewed the charts and other program records of 308 patients with HIV, who had CD4 counts≤350 cells/µl, were aged 15 years or more, had never previously taken cART, and received their first TB treatment at one of five cART sites (two urban, three rural) in Rwanda between January 2004 and February 2007. Using this method, the researchers collected baseline demographic and clinical variables and relevant clinical follow-up data. They then used this data to estimate the effect of cART on survival by using sophisticated statistical models that calculated the effects of initiating cART at 15, 30, 60, or 180 d after the start of TB treatment or not at all.
The researchers then conducted a further analysis to assess combined outcomes of (1) death, default, lost to follow-up, and (2) death, hospitalization due to any cause, or occurrence of severe opportunistic infections, such as Kaposi's sarcoma. The researchers used the resulting multivariable model to estimate survival probabilities for each individual, based on his/her baseline characteristics.
The researchers found that when they set their model to first CD4 cell counts of 50 and 100 cells/µl, and starting cART at day 15, mean survival probabilities at two years were 0.82 and 0.86, respectively, statistically significantly higher than the survival probabilities calculated for each of the other treatment strategies, where cART was started later. They observed a similar pattern for the combined outcome of death, hospitalization, or serious opportunistic infection In addition, two-year outcomes for death or lost to follow-up were also improved with early cART, regardless of CD4 count at treatment initiation.
What Do These Findings Mean?
These findings show that in a real world program setting, starting cART 15 d after the start of TB treatment is more beneficial (measured by differences in survival probabilities) among patients with HIV/TB co-infection who have CD4 cell counts≤100 cells/µl than starting later. Early cART initiation may also increase retention in care for all individuals with CD4 cell counts≤350 cells/µl.
As the outcomes of this modeling study are based on data from a retrospective observational study, the biases associated with use of these data must be carefully addressed. However, the results support the recommendation of cART initiation after 15 d of TB treatment for patients with CD4 cell counts≤100 cells/µl and can be used as an advocacy base for TB treatment to be used as an opportunity to refer and retain HIV-infected individuals in care, regardless of CD4 cell count.
Additional Information
Please access these Web sites via the online version of this summary at
Information is available on HIV/TB co-infection from the World Health Organization, the US Centers for Disease Control and Prevention, and the International AIDS Society
PMCID: PMC3086874  PMID: 21559327
14.  When to Start Treatment? A Systematic Approach to the Comparison of Dynamic Regimes Using Observational Data* 
Dynamic treatment regimes are the type of regime most commonly used in clinical practice. For example, physicians may initiate combined antiretroviral therapy the first time an individual’s recorded CD4 cell count drops below either 500 cells/mm3 or 350 cells/mm3. This paper describes an approach for using observational data to emulate randomized clinical trials that compare dynamic regimes of the form “initiate treatment within a certain time period of some time-varying covariate first crossing a particular threshold.” We applied this method to data from the French Hospital database on HIV (FHDH-ANRS CO4), an observational study of HIV-infected patients, in order to compare dynamic regimes of the form “initiate treatment within m months after the recorded CD4 cell count first drops below x cells/mm3” where x takes values from 200 to 500 in increments of 10 and m takes values 0 or 3. We describe the method in the context of this example and discuss some complications that arise in emulating a randomized experiment using observational data.
PMCID: PMC3406513  PMID: 21972433
dynamic treatment regimes; marginal structural models; HIV infection; antiretroviral therapy
15.  Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes, Part II: Proofs of Results* 
In this companion article to “Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes, Part I: Main Content” [Orellana, Rotnitzky and Robins (2010), IJB, Vol. 6, Iss. 2, Art. 7] we present (i) proofs of the claims in that paper, (ii) a proposal for the computation of a confidence set for the optimal index when this lies in a finite set, and (iii) an example to aid the interpretation of the positivity assumption.
PMCID: PMC2854089  PMID: 20405047
dynamic treatment regime; double-robust; inverse probability weighted; marginal structural model; optimal treatment regime; causality
16.  Marginal Structural Models for Sufficient Cause Interactions 
American Journal of Epidemiology  2010;171(4):506-514.
Sufficient cause interactions concern cases in which there is a particular causal mechanism for some outcome that requires the presence of 2 or more specific causes to operate. Empirical conditions have been derived to test for sufficient cause interactions. However, when regression outcome models are used to control for confounding variables in tests for sufficient cause interactions, the outcome models impose restrictions on the relation between the confounding variables and certain unidentified background causes within the sufficient cause framework; often, these assumptions are implausible. By using marginal structural models, rather than outcome regression models, to test for sufficient cause interactions, modeling assumptions are instead made on the relation between the causes of interest and the confounding variables; these assumptions will often be more plausible. The use of marginal structural models also allows for testing for sufficient cause interactions in the presence of time-dependent confounding. Such time-dependent confounding may arise in cases in which one factor of interest affects both the second factor of interest and the outcome. It is furthermore shown that marginal structural models can be used not only to test for sufficient cause interactions but also to give lower bounds on the prevalence of such sufficient cause interactions.
PMCID: PMC2877448  PMID: 20067916
causal inference; interaction; marginal structural models; sufficient causes; synergism; weighting
17.  Intervening on risk factors for coronary heart disease: an application of the parametric g-formula 
Estimating the population risk of disease under hypothetical interventions—such as the population risk of coronary heart disease (CHD) were everyone to quit smoking and start exercising or to start exercising if diagnosed with diabetes—may not be possible using standard analytic techniques. The parametric g-formula, which appropriately adjusts for time-varying confounders affected by prior exposures, is especially well suited to estimating effects when the intervention involves multiple factors (joint interventions) or when the intervention involves decisions that depend on the value of evolving time-dependent factors (dynamic interventions). We describe the parametric g-formula, and use it to estimate the effect of various hypothetical lifestyle interventions on the risk of CHD using data from the Nurses’ Health Study. Over the period 1982–2002, the 20-year risk of CHD in this cohort was 3.50%. Under a joint intervention of no smoking, increased exercise, improved diet, moderate alcohol consumption and reduced body mass index, the estimated risk was 1.89% (95% confidence interval: 1.46–2.41). We discuss whether the assumptions required for the validity of the parametric g-formula hold in the Nurses’ Health Study data. This work represents the first large-scale application of the parametric g-formula in an epidemiologic cohort study.
PMCID: PMC2786249  PMID: 19389875
g-formula; coronary heart disease; hypothetical interventions
18.  Transmission Dynamics and Control of Severe Acute Respiratory Syndrome 
Science (New York, N.Y.)  2003;300(5627):1966-1970.
Severe acute respiratory syndrome (SARS) is a recently described illness of humans that has spread widely over the past 6 months. With the use of detailed epidemiologic data from Singapore and epidemic curves from other settings, we estimated the reproductive number for SARS in the absence of interventions and in the presence of control efforts. We estimate that a single infectious case of SARS will infect about three secondary cases in a population that has not yet instituted control measures. Public-health efforts to reduce transmission are expected to have a substantial impact on reducing the size of the epidemic.
PMCID: PMC2760158  PMID: 12766207
19.  Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes 
Biostatistics (Oxford, England)  2003;4(4):495-512.
In randomized studies with missing outcomes, non-identifiable assumptions are required to hold for valid data analysis. As a result, statisticians have been advocating the use of sensitivity analysis to evaluate the effect of varying asssumptions on study conclusions. While this approach may be useful in assessing the sensitivity of treatment comparisons to missing data assumptions, it may be dissatisfying to some researchers/decision makers because a single summary is not provided. In this paper, we present a fully Bayesian methodology that allows the investigator to draw a ‘single’ conclusion by formally incorporating prior beliefs about non-identifiable, yet interpretable, selection bias parameters. Our Bayesian model provides robustness to prior specification of the distributional form of the continuous outcomes.
PMCID: PMC2748253  PMID: 14557107
Dirichlet process prior; Identifiability; MCHC; Non-parametric Bayes; Selection model; Sensitivity analysis
20.  Identifiability, exchangeability and confounding revisited 
In 1986 the International Journal of Epidemiology published "Identifiability, Exchangeability and Epidemiological Confounding". We review the article from the perspective of a quarter century after it was first drafted and relate it to subsequent developments on confounding, ignorability, and collapsibility.
PMCID: PMC2745408  PMID: 19732410
21.  Estimating causal effects from epidemiological data 
In ideal randomised experiments, association is causation: association measures can be interpreted as effect measures because randomisation ensures that the exposed and the unexposed are exchangeable. On the other hand, in observational studies, association is not generally causation: association measures cannot be interpreted as effect measures because the exposed and the unexposed are not generally exchangeable. However, observational research is often the only alternative for causal inference. This article reviews a condition that permits the estimation of causal effects from observational data, and two methods—standardisation and inverse probability weighting—to estimate population causal effects under that condition. For simplicity, the main description is restricted to dichotomous variables and assumes that no random error attributable to sampling variability exists. The appendix provides a generalisation of inverse probability weighting.
PMCID: PMC2652882  PMID: 16790829
causal inference; confounding; inverse probability weighting; randomisation; standardisation
22.  Generation interval contraction and epidemic data analysis 
Mathematical biosciences  2008;213(1):71-79.
The generation interval is the time between the infection time of an infected person and the infection time of his or her infector. Probability density functions for generation intervals have been an important input for epidemic models and epidemic data analysis. In this paper, we specify a general stochastic SIR epidemic model and prove that the mean generation interval decreases when susceptible persons are at risk of infectious contact from multiple sources. The intuition behind this is that when a susceptible person has multiple potential infectors, there is a “race” to infect him or her in which only the first infectious contact leads to infection. In an epidemic, the mean generation interval contracts as the prevalence of infection increases. We call this global competition among potential infectors. When there is rapid transmission within clusters of contacts, generation interval contraction can be caused by a high local prevalence of infection even when the global prevalence is low. We call this local competition among potential infectors. Using simulations, we illustrate both types of competition. Finally, we show that hazards of infectious contact can be used instead of generation intervals to estimate the time course of the effective reproductive number in an epidemic. This approach leads naturally to partial likelihoods for epidemic data that are very similar to those that arise in survival analysis, opening a promising avenue of methodological research in infectious disease epidemiology.
PMCID: PMC2365921  PMID: 18394654
23.  Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing 
Journal of theoretical biology  2007;249(4):706-722.
In this paper, we outline the theory of epidemic percolation networks and their use in the analysis of stochastic SIR epidemic models on undirected contact networks. We then show how the same theory can be used to analyze stochastic SIR models with random and proportionate mixing. The epidemic percolation networks for these models are purely directed because undirected edges disappear in the limit of a large population. In a series of simulations, we show that epidemic percolation networks accurately predict the mean outbreak size and probability and final size of an epidemic for a variety of epidemic models in homogeneous and heterogeneous populations. Finally, we show that epidemic percolation networks can be used to re-derive classical results from several different areas of infectious disease epidemiology. In an appendix, we show that an epidemic percolation network can be defined for any time-homogeneous stochastic SIR model in a closed population and prove that the distribution of outbreak sizes given the infection of any given node in the SIR model is identical to the distribution of its out-component sizes in the corresponding probability space of epidemic percolation networks. We conclude that the theory of percolation on semi-directed networks provides a very general framework for the analysis of stochastic SIR models in closed populations.
PMCID: PMC2186204  PMID: 17950362
24.  Second look at the spread of epidemics on networks 
In an important paper, M.E.J. Newman claimed that a general network-based stochastic Susceptible-Infectious-Removed (SIR) epidemic model is isomorphic to a bond percolation model, where the bonds are the edges of the contact network and the bond occupation probability is equal to the marginal probability of transmission from an infected node to a susceptible neighbor. In this paper, we show that this isomorphism is incorrect and define a semi-directed random network we call the epidemic percolation network that is exactly isomorphic to the SIR epidemic model in any finite population. In the limit of a large population, (i) the distribution of (self-limited) outbreak sizes is identical to the size distribution of (small) out-components, (ii) the epidemic threshold corresponds to the phase transition where a giant strongly-connected component appears, (iii) the probability of a large epidemic is equal to the probability that an initial infection occurs in the giant in-component, and (iv) the relative final size of an epidemic is equal to the proportion of the network contained in the giant out-component. For the SIR model considered by Newman, we show that the epidemic percolation network predicts the same mean outbreak size below the epidemic threshold, the same epidemic threshold, and the same final size of an epidemic as the bond percolation model. However, the bond percolation model fails to predict the correct outbreak size distribution and probability of an epidemic when there is a nondegenerate infectious period distribution. We confirm our findings by comparing predictions from percolation networks and bond percolation models to the results of simulations. In an appendix, we show that an isomorphism to an epidemic percolation network can be defined for any time-homogeneous stochastic SIR model.
PMCID: PMC2215389  PMID: 17930312

Results 1-24 (24)