Antibiotic treatment of childhood illnesses is common in India. In addition to contributing to antimicrobial resistance, antibiotics might result in increased susceptibility to diarrhea through interactions with the gastrointestinal microbiota. Breast milk, which enriches the microbiota early in life, may increase the resilience of the microbiota against perturbations by antibiotics.
In a prospective observational cohort study, we assessed whether antibiotic exposures from birth to 6 months affected rates of diarrhea up to age 3 years among 465 children from Vellore, India. Adjusting for treatment indicators, we modeled diarrheal rates among children exposed and unexposed to antibiotics using negative binomial regression. We further assessed whether the effect of antibiotics on diarrheal rates was modified by exclusive breastfeeding at 6 months.
More than half of the children (n = 267, 57.4%) were given at least one course of antibiotics in the first 6 months of life. The adjusted relative incidence rate of diarrhea was 33% higher among children who received antibiotics under 6 months of age compared with those who did not (incidence rate ratio: 1.33, 95% confidence interval: 1.12, 1.57). Children who were exclusively breastfed until 6 months of age did not have increased diarrheal rates following antibiotic use.
Antibiotic exposures early in life were associated with increased rates of diarrhea in early childhood. Exclusive breastfeeding might protect against this negative impact.
antimicrobials; diarrhea; microbiota; India
Many epidemiologic studies identify contrasts between an “always-exposed” population and a “never-exposed” population. Such “exposure effects” are perhaps most valuable in discussing individual lifestyle changes, or in clinical care; they may be less valuable in estimating the potential effects of realistic public health interventions. Various methods, among them population attributable fractions and generalized impact fractions, attempt to obtain more policy-relevant estimates of “population intervention” effects, but such methods remain rare in the epidemiologic literature. Here, we describe the use of the parametric g-formula as a tool for the estimation of population intervention effects in longitudinal data. Our discussion is motivated by a previous study of the effect of incident pregnancy on time to virological failure among human immunodeficiency virus-positive women initiating antiretroviral therapy in South Africa between 2004 and 2011. We show that 1) interventional estimates of effect can be estimated in longitudinal data using the parametric g-formula and 2) exposure effects and population interventional effects can have dramatically different interpretations and magnitudes in real-world data. Epidemiologists should consider estimating interventional effects in addition to exposure effects; doing so would allow the results of epidemiologic studies to be more immediately relevant to policy-makers and to implementation science efforts.
causal inference; generalized impact fraction; implementation science; population attributable fraction; population intervention effects
Few data exist regarding the effect of hormonal contraception (HC) on incidence and progression of cervical disease (e.g., cervical dysplasia, squamous intraepithelial lesions, cervical intraepithelial neoplasia) in HIV-infected African women.
We conducted an observational study of HIV-seropositive women in Johannesburg, South Africa. The effect of individual HC types on the incidence and progression of cervical disease was determined using Poisson regression to obtain adjusted incidence rate ratios (IRR).
We evaluated 594 HIV-infected women, with median follow-up time of 445 days; 75 of these women were receiving some form of hormonal contraception (largely DMPA, NET-EN, or COCs) at baseline. Risks of incidence and progression of cervical disease were similar comparing women not receiving HCs to women receiving DMPA, NET-EN, or COCs both individually by HC-type and considering all HC together.
There was no statistically significant effect of particular HC methods or of HC use in general on rates of incidence or progression of cervical disease in this study. These results should reassure us that use of HC is unlikely to substantially increase risks of cervical disease among HIV-positive women.
Determining whether hormonal contraception (HC), particularly the injectable contraceptive depot-medroxyprogesterone acetate (DMPA), increases a woman's risk of HIV acquisition is a priority question for public health. However, assessing the relationship between various HC methods and HIV acquisition with observational data involves substantial analytic design issues and challenges. Studies to date have used inconsistent approaches and generated a body of evidence that is complex and challenging to interpret.
In January 2013, USAID and FHI 360 supported a meeting of epidemiologists, statisticians, and content experts to develop recommendations for future observational analyses of HC and HIV acquisition.
Meeting participants generated recommendations regarding careful definition of exposure groups; handling potential confounders, mediators, and effect modifiers; estimating and addressing the magnitude of measurement error; using multiple methods to account for pregnancy; and exploring the potential for differential exposure to HIV-infected partners. Advantages and disadvantages of various statistical approaches to account for time-varying confounding and estimating total and direct effects were also discussed.
Implementing these recommendations in future observational HC-HIV acquisition analyses will enhance interpretation of existing studies and strengthen the overall evidence base for this complex and important area.
HIV acquisition; contraception; DMPA; injectable; observational epidemiology
Marginal structural models were developed as a semiparametric alternative to the G-computation formula to estimate causal effects of exposures. In practice, these models are often specified using parametric regression models. As such, the usual conventions regarding regression model specification apply. This paper outlines strategies for marginal structural model specification, and considerations for the functional form of the exposure metric in the final structural model. We propose a quasi-likelihood information criterion adapted from use in generalized estimating equations. We evaluate the properties of our proposed information criterion using a limited simulation study. We illustrate our approach using two empirical examples. In the first example, we use data from a randomized breastfeeding promotion trial to estimate the effect of breastfeeding duration on infant weight at one year. In the second example, we use data from two prospective cohorts studies to estimate the effect of highly active antiretroviral therapy on CD4 count in an observational cohort of HIV-infected men and women. The marginal structural model specified should reflect the scientific question being addressed, but can also assist in exploration of other plausible and closely related questions. In marginal structural models, as in any regression setting, correct inference depends on correct model specification. Our proposed information criterion provides a formal method for comparing model fit for different specifications.
Bias; Causal inference; Marginal structural model; Regression analysis; Model specification
Missing outcome data due to loss to follow-up occurs frequently in clinical cohort studies of HIV-infected patients. Censoring patients when they become lost can produce inaccurate results if the risk of the outcome among the censored patients differs from the risk of the outcome among patients remaining under observation. We examine whether patients who are considered lost to follow up are at increased risk of mortality compared to those who remain under observation. Patients from the US Centers for AIDS Research Network of Integrated Clinical Systems (CNICS) who newly initiated combination antiretroviral therapy between January 1, 1998 and December 31, 2009 and survived for at least one year were included in the study. Mortality information was available for all participants regardless of continued observation in the CNICS. We compare mortality between patients retained in the cohort and those lost-to-clinic, as commonly defined by a 12-month gap in care. Patients who were considered lost-to-clinic had modestly elevated mortality compared to patients who remained under observation after 5 years (risk ratio (RR): 1.2; 95% CI: 0.9, 1.5). Results were similar after redefining loss-to-clinic as 6 months (RR: 1.0; 95% CI: 0.8, 1.3) or 18 months (RR: 1.2; 95% CI: 0.8, 1.6) without a documented clinic visit. The small increase in mortality associated with becoming lost to clinic suggests that these patients were not lost to care, rather they likely transitioned to care at a facility outside the study. The modestly higher mortality among patients who were lost-to-clinic implies that when we necessarily censor these patients in studies of time-varying exposures, we are likely to incur at most a modest selection bias.
The Themba Lethu Clinical Cohort was established in 2004 to allow large patient-level analyses from a single HIV treatment site to evaluate National Treatment Guidelines, answer questions of national and international policy relevance and to combine an economic and epidemiologic focus on HIV research. The current objectives of the Themba Lethu Clinical Cohort analyses are to: (i) provide cohort-level information on the outcomes of HIV treatment; (ii) evaluate aspects of HIV care and treatment that have policy relevance; (iii) evaluate the cost and cost-effectiveness of different approaches to HIV care and treatment; and (iv) provide a platform for studies on improving HIV care and treatment. Since 2004, Themba Lethu Clinic has enrolled approximately 30 000 HIV-positive patients into its HIV care and treatment programme, over 21 000 of whom have received anti-retroviral therapy since being enrolled. Patients on treatment are typically seen at least every 3 months with laboratory monitoring every 6 months to 1 year. The data collected include demographics, clinical visit data, laboratory data, medication history and clinical diagnoses. Requests for collaborations on analyses can be submitted to our data centre.
HIV-1 and CMV are important pathogens transmitted via breastfeeding. Furthermore, perinatal CMV transmission may impact growth and disease progression in HIV-exposed infants. Although maternal antiretroviral therapy reduces milk HIV-1 RNA load and postnatal transmission, its impact on milk CMV load is unclear. We examined the relationship between milk CMV and HIV-1 load (4–6 weeks postpartum) and the impact of antiretroviral treatment in 69 HIV-infected, lactating Malawian women and assessed the relationship between milk CMV load and postnatal growth in HIV-exposed, breastfed infants through six months of age. Despite an association between milk HIV-1 RNA and CMV DNA load (0.39 log10 rise CMV load per log10 rise HIV-1 RNA load, 95% CI 0.13–0.66), milk CMV load was similar in antiretroviral-treated and untreated women. Higher milk CMV load was associated with lower length-for-age (−0.53, 95% CI: −0.96, −0.10) and weight-for-age (−0.40, 95% CI: −0.67, −0.13) Z-score at six months in exposed, uninfected infants. As the impact of maternal antiretroviral therapy on the magnitude of postnatal CMV exposure may be limited, our findings of an inverse relationship between infant growth and milk CMV load highlight the importance of defining the role of perinatal CMV exposure on growth faltering of HIV-exposed infants.
It is common to present multiple adjusted effect estimates from a single model in a single table. For example, a table might show odds ratios for one or more exposures and also for several confounders from a single logistic regression. This can lead to mistaken interpretations of these estimates. We use causal diagrams to display the sources of the problems. Presentation of exposure and confounder effect estimates from a single model may lead to several interpretative difficulties, inviting confusion of direct-effect estimates with total-effect estimates for covariates in the model. These effect estimates may also be confounded even though the effect estimate for the main exposure is not confounded. Interpretation of these effect estimates is further complicated by heterogeneity (variation, modification) of the exposure effect measure across covariate levels. We offer suggestions to limit potential misunderstandings when multiple effect estimates are presented, including precise distinction between total and direct effect measures from a single model, and use of multiple models tailored to yield total-effect estimates for covariates.
causal diagrams; causal inference; confounding; direct effects; epidemiologic methods; mediation analysis; regression modeling
Effective behavioral HIV prevention is needed for stable HIV-discordant couples at risk for HIV, especially those without access to biomedical prevention. This analysis addressed whether HIV testing and counseling (HTC) with ongoing counseling and condom distribution lead to reduced unprotected sex in HIV-discordant couples.
Partners in Prevention HSV/HIV Transmission Study was a randomized trial conducted from 2004–2008 assessing whether acyclovir reduced HIV transmission from HSV-2/HIV-1 co-infected persons to HIV-uninfected sex partners. This analysis relied on self-reported behavioral data from 508 HIV-infected South African participants. The exposure was timing of first HTC: 0–7, 8–14, 15–30, or >30 days before baseline. In each exposure group, predicted probabilities of unprotected sex in the last month were calculated at baseline, month one, and month twelve using generalized estimating equations with a logit link and exchangeable correlation matrix.
At baseline, participants who knew their HIV status for less time experienced higher predicted probabilities of unprotected sex in the last month: 0–7 days, 0.71; 8–14 days, 0.52; 15–30 days, 0.49; >30 days, 0.26. At month one, once all participants had been aware of being in HIV-discordant relationships for ≥ 1 month, predicted probabilities declined: 0–7 days, 0.08; 8–14 days, 0.08; 15–30 days, 0.15; >30 days, 0.14. Lower predicted probabilities were sustained through month twelve: 0–7 days, 0.08; 8–14 days, 0.11; 15–30 days, 0.05; >30 days, 0.19.
Unprotected sex declined after HIV-positive diagnosis, and declined further after awareness of HIV-discordance. Identifying HIV-discordant couples for behavioral prevention is important for reducing HIV transmission risk.
HIV; condom; unprotected sex; HIV counseling and testing; discordant couple; South Africa
Treatment outcomes for antiretroviral therapy (ART) patients may vary by gender, but estimates from current evidence may be confounded by disease stage and adherence. We investigated the gender differences in treatment response among HIV-positive patients virally suppressed within 6 months of treatment initiation.
We analyzed data from 7,354 patients initiating ART between April 2004 and April 2010 at Themba Lethu Clinic, a large urban public sector treatment facility in South Africa. We estimated the relations among gender, mortality, and mean CD4 response in HIV-infected adults virally suppressed within 6 months of treatment initiation and used inverse probability of treatment weights to correct estimates for loss to follow-up.
Male patients had a 20% greater risk of death at both 24 months and 36 months of follow-up compared to females. Older patients and those with a low hemoglobin level or low body mass index (BMI) were at increased risk of mortality throughout follow-up. Men gained fewer CD4 cells after treatment initiation than did women. The mean differences in CD4 count gains made by women and men between baseline and 12, 24, and 36 months were 28.2 cells/mm3 (95% confidence interval [CI] 22.2–34.3), 60.8 cells/mm3 (95% CI 71.1-50.5 cells/mm3), and 83.0 cells/mm3 (95% CI 97.1-68.8 cells/mm3), respectively. Additionally, patients with a current detectable viral load (>400 copies/mL) and older patients had a lower mean CD4 increase at the same time points.
In this initially virally suppressed population, women showed consistently better immune response to treatment than did men. Promoting earlier uptake of HIV treatment among men may improve their immunologic outcomes.
Recent studies have raised concerns about a change in rates of pregnancy among HIV-negative women exposed to tenofovir. Here, our objective was to determine among HIV-positive women whether use of tenofovir at HAART initiation or thereafter is associated with subsequent changes in incidence of pregnancy.
Analysis of prospectively collected clinical data.
We used Cox proportional hazards models and logistic regression to estimate hazard ratios and odds-ratios for the association of baseline tenofovir use and time to first incident pregnancy. We used marginal structural Cox models to estimate hazard ratios for the association of current tenofovir use and time to first incident pregnancy.
We studied 7,275 women, of whom 1,199 were initiated on tenofovir-based HAART regimens, and who experienced a total of 894 pregnancies in 17,200 person-years of follow-up. Analyses showed slight reductions in hazards of pregnancy among women who used tenofovir, but without sufficient precision to draw strong conclusions. Sensitivity analyses confirmed main results.
Tenofovir may be associated with a lower hazard or rate of pregnancy in women receiving HAART. However, conclusions are limited by low precision, the observational nature of the data, and possible uncontrolled confounding by temporal trends in contraception use and other factors.
Setting and Objective
We examined the effect of initiating ART on CD4 and viral response at different time periods during TB therapy (< 14 days; 15–60 days; or ≥60 days) using prospectively collected clinical data from a large HIV clinic in South Africa.
Cohort data analysis for 1499 TB/HIV co-infected patients classified according to timing of ART after the initiation of TB therapy.
In adjusted modified Poisson regression models, CD4 and viral responses showed no significant differences according to timing of ART initiation (failure to increase CD4 by 6 months, <14 days vs. >60 days: RR 1.02 (95% CI 0.85–1.22), 15–60 days vs. >60 days: RR 1.00 (95% CI 0.86–1.15); failure to suppress virus by 6 months, <14 days vs. >60 days: RR 0.98 (95% CI 0.59–1.63), 15–60 days vs. >60 days: RR 0.96 (95% CI 0.66–1.41) and viral rebound at 12 months, 14 days vs. >60 days: RR 1.43 (95% CI 0.50–4.12), 15–60 days vs. >60 days: RR 1.14 (95% CI 0.39–3.34). Similar estimates were found in analysis restricted to patients with severe immunosuppression.
Concerns over the overlapping impact of TB treatment with ART on ART response should not be a reason to delay ART in patients with HIV-associated TB.
Timing of ART; CD4 response; Viral response; TB–HIV co-infection
Motivated by a previously published study of HIV treatment, we simulated data subject to time-varying confounding affected by prior treatment to examine some finite-sample properties of marginal structural Cox proportional hazards models. We compared (a) unadjusted, (b) regression-adjusted, (c) unstabilized and (d) stabilized marginal structural (inverse probability-of-treatment [IPT] weighted) model estimators of effect in terms of bias, standard error, root mean squared error (MSE) and 95% confidence limit coverage over a range of research scenarios, including relatively small sample sizes and ten study assessments. In the base-case scenario resembling the motivating example, where the true hazard ratio was 0.5, both IPT-weighted analyses were unbiased while crude and adjusted analyses showed substantial bias towards and across the null. Stabilized IPT-weighted analyses remained unbiased across a range of scenarios, including relatively small sample size; however, the standard error was generally smaller in crude and adjusted models. In many cases, unstabilized weighted analysis showed a substantial increase in standard error compared to other approaches. Root MSE was smallest in the IPT-weighted analyses for the base-case scenario. In situations where time-varying confounding affected by prior treatment was absent, IPT-weighted analyses were less precise and therefore had greater root MSE compared with adjusted analyses. The 95% confidence limit coverage was close to nominal for all stabilized IPT-weighted but poor in crude, adjusted, and unstabilized IPT-weighted analysis. Under realistic scenarios, marginal structural Cox proportional hazards models performed according to expectations based on large-sample theory and provided accurate estimates of the hazard ratio.
Bias; Causal inference; Marginal structural models; Monte Carlo study
HIV-related outcomes may be affected by biological sex and by pregnancy. Including women in general and pregnant women in particular in HIV-related research is important for generalizability of findings.
To characterize representation of pregnant and non-pregnant women in HIV-related research conducted in general populations.
All HIV-related articles published in fifteen journals from January to March of 2011. We selected the top five journals by 2010 impact factor, in internal medicine, infectious diseases, and HIV/AIDS.
Study Eligibility Criteria
HIV-related studies reporting original research on questions applicable to both men and women of reproductive age were considered; studies were excluded if they did not include individual-level patient data.
Study appraisal and synthesis methods.
Articles were doubly reviewed and abstracted; discrepancies were resolved through consensus. We recorded proportion of female study participants, whether pregnant women were included or excluded, and other key factors.
In total, 2014 articles were published during this period. After screening, 259 articles were included as original HIV-related research reporting individual-level data; of these, 226 were determined to be articles relevant to both men and women of reproductive age. In these articles, women were adequately represented within geographic region. The vast majority of published articles, 183/226 (81%), did not mention pregnancy (or related issues); still fewer included pregnant women (n=33), reported numbers of pregnant women (n=19), or analyzed using pregnancy status (n=9).
Data were missing for some key variables, including pregnancy. The time period over which published works were evaluated was relatively short.
Conclusions and implications of key findings.
The under-reporting and inattention to pregnancy in the HIV literature may reduce policy-makers’ ability to set evidence-based policy around HIV/AIDS care for pregnant women and women of child-bearing age.
Pregnancy is a common indication for initiation of highly active antiretroviral therapy (HAART) in sub-Saharan Africa. Our objective was to evaluate how pregnancy at treatment initiation predicts virologic response to HAART.
We evaluated an open cohort of 9,173 patients who initiated HAART between April 2004 and September 2009 in the Themba Lethu Clinic in Johannesburg, South Africa. Risk ratios were estimated using log-binomial regression; hazard ratios were estimated using Cox proportional hazards models; time ratios were estimated using accelerated failure time models. We controlled for calendar date, age, ethnicity, employment status, history of smoking, tuberculosis, WHO stage, weight, body mass index, hemoglobin, CD4 count and CD4 percent, and whether clinical care was free. Extensive sensitivity and secondary analyses were performed.
During follow-up, 822 non-pregnant women and 70 pregnant women experienced virologic failure. In adjusted analyses, pregnancy at baseline was associated with reduced risk of virologic failure by six months (risk ratio 0.66, 95% confidence limits [CL] 0.35, 1.22) and with reduced hazard of virologic failure over follow-up (hazard ratio 0.69, 95% CL 0.50, 0.95). The adjusted time ratio for failure was 1.44 (95% CL 1.13, 1.84), indicating 44% longer time to event among women pregnant at baseline. Sensitivity analyses generally confirmed main findings.
Pregnancy at HAART initiation is not associated with increased risk of virologic failure at six months or during longer follow-up.
Pregnancy; HIV; highly active antiretroviral therapy (HAART); South Africa
The parametric g-formula can be used to contrast the distribution of potential outcomes under arbitrary treatment regimes. Like g-estimation of structural nested models and inverse probability weighting of marginal structural models, the parametric g-formula can appropriately adjust for measured time-varying confounders that are affected by prior treatment. However, there have been few implementations of the parametric g-formula to date. Here, we apply the parametric g-formula to assess the impact of highly active antiretroviral therapy on time to AIDS or death in two US-based HIV cohorts including 1,498 participants. These participants contributed approximately 7,300 person-years of follow-up of which 49% was exposed to HAART and 382 events occurred; 259 participants were censored due to drop out. Using the parametric g-formula, we estimated that antiretroviral therapy substantially reduces the hazard of AIDS or death (HR=0.55; 95% confidence limits [CL]: 0.42, 0.71). This estimate was similar to one previously reported using a marginal structural model 0.54 (95% CL: 0.38, 0.78). The 6.5-year difference in risk of AIDS or death was 13% (95% CL: 8%, 18%). Results were robust to assumptions about temporal ordering, and extent of history modeled, for time-varying covariates. The parametric g-formula is a viable alternative to inverse probability weighting of marginal structural models and g-estimation of structural nested models for the analysis of complex longitudinal data.
Cohort study; Confounding; g-formula; HIV/AIDS; Monte Carlo methods
The effect of tuberculosis on mortality in people initiating highly active antiretroviral treatment (HAART) remains unclear; here, we strengthened a previous cohort analysis. Multivariate Cox proportional hazards models were used to assess the association of baseline tuberculosis and time to all-cause mortality among HAART initiators. In reanalysis, treatment for tuberculosis at time of HAART initiation remained unassociated with increased risks of all-cause mortality, with adjusted hazard ratios ranging from 1.00 to 1.09.
Little is known about the impact of pregnancy on response to highly active antiretroviral therapy (HAART) in sub-Saharan Africa. We examined the effect of incident pregnancy after HAART initiation on clinical response to HAART.
We evaluated a prospective clinical cohort of adult women initiating HAART in Johannesburg, South Africa between 1 April 2004 and 31 March 2011, and followed up until an event, transfer, drop-out, or administrative end of follow-up on 30 September 2011. Women over age 45 and women who were pregnant at HAART initiation were excluded from the study. Main exposure was having experienced pregnancy after HAART initiation; main outcome was death and (separately) death or new AIDS event. We calculated adjusted hazard ratios (HRs) and 95% confidence limits (CL) using marginal structural Cox proportional hazards models.
The study included 7,534 women, and 20,813 person-years of follow-up; 918 women had at least one recognized pregnancy during follow-up. For death alone, the weighted (adjusted) HR was 0.84 (95% CL 0.44, 1.60). Sensitivity analyses confirmed main results, and results were similar for analysis of death or new AIDS event. Incident pregnancy was associated with a substantially reduced hazard of drop-out (HR = 0.62, 95% CL 0.51, 0.75).
Recognized incident pregnancy after HAART initiation was not associated with increases in hazard of clinical events, but was associated with a decreased hazard of drop-out. High rates of pregnancy after initiation of HAART may point to a need to better integrate family planning services into clinical care for HIV-infected women.
This study examines the timing of menarche in relation to infant feeding methods, specifically addressing the potential effects of soy isoflavone exposure through soy-based infant feeding. Subjects were participants in the Avon Longitudinal Study of Parents and Children (ALSPAC). Mothers were enrolled during pregnancy and their children have been followed prospectively. Early life feeding regimes, categorized as primarily breast, early formula, early soy, and late soy were defined using infant feeding questionnaires administered during infancy. For this analysis, age at menarche was assessed through questionnaires administered approximately annually between ages 8 and 14.5. Eligible subjects were limited to term, singleton, white females. We used Kaplan-Meier survival curves and Cox proportional hazards models to assess age at menarche and risk of menarche over the study period.
The present analysis included 2,920 girls. Approximately 2% of mothers reported that soy products were introduced into the infant diet at or before 4 months of age (early soy). The median age at menarche [interquartile range (IQR)] in the study sample was 153 months [144–163], approximately 12.8 years. The median age at menarche among early soy fed girls was 149 months (12.4 years) [IQR, 140–159]. Compared to girls fed non-soy based infant formula or milk (early formula), early soy fed girls were at 25% higher risk of menarche throughout the course of follow up (Hazard Ratio 1.25 [95% confidence interval, 0.92, 1.71]). Our results also suggest that girls fed soy products in early infancy may have an increased risk of menarche specifically in early adolescence. These findings may be the observable manifestation of mild endocrine disrupting effects of soy isoflavone exposure. However, our study is limited by few soy-exposed subjects and is not designed to assess biological mechanisms. Because soy formula use is common in some populations, this subtle association with menarche warrants more indepth evaluation in future studies.
While Berkson’s bias is widely recognized in the epidemiologic literature, it remains underappreciated as a model of both selection bias and bias due to missing data. Simple causal diagrams and 2×2 tables illustrate how Berkson’s bias connects to collider bias and selection bias more generally, and show the strong analogies between Berksonian selection bias and bias due to missing data. In some situations, considerations of whether data are missing at random or missing not at random is less important than the causal structure of the missing-data process. While dealing with missing data always relies on strong assumptions about unobserved variables, the intuitions built with simple examples can provide a better understanding of approaches to missing data in real-world situations.