Longitudinal studies are often viewed as the “gold standard” of observational epidemiologic research. Establishing a temporal association is a necessary criterion to identify causal relations. However, when covariates in the causal system vary over time, a temporal association is not straightforward. Appropriate analytical methods may be necessary to avoid confounding and reverse causality. These issues come to light in 2 studies of breastfeeding described in the articles by Al-Sahab et al. (Am J Epidemiol. 2011;173(9):971–977) and Kramer et al. (Am J Epidemiol. 2011;173(9):978–983) in this issue of the Journal. Breastfeeding has multiple time points and is a behavior that is affected by multiple factors, many of which themselves vary over time. This creates a complex causal system that requires careful scrutiny. The methods presented here may be applicable to a wide range of studies that involve time-varying exposures and time-varying confounders.
breast feeding; causality; confounding
Lung cancer is the leading cause of cancer death among women in the United States and other Western nations. The predominant cause of lung cancer in women is active cigarette smoking. Secondhand exposure to tobacco smoke is another important cause. The hypothesis that women are more susceptible than men to smoking-induced lung cancer has not been supported by the preponderance of current data, as noted by De Matteis et al. (Am J Epidemiol. 2013;177(7):601–612) in the accompanying article. However, aspects of lung cancer in men and women continue to indicate potential male-female differences in the etiology of lung cancer, based on several observations: 1) among never smokers, women have higher lung cancer incidence rates than men; 2) there is evidence that estrogen may contribute to lung cancer risk and progression; and 3) there are different clinical characteristics of lung cancer in women compared with men, such as the higher percentage of adenocarcinomas in never smokers, the greater prevalence of epidermal growth factor receptor gene (EGFR) mutations in adenocarcinomas among never smokers, and better prognosis. Considered in total, observations such as these offer enticing clues that, even amid cigarette smoking and other commonalities in the etiology of lung cancer in men and women, distinct differences may remain to be delineated that could potentially be of scientific and clinical relevance.
cigarettes; estrogen; lung cancer; men; secondhand smoke exposure; sex; smoking; women
Case-control and cohort studies are almost always complicated by nonrandom exposure allocation, which must be minimized in the design and analysis phases. Tubal sterilization is a common gynecological procedure that may be associated with other reproductive organ surgeries, which in turn may be associated with breast cancer risk. In this issue of the Journal, Gaudet et al. (Am J Epidemiol. 2013;177(6):492–499) argue successfully that tubal sterilization is unassociated with breast cancer risk. Scrutiny of the heterogeneous studies included in their meta-analysis underscores the role of confounding and effect modification in observational epidemiologic studies. Specifically, tubal sterilization is unassociated with breast cancer risk, but either oophorectomy or hysterectomy, or both, and the timing of these procedures warrant careful consideration in the design, analysis, and interpretation of observational research on reproductive factors.
breast neoplasms; case-control studies; cohort studies; hysterectomy; meta-analysis; oophorectomy; tubal sterilization
Research on racial residential segregation and health typically uses multilevel, population-based, slice-in-time data. Although research using this approach, including that by Kershaw et al. (Am J Epidemiol. 2013;177(4):299–309), has been valuable, I argue that to advance our understanding of how residential segregation influences health and health disparities, it is critical to incorporate a life-course perspective and integrate social theory. Applying a life-course perspective would entail modeling transitions, cumulative risk, and developmental and dynamic processes and mechanisms, as well as recognizing the contingency of contextual effects on different social groups. I discuss the need for analytic methods appropriate for modeling health effects of distal causes experienced across the life course, such as segregation, that operate through multiple levels and sequences of mediators, potentially across decades. Sociological theories of neighborhood attainment (e.g., segmented assimilation, ethnic resurgence, and place stratification theories) can guide effect-modification tests to help illuminate health effects resulting from intersections of residential processes, race/ethnicity, immigration, and other social determinants of health. For example, nativity and immigration history may crucially shape residential processes and exposures, but these have received limited attention in prior segregation-health literature.
neighborhood effects; place; racial residential segregation; social epidemiology; social theory
Arsenic exposure affects millions of people worldwide, causing substantial mortality and morbidity from cancers and cardiovascular and respiratory diseases. An article in the current issue (Am J Epidemiol. 2013;177(3):202–212) reports that classic dermatological manifestations, typically associated with chronic arsenic exposure, are predictive of internal cancers among Taiwanese decades after the cessation of exposure. Specifically, the risk of lung and urothelial cancers was elevated, which was evident regardless of arsenic dose, smoking, and age. There was also an unexpected elevated risk of prostate cancer. Despite some methodological limitations, these findings underscore the need for assessing whether dermatological manifestations are also predictive of cardiovascular, respiratory, and other arsenic-related, long-term health consequences. Given the emerging evidence of arsenic exposure from dietary sources beyond contaminated drinking water and occupational and environmental settings, and also because the vast majority of diseases and deaths among exposed populations do not show classic dermatological manifestations, larger and more comprehensive investigations of the health effects of arsenic exposure, especially at lower doses, are needed. In parallel, because the risk of known arsenic-related health outcomes remains elevated decades after exposure cessation, research toward identification of early clinical and biological markers of long-term risk as well as avenues for prevention, in addition to policy actions for exposure reductions, is warranted.
arsenic; cancer; prevention; skin lesions
Approximately 2 million new cases of cancer are caused by infections each year. For many of these cancers, we have been successful at developing methods for prevention or effective treatment/control. Epstein-Barr virus (EBV), a ubiquitous infection that establishes lifelong latency, was the first infection to be linked to the development of cancers, including nasopharyngeal carcinoma, lymphomas, and gastric cancer. EBV infection is linked to the development of approximately 200,000 new cancers each year, yet there have been no successful efforts to implement EBV-based strategies for the reduction in the burden of EBV-associated cancers. In this issue of the Journal, Liu et al. (Am J Epidemiol. 2013;177(3):242–250) report results from the enrollment phase of a large effort to demonstrate the efficacy of an EBV-based screening strategy to detect nasopharyngeal carcinoma at early stages and hopefully reduce the mortality associated with this disease. In this invited commentary, the design and initial findings from this demonstration project are reviewed, possible ways to enrich the effort are discussed, and populations that might benefit from EBV-based screening in the future are identified.
cancer; Epstein-Barr virus; infections; nasopharyngeal carcinoma; prevention; screening
Chronic inflammation, an established risk factor for cardiovascular disease, is increasingly being recognized as an etiologic factor in several cancers. In this issue of the Journal, Touvier et al. (Am J Epidemiol. 2013;177(1):3–13) report on the association of 7 markers of inflammation, adiposity, and endothelial function with risk of overall cancer and breast and prostate cancers in a nested case-control study carried out within the SU.VI.MAX cohort (France, 1994–2007). Consistent with previous reports on this topic, Touvier et al. focused on a limited number of markers. Future studies of inflammation and cancer should be able to capitalize on emerging multiplexed methods for the simultaneous detection of larger numbers of inflammatory markers in low-volume specimens. This should allow a more comprehensive evaluation of the role of inflammation in cancer development. In this commentary, the authors review emerging methods for measurement of multiplexed inflammation markers, the design and analytic implications of the use of these methods in epidemiologic studies, and potential public health implications of such studies. Given that many large prospective cohort studies have already collected and banked serum/plasma samples, rapid gains in our understanding of chronic inflammation and its role in cancer etiology are possible.
cardiovascular disease; circulating markers; inflammation; multiplexed assays; neoplasms; reproducibility
Previous epidemiologic studies have shown an inverse association between a personal history of atopy/allergies, both overall and among asthma, eczema, and hay fever investigated separately, and childhood acute lymphoblastic leukemia (ALL) with some consistency; however, in most of these studies, exposure data were collected by maternal interview. Now, in a population-based and records-based study in this issue of the Journal (Am J Epidemiol. 2012;176(11):970–978), Chang et al. report an increased risk for allergic conditions across different etiologic time periods, calling the former paradigm into doubt. A review of the basic biology literature shows that proposed mechanisms support either a positive or an inverse association. In light of this ambiguity, it is epidemiology's turn to determine the direction of association.
child; hypersensitivity; leukemia
The conditions under which children are raised have a long-term impact on health throughout the life course. Because childhood conditions can have such a strong influence on adult risk factors for disease, failure to account for their influences could distort observed associations between adult risk factors and subsequent health outcomes. In other words, childhood conditions could confound the association between every X and Y when X is measured in adulthood. Comparisons of health outcomes between exposed and unexposed siblings have the potential to eliminate confounding effects due to vulnerability factors shared between siblings (i.e., 50% of their genes and aspects of the childhood environment that affect siblings equally). In a large, population-based study of siblings in Denmark, Søndergaard et al. (Am J Epidemiol. 2012;176(8):675–683) found that individuals with higher educational qualifications lived longer than did their siblings with lower educational qualifications. Their results provide evidence for the returns to health resulting from investment in expanded educational opportunities. However, even sibling designs are not conclusive regarding causality; they remain subject to the unmeasured confounding influences of factors that vary within families. Nonetheless, sibling-based approaches should be used more often in studies of adult risk factors to address the long-term influences of the childhood environment on health.
causal inference; education; health; mortality; siblings; social epidemiology
In this issue of the Journal, Pencina and et al. (Am J Epidemiol. 2012;176(6):492–494) examine the operating characteristics of measures of incremental value. Their goal is to provide benchmarks for the measures that can help identify the most promising markers among multiple candidates. They consider a setting in which new predictors are conditionally independent of established predictors. In the present article, the authors consider more general settings. Their results indicate that some of the conclusions made by Pencina et al. are limited to the specific scenarios the authors considered. For example, Pencina et al. observed that continuous net reclassification improvement was invariant to the strength of the baseline model, but the authors of the present study show this invariance does not hold generally. Further, they disagree with the suggestion that such invariance would be desirable for a measure of incremental value. They also do not see evidence to support the claim that the measures provide complementary information. In addition, they show that correlation with baseline predictors can lead to much bigger gains in performance than the conditional independence scenario studied by Pencina et al. Finally, the authors note that the motivation of providing benchmarks actually reinforces previous observations that the problem with these measures is they do not have useful clinical interpretations. If they did, researchers could use the measures directly and benchmarks would not be needed.
area under curve; biomarkers; bivariate binomial distribution; receiver operating characteristic; risk assessment; risk factors
In a 1993 paper (Am J Epidemiol. 1993;137(1):1–8), Weinberg considered whether a variable that is associated with the outcome and is affected by exposure but is not an intermediate variable between exposure and outcome should be considered a confounder in etiologic studies. As an example, she examined the common practice of adjusting for history of spontaneous abortion when estimating the effect of an exposure on the risk of spontaneous abortion. She showed algebraically that such an adjustment could substantially bias the results even though history of spontaneous abortion would meet some definitions of a confounder. Directed acyclic graphs (DAGs) were introduced into epidemiology several years later as a tool with which to identify confounders. The authors now revisit Weinberg's paper using DAGs to represent scenarios that arise from her original assumptions. DAG theory is consistent with Weinberg's finding that adjusting for history of spontaneous abortion introduces bias in her original scenario. In the authors' examples, treating history of spontaneous abortion as a confounder introduces bias if it is a descendant of the exposure and is associated with the outcome conditional on exposure or is a child of a collider on a relevant undirected path. Thoughtful DAG analyses require clear research questions but are easily modified for examining different causal assumptions that may affect confounder assessment.
bias (epidemiology); causality; confounding factors (epidemiology); reproductive history
Risk reclassification methods have become popular in the medical literature as a means of comparing risk prediction models. In this issue of the Journal, Pencina et al. (Am J Epidemiol. 2012;176(6):492–494) present further results for continuous measures of model discrimination and describe their characteristics in nested models with normally distributed variables. Measures include the change in the area under the receiver operating characteristic curve, the integrated discrimination improvement, and the continuous net reclassification improvement. Although theoretically interesting, these continuous measures may not be the most appropriate to assess clinical utility. The continuous net reclassification improvement, in particular, is a measure of effect rather than model improvement and can sometimes exhibit erratic behavior, as illustrated in 2 examples. Caution is needed before using this as a measure of improvement. Further, the test of the continuous net reclassification improvement and that for the integrated discrimination improvement are similar to the likelihood ratio test in nested models and may be overinterpreted. Reclassification in risk strata, while requiring thresholds, may be more relevant clinically with its ability to examine potential changes in treatment decisions.
calibration; discrimination; model fit; risk prediction
The interaction estimates from Bhavnani et al. (Am J Epidemiol. 2012;176(5):387–395) are used to evaluate evidence for mechanistic interaction between coinfecting pathogens for diarrheal disease. Mechanistic interaction is said to be present if there are individuals for whom the outcome would occur if both of 2 exposures are present but would not occur if 1 or the other of the exposures is absent. In the epidemiologic literature, mechanistic interaction is often conceived of as synergism within Rothman's sufficient-cause framework. Tests for additive interaction are sometimes used to assess such synergism or mechanistic interaction, but testing for positive additive interaction only allows for the conclusion of mechanistic interaction under fairly strong “monotonicity” assumptions. Alternative tests for mechanistic interaction, which do not require monotonicity assumptions, have been developed more recently but require more substantial additive interaction to draw the conclusion of the presence of mechanistic interaction. The additive interaction reported by Bhavnani et al. is of sufficient magnitude to provide strong evidence of mechanistic interaction between rotavirus and Giardia and between rotavirus and Escherichia. coli/Shigellae, even without any assumptions about monotonicity.
coinfecting pathogens; diarrhea; interaction; mechanism; synergism
Because of the aging of the population, dementia has become a major public health problem. There has been growing evidence for a possible association between lipids and dementia. A large body of literature has demonstrated multiple hypothesized biologic links between lipids and neurodegenerative or other biologic pathways connected to dementing processes. However, the epidemiologic associations have been conflicting: dyslipidemia at middle age, but not in later life, seems to be associated with higher dementia risk in some but not all studies. Results from the Honolulu-Asia Aging Study reported by Saczynski et al. (Am J Epidemiol 2007;165:000–00) suggest that lipoprotein constituents, such as apolipoprotein A-I, a major component of the high density lipoprotein, may be more informative in enlightening the association between lipids and dementia. In this commentary, the epidemiology and biology of apolipoprotein A-I in relation to dementia is reviewed.
Alzheimer disease; apolipoproteins; dementia; lipids
Pleiotropy across the 8q24 region is perhaps the most intriguing of the genome-wide association findings relating to cancer. This region of chromosome 8 is a gene desert, far from any recognized genes. Guarrera et al., whose work is reported in this issue (Am J Epidemiol. 2012;175(6):479–487), took an epidemiologic approach to learn more about the 8q24 region. They capitalized on their ascertainment of other endpoints in members of the cohort at the Turin site of the European Prospective Investigation Into Cancer and Nutrition to investigate multiple outcomes for additional pleiotropic effects in the 8q24 region. Alternative design options might involve genotyping of more variants, incorporation of more cases, or use of a single control group close to the size of the most common case group. Their analytic methods reflect the uncertainty of the underlying biology. The findings sharpen the scientific question about how variation in the 8q24 region affects pathogenesis. The genome-wide association effort is possible because of the economy of scale afforded by extremely dense genotyping. Strict adherence to the hypothesis-driven approach would ignore information that is obtainable at a trivial cost. The genome-wide association strategy tests whether agnostic data-mining methods can advance knowledge alongside or even in place of the standard hypothesis-driven approach, which is the conventional scientific method children learn in kindergarten and onward, even through graduate school and beyond.
neoplasms; chromosomes, human, pair 8; diabetes mellitus; DNA, intergenic; genetic pleiotropy; mortality
The typical dilemma with sex-ratio findings is that when they are real, they aren’t interesting, and when they are interesting, they aren’t real. In this issue of the Journal, Fernández et al. (Am J Epidemiol. 2011;174(12):1327–1331) describe a deviation of the sex ratio that is apparently both large and real. There was a temporary but distinct spike in the proportion of boys born in Cuba around the time of the collapse of the national economy during the 1990s. Although an excess of boys does not fit the prevailing biologic theory regarding maternal stress and the sex ratio, the data are consistent with results from the Dutch famine (where population-level deprivation was even more extreme). A new quandary arises in the modern era with interpretation of the sex ratio: If the decision to abort a pregnancy is influenced by the sex of the fetus, a change in the behavior of even a small proportion of women could influence the sex ratio at birth. The possible role of sex selection in the Cuban context is discussed.
abortion; sex ratio
In choosing covariates for adjustment or inclusion in propensity score analysis, researchers must weigh the benefit of reducing confounding bias carried by those covariates against the risk of amplifying residual bias carried by unmeasured confounders. The latter is characteristic of covariates that act like instrumental variables—that is, variables that are more strongly associated with the exposure than with the outcome. In this issue of the Journal (Am J Epidemiol. 2011;174(11):1213–1222), Myers et al. compare the bias amplification of a near-instrumental variable with its bias-reducing potential and suggest that, in practice, the latter outweighs the former. The author of this commentary sheds broader light on this comparison by considering the cumulative effects of conditioning on multiple covariates and showing that bias amplification may build up at a faster rate than bias reduction. The author further derives a partial order on sets of covariates which reveals preference for conditioning on outcome-related, rather than exposure-related, confounders.
bias (epidemiology); confounding factors (epidemiology); epidemiologic methods; instrumental variable; precision; simulation; variable selection
The Icelandic study of melanoma trends by Héry et al. in this issue of the Journal (Am J Epidemiol. 2010;172(7):762–767) is a fascinating analysis of an ecologic association. The authors noted a sharp increase in melanoma incidence that appeared to lag a few years behind the increased prevalence of sunbeds in Iceland. Caution, however, must be exercised in interpreting the data because of the lack of understanding of emissions of ultraviolet radiation from sunbeds and the ecologic nature of the data.
Iceland; melanoma; ultraviolet rays
Metrics such as relative hazards and relative risks do not account for the prevalence of a marker over time and its relation to whether and when an outcome occurs. Uncommon markers that have good predictive values and common markers that are poorly predictive may not be (clinically) useful in predicting disease and other health outcomes. Recent work by Little et al. (Am J Epidemiol. 2011;173(12):1380–1387) highlights the development of a new method that considers both factors in predicting outcomes. Measures that incorporate both marker prevalence and predictive values and therefore are measures of “effectiveness” may be broadly helpful in deciding which markers or exposures are useful in disease screening or should be targeted by health interventions.
biological markers; censored data; life change events; menopause; premenopause; survival analysis
The very insightful and clear paper by VanderWeele and Vansteelandt in this issue of the Journal (Am J Epidemiol. 2010;172(12):1339–1348) bridges the gap between biostatistics methodologists focusing on causal methods for mediation analyses and the practitioners of mediational analyses to the benefit of both groups. In an effort to continue the bridging of this gap, this invited commentary relates the important issue of “natural direct effects” to the well-known epidemiologic method of direct standardization. Additionally, attention is paid to the importance of temporal sequencing to help substantiate the mediation relations among the exposure, mediation, and outcome. A crucial mathematical distortion under the logistics model, called “absence of collapsibility,” is noted in motivating VanderWeele and Vansteelandt's use of the log-linear model for comparing the effect of exposure adjusted for the mediator with the effect of exposure unadjusted for the mediator. It is also noted that this issue applies to one approach to assessing confounding. Finally, some issues are raised for consideration when testing the interaction between the exposure and mediator before assessing mediation.
collapsibility; confounding; epidemiologic methods; logistic regression; log-linear models; standardization
In this issue of the Journal, VanderWeele and Vansteelandt (Am J Epidemiol. 2010;172(12):1339–1348) provide simple formulae for estimation of direct and indirect effects using standard logistic regression when the exposure and outcome are binary, the mediator is continuous, and the odds ratio is the chosen effect measure. They also provide concisely stated lists of assumptions necessary for estimation of these effects, including various conditional independencies and homogeneity of exposure and mediator effects over covariate strata. They further suggest that this will allow effect decomposition in case-control studies if the sampling fractions and population outcome prevalence are known with certainty. In this invited commentary, the author argues that, in a well-designed case-control study in which the sampling fraction is known, it should not be necessary to rely on the odds ratio. The odds ratio has well-known deficiencies as a causal parameter, and its use severely complicates evaluation of confounding and effect homogeneity. Although VanderWeele and Vansteelandt propose that a rare disease assumption is not necessary for estimation of controlled direct effects using their approach, collapsibility concerns suggest otherwise when the goal is causal inference rather than merely measuring association. Moreover, their clear statement of assumptions necessary for the estimation of natural/pure effects suggests that these quantities will rarely be viable estimands in observational epidemiology.
causal inference; conditional independence; confounding; decomposition; estimation; interaction; logistic regression; odds ratio
Weathering—the cumulative burden of adverse psychosocial and economic circumstances on the bodies of minority women—has been repeatedly described in epidemiologic studies. The most common application has been the documentation of rapidly increasing risks of adverse birth outcomes as African-American women age. Previous work has been based largely on cross-sectional data that aggregate women across a variety of socioeconomic circumstances. When more specific information about women's life-course socioeconomic status is taken into account, however, heterogeneity in the weathering experience of African-American women becomes more readily apparent. Adverse birth outcome risk trajectories with advancing age for African-American women who reside in wealthier neighborhoods look much more similar to those of white women. The accompanying article by Love et al. (Am J Epidemiol. 2010;172(2):127–134) provides a more nuanced investigation of the social conditions that contribute to the weathering of African-American women and points to the critical role played by social and economic conditions over the life course in producing adverse birth outcome disparities.
African Americans; infant, premature; infant, small for gestational age; maternal age; poverty; preterm birth; residence characteristics
Epidemiologists are well aware of the negative consequences of measurement error in exposure and outcome variables to their ability to detect putative causal associations. However, empirical proof that remedying the misclassification problem improves estimates of epidemiologic effect is seldom examined in detail. Of all areas in cancer epidemiology, perhaps the best example of the consequences of misclassification and of the steps taken to circumvent them was the pursuit, beginning in the mid-1980s, of the human papillomavirus (HPV) infection–cervical cancer association. The stakes were high: Had the wrong conclusions been reached epidemiologists would have been led astray in the search for competing hypotheses for the sexually transmissible agent causing cervical cancer or in ascribing to HPV infection a mere ancillary role among many lifestyle, hormonal, and environmental factors. The article by Castle et al. in this issue of the Journal (Am J Epidemiol. 2010;171(2):155–163) provides a detailed account of the joint influences of improved HPV and cervical precancer measurements in gradually unveiling the strong magnitude of the underlying association between viral exposure and cervical lesion risk. In this commentary, the authors extend the findings of Castle et al. by providing additional empirical evidence in support of their arguments.
cytology; measurement error; misclassification; papillomavirus infections; uterine cervical neoplasms; vaginal smears
Genomic data will become an increasingly important component of epidemiologic studies in coming years. The authors of the accompanying Journal article, van Ballegooijen et al. (Am J Epidemiol. 2009;170(12):1455–1463), are to be commended for attempting to use the coalescent analysis of viral sequence data to evaluate a hepatitis B vaccination program. Coalescent theory attempts to link the phylogenetic history of populations with rates of population growth and decline. In particular, under certain assumptions, a reduction in genetic diversity can be interpreted as a reduction in disease incidence. However, the authors of this commentary contend that van Ballegooijen et al.’s interpretation of changes in viral genetic diversity as a measure of hepatitis B vaccine effectiveness has major limitations. Because of the potential use of these methods in future vaccination studies, the authors discuss the utility of these methods and the data requirements needed for them to be convincing. First, data sets should be large enough to provide sufficient epidemiologic-scale resolution. Second, data need to reflect sufficiently fine-grained temporal sampling. Third, other processes that can potentially influence genetic diversity and confuse demographic inferences should be considered.
communicable diseases; disease notification; disease transmission, infectious; genetic variation; hepatitis B virus; molecular sequence data; vaccination
Making decisions about medical treatments based upon valid evidence is critical to improve health-care quality, outcomes, and value. Although such research commonly connotes the use of randomized controlled trials, experimental methods are not always feasible, and research using observational, quasi-experimental, and other nonexperimental methods may also be important. At the same time, nonexperimental methods are inherently susceptible to various types of bias and thus present special challenges in the search for valid and generalizable evidence. The study by Gardarsdottir et al. (Am J Epidemiol. 2009;170(3):280–285), on which this commentary is based, addresses a key potential source of bias—mismeasurement of patients’ duration of treatment—in previous research on pharmacotherapy for depression. However, the authors’ study is unlikely to address other potential sources of bias, which may make interpretation of their findings more difficult.
bias (epidemiology); depression; observation; research design; treatment outcome