Cigarette smoking is the major environmental risk factor for chronic obstructive pulmonary disease (COPD). Genome-wide association studies have provided compelling associations for three loci with COPD. In this study, we aimed to estimate direct, i.e., independent from smoking, and indirect effects of those loci on COPD development using mediation analysis. We included a total of 3,424 COPD cases and 1,872 unaffected controls with data on two smoking-related phenotypes: lifetime average smoking intensity and cumulative exposure to tobacco smoke (pack years). Our analysis revealed that effects of two linked variants (rs1051730 and rs8034191) in the AGPHD1/CHRNA3 cluster on COPD development are significantly, yet not entirely, mediated by the smoking-related phenotypes. Approximately 30 % of the total effect of variants in the AGPHD1/CHRNA3 cluster on COPD development was mediated by pack years. Simultaneous analysis of modestly (r2 = 0.21) linked markers in CHRNA3 and IREB2 revealed that an even larger (~42 %) proportion of the total effect of the CHRNA3 locus on COPD was mediated by pack years after adjustment for an IREB2 single nucleotide polymorphism. This study confirms the existence of direct effects of the AGPHD1/CHRNA3, IREB2, FAM13A and HHIP loci on COPD development. While the association of the AGPHD1/CHRNA3 locus with COPD is significantly mediated by smoking-related phenotypes, IREB2 appears to affect COPD independently of smoking.
Assisted reproductive techniques (ART) are associated with a higher risk of tetralogy of Fallot (TOF) and multiple pregnancies may be associated with a higher risk of congenital anomalies. We assessed the extent to which the association between ART and risk of TOF may be mediated by the higher risk of multiple pregnancies associated with ART.
We conducted a case–control study using population-based data from the Paris Registry of Congenital Malformations for the period 1987–2009 and a cohort study of congenital heart defects (EPICARD). The study population included 395 cases of TOF and 4104 malformed controls with no known associations with ART. The analysis was based on a path-analysis model using a counterfactual approach, which allows decomposition of the total effect of ART into an indirect effect (that mediated by the association between ART and multiple pregnancies) and a direct effect.
ART (all methods combined) were associated with a 2.6-fold higher odds of TOF after adjustment for maternal and paternal characteristics and year of birth (adjusted OR 2.6, 95% CI, 1.5-4.5). Most (79%) of the effect associated with ART was a direct effect (i.e., not mediated by multiple pregnancies), whereas 21% of the effect of ART was due to its association with multiple pregnancies (i.e., the indirect effect). In vitro fertilization with intracytoplasmic sperm injection was associated with a 3.5-fold higher odds of TOF (adjusted OR 3.5, 95% CI, 1.1-11.2); 11% of this effect was mediated through the association of ICSI with multiple pregnancies.
By far, most of the higher risk of TOF associated with ART is a direct effect and only a small proportion of the effect may be mediated by multiple pregnancies.
Multiple pregnancies; Reproductive techniques; Assisted; Heart defects; Congenital; Tetralogy of Fallot; Epidemiology
The goal of mediation analysis is to assess direct and indirect effects of a treatment or exposure on an outcome. More generally, we may be interested in the context of a causal model as characterized by a directed acyclic graph (DAG), where mediation via a specific path from exposure to outcome may involve an arbitrary number of links (or ‘stages’). Methods for estimating mediation (or pathway) effects are available for a continuous outcome and a continuous mediator related via a linear model, while for a categorical outcome or categorical mediator, methods are usually limited to two-stage mediation. We present a method applicable to multiple stages of mediation and mixed variable types using generalized linear models. We define pathway effects using a potential outcomes framework and present a general formula that provides the effect of exposure through any specified pathway. Some pathway effects are nonidentifiable and their estimation requires an assumption regarding the correlation between counterfactuals. We provide a sensitivity analysis to assess of the impact of this assumption. Confidence intervals for pathway effect estimates are obtained via a bootstrap method. The method is applied to a cohort study of dental caries in very low birth weight adolescents. A simulation study demonstrates low bias of pathway effect estimators and close-to-nominal coverage rates of confidence intervals. We also find low sensitivity to the counterfactual correlation in most scenarios.
Copula; Generalized linear model; G-computation algorithm; Path analysis; Potential outcome; Sensitivity analysis
Applied researchers often include mediation effects in applications of advanced methods such as latent variable models and linear growth curve models. Guidance on how to estimate statistical power to detect mediation for these models has not yet been addressed in the literature. We describe a general framework for power analyses for complex mediational models. The approach is based on the well known technique of generating a large number of samples in a Monte Carlo study, and estimating power as the percentage of cases in which an estimate of interest is significantly different from zero. Examples of power calculation for commonly used mediational models are provided. Power analyses for the single mediator, multiple mediators, three-path mediation, mediation with latent variables, moderated mediation, and mediation in longitudinal designs are described. Annotated sample syntax for Mplus is appended and tabled values of required sample sizes are shown for some models.
Mediation; Statistical Power; Monte Carlo; Mplus
Recent genome-wide association (GWA) studies of lung cancer have shown that the CHRNA5-A3 region on chromosome 15q24-25.1 is strongly associated with an increased risk of lung cancer and nicotine dependence, and thought to be associated with chronic obstructive airways disease as well. However, it has not been established whether the association between genetic variants and lung cancer risk is a direct one or one mediated by nicotine dependence.
In this paper we applied a rigorous statistical approach, mediation analysis, to examine the mediating effect of smoking behavior and self-reported physician-diagnosed emphysema (chronic obstructive pulmonary disease [COPD]) on the relationship between the CHRNA5-A3 region genetic variant rs1051730 and the risk of lung cancer.
Our results showed that rs1051730 is directly associated with lung cancer risk, but that it is also associated with lung cancer risk through its effect on both smoking behavior and COPD. Furthermore, we showed that COPD is a mediating phenotype that explains part of the effect of smoking behavior on lung cancer. Our results also suggested that smoking behavior is a mediator of the relationship between rs1051730 and COPD risk.
Smoking behavior and COPD are mediators of the association between the SNP rs1051730 and the risk of lung cancer. Also, COPD is a mediator of the association between smoking behavior and lung cancer. Finally, smoking behavior also has mediating effects on the association between the SNP and COPD.
Lung Cancer; COPD; Mediation analysis; smoking behavior; genetic variants
Direct and indirect effects of the new psychotropic paliperidone extended-release (paliperidone ER) tablets on negative symptom improvement in schizophrenia were investigated using path analysis. A post hoc analysis of pooled data from three 6-week, double-blind, placebo-controlled studies of paliperidone ER in patients experiencing acute exacerbation was conducted. Regression analysis explored relationships between baseline/study characteristics and negative symptoms. Change in Positive and Negative Syndrome Scale (PANSS) negative factor score at endpoint was the dependent variable; explanatory variables included demographic and clinical characteristics. Path analysis determined direct and indirect effects of treatment on negative symptom change. Indirect mediators of negative symptom change in the model included changes in positive symptoms, anxiety/depression symptoms and movement disorders. Path analysis indicated that up to 33% of negative symptom improvement was a direct treatment effect. Indirect effects on negative symptoms were mediated through changes in positive symptoms (51%) and anxiety/depression symptoms (18%), whereas changes in movement disorders had a 2.1% inverse effect. Path analysis indicated that paliperidone ER has a direct effect on negative symptoms. Negative symptom improvement also was indirectly mediated via changes in positive and depressive symptoms.
antipsychotic; paliperidone ER; path analysis; psychotropic; schizophrenia
For dichotomous outcomes, the authors discuss when the standard approaches to mediation analysis used in epidemiology and the social sciences are valid, and they provide alternative mediation analysis techniques when the standard approaches will not work. They extend definitions of controlled direct effects and natural direct and indirect effects from the risk difference scale to the odds ratio scale. A simple technique to estimate direct and indirect effect odds ratios by combining logistic and linear regressions is described that applies when the outcome is rare and the mediator continuous. Further discussion is given as to how this mediation analysis technique can be extended to settings in which data come from a case-control study design. For the standard mediation analysis techniques used in the epidemiologic and social science literatures to be valid, an assumption of no interaction between the effects of the exposure and the mediator on the outcome is needed. The approach presented here, however, will apply even when there are interactions between the effect of the exposure and the mediator on the outcome.
case-control studies; causal inference; decomposition; dichotomous response; epidemiologic methods; interaction; logistic regression; odds ratio
Several studies have found replicable associations between nicotine dependence and specific variants in the nicotinic receptor genes CHRNA5(rs16969968) and CHRNA3(rs3743078). How these newly identified genetic risks combine with known environmental risks is unknown. This study examined whether the level of parent monitoring during early adolescence modified the risk of nicotine dependence associated with these genetic variants.
In a cross-sectional case control study of US-based community sample of 2027 subjects, we use a systematic series of regression models to examine the effect of parent monitoring on risk associated with two distinct variants in the nicotinic receptor genes CHRNA5(rs16969968) and CHRNA3(rs3743078).
Low parent monitoring as well as the previously identified genetic variants were associated with an increased risk of nicotine dependence. An interaction was found between the SNP(rs16969968) and parent monitoring (p=0.034). The risk for nicotine dependence increased significantly with the risk genotype of SNP(rs16969968) when combined with lowest quartile parent monitoring. In contrast, there was no evidence of an interaction between SNP(rs3743078) and parent monitoring (p=0.80).
The genetic risk of nicotine dependent associated with rs16969968 was modified by level of parent monitoring, while the genetic risk associated with rs3743078 was not, suggesting that the increased risk due to some genes may be mitigated by environmental factors such as parent monitoring.
nicotine dependence; parent monitoring; phenotype; gene-environmental interaction; nicotinic receptor genes; case control study
Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional z test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a simulation was conducted to evaluate these alternative methods in a more complex path model with multiple mediators and indirect paths with 2 and 3 paths. Methods for testing contrasts of 2 effects were evaluated also. The simulation included 1 exogenous independent variable, 3 mediators and 2 outcomes and varied sample size, number of paths in the mediated effects, test used to evaluate effects, effect sizes for each path, and the value of the contrast. Confidence intervals were used to evaluate the power and Type I error rate of each method, and were examined for coverage and bias. The bias-corrected bootstrap had the least biased confidence intervals, greatest power to detect nonzero effects and contrasts, and the most accurate overall Type I error. All tests had less power to detect 3-path effects and more inaccurate Type I error compared to 2-path effects. Confidence intervals were biased for mediated effects, as found in previous studies. Results for contrasts did not vary greatly by test, although resampling approaches had somewhat greater power and might be preferable because of ease of use and flexibility.
The goal of mediation analysis is to identify and explicate the mechanism that underlies a relationship between a risk factor and an outcome via an intermediate variable (mediator). In this paper, we consider the estimation of mediation effects in zero-inflated (ZI) models intended to accommodate `extra' zeros in count data. Focusing on the ZI negative binomial (ZINB) models, we provide a mediation formula approach to estimate the (overall) mediation effect in the standard two-stage mediation framework under a key sequential ignorability assumption. We also consider a novel decomposition of the overall mediation effect for the ZI context using a three-stage mediation model. Estimation of the components of the overall mediation effect requires an assumption involving the joint distribution of two counterfactuals. Simulation study results demonstrate low bias of mediation effect estimators and close-to-nominal coverage probability (CP) of confidence intervals. We also modify the mediation formula method by replacing `exact' integration with a Monte Carlo integration method. The method is applied to a cohort study of dental caries in very low birth weight adolescents. For overall mediation effect estimation, sensitivity analysis was conducted to quantify the degree to which key assumption must be violated to reverse the original conclusion.
causal mediation analysis; indirect effect; Monte-Carlo integration; mediation formula; sensitivity analysis; zero-inflation
Recently, researchers have used a potential-outcome framework to estimate causally interpretable direct and indirect effects of an intervention or exposure on an outcome. One approach to causal-mediation analysis uses the so-called mediation formula to estimate the natural direct and indirect effects. This approach generalizes classical mediation estimators and allows for arbitrary distributions for the outcome variable and mediator. A limitation of the standard (parametric) mediation formula approach is that it requires a specified mediator regression model and distribution; such a model may be difficult to construct and may not be of primary interest. To address this limitation, we propose a new method for causal-mediation analysis that uses the empirical distribution function, thereby avoiding parametric distribution assumptions for the mediator. In order to adjust for confounders of the exposure-mediator and exposure-outcome relationships, inverse-probability weighting is incorporated based on a supplementary model of the probability of exposure. This method, which yields estimates of the natural direct and indirect effects for a specified reference group, is applied to data from a cohort study of dental caries in very-low-birth-weight adolescents to investigate the oral-hygiene index as a possible mediator. Simulation studies show low bias in the estimation of direct and indirect effects in a variety of distribution scenarios, whereas the standard mediation formula approach can be considerably biased when the distribution of the mediator is incorrectly specified.
Genetic association studies for binary diseases are designed as case-control studies: the cases are those affected with the primary disease and the controls are free of the disease. At the time of case-control collection, information about secondary phenotypes is also collected. Association studies of secondary phenotype and genetic variants have received a great deal of interest recently. To study the secondary phenotypes, investigators use standard regression approaches, where individuals with secondary phenotypes are coded as cases and those without secondary phenotypes are coded as controls. However, using the secondary phenotype as an outcome variable in a case-control study might lead to a biased estimate of odds ratios (ORs) for genetic variants. The secondary phenotype is associated with the primary disease; therefore, individuals with and without the secondary phenotype are not sampled following the principles of a case-control study. In this article, we demonstrate that such analyses will lead to a biased estimate of OR and propose new approaches to provide more accurate OR estimates of genetic variants associated with the secondary phenotype for both unmatched and frequency-matched (with respect to the secondary phenotype) case-control studies. We also propose a bootstrapping method to estimate the empirical confidence intervals for the corrected ORs. Using simulation studies and analysis of lung cancer data for single-nucleotide polymorphism associated with smoking quantity, we compared our new approaches to standard logistic regression and to an extended version of the inverse-probability-of-sampling-weighted regression. The proposed approaches provide more accurate estimation of the true OR.
Odds ratio; bias; secondary phenotype; un-matched and frequency-matched study; SNP; genome-wide association study
To build upon state-of-the art theory and empirical data to estimate the strength of multiple mediators of the efficacious Keep Active Minnesota (KAM) physical activity (PA) maintenance intervention.
The total, direct, and indirect effects through which KAM helped randomized participants (KAM n=523; UC n=526) maintain moderate or vigorous PA (MVPA) for up to 2 years were estimated using structural equation modeling.
Multiple mediators explained half (β=.052, P=.13) of the effect of KAM on MVPA (β=.105, P=.004). Self-efficacy was the upstream variable in 2 endogenously mediated effects, and the self-concept mediator emerged as the strongest predictor of MVPA.
KAM positively impacted self-efficacy, which was associated with PA enjoyment, integration into the self-concept, and PA maintenance. Successful long-term PA maintenance appears to be influenced by multiple small interrelated mediational pathways. Future research evaluating maintenance models should specify recursive relationships among mediators and outcomes.
maintenance; physical activity; multiple mediation; behavioral intervention; structural equation modeling
A non-synonymous coding polymorphism, rs16969968, of the CHRNA5 gene which encodes the alpha-5 subunit of the nicotinic acetylcholine receptor (nAChR) has been found to be associated with nicotine dependence (20). The goal of the present study is to examine the association of this variant with cocaine dependence.
Genetic association analysis in two, independent samples of unrelated cases and controls; 1.) 504 European-American participating in the Family Study on Cocaine Dependence (FSCD); 2.) 814 European Americans participating in the Collaborative Study on the Genetics of Alcoholsim (COGA).
In the FSCD, there was a significant association between the CHRNA5 variant and cocaine dependence (OR = 0.67 per allele, p = 0.0045, assuming an additive genetic model), but in the reverse direction compared to that previously observed for nicotine dependence. In multivariate analyses that controlled for the effects of nicotine dependence, both the protective effect for cocaine dependence and the previously documented risk effect for nicotine dependence were statistically significant. The protective effect for cocaine dependence was replicated in the COGA sample. In COGA, effect sizes for habitual smoking, a proxy phenotype for nicotine dependence, were consistent with those observed in FSCD.
The minor (A) allele of rs16969968, relative to the major G allele, appears to be both a risk factor for nicotine dependence and a protective factor for cocaine dependence. The biological plausibility of such a bidirectional association stems from the involvement of nAChRs with both excitatory and inhibitory modulation of dopamine-mediated reward pathways.
Smoking; Nicotine dependence; Addiction; Substance-use disorders; Genetics; Receptors; nicotinic; Cocaine
Case-control association studies often collect extensive information on secondary phenotypes, which are quantitative or qualitative traits other than the case-control status. Exploring secondary phenotypes can yield valuable insights into biological pathways and identify genetic variants influencing phenotypes of direct interest. All publications on secondary phenotypes have used standard statistical methods, such as least-squares regression for quantitative traits. Because of unequal selection probabilities between cases and controls, the case-control sample is not a random sample from the general population. As a result, standard statistical analysis of secondary phenotype data can be extremely misleading. Although one may avoid the sampling bias by analyzing cases and controls separately or by including the case-control status as a covariate in the model, the associations between a secondary phenotype and a genetic variant in the case and control groups can be quite different from the association in the general population. In this article, we present novel statistical methods that properly reflect the case-control sampling in the analysis of secondary phenotype data. The new methods provide unbiased estimation of genetic effects and accurate control of false-positive rates while maximizing statistical power. We demonstrate the pitfalls of the standard methods and the advantages of the new methods both analytically and numerically. The relevant software is available at our website.
case-control sampling; complex diseases; genomewide association studies; linear regression; maximum likelihood; meta-analysis; quantitative traits; secondary traits; SNPs
Genetic association studies are a powerful tool to detect genetic variants that predispose to human disease. Once an associated variant is identified, investigators are also interested in estimating the effect of the identified variant on disease risk. Estimates of the genetic effect based on new association findings tend to be upwardly biased due to a phenomenon known as the “winner's curse”. Overestimation of genetic effect size in initial studies may cause follow-up studies to be underpowered and so to fail. In this paper, we quantify the impact of the winner's curse on the allele frequency difference and odds ratio estimators for one- and two-stage case-control association studies. We then propose an ascertainment-corrected maximum likelihood method to reduce the bias of these estimators. We show that overestimation of the genetic effect by the uncorrected estimator decreases as the power of the association study increases and that the ascertainment-corrected method reduces absolute bias and mean square error unless power to detect association is high.
winner's curse; ascertainment bias; genome-wide association study; maximum likelihood
Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also known as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional hazards model, or a marginal structural additive hazards model. The proposed theory delivers new estimators for mediation analysis in each of these models, with appealing robustness properties. Specifically, in order to guarantee ignorability with respect to the exposure and mediator variables, the approach, which is multiply robust, allows the investigator to use several flexible working models to adjust for confounding by a large number of pre-exposure variables. Multiple robustness is appealing because it only requires a subset of working models to be correct for consistency; furthermore, the analyst need not know which subset of working models is in fact correct to report valid inferences. Finally, a novel semiparametric sensitivity analysis technique is developed for each of these models, to assess the impact on inference, of a violation of the assumption of ignorability of the mediator.
natural direct effect; natural indirect effect; Cox proportional hazards model; additive hazards model; multiple robustness
This study examined the association of two distinct self-regulation constructs, effortful control and dysregulation, with weight-related behaviors in adolescents and tested whether these effects were mediated by self-efficacy variables.
A school-based survey was conducted with 1771 adolescents from 11 public schools in the Bronx, New York. Self-regulation was assessed by multiple indicators and defined as two latent constructs. Dependent variables included fruit/vegetable intake, intake of snack/junk food, frequency of physical activity, and time spent in sedentary behaviors. Structural equation modeling examined the relation of effortful control and dysregulation to lifestyle behaviors, with self-efficacy variables as possible mediators.
Study results showed that effortful control had a positive indirect effect on fruit and vegetable intake, mediated by self-efficacy, as well as a direct effect. Effortful control also had a positive indirect effect on physical activity, mediated by self-efficacy. Dysregulation had direct effects on intake of junk food/snacks and time spent in sedentary behaviors.
These findings indicate that self-regulation characteristics are related to diet and physical activity and that some of these effects are mediated by self-efficacy. Different effects were noted for the two domains of self-regulation. Prevention researchers should consider including self-regulation processes in programs to improve health behaviors in adolescents.
Percent mammographic breast density (PMD) is a strong heritable risk factor for breast cancer. However, the pathways through which this risk is mediated are still unclear. To explore whether PMD and breast cancer have a shared genetic basis, we identified genetic variants most strongly associated with PMD in a published meta-analysis of five genome-wide association studies (GWAS) and used these to construct risk scores for 3628 breast cancer cases and 5190 controls from the UK2 GWAS of breast cancer. The signed per-allele effect estimates of SNPs were multiplied with the respective allele counts in the individual and summed over all SNPs to derive the risk score for an individual. These scores were included as the exposure variable in a logistic regression model with breast cancer case-control status as the outcome. This analysis was repeated using ten different cut-offs for the most significant density SNPs (1-10% representing 5,222-50,899 SNPs). Permutation analysis was also performed across all 10 cut-offs. The association between risk score and breast cancer was significant for all cut-offs from 3-10% of top density SNPs, being most significant for the 6% (2-sided P=0.002) to 10% (P=0.001) cut-offs (overall permutation P=0.003). Women in the top 10% of the risk score distribution had a 31% increased risk of breast cancer [OR= 1.31 (95%CI 1.08-1.59)] compared to women in the bottom 10%. Together, our results demonstrate that PMD and breast cancer have a shared genetic basis that is mediated through a large number of common variants.
breast cancer; mammographic density; SNPs; polygenic; Mendelian Randomisation
Only 10-15% of smokers develop chronic obstructive pulmonary disease (COPD) which indicates genetic susceptibility to the disease. Recent studies suggested an association between COPD and polymorphisms in CHRNA coding subunits of nicotinic acetylcholine receptor. Herein, we performed a meta-analysis to clarify the impact of CHRNA variants on COPD.
We searched Web of Knowledge and Medline from 1990 through June 2011 for COPD gene studies reporting variants on CHRNA. Pooled odds ratios (ORs) were calculated using the major allele or genotype as reference group.
Among seven reported variants in CHRNA, rs1051730 was finally analyzed with sufficient studies. Totally 3460 COPD and 11437 controls from 7 individual studies were pooled-analyzed. A-allele of rs1051730 was associated with an increased risk of COPD regardless of smoking exposure (pooled OR = 1.26, 95% CI 1.18-1.34, p < 10-5). At the genotypic level, the ORs gradually increased per A-allele (OR = 1.27 and 1.50 for GA and AA respectively, p < 10-5). Besides, AA genotype exhibited an association with reduced FEV1% predicted (mean difference 3.51%, 95%CI 0.87-6.16%, p = 0.009) and increased risk of emphysema (OR 1.93, 95%CI 1.29-2.90, p = 0.001).
Our findings suggest that rs1051730 in CHRNA is a susceptibility variant for COPD, in terms of both airway obstruction and parenchyma destruction.
Chronic Obstructive Pulmonary Disease (COPD); Nicotine acetylcholine receptor (nAChR); CHRNA -; ; Single nucleotide polymorphism (SNP)
Recent research found that among patients in aftercare treatment for alcoholism the level of therapist structure interacted with the level of patients’ interpersonal reactance to predict alcohol use outcomes. The present study examined 2 sets of potential mediators of this interaction effect among a sample from 2 aftercare sites of Project MATCH (n = 127). The mediator constructs were types of pro-recovery change talk and resistance to therapeutic work. Dependent variables were percentage of days abstinent (PDA) and percentage of heavy drinking days (PHDD) across the year after treatment. Multiple-mediator models using bootstrapped estimates of indirect effects were used to test for mediation. Results indicated that the ‘taking steps’ aspect of change talk partially mediated the Structure X Reactance interaction effect on both PDA and PHDD post treatment. Resistance was not found to mediate the interaction effect though resistance did predict worse drinking outcomes. Depending on patients’ openness to being influenced by others, therapist structure early in treatment may promote or inhibit pro-recovery steps taken by aftercare patients between treatment sessions. Those steps in turn play an important role in predicting future alcohol use.
alcohol; treatment; reactance; structure; change talk; mediation
Several independent studies show that the chromosome 15q25.1 region, which contains the CHRNA5-CHRNA3-CHRNB4 gene cluster, harbors variants strongly associated with nicotine dependence, other smoking behaviors, lung cancer, and chronic obstructive pulmonary disease.
We investigated whether variants in other cholinergic nicotinic receptor subunit (CHRN) genes affect risk for nicotine dependence in a new sample of African-Americans (N = 710). We also analyzed this African-American sample together with a European-American sample (N=2062, 1608 of which have been previously studied), allowing for differing effects in the two populations. Cases are current nicotine-dependent smokers and controls are non-dependent smokers.
Variants in or near CHRND-CHRNG, CHRNA7, and CHRNA10 show modest association with nicotine dependence risk in the African-American sample. In addition, CHRNA4, CHRNB3-CHRNA6, and CHRNB1 show association in at least one population. CHRNG and CHRNA4 harbor SNPs that have opposite directions of effect in the two populations. In each of the population samples, these loci substantially increase the trait variation explained, although no loci meet Bonferroni-corrected significance in the African-American sample alone. The trait variation explained by three key associated SNPs in CHRNA5-CHRNA3-CHRNB4 is 1.9% in European-Americans and also 1.9% in African-Americans; this increases to 4.5% in EAs and 7.3% in AAs when we add six variants representing associations at other CHRN genes.
Multiple nicotinic receptor subunit genes outside of chromosome 15q25 are likely to be important in the biological processes and development of nicotine dependence, and some of these risks may be shared across diverse populations.
genetic association; smoking; cholinergic nicotinic receptors; nicotinic acetylcholine receptors
Analytical solutions for point and variance estimators of the mediated effect, the ratio of the mediated to the direct effect, and the proportion of the total effect that is mediated were studied with statistical simulations. We compared several approximate solutions based on the multivariate delta method and second order Taylor series expansions to the empirical standard deviation of each estimator and theoretical standard error when available. The simulations consisted of 500 replications of three normally distributed variables for eight sample sizes (N = 10, 25, 50, 100, 500, 1000, and 5000) and 64 parameter value combinations. The different solutions for the standard error of the indirect effect were very similar for sample sizes of at least 50, except when the independent variable was dichotomized. A sample size of at least 500 was needed for accurate point and variance estimates of the proportion mediated. The point and variance estimates of the ratio of the mediated to nonmediated effect did not stabilize until the sample size was 2,000 for the all continuous variable case. Implications for the estimation of mediated effects in experimental and nonexperimental studies are discussed.
A genome-wide association (GWA) study is usually designed as a case-control study, where the presence and absence of the primary disease defines the cases and controls, respectively. Using the existing data from GWA studies, investigators are also trying to identify the association between genetic variants and secondary phenotypes, which are defined as traits associated with the primary disease. However, recent studies have shown that bias arises in the estimation of marker-secondary phenotype association using originally collected data. We recently proposed a bias correction approach to accurately estimate the odds ratio (OR) for marker-secondary phenotype association. In this communication, we further investigated whether our bias correction approach is robust for a scenario involving the interactive effect of the secondary phenotype and genetic variants on the primary disease. We found that in such a scenario, our bias correction approach also provides an accurate estimation of OR for marker-secondary phenotype association. We investigated accuracy of our approach using simulation studies and showed that the approach better controlled for type I errors than the existing approaches. We also applied our bias correction approach to the real data analysis of association between an N-acetyltransferase gene, NAT2, and smoking on the basis of colorectal cancer data.
odds ratio; bias; secondary phenotype; SNP; genome-wide association study; frequency-matched study design
Genetic variants at the 15q25 CHRNA5-CHRNA3 locus have been shown to influence lung cancer risk however there is controversy as to whether variants have a direct carcinogenic effect on lung cancer risk or impact indirectly through smoking behavior. We have performed a detailed analysis of the 15q25 risk variants rs12914385 and rs8042374 with smoking behavior and lung cancer risk in 4,343 lung cancer cases and 1,479 controls from the Genetic Lung Cancer Predisposition Study (GELCAPS). A strong association between rs12914385 and rs8042374, and lung cancer risk was shown, odds ratios (OR) were 1.44, (95% confidence interval (CI): 1.29–1.62, P = 3.69×10−10) and 1.35 (95% CI: 1.18–1.55, P = 9.99×10−6) respectively. Each copy of risk alleles at rs12914385 and rs8042374 was associated with increased cigarette consumption of 1.0 and 0.9 cigarettes per day (CPD) (P = 5.18×10−5 and P = 5.65×10−3). These genetically determined modest differences in smoking behavior can be shown to be sufficient to account for the 15q25 association with lung cancer risk. To further verify the indirect effect of 15q25 on the risk, we restricted our analysis of lung cancer risk to never-smokers and conducted a meta-analysis of previously published studies of lung cancer risk in never-smokers. Never-smoker studies published in English were ascertained from PubMed stipulating - lung cancer, risk, genome-wide association, candidate genes. Our study and five previously published studies provided data on 2,405 never-smoker lung cancer cases and 7,622 controls. In the pooled analysis no association has been found between the 15q25 variation and lung cancer risk (OR = 1.09, 95% CI: 0.94–1.28). This study affirms the 15q25 association with smoking and is consistent with an indirect link between genotype and lung cancer risk.