Although inversions have occasionally been found to be associated with disease susceptibility through interrupting a gene or its regulatory region, or by increasing the risk for deleterious secondary rearrangements, no association study has been specifically conducted for risks associated with inversions, mainly because existing approaches to detecting and genotyping inversions do not readily scale to a large number of samples. Based on our recently proposed approach to identifying and genotyping inversions using principal components analysis (PCA), we herein develop a method of detecting association between inversions and disease in a genome-wide fashion. Our method uses genotype data for single nucleotide polymorphisms (SNPs), and is thus cost-efficient and computationally fast. For an inversion polymorphism, local PCA around the inversion region is performed to infer the inversion genotypes of all samples. For many inversions, we found that some of the SNPs inside an inversion region are fixed in the two lineages of different orientations and thus can serve as surrogate markers. Our method can be applied to case-control and quantitative trait association studies to identify inversions that may interrupt a gene or the connection between a gene and its regulatory agents. Our method also offers a new venue to identify inversions that are responsible for disease-causing secondary rearrangements. We illustrated our proposed approach to case-control data for psoriasis and identified novel associations with a few inversion polymorphisms.
Chromosomal inversion; Principal components analysis; Genome-wide association scan; Single-Nucleotide Polymorphism; Psoriasis
Epistasis, or gene–gene interaction, results from joint effects of genes on a trait; thus, the same alleles of one gene may display different genetic effects in different genetic backgrounds. In this study, we generalized the coding technique of a natural and orthogonal interaction (NOIA) model for association studies along with gene–gene interactions for dichotomous traits and human complex diseases. The NOIA model which has non-correlated estimators for genetic effects is important for estimating influencing from multiple loci. We conducted simulations and data analyses to evaluate the performance of the NOIA model. Both simulation and real data analyses revealed that the NOIA statistical model had higher power for detecting main genetic effects and usually had higher power for some interaction effects than the usual model. Although associated genes have been identified for predisposing people to melanoma risk: HERC2 at 15q13.1, MC1R at 16q24.3 and CDKN2A at 9p21.3, no gene–gene interaction study has been fully explored for melanoma. By applying the NOIA statistical model to a genome-wide melanoma dataset, we confirmed the previously identified significantly associated genes and found potential regions at chromosomes 5 and 4 that may interact with the HERC2 and MC1R genes, respectively. Our study not only generalized the orthogonal NOIA model but also provided useful insights for understanding the influencing of interactions on melanoma risk.
Genome-wide association studies of European and East Asian populations have identified lung cancer susceptibility loci on chromosomes 5p15.33, 6p22.1-p21.31 and 15q25.1. We investigated whether these regions contain lung cancer susceptibly loci in African-Americans refined previous association signals by utilizing the reduced linkage disequilibrium observed in African-Americans.
1308 African-American cases and 1241 African-American controls from three centers were genotyped for 760 single nucleotide polymorphisms spanning three regions, and additional SNP imputation was performed. Associations between polymorphisms and lung cancer risk were estimated using logistic regression, stratified by tumor histology where appropriate.
The strongest associations were observed on 15q25.1 in/near CHRNA5, including a missense substitution (rs16969968: OR = 1.57, 95% CI = 1.25–1.97, P = 1.1 × 10−4) and variants in the 5′-UTR. Associations on 6p22.1-p21.31 were histology-specific and included a missense variant in BAT2 associated with squamous-cell carcinoma (rs2736158: OR = 0.64, 95% CI = 0.48–0.85, P = 1.82 × 10−3). Associations on 5p15.33 were detected near TERT, the strongest of which was rs2735940 (OR = 0.82, 95% CI = 0.73–0.93, P = 1.1 × 10−3). This association was stronger among cases with adenocarcinoma (OR = 0.75, 95% CI = 0.65–0.86, P = 8.1 × 10−5).
Polymorphisms in 5p15.33, 6p22.1-p21.31 and 15q25.1 are associated with lung cancer in African-Americans. Variants on 5p15.33 are stronger risk factors for adenocarcinoma and variants on 6p21.33 associated only with squamous-cell carcinoma.
Results implicate the BAT2, TERT and CHRNA5 genes in the pathogenesis of specific lung cancer histologies.
Lung cancer; adenocarcinoma; squamous-cell carcinoma; fine-mapping; African-American; genetic association
Acquired uniparental disomy (aUPD) can lead to homozygosity for tumor suppressor genes or oncogenes. Our purpose is to determine the frequency and profile aUPD regions in serous ovarian cancer (SOC) and investigated the association of aUPD with clinical features and patient outcomes.
We analyzed single nucleotide polymorphism (SNP) array-based genotyping data on 532 SOC specimens from The Cancer Genome Atlas database to identify aUPD regions. Cox univariate regression and Cox multivariate proportional hazards analyses were performed for survival analysis.
We found that 94.7% of SOC samples harbored aUPD; the most common aUPD regions were in chromosomes 17q (76.7%), 17p (39.7%), and 13q (38.3%). In Cox univariate regression analysis, two independent regions of aUPD on chromosome 17q (A and C), and whole-chromosome aUPD were associated with shorter overall survival (OS), and five regions on chromosome 17q (A, D-G) and BRCA1 were associated with recurrence-free survival time. In Cox multivariable proportional hazards analysis, whole-chromosome aUPD was associated with shorter OS. One region of aUPD on chromosome 22q (B) was associated with unilateral disease. A statistically significant association was found between aUPD at TP53 loci and homozygous mutation of TP53 (p < 0.0001).
aUPD is a common event and some recurrent loci are associated with a poor outcome for patients with serous ovarian cancer.
Electronic supplementary material
The online version of this article (doi:10.1186/s12943-015-0289-1) contains supplementary material, which is available to authorized users.
Acquired uniparental disomy; Ovarian cancer; Overall survival; Recurrence-free survival
Obesity and diabetes are potentially alterable risk factors for pancreatic cancer. Genetic factors that modify the associations of obesity and diabetes with pancreatic cancer have previously not been examined at the genome-wide level.
Using GWAS genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study of 2,028 cases and 2,109 controls to examine gene-obesity and gene-diabetes interactions in relation to pancreatic cancer risk by employing the likelihood ratio test (LRT) nested in logistic regression models and Ingenuity Pathway Analysis (IPA).
After adjusting for multiple comparisons, a significant interaction of the chemokine signaling pathway with obesity (P = 3.29 × 10−6) and a near significant interaction of calcium signaling pathway with diabetes (P = 1.57 × 10−4) in modifying the risk of pancreatic cancer was observed. These findings were supported by results from IPA analysis of the top genes with nominal interactions. The major contributing genes to the two top pathways include GNGT2, RELA, TIAM1 and GNAS. None of the individual genes or SNPs except one SNP remained significant after adjusting for multiple testing. Notably, SNP rs10818684 of the PTGS1 gene showed an interaction with diabetes (P = 7.91 × 10−7) at a false discovery rate of 6%.
Genetic variations in inflammatory response and insulin resistance may affect the risk of obesity and diabetes-related pancreatic cancer. These observations should be replicated in additional large datasets.
Gene-environment interaction analysis may provide new insights into the genetic susceptibility and molecular mechanisms of obesity- and diabetes-related pancreatic cancer.
GWAS; obesity; diabetes; interaction; pancreatic cancer; genetic susceptibility
In omic research, such as genome wide association studies, researchers seek to repeat their results in other datasets to reduce false positive findings and thus provide evidence for the existence of true associations. Unfortunately this standard validation approach cannot completely eliminate false positive conclusions, and it can also mask many true associations that might otherwise advance our understanding of pathology. These issues beg the question: How can we increase the amount of knowledge gained from high throughput genetic data? To address this challenge, we present an approach that complements standard statistical validation methods by drawing attention to both potential false negative and false positive conclusions, as well as providing broad information for directing future research. The Diverse Convergent Evidence approach (DiCE) we propose integrates information from multiple sources (omics, informatics, and laboratory experiments) to estimate the strength of the available corroborating evidence supporting a given association. This process is designed to yield an evidence metric that has utility when etiologic heterogeneity, variable risk factor frequencies, and a variety of observational data imperfections might lead to false conclusions. We provide proof of principle examples in which DiCE identified strong evidence for associations that have established biological importance, when standard validation methods alone did not provide support. If used as an adjunct to standard validation methods this approach can leverage multiple distinct data types to improve genetic risk factor discovery/validation, promote effective science communication, and guide future research directions.
Replication; Validation; Complex disease; Heterogeneity; GWAS; Omics; Type 2 error; Type 1 error; False negatives; False positives
Angiogenesis and lymphangiogenesis are important in the progression of melanoma. We investigated associations between genetic variants in these pathways with sentinel lymph node (SLN) metastasis and mortality in two independent series of melanoma patients.
Participants at Moffitt Cancer Center were 552 patients, all Caucasian, with primary cutaneous melanoma referred for SLN biopsy. A total of 177 patients had SLN metastasis, among whom 60 died from melanoma. Associations between 238 SNPs in 26 genes and SLN metastasis were estimated as odds ratios and 95%CI using logistic regression. Competing risk regression was used to estimate hazard ratios and 95%CI for each SNP and melanoma-specific mortality. We attempted to replicate significant findings using data from a genome-wide association study comprising 1,115 melanoma patients, who were referred for SLN biopsy from MD Anderson Cancer Center (MDACC), among whom 189 patients had SLN metastasis and 92 patients died from melanoma.
In the Moffitt dataset, we observed significant associations in 18 SNPs with SLN metastasis and 17 SNPs with mortality. Multiple SNPs in COL18A1, EGFR, FLT1, IL10, PDGFD, PIK3CA and TLR3 were associated with risk of SLN metastasis and/or patient mortality. The MDACC data set replicated an association between mortality and rs2220377 in PDGFD. Further, in a meta-analysis, three additional SNPs were significantly associated with SLN metastasis (EGFR rs723526 and TLR3 rs3775292) and melanoma specific death (TLR3 rs7668666).
These findings suggest that genetic variation in angiogenesis and lymphangiogenesis contributes to regional nodal metastasis and progression of melanoma.
Additional research attempting to replicate these results is warranted.
SNP; lymph/angiogenesis; melanoma; sentinel lymph node
Cutaneous melanoma (CM) is the most lethal skin cancer. The Fanconi Anemia (FA) pathway involved in DNA crosslinks repair may affect CM susceptibility and prognosis. Using data derived from published genome-wide association study, we comprehensively analyzed the associations of 2339 common single nucleotide polymorphisms (SNPs) in 14 autosomal FA genes with overall survival (OS) in 858 CM patients. By performing false-positive report probability corrections and stepwise Cox proportional hazards regression analyses, we identified significant associations between CM OS and four putatively functional SNPs: BRCA2 rs10492396 [AG vs. GG: adjusted hazard ratio (adjHR)=1.85, 95% confident interval (CI)=1.16-2.95, P=0.010], rs206118 (CC vs. TT+TC: adjHR=2.44, 95% CI=1.27-4.67, P=0.007), rs3752447 (CC vs. TT+TC: adjHR=2.10, 95% CI=1.38-3.18, P=0.0005), and FANCA rs62068372 (TT vs. CC+CT: adjHR=1.85, 95% CI=1.27-2.69, P=0.001). Moreover, patients with an increasing number of unfavorable genotypes (NUG) of these loci had markedly reduced OS and melanoma-specific survival (MSS). The final model incorporating with NUG, tumor stage and Breslow thickness showed an improved discriminatory ability to classify both 5-year OS and 5-year MSS. Additional investigations, preferably prospective studies, are needed to validate our findings.
cutaneous melanoma; Fanconi Anemia pathway; survival; single nucleotide polymorphisms; Cox regression
Mutations in BRCA1 and BRCA2 increase a woman's lifetime risk of developing breast cancer to 43%-84%. It was originally postulated that BRCA1/2-associated breast cancers develop more rapidly than sporadic cancers and may lack pre-invasive lesions. More recent studies have found pre-invasive lesions in prophylactic mastectomy specimens from mutation carriers; however, there is little information on the presence of pre-invasive lesions in tissue adjacent to breast cancers. Our aim is to investigate the role of pre-invasive lesions in BRCA-associated breast carcinogenesis.
We retrospectively compared BRCA1/2-associated breast cancers and sporadic breast cancers for the prevalence of pre-invasive lesions (ductal carcinoma in situ [DCIS], lobular carcinoma in situ [LCIS], and atypical lobular hyperplasia [ALH]) in tissue adjacent to invasive breast cancers.
Pathology was reviewed for 73 BRCA1/2-associated tumors from breast cancer patients. We selected 146 mutation-negative breast cancer patients as age-matched controls. Of BRCA1/2-associated breast cancers, 59% had at least one associated pre-invasive lesion compared with 75% of controls. Pre-invasive lesions were more prevalent in BRCA2 mutation carriers than in BRCA1 mutation carriers (70% vs. 52%, respectively). The most common pre-invasive lesion in both groups was DCIS; 56% of BRCA1/2-associated breast cancers and 71% of the sporadic breast cancers had adjacent intraductal disease, respectively.
Pre-invasive lesions, most notably DCIS, are common in BRCA1/2-associated breast cancers. These findings suggest that BRCA1/2-associated breast cancers progress through the same intermediate steps as sporadic breast cancers, and that DCIS should be considered as a part of the BRCA1/2 tumor spectrum.
Alopecia areata (AA) is a prevalent autoimmune disease with ten known susceptibility loci. Here we perform the first meta-analysis in AA by combining data from two genome-wide association studies (GWAS), and replication with supplemented ImmunoChip data for a total of 3,253 cases and 7,543 controls. The strongest region of association is the MHC, where we fine-map 4 independent effects, all implicating HLA-DR as a key etiologic driver. Outside the MHC, we identify two novel loci that exceed statistical significance, containing ACOXL/BCL2L11(BIM) (2q13); GARP (LRRC32) (11q13.5), as well as a third nominally significant region SH2B3(LNK)/ATXN2 (12q24.12). Candidate susceptibility gene expression analysis in these regions demonstrates expression in relevant immune cells and the hair follicle. We integrate our results with data from seven other autoimmune diseases and provide insight into the alignment of AA within these disorders. Our findings uncover new molecular pathways disrupted in AA, including autophagy/apoptosis, TGFß/Tregs and JAK kinase signaling, and support the causal role of aberrant immune processes in AA.
Genome-wide association studies (GWAS) have generated sufficient data to assess the role of selection in shaping allelic diversity of disease-associated SNPs. Negative selection against disease risk variants is expected to reduce their frequencies making them overrepresented in the group of minor (<50%) alleles. Indeed, we found that the overall proportion of risk alleles was higher among alleles with frequency <50% (minor alleles) compared to that in the group of major alleles. We hypothesized that negative selection may have different effects on environment (or lifestyle)-dependent versus environment (or lifestyle)-independent diseases. We used an environment/lifestyle index (ELI) to assess influence of environmental/lifestyle factors on disease etiology. ELI was defined as the number of publications mentioning “environment” or “lifestyle” AND disease per 1,000 disease-mentioning publications. We found that the frequency distributions of the risk alleles for the diseases with strong environmental/lifestyle components follow the distribution expected under a selectively neutral model, while frequency distributions of the risk alleles for the diseases with weak environmental/lifestyle influences is shifted to the lower values indicating effects of negative selection. We hypothesized that previously selectively neutral variants become risk alleles when environment changes. The hypothesis of ancestrally neutral, currently disadvantageous risk-associated alleles predicts that the distribution of risk alleles for the environment/lifestyle dependent diseases will follow a neutral model since natural selection has not had enough time to influence allele frequencies. The results of our analysis suggest that prediction of SNP functionality based on the level of evolutionary conservation may not be useful for SNPs associated with environment/lifestyle dependent diseases.
We reviewed several thousand genome wide association studies that were conducted to identify genetic variants influencing risk of human diseases. We tested the hypothesis that single nucleotide polymorphisms (SNPs) that influence disease risk undergo positive or negative selection more frequently than an average SNP in the human genome. We found no evidence for excess of positive selection on disease-associated SNPs. At the same time we found that alleles associated with a higher disease risk undergo negative selection. We also demonstrated that risk alleles for diseases with strong influence of environment/lifestyle factors (e.g. Type II diabetes) show little evidence of negative selection, while risk alleles for diseases with weak influence of environment/lifestyle factors (e.g. Pathological myopia) show clear signs of negative selection. The approach used in this study can be used to estimate the number of genetic variants in the human genome influencing risk of human diseases.
To identify genetic determinants of granulomatosis with polyangiitis (Wegener’s) (GPA).
We carried out a genome-wide association study (GWAS) of 492 GPA cases and 1,506 healthy controls (white subjects of European descent), followed by replication analysis of the most strongly associated signals in an independent cohort of 528 GPA cases and 1,228 controls.
Genome-wide significant associations were identified in 32 single-nucleotide polymorphic (SNP) markers across the HLA region, the majority of which were located in the HLA–DPB1 and HLA–DPA1 genes encoding the class II major histocompatibility complex (MHC) DPβ chain 1 and DPα chain 1 proteins, respectively. Peak association signals in these 2 genes, emanating from SNPs rs9277554 (for DPβ chain 1) and rs9277341 (DPα chain 1) were strongly replicated in an independent cohort (in the combined analysis of the initial cohort and the replication cohort, P = 1.92 × 10−50 and 2.18 × 10−39, respectively). Imputation of classic HLA alleles and conditional analyses revealed that the SNP association signal was fully accounted for by the classic HLA–DPB1*04 allele. An independent single SNP, rs26595, near SEMA6A (the gene for semaphorin 6A) on chromosome 5, was also associated with GPA, reaching genome-wide significance in a combined analysis of the GWAS and replication cohorts (P = 2.09 × 10−8).
We identified the SEMA6A and HLA–DP loci as significant contributors to risk for GPA, with the HLA–DPB1*04 allele almost completely accounting for the MHC association. These two associations confirm the critical role of immunogenetic factors in the development of GPA.
The risk of glioma has consistently been shown to be increased two-fold in relatives of patients with primary brain tumors (PBT). A recent genome-wide linkage study of glioma families provided evidence for a disease locus on 17q12-21.32, with the possibility of four additional risk loci at 6p22.3, 12p13.33-12.1, 17q22-23.2, and 18q23.
To identify the underlying genetic variants responsible for the linkage signals, we compared the genotype frequencies of 5,122 SNPs mapping to these five regions in 88 glioma cases with and 1,100 cases without a family history of PBT (discovery study). An additional series of 84 familial and 903 non-familial cases were used to replicate associations.
In the discovery study, 12 SNPs showed significant associations with family history of PBT (P < 0.001). In the replication study, two of the 12 SNPs were confirmed: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.031) and 17q12-21.32 SPOP rs650461 (P = 0.025). In the combined analysis of discovery and replication studies, the strongest associations were attained at four SNPs: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.0001), SOX5 rs7305773 (P = 0.0001) and STKY1 rs2418087 (P = 0.0003), and 17q12-21.32 SPOP rs6504618 (P = 0.0006). Further, a significant gene-dosage effect was found for increased risk of family history of PBT with these four SNPs in the combined data set (Ptrend < 1.0 ×10−8).
The results support the linkage finding that some loci in the 12p13.33-12.1 and 17q12-q21.32 may contribute to gliomagenesis and suggest potential target genes underscoring linkage signals.
Association; Polymorphisms; Glioma; Family history of primary brain tumor; Linkage analysis
Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting is related to several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we generalize the natural and orthogonal interactions (NOIA) framework to allow for estimation of both main allelic effects and POEs. We develop a statistical (Stat-POE) model that has the orthogonal estimates of parameters including the POEs. We conducted simulation studies for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.
We aimed at extending the natural and orthogonal interaction (NOIA) framework, developed for modeling gene-gene interactions in the analysis of quantitative traits, to allow for reduced genetic models, dichotomous traits, and gene-environment interactions. We evaluate the performance of the NOIA statistical models using simulated data and lung cancer data.
The NOIA statistical models are developed for the additive, dominant, recessive genetic models, and a binary environmental exposure. Using the Kronecker product rule, a NOIA statistical model is built to model gene-environment interactions. By treating the genotypic values as the logarithm of odds, the NOIA statistical models are extended to the analysis of case-control data.
Our simulations showed that power for testing associations while allowing for interaction using the statistical model is much higher than using functional models for most of the scenarios we simulated. When applied to the lung cancer data, much smaller P-values were obtained using the NOIA statistical model for either the main effects or the SNP-smoking interactions for some of the SNPs tested.
The NOIA statistical models are usually more powerful than the functional models in detecting main effects and interaction effects for both quantitative traits and binary traits.
Statistical power; Genetic association studies; Case-control association analysis; Gene-environment interaction; Environmental risk factor; Association mapping; Orthogonal modeling
We conducted gene–smoking interaction analysis in GWAS data of pancreati cancer. We found a possible interaction of axon guidance pathway genes with smoking in modifying the risk of pancreatic cancer. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
Cigarette smoking is the best established modifiable risk factor for pancreatic cancer. Genetic factors that underlie smoking-related pancreatic cancer have previously not been examined at the genome-wide level. Taking advantage of the existing Genome-wide association study (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study in 2028 cases and 2109 controls to examine gene–smoking interactions at pathway/gene/single nucleotide polymorphism (SNP) level. Using the likelihood ratio test nested in logistic regression models and ingenuity pathway analysis (IPA), we examined 172 KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, 3 manually curated gene sets, 3 nicotine dependency gene ontology pathways, 17 912 genes and 468 114 SNPs. None of the individual pathway/gene/SNP showed significant interaction with smoking after adjusting for multiple comparisons. Six KEGG pathways showed nominal interactions (P < 0.05) with smoking, and the top two are the pancreatic secretion and salivary secretion pathways (major contributing genes: RAB8A, PLCB and CTRB1). Nine genes, i.e. ZBED2, EXO1, PSG2, SLC36A1, CLSTN1, MTHFSD, FAT2, IL10RB and ATXN2 had P
interaction < 0.0005. Five intergenic region SNPs and two SNPs of the EVC and KCNIP4 genes had P
interaction < 0.00003. In IPA analysis of genes with nominal interactions with smoking, axonal guidance signaling (P=2.12×10−7)
and α-adrenergic signaling (P=2.52×10−5)
genes were significantly overrepresented canonical pathways. Genes contributing to the axon guidance signaling pathway included the SLIT/ROBO signaling genes that were frequently altered in pancreatic cancer. These observations need to be confirmed in additional data set. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
We explored the contribution of nitrosamine metabolism to lung cancer in a pilot investigation of genetic variation in CYP2B6, a high-affinity enzymatic activator of tobacco-specific nitrosamines with a negligible role in nicotine metabolism. Previously we found that variation in CYP2A6 and CHRNA5-CHRNA3-CHRNB4 combined to increase lung cancer risk in a case-control study in European American ever-smokers (n = 860). However, these genes are involved in the pharmacology of both nicotine, through which they alter smoking behaviours, and carcinogenic nitrosamines. Herein, we separated participants by CYP2B6 genotype into a high- vs. low-risk group (*1/*1 + *1/*6 vs. *6/*6). Odds ratios estimated through logistic regression modeling were 1.25 (95% CI 0.68–2.30), 1.27 (95% CI 0.89–1.79) and 1.56 (95% CI 1.04–2.31) for CYP2B6, CYP2A6 and CHRNA5-CHRNA3-CHRNB4, respectively, with negligible differences when all genes were evaluated concurrently. Modeling the combined impact of high-risk genotypes yielded odds ratios that rose from 2.05 (95% CI 0.39–10.9) to 2.43 (95% CI 0.47–12.7) to 3.94 (95% CI 0.72–21.5) for those with 1, 2 and 3 vs. 0 high-risk genotypes, respectively. Findings from this pilot point to genetic variation in CYP2B6 as a lung cancer risk factor supporting a role for nitrosamine metabolic activation in the molecular mechanism of lung carcinogenesis.
CYP2B6; CYP2A6; CHRNA5-CHRNA3-CHRNB4; tobacco specific nitrosamines; lung cancer risk; genetic variation
Genetic variants located at 15q25, including those in the cholinergic receptor nicotinic cluster (CHRNA5) have been implicated in both lung cancer risk and nicotine dependence in recent genome-wide association studies. Among these variants, a 22 base pair insertion/deletion, rs3841324 showed the strongest association with CHRNA5 mRNA expression levels. However the influence of rs3841324 on lung cancer risk has not been studied in depth.
We have therefore evaluated the association of rs3841324 genotypes with lung cancer risk in a case-control study of 624 Caucasian subjects with lung cancer and 766 age- and sex-matched cancer-free Caucasian controls. We also evaluated the joint effects of rs3841324 with single-nucleotide polymorphisms (SNPs) rs16969968 and rs8034191 in the 15q25 region that have been consistently implicated in lung cancer risk.
We found that the homozygous genotype with both short alleles (SS) of rs3841324 was associated with a decreased lung cancer risk in female ever smokers relative to the homozygous wild-type (LL) and heterozygous (LS) genotypes combined in a recessive model (OR adjusted = 0.55, 95% CI = 0.31–0.89, P = 0.0168). There was no evidence for a sex difference in the association between this variant and cigarettes smoked per day (CPD). Diplotype analysis of rs3841324 with either rs16969968 or rs8034191 showed that these polymorphisms influenced the lung cancer risk independently.
Conclusions and impact
This study has shown a sex difference in the association between the 15q25 variant rs3841324 and lung cancers. Further research is warranted to elucidate the mechanisms underlying these observations.
lung cancer; CHRNA5; Chromosome 15q25; rs3841324; sex-specific association
Genetic researchers often collect disease related quantitative traits in addition to disease status because they are interested in understanding the pathophysiology of disease processes. In genome-wide association (GWA) studies, these quantitative phenotypes may be relevant to disease development and serve as intermediate phenotypes or they could be behavioral or other risk factors that predict disease risk. Statistical tests combining both disease status and quantitative risk factors should be more powerful than case-control studies, as the former incorporates more information about the disease. In this paper, we proposed a modified inverse-variance weighted meta-analysis method to combine disease status and quantitative intermediate phenotype information. The simulation results showed that when an intermediate phenotype was available, the inverse-variance weighted method had more power than did a case-control study of complex diseases, especially in identifying susceptibility loci having minor effects. We further applied this modified meta-analysis to a study of imputed lung cancer genotypes with smoking data in 1154 cases and 1137 matched controls. The most significant SNPs came from the CHRNA3-CHRNA5-CHRNB4 region on chromosome 15q24–25.1, which has been replicated in many other studies. Our results confirm that this CHRNA region is associated with both lung cancer development and smoking behavior. We also detected three significant SNPs—rs1800469, rs1982072, and rs2241714—in the promoter region of the TGFB1 gene on chromosome 19 (p = 1.46×10−5, 1.18×10−5, and 6.57×10−6, respectively). The SNP rs1800469 is reported to be associated with chronic obstructive pulmonary disease and lung cancer in cigarette smokers. The present study is the first GWA study to replicate this result. Signals in the 3q26 region were also identified in the meta-analysis. We demonstrate the intermediate phenotype can potentially enhance the power of complex disease association analysis and the modified meta-analysis method is robust to incorporate intermediate phenotype or other quantitative risk factor in the analysis.
The genetic basis of sporadic colorectal cancer (CRC) is not well explained by known risk polymorphisms. Here we perform a meta-analysis of two genome-wide association studies in 2,627 cases and 3,797 controls of Japanese ancestry and 1,894 cases and 4,703 controls of African ancestry, to identify genetic variants that contribute to CRC susceptibility. We replicate genome-wide statistically significant associations (P < 5×10−8) in 16,823 cases and 18,211 controls of European ancestry. This study reveals a new pan-ethnic CRC risk locus at 10q25 (rs12241008, intronic to VTI1A; P=1.4×10−9), providing additional insight into the etiology of CRC and highlighting the value of association mapping in diverse populations.
Glioma is a rare, but highly fatal, cancer that accounts for the majority of malignant primary brain tumors. Inherited predisposition to glioma has been consistently observed within non-syndromic families. Our previous studies, which involved non-parametric and parametric linkage analyses, both yielded significant linkage peaks on chromosome 17q. Here, we use data from next generation and Sanger sequencing to identify familial glioma candidate genes and variants on chromosome 17q for further investigation. We applied a filtering schema to narrow the original list of 4830 annotated variants down to 21 very rare (<0.1% frequency), non-synonymous variants. Our findings implicate the MYO19 and KIF18B genes and rare variants in SPAG9 and RUNDC1 as candidates worthy of further investigation. Burden testing and functional studies are planned.
Recent genome-wide association studies (GWASs) have identified common genetic variants at 5p15.33, 6p21–6p22 and 15q25.1 associated with lung cancer risk. Several other genetic regions including variants of CHEK2 (22q12), TP53BP1 (15q15) and RAD52 (12p13) have been demonstrated to influence lung cancer risk in candidate- or pathway-based analyses. To identify novel risk variants for lung cancer, we performed a meta-analysis of 16 GWASs, totaling 14 900 cases and 29 485 controls of European descent. Our data provided increased support for previously identified risk loci at 5p15 (P = 7.2 × 10−16), 6p21 (P = 2.3 × 10−14) and 15q25 (P = 2.2 × 10−63). Furthermore, we demonstrated histology-specific effects for 5p15, 6p21 and 12p13 loci but not for the 15q25 region. Subgroup analysis also identified a novel disease locus for squamous cell carcinoma at 9p21 (CDKN2A/p16INK4A/p14ARF/CDKN2B/p15INK4B/ANRIL; rs1333040, P = 3.0 × 10−7) which was replicated in a series of 5415 Han Chinese (P = 0.03; combined analysis, P = 2.3 × 10−8). This large analysis provides additional evidence for the role of inherited genetic susceptibility to lung cancer and insight into biological differences in the development of the different histological types of lung cancer.
Biological pathways provide rich information and biological context on the genetic causes of complex diseases. The logistic kernel machine test integrates prior knowledge on pathways in order to analyze data from genome-wide association studies (GWAS). Here, the kernel converts genomic information of two individuals to a quantitative value reflecting their genetic similarity. With the selection of the kernel one implicitly chooses a genetic effect model. Like many other pathway methods, none of the available kernels accounts for topological structure of the pathway or gene-gene interaction types. However, evidence indicates that connectivity and neighborhood of genes are crucial in the context of GWAS, because genes associated with a disease often interact. Thus, we propose a novel kernel that incorporates the topology of pathways and information on interactions. Using simulation studies, we demonstrate that the proposed method maintains the type I error correctly and can be more effective in the identification of pathways associated with a disease than non-network-based methods. We apply our approach to genome-wide association case control data on lung cancer and rheumatoid arthritis. We identify some promising new pathways associated with these diseases, which may improve our current understanding of the genetic mechanisms.
Kernel Machine Test; Pathways; Networks; Gene-Gene Interactions; Score Test; Generalized Linear Model; Lung Cancer; Rheumatoid Arthritis; Disease Association; Genetic Association Studies
After a prolonged period of increasing rates of lung cancer incidence and mortality for both men and women, incidence and mortality rates are decreasing in men and stabilizing in women. The goal of this study was to assess changes over 20 years in the prevalence of known risk factors for lung cancer and to elucidate possible predictors associated with lung cancer survival.
The study included a total of 908 patients with primary lung cancer referred to The University of Texas M. D. Anderson Cancer Center over three study periods 1985–1989 (N=392), 1993–1997 (N= 216), and 2000–2004 (N= 300). Detailed questionnaires were used to collect information from the patients. Hazard ratios were estimated by fitting a Cox proportional hazards model. Using the Kaplan Meier method, survival in months was calculated up to 2 years from the date of diagnosis to achieve comparability in the three groups.
We observed a decrease in the proportion of patients who are current cigarette smokers and an increase in the proportion of patients who present with adenocarcinoma of the lung, are obese and patients who present with localized disease. We also found an increase in the number of patients who report a family history of lung cancer. The overall median survival duration has increased over the years from 12.0 months in 1985–1989 to 17.5 months in 2000–2004. Also, the probability of survival of patients who were alive at 2 years after diagnosis has also increased (26.5% in 1985–1989 to 40.8% in 2000–2004). Overall, women had a better median survival than men.
The results show that the demographic, histologic, clinical, and outcome variables of patients with lung cancer have changed over the past 20 years. Most important, the survival of patients with lung cancer has improved.
Lung cancer survival; time-trends; predictors for lung cancer survival
We conducted imputation to the 1000 Genomes Project of four genome-wide association studies of lung cancer in populations of European ancestry (11,348 cases and 15,861 controls) and genotyped an additional 10,246 cases and 38,295 controls for follow-up. We identified large-effect genome-wide associations for squamous lung cancer with the rare variants of BRCA2-K3326X (rs11571833; odds ratio [OR]=2.47, P=4.74×10−20) and of CHEK2-I157T (rs17879961; OR=0.38 P=1.27×10−13). We also showed an association between common variation at 3q28 (TP63; rs13314271; OR=1.13, P=7.22×10−10) and lung adenocarcinoma previously only reported in Asians. These findings provide further evidence for inherited genetic susceptibility to lung cancer and its biological basis. Additionally, our analysis demonstrates that imputation can identify rare disease-causing variants having substantive effects on cancer risk from pre-existing GWAS data.