Motivation: Multivariate tests derived from the logistic regression model are widely used to assess the joint effect of multiple predictors on a disease outcome in case–control studies. These tests become less optimal if the joint effect cannot be approximated adequately by the additive model. The tree-structure model is an attractive alternative, as it is more apt to capture non-additive effects. However, the tree model is used most commonly for prediction and seldom for hypothesis testing, mainly because of the computational burden associated with the resampling-based procedure required for estimating the significance level.
Results: We designed a fast algorithm for building the tree-structure model and proposed a robust TREe-based Association Test (TREAT) that incorporates an adaptive model selection procedure to identify the optimal tree model representing the joint effect. We applied TREAT as a multilocus association test on >20 000 genes/regions in a study of esophageal squamous cell carcinoma (ESCC) and detected a highly significant novel association between the gene CDKN2B and ESCC (). We also demonstrated, through simulation studies, the power advantage of TREAT over other commonly used tests.
Availability and implementation: The package TREAT is freely available for download at http://www.hanzhang.name/softwares/treat, implemented in C++ and R and supported on 64-bit Linux and 64-bit MS Windows.
Supplementary information: Supplementary data are available at Bioinformatics online.
Our GWAS of smoking and bladder cancer risk based on data from 5,424 cases and 10,162 controls suggest that exploring additive and multiplicative gene–environment interactions can identify novel susceptibility loci that are associated with risk for different subgroups.
Bladder cancer is a complex disease with known environmental and genetic risk factors. We performed a genome-wide interaction study (GWAS) of smoking and bladder cancer risk based on primary scan data from 3002 cases and 4411 controls from the National Cancer Institute Bladder Cancer GWAS. Alternative methods were used to evaluate both additive and multiplicative interactions between individual single nucleotide polymorphisms (SNPs) and smoking exposure. SNPs with interaction P values < 5 × 10−
5 were evaluated further in an independent dataset of 2422 bladder cancer cases and 5751 controls. We identified 10 SNPs that showed association in a consistent manner with the initial dataset and in the combined dataset, providing evidence of interaction with tobacco use. Further, two of these novel SNPs showed strong evidence of association with bladder cancer in tobacco use subgroups that approached genome-wide significance. Specifically, rs1711973 (FOXF2) on 6p25.3 was a susceptibility SNP for never smokers [combined odds ratio (OR) = 1.34, 95% confidence interval (CI) = 1.20–1.50, P value = 5.18 × 10−
7]; and rs12216499 (RSPH3-TAGAP-EZR) on 6q25.3 was a susceptibility SNP for ever smokers (combined OR = 0.75, 95% CI = 0.67–0.84, P value = 6.35 × 10−
7). In our analysis of smoking and bladder cancer, the tests for multiplicative interaction seemed to more commonly identify susceptibility loci with associations in never smokers, whereas the additive interaction analysis identified more loci with associations among smokers—including the known smoking and NAT2 acetylation interaction. Our findings provide additional evidence of gene–environment interactions for tobacco and bladder cancer.
As increasing evidence suggests that multiple correlated genetic variants could jointly influence the outcome, a multilocus test that aggregates association evidence across multiple genetic markers in a considered gene or a genomic region may be more powerful than a single-marker test for detecting susceptibility loci. We propose a multilocus test, AdaJoint, which adopts a variable selection procedure to identify a subset of genetic markers that jointly show the strongest association signal, and defines the test statistic based on the selected genetic markers. The P-value from the AdaJoint test is evaluated by a computationally efficient algorithm that effectively adjusts for multiple-comparison, and is hundreds of times faster than the standard permutation method. Simulation studies demonstrate that AdaJoint has the most robust performance among several commonly used multilocus tests. We perform multilocus analysis of over 26 000 genes/regions on two genome-wide association studies of pancreatic cancer. Compared with its competitors, AdaJoint identifies a much stronger association between the gene CLPTM1L and pancreatic cancer risk (6.0 × 10−8), with the signal optimally captured by two correlated single-nucleotide polymorphisms (SNPs). Finally, we show AdaJoint as a powerful tool for mapping cis-regulating methylation quantitative trait loci on normal breast tissues, and find many CpG sites whose methylation levels are jointly regulated by multiple SNPs nearby.
genome-wide association study; cis-regulating meQTLs mapping; multilocus test; variable selection; multiple comparisons; pathway analysis
A genome-wide association study (GWAS) of bladder cancer identified a genetic marker rs8102137 within the 19q12 region as a novel susceptibility variant. This marker is located upstream of the CCNE1 gene, which encodes cyclin E, a cell cycle protein. We performed genetic fine mapping analysis of the CCNE1 region using data from two bladder cancer GWAS (5,942 cases and 10,857 controls). We found that the original GWAS marker rs8102137 represents a group of 47 linked SNPs (with r2≥0.7) associated with increased bladder cancer risk. From this group we selected a functional promoter variant rs7257330, which showed strong allele-specific binding of nuclear proteins in several cell lines. In both GWAS, rs7257330 was associated only with aggressive bladder cancer, with a combined per-allele odds ratio (OR) =1.18 (95%CI=1.09-1.27, p=4.67×10−5 vs. OR =1.01 (95%CI=0.93-1.10, p=0.79) for non-aggressive disease, with p=0.0015 for case-only analysis. Cyclin E protein expression analyzed in 265 bladder tumors was increased in aggressive tumors (p=0.013) and, independently, with each rs7257330-A risk allele (ptrend=0.024). Over-expression of recombinant cyclin E in cell lines caused significant acceleration of cell cycle. In conclusion, we defined the 19q12 signal as the first GWAS signal specific for aggressive bladder cancer. Molecular mechanisms of this genetic association may be related to cyclin E over-expression and alteration of cell cycle in carriers of CCNE1 risk variants. In combination with established bladder cancer risk factors and other somatic and germline genetic markers, the CCNE1 variants could be useful for inclusion into bladder cancer risk prediction models.
Aggressive bladder cancer; cyclin E; cell cycle; single nucleotide polymorphism; GWAS
Candidate gene and genome-wide association studies (GWAS) have identified 11 independent susceptibility loci associated with bladder cancer risk. To discover additional risk variants, we conducted a new GWAS of 2422 bladder cancer cases and 5751 controls, followed by a meta-analysis with two independently published bladder cancer GWAS, resulting in a combined analysis of 6911 cases and 11 814 controls of European descent. TaqMan genotyping of 13 promising single nucleotide polymorphisms with P < 1 × 10−5 was pursued in a follow-up set of 801 cases and 1307 controls. Two new loci achieved genome-wide statistical significance: rs10936599 on 3q26.2 (P = 4.53 × 10−9) and rs907611 on 11p15.5 (P = 4.11 × 10−8). Two notable loci were also identified that approached genome-wide statistical significance: rs6104690 on 20p12.2 (P = 7.13 × 10−7) and rs4510656 on 6p22.3 (P = 6.98 × 10−7); these require further studies for confirmation. In conclusion, our study has identified new susceptibility alleles for bladder cancer risk that require fine-mapping and laboratory investigation, which could further understanding into the biological underpinnings of bladder carcinogenesis.
We conducted a joint (pooled) analysis of three genome-wide association studies (GWAS) 1-3 of esophageal squamous cell carcinoma (ESCC) in ethnic Chinese (5,337 ESCC cases and 5,787 controls) with 9,654 ESCC cases and 10,058 controls for follow-up. In a logistic regression model adjusted for age, sex, study, and two eigenvectors, two new loci achieved genome-wide significance, marked by rs7447927 at 5q31.2 (per-allele odds ratio (OR) = 0.85, 95% CI 0.82-0.88; P=7.72x10−20) and rs1642764 at 17p13.1 (per-allele OR= 0.88, 95% CI 0.85-0.91; P=3.10x10−13). rs7447927 is a synonymous single nucleotide polymorphism (SNP) in TMEM173 and rs1642764 is an intronic SNP in ATP1B2, near TP53. Furthermore, a locus in the HLA class II region at 6p21.32 (rs35597309) achieved genome-wide significance in the two populations at highest risk for ESSC (OR=1.33, 95% CI 1.22-1.46; P=1.99x10−10). Our joint analysis identified new ESCC susceptibility loci overall as well as a new locus unique to the ESCC high risk Taihang Mountain region.
Populations in north central China are at high risk for gastric cancers (GC), and altered FAS-mediated cell signaling and/or apoptosis may contribute to this risk. We examined the association of 554 single nucleotide polymorphisms (SNPs) in 53 Fas signaling-related genes using a pathway-based approach in 1758 GC cases (1126 gastric cardia adenocarcinomas (GCA) and 632 gastric noncardia adenocarcinomas (GNCA)), and 2111 controls from a genome-wide association study (GWAS) of GC in ethnic Chinese. SNP associations with risk of overall GC, GCA and GNCA were evaluated using unconditional logistic regressions controlling for age, sex and study. Gene- and pathway-based associations were tested using the adaptive rank-truncated product (ARTP) method. Statistical significance was evaluated empirically by permutation. Significant pathway-based associations were observed for Fas signaling with risk of overall GC (P = 5.5E-04) and GCA (P = 6.3E-03), but not GNCA (P = 8.1E-02). Among examined genes in the Fas signaling pathway, MAP2K4, FAF1, MAPK8, CASP10, CASP8, CFLAR, MAP2K1, CAP8AP2, PAK2 and IKBKB were associated with risk of GC (nominal P < 0.05), and FAF1 and MAPK8 were significantly associated with risk of both GCA and GNCA (nominal P < 0.05). Our examination of genetic variation in the Fas signaling pathway is consistent with an association of altered Fas signaling and/or apoptosis with risk of GC. As one of the first attempts to investigate a pathway-level association, our results suggest that these genes and the Fas signaling pathway warrant further evaluation in relation to GC risk in other populations.
Gastric cancer; gastric cardia; gastric noncardia; Fas signaling; genetic variants; GWAS; single nucleotide polymorphisms; pathway genes
We conducted imputation to the 1000 Genomes Project of four genome-wide association studies of lung cancer in populations of European ancestry (11,348 cases and 15,861 controls) and genotyped an additional 10,246 cases and 38,295 controls for follow-up. We identified large-effect genome-wide associations for squamous lung cancer with the rare variants of BRCA2-K3326X (rs11571833; odds ratio [OR]=2.47, P=4.74×10−20) and of CHEK2-I157T (rs17879961; OR=0.38 P=1.27×10−13). We also showed an association between common variation at 3q28 (TP63; rs13314271; OR=1.13, P=7.22×10−10) and lung adenocarcinoma previously only reported in Asians. These findings provide further evidence for inherited genetic susceptibility to lung cancer and its biological basis. Additionally, our analysis demonstrates that imputation can identify rare disease-causing variants having substantive effects on cancer risk from pre-existing GWAS data.
The genetic regulation of the human epigenome is not fully appreciated. Here we describe the effects of genetic variants on the DNA methylome in human lung based on methylation-quantitative trait loci (meQTL) analyses. We report 34,304 cis- and 585 trans-meQTLs, a genetic-epigenetic interaction of surprising magnitude, including a regulatory hotspot. These findings are replicated in both breast and kidney tissues and show distinct patterns: cis-meQTLs mostly localize to CpG sites outside of genes, promoters, and CpG islands (CGIs), while trans-meQTLs are over-represented in promoter CGIs. meQTL SNPs are enriched in CTCF binding sites, DNaseI hypersensitivity regions and histone marks. Importantly, 4 of the 5 established lung cancer risk loci in European ancestry are cis-meQTLs and, in aggregate, cis-meQTLs are enriched for lung cancer risk in a genome-wide analysis of 11,587 subjects. Thus, inherited genetic variation may affect lung carcinogenesis by regulating the human methylome.
Dysplastic nevi (DN) is a strong risk factor for cutaneous malignant melanoma (CMM), and it frequently occurs in melanoma-prone families. To identify genetic variants for DN, we genotyped 677 tagSNPs in 38 melanoma candidate genes that are involved in pigmentation, DNA repair, cell cycle control, and melanocyte proliferation pathways in a total of 504 individuals (310 with DN, 194 without DN) from 53 melanoma-prone families (23 CDKN2A mutation positive and 30 negative). Conditional logistic regression, conditioning on families, was used to estimate the association between DN and each SNP separately, adjusted for age, sex, CMM and CDKN2A status. P-values for SNPs in the same gene were combined to yield gene-specific p-values. Two genes, CDK6 and XRCC1, were significantly associated with DN after Bonferroni correction for multiple testing (P=0.0001 and 0.00025, respectively), whereas neither gene was significantly associated with CMM. Associations for CDK6 SNPs were stronger in CDKN2A mutation positive families (rs2079147, Pinteraction=0.0033), whereas XRCC1 SNPs had similar effects in mutation-positive and negative families. The association for one of the associated SNPs in XRCC1 (rs25487) was replicated in two independent datasets (random effect meta-analysis: P<0.0001). Our findings suggest that some genetic variants may contribute to DN risk independently of their association with CMM in melanoma-prone families.
The DNA repair pathways help to maintain genomic integrity and therefore genetic variation in the pathways could affect the propensity to develop cancer. Selected germline single nucleotide polymorphisms (SNPs) in the pathways have been associated with esophageal cancer and gastric cancer (GC) but few studies have comprehensively examined the pathway genes. We aimed to investigate associations between DNA repair pathway genes and risk of esophageal squamous cell carcinoma (ESCC) and GC, using data from a genome-wide association study in a Han Chinese population where ESCC and GC are the predominant cancers. In sum, 1942 ESCC cases, 1758 GC cases and 2111 controls from the Shanxi Upper Gastrointestinal Cancer Genetics Project (discovery set) and the Linxian Nutrition Intervention Trials (replication set) were genotyped for 1675 SNPs in 170 DNA repair-related genes. Logistic regression models were applied to evaluate SNP-level associations. Gene- and pathway-level associations were determined using the resampling-based adaptive rank-truncated product approach. The DNA repair pathways overall were significantly associated with risk of ESCC (P = 6.37 × 10−
4), but not with GC (P = 0.20). The most significant gene in ESCC was CHEK2 (P = 2.00 × 10−
6) and in GC was CLK2 (P = 3.02 × 10−
4). We observed several other genes significantly associated with either ESCC (SMUG1, TDG, TP53, GTF2H3, FEN1, POLQ, HEL308, RAD54B, MPG, FANCE and BRCA1) or GC risk (MRE11A, RAD54L and POLE) (P < 0.05). We provide evidence for an association between specific genes in the DNA repair pathways and the risk of ESCC and GC. Further studies are warranted to validate these associations and to investigate underlying mechanisms.
Elevated resting heart rate is associated with greater risk of cardiovascular disease and mortality. In a 2-stage meta-analysis of genome-wide association studies in up to 181,171 individuals, we identified 14 new loci associated with heart rate and confirmed associations with all 7 previously established loci. Experimental downregulation of gene expression in Drosophila melanogaster and Danio rerio identified 20 genes at 11 loci that are relevant for heart rate regulation and highlight a role for genes involved in signal transmission, embryonic cardiac development and the pathophysiology of dilated cardiomyopathy, congenital heart failure and/or sudden cardiac death. In addition, genetic susceptibility to increased heart rate is associated with altered cardiac conduction and reduced risk of sick sinus syndrome, and both heart rate–increasing and heart rate–decreasing variants associate with risk of atrial fibrillation. Our findings provide fresh insights into the mechanisms regulating heart rate and identify new therapeutic targets.
With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave).
rare variants; score test; GWAS; association test
In China, esophageal cancer is the fourth leading cause of cancer death where essentially all cases are histologically esophageal squamous cell carcinoma (ESCC), in contrast to esophageal adenocarcinoma in the West. Globally, ESCC is 2.4 times more common among men than women and recently it has been suggested that sex hormones may be associated with the risk of ESCC. We examined the association between genetic variants in sex hormone metabolic genes and ESCC risk in a population from north central China with high-incidence rates. A total of 1026 ESCC cases and 1452 controls were genotyped for 797 unique tag single-nucleotide polymorphisms (SNPs) in 51 sex hormone metabolic genes. SNP-, gene- and pathway-based associations with ESCC risk were evaluated using unconditional logistic regression adjusted for age, sex and geographical location and the adaptive rank truncated product (ARTP) method. Statistical significance was determined through use of permutation for pathway- and gene-based associations. No associations were observed for the overall sex hormone metabolic pathway (P = 0.14) or subpathways (androgen synthesis: P = 0.30, estrogen synthesis: P = 0.15 and estrogen removal: P = 0.19) with risk of ESCC. However, six individual genes (including SULT2B1, CYP1B1, CYP3A7, CYP3A5, SHBG and CYP11A1) were significantly associated with ESCC risk (P < 0.05). Our examination of genetic variation in the sex hormone metabolic pathway is consistent with a potential association with risk of ESCC. These positive findings warrant further evaluation in relation to ESCC risk and replication in other populations.
Neuronal nicotinic acetylcholine receptor (nAChR) genes (CHRNA5/CHRNA3/CHRNB4) have been reproducibly associated with nicotine dependence, smoking behaviors, and lung cancer risk. Of the few reports that have focused on early smoking behaviors, association results have been mixed. This meta-analysis examines early smoking phenotypes and SNPs in the gene cluster to determine: (1) whether the most robust association signal in this region (rs16969968) for other smoking behaviors is also associated with early behaviors, and/or (2) if additional statistically independent signals are important in early smoking. We focused on two phenotypes: age of tobacco initiation (AOI) and age of first regular tobacco use (AOS). This study included 56,034 subjects (41 groups) spanning nine countries and evaluated five SNPs including rs1948, rs16969968, rs578776, rs588765, and rs684513. Each dataset was analyzed using a centrally generated script. Meta-analyses were conducted from summary statistics. AOS yielded significant associations with SNPs rs578776 (beta = 0.02, P = 0.004), rs1948 (beta = 0.023, P = 0.018), and rs684513 (beta = 0.032, P = 0.017), indicating protective effects. There were no significant associations for the AOI phenotype. Importantly, rs16969968, the most replicated signal in this region for nicotine dependence, cigarettes per day, and cotinine levels, was not associated with AOI (P = 0.59) or AOS (P = 0.92). These results provide important insight into the complexity of smoking behavior phenotypes, and suggest that association signals in the CHRNA5/A3/B4 gene cluster affecting early smoking behaviors may be different from those affecting the mature nicotine dependence phenotype.
CHRNA5; CHRNA3; CHRNB4; meta-analysis; nicotine; smoke
We conducted a genome-wide association study of gastric cancer (GC) and esophageal squamous cell carcinoma (ESCC) in ethnic Chinese subjects in which we genotyped 551,152 single nucleotide polymorphisms (SNPs). We report a combined analysis of 2,240 GC cases, 2,115 ESCC cases, and 3,302 controls drawn from five studies. In logistic regression models adjusted for age, sex, and study, multiple variants at 10q23 had genome-wide significance for GC and ESCC independently. A notable signal was rs2274223, a nonsynonymous SNP located in PLCE1, for GC (P=8.40×1010; per allele odds ratio (OR) = 1.31) and ESCC (P=3.85×10−9; OR = 1.34). The association with GC differed by anatomic subsite. For tumors located in the cardia the association was stronger (P=4.19 × 10−15; OR= 1.57) and for those located in the noncardia stomach it was absent (P=0.44; OR=1.05). Our findings at 10q23 could provide insight into the high incidence rates of both cancers in China.
Recent evidence suggests a link between constitutional telomere length (TL) and cancer risk. Previous studies have suggested that longer telomeres were associated with an increased risk of melanoma and larger size and number of nevi. The goal of this study was to examine whether TL modified the risk of melanoma in melanoma-prone families with and without CDKN2A germline mutations.
Materials and Methods
We measured TL in blood DNA in 119 cutaneous malignant melanoma (CMM) cases and 208 unaffected individuals. We also genotyped 13 tagging SNPs in TERT.
We found that longer telomeres were associated with an increased risk of CMM (adjusted OR = 2.81, 95% CI = 1.02–7.72, P = 0.04). The association of longer TL with CMM risk was seen in CDKN2A- cases but not in CDKN2A+ cases. Among CMM cases, the presence of solar injury was associated with shorter telomeres (P = 0.002). One SNP in TERT, rs2735940, was significantly associated with TL (P = 0.002) after Bonferroni correction.
Our findings suggest that TL regulation could be variable by CDKN2A mutation status, sun exposure, and pigmentation phenotype. Therefore, TL measurement alone may not be a good marker for predicting CMM risk.
Recent studies have shown an association between cigarettes per day (CPD) and a nonsynonymous single-nucleotide polymorphism in CHRNA5, rs16969968.
To determine whether the association between rs16969968 and smoking is modified by age at onset of regular smoking.
Available genetic studies containing measures of CPD and the genotype of rs16969968 or its proxy.
Uniform statistical analysis scripts were run locally. Starting with 94 050 ever-smokers from 43 studies, we extracted the heavy smokers (CPD >20) and light smokers (CPD ≤10) with age-at-onset information, reducing the sample size to 33 348. Each study was stratified into early-onset smokers (age at onset ≤16 years) and late-onset smokers (age at onset >16 years), and a logistic regression of heavy vs light smoking with the rs16969968 genotype was computed for each stratum. Meta-analysis was performed within each age-at-onset stratum.
Individuals with 1 risk allele at rs16969968 who were early-onset smokers were significantly more likely to be heavy smokers in adulthood (odds ratio [OR]=1.45; 95% CI, 1.36–1.55; n=13 843) than were carriers of the risk allele who were late-onset smokers (OR = 1.27; 95% CI, 1.21–1.33, n = 19 505) (P = .01).
These results highlight an increased genetic vulnerability to smoking in early-onset smokers.
Relationships are unclear between polymorphisms in genes involved in metabolism and detoxification of various chemicals and papillary thyroid cancer (PTC) risk as well as their potential modification by alcohol or tobacco intake. We evaluated associations between 1647 tagging single nucleotide polymorphisms (SNPs) in 132 candidate genes/regions involved in metabolism of exogenous and endogenous compounds (Phase I/II, oxidative stress, and metal binding pathways) and PTC risk in 344 PTC cases and 452 controls. For 15 selected regions and their respective SNPs, we also assessed interaction with alcohol and tobacco use. Logistic regression models were used to evaluate the main effect of SNPs (Ptrend) and interaction with alcohol/tobacco intake. Gene- and pathway-level associations and interactions (Pgene interaction) were evaluated by combining Ptrend values using the adaptive rank-truncated product method. While we found associations between PTC risk and nine SNPs (Ptrend≤0.01) and seven genes/regions (Pregion<0.05), none remained significant after correction for the false discovery rate. We found a significant interaction between UGT2B7 and NAT1 genes and alcohol intake (Pgene interaction=0.01 and 0.02 respectively) and between the CYP26B1 gene and tobacco intake (Pgene interaction=0.02). Our results are suggestive of interaction between the genetic polymorphisms in several detoxification genes and alcohol or tobacco intake on risk of PTC. Larger studies with improved exposure assessment should address potential modification of PTC risk by alcohol and tobacco intake to confirm or refute our findings.
Cutaneous malignant melanoma (CMM) is an etiologically heterogeneous disease with genetic, environmental (sun exposure) and host (pigmentation/nevi) factors, and their interactions contributing to risk. Genetic variants in DNA repair genes may be particularly important since their altered function in response to sun exposure-related DNA damage maybe related to risk for CMM. However, systematic evaluations of genetic variants in DNA repair genes are limited, particularly in high-risk families.
We comprehensively analyzed DNA repair gene polymorphisms and CMM risk in melanoma-prone families with/without CDKN2A mutations. A total of 586 individuals (183 CMM) from 53 families (23 CDKN2A (+), 30 CDKN2A (−)) were genotyped for 2964 tagSNPs in 131 DNA repair genes. Conditional logistic regression, conditioning on families, was used to estimate trend p-values, odds ratios and 95% confidence intervals for the association between CMM and each SNP separately, adjusted for age and sex. P-values for SNPs in the same gene were combined to yield gene specific p-values. Two genes, POLN and PRKDC, were significantly associated with melanoma after Bonferroni correction for multiple testing (p=0.0003 and 0.00035, respectively). DCLRE1B showed suggestive association (p=0.0006). 28~56% of genotyped SNPs in these genes had single SNP p<0.05. The most significant SNPs in POLN and PRKDC had similar effects in CDKN2A (+) and CDKN2A (−) families. Our finding suggests that polymorphisms in DNA repair genes, POLN and PRKDC, were associated with increased melanoma risk in melanoma families with and without CDKN2A mutations.
Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10−8, and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19–1.40) and P= 7.63 × 10−10. An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants.
Accumulating evidence suggests that alterations in immune function may be important in the etiology of papillary thyroid cancer (PTC). To identify genetic markers in immune-related pathways, we evaluated 3,985 tag single nucleotide polymorphisms (SNPs) in 230 candidate gene regions (adhesion-extravasation-migration, arachidonic acid metabolism/eicosanoid signaling, complement and coagulation cascade, cytokine signaling, innate pathogen detection and antimicrobials, leukocyte signaling, TNF/NF-kB pathway or other) in a case-control study of 344 PTC cases and 452 controls. We used logistic regression models to estimate odds ratios (OR) and calculate one degree of freedom P values of linear trend (PSNP-trend) for the association between genotype (common homozygous, heterozygous, variant homozygous) and risk of PTC. To correct for multiple comparisons, we applied the false discovery rate method (FDR). Gene region- and pathway-level associations (PRegion and PPathway) were assessed by combining individual PSNP-trend values using the adaptive rank truncated product method. Two SNPs (rs6115, rs6112) in the SERPINA5 gene were significantly associated with risk of PTC (PSNP-FDR/PSNP-trend = 0.02/6×10−6 and PSNP-FDR/PSNP-trend = 0.04/2×10−5, respectively). These associations were independent of a history of autoimmune thyroiditis (OR = 6.4; 95% confidence interval: 3.0–13.4). At the gene region level, SERPINA5 was suggestively associated with risk of PTC (PRegion-FDR/PRegion = 0.07/0.0003). Overall, the complement and coagulation cascade pathway was the most significant pathway (PPathway = 0.02) associated with PTC risk largely due to the strong effect of SERPINA5. Our results require replication but suggest that the SERPINA5 gene, which codes for the protein C inhibitor involved in many biological processes including inflammation, may be a new susceptibility locus for PTC.
Hormonal differences are hypothesized to contribute to the approximately ≥2-fold higher thyroid cancer incidence rates among women compared with men worldwide. Although thyroid cancer cells express estrogen receptors and estrogen has a proliferative effect on papillary thyroid cancer (PTC) cells in vitro, epidemiologic studies have not found clear associations between thyroid cancer and female hormonal factors. We hypothesized that polymorphic variation in hormone pathway genes is associated with the risk of developing papillary thyroid cancer.
We evaluated the association between PTC and 1151 tag single nucleotide polymorphisms (SNPs) in 58 candidate gene regions involved in sex hormone synthesis and metabolism, gonadotropins, and prolactin in a case-control study of 344 PTC cases and 452 controls, frequency matched on age and sex. Odds ratios and p-values for the linear trend for the association between each SNP genotype and PTC risk were estimated using unconditional logistic regression. SNPs in the same gene region or pathway were aggregated using adaptive rank-truncated product methods to obtain gene region-specific or pathway-specific p-values. To account for multiple comparisons, we applied the false discovery rate method.
Seven SNPs had p-values for linear trend <0.01, including four in the CYP19A1 gene, but none of the SNPs remained significant after correction for multiple comparisons. Results were similar when restricting the dataset to women. p-values for examined gene regions and for all genes combined were ≥0.09.
Based on these results, SNPs in selected hormone pathway genes do not appear to be strongly related to PTC risk. This observation is in accord with the lack of consistent associations between hormonal factors and PTC risk in epidemiologic studies.
In an analysis of 31,717 cancer cases and 26,136 cancer-free controls drawn from 13 genome-wide association studies (GWAS), we observed large chromosomal abnormalities in a subset of clones from DNA obtained from blood or buccal samples. Mosaic chromosomal abnormalities, either aneuploidy or copy-neutral loss of heterozygosity, of size >2 Mb were observed in autosomes of 517 individuals (0.89%) with abnormal cell proportions between 7% and 95%. In cancer-free individuals, the frequency increased with age; 0.23% under 50 and 1.91% between 75 and 79 (p=4.8×10−8). Mosaic abnormalities were more frequent in individuals with solid-tumors (0.97% versus 0.74% in cancer-free individuals, OR=1.25, p=0.016), with a stronger association for cases who had DNA collected prior to diagnosis or treatment (OR=1.45, p=0.0005). Detectable clonal mosaicism was common in individuals for whom DNA was collected at least one year prior to diagnosis of leukemia compared to cancer-free individuals (OR=35.4, p=3.8×10−11). These findings underscore the importance of the role and time-dependent nature of somatic events in the etiology of cancer and other late-onset diseases.
Cancer is an important cause of morbidity in the elderly, and many medical conditions and treatments influence cancer risk. The Surveillance, Epidemiology, and End Results (SEER)-Medicare database can be used to conduct population-based case-control studies that elucidate the etiology of cancer among the US elderly. SEER-Medicare links data on malignancies ascertained through SEER cancer registries to claims from Medicare, the US government insurance program for people over age 65 years. Under one approach described herein, elderly cancer cases are ascertained from SEER data (1987–2005). Matched controls are selected from a 5% random sample of Medicare beneficiaries. Risk factors of interest, including medical conditions and procedures, are identified by using linked Medicare claims. Strengths of this design include the ready availability of data, representative sampling from the US elderly population, and large sample size (e.g., under one scenario: 1,176,950 cases, including 221,389 prostate cancers, 185,853 lung cancers, 138,041 breast cancers, and 124,442 colorectal cancers; and 100,000 control subjects). Limitations reflect challenges in exposure assessment related to Medicare claims: restricted range of evaluable risk factors, short time before diagnosis/selection for ascertainment, and inaccuracies in claims. With awareness of limitations, investigators have in SEER-Medicare data a valuable resource for epidemiologic research on cancer etiology.
aged; case-control studies; data collection; epidemiologic methods; Medicare; neoplasms; risk factors; SEER Program