|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: IM JDF MG NC SJC DTS NR. Performed the experiments: ZW KBJ AH LB. Analyzed the data: IM JF QY DM. Contributed reagents/materials/analysis tools: NM AP LP FXR MT DA MPP MK YF WT AT CS AC RG JL AJ MS AS GA AB AJJ RWD SMG SJW JV NEC MTL JFF. Wrote the paper: IM JF.
Pathway analysis of genome-wide association studies (GWAS) offer a unique opportunity to collectively evaluate genetic variants with effects that are too small to be detected individually. We applied a pathway analysis to a bladder cancer GWAS containing data from 3,532 cases and 5,120 controls of European background (n=5 studies). Thirteen hundred and ninety-nine pathways were drawn from five publicly available resources (Biocarta, Kegg, NCI-PID, HumanCyc, and Reactome), and we constructed 22 additional candidate pathways previously hypothesized to be related to bladder cancer. In total, 1421 pathways, 5647 genes and ~90,000 SNPs were included in our study. Logistic regression model adjusting for age, sex, study, DNA source, and smoking status was used to assess the marginal trend effect of SNPs on bladder cancer risk. Two complementary pathway-based methods (gene-set enrichment analysis [GSEA], and adapted rank-truncated product [ARTP]) were used to assess the enrichment of association signals within each pathway. Eighteen pathways were detected by either GSEA or ARTP at P≤0.01. To minimize false positives, we used the I2 statistic to identify SNPs displaying heterogeneous effects across the five studies. After removing these SNPs, seven pathways (‘Aromatic amine metabolism’ [PGSEA=0.0100, PARTP=0.0020], ‘NAD biosynthesis’ [PGSEA=0.0018, PARTP=0.0086], ‘NAD salvage’ [PARTP=0.0068], ‘Clathrin derived vesicle budding’ [PARTP=0.0018], ‘Lysosome vesicle biogenesis’ [PGSEA=0.0023, PARTP<0.00012], ’Retrograde neurotrophin signaling’ [PGSEA=0.00840], and ‘Mitotic metaphase/anaphase transition’ [PGSEA=0.0040]) remained. These pathways seem to belong to three fundamental cellular processes (metabolic detoxification, mitosis, and clathrin-mediated vesicles). Identification of the aromatic amine metabolism pathway provides support for the ability of this approach to identify pathways with established relevance to bladder carcinogenesis.
Genome-wide association studies (GWAS) have served as a useful tool to identify common genetic variants associated with various complex traits . As expected, each variant explains a tiny portion of the heritable component of their associated phenotypes , . Recently, Park and colleagues estimated that some proportion of the ‘missing heritability’ may reside in additional common low-penetrance susceptibility variants that can be discovered in larger GWAS . In principle, other methods could complement the primary single-locus tests of GWAS in identifying additional susceptibility loci. One such approach is pathway (gene-set) analysis , , which examines whether association signals of a collection of functionally related loci (typically genes) consistently deviate from what is expected by chance. This approach may suggest new candidate susceptibility loci and possibly provide insights into the mechanisms underlying complex traits. Pathway-based analyses have been applied to GWAS of complex diseases, including multiple sclerosis , type-2 diabetes , , Crohn's disease , , Parkinson's disease , , colon  and breast  cancers.
Bladder cancer is the fourth most common malignancy among men in the western world . Epidemiological studies have shown that exposure to aromatic amines (AAs) from tobacco smoking or occupation is strongly associated with bladder cancer risk , , , . Additionally, genetic studies have demonstrated that functional polymorphisms in two genes involved in carcinogen metabolism (N-acetyltransferase 2 [NAT2] and glutathione S-transferase M1 [GSTM1]) are associated with bladder cancer risk , . Notably, the risk of bladder cancer associated with NAT2 slow acetylation genotype is restricted to smokers , . Recently, a series of GWAS have identified previously unknown susceptibility loci for bladder cancer, with the prospects of more to be discovered , , , . To identify additional regions that harbor plausible candidate genes and shed further light on genetic basis of this disease, we applied pathway analysis to the first stage of the NCI's CGEMS bladder cancer GWAS containing 3,532 cases and 5,120 controls . We report here seven pathways implicated in diverse carcinogenic processes to be enriched with bladder cancer susceptibility loci.
We applied our analyses to primary scan data of 591,637 SNPs from NCI's bladder cancer GWAS containing 3,532 cases and 5,120 controls of European ancestry from five studies (Spanish Bladder Cancer Study [SBCS], New England, Maine and Vermont Bladder Cancer Study [NEBCS-ME/VT], Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study [ATBC], the American Cancer Society Cancer Prevention Study II Nutrition Cohort [CPS-II], and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial [PLCO]) .
We collected gene-sets from five publicly available pathway resources: BioCarta , Kyoto Encyclopedia of Genes and Genomes (KEGG) , NCI's Pathway Interaction Database (PID) , Reactome , and Encyclopedia of Homo sapiens Genes and Metabolism (HumanCyc) . Inclusion criteria of pathways for analysis were those containing 5–100 genes to avoid testing too narrowly- or too broadly- defined functional categories. In addition, we constructed 22 candidate pathways (Table S2) based on known bladder cancer risk factors and general carcinogenic processes , ,  which were not represented in the public databases above. Specifically, selection of genes was determined through 1) biochemical data for the detoxification of aromatic amines , ; 2) Ingenuity pathway lists ; and 3) Gene ontology lists .
To explore the similarity between pathways in our database, we assessed the percentage of overlapping genes between each two pathways (A and B) as:
where NA and NB are the number of genes within pathways A and B.
SNPs from the first stage of the NCI bladder cancer GWAS  were mapped to genes in these pathways if they were located in a region encompassing 20 kb 5′ upstream and 10 kb 3′ downstream from the genes' coding regions (NCBI's human genome build 36.3). These gene's boundaries were selected attempting to capture most of the gene's coding and regulatory variants  as well as minimizing the overlap between genes. Overall, 1,422 pathways containing 5,647genes (24.3±21.7 [mean ± SD] genes per pathway) and ~92,000 SNPs were included in our database. A complete list of the studied pathways is available in Table S1.
SNPs with MAF<1% among controls were excluded from the analysis. We fitted logistic regression models adjusted for age, sex, study center, DNA source (buccal/blood), and smoking status (current/former/never/occasional), to assess the marginal effect of each SNP (1 degree of freedom trend test) on the risk of bladder cancer, as previously described . For each gene Gj (j=1, …, N, where N is the total number of genes in our dataset), the SNP with the lowest p-value among all SNPs that were mapped to its region was selected to represent the gene in the pathway analysis. We used two approaches to test for overrepresentation of association signals within pathways in our database:
Using both the GSEA and ARTP methods that employ different approaches to assess the enrichment of gene-based signals within predefined gene-sets may facilitate capturing a broader range of candidate pathways for bladder cancer susceptibility.
Finally, we calculated a false discovery rate (FDR) to assess the proportion of expected false positive findings in the GSEA and ARTP analyses. In short, we normalized the GSEA and ARTP statistics for each pathway (NSs(GSEA) and NSs(ARTP) respectively) based on the mean and standard deviation of the corresponding permutation data . This procedure allows a direct comparison of pathways with different sizes and gene compositions. Then, we used these normalized statistics to calculate the FDR as:
To minimize false positives, we estimated the I-squared statistic (I2)  to identify SNPs displaying heterogeneous effects across the five studies [ATBC, CPSII, NEBCS (ME, VT), PLCO, and SBCS]. I2describes the proportion of total variation in study estimates that is due to heterogeneity. In short, a meta-analysis was applied to every SNP belonging to one of the top pathways using the genotype frequency counts of cases and controls to estimate per-allele OR and CI's. SNPs with I2 P-values<0.2 were removed from further analyses. We evaluated the OR, CI and p values for both the meta-analysis and they were similar in both models, and did not change the interpretation of the data. These analyses were done using STATA (Version 11, STATA Corporation, College Station, TX).
Overall, there was good correlation between the results of the GSEA and the ARTP methods (r=0.74, P<0.0001). A detailed examination of the results revealed that, on average, GSEA performed better in detecting pathways enriched with multiple weak association signals while ARTP appeared to be more powerful in detecting pathways where only few genes with relatively strong signals are dominating. Notably, the AA metabolism pathway, which contains several known bladder cancer susceptibility loci, was detected by both GSEA and ARTP methods (PGSEA=0.0100, PARTP=0.0020). Therefore, we used its significance level as a reference for highlighting additional candidate susceptibility pathways. Of the 1421 pathways examined, 18 were significantly enriched with association signals at the P<0.01 level (Table 1). Of these, seven pathways were detected by both GSEA and ARTP, four pathways were detected only by GSEA, and seven were detected only by ARTP. After removing SNPs with heterogeneous effects across the five studies (I2 P-value<0.2), the enrichment signals remained significant (P<0.01) in seven pathways belonging to four cellular processes (“aromatic amine [AA] metabolism”, “Nicotinamide adenine dinucleotide [NAD] metabolism”, “Clathrin-mediated vesicles”, and “Mitosis”). For clarity, from this point forward, we will refer only to the results from the post heterogeneity analysis.
Table 2 displays the results for the genes in the AA pathway. The enrichment signals in this pathway were mainly driven by SNPs in the UGT1A9 and NAT2 genes. SNPs in these genes were identified in the primary analysis of this GWAS . Removing these two genes from the pathway analyses reduced the enrichment signal in the AA metabolism pathway in both methods but still ranked it relatively high using the GSEA (PGSEA=0.0130, PARTP=0.1217). Apart from UGT1A9 and NAT2, five additional genes in this pathway had SNPs with significant genetic effect (Ptrend<0.05). These included NAT1, UGT1A4, UGT1A6, NQO1 and CYP1B1.
Some of the genes in the AA metabolism pathway (i.e. CYP1A1 and CYP1A2; UGT1A4, UGT1A6 and UGT1A9; SULT1A1 and SULT1A2) occur on the same chromosomal locus and consequently share similar tagging SNPs. To assess the effect of this redundancy on the pathway enrichment signal, we pooled together genes with overlapping SNPs and treated them as a single genetic unit in our pathway analyses. Consequently, the number of loci included in the AA metabolism pathway was reduced to seven, (Table S2) and the corresponding enrichment signals were strengthened (PGSEA=0.0046, PARTP=0.0001). Even when removing the NAT2 and UGT1A regions from this gene-set, its corresponding enrichment signal remains relatively high (PGSEA=0.024, PARTP=0.0921).
Two nicotinamide adenine dinucleotide (NAD) metabolism pathways were detected in this analysis. The “NAD biogenesis I” pathway (HumanCyc) was detected by both GSEA and ARTP (PGSEA=0.0018, PARTP=0.0086), and the “NAD salvage II” pathway (HumanCyc) was detected only by the ARTP method (PARTP=0.0068). Table 3 presents the results for the genes in these pathways. The three NMNAT genes (NMNAT1, NMNAT2, and NMNAT3) that are shared by both of these two pathways harbor SNPs with significant genetic effect (Ptrend<0.05) and therefore likely to dominate the significant enrichment signals in these pathways. Other genes displaying significant bladder cancer risk are QPRT in the “NAD I” pathway, and ACP6, ITGB1BP3, ACPL2 in the “NAD II” pathway.
Three pathways involved in clathrin-dependent vesicle biogenesis and budding were detected in this analysis. The “Lysosome Vesicle Biogenesis” pathway (Reactome) showed the strongest enrichment signal among all pathways in this study, and was detected by both GSEA and ARTP (PGSEA=0.0023, PARTP<0.0001). The “Clathrin derived vesicle budding” pathway (Reactome) was detected only by ARTP (PARTP=0.0018), while the “Retrograde neurotrophin signaling” pathway (Reactome) was detected only by GSEA (PGSEA=0.0084). Table 4 displays the results for the genes in these pathways. Three genes are shared by the three pathways: CLTA and CLTC, which encode for the light and heavy chains of clathrin respectively, and SH3GL2 which is associated with clathrin-mediated endocytosis. The association of SNPs in these three genes with bladder cancer risk ranked them among the top four genes in these pathways.
The “Mitotic metaphase/anaphase transition” (Reactome) was detected by the GSEA method (PGSAE=0.0040) and was marginally significant using ARTP (PARTP=0.0187). Interestingly, all eight genes in this pathway are included in the more comprehensive “Mitotic prometaphase” pathway that was detected in the initial pathway screening, but had a less significant signal after removing SNPs with heterogeneous signals (Table 1). Results for the eight genes included in the “Mitotic metaphase/anaphase transition” pathway are presented in Table 5. Three SNPs in three genes (FBXO5, SMC3 and SPC24) were associated with significant protective effect on bladder cancer (Ptrend<0.05).
Our pathway-based analysis of a large bladder cancer GWAS using two complementary pathway-based methods (GSEA and ARTP) identified an overrepresentation of association signals in seven pathways (‘Aromatic amine metabolism’, ‘NAD biosynthesis’, ‘NAD salvage’, ‘Clathrin derived vesicle budding’, ‘Lysosome vesicle biogenesis’, ‘Retrograde neurotrophin signaling’, and ‘Mitotic metaphase/anaphase transition’) and suggest involvement in at least three cellular processes (metabolic detoxification, mitosis, and clathrin-mediated vesicles).
The identification of the AA metabolism pathway in this study by both GSEA and ARTP could be considered a good indication for the utility of this approach, since AA metabolism has established relevance to bladder cancer susceptibility. Interestingly, the enrichment signal in this pathway is driven by variations in the UGT1A gene cluster and the NAT1, NAT2, and NQO1 genes (Table 1) that are involved in detoxification processes in the AA pathway , . The strong enrichment signal left in this pathway even after the removal of the UGT1A and NAT2 genes from the analysis indicates that other genetic variations affecting aromatic amines detoxification may contribute to bladder cancer susceptibility.
The detection of the NAD metabolism pathway may be relevant to bladder cancer susceptibility through several carcinogenic mechanisms. First, NAD homeostasis has been shown to play a role in various redox reactions that may lead to irreversible cellular damage and consequently to the initiation of malignant tumor . In addition, NAD has been shown to be involved in DNA repair and telomere maintenances  as well as in energy production both of which are important processes in cancer development. Interestingly, NAD metabolism pathway has been implicated in a recent pathway-based analysis of colon cancer GWAS . Colon and bladder cancers have been associated with NAT2 acetylation status. For bladder cancer, in which N-acetylation is a detoxification step, NAT2 slow acetylator phenotype presents a higher risk. In contrast, for heterocyclic amine-related colon cancer in which N-acetylation is negligible and O-acetylation is a carcinogen-activation step, NAT2 rapid acetylator phenotype presents a higher risk . Thus, similar metabolic pathways could play diverse roles in the etiology of these two cancers.
Three clathrin-mediated vesicle pathways are also highlighted in this study. Clathrin-coated vesicles play essential role in intracellular trafficking, endocytosis, and exocytosis . In this realm, it has been shown that clathrin-mediated vesicle pathways regulate the signaling and cellular localization of several growth factors  that are known to play a role in cancer susceptibility. Interestingly, clathrin may be also relevant to the Mitotic Metaphase/Anaphase transition pathway that was also implicated in this study. During mitosis, clathrin helps stabilizing the kinetochore fibers which are required for the proper function of the mitotic spindle . Thus, the overrepresentation of association signals in two distinct pathways associated with mitosis suggest that perturbations in the mitotic process, and particularly those related to the metaphase/anaphase transition, may modify the risk of human bladder cancer.
Strengths of our study are the large sample size; the use of primary scan data from five independent studies allowing us to address consistency of effects across the different populations; and the use of two complementary pathway-based methods. A limitation of our study is the lack of pathway-based signals to reach a noteworthy FDR significance level, with only one pathway (Lysosome Vesicle Biogenesis) having an FDR value <0.2. This could be partially due to the inherent limits of the methods used, the inadequate annotation of relevant pathways in public databases, or due to weak association signals in our data. Recent analysis of bladder cancers using RNA expression data, have also highlighted enrichment of genes with similar processes as we identified in our genomic data here, including metabolic processes, which provide further plausibility that the pathways identified may be relevant to bladder cancer susceptibility . Furthermore, the high rank of the AA metabolism pathway in both GSEA and ARTP support the power of these methods to highlight pathways with established relevance to bladder cancer susceptibility and may therefore similarly suggest the involvement of metabolic detoxification, mitosis and clathrin-mediated pathways in bladder carcinogenesis.
Details and results for all 1423 pathways included in this study.
List of genes included in the 22 self-constructed candidate pathways.
We would like to thank Leslie Carroll (Information Management Services, Silver Spring, MD, USA), Gemma Castaño-Vinyals (Institut Municipal d'Investigació Mèdica, Barcelona, Spain), Fernando Fernández (Institut Municipal d'Investigació Mèdica, Barcelona, Spain), Paul Hurwitz (Westat, Inc., Rockville, MD, USA)
Charles Lawrence (Westat, Inc., Rockville, MD, USA), Marta Lopez-Brea (Marqués de Valdecilla University Hospital, Santander, Cantabria, Spain), Anna McIntosh (Westat, Inc., Rockville, MD, USA)
Angeles Panadero (Hospital Ciudad de Coria, Coria (Cáceres), Spain), Fernando Rivera (Marqués de Valdecilla University Hospital, Santander, Cantabria, Spain), Robert Saal (Westat, Rockville, MD, USA)
Maria Sala (Institut Municipal d'Investigació Mèdica, Barcelona, Spain), Kirk Snyder (Information Management Services, Inc., Silver Spring, MD), Anne Taylor (Information Management Services, Inc., Silver Spring, MD), Montserrat Torà (Institut Municipal d'Investigació Mèdica, Barcelona, Spain), Jane Wang (Information Management Services, Silver Spring, MD, USA)
Competing Interests: The authors have declared that no competing interests exist.
Funding: This project has been funded in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. Support for individual studies that participated in the effort is as follows: SBCS (Dr. Silverman) - Intramural Research Program of the National Institutes of Health, National Cancer Institute, Division of Cancer Epidemiology and Genetics and intramural contract number NCI N02-CP-11015. FIS/Spain 98/1274, FIS/Spain 00/0745, PI061614, and G03/174, Fundació Marató TV3, Red Temática Investigación Cooperativa en Cáncer (RTICC), Consolíder ONCOBIO, EU-FP7-201663; and RO1- CA089715 and CA34627. NEBCS (Dr. Silverman) - Intramural research program of the National Institutes of Health, National Cancer Institute, Division of Cancer Epidemiology and Genetics and intramural contract number NCI N02-CP-01037 PLCO (Dr. Purdue) - The National Institutes of Health (NIH) Genes, Environment and Health Initiative (GEI) partly funded DNA extraction and statistical analyses (HG-06-033-NCI-01 and RO1HL091172-01), genotyping at the Johns Hopkins University Center for Inherited Disease Research (U01HG004438 and NIH HHSN268200782096C), and study coordination at the GENEVA (Dr. Caporaso)- The NIH Genes, Environment and Health Initiative [GEI] partly funded DNA extraction and statistical analyses (HG-06-033-NCI-01 and RO1HL091172-01), genotyping at the Johns Hopkins University Center for Inherited Disease Research (U01HG004438 and NIH HHSN268200782096C) and study coordination at the GENEVA Coordination Center (U01 HG004446) for EAGLE and part of PLCO studies. Genotyping for the remaining part of PLCO and all ATBC and CPS-II samples were supported by the Intramural Research Program of the National Institutes of Health, NCI, Division of Cancer Epidemiology and Genetics. The PLCO is supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and supported by contracts from the Division of Cancer Prevention, National Cancer Institute, National Institutes of Health. ATBC (Dr. Albanes) - This research was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, and N01-RC-37004 from the National Cancer Institute, Department of Health and Human Services. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.