|Home | About | Journals | Submit | Contact Us | Français|
Gliomas account for approximately 80% of all primary malignant brain tumors, and despite improvements in clinical care over the last 20 years remain among the most lethal tumors, underscoring the need for gaining new insights that could translate into clinical advances. Recent genome-wide association studies (GWAS) have identified seven new susceptibility regions. We conducted a new independent GWAS of glioma using 1,856 cases and 4,955 controls (from 14 cohort studies, 3 casecontrol studies, and 1 population-based case only study) and found evidence of strong replication for three of the seven previously reported associations at 20q13.33 (RTEL), 5p15.33 (TERT), and 9p21.3 (CDKN2BAS), and consistent association signals for the remaining four at 7p11.2 (EGFR both loci), 8q24.21 (CCDC26) and 11q23.3 (PHLDB1). The direction and magnitude of the signal were consistent for samples from cohort and case-control studies, but the strength of the association was more pronounced for loci rs6010620 (20q,13.33; RTEL) and rs2736100 (5p15.33, TERT) in cohort studies despite the smaller number of cases in this group, likely due to relatively more higher grade tumors being captured in the cohort studies. We further examined the 85 most promising single nucleotide polymorphism (SNP) markers identified in our study in three replication sets (5,015 cases and 11,601 controls), but no new markers reached genome-wide significance. Our findings suggest that larger studies focusing on novel approaches as well as specific tumor subtypes or subgroups will be required to identify additional common susceptibility loci for glioma risk.
Gliomas account for approximately 80% of all primary malignant brain tumors (Kohler et al. 2011) and despite improvements in clinical care over the last 20 years, remain associated with considerable morbidity, with the most common histological subtype, glioblastoma (GBM) having a median survival of only 15 months (CBTRUS 2012). To date the only established environmental risk factor is exposure to moderate-to-high doses of ionizing radiation (Bondy et al. 2008). A heritable component of glioma is supported by: the two-fold elevated risk in individuals with a positive family history (Hemminki et al. 2009; Malmer et al. 2003; Scheurer et al. 2007; Wrensch et al. 1997); an increased risk observed in rare genetic syndromes (Farrell and Plotkin 2007); a possible moderately penetrant risk locus in the 3′ untranslated region of TP53(Stacey et al. 2011); and recent identification by genome-wide association studies (GWAS) of common susceptibility variants at 5p15.33 (TERT), 8q24.21 (CCDC26), 9p21.3 (CDKN2A-CDKN2B), 20q13.33 (RTEL1), 11q23.3 (PHLDB1), and two independent signals at 7p11.2 (EGFR) (Sanson et al. 2011; Shete et al. 2009; Wrensch et al. 2009)
To search for additional common genetic variants we conducted a new independent GWAS in 1,856 cases and 4,955 controls ascertained from 14 cohort studies, 3 case-control studies, and 1 population-based case-only study (Table 1). Previous GWAS studies were based on case-control samples only. Our study was designed to include a large number of incident cases from cohort studies (556 out of 1856, i.e. 30% of all cases) to minimize potential bias to glioma with longer survival.
Study-specific population characteristics are summarized in Table 1. The mean age of cases ranged from 48.7 years in the NIOSH Upper Midwest Health Study to 73.5 years in the Multi-Ethnic Cohort. 55.1% of glioma cases were of the glioblastoma subtype, with a larger percentage of high grade tumors (WHO III or IV) observed in cohort (74.7%) versus case-control (64.5%) studies (Supplementary Table 1).
After quality control metrics were applied to the scan data, 559,977 SNPs were available for analysis in 1,856 cases and 4,955 controls (details in Materials and Methods). Concordance between known duplicates was greater than 99.95%. The main effect model was adjusted by sex, age, study and seven eigenvectors (to account for small differences in population substructure). Examination of the Q-Q plot indicated the likelihood of additional loci associated with glioma risk (Figure 1). The genomic control lambda for the study is estimated at 1.006, suggesting the lack of issues related to differences in the underlying population substructure.
The results of this genome-wide association scan confirmed the previously reported seven regions as risk susceptibility loci for glioma (Figure 2). Specifically, we replicated three of seven previously reported associations at 20q13.33 (RTEL), 5p15.33 (TERT), and 9p21.3 (CDKN2BAS) (Table 2). Associations for the remaining loci were consistent with reported findings with respect to the direction of the odds ratios, but were not statistically significant at the genome-wide level (i.e. p < 5.0 × 10−8). When results were examined separately for samples from cohort versus case-control studies, the direction and magnitude of the signal were generally consistent. However, the strength of the association was more pronounced for loci rs6010620 (20q,13.33; RTEL) and rs2736100 (5p15.33, TERT) in the cohort studies despite the smaller number of cases in this group (Table 2a). Conversely, the strength of the association for loci at 11q23.3 (PHLDB1) and 9p21.3 (CDKN2BAS) was higher in case-control studies.
We further examined associations for previously reported loci by gender and tumor subtype (Tables 3 and and4).4). In analyses by gender, the signals at 8q24.21 (CCDC26) rs4295627 and 7p11.2 (EGFR) rs2252586 were stronger in women compared with men in our data (p-value for heterogeneity 0.0037 and 0.057, respectively). However, this effect modification by gender was not observed in the joint data from the UK, US-MDA, French and German replication groups. By tumor subtypes, the three regions most strongly associated with glioma risk overall at 5p15.33 (TERT), 9p21.3 (CDKN2B) and 20q13.33 (RTEL1), were mainly associated with glioblastoma. Associations with the marker at 8q24.21 (CCDC26) appeared more pronounced for oligodendroglioma, while the signal at 11q23.3 (PHLDB1) was preferentially associated with low-grade glioma.
In addition to previously reported loci, we identified 85 previously unreported loci with associations of p-trend ≤ 4.0 ×10−4 after removing probable genotyping artifacts, known associations, and highly correlated SNP markers (r2>0.6). We performed an in silico replication by a meta analysis with data from three previously reported GWAS studies which provided data on a total of 5,015 cases and 11,601 controls (Table 1) (Sanson et al. 2011; Shete et al. 2009; Wrensch et al. 2009). Summary measures (odds ratios and 95% confidence intervals) were obtained from each study, and a meta-analysis was performed using an inverse variance fixed effect model. However, none of these associations reached statistical significance at the genome-wide association level (Supplementary Table 2). A similar exercise was undertaken for 85 promising loci identified in combined data from the UK, US-MDA, French and German replication groups, but again, none of these associations reached statistical significance at p < 5.0 × 10−8 (Supplementary Table 3).
In this study, we present the data from a new independent GWAS of glioma based on 1,856 cases and 4,955 controls. While we did not observe any novel locus that reached genome-wide significance, the new scan provides further evidence for confirmation of the established loci. Similar to previously published reports, we note that TERT rs2736100, CDKN2B rs4977756, and RTEL1 rs6010620 were most strongly associated with glioblastoma, CCDC26 rs4295627 with oligodendroglioma, and PHLDB1 rs498872 with low-grade glioma (Egan et al. 2011; Jenkins et al. 2011; Simon et al. 2010). These results suggest different genetic etiologies for different subtypes of glioma, and underscore the importance of considering tumor heterogeneity in GWAS studies.
Although we observed differential associations for the two loci on 8q24.21 (CCDC26) and 7p11.2 (EGFR) by gender in our data, effect modification by gender was not observed for these loci in the joint data from the UK, US-MDA, French and German replication groups, suggesting that the observed gender differences in our data could have been due to chance. However, it will be important to re-examine potential effect modification by gender in larger datasets, along with consideration of potential risk covariates of interest such as allergy or smoking (Lachance et al. 2011; Schoemaker et al. 2010).
Previous GWAS of glioma were based on case-control studies only, which would generally not include rapidly fatal gliomas. One concern of results from these studies is that associations may be influenced by survival and therefore potentially bias toward glioma with longer survival. It is noteworthy that in our GWAS scan, the strength of the association was more pronounced for rs6010620 (20q,13.33; RTEL) and rs2736100 (5p15.33, TERT) in the cohort studies despite the smaller number of cases in this group. These regions have been particularly associated with high-grade glioma in other studies (Egan et al. 2011; Jenkins et al. 2011; Simon et al. 2010), and the differences in cohort versus case-control results in our scan likely reflects the fact that a higher proportion of highly fatal tumors (WHO Grade III and IV) were captured by the cohort studies as compared to the case-control studies. Similarly, stronger results for the CCDC26 and PHLDB1 variants in the case-control studies are consistent with previous associations of these loci with low-grade tumors. Nonetheless, overall results from GliomaScan, which comprised a large number of incident glioma from cohort studies, support GWAS associations based on previous case-control studies. Our data thus suggest that previously reported associations are generalizable to incident glioma cases.
Our study had adequate power to detect variants of moderate effect sizes for common allele frequencies. However we did not observe additional signals with in silico analysis in three previously reported scans totaling 5,015 cases and 11,601 controls. This suggests that the underlying architecture of genetic susceptibility to glioma may not include as large a proportion of common variants as has been seen for other cancers to date. Alternatively, the underlying heterogeneity of glioma may limit our ability to identify more highly significant variants. For example, recent advances in understanding of glioma subtypes (e.g. proneural, neural, mesenchymal) based on gene expression (2008; Phillips et al. 2006), somatic mutations (e.g. IDH1)(Yan et al. 2009) and global patterns of methylation (glioma CpG island methylator pheynotype; G-CIMP)(Noushmehr et al. 2010) suggest there are important subgroups of glioma that may represent distinct pathological entities. Still, given the relatively small sizes of the glioma scans to date, and in order to comprehensively define the catalog of common variants associated with risk for glioma (Park et al. 2010), further genome-wide association studies will need to involve sufficiently large study populations along with analysis of tumor subtypes to assess these risks.
Studies participating in GliomaScan are described in Table 1, and comprise 1,856 glioma cases and 4,955 controls from 14 cohort studies, 3 case-control studies, and 1 community-based case-only study. Cases were newly diagnosed glioma [ICDO-3 codes 9380-9480 or equivalent], and controls were cancer-free at the time of glioma diagnosis. Cases and 2,429 newly genotyped controls (pre-QC) were scanned with the Illumina 660W chip. Newly genotyped controls for this project were selected in a 2:1 ratio, frequency matched on age, sex and race/ethnicity. GWAS data were already available on 2,591 controls and 12 cases from cohorts that had participated in the PANSCAN study (pancreatic cancer GWAS), CGEMS studies (Hunter et al. 2007; Landi et al. 2009; Yeager et al. 2007) and the NCI lung cancer GWAS (Landi et al. 2009). These were scanned with the commercial HumanHap 550 or HumanHap 610 Illumina SNP arrays.
We conducted a new genome wide association scan of glioma (GliomaScan) to validate previously reported risk regions and to attempt to identify additional novel risk loci. Details of the 19 studies participating in GliomaScan are provided in Table 1. We evaluated 85 additional loci of potential interest by conducting a fixed-effects meta-analysis using in silico data from 3 previously reported genome wide association scans in a total of 5,015 cases and 11,601 controls (Sanson et al. 2011; Shete et al. 2009; Wrensch et al. 2009).
All GliomaScan samples were genotyped at the NCI Core Genotyping Facility (CGF, Division of Cancer Epidemiology and Genetics (DCEG), National Cancer Institute, Bethesda, USA). Samples from the United Kingdom, MD Anderson Cancer Center, France and Germany were genotyped as described previously (Sanson et al. 2011; Shete et al. 2009; Wrensch et al. 2009). Summary estimates were provided from previously genotyped studies for the purpose of meta-analysis.
Genotyping was attempted for a total of 5,084 GliomaScan samples on Illumina 660W arrays at the CGF. After excluding 6 samples due to lab processing error, 5,078 samples remained (2,215 cases, 2,859 new controls and 4 QC samples). Genotype clusters were estimated with high performing samples having overall completion rates greater than 98%, and genotype calls for the rest of the samples were based on the clusters defined by the high performing samples only. Additionally, 2,591 previously scanned (on 550 or 610 chips) controls and 12 previously scanned individual cases from ATBC, CLUE, CPSII, HPFS, NHS, NYUWHS, PHS, PLCO, SMWHS and WHS were included.
SNP assays were excluded if they had less than 90% of completion rate, or had extreme deviation from fitness for Hardy-Weinberg proportion (P<1×10−10). Participants were excluded based on: 1) completion rates lower than 94-96% as per the QC groups (n=420 samples); 2) abnormal heterozygosity values of less than 25% or greater than 35% (n=45) – some samples were excluded for both low completion rates and abnormal heterozygosity, and the total number of unique samples excluded for either criteria was 438; 3) unexpected duplicates (n=8 forming 4 pairs), and one sample that also failed due to low completion rate; 4) sex discordance between self reported and the imputed gender by X chromosome heterozygosity (n=9); 5) one sample from each unexpected inter-study duplicates (n=20) and 6) phenotype exclusions (due to ineligibility or incomplete information) (n=27). Utilizing a set of 12,000 un-linked SNPs (pair-wise r2<0.004) common to all GWAS chips (Yu et al. 2008), 215 subjects with less than 80% European ancestry were excluded from downstream analyses based on STRUCTURE analysis (Falush 2007) and PCA (Price et al. 2006). For the planned 154 duplicate pairs, concordance was 99.96%.
The final participant count for the association analysis was 1,856 cases and 4,955 controls. 559,977 SNPs were available for analysis in one or more studies. Each participating study obtained informed consent from study participants and approval from its Institutional Review Board (IRB) for this study and obtained IRB certification permitting data sharing in accordance with the NIH Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome-Wide Association Studies (GWAS). The dbGaP data portal provides access to individual level data from the NCI scan ONLY to investigators from certified scientific institutions after approval of their submitted Data Access Request.
The association between the 559,977 SNPs and risk of glioma was estimated by the odds ratio (OR) and 95% confidence interval (CI) using unconditional logistic regression assuming a trend effect genetic model with 1 degree of freedom. PCA analysis revealed seven significant (p<0.05) eigenvectors when included in the NULL model (logistic regression with dummy variables for sex, age and study). The main effect model was adjusted by sex, age, study and seven eigenvectors. In addition to overall analyses of SNP associations, models were also examined by gender and stratified by the following tumor subtypes: glioblastoma (ICDO-3 codes 9440, 9441, 9442, 9443), oligodendroglioma/mixed glioma (ICDO-3 codes 9382, 9450, 9451, 9460), low grade glioma (grade I or II according to current WHO classifications), or high grade glioma (grade III or IV according to current WHO classifications)(2000; Louis et al. 2007). Top-ranked SNPs for further follow-up were selected based on the p-value for additive trend, after known hits and loci in high linkage disequilibrium (pairwise r2 value > 0.6) were removed.
For the 85 loci of interest, each participating center provided the results of logistic regression analysis for individuals of European ancestry (CEU) adjusted for age and study-specific factors (e.g. study site). The following information was provided for each SNP: minor allele frequency (MAF), genotype counts for both cases and controls, risk allele, per allele odds ratio (OR), associated 95% confidence intervals, and the associated p-value of 1 degree freedom (df) test of the trend effect for the SNP. Summary estimates for each center were combined using a fixed effect metaanalysis.
Data analysis and management was performed with GLU (Genotyping Library and Utilities version 1.0), PLINK and SAS® version 9.2 (Raleigh, NC, USA).
This research was supported by intramural funds from the NCI, NIH, Department of Health and Human Services, and has been funded in whole or in part with federal funds from the NCI, NIH, under contract N01-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. We are indebted to the scientific and field efforts of Michelle Brotzman (Westat), Laurie Burdette, Peter Hui (IMS), Annelie Landgren, Leah Mechanic, Lisa Newman (RTI), Aurelie Vogt, Tim Sheehy, Mitchel Berger Susan Chang, Mike Prados, Tarik Tihan, Ivan Smirnov and Lucie McCoy.
URLs: CGEMS portal: http://cgems.cancer.gov/
Author Contributions PR, BSM, RMC, DM, SSW, PH, SC designed the study and wrote the proposal.
PR, BSM, PH, SC organized the study and sample co-ordination.
MY, SC conducted and supervised genotyping of samples.
PR, ZW, BSM, MY, NC, SC contributed to the design and execution of statistical analysis.
PR, ZW, SC wrote the first draft of the manuscript.
PR, BSM, RMC, DM, MB, RH, RJ, MW, AA, DA, UA, LBF, JEB, MAB, TC, MF, SMG, JMG, GGG, GH, RH, JHB, PDI, CJ, ML, LLM, MSL, SL, UP, MPP, NR, AMR, MS, HDS, GS, XOS, MS, MS, VLS, KV, EW, AW, AZJ, WZ, AS conducted the epidemiologic studies and contributed samples to the glioma cancer GWAS and/or replication.
All authors contributed to the writing of the manuscript
Please see Supplementary Material for information on support for individual studies that participated in the effort.