|Home | About | Journals | Submit | Contact Us | Français|
Genomewide association studies (GWAS) and candidate-gene studies have implicated single-nucleotide polymorphisms (SNPs) in at least 45 different genes as putative glioma risk factors. Attempts to validate these associations have yielded variable results and few genetic risk factors have been consistently replicated. We conducted a case-control study of Caucasian glioma cases and controls from the University of California San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls) in an attempt to replicate previously reported genetic risk factors for glioma. Sixty SNPs selected from the literature (eight from GWAS and 52 from candidate-gene studies) were successfully genotyped on an Illumina custom genotyping panel. Eight SNPs in/near seven different genes (TERT, EGFR, CCDC26, CDKN2A, PHLDB1, RTEL1, TP53) were significantly associated with glioma risk in the combined dataset (P < 0.05), with all associations in the same direction as in previous reports. Several SNP associations showed considerable differences across histologic subtype. All eight successfully replicated associations were first identified by GWAS, although none of the putative risk SNPs from candidate-gene studies was associated in the full case-control sample (all P values > 0.05). Although several confirmed associations are located near genes long known to be involved in gliomagenesis (e.g., EGFR, CDKN2A, TP53), these associations were first discovered by the GWAS approach and are in noncoding regions. These results highlight that the deficiencies of the candidate-gene approach lay in selecting both appropriate genes and relevant SNPs within these genes.
Heritable susceptibility to glioma was originally suggested by the increased risk observed in patients with an affected first-degree relative, and also by the association of glioma with several well-defined Mendelian disorders (e.g., Neurofibromatosis Type 1, Lynch syndrome, Li-Fraumeni syndrome) [Malmer et al., 2007]. Gliomagenesis is, however, a complex and multifaceted process influenced by both heritable and somatic genetic variation. Studies of glioma tumor DNA have found that mutations in the genes IDH1 and IDH2 occur in approximately 50–80% of grades 2–3 glioma, but in < 10% of primary glioblastomas [Christensen et al., 2010; Yan et al., 2009]. These mutations are associated with younger age of onset and better survival among glioma patients, and are also associated with other somatic genetic and epigenetic alterations [Hartmann et al., 2009].
Studies of constitutional DNA from glioma patients have implicated at least 44 different genes in gliomagenesis. These investigations have mostly been candidate-gene studies, which frequently examine genes involved in a biological pathway of interest, such as DNA repair [Bethke et al., 2008c; Felini et al., 2007; Liu et al., 2007], apoptosis [Bethke et al., 2008a; Rajaraman et al., 2007], or folate metabolism [Bethke et al., 2008b]. However, robustly replicated risk genes have not emerged from these candidate studies and inconsistent associations are the norm.
Genomewide association studies (GWAS) of glioma have been conducted in recent years, identifying eight single-nucleotide polymorphisms (SNPs) in seven different genes which are independently associated with glioma risk [Sanson et al., 2011; Shete et al., 2009; Stacey et al., 2011; Wrensch et al., 2009]. The “hypothesis-free” GWAS approach has realized greater success than the candidate-gene approach, at least in part because it is not gene-centric. Indeed, across all the GWAS conducted to date for more than 500 diseases/traits, most significantly associated SNPs have been found in noncoding regions of the genome [Hindorff et al., 2009]. Furthermore, because of the lack of prior hypotheses, most GWAS are designed to include a replication phase to minimize false positives. Despite the shortcomings of the candidate-gene approach, such studies frequently evaluate well-considered a priori hypotheses. Because several of the glioma risk genes identified through GWAS had prior data indicating their potential involvement in gliomagenesis (i.e., TP53, p15/CDKN2B, EGFR), there is strong rationale to attempt replication of putative glioma risk loci appearing in the candidate-gene literature.
In order to assess the role of genetic variation at loci previously implicated in influencing glioma risk, we conducted a case-control study of Caucasian glioma patients and ancestry-matched controls. A total of 2,963 individuals recruited at The University of California, San Francisco and the Mayo Clinic were genotyped on an Illumina GoldenGate (Illumina, San Diego, CA) custom panel containing candidate SNPs selected from 28 previous publications. In total, 61 candidate SNPs were assayed, including eight previous GWAS hits on chromosomes 5, 7 (two loci), 8, 9, 11, 17, and 20. We sought to replicate previously detected SNP associations using a larger sample size than any of the candidate-gene studies, and also to evaluate these associations within specific histologic subgroups.
This study included European-ancestry glioma cases and controls from two collaborating institutions: The University of California, San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls). All participating institutions received institutional review board approval and informed consent was obtained from subjects. Patient recruitment methods have been described in detail elsewhere [Jenkins et al., 2012]. Briefly, UCSF cases and controls were taken from the San Francisco Bay Area Adult Glioma Study. Cases aged 20 or older, diagnosed with histologically confirmed incident glioma were recruited from the local population-based registry, the Northern California Rapid Case Ascertainment program, and the UCSF Neuro-oncology clinic between 1997 and 2012. Controls aged 20 years or older from the same residential area as cases were ascertained through random digit dialing, had no history of brain tumor at time of recruitment, and were frequency matched to population-based cases on age, sex, and ethnicity.
Mayo Clinic cases consisted of patients 18 years of age and older that had surgical resection or biopsy of a glioma between 1989 and 2012. Cases were identified at diagnosis (for those initially seen at the Mayo Clinic) or at the time of pathologic confirmation (for those initially diagnosed elsewhere and subsequently treated at Mayo). The Mayo control group consisted of consented individuals, 18 years of age and older that underwent a general medical examination at the Mayo Clinic, and had no previous history of a brain tumor. Subject characteristics, including histopathologic classification of glioma cases, are outlined in Table 1. Pathology review was performed as previously described [Jenkins et al., 2012; Wrensch et al., 2009].
A Medline search was performed on September 1, 2011 to retrieve association studies identifying at least one significantly associated glioma risk SNP using the following search expression (glioma[Title/Abstract] OR glioblastoma[Title/Abstract] OR astrocytoma[Title/Abstract] OR oligodendroglioma[Title/Abstract]) AND association[Title/Abstract] AND (SNP[Title/Abstract] OR single nucleotide polymorphism). Additionally, the bibliographies of selected articles were scanned to identify pertinent publications which the electronic search may have missed. Results were not filtered by language. Only studies which assayed SNPs were included because other variant types (e.g., microsatellites) are not amenable to the genotyping platform used in our study. Manuscripts were scanned and eliminated from further analysis as outlined in Supporting information Figure SI.
SNPs from 28 publications which investigated the role of inherited genetic variation in influencing glioma risk were included, of which 4 studies were GWAS [Sanson et al., 2011; Shete et al., 2009; Stacey et al., 2011; Wrensch et al., 2009] and 24 were candidate-gene studies [Bethke et al., 2008a,b,c; Brenner et al., 2007; Caggana et al., 2001; Chang et al., 2008; Chen et al., 2000; Dobbins et al., 2011; Dou et al., 2010; Felini et al., 2007; Liu et al., 2009, 2007, 2008; Rajaraman et al., 2007; Ruan et al., 2011; Schwartzbaum et al., 2005, 2007, 2010; Semmler et al., 2006; Wang et al., 2010; Wiemels et al., 2007; Wiencke et al., 2005; Wrensch et al., 2005; Yang et al., 2005]. Sample size, patient ethnicity, histological inclusion criteria, and the study hypothesis of included studies are outlined in Table 2.
From these studies, 61 SNPs, including eight identified in glioma GWAS, were selected for replication analysis in our study sample. The remaining 53 SNPs were reported to be associated with glioma risk in candidate-gene studies, with reported P values ranging from 1.45 × 10−4 to 0.042.
GoldenGate custom genotyping arrays were designed by Illumina (San Diego, CA). Genotyping was performed by the Mayo Clinic Genotyping and UCSF Genome Center core facilities as previously published [Jenkins et al., 2012]. Samples were submitted in 96-well plates containing intra- and inter-plate replicates to ensure genotype reproducibility. All cluster plots were visually inspected.
For both study sites, samples with genotyping call rate <95% were excluded from analysis. SNPs with genotyping call rates <95% in any site were excluded from all analyses. To exclude poorly genotyped SNPs, any SNP with a Hardy-Weinberg Equilibrium (HWE) P-value < 0.001 in controls, stratified by site, was removed from further analyses.
Single SNP association statistics were calculated using logistic regression in Plink v1.07, assuming a log-additive model [http://pngu.mgh.harvard.edu/purcell/plink/] [Purcell et al., 2007]. The effect of individual SNPs on glioma risk was calculated in the full case-control dataset using a logistic regression model adjusted for sex, age, and study site. Reported associations are for an allelic additive model, adjusted for these covariates, where odds ratios are for each additional copy of the minor allele. All P values are two-sided. SNP associations were assessed in the full case-control sample, and also stratified by tumor grade/histology (glioblastoma vs. controls, anaplastic astrocytoma vs. controls, grade 2 astrocytoma vs. controls, oligodendroglioma vs. controls, mixed oligoastrocytoma vs. controls) and IDH mutation status of patient tumors (cases with IDH mutant tumors vs. controls).
Of the 60 candidate SNPs for which replication was attempted, three were originally reported on in a glioma GWAS published by our group [Wrensch et al., 2009]. That GWAS case-control sample partially overlaps with the individuals reported on in this manuscript. We therefore removed 433 individuals from the analysis of these three SNPs in order to eliminate sample overlap with the previous GWAS report (i.e., N = 2,963 for 57 SNPs and N = 2530 for 3 SNPs).
UCSF tumor specimens were sequenced to identify IDH1 and IDH2 mutations using previously described methods [Christensen et al., 2010]. Briefly, the region spanning the R132 codon of IDH1 and the region spanning the R172 6 codon of IDH2 were amplified by polymerase chain reaction with M13 tagged primers to facilitate amplification and sequencing. Products were run on a 1.5% agarose gel and subsequently sequenced in both directions at the UCSF Genomics Core Facility according to the manufacturer’s protocol. Sequences were analyzed with Applied Biosystems Sequence Scanner Software v1.0. Mayo Clinic tumor specimens were assayed for IDH1 mutations using pyrosequencing and IDH2 mutations using both pyrosequencing and Sanger sequencing as previously described [Kipp et al., 2012].
After excluding samples with call rates <95%, 2,959 participants remained for analysis (1,660 cases, 1,299 controls). After excluding SNPs that did not meet call-rate or HWE thresholds, 60 SNPs remained for analysis.
In the combined analysis of all glioma tumor histologies, seven SNPs were associated with case-control status at a P-value < 0.05. This included seven of the eight SNPs identified to be associated with glioma risk in previous GWAS. The eighth SNP identified via GWAS, rs498872 on chromosome 11, was associated only when analyses were restricted to the oligodendroglioma, mixed oligoastrocytoma, or IDH mutant subgroups, consistent with previous reports of the histologic specificity of this association (Table 3) [Jenkins, 2012]. For all eight SNPs, the direction of association in our data matches that in previous GWAS.
Many of the significantly associated SNPs in Table 3 show marked differences in association across histologic subtypes. As previously reported, rs4295627 on chromosome 8 and rs498872 on chromosome 11 are more strongly associated with tumors having an oligodendroglial component or an IDH mutation [Jenkins, 2012; Jenkins et al., 2011]. SNPs in EGFR and CDKN2A, on the other hand, show weak associations with oligodendroglial tumors but stronger associations with high-grade astrocytic tumors. SNPs in TERT, TP53, and RTEL1 show only modest differences in effect size across histology strata and appear to be more general glioma risk factors.
Of the 52 remaining successfully genotyped SNPs, all selected from the candidate-gene literature, no significant associations were detected in analysis of the full case-control dataset (all P values > 0.05, Supporting information Table SI). Associations stratified by tumor histology and by IDH-mutation status can also be found in Supporting information Table SI. We were able to determine the previously reported direction of association and reference allele for 39 of these 52 SNPs. The direction of association reported in the literature matched that observed in our data for just 13/39 (33%) of the putative risk SNPs abstracted from the candidate-gene literature, compared with 50% expected by chance and 100% for SNPs identified by previous GWAS.
The importance of performing robust SNP replication studies cannot be understated, as additional samples help to validate purported genotype-phenotype correlations and extend them to additional populations. We replicated associations at all eight glioma risk SNPs originally identified by the GWAS method, attesting to the efficacy of this approach and its ability to produce robust associations. Furthermore, the GWAS approach can identify associated genes acting in biological pathways not previously known to influence disease pathogenesis. Two such genes with relevance to glioma are involved in telomere elongation: TERT and RTEL1 (rs2736100 and rs6010620, respectively). Before GWAS were conducted, telomere function was not linked to gliomagenesis and as a result, this important aspect of glioma biology remained unstudied.
In the case of glioma, TP53 and p16Ink4a (containing CDKN2A, CDKN2B, and ANRIL) are obvious candidate loci based on the glioma-associated Mendelian syndromes resulting from deletion of these regions. Inherited mutations in TP53 cause Li-Fraumeni syndrome (Online Mendelian Inheritance in Man (OMIM): 151623) and inherited deletions of CDKN2A cause familial melanoma-astrocytoma syndrome (OMIM: 155755). Additionally, the relevance of EGFR to gliomagenesis has long been apparent, as it is commonly amplified in glioma tumor samples [Wong et al., 1987]. Yet, significant and robustly replicated SNPs in TP53, CDKN2B, and EGFR were first identified by GWAS, despite being preceded by a plethora of candidate-gene studies. Although the most obvious drawback of the candidate-gene approach is that selection of relevant genes is limited by the current state of biological knowledge, selecting the relevant SNPs within those genes can be a comparable challenge.
Of the eight significant SNPs identified through GWAS, none are located in coding regions (two intergenic, four intronic, one 3′-UTR, one 5′-UTR). This is in stark contrast to the 53 SNPs identified by candidate-gene studies, nearly half of which are located in exons but none of which were replicated. The impetus to identify coding variants of potential functional relevance is understandable. However, researchers performing candidate-gene studies in the future should recognize that genotyping such variants does not align with the genetic paradigm recently revealed by GWAS: Namely, that common variants are associated with common diseases, but such loci are often noncoding variants of a regulatory nature or are manifestations of “synthetic associations” [Dickson et al., 2010]. Selecting SNPs to genotype based on position within exons and predicted effect on protein function, as opposed to their ability to tag haplotype blocks, appears to be a poor strategy for identifying new associations given the results of the GWAS published to date.
The associations appearing in Supporting information Table SI can serve as a resource for researchers interested in these particular genes or in performing meta-analyses of SNPs potentially associated with glioma risk. The identification of constitutional genetic polymorphisms associated with the development of glioma has brought us closer to understanding the causal mechanisms underlying gliomagenesis. Additionally, excluding erroneous associations generated by studies with immoderate Type 1 error rates helps to focus research on more salient endeavors, such as fine-mapping of the confirmed risk loci.
Work at UCSF was supported by the NIH (grant numbers R25CA112355, R01CA52689, and P50CA097257), as well as the National Brain Tumor Foundation, the UCSF Lewis Chair in Brain Tumor Research, and by donations from families and friends of John Berardi, Helen Glaser, Elvera Olsen, Raymond E. Cooper, and William Martinusen. Work at the Mayo Clinic was supported by the NIH (grant numbers P50CA108961 and P30 CA15083), National Institute of Neurological Disorders and Stroke (grant number RC1NS068222Z), the Bernie and Edith Waterman Foundation, and the Ting Tsung and Wei Fong Chao Family Foundation. The collection of cancer incidence data used in this study was supported by the California Department of Public Health as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885; the National Cancer Institute’s Surveillance, Epidemiology and End Results Program under contract HHSN261201000036C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute; and the Centers for Disease Control and Prevention’s National Program of Cancer Registries, under agreement #1U58 DP000807-01 awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the author(s) and endorsement by the State of California Department of Public Health, the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors is not intended nor should be inferred.
The authors do not have any conflicts of interest, financial or otherwise.
Supporting Information is available in the online issue at wileyonlinelibrary.com.