|Home | About | Journals | Submit | Contact Us | Français|
Neuroblastoma is a malignancy of the developing sympathetic nervous system that most commonly affects young children and is often lethal. The etiology of this embryonal cancer is not known.
We performed a genome-wide association study by first genotyping 1,032 neuroblastoma patients and 2,043 controls of European descent using the Illumina HumanHap550 BeadChip. Three independent groups of neuroblastoma cases (N=720) and controls (N=2128) were then genotyped to replicate significant associations.
We observed highly significant association between neuroblastoma and the common minor alleles of three single nucleotide polymorphisms (SNPs) within a 94.2 kilobase (Kb) linkage disequilibrium block at chromosome band 6p22 containing the predicted genes FLJ22536 and FLJ44180 (P-value range = 1.71×10-9-7.01×10-10; allelic odds ratio range 1.39-1.40). Homozygosity for the at-risk G allele of the most significantly associated SNP, rs6939340, resulted in an increased likelihood of developing neuroblastoma of 1.97 (95% CI 1.58-2.44). Subsequent genotyping of these 6p22 SNPs in the three independent case series confirmed our observation of association (P=9.33×10-15 at rs6939340 for joint analysis). Furthermore, neuroblastoma patients homozygous for the risk alleles at 6p22 were more likely to develop metastatic (Stage 4) disease (P=0.02), show amplification of the MYCN oncogene in the tumor cells (P=0.006), and to have disease relapse (P=0.01).
Common genetic variation at chromosome band 6p22 is associated with susceptibility to neuroblastoma.
Despite marked improvements in the cure rates for many childhood cancers, neuroblastoma remains an important clinical problem accounting for 15% of the mortality attributable to malignancy in children.1 It is the most common solid cancer of early childhood, and approximately half of all neuroblastoma cases present with widely disseminated disease that is often refractory to intensive chemoradiotherapy. Cure rates for these high-risk patients remain less than 40% despite dramatic increases in chemoradiotherapy intensity, and survivors often suffer serious life-long morbidities.2-4 Somatically acquired genomic aberrations in neuroblastoma are of fundamental importance for predicting tumor phenotype in patients.1 Tumors with amplification of the MYCN oncogene, or deletions of chromosome arms 1p and/or 11q, typically are metastatic at diagnosis and resistant to therapy.5 Conversely, tumors showing no structural chromosomal changes, but hyperdiploidy due to whole chromosome gains, are more easily cured and may even spontaneously regress.6 Despite the wealth of knowledge on somatically acquired genomic aberrations that correlate with tumor phenotype, little is known about the genetic events that predispose to neuroblastoma tumorigenesis.
Knudson and Strong originally predicted that neuroblastoma fits the two mutation model of oncogenesis,7 subsequently proven for the analogous embryonal cancer retinoblastoma.8 However, a family history of neuroblastoma is obtained in only about 1% of cases,9 and to date genetic studies in these rare families show locus heterogeneity with no commonly mutated gene identified.10-12 Mutations in the retinoblastoma gene (RB1) confer several orders of magnitude increase in the risk of developing tumor, penetrance is nearly complete, and mutations in this suppressor gene are involved in essentially all familial and sporadic cases; whereas neuroblastoma appears to be more genetically complex and heterogeneous. We therefore hypothesized that the majority of neuroblastomas arise from relatively common DNA variations that predispose to an increased risk for neuroblastic malignant transformation.
For genome-wide genotyping, cases were defined as a child diagnosed with neuroblastoma or ganglioneuroblastoma and registered through the Children’s Oncology Group (COG). The blood samples from the neuroblastoma cases were identified through the COG Neuroblastoma bio-repository for specimen collection at the time of diagnosis. The majority of specimens were annotated with clinical and genomic information that included: age at diagnosis, site of origin, disease stage by the International Neuroblastoma Staging System 13, INPC International Neuroblastoma Pathology Classification 14, MYCN oncogene copy number,15 DNA index (ploidy),16 registration on clinical trial(s), event-free and overall survival, second malignancies, and any associated conditions (e.g. congenital abnormalities).
Eligibility criterion for genome-wide genotyping was availability of 1.5 μg of high quality DNA from a tumor-free source such as peripheral blood or uninvolved (with tumor) bone marrow mononuclear cells. Because neuroblastoma in the United States is demographically a disease of Caucasians of European descent,17 we limited our initial analyses to this racial group to minimize genetic variability. The neuroblastoma cases genotyped genome--wide were randomly divided into a discovery set of 1251 cases, and an initial replication set of 409 cases, all annotated as self-reported Caucasian.
Control subjects were recruited from the Philadelphia region through the Children’s Hospital of Philadelphia (CHOP) Health Care Network, including four primary care clinics and several group practices and outpatient practices that included well child visits. Eligibility criteria for control subjects were: 1) self-reported as Caucasian; 2) availability of 1.5 μg of high quality DNA from peripheral blood mononuclear cells; and 3) no serious underlying medical disorder including cancer. A total of 3414 control subjects genotyped genome-wide were used in this study, randomly allocated to 2236 used in the discovery phase, and 1178 in the initial replication effort. The median age of the control subjects at the time of sample collection was 10.0 years.
Two additional case series were used for the purpose of replication. First, 252 randomly selected and unrelated individuals with neuroblastoma or ganglioneuroblastoma were recruited from Pediatric Oncology Centers in the United Kingdom either as part of a collection of sporadic neuroblastoma patients diagnosed at the Birmingham Children’s Hospital since 1992, or through the Factors Associated with Childhood Tumours (FACT) study or from the Childhood Cancer and Leukaemia Group Tumour and Leukaemia Bank. Control samples (N=788) were from the 1958 Birth Cohort Collection, an ongoing follow-up study of all persons born in Great Britain during one week in 1958, including a recent biomedical assessment during 2002–2004 at which blood samples and informed consent were obtained for creation of a genetic resource (National Child Developmental Study http://www.cls.ioe.ac.uk/). All cases and controls were from the United Kingdom and subjects known to be of non-European ethnic groups were excluded. The final replication group of 59 unrelated individuals with high-risk neuroblastoma were recruited from the US-based Children’s Cancer Group (CCG) protocols from the 1990’s2. Control DNA samples for the CCG cases (N=162) were collected from unrelated Caucasian individuals living in the Los Angeles area during 2000–2001 from buccal swabs obtained for the creation of an anonymized genetic resource.
Informed consent was obtained for all participants and this study was approved by each participating center’s Institutional Review Board as well as the COG Scientific Council, COG Neuroblastoma Disease Committee and Cancer Therapy Evaluation Program (CTEP) at the NCI.
Details of methods for genome-wide genotyping have been described18,19, and are included with methods for replication genotyping by PCR-based allelic discrimination assays in the Supplementary Information.
Genome-wide genotyping data from an initial 1,251 neuroblastoma patient and 2,236 disease-free control discover set were filtered based on pre-specified quality control measures where individual SNPs were excluded from further analysis if they showed: 1) deviation from Hardy Weinberg equilibrium with P<0.0001; 2) individual SNP genotype yield <98%; or 3) minor allele frequency (MAF) <5%. This resulted in 464,934 SNPs being utilized in the subsequent analyses. In addition, 33 samples had genotype yields <90% and were removed (23 cases and 10 controls). Finally, since the case samples were accrued nation-wide, while the control set was recruited locally in Philadelphia, we performed principal components analyses to identify outlier samples in order to reduce the effects of population stratification. 20,21 This approach removed 379 samples (196 cases and 183 controls), leaving 1,032 cases and 2,043 controls for our discovery case series with self-reported ethnicity of Caucasian. Evaluation of these 3,075 subjects using ancestry informative markers (AIMs) available on the HumanHap550 BeadChip predicted Caucasian status in all but two subjects (these two subjects remained in the analysis).23 The neuroblastoma subjects were representative of the expected distribution of clinical and biological covariates as observed in the general population (Supplemental Table S1).1,17
The primary statistical tests for association in the discovery case series were carried out using the software package PLINK.22 We conservatively set 1.0 × 10-7 as the threshold for genome-wide significance, based on the fact that slightly less than 500,000 SNPs were used in the analysis (0.05/500,000=1.0 × 10-7). The single marker analyses for the genome-wide data were carried out using the χ2 test based on allele count differences between 1,032 cases and 2,043 controls and the Cochran-Armitage test for trends on genotype frequencies. Allelic odds ratios and the corresponding 95% confidence intervals were calculated for the association analyses. In addition, to further control for the potential confounding influence of population stratification, we performed association analyses after correction for substructure based on principal components analysis as implemented in Eigenstrat20,21. For each SNP, we utilized the default settings within the program and performed a modified Cochran-Armitage trend test adjusting for the top five principal components21, and report the result as the Eigenstrat p-value. Supplementary Figure S1 shows quantile-quantile plots before and after correction.
Of the 1,032 patients included in the discovery case series, clinical and biological covariate data obtained at diagnosis was available for most (Supplementary Table S1). A total of 883 patients (85.6%) had complete outcome data available with a median follow-up interval of 4.02 years for patients without an event. Association analyses of chromosome 6p22 SNPs with clinical characteristics were performed with the χ2 test on allele and genotype counts, and with outcome by comparing Kaplan-Meier survival curves by the logrank test in a pairwise fashion.
The Eigenstrat corrected summary statistics for the entire discovery case series are available at http://www.caglab.net/pub/kiran/neuroblastoma.html.
To identify sequence variants that are associated with susceptibility to develop neuroblastoma, we compared single marker allele and genotype frequencies in our discovery case series using the χ2 and Cochran-Armitage trend test statistics. The top three SNPs showing significant association to neuroblastoma were in tight linkage disequilibrium (LD) at chromosome 6p22 (rs6939340, rs4712653, rs9295536) yielding P-values of 1.71 × 10-9 - 7.01 × 10-10 (allelic odds ratio 1.39-1.40; Figure 1, Table 1, Table S2). Two additional SNPs at chromosome 20p11 (rs3790171 and rs7272481) showed genome-wide significant single marker P-values, and many others were very close to the genome-wide significance threshold (Figure 1). However, only the chromosome 6 association results retained genome-wide significance after further correction for population substructure using principal components analyses as implemented in Eigenstrat20,21 (Table 1). The chromosome 6 signal falls within a 94.2 Kb LD block containing the predicted overlapping genes FLJ22536 and FLJ44180 (Figure 2).
We next sought to replicate the chromosome 6p22 and 20p11 association signals in three separate pairings of neuroblastoma patients with controls. As shown in Table 2, the chromosome 6 risk alleles identified in the discovery phase were also significantly over-represented in all three sets of neuroblastoma cases compared to their controls, yielding a combined P-value and allelic odds ratio for the most strongly associated SNP rs6939340 of 9.33×10-15 and 1.37 (95% CI 1.27-1.49). There was no support for the chromosome 20 alleles identified in the discovery cohort truly being associated with predisposition to neuroblastoma in the three replication case series (Supplementary Table S3).
To determine if the risk alleles at the chromosome 6p22 locus were differentially associated with patient subsets and outcome, we analyzed the associated SNPs frequency distribution from the discovery phase COG case series with respect to prognostically relevant clinical and biological covariates present at diagnosis and survival rates. For all three SNPs, there was a significantly different allele frequency distribution indicating that the risk alleles were more likely to be present in patients with a more malignant clinical presentation and disease course (Figure 3). Accordingly, neuroblastoma patients homozygous for the at-risk alleles were more likely to develop tumors that were metastatic at diagnosis (P=0.0060, 0.0084, and 0.0245 for rs4712653, rs9295536, and rs6939340 respectively), to have somatically acquired amplification of the MYCN oncogene in tumor cells (P=0.0018, 0.0028, and 0.0061 respectively), and to have a high-risk classification for the purposes of treatment stratification (P=0.0018, 0.0004 and 0.0107 respectively), compared to neuroblastoma patients homozygous for the non-risk alleles (Supplemental Table S4 and S5). In addition, patients homozygous for the at-risk alleles had a significantly decreased event-free survival probability (P=0.0158 for SNP rs6939340; Supplemental Figures S2). This association with more malignant disease may explain why we were able to show replication of results in the relatively small series of 59 cases from CCG that consisted solely of high-risk patients (Table 2).
Embryonal cancers are postulated to arise from partially committed primordial cells during fetal or early childhood development. This comprehensive genetic analysis shows that the likelihood for malignant transformation of developing neuroblasts is influenced by common variation in the human genome at 6p22, and that the ultimate neuroblastoma phenotype is in part determined by germline variation at this locus. These definitive data support a model of pediatric cancers as complex disease, with common variations in the genome likely influencing susceptibility and phenotypic variability.
The motivation for this large and ongoing study was the paucity of information on how neuroblastoma tumorigenesis is initiated. Epidemiological studies have not identified a common environmental exposure that influences susceptibility to neuroblastoma,24-26 and genetic studies of hereditary disease have been hampered by the rarity of the condition and the small size of pedigrees due to the lethality of neuroblastoma in early childhood.27 Our first results not only provide proof-of-concept for a genome-wide association approach to neuroblastoma, but also identifies novel candidates for susceptibility genes. Little is known about the predicted genes within the LD block identified here. FLJ22536 has multiple predicted isoforms and contains a potential epidermal growth factor-like domain. The FLJ44180 gene is apparently novel as there are no sequence similarities in human or mouse nucleic acids and protein databases. Studies are underway to determine how the presence of common DNA variants at 6p22 contributes to the risk of neuroblastic malignant transformation, presumably through altered expression and/or alternative splicing of regional candidate genes. The association of common SNP alleles at 6p22 with the sub-phenotype of high-risk disease also requires further exploration. Future work will define why tumors initiated with a genetic contribution from the 6p22 locus are more likely to evolve towards a more malignant clinical disease.
Figure 1 demonstrates that there are multiple other potential association signals that will emerge as we continue towards our eventual goal of genotyping 5000 neuroblastoma cases genome-wide. While speculative, these early data support a common variant-common disease model for neuroblastoma where initiation of neuroblastoma tumorigenesis occurs due to the interaction of multiple relatively common genetic variations in the developing neuroblast. Our ongoing genome-wide association study and international collaborations are expected to identify and verify additional risk alleles and how these may interact to increase susceptibility to neuroblastoma.
Neuroblastoma is a unique malignancy which includes the highest proportion of cases that will spontaneously regress compared to any other human cancer,28,29 but the majority of patients have a relentless malignant disease that is difficult to cure even with intensive chemoradiotherapy.1 Our data suggest that the chromosome 6p22 risk alleles not only influence the likelihood of developing neuroblastoma, but also the likelihood for developing a more malignant phenotype. This observation increases the urgency for understanding the molecular pathways influenced by these risk variants in somatic and tumor tissues, as this may lead to novel approaches to neuroblastoma therapy.
This work was made possible due to the efforts of multiple investigators throughout the Children’s Oncology Group (U10-CA98543, Dr. Gregory Reaman Chair). This work was also supported in part by NIH Grants R01-CA78454 (JMM), R01-ES009911 (HL) and R01-CA60104 (RCS), the Alex’s Lemonade Stand Foundation (JMM), the Center for Applied Genomics at the Joseph Stokes Research Institute, which funded all GWA genotyping (HH), and the Abramson Family Cancer Research Institute (JMM), the Neuroblastoma Society (NR, RHS), the Institute of Cancer Research, UK (NR, RHS). We acknowledge use of DNA from the British 1958 Birth Cohort collection, funded by the Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. MC is supported in part by a fellowship from the Associazione Italiana Ricerca sul Cancro (AIRC). The Factors Associated with Childhood Tumours (FACT) Study and the Tumour and Leukemia Bank and Children’s Cancer and Leukaemia Group (CCLG) also supported these studies.
The authors have no financial disclosures. Dr. Maris authored the original manuscript draft.