Neuroblastoma is a pediatric cancer of the developing sympathetic nervous system that most commonly affects young children, and is often lethal5
. Most neuroblastomas arise sporadically, with less than 1% of cases inherited in an autosomal dominant fashion5
. We recently identified the anaplastic lymphoma kinase (ALK
) gene as the major hereditary neuroblastoma predisposition gene1
. For the vast majority of neuroblastomas that arise without a family history of the disease, we hypothesize that multiple common DNA variations cooperate to increase the risk for neuroblastic malignant transformation. By performing a genome-wide association study (GWAS) of single nucleotide polymorphism (SNP) genotypes, we recently identified common SNP alleles within the putative FLJ22536
genes at 6p22 and within the BARD1
gene at 2q35 associated with malignant neuroblastoma, providing the first evidence that childhood cancers also arise due to complex interactions of polymorphic variants1,2
. Here, we investigate constitutional DNA copy number variations (CNVs) as another source of genetic diversity that may contribute to the development of this disease. CNVs have been shown to significantly influence mRNA expression levels6
and recent studies have described associations of CNVs with systemic autoimmunity7,8
, and psoriasis13
. In addition, Shlien and colleagues reported an increased number of CNVs in Li-Fraumeni families harboring TP53
mutations, but did not explore associations of individual CNVs with cancer susceptibility14
To identify CNVs that are associated with neuroblastoma, we first genotyped a discovery set of 1,032 Caucasian neuroblastoma patients and 2,043 disease-free Caucasian control subjects, as previously described1,15
. We next applied stringent quality control criteria necessary for accurate CNV detection and reliable association assessment (see Supplementary Methods
). The final discovery set consisted of 846 Caucasian cases and 803 Caucasian controls (Supplementary Tables 1 and 2
). These subjects showed tight clustering in a multi-dimensional scaling (MDS) analysis of SNPs not in linkage disequilibrium (LD; Supplementary Figure 1
), demonstrating that population substructure was not likely to have a significant impact on association testing.
By comparing single marker binary copy number states at 531,689 SNPs mapped to autosomes, we observed a total of 131 SNPs showing significant association with neuroblastoma, defined as a two-sided Fisher’s exact test p-value below a genome-wide threshold of P
= 1.0 × 10−7
(Supplementary Table 3
). Associations with deletion polymorphisms were seen at chromosomes 1, 7, and 14 (); no duplication polymorphisms reached genome-wide significance (). Review of significant SNPs revealed four distinct regions of deletion (). We next sought to validate significant association signals in two independent replication sets, the first consisting of 363 Caucasian cases and 1,139 Caucasian controls, the second of 232 Caucasian cases and 2,218 Caucasian controls (Supplementary Tables 1 and 2
). All deletion associations showed robust replication in both independent case-series ( and Supplementary Tables 4 and 5
Discovery of 1q21.1 CNV associated with neuroblastoma
Significant copy number variable regions from the discovery phase.
Replication of significant copy number variable regions.
We observed a seven-SNP deletion at 1q21.1 which occurred in 15.6% of cases but only 9.1% of controls overall (Pcombined
= 2.97 × 10−17
; OR = 2.49, 95% CI: 2.02 to 3.05). This association remained significant after additional adjustment for potential population substructure captured by SNPs not in LD (Pdiscovery
< 0.0001), and was driven by a difference in hemizygous deletion frequency (Pcombined
= 1.83 × 10−19
). The observed frequency of homozygous deletion was identical in cases and controls (1.3% overall). The maximal deletion defined by neighboring 2-copy SNPs spanned 1.6 Mb and contained a cluster of “Neuroblastoma Breakpoint Family” (NBPF
) genes (). To refine the maximal deletion boundaries, we first genotyped 48 representative samples on the Illumina CNV-12 array (36 cases and 12 controls equally divided between those with and without the deletion CNV). These data reduced the maximal size of the deletion to approximately 300-Kb, consistent with published reports of CNVs at this location ()16,17,18
. Lastly, use of the Illumina HumanHap610 SNP platform reduced the maximal deletion size to only 143-Kb, from 147,292,384 – 147,435,422 (Supplementary Figure 2
). The minimal deletion based on significant SNPs in the association study spanned 121-Kb from 147,305,744–147,427,061and did not contain any known genes ().
We next sought to confirm that this CNV is indeed a heritable genetic variation. First, we genotyped an independent set of 713 trios from variable phenotypes on the same 550K SNP array and generated CNV calls using a family-based approach16
. Deletion at 1q21.1 was observed in 125 offspring and confirmed by parental analysis in 123 trios, estimating the inheritance rate at 98.4%. Next, we genotyped paired tumor DNA in 226 cases (Supplementary Table 2
) using the same 550K SNP platform, and confirmed existence of the CNV (deletion and duplication) in every tumor sample studied (Supplementary Figure 3a
). We did not observe progression from hemizygous deletion in constitutional DNA to homozygous deletion in the matched tumor DNA for any case in this study, nor did we observe any expansion of CNV boundaries in tumor compared to matched constitutional DNA.
To investigate whether 1q21.1 deletions are associated with specific neuroblastoma phenotypes, we tested each for association with clinical covariates using the combined set of 1,441 cases (Supplementary Table 6
). While deletions at 1q21.1 were observed more frequently in patients with aggressive disease; this trend did not reach statistical significance in this study. An additive effect on the odds ratio was observed for those harboring both the 1q21.1 deletion and the 6p22 risk alleles (Supplementary Figure 4
), but no significant interaction effect was detected to suggest epistasis (Supplementary Table 7
In addition to the 1q21.1 CNV, we observed highly significant associations of deletion within all four T-cell receptor (TCR
) loci clustered on chromosomes 7 and 14. Contrary to the 1q21.1 CNV, TCR deletions were not observed in paired tumor DNA samples (Supplementary Figure 3b
). Heterogeneity was observed in the areas of apparent deletion consistent with some cells harboring homozygous deletion, and others exhibiting 2-copy heterozygosity. Deletions within the T-cell receptors tended to co-occur in patients (P
< 0.0001, Supplementary Table 7
), and were significantly more common in blood-than bone marrow-derived DNA samples (PTCRG
= 9.2 × 10−31
= 7.3 × 10−6
= 8.0 × 10−50
). Taken together, these findings suggest that we are detecting an oligoclonal expansion of T-cell lymphocytes in a subset of neuroblastoma patients, and this signal is diluted within the bone marrow compartment. Interestingly, TCR deletions showed a striking over-representation in the less
aggressive subset of neuroblastoma (Supplementary Table 6, Supplementary Figure 5
). It is possible that these events herald an immunologic response to neuroblastoma, however this hypothesis requires further investigation.
To validate the 1q21.1 CNV, we first performed quantitative PCR (qPCR) on 46 neuroblastoma cases (ten 0-copy, sixteen 1-copy, twelve 2-copy, and eight duplications as predicted by SNP analysis). We observed 100% concordant results when comparing copy number estimated by qPCR with copy number based on SNP genotyping (). To confirm that the detected CNV is not an artifact caused by segmental duplication of the 1p36 region, and that it indeed maps to 1q21.1, we validated the existence of the deletion in a sample harboring a single copy loss using fluorescent in situ hybridization ().
Validation and biological relevance of 1q21.1 CNV
Although no known genes mapped to the refined 1q21.1 CNV, we identified a spliced EST (BQ431323) from a melanoma library that mapped within the CNV with 100% identity across the entire sequence (). Using primers designed against exon-1 and exon-3 of BQ431323, we PCR amplified cDNA from fetal brain and a neuroblastoma cell line. These PCR products were cloned and sequenced; the resulting sequence mapped uniquely within the CNV with 100% identity across the entire sequence and showed splicing out of the predicted second exon of BQ431323 (Supplementary Figure 6
). The top scoring hit from a Blastn19
search of this sequence against available human RefSeq transcripts was NBPF3
), followed by NBPF1
= 1.0 × 10−75
) and NBPF15
= 2.0 × 10−24
). These data provide strong evidence for a novel NBPF transcript, termed “NBPFX” here, mapping within the 1q21.1 CNV associated with neuroblastoma.
We investigated the expression of NBPFX in both neuroblastoma cells and normal human fetal and adult tissues using realtime quantitative reverse transcriptase PCR. Analysis of eighteen neuroblastomas (tumors and cell lines) of known copy number at the 1q21.1 CNV showed a clear correlation between CNV state and transcript expression (). Notably, 2-copy samples clustered into two distinct expression classes (P
= 0.007), the first with low expression and the second with high expression, and these likely represent different 1q21.1 CNV genotypes. There are two possible CNV genotypes for 2-copy samples, which we refer to as 2:0 (“cis”) and 1:1 (“trans”) based on the number of copies present on each chromosome in a diploid genome (Supplementary Figure 7a
). Two neuroblastoma samples in the low-expression group can be demonstrated to be from the 2:0 constitutional CNV genotype because they have somatically acquired gain of chromosome 1q (3-copies) yet are 2-copies with heterozygous SNPs at the 1q21.1 CNV (see Supplementary Figure 7b–c
for details). These findings are consistent with the hypothesis that two copies in “cis
” (same chromosome) behave differently than two copies in “trans
” (different chromosomes). Therefore, we propose a model whereby NBPFX expression is decreased when copies are in the “cis
” configuration as opposed to the “trans
” configuration. Importantly however, even when the 2-copy samples are not clustered in this manner, a statistically significant difference in transcript levels is observed between the 2- and 3-copy samples (P
= 0.001). Finally, we analyzed expression of NBPFX in a panel of twenty-eight normal fetal and adult tissues (). We observed the highest transcript levels in fetal brain and fetal sympathetic ganglia from early gestation (13–22 wks), consistent with NBPFX being expressed in early sympathicoadrenal neurodevelopment.
NBPF genes were identified after the founding member, NBPF1
, was determined to be disrupted via a constitutional chromosomal translocation in a neuroblastoma patient3,4
. Subsequent scans of the genome identified three clusters of NBPF genes on chromosome 1 within areas of segmental duplication. The encoded proteins are recently evolved and primate-specific, share significant homology, and contain highly conserved domains of unknown function (DUF1220) that are thought to be neuronal-specific3,20
. Constitutional deletions disrupting NBPF genes have been implicated in schizophrenia10,11,12
, and rare recurrent structural variants just upstream of the CNV identified in this study have also been reported in a variety of phenotypes including mental retardation, autism, and congenital anomalies21
. Although the specific functions of NBPF genes are not known, recently evolved genes may be involved in cancer predisposition due to a lack of selective pressure12
. Expression of NBPF1
was recently shown to suppress anchorage independent growth4
. Conversely, over-expression of NBPF transcripts has been reported in several cancers including sarcomas22
and non-small-cell lung cancer23
. A challenge for the field is to design experiments that clearly distinguish these highly homologous transcripts and define specific disease-causal mechanisms.
Neuroblastoma likely develops from the malignant transformation of partially committed sympathicoadrenal neuroblasts during fetal or early childhood development. We previously identified the ALK
gene as the major familial neuroblastoma predisposition gene1
and showed that common variations at 6p22 and 2q35 are associated with sporadic neuroblastma1,2
. In our current study, we show that a common CNV at 1q21.1 likewise contributes to neuroblastoma susceptibility, and that this CNV leads to altered expression of a novel NBPFX transcript. These data provide the first definitive evidence for a specific CNV predisposing to human cancer, and ongoing efforts will define remaining susceptibility variants in the human genome associated with neuroblastoma.