|Home | About | Journals | Submit | Contact Us | Français|
Neuroblastomas are tumors of peripheral sympathetic neurons and are the most common solid tumor in children. To determine the genetic basis for neuroblastoma we performed whole-genome sequencing (6 cases), exome sequencing (16 cases), genome-wide rearrangement analyses (32 cases), and targeted analyses of specific genomic loci (40 cases) using massively parallel sequencing. On average each tumor had 19 somatic alterations in coding genes (range, 3–70). Among genes not previously known to be involved in neuroblastoma, chromosomal deletions and sequence alterations of chromatin remodeling genes, ARID1A and ARID1B, were identified in 8 of 71 tumors (11%) and were associated with early treatment failure and decreased survival. Using tumor-specific structural alterations, we developed an approach to identify rearranged DNA fragments in sera, providing personalized biomarkers for minimal residual disease detection and monitoring. These results highlight dysregulation of chromatin remodeling in pediatric tumorigenesis and provide new approaches for the management of neuroblastoma patients.
Neuroblastomas are pediatric tumors arising from neural crest-derived precursors of the peripheral sympathetic nervous system. As is typical of embryonal tumors, they arise early in childhood with 90% of all cases diagnosed before the age of 5 years. They are the most common extra-cranial solid tumor of childhood and are responsible for up to 15% of childhood cancer-related deaths1–3, with the majority of patients presenting with metastatic disease at the time of diagnosis. Neuroblastomas manifest marked heterogeneity in clinical outcome. The prognosis of children less than 18 months old, even those with metastatic disease, is favorable, and the tumors in children with stage 4S disease frequently regress spontaneously4. Unfortunately, children older than 18 months old who are diagnosed with advanced stage disease have a grave prognosis despite multimodal, dose-intensive chemoradiotherapy5. Several recurrent genetic alterations have been elucidated, including amplification of the MYCN oncogene in ~20% of cases6,7, activating mutations in the ALK tyrosine kinase in ~8% of primary tumors8–11, and more recently mutations in ATRX in neuroblastomas presenting in older children and adolescents12. MYCN amplification is associated with advanced tumors and poor outcome, ATRX mutations define indolent neuroblastoma with eventual progression, while the prognostic value of ALK alterations remains to be defined7.
To comprehensively analyze acquired genetic alterations in neuroblastoma, we used a combination of next generation sequencing approaches in a discovery screen: low-coverage whole-genome sequencing for detection of structural and copy number alterations in 26 cases; exome sequencing for detection of subtle sequence alterations in 16 cases; and high-coverage whole-genome sequencing for detection of both sequence and structural alterations in 6 cases (all of which were also subjected to exome sequencing) (Supplementary Fig. 1, Table 1). In total, 16 cases could be analyzed for subtle mutations such as single base substitutions and small insertions or deletions (indels), while 32 cases (26 with low coverage, 6 with high coverage) could be analyzed for large scale structural changes and copy number alterations. DNA was obtained from low-passage cell lines (n=6) or primary tumors (n=29) and matched normal controls as indicated in Supplementary Table 1. Following library construction and capture on a SureSelect (Agilent) Enrichment System, DNA was sequenced using Illumina GAIIx/HiSeq instruments (Supplementary Note). The average coverage of each base in the targeted regions was 31-fold and 94-fold for the high-coverage whole-genome and exome sequencing approaches, respectively (Supplementary Tables 2 and 3), while the low-coverage whole genomic sequencing achieved an average of 10-fold physical coverage (Supplementary Table 4).
The sequencing data were analyzed using stringent criteria to identify somatic single base substitutions, insertions or deletions (indels), and structural alterations (Online Methods). All single base substitutions and indels were confirmed by an independent sequencing method (Online Methods), and only confirmed mutations are included in the analyses described below. With the exception of one tumor, we found that neuroblastoma tumors had an average of 13 (range, 1 to 52) somatically acquired single base substitution or indel mutations that would be predicted to result in non-silent (NS) changes in coding regions. The NS substitutions were predominantly C:G to A:T transversions (Fig. 1; Supplementary Table 5), representing a mutation spectra different from other pediatric and adult tumors13,14,15. Overall, we detected 368 mutations in 353 genes (Supplementary Table 5). The average number of somatic mutations in neuroblastomas was similar to that reported for neuroblastoma by Molenaar16 and slightly higher than the number in medulloblastomas, a pediatric tumor analyzed by exome sequencing13. This is notably lower than the number of alterations observed in most common adult solid tumors14,15. One tumor-derived cell line, NB07C, had a substantially higher number of somatic mutations (169 NS changes) than the other neuroblastomas analyzed. This case was considered to be an outlier in this study but may identify a unique subset of cases if similar tumors are identified in future validation efforts.
Six samples were analyzed by both exome and high-coverage whole-genome sequencing, permitting independent validation of the somatic alterations as well as a comparison of these approaches for the detection of sequence alterations. Over 91% of the whole-genome and 94% of whole-exome targeted bases were represented by at least 10 reads (Supplementary Tables 2 and 3). A total of 245 somatic alterations in coding regions were detected by either approach with 219 mutations identified by whole-genome sequencing and 240 alterations identified by whole-exome sequencing. Exomic and genomic sequencing detected 98% and 89%, respectively, of the mutations, consistent with similar comparisons made by others17.
In addition to the single base substitutions and indels, we analyzed copy number changes corresponding to focal amplifications (≥5-fold copy number gain) or homozygous deletions (less than 20 Mb in size) as these are likely to harbor potential oncogenes and tumor suppressor genes. There was an average of two such focal copy number changes per tumor (range, 0 to 10 per tumor) whose boundaries included at least one protein-encoding gene (Supplementary Table 6); all were amplification events and the majority included either MYCN or ALK as the putative target gene. One tumor amplicon (in NB1395T) harbored LIN28B, which is downstream of MYCN and a putative neuroblastoma oncogenic driver18,19. There were also four structural rearrangements per tumor that were within protein-encoding genes (range, 0 to 18 per tumor; Supplementary Tables 4 and 7 and Supplementary Fig. 2). These included deletions, duplications, and inversions within the same chromosome as well as inter-chromosomal translocations. We did not find evidence of chromothrypsis in these samples, although this has recently been reported in a subset of high-risk neuroblastoma tumors16.
The coding exons of all genes that were recurrently altered in the tumors analyzed by next generation sequencing were examined by PCR and Sanger sequencing in 74 additional neuroblastoma cases (Table 2, Supplementary Table 1 and Online Methods). Integration of these data with next generation sequencing data revealed a number of novel genes as well as those previously known to be involved in neuroblastoma. The ALK receptor tyrosine kinase gene was found to be mutated in 8 of 90 cases (9%) in our discovery screen (Table 2 and Supplementary Table 5). All eight sequence changes in ALK affected two amino acid residues in the tyrosine kinase domain (R1275Q, R1275L and F1174L) that have been reported to lead to constitutive kinase activity4,8,11. An additional 15-fold amplification of the ALK gene was identified in one of 32 cases evaluated for structural changes and copy number alterations (Supplementary Table 6). However, no ALK translocations were detected, suggesting that this mechanism of ALK activation, typical of large cell lymphomas, non-small cell lung cancers, and inflammatory myofibroblastic tumors, is uncommon in neuroblastoma20,21. Additionally, the MYCN oncogene was found to be focally amplified in 15 of the 32 (47%) neuroblastomas, including 5 of the 6 neuroblastoma cell lines, consistent with the previously reported frequency of MYCN amplification in high risk tumors and cell lines derived from such tumors7 (Table 2 and Supplementary Table 6). Co-amplification of ODC1, a MYCN target gene important for oncogenicity in neuroblastoma22, was seen in 3 of 15 (20%) MYCN amplified tumors (none of which displayed copy number changes of ALK). Other alterations in known cancer genes included a glutamine to lysine change at codon 61 in the HRAS oncogene, and single missense alterations in the PTCH1 tumor suppressor and in the EGF receptor family member ERBB4 (Supplementary Table 5).
In addition to these alterations, a number of mutations in genes not previously known to be involved in neuroblastoma were identified. The most prominent example was the detection of intragenic hemizygous deletions targeting the AT rich interactive domain 1B gene, ARID1B, in three of 32 tumors (9%) in the discovery screen (Fig. 2, Table 2, and Supplementary Table 7). The deletions in ARID1B were identified by virtue of their aberrantly spaced paired-end sequences and, due to their small size and hemizygous nature, would have been difficult to detect using conventional copy number analyses. These included an 83 kb deletion encompassing exon 6 and a 147 kb deletion encompassing exons 6–9 that were predicted to result in a frameshift and premature truncation of the gene products, and a 621 kb deletion that removed exons 1 and 2, including the protein translation start site (Fig. 2 and Table 2). All these deletions, which were confirmed by PCR amplification and sequencing across the deletion junction, would be expected to abolish functional translation of the key downstream DNA binding (ARID) and topoisomerase-II associated (PAT1) protein domains of ARID1B. An additional tumor had an insertion mutation in the homologous ARID1A gene that would be predicted to lead to premature termination of the protein.
To investigate the prevalence of these specific alterations identified in the discovery screen, we designed a custom capture approach to selectively sequence and detect point mutations and structural alterations in the genomic regions of ARID1A, ARID1B, ALK and MYCN in 40 additional neuroblastoma cases (Supplementary Fig. 1, Prevalence Screen). These analyses yielded an average sequence coverage of 723-fold per targeted base (Supplementary Tables 1 and 8). Through these analyses we were able to identify an intragenic hemizgyous deletion, a splice-site mutation and a missense mutation in ARID1B in two additional tumors as well as an additional intragenic deletion in a previously analyzed sample (NB05) (Fig. 2, Table 2 and Supplementary Tables 5 and 7). Collectively, ARID1B point mutations or intragenic deletions were identified in 5/71 (7%) of neuroblastoma cases (Fig. 2 and Table 2). We further identified hemizygous deletions encompassing the entire coding region of ARID1B in the distal region of 6q in 5 additional cases (Supplementary Table 6). Furthermore, point mutations of ARID1A were identified in three additional cases, two of which led to biallelic inactivation through mutation predicted to result in premature termination of the protein and deletion of the alternative allele at 1p36 (Fig. 2 and Table 2, Supplementary Table 5). All of these alterations were confirmed by Sanger sequencing. Not surprisingly, we identified additional ALK missense changes and MYCN amplifications, resulting in somatic alterations of ALK in 18/130 (14%) and of MYCN in 43/71 (61%) of total cases (Table 2, Supplementary Tables 5 and 6).
ARID1B is a member of the SWI/SNF transcriptional complex that is thought to regulate chromatin structure23. Mutations recently identified in ARID1B suggest that it may serve as a potential tumorigenic driver in a small fraction of hepatocellular24, breast25, ovarian26, and medulloblastoma27,28 tumors. Through our integrated genomic analyses, our findings of five independent structural alterations and two sequence changes, the majority of which would result in a truncated protein, strongly support this gene as a contributor to neuroblastoma oncogenesis (passenger probability P<0.001). Interestingly, we found sequence alterations in other genes involved in chromatin regulation in neuroblastoma. These included two frameshift, one nonsense and one missense mutation in ARID1A, another SWI/SNF complex member, nonsense mutations in the histone acetyl transferase (HAC) genes EP300 and CREBBP, and missense mutations in the SWI2/SNF2 family member TTF2 gene, the histone demethylase gene KDM5A, and the chromatin remodeling zinc finger gene IKZF1. Genes involved in chromatin structure or remodeling have been reported to be implicated in human cancers. These include a high frequency of alterations of ARID1A in ovarian clear cell carcinomas26, SMARCB1 in malignant rhabdoid tumors29, alterations of PBRM1 in renal cell carcinomas30, alterations of EP300 and CREBBP in transitional cell carcinomas of the bladder31 and B cell lymphomas32, alterations of DAXX and ATRX in pancreatic endocrine tumors33, and inactivation of histone methyltransferases MLL2 and MLL3 in medulloblastomas13, among others34–36. Of note, ATRX has recently been shown to be mutated in neuroblastoma tumors from adolescents and young adults (≥12 years old)12 but would not have been expected to be altered in a significant fraction of the patients evaluated in our study (median age of diagnosis <2 years old, range <1 to 6 years old).
Although the number of sequence alterations in neuroblastomas was low compared to adult tumors, the frequency of recurrent structural rearrangements in neuroblastomas was relatively high. Every tumor had at least one rearrangement (range, 1 to 66) and all cases that had recurrent copy number changes of the MYCN, ARID1B, or ALK genes also had rearrangements at these loci. Such rearrangements are not present in normal cells and could therefore be useful as biomarkers of neuroblastoma. Given the poor treatment outcomes of many neuroblastoma patients, the availability of non-invasive biomarkers to detect minimal residual disease after surgery and to measure molecular response to chemotherapy would be useful for clinical management of neuroblastoma patients.
To demonstrate the feasibility of this approach, we developed personalized biomarkers based on the rearrangements present in the cancers analyzed37. This was performed through analysis of either whole-genome sequencing or capture and sequencing of the MYCN locus to identify structural alterations associated with novel rearrangement junctions not present in the germline (Online Methods). We have previously shown that tumor-specific rearrangements have the potential to serve as highly sensitive biomarkers for tumor detection and monitoring37, and would therefore be expected to have fundamental advantages over measurement of wild-type sequences, including wild-type MYCN levels38, in neuroblastoma patients. Notably, both MYCN amplified and non-amplified tumors had identifiable somatic rearrangement biomarkers, and in three cases in which serum was available at the time of diagnosis, we were able to detect and quantify such specific tumor rearrangements in the patients’ serum (Table 3, Supplementary Table 9). Interestingly, quantitative analyses showed that there was much more tumor DNA freely floating in the serum than in circulating cells, suggesting that the cell free compartment of blood may represent a more sensitive source for detection of tumor burden (Table 3).
We developed personalized rearrangement biomarkers to monitor circulating tumor DNA (ctDNA) in serial plasma samples from four additional cases of neuroblastoma obtained during a post-consolidation minimal residual disease (MRD) immunotherapy trial39 (Supplementary Fig. 3). In two cases, NB2885T and NB2870T, the ctDNA was detected at the end of standard high risk neuroblastoma therapy and, despite MRD immunotherapy, went on to relapse and eventually die of disease. The prolonged reduction in ctDNA in NB2885T during immunotherapy may be an indication of therapeutic response whereas the marked increase in ctDNA in NB2870T correlated with clinical relapse during the trial period. In cases NB6321T and NB2464T, no ctDNA was detectable and these patients were alive at the last follow-up over one and four years later, respectively. These data demonstrate that ctDNA may be a useful surrogate for the level of clinical disease, and that the presence of ctDNA may be a highly sensitive and specific predictor of minimal residual disease and subsequent relapse40.
These genome-wide sequence analyses suggest that neuroblastoma tumors are driven by a relatively small number of somatically acquired alterations and that genes involved in chromatin remodeling, including ARID1B and ARID1A, were enriched for alterations. ARID1 family genes are integral components of the SWI/SNF neural progenitors-specific chromatin remodeling BAF complex that is essential for the self-renewal of multipotent neural stem cells41. Tumor-specific deletions encompassing ARID1B have been reported in CNS tumors42 and multiple members of this complex have been identified as tumor suppressor genes26,41. We found that high expression of members unique to the neural-progenitor BAF complex correlates with a high-risk neuroblastoma phenotype while high expression of those specific to the neuron specific BAF complex, or downstream neuritogenesis target genes, correlates with lower risk neuroblastoma (Supplementary Fig. 4). These data support a model whereby disrupted BAF complex signaling may preserve an undifferentiated progenitor state.
The model above would suggest that alterations in ARID1 may correlate with a more aggressive neuroblastoma phenotype. All but one of the patients with alterations in ARID1A or ARID1B died of progressive disease, including a child with low-risk neuroblastoma (a group with a survival probability of >98%). ARID1 alterations were associated with inferior overall survival of 386 days compared to 1689 days for patients without such alterations (hazard ratio, HR 4.49; 95% confidence interval, CI 1.24–16.33; P=0.0226, log-rank test; Fig. 3 and Supplementary Table 10). An analysis that also included hemizygous deletions of the entire coding region of ARIDB further increased the significance of the survival difference between patients with mutant and wildtype ARID1B/A (hazard ratio, HR 6.41; 95% confidence interval, CI 1.93–21.25; P=0.0024, log-rank test). The median survival of patients with ARID1 alterations was lower than that of any other genetic alterations assessed, including MYCN amplification (median survival 726 days) providing a potential marker for early therapy failure and disease progression.
This study underlies the importance of integrated genomic analyses, including detection of sequence alterations, copy number changes, and rearrangements that can now be performed using massively parallel sequencing approaches to identify subtle genomic changes. Despite the comprehensive efforts of this study, some alterations may not have been detected. First, a small fraction of the exome was not analyzed, either due to low sequence coverage in the whole-genome analyses or inadequate capture in the exome analyses. Second, it is possible that point mutations in non-protein-coding regions of the genome may be involved in neuroblastoma. Such data were obtained for six neuroblastoma cases and did not identify any clear clustering of alterations; analysis of additional neuroblastoma cases could be useful to further interpret these non-coding changes. Third, germline neuroblastoma susceptibility variants have been identified43,44 and additional such variants yet to be discovered may be present in our neuroblastoma cases. Fourth, it is possible that epigenetic alterations contribute to the initiation or progression of neuroblastomas. This possibility is intriguing given the new data on ARID1B and ARID1A in this tumor type. Finally, although rearrangements and copy number changes were detected in a genome-wide fashion, many of these occurred in non-coding regions and their functional roles remain to be elucidated.
Our data add to the growing knowledge of the genomic landscapes of human cancers. They are consistent with the idea that pediatric tumors do not require as many genetic alterations as typical adult cancers13,45. Although few alterations were identified in known therapeutically-targetable oncogenes such ALK, there are many other alterations, both subtle and large, that are found in these cancers and many of these affect chromatin-modifying genes. These data highlight the important connection between genetic alterations in the cancer genome and epigenetic pathways, and provide new avenues for research and disease management in neuroblastoma patients.
Neuroblastoma tumor DNA (from cell lines and primary tumors), matched germline DNA (from peripheral blood or lymphoblastoid cell line) and patient serum or plasma were obtained from the Children’s Oncology Group (COG) cell line repository and the COG Neuroblastoma biobank following committee approval (study #COG NB 2008-02). Informed consent for research use was obtained from all patients and/or parents at the enrolling COG member institution prior to tissue banking or cell line generation, and study approval was obtained from The Children’s Hospital of Philadelphia Institutional Review Board. All samples were STR genotyped to confirm identity. Primary tumor samples were selected from patients with COG high-risk disease, and specimens verified to have >75% viable tumor cell content by histopathology assessment. Serial plasma samples for MRD assays were obtained from patients enrolled on the COG ANBL0032 immunotherapy study.
Genomic DNA libraries were prepared and captured following Illumina’s (Illumina, San Diego, CA) suggested protocol with the modifications described in the Supplementary Note, or by Personal Genome Diagnostics (Baltimore, MD). DNA libraries were sequenced with the Illumina GAIIx/HiSeq Genome Analyzer, yielding 100 or 200 base pairs of sequence from the final library fragments for high coverage exome/low coverage genome and high coverage genome analyses respectively. Sequencing reads were analyzed and aligned to human genome hg18 with the Eland algorithm in CASAVA 1.7 software (Illumina). Reads were mapped using the default seed-and-extend algorithm, which allowed a maximum of 2 mismatched bases in the first 32bp of sequence. Identification of somatic alterations was performed as previously described46–49 utilizing a next-generation sequencing analysis pipeline that enriched for tumor-specific single nucleotide alterations and small insertions/deletions. Briefly, for each position with a mismatch (as compared to the hg18 reference sequence using the Eland algorithm) the read coverage of the mismatch and wild-type sequence at that base was calculated. A candidate mismatched base was identified as a mutation only when (i) two or more distinct paired-tags contained the mismatched base; (ii) the number of distinct paired-tags containing a particular mismatched base was at least 7.5% of the total distinct tags; and (iii) the mismatched base was not present in >0.5% of the tags in the matched normal sample. Candidate somatic point mutations identified by next generation sequencing approaches were confirmed by an independent sequencing method (either a different next-generation sequencing approach or polymerase chain reaction (PCR) followed by Sanger sequencing, Supplementary Table 5).
For 12 selected genes that were somatically altered, the coding region was sequenced in a validation set composed of an independent series of 74 additional neuroblastomas and matched controls. These genes included ALK, ANKRD34B, ARID1B, ARID1A, FAR1, PRSS16, PRSS23, RASGRP3, TTLL6, VANGL1, VCAN and ZHX2. PCR amplification and Sanger sequencing analyses were performed following protocols described previously15.
Single tags passing filter were grouped by genomic position in nonoverlapping 3-kb bins. A tag density ratio was calculated for each bin by dividing the number of tags observed in the bin by the average number of tags expected to be in each bin (on the basis of the total number of tags obtained for chromosomes 1 to 22 for each library divided by 849,434 total bins). The tag density ratio thereby allowed a normalized comparison between libraries containing different numbers of total tags. A control group of libraries made from the six matched normal high coverage whole-genome samples from Supplementary Table 1 and six additional normal samples [Co84N, Co108N, B5N, B7N37 and CEPH (Centre d’Etude du Polymorphisme Humain) samples NA07357 and NA18507] was used to define areas of germline copy number variation or that contained a large fraction of repeated or low-complexity sequences. Any bin where at least two of the normal libraries had a tag density ratio of <0.25 or >1.75 was removed from further analysis.
For all samples analyzed with low coverage whole-genome sequencing (Supplementary Table 4), amplifications were identified as three or more bins with tag ratios of >2, separated by no more than ten intervening bins with a tag ratio <2. For all amplifications, at least one bin had a tag ratio of ≥5. For samples with high coverage whole-genome sequencing (Supplementary Table 3), homozygous deletions were identified as three or more bins with tag ratios of <0.25, separated by no more than ten intervening bins with a tag ratio >0.25. Single-copy gains and losses were identified through visual inspection of tag density data for each sample.
For all samples analyzed with targeted capture sequencing, the tag ratio for each gene was calculated as the average read coverage for the gene, divided by the average read coverage of the ALK, ARID1A and ARID1B genes (MYCN was not used as it is frequently amplified). These values were normalized to the average coverage for each gene in a normal sample. Amplifications and hemizygous deletions were identified if the tag ratio for a gene was ≥ 5.0 or < 0.65, respectively. Hemizgyous deletions were confirmed through LOH analyses of SNPs in the genomic region of each gene.
Six samples with high coverage whole-genome sequencing were analyzed for amplifications at the MYCN locus. The boundary coordinates for these amplifications were compared and a one megabase (hg18 chr2:15.5Mb–16.5Mb) region was identified that contained at least one amplification boundary region from each sample.
Somatic rearrangements were identified by querying aberrantly mapping reads from one flow cell of an Illumina GAIIx run (100bp PE) or up to two lanes of an Illumina HiSeq Genome Analyzer run (50bp PE) to achieve a physical coverage of >8X. The discordantly mapping pairs were grouped into 1kb bins when at least 2 distinct tag pairs (with distinct start sites) spanned the same two 1kb bins (known bins which contained aberrantly mapping tags were removed as described above37, as well as 1kb bins involved in known germline structural alterations50).
To identify all high-confidence genomic rearrangements, candidate rearrangements were filtered using the above described criteria and were required to have at least one tag sequenced across the rearrangement breakpoint. Breakpoints were determined using BLAT alignment to the human genome sequence (hg18)51. In order to ensure that no recurrent rearrangements in coding genes were missed, genes which harbored rearrangements were evaluated for all candidate rearrangements without the requirement that the breakpoint be present in a sequenced tag and any recurrent gene rearrangement was further analyzed. Candidate rearrangements were confirmed as somatic when a 10 uL PCR based reaction (containing 5.9 uL H2O, 1 uL 10X PCR buffer, 1 uL 10mM dNTPs, 0.6 uL DMSO, 0.4 uL 25uM primers, 0.1 uL Platinum Taq and 1 uL DNA, 3 ng/uL) resulted in the amplification of a product of the expected size in the tumor but not in the matched normal on a 1% ethidium bromide stained agarose gel. Utilizing this stringent pipeline, of the 26 candidate genomic rearrangements tested, 25 were confirmed as somatic (96%) as well as 15 of the 16 candidate rearrangements tested that were identified by the NMYC capture sequencing method (94%). In all three cases of ARID1B somatic rearrangement, the PCR product was Sanger sequenced to identify the breakpoint to the base-pair resolution. For biomarker analyses, rearrangements were identified with the initial-above described method, with a subsequent PCR product sequenced and aligned using BLAT to hg1851 in order to design primers to amplify a PCR product in the serum, plasma or peripheral blood between 70 and 120 bp.
Circulating tumor DNA was amplified using 2x Phusion Flash PCR Master Mix and patient specific primers (at a final concentration of 0.5uM each) in DNA isolated from serum or plasma and DNA isolated from peripheral blood cells. Subsequently, the level of tumor DNA was quantified after amplification by digital PCR on SYBR green I stained 10% TBE gels37.
For gene expression profiling by Affymetrix U95Av2 microarrays, the expression measures for each probe set was extracted and normalized using robust multi-array average protocols from raw CEL files as described previously. Basic linear correlation and regression was used to define r, r2 and two-tailed p value to assess correlation among gene expression values.
Curves for overall survival (calculated as the time from diagnosis) were constructed using the Kaplan-Meier method and compared between groups using the log-rank test for descriptive purposes. Cox proportional hazards models were used to test for the effect of clinical and genetic parameters on survival. Passenger probabilities were calculated using the binomial test adjusted for gene sizes and corrected for multiple comparisons52.
We thank the families and children with neuroblastoma who contributed to this work. We thank John Maris for valuable input to this work, J. Ptak, N. Silliman, L. Dobbyn, M. Whalen, J. Schaefer, and T. Mosbruger for technical assistance with sequencing analyses, Lisa Kann and Sam Angiuoli of Personal Genome Diagnostics for targeted sequence analyses, the Children’s Oncology Group (COG), Wendy B. London and the COG Statistics and Data Center, Julie Gastier-Foster and the Neuroblastoma Reference Laboratory, Nilsa Ramirez and the Biopathology Center, Cindy Winter and the CHOP Nucleic Acids Bank, and Tito Woodburn and the COG Cell Line Repository. This work was generously supported by the St. Baldrick’s Foundation for childhood cancer research, the Virginia and D. K. Ludwig Fund for Cancer Research, Swim Across America, AACR Stand Up To Cancer-Dream Team Translational Cancer Research Grant, and NIH grant CA121113.
AUTHOR CONTRIBUTIONSM.S. and R.J.L. are joint first authors. C.P.R. established cell lines and C.P.R. and X.L. purified DNA samples from which M.S. prepared next generation DNA sequencing libraries. J.W. performed MYCN capture of genomic DNA libraries for massively parallel sequencing. M.S. and R.J.L. analyzed sequencing data for structural alterations. M.S., S.J., N.P., B.V., K.W.K., and V.E.V. sequenced next-generation DNA libraries and performed mutational analyses. A.B., G.P., and L.A.D. performed statistical analyses of clinical and sequencing data. M.S., R.J.L., B.V., K.W.K., V.E.V., and M.D.H. conceived the research and wrote the manuscript.
COMPETING FINANCIAL INTERESTS
L.A.D., N.P., B.V., K.W.K., and V.E.V are founders of Inostics and Personal Genome Diagnostics and are members of their Scientific Advisory Boards. L.A.D., N.P., B.V., K.W.K., and V.E.V. own Inostics and Personal Genome Diagnostics stock, which is subject to certain restrictions under university policy. The terms of these arrangements are managed by Johns Hopkins University in accordance with its conflict-of-interest policies.
Sequence data have been deposited at the European Genome-Phenome Archive (EGA, http://www.ebi.ac.uk/ega/) which is hosted at the EBI, under accession number EGAS00001000369. Expression data have been previously deposited at the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE3960.