|Home | About | Journals | Submit | Contact Us | Français|
MYCN gene amplification (MNA) is a hallmark of aggressive neuroblastoma. We sought to determine the univariate and multivariate predictors of tumor MNA.
Data from the International Neuroblastoma Risk Group (INRG) were analyzed from the subset of 7,102 patients with known MYCN status. We used chi-squared testing and logistic regression to identify univariate and multivariate predictors of MYCN status. Recursive partitioning was used to identify groups of patients with maximal difference in rates of MNA.
All clinical [age >18 months; high ferritin; high lactate dehydrogenase (LDH); INSS stage 4; adrenal site] and pathology/biology [DNA index ≤1; high MKI, undifferentiated/poorly differentiated grade; unfavorable histology by International Neuroblastoma Pathology Classification (INPC); segmental chromosomal aberrations (SCA)] features were significantly associated with MNA. LDH (OR=8.4; p<0.001) and chromosomal 1p LOH (OR 19.8; p<0.001) respectively were the clinical and biologic variables most strongly associated with MYCN-Amp. In logistic regression, all variables except chromosome 17q aberration and pooled SCAs were independently predictive of MNA. Recursive partitioning identified subgroups with disparate rates of MNA, including subgroups with 85.7% MNA [patients with high LDH who had poorly differentiated adrenal tumors with chromosome 1p deletion] and 0.6% MNA [localized tumors with hyperdiploidy, low MKI, and lacking chromosome 1p aberration].
MNA is strongly associated with other clinical and biologic variables in neuroblastoma. Recursive partitioning identifies subgroups of neuroblastoma patients with highly disparate rates of MNA. These findings can be used to inform investigations of molecular mechanisms of MNA.
Neuroblastoma (NB) is a childhood tumor with marked clinical and biological heterogeneity.1,2,3 Stage, age4, histology3,5, lactate dehydrogenase (LDH) and ferritin levels3, MYCN status6,7, ploidy8,9, segmental chromosomal aberrations (SCA)3, and primary site10 have all been shown to be independently prognostic. In particular, MYCN amplification (MNA) is associated with rapid disease progression in patients of all ages or stages2,6,11 and is one of the strongest independent adverse prognostic factors.3
MYCN status is also associated with other adverse prognostic factors. For example, MNA is associated with unfavorable histology5, mitotic karyorrhectic index (MKI)12, and diploid/tetraploid tumors.13 MNA is associated with the presence of a range of SCAs, especially 1p deletion1,14,15, though 11q aberration shows a strong inverse relationship.16,17,18 The adverse prognostic effect of MNA can supersede otherwise favorable tumor genetics (e.g., near-triploid DNA content).2 MNA is found in a higher proportion of adrenal primary tumors10 and its incidence may also differ based upon the involvement of specific metastatic sites.19,20 Tumor MNA also varies by patient age, though this has been a complex nonlinear relationship in prior reports.21,22,23 To our knowledge, a comprehensive and definitive analysis of clinical and pathology/biology predictors of MNA in a large cohort of patients has not been reported. Such an analysis has the potential to inform to what extent variables associated with MNA interact, or are independent predictors, and therefore whether or not these associations are stronger in specific patient subgroups.
In the current study, we used the largest available international neuroblastoma dataset to define comprehensively the association of MNA with clinical and pathology/biology factors. We identified variables independently associated with MNA and patient subgroups with significantly different rates of MNA. These findings will provide a valuable resource for the field, will facilitate care of patients with neuroblastoma, and may lead to greater biological insight into the propensity for MNA. In particular, the use of recursive partitioning has the potential to reveal differences in the relative impact of predictors of MNA depending upon specific contexts defined by other variables.
The International Neuroblastoma Risk Group (INRG) database includes 8,800 individual patients <21 years of age with pathologically confirmed neuroblastoma who were diagnosed between January 1, 1990 and December 31, 2002.3 All patients were consented and enrolled on a neuroblastoma study within INRG member countries (France, Germany, Japan, Italy, Spain, United Kingdom) or within a cooperative group (Children’s Oncology Group, SIOPEN LNESG1 study). Only patients with known MYCN status were included.
MYCN status was the primary dichotomous outcome variable for this analysis. Within each INRG member group or country, MNA was determined by either interphase fluorescence in situ hydridization (I-FISH)24, polymerase chain reaction (PCR)25, array-based comparative genomic hybridization (aCGH)2, or multiplex ligation-dependent probe amplification (MPLA)2. For I-FISH, MNA has been defined as a more than 4-fold increase in the MYCN signal number compared to the reference probe on chromosome 2q.2 Data on number of copies of MYCN were not available for this analysis.
The INRG collects data on the following predictor variables used in this analysis: age; International Neuroblastoma Staging System (INSS) stage26,27; ploidy; presence of SCAs (1p, 11q, or 17q); ferritin level; LDH; primary site; sites of metastasis; histology (favorable vs. unfavorable by International Neuroblastoma Pathology Classification [INPC] system5); tumor diagnostic category (neuroblastoma vs ganglioneuroblastoma [intermixed, maturing, well-differentiated, or nodular]); grade (differentiating vs. poorly or undifferentiated); and MKI. Each continuous predictor variable was transformed into a binary variable for the analysis. Ferritin was categorized as high (≥92 ng/mL) versus low (<92 ng/mL) and LDH was categorized as high (≥587 U/L) versus low (<587 U/L), based on the mean values of ferritin and LDH in the overall INRG cohort, as previously reported3. Tumor primary site was categorized as thoracic vs. non-thoracic and adrenal vs. non-adrenal primary site. These site categories were chosen based on our previous work demonstrating the greatest effects of these sites on rates of MNA and because they are the most common primary sites10. We created a binary variable to reflect the presence of any SCA, defined as having 1p, 11q, and/or 17q aberrations.
We used chi-squared testing to perform univariate analyses comparing the frequency of MNA as a function of each categorical predictor variable. We used a t-test to compare the distribution of ages as a continuous variable between patients with and without MNA.
We evaluated appropriate variables for inclusion in multivariate testing (logistic regression and recursive partitioning). Most potential predictor variables had missing data for >10% of patients; predictor variables ranged from 0% missing values (age) to 95.2% (17q chromosomal aberration). Within each predictor variable, we performed an assessment to determine if MYCN status was missing at random. The proportion of patients with MNA was calculated for the favorable category (decreased rates of MNA expected), the unfavorable category (increased rates of MNA expected), and the missing category. If MYCN status was missing at random, we expected the proportion of MNA in patients with missing data for the variable to fall between the proportions seen in the favorable and unfavorable groups. Tumor diagnosis category violated this pattern and was excluded from multivariate models.
INPC histology classification system incorporates age as well as grade and MKI. To avoid confounding of INPC histology and age in multivariate models, only the underlying components (grade, age, and MKI) were included.
We performed multivariate logistic regression to select variables independently associated with MNA. Excluding 17q chromosomal aberration, only 188/7102 patients had known data for all other predictors. To utilize the full sample size of the dataset without selection bias and allow inclusion of patients for whom some predictor variables had missing values, we created a series of “dummy” variables that fully describe each predictor variable as unfavorable (yes/no), favorable (yes/no), and missing (yes/no). The “dummy” variables for unfavorable and missing categories were included in the model, leaving the favorable category as the reference. This approach prevents bias that might occur if only the subset of patients with complete data was utilized in the model.28,29 We used a backward selection approach with p<0.05 to enter the model and p<0.05 to remain in the final model. Three multivariate logistic regression models were built: Model A - testing all covariates; Model B – testing clinical variables only; and, Model C – testing pathology/biology variables only. Model A was repeated including a binary variable for vital status (alive vs. dead), to see which baseline variables of interest remain associated with MNA after adjustment for vital status.
We used the classification and regression tree (CART) analysis method to identify age cut points associated with lower and higher rates of MNA.30 In this analysis, age as a continuous variable was the only predictor variable and MNA was the outcome variable. The first three (most highly predictive) age cut points from CART were eligible for selection in future models.
For multivariate recursive partitioning models, we used the univariate odds ratios for MNA to prioritize variables for selection. Each predictor variable was tested within the overall cohort and the statistically significant predictor with the largest odds ratio was selected manually to form the split. Each split created two nodes; odds ratios were recalculated and the process was repeated within each node. This process proceeded iteratively until either or both of the following pre-specified conditions were satisfied: 1) no remaining variables with a statistically significant (p<0.05) odds ratio; and/or 2) further split yielded a subgroup with fewer than 15 total patients.
We utilized CART in R (http://www.R-project.org/) for recursive partitioning using age as the only predictor variable. STATA version 12 (StataCorp, College Station, TX) was used for all other analyses.
The characteristics of the 7,102 patients with known MYCN status are shown in Table 1 and are similar to the full INRG cohort reported previously.3 MNA was reported in 1,155 patients in our cohort (16.3%).
Patients with MYCN amplified tumors were older compared to patients with MYCN non-amplified tumors (mean age 28.1 versus 24.3 months, p<0.001). A higher percentage of patients ≥18 months had MNA compared to patients <18 months (24.7% versus 9.9%); in addition, 65% of patients with MNA were ≥18 months of age. To further characterize the relationship of MNA with age, we analyzed the proportion of total cases with and without MNA as a function of age (Figure 1A). Patients 18–20 months old contributed the highest proportion of cases with MNA (9.3% of all cases with MNA). In contrast, patients 0–2.9 months old contributed the highest proportion of cases without MNA (17.6% of all cases without MNA). We then plotted the percent of patients in a given age interval with MNA (Figure 1B). The 3-month interval with the highest incidence of MNA was in patients 21–23.9 months old (38.4%) and the lowest incidence was in patients 0–2.9 months old (3.6%).
We used CART to identify optimal age cut points associated with the most disparate rates of MNA (Supplemental Figure 1). The top three age cutoffs identified by recursive partitioning were 367 days (12 months), 110 days (3.5 months), and 1282 days (42 months). Patients <110 days had an incidence of MNA of 3.6%, whereas patients 516–1086 days (17–35.5 months) had an incidence of MNA of 33.5%.
All clinical and pathology/biology variables were significantly associated with MYCN status in univariate analyses (Table 1). The clinical factor most strongly associated with MNA was high LDH [odds ratio (OR) 8.4; p<0.001], though multiple other variables had OR>4. The pathology/biology variable most strongly associated with MNA was LOH at 1p [OR 19.8; p<0.001], though multiple other variables had OR>10.
We created three separate logistic regression models (Table 2). When building a model that tested all clinical and pathology/biology variables, we demonstrated that all variables except gain of 17q and pooled SCAs were independently associated with MNA (Model A). After the inclusion of vital status, the results were similar to Model A except age was no longer independently associated with MNA (data not shown). Model B tested only clinical variables, and demonstrated that all clinical variables were independently associated with MNA. Model C tested only pathology/biology variables, and demonstrated that all pathology/biology variables except gain of 17q and pooled SCAs were independent predictors of MNA. In Model C, the point estimate for the OR for 1p LOH was similar to the univariate OR (17.8 from Table 2 model C vs. 19.8 from Table 1).
In multivariate recursive partitioning using only clinical variables (Figure 2A), we identified two extreme patient subgroups. The first subgroup with low LDH level, age <3.5 months, and non-stage 4 were MYCN amplified in 1.3%. In contrast, patients with high LDH, non-thoracic sites, and age >12 months but <42 months were MYCN amplified in 50.7%. The tree continued to split beyond these groups, but we made a post hoc decision to truncate the model after 5 splits to increase the practicality of the model (full tree with all splits shown in Supplemental Figure 2).
Our second tree utilized only pathology/biology variables (Figure 2B) and two extreme subgroups were identified. Tumors without 1p LOH and low MKI had a rate of MNA of 3.2%. In contrast, tumors with 1p LOH that were poorly differentiated, lacked 11q aberration and were diploid were MYCN amplified in 87.5%.
In our final recursive partitioning tree, all clinical and pathology/biology variables were available for selection (Figure 2C). This tree showed that non-stage 4 patients with low MKI, hyperdiploid tumors lacking 1p LOH were MYCN amplified in 0.6%. In contrast, patients with high LDH and adrenal primary tumors that were poorly differentiated and had 1p LOH were MYCN amplified in 85.7%. In this final tree with all variables available for selection, age was not chosen as a significant variable.
We repeated univariate analyses among the 2,176 patients with stage 4 disease and at least one known specific metastatic site to assess whether metastatic sites are associated with MNA (Table 3). Lung metastasis was the site most strongly associated with MNA (OR 3.0; p<0.001). Bone marrow and bone metastases were also significantly associated with MNA (OR 1.4 and 1.3, respectively; p<0.01), while skin metastases had lower likelihood of MNA (OR 0.4; p=0.01).
In this same group, a multivariate logistic regression model, testing all of the potential clinical and pathology/biology variables previously evaluated, demonstrated that LOH at 1p, high LDH, absence of 11q aberration, high MKI, non-thoracic site, lung metastases, diploid/hypodiploid tumors, absence of skin metastases, and adrenal site were independently significantly associated with MNA (Table 4).
In this comprehensive analysis of predictors of MYCN amplified status, we demonstrate in a comprehensive and definitive manner the complex interaction between MNA and other features of this disease. The presence of each prognostic factor was statistically associated with MYCN status, with most remaining significant on multivariate testing. This novel finding demonstrating the independent association between MNA and almost all variables was unexpected. The result of our multivariable logistic regression model adjusting for vital status demonstrates that our findings cannot be fully explained by an association between MNA and prognosis. Pathology/biology variables demonstrated the strongest associations with MNA. Tumors with 1p LOH were almost 20 times as likely to have MNA compared to tumors without 1p LOH, with a similar association even after controlling for other biological predictors. This is consistent with prior observations1,14,15, and in context with our other findings, highlights the critical association between MNA and tumor genetic features compared to clinical features. For example, of the clinical variables, elevated LDH was the strongest predictor with a univariate odds ratio of 8.4 compared to 19.8 for 1p LOH.
Our novel recursive partitioning approach revealed dramatic differences in the rates of MNA between identified subgroups of patients, including two trees that yielded >80% absolute differences in rates of MNA between subgroups. In addition to highlighting the importance of SCA in prediction of MNA, this approach also enables us to illustrate the relative importance of some variables over others in predicting MNA as well as the importance of context of other predictor variables in the impact of a specific variable. For example, in the absence of LOH at 1p, the presence of high MKI is associated with a maximum incidence of MNA of 25% whereas, in the presence of LOH at 1p (along with other features), the presence of high MKI is associated with a maximum incidence of MNA of 71.4%. Moreover, among poorly differentiated tumors with LOH at 1p, aberration at chromosome 11q is associated with MNA, though the difference in maximum incidence of MNA based upon presence or absence of this aberration is not large (maximum possible incidence for those without 11q aberration of 87.5% compared to 71.4% for those with 11q aberration).
Along these same lines, it is noteworthy that age was a key predictor of differential rates of MNA in models relying solely on clinical variables, but age was not selected as a predictive variable in the recursive partitioning model that included all clinical and biologic/pathologic variables as potential predictors. This finding, along with the results of our multivariate logistic regression models, suggest that other variables associated with age account in part for some of the association between MYCN status and age, which fits with the concept that age behaves as a surrogate for the effects of other clinical and biological variables. Graphical representations of the relationship of age with MNA show a complex relationship. London and colleagues showed that any age cutoff between 15–20 months appropriately separates neuroblastoma patients into high and low risk groups for treatment decisions, with 18 months selected for future risk stratification.23 We complement these findings by showing that MNA peaks in incidence at approximately this same age. Our results demonstrate the nonlinear relationship between age and MNA and should motivate additional studies into the developmental and molecular pathways involved in this association.
Our findings extend previous observations demonstrating positive associations between MNA and 1p LOH and gain of 17q, as well as a negative association between MNA and 11q aberration.2,14–16,31,32 Our multivariate logistic regression model focused solely on pathology/biology variables showed that the 1p and 11q associations with MYCN status were independent of each other, while the 17q association was not. This finding may be due to the fact that 17q and 1p aberrations are themselves correlated,2 or more likely reflect small sample size with available 17q data. The composite variable that included 1p, 11q, and/or 17q segmental aberrations was also not independently associated with MYCN status, likely due to opposing effects of 1p and 11q aberrations on the incidence of MNA. The overall pattern observed in this and other studies33 is noteworthy as it indicates that specific loci rather than non-specific SCAs are associated with the presence or absence of MNA, though the underlying biologic mechanisms for these associations are yet unknown.
Our results address the association between MNA and sites of primary and metastatic disease. We previously reported the association of primary site with MYCN status.10 We have extended that finding to demonstrate that adrenal tumors are more likely and thoracic tumors are less likely to be MYCN amplified even after controlling for other predictors of MYCN status. Likewise, we have previously demonstrated that patients with lung metastases are more likely to have MYCN amplified tumors.20 We now demonstrate that this association is independent of other predictors of MYCN status, though the underlying mechanisms for this association remain obscure. Unlike previous analyses, we did not find an association between MNA and central nervous system metastases, though this may be due to the rarity of this metastatic site at diagnosis.34 We also demonstrate that skin metastases are predictive of being MYCN non-amplified. This site of disease is more common in stage 4S disease and also in young infants, groups which are less likely to have MYCN amplified tumors.35
Although we utilized the largest available patient database, with 7,102 patients with known MYCN status, our work has certain limitations. Our overall rate of MNA of 16% is lower than widely cited estimates of 20–25%.1 It is possible that the comprehensive nature of the INRG database provides a more accurate estimate of the rate of MNA across neuroblastoma. We note that our estimate is identical to that reported by a large analysis from the COG16, though acknowledge that data from the COG are included within the INRG database and may bias the estimate towards those previously reported by the COG. It is also possible that the contribution of low-risk patients identified from national neuroblastoma screening efforts and/or higher rates of MYCN testing in localized patients enriched the population with biologically favorable tumors thereby reducing the proportion with MNA. Although our primary outcome, MYCN status, was not determined by identical techniques in each INRG member country, all those used are standard validated methods2,24,25. We chose to dichotomize LDH and ferritin based upon the median of the initial INRG cohort to maintain consistency with prior INRG analyses, though acknowledge that other cutpoints could be explored in future analysis for their association with MYCN status. The multivariate logistic regression analyses were limited by missing data. We attempted to circumvent this issue by utilizing dummy variables to allow the full data set to be used, thereby reducing the risk of selection bias. This approach could not be used as part of our recursive partitioning models and therefore certain variables with limited data, such as gain of 17q, may have been less likely to be selected, thus potentially decreasing the generalizability of our findings. Although the differences in patient subgroups revealed by recursive partitioning are innovative and can both inform insights into the biological and pathological factors that lead to MYCN amplification and lead to further research in the field, some of the resulting subgroups are small and therefore one must use caution in interpreting these subgroups for clinical decisions.
Despite these limitations, the current study is the largest analysis of predictors of MNA. Our work highlights the importance of obtaining adequate tissue for detailed molecular testing as part of the diagnostic process. While we identified groups of patients with very low probabilities of MNA, no subgroups that completely lack MNA could be identified. Moreover, other pathology/biology variables, particularly SCAs, were key to identifying groups of patients with maximally different rates of MNA and improved our stratification beyond one restricted to the use of only clinical variables. Our findings should stimulate additional laboratory studies into the mechanisms of MNA as a function of age, as well as the genetics or epigenetics that may be orchestrating the complex interaction between the described molecular findings. Finally, additional studies exploring the ways that molecular changes associated with MNA interact with clinical features associated with MNA will be critical.
Supplemental Figure 1. Recursive partitioning using CART to identify age subgroups associated with MYCN amplification by testing age as a continuous variable.
Supplemental Figure 2. Manual recursive partitioning using logistic regression to identify factors associated with MYCN amplification, testing only clinical variables (full tree without truncation).
Support: The INRG database is supported in part by the William Guy Forbeck Research Foundation, the Little Heroes Cancer Research Fund, Children’s Neuroblastoma Cancer Foundation, Neuroblastoma Children’s Cancer Foundation, and the Super Jake Foundation. Data included in the INRG database were provided by Children’s Oncology Group (COG), Pediatric Oncology Group (POG), Children’s Cancer Study Group (CCSG), German Gesellschaft für Pädiatrische Onkologie und Hämatologie (GPOH), European Neuroblastoma Study Group (ENSG), International Society of Paediatric Oncology Europe Neuroblastoma Group (SIOPEN), Japanese Advanced Neuroblastoma Study Group (JANB), Japanese Infantile Neuroblastoma Co-operative Study Group (JINCS), Spanish Neuroblastoma Group and the Italian Neuroblastoma Group. Supported also in part by the Alex’s Lemonade Stand Foundation (KTV, KKM, SGD), Frank A. Campini Foundation (KKM and SGD), Edward Conner Fund (KKM), Dougherty Foundation (KKM and SGD), NIH grant CA039771 and the Audrey Evans Endowed Chair (GMB), and the Mildred V. Strouss Chair (KKM). The contents are solely the responsibility of the authors and do not necessarily represent the official views of the funding sources listed above.
Disclaimers/Disclosures: The authors have no conflicts of interest and/or acknowledgements to disclose.