|Home | About | Journals | Submit | Contact Us | Français|
Although five-year survival rates for childhood acute lymphoblastic leukemia (ALL) are now over 80% in most industrialized countries1, not all children have benefited equally from this progress2. Ethnic differences in survival after childhood ALL have been reported in many clinical studies3-11, with poorer survival observed among African Americans or those with Hispanic ethnicity when compared with European American or Asian patients3-5. The causes of ethnic differences remain uncertain, although both genetic and non-genetic factors are likely important4,12. Interrogating genome-wide germline SNP genotypes in an unselected large cohort of children with ALL, we observed that the component of genomic variation that co-segregated with Native American ancestry was associated with the risk of relapse (P=0.0029), even after adjusting for known prognostic factors (P=0.017). Ancestry-related differences in relapse risk were abrogated by the addition of a single extra phase of chemotherapy, indicating that modifications to therapy can mitigate ancestry-related risk of relapse.
After applying quality control measures (Supplementary Note), we analyzed 444,044 germline genetic single nucleotide polymorphisms (SNPs) in an ethnically diverse group of 2,534 children with ALL. To summarize genetic variation, we applied principal component analysis (PCA) to genotypes of 2,849 individuals, including 2534 patients with ALL, plus 210 HapMap samples from descendants of Northern Europeans (CEU, N=60), West Africans (YRI, N=60), East Asians (CHB, N=45; JPT, N=45), and 105 Native American (NA)13 reference groups (Fig. 1). The top ranked principal component (PC1) separated self-reported black patients (N=250) and the YRI HapMap samples from all other groups (Fig. 1A); PC2 separated self-reported Asian patients (N=76) and the CHB/JPT HapMap samples from non-Asian populations (Fig. 1B). PC3 primarily captured genetic variation characteristic of NA ancestry, and self-reported Hispanics (N=405) exhibited a cline in PC3 between NA reference populations and other ancestral groups (Fig. 1C), consistent with the extensive ancestral admixture in Hispanics. Because the components of genomic variation (e.g. PCs) clearly co-segregated with geographic ancestries, we applied STRUCTURE14 to quantitatively determine ancestral composition of children with ALL. Patients encompassed a wide array of self-declared ethnic groups (Table 1) and we observed substantial variability in the ancestral genetic background of this unselected group of 2,534 children (Fig. 2A and Table 2).
We tested whether genetic ancestry itself was associated with treatment outcome, after stratifying for treatment protocols, and found that the cumulative incidence of relapse was significantly associated with higher NA ancestry, treated as a continuous variable (Fig. 2B, P=0.0029, N=2,534, Supplementary Table S1). NA ancestry was also negatively associated with event-free survival (P=0.018). Within patients self-reporting as whites, there was a trend for higher NA ancestry to be related to a higher risk of ALL relapse (Fig. 2C, P=0.08, N=1,687). In a multivariate analysis adjusting for known risk factors for relapse (e.g. leukocyte count [≥ or <50,000/μl], age [≥ or <10 years], ALL lineage and molecular ALL subtypes [T or B-cell ALL, presence or absence of MLL rearrangements, ETV6-RUNX1, TCF3-PBX1, and BCR-ABL], DNA index [≥ or <1.16], and minimal residual disease [MRD, ≥ or <0.01%]), NA ancestry remained prognostic (P=0.017 when NA ancestry was treated as continuous variable [Table 3]). To dichotomize NA ancestry in a manner similar to the dichotomization used for other ALL prognostic features such as leukocyte count, we divided patients into those with low (<10%) vs high (≥ 10%) NA ancestry (see Supplementary Fig. S1 and Supplementary Note for details), and NA ancestry remained associated with relapse (P=3.6×10−4, Supplementary Table S2) in the context of other dichotomized prognostic features. The same multivariate analysis that include self-declared Hispanic ethnicity instead of NA ancestry also showed Hispanic ethnicity associated with relapse risk (P=3.4×10−3, Supplementary Table S2). Unfavorable clinical features were not associated with NA genetic ancestry (Table 1). An important clinical indicator of relapse risk in ALL is the early response to therapy, determined by the level of MRD at the end of remission induction therapy. Even within the group of patients with negative MRD status, higher NA genetic ancestry was linked to a higher risk of relapse (P=0.08 when NA ancestry was treated as a continuous variable, P=0.006 for dichotomous NA ancestry [≥ or <10%], N=1,834, Supplementary Fig. S2). This is of clinical relevance, in that identifying a subgroup of patients who may need more intensive therapy despite negative MRD (good response to initial therapy) would provide additional prognostic information.
Notably, the prognostic impact of NA ancestry varied dependent upon specific treatment regimens. In the Children’s Oncology Group (COG) P9904/P9905 study, patients were either randomized or non-randomly assigned to receive or not to receive a delayed intensification phase (i.e. an 8-week multi-agent treatment, Supplementary Table S3). Higher NA ancestry was associated with higher relapse risk in children who did not receive delayed intensification (P=0.015, N=938, Fig. 2D), but not in those who did receive this phase of therapy (P=0.73, N=667, Fig. 2E). Similar results were observed when NA ancestry was dichotomized as a prognostic feature (≥ or <10%): P=0.0016 for those who did not receive the delayed intensification and P=0.51 for those who received the delayed intensification.
Delayed intensification consists of 8 weeks of chemotherapy involving 7 widely-used anticancer drugs: dexamethasone, vincristine, daunorubicin, asparaginase, thioguanine, cyclophosphamide, and cytarabine. For patients at lower risk of relapse, there has been some controversy as to whether the increased intensity of this phase of therapy (and its attendant slightly increased risk of complications such as infection15) is worth the benefit of lower relapse rates, which have been observed in most but not all settings16-18. Delayed intensification is relatively well-tolerated; it has been associated with an extra 5 days of hospitalization for its attendant toxicity (out of a total of ~ 134 weeks of ALL therapy)19. Thus, the benefits of delayed intensification are likely to outweigh the costs among the subset of patients with ≥ 10% NA ancestry, exemplifying the importance and possibility of individualizing ALL therapy on the basis of genetic variation. Additional insights will be gained from clinical trials to examine the therapeutic efficacy of various phases of therapy (delayed intensification and other phases), set in the context of other therapeutic regimens, in patients with high versus low % NA ancestry.
To further illustrate the evidence for the association between ancestry and relapse, we examined local NA ancestry across the genome for association with ALL relapse, using admixture mapping20. Of 3,682 genomic segments with unique NA ancestry status, local NA ancestry at several loci was associated with relapse, with a locus at 2p25.3 (rs17039396) exhibiting the strongest association between local NA ancestry and relapse (nominal P=3.2×10−7, genome-wide threshold for significance is P=1.4×10−5 based on 3,682 independent loci tested, Supplementary Note and Supplementary Fig. S3). Likewise, admixture mapping using a previously published admixture map for US Hispanics13 also identified 2p25.3 as having genome-wide significance for association with relapse (nominal P =1.1×10−6, genome-wide threshold is P=2.4 × 10−5, Supplementary Fig. S3).
To explore the mechanisms by which ancestry may affect ALL relapse risk, we also examined which individual SNP genotypes were significantly associated with relapse risk, with the phenotype defined as “any” relapse (hematologic plus extramedullary) as well as more narrowly defined as the most common (and deadly) form of relapse, hematologic relapse. A SNP in PDE4B (rs6683977) was the highest-ranked SNP associated with hematologic relapse (P=2.2×10−6), and also associated with any relapse risk (Supplementary Fig. S4 and S5), and admixture mapping indicated that local NA ancestry at 1p32.2-31.3 that encompassed SNPs in PDE4B (including rs6683977) exhibited a significant association signal (P=3.2×10−6 for that locus, P value threshold for genome-wide significance=1.4×10−5, Supplementary Note and Supplementary Fig. S3). Interestingly, primary ALL cells expressing higher levels of PDE4B were also more resistant to prednisolone (Supplementary Fig. S6), indicating that PDE4B might play a role in glucocorticoid response in ALL.
The association we observed between genomically-defined NA ancestry and relapse is consistent with higher ALL relapse risk among Hispanics3-5. The high risk of relapse associated with NA ancestry was not explained by an association of NA ancestry with known ALL relapse risk factors (Tables (Tables11 and and3).3). The association was similar when we confined the analysis to self-declared whites (Fig. 2C), which is consistent with a genetic (rather than cultural or environmental) basis for the elevated risk of leukemia relapse in Hispanics. However, we cannot exclude the possibility that environmental, sociocultural, or dietary differences that are associated with NA ancestry also or even primarily influenced relapse risk.
Although current risk classification schemes identify children with more aggressive ALL, a substantial portion of patients are not cured with contemporary chemotherapy, and a substantial portion of patients who ultimately relapse are considered at “low risk” for relapse or do not exhibit MRD21,22. For these reasons, it was interesting that NA ancestry identified a group of patients who benefited by the use of an extra phase of chemotherapy (delayed intensification, Fig. 2D and E), and there was a trend for prognostic importance of NA ancestry even within the group exhibiting negative MRD (Supplementary Fig. S2), indicating that germline genomic variation may add prognostic value to current ALL risk stratification schema.
There are likely many mechanisms by which ancestry-related germline polymorphisms could affect drug response: our data illustrate that ancestry-related genomic variation could affect the probability of cure in part by affecting drug resistance. However, all such outcomes (and therefore the identification of genetic prognostic features) are also dependent upon therapy. Our data illustrate that giving additional chemotherapy can overcome the negative prognostic impact conferred by a set of ancestry-related polymorphisms, and thereby mitigate ethnic disparities in outcome of childhood ALL.
Included in this study were all 2,534 children with newly diagnosed ALL treated on St. Jude Children’s Research Hospital (St. Jude) Total Therapy XIIIB23 or XV (N=707)24, the Children’s Oncology Group (COG) P9906 (N=222)22, or COG P9904/P9905 clinical trials (N=1,605)22 who had successfully-genotyped germline DNA (Supplementary Note). None of the children received any ALL treatment prior to enrollment on these clinical trials. The studies were approved by the Institutional Review Boards and informed consent was obtained from the parents, guardians, or patients, as appropriate. Risk-directed treatment was described previously for St. Jude23,24 and COG22 trials (Supplementary Table S3). Minimal residual disease (MRD) was determined in bone marrow at the end of remission induction therapy22,25.
Genotyping was performed using the Affymetrix GeneChip Human Mapping 500K Array sets or the Genome-Wide Human SNP Array 6.0; a subset of genotypes was validated using Illumina GoldenGate assays (see Supplementary Note). Genotypes were coded as 0, 1, 2 for AA, AB, BB genotypes. Genotype calling and quality control were performed as described26 (Supplementary Note). The final analyses included 444,044 SNPs in 2,534 patients.
Self-reported race/ethnicity was designated in mutually exclusive categories of white, black, Hispanic, or Asian based on criteria in place for the original clinical trials (Table 1). An individual designated as self-reported Hispanic was considered in the Hispanic race/ethnicity category regardless of his or her racial background (which was usually not noted). Remaining patients included Native American/Native Alaskans, Native Hawaiian/Pacific Islanders, and individuals for whom race/ethnicity was not noted, or reported as “other.”
Population structure was determined using EIGENSTRAT27 and was compared with geographic ancestral reference populations (HapMap samples from descendants of Northern Europeans, West Africans, East Asians, and NA13). We also estimated ancestral composition using STRUCTURE14 (version 2.2.3) on the basis of the genotypes at 30,000 SNPs (Supplementary Note and Supplementary Table S4). European, African, Asian, and Native American genetic ancestries were assumed to sum to 100% in each patient.
Relapse was defined as bone marrow relapse and/or extramedullary relapse. Lineage switch, second malignancy, and death during remission were incorporated as competing events. We evaluated associations between genetic ancestries (as a continuous variable, with each ancestry varying from 0 to 100%) and the risk of relapse using the Fine and Gray’s regression model28 and stratifying by 9 risk-adapted treatment arms: St. Jude Total XIIIB low risk23, St. Jude Total XIIIB high risk23, St. Jude Total XV low risk24, St. Jude Total XV standard/high risk24, COG P9906, and COG P9904/9905 regimens A, B, C, and D (Supplementary Table S3)22. When noted, NA ancestry was also analyzed as a dichotomous variable (i.e. ≥ or < 10%). The basis of the dichotomization is described in detail in the Supplementary Note. A threshold of 10% was the antimode that discriminated 2 major modes in the US ALL population (Supplementary Fig. S1)29, and approximately the same 10% value maximally differentiated relapse risk (see Supplementary Note). In multivariate analyses, known risk factors (leukocyte count [≥ or <50,000/μl], age [≥ or <10 years], ALL lineage and molecular ALL subtypes [T-cell ALL, MLL rearrangements, ETV6-RUNX1, TCF3-PBX1, BCR-ABL, or other B-lineage ALL], DNA index [≥ or <1.16], and MRD [≥ or <0.01%]) were included together with genetic ancestries. When evaluating the association between genotypes at individual germline SNPs and the risk of relapse, we adjusted for treatment arms and also included genetic ancestry as covariates to account for population stratification (Supplementary Note).
Spearman rank correlation test was used to determine the relationships between MRD (classified as negative [<0.01%] or positive [≥0.01%]) and genetic variation (genetic ancestry or genotypes at individual SNPs), as previously described26.
Statistical and computational analyses were performed using S-Plus software, version 7.0 (Insightful Corp, Seattle, WA), R version 2.6.1 (www.r-project.org), and SAS software, version 9.1 (SAS Institute Inc, Cary, NC).
We are indebted to all patients and their parents who participated St. Jude and COG protocols included in this study, clinicians and research staff at St. Jude and COG institutions, and Dr. Jeannette Pullen from University Mississippi at Jackson for assistance in classification of patients with ALL. Genome-wide genotyping of COG P9904/9905 samples was performed by the Center for Molecular Medicine with generous financial support from the Jeffrey Pride Foundation and the National Childhood Cancer Foundation. Steve P. Hunger is the Ergen Family Chair in Pediatric Cancer and Jun. J. Yang is supported by the American Society of Clinical Pharmacology and Therapeutics Young Investigator Award and Alex Lemonade Stand Foundation for Childhood Cancer Young Investigator Grant. We also thank Dr. Mark Shriver at the Pennsylvania State University for sharing SNP genotype data of the Native American references. We are especially inspired by Damon Ingersoll, who bravely fought but lost his battle with ALL. This work was supported by the National Cancer Institute and National Institute of General Medical Sciences of the National Institutes of Health [grant numbers: CA093552, CA78224, CA21765, R37CA36401, CA98543, CA114766, RC4CA156449, U10CA98413, U01GM 61393, and U01GM 92666], ALSAC, and by CureSearch.
Author Contribution Study concept and design: J.J.Y, C.C. and M.V.R; Acquisition of data: D.C., C-H.P., W.P.B., P.L.M., N.J.W., C.L.W, M.J.B., M.V.R., G.N., Y.F., M.D., B.M.C., and A.C.; Drafting of the manuscript: J.J.Y. and M.V.R.; Critical revision of the manuscript for important intellectual content: J.J.Y., C.C., W.Y., W.E.E., C-H.P., B.M.C., M.J.B., W.L.C., S.P.H., G.H.R., M.L., Y.F., M.V.R.; Statistical analysis: C.C., W.Y., X.C., M.D., N.J.C., and P.S.; Obtaining funding: G.H.R., D.C., W.E.E., M.B., W.L.C., S.P.H. and M.V.R.; Study supervision: M.V.R.
Competing Interests and Financial Disclosures The authors do not have any relevant competing interests to disclaim and full disclosures are provided in the Supplementary Note.