Although five-year survival rates for childhood acute lymphoblastic leukemia (ALL) are now over 80% in most industrialized countries1, not all children have benefited equally from this progress2. Ethnic differences in survival after childhood ALL have been reported in many clinical studies3-11, with poorer survival observed among African Americans or those with Hispanic ethnicity when compared with European American or Asian patients3-5. The causes of ethnic differences remain uncertain, although both genetic and non-genetic factors are likely important4,12. Interrogating genome-wide germline SNP genotypes in an unselected large cohort of children with ALL, we observed that the component of genomic variation that co-segregated with Native American ancestry was associated with the risk of relapse (P=0.0029), even after adjusting for known prognostic factors (P=0.017). Ancestry-related differences in relapse risk were abrogated by the addition of a single extra phase of chemotherapy, indicating that modifications to therapy can mitigate ancestry-related risk of relapse.
After applying quality control measures (Supplementary Note), we analyzed 444,044 germline genetic single nucleotide polymorphisms (SNPs) in an ethnically diverse group of 2,534 children with ALL. To summarize genetic variation, we applied principal component analysis (PCA) to genotypes of 2,849 individuals, including 2534 patients with ALL, plus 210 HapMap samples from descendants of Northern Europeans (CEU, N=60), West Africans (YRI, N=60), East Asians (CHB, N=45; JPT, N=45), and 105 Native American (NA)13 reference groups (Fig. 1). The top ranked principal component (PC1) separated self-reported black patients (N=250) and the YRI HapMap samples from all other groups (Fig. 1A); PC2 separated self-reported Asian patients (N=76) and the CHB/JPT HapMap samples from non-Asian populations (Fig. 1B). PC3 primarily captured genetic variation characteristic of NA ancestry, and self-reported Hispanics (N=405) exhibited a cline in PC3 between NA reference populations and other ancestral groups (Fig. 1C), consistent with the extensive ancestral admixture in Hispanics. Because the components of genomic variation (e.g. PCs) clearly co-segregated with geographic ancestries, we applied STRUCTURE14 to quantitatively determine ancestral composition of children with ALL. Patients encompassed a wide array of self-declared ethnic groups (Table 1) and we observed substantial variability in the ancestral genetic background of this unselected group of 2,534 children (Fig. 2A and Table 2).
We tested whether genetic ancestry itself was associated with treatment outcome, after stratifying for treatment protocols, and found that the cumulative incidence of relapse was significantly associated with higher NA ancestry, treated as a continuous variable (Fig. 2B, P=0.0029, N=2,534, Supplementary Table S1). NA ancestry was also negatively associated with event-free survival (P=0.018). Within patients self-reporting as whites, there was a trend for higher NA ancestry to be related to a higher risk of ALL relapse (Fig. 2C, P=0.08, N=1,687). In a multivariate analysis adjusting for known risk factors for relapse (e.g. leukocyte count [≥ or <50,000/μl], age [≥ or <10 years], ALL lineage and molecular ALL subtypes [T or B-cell ALL, presence or absence of MLL rearrangements, ETV6-RUNX1, TCF3-PBX1, and BCR-ABL], DNA index [≥ or <1.16], and minimal residual disease [MRD, ≥ or <0.01%]), NA ancestry remained prognostic (P=0.017 when NA ancestry was treated as continuous variable [Table 3]). To dichotomize NA ancestry in a manner similar to the dichotomization used for other ALL prognostic features such as leukocyte count, we divided patients into those with low (<10%) vs high (≥ 10%) NA ancestry (see Supplementary Fig. S1 and Supplementary Note for details), and NA ancestry remained associated with relapse (P=3.6×10−4, Supplementary Table S2) in the context of other dichotomized prognostic features. The same multivariate analysis that include self-declared Hispanic ethnicity instead of NA ancestry also showed Hispanic ethnicity associated with relapse risk (P=3.4×10−3, Supplementary Table S2). Unfavorable clinical features were not associated with NA genetic ancestry (Table 1). An important clinical indicator of relapse risk in ALL is the early response to therapy, determined by the level of MRD at the end of remission induction therapy. Even within the group of patients with negative MRD status, higher NA genetic ancestry was linked to a higher risk of relapse (P=0.08 when NA ancestry was treated as a continuous variable, P=0.006 for dichotomous NA ancestry [≥ or <10%], N=1,834, Supplementary Fig. S2). This is of clinical relevance, in that identifying a subgroup of patients who may need more intensive therapy despite negative MRD (good response to initial therapy) would provide additional prognostic information.
Notably, the prognostic impact of NA ancestry varied dependent upon specific treatment regimens. In the Children’s Oncology Group (COG) P9904/P9905 study, patients were either randomized or non-randomly assigned to receive or not to receive a delayed intensification phase (i.e. an 8-week multi-agent treatment, Supplementary Table S3). Higher NA ancestry was associated with higher relapse risk in children who did not receive delayed intensification (P=0.015, N=938, Fig. 2D), but not in those who did receive this phase of therapy (P=0.73, N=667, Fig. 2E). Similar results were observed when NA ancestry was dichotomized as a prognostic feature (≥ or <10%): P=0.0016 for those who did not receive the delayed intensification and P=0.51 for those who received the delayed intensification.
Delayed intensification consists of 8 weeks of chemotherapy involving 7 widely-used anticancer drugs: dexamethasone, vincristine, daunorubicin, asparaginase, thioguanine, cyclophosphamide, and cytarabine. For patients at lower risk of relapse, there has been some controversy as to whether the increased intensity of this phase of therapy (and its attendant slightly increased risk of complications such as infection15) is worth the benefit of lower relapse rates, which have been observed in most but not all settings16-18. Delayed intensification is relatively well-tolerated; it has been associated with an extra 5 days of hospitalization for its attendant toxicity (out of a total of ~ 134 weeks of ALL therapy)19. Thus, the benefits of delayed intensification are likely to outweigh the costs among the subset of patients with ≥ 10% NA ancestry, exemplifying the importance and possibility of individualizing ALL therapy on the basis of genetic variation. Additional insights will be gained from clinical trials to examine the therapeutic efficacy of various phases of therapy (delayed intensification and other phases), set in the context of other therapeutic regimens, in patients with high versus low % NA ancestry.
To further illustrate the evidence for the association between ancestry and relapse, we examined local NA ancestry across the genome for association with ALL relapse, using admixture mapping20. Of 3,682 genomic segments with unique NA ancestry status, local NA ancestry at several loci was associated with relapse, with a locus at 2p25.3 (rs17039396) exhibiting the strongest association between local NA ancestry and relapse (nominal P=3.2×10−7, genome-wide threshold for significance is P=1.4×10−5 based on 3,682 independent loci tested, Supplementary Note and Supplementary Fig. S3). Likewise, admixture mapping using a previously published admixture map for US Hispanics13 also identified 2p25.3 as having genome-wide significance for association with relapse (nominal P =1.1×10−6, genome-wide threshold is P=2.4 × 10−5, Supplementary Fig. S3).
To explore the mechanisms by which ancestry may affect ALL relapse risk, we also examined which individual SNP genotypes were significantly associated with relapse risk, with the phenotype defined as “any” relapse (hematologic plus extramedullary) as well as more narrowly defined as the most common (and deadly) form of relapse, hematologic relapse. A SNP in PDE4B (rs6683977) was the highest-ranked SNP associated with hematologic relapse (P=2.2×10−6), and also associated with any relapse risk (Supplementary Fig. S4 and S5), and admixture mapping indicated that local NA ancestry at 1p32.2-31.3 that encompassed SNPs in PDE4B (including rs6683977) exhibited a significant association signal (P=3.2×10−6 for that locus, P value threshold for genome-wide significance=1.4×10−5, Supplementary Note and Supplementary Fig. S3). Interestingly, primary ALL cells expressing higher levels of PDE4B were also more resistant to prednisolone (Supplementary Fig. S6), indicating that PDE4B might play a role in glucocorticoid response in ALL.
The association we observed between genomically-defined NA ancestry and relapse is consistent with higher ALL relapse risk among Hispanics3-5. The high risk of relapse associated with NA ancestry was not explained by an association of NA ancestry with known ALL relapse risk factors (Tables (Tables11 and and3).3). The association was similar when we confined the analysis to self-declared whites (Fig. 2C), which is consistent with a genetic (rather than cultural or environmental) basis for the elevated risk of leukemia relapse in Hispanics. However, we cannot exclude the possibility that environmental, sociocultural, or dietary differences that are associated with NA ancestry also or even primarily influenced relapse risk.
Although current risk classification schemes identify children with more aggressive ALL, a substantial portion of patients are not cured with contemporary chemotherapy, and a substantial portion of patients who ultimately relapse are considered at “low risk” for relapse or do not exhibit MRD21,22. For these reasons, it was interesting that NA ancestry identified a group of patients who benefited by the use of an extra phase of chemotherapy (delayed intensification, Fig. 2D and E), and there was a trend for prognostic importance of NA ancestry even within the group exhibiting negative MRD (Supplementary Fig. S2), indicating that germline genomic variation may add prognostic value to current ALL risk stratification schema.
There are likely many mechanisms by which ancestry-related germline polymorphisms could affect drug response: our data illustrate that ancestry-related genomic variation could affect the probability of cure in part by affecting drug resistance. However, all such outcomes (and therefore the identification of genetic prognostic features) are also dependent upon therapy. Our data illustrate that giving additional chemotherapy can overcome the negative prognostic impact conferred by a set of ancestry-related polymorphisms, and thereby mitigate ethnic disparities in outcome of childhood ALL.