Patients treated successfully for Hodgkin lymphoma (HL) in childhood are at significant risk for radiation therapy-induced second malignant neoplasms (SMNs), with a cumulative incidence of 18.4% by 30 years after treatment and an absolute excess risk of 6.9 per 1,000 person-years of follow-up
1. This high prevalence makes SMNs the second leading cause of mortality in HL survivors. SMNs primarily affect organs in the involved mediastinal RT field, including the thyroid, skin, gastrointestinal tract, and female breast
2,3. Risk is positively associated with cumulative radiation dose and inversely correlated with age at treatment
4,5.
Despite the clinical significance of this devastating late consequence of RT exposure, little is known about predisposing risk factors. We performed a genome-wide association study (GWAS) to identify variants associated with RT-induced SMNs in HL survivors. In studies of sporadic cancers, non-genetic heterogeneity can obscure genetic associations
6, but here, RT exposure is common to both HL patients who do and do not develop SMNs. Thus, we hypothesized that limiting our study to RT-treated survivors would improve our power to detect the genetic contribution to SMN risk.
The discovery set consisted of 100 SMN cases and 89 SMN-free controls (
Supplementary Table 1a, Supplementary Table 2). All cases and controls were diagnosed with HL as children (median age: 15.6, range: 8–20) and treated with 25–44 Gy RT +/− alkylating chemotherapy
7. Cases developed SMNs with a mean latency of 20.0 years (s.d. = 5.8 years, range: 6–34). Controls were followed for at least 27 years (median: 32 years, range: 27–38) to ensure that the maximal contamination of controls by future cases was < 2%. For a detailed description of the study populations and experimental protocols, see the
Supplementary Methods.
Following genotype quality control, 665,313 single nucleotide polymorphisms (SNPs) were successfully genotyped in 96 cases and 82 controls. We compared allele frequencies between cases and controls using a Chi-square test of homogeneity. A quantile-quantile (Q-Q) plot of the expected and observed distribution of
P values revealed no evidence for systematic genotype calling error or hidden population substructure (genomic control λ = 1.007) (
Supplementary Fig. 1)
8. Principal component analysis using Eigenstrat indicated cases and controls were of European descent (
Supplementary Fig. 2)
9.
We empirically determined the threshold for a genome-wide type I error rate of 0.05 by permutation (
P < 1.0×10
−7). At this threshold, our study had 80% power to detect a SNP with a frequency of 35% and an odds ratio of 3.5 (
Supplementary Fig. 3). Three SNPs (rs4946728, rs1040411, rs8083533) achieved genome-wide significance (
Supplementary Fig. 4 and Table 1). rs4946728 and rs1040411 mapped to chromosome 6q21, intergenic between
ATG5 and
PRDM1. The strongest evidence for association in this region was for rs4946728 (
P = 1.09×10
−8, OR
allelic = 4.22 [95% CI = 2.53–7.05]). rs8083533 mapped to 18q11.2, intronic to
TAF4B (
P = 4.98×10
−8, OR
allelic = 3.78 [95% CI = 2.31–6.18]). Logistic regression, adjusting for gender, age at diagnosis, year of HL diagnosis, gonadal radiation (in females), and alkylating chemotherapy exposure, demonstrated that these risk variables had no effect on the observed associations (
Supplementary Table 3).
We sought to replicate these findings in an independent set of 62 cases with SMNs and 71 SMN-free controls, all treated for HL in childhood with 25–44 Gy mediastinal RT (
Supplementary Table 1b). We observed significant associations with SMNs for both SNPs on chromosome 6q21, rs4946728 (
P = 0.002) and rs1040411 (
P = 0.03), but not for rs8083533 (
P = 0.82) (). In the combined set, odds of an SMN were increased over 3-fold per copy of the major allele for rs4946728 (OR
allelic = 3.32 [95% CI = 2.25–4.90], combined
P = 5.99×10
−10) and over 2-fold for rs1040411 (OR
allelic = 2.39 [95% CI = 1.73–3.30], combined
P = 1.18×10
−7).
| Table 1.Association of SNPs with SMNs following HL treatment. |
We found no evidence that the association of rs4946728 and rs1040411 differed between breast cancer and other SMNs (
Phet = 0.41 for rs4946728 and
Phet = 0.58 for rs1040411) or between males and females (
Phet = 0.83 for rs4846728 and
Phet = 0.29 for rs1040411) (
Supplementary Tables 4a and 4b). To determine whether rs4946728 or rs1040411 were associated with SMNs after adult HL, we genotyped both SNPs in 57 SMN cases and 37 controls who were treated with RT as adults (median age: 24.0, range: 21–43) (
Supplementary Table 1b). We did not observe an association for either rs4946728 (P = 0.87) or rs1040411 (
P = 0.65) (
Supplementary Table 5) suggesting that age of RT-exposure modifies the association between these variants and SMN risk. These results should be interpreted with caution, however, given the small number of individuals genotyped.
Both rs4946728 and rs1040411 (r
2 = 0.4) are noncoding SNPs located between
PRDM1 and
ATG5 on chromosome 6q21 (). Imputation of the locus with the 1000 Genomes reference panel
10 did not reveal any variant with a stronger association than either genotyped SNP (
Supplementary Table 6). Logistic regression conditioning on rs4946728 revealed a modest residual association for rs1040411 (
P = 0.05) (
Supplementary Table 7), suggesting that an unobserved causal variant may be correlated with a haplotype harboring both SNPs.
rs4946728 and rs1040411 form three common haplotypes in Caucasians that represent 99.9% of the haplotypes at this locus (
Supplementary Table 8). As noncoding risk variants frequently regulate gene expression
11, we performed expression quantitative trait locus (eQTL) analysis to determine whether these haplotypes were associated with expression of
PRDM1,
ATG5 or other genes within five megabases. We found that increasing dosage of the risk haplotype (comprised of the risk alleles for both rs4946728 and rs1040411) was significantly associated with lower
PRDM1 mRNA expression (
P = 0.03) (
Supplementary Fig. 5). In contrast, no association with expression was observed for any other gene, including
ATG5 (
P = 0.39).
Because SMNs after HL are caused by radiation exposure, we investigated the relationship between ionizing radiation (IR) exposure and PRDM1 protein levels in cell lines homozygous for either the risk haplotype (n = 4) or the protective haplotype (comprised of the protective alleles for both rs4946728 and rs1040411, n = 4). In untreated cells, PRDM1 was more abundant in cells homozygous for the protective haplotype than in cells homozygous for the risk haplotype (P = 0.048) and was significantly induced within 2 hours of IR-exposure (P = 0.020) (). Strikingly, PRDM1 was not induced by IR in cells homozygous for the risk haplotype (P = 0.19).
PRDM1 (PR domain containing 1, with ZNF domain (also known as
BLIMP1)) (OMIM# 603423), encodes a zinc finger transcriptional repressor involved in a variety of cellular processes including proliferation, differentiation, and apoptosis
12. It was recently shown to be a tumor suppressor in activated B cell-like diffuse large B cell lymphoma
13,14, and is frequently lost in many cancer types, including solid tumors
15. Of note, loss of heterozygosity at chromosome 6q was found to be significantly more common in breast cancers following RT for HL than in sporadic breast cancers (42% vs 10%), suggesting this region may be targeted for loss in these RT-induced cancers
16.
PRDM1 negatively regulates pro-proliferative genes, such as
MYC17. Therefore, we investigated whether the 6q21 variants were associated with repression of
MYC by radiation concomitant with PRDM1 induction. Though basal MYC levels did not correlate with carriage of the 6q21 risk haplotype (
P = 0.19), MYC was significantly more repressed following IR exposure in cells homozygous for the protective haplotype than in cells homozygous for the risk haplotype (
P = 0.02) (
Supplementary Fig. 6).
In summary, these data demonstrate that variants at 6q21 are strongly associated with risk for SMNs following RT treatment for HL in childhood and suggest that common variants can have large effect sizes in the context of specific exposures. The SNPs we identified are associated with basal and radiation-induced PRDM1 expression, as well as radiation-induced MYC repression. Taken together, our findings support a novel role for PRDM1 as a radiation-responsive tumor suppressor. We cannot rule out, however, either long-range effects of these variants on other genes or tissue-specific differences in PRDM1 function. Additionally, the observation that SNPs intergenic between
PRDM1 and
ATG5 are associated with autoimmune disease
18,19 raises the intriguing possibility that altered immune function or inflammation may be associated with SMN risk. Although the RT doses used currently to treat HL are considerably lower than the RT doses used to treat the children analyzed in this study, recent data indicate that children treated with lower-dose RT for HL remain at significant risk for SMNs
20. Thus, our findings may be important for understanding the etiology of SMNs in these pediatric HL survivors, as well as in other cancer patients treated with RT.