Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Nat Genet. Author manuscript; available in PMC 2011 February 1.
Published in final edited form as:
Published online 2010 July 18. doi:  10.1038/ng.626
PMCID: PMC2913472

Genome-wide association study of follicular lymphoma identifies a risk locus at 6p21.32


To identify susceptibility loci for non-Hodgkin lymphoma (NHL) subtypes, we conducted a three-stage genome-wide association study. We identified two variants associated with follicular lymphoma (FL) in 1,465 FL cases/6,958 controls at 6p21.32 (rs10484561, rs7755224, r2=1.0; combined p-values=1.12×10-29, 2.00×10-19), providing further support that MHC genetic variation influences FL susceptibility. Confirmatory evidence of a previously reported association was also found between chronic lymphocytic leukemia/small lymphocytic lymphoma and rs735665 (combined p-value=4.24×10-9).

Non-Hodgkin lymphoma (NHL) is a complex group of B- and T-cell neoplasms with >300,000 new cases diagnosed worldwide each year ( Family and epidemiological studies suggest an important genetic role in the etiology of lymphoma1, though the inherited genetic basis of the disease is largely unknown. Recently, we conducted a genome-wide association study (GWAS) of three common histological subtypes of NHL, follicular lymphoma (FL), chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL), and diffuse large B-cell lymphoma (DLBCL), using a pooled DNA genotyping strategy2. Due to experimental and technical noise associated with pooled DNA GWAS, we conducted a new individual genotyping-based study on a larger subset of NHL cases and control using a three-stage GWAS study design (Supplementary Table 1, Supplementary Fig. 1).

In Stage 1, we conducted a GWAS using a subset of samples (SF1 study) chosen from a larger population-based case-control study of NHL based in the San Francisco Bay Area3. After applying quality control metrics (Supplementary Methods), 213 FL, 211 CLL/SLL and 257 DLBCL cases and 750 controls were included in the final statistical analysis of Stage 1. The genome-wide results for FL, CLL/SLL and DLBCL are represented in Supplementary Fig. 2a-c, where overall, 18 SNPs showed unadjusted trend p-values below a 10-5 threshold (Supplementary Table 2). The most notable findings were for SNPs associated with FL in the major histocompatibility complex (MHC) region on chromosome 6, concentrated around two independent peaks at 6p21.33 and 6p21.32 (r2<0.01; Figure 1). The strongest signal in 6p21.33 is located at the psoriasis susceptibility region 1 (PSORS1) (rs1265086, trend p-value=3.34×10-6, Supplementary Table 3), where we have previously detected a FL susceptibility locus2. The 6p21.32 association peak is located in a region encompassing the HLA-DR and HLA-DQ genes. Here, eight SNPs exhibited trend p-values ≤10-4 (Supplementary Table 3).

Figure 1
LD and association results for the FL-associated regions in the major histocompatibility complex (MHC)

In Stage 2, 40 SNPs with the lowest trend p-values for each NHL subtype (Supplementary Tables 4-6) were genotyped in two independent population-based case-control studies (SCALE4 for FL, Mayo-GEC5 for CLL/SLL) and in a separate sample of 118 DLBCL population-based cases and 651 controls (SF1B study) drawn from the same population as SF1 (Supplementary Table 1, Supplementary Fig. 1). Seven SNPs were associated with FL risk (p<0.05) in the SCALE study (Supplementary Table 7). The two SNPs with the lowest p-values, rs7755224 and rs10484561 (allelic p-values=6.30×10-5 and 1.20×10-5) are located in the MHC Class II region at 6p21.32. They are in complete linkage disequilibrium (LD) (r2=1.0) and lie, respectively, 16kb and 29kb upstream of HLA-DQB1. Four additional SNPs were associated with FL in the MHC region at 6p21.33, including the two SNPs, rs6457327 and rs2517448 (allelic p-values=2.90×10-2 and 3.45×10-2), previously reported in our pooled DNA GWAS2.

Two SNPs with trend p-values <0.05 were positively associated with CLL/SLL risk in the Mayo-GEC study (rs735665 and rs484458, trend p-values=2.64×10-3 and 6.43×10-3; Supplementary Table 7). Located in an intergenic region on 11q24.1, rs735665 was previously reported as a risk allele for CLL in a genome scan by Di Bernardo and colleagues6. Interestingly, two other SNPs, rs872071 and rs9378805, reported to be highly associated with CLL in that study were ranked among the top 100 CLL/SLL SNPs in our Stage 1 GWAS (trend p-values = 7.17×10-4 and 1.75×10-4 respectively).

None of the SNPs genotyped in Stage 2 for DLBCL showed evidence of association with disease risk (p-value<0.05). Failure to identify associated alleles may be due to the heterogeneity of DLBCL as evidenced from gene expression and immunophenotyping studies.

In Stage 3, the nine SNPs associated with FL and CLL/SLL risk were genotyped in 873 FL, 471 CLL/SLL, 916 DLBCL cases and 4,470 controls recruited from six case-control studies of European descent participating in the InterLymph Consortium (Supplementary Table 1, Supplementary Fig. 1). Again, rs10484561 was associated with increased FL risk across all studies, with trend p-values ranging from 2.21×10-2 to 1.40×10-10 (Table 1). The combined p-value reached 1.12×10-29 in the meta-analysis of samples from all three stages (combined odds ratio [OR]=1.95, 95% confidence interval [CI] 1.72-2.22, Supplementary Table 8, Supplementary Fig. 3). No evidence was found of heterogeneity across studies (Cochran's Q statistic=5.61, d.f.=7, p-value=0.5857; I2 heterogeneity index=0%). Likewise, rs7755224 was associated with FL in the meta-analysis from all stages (combined p-value=2.00×10-19, OR=2.07, 95% CI 1.76-2.42, Supplementary Table 8, Supplementary Fig. 4) and no evidence was found of significant heterogeneity (Q=5.42, d.f.=3, p-value=0.1438; I2=44.6%). While the association between rs6457327 and FL was replicated in Stage 2, and in three other independent sample sets2, this association was poorly replicated in the smaller studies from Stage 3, resulting in a weaker combined p-value (6.64×10-6; OR=0.68, 95% CI 0.58-0.79, Supplementary Table 8) than previously reported2 (see Supplementary Table 9 for additional information).

Table 1
Summary statistics for association between rs10484561 and FL in all three stages.

For CLL/SLL, rs735665 exhibited a trend p-value of 3.58×10-4 in Stage 3 and a combined p-value of 4.24×10-9 in the meta-analysis of all samples (OR=1.81, 95% CI 1.50-2.20; Supplementary Table 8). This finding confirms the previous association of rs735665 with CLL risk6, and further supports its role in CLL susceptibility.

To search for additional FL-associated variants that were not genotyped in the HLA-DQB1 region in Stage 1, we imputed SNP genotypes in a 500kb region centered on rs10484561 (Figure 1). We identified four SNPs with imputed p-values<10-3 that were in complete LD with rs10484561 (D'=1, r2=1 in HapMap-CEU), although none showed stronger signals than rs10484561 or rs7755224 (Supplementary Table 10). These SNPs are located in a 100kb region of relatively high LD in chromosome 6p21.32 that covers HLA-DQB1 and HLA-DQA1, and in close proximity to HLA-DRB1. Logistic regression analysis conditional on the associated SNPs in the region suggested that none were independent signals (Supplementary Table 11), and that a single locus or haplotype in LD with rs10484561/rs7755224 may harbor the causal variant(s). Analysis of SNP interactions and preliminary functional analyses did not provide further refinement of the signal (Supplementary Methods). However, one of the imputed SNPs, rs6457614, reported as a tag SNP for the HLA-DQB1*0501 allele in European, African and Japanese populations7, suggested that the association signal may be driven by this protein variant. To verify our imputation, rs6457614 was genotyped in the SF1 study. We found 99% concordance between imputed and observed genotypes for rs6457614, which was in strong LD with rs10484561 (D′=0.99, r2=0.95 in controls). Because HLA-DRB1*0101 and HLA-DQA1*0101 form the most frequent haplotype containing HLA-DQB1*0501 in European American populations8, tag SNPs for these two MHC Class II alleles (rs4947332 and rs1794265, respectively)7 also were genotyped in SF1. Genotyping results revealed that markers for an extended haplotype that includes HLA-DRB1*0101-HLA-DQA1*0101-HLA-DQB1*0501-rs104845561 were associated with FL risk (OR=2.07, 95% CI=1.40-3.06, p-value=2.32×10-4).

Several small studies have reported links between MHC Class II alleles and NHL9,10,11, with somewhat conflicting results that may be attributable to small sample size, the combined analysis of mixed NHL subtypes, and/or differences in ethnic groups being analyzed11. In a large pooled study within the InterLymph consortium, the variant allele for TNF-308G>A (rs1800629) and a TNF/LTA haplotype located in the MHC Class III region were positively associated with DLBCL risk, but no association was found for FL12. Further, MHC Class I and II alleles have been evaluated in the context of TNF extended haplotypes, which revealed independent positive associations for TNF-308A and HLA-B*0801 alleles in risk of DLBCL13. These loci are not in LD with rs10484561 in our controls (r2=0.014 for rs10484561 and TNF-308A; r2=0.007 and 0.001 for rs10484561 and HLA-B*0801 tag SNPs [rs6457374, rs2844535]7), suggesting that our signal is not driven by these MHC Class III and I loci. Importantly, the association found here in the MHC Class II region also appears to be independent of the FL susceptibility locus at PSORS12, since the LD block at HLA-DRB1-HLA-DQA1-HLA-DQB1 is located 1.43Mb downstream of the PSORS1 locus (Figure 1), and the LD measurement between rs6457327 and rs10484561 (r2<0.01) in our controls and in HapMap-CEU indicates no correlation between these loci. Results from conditional logistic regression analysis adjusted for the additive effects of rs6457327 in the SF1 study provided additional evidence for an independent role of rs10484561 (p-value=3.46×10-5) in FL risk.

In conclusion, we have identified a new FL susceptibility locus at chromosome 6p21.32 with combined p-values of 1.12×10-29 and 2.00×10-19 for rs10484561 and rs7755224, respectively, providing evidence that genetic variation in the MHC Class II region is strongly associated with FL susceptibility. These loci appear to be part of an extended haplotype that includes HLA-DRB1*0101-HLA-DQA1*0101-HLA-DQB1*0501. Of note, although rs10484561 showed a trend towards association in DLBCL in SF1 (trend p-value=3.58×10-2), we did not observe markedly significant associations between the MHC region with risk of CLL/SLL or DLBCL (Supplementary Fig. 5), which suggests that the influence of MHC genetic variation differs by NHL subtype.

Supplementary Material


This work was supported by grants CA122663 and CA104682 from the National Cancer Institute (NCI), National Institutes of Health (NIH) (C.F.S.); grants CA45614 and CA89745 from the NCI, NIH (E.A.H.); and the American Cancer Society (IRG-07-06401) and a charitable donation by Sylvia Chase (K.B.). E.H. is a faculty fellow of the Edmond J. Safra Bioinformatics program at Tel-Aviv University.

SCALE study: we are indebted to X.Y. Chen, H.B. Toh, K.K. Heng and W.Y. Meah from Genome Institute of Singapore for their support in genotyping analyses. We are also grateful to Professor L. Klareskog, Center for Molecular Medicine, and L. Alfredsson, Institute of Environmental Medicine, at the Karolinska Institute, Stockholm, Sweden for sharing DNA from their EIRA study control population and to Emil Rehnberg for help with imputing genotypes in the SCALE study. The following agencies contributed funding to SCALE that facilitated the present project: Agency for Science & Technology and Research of Singapore (A*STAR), National Cancer Institute, the Swedish Cancer Society, the Swedish Research Council, and the Danish Medical Research Council.

GEC- Mayo Study: This study is supported by R01 CA91253 and R01 CA118444 from the National Cancer Institute.

NCI-SEER study: We thank Peter Hui of the Information Management Services, Inc. for programming support and gratefully acknowledge the contributions of the staff and scientists at the SEER centers of Iowa, Los Angeles, Detroit, and Seattle for the conduct of the study's field effort. We especially acknowledge the contributions of study site principal investigator Leslie Bernstein (Los Angeles). The NCI-SEER study was supported by the Intramural Research Program of the NIH (NCI), and by Public Health Service (PHS) contracts N01-PC-65064, N01-PC-67008, N01-PC-67009, N01-PC-67010, and N02-PC-71105.

Yale Study: This study is supported by grant CA62006 from the National Cancer Institute (NCI) and the Intramural Research Program of the National Institutes of Health (NCI).

NSW study: The NSW study was supported by a National Health and Medical Research Council of Australia project grant, Cancer Council NSW, a University of Sydney Medical Foundation Program Grant, and the Intramural Research Program of the US NIH (NCI).

BC study: Funding for the British Columbia study was from the Canadian Cancer Society and the Canadian Institutes of Health Research. AB-W is a Senior Scholar of the Michael Smith Foundation for Health Research.

EpiLymph study: Funding was available from the German José Carreras Leukemia Foundation (DJCLS_R04/08 and R07/26f) (A.N.), the EC 5th Framework Program Quality of Life grant No. QLK4-CT-2000-00422 (P.Boffetta, P.Brennan), the Federal Office for Radiation Protection grants No. StSch4261 and StSch4420 (Germany), the Spanish Ministry of Health grants CIBERESP (06/02/0073), FIS 08-1555 and Marato TV3 (051210) (Spain), La Fondation de France, n° 1999 0084 71 (M.Maynadié), Compagnia di San Paolo di Torino, Programma Oncologia 2001 (P.C.), the Health Research Board, Ireland, and the Ministry of Health of the CR, MZ0 MOU 2005 (L.F.)


Author Contributions: Design and interpretation of overall study: C.F.S., E.H., K.M.B., M.T.S., P.M.B. Primary data analysis: L.C., E.H. Drafting of manuscript: L.C., C.F.S. Critical revision of manuscript: E.H., K.M.B., A.B-W., B.A., C.M.Vajdic., S.C., M.T.S., P.M.B. Study design, genotyping and statistical analysis of individual studies: SF1, SF1B and SF2: L.C., E.H., J.R., N.K.A., L.A., E.A.H., M.T.S., P.M.B., C.F.S. SCALE: K.E.S., J.L., H-O.A., H.D., H.H., H-Q.L., K.H., M.Melbye, E.T.C., B.G. NCI-SEER: N.R., W.C., S.D., P.H., L.M.M., M.S., S.S.W., S.C., J.R.C. NSW: B.A., A.K., S.M., M.P.P., C.M.Vajdic. Yale: P.Boyle, Q.L., S.H.Z., Y.Z., T.Z. EpiLymph: A.N., N.B., Y.B., P.Boffetta, P.Brennan, K.B., P.C., L.F., M.Maynadié, L.F., S.S., A.S. BC: A.B-W., J.J.S. Mayo-GEC: S.L.S, S.J.A., T.G.C., N.J.C., N.E.C., J.R.C., J.M.C., L.R.G., C.A.H., N.E.K., M.C.L., J.F.L., G.E.M., K.G.R., L.Z.R., L.G.S., S.S.S., C.M.Vachon, J.B.W. All authors contributed to the final manuscript and approved its content.


1. Skibola CF, Curry JD, Nieters A. Haematologica. 2007;92:960–969. [PMC free article] [PubMed]
2. Skibola CF, et al. Nat Genet. 2009;41:873–875. [PMC free article] [PubMed]
3. Skibola CF, et al. PLoS One. 2008;3
4. Smedby KE, et al. J Natl Cancer Inst. 2005;97:199–209. [PubMed]
5. Cerhan JR, et al. Blood. 2007;110:4455–4463. [PubMed]
6. Di Bernardo MC, et al. Nat Genet. 2008;40:1204–1210. [PubMed]
7. de Bakker PI, et al. Nat Genet. 2006;38:1166–1172. [PMC free article] [PubMed]
8. Klitz W, et al. Tissue Antigens. 2003;62:296–307. [PubMed]
9. Al-Tonbary Y, et al. Hematology. 2004;9:139–145. [PubMed]
10. Nathalang O, et al. Eur J Immunogenet. 1999;26:389–392. [PubMed]
11. Choi HB, et al. Int J Hematol. 2008;87:203–209. [PubMed]
12. Skibola CF, et al. Am J Epidemiol. 2010;171:267–276. [PubMed]
13. Abdou AM, et al. Leukemia. 2010;24:1055–1058. [PMC free article] [PubMed]