Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Br J Haematol. Author manuscript; available in PMC 2013 December 1.
Published in final edited form as:
PMCID: PMC3614403

Common variants within 6p21.31 locus are associated with chronic lymphocytic leukaemia and potentially other non-Hodgkin lymphoma subtypes


A recent meta-analysis of three genome-wide association studies of chronic lymphocytic leukaemia (CLL) identified two common variants at the 6p21.31 locus that are associated with CLL risk. To verify and further explore the association of these variants with other non-Hodgkin lymphoma (NHL) subtypes, we genotyped 1196 CLL cases, 1699 NHL cases, and 2410 controls. We found significant associations between the 6p21.31 variants and CLL risk (rs210134: P=0.01; rs210142: P=6.8×10−3). These variants also showed a trend towards association with some of the other NHL subtypes. Our results validate the prior work and support specific genetic pathways for risk among NHL subtypes.

Keywords: CLL, NHL, SNPs, BAK1, risk locus


Chronic Lymphocytic leukaemia (CLL) is a B-cell malignancy and one of the most common non-Hodgkin lymphomas (NHL). Through a meta-analysis of three genome-wide association studies (GWAS) of CLL, we provided evidence that two common variants (rs210134, rs210142) at the 6p21.31 locus contribute to the heritability of CLL (Slager et al, 2012). These two variants are highly correlated with each other (r2=0.8) with rs20142 located within intron 1 of BAK1 and rs210134 located 100kb telomeric to BAK1. Because replication is important for validity of the findings, we evaluated the association of these two variants with CLL risk in additional independent samples of 1,196 CLL cases and 2,142 controls collected from five CLL studies.

Further, because NHL comprises a group of closely related B- and T- cell neoplasms, we explored the association of these CLL-associated variants in 1,699 patients with other common NHL subtypes (583 diffuse large B-cell lymphomas [DLBCL], 585 follicular lymphomas [FL], 229 marginal zone lymphomas [MZL], 156 T-cell lymphomas [TCL], and 135 mantle cell lymphomas [MCL]).

Finally, we sequenced the BAK1 gene in germline DNA from 67 CLL cases and 39 controls to identify functional variants that are correlated with either rs210134 or rs210142.


Study Participants

Five studies contributed Caucasian samples for genotyping (Supplemental Table I). The Mayo Clinic Lymphoma case-control study is a clinic-based study of incident cases and frequency-matched controls (based on age, sex, and residence) conducted in Rochester, Minnesota, between the years of 2002-2008. The University of Utah CLL case-control study ascertained cases through the Huntsman Cancer Hospital clinics and the Utah Cancer Registry between the years of 1979 to 2011. Controls were frequency matched (based on age, sex, and residence) and ascertained via the Utah Population Database. The San Francisco (SF) Bay Area study is a population-based case-control study of NHL and included incident cases diagnosed from 2001 to 2006 and controls which were frequency matched to cases by age in five-year groups, sex, and county. Duke University ascertained incident CLL cases through the Duke Haematology clinic between the years of 1999 to 2011. Finally, the Genetic Epidemiology of CLL (GEC) Consortium started in 2004 and is an on-going family-based study in which families with two or more members with prevalent CLL are recruited through haematology clinics or through the internet.

The NHL diagnoses were confirmed by study pathologists and classified according to the WHO classification (Harris et al, 2008; Swerdlow et al, 2008). Each study was approved by its respective institutional review boards; all participants provided written informed consent.


Genotyping of the GEC/Mayo Clinic samples was performed at Mayo Clinic as part of a larger genotyping project using a custom Illumina Infinium array. Genotyping of the University of Utah/Duke University samples was performed using Illumina Goldengate assay. Genotyping of the SF samples was done using Taqman (Applied Biosystems, Carlsbad, California, USA) and was corroborated using Sanger sequencing. Standard genotyping quality control procedures were performed at each genotyping center and included duplicate samples within center, dropping samples with call rates<80%, and testing for Hardy-Weinberg equilibrium. We found >99% genotyping concordance among duplicate samples within genotyping centers.

Statistical analysis

Main analyses used SASv9.2. The association between each SNP and NHL risk was assessed by the Cochran-Armitage trend test. Odds ratios (OR) and 95% confidence intervals (CI) were calculated using logistic regression with and without adjustment for age and sex covariates. Meta-analysis of the results reported here and our previously reported results (Slager et al, 2012) was conducted under a fixed-effects model. Cochran’s Q statistic, to test for heterogeneity (P.het), and the I2 statistic, to quantify the proportion of the total variation due to heterogeneity, were calculated (Higgins & Thompson, 2002). Linkage disequilibrium (LD) metric (r2) between rs210134/rs210142 and other variants in the region were calculated using the CEU samples from version 2 of 1000 Genome data. Case-only analysis of rs210142 genotypes with either CLL platelet counts or CLL Rai stage was conducted using Kruskal-Wallis test or chi-square test, respectively.


To identify potential functional variants, we sequenced the BAK1 gene in 67 CLL cases from CLL pedigrees obtained in the GEC consortium and 39 controls from the Mayo Clinic Biobank (Slager et al, 2011). Sequencing of the exons was performed using Agilent sure select capture 50Mb kit. Briefly, 100bp paired-end sequencing reads were aligned to Build 37 using Novoalign software (Novocraft Technologies, Selangor, Malaysia) and realignment was performed using genome analysis toolkit (GATK) (McKenna et al, 2010). Single-sample variant calling was performed using the GATK unified genotyper. Variants with total read-depth >9 and high-quality scores(>Q20) were included in analyses. Frequency of the variants was compared between CLL cases and controls using a chi-square test.

Results and discussion

A total of 1,198 CLL cases, 1,699 other NHL cases, and 2,410 controls were available for genotyping. However, two genotyping teams (Duke/Utah and Mayo/GEC) were only able to successfully genotype one of the two SNPs. rs210142 had low call rate for the Duke/Utah samples and was dropped from statistical analyses; rs210134 could not be multiplexed with the other variants genotyped on the Mayo/GEC Infinium array. For the SNPs that were successfully genotyped, the SNP call rates were >92% and did not differ significantly between cases and controls. Further, because these two variants are highly correlated with each other, the results were expected to be similar across these two variants; this was confirmed in our results (Table I). For both variants, we found an association with CLL risk (P<0.01 for the combined samples) and similar effect sizes to each other and to our previously reported findings (Slager et al, 2012). We report the ordinal OR=0.82 (95% CI: 0.71, 0.95) and OR=0.78 (95% CI: 0.65, 0.93) for rs210142 and rs210134, respectively (Table I). These results did not change after adjusting for age and sex (results not shown). In a meta-analysis of our previous results (Slager et al, 2012) with the new data reported herein, we continue to have strong support for an association between these two SNPs and CLL risk (Fig. 1).

Fig 1Fig 1
Forest plots of effect size for (A) rs210134 and (B) rs210142. Boxes denote per allele odds ratios; the size of box is proportional to the sample size. Horizontal lines represent 95% confidence intervals. The dotted vertical line denotes the null value. ...
Table I
Association between rs210142 and rs210134 with risk of NHL/CLL

We next assessed association of rs210142 across other NHL subtypes (Table I). Clearly, no association was observed for FL, DLBCL, and MZL (P>0.05); although not statistically significant given the limited sample sizes and statistical power for these two subtypes, MCL and TCL showed similar effect sizes to that of CLL. MCL and CLL lymphomas have very similar immunophenotype patterns, but MCL overexpresses cyclin D1 due to an invariant t(11:14) chromosomal translocation. In contrast, TCL and CLL are most likely from a different cell-of-origin, and this suggestive association of BAK1 variants with TCL is of potential interest.

Using the 1000 Genomes CEU data, we computed the LD among all available variants at 6p21.31 locus. Two variants (rs511515, rs210143) were in high LD (r2>0.85) with the two GWAS SNPs. rs511515 is located within the 3′ UTR of BAK1, and rs210143 is in intron 1 and is 107 bp from rs201142. Sequencing the exons of BAK1 in 67 CLL cases and 39 controls, we identified four variants in our CLL cases, including one of the two correlated SNPs (rs511515), rs561276, and two novel rare variants. The two novel variants were not seen in our controls, and had an allele frequency of 1% (rs511515) and 3% (rs561276) in our cases. The allele frequencies of rs511515 and rs561276 did not statistically differ between the 67 cases and 39 controls (P=0.56 and P=0.26, respectively). Neither of these novel variants nor the two exonic SNPs changed the amino acid of the protein based on SIFT(Kumar et al, 2009).

Interestingly, variants in or near the BAK1 gene have been shown to be associated with platelet counts(Lo et al, 2011; Qayyum et al, 2012; Soranzo et al, 2009) with decreasing platelet counts with increasing number of major alleles. CLL staging at diagnosis includes determining the absolute levels of the platelet count. Thus, a CLL patient presenting with thrombocytopenia (i.e., <100×109 cells/L) due to bone marrow replacement by CLL is designated Rai Stage 4 and will have a poorer prognosis. We therefore assessed the relationship of CLL Rai stage and platelet counts at diagnosis with rs210142 genotypes in a case-only analyses using the CLL cases from the GEC/Mayo Clinic case-control study. As expected, we found that CLL cases with the major allele tended to present with Rai-stage 4 disease (major allele frequency= 0.86) than with the lower stages (major allele frequency =0.77), although this finding was not statistically significant (p=0.17). However, we did not see a trend in platelet counts that decreased with the number of major alleles; the median platelet count for CLL patients with two copies of the major allele was only slightly higher (mean=208×109 cells/L) than that for CLL patients with no copies of the major allele (mean=187×109 cells/L), p=0.15.

In conclusion, this study confirms the previously reported associations of CLL risk and SNPs located within the 6p21.31 locus (Slager et al, 2012) and extends the findings to potentially include associations with MCL and TCL. Although our sequencing effort did not yet identify any functional variants that were linked to the 6p21.31 variants, additional studies are needed to evaluate other biological mechanisms of this locus in relation to the leukemic process.

Supplementary Material

Suppl. Table I


We thank the study participants and the study coordinators for work in recruitment.

In the GEC Consortium and Mayo Clinic case-control study, the work was supported in part by National Institutes of Health (NIH) grants CA118444 (SLS), CA148690 (SLS), CA97274 (JRC) and CA92153 (JRC). The genotyping at the Mayo Clinic Genotyping Core is supported, in part, by CA15083 (JMC).

The Utah case-control study was supported in part by NIH grant CA134674 (NJC). Data collection in Utah was made possible by the Utah Population Database (UPDB) and the Utah Cancer Registry (UCR). Partial support for all data in the UPDB was provided by the University of Utah Huntsman Cancer Institute. The UCR is funded by contract HHSN261201000026C from the NCI Surveillance Epidemiology and End Results program with additional support from the Utah State Department of Health and the University of Utah.

Sample collection at Duke University was supported by a Leukemia & Lymphoma Society Career Development Award (to MCL), by the Bernstein Family Fund for Leukemia and Lymphoma Research, the Veterans Affairs Research Service, and by NIH CA134919 (MCL).

The SF-case-control study was supported by National Institutes of Health grants CA122663, CA154643 and CA104682 (CFS) and CA45614 and CA89745 (PMB). E.H. is a faculty fellow of the Edmond J. Safra Bioinformatics program at Tel-Aviv University.


Author contributions The study was designed and financial support was obtained by SLS and JRC. The manuscript was drafted by SLS with contributions from all co-authors. Statistical analyses were conducted by SJA, DJS, SKM, AHW, and KGR. JMC oversaw genotyping at Mayo Clinic; LC FCMS, SS, NKA, and CFS oversaw genotyping and CFS processed the biological specimens for the SF study; and NJC oversaw genotyping at Utah. PMB oversaw recruitment of individuals for the SF study; JBW and MCL oversaw for recruitment of individuals from Duke University; SLS, JRC, and CMV oversaw recruitment of individuals from Mayo Clinic; NJC and MG oversaw recruitment of individuals from Utah. All authors contributed to the final paper.

Conflict of interest The authors declare no competing financial interests.


  • Harris NL, Swerdlow S, Campo E, Jaffe ES, Stein H, Pileri S, Thiele J, Vardiman J. The World Health Organization (WHO) classification of lymphoid neoplasms: What’s new? Annals of Oncology. 2008;19:119.
  • Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Statistics in Medicine. 2002;21:1539–1558. [PubMed]
  • Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols. 2009;4:1073–1081. [PubMed]
  • Lo KS, Wilson JG, Lange LA, Folsom AR, Galarneau G, Ganesh SK, Grant SF, Keating BJ, McCarroll SA, Mohler ER, 3rd, O’Donnell CJ, Palmas W, Tang W, Tracy RP, Reiner AP, Lettre G. Genetic association analysis highlights new loci that modulate hematological trait variation in Caucasians and African Americans. Human Genetics. 2011;129:307–317. [PMC free article] [PubMed]
  • McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research. 2010;20:1297–1303. [PubMed]
  • Qayyum R, Snively BM, Ziv E, Nalls MA, Liu Y, Tang W, Yanek LR, Lange L, Evans MK, Ganesh S, Austin MA, Lettre G, Becker DM, Zonderman AB, Singleton AB, Harris TB, Mohler ER, Logsdon BA, Kooperberg C, Folsom AR, Wilson JG, Becker LC, Reiner AP. A meta-analysis and genome-wide association study of platelet count and mean platelet volume in african americans. PLoS Genetics. 2012;8:e1002491. [PMC free article] [PubMed]
  • Slager SL, Rabe KG, Achenbach SJ, Vachon CM, Goldin LR, Strom SS, Lanasa MC, Spector LG, Rassenti LZ, Leis JF, Camp NJ, Glenn M, Kay NE, Cunningham JM, Hanson CA, Marti GE, Weinberg JB, Morrison VA, Link BK, Call TG, Caporaso NE, Cerhan JR. Genome-wide association study identifies a novel susceptibility locus at 6p21.3 among familial CLL. Blood. 2011;117:1911–1916. [PubMed]
  • Slager SL, Skibola CF, Di Bernardo MC, Conde L, Broderick P, McDonnell SK, Goldin LR, Croft N, Holroyd A, Harris S, Riby J, Serie DJ, Kay NE, Call TG, Bracci PM, Halperin E, Lanasa MC, Cunningham JM, Leis JF, Morrison VA, Spector LG, Vachon CM, Shanafelt TD, Strom SS, Camp NJ, Weinberg JB, Matutes E, Caporaso NE, Wade R, Dyer MJ, Dearden C, Cerhan JR, Catovsky D, Houlston RS. Common variation at 6p21.31 (BAK1) influences the risk of chronic lymphocytic leukemia. Blood. 2012;120:843–846. [PubMed]
  • Soranzo N, Spector TD, Mangino M, Kuhnel B, Rendon A, Teumer A, Willenborg C, Wright B, Chen L, Li M, Salo P, Voight BF, Burns P, Laskowski RA, Xue Y, Menzel S, Altshuler D, Bradley JR, Bumpstead S, Burnett MS, Devaney J, Doring A, Elosua R, Epstein SE, Erber W, Falchi M, Garner SF, Ghori MJ, Goodall AH, Gwilliam R, Hakonarson HH, Hall AS, Hammond N, Hengstenberg C, Illig T, Konig IR, Knouff CW, McPherson R, Melander O, Mooser V, Nauck M, Nieminen MS, O’Donnell CJ, Peltonen L, Potter SC, Prokisch H, Rader DJ, Rice CM, Roberts R, Salomaa V, Sambrook J, Schreiber S, Schunkert H, Schwartz SM, Serbanovic-Canic J, Sinisalo J, Siscovick DS, Stark K, Surakka I, Stephens J, Thompson JR, Volker U, Volzke H, Watkins NA, Wells GA, Wichmann HE, Van Heel DA, Tyler-Smith C, Thein SL, Kathiresan S, Perola M, Reilly MP, Stewart AF, Erdmann J, Samani NJ, Meisinger C, Greinacher A, Deloukas P, Ouwehand WH, Gieger C. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nature Genetics. 2009;41:1182–1190. [PMC free article] [PubMed]
  • Swerdlow SH. World Health Organization Classification of Tumours of Haematopoietic and Lymphoid Tissues; Volume 2 of WHO Classification of Tumours Series. 2008.