|Home | About | Journals | Submit | Contact Us | Français|
There is strong and consistent evidence that a genetic component contributes to the etiology of chronic lymphocytic leukemia (CLL). A recent genome-wide association study (GWAS) of CLL identified 7 genetic variants that increased the risk of CLL within a European population.
We evaluated the association of these variants, or variants in linkage disequilibrium (LD) with these variants, with CLL risk in an independent sample of 438 CLL cases and 328 controls.
Of these 7 SNPs, 6 had p-trend < 0.05 and had estimated odds ratios (ORs) that were strikingly comparable to those of the previous study. Associations were seen for rs9378805 (OR = 1.47, 95% CI: 1.19, 1.80, p-trend = 0.0003) near IRF4 and rs735665 near GRAMD1B (OR= 1.47; 95% CI: 1.14, 1.89; p-trend = 0.003). However, no associations (P> 0.05) were found for rs11083846, nor were any found for any SNPs in LD with rs11083846.
Our results confirm the previous findings and further support the role of a genetic basis in the etiology of CLL; however, more research is needed to elucidate the causal SNP(s) and the potential manner in which these SNPs or linked SNPs function in CLL pathogenesis.
Chronic lymphocytic leukemia (CLL) is a malignant B-cell lymphoproliferative disorder and is most common in male Caucasians, with a lifetime risk of 1 in 200(1). It has been known for decades that a genetic component affects CLL risk, with the risk on the order of 8-fold for individuals with relatives with CLL compared to the general population(2, 3). However, identification of susceptibility genes has remained elusive. Three genome-wide linkage studies have been conducted(4-6). Although regions of interest were found, no definitive susceptibility genes were identified. Further, a number of candidate gene studies have been conducted (see review in Slager et. al(7)), but no consistent results were identified across these studies. Only recently, through a GWAS of 517 CLL cases from the United Kingdom and 1,438 British 1958 Birth Cohort controls, researchers have identified 7 genetic variants that increase CLL risk(8). These findings were replicated through two internal validation samples of 520 cases and 891 controls (validation 1) and 504 cases and 786 controls (validation 2). We evaluated the association of these 7 variants, or variants in linkage disequilibrium (LD) with these variants, with CLL risk in an independent sample of 438 U.S. non-Hispanic Caucasian CLL cases and 328 controls.
All studies were approved by the institutional review boards. Peripheral blood samples were obtained from two ongoing studies: the Genetic Epidemiology of CLL (GEC) Consortium and the Mayo Clinic non-Hodgkin lymphoma (NHL)/CLL study. The GEC consortium is a collaboration of researchers from seven institutions with the overall aim of investigating the genetic basis of CLL through the collection of CLL families (i.e., families with two or more relatives with CLL). A total of 110 Caucasian CLL patients from 110 families were available at the time of genotyping. These families were collected from: Duke University, Mayo Clinic, M.D. Anderson Cancer Center, National Cancer Institute, University of Minnesota/Minneapolis Veteran Administration Medical Center, University of California San Diego, and University of Utah. The second study is the Mayo Clinic NHL/CLL case-control study, which is an ongoing, clinic-based study conducted in Rochester, Minnesota(9). Briefly, newly diagnosed NHL/CLL patients who were aged 20 years or older, HIV negative, and residents of the Midwest United States at the time of diagnosis are enrolled. Clinic-based controls are ascertained from patients visiting the General Internal Medicine clinic. Eligibility requirements included age 20 years or older and a resident of Minnesota, Iowa or Wisconsin; patients were excluded if they had prior diagnoses of lymphoma, leukemia, or HIV infection. From this study, genotype data were available from 328 CLL cases and 475 controls.
The diagnosis of all CLL cases across both studies were reviewed and confirmed by a hematopathologist, and classified according to the WHO criteria(10).
This replication used genotype data from two larger ongoing genotyping projects from the GEC Consortium and the Mayo Clinic case-control studies (Figure 1). For the first project, we genotyped 438 CLL cases (110 familial CLL and 328 Sporadic CLL) and 328 controls using the Affymetrix 6.0 SNP Array. Rigorous quality control measures were implemented including dropping SNPs or individuals with call rates < 95%, and evaluating for subject relatedness, evidence of population stratification, and sex discrepancies. Concordancy of genotype calls across duplicates was >99.7%. Four of the 7 SNPs (rs17483466, rs13397985, rs9378805, and rs735665) identified by Di Bernardo et al.(8) were directly available on the Affymetrix 6.0 platform on 407 CLL cases and 296 controls who passed quality control criteria. From the second genotyping project, 129 sporadic CLL cases and 475 controls were genotyped on the ParAllele Immune and Inflammation SNP panel. Similar quality control measures as listed above were implemented in this project; for full details see(9). One SNP (rs872071) of the 7 identified by Di Bernardo et al.(8) was available from this platform in all 129 CLL cases and 468 controls who were available at time of genotyping. A total of 223 subjects and 1,974 SNPs overlapped these two projects, providing additional assessment of genotype concordancy (> 99.5% concordant).
Tests for Hardy-Weinberg Equilibrium were done using a Pearson goodness-of-fit test. We used logistic regression to estimate odds ratios (ORs) and corresponding 95% confidence intervals (CIs). A p-trend was calculated assuming an ordinal genotypic relationship. We also evaluated whether the associations between CLL risk and each SNP varied by family history status. We used polytomous logistic regression with three outcome categories (familial CLL cases, sporadic CLL cases, and controls), and we tested for heterogeneity of the ORs across the two case groups.
There were 2 SNPs (rs7176508 and rs11083846) that were not genotyped in either of the two genotyping platforms. For these, we reported statistical results for all neighboring SNPs that were in high LD (r2>0.9) with these SNPs based on HapMap, NCBI build 36. Pairwise LD measure (r2) was estimated using the program Haploview. Further, we imputed genotypes for these two untyped SNPs using MACH 1.0.14(11). The 60 unrelated HapMap CEU samples were used to obtain the phased chromosomes, and the expected genotype was estimated based on the posterior probability. We imputed genotypes for only the 407 cases and 296 controls who were genotyped on Affymetrix 6.0 platform, since the regions around the two untyped SNPs had no genotype data available on the ParAllele Immune and Inflammation SNP panel.
Gender and age distribution were similar across the two genotyping projects. Among the cases, 64% were male, and the mean age at diagnosis was 61.4 years (+/− 11.1). Among the controls, 57% were male and the mean age of recruitment was 61.8 years (+/− 12.5).
Our associations of the 7 SNPs identified by Di Bernardo et al.(8) and CLL risk are shown in Table 1. For all of the 5 typed SNPs, the observed p values were < 0.05. Our strongest finding was with rs9378805 SNP on chromosome 6p25.3 with a 47% increased risk of CLL (OR = 1.47; 95% CI: 1.19, 1.80, p trend = 0.0003) with each allele. This effect size is very similar to that of Di Bernardo et al., who reported a 51% increased risk (95% CI: 1.38, 1.65). For rs872071, which was in LD (r2= 0.74) with rs9378805, we also observed an association (OR = 1.35; 95% CI: 1.02, 1.78; p trend = 0.037), but the effect is slightly attenuated compared to that of Di Bernardo et al. (OR = 1.54; 95% CI: 1.41, 1.69). Our second strongest finding of the genotyped SNPs was rs735665 on chromosome 11q24.1 (OR= 1.47; 95% CI: 1.14, 1.89; p trend= 0.003), followed by the two SNPs on chromosome 2q13 (rs17483466 and rs13397985). The magnitude of these associations was also comparable to that of Di Bernardo et al.
For the remaining two SNPs identified by Di Bernardo et al. located on 15q23 (rs7176508) and 19q13 (rs11083846), we did not have genotype data available. However, we evaluated genotyped SNPs that were in high LD (r2>0.9) with rs7176508 and rs11083846. As shown in Table 1, the 4 SNPs in high LD with rs7176508 on chromosome 15q23 were associated (p trend < 0.001) with effect sizes comparable to that of Di Bernardo et al. who reported an OR=1.37 (95% CI 1.26, 1.50). We also imputed genotypes for rs7176508, with a resulting imputed OR = 1.50 (95% CI 1.19, 1.90). The quality score for the imputed genotypes was 0.99, indicating that our imputed genotypes are accurate. We did not replicate the findings on chromosome 19 based on two neighboring typed SNPs (p trend > 0.05) nor based on imputing genotypes for rs11083846. The quality score for the imputed genotypes was 0.96.
Because 102 of the CLL cases had a family history of CLL, we evaluated CLL risk by family history status for the genotyped SNPs. Risk estimates were generally higher among the familial CLL cases compared to that of the sporadic CLL cases (Table 2), but none had P<0.05.
We found evidence for 6 out of the 7 previously implicated SNPs for CLL risk in an independent sample of US Caucasians. Even after using a Bonferroni correction of 6 independent regions (i.e., correcting for 6 independent tests), 4 SNPs still remained significant in our sample. There was suggestive evidence that the associations of the SNPs differed by family history status of CLL. A next step would be to evaluate whether these variants co-segregate with relatives who have CLL.
We were unable to replicate the association between CLL risk and rs11083846 located on 19q13. Limited statistical power may be an explanation. Assuming a 5% type I error rate and a 1.35 effect size (reported in Di Bernardo et al.), we have between 67% to 74% power to find this effect size with allele frequency between 0.22 and 0.28. Alternatively, this could be related to inherent differences in the populations under study. Our allele frequency of this SNP was greater in the controls compared to that of Di Bernardo et al., resulting in less allele frequency difference between cases and controls.
As reported in Di Bernardo et al., three of replicated SNPs are located within genes, including rs872071 within the 3′ UTR of IRF4, rs17483466 located in intron 10 of ACOXL, and rs13397985 located in intron 1 of SP140. The SNP rs9378805 is about 10 kb from 3′ end of IRF4. IRF4 is of great interest because of its role in lymphocyte development and previous associations with other lymphomas(12-15).
The strength of this study includes the well-characterized CLL cases and controls, as well as a fairly comparable number of CLL cases to that in each of the three phases of Di Bernardo et al.. This study has several limitations. First, genotyped data were pulled from two on-going separate genotype studies. As a result, not all subjects were genotyped across the 5 typed SNPs. The impact of this, however, was minor because even with fewer subjects genotyped, we were still able to identify all the SNPs but one with comparable effect sizes as those previously reported, yet with reduced statistical power. Second, two previously implicated SNPs were not typed in our study. To overcome this limitation, we imputed genotypes at these SNPs; we also evaluated associations of neighboring genotyped SNPs in high LD with the implicated SNPs. Both the observed genotype data at neighboring SNPs and the imputed data at the previously implicated SNPs were consistent with each other for each of the two regions on chromosomes 15 and 19. Further our imputation results had high accuracy. Thus, although we did not have actual data for these SNPs, surrogate information was available and reliable.
In conclusion, this study confirms the previously reported associations of CLL risk and SNPs located on chromosomes 2q13, 2q37.1, 6p25.3, 11q24.1, and 15q23. Candidate genes have been identified in or near the regions of interest, so further work is needed to identify the causal SNPs, as well as the biological mechanisms by which CLL risk is increased.
This work was supported by NIH grants CA118444 and CA92153; Intramural Research Program of the NIH, NCI; and CLL Research Consortium. Data collection in Utah was made possible by the Utah Population DataBase (UPDB) and the Utah Cancer Registry (UCR). Partial support for all data in the UPDB was provided by the University of Utah Huntsman Cancer Institute. The UCR is funded by contract N01-PC-35141 from the NCI’s SEER program with additional support from the Utah State Department of Health and the University of Utah.
Support: NCI U01 CA118444, Intramural Research Program of the NIH, NCI NCI R01 CA92153
The authors declare no competing financial interests.