|Home | About | Journals | Submit | Contact Us | Français|
Chromosomal translocations are the hallmark genetic aberration in non-Hodgkin lymphoma (NHL), with specific translocations often selectively associated with specific NHL subtypes. Because many NHL-associated translocations involve cell cycle, apoptosis, and lymphocyte development regulatory genes, we evaluated NHL risk associated with common genetic variation in 20 candidate genes in these pathways. Genotyping of 203 tag single nucleotide polymorphisms (SNPs) was conducted in 1946 NHL cases and 1808 controls pooled from three independent population-based case-control studies. We used logistic regression to compute odds ratios (OR) and 95% confidence intervals (CI) for NHL and four major NHL subtypes in relation to tag SNP genotypes and haplotypes. We observed the most striking associations for tag SNPs in the pro-apoptotic gene BCL2L11 (BIM) and BCL7A, which is involved in a rare NHL-associated translocation. Variants in BCL2L11 were strongly related to follicular lymphoma only, particularly rs3789068 (ORAG=1.41, 95%CI 1.10–1.81; ORGG=1.65, 95%CI 1.25–2.19; p-trend=0.0004). Variants in BCL7A were strongly related to diffuse large B-cell lymphoma only, particularly rs1880030 (ORAG=1.34, 95%CI 1.08–1.68; ORAA=1.60, 95%CI 1.22–2.08; p-trend=0.0004). The associations for both variants were similar in all three studies and supported by haplotype analyses. We also observed notable associations for variants in BCL6, CCND1, and MYC. Our results support the role of common genetic variation in cell cycle, apoptosis, and lymphocyte development regulatory genes in lymphomagenesis, and suggest that effects may vary by NHL subtype. Replication of our findings and further study to identify functional SNPs are warranted.
Non-Hodgkin lymphomas (NHL) are closely related diseases, each involving the malignant transformation of lymphoid cells, but with distinctive morphologic, immunophenotypic, genetic, and clinical features (1). The strongest known NHL risk factor is severe immunodeficiency, but the etiologies of most lymphomas remain unexplained (1, 2). Although no major susceptibility gene has been identified, several lines of evidence reveal the contributions of genetic predisposition to NHL etiology: NHL risk is elevated among individuals with a family history of hematopoietic malignancy, migrant studies show that migrants tend to retain the NHL incidence rates and patterns of their country of origin, and common genetic variations have recently been associated with NHL risk (3–6).
Chromosomal translocations are the hallmark genetic aberration in NHL, with specific translocations often selectively associated with particular NHL subtypes (7–10). Most translocations occur as a side-effect of the single- and double-stranded DNA breaks induced during endogenous processes critical to normal lymphocyte development. Specifically, early in lymphocyte development, DNA in the variable (V), diversity (D), and joining (J) regions of the immunoglobulin heavy chain (IgH) and lambda light chain (IgL) loci recombines to form a functioning B-cell receptor in a process known as V(D)J recombination. Mature, antigen-stimulated B-cells undergo two processes that entail DNA breaks. The first, class switch recombination, involves recombination of DNA in the IgH constant region to produce effector antibody classes. The second, somatic hypermutation, typically induces a high rate of point mutations in the Ig V regions to produce antibodies with improved antigen affinity.
NHL-associated translocations typically result in transcriptional deregulation of a proto-oncogene or oncogene by juxtaposing it with Ig regulatory sequences, although some non-Ig translocations can also occur (7–11). Many of the genes involved in NHL-associated translocations regulate the cell cycle, apoptosis, and lymphocyte development, such as MYC, BCL2, CCND1, and BCL6. Genes in these pathways (e.g., MYC, BCL6, and PIM1) also have been identified as targets of aberrant (non-Ig) somatic hypermutation (12).
The likely importance of cell cycle, apoptosis, and lymphocyte development regulatory genes in lymphomagenesis is evident from their participation in NHL-associated translocations and their identification as targets of aberrant somatic hypermutation, yet few studies have investigated the relationship between risk of developing lymphoma and common genetic variation in these genes. We therefore investigated risk of NHL and NHL subtypes associated with common genetic variation in 20 candidate genes involved in regulating the cell cycle, apoptosis, and lymphocyte development, 7 of them in or near breakpoints for lymphoma-associated chromosomal translocations (Table 1). Our study population included 1946 patients with NHL and 1808 controls derived from pooling three population-based case-control studies. Combining data from three studies enabled us to evaluate pooled risk estimates as well as risk estimates in three independent populations, and provided sufficient sample size to investigate risk of NHL overall and the four most common NHL subtypes.
Our study population was derived from pooling three independent population-based case-control studies, which have been described in detail previously: the National Cancer Institute-Surveillance Epidemiology and End Results (NCI-SEER) NHL Case-Control Study (13, 14), the Connecticut NHL Case-Control Study (15, 16), and the New South Wales (NSW) NHL Case-Control Study (17, 18). Selected characteristics for each study are presented in Table 2. All three studies included first primary NHL cases only, and population controls were frequency matched to cases (Table 2). The pooled study population had more women than men because the Connecticut study was limited to women, and the age distribution was somewhat younger than a typical series of NHL cases because the NCI-SEER and NSW studies were limited to adults younger than age 75 years. Like the underlying populations, the study population was predominantly Caucasian and non-Hispanic.
The protocols for each study were approved at the Institutional Review Boards of the NCI and each SEER center for the NCI-SEER study; Yale University, the Connecticut Department of Public Health, and the NCI for the Connecticut study; and all participating institutions for the NSW study. All study participants provided informed consent.
All cases were histologically confirmed by the local diagnosing pathologist in the NCI-SEER study and by central review of diagnostic slides by two independent expert hematopathologists in the Connecticut study. In the NSW study, all cases were histologically confirmed by the local diagnosing pathologist, and a confirmatory central pathology review was performed for cases judged to be <90% certain to be NHL on review of the diagnostic pathology report by an expert hematopathologist. In the present analyses, we evaluated NHL overall and specific NHL subtypes, grouping cases according to the World Health Organization classification (1) using the International Lymphoma Epidemiology Consortium (InterLymph) guidelines (19). For analyses by NHL subtype, we evaluated only the four most common subtypes: diffuse large B-cell lymphoma (DLBCL) (28%), follicular lymphoma (28%), marginal zone lymphoma (8%), and chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL) (8%) (Table 2). Our studies primarily included SLL rather than CLL cases because these diseases were not considered the same entity until the WHO classification was introduced in 2001 (1).
Study participants who did not provide a biologic specimen, did not have sufficient material for DNA extraction or sufficient DNA for genotyping, or whose genotyped sex was discordant from the questionnaire data were excluded from this analysis (Table 2). For the NCI-SEER study, DNA was extracted from blood clots or buffy coats (BBI Biotech, Gaithersburg, MD) using Puregene Autopure DNA extraction kits (Gentra Systems, Minneapolis, MN), and from buccal cell samples by phenol-chloroform extraction methods (20). Genotype frequencies for individuals who provided blood compared with buccal cells were equivalent (21). For the Connecticut study, DNA was extracted from the blood samples using phenol-chloroform extraction methods (20). For the NSW study, DNA was extracted from buffy coats using Qiagen QIAamp® DNA Blood Midi Kits by laboratory staff at the Viral Epidemiology Section, SAIC-Frederick, NCI-Frederick.
Genotyping of tag SNPs from 20 candidate genes involved in regulating the cell cycle, apoptosis, and lymphocyte development was conducted at the NCI Core Genotyping Facility (Advanced Technology Center, Gaithersburg, MD; http://snp500cancer.nci.nih.gov) (22) using a custom-designed GoldenGate assay (Illumina, www.illumina.com). The GoldenGate assay included a total of 1536 tag SNPs, thus this analysis was conducted as part of a panel that also included SNPs from candidate genes in other pathways. Tag SNPs were chosen from the designable set of common SNPs (minor allele frequency (MAF)>5%) genotyped in the Caucasian (CEU) population sample of the HapMap Project (Data Release 20/Phase II, NCBI Build 35 assembly, dbSNPb125) using the software Tagzilla (http://tagzilla.nci.nih.gov/), which implements a tagging algorithm based on the pairwise binning method of Carlson et al. (23). For each gene, SNPs within the region spanning 20kb 5’ of the start of transcription (exon 1) to 10kb 3’ of the end of the last exon were grouped using a binning threshold of r2>0.8. When there were multiple transcripts available for genes, only the primary transcript was assessed.
We excluded tag SNPs (N=3) that failed to cluster in the genotyping calling algorithm (separately analyzed for buccal cell and peripheral blood cell samples) or did not amplify during the amplification step of the genotyping assay. SNPs with low completion rate (<90% of samples) were excluded by study (NCI-SEER blood samples: N=1; NCI-SEER buccal cell samples: N=4). QC duplicates and replicates from each study were genotyped, blinded to laboratory personnel. SNPs with concordance <95% in the study-specific QC samples were excluded for that study (NCI-SEER buccal cell samples: N=1). We also excluded samples with a low completion rate (<90% of the full panel of 1536 tag SNPs; NCI-SEER: 11 cases, 6 controls; Connecticut: 2 controls; NSW: 4 cases, 9 controls). We included in our analyses 5 candidate SNPs previously genotyped by Taqman assay in at least two of the three studies and located within one of the 20 candidate genes in this analysis, some of the results of which have been published previously (24, 25).
The final pooled analytic study population included 1946 cases and 1808 controls with data for 203 SNPs (198 tag SNPs, 5 previously genotyped Taqman SNPs) in or near the 20 candidate genes in this analysis (Table 1, Supplementary Table 1). Hardy–Weinberg equilibrium was evaluated among non-Hispanic Caucasian controls (N=1578, 87% of the analytic population) for the pooled study population and by study (Supplementary Table 1). In the pooled study population, three SNPs showed evidence (p<0.001) for deviation from Hardy–Weinberg proportions but were retained in the analysis because the QC data did not suggest any obvious genotyping error (rs9392454, rs7941248, rs17757541).
We calculated odds ratios (OR) and 95% confidence intervals (CI) estimating the relative risk of NHL and NHL subtypes in relation to SNP genotype using dichotomous and polytomous unconditional logistic regression models, respectively. The homozygote of the most common allele in the pooled study population was used as the referent group. Tests for trend under the co-dominant model used a three-level ordinal variable for each SNP (0=homozygote common, 1=heterozygote, 2=homozygote variant). All models were adjusted for age, race/ethnicity, sex, and study center (categories listed in Table 2). We conducted analyses restricted to non-Hispanic Caucasians and stratified by age (<50, ≥ 50 years) and sex to evaluate the consistency of our results by various demographic groups. To evaluate the consistency of our results by NHL subtype, we assessed heterogeneity among NHL subtypes in the polytomous multivariate unconditional logistic regression models using the Wald chi-square statistic (results presented in Supplementary Tables). Analyses were conducted using SAS version 9.1 (SAS Institute, Cary, NC).
We obtained a gene-level summary of association by computing the minimum p-value (“minP test”), which assesses the true statistical significance of the smallest p-trend within each gene (determined by dichotomous logistic regression, comparing NHL or NHL subtypes to controls; SNPs listed in Supplementary Table 2) by permutation-based resampling methods (10,000 permutations) that automatically adjust for the number of tag SNPs tested within that gene and the underlying linkage disequilibrium pattern (26, 27). To account for multiple comparisons with 20 candidate genes in this analysis, we applied the false discovery rate (FDR) method of Benjamini and Hochberg (28) to the minP test separately for NHL and each subtype. We considered FDR values <0.2 for the minP test as the least likely to be due to a false positive finding and thus represent our most interesting results. Finally, we summarized the overall evidence of association of the 203 SNPs with NHL or an NHL subtype by using the “tail strength” statistic (29), a summary measure for the departure of the observed p-value distribution from their expected distribution under the global null hypothesis of no association in the group of 20 candidate genes in this analysis. We assessed the significance of the tail strength statistics by generating their null distributions by permutation-based resampling of the data. Higher tail strength values (and corresponding lower p-values) provide stronger evidence of association. Analyses were conducted using the MATLAB Statistics Toolbox™ 6.2 (The Mathworks, Inc., Natick, MA).
For those genes with at least one SNP with a p-trend <0.05 for NHL or an NHL subtype (N=12 genes), we further conducted two multi-locus tests. The purpose of these tests was to detect stronger associations that might have been missed by the single-SNP analyses based on linkage disequilibrium between the genotyped SNPs and a causally-associated SNP. First, we conducted a likelihood ratio test, assessing the relative improvement in model fit from the inclusion of parameters for all independent SNPs (r2<0.8 among controls) in a particular gene, assuming a codominant model for each SNP, compared with a model with just age, sex, race, and study center. Second, we conducted haplotype analyses among non-Hispanic Caucasians. We evaluated risk of NHL and NHL subtypes associated with haplotypes defined by SNPs within a sliding window of three loci across a gene (Haplo Stats, version 1.2.1, haplo.score.slide, http://mayoresearch.mayo.edu/mayo/research/schaid_lab/software.cfm). A global score statistic was used to summarize the evidence of association of disease with the haplotypes for each window. In addition, we visualized haplotype structures using Haploview, version 3.11 (30) based on measures of pairwise linkage disequilibrium between SNPs. For blocks of linkage disequilibrium (Supplementary Table 3), we obtained ORs and 95% CIs for the underlying haplotypes under the assumption of an additive model (haplo.glm, minimum haplotype frequency 1%). Two SNPs (MYC rs3824120, BCL2 rs1982673) were excluded from haplotype analyses because they were genotyped in only two of three studies (Supplementary Table 1). All haplotype analyses were adjusted for age, sex, and study center.
In this analysis of 203 SNPs from 20 candidate genes among 1946 patients with NHL and 1808 population controls, the overall statistical significance for NHL of the biologic pathway(s) captured by all 20 genes was p=0.0544 (tail strength statistic=0.1546). We observed suggestive associations (p-trend<0.05) for 15 SNPs with risk of NHL overall, 17 SNPs with DLBCL, 12 SNPs with follicular lymphoma, 10 SNPs with marginal zone lymphoma, and 13 SNPs with CLL/SLL (Supplementary Table 4).
We observed the most striking associations for BCL2L11 (also known as BIM) and BCL7A (FDR value for minP test <0.2). BCL2L11 was associated with follicular lymphoma (minP=0.0068) (Table 3). SNP-based analyses revealed suggestive associations (p-trend<0.05) for 4 SNPs with NHL overall, and 6 SNPs with follicular lymphoma, but no significant associations with any other NHL subtype (Supplementary Table 4). Two variants in linkage disequilibrium in our control population (D’=0.99, r2=0.75) were particularly strongly related to follicular lymphoma (rs7567444: ORCT=0.87, 95%CI 0.70–1.08; ORTT=0.60, 95%CI 0.44–0.80; p-trend=0.0009; rs3789068: ORAG=1.41, 95%CI 1.10–1.81; ORGG=1.65, 95%CI 1.25–2.19; p-trend=0.0004), with very similar risk estimates in all three studies (Table 4, Figure 1, Supplementary Table 5). The multi-locus analyses supported the association of BCL2L11 with follicular lymphoma and did not show stronger evidence of association than the single SNP-based analyses (Supplementary Table 6–Supplementary Table 7).
BCL7A was particularly associated with DLBCL (minP=0.0025) (Table 3). Three SNPs were significantly related to DLBCL only, with one variant particularly strongly associated with DLBCL (rs1880030: ORAG=1.34, 95%CI 1.08–1.68; ORAA=1.60, 95%CI 1.22–2.08; p-trend=0.0004), which was consistent in all three studies (Table 4, Figure 1, Supplementary Table 5). The multi-locus analyses supported the association of this variant with DLBCL and did not show stronger evidence of association than the single SNP-based analyses (Supplementary Table 6–Supplementary Table 7). Another SNP was related to risk of NHL overall (rs12827036: ORGT=0.83, 95%CI 0.71–0.97; ORTT=0.77, 95%CI 0.64–0.93; p-trend=0.0044), with similar statistically significant risk estimates for both DLBCL and follicular lymphoma (Table 4), and consistent risk estimates in all three studies (Supplementary Table 8).
We also observed notable associations for BCL6, MYC, and CCND1 (FDR value for minP test 0.2–0.5). BCL6 was marginally associated with NHL overall (minP=0.0616) and most subtypes (Table 3). In SNP-based analyses, scattered suggestive associations (p-trend<0.05) were observed for 8 SNPs for NHL overall and/or at least one NHL subtype (Supplementary Table 4). The SNP most strongly associated with NHL overall was rs1523475 (ORCT=1.14, 95%CI 0.99–1.31; ORTT=1.50, 95%CI 1.07–2.11; p-trend=0.0079) (Table 4). Consistent with our previous report on rs1056932 from the Connecticut study (25), the strongest SNP associations in the pooled dataset were observed for CLL/SLL (5 SNPs p-trend<0.05, including rs1056932). Compared with the Connecticut study, the associations for CLL/SLL in the NCI-SEER and New South Wales studies tended to be weaker but were in the same direction. In the pooled dataset, CLL/SLL was particularly associated with rs3172469 (ORGT=1.20, 95%CI 0.85–1.70; ORGG=2.29, 95%CI 1.33–3.93; p-trend=0.0094), with similar risk estimates in all three studies (Table 4, Supplementary Table 5). In multi-locus analyses, the likelihood ratio test showed a slightly stronger association than the minP test with NHL overall (likelihood ratio test: p=0.0122) (Supplementary Table 6), whereas the analyses of haplotypes defined by SNPs within a sliding window of three loci were similar to the SNP-based analyses and supported a stronger association for CLL/SLL than other subtypes (Supplementary Table 7).
MYC was associated with CLL/SLL (minP=0.0361) (Table 3). The two SNPs most strongly associated with CLL/SLL were in modest linkage disequilibrium in our control population (D’=0.77, r2=0.45) and the homozygote was rare (3.0–4.8% among controls). Thus, we evaluated risk estimates under the dominant genetic model (rs3891248: ORAT/AA=0.57, 95%CI 0.38–0.85, p=0.0060; rs16902359: ORCT/TT=0.52, 95%CI 0.33–0.82, p=0.0049), which were similar in all three studies (Table 4, Supplementary Table 5). The multi-locus analyses did not show stronger evidence of association than the single SNP-based analyses (Supplementary Table 6–Supplementary Table 7).
CCND1 was weakly associated with NHL (minP=0.0744) (Table 3). Two SNPs in linkage disequilibrium in our control population (D’=0.96, r2=0.53) were modestly related to NHL in the pooled study population (rs603965: ORGA=1.10, 95%CI 0.94–1.27; ORAA=1.25, 95%CI 1.04–1.52; p-trend=0.0203; rs2450254: ORAT=0.94, 95%CI 0.82–1.09; ORTT=0.83, 95%CI 0.68–1.00; p-trend=0.0623), with consistent risk estimates across all four subtypes (Table 4). The risk estimates were also generally similar across all three studies (Supplementary Table 8), although the risk estimates for the splice variant G870A (rs603965), which we previously reported for the NCI-SEER study (24), were attenuated and not significant in the Connecticut and New South Wales studies. The multi-locus analyses did not show stronger evidence of association than the single SNP-based analyses (Supplementary Table 6–Supplementary Table 7).
LMO2 and BCL2 were not statistically significantly associated with NHL or any NHL subtype (FDR value for minP test >0.5) (Table 3). However, in each gene the association with NHL for at least one SNP could not be disregarded based on a p-trend<0.01 and consistency of risk estimates in all three studies and across all four NHL subtypes (LMO2 rs3824848: ORCT=1.10, 95%CI 0.96–1.27; ORTT=1.35, 95%CI 1.08–1.69; p-trend=0.0095; BCL2 rs2849377: ORAT=0.86, 95%CI 0.74–1.01; ORTT=0.41, 95%CI 0.22–0.76; p-trend=0.0041) (Supplementary Table 4, Supplementary Table 5, and Supplementary 8). It was also notable that of the 12 SNPs in BCL2 related to NHL or an NHL subtype, 8 were particularly related to marginal zone lymphoma, though no clear patterns emerged to implicate a particular variant (Supplementary Table 4). The multi-locus analyses did not show stronger evidence of association than the single SNP-based analyses (Supplementary Table 6–Supplementary Table 7).
Although we observed suggestive associations (p-trend<0.05) for the one SNP we genotyped in PIM1 and for one of the two SNPs we genotyped in TP53, we could not explore these findings further because we did not have data for additional SNPs within these genes (Supplementary Table 4). We also observed suggestive associations (p-trend<0.05) for individual SNPs in BCL10, AICDA, and BAX, but the minP test, study-specific SNP-based analyses, and multi-locus analyses generally did not support an association with risk of NHL overall or any NHL subtype (Table 3, Supplementary Table 4–Supplementary Table 8).
Risk estimates were similar when we conducted the SNP-based analyses restricted to non-Hispanic Caucasians and stratified by age (<50, ≥ 50 years) and sex (data not shown).
In this pooled analysis, we demonstrated consistent evidence from three population-based case-control studies that common genetic variation in cell cycle, apoptosis, and lymphocyte development regulatory genes may play a role in lymphomagenesis, and the effects may vary by NHL subtype. In particular, we found that two variants in linkage disequilibrium in the pro-apoptotic gene BCL2L11 (BIM) were significantly related to follicular lymphoma risk, and one variant in BCL7A, which is involved in a rare NHL-associated translocation, was significantly related to DLBCL risk. We also observed notable associations for variants in BCL6 and CCND1 with risk of NHL overall, and variants in MYC with risk of CLL/SLL. We observed suggestive associations for at least one variant in 7 of the remaining 15 genes we evaluated, but overall the findings for these genes were not compelling.
BCL2L11 (also known as BIM) is a key pro-apoptotic member of the BCL2 family that maintains hematopoietic cell homeostasis by initiating apoptosis in lymphocytes, regulating the negative selection of autoreactive lymphocytes, and balancing the proliferative and anti-apoptotic effects of BCL2 (31–34). Several isoforms of BCL2L11 created by both transcriptional and posttranslational modification have been identified and shown to have varying pro-apoptotic activity (35, 36). Further, diminished expression of BCL2L11 has been associated with melanoma progression (37), renal cell carcinoma (38), and glioblastoma (39). We present here the first report on common genetic variation in BCL2L11. The two variants in BCL2L11 for which we observed a particularly striking association with follicular lymphoma (rs7567444, rs3789068) were in linkage disequilibrium in our control population and tag variants spanning most of BCL2L11. If our findings are replicated, it will be necessary to conduct additional genotyping across the entire gene to determine which region contains the causal variant(s).
BCL7A was identified by its participation in a three-way chromosomal translocation with MYC and IgH in a Burkitt lymphoma cell line and has also been shown to be rearranged in a mediastinal B-cell lymphoma cell line (40). Although the function of BCL7A is unknown, the protein shows homology with the actin-binding protein, caldesmon, and is part of an evolutionarily conserved family that also includes BCL7B and BCL7C (41). We present the first report on common genetic variation in BCL7A, although diminished expression of BCL7A has been associated with mycosis fungoides (42), peripheral T-cell lymphoma (43), more aggressive clinical behavior of cutaneous T-cell lymphoma (44), and poorer prognosis for DLBCL (45). The variant in BCL7A for which we observed a particularly strong association with DLBCL (rs1880030) tags eight other loci located in or near exon 5. More research is needed to discover the function of BCL7A and replicate our findings, particularly focusing on the region of the gene surrounding exon 5.
We also observed notable associations for variants in BCL6 and CCND1 with risk of NHL overall, and variants in MYC with risk of CLL/SLL. All three of these genes play important roles in the cell cycle and/or lymphocyte development (46–48) and have been implicated in lymphomagenesis by several lines of evidence (7–12, 45, 49, 50). However, there is limited previous research associating lymphoma with common genetic variation in BCL6 and CCND1, and no previous research for MYC. The BCL6 findings from the pooled dataset were consistent with our previous report from the Connecticut study only (25), but do not provide support for two other previous studies of follicular lymphoma in relation to SNPs in the regulatory first intronic region of BCL6 (51, 52). The CCND1 splice variant G870A (rs603965), which we previously reported for the NCI-SEER study (24), has also been associated with acute lymphoblastic leukemia (48). Although no previous research has associated lymphoma with common genetic variation in MYC, the two rare variants in MYC (rs3891248, rs16902359) associated with CLL/SLL in this pooled analysis are singletons located in the promoter and first intronic region of MYC. Chromosomal translocation breakpoints clustered in this region have been shown to have a greater effect on MYC overexpression in Burkitt lymphomas than breakpoints in other regions of MYC (53). Because of the importance of BCL6, CCND1, and MYC in the cell cycle and/or lymphocyte development as well as carcinogenesis, we believe further study of common genetic variation in these genes and lymphoma risk is warranted.
Of the remaining 15 candidate genes we evaluated in this pooled analysis, we observed suggestive associations for at least one variant in each of 7 genes, but overall the findings for these genes were not compelling. For three of these genes (LMO2, BCL2, BCL10), we successfully genotyped ≥85% of the SNPs identified by our tagging algorithm from both HapMap Build 20 and the current version of HapMap (Build 22). However, for the remaining four genes (TP53, PIM1, BAX, AICDA), we successfully genotyped ≤70% of the SNPs identified by our tagging algorithm from both HapMap Build 20 and the current version of HapMap (Build 22). The publication of our complete results from all SNPs in all 20 of the candidate genes can be used to compare results of future research on these variants in relation to lymphomagenesis (Supplementary Table 4).
The main strength of this analysis was our ability to evaluate the associations in three independent study populations. Interpretation of our results should also take into account several limitations. We did not have data on a sufficient number of unlinked, unassociated SNPs to quantitatively assess population structure within our data. However, it is unlikely that our results were biased by population stratification because our results were similar in three independent study populations, and it is unlikely that the same substructure would be repeated in multiple studies. In addition, our risk estimates were similar when we restricted the analytic population to non-Hispanic Caucasians (data not shown). Participation (percentage interviewed among those approached) was low in the three studies, particularly for controls. However, it is unlikely that participation bias would completely explain our findings because it is unlikely that genotype frequencies vary by willingness to participate (21). Survival bias could have influenced our results for those genotypes also associated with prognosis because some patients with more aggressive disease were too ill to participate or died before study investigators could contact them, and common genetic variants associated with NHL etiology may also be associated with survival (54). Although all cases had histologically confirmed NHL, our results for NHL subtypes could have been biased by disease misclassification among the subtypes. However, diagnostic accuracy is estimated to be more than 80% for most NHL subtypes (55, 56), and any disease misclassification was likely to be non-differential, thus biasing our results toward the null hypothesis. We may have had some false negative results because of inadequate coverage of the SNPs identified in HapMap, or because the genetic variation identified by HapMap does not uniformly cover the genome. Finally, our results require replication in other study populations because some findings may be the result of false positive associations. However, by combining data from three studies we were able to evaluate pooled risk estimates as well as risk estimates in three independent populations, minimizing the chance of false positive associations particularly for our strongest findings.
In summary, we found consistent evidence in three population-based case-control studies that common genetic variation in cell cycle, apoptosis, and lymphocyte development regulatory genes may play a role in lymphomagenesis, and the effects may vary by NHL subtype. Replication of our results, particularly in studies with sufficient power to evaluate NHL subtypes, and further study to identify functional SNPs are warranted.
We thank Mary McAdams, Peter Hui, Michael Stagner and Zeynep Kalaylioglu of Information Management Services, Inc. for their programming support. For the NCI-SEER study, we also gratefully acknowledge the contributions of the staff and scientists at the SEER centers of Iowa, Los Angeles, Detroit, and Seattle for the conduct of the study’s field effort. The NSW study was made possible by access to new notifications to the NSW Central Cancer Registry, which is funded by the NSW Health Department. Ann-Maree Hughes oversaw conduct of the study and Melisa Litchfield, Maria Agaliotis, Chris Goumas, Jackie Turner and staff of the Hunter Valley Research Foundation contributed to the data collection. Jenny Turner, study pathologist, reviewed all pathology reports and original slides as necessary.
Financial support: All genotyping and statistical analysis for this project was supported by the Intramural Research Program of the National Institutes of Health (National Cancer Institute). The NCI-SEER study was also supported by the Intramural Research Program of the National Institutes of Health (National Cancer Institute) and by Public Health Service (PHS) contracts N01-PC-65064, N01-PC-67008, N01-PC-67009, N01-PC-67010, N02-PC-71105. The Connecticut study was also supported by NIH grant CA62006 from the National Cancer Institute. The NSW study was also supported by the National Health and Medical Research Council of Australia ([Bruce Armstrong] Project Grant number 990920), The Cancer Council NSW, and The University of Sydney Medical Foundation.