|Home | About | Journals | Submit | Contact Us | Français|
Germline mutations in complement genes have been associated with susceptibility to infections and autoimmune diseases, conditions that are associated with non-Hodgkin lymphoma (NHL) risk. To test the hypothesis that common genetic variation in complement genes affect risk of NHL, we genotyped 167 single nucleotide polymorphisms (SNPs) from 31 genes in 441 NHL cases and 475 controls. Principal components (PC) and haplotype analyses were used for gene-level tests of NHL risk, while individual SNPs were modeled as having a log-additive effect. In gene level PC analyses, C2 (p=0.023), C5 (p=0.0032) and C9 (p=0.020) were associated with NHL risk; haplotype analyses showed similar results, as well as a haplotype association for C7 (p=0.046). When all 4 genes were considered simultaneously, only C5 and C9 remained significant (p<0.05). In SNP level results from these genes, 10 SNPs had a p<0.05. However, after correcting for multiple testing, only the C5 SNPs rs7026551 (q=0.015; OR=1.54, 95% CI 1.21-1.95) and rs2416810 (q=0.015; OR=1.57; 95% CI 1.22-2.01), and the C9 SNP rs187875 (q=0.015; OR=0.68; 95% 0.56-0.84) remained noteworthy. Associations were similar for the common NHL subtypes. In summary, we provide evidence for a role of genetic variation in complement genes, particularly C5 and C9, and NHL risk.
Non-Hodgkin lymphoma (NHL) is the most commonly diagnosed hematologic malignancy in the United States, and the lifetime odds of developing NHL is 1 in 47 for men and 1 in 55 for women (Jemal, et al 2008). Lymphomagenesis is closely related to suppression of the cellular immune system, particularly in the setting of chronic antigenic stimulation (driving proliferation of the B-cell compartment) and genetic instability (from V(D)J recombination, somatic hypermutation, and class-switch recombination) (Shaffer, et al 2002). The strongest known epidemiologic risk factors for NHL are immune suppressed states due to primary immune deficiency diseases or acquired immune alterations (e.g., iatrogenic suppression in solid organ transplantation; disease-related); the risk of developing NHL in these patients is ten to several hundred times greater than the general population (Alexander, et al 2007). Other putative immune and inflammation-related risk factors for NHL include autoimmune diseases, atopy, asthma, later birth order, as well as infectious agents for specific NHL subtypes (Alexander, et al 2007, Chiorazzi and Ferrarini 2003). The latter epidemiologic data supporting a prominent role for immune dysfunction in the etiology of NHL leads to the hypothesis that NHL risk may be associated with variation in polymorphic genes that modulate immune function and regulation and response to inflammatory stimuli, for which there are growing empiric data (Cerhan, et al 2007a, Rothman, et al 2006, Skibola, et al 2007).
The complement system comprises a group of more than 30 proteins in a tightly regulated pathway of plasma proteins and membrane-bound regulators and receptors (Figure I) (Markiewski and Lambris 2007). Complement has long been known to play a central role in the innate immune response and functions to protect the host from pathogenic microorganisms. Complement has been further shown to play a prominent role as a mediator of inflammation, in the removal of immune complexes and apoptotic cells, and as a regulator of the immune response, including T-cell responses (Barrington, et al 2001, Kemper and Atkinson 2007, Markiewski and Lambris 2007). Complement can also target and remove tumor cells (Gorter and Meri 1999), although recent evidence suggest that in other settings it may also play a role in augmenting tumor growth (Markiewski, et al 2008). Germline mutations in complement genes have been associated with susceptibility to infections and autoimmune diseases (Crawford and Alper 2000, O'Neil 2000), conditions that have also been associated with NHL risk in many studies (Alexander, et al 2007). We therefore evaluated the hypothesis that common genetic variation in complement genes is associated with risk of NHL. We have previously reported our top results from an evaluation of 1253 immune genes (Cerhan, et al 2007a); here we report the specific results for the complement pathway also generated from that study, which have not been previously published.
This study was reviewed and approved by the Human Subjects Institutional Review Board at the Mayo Clinic, and all participants provided written, informed consent. Full details of this on-going, clinic-based case-control study conducted at the Mayo Clinic in Rochester, Minnesota have been previously reported (Cerhan, et al 2007a). Briefly, we offered enrollment to consecutive patients with newly diagnosed (within nine months of first diagnosis), histologically-confirmed Hodgkin and non-Hodgkin lymphoma (including the subtype of chronic lymphocytic leukemia/small lymphocytic lymphoma) who were aged 20 years or older, HIV negative, and a resident of Minnesota, Iowa or Wisconsin at the time of diagnosis. All cases were reviewed and confirmed by a hematopathologist, and classified according to the WHO criteria (Jaffe, et al 2001). Of the 956 eligible cases, 626 (65%) participated.
Clinic-based controls were selected from patients visiting the Mayo Clinic Department of Medicine for a pre-scheduled general medical exam. Eligibility requirements included being 20 years or older and a resident of Minnesota, Iowa or Wisconsin; patients were excluded if they had prior diagnoses of lymphoma, leukemia, or HIV infection. Controls were randomly selected and frequency matched to our cases by 5-year age group, gender, and region of residence. Of the 818 eligible controls, 572 (70%) participated.
All participants were asked to complete a self-administered risk-factor questionnaire and provide a blood sample. DNA was extracted using a standard procedure (Gentra Inc., Minneapolis, MN). This (Phase 1) analysis included 498 cases and 497 controls enrolled from 9/1/02 through 9/30/05 and who had a DNA sample available at the time of genotyping (10/1/2005).
This analysis of complement genes is part of a larger genotyping project to assess the role of immune and other candidate genes in the etiology of NHL. All of the genes and SNPs reported here were from the ParAllele (now Affymetrix) Immune and Inflammation SNP panel that included 1253 genes and 9412 SNPs, and full details of genotyping (including quality controls) for this panel have been previously published (Cerhan, et al 2007a). Briefly, genes were selected for their involvement in inflammation and immunity, and tagging SNPs for the 1253 genes were selected using CEPH (European-American) and Yoruba (African) samples from release 16 of the HapMap Consortium (The International HapMap Consortium 2003). Tagging SNPs covered 5 kb up and downstream of each gene with minor allele frequency (MAF) ≥0.05 and pairwise r2 threshold of 0.8. In addition, the panel included 748 validated non-synonymous SNPs. Genotyping was conducted at the Affymetrix facility using the Molecular Inversion Probe (MIP) genotyping technology (Hardenbol, et al 2005). Overall, the sample success rate was 98.75%, the assay call rate was 99.13%, and the concordance rate (based on 48 duplicates) was 98.95%. For this analysis, a total of 916 people (441 cases and 475 controls) passed all quality control measures, had NHL (Hodgkin lymphoma excluded) and were available for analysis (Cerhan, et al 2008).
Allele frequencies from cases and controls were estimated using observed genotype frequencies. The genotype frequencies in the controls were compared to allele frequencies expected under Hardy-Weinberg Equilibrium using a Pearson goodness-of-fit test or Fisher's exact test (MAF<0.05). In this analysis, there were 8 SNPs from the candidate genes that had a Hardy-Weinberg p-value of p<0.05 (see supplementary Table I); none of these SNPs were excluded from the analysis. We restricted our analyses to subjects whose self-reported race was Caucasian. Furthermore, we previously tested and found no evidence of population stratification in our data (Cerhan, et al 2007a, Cerhan, et al 2007b) using STRUCTURE (Pritchard, et al 2000).
Two methods were used to analyze the association of each gene with risk of NHL: haplotype and principal components analysis. For the haplotype analysis, all SNPs from a gene with HWE p-value for the controls greater than 0.0001 were used to determine haplotype frequencies, and a global score test was used, as implemented in the S-plus program Haplo.stats (Schaid, et al 2002). As a global gene test, we used principal components to create uncorrelated components that are linear combinations of the SNPs from a gene. These components were then ranked according to the amount of the total SNP variance explained. The resulting smallest subset of components that accounted for at least 90% of the variability amongst the SNPs was included in a multivariable logistic regression model. A likelihood ratio test was then used to jointly test the significance of the selected principal components. This method decreases the dimensionality of the correlated SNPs by reducing the number of independent degrees of freedom (Gauderman, et al 2007). Gene level tests with p<0.05 were declared of interest.
Individual SNPs were examined using unconditional logistic regression to estimate odds ratios (ORs) and corresponding 95% confidence intervals (CIs) separately for heterozygotes and minor allele homozygotes, using homozygotes for the major allele as the reference. Each polymorphism was modeled as gene-dosage effect (i.e., a p-trend) in which the rare variant allele is arbitrarily chosen as the “high-risk” allele and the scores take on values of 0, 1, or 2 corresponding to the number of copies of the high-risk alleles. SNPs with a p<0.05 in the setting of a global gene test of p<0.05 were declared of interest. In exploratory analyses, we reported subtype results for SNPs declared of interest, as well as investigated logistic regression models that simultaneously modeled effects from multiple genes.
To assess the robustness of our results we calculated the tail strength for the genes and SNPs that we evaluated (Taylor and Tibshirani 2006). The tail strength assesses the overall statistical significance of testing for no association across all the SNPs or genes evaluated. A tail strength > 0 indicates that you observe more significant findings than what is expected by chance. We also calculated q-values (Storey 2002) for the SNP level tests. A q-value is the expected proportion of false positives among significant results; q-values <0.05 were declared to be of interest.
Analyses were conducted using SAS (SAS Institute, Cary, NC, Version 8, 1999), S-Plus (Insightful Corp, Seattle, WA, Version 7.05, 2005), and R software systems. All analyses were adjusted for the design variables of age, gender, and county of residence.
There were 441 cases and 475 controls available for analysis. Compared to controls, cases were slightly younger (60.1 versus 61.7 years) and were less likely to have attended graduate or professional school (13.8% versus 20.3%), but were similar with respect to gender (58% of cases and 55% of controls were male) and state of residence (65% of cases and 67% of controls were from Minnesota) (See Supplemental Table II). The most common NHL subtypes were CLL/SLL (N=123), follicular (N=113), and DLBCL (N=69).
We evaluated 31 genes (Table I) from the complement pathway (Figure I). Gene coverage ranged from 5% to 100%, with a median coverage of 67% (Table II). Coverage was defined as the number of SNPs accounted for by a successfully genotyped tagSNP divided by the total number of SNPs, as defined by HapMap, within each complement gene. In gene level analysis, C2, C5, and C9 were significant at a nominal p<0.05 in principal components analysis, which was in agreement with the haplotype results (Table II). In addition, C7 was also statistically significantly associated with NHL risk in haplotype analysis (p=0.046) but not in the principle components analysis (p=0.11). The tail strength of this set of 31 genes from the principal components analysis was 0.26 (95% CI -0.09, 0.61). Simultaneous inclusion of the C2 SNP (p=0.068) and the principal components for C5 (p=0.028), C7 (p=0.19), and C9 (p=0.044) in the same statistical model suggested that the strongest signal (p<0.05) was from C5 and C9 (results not shown).
In Table III we report the SNP-level results for these 4 genes, and in Table IV we report the NHL subtype-specific results for SNPs that were significant at p<0.05 in Table III. For C2 we only genotyped one SNP (gene coverage, 25%) and this variant was associated with increased NHL risk overall (ordinal OR=1.32; 95% CI 1.04, 1.69). NHL subtype results were consistent with the overall associations reported in Table III.
For C5, 4 of the 14 typed SNPs had a p<0.05, including 3 intronic SNPs and one non-synonymous coding SNP (rs17612), which leads to a missense protein change (Glu1437Asp). The latter variant was associated with an increased risk of NHL (ordinal OR=1.64; 95% CI 1.14, 2.35). The subtype results for the 4 SNPs from C5 were generally consistent with the risk estimates for overall NHL, although the ORs were somewhat stronger (but not significantly so) for CLL/SLL.
For C7, 2 of the 12 typed SNPs had a p<0.05, including one intronic SNP (ordinal OR=1.26; 95% CI 1.01, 1.58) and one non-synonymous coding SNP (OR=1.30; 95% CI 1.03, 1.62); the latter SNP is associated with a missense protein change (Thr587Pro). Both SNPs showed similar patterns for each of the NHL subtypes.
For C9, 3 of the 7 typed SNPs had a p<0.05, and all were intronic. Subtype results were similar with overall results for the SNP rs187875, and variable for the other two SNPs.
With the exception of a SNP in CR1, no other SNPs in genes from the complement pathway were significant at p<0.05. The tail strength test for the 167 SNPs was 0.14 (95% CI, -0.01, 0.29), and 3 of the SNPs had q-values<0.05: rs7026551 from C5 (q=0.015), rs2416810 from C5 (q=0.015), and rs187875 from C9 (q=0.015).
C5 is adjacent to TRAF1, which we previously reported to be associated with risk of NHL in this study (Cerhan, et al 2007a). Figure II shows the p-values and LD pattern for the SNPs genotyped across this region, and suggests there are multiple signals across this region of the genome. When all significant SNPs in C5 and TRAF1 were included in the same logistic model, none of the SNPs were individually significant at p<0.05.
In our analysis of genetic variation in the complement pathway and risk of NHL, we found evidence at the gene and SNP level for an association of C2, C5, C7, and C9 with risk of NHL, and after accounting for multiple testing, SNPs from C5 and C9 remained noteworthy. Our results were broadly similar for the major NHL subtypes of CLL/SLL, follicular NHL and DLBCL, suggesting that these genes may play a general role in lymphomagenesis, although relatively small numbers for the selected subtypes and lack of data on the rarer subtypes makes this conclusion preliminary. Strengths of this study include the use of newly diagnosed patients from a regionally-defined clinic population, central pathology review, careful matching of clinic controls, and extensive quality controls for data collection and genotyping. Population stratification is unlikely to explain our results based on our restriction to Caucasians in the study and lack of evidence using the STRUCTURE program. However, by limiting to Caucasians of mainly northern European ancestry, we recognize that our results do not readily generalize to other racial and ethnic groups.
Both false positive and false negative results are a concern in association studies. To address false positive concerns generated from multiple testing, we first calculated the tail strength at both the gene and SNP level and found support that both the collection of genes and SNPs provide a clear signal over chance, although the 95% confidence intervals did not exclude zero. We also calculated q-values for individual SNPs, and two SNPs from C5 and a single SNP from C9 remained noteworthy at q<0.05. False negatives are also a potential concern. While our analysis included nearly all of the genes from the complement pathway, the tagSNP coverage was variable (5 to 100%) and only 8 genes had ≥80% coverage, although the median coverage was reasonable (67%). For lower coverage genes in particular, we cannot rule out a genetic association. There may be other genetic mechanisms (e.g., copy number variants) by which these genes could influence NHL. Ultimately, replication in independent populations will be required.
C2 is part of the classical pathway of the complement system, and activated C1 cleaves C2 into C2a and C2b, and C2a leads to the activation of C3 (Figure I). Deficiency in C2 is the most frequently inherited complement deficiency, affecting 1 in 20,000 Caucasians (Jonsson, et al 2005), and more than 90% of all C2 deficiency cases are thought to be due to a 28 base pair deletion in C2 gene of the HLA-B*18,S042,DRB1*15 MHC haplotype (Johnson, et al 1992). Approximately 50% of C2 deficient persons develop severe and/or repeated infections or an autoimmune disease, most commonly systemic lupus erythematosus, Henoch-Schonlein purpura, or polymyositis (Jonsson, et al 2005). Of additional interest, in age-related macular degeneration, a chronic degenerative disease thought to be due to chronic inflammation from deposition of complement proteins, a common risk haplotype and two protective haplotypes have been identified based on variants in CFB (L9H, R32Q) and C2 (E318D, intron 10) (Gold, et al 2006), which was confirmed in a second study (Maller, et al 2006). None of our SNPs were in LD with these SNPs. Both CFB and C2 are closely located to both the MHC class III region and the TNF loci. Thus, further characterization of this region is of high priority to identify one or more SNPs that could play a causal role in NHL.
C5 is composed of 2 disulfide-linked polypeptide chains, alpha (C5a) and beta (C5b), and is activated after cleavage by C3b (Figure I). C5a is a highly potent mediator of acute inflammatory reactions, while C5b-C9 compose the membrane attack complex (MAC), which can lead to direct lysis of target cells by complement-dependent cytotoxicity (CDC) (Guo and Ward 2005, Kemper and Atkinson 2007). C5a is a strong chemoattractant for a variety of inflammatory cells (neutrophils, eosinophils, monocytes, and T-lymphocytes), up-regulates expression of adhesion molecules on endothelial cells, and induces the release of a number of pro-inflammatory cytokines and chemokines including IL-1, IL-6, MCP-1, and TNFα (Guo and Ward 2005, Markiewski and Lambris 2007). C5a also mediates effector T-cell apoptosis, further emphasizing the importance of the complement pathway in regulating T-cell responses (Lalli, et al 2008). C5a is involved in sepsis, allergy and asthma (Guo and Ward 2005), and also appears to contribute to tumor growth in mice models, with the latter effect primarily due to the impact of C5a on the host microenvironment rather than tumor cells (Markiewski, et al 2008). Deficiencies of C5b-C9 are associated with recurrent and clinically mild to moderate infections by Neisseria species, particularly meningococcal and gonococcal (O'Neil 2000). Complement-mediated signaling can override the intracellular inhibitory mechanisms that maintain clonal anergy in B cells, and exposure to complement fixing bacteria may increase the risk of autoimmunity (Lyubchenko, et al 2007). Taken together, this suggests the hypothesis that inadequate control of infections may result in chronic B-cell receptor stimulation or T-cell/B-cell interactions and the subsequent development of a malignant clone.
C5 has also been identified as a susceptibility locus in asthma (Ober, et al 1998, Wjst, et al 1999) and hepatic fibrosis (Hillebrandt, et al 2005). Furthermore, one of the top hits from a recent genome-wide association study (GWAS) of rheumatoid arthritis (RA) identified a SNP in the intragenic region between C5 and TRAF1 (Plenge, et al 2007), which was further confirmed in a candidate gene study (Kurreeman, et al 2007). We previously identified TRAF1, which is involved in TNF signaling, as a top candidate gene in an analysis of over 1200 immune genes (Cerhan, et al 2007a). The combined results, shown in Figure II, suggest several SNPs that are of interest across this region; however, none of our SNPs were in LD with the SNP from the RA GWAS. Since both genes are plausible candidates for both RA and NHL, and further fine mapping and functional studies are warranted.
Many types of solid tumors and hematologic malignancies have been shown to express molecules that regulate the complement cascade, including CD46, CD55, and CD59, and this may allow tumor cells to escape complement attack (Gorter and Meri 1999). We did not find evidence for genetic variation in these molecules impacting risk of NHL, but given that complement-dependent cellular cytotoxicity (CDCC) may be relevant to the mechanism of action for monoclonal antibody therapy in NHL including CLL/SLL (e.g., Rituximab), the role of genetic variation of these genes in NHL prognosis remains to be evaluated.
The role of genetic variation in immune genes and risk of NHL is still limited, but several patterns are emerging, including a role for genes encoding pro-inflammatory cytokines (Cerhan, et al 2007a, Rothman, et al 2006, Skibola, et al 2007). Furthermore, there are suggestions that genes involved in innate immunity, including TLR2, TLR4, TLR6 and CARD15 may also be susceptibility loci (Cerhan, et al 2007a, Forrest, et al 2006, Nieters, et al 2006, Wang, et al 2006). Complement activation can be initiated by the interaction of pattern recognition receptors such as TLRs, and given the central role of complement in innate immunity and the critical link it provides to adaptive immunity, further evaluation of this pathway is warranted.
In summary, we provide the first evidence for a role of genetic variation in complement genes, particularly C5 and C9, and risk of NHL. These initial findings require replication, and the C5-TRAF1 region in particular will require fine mapping to identify any causal variants.
This study was funded, in part, by the National Institutes of Health (R01 CA92153; P50 CA97274). We thank Sondra Buehler for her editorial assistance.
Conflicts of Interest: The authors have no conflicts of interest to declare.