|Home | About | Journals | Submit | Contact Us | Français|
The high incidence of lung cancer in Xuanwei County, China has been attributed to exposure to indoor smoky coal emissions that contain polycyclic aromatic hydrocarbons. The inflammatory response induced by coal smoke components may promote lung tumor development. We studied the association between single nucleotide polymorphisms (SNP) in genes involved in innate immunity and lung cancer risk in a population-based case-control study (122 cases and 122 controls) in Xuanwei. A total of 1,360 tag SNPs in 149 gene regions were included in the analysis. FCER2 rs7249320 was the most significant SNP (OR: 0.30; 95% CI: 0.16–0.55; P, 0.0001; false discovery rate value, 0.13) for variant carriers. The gene regions ALOX12B/ALOX15B and KLK2 were associated with increased lung cancer risk globally (false discovery rate value < 0.15). In addition, there were positive interactions between KLK15 rs3745523 and smoky coal use (OR: 9.40; P interaction = 0.07), and between FCER2 rs7249320 and KLK2 rs2739476 (OR: 10.77; P interaction = 0.003). Our results suggest that genetic polymorphisms in innate immunity genes may play a role in the carcinogenesis of lung cancer caused by polycyclic aromatic hydrocarbon-containing coal smoke. Integrin/receptor and complement pathways as well as IgE regulation are particular noteworthy.
Xuanwei County has the highest incidence rate of lung cancer in China, a fact that is attributed to the wide use of smoky coal and unvented stoves. The use of smoky coal in unvented stoves accounts for more than 90% of lung cancer cases for both men and women [Mumford et al., 1987; He & Yang, 1994]. Burning the local smoky coal, a bituminous coal, is known to generate a large amount of carcinogenic polycyclic aromatic hydrocarbons (PAHs). Although very few women smoke, the lung cancer mortality rates in Xuanwei County were similar between men and women (27.7 and 25.3 per 100,000 for males and females, respectively), most likely as a result of the fact that women were exposed to indoor coal smoke for a longer time [Mumford et al., 1987; He & Yang, 1994].
In addition to PAHs, indoor smoky coal burning generates a high concentration of other pollutants including particulate matter that contains ultrafines, sulfur dioxide, nitrogen oxides, and ozone, all of which are thought to provoke inflammatory and/or allergic disorders. Some studies have shown that the inflammatory response may promote lung tumor development [Ballaz & Mulshine, 2003]. Because of the high prevalence of indoor air pollution, pulmonary inflammation is prevalent in Xuanwei, which can be documented from the high incidence of chronic obstructive pulmonary disease [Zhou et al., 1995].
Innate immunity is the body’s first barrier to clear non-specific antigens. Furthermore, the innate immune system interacts with the adaptive immune system during physiological and chronic inflammation [Kabelitz & Medzhitov, 2007]. As exposure to coal smoke is the primary risk factor for lung cancer in Xuanwei, we hypothesized that genetic variation in genes involved in innate immunity would play a role in lung carcinogenesis in this special population. Here, we report the analysis of genetic polymorphisms in 209 innate immunity genes in a population-based case-control study in Xuanwei, China.
This was a population-based case-control study of lung cancer in Xuanwei, China. Details of the study have been described elsewhere [Shen et al., 2005]. Briefly, from March 1995 to March 1996, a total of 122 newly diagnosed lung cancer cases and 122 controls were recruited. Matched controls were enrolled within 2 weeks after the diagnosis and recruitment of each lung cancer case, and matched factors were sex, age (+/−2 years), village, and type of fuel (smoky coal, smokeless coal, and wood) currently used for cooking and home heating. The participation rates for cases and controls were 98 % and 100 %, respectively. A standardized structured questionnaire was used to obtain information about demographic characteristics, life-time use of different types of coal, tobacco smoking, family history of lung cancer, and personal medical history. Detailed information on tobacco smoking history was collected including type of tobacco smoked (i.e., cigarette tobacco, tobacco leaf), the amount of tobacco used (i.e., number of cigarettes smoked per day, amount of tobacco leaf per month), age at start and cessation of smoking, and means of smoking (cigarette, pipe, and water pipe).
DNA was extracted from sputum samples using phenol-chloroform extraction. A total of 111 cases and 95 controls had enough DNA for the assay. DNA samples from cases and controls were randomly sorted to generate a list of 46 duplicated DNA samples for genotyping quality control.
A GoldenGate assay (Illumina, http://www.illumina.com) was developed using SNPs with plausible evidence that the gene product is related to innate immunity. Genes in the same gene region of innate immunity genes were also genotyped. Gene region was defined as a chromosome region with multiple genes within a range of 200 thousands bases. The GoldenGate assay was initially designed to examine 1,536 SNPs in 150 gene regions. However, 140 SNPs were excluded due to assay failure, and 36 SNPs were dropped from the analysis because of low minor allele frequency (< 1 %) or poor quality performance (low completion rates, concordances, or Hardy-Weinberg equilibrium). A total of 1,360 tag SNPs in 149 gene regions (209 genes) were included in the analysis (Supplementary Table 1). Among them, a further 187 SNPs were excluded from the statistical analyses due to strong correlation with other SNPs in the same gene (Spearman correlation coefficient > 0.9). The final analysis dataset contained 1,173 SNPs in 149 gene regions. Approximately 99.7% assays had a completion rate of ≥99 % for all samples and 95.3 % assays reached ≥99% concordance among quality control samples.
The genotype data were analyzed by comparing the homozygotes and the heterozygotes of the variant with the homozygote of the common allele. An additive model was applied by treating the genotypes as values of 0, 1, and 2 in one model in order to test for a linear trend. As genotype data were not obtained for all cases and controls, unconditional logistic regression was used to estimate the odds ratios (ORs) and 95% confidence intervals (CIs) with two-sided P values for the association between lung cancer risk and the SNPs, adjusted for age (continuous variable) and sex. Since additional adjustment for coal use (< 130 tons and ≥ 130 tons of smoky coal used lifetime) and variables of tobacco smoking (ever/never smoking, pack-year, duration, and daily amount) generated very similar results, the final models adjusted for age and sex only.
We performed three gene region-based tests for association: (1) a Likelihood Ratio Test (LRT) for each gene comparing models with and without terms for heterozygotes and homozygotes for variant carriers for all SNPs in one given gene region (degree of freedom = 2 × number of SNPs per gene region); (2) a LRT for each gene comparing models with and without terms for each SNP (genotypes coded as 0, 1, and 2) in a gene region (degree of freedom = number of SNPs per gene region); (3) permutation-based resampling methods (1,000 permutations) to assess the true statistical significance of the smallest p-trend within each gene region. This method automatically adjusts for the number of tag SNPs tested within that gene region, as well as the underlying linkage disequilibrium pattern [Chen et al., 2006]. In short, the method considers p-values one-at-a-time for a set of SNP-disease associations, marginally over all other SNPs in one gene region. Inference is based on the permutation distribution of the minimum of the ordered p-values, which takes the correlations into account.
Haplotype analysis was carried out for gene regions in which ≥ 1 SNP was significantly associated with lung cancer risk. The haplotype block structure for each gene region was examined with HaploView [Barrett et al., 2005]. All contiguous locus subsets in a gene region were scanned to identify sub-haplotypes with the strongest omnibus association with lung cancer risk using various window sizes (3~6) [Schaid et al., 2002]. For significant sub-haplotypes, individual haplotypes were estimated using the EM algorithm and their association with lung cancer risk was calculated using unconditional logistic regression with the most common haplotype as the reference [Schaid et al., 2002].
Interaction between smoky coal use and a SNP, and between SNPs, were evaluated by adding an interaction term between smoky coal use (< 130 tons and ≥ 130 tons lifetime) and the SNP (common allele homozygotes and minor allele carriers), or between two SNPs in a model. Joint effects of coal use and a SNP, and joint effects of two SNPs were estimated by combining them as one variable in one model.
We evaluated the robustness of our results using the False Discovery Rate (FDR) [Benjamini & Hochberg, 1995]. FDR is the expected ratio of erroneous rejections of the null hypothesis to the total number of rejected hypothesis among all the SNPs or genes analyzed in this report. A FDR value of 0.15 is set as cutoff point for promising SNPs or genes in statistical analyses. All data were analyzed with the Statistical Analysis Software, version 9.13 (SAS Institute Inc, 1996) if not specified otherwise.
Demographic features, including age and sex, were comparable between cases and controls (Table 1). Compared with studies in other populations, the impact of tobacco smoking in Xuanwei was weak with a 1.7-fold (95 % CI = 0.8–3.5) risk of lung cancer for exposure to more than 25 pack-years among men, which is consistent with previous studies in Xuanwei [Lan et al., 2002; Lan et al., 2008]. The risk estimate for coal use is reduced because of the match with fuel type between cases and controls. Heavy smoky coal users (≥ 130 tons during lifetime) had a 2.3-fold (95 % CI = 1.2–4.1) risk of lung cancer compared with subjects who used less than 130 tons of smoky coal during their lifetime.
Among the 1,173 SNPs, a few criteria were preset to screen promising SNPs or gene regions: 1) a significance level of 0.01 or smaller in the additive model, dominant model (all variant carriers combined), or homozygote variant carriers; or 2) a significance level of 0.05 or smaller in either of the 2 LRTs; or 3) a significance level of 0.05 or smaller in the permutation test. The genotype frequencies for cases and controls, and the effect of these SNPs and gene regions on lung cancer risk are shown in Supplementary Table 2. Ten gene regions (i.e., ALOX12B, ALOX5AP, FCER2, SELP, KLK15, KLK2, KLK4, KLKB1, IRAK3, and MBP) displayed global association with lung cancer risk at the P<0.01 level. One or more SNPs in FCER2, SECTM1, C7, DAB2, KLK15, KLK2, SERPING1, MBP, MUC7, TLR4, NCF2, NOS3, and INSL3 were associated with lung cancer risk at the P<0.005 significance level.
The FDR values that are less than 0.15 were obtained from FCER2 rs7249320 (SNP analyses, 0.13), ALOX12B/ALOX15B (0.10) and KLK2 (0.15) (LRT). FCER2 rs7249320 (IVS4 +197C>A) was the most significant SNP with an OR of 0.30 (95 % CI: 0.16–0.55; P, 0.0001) for variant carriers. Carriers of variants in KLK2 rs2739476 (−2911 G>A; P = 0.001) and ALOX12B rs6503075 (IVS12 +162G>A; P = 0.01) were associated with a >2-fold risk of lung cancer (Table 2). Haplotype analyses did not provide additional information for pronounced loci of a gene in addition to those significant SNPs found in individual SNP analyses.
Finally, we found KLK15 rs3745523 displayed a borderline significant positive interaction with smoky coal use (P = 0.07) in addition to a significant main effect (P = 0.002) (Table 3). There is a larger risk of lung cancer among heavy smoky coal users who are variant carriers of KLK15 rs3745523. The main effect is shown in Table 2. In addition, FCER2 rs7249320 and KLK2 rs2739476 displayed a significant positive interaction (Table 4). Carriers of at-risk alleles in the two SNPs were associated with a 10-fold risk of lung cancer.
This study represents the largest evaluation of innate immunity genes in a population with extremely high coal emission exposure and lung cancer incidence. We found that several SNPs in genes associated with innate immunity were associated with lung cancer risk in Xuanwei, suggesting that innate immunity plays a role in coal-emission related-lung carcinogenesis. Out of six pre-defined pathways (oxidative response; pattern recognition molecules and antimicrobials; integrins/receptors; complement; chemokines; and response genes and tissue factors), the two pathways with the most significant associations were the integrin/receptor and the complement pathways. Even though the most significant findings were based on variation in an untranslated region (5′ or 3′) or an intron, these variants may be markers of causal loci or may influence gene expression.
Chronic inflammation is an important risk factor in the pathogenesis of lung cancer [Coussens & Werb, 2002; Schottenfeld & Beebe-Dimmer, 2006; Ames et al., 1995]. Recurrent and persistent inflammation may cause or facilitate malignant transformation, incite tissue reparative proliferation, and create a microenvironment to promote carcinogenesis and tumor progression. Host’s immunoinflammatory up-regulation may induce carcinogenesis, which was documented in smoking-induced lung cancer [Ballaz & Mulshine, 2003]. The innate immunity is the first barrier to protect the host from invasion of exogenous chemicals or microbes. There is also some molecular crosstalk between components of the innate immunity and adaptive immunity pathways through pattern recognition receptors and cytokines [Kabelitz & Medzhitov, 2007]. Coal smoke contains a variety of particulates and chemicals that can stimulate airways and induce acute and chronic inflammatory reaction. The innate immunity system is recruited and activated when individuals inhale the coal smoke. Excessive inflammation can increase the risk of malignant transformation through oxidative DNA damage, or facilitate the carcinogenesis of PAHs in coal smoke. In addition, local chronic inflammation can promote the development of cancer. Tumor-associated macrophages and mast cells are two documented innate immunity facilitators of lung cancer progression [Vella & Finn, 2006].
Those genes whose SNPs showed very significant associations or gene-environmental interactions play an important role in innate immunity. FCER2 is known to code a key molecule for B-cell activation and growth. It is a receptor for IgE and is important in regulation of IgE levels. The expression of FCER2 is highly up-regulated in normal activated follicular B cells and in chronic lymphocytic leukemia B cells [Pathan et al., 2008]. Atopy, defined by the presence of elevated levels of total and specific IgE antibodies, was associated with elevated risk of lung cancer in a meta-analysis [Wang & Diepgen, 2005]. A few SNPs, including FCER2 rs7249320 (included in our study) and haplotypes of FCER2, have been found to be associated with a higher level of serum IgE [Laitinen et al., 2000; Tantisira et al., 2007]. Our results support the hypothesis that altered IgE level due to genetic variation is involved in lung carcinogenesis. We found three significant tag SNPs, covering from intron 4 to the 3′ end, which overlaps the conserved domain of this gene (CD03590). More SNPs in this region should be investigated to explore the causal locus.
Kallikreins are a group of serine proteases and are implicated in carcinogenesis [Borgono & Diamandis, 2004]. Expression and genetic polymorphisms of kallikreins have been used as biomarkers for diagnosis and prognosis for hormone-dependent cancers, including prostate cancer, breast cancer, and ovarian cancer [Yousef et al., 2001; Yousef et al., 2002; Yousef et al., 2003]. Kallikreins have also been found to be dysregulated and differentially expressed in lung cancer [Bhattacharjee et al., 2001; Planque et al., 2005]. Kallikrein markers might be useful for lung cancer diagnosis and prognosis estimate [Planque et al., 2008; Sher et al., 2006]. The chromosome region 19q13.3–q13.4, where KLK15 and KLK2 are localized, is an area known to undergo rearrangements in various tumors [Obiezu & Diamandis, 2005]. Such rearrangements may lead to dysregulation of expression of involved genes. Consistent with these findings, our results point to the promoter regions of KLK15 and KLK2 as promising functional regions. Epigenetic changes leading to up- or down-regulation of expression of these genes may participate in lung cancer pathogenesis.
Kallikreins also affect IgE production [Matsushita & Katz, 1993] and have been associated with atopic asthma and dermatitis [Christiansen et al., 1992; Komatsu et al., 2007]. We found a significant positive interaction between SNPs in FCER2 and KLK2. These suggest that IgE regulation is important in lung cancer development in Xuanwei.
Our study is limited by its small sample size and consequently low power to detect effects that may truly exist. Also, given the multiple comparisons carried out, there is a possibility that one or more findings are false positives. As such, our results need to be considered as preliminary. However, the findings derive from a unique population that provides a special model for non-smoking PAH-(particulates) carcinogenesis, and are biologically plausible. A substantially larger case-control study of lung cancer is currently being conducted in this region of China and will provide an opportunity to replicate and extend these findings.
In summary, we found genetic polymorphisms in a number of innate immunity genes that were associated with lung cancer risk in Xuanwei, China. This suggests that innate immunity, especially the regulation of IgE may play a role in the pathogenesis of lung cancer in populations exposed to high levels of PAHs and coal emissions.
This study was supported by Chinese Center for Disease Control and Prevention and the Intramural Research Program of the National Institutes of Health (NIH), National Cancer Institute.