|Home | About | Journals | Submit | Contact Us | Français|
Transforming growth factor beta (TGF-β) receptors are centrally involved in TGF-β-mediated cell growth and differentiation and are frequently inactivated in non-small cell lung cancer (NSCLC). Constitutively decreased type I TGF-β receptor (TGFBR1) expression is emerging as a novel tumor-predisposing phenotype. The association of TGFBR1 haplotypes with risk for NSCLC has not yet been studied. We tested the hypothesis that single nucleotide polymorphisms (SNPs) and/or TGFBR1 haplotypes are associated with risk of NSCLC. We genotyped six TGFBR1 haplotype tagging SNPs (htSNPs) by PCR-restriction fragment length polymorphism (PCR-RFLP) assays and one htSNP by PCR-single strand conformation polymorphism (PCR-SSCP) assay in two case-control studies. Case-control study 1 included 102 NSCLC patients and 104 healthy controls from Suzhou. Case-control study 2 included 131 patients with NSCLC and 133 healthy controls from Wuxi. Individuals included in both case-control studies were Han Chinese. Haplotypes were reconstructed according to the genotyping data and linkage disequilibrium (LD) status of these seven htSNPs. None of the htSNP was associated with NSCLC risk in either study. However, a four-marker haplotype CTGC was significantly more common among controls than among cases in both studies (P=0.014 and P=0.010, respectively) indicating that this haplotype is associated with decreased NSCLC risk (adjusted OR, 0.09; 95% CI, 0.01-0.61 and adjusted OR, 0.11; 95% CI, 0.02-0.59, respectively). Combined analysis of both studies shows a strong association of this four-marker haplotype with decreased NSCLC risk (adjusted OR, 0.11; 95% CI, 0.03-0.39). This is the first evidence of an association between a TGFBR1 haplotype and risk for NSCLC.
Lung cancer is one of the most common cancers worldwide. It is estimated that approximately 41.8 men and 19.3 women per 100,000 Chinese individuals die every year of lung cancer(1). Non-small cell lung cancer (NSCLC) accounts for approximately 85% of all cases of lung cancer. Although smoking is considered a major risk factor for lung cancer, less than 20% of lifetime smokers develop lung cancer(2), suggesting that genetic susceptibility plays an important role in lung carcinogenesis.
Transforming growth factor beta (TGF-β) is a potent inhibitor of normal epithelial cell growth in vivo. TGF-β signaling is mediated by two specific cellular serine/threonine kinase receptors, the type I (TGFBR1) and type II (TGFBR2) TGF-β receptors. TGF-β binds directly to TGFBR2, and is then recognized by TGFBR1, which is phosphorylated and activated by TGFBR2(3).
Previous studies have demonstrated that TGF-β signaling alterations significantly contribute to NSCLC progression(4). Furthermore, increased levels of circulating TGF-β are associated with poor prognosis(5), which indicates that TGF-β is ineffective in inhibiting the growth of lung tumors in vivo and may enhance disease progression. Alterations of TGF-β receptors are a potential mechanism underlying refractoriness to TGF-β growth inhibitory signals and development and progression of human cancers, including NSCLC(6-10). Several studies have focused on the association between the polyalanine polymorphism of TGFBR1 (TGFBR1*6A) and several types of cancers(11-14). Another single nucleotide polymorphism (SNP), Int7G24A (rs334354), may be associated with risk of kidney, bladder and invasive breast cancer(15, 16). Given the fact that neither TGFBR1*6A nor Int7G24A are associated with risk for NSCLC(17, 18), we decided to investigate the association of TGFBR1 haplotypes with NSCLC risk.
To comprehensively study the genetic variants of TGFBR1 associated with susceptibility to NSCLC, we genotyped six TGFBR1 haplotype tagging SNPs (htSNPs) using a PCR-restriction fragment length polymorphism (PCR-RFLP) and one htSNP with PCR-single strand conformation polymorphism (PCR-SSCP) in two Chinese population-based case-control studies. The seven htSNPs, including three in the 5′ flanking region, three in intronic regions and one in the 3′ flanking region of the TGFBR1 gene, appropriately capture all the common haplotype blocks reconstructed in HapMap Phase II data.
In case-control study 1, blood specimens were collected from 102 consecutive patients diagnosed with NSCLC at the First Affiliated Hospital of Soochow University between January 2005 and May 2008. None of NSCLC patients had received either radiotherapy or chemotherapy prior to blood sampling. As controls, we collected blood samples from 104 geographically-matched individuals with the same age range and without a history of cancer at the First Affiliated Hospital of Soochow University between January and December 2005. In case-control study 2, blood specimens were collected from 131 patients with a diagnosis of NSCLC who had not received radiotherapy or chemotherapy and 133 geographically-matched controls with the same age range at Wuxi Third People's Hospital between October 2004 and June 2007. Blood specimens were obtained after informed consent from all subjects. This study was approved by the Academic Advisory Board of Soochow University. A standardized questionnaire was carried out to collect data concerning age, sex and smoking history.
HapMap SNP Phase II data (www.hapmap.org) were used to determine the frequency of SNPs among Han Chinese (CHB) and 74 SNPs were obtained from a 76kb region of TGFBR1 from 28kb upstream of the transcriptional start site to 7kb downstream of the 3′ untranslated region. Three haplotype blocks were reconstructed using these 74 SNPs with the Haploview program(19). Haplotype tagging SNP selection was performed using the Haploview program. The Haploview program implemented a htSNP selection method proposed by Carlson et al(20), which selects a set of htSNPs such that each SNP considered has r2 greater than a pre-specified threshold with at least one of htSNPs. In our selection, only SNPs with minor allele frequency (MAF) greater than 10% were considered and the threshold of pairwise LD was set as r2=0.8. A total of seven htSNPs within three blocks were selected among 47 SNPs considered across TGFBR1, including three in the 5′ flanking region, three in intronic regions and one in the 3′ flanking region (Supplemental Table 1). The LD map of these seven htSNPs is shown in Supplemental Figure 1.
Genomic DNA from blood specimens was isolated according to standard proteinase K digestion and phenol-chloroform extraction. The seven TGFBR1 htSNPs were amplified by polymerase chain reaction (PCR). The sequences of PCR primers and annealing temperature are reported in Supplemental Table 2. The PCR reaction was carried out in a total volume of 25 μl, containing 50 to 100 ng genomic DNA, 1 unit Ex Taq DNA polymerase (Takara, Japan), 0.2 μmol/L of each primer, 1×Ex Taq Buffer (Mg2+ Plus), 0.25mmol/L of each dNTPs. Genotyping for the htSNPs was performed by restriction fragment length polymorphism (RFLP) with restriction endonucleases (Supplemental Table 2). The different alleles were identified on a 2.5% agarose gel and visualized with ethidium bromide. One htSNP (rs1888223) was genotyped using single strand conformation polymorphism (SSCP) because of lack of restriction endonuclease. For SSCP, the PCR products were mixed at a 1:1 ratio with loading buffer (95% formamide, 0.05% xylene cyanol and 0.05% bromophenol blue) and denatured at 95°C for 5 min, cooled on ice for 2 min. Electrophoresis was done in 8% non-denaturing polyacrylamide gels and run at a constant 20 W for 5 hours in 1×TBE running buffer, with the gel temperature maintained at 7°C. Ethidium bromide staining was used for detection of single-strand DNA in polyacrylamide gels.
Pairwise measures of linkage disequilibrium (LD) measured by Lewontin coefficient (D′) and squared correlation coefficient (r2) between the SNPs genotyped were calculated with the Haploview program(19). The frequencies of individual haplotypes were estimated from the genotype data using the SAS 9.1.3 PROC HAPLOTYPE and SHEsis program(21), which implement an expectation-maximization (EM) algorithm and a Full-Precise-Iteration algorithm for reconstructing haplotypes, respectively. Haplotypes with a frequency of less than 0.05 were not considered in the analysis. Logistic regression analysis was performed using SAS PROC LOGISTIC to estimate odds ratios (OR) and 95% confidence intervals (CI) of individual SNPs or haplotypes, with adjustment for age, sex, and smoking status.
Two-sided χ2 test or independent-samples t test was used to compare the difference in gender, age and smoking status between NSCLC cases and controls. Hardy-Weinberg equilibrium (HWE) analysis for genotype distribution in controls was carried out by a Chi-square goodness-of-fit χ2 test. Difference of genotype and allele frequencies between cases and controls were determined using χ2 test. Logistic regression was performed to assess OR and 95% CI, which were adjusted for gender, age and smoking status. All the statistical analyses were implemented with SAS 9.1.3. Statistical significance cutoff was P<0.05.
As shown in Table 1, there was no significant difference regarding sex and age between patients with NSCLC and controls in both studies. However, there was a higher proportion of smokers among patients with NSCLC than among controls (P<0.001 and P=0.006, respectively).
The allele and genotype distributions for seven TGFBR1 htSNPs among NSCLC cases and controls are summarized in Table 2. The genotype frequencies of these polymorphisms were in Hardy-Weinberg equilibrium in controls in both studies. No significant difference in allele and genotype frequencies at any of these seven polymorphic sites was observed between NSCLC patients and controls in either study.
D′ value and r2 for these seven polymorphisms were calculated according to the genotyping data reported in Table 2. The different degrees of linkage disequilibrium between cases and controls are summarized in Table 3 and their LD maps measured by D′ in cases and controls are shown in Figure 1. In case-control study 1, four polymorphisms consisting of rs10819638, rs6478974, rs10733710 and rs597457 were in LD with each other in cases (D′>0.8). In contrast, the D′ values of rs107733710 with rs10819638 and rs6478974 and the D′ value of rs6478974 with rs597457 were less than 0.80 in controls. Especially, linkage disequilibrium between rs6478974 and rs10733710 was very weak in controls (D′=0.383, r2=0.014). Moreover, two htSNPs in the 5′ flanking region, rs7040869 and 4743325, had weaker LD in cases (D′=0.607, r2=0.111) than they had in controls (D′=0.848, r2=0.237). The LD findings in study 2 are similar to those in study 1 (Table 3 and Figure 1).
Accordingly, 4-SNP haplotypes (rs10819638, rs6478974, rs10733710 and rs597457) and 2-SNP haplotypes (rs7040869 and 4743325) were reconstructed according to the genotyping data in NSCLC patients and controls. Using haplotypes with frequencies of more than 0.05 for further analysis, four 4-SNP haplotypes accounted for 90.0% and 92.2% of the corresponding haplotypes in controls of study 1 and 2, respectively; and three 2-SNP haplotypes accounted for 98.1% and 97.5% of the corresponding haplotypes in controls of study 1 and 2, respectively (Table 4). After adjustment for gender, age and smoking status a 4-SNP CTGC haplotype was significantly more common in controls than cases in both case-control studies (P=0.014; adjusted OR, 0.09; 95% CI, 0.01-0.61; and P=0.010; adjusted OR, 0.11; 95% CI, 0.02-0.59, respectively) while the frequencies for all of 2-SNP haplotypes were not significantly different between NSCLC patients and controls. Moreover, as summarized in Table 5, combined analysis of both studies shows an association of this 4-SNP haplotype with decreased NSCLC risk (adjusted OR, 0.11; 95% CI, 0.03-0.39). Interestingly, four individuals were homozygous for the 4-SNP haplotype among controls (4/237) and none among cases (0/233) (P=0.124). We did not observe any association between the 4-SNP haplotype and gender (P=0.745), age assessed either as a categorical (P=0.584) or a continuous (P=0.317) variable, histology (P=0.599), and TNM stage (P=0.804). Importantly, we found that the pairwise LD values between these four SNPs were quite strong, especially for cases in both studies (Supplementary Table 3). These findings provide strong support for the novel notion that the CTGC haplotype is associated with lung cancer risk.
Morbidity and mortality of lung cancer have dramatically increased in China in the past decade. It has been suggested that NSCLC results from accumulation of multiple genetic and/or epigenetic aberrations, and genetic variants that modulate susceptibility to complex diseases may be identified through association studies. Recent studies have confirmed that variants in DNA and cell cycle pathway are weakly associated with risk of lung cancer(22). A recent genome-wide association study has identified two markers at 5p15.33 that are associated with risk of lung cancer(23). Recent studies suggest that, compared with single SNP approaches for genetic association studies, analyses based on haplotypes can significantly improve the power of mapping disease genes(24).
This is the first study investigating the association of TGFBR1 haplotypes with risk for NSCLC. We performed tagging SNP and haplotype analyses to comprehensively capture the various genetic variants of TGFBR1 in the Chinese population. No significant differences in allele and genotype frequencies were observed between NSCLC patients and controls, which suggest that none of the individual TGFBR1 SNPs examined in this study is associated with NSCLC risk. However, a 4-SNP TGFBR1 CTGC haplotype was significantly higher in controls (10.4% for study 1 and 8.8% for study 2) than in NSCLC patients (2.9% for study 1 and 3.1% for study 2), indicating that this haplotype may confer protection against NSCLC (combined adjusted OR, 0.11; 95% CI, 0.03-0.39). To test the hypothesis that constitutively decreased TGFBR1 signaling enhances cancer risk, we have developed a novel mouse model of Tgfbr1 haploinsufficiency(25). We observed that Tgfbr1+/− mice do not exhibit an obvious phenotype but, when crossed with ApcMin+/ mice, have a dramatically increased susceptibility to develop colorectal cancer. These findings led us to validate this hypothesis in humans and led to the discovery that constitutively decreased TGFBR1 signaling in humans is also associated with dramatically increased colorectal cancer susceptibility(26). This allele-specific quantitative trait is dominantly inherited and two TGFBR1 haplotypes are associated with a substantially increased risk of colorectal cancer in Caucasians(26). We hypothesize that constitutively decreased TGFBR1 signaling may be associated with increased cancer susceptibility that is not limited to colorectal cancer. Because of the observed protective effect of the TGFBR1 CTGC haplotype with respect to NSCLC risk, we predict that the CTGC haplotype is associated with increased TGF-β signaling.
TGF-β is a potent naturally-occurring inhibitor of cell growth. Decreased TGF-β signaling may increase susceptibility to cancer development(27, 28). There is compelling evidence supporting the concept that TGFBR1 is a tumor suppressor gene, and TGFBR1 mutations are associated with various human cancers, including head and neck cancers, cervical and ovarian carcinomas(11, 29-31). However, such an association has not been found in lung cancer(18). Although polymorphisms of the TGFBR1 gene, including TGFBR1*6A and Int7G24A, are associated with cancer risk in some studies(18, 32, 33), only limited data exist on the role of TGFBR1 htSNP in NSCLC. TGFBR1*6A, located in exon1 of TGFBR1, has a deletion of three alanines within a stretch of nine alanines(12). There is accumulating evidence showing that TGFBR1*6A may be associated with risk of breast, ovarian, and cervix cancer as well as with risk for abdominal aortic aneurysm(11-14, 34). Moreover, TGFBR1*6A is somatically acquired in 29.5% of liver metastases from colorectal cancer(35) and enhances MCF-7 breast cell migration(36). Nevertheless, we previously reported no association between TGFBR1*6A and lung cancer(17). Although the common SNP Int7G24A may modify risk of kidney, bladder and invasive breast cancer(15, 16), it was not associated with risk for NSCLC(18). In this study, we selected a total of seven tagging SNPs. No single htSNP was significantly associated with risk of NSCLC, suggesting that interplay between these htSNP is related to predisposition to NSCLC, similarly to what we observed in colorectal cancer(26). In summary, our results suggest that a 4-SNP haplotype of TGFBR1 is associated with significantly decreased risk for NSCLC. These finding warrant additional functional studies as well as validation studies in large series of NSCLC cases and matched controls in various ethnics groups.
We gratefully acknowledge the participation and cooperation of patients with NSCLC and individuals without a history of cancer.
Grant support: National Natural Science Foundation of China (30672400 and 30400533 to H.-T. Z.), Science and Technology Committee of Jiangsu Province (BK2008162 to H.-T. Z.), SRF for ROCS, State Education Ministry (2008890 to H.-T. Z.), Qing-Lan Project of Education Bureau of Jiangsu Province (to H.-T. Z.), R01 GM74913 from NIH (to K.Z.), and R01 CA108741, R01 CA112520 and P60 AR048098 from NIH (to B. P.).