PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Cancer Res. Author manuscript; available in PMC Mar 15, 2011.
Published in final edited form as:
PMCID: PMC2855643
NIHMSID: NIHMS171466
A susceptibility locus on chromosome 6q greatly increases risk lung cancer risk among light and never smokers
C.I. Amos,1 S. M. Pinney,2 Y. Li,1 E Kupert,2 J Lee,2 M. A. de Andrade,3 P. Yang,3 A.G. Schwartz,4 P. R. Fain,5 A. Gazdar,6 J. Minna,6 J.S. Wiest,7 D. Zeng,1 H. Rothschild,8 D. Mandal,8 M. You,9 Teresa Coons,10 C. Gaba,11 J.E. Bailey-Wilson,12 and M.W. Anderson2
1Epidemiology, U.T. M.D. Anderson Cancer Center, Houston, TX
2University of Cincinnati, Cincinnati
3Mayo Clinic College of Medicine, Rochester, MN
4Karmanos Cancer Institute at Wayne State University, Detroit
5University of Colorado, Denver
6UT Southwestern Medical Center, Dallas
7National Cancer Institute, National Institutes of Health, Bethesda, MD
8Louisiana State University Health Sciences Center, New Orleans
9Washington University, St. Louis
10Saccomanno Research Institute, Grand Junction, Colorado, USA
11Medical College of Ohio, Toledo
12National Human Genome Research Institute, Baltimore, MD
Address Correspondence to: Christopher I. Amos, Ph.D. Epidemiology, U.T. M.D. Anderson Cancer Center, 1155 Pressler St, Unit 1340, Houston, TX 77005, TEL: 713-792-3020, FAX: 713-792-0807, camos/at/mdanderson.org
Cigarette smoking is the major cause for lung cancer but genetic factors also affect susceptibility. We studied families that included multiple relatives affected by lung cancer. Results from linkage analysis showed strong evidence that a region of chromosome 6q affects lung cancer risk. To characterize the effects that this region of chromosome 6q region has on lung cancer risk we identified a haplotype that segregated with lung cancer. We then performed Cox regression analysis to estimate the differential effects that smoking behaviors have upon lung cancer risk according to whether each individual carried a risk-associated haplotype or could not be classified and was assigned unknown haplotypic status. We divided smoking exposures into never smokers, light smokers (<20 pack years), moderate smokers (20-<40 pack years) and heavy smokers (40 or more pack years). Comparing results according to smoking behavior stratified by carrier status, compared to never smokers, there was weakly increasing risk for increasing smoking behaviors, with the hazards ratios being 3.44, 4.91, and 5.18 respectively for light, moderate or heavy smokers, while among the individuals from families without the risk haplotype, the risks associated with smoking increased strongly with exposure, the hazards ratios being respectively 4.25, 9.17 and 11.89 for light, moderate and heavy smokers. The never smoking carriers had a 4.71 fold higher risk than the never smoking individuals without known risk haplotypes. These results identify a region of chromosome 6q that increases risk for lung cancer and that confers particularly higher risks to never and light smokers.
Over 40 years ago, Tokuhata and Lilienfeld (1) provided clear epidemiologic evidence for familial aggregation of lung cancer after accounting for personal smoking, suggesting the possible interaction of genes and smoking behavior in the etiology of lung cancer. The familial effects were most pronounced among smoking relatives, for which the case relatives showed a 2.4 fold higher risk compared to smoking relatives of controls. A later study by Ooi et al. (2) similarly showed a 2.5 fold higher risk in the relatives of cases compared with controls after conditioning on smoking behavior and age. In case-control studies, a positive family history has consistently been found to be a risk factor for lung cancer (3-5). The study of Jonsson (5) employed a population-based approach and obtained familial risks of 2.69 comparing parents of cases to controls, and this relative risk increased to 3.48 comparing parents of cases younger than 60 to age matched controls.
Genetic modeling studies have suggested that at least some of the observed familial aggregation of lung cancer may be due to inheritance of strongly-acting genetic factors. Sellers et al. (6) performed segregation analyses on the families studied by Ooi et al. (2), and found results that were compatible with Mendelian codominant inheritance of a rare major autosomal gene that acts in conjunction with cigarette smoking to produce earlier age of onset of the cancer (6). Under this model, average smoking heterozygotes had relative risks of 14, 11.8 and 6.2 at ages 50, 60 and 70, respectively, compared with average smoking non-carriers. However, the model that was fitted could not allow for a possible interaction between unmeasured genetic effects and the measured environmental factor, tobacco smoke and could not evaluate the potential effects of multiple genetic factors. Gauderman et al. (7) applied a Gibbs sampling method to examine gene - environment interaction models on the same lung cancer data set and found evidence for a dominant major locus with significant effects of smoking, and weak evidence of gene-environmental statistical interaction.
The Genetic Epidemiology of Lung Cancer Consortium (GELCC) identified a region of chromosome 6q that cosegregates with lung cancer susceptibility in families that included 4 or more individuals affected with lung cancer (8). The HLOD score associating lung cancer with the chromosome 6q region increased from 2.79 when families that included 3 or more relatives affected by lung cancer families were studied, to 3.47 when studying families with 4 or more affected individuals and the HLOD score increased to 4.26 when multigenerational families with 5 or more affected lung-cancer relatives were analyzed. The HLOD score represents the log10 ratio of the data from a model including linkage to the model without linked markers assuming genetic heterogeneity, that is assuming that only a proportion, α, of the families are linked to the region and that the other 1-α families do not show evidence of linkage to this region. Hence the best model for these multigenerational families allowing for linkage in the presence of genetic heterogeneity was about 18,000 times more likely than a model excluding linkage, and this result yields a p value less than 1 × 10-5 (9)
These data suggest segregation of a dominant major gene in a subset of families that show excess lung cancer. Preliminary evaluation of smoking behavior in the families studied by Bailey-Wilson et al., (8) provided some evidence for a differential effect of smoking among individuals who are carriers of the chromosome 6q susceptibility locus compared to noncarriers. The goal of this study is to further characterize the impact that smoking behavior has upon susceptibility to lung cancer according to whether or not a family is segregating a risk allele at the 6q susceptibility locus and to provide updated results from linkage analyses including an additional 41 families that have been genotyped since our first report.
The methods for sample collection have been summarized by Bailey-Wilson et al., (8). Samples and data have been collected by the familial lung cancer recruitment sites of the Genetic Epidemiology of Lung Cancer Consortium (GELCC): University of Cincinnati, University of Colorado, Karmanos Cancer Institute, Louisiana State University Health Sciences Center, Mayo Clinic, Johns Hopkins University and Medical College of Ohio. Of the 28,085 lung cancer cases screened at GELCC sites for use in this report, 23.7% had at least 1 first degree relative with lung cancer (details by data collection site shown in Table 1). Families were identified from the Mayo Clinic and Karmanos Cancer Institute as a part of ongoing case-series based in these hospitals. All other sites accrued patients by physician referral, and in addition, some patients were self-referred to the Johns Hopkins, Karmanos Cancer Institute and University of Cincinnati sites. All sites accrued participants to IRB-approved protocols and obtained informed consent from each participant, and the analytical site at the U.T. M.D. Anderson Cancer Center also maintained an IRB-approved protocol for analysis of the data.
Table 1
Table 1
Total Number of Lung Cancer Cases and Families Accrued by the Data Collection Sites
The pedigree development process began at all GELCC sites by screening lung cancer cases for family history (focusing on number of first degree relatives affected with lung cancer). After the initial screening process, we collected additional data from 3,827 willing probands or their family representatives regarding additional cancer affected persons in the extended family, vital status of cancer-affected individuals, availability of archival tissue, and willingness of family members to participate in the study. We then initiated full pedigree development and biospecimen collection on 871 families, most with ≥3 affected relatives. We eliminated the majority of these families from further study because they did not contain enough family members with lung cancer from whom blood samples or non-tumor tissues could be obtained for genotyping or, if affected member(s) were deceased, who had children willing to participate in the study, from whom the genotype of the affected parent could be deduced. To date, 93 families that include genetic information for at least 2 lung-cancer affected relatives have been genotyped, representing 0.3% of the cases we screened and 2.4% of the potential families that were identified (Table 1).
Data on tumors in the families have been obtained by requesting pathology reports, death certificates and original tumor blocks and slides, where available. When tumor blocks or slides could be obtained, they were transmitted to the tumor pathology core, headed by Adi Gazdar at the University of Texas Southwestern Medical Center. Otherwise, tumor histology was assigned according to pathology report or death certificate. Cancer diagnoses could not be verified for 72 of the 489 subjects from the 93 families who were reported by relatives in the families to have had cancers.
Sample Preparation and Genotyping
Blood, buccal cells, and archival biospecimens have been used as sources of DNA for genotyping family members of the lung cancer kindreds. DNA isolated from blood has been genotyped at the Center for Inherited Disease Research (CIDR; an NIH-supported core research facility) and DNA from buccal cells, archival tissue or sputum were genotyped at the University of Cincinnati.
DNA from archival tissue for genotyping was obtained from ten 10 μm paraffin sections containing normal tissue. The archival tissue blocks were examined at University of Texas Southwestern, and sections of normal tissue were prepared for genotyping at the University of Cincinnati. We required the specimen to have at least 50% normal cells for genotyping to ensure the germline rather than tumor genotype was observed. DNA was isolated from paraffin sections and sputum samples by a modified Wright and Manos [10,10] procedure, performed by incubating the tissue with 0.5 μg/μ1 of Proteinase K in 1× PCR buffer with NP-40 and Tween 20 for 1 hour at 55°C. This is followed by a 95°C incubation for 10 minutes to inactivate the Proteinase K, and then treatment of the isolated DNA with 24:1(v:v) chloroform:isoamyl alcohol. DNA was isolated from the buccal cells and from whole blood using the Puregene Kit (Gentra Systems Inc., Minneapolis, MN) following the manufacturer's protocols.
The CIDR global genotyping set consisted of 392 markers (15 families genotyped from 1998-2000) or 388 markers (78 families). PCR amplifications, using the primer set for each of the markers, were performed at CIDR and the University of Cincinnati. The standard protocol for PCR at CIDR can be found on the CIDR Web site (http://www.cidr.jhmi.edu). Conditions for genotyping markers using archived DNA were similar to CIDR's protocol, but with a modification of increasing the number of amplification cycles to 35. All samples were amplified in an MJ Research Thermocycler. Briefly, the cycles were as follows: 95° for 12 minutes; 94° for 45 seconds; 55° for 1 minute; 72° for 1 minute; for an initial 10 cycles; then 89° for 1 minute; 55° for 1 minute; 72° for 1 minute; for an additional 20 cycles; followed by a final extension at 72° for 10 minutes. PCR amplifications were performed using a single fluorescently labeled primer obtained from CIDR. Following the reactions, PCR products were resolved on an ABI 3100 automated DNA sequencer and analyzed with genotype software. Due to the reduced amounts of genomic DNA in the archived samples, none of the amplification products were pooled prior to loading onto the 96 wells of a plate for subsequent analysis.
Integrating genotype data across platforms and quality control procedures
Assignment of alleles generated at CIDR and the University of Cincinnati was accomplished by genotyping several samples in common for each gel (or plate) at both facilities. These common samples included CEPH controls 1331-01 and 1331-02 as well as several lymphocyte DNA samples from members of the lung cancer families.
Our first step in evaluating the genetic data was to appropriately bin the allele lengths. To allow us to jointly analyze data across different platforms used at CIDR versus the University of Cincinnati, we first compared the raw allele lengths for 16 subjects who had been genotyped on both platforms. We next generated a linear regression to predict CIDR lengths from the UC data, while identifying any errors in the data as alleles that failed to satisfy the criterion: Distance= abs (Cosine (Arctangent (b)) * (ŷ -y)) < 1, where ŷ is the predicted value of a point. The prediction of allele lengths between centers routinely yielded an R-squared value of greater than 99% for all but 2 markers (which had R-squared values of 97% and 98%, respectively). However, the intercepts were routinely different from 0 indicating a shift in allele lengths between labs, and the slope often varied from 1, indicating that without regression adjustment, alleles at the extremes could have been misclassified.
The programs Relative (10) and PREST (11) were used to verify relationships among individuals in the data. SIBPAIR (12) and PEDCHECK (13) were used to check for Mendelian inconsistencies. All such errors were corrected by eliminating the genotypes indicated to have been most likely to cause errors. Once verification of pedigree structures and elimination of marker inconsistencies had been completed we estimated allele frequencies for the chromosome 6 linkage analysis using maximum likelihood methods as discussed by Boehnke(14). To perform this analysis we used the FastIlink program which is a module of Fastlink (15). To allow for both genotyping heterogeneity and racial heterogeneity in allele frequencies we estimated allele lengths separately for Caucasians and non-Caucasians and by genotyping set (three sets of samples were separately analyzed by CIDR).
LOD score analyses and haplotyping
Our primary analytical approach in analysis of data from the GELCC assumed a model with 10% penetrance in carriers and 1% penetrance in the noncarriers. This analytical approach weights information only from the affected subjects (16) and so provides an essentially model-free analysis. To obtain linkage results we used SIMWALK2 (17) and calculated heterogeneity LOD scores (18) from the output using Perl scripts we have developed and provide on our website (www.epigenetic.org). In this analysis we estimate the evidence for linkage from each family separately using the Monte-Carlo Markov Chain provided by SIMWALK2. MCMC analysis was used to estimate LOD scores because the pedigrees were too large to permit exact multipoint computation of the likelihood of the data. The LOD scores from each family were then combined allowing for an additional heterogeneity parameter which models the effect on the LOD score that will occur if not all of the families are linked to a specific region. We performed all analyses separately within genotyping set and within racial group to avoid any issues that might arise if marker alleles were not faithfully mapped among studies. Results from each LOD score analysis were then summed across study and ethnicity to obtain the final results.
To obtain haplotypes for the linked region of chromosome 6q, we used a feature of SIMWALK2 that assigns marker genotypes to haplotypes using markers in the linked region including D6S2436 and D6S1035 covering the region from 155 to 165 cM on chromosome 6q. We then integrated the haplotype data onto pedigree drawings that we developed using Progeny. Finally where possible in 40 multigenerational pedigrees we visually identified haplotypes that cosegregate with disease susceptibility by tracing the segregation of haplotypes with disease in families. This tracing algorithm was only possible in families that supported evidence for linkage and included multiple generations. To assign phase, it was helpful to have more than one generation available for study. In addition, to assign a haplotype indicating risk we required that the family provide positive support for linkage. There were 2 families that appeared to segregate two risk haplotypes, because of bilineality in the family (i.e. the inheritance of disease susceptibility appeared to segregate from both parents of the proband). Then, conditional upon the carrier status and smoking behavior of subjects, we performed Kaplan-Meier analyses and Cox regression analysis to assess the relationship between smoking behavior and lung cancer risk, according to the carrier status of the subjects we were studying. We defined never smokers to be individuals who smoked less than 100 cigarettes, light smokers as individuals who reported having smoking less than 20 packyears, moderate smokers had 20 or more but less than 40 packyears eposure and heavy smokers had more than 40 packyears exposure.
To adjust for nonrandom sampling of individuals into our study, we also used a previously developed approach that weights the cases and controls according to population-based incidence rates of cancer (19). Specifically, we obtained incidences of lung cancer for 5 year age-intervals from statistics compiled by the American Cancer Society (Cancer Facts and Figures 2008 from http://www.cancer.org/), averaging rates for males and females. These age-specific incidences were then used to obtain sampling fractions to weight the proportion of cases versus controls in each age-stratum. The weights so derived are presented in supplementary table 1 and show that there were more cases observed in the sample than expected from a population based sample. Therefore, for all intervals, we downweighted the information from cases compared with controls in the subsequent analysis. While this weighted Cox regression approach eliminated bias in simulation studies, it also had reduced power compared with unweighted analysis. Therefore, we present results from both analyses.
The 93 families that have been studied include 489 persons affected with lung cancer of whom 45 are unrelated (marrying-in to the pedigree) and 444 are related to other affected family members, and informative for linkage analysis. From these families we have accrued 1,156 blood samples, 24 buccal cell samples, 58 sputum samples, and 274 archival blocks containing normal tissue. Archival tumor blocks of lung cancer-affected subjects have been collected from 186 persons and 88 blocks from other tissues. When other sources of DNA were not available, we used archival tissue blocks for genotyping. Where possible, since we are interested in studying the coinheritance of lung cancer with genetic markers present in the germline, we have performed analyses on tumor blocks from non lung cancer specimens. Otherwise, when lung cancers have been studied, one of us (AG) has retrieved normal tissue from the tumor margins. Of the 93 families, three are African-American and 1 family has mixed racial composition (African-American, Creole, and Caucasian); the remaining 89 families are Caucasian.
Lung cancer affected individuals are 63.4% male, 81.8% deceased at the time of data collection, and 86.3% ever smokers, with a median value of 50 pack years. For the unaffected individuals who reported cigarette smoking history data, 73.2% were ever smokers with median pack year value of 26, a generally higher level of smokers than in the general population (20). However, because these persons come from families with a strong history of lung cancer among a smoking relative, and smoking aggregates in families, they are more likely to be smokers. Smoking histories for deceased individuals were obtained from surrogates. Numerous studies have reported that surrogate reported data are about 90%-95% accurate for smoking status, but usually underestimate pack years (21-25). Cancer status has been verified with medical records, cancer registry data or death certificates on 417 (85.3%) of the 489 lung affected persons. Pathology reports were obtained whenever possible, i.e., the tissue sample was obtained for diagnosis, medical records could be located and patient or family had signed a medical record release. The distribution of cell type of lung cancer was similar to that reported in the past for the general population (26). In 59 families studied by Bailey-Wilson et al. (8) of 224 lung cancer- affected persons on whom we have pathology reports, 75 (33.5%) had adenocarcinoma, 69 (30.8%) had squamous cell carcinoma, and 22 (9.85%) had small cell carcinoma. Seven families presented with predominantly either adenocarcinoma (N=3) or squamous cell carcinoma (N=4).
Two pedigree characteristics that affect informativeness for linkage analysis are the number of affected persons in the family and number of generations with affected persons (Table 2). For assessing informativeness, we count only affected persons who have at least a third-degree relationship to another affected. In bilineal families, we count only those in the predominant lineage with lung cancer when both parents are lung cancer affected (Table 2), Because at least some of the families with only 2 and 3 affected relatives may not segregate effects from a major susceptibility factor but may rather reflect chance clustering of lung cancer, we have separated this group into subset 1. Similarly, families that include 5 or more affected relatives in 2 or more generations are most likely to segregate a dominantly inherited locus that increases susceptibility and these families have been denoted as subset 2. Families that include 4 or more relatives in a sibship are denoted subset 3. The median number of affected persons per family is 5. In the 93 families, there are 66 families with 4 or more affected and 57 of these families have affected persons in more than 1 generation (subset 4, results in supplementary figures). Of the 50 families with 5 or more affected persons, 47 have affected persons in multiple generations. Linkage analyses of chromosome 6 show that families with 5 or more affected persons in multiple generations exhibited linkage to chromosome 6q.
Table 2
Table 2
Number of lung cancer affected individuals in families, having at least a third-degree relationship to each other
Linkage and haplotype analyses of risk
Maximal heterogeneity LOD (HLOD) scores from genomewide linkage analyses are presented in Table 3. Results from linkage analyses are presented in figure 1 and Supplementary Figure 1. In Figure 1 we present the results of linkage analysis on chromosome 6, while the supplementary figure 1 provides results for other chromosomes that yielded an HLOD score > 1.0 in any subset. The proportions of families estimated from heterogeneity LOD score analysis was 0.53 for the entire dataset and for subsets 1-3 the heterogeneity estimates were 0.74, 1.0, 0.35 respectively. Of the entire set of 93 families, 10 had a LOD score over 0.3 on chromosome 6q at 158 cM.
Table 3
Table 3
Maximum heterogeneity LOD scores over 1.0 in linkage analysis of any subset
Figure 1
Figure 1
Heterogeneity LOD scores from analysis of chromosome 6 for 93 families selected to include multiple relatives with lung cancer. Subset 1 includes families with 2 or 3 individuals affected by lung cancer, Subset 2 includes families with 5 or more individuals (more ...)
Further analysis of the impact of smoking on risk for cancer was carried out as indicated above by first defining carrier status and then by performing Cox regression modeling treating the intensity of smoking as an ordinal variable. There were 292 individuals who carried a risk haplotype, 441 who were in families segregating a risk haplotype who were noncarriers of that haplotype, and 2248 individuals for whom carrier status could not be derived and were classified as unknown carrier status. Figure 2 results from Kaplan-Meier analysis showing that among carriers the overall risk for lung cancer was higher than among noncarriers. There is also significantly higher risk for lung cancer among ever compared with never smokers, as assessed by the log-rank test. However, among smoking carriers there was no evidence for increasing risk with an increasing exposure level to cigarette smoke (p=0.36). On the other hand, among noncarriers (p=0.085) and individuals with unknown carrier status (p=0.0008) a more usual dose-effect relationship between smoking and lung cancer risk is observed. These findings suggest that any level of tobacco exposure increases risk among those with inherited lung cancer susceptibility, suggesting that such individuals should be heavily targeted for smoking prevention and monitored by early detection procedures. Comparing with the risk in never smokers (Table 4a), carriers had higher hazard ratios of 3.44 (95% CI= [1.40,8.48], p=0.007) for light smokers, 4.91 (95% CI = [2.46, 9.8], p <0.0001) for moderate smokers and 5.18 (95% CI =[2.81, 9.56], p<0.0001) for heavy smokers. Among noncarriers, no events occurred in never smokers, so that hazards ratios could not be estimated. For unknown carrier status there was a much stronger effect of smoking, with all groups having highly significant differences from never smokers (p<0.0001). For those light smokers with unknown carrier status, the hazards ratio compared to never smokers was 4.25 (95% CI = [2.11, 8.54]), for moderate smokers the hazards ratio was 9.77 (95% CI = [5.9, 16.20]) and for heavy smokers the hazards ratio was 11.89 (95% CI = 7.59, 18.61). When the analyses were adjusted for excess selection for affected individuals (Table 4b), we found very little trend in carriers, with the hazards ratios in carriers being 2.67 (95% CI=[1.22, 5.86}), 2.34 (95% CI = [1.37, 3.98]), and 2.75 (95% CI = 1.74, 4.37]) in light, moderate, and heavy smokers respectively, while for those with unknown carrier status the hazard ratios were 3.00 (95% CI = [1.64, 5.88]), 5.20 (95% CI = [3.67, 7.58]) and 7.32 (95% CI = [5.28, 10.14]) respectively for light, moderate and heavy smokers.
Figure 2
Figure 2
Time to lung cancer among carriers (left panel), noncarriers (middle panel), and individuals with unknown carrier status (right panel). Smoking strata are shown with the black line reserved for nonsmokers, the red line for light smokers (1-19 pack years), (more ...)
Table 4
Table 4
Table 4a. Comparsion of risks for light, moderate, and heavy smokers versus nonsmokers stratified by carrier status, without adjustment for sampling through multiple affected relatives.
An alternative approach to evaluating risk compares risk among carrier groups, conditioning on smoking behavior (supplemental table 2a). Using individuals with unknown carrier status as the referrent, for never smokers the hazards ratio for for non-carriers was 0 (no events, p=0.99) and 4.71 for carriers (95% CI = 2.35, 9.43], p<0.0001). For light smokers the hazards ratio was 1.08 for noncarriers (95% CI = 0.31, 3.83], p=0.90) and 4.34 for carriers (95% CI = [1.76, 10.7], p=0.0001). For moderate smokers the hazards ratios were 0.83 for noncarriers (95% CI = [0.41, 1.165],p=0.59) and 2.51 for carriers (95% CI = [1.53, 4.13], p=0.0003). For heavy smokers the hazards ratio was 0.83 for noncarriers (95% CI = 0.54, 1.29], p=0.41) and 2.21 for carriers (95% CI = [1.65, 2.97], p<0.0001). Thus, comparing noncarriers and those with no known haplotype (unknown carrier status) there is no significant difference in risk between these two groups according to smoking behavior. However, among those who are carriers the increased risk is most prominent in never smokers. However, as shown in figure 2, any degree of smoking confers a marked increased in risk beyond this baseline. Decreasing hazards ratios according to increasing smoking reflect the higher risks among the noncarriers of risk haplotypes according to increased effects from smoking, but comparable risks for lung cancer among carriers who have any degree of smoking exposure.
Inclusion of additional families collected and analyzed since our 2004 report continue to support evidence for linkage in the 6q region. Among 22 new families that have been collected, 4 showed substantive evidence for linkage (LOD:> 0.3) with one family yielding a LOD score of 0.826, while for the entire set of 93 families, 10 showed substantive evidence for linkage (LOD>0.3). Interestingly, analysis including all the families now shows a bifurcation in the linkage signal around D6S1048. Aside from the linkage studies we report here, there has also been a report of linkage of mesothelioma susceptibility to the same region of chromosome 6q from a family study in an area of Turkey exposed to mineral fibers (27).
Further association analysis of the chromosome 6q region identified one locus that influences lung cancer susceptibility (28). The identified gene, RGS17 is a signalling protein with homology to opioid receptors that has an oncogenic effect in cell culture. While genetic analysis has shown a strong effect of this locus in selected high risk families that we have studied, its effects remain insufficient to explain the high penetrance observed in Figure 2. Therefore, additional variability either in the promoter region of RGS17 or in additional linked loci seem likely to explain the high penetrance observed in these families. It is possible that the region on 6q harbors more than a single genetic locus influencing susceptibility to lung cancer. Towards the aim of fully querying the region of chromosome 6q, the Genetic Epidemiology of Lung Cancer Consortium is performing a comprehensive resequencing effort for all of the loci within a 10 megabase region of the linkage peak. Because we cannot yet fully identify all of the risk alleles for lung cancer that exist on chromosome 6q, we have used a haplotype-based approach to identify individuals who are at increased lung cancer risk.
Statistical modeling of the risk for cancer among those carrying a haplotype associated with increased lung cancer risk showed evidence for an interaction between exposure to smoking and inherent susceptibility to lung cancer. Among those with inherited susceptibility to lung cancer, the risk for lung cancer among never smokers was higher than never smokers who did not inherit susceptibility. However, the more dramatic observation from our analysis was the finding that any degree of smoking yielded a similar and substantive increase in risk for developing lung cancer among carriers of inherited susceptibility while there was a quantitative increase in risk according to the increasing level of smoking among individuals who we did not infer to carry a lung cancer susceptibility haplotype. The observation that environmental factors can have striking effects upon individuals with inherited susceptibility to disease parallels many observations in medical genetics. For example, individuals with metabolic deficiencies in phenylalanine hydroxylase or porphobilinogen deaminase are greatly adversely affected by exposure to even small levels of respectively phenylalanine (29) or barbiturates or other drugs (30). Therefore adverse response to even small amounts of exogenous compounds such as those present in tobacco smoke may be a particular effect of the genetic locus we have identified on chromosome 6q. Because we are not able to obtain detailed information about passive smoking in the family study we conducted, we do not know what level of exposure never smokers have had in our families, but it is possible that the elevated risks we observe in never smokers in carriers reflect in part their exposure patterns.
While our ongoing studies of families collected by the Genetic Epidemiology of Lung Cancer Consortium continue to support effects on risk of a locus on chromosome 6q, we also are here reporting additional evidence for loci on chromosomes 6p, 1q, 8q, and 9p in subsets of families that have multiple affected relatives with lung cancer. To characterize more fully these regions of linkage and to refine the region of linkage on chromosome 6q, further efforts in identifying familial lung cancer cases and families are underway. To identify highly-penetrant causal genetic factors, families that include multiple affected relatives are informative (31). The GELCC is pursuing initial genome-wide SNP-based association studies as well as resequencing. Initially, the resequencing efforts by GELCC have targeted selected regions that showed linkage as well as known candidate loci, but we anticipate that as more global resequencing become cost-effective we will adopt seek to adopt this strategy. The continued collection of families with multiple affected relatives will allow us to identify additional loci, through both linkage and association studies.
Supplementary Material
Acknowledgments
This research was partially supported by NIH grants P30ES06096, P30CA016772, UO1CA076293, R01CA133996, NIEHSP30ES06096, RO1CA060691, RO1CA87895, P50CA70907 and NO1PC35145 and the intramural programs of the National Cancer Institute and the National Human Genome Research Institute, National Institutes of Health.
1. Tokuhata GK, Lilienfeld AM. Familial aggregation of lung cancer in humans. J Natl Cancer Inst. 1963;30:289–312. [PubMed]
2. Ooi WL, Elston RC, Chen VW, Bailey-Wilson JE, Rothschild H. Increased familial risk for lung cancer. J Natl Cancer Inst. 1986;76(2):217–22. [PubMed]
3. Shaw GL, Falk RT, Pickle LW, Mason TJ, Buffler PA. Lung cancer risk associated with cancer in relatives. J Clin Epidemiol. 1991;44:429–37. [PubMed]
4. Etzel CJ, Amos CI, Spitz MR. Risk for smoking-related cancer among relatives of lung cancer patients. Cancer Res. 2003 Dec 1;63:8531–5. [PubMed]
5. Jonsson S, Thorsteinsdottir U, Gudbjartsson DF, Jonsson HH, Kristjansson K, Arnason S, et al. Familial risk of lung carcinoma in the Icelandic population. JAMA. 2004;292(24):2977–83. [PubMed]
6. Sellers TA, Bailey-Wilson JE, Elston RC, Wilson AF, Elston GZ, Ooi WL, et al. Evidence for mendelian inheritance in the pathogenesis of lung cancer. J Natl Cancer Inst. 1990;82(15):1272–9. [PubMed]
7. Gauderman WJ, Morrison JL, Carpenter CL, Thomas DC. Analysis of gene-smoking interaction in lung cancer. Genet Epidemiol. 1997;14:199–214. [PubMed]
8. Bailey-Wilson JE, Amos CI, Pinney SM, Petersen GM, de AM, Wiest JS, et al. A major lung cancer susceptibility locus maps to chromosome 6q23-25. Am J Hum Genet. 2004;75:460–74. [PubMed]
9. Ott J. Analysis of human genetic linkage. Baltimore: Johns Hopkins University Press; 1999.
10. Goring HH, Ott J. Relationship estimation in affected sib pair analysis of late-onset diseases. Eur J Hum Genet. 1997;5:69–77. [PubMed]
11. McPeek MS, Sun L. Statistical tests for detection of misspecified relationships by use of genome-screen data. Am J Hum Genet. 2000;66:1076–94. [PubMed]
12. SIB-PAIR [computer program] Brisbane: Queensland Institute for Medical Research; 2009.
13. O'Connell JR, Weeks DE. PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet. 1998;63:259–66. [PubMed]
14. Boehnke M. Allele frequency estimation from data on relatives. Am J Hum Genet. 1991;48:22–5. [PubMed]
15. Cottingham RW, Jr, Idury RM, Schaffer AA. Faster sequential genetic linkage computations. Am J Hum Genet. 1993;53:252–63. [PubMed]
16. Speer MC. Use of LINKAGE programs for linkage analysis. Curr Protoc Hum Genet. 2006;Chapter 1(Unit 4) [PubMed]
17. Elston RC, Sobel E. Sampling considerations in the gathering and analysis of pedigree data. Am J Hum Genet. 1979;31:62–9. [PubMed]
18. Ott J. Complex traits on the map. Nature. 1996;379:772–3. [PubMed]
19. Antoniou AC, Goldgar DE, Andrieu N, Chang-Claude J, Brohet R, Rookus MA, et al. A weighted cohort approach for analysing factors modifying disease risks in carriers of high-risk susceptibility genes. Genet Epidemiol. 2005;29:1–11. [PubMed]
20. Schoenborn CA, Adams PF, Schiller JS. Summary health statistics for the U.S.population: National Health Interview Survey, 2000. Vital Health Stat. 2003;214:1–83. 10. [PubMed]
21. Woo JG, Pinney SM. Retrospective smoking history data collection for deceased workers: completeness and accuracy of surrogate reports. J Occup Environ Med. 2002;44:915–23. [PubMed]
22. Hansen KS. Validity of occupational exposure and smoking data obtained from surviving spouses and colleagues. Am J Ind Med. 1996;30:392–7. [PubMed]
23. Lerchen ML, Samet JM. An assessment of the validity of questionnaire responses provided by a surviving spouse. Am J Epidemiol. 1986;123:481–9. [PubMed]
24. McLaughlin N. Smoking bans, restrictions increase at country's healthcare facilities. Mod Healthc. 1987;17:92. [PubMed]
25. Hyland A, Cummings KM, Lynn WR, Corle D, Giffen CA. Effect of proxy-reported smoking status on population estimates of smoking prevalence. Am J Epidemiol. 1997;145:746–51. [PubMed]
26. Alberg AJ, Samet JM. Epidemiology of lung cancer. Chest. 2003;123(1 Suppl):21S–49S. [PubMed]
27. Below JE, Pluznikov A, Aquino-Michaels K, Nasu M, Paz V, Mossman B, et al. Localization of a Dominant Genetic Susceptibility Factor in Familial Malignant Mesothelioma.(Abstract program number 1202). American Society of Human Genetics. Presented at the annual meeting of The American Society of Human Genetics; November, 2008; Philadelphia, Pennsylvania. 2008.
28. You M, Wang D, Liu P, Vikis H, James M, Lu Y, et al. Fine mapping of chromosome 6q23-25 region in familial lung cancer families reveals RGS17 as a likely candidate gene. Clin Cancer Res. 2009;15:2666–74. [PMC free article] [PubMed]
29. Paine RS. The variability in manifestations of untreated patients with phenylketonuria (phenylpyruvic aciduria) Pediatrics. 1957;20:290–302. [PubMed]
30. Tschudy DP, Valsamis M, Magnussen CR. Acute intermittent porphyria: clinical and selected research aspects. Ann Intern Med. 1975;83:851–64. [PubMed]
31. Jones S, Hruban RH, Kamiyama M, Borges M, Zhang X, Parsons DW, et al. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science. 2009;324:217. [PMC free article] [PubMed]