In this admixture analysis of 1812 African-Americans, ADMIXMAP results revealed ancestry associations at 1q42 and 3q25. Both of these areas were identified by increased Z-scores in both the case-only and case–control analyses. This is the first reported admixture analysis of lung cancer.
For the excess European ancestry region on chromosome 1, the most significant Z
-scores were found near a gene that has been strongly associated with lung cancer risk, Epoxide Hydrolase 1 (EPHX1
), located at 1q42.1 (http://www.ncbi.nlm.nih.gov/gene
). This gene metabolizes polycyclic aromatic hydrocarbons, which are found in cigarette smoke as well as other sources. A meta-analysis by Kiyohara showed a significantly decreased risk of lung cancer associated with the low-activity variant of the exon 3 (Tyr113His) polymorphism among whites (22
). Only two studies of EPHX1
appear to have included African-Americans (23
). Wu et al.
) did not find significant associations between either Tyr113His or His139Arg and lung cancer among African-Americans. When London et al.
) combined these two polymorphisms, a small decrease in risk was associated with the predicted slow activity genotype. Neither of these two more extensively studied SNPs was included in this admixture panel. It is possible that these or other SNPs in this gene or in this region are responsible for the association we see with lung cancer risk.
Excess African ancestry was displayed on chromosome 3q25. Many studies have reported increased copy number in this region in lung tumors (25
). In a 2008 paper, Qian et al.
) suggest many possible candidate genes in this region, including the oncogene PIK3CA
and report that multiple studies have shown amplification of the 3q region to be much more probable in squamous cell carcinoma cases as compared with adenocarcinoma cases. We did not have sufficient numbers of squamous cell carcinoma cases (n = 172) to repeat our analyses in this subpopulation; however, our adenocarcinoma results add support to this finding. When results were restricted to adenocarcinoma cases, Z
-scores were significantly lower in this region.
Several recent GWAS for lung cancer have also identified potential susceptibility gene regions at 15q15, 15q25, 5p15 and 6p21 (6
). Within our study, we found excess European ancestry at 6p24 and excess African ancestry at 15q11–13 in the case–control analyses (supplementary Table I
is available at Carcinogenesis
Online). These areas are a large distance removed from the regions identified in the GWAS (>17 Mb and >14 Mb for the 6p21 and 15q15 regions, respectively). Potential genes of interest in the 15q11–13 region include the genes encoding the inhibitory neurotransmitter GABA receptor gamma 3 (GABRG3
, rs8042276), receptor alpha 5 (GABRA5
) and receptor beta 3 (GABRB3
). Although these 15q11–13 receptor genes have yet to be linked to lung cancer risk or nicotine addiction, GABRA
gene clusters on chromosomes 4 and 5 have been associated with nicotine dependence (28
). Another potential candidate gene is CHRNA7
, located further downstream on chromosome 15q14. While CHRNA3
were both identified as candidate genes in the lung cancer GWAS, CHRNA7
was not. However, it has been shown to be upregulated in lung cell lines exposed to the tobacco-specific nitrosamine NNK (4-(methylnitrosamino)-1-(3-pyridyl)-1-butonone) and estrogen (30
is a candidate gene for schizophrenia, where it has been shown to be differentially expressed in smokers compared with nonsmokers (31
It is possible that we did not see significant associations in the same areas as the GWAS mentioned above because those studies only included white and Asian populations. Furthermore, the regions identified using GWAS methods might be associated with lung cancer risk but might not be as strongly linked to ancestry.
We report both case-only and case–control analysis because these methods complement each other. The case-only analysis is more powerful than the case–control analyses because it compares ancestry estimates at a given locus with its inherent variability to the case’s genome-wide ancestry, which has been estimated over many markers to reduce variability (32
). The case–control method compares the admixture at a given locus between the cases and controls (32
). Thus, while our Z
-scores were sometimes lower for the case–control results, they validate the case-only results in these regions by ruling out the possibility that the findings are due to selection in these regions among African-Americans.
There are several strengths to this study. The admixture mapping approach to gene discovery is a powerful method that takes advantage of the LD patterns among the recently admixed African-American population so that fewer individuals and fewer SNPs are required than for a GWAS. In addition, this study focused on African-Americans, a population that is known to have different smoking patterns and possibly different genetic risks than white and Asian populations. Although racial differences in LD and smoking behaviors warrant the inclusion of African-Americans in genetic studies, few studies have been done in this population.
A proposed cutoff for genome-wide statistical significance has been suggested for admixture mapping studies in African-Americans at P
). In the case-only NSCLC strata, SNP rs6587361 (P
= 1.5 × 10−5
) at the 1q42.13 locus approached this significance level, and among women with NSCLC, rs6587361 (P
= 1.4 × 10−6
) met this threshold. Furthermore, we performed a post-hoc power calculation based on the methods of Hoggart et al.
) for the case-only analysis. Based on the mean observed marker information (37.98) in our sample across the marker map and a two-tailed genome-wide significance threshold of 10−5
, we had 80% power to detect an aRR < 0.40 (or >2.52). This calculation indicates that the sample was reasonably powered to detect the ancestry association signal observed among NSCLC subjects at 1q42.13 (for SNP rs6587361, an aRR of 0.49 among NSCLC subjects overall and an aRR of 0.36 among female NSCLC subjects). Taken together, these results indicate that at 1q42.13, there is a suggestive ancestry association locus for NSCLC.
The study has a large sample size with >1800 cases and controls and used a well-developed set of ancestry informative markers. Women made up 57% of the study population and were disproportionally represented in the WSU group due to the nature of one of the WSU studies. It is possible that the findings in women are driven by study site differences rather than a sex specific effect; however, study stratified analyses did not support a study site-specific finding (data not shown). The only histology categories with large enough numbers for meaningful analyses were NSCLC and adenocarcinoma and therefore, our results might not be generalizable to all histology types. We also had too few never smoking cases for meaningful analyses.
Lung cancer is one of the few cancers for which no substantial progress has been made in early detection and treatment. Given that it is the leading cause of death from cancer, it is exceedingly important to better understand the underlying biology to develop targeted treatments and to identify high-risk populations for targeted behavioral interventions and access to developing screening methods. It is probably that many genes contribute to an individual’s lung cancer risk. Admixture mapping provides an alternative approach to the identification of lung cancer susceptibility genes. New regions were identified in this admixture study. These results add to the findings from the GWAS in Caucasian populations and suggest novel candidate gene areas. These regions need to be confirmed in additional studies with finer mapping.