We conducted imputation to the 1000 Genomes Project of four genome-wide association studies of lung cancer in populations of European ancestry (11,348 cases and 15,861 controls) and genotyped an additional 10,246 cases and 38,295 controls for follow-up. We identified large-effect genome-wide associations for squamous lung cancer with the rare variants of BRCA2-K3326X (rs11571833; odds ratio [OR]=2.47, P=4.74×10−20) and of CHEK2-I157T (rs17879961; OR=0.38 P=1.27×10−13). We also showed an association between common variation at 3q28 (TP63; rs13314271; OR=1.13, P=7.22×10−10) and lung adenocarcinoma previously only reported in Asians. These findings provide further evidence for inherited genetic susceptibility to lung cancer and its biological basis. Additionally, our analysis demonstrates that imputation can identify rare disease-causing variants having substantive effects on cancer risk from pre-existing GWAS data.
We performed a genome-wide association study on 1,292 individuals with abdominal aortic aneurysms (AAAs) and 30,503 controls from Iceland and The Netherlands, with a follow-up of top markers in up to 3,267 individuals with AAAs and 7,451 controls. The A allele of rs7025486 on 9q33 was found to associate with AAA, with an odds ratio (OR) of 1.21 and P = 4.6 × 10−10. In tests for association with other vascular diseases, we found that rs7025486[A] is associated with early onset myocardial infarction (OR = 1.18, P = 3.1 × 10−5), peripheral arterial disease (OR = 1.14, P = 3.9 × 10−5) and pulmonary embolism (OR = 1.20, P = 0.00030), but not with intracranial aneurysm or ischemic stroke. No association was observed between rs7025486[A] and common risk factors for arterial and venous diseases—that is, smoking, lipid levels, obesity, type 2 diabetes and hypertension. Rs7025486 is located within DAB2IP, which encodes an inhibitor of cell growth and survival.
Loss-of-function mutations protective against human disease provide in vivo validation of therapeutic targets1,2,3, yet none are described for type 2 diabetes (T2D). Through sequencing or genotyping ~150,000 individuals across five ethnicities, we identified 12 rare protein-truncating variants in SLC30A8, which encodes an islet zinc transporter (ZnT8)4 and harbors a common variant (p.Trp325Arg) associated with T2D risk, glucose, and proinsulin levels5–7. Collectively, protein-truncating variant carriers had 65% reduced T2D risk (p=1.7×10−6), and non-diabetic Icelandic carriers of a frameshift variant (p.Lys34SerfsX50) demonstrated reduced glucose levels (−0.17 s.d., p=4.6×10−4). The two most common protein-truncating variants (p.Arg138X and p.Lys34SerfsX50) individually associate with T2D protection and encode unstable ZnT8 proteins. Previous functional study of SLC30A8 suggested reduced zinc transport increases T2D risk8,9, yet phenotypic heterogeneity was observed in rodent Slc30a8 knockouts10–15. Contrastingly, loss-of-function mutations in humans provide strong evidence that SLC30A8 haploinsufficiency protects against T2D, proposing ZnT8 inhibition as a therapeutic strategy in T2D prevention.
To search for new sequence variants that confer risk of cutaneous basal cell carcinoma (BCC), we conducted a genome-wide association study of 38.5 million single nucleotide polymorphisms (SNPs) and small indels identified through whole-genome sequencing of 2230 Icelanders. We imputed genotypes for 4208 BCC patients and 109 408 controls using Illumina SNP chip typing data, carried out association tests and replicated the findings in independent population samples. We found new BCC susceptibility loci at TGM3 (rs214782[G], P = 5.5 × 10−17, OR = 1.29) and RGS22 (rs7006527[C], P = 8.7 × 10−13, OR = 0.77). TGM3 encodes transglutaminase type 3, which plays a key role in production of the cornified envelope during epidermal differentiation.
We report the results of an association study of melanoma based on the genome-wide imputation of the genotypes of 1,353 cases and 3,566 controls of European origin conducted by the GenoMEL consortium. This revealed a novel association between several single nucleotide polymorphisms (SNPs) in intron 8 of the FTO gene, including rs16953002, which replicated using 12,313 cases and 55,667 controls of European ancestry from Europe, the USA and Australia (combined p=3.6×10−12, per-allele OR for A=1.16). As well as identifying a novel melanoma susceptibility locus, this is the first study to identify and replicate an association with SNPs in FTO not related to body mass index (BMI). These SNPs are not in intron 1 (the BMI-related region) and show no association with BMI. This suggests FTO’s function may be broader than the existing paradigm that FTO variants influence multiple traits only through their associations with BMI and obesity.
Effects of susceptibility variants may depend on from which parent they are inherited. While many associations between sequence variants and human traits have been discovered through genome-wide associations, the impact of parental origin has largely been ignored. Combining genealogy with long range phasing, we demonstrate that for 38,167 Icelanders genotyped using SNP chips, the parental origin of most alleles can be determined. We then focused on SNPs that associate with diseases and are within 500kb of known imprinted genes. Seven independent SNP associations were examined. Five, one each with breast cancer and basal cell carcinoma, and three with type 2 diabetes (T2D), exhibit parental-origin specific associations. These variants are located in two genomic regions, 11p15 and 7q32, each harbouring a cluster of imprinted genes. Furthermore, a novel variant rs2334499 at 11p15 was seen to associate with T2D where the allele that confers risk when paternally inherited is protective when maternally transmitted. We identified a differentially methylated CTCF binding site at 11p15 and demonstrated correlation of rs2334499 with decreased methylation of that site.
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimer's disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations.
Genome-wide association studies have in recent years revealed a wealth of common variants associated with common diseases and phenotypes. We took advantage of the advances in sequencing technologies to study the association of low frequency and rare variants in conjunction with common variants with serum levels of vitamin B12 (B12) and folate in Icelanders and Danes. We found 18 independent signals in 13 loci associated with serum B12 or folate levels. Interestingly, 13 of the 18 identified variants are coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. These data indicate that the target genes at all of the loci have been identified. Epidemiological studies have shown a relationship between serum B12 and folate levels and the risk of cardiovascular diseases, cancers, and Alzheimer's disease. We investigated association between the identified variants and these diseases but did not find consistent association.
Western countries, prostate cancer is the most prevalent cancer of men, and one of the leading causes of cancer-related death in men. Several genome-wide association studies have yielded numerous common variants conferring risk of prostate cancer. In the present study we analyzed 32.5 million variants discovered by whole-genome sequencing 1,795 Icelanders. One variant was found to be associated with prostate cancer in European populations: rs188140481[A] (OR = 2.90, Pcomb = 6.2×10−34) located on 8q24, with an average risk allele control frequency of 0.54%. This variant is only very weakly correlated (r2 ≤ 0.06) with previously reported risk variants on 8q24, and remains significant after adjustment for all of them. Carriers of rs188140481[A] were diagnosed with prostate cancer 1.26 years younger than non-carriers (P = 0.0059). We also report results for the previously described HOXB13 mutation (rs138213197[T]), confirming it as prostate cancer risk variant in populations from all over Europe.
In order to search for sequence variants conferring risk of thyroid cancer we conducted a genome-wide association study in 192 and 37,196 Icelandic cases and controls, respectively, followed by a replication study in individuals of European descent. Here we show that two common variants, located on 9q22.33 and 14q13.3, are associated with the disease. Overall, the strongest association signals were observed for rs965513 on 9q22.33 (OR = 1.75; P = 1.7 × 10−27) and rs944289 on 14q13.3 (OR = 1.37; P = 2.0 × 10−9). The gene nearest to the 9q22.33 locus is FOXE1 (TTF2) and NKX2-1 (TTF1) is among the genes located at the 14q13.3 locus. Both variants contribute to an increased risk of both papillary and follicular thyroid cancer. Approximately 3.7% of individuals are homozygous for both variants, and their estimated risk of thyroid cancer is 5.7-fold greater than that of noncarriers. In a study on a large sample set from the general population, both risk alleles are associated with low concentrations of thyroid stimulating hormone (TSH), and the 9q22.33 allele is associated with low concentration of thyroxin (T4) and high concentration of triiodothyronine (T3).
To search for sequence variants conferring risk of nonmedullary thyroid cancer, we focused our analysis on 22 SNPs with a P < 5 × 10−8 in a genome-wide association study on levels of thyroid stimulating hormone (TSH) in 27,758 Icelanders. Of those, rs965513 has previously been shown to associate with thyroid cancer. The remaining 21 SNPs were genotyped in 561 Icelandic individuals with thyroid cancer (cases) and up to 40,013 controls. Variants suggestively associated with thyroid cancer (P < 0.05) were genotyped in an additional 595 non-Icelandic cases and 2,604 controls. After combining the results, three variants were shown to associate with thyroid cancer: rs966423 on 2q35 (OR = 1.34; Pcombined = 1.3 × 10−9), rs2439302 on 8p12 (OR = 1.36; Pcombined = 2.0 × 10−9) and rs116909374 on 14q13.3 (OR = 2.09; Pcombined = 4.6 × 10−11), a region previously reported to contain an uncorrelated variant conferring risk of thyroid cancer. A strong association (P = 9.1 × 10−91) was observed between rs2439302 on 8p12 and expression of NRG1, which encodes the signaling protein neuregulin 1, in blood.
Anaemia is a chief determinant of globalill health, contributing to cognitive impairment, growth retardation and impaired physical capacity. To understand further the genetic factors influencing red blood cells, we carried out a genome-wide association study of haemoglobin concentration and related parameters in up to 135,367 individuals. Here we identify 75 independent genetic loci associated with one or more red blood cell phenotypes at P <10−8, which together explain 4–9% of the phenotypic variance per trait. Using expression quantitative trait loci and bioinformatic strategies, we identify 121 candidate genes enriched in functions relevant to red blood cell biology. The candidate genes are expressed preferentially in red blood cell precursors, and 43 have haematopoietic phenotypes in Mus musculus or Drosophila melanogaster. Through open-chromatin and coding-variant analyses we identify potential causal genetic variants at 41 loci. Our findings provide extensive new insights into genetic mechanisms and biological pathways controlling red blood cell formation and function.
We conducted a genome-wide SNP association study on prostate cancer on over 23,000 Icelanders, followed by a replication study including over 15,500 individuals from Europe and the United States. Two newly identified variants were shown to be associated with prostate cancer: rs5945572 on Xp11.22 and rs721048 on 2p15 (odds ratios (OR) = 1.23 and 1.15; P = 3.9 × 10−13 and 7.7 × 10−9, respectively). The 2p15 variant shows a significantly stronger association with more aggressive, rather than less aggressive, forms of the disease.
Mutations generate sequence diversity and provide a substrate for selection. The rate of de novo mutations is therefore of major importance to evolution. We conducted a study of genomewide mutation rate by sequencing the entire genomes of 78 Icelandic parent-offspring trios at high coverage. Here we show that in our samples, with an average father’s age of 29.7, the average de novo mutation rate is 1.20×10−8 per nucleotide per generation. Most strikingly, the diversity in mutation rate of single-nucleotide polymorphism (SNP) is dominated by the age of the father at conception of the child. The effect is an increase of about 2 mutations per year. After accounting for random Poisson variation, father’s age is estimated to explain nearly all of the remaining variation in the de novo mutation counts. These observations shed light on the importance of the father’s age on the risk of diseases such as schizophrenia and autism.
Measuring serum levels of the prostate specific antigen (PSA) is the most common screening method for prostate cancer. However, PSA levels are affected by a number of factors apart from neoplasia. Notably, around 40% of the variability of PSA levels in the general population is accounted for by inherited factors, suggesting that it may be possible to improve both sensitivity and specificity by adjusting test results for genetic effects. In order to search for sequence variants that associate with PSA levels, we performed a genome-wide association study and follow-up analysis using PSA information from 15,757 Icelandic and 454 British men not diagnosed with prostate cancer. Overall, we detected a genome-wide significant association between PSA levels and SNPs at six loci: 5p15.33 (rs2736098), 10q11 (rs10993994), 10q26 (rs10788160), 12q24 (rs11067228), 17q12 (rs4430796), and 19q13.33 (rs17632542 (KLK3: I179T), each with Pcombined < 3×10−10. Among 3,834 men who underwent a biopsy of the prostate, the 10q26, 12q24, and 19q13.33 alleles that associate with high PSA levels are associated with higher probability of a negative biopsy (OR between 1.15 and 1.27). Assessment of association between the 6 loci and prostate cancer risk in 5,325 cases and 41,417 controls from Iceland, the Netherlands, Spain, Romania, and the US showed that the SNPs at 10q26 and 12q24 were exclusively associated with PSA levels, whereas the other 4 loci also were associated with prostate cancer risk. We propose that a personalized PSA cutoff value, based on genotype, should be used when deciding to perform a prostate biopsy.
We report a genome-wide association follow up study on prostate cancer. We identify four variants associated with the disease in European populations: rs10934853-A (OR = 1.12, P = 2.9×10−10) on 3q21.3, two moderately correlated (r2 = 0.07) variants on 8q24.21; rs16902094-G (OR = 1.21, P = 6.2×10−15) and rs445114-T (OR = 1.14, P = 4.7×10−10) and rs8102476-C (OR = 1.12, P = 1.6×10−11) on 19q13.2. We also refine a previous association signal on 11q13 with the SNP rs11228565-A (OR =1.23, P = 6.7 × 10−12). In a multi-variant analysis, using 22 prostate cancer risk variants typed in the Icelandic population, we estimate that carriers belonging to the top 1.3% of the risk distribution have a risk of developing the disease that is more than 2.5 times greater than the population average risk estimates.
Early menopause (EM) affects up to 10% of the female population, reducing reproductive lifespan considerably. Currently, it constitutes the leading cause of infertility in the western world, affecting mainly those women who postpone their first pregnancy beyond the age of 30 years. The genetic aetiology of EM is largely unknown in the majority of cases. We have undertaken a meta-analysis of genome-wide association studies (GWASs) in 3493 EM cases and 13 598 controls from 10 independent studies. No novel genetic variants were discovered, but the 17 variants previously associated with normal age at natural menopause as a quantitative trait (QT) were also associated with EM and primary ovarian insufficiency (POI). Thus, EM has a genetic aetiology which overlaps variation in normal age at menopause and is at least partly explained by the additive effects of the same polygenic variants. The combined effect of the common variants captured by the single nucleotide polymorphism arrays was estimated to account for ∼30% of the variance in EM. The association between the combined 17 variants and the risk of EM was greater than the best validated non-genetic risk factor, smoking.
Three genome-wide association studies in Europe and the USA have reported eight urinary bladder cancer (UBC) susceptibility loci. Using extended case and control series and 1000 Genomes imputations of 5 340 737 single-nucleotide polymorphisms (SNPs), we searched for additional loci in the European GWAS. The discovery sample set consisted of 1631 cases and 3822 controls from the Netherlands and 603 cases and 37 781 controls from Iceland. For follow-up, we used 3790 cases and 7507 controls from 13 sample sets of European and Iranian ancestry. Based on the discovery analysis, we followed up signals in the urea transporter (UT) gene SLC14A. The strongest signal at this locus was represented by a SNP in intron 3, rs17674580, that reached genome-wide significance in the overall analysis of the discovery and follow-up groups: odds ratio = 1.17, P = 7.6 × 10−11. SLC14A1 codes for UTs that define the Kidd blood group and are crucial for the maintenance of a constant urea concentration gradient in the renal medulla and, through this, the kidney's ability to concentrate urine. It is speculated that rs17674580, or other sequence variants in LD with it, indirectly modifies UBC risk by affecting urine production. If confirmed, this would support the ‘urogenous contact hypothesis’ that urine production and voiding frequency modify the risk of UBC.
To identify new risk variants for cutaneous basal cell carcinoma, we performed a genome-wide association study of 16 million SNPs identified through whole-genome sequencing of 457 Icelanders. We imputed genotypes for 41,675 Illumina SNP chip-typed Icelanders and their relatives. In the discovery phase, the strongest signal came from rs78378222[C] (odds ratio (OR) = 2.36, P = 5.2 × 10−17), which has a frequency of 0.0192 in the Icelandic population. We then confirmed this association in non-Icelandic samples (OR = 1.75, P = 0.0060; overall OR = 2.16, P = 2.2 × 10−20). rs78378222 is in the 3′ untranslated region of TP53 and changes the AATAAA polyadenylation signal to AATACA, resulting in impaired 3′-end processing of TP53 mRNA. Investigation of other tumor types identified associations of this SNP with prostate cancer (OR = 1.44, P = 2.4 × 10−6), glioma (OR = 2.35, P = 1.0 × 10−5) and colorectal adenoma (OR = 1.39, P = 1.6 × 10−4). However, we observed no effect for breast cancer, a common Li-Fraumeni syndrome tumor (OR = 1.06, P = 0.57, 95% confidence interval 0.88–1.27).
To identify novel loci for age at natural menopause, we performed a meta-analysis of 22 genome-wide association studies in 38,968 women of European descent, with replication in up to 14,435 women. In addition to four known loci, we identified 13 new age at natural menopause loci (P < 5 × 10−8). The new loci included genes implicated in DNA repair (EXO1, HELQ, UIMC1, FAM175A, FANCI, TLK1, POLG, PRIM1) and immune function (IL11, NLRP11, BAT2). Gene-set enrichment pathway analyses using the full GWAS dataset identified exodeoxyribonuclease, NFκB signalling and mitochondrial dysfunction as biological processes related to timing of menopause.
Androgenetic alopecia (AGA) is a highly heritable condition and the most common form of hair loss in humans. Susceptibility loci have been described on the X chromosome and chromosome 20, but these loci explain a minority of its heritable variance. We conducted a large-scale meta-analysis of seven genome-wide association studies for early-onset AGA in 12,806 individuals of European ancestry. While replicating the two AGA loci on the X chromosome and chromosome 20, six novel susceptibility loci reached genome-wide significance (p = 2.62×10−9–1.01×10−12). Unexpectedly, we identified a risk allele at 17q21.31 that was recently associated with Parkinson's disease (PD) at a genome-wide significant level. We then tested the association between early-onset AGA and the risk of PD in a cross-sectional analysis of 568 PD cases and 7,664 controls. Early-onset AGA cases had significantly increased odds of subsequent PD (OR = 1.28, 95% confidence interval: 1.06–1.55, p = 8.9×10−3). Further, the AGA susceptibility alleles at the 17q21.31 locus are on the H1 haplotype, which is under negative selection in Europeans and has been linked to decreased fertility. Combining the risk alleles of six novel and two established susceptibility loci, we created a genotype risk score and tested its association with AGA in an additional sample. Individuals in the highest risk quartile of a genotype score had an approximately six-fold increased risk of early-onset AGA [odds ratio (OR) = 5.78, p = 1.4×10−88]. Our results highlight unexpected associations between early-onset AGA, Parkinson's disease, and decreased fertility, providing important insights into the pathophysiology of these conditions.
While most genome-wide association studies (GWAS) focus on the identification of susceptibility loci for a specific disease, this hypothesis-free approach also enables the identification of unexpected associations between different diseases by taking advantage of the previously published GWAS associations. Androgenetic Alopecia (AGA, also known as male pattern baldness) is the most common type of hair loss in humans. Parkinson's disease is reported to occur more commonly in men than in women; however, there are no studies investigating the link between AGA and Parkinson's disease. Here, we show that a specific genetic locus, chromosome 17q21.31, which is associated with Parkinson's disease, is also a susceptibility locus for early-onset AGA. We further investigate the association between early-onset AGA and Parkinson's disease, irrespective of genotype, directly in a large-scale web-based study. We find that men with early-onset AGA have 28% higher risk of developing Parkinson's disease. The early-onset AGA locus on chromosome 17q21.31 has also been linked to decreased fertility previously. Future studies of this locus may implicate novel biological pathways affecting these three conditions.
Coffee is the most commonly used stimulant and caffeine is its main psychoactive ingredient. The heritability of coffee consumption has been estimated at around 50%. We performed a meta-analysis of four genome-wide association studies of coffee consumption among coffee drinkers from Iceland (n = 2680), the Netherlands (n = 2791), the Sorbs Slavonic population isolate in Germany (n = 771) and the USA (n = 369) using both directly genotyped and imputed single nucleotide polymorphisms (SNPs) (2.5 million SNPs). SNPs at the two most significant loci were also genotyped in a sample set from Iceland (n = 2430) and a Danish sample set consisting of pregnant women (n = 1620). Combining all data, two sequence variants significantly associated with increased coffee consumption: rs2472297-T located between CYP1A1 and CYP1A2 at 15q24 (P = 5.4 · 10−14) and rs6968865-T near aryl hydrocarbon receptor (AHR) at 7p21 (P = 2.3 · 10−11). An effect of ∼0.2 cups a day per allele was observed for both SNPs. CYP1A2 is the main caffeine metabolizing enzyme and is also involved in drug metabolism. AHR detects xenobiotics, such as polycyclic aryl hydrocarbons found in roasted coffee, and induces transcription of CYP1A1 and CYP1A2. The association of these SNPs with coffee consumption was present in both smokers and non-smokers.
DNA repair genes are important for maintaining genomic stability and limiting carcinogenesis. We analyzed all single nucleotide polymorphisms (SNPs) of 125 DNA repair genes covered by the Illumina HumanHap300 (v1.1) BeadChips in a previously conducted genome-wide association study (GWAS) of 1,154 lung cancer cases and 1,137 controls and replicated the top-hits of XRCC4 SNPs in an independent set of 597 cases and 611 controls in Texas populations. We found that six of 20 XRCC4 SNPs were associated with a decreased risk of lung cancer with a P value of 0.01 or lower in the discovery dataset, of which the most significant SNP was rs10040363 (P for allelic test = 4.89 ×10−4). Moreover, the data in this region allowed us to impute a potentially functional SNP rs2075685 (imputed P for allelic test = 1.3 ×10−3). A luciferase reporter assay demonstrated that the rs2075685G>T change in the XRCC4 promoter increased expression of the gene. In the replication study of rs10040363, rs1478486, rs9293329, and rs2075685, however, only rs10040363 achieved a borderline association with a decreased risk of lung cancer in a dominant model (adjusted OR = 0.80, 95% CI = 0.62–1.03, P = 0.079). In the final combined analysis of both the Texas GWAS discovery and replication datasets, the strength of the association was increased for rs10040363 (adjusted OR = 0.77, 95% CI = 0.66–0.89, Pdominant = 5×10−4 and P for trend = 5×10−4) and rs1478486 (adjusted OR = 0.82, 95% CI = 0.71 −0.94, Pdominant = 6×10−3 and P for trend = 3.5×10−3). Finally, we conducted a meta-analysis of these XRCC4 SNPs with available data from published GWA studies of lung cancer with a total of 12,312 cases and 47,921 controls, in which none of these XRCC4 SNPs was associated with lung cancer risk. It appeared that rs2075685, although associated with increased expression of a reporter gene and lung cancer risk in the Texas populations, did not have an effect on lung cancer risk in other populations. This study underscores the importance of replication using published data in larger populations.
XRCC4; variant; Genetic susceptibility; genome-wide association study; replication study
Published genome-wide association studies (GWASs) have identified few variants in the known biological pathways involved in lung cancer etiology. To mine the possibly hidden causal single nucleotide polymorphisms (SNPs), we explored all SNPs in the extrinsic apoptosis pathway from our published GWAS dataset for 1154 lung cancer cases and 1137 cancer-free controls. In an initial association analysis of 611 tagSNPs in 41 apoptosis-related genes, we identified only 10 tagSNPs associated with lung cancer risk with a P value <10−2, including four tagSNPs in DAPK1 and three tagSNPs in TNFSF8. Unlike DAPK1 SNPs, TNFSF8 rs2181033 tagged other four predicted functional but untyped SNPs (rs776576, rs776577, rs31813148 and rs2075533) in the promoter region. Therefore, we further tested binding affinity of these four SNPs by performing the electrophoretic mobility shift assay. We found that only rs2075533T allele modified levels of nuclear proteins bound to DNA, leading to significantly decreased expression of luciferase reporter constructs by 5- to –10-fold in H1299, HeLa and HCT116 cell lines compared with the C allele. We also performed a replication study of the untyped rs2075533 in an independent Texas population but did not confirm the protective effect. We further performed a mini meta-analysis for SNPs of TNFSF8 obtained from other four published lung cancer GWASs with 12 214 cases and 47 721 controls, and we found that only rs3181366 (r2 = 0.69 with the untyped rs2075533) was associated to lung cancer risk (P = 0.008). Our findings suggest a possible role of novel TNFSF8 variants in susceptibility to lung cancer.
We conducted a genome-wide association study on 969 bladder cancer cases and 957 controls from Texas. For fast-track validation, we evaluated 60 SNPs in three additional US populations and validated the top SNP in nine European populations. A missense variant (rs2294008) in the PSCA gene showed consistent association with bladder cancer in US and European populations. Combining all subjects (6,667 cases, 39,590 controls), the overall P-value was 2.14 × 10−10 and the allelic odds ratio was 1.15 (95% confidence interval 1.10–1.20). rs2294008 alters the start codon and is predicted to cause truncation of nine amino acids from the N-terminal signal sequence of the primary PSCA translation product. In vitro reporter gene assay showed that the variant allele significantly reduced promoter activity. Resequencing of the PSCA genomic region showed that rs2294008 is the only common missense SNP in PSCA. Our data identify rs2294008 as a new bladder cancer susceptibility locus.
Genome-wide association studies (GWAS) have identified three genomic regions, at 15q24-25.1, 5p15.33 and 6p21.33, which associate with risk of lung cancer. Large meta-analyses of GWA data have failed to find additional associations of genome-wide significance. In this study, we sought to confirm 7 variants with suggestive association to lung cancer (P<10−5) in a recently published meta-analysis. In a GWA dataset of 1,447 lung cancer cases and 36,256 controls in Iceland, three correlated variants on 15q15.2 (rs504417, rs11853991 and rs748404) showed a significant association with lung cancer whereas rs4254535 on 2p14, rs1530057 on 3p24.1, rs6438347 on 3q13.31 and rs1926203 on 10q23.31 did not. The most significant variant, rs748404, was genotyped in additional 1,299 lung cancer cases and 4,102 controls from the Netherlands, Spain and the USA and the results combined with published GWAS data. In this analysis, the T allele of rs748404 reached genome-wide significance (OR=1.15, P=1.1×10−9). Another variant at the same locus, rs12050604, showed association with lung cancer (OR=1.09, 3.6×10−6) and remained significant after adjustment for rs748404 and vice versa. rs748404 is located 140 kb centromeric of the TP53BP1 gene that has been implicated in lung cancer risk. Two fully correlated, non-synonymous coding variants in TP53BP1, rs2602141 (Q1136K) and rs560191 (E353D), showed association with lung cancer in our sample set; however, this association did not remain significant after adjustment for rs748404. Our data show that one or more lung cancer risk variants of genome-wide significance and distinct from the coding variants in TP53BP1 are located at 15q15.2.
Lung cancer; genome-wide association studies; GWAS; 15q15.2; TP53BP1