The increasing availability of tumor and germline genomic data allows integration of these two data sets to better understand cancer risk. We provide an overview of the types of research being performed and the tools and data available to researchers.
Cancer is characterized by a diversity of genetic and epigenetic alterations occurring in both the germline and somatic (tumor) genomes. Hundreds of germline variants associated with cancer risk have been identified, and large amounts of data identifying mutations in the tumor genome that participate in tumorigenesis have been generated. Increasingly, these two genomes are being explored jointly to better understand how cancer risk alleles contribute to carcinogenesis and whether they influence development of specific tumor types or mutation profiles. To understand how data from germline risk studies and tumor genome profiling is being integrated, we reviewed 160 articles describing research that incorporated data from both genomes, published between January 2009 and December 2012, and summarized the current state of the field. We identified three principle types of research questions being addressed using these data: (i) use of tumor data to determine the putative function of germline risk variants; (ii) identification and analysis of relationships between host genetic background and particular tumor mutations or types; and (iii) use of tumor molecular profiling data to reduce genetic heterogeneity or refine phenotypes for germline association studies. We also found descriptive studies that compared germline and tumor genomic variation in a gene or gene family, and papers describing research methods, data sources, or analytical tools. We identified a large set of tools and data resources that can be used to analyze and integrate data from both genomes. Finally, we discuss opportunities and challenges for cancer research that integrates germline and tumor genomics data.
Pancreatic cancer is a leading cause of cancer-related mortality in the U.S. and both incidence and mortality are highest in African Americans. Obesity is also disproportionately high in African Americans, but limited data are available on the relation of obesity to pancreatic cancer in this population.
Seven large prospective cohort studies pooled data from African American participants. Body mass index (BMI) was calculated from self-reported height and weight at baseline. Cox regression was used to calculate hazard ratios (HRs) and 95% confidence intervals (CIs) for levels of BMI relative to BMI 18.5–24.9, with adjustment for covariates. Primary analyses were restricted to participants with ≥5 years of follow-up because weight loss prior to diagnosis may have influenced baseline BMI in cases who died during early follow-up.
In follow-up of 239,597 participants, 897 pancreatic cancer deaths occurred. HRs were 1.08 (95% CI, 0.90–1.31) for BMI 25.0–29.9, 1.25 (95% CI, 0.99–1.57) for BMI 30.0–34.9, and 1.31 (95% CI, 0.97–1.77) for BMI ≥35.0 among those with ≥5 years of follow-up (Ptrend = 0.03). The association was evident among both sexes and was independent of a history of diabetes. A stronger association was observed among never-smokers (BMI ≥30 vs. referent: HR = 1.44; 95% CI, 1.02–2.03) than among smokers (HR = 1.16; 95% CI, 0.87–1.54; Pinteraction = 0.02).
The findings suggest that obesity is independently associated with increased pancreatic cancer mortality in African Americans.
Interventions to reduce obesity may also reduce risk of pancreatic cancer mortality, particularly among never-smokers.
body mass index; obesity; pancreatic cancer mortality; African American
Over the past two decades, researchers have increasingly used human biospecimens to evaluate hypotheses related to disease risk, outcomes and treatment. We conducted an analysis of population-science cancer research grants funded by the National Cancer Institute (NCI) to gain a more comprehensive understanding of biospecimens and common derivatives involved in those studies and identify opportunities for advancing the field. Data available for 1,018 extramural, peer-reviewed grants (active as of July 2012) supported by the Division of Cancer Control and Population Sciences (DCCPS), the NCI Division that supports cancer control and population-science extramural research grants, were analyzed. 455 of the grants were determined to involve biospecimens or derivatives. The most common specimen types included were whole blood (51% of grants), serum or plasma (40%), tissue (39%), and the biospecimen derivative, DNA (66%). While use of biospecimens in molecular epidemiology has become common, biospecimens for behavioral and social research is emerging, as observed in our analysis. Additionally, we found the majority of grants were using already existing biospecimens (63%). Grants that involved use of existing biospecimens resulted in lower costs (studies that used existing serum/plasma biospecimens were 4.2 times less expensive) and more publications per year (1.4 times) than grants collecting new biospecimens. This analysis serves as a first step at understanding the types of biospecimen collections supported by NCI DCCPS. There is room to encourage increased use of archived biospecimens and new collections of rarer specimen and cancer types, as well as for behavioral and social research. To facilitate these efforts, we are working to better catalogue our funded resources and make that data available to the extramural community.
Next Generation Sequencing (NGS) technologies are used to detect somatic mutations in tumors and study germ line variation. Most NGS studies use DNA isolated from whole blood or fresh frozen tissue. However, formalin-fixed paraffin-embedded (FFPE) tissues are one of the most widely available clinical specimens. Their potential utility as a source of DNA for NGS would greatly enhance population-based cancer studies. While preliminary studies suggest FFPE tissue may be used for NGS, the feasibility of using archived FFPE specimens in population based studies and the effect of storage time on these specimens needs to be determined. We conducted a study to determine whether DNA in archived FFPE high-grade ovarian serous adenocarcinomas from Surveillance, Epidemiology and End Results (SEER) registries Residual Tissue Repositories (RTR) was present in sufficient quantity and quality for NGS assays. Fifty-nine FFPE tissues, stored from 3 to 32 years, were obtained from three SEER RTR sites. DNA was extracted, quantified, quality assessed, and subjected to whole exome sequencing (WES). Following DNA extraction, 58 of 59 specimens (98%) yielded DNA and moved on to the library generation step followed by WES. Specimens stored for longer periods of time had significantly lower coverage of the target region (6% lower per 10 years, 95% CI: 3-10%) and lower average read depth (40x lower per 10 years, 95% CI: 18-60), although sufficient quality and quantity of WES data was obtained for data mining. Overall, 90% (53/59) of specimens provided usable NGS data regardless of storage time. This feasibility study demonstrates FFPE specimens acquired from SEER registries after varying lengths of storage time and under varying storage conditions are a promising source of DNA for NGS.
Over the past several years, genome-wide association studies (GWAS) have succeeded in identifying hundreds of genetic markers associated with common diseases. However, most of these markers confer relatively small increments of risk and explain only a small proportion of familial clustering. To identify obstacles to future progress in genetic epidemiology research and provide recommendations to NIH for overcoming these barriers, the National Cancer Institute sponsored a workshop entitled “Next Generation Analytic Tools for Large-Scale Genetic Epidemiology Studies of Complex Diseases” on September 15–16, 2010. The goal of the workshop was to facilitate discussions on (1) statistical strategies and methods to efficiently identify genetic and environmental factors contributing to the risk of complex disease; and (2) how to develop, apply, and evaluate these strategies for the design, analysis, and interpretation of large-scale complex disease association studies in order to guide NIH in setting the future agenda in this area of research. The workshop was organized as a series of short presentations covering scientific (gene-gene and gene-environment interaction, complex phenotypes, and rare variants and next generation sequencing) and methodological (simulation modeling and computational resources and data management) topic areas. Specific needs to advance the field were identified during each session and are summarized.
gene-gene interactions; gene-environment interactions; rare variants; next generation sequencing; complex phenotypes; simulations; computational resources
Genome-wide association studies (GWAS) have identified 76 variants associated with prostate cancer risk predominantly in populations of European ancestry. To identify additional susceptibility loci for this common cancer, we conducted a meta-analysis of >10 million SNPs in 43,303prostate cancer cases and 43,737 controls from studies in populations of European, African, Japanese and Latino ancestry. Twenty-three novel susceptibility loci were revealed at P<5×10-8; 15 variants were identified among men of European ancestry, 7 from multiethnic analyses and one was associated with early-onset prostate cancer. These 23 variants, in combination with the known prostate cancer risk variants, explain 33% of the familial risk of the disease in European ancestry populations. These findings provide new regions for investigation into the pathogenesis of prostate cancer and demonstrate the utility of combining ancestrally diverse populations to discover risk loci for disease.
At least 17 genomic regions are established as harboring melanoma susceptibility variants, in most instances with genome-wide levels of significance and replication in independent samples. Based on genome-wide single nucleotide polymorphism (SNP) data augmented by imputation to the 1,000 Genomes reference panel, we have fine mapped these regions in over 5,000 individuals with melanoma (mainly from the GenoMEL consortium) and over 7,000 ethnically matched controls. A penalized regression approach was used to discover those SNP markers that most parsimoniously explain the observed association in each genomic region. For the majority of the regions, the signal is best explained by a single SNP, which sometimes, as in the tyrosinase region, is a known functional variant. However in five regions the explanation is more complex. At the CDKN2A locus, for example, there is strong evidence that not only multiple SNPs but also multiple genes are involved. Our results illustrate the variability in the biology underlying genome-wide susceptibility loci and make steps toward accounting for some of the “missing heritability.”
melanoma; fine mapping; penalized regression; heritability; genome-wide signal
Telomere length has been associated with risk of many cancers, but results are inconsistent. Seven single nucleotide polymorphisms (SNPs) previously associated with mean leukocyte telomere length were either genotyped or well-imputed in 11108 case patients and 13933 control patients from Europe, Israel, the United States and Australia, four of the seven SNPs reached a P value under .05 (two-sided). A genetic score that predicts telomere length, derived from these seven SNPs, is strongly associated (P = 8.92x10-9, two-sided) with melanoma risk. This demonstrates that the previously observed association between longer telomere length and increased melanoma risk is not attributable to confounding via shared environmental effects (such as ultraviolet exposure) or reverse causality. We provide the first proof that multiple germline genetic determinants of telomere length influence cancer risk.
The major factors individually reported to be associated with an increased frequency of CDKN2A mutations are increased number of patients with melanoma in a family, early age at melanoma diagnosis, and family members with multiple primary melanomas (MPM) or pancreatic cancer.
These four features were examined in 385 families with ⩾3 patients with melanoma pooled by 17 GenoMEL groups, and these attributes were compared across continents.
Overall, 39% of families had CDKN2A mutations ranging from 20% (32/162) in Australia to 45% (29/65) in North America to 57% (89/157) in Europe. All four features in each group, except pancreatic cancer in Australia (p = 0.38), individually showed significant associations with CDKN2A mutations, but the effects varied widely across continents. Multivariate examination also showed different predictors of mutation risk across continents. In Australian families, ⩾2 patients with MPM, median age at melanoma diagnosis ⩽40 years and ⩾6 patients with melanoma in a family jointly predicted the mutation risk. In European families, all four factors concurrently predicted the risk, but with less stringent criteria than in Australia. In North American families, only ⩾1 patient with MPM and age at diagnosis ⩽40 years simultaneously predicted the mutation risk.
The variation in CDKN2A mutations for the four features across continents is consistent with the lower melanoma incidence rates in Europe and higher rates of sporadic melanoma in Australia. The lack of a pancreatic cancer–CDKN2A mutation relationship in Australia probably reflects the divergent spectrum of mutations in families from Australia versus those from North America and Europe. GenoMEL is exploring candidate host, genetic and/or environmental risk factors to better understand the variation observed.
; multiple primary melanomas; pancreatic cancer
Common variants in two of the five genetic regions recently identified from genome-wide association studies (GWAS) of risk of glioma were reported to interact with a history of allergic symptoms. In a pooled analysis of five epidemiologic studies, we evaluated the association between the five GWAS implicated gene variants and allergies and autoimmune conditions (AIC) on glioma risk (851 adult glioma cases and 3,977 controls). We further evaluated the joint effects between allergies and AIC and these gene variants on glioma risk. Risk estimates were calculated as odds ratios (OR) and 95 % confidence intervals (95 % CI), adjusted for age, gender, and study. Joint effects were evaluated by conducting stratified analyses whereby the risk associations (OR and 95 % CI) with the allergy or autoimmune conditions for glioma were evaluated by the presence or absence of the ‘at-risk’ variant, and estimated p interaction by fitting models with the main effects of allergy or autoimmune conditions and genotype and an interaction (product) term between them. Four of the five SNPs previously reported by others were statistically significantly associated with increased risk of glioma in our study (rs2736100, rs4295627, rs4977756, and rs6010620); rs498872 was not associated with glioma in our study. Reporting any allergies or AIC was associated with reduced risks of glioma (allergy: adjusted OR = 0.71, 95 % CI 0.55–0.91; AIC: adjusted OR = 0.65, 95 % CI 0.47–0.90). We did not observe differential association between allergic or autoimmune conditions and glioma by genotype, and there were no statistically significant p interactions. Stratified analysis by glioma grade (low and high grade) did not suggest risk differences by disease grade. Our results do not provide evidence that allergies or AIC modulate the association between the four GWAS-identified SNPs examined and risk of glioma.
Single-nucleotide polymorphisms; Glioma; Allergies; Autoimmune conditions; Gene–environment interaction
In contemporary oncology practices there is an increasing emphasis on concurrent evaluation of multiple genomic alterations within the biological pathways driving tumorigenesis. At the foundation of this paradigm shift are several commercially available tumor panels using next-generation sequencing to develop a more complete molecular blueprint of the tumor. Ideally, these would be used to identify clinically actionable variants that can be matched with available molecularly targeted therapy, regardless of the tumor site or histology. Currently, there is little information available on the post-analytic processes unique to next-generation sequencing platforms used by the companies offering these tests. Additionally, evidence of clinical validity showing an association between the genetic markers curated in these tests with treatment response to approved molecularly targeted therapies is lacking across all solid-tumor types. To date, there is no published data of improved outcomes when using the commercially available tests to guide treatment decisions. The uniqueness of these tests from other genomic applications used to guide clinical treatment decisions lie in the sequencing platforms used to generate large amounts of genomic data, which have their own related issues regarding analytic and clinical validity, necessary precursors to the evaluation of clinical utility. The generation and interpretation of these data will require new evidentiary standards for establishing not only clinical utility, but also analytical and clinical validity for this emerging paradigm in oncology practice.
Genetic and environmental factors jointly influence cancer risk. The National Institutes of Health (NIH) has made the study of gene-environment (GxE) interactions a research priority since the year 2000.
To assess the current status of GxE research in cancer, we analyzed the extramural grant portfolio of the National Cancer Institute (NCI) from Fiscal Years 2007 to 2009. Publications attributed to selected grants were also evaluated.
From the 1,106 research grants identified in our portfolio analysis, a random sample of 450 grants (40%) was selected for data abstraction; of these, 147 (33%) were considered relevant. The most common cancer type was breast (20%, n=29), followed by lymphoproliferative (10%, n=14), colorectal (9%, n=13), melanoma/other skin (9%, n=13), and lung/upper aero-digestive tract (8%, n=12) cancers. The majority of grants were studies of candidate genes (68%, n=100) compared to genome-wide association studies (GWAS) (8%, n=12). Approximately one third studied environmental exposures categorized as energy balance (37%, n=54) or drugs/treatment (29%, n=43). From the 147 relevant grants, 108 publications classified as GxE or pharmacogenomic were identified. These publications were linked to 37 of the 147 grant applications (25%).
The findings from our portfolio analysis suggest that GxE studies are concentrated in specific areas. There is room for investments in other aspects of GxE research, including, but not limited to developing alternative approaches to exposure assessment, broadening the spectrum of cancer types investigated, and performing GxE within GWAS.
This portfolio analysis provides a cross-sectional review of NCI support for GxE research in cancer.
Gene-Environment Interaction; Grants
Genome-wide association studies (GWAS) have identified 36 loci associated with body mass index (BMI), predominantly in populations of European ancestry. We conducted a meta-analysis to examine the association of >3.2 million SNPs with BMI in 39,144 men and women of African ancestry, and followed up the most significant associations in an additional 32,268 individuals of African ancestry. We identified one novel locus at 5q33 (GALNT10, rs7708584, p=3.4×10−11) and another at 7p15 when combined with data from the Giant consortium (MIR148A/NFE2L3, rs10261878, p=1.2×10−10). We also found suggestive evidence of an association at a third locus at 6q16 in the African ancestry sample (KLHL32, rs974417, p=6.9×10−8). Thirty-two of the 36 previously established BMI variants displayed directionally consistent effect estimates in our GWAS (binomial p=9.7×10−7), of which five reached genome-wide significance. These findings provide strong support for shared BMI loci across populations as well as for the utility of studying ancestrally diverse populations.
We report the results of an association study of melanoma based on the genome-wide imputation of the genotypes of 1,353 cases and 3,566 controls of European origin conducted by the GenoMEL consortium. This revealed a novel association between several single nucleotide polymorphisms (SNPs) in intron 8 of the FTO gene, including rs16953002, which replicated using 12,313 cases and 55,667 controls of European ancestry from Europe, the USA and Australia (combined p=3.6×10−12, per-allele OR for A=1.16). As well as identifying a novel melanoma susceptibility locus, this is the first study to identify and replicate an association with SNPs in FTO not related to body mass index (BMI). These SNPs are not in intron 1 (the BMI-related region) and show no association with BMI. This suggests FTO’s function may be broader than the existing paradigm that FTO variants influence multiple traits only through their associations with BMI and obesity.
The Epidemiology and Genomics Research Program (EGRP) at the National Cancer Institute (NCI) is develop scientific priorities for cancer epidemiology research in the next decade. We would like to engage the research community and other stakeholders in a planning effort that will include a workshop, in December, 2012, to help shape new foci for cancer epidemiology research. To facilitate the process of defining the future of cancer epidemiology, we invite the research community to join in an ongoing Web-based conversation at http://blog-epi.grants.cancer.gov/ to develop priorities and the next generation of high-impact studies.
In an analysis of 31,717 cancer cases and 26,136 cancer-free controls drawn from 13 genome-wide association studies (GWAS), we observed large chromosomal abnormalities in a subset of clones from DNA obtained from blood or buccal samples. Mosaic chromosomal abnormalities, either aneuploidy or copy-neutral loss of heterozygosity, of size >2 Mb were observed in autosomes of 517 individuals (0.89%) with abnormal cell proportions between 7% and 95%. In cancer-free individuals, the frequency increased with age; 0.23% under 50 and 1.91% between 75 and 79 (p=4.8×10−8). Mosaic abnormalities were more frequent in individuals with solid-tumors (0.97% versus 0.74% in cancer-free individuals, OR=1.25, p=0.016), with a stronger association for cases who had DNA collected prior to diagnosis or treatment (OR=1.45, p=0.0005). Detectable clonal mosaicism was common in individuals for whom DNA was collected at least one year prior to diagnosis of leukemia compared to cancer-free individuals (OR=35.4, p=3.8×10−11). These findings underscore the importance of the role and time-dependent nature of somatic events in the etiology of cancer and other late-onset diseases.
Advances in genomics and related fields are promising tools for risk assessment, early detection, and targeted therapies across the entire cancer care continuum. In this commentary, we submit that this promise cannot be fulfilled without an enhanced translational genomics research agenda firmly rooted in the population sciences. Population sciences include multiple disciplines that are needed throughout the translational research continuum. For example, epidemiologic studies are needed not only to accelerate genomic discoveries and new biological insights into cancer etiology and pathogenesis, but to characterize and critically evaluate these discoveries in well defined populations for their potential for cancer prediction, prevention and response to treatments. Behavioral, social and communication sciences are needed to explore genomic-modulated responses to old and new behavioral interventions, adherence to therapies, decision-making across the continuum, and effective use in health care. Implementation science, health services, outcomes research, comparative effectiveness research and regulatory science are needed for moving validated genomic applications into practice and for measuring their effectiveness, cost effectiveness and unintended consequences. Knowledge synthesis, evidence reviews and economic modeling of the effects of promising genomic applications will facilitate policy decisions, and evidence-based recommendations. Several independent and multidisciplinary panels have recently made specific recommendations for enhanced research and policy infrastructure to inform clinical and population research for moving genomic innovations into the cancer care continuum. An enhanced translational genomics and population sciences agenda is urgently needed to fulfill the promise of genomics in reducing the burden of cancer.
cancer; genetics; genomics; medicine; population sciences; public health; translation
Genome-wide association studies have broadened our understanding of the genetic architecture of cancer to include common variants, in addition to the rare variants previously identified by linkage analysis. We review current knowledge on the genetic architecture of four cancers—breast, lung, prostate and colorectal—for which the balance of common and rare alleles identified ranges from fewer common alleles (lung cancer) to more common alleles (prostate cancer). Although most variants are cancer specific, pleiotropy has been observed for several variants, for example, variants at the 8q24 locus and breast, ovarian and prostate cancers or variants in KITLG in relation to hair color and testicular cancer. Although few studies have been adequately powered to investigate heterogeneity among ancestry groups, effect sizes associated with common variants have been reported to be fairly homogenous among ethnic groups. Some associations appear to be ancestry specific, such as HNF1B, which is associated with prostate cancer in European Americans and Latinos but not in African-Americans. Studies of cancer and other complex diseases suggest that a simple dichotomy between rare and common allelic architectures may be too simplistic and that future research is needed to characterize a fuller spectrum of allele frequency (common (>5%), uncommon (1–5%) and rare (<<1%) alleles) and effect size. In addition, a broadening of the concept of genetic architecture to encompass both population architecture, which reflects differences in exposures, genetic factors and population level risk among diverse groups of people, and genomic architecture, which includes structural, epigenomic and somatic variation, is envisioned.
We report a genome-wide association study of melanoma, conducted by GenoMEL, of 2,981 cases, of European ancestry, and 1,982 study-specific controls, plus a further 6,426 French and UK population controls, all genotyped for 317,000 or 610,000 SNPs. The analysis confirmed previously known melanoma susceptibility loci. The 7 novel regions with at least one SNP with p<10−5 and further local imputed or genotyped support were selected for replication using two other genome-wide studies (from Australia and Houston, Texas). Additional replication came from UK and Dutch case-control series. Three of the 7 regions replicated at p<10−3: an ATM missense polymorphism (rs1801516, overall p=3.4×10−9); a polymorphism within MX2 (rs45430, p=2.9×10−9) and a SNP adjacent to CASP8 (rs13016963, p=8.6×10−10). A fourth region near CCND1 remains of potential interest, showing suggestive but inconclusive evidence of replication. Unlike the previously known regions, the novel loci showed no association with nevus or pigmentation phenotypes in a large UK case-control series.
Recombination, together with mutation, is the ultimate source of genetic variation in populations. We leverage the recent mixture of people of African and European ancestry in the Americas to build a genetic map measuring the probability of crossing-over at each position in the genome, based on about 2.1 million crossovers in 30,000 unrelated African Americans. At intervals of more than three megabases it is nearly identical to a map built in Europeans. At finer scales it differs significantly, and we identify about 2,500 recombination hotspots that are active in people of West African ancestry but nearly inactive in Europeans. The probability of a crossover at these hotspots is almost fully controlled by the alleles an individual carries at PRDM9 (P<10−245). We identify a 17 base pair DNA sequence motif that is enriched in these hotspots, and is an excellent match to the predicted binding target of African-enriched alleles of PRDM9.
A significant proportion of high-risk breast cancer families are not explained by mutations in known genes. Recent genome-wide searches (GWS) have not revealed any single major locus reminiscent of BRCA1 and BRCA2, indicating that still unidentified genes may explain relatively few families each or interact in a way obscure to linkage analyses. This has drawn attention to possible benefits of studying populations where genetic heterogeneity might be reduced. We thus performed a GWS for linkage on nine Icelandic multiple-case non-BRCA1/2 families of desirable size for mapping highly penetrant loci. To follow up suggestive loci, an additional 13 families from other Nordic countries were genotyped for selected markers.
GWS was performed using 811 microsatellite markers providing about five centiMorgan (cM) resolution. Multipoint logarithm of odds (LOD) scores were calculated using parametric and nonparametric methods. For selected markers and cases, tumour tissue was compared to normal tissue to look for allelic loss indicative of a tumour suppressor gene.
The three highest signals were located at chromosomes 6q, 2p and 14q. One family contributed suggestive LOD scores (LOD 2.63 to 3.03, dominant model) at all these regions, without consistent evidence of a tumour suppressor gene. Haplotypes in nine affected family members mapped the loci to 2p23.2 to p21, 6q14.2 to q23.2 and 14q21.3 to q24.3. No evidence of a highly penetrant locus was found among the remaining families. The heterogeneity LOD (HLOD) at the 6q, 2p and 14q loci in all families was 3.27, 1.66 and 1.24, respectively. The subset of 13 Nordic families showed supportive HLODs at chromosome 6q (ranging from 0.34 to 1.37 by country subset). The 2p and 14q loci overlap with regions indicated by large families in previous GWS studies of breast cancer.
Chromosomes 2p, 6q and 14q are candidate sites for genes contributing together to high breast cancer risk. A polygenic model is supported, suggesting the joint effect of genes in contributing to breast cancer risk to be rather common in non-BRCA1/2 families. For genetic counselling it would seem important to resolve the mode of genetic interaction.
We report a genome-wide association study of melanoma conducted by the GenoMEL consortium based on 317k tagging SNPs for 1650 genetically-enriched cases (from Europe and Australia) and 4336 controls and subsequent replication in 1149 genetically-enriched cases and 964 controls and a population-based case-control study of 1163 cases and 903 controls. The genome-wide screen identified five regions with genotyped or imputed SNPs reaching p < 5×10−7; three regions were replicated: 16q24 encompassing MC1R (overall p=2.54×10−27 for rs258322), 11q14-q21 encompassing TYR (p=2.41×10−14 for rs1393350) and 9p21 adjacent to MTAP and flanking CDKN2A (p=4.03×10−7 for rs7023329). MC1R and TYR are associated with pigmentation, freckling and cutaneous sun sensitivity, well-recognised melanoma risk factors, while the 9p21 locus is novel for common variants associated with melanoma. Despite wide variation in allele frequency, these genetic variants show notable homogeneity of effect across populations of European ancestry living at different latitudes and contribute independently to melanoma risk.
We conducted a genome-wide association pooling study for cutaneous melanoma and performed validation in samples totalling 2019 cases and 2105 controls. Using pooling we identified a novel melanoma risk locus on chromosome 20 (rs910873, rs1885120), with replication in two further samples (combined P <1 × 10-15). The odds ratio is 1.75 (1.53, 2.01), with evidence for stronger association in early onset cases.
By assaying hundreds of thousands of single nucleotide polymorphisms, genome wide association studies (GWAS) allow for a powerful, unbiased review of the entire genome to localize common genetic variants that influence health and disease. Although it is widely recognized that some correction for multiple testing is necessary, in order to control the family-wide Type 1 Error in genetic association studies, it is not clear which method to utilize. One simple approach is to perform a Bonferroni correction using all n single nucleotide polymorphisms (SNPs) across the genome; however this approach is highly conservative and would "overcorrect" for SNPs that are not truly independent. Many SNPs fall within regions of strong linkage disequilibrium (LD) ("blocks") and should not be considered "independent".
We proposed to approximate the number of "independent" SNPs by counting 1 SNP per LD block, plus all SNPs outside of blocks (interblock SNPs). We examined the effective number of independent SNPs for Genome Wide Association Study (GWAS) panels. In the CEPH Utah (CEU) population, by considering the interdependence of SNPs, we could reduce the total number of effective tests within the Affymetrix and Illumina SNP panels from 500,000 and 317,000 to 67,000 and 82,000 "independent" SNPs, respectively. For the Affymetrix 500 K and Illumina 317 K GWAS SNP panels we recommend using 10-5, 10-7 and 10-8 and for the Phase II HapMap CEPH Utah and Yoruba populations we recommend using 10-6, 10-7 and 10-9 as "suggestive", "significant" and "highly significant" p-value thresholds to properly control the family-wide Type 1 error.
By approximating the effective number of independent SNPs across the genome we are able to 'correct' for a more accurate number of tests and therefore develop 'LD adjusted' Bonferroni corrected p-value thresholds that account for the interdepdendence of SNPs on well-utilized commercially available SNP "chips". These thresholds will serve as guides to researchers trying to decide which regions of the genome should be studied further.
Several mutations in the PALB2 gene (partner and localizer of BRCA2) have been associated with an increased risk of breast cancer, including a founder mutation, 1592delT, reported in Finnish breast cancer families. Although most often the risk is moderate, it doesn't exclude families with high-risk mutations to exist and such observations have been reported. To see if high-risk PALB2-mutations may be present in the geographically confined population of Iceland, linkage analysis was done on 111 individuals, thereof 61 breast cancer cases, from 9 high-risk non-BRCA1/BRCA2 breast cancer families, targeting the PALB2 region. Also, screening for the 1592delT founder mutation in the 9 high-risk families and in 638 unselected breast cancer cases was performed. The results indicate no linkage in any of the high-risk families and screening for the 1592delT mutation was negative in all samples. PALB2 appears not to be a significant factor in high-risk breast cancer families in Iceland and the 1592delT mutation is not seen to be associated with breast cancer in Iceland.