We propose a two-step model-based approach, with correction for ascertainment, to linkage analysis of a binary trait with variable age of onset and apply it to a set of multiplex pedigrees segregating for adult glioma.
First, we fit segregation models by formulating the likelihood for a person to have a bivariate phenotype, affection status and age of onset, along with other covariates, and from these we estimate population trait allele frequencies and penetrance parameters as a function of age (N=281 multiplex glioma pedigrees). Second, the best fitting models are used as trait models in multipoint linkage analysis (N=74 informative multiplex glioma pedigrees). To correct for ascertainment, a prevalence constraint is used in the likelihood of the segregation models for all 281 pedigrees. Then the trait allele frequencies are re-estimated for the pedigree founders of the subset of 74 pedigrees chosen for linkage analysis.
Using the best fitting segregation models in model-based multipoint linkage analysis, we identified two separate peaks on chromosome 17; the first agreed with a region identified by Shete et al. who used model-free affected-only linkage analysis, but with a narrowed peak: and the second agreed with a second region they found but had a larger maximum log of the odds (LOD).
Our approach has the advantage of not requiring markers to be in linkage equilibrium unless the minor allele frequency is small (markers which tend to be uninformative for linkage), and of using more of the available information for LOD-based linkage analysis.
Glioma; model-based linkage; segregation; age of onset; prevalence constraint
Many individuals with multiple or large colorectal adenomas, or early-onset colorectal cancer (CRC), have no detectable germline mutations in the known cancer predisposition genes. Using whole-genome sequencing, supplemented by linkage and association analysis, we identified specific heterozygous POLE or POLD1 germline variants in several multiple adenoma and/or CRC cases, but in no controls. The susceptibility variants appear to have high penetrance. POLD1 is also associated with endometrial cancer predisposition. The mutations map to equivalent sites in the proof-reading (exonuclease) domain of DNA polymerases ε and δ, and are predicted to impair correction of mispaired bases inserted during DNA replication. In agreement with this prediction, mutation carriers’ tumours were microsatellite-stable, but tended to acquire base substitution mutations, as confirmed by yeast functional assays. Further analysis of published data showed that the recently-described group of hypermutant, microsatellite-stable CRCs is likely to be caused by somatic POLE exonuclease domain mutations.
Gliomas account for approximately 80% of all primary malignant brain tumors, and despite improvements in clinical care over the last 20 years remain among the most lethal tumors, underscoring the need for gaining new insights that could translate into clinical advances. Recent genome-wide association studies (GWAS) have identified seven new susceptibility regions. We conducted a new independent GWAS of glioma using 1,856 cases and 4,955 controls (from 14 cohort studies, 3 casecontrol studies, and 1 population-based case only study) and found evidence of strong replication for three of the seven previously reported associations at 20q13.33 (RTEL), 5p15.33 (TERT), and 9p21.3 (CDKN2BAS), and consistent association signals for the remaining four at 7p11.2 (EGFR both loci), 8q24.21 (CCDC26) and 11q23.3 (PHLDB1). The direction and magnitude of the signal were consistent for samples from cohort and case-control studies, but the strength of the association was more pronounced for loci rs6010620 (20q,13.33; RTEL) and rs2736100 (5p15.33, TERT) in cohort studies despite the smaller number of cases in this group, likely due to relatively more higher grade tumors being captured in the cohort studies. We further examined the 85 most promising single nucleotide polymorphism (SNP) markers identified in our study in three replication sets (5,015 cases and 11,601 controls), but no new markers reached genome-wide significance. Our findings suggest that larger studies focusing on novel approaches as well as specific tumor subtypes or subgroups will be required to identify additional common susceptibility loci for glioma risk.
The risk of glioma has consistently been shown to be increased two-fold in relatives of patients with primary brain tumors (PBT). A recent genome-wide linkage study of glioma families provided evidence for a disease locus on 17q12-21.32, with the possibility of four additional risk loci at 6p22.3, 12p13.33-12.1, 17q22-23.2, and 18q23.
To identify the underlying genetic variants responsible for the linkage signals, we compared the genotype frequencies of 5,122 SNPs mapping to these five regions in 88 glioma cases with and 1,100 cases without a family history of PBT (discovery study). An additional series of 84 familial and 903 non-familial cases were used to replicate associations.
In the discovery study, 12 SNPs showed significant associations with family history of PBT (P < 0.001). In the replication study, two of the 12 SNPs were confirmed: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.031) and 17q12-21.32 SPOP rs650461 (P = 0.025). In the combined analysis of discovery and replication studies, the strongest associations were attained at four SNPs: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.0001), SOX5 rs7305773 (P = 0.0001) and STKY1 rs2418087 (P = 0.0003), and 17q12-21.32 SPOP rs6504618 (P = 0.0006). Further, a significant gene-dosage effect was found for increased risk of family history of PBT with these four SNPs in the combined data set (Ptrend < 1.0 ×10−8).
The results support the linkage finding that some loci in the 12p13.33-12.1 and 17q12-q21.32 may contribute to gliomagenesis and suggest potential target genes underscoring linkage signals.
Association; Polymorphisms; Glioma; Family history of primary brain tumor; Linkage analysis
We conducted a genome-wide association study of male breast cancer using 823 cases and 2,795 controls of European ancestry with validation in independent sample sets totalling 438 cases and 474 controls. A novel variant in RAD51B (14q24.1) was significantly associated with male breast cancer risk (P = 3.02 ×10−13, odds ratio (OR) = 1.57). TOX3 (16q12.1) was also a susceptibility locus (P = 3.87 ×10−15, OR = 1.50).
Breast cancer is the most common cancer in women in developed countries. To identify common breast cancer susceptibility alleles, we conducted a genome-wide association study in which 582,886 SNPs were genotyped in 3,659 cases with a family history of the disease and 4,897 controls. Promising associations were evaluated in a second stage, comprising 12,576 cases and 12,223 controls. We identified five new susceptibility loci, on chromosomes 9, 10 and 11 (P = 4.6 × 10−7 to P = 3.2 × 10−15). We also identified SNPs in the 6q25.1 (rs3757318, P = 2.9 × 10−6), 8q24 (rs1562430, P = 5.8 × 10−7) and LSP1 (rs909116, P = 7.3 × 10−7) regions that showed more significant association with risk than those reported previously. Previously identified breast cancer susceptibility loci were also found to show larger effect sizes in this study of familial breast cancer cases than in previous population-based studies, consistent with polygenic susceptibility to the disease.
Many colorectal cancers (CRCs) develop in genetically susceptible individuals most of whom are not carriers of germ line mismatch repair or APC gene mutations and much of the heritable risk of CRC appears to be attributable to the co-inheritance of multiple low-risk variants. The accumulated experience to date in identifying this class of susceptibility allele has highlighted the need to conduct statistically and methodologically rigorous studies and the need for the multi-centre collaboration. This has been the motivation for establishing the COGENT (COlorectal cancer GENeTics) consortium which now includes over 20 research groups in Europe, Australia, the Americas, China and Japan actively working on CRC genetics. Here, we review the rationale for identifying low-penetrance variants for CRC and the current and future challenges for COGENT.
Gliomas, which generally have a poor prognosis, are the most common primary malignant brain tumors in adults. Recent genome-wide association studies have demonstrated that inherited susceptibility plays a role in the development of glioma. Although first-degree relatives of patients exhibit a two-fold increased risk of glioma, the search for susceptibility loci in familial forms of the disease has been challenging because the disease is relatively rare, fatal, and heterogeneous, making it difficult to collect sufficient biosamples from families for statistical power. To address this challenge, the Genetic Epidemiology of Glioma International Consortium (Gliogene) was formed to collect DNA samples from families with two or more cases of histologically confirmed glioma. In this study, we present results obtained from 46 U.S. families in which multipoint linkage analyses were undertaken using nonparametric (model-free) methods. After removal of high linkage disequilibrium SNPs, we obtained a maximum nonparametric linkage score (NPL) of 3.39 (P=0.0005) at 17q12–21.32 and the Z-score of 4.20 (P=0.000007). To replicate our findings, we genotyped 29 independent U.S. families and obtained a maximum NPL score of 1.26 (P=0.008) and the Z-score of 1.47 (P=0.035). Accounting for the genetic heterogeneity using the ordered subset analysis approach, the combined analyses of 75 families resulted in a maximum NPL score of 3.81 (P=0.00001). The genomic regions we have implicated in this study may offer novel insights into glioma susceptibility, focusing future work to identify genes that cause familial glioma.
Glioma; family studies; linkage; haplotype pattern; NPL
Genome-wide association study (GWAS) data on a disease are increasingly available from multiple related populations. In this scenario, meta-analyses can improve power to detect homogeneous genetic associations, but if there exist ancestry-specific effects, via interactions on genetic background or with a causal effect that co-varies with genetic background, then these will typically be obscured. To address this issue, we have developed a robust statistical method for detecting susceptibility gene-ancestry interactions in multi-cohort GWAS based on closely-related populations. We use the leading principal components of the empirical genotype matrix to cluster individuals into “ancestry groups” and then look for evidence of heterogeneous genetic associations with disease or other trait across these clusters. Robustness is improved when there are multiple cohorts, as the signal from true gene-ancestry interactions can then be distinguished from gene-collection artefacts by comparing the observed interaction effect sizes in collection groups relative to ancestry groups. When applied to colorectal cancer, we identified a missense polymorphism in iron-absorption gene CYBRD1 that associated with disease in individuals of English, but not Scottish, ancestry. The association replicated in two additional, independently-collected data sets. Our method can be used to detect associations between genetic variants and disease that have been obscured by population genetic heterogeneity. It can be readily extended to the identification of genetic interactions on other covariates such as measured environmental exposures. We envisage our methodology being of particular interest to researchers with existing GWAS data, as ancestry groups can be easily defined and thus tested for interactions.
Pipelines for the analysis of Next-Generation Sequencing (NGS) data are generally composed of a set of different publicly available software, configured together in order to map short reads of a genome and call variants. The fidelity of pipelines is variable. We have developed ArtificialFastqGenerator, which takes a reference genome sequence as input and outputs artificial paired-end FASTQ files containing Phred quality scores. Since these artificial FASTQs are derived from the reference genome, it provides a gold-standard for read-alignment and variant-calling, thereby enabling the performance of any NGS pipeline to be evaluated. The user can customise DNA template/read length, the modelling of coverage based on GC content, whether to use real Phred base quality scores taken from existing FASTQ files, and whether to simulate sequencing errors. Detailed coverage and error summary statistics are outputted. Here we describe ArtificialFastqGenerator and illustrate its implementation in evaluating a typical bespoke NGS analysis pipeline under different experimental conditions. ArtificialFastqGenerator was released in January 2012. Source code, example files and binaries are freely available under the terms of the GNU General Public License v3.0. from https://sourceforge.net/projects/artfastqgen/.
Objectives: Using a novel candidate SNP approach, we aimed to identify a possible genetic basis for the higher glioma incidence in Whites relative to East Asians and African-Americans. Methods: We hypothesized that genetic regions containing SNPs with extreme differences in allele frequencies across ethnicities are most likely to harbor susceptibility variants. We used International HapMap Project data to identify 3,961 candidate SNPs with the largest allele frequency differences in Whites compared to East Asians and Africans and tested these SNPs for association with glioma risk in a set of White cases and controls. Top SNPs identified in the discovery dataset were tested for association with glioma in five independent replication datasets. Results: No SNP achieved statistical significance in either the discovery or replication datasets after accounting for multiple testing or conducting meta-analysis. However, the most strongly associated SNP, rs879471, was found to be in linkage disequilibrium with a previously identified risk SNP, rs6010620, in RTEL1. We estimate rs6010620 to account for a glioma incidence rate ratio of 1.34 for Whites relative to East Asians. Conclusion: We explored genetic susceptibility to glioma using a novel candidate SNP method which may be applicable to other diseases with appropriate epidemiologic patterns.
glioma; candidate SNP association study; ancestry informative markers; admixture; race; ethnicity; brain cancer
Genome-wide association (GWA) studies, where hundreds of thousands of single-nucleotide polymorphisms (SNPs) are tested simultaneously, are becoming popular for identifying disease loci for common diseases. Most commonly, a GWA study involves two stages: the first stage includes testing the association between all SNPs and the disease and the second stage includes replication of SNPs selected from the first stage to validate associations in an independent sample. The first stage is considered to be more fundamental since the second stage is contingent on the results of the first stage. Selection of SNPs from stage one for genotyping in stage two is typically based on an arbitrary threshold or controlling type I errors. These strategies can be inefficient and have potential to exclude genotyping of disease-associated SNPs in stage two. We propose an approach for selecting top SNPs that uses a strategy based on the false-negative rate (FNR). Using the FNR approach, we proposed the number of SNPs that should be selected based on the observed p-values and a pre-specified multi-testing power in the first stage. We applied our method to simulated data and a GWA study of glioma (a rare form of brain tumor) data. Results from simulation and the glioma GWA indicate that the proposed approach provides an FNR-based way to select SNPs using pre-specified power.
False negative rate; SNP selection; Two-stage genome-wide association study
Using data from a genome-wide association study of 907 individuals with childhood acute lymphoblastic leukemia (cases) and 2,398 controls and with validation in samples totaling 2,386 cases and 2,419 controls, we have shown that common variation at 9p21.3 (rs3731217, intron 1 of CDKN2A) influences acute lymphoblastic leukemia risk (odds ratio = 0.71, P = 3.01 × 10−11), irrespective of cell lineage.
While lung cancer is largely caused by tobacco smoking, inherited genetic factors play a role in its etiology. Genome-wide association studies (GWAS) in Europeans have robustly demonstrated only three polymorphic variations influencing lung cancer risk. Tumor heterogeneity may have hampered the detection of association signal when all lung cancer subtypes were analyzed together. In a GWAS of 5,355 European smoking lung cancer cases and 4,344 smoking controls, we conducted a pathway-based analysis in lung cancer histologic subtypes with 19,082 SNPs mapping to 917 genes in the HuGE-defined “inflammation” pathway. We identified a susceptibility locus for squamous cell lung carcinoma (SQ) at 12p13.33 (RAD52, rs6489769), and replicated the association in three independent samples totaling 3,359 SQ cases and 9,100 controls (odds ratio=1.20, Pcombined=2.3×10−8).
The combination of pathway-based approaches and information on disease specific subtypes can improve the identification of cancer susceptibility loci in heterogeneous diseases.
Lung cancer; histology; squamous cell carcinoma; pathway analysis; RAD52
We have previously identified several colorectal cancer (CRC)-associated polymorphisms using genome-wide association (GWA) analysis. We sought to fine-map the location of the functional variants for three of these regions at 8q23.3 (EIF3H), 16q22.1 (CDH1/CDH3) and 19q13.11 (RHPN2). We genotyped two case–control sets at high density in the selected regions and used existing data from four other case–control sets, comprising a total of 9328 CRC cases and 10 480 controls. To improve marker density, we imputed genotypes from the 1000 Genomes Project and Hapmap3 data sets. All three regions contained smaller areas in which a cluster of single nucleotide polymorphisms (SNPs) showed clearly stronger association signals than surrounding SNPs, allowing us to assign those areas as the most likely location of the disease-associated functional variant. Further fine-mapping within those areas was generally unhelpful in identifying the functional variation based on strengths of association. However, functional annotation suggested a relatively small number of functional SNPs, including some with potential regulatory function at 8q23.3 and 16q22.1 and a non-synonymous SNP in RPHN2. Interestingly, the expression quantitative trait locus browser showed a number of highly associated SNP alleles correlated with mRNA expression levels not of EIF3H and CDH1 or CDH3, but of UTP23 and ZFP90, respectively. In contrast, none of the top SNPs within these regions was associated with transcript levels at EIF3H, CDH1 or CDH3. Our post-GWA study highlights benefits of fine-mapping of common disease variants in combination with publicly available data sets. In addition, caution should be exercised when assigning functionality to candidate genes in regions discovered through GWA analysis.
While gliomas are the most common primary brain tumors, their etiology is largely unknown. To identify novel risk loci for glioma, we conducted genome-wide association (GWA) analysis of two case–control series from France and Germany (2269 cases and 2500 controls). Pooling these data with previously reported UK and US GWA studies provided data on 4147 glioma cases and 7435 controls genotyped for 424 460 common tagging single-nucleotide polymorphisms. Using these data, we demonstrate two statistically independent associations between glioma and rs11979158 and rs2252586, at 7p11.2 which encompasses the EGFR gene (population-corrected statistics, Pc = 7.72 × 10−8 and 2.09 × 10−8, respectively). Both associations were independent of tumor subtype, and were independent of EGFR amplification, p16INK4a deletion and IDH1 mutation status in tumors; compatible with driver effects of the variants on glioma development. These findings show that variation in 7p11.2 is a determinant of inherited glioma risk.
The pathogenesis of classical Hodgkin lymphoma (cHL) involves environmental and genetic factors. To explore the role of the human leukocyte antigen (HLA) genes, we performed a case-control genotyping study in 338 Dutch cHL patients using a PCR-based sequence-specific oligonucleotide probe (SSOP) hybridization approach. The allele frequencies were compared to HLA typings of more than 6,000 controls. The age of the cHL patients varied between 13 and 81 years with a median of 35 years. Nodular sclerosis subtype was the most common subtype (87%) and EBV was detected in 25% of the cHL patients. HLA-B5 was significantly increased and HLA-DR7 significantly decreased in the total cHL patient population as compared to controls. Two class II associations were observed to be specific for the EBV− cHL population with an increase of HLA-DR2 and HLA-DR5. Allele frequencies of HLA-A1, HLA-B37 and HLA-DR10 were significantly increased in the EBV+ cHL population; these alleles are in strong linkage disequilibrium and form a common haplotype in Caucasians. The allele frequency of HLA-A2 was significantly decreased in the EBV+ cHL population. Analysis of haplotypes with a frequency of >1% revealed a significant increase of HLA-A2-B7-DR2 in EBV− cHL as compared to controls. SSOP association analysis revealed significant differences between EBV+ and EBV− cHL patients for 19 probes that discriminate between HLA-A*01 and HLA-A*02. In conclusion, the HLA-A1 and HLA-A2 antigens and not specific single nucleotide variants shared by multiple alleles are responsible for the association with EBV+ cHL. Furthermore several new protective and predisposing HLA class I and II associations for the EBV+, the EBV− and the entire cHL population were identified.
Acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) are common early-onset malignancies. Their causes are largely unknown but infectious etiology has been implicated. Type 1 diabetes (T1D) is an autoimmune disease for which infectious triggers of disease onset have been sought and increasing pointing to enteroviruses. Based on our previous results on co-morbidity between leukemia and T1D, we updated the Swedish dataset and focused on early onset leukemias in patients who had been hospitalized for T1D, comparing to those not hospitalized for T1D.
Methods and Findings
Standardized incidence ratios (SIRs) were calculated for leukemia in 24,052 patients hospitalized for T1D covering years 1964 through 2008. T1D patients were included if hospitalized before age 21 years. Practically all Swedish children and adolescents with T1D are hospitalized at the start of insulin treatment. SIR for ALL was 8.30 (N = 18, 95% confidence interval 4.91–13.14) when diagnosed at age 10 to 20 years after hospitalization for T1D and it was 3.51 (13, 1.86–6.02) before hospitalization for T1D. The SIR for ALL was 19.85 (N = 33, 13.74–27.76) and that for AML was 25.28 (8, 10.80–50.06) when the leukemias were diagnosed within the year of T1D hospitalization. The SIRs increased to 38.97 (26, 25.43–57.18) and 40.11 (8, 17.13–79.42) when T1D was diagnosed between ages 10 to 20 years. No consistent time-dependent changes were found in leukemia risk.
A shared infectious etiology could be a plausible explanation to the observed co-morbidity. Other possible contributing factors could be insulin therapy or T1D related metabolic disturbances.
DNA repair genes are important for maintaining genomic stability and limiting carcinogenesis. We analyzed all single nucleotide polymorphisms (SNPs) of 125 DNA repair genes covered by the Illumina HumanHap300 (v1.1) BeadChips in a previously conducted genome-wide association study (GWAS) of 1,154 lung cancer cases and 1,137 controls and replicated the top-hits of XRCC4 SNPs in an independent set of 597 cases and 611 controls in Texas populations. We found that six of 20 XRCC4 SNPs were associated with a decreased risk of lung cancer with a P value of 0.01 or lower in the discovery dataset, of which the most significant SNP was rs10040363 (P for allelic test = 4.89 ×10−4). Moreover, the data in this region allowed us to impute a potentially functional SNP rs2075685 (imputed P for allelic test = 1.3 ×10−3). A luciferase reporter assay demonstrated that the rs2075685G>T change in the XRCC4 promoter increased expression of the gene. In the replication study of rs10040363, rs1478486, rs9293329, and rs2075685, however, only rs10040363 achieved a borderline association with a decreased risk of lung cancer in a dominant model (adjusted OR = 0.80, 95% CI = 0.62–1.03, P = 0.079). In the final combined analysis of both the Texas GWAS discovery and replication datasets, the strength of the association was increased for rs10040363 (adjusted OR = 0.77, 95% CI = 0.66–0.89, Pdominant = 5×10−4 and P for trend = 5×10−4) and rs1478486 (adjusted OR = 0.82, 95% CI = 0.71 −0.94, Pdominant = 6×10−3 and P for trend = 3.5×10−3). Finally, we conducted a meta-analysis of these XRCC4 SNPs with available data from published GWA studies of lung cancer with a total of 12,312 cases and 47,921 controls, in which none of these XRCC4 SNPs was associated with lung cancer risk. It appeared that rs2075685, although associated with increased expression of a reporter gene and lung cancer risk in the Texas populations, did not have an effect on lung cancer risk in other populations. This study underscores the importance of replication using published data in larger populations.
XRCC4; variant; Genetic susceptibility; genome-wide association study; replication study
Published genome-wide association studies (GWASs) have identified few variants in the known biological pathways involved in lung cancer etiology. To mine the possibly hidden causal single nucleotide polymorphisms (SNPs), we explored all SNPs in the extrinsic apoptosis pathway from our published GWAS dataset for 1154 lung cancer cases and 1137 cancer-free controls. In an initial association analysis of 611 tagSNPs in 41 apoptosis-related genes, we identified only 10 tagSNPs associated with lung cancer risk with a P value <10−2, including four tagSNPs in DAPK1 and three tagSNPs in TNFSF8. Unlike DAPK1 SNPs, TNFSF8 rs2181033 tagged other four predicted functional but untyped SNPs (rs776576, rs776577, rs31813148 and rs2075533) in the promoter region. Therefore, we further tested binding affinity of these four SNPs by performing the electrophoretic mobility shift assay. We found that only rs2075533T allele modified levels of nuclear proteins bound to DNA, leading to significantly decreased expression of luciferase reporter constructs by 5- to –10-fold in H1299, HeLa and HCT116 cell lines compared with the C allele. We also performed a replication study of the untyped rs2075533 in an independent Texas population but did not confirm the protective effect. We further performed a mini meta-analysis for SNPs of TNFSF8 obtained from other four published lung cancer GWASs with 12 214 cases and 47 721 controls, and we found that only rs3181366 (r2 = 0.69 with the untyped rs2075533) was associated to lung cancer risk (P = 0.008). Our findings suggest a possible role of novel TNFSF8 variants in susceptibility to lung cancer.
In genome-wide association studies (GWASs) of colorectal cancer, we have identified two genomic regions in which pairs of tagging-single nucleotide polymorphisms (tagSNPs) are associated with disease; these comprise chromosomes 1q41 (rs6691170, rs6687758) and 12q13.13 (rs7163702, rs11169552). We investigated these regions further, aiming to determine whether they contain more than one independent association signal and/or to identify the SNPs most strongly associated with disease. Genotyping of additional sample sets at the original tagSNPs showed that, for both regions, the two tagSNPs were unlikely to identify a single haplotype on which the functional variation lay. Conversely, one of the pair of SNPs did not fully capture the association signal in each region. We therefore undertook more detailed analyses, using imputation, logistic regression, genealogical analysis using the GENECLUSTER program and haplotype analysis. In the 1q41 region, the SNP rs11118883 emerged as a strong candidate based on all these analyses, sufficient to account for the signals at both rs6691170 and rs6687758. rs11118883 lies within a region with strong evidence of transcriptional regulatory activity and has been associated with expression of PDGFRB mRNA. For 12q13.13, a complex situation was found: SNP rs7972465 showed stronger association than either rs11169552 or rs7136702, and GENECLUSTER found no good evidence for a two-SNP model. However, logistic regression and haplotype analyses supported a two-SNP model, in which a signal at the SNP rs706793 was added to that at rs11169552. Post-GWAS fine-mapping studies are challenging, but the use of multiple tools can assist in identifying candidate functional variants in at least some cases.
Male breast cancer accounts for approximately 1% of all breast cancer. To date, risk factors for male breast cancer are poorly defined, but certain risk factors and genetic features appear common to both male and female breast cancer. Genome-wide association studies (GWAS) have recently identified common single nucleotide polymorphisms (SNPs) that influence female breast cancer risk; 12 of these have been independently replicated. To examine if these variants contribute to male breast cancer risk, we genotyped 433 male breast cancer cases and 1,569 controls. Five SNPs showed a statistically significant association with male breast cancer: rs13387042 (2q35) (odds ratio (OR) = 1.30, p = 7.98×10−4), rs10941679 (5p12) (OR = 1.26, p = 0.007), rs9383938 (6q25.1) (OR = 1.39, p = 0.004), rs2981579 (FGFR2) (OR = 1.18, p = 0.03), and rs3803662 (TOX3) (OR = 1.48, p = 4.04×10−6). Comparing the ORs for male breast cancer with the published ORs for female breast cancer, three SNPs—rs13387042 (2q35), rs3803662 (TOX3), and rs6504950 (COX11)—showed significant differences in ORs (p<0.05) between sexes. Breast cancer is a heterogeneous disease; the relative risks associated with loci identified to date show subtype and, based on these data, gender specificity. Additional studies of well-defined patient subgroups could provide further insight into the biological basis of breast cancer development.
Breast cancer is the most common female cancer in the United Kingdom but also occurs in men, albeit at a much lower frequency. Relatively little is known regarding risk factors for male breast cancer. Here, we examine the effect of common genetic variants that are known to be associated with female breast cancer to determine whether they also affect risk of male breast cancer. We show that certain of these variants are also associated with male breast cancer risk but that the magnitudes of their effects differ in males from females. Future analyses of the genetics of male breast cancer may shed light on the biology of both male and female breast cancer.
Regions of restricted genetic heterogeneity due to identity by descent (autozygosity) are known to confer susceptibility to a number of diseases. Regions of germline homozygosity (ROHs) of 1–2 Mb, the result of autozygosity, are detectable at high frequency in outbred populations. Recent studies have reported that ROHs, possibly through exposing recessive disease-causing alleles or alternative mechanisms, are associated with an increased cancer risk. To examine whether homozygosity is associated with breast or prostate cancer risk, we analysed 500K single-nucleotide polymorphism data from two genome-wide association studies conducted by the Cancer Genetics Markers of Susceptibility initiatives (http://cgems.cancer.gov/). Six common ROHs were associated with breast cancer risk and four with prostate cancer (P<0.01). Intriguingly, one of the breast cancer ROHs maps to 6q22.31–6q22.3, a region that has been previously shown to confer breast cancer risk. Although none of the ROHs remained significantly associated with cancer risk after adjustment for multiple testing, a number of ROHs merit further interrogation. However, our findings provide no strong evidence that levels of measured homozygosity, whatever their aetiology (autozygosity, uniparental isodisomy or hemizygosity), confer an increased risk of developing breast or prostate cancer in predominantly outbred populations.
homozygosity; risk; prostate; breast; cancer