|Home | About | Journals | Submit | Contact Us | Français|
Multiple genome-wide and candidate gene association studies have been performed in search of common risk variants for breast cancer. Recent large meta analyses, consolidating evidence from these studies, have been consistent in highlighting the caspase-8 (CASP8) gene as important in this regard. In order to define a risk haplotype and map the CASP8 gene region with respect to underlying susceptibility variant/s, we screened four genes in the CASP8 region on 2q33-q34 for breast cancer risk.
Two independent data sets from the United Kingdom and the United States, including 3,888 breast cancer cases and controls, were genotyped for 45 tagging single nucleotide polymorphisms (tSNP) in the expanded CASP8 region. SNP and haplotype association tests were carried out using Monte Carlo based methods.
We identified a three-SNP haplotype across rs3834129, rs6723097 and rs3817578 that was significantly associated with breast cancer (p<5×10−6), with a dominant risk ratio and 95% confidence interval of 1.28 (1.21-1.35) and frequency of 0.29 in controls. Evidence for this risk haplotype was extremely consistent across the two study sites and also consistent with previous data.
This three-SNP risk haplotype represents the best characterization so far of the chromosome upon which the susceptibility variant resides.
Characterization of the risk haplotype provides a strong foundation for re-sequencing efforts to identify the underlying risk variant, which may prove useful for individual-level risk prediction, and provide novel insights into breast carcinogenesis.
The caspase-8 (CASP8) gene is one of only three genes identified as possessing common variants with strong and noteworthy associations with breast cancer risk based on cumulative evidence from candidate gene and genomewide association studies [1,2]. Similarly, pooled- and meta- analyses focusing on CASP8 specifically have indicated strong associations with breast cancer [3-5]. In particular, these refer to the highly significant association of the minor allele at D302H in exon 12 (rs1045485) and decreased risk. Some data suggests that another variant, a 6-bp deletion in the CASP8 promoter (−652 6N del, rs3834129) is associated with breast cancer, although the evidence for this variant is much less consistent [6,7,8]. There is no known functional effect of rs1045485 , and it is very rare in Asian populations. The del allele of rs3834129 has been suggested to remove an Spl transcription binding site, although functional data on the effects of this change in lymphocytes are conflicting [6,9]. Evidence thus far suggests that it is likely that other variant/s in linkage disequilibrium (LD) with rs1045485 and/or rs3834129 will be the critical variant/s.
CASP8 resides at chromosome 2q33. Two other genes, caspase-10 (CASP10) and amyotrophic lateral sclerosis 2 (juvenile) chromosome region candidate 12 (ALS2CR12), lie directly adjacent to CASP8. Like CASP8, CASP10 is an initiator of apoptosis, and the CASP10 V410I variant has been reported to be associated with breast cancer . Another gene, called ‘CASP8 and FADD-like Apoptosis Regulator’ (CFLAR) lies centromeric to CASP10. It is a member of the same gene family as CASP8 and CASP10, but acts as a negative regulator of apoptosis . Given their physical proximity to CASP8 and functional relevance (CASP10/CFLAR), the critical variants could reasonably lie in any of these four genes.
We previously genotyped 14 tagging-SNPs (tSNPs) in CASP8 on 2,450 breast cancer case and control subjects from the Sheffield Breast Cancer Study (SBCS) and identified a four-SNP risk haplotype (1-1-2-1 across SNPs rs7608692, rs1861269, rs6723097, rs3817578; p = 8.0×10−5), with a per allele OR (95% CI) of 1.30 (1.12–1.49) . This haplotype was substantially more significant than any individual SNP, and was consistent with previous findings (i.e. the common (aspartate) allele at rs1045485, and the ins allele of rs3834129, are associated with the increased risk haplotype). The aim of the current study was to consider the broader four gene region and refine the risk haplotype upon which the susceptibility variant/s lie. Here, we have studied 3,888 breast cancer cases and controls from two collaborating sites, genotyped for 45 tSNPs across the four genes in the CASP8 region at chromosome 2q33-q34.
A joint resource of 3,888 breast cancer cases and controls were genotyped: SBCS (N=2,049) and Utah Breast Cancer Study (UBCS; N=1,839). The SBCS set consisted of 1,015 histopathologically confirmed breast cancer patients recruited from the surgical outpatient clinics of the Royal Hallamshire Hospital, Sheffield, United Kingdom between 1998 and 2005. Case subjects were a mixture of incident and prevalent cases (median time to diagnosis 2.3 years), with median age at diagnosis (range) 59 (28-92) years. Fifteen percent of SBCS cases had at least one first degree relative with breast cancer. Control SBCS subjects (n=1,034) were healthy women attending the Sheffield Mammography Screening Service between 2000 and 2004. In the UK women are invited for routine mammography screening every 3 years between the ages of 50 and 70 years, and the average uptake in Sheffield is >80%. Women whose mammograms showed no evidence of breast lesions were eligible as controls for this study and median age at recruitment (range) was 57 (45-78) years. Eleven percent of SBCS controls had at least one first degree relative with breast cancer . The UBCS set consisted of 905 breast cancer cases identified using the Utah Population Database (UPDB) and confirmed and ascertained through the Utah Cancer Registry; median age at diagnosis (range) was 56 (21-92) years and 41% of UBCS cases had at least one first degree relative with breast cancer. Controls (n=934) were birth cohort- and sex-matched cancer-free individuals, and 2% of these had at least one first degree relative with breast cancer. Using genealogy from the UPDB, it was established that 208 cases and 564 controls were singletons and the remaining individuals were members of 31 extended pedigrees; although most relationships were distant (average kinship coefficient=0.017, i.e. approximately 6 meioses distant, or second cousins). All cases and controls were of North European ancestry.
We excluded individuals with genotype call rates <80%, resulting in a total sample of 1,882 cases and 1,896 controls included in the genetic analyses.
For replication of significant single SNPs, we used a replication cohort comprising cases with a strong family history of breast cancer and control subjects from Manchester, UK. The cases comprised 713 subjects fulfilling the NICE criteria for BRCA1 and BRCA2 screening (>20% risk of mutation), but negative for BRCA1 or BRCA2 mutations as determined by DNA sequence analysis of coding regions, and multiplex ligation-dependent probe amplification to detect deletions and duplications. The 236 control subjects had no cancer and no immediate family history of breast cancer. All cases and controls were unrelated white British women.
We identified 60 tSNPs in the four genes for genotyping using LDselect , based on an analysis including all known SNPs with data available for the CEPH Utah individuals from HapMap and NIEHS . Genotyping was carried out using the Applied Biosystems SNPlex™ multiplex system (55 SNPs) or 5′ nuclease PCR (TaqMan™) (3 SNPs). Genotype data for the remaining 2 SNPs, rs3834129 and rs6723097, was already available for these subjects . Genotyping quality was assessed by examination of duplicate concordance and call rates for each SNP and a test for compliance with Hardy-Weinberg equilibrium (HWE) in controls. SNPs were removed if, in either SBCS or UBCS, their duplicate concordance rate <98% (n=2), more than one plate failed (n=7), or HWE p<0.005 (n=1). We also removed SNPs that were monomorphic (n=5). This resulted in a final set of 45 tSNPs for analysis (Supplementary Table S1).
In order to account for familial relatedness in the UBCS subjects, all analyses were carried out using the meta-association options in the Genie (single SNPs) and hapConstructor (haplotypes) software packages which use Monte Carlo testing to derive empirical estimates of significance and 95% confidence intervals [16,17]. Odds ratios, confidence intervals and significance tests for individual SNPs were derived based on allele dose, dominant and recessive models. HapConstructor is a data-mining algorithm that builds multi-SNP haplotypes based on association evidence. Starting with evidence from single SNPs, the process adds or removes SNPs using a forward-backwards algorithm. Each step includes tests for dominant, additive, and recessive models for each haplotype. The process continues, provided pre-defined significance thresholds are met with each step. The results from the data-mining are the haplotype, genetic model, risk ratio and p-value. The significance thresholds used for the haplotype construction process were 0.05, 0.005, 0.0005, 0.0001 for haplotypes of one to four SNPs, respectively, and 0.00005 thereafter. Haplotypes were estimated via the estimation maximization algorithm, and any missing genotypes were internally imputed. All p-values were estimated using between 100,000 and 500,000 simulations. Observations more extreme than all simulated data sets were designated p<1/simulations.
Table 1 and Figure 1 illustrate the individual tSNP results based on the Cochran Mantel Haentzel test for trend (allele dose risk model). For comparison, Supplemental Table S2 and Figure S1 show the most significant evidence for each tSNP (based on dominant or recessive models). Three of the 45 tSNPs showed at least nominally significant association with breast cancer (ptrend<0.05) in single SNP meta-association analyses across the two sites (rs3769821, rs6723097, rs700635 Table 1 and Figure 1). As illustrated in Figure 1, the most significant single SNPs cluster in CASP8 and AL2CR12, with rs3769821 in CASP8 being the most significant, with OR per-allele (95% CI) of 1.17 (1.05–1.30; p=0.0032; Table 1) and ORdom of 1.28 (1.21–1.38; p=5.3×10−4; Table S2). For confirmation, we genotyped the three most significant tSNPs in a cohort of unrelated familial cases negative for mutations in BRCA1 or BRCA2 from Manchester, UK and local controls. Nominally significant results were replicated for rs3769821 (ptrend = 0.042) and rs6723097 (ptrend = 0.014).
HapConstructor analyses for the genes CASP8 and ALS2CR12 identified highly significant risk haplotypes (p<5×10−6 and p<1×10−5, respectively). No significant haplotypes (p<0.001) were identified in the downstream genes CFLAR or CASP10.
The most significant risk haplotype with greatest effect size in CASP8 was a six-SNP haplotype 1-2-1-1-1-1 across rs3834129, rs6723097, rs3817578, rs7571586, rs36043647 and rs35010052 (p<5×10−6; ORdom=1.29, 95% CI 1.22-1.33). The first five SNPs reside in CASP8, the last SNP, rs35010052, is contained in the ~700bp region between CASP8 and ALS2CR12. The frequency of this haplotype was 0.27 in controls and 0.30 in cases. Risk estimates from the two separate sites were ORdom = 1.31 95% CI 1.09-1.57 (p=0.004) and 1.27 95% CI 1.22-1.33 (p=0.0002) for SBCS and UBCS, respectively. The association strength of this six-SNP haplotype was driven by a three-SNP sub-haplotype, which gave a slightly lower risk estimate, but attained the same significance in the meta analysis (1-2-1 at rs3834129, rs6723097 and rs3817578; p<5×10−6; ORdom=1.28, 95% CI 1.21-1.35; freqcontrol=0.29, freqcases=0.32). Single site results for this three-SNP sub-haplotype haplotype were ORdom = 1.29 95% CI 1.07-1.55 (p=0.008) and 1.27 95% CI 1.23-1.31 (p=0.0002) for SBCS and UBCS, respectively. These three SNPs are not observed to have substantial levels of LD between them in the population (0.54, 0.03 and 0.05 for rs3834129-rs6723097, rs3834129-rs3817578 and rs6723097- rs3817578, respectively).
Thefindings for CASP8 are consistent in direction with previous single SNP results for rs3834129  and rs1045485 [3-5], that is, the rs3834129 ins allele and rs1045485 aspartate allele are on or in LD with the risk haplotypes. Similarly, they are consistent with the risk haplotype we defined in our previous single-site SBCS analyses (1-1-2-1 across rs7608692, rs1861269, rs6723097, rs3817578; Table 2) .
The most significant risk haplotype and highest effect size in ALS2CR12 was a three-SNP haplotype 2-2-1 across rs1035140, rs1035142 and rs10185177 (p<1×10−5; ORdom=1.26, 95% CI 1.17-1.35; freqcontrol=0.27, freqcases=0.30). Single site results were ORdom = 1.29 95% CI 1.08-1.54 (p=0.005) and 1.21 95% CI 1.12-1.31 (p=0.003) for SBCS and UBCS, respectively. The first SNP on this haplotype, rs1035140, lies between CASP8 and ALS2CR12. Due to the strong association between the minor alleles at rs1035142 and rs700635 on this risk haplotype, the four-SNP and three-SNP haplotypes created by either adding allele 2 at rs700635 or substituting rs1035142 with rs700635, gave very similar results (both p<1×10−5; ORdom=1.25, 95% CI 1.20-1.30; freqcontrol=0.27, freqcases=0.30).
We performed a hapConstructor analysis across the CASP8 -ALS2CR12 two-gene region. This analysis of 27 tSNPs converged to the same results identified from the CASP8-only analysis, involving haplotype 1-2-1 at pivotal SNPs rs3834129, rs6723097 and rs3817578. This confirmed the expectation that the risk haplotypes identified in CASP8 and ALS2CR12 are due to the same underlying variant/s and suggests that only one risk haplotype exists at this region.
Previously, we defined a four-SNP risk haplotype in CASP8 based on an analysis of 14 tSNPs and data from a single site (SBCS) . Here, we provide a more thorough interrogation of a broader region using 45 tSNPs across four genes on chromosome 2q33 genotyped in two independent data sets (SBCS and UBCS). Two single-SNP results were additionally replicated in a third set of familial cases without BRCA1 or BRCA2 mutations. We have identified a common three-SNP risk haplotype in CASP8 that drives the association evidence at both sites and results in a highly significant meta-association finding (p<5×10−6; ORdom=1.28; freqcontrol=0.29). Evidence from multi-gene analyses indicates haplotypes that are centered in CASP8, and continues to confirm the involvement of CASP8 in risk to breast cancer. The consistency of results across two independent sites lends robustness to the finding and credibility and to the risk haplotype.
A recent genome-wide association study identified that the CASP8 region is associated with melanoma risk, and there are reports that it is also associated with other cancers including chronic lymphocytic leukaemia and pancreatic cancer [18-20]. These observations suggest that this region may be of broader interest for cancer in general. The risk haplotype provides a strong foundation for re-sequencing efforts by refining the haplotype upon which the susceptibility variant likely resides. Identification of the critical underlying risk variant may prove useful for individual-level risk prediction, aid in deciphering the role of CASP8 in risk of other cancers [6,18-20], and provide novel insights into breast carcinogenesis.
The work reported in this paper would not have been possible without the contribution of many individuals. In particular we thank Steve Backus, Kim Nguyen, Jathine Wong, Thomas Naranjo and Jim Farnham (UBCS) and Sue Higham, Gordon MacPherson, Helen Cramp, Dan Connley and Ian Brock (SBCS). We would also like to thank all the women who took part in these studies.
Grant Support Data collection for the UBCS was made possible by the Utah Population Database (UPDB) and the Utah Cancer Registry (UCR). Partial support for all datasets within the UPDB was provided by the University of Utah Huntsman Cancer Institute (HCI) and the HCI Cancer Center Support grant, P30 CA42014 from the NCI. The UCR is funded by contract HHSN261201000026C from the NCI SEER program with additional support from the Utah State Department of Health and the University of Utah. The genotyping and analysis was supported by funding from the Susan G. Komen Foundation (BCTR0706911) and the Avon Foundation (02-2009-080), the Breast Cancer Campaign (2004Nov 49), and Yorkshire Cancer Research (S295, S299 and core funding). L-C-A acknowledges support from the Huntsman Cancer Foundation. HMcB, AL, WGN and DGE were all supported by the Manchester NIHR Biomedical Research Centre.
Financial support: Susan G. Komen Foundation (BCTR0706911 to NJC), Avon Foundation (02-2009-080 to NJC), Breast Cancer Campaign (grant 2004Nov 49 to AC), and Yorkshire Cancer Research (S295 and S299 to AC).
Conflict of Interest: No conflicts of interest to declare.