|Home | About | Journals | Submit | Contact Us | Français|
Recent large-scale studies have been successful in identifying common, low penetrance variants associated with common cancers. One such variant in the caspase-8 (CASP8) gene, D302H (rs1045485), has been confirmed to be associated with breast cancer risk, although the functional impact of this polymorphism (if any) is not yet clear. In order to further map the CASP8 gene with respect to breast cancer susceptibility, we performed extensive haplotype analyses using single nucleotide polymorphisms (SNPs) chosen to tag all common variation in the gene (tSNPs). We used a staged study design based on 3200 breast cancer and 3324 control subjects from the UK, Utah and Germany. Using a haplotype-mining algorithm in the UK cohort, we identified a 4-SNP haplotype that was significantly associated with breast cancer and superior to any other single or multi-locus combination (P=8.0×10−5), with a per allele odds ratio and 95% confidence interval [OR (95% CI)] of 1.30 (1.12, 1.49). The result remained significant after adjustment for the multiple testing inherent in mining techniques (false discovery rate (FDR), q=0.044). As expected, this haplotype includes the D302H locus. Multi-center analyses on a subset of the tSNPs yielded consistent results. This risk haplotype is likely to carry one or more underlying breast cancer susceptibility alleles, making it an excellent candidate for re-sequencing in homozygous individuals. An understanding of the mode of action of these alleles will aid risk assessment and may lead to the identification of novel treatment targets in breast cancer.
Recent genome-wide and candidate gene association studies have started to convincingly identify low-penetrance variants associated with breast cancer (1–4). The only confirmed common variant that has emerged from candidate gene studies for breast cancer so far, is in the gene for the apoptosis-related cysteine protease caspase-8 (CASP8), located on chromosome region 2q33 (2, 5, 6). The rare allele of the non-synonymous variant D302H (rs1045485) was associated with a reduced risk of breast cancer, with a per allele OR (95% CI) of 0.88 (0.84, 0.92) in a large study of 16,423 cases and 17,109 controls carried out by the Breast Cancer Association Consortium (2). As yet, there is no known functional effect of rs1045485, and it is non-polymorphic in Asian populations. Another CASP8 polymorphism, a six-base pair insertion-deletion (indel) in the promoter of CASP8 (rs3834129) was found to reduce breast cancer risk in a Chinese population (7). However, subsequent larger studies failed to replicate this finding (8, 9).
The aim of the present work was to use a SNP-tagging approach to further map the CASP8 gene with respect to breast cancer risk, in order to move towards identification of potential susceptibility variant(s) (10).
The primary set of case and control subjects were drawn from the Sheffield Breast Cancer Study (SBCS) and consisted of histopathologically confirmed breast cancer patients recruited from the surgical outpatient clinics of the Royal Hallamshire Hospital, Sheffield UK, between November 1998 and January 2005. Controls were recruited from patients attending the Sheffield Mammography Screening Service between September 2000 and January 2004, whose mammograms showed no evidence of breast lesions. All cases and controls were of North European origin and resident in the Sheffield area (5, 11). The second set comprised unrelated BRCA1/2 mutation-negative breast cancer patients recruited between 1997 and 2007 by three centres from the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) (6, 8). All patients had been screened for mutations in the BRCA1 and BRCA2 genes by denaturing high performance liquid chromatography analysis of all exons followed by direct sequencing. Ethnically matched controls were selected from unrelated healthy female blood donors collected by the Institute of Transfusion Medicine and Immunology (Mannheim, Germany) between the years 2004 and 2007. The Utah Breast Cancer Study (UBCS) cohort consisted of BRCA1/2 mutation-negative cases (established by sequencing, family inference or linkage evidence) from extended high-risk Utah pedigrees ascertained using the Utah Population Database (12). Controls were unrelated birth cohort- and sex-matched cancer-free individuals.
All available HapMap1 SNPs within a 50kb region spanning the CASP8 gene, and SNPs from dbSNP2 with a minor allele frequency (MAF) >0.05, were genotyped on 135 random SBCS control samples. The optimal set of 12 tSNPs was identified from these data by Principal Components Analysis (13). The 12 tSNPs were supplemented by two further SNPs identified by DNA sequencing of regions containing putative SNPs, plus the 6bp promoter indel variant rs3834129. Thus a total of 15 SNPs were selected for genotyping.
Genotyping was carried out using the Applied Biosystems SNPlex™ multiplex system (SBCS samples) or 5′ nuclease PCR (UBCS and GC-HBOC samples). The 6bp indel was genotyped by fragment analysis on an ABI 3730 automated sequencer. Genotyping quality was assessed by examination of duplicate concordance and call rates for each SNP and a test for compliance with Hardy-Weinberg equilibrium (HWE) in controls. A summary of genotyping quality data is shown in Supplementary Table 1. SNPs with duplicate concordance rates of less than 98%, call rates less than 90%, or PHWE <0.005 were removed from the analysis.
All statistical tests were two-sided. Evidence of association for single SNPs in the primary discovery set was initially assessed by use of a trend test. Per allele and genotypic OR and 95% CIs were estimated within a logistic regression framework with the common homozygotes as reference group. In order to account for familial relatedness in the UBCS subjects, meta-analyses of individual SNPs across study populations were carried out using the Genie software package which uses Monte Carlo testing to derive empirical estimates of significance and CIs (14, 15).
Pairwise R2 and D′ values were estimated based on genotype data from 123 SBCS controls using Haploview (16). Haplotype frequencies were estimated by use of the estimation maximization (EM) algorithm within SNPHAP3. The hapConstructor module of Genie was used to build combinations of SNPs associated with breast cancer (17). This data-mining module includes tests for dominant, additive, recessive and allelic models for each haplotype with OR, χ2 and statistics calculated. Individuals with >50% missing genotype data were excluded from the analysis. In the remaining individuals, missing genotypes were internally imputed, and the haplotypes were estimated via the EM algorithm. The significance thresholds used for the haplotype construction process were 0.05, 0.005, 0.0005, 0.0001 for haplotypes of one to four markers, respectively, and 0.00005 thereafter. Construction-wide FDR q values for the best haplotypes, that appropriately account for the construction process, were determined empirically using100 000 simulations.
Polytomous logistic regression and logistic regression (stratified by study) were used to compare genotype frequencies in different sub-groups of cases, based on an additive model for genotype as above. Likelihood ratio testing was used to compare models with and without terms for genotype.
We applied a staged study design based on three case-control population sets; the primary, discovery set (SBCS), and two additional sets to establish the robustness of findings (GC-HBOC and UBCS). A total of 14 SNPs were successfully genotyped in 1,228 case and 1,222 control subjects in the SBCS discovery set. (Supplementary Tables 1 and 2). Four SNPs (rs3834129, rs6435074, rs6723097 and rs1045485) demonstrated significant associations with breast cancer (Ptrend<0.05), with rs6723097 being the most significant, with per allele OR (95% CI) 1.16 (1.03–1.31), Ptrend=0.017 (Table 1). These four SNPs were genotyped in samples from 1,220 cases and 1,664 controls from GC-HBOC and 752 cases and 438 controls from UBCS (Supplementary Table 2). Three of the four SNPs yielded smaller empirical Ptrend values in the 3-study meta-analysis compared to SBCS alone, with rs6723097 again yielding the most significant result (Ptrend=0.0008), with no evidence of heterogeneity between studies (Table 1). Table 2 shows that there is generally a low degree of pairwise correlation between the four SNPs, with the exception of rs6723097 and rs6435074 (R2 =0.67). As expected, the D′ values are somewhat higher, suggesting that the associated SNPs may be marking one or more underlying breast cancer haplotypes.
With the aim of identifying any such haplotypes that might carry functional aetiological variants, we searched for susceptibility haplotypes using the hapConstructor module of Genie in the SBCS data set (17). Table 3 shows a summary of all haplotypes with frequency >1% in SBCS. HapConstructor identified a four-locus haplotype 1-1-2-1 at rs7608692, rs1861269, rs6723097 and rs3817578 as being most significant (P=8.0×10−5), with a per-allele OR (95% CI) of 1.30 (1.12, 1.49), and construction-wide FDR q-value of 0.044. This four-allele haplotype has frequency 19.8% in controls and 24.2% in cases and is present on haplotypes 2, 8, 11 and 14 (Table 3). The only other four-locus haplotype to surpass the significance thresholds set in the data-mining process was identical at the first three SNP positions and replaced rs3817578 with D302H (rs1045485) (P=1.0×10−4). These two haplotypes constituted 16 of the 18 significant tests that were contained in the group of tests with the FDR of 0.044. Hence, there is extremely good evidence that these related haplotypes are true indicators of an underlying susceptibility variant. Furthermore, in a stepwise logistic regression, the 1-1-2-1 haplotype alone provided the best fitting model, compared to models involving any of the individual SNPs.
To assess the robustness of these results, we also carried out a meta-haplotype-construction with the four SNPs typed in the three study populations (rs3834129, rs6435074, rs6723097 and rs1045485). HapConstructor extracted a two-SNP haplotype across rs6723097 and rs1045485 (1–2) as the most significant (P=2.0×10−5, FDR q-value 0.002), with a protective per allele OR (95% CI) of 0.76 (0.68, 0.85). The complement of this haplotype, 2–1, increased risk and was also significant (3.3×10−4), with a per allele OR (95%CI) of 1.15 (1.06, 1.24). This two-SNP combination also lies on haplotypes 2, 8, 11 and 14 (Table 2), and these two SNPs are also found on the 4-locus risk haplotype in the discovery analysis. Thus the meta-analysis haplotypic associations are extremely consistent with the 4-allele haplotype association seen in SBCS.
A case-only meta-analysis across the three studies yielded no evidence that either the individual SNPs or the haplotypes were associated with age at onset, family history, bilateral disease, or estrogen or progesterone receptor tumour status (data not shown).
Our haplotype mining results, based on three independent data sets, provide evidence that an extended multi-locus CASP8 haplotype is associated with breast cancer. The risk haplotype provides a better fitting model than any combination of the individual SNPs. This suggests that additional untyped variants carried on this haplotype may be responsible for the increased breast cancer risk. Re-sequencing of DNA samples from individuals carrying the high and low risk haplotypes should allow the underlying causative variants to be identified. Such variants might affect the molecular interactions of caspase-8, caspase-8 activity (coding variants), or caspase-8 levels, via effects on transcription factor binding, RNA splicing, or RNA stability (intronic/intergenic variants).
Aside from a well-defined role as an initiator of apoptosis, caspase-8 has been proposed as a molecular switch between cell motility, promoted by procaspase-8, and apoptosis, promoted by mature caspase-8 (18). Caspase-8 processing to the mature form is in turn controlled by phosphorylation by c-SRC, a proto-oncogene tyrosine kinase whose activity is upregulated in many types of tumour (19). It will be important to determine whether cancer-associated variants in CASP8 affect these processes. Furthermore it is intriguing to note that although the rare allele of CASP8 D302H is associated with a decreased risk of breast cancer, it is associated with an increased risk of glioma (20). Further studies including more comprehensive SNP panels and cancer characteristics are therefore needed to help us understand the roles of caspase-8 in different cancer types.
Financial support: Genotyping and data analysis in Sheffield, UK were supported by the Breast Cancer Campaign [grant no. 2004Nov49] and Yorkshire Cancer Research [grant no. S295]. For the Utah Breast Cancer Study, genotype data and analysis were supported by a Susan G. Komen Foundation grant (BCTR0706911) and an NIH grant (CA98364). Recruitment in Utah was supported in part by the Utah Cancer Registry (UCR) and the Utah Population Database (UPDB). The UCR is funded by contract N01-PC-35141 from the National Cancer Institute’s SEER program with additional support from the Utah State Department of Health and the University of Utah. Partial support for the UPDB was provided by the University of Utah Huntsman Cancer Institute.
We would like to thank all study subjects for their participation in this research. We would like to thank Helen Cramp and Dan Connley for subject recruitment and data collection in Sheffield, and Sandrine Tchatchou for overseeing the DNA sample collection for GC-HBOC.