We applied a staged study design based on three case-control population sets; the primary, discovery set (SBCS), and two additional sets to establish the robustness of findings (GC-HBOC and UBCS). A total of 14 SNPs were successfully genotyped in 1,228 case and 1,222 control subjects in the SBCS discovery set. (Supplementary Tables 1 and 2
). Four SNPs (rs3834129, rs6435074, rs6723097 and rs1045485) demonstrated significant associations with breast cancer (Ptrend
<0.05), with rs6723097 being the most significant, with per allele OR (95% CI) 1.16 (1.03–1.31), Ptrend
=0.017 (). These four SNPs were genotyped in samples from 1,220 cases and 1,664 controls from GC-HBOC and 752 cases and 438 controls from UBCS (Supplementary Table 2
). Three of the four SNPs yielded smaller empirical Ptrend
values in the 3-study meta-analysis compared to SBCS alone, with rs6723097 again yielding the most significant result (Ptrend
=0.0008), with no evidence of heterogeneity between studies (). shows that there is generally a low degree of pairwise correlation between the four SNPs, with the exception of rs6723097 and rs6435074 (R2
=0.67). As expected, the D′ values are somewhat higher, suggesting that the associated SNPs may be marking one or more underlying breast cancer haplotypes.
Association statistics for SNPs in the CASP8 gene.
Levels of correlation between breast cancer associated SNPs
With the aim of identifying any such haplotypes that might carry functional aetiological variants, we searched for susceptibility haplotypes using the hapConstructor module of Genie in the SBCS data set (17
). shows a summary of all haplotypes with frequency >1% in SBCS. HapConstructor identified a four-locus haplotype 1-1-2-1 at rs7608692, rs1861269, rs6723097 and rs3817578 as being most significant (P=8.0×10−5
), with a per-allele OR (95% CI) of 1.30 (1.12, 1.49), and construction-wide FDR q-value of 0.044. This four-allele haplotype has frequency 19.8% in controls and 24.2% in cases and is present on haplotypes 2, 8, 11 and 14 (). The only other four-locus haplotype to surpass the significance thresholds set in the data-mining process was identical at the first three SNP positions and replaced rs3817578 with D302H (rs1045485) (P=1.0×10−4
). These two haplotypes constituted 16 of the 18 significant tests that were contained in the group of tests with the FDR of 0.044. Hence, there is extremely good evidence that these related haplotypes are true indicators of an underlying susceptibility variant. Furthermore, in a stepwise logistic regression, the 1-1-2-1 haplotype alone provided the best fitting model, compared to models involving any of the individual SNPs.
Haplotype frequencies for SBCS
To assess the robustness of these results, we also carried out a meta-haplotype-construction with the four SNPs typed in the three study populations (rs3834129, rs6435074, rs6723097 and rs1045485). HapConstructor extracted a two-SNP haplotype across rs6723097 and rs1045485 (1
) as the most significant (P=2.0×10−5
, FDR q-value 0.002), with a protective per allele OR (95% CI) of 0.76 (0.68, 0.85). The complement of this haplotype, 2–1, increased risk and was also significant (3.3×10−4
), with a per allele OR (95%CI) of 1.15 (1.06, 1.24). This two-SNP combination also lies on haplotypes 2, 8, 11 and 14 (), and these two SNPs are also found on the 4-locus risk haplotype in the discovery analysis. Thus the meta-analysis haplotypic associations are extremely consistent with the 4-allele haplotype association seen in SBCS.
A case-only meta-analysis across the three studies yielded no evidence that either the individual SNPs or the haplotypes were associated with age at onset, family history, bilateral disease, or estrogen or progesterone receptor tumour status (data not shown).