A total of 1,792 breast cancer cases and 1,867 cancer-free controls were included in the final analysis, and the characteristics of these subjects were summarized in Table . Age at menarche (P < 0.001) and age at first live birth (P < 0.001) were consistently, differentially distributed between the cases and the controls in all samples. Among 1,437 breast cancer cases with known ER and PR status, 662 (46.07%) were both ER and PR positive, and 498 (34.66%) were both negative.
| Table 2Distribution of demographic characteristics and known breast cancer risk factors for cases and controls included in the study |
The results of the selected 15 SNPs and the breast cancer risk in testing set samples were presented in Table . The call rates of the 15 SNPs were all above 95% and the MAF in the controls were all above 0.05. Five SNPs at 2q35, 3p24, 6q22, 6q25 and 10q26 were significantly associated with breast cancer risk (2q35: rs13387042, P = 0.039; 3p21.4: rs2307032, P = 0.017; 6q22.33: rs2180341, P = 0.040; 6q25.1: rs2046210, P = 1.26 × 10-5; 10q26.13: rs2981582, P = 0.037). Therefore, these five SNPs were included in the further validation analyses.
The call rates of the five SNPs in the validation stage were all above 95% (Table ). Consistent associations were observed for the five SNPs, with significant or borderline significant P-values. Overall, after adjustment for age, age at menarche, menopausal status and age at first live birth, the five SNPs showed significant associations with breast cancer susceptibility (dominant genetic model: 2q35, rs13387042: OR = 1.26, 95% CI = 1.07 to 1.49; 3q24.1, rs2307032: OR = 1.24, 95% CI = 1.07 to 1.44; 6q22.33, rs2180341: OR = 1.22, 95% CI = 1.06 to 1.40; 6q25.1, rs2046210: OR = 1.51, 95% CI = 1.31 to 1.75; 10q26.13, rs2981582: OR = 1.31, 95% CI = 1.14 to 1.50).
| Table 3Association of SNPs with breast cancer risk in both testing and validation sets |
The cumulative effects of the five SNPs and the two risk factors (age at menarche and age at first live birth) on breast cancer risk were examined by two methods (Table ). One method was based on the counting of risk alleles/factors. Women carrying six or more risk alleles of the five SNPs (5.75% of case patients and 3.23% of control subjects) had a nearly three-fold increased risk for developing breast cancer compared with those carrying less than one of the risk alleles (11.08% of case subjects and 16.70% of control subjects). When taking age at menarche and age at first live birth into consideration, the top group (having more than seven risk alleles/factors) had a 5.61-fold increased risk compared to the reference group (adjusted OR = 5.61, 95% CI = 4.16 to 7.56). Another method was based on the risk score calculated with a linear combination of the SNP alleles or risk factors weighted by the individual odds ratio and then classified into four groups by the quartiles. Subjects with the upper quartile risk score were associated with a 91% increased breast cancer risk compared to those having the low quartile score (adjusted OR = 1.91, 95% CI = 1.56 to 2.35, P for trend: 5.60 × 10-10). Similarly, a 4.73-fold increased risk was illustrated when taking age at menarche and age at first live birth into consideration (adjusted OR = 4.73, 95% CI = 3.80 to 5.88, P for trend: 2.27 × 10-47). We then assessed the performance of the two risk prediction methods in discriminating cases and controls by ROC curves analyses. The AUC for the risk score analysis (0.649, 95% CI: 0.631 to 0.667; sensitivity = 62.60%, specificity = 57.05%, Figure ) was significantly higher than that by the risk factors counting method (AUC: 0.637, 95% CI: 0.619 to 0.655; sensitivity = 62.16%, specificity = 60.03%, Figure ) (P < 0.0001).
| Table 4Cumulative effects of associated SNPs and clinical risk factors on the risk of breast cancer in all samples |
Absolute risk was also calculated to evaluate the combined effects of the five SNPs and the two risk factors by a modified Gail model and a 65-year absolute risk for breast cancer among women aged 20 to 85 years was estimated for each subject. From Table , a clear trend was observed that more subjects were grouped as high risk along with the increased numbers of risk alleles/factors. However, the variation of absolute risk distribution increased with increasing numbers of factors used in the risk-predicting model. Compared to a uniform 65-year cumulative risk 0.07 as carrying four risk factors (chosen by the largest proportion in controls: 22.01%, Table ) for breast cancer in the population, a wide spectrum of absolute risk estimates was found using these five markers and the two clinical risk factors (Figure ). At a cutoff of 0.14 (two-fold of the population median risk) or 0.21 (three-fold of the population median risk), 26.57% or 10.43% of women were grouped as high risk, respectively. We also used the ROC curve analysis to evaluate the performance of absolute risk to classify the cases and controls. As shown in Figure we obtained an AUC of 0.658 (95% CI: 0.640 to 0.676) (sensitivity = 61.98%, specificity = 60.26%) for five SNPs plus two risk factors. Based on the cross-validation, similar results for AUCs were obtained (0.572 (five SNPs only), 0.644 (two risk factors only) and 0.660 (five SNPs plus two risk factors)), which suggests a relative reliability of the models.
| Table 5Absolute risk estimated in all samples |
The stratified analyses by ER or PR status of the five SNPs were summarized in Additional file
2. However, no significant heterogeneity was observed for the effect of each SNP by different ER or PR subgroups. Further stratified analysis was conducted on the cumulative effects of the five SNPs (coded 0 to 2 risk alleles as 0 and more than 3 risk alleles as 1) and found no heterogeneity between subgroups (Additional file
3).