|Home | About | Journals | Submit | Contact Us | Français|
Patients with early-onset breast and/or ovarian cancer frequently wish to know if they inherited a mutation in one of the cancer susceptibility genes, BRCA1 or BRCA2. Accurate carrier prediction models are needed to target costly testing. Two widely used models, BRCAPRO and BOADICEA, were developed using data from non-Hispanic Whites (NHW), but their accuracies have not been evaluated in other racial/ethnic populations.
We evaluated the BRCAPRO and BOADICEA models in a population-based series of African-American, Hispanic and NHW breast cancer patients tested for BRCA1 and BRCA2 mutations. We assessed model calibration by evaluating observed versus predicted mutations and attribute diagrams, and model discrimination using areas under receiver operating characteristic curves (AUCs).
Both models were well-calibrated within each racial/ethnic group, with some exceptions. BOADICEA over-predicted mutations in African Americans and older NHWs, and BRCAPRO under-predicted in Hispanics. In all racial/ethnic groups, the models over-predicted in cases whose personal and family histories indicated greater than 80% probability of carriage. The two models showed similar discrimination in each racial/ethnic group, discriminating least well in Hispanics. For example, BRCAPRO’s AUCs were 83% (95% confidence interval 63–93%) for NHWs, compared to 74% (59–85%) for African Americans and 58% (45–70%) for Hispanics.
The models’ poor performance for Hispanics may be due to model misspecification in this racial/ethnic group. However, it also may reflect racial/ethnic differences in the distributions of personal and family histories among breast cancer cases in the Northern California population.
Since identification of the BRCA1 and BRCA2 cancer susceptibility genes, predictive models have been developed to identify individuals likely to carry inherited deleterious BRCA mutations. These models assign a patient a probability of mutation carriage using the cancer histories of the patient and her first-, second-degree, and, in some cases, more distant relatives, and estimates of BRCA mutation prevalence and penetrance. The BRCAPRO model assumes that all genetic susceptibility to breast cancer is due to BRCA mutations (1–3). The BOADICEA model considers the simultaneous effects of BRCA1, BRCA2 and the residual familial clustering of breast cancer not explained by these genes, which is assumed to be polygenic (4, 5). Multiple studies have validated BRCAPRO’s performance, and most have found it to discriminate as well as or more accurately than other models (2, 3, 6–18), although some studies suggest that BOADICEA performs slightly better in certain high-risk populations (1, 6, 19). Most evaluations of BRCAPRO and BOADICEA have included primarily non-Hispanic White (NHW) breast cancer patients, with a few exceptions (13, 17, 18, 20–22); consequently, little is known about the performance of these models across different racial/ethnic groups.
The BRCAPRO model was developed and has been validated with data from patients presenting for clinical genetics evaluation because of strong family cancer history; the BOADICEA model was developed using population-based data. As we and others have observed (23), a substantial portion of population-sampled young breast cancer patients with BRCA mutations do not have a family history of breast or ovarian cancer. Many women with early-onset breast cancer but no family history wish to know their mutation status, because results may guide subsequent management of breast and ovarian cancer risk, and family planning; such patients increasingly present for evaluation at cancer genetics clinics. An accurate and precise risk prediction algorithm based on personal and family history could identify individuals who lack a known deleterious mutation, but who, if tested, would have a 6–13% chance of receiving ambiguous and anxiety-producing results (24). We report an analysis of the performance of the BRCAPRO and BOADICEA mutation carriage prediction models in African-American, Hispanic and NHW female breast cancer patients from a population-based study in the San Francisco Bay Area.
To increase the precision of our inferences while maintaining their population-based properties, we identified breast cancer patients diagnosed at age <65 years through the population-based Greater San Francisco Bay Area Cancer Registry, which ascertains all incident cancers as part of the Surveillance, Epidemiology and End Results (SEER) Program and the California Cancer Registry, and invited them to participate in the Northern California Breast Cancer Family Registry (NC-BCFR) (25–27). We recruited patients using a two-stage sampling design, with over-sampling of patients having characteristics that suggest an inherited basis for their cancers (25–27). In stage one of sampling, we administered a brief telephone interview to all patients and assessed self-identified race/ethnicity and family history of breast and ovarian cancer. Based on age at diagnosis and personal and family history, patients were classified into either Category A (patients whose cancers are likely to be hereditary) or Category B (all other patients with cancers less likely to be hereditary). Category A patients were those who met at least one of the following criteria: (1) breast cancer diagnosis before age 35 years; (2) bilateral breast cancer, with first diagnosis before age 50 years; (3) prior ovarian or childhood cancer; or (4) at least one first-degree relative with breast or ovarian cancer. Categories A and B were designed with the aim of reducing the variance of overall prevalence estimates in the entire population-based series of patients. In stage two, we invited all patients in Category A and a random sample of patients in Category B to enroll in the family registry. Participants completed questionnaires on family history of cancer and breast cancer risk factors and provided a biospecimen sample. This two-stage sampling design provides unbiased estimates of population-based mutation carrier prediction performance having greater precision than those obtained from a simple random sample of the same size.
In telephone interviews, participants provided information on date of birth, vital status, date of death or last observation, and diagnosis dates and types of all site-specific cancers for themselves and all first-degree relatives. The questionnaire also elicited the occurrence and age at diagnosis of breast or ovarian cancer in second-degree relatives. When possible, reports of breast or ovarian cancer in relatives were verified by interviewing the relatives themselves, by obtaining medical records, or both. Informed consent was obtained from each study subject; the Institutional Review Boards of the Northern California Cancer Center, Stanford University, and the Dana-Farber Cancer Institute approved the study, in accordance with an assurance filed with and approved by the Department of Health and Human Services. The present analysis was restricted to patients aged <65 years diagnosed with invasive breast cancer between January 1st, 1995 and April 30, 2003, who identified themselves as African American, Hispanic or NHW. The number of BRCA mutation carriers among Asian-American patients was too small for reliable analysis. We restricted NHW patients to those without self-identified Ashkenazi Jewish ancestry given the known elevated prevalence of BRCA mutations among Ashkenazim.
Biospecimens from the NC-BCFR were processed by the Coriell Cell Repositories (Coriell Institute for Medical Research, Camden NJ). Testing was performed using Exon Grouping Analysis (EGAN) (28, 29), full sequencing performed by Myriad Genetics (30), or two-dimensional gene scanning (31, 32). The numbers of patients tested by these three methods were 28, 691, and 646, respectively, for BRCA1 and 363, 1002 and 0, respectively, for BRCA2. All coding exons and surrounding intronic sequences were amplified with 34 primer pairs and analyzed on ABI-377 instruments. PCR fragments with aberrant mobility were sequenced. For two-dimensional gene scanning, the entire coding exons and surrounding intronic sequences were amplified in a two-step PCR process involving six individual multiplex reactions (31, 32); these methods permit detection of variants in coding regions and splice site mutations. Mutations were classified according to the Breast Cancer Information Core (BIC) database1 and considered pathogenic as described by Couch et al (33). Regulatory mutations outside of the coding region and splice junctions and large genomic rearrangements are not detected by the methods used here.
We calculated BRCAPRO and BOADICEA probabilities of mutation carriage (hereafter called prediction scores) in patients tested for BRCA1 and BRCA2 mutations. BRCAPRO probabilities were obtained using the CancerGene 4b program (University of Texas Southwestern, Dallas, TX), and BOADICEA probabilities were obtained using the program BOADICEA V3 provided by Antonis Antoniou (19). All available demographic data on all patients’ first- and second-degree relatives aged ≥20 years, whether affected by cancer or not, were included in the analyses. Information on all cases of breast cancer, ovarian cancer, male breast cancer, pancreatic cancer, and prostate cancer among patients and their first- and second-degree relatives was included in the prediction models.
Our goal was to obtain precise, population-based evaluation of the calibration and discrimination of each of the two models applied to each of the three racial/ethnic groups. Here calibration refers to agreement between mean prediction scores and observed carrier prevalences within subgroups of a population. Discrimination refers to the extent of separation between the prediction scores of carriers and noncarriers. We evaluated model calibration by comparing observed prevalences to mean prediction scores visually using attribute diagrams (34). We evaluated model discrimination using the areas under receiver operating characteristic curves (AUCs). Horvitz-Thompson (HT) estimating equations (26, 27) were used to adjust all analyses for the two-stage sampling design of the study.
We estimated the prevalence of BRCA mutation carriers in a given racial/ethnic group as a weighted average of the two category-specific prevalences πA and πB. Here, πA is the number of carriers identified in Category A divided by the total number of tested patients in Category A. The overall prevalence estimate (Categories A and B combined) was π = wπA + (1-w) πB, where the weight w is the proportion of all screened patients who were classified in category A. We compared the prevalence π to similarly weighted averages of the category-specific mean prediction scores. In these calculations, we multiplied each prediction score by 90%, assuming that laboratory testing methods were 90% sensitive (35). The overall predicted prevalence was then computed as a weighted average = wA + (1 – w)B of the two category-specific mean scores A and B. We evaluated statistical significance of the observed/predicted differences via the statistic S = (π – )2 /V where V is the HT variance estimate (26,27). This statistic has approximately a chi-squared distribution on one degree of freedom, under the null hypothesis that the prediction score for each breast cancer patient in a given racial/ethnic population equals her actual probability of mutation carriage.
We also used attribute diagrams to compare observed carrier prevalence to mean prediction scores within subgroups of patients whose scores lie in subintervals of the unit interval. To obtain an even distribution of patients within the subintervals, we grouped the patients according to quantiles of the logits of their scores. Within each subinterval, we computed weighted estimates and HT-based confidence intervals for the proportion of patients with observed mutations. We then plotted these points in attribute diagrams, which are plots of BRCA mutation carrier prevalence against median scores within each score subinterval (1). Results are provided for subgroups according to age and family cancer history, and for the group as a whole.
To evaluate the models’ ability to discriminate between carriers and non-carriers in each racial/ethnic group, we constructed receiver operating characteristic curves and evaluated the areas under these curves (AUCs) (36). We estimated these AUCs as weighted sums of the category-specific AUCs, using weights as described above. We used bootstrap variance estimates to obtain confidence intervals for the AUCs. Results are provided for subgroups according to age and family cancer history, and for the group as a whole.
Figure 1 shows the distribution of African-American, Hispanic and NHW patients that were screened according to sampling categories. They include 1,050 (21%) African-Americans, 1,487 (30%) Hispanics, and 2,478 (49%) NHWs; 1,466 (29%) were classified in Category A, and the remaining 3,549 (71%) in Category B. All Category A patients (N=1,466) were invited to participate, and of these, 1002 (68%) provided a blood or buccal sample and were tested for BRCA1 and BRCA2 mutations. Of the 3,549 Category B patients, 671 were randomly selected for participation and of these, 435 (65%) provided a biospecimen sample and were tested. Of the 1,437 patients tested for BRCA1 and BRCA2 mutations, 72 were excluded from analysis: 23 because a family member had already been enrolled in the study through ascertainment of a first-degree relative with breast cancer (7 African Americans, 14 Hispanics, and 2 NHWs), one NHW because no family history information was available, and 48 NHWs because of Ashkenazi Jewish ancestry. A total of 1,365 tested patients were included in the analysis.
Table 1 shows the distribution of tested patients according to BRCA mutation status, by sampling category (A and B) and race/ethnicity (African-American, Hispanic, and non-AJ NHWs). Of 1,365 patients tested for BRCA mutations, 44 tested positive for BRCA1 mutations (40 in Category A and four in Category B), and 41 tested positive for BRCA2 mutations (32 in Category A and 9 in Category B). Estimates of BRCA1 and BRCA2 mutation prevalence were highest among Hispanic patients (prevalence = 3.2% for each gene) and lowest among African-American patients (1.1% and 1.8% for BRCA1 and BRCA2, respectively), as we previously reported (23).
Table 2 shows prediction scores and observed prevalence of BRCA1 and BRCA2 mutations, specific for age and racial/ethnic group. Comparison of prediction scores to observed prevalence indicates generally similar levels of calibration for the two models, with notable differences as follows. The BRCAPRO model under-predicted mutation carriage in Hispanics, with the observed prevalence estimate of 6.4% exceeding the prediction of 3.8% (p=0.04); this under-prediction was statistically significant in the subset of Hispanic patients with no family history (FH) of breast cancer (p=0.01), but not in those with FH (p=0.14) BRCAPRO over-predicted in the subset of patients with FH among African Americans (p<0.01) and NHWs (p<0.01), but did not over-predict significantly in African Americans and NHWs without FH. The BOADICEA model over-predicted in African Americans (3.0% observed vs. 5.0% predicted, p=0.02) and older non-AJ NHWs (2.8% observed vs. 4.0% predicted, p=0.03). BOADICEA’s over-prediction was statistically significant in the subset of patients with FH among African Americans (p<0.01) and NHWs (p<0.001); it under-predicted significantly among Hispanics without FH (p=0.04), but not among those with FH.
Attribute diagrams for each racial/ethnic group are presented in Figure 2 as a measure of model resolution and reliability, with optimal performance represented by data points on the 45 degree line (1). For both models, data points are clustered on the 45 degree line, with the exception of the BRCAPRO model for Hispanics, in which most points fell above the 45 degree line, consistent with the observed under-prediction of BRCA mutation carriage as reported in Table 2. In all racial/ethnic groups, BRCAPRO shows over-prediction in patients with 80% or greater predicted probability of mutation carriage; BOADICEA shows over-prediction for non-AJ NHWs with 80% or greater predicted probability of mutation carriage (Figure 2).
Accuracy in discriminating between carriers and non-carriers, as measured by AUC, was similar for both models within each racial/ethnic group (Table 3). The highest AUC values were observed in non-AJ NHWs (BRCAPRO 83%, 95% confidence interval (CI) 63–93%; BOADICEA 83%, CI 63–93%), followed by African Americans (BRCAPRO 74%, CI 59–85%; BOADICEA 75%, CI 60–85%) and Hispanics (BRCAPRO 58%, CI 45–70%; BOADICEA 56%, CI 43–68%). Within subsets defined by age and FH of breast cancer, there was a trend toward worse performance of both models in older (BRCAPRO 49%, CI 22–76%; BOADICEA 44%, CI 17–74%), compared to younger (BRCAPRO 81%, CI 67–90%; BOADICEA 85%, CI 74–92%) African Americans. There also was a trend toward worse discrimination by both models in Hispanics without FH (BRCAPRO 52%, CI 38–65%; BOADICEA 50%, CI 34–66%), compared to Hispanics with FH (BRCAPRO 69%, CI 56–79%; BOADICEA 69%, CI 55–80%).
We evaluated the performance of the BRCAPRO and BOADICEA BRCA mutation carrier prediction models in three racial/ethnic groups, consisting of African-American, Hispanic and non-Ashkenazi Jewish NHW breast cancer patients from the San Francisco Bay Area. To our knowledge, this is the first study to compare these BRCA mutation-prediction models across population samples of such racial/ethnic diversity. In general, the models showed similar discrimination within each racial/ethnic group, but differences in calibration: BRCAPRO under-predicted mutation carriage in Hispanics, whereas BOADICEA over-predicted mutations in African Americans and in older NHWs.
The strength of this study is its focus on population-based samples of racial/ethnic minorities, in contrast to most prior evaluations of BRCA mutation prediction models. Some prior studies have found similar accuracy of BRCAPRO in racial/ethnic minorities as in NHWs (13, 17), but we and others reported under-prediction by BRCAPRO and other models among clinic-based minorities including Asian Americans and Hispanics (18, 20). Models may perform less well in racial/ethnic minorities because the prevalence of carriers among breast cancer cases may differ by race/ethnicity. In non-Ashkenazi Jewish NHW cases, our prevalence estimates for BRCA1 (2.1%) and BRCA2 (2.3%) are similar to those used by the BRCAPRO and BOADICEA models (3, 4); by contrast, we found that African-American cases had lower (BRCA1 1.1%; BRCA2 1.8%), and Hispanic cases higher (BRCA1 3.2%; BRCA2 3.2%), carrier prevalence. Notably, the exception to BOADICEA’s general over-prediction occurred in Hispanics, as did a significant under-prediction by BRCAPRO, both consistent with Hispanics’ higher mutation prevalence than observed in non-Ashkenazi Jewish NHWs. Within subsets specific for family history and age, calibration worsened with increasing divergence from the mutation prevalence expected by the models; for example, both models over-predicted significantly only in African Americans with family history of breast cancer, while under-predicting in Hispanics lacking such family history. Recent publications have reported a higher prevalence of the 185delAG BRCA1 founder mutation in Hispanics than was initially appreciated (18, 37, 38), leading some to suggest a common origin for this mutation in Sephardic and Ashkenazi Jewish populations (18, 37, 38). The present results contrast with those from a recent single-center clinic-based study of BRCAPRO in Hispanics, which reported better model performance than we found (17); variations in the use of BRCAPRO between studies may explain some of this difference. Our finding of lesser BRCAPRO model accuracy in Hispanics also prompts questions as to whether BRCA mutation penetrance, or associated cancer risk, might differ between Hispanics and NHW. Given the growing size of the U.S. Hispanic population, further study of this issue has important implications for health policy and resource allocation.
In contrast to prior studies of the calibration of BRCAPRO and similar models in clinic-based settings (1, 2, 6–22), this analysis considered populations having lower BRCA mutation prevalence, with 85 (6%) carriers identified among 1,365 tested patients. This study sample reflects the reality of current clinical BRCA mutation testing, given patient preference and practice guidelines that support more inclusive testing than previously advised2. Our finding that BRCAPRO and BOADICEA over-predicted in a substantial proportion of patients, particularly in patients with family history of breast cancer or with greater than 80% predicted probability of mutation carriage, likely results from lower mutation frequency, and perhaps higher sporadic breast cancer incidence (39, 40), in these groups than the model parameters assume; we anticipate that population-specific corrections may improve model calibration, as others have demonstrated (7).
Prior studies of the BRCAPRO model’s discrimination have reported AUCs in the range of 60–88%. Comparisons of BRCAPRO to other BRCA mutation prediction models, including BOADICEA, Couch, Finnish, National Cancer Institute, Frank/Myriad II, the Manchester Scoring System, the Family History Assessment Tool, and Shattuck-Eidens/Myriad I, have revealed relatively few differences in terms of discrimination (1, 2, 6–16, 18, 19, 41, 42). Exceptions include the slightly superior performance of BOADICEA and the Manchester Scoring System in the United Kingdom, of the Italian IC software modification of BRCAPRO among Italians, and of the LAMBDA model among Ashkenazi Jewish probands; some of these models use population-specific mutation prevalence estimates, which tailored them to the groups under study (7, 9, 19, 43). In the present study, we evaluated model discrimination in three separate racial/ethnic populations, which may differ in their prevalence of BRCA mutations, and in the variance of their carriage probabilities. Variation in the probability of mutation carriage within a racial/ethnic population affects a model’s AUC. For example, if all breast cancer patients in a racial/ethnic group had the same mutation carrier probability, its AUC (which is the likelihood that the carriage probability for a randomly selected carrier exceeds that of a randomly selected noncarrier) would equal its minimum of 50%, indicating that the model is no better at discriminating between carriers and noncarriers than random chance. Given such intra-group variability, it is difficult to compare a model’s discrimination across different racial/ethnic groups. Comparing BRCAPRO’s and BOADICEA’s discriminative abilities within a single population is more straightforward, and we found no difference between models in any of the three racial/ethnic groups under study. Within subsets defined by age and family history of breast cancer, we observed some trends in model performance which did not reach statistical significance (for example, both models discriminated better in younger, compared to older, African Americans). Future research should evaluate race/ethnicity-specific modifications to BRCAPRO’s and BOADICEA’s mutation prevalence assumptions, and compare each model’s discrimination to that of other prediction tools, within the racial/ethnic groups we studied. As understanding of the spectrum of BRCA mutations across race/ethnicity matures, it may prove optimal to develop models specific to each racial/ethnic population.
Although BRCA mutation testing was completed for only 67% of those invited to enroll in the NC-BCFR, the testing rate was similar for patients in Categories A (68%) and B (65%). This similarity suggests that family history was not related to patients’ willingness to participate in the registry and provide biospecimens for research. We assumed that the combination of BRCA testing methods used was 90% sensitive for detection of deleterious mutations (35); however, if testing sensitivity was actually lower than 90%, then the models may over-predict less, and under-predict more, than we report here.
In conclusion, the BRCAPRO and BOADICEA models showed differences in performance across racial/ethnic and age groups in a large population-based series of breast cancer patients. This finding emphasizes the need for further study of BRCA mutations in specific racial/ethnic and age groups, and for development of more accurate mutation prediction methods, with customization for the populations to which they are applied.
Notes: This research was supported by the National Cancer Institute, National Institutes of Health, under RFA #CA-95-003 through a cooperative agreement with the Northern California Cancer Center (U01 CA69417), as well as NIH grants CA69417 and CA94069. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Breast CFR, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government or the Breast CFR. The authors thank Antonis Antoniou for providing BOADICEA software, and Frederick Li for his contributions to the laboratory work for this analysis.
Financial support: National Cancer Institute, National Institutes of Health, under RFA CA-95-003 through a cooperative agreement with the Northern California Cancer Center (U01 CA69417), and NIH grants CA69417 and CA94069.
1Available at: http://research.nhgri.nih.gov/bic/.
2Guidelines of the National Comprehensive Cancer Network. Available at: http://www.nccn.org/professionals/physician_gls/PDF/genetics_screening.pdf