|Home | About | Journals | Submit | Contact Us | Français|
Models have been developed to predict the probability that a person carries a detectable germline mutation in the BRCA1 or BRCA2 genes. Their relative performance in a clinical setting is unclear.
To compare the performance characteristics of four BRCA1/BRCA2 gene mutation prediction models: LAMBDA, based on a checklist and scores developed from data on Ashkenazi Jewish (AJ) women; BRCAPRO, a Bayesian computer program; modified Couch tables based on regression analyses; and Myriad II tables collated by Myriad Genetics Laboratories.
Family cancer history data were analyzed from 200 probands from the Mayo Clinic Familial Cancer Program, in a multispecialty tertiary care group practice. All probands had clinical testing for BRCA1 and BRCA2 mutations conducted in a single laboratory.
For each model, performance was assessed by the area under the receiver operator characteristic curve (ROC) and by tests of accuracy and dispersion. Cases “missed” by one or more models (model predicted less than 10% probability of mutation when a mutation was actually found) were compared across models.
All models gave similar areas under the ROC curve of 0.71 to 0.76. All models except LAMBDA substantially under-predicted the numbers of carriers. All models were too dispersed.
In terms of ranking, all prediction models performed reasonably well with similar performance characteristics. Model predictions were widely discrepant for some families. Review of cancer family histories by an experienced clinician continues to be vital to ensure that critical elements are not missed and that the most appropriate risk prediction figures are provided.
It is estimated that about 1 in 500 people in the U.S.A. carry deleterious mutations in the genes BRCA1 or BRCA2. The prevalence is higher in some ethnic groups, such as individuals of Ashkenazi Jewish (AJ) descent for whom the prevalence is about 1 in 40. In a pooled analysis of 22 studies of the relatives of 500 women with breast and ovarian cancer not selected for family history , the cumulative risk to age 70 years of female breast cancer was estimated to be 65% (95% confidence interval [CI] 44–78%) for BRCA1 mutation carriers and 45% (95% CI 31–56%) for BRCA2 mutation carriers. For ovarian cancer, the same cumulative risks were 39% (95% CI 18–54%) and 11% (95% CI 2.4–19%) for BRCA1 and BRCA2 mutation carriers, respectively. Other studies, particularly those performed on selected ethnic groups or on clinic-based population have reported even greater risks [2, 3]. The breast cancers that occur in BRCA1 and BRCA2 mutation carriers are on average diagnosed twenty years earlier than in the general population. There is also evidence of increased risks for carriers of cancers of the male breast, peritoneum, fallopian tube, prostate, pancreas, biliary system, stomach, and cutaneous and ocular melanoma [4–6].
These well-documented facts have led to recommendations for cancer risk management that are vastly different and more aggressive than recommendations for women with average cancer risks [7, 8]. Therefore, it is incumbent upon health care providers to consider carefully if a woman may have a high risk of being a carrier of a BRCA1 or BRCA2 mutation so that appropriate medical options can be offered.
Mutation analysis of the BRCA1 and BRCA2 genes has been clinically available since the mid-1990s. The decision to proceed with genetic testing is complex and requires discussion about how a test might impact on medical decision making, consideration of possible psychological results of testing, and addressing any cost and insurance implications. The original genetic testing guidelines proposed in 1996 by the American Society of Clinical Oncology  were sometimes interpreted as implying that mutation testing for BRCA genes should be offered to those individuals in whom there was a 10% or greater probability of finding a BRCA1 or BRCA2 mutation. This threshold remains a non-binding guide for genetic counseling of women considering mutation testing.
A variety of models have now been developed to predict the probability that a woman carries a detectable germline mutation in BRCA1 or BRCA2, based on her personal and family cancer history. In practice, predictions across different models are usually compared when providing a risk assessment and making decisions: a high probability would support genetic testing and/or aggressive clinical management, whereas a low probability might support a decision to forego testing. Thus, medical decisions may be directly linked to the quality of information generated by these models. In this study, we compared the performance characteristics of four mutation prediction models using families seen in the Mayo Clinic Familial Cancer Program.
Between l996 and 2003, probands (defined as the initial consultants in the families) from 260 independent families had cancer risk assessment evaluation and counseling from the Mayo Clinic Familial Cancer Program and subsequently had clinical genetic testing for mutations in BRCA1 and BRCA2. No a priori selection criteria were required. This is a clinic-based population representing those who opted to have genetic testing after having comprehensive genetic risk assessment counseling. Thirteen families with male probands were excluded from this study because only one of the models being evaluated estimated the mutation probability for males. Eleven families were excluded because the proband had only a variant of unknown significance (VUS) in BRCA1 (n = 2) or BRCA2 (n = 9). A detailed family history of cancer diagnoses was collected by either a genetic counselor or a Familial Cancer Program study coordinator trained in genealogic collection. If possible, the pedigrees were extended to all affected and unaffected third degree relatives of the consultant. Efforts were routinely made to verify the diagnoses of breast and ovarian cancers in relatives, but this was inconsistently accomplished (less than half of cases) so risk estimates were most often reliant on family history reports alone, as is common in clinic practices. Thirty six families were excluded because several ages were unknown in a pedigree or ages at diagnosis of breast or ovarian cancer were not provided. For the purposes of this study, ductal carcinoma in situ was not considered to be invasive breast cancer, and peritoneal cancer was considered equivalent to ovarian cancer. No fallopian tube cancers were reported in these families. All consultants studied had given permission for use of their medical records for research purposes and this study was approved by the Mayo Clinic Institutional Review Board.
The final study sample consisted of 200 Caucasian probands; 30 (15%) were self-reported to be of AJ ancestry and 46 (23%) had no previous diagnosis of breast or ovarian cancer. Of the 154 with a previous diagnosis, 117 (76%) had breast cancer only, 27 (18%) had ovarian cancer only, and 10 (6%) had both breast and ovarian cancer. Forty-six probands had no type of cancer diagnosis. They were still considered to be the probands in these families.
All mutation testing for BRCA1 and BRCA2 was performed by Myriad Genetics Laboratories, Inc. and consisted of complete sequencing of both genes and, since 8/2003, testing for a large rearrangement-panel in BRCA1. Comprehensive testing was performed for all probands, affected or unaffected, Jewish and non-Jewish.
Table 1 describes the four models used to estimate the probability that the proband was a mutation carrier. The LAMBDA model  was developed specifically for AJ women who were tested only for the three founder mutations. It involves identifying whether the consultant and her first- and second-degree relatives had a diagnosis of breast or ovarian cancer, and if so at what age, checking appropriate boxes, summing scores and reading a probability off a prepared table. Figure 1 shows a sample worksheet for LAMBDA calculations. We applied it to both our AJ and non-AJ probands, who all had the comprehensive genetic testing, with no modification of the scoring system.
The second model was derived from the Myriad II tables, updated in Spring 2004 (http://www.myriad-tests.com/provider/brca-mutation-prevalence.htm) . While a 2006 update has recently been posted, the data used in the 2004 tables would be more concurrent with the testing of our cohort.
The fourth model tested was an adaptation of the BRCA1-only prediction model based on the Couch tables (sometimes called the Penn I model) , and was generated by the BRCAPRO program output. We calculated and analyzed a score that we designated “Couch 1.5”, which involved multiplying the Couch table score by 1.5 and truncating at 1 to simulate a practice that is sometimes used by genetic counselors in clinical risk assessment (personal communications with author). The number derived is thought to predict the combined probability of a mutation in either BRCA1 or BRCA2, based on observations that BRCA1 mutations outnumber BRCA2 mutations in most series by a factor of about two to one.
There was adequate information to calculate an estimate for all 200 probands using the LAMBDA and Myriad II scores, for 182 using BRCAPRO, and for 170 using the Couch 1.5 score. (Note that only BRCAPRO requires information on all unaffected relatives, Couch tables require exact ages of all affected relatives, whereas the Myriad tables require only under-50/over-50 dichotomy for breast cancer.)
The fits of models to the observed data were assessed in several ways. The Receiver Operating Characteristic (ROC) curve is a plot of Y = sensitivity (the proportion of subjects correctly classified when the subject is a carrier) against corresponding values of X = 1 – specificity (the proportion of subjects incorrectly classified when the subject is not a carrier). If a model has no discriminatory power the curve will not differ from the reference line Y = X, while perfect discriminatory power will result in the lines X = 0 and Y = 1. The area under the ROC curve is, therefore, a global measure of a model’s predictive performance. It can be interpreted as the probability that the model’s prediction for a randomly selected carrier (from the pool of probands) will be greater than its prediction for a randomly selected non-carrier. The greater the area, the better the model’s performance in ranking probands by carrier probability.
Each ROC curve was generated by defining a sensitivity and 1 – specificity for every number c between 0 and 1. Each c was interpreted as a cut-off to which the model’s outcome probabilities were compared, giving a binary test for which sensitivity and 1 – specificity could be calculated in the usual way. The resulting points were connected by straight lines, and the area under the curve calculated by the trapezoid rule. Standard errors and confidence intervals for the area under the ROC curve, and tests for the equality of areas under the ROC curve, were calculated using algorithms suggested by DeLong et al. . Note that the area under the ROC curve depends only on the relationship between sensitivity and specificity and not on the actual values of the predicted probabilities provided their order is unchanged. That is, it is a test of the ranking of observed versus predicted probability status only.
Therefore, we also conducted two tests to assess the fit of the predicted carrier probabilities to the observed data, as described in Cox and Snell  and applied in Apicella et al. . These tests assess the predicted carrier probabilities for systematic under- or over-estimation (i.e. accuracy) and for under- or over-dispersion. Over-dispersion typically occurs when too few of the probands with high predicted carrier probabilities are carriers (i.e. the predictions are too high at the top end) and/or too many of the probands with low predicted probabilities are carriers (i.e. the predictions are too low at the bottom end).
The degree to which a model over-estimates or is over-dispersed can be estimated (with standard errors) using logistic regression . We can therefore compare the accuracy and dispersion of models by comparing their corresponding regression coefficients. We assume (conservatively) that these coefficients are uncorrelated. This likely gives an under-estimate of the variance of their difference so, for example, when testing whether two models differ in accuracy or dispersion the true P-value will be less than the one cited.
Statistical analyses were performed with STATA version 9.0 and R version 2.1.0 . Following convention, all statistical tests were two-sided and statistical significance was based on a P-value of less than 0.05.
Overall, 46 (23%) consultants had deleterious BRCA1 mutations, and 20 (10%) had deleterious BRCA2 mutations (total 33%). Of the 30 AJ consultants, six (20%) were carriers, one of whom had a BRCA mutation other than a founder mutation. Table 2 shows the relationship between proband age, cancer status, and mutation results.
Table 3 and Fig. 2 describe the performance of the four models. The median predicted probabilities were 36% for LAMBDA, 29% for BRCAPRO, 28% for Couch 1.5 and 17% for Myriad II. LAMBDA was the only model that over-predicted carriers (by 11%). The other three models underpredicted carriers by 15% (BRCAPRO), 19% (Couch 1.5) and 48% (Myriad II). A formal test of observed versus predicted values found that the deviation was not statistically significant for LAMBDA (P = 0.3), but was for all other models; BRCAPRO (P = 0.01), Couch 1.5 (P = 0.01) and Myriad II (P < 0.001). LAMBDA gave a more accurate prediction than Myriad II (P < 0.001) and Couch 1.5 (P = 0.05).
A consistent feature of all models was the large under-estimation for women with low predicted carrier probabilities. For example, each model predicted that a substantial proportion of women had a carrier probability of less than 10%, yet there were at least three times as many observed carriers in this subgroup than predicted by the models. This phenomenon persisted for all models, except LAMBDA, up to the predicted probability range of 10–25%. On the other hand, all models except Myriad II over-predicted the number of carriers in the group of women with highest predicted carrier probabilities (e.g. ≥50%). A formal test found evidence of over-dispersion for all models (all P < 0.001), increasing in strength from LAMBDA to Myriad II to Couch 1.5 to BRCAPRO ( , 40, 100 and 180, respectively). LAMBDA was less dispersed than the latter two models (P < 0.02).
Table 4 shows that the under the ROC curves were similar for the four models. Ranging only from 0.76 to 0.71, BRCAPRO had the highest area under the ROC curve, but the confidence intervals illustrate that the small differences were not statistically significant (P = 0.3).
Table 5 shows the 21 families in which one or more of the models predicted <10% chance of a mutation, but a deleterious mutation was found. In this subset of 21 families, the LAMBDA model “missed” the fewest carriers (33%), compared with BRCAPRO (71%), Couch 1.5 (61%), and Myriad II (76%).
Evaluation of a human pedigree disease data for possible transmission of genetic susceptibility is complex and made all the more difficult by small family size, missing information, incomplete penetrance, variable expressivity, and environmental exposures that vary from person to person. Nevertheless, the family’s medical history remains a powerful tool for identifying individuals most likely to have a high genetically inherited risk for specific diseases. This has become all the more relevant now with the discovery of some genes which, when mutated in the germ-line, are responsible for a proportion of common diseases.
We have considered four models that derive, for a woman with a particular personal and family history of breast and ovarian cancer, the probability that she carries a germline deleterious mutation in the genes BRCA1 or BRCA2, and therefore is at substantially increased risk of breast and ovarian cancer. All models gave comparable areas under the ROC curve; i.e. they are globally similar to one another in how well they rank women according to carrier probability. All models except for LAMBDA underestimated the total number of carriers in the cohort. All models were over-dispersed; i.e. they generally predicted too high a probability for those most likely to be carriers, and/or too low a probability for those least likely to be carriers.
Our interest in assessing the LAMBDA model derived from observing how straightforward it was to use in an office setting without requiring a computer program (Fig. 2), while being more nuanced in use of age ranges than the Myriad II tables. LAMBDA was developed using data from AJ women. It is, perhaps, surprising that it performed well without any adjustment of the scoring system in families attending the Mayo Clinic, given that only 15% were of AJ descent. The LAMBDA model was less over-dispersed then the other three models. LAMBDA also was less likely to “miss” mutation carriers than all other models, which is extremely important in clinical practice where missing a carrier is of more significance than overestimating the risk of a non carrier. Further studies of this model in other non-AJ population are warranted.
The computer-based BRCAPRO model showed substantial under-estimation of risk in the lower risk groups and over-estimation in the highest risk group. It was the most demanding to implement due to the amount of data required—it is necessary to know of the existence and ages of each unaffected relative. The integration of such data using a Bayesian approach likely contributed to the model’s ability to give a higher probability to carriers in some families that were given a lower probability (i.e. more likely to be “missed”) by other models. The time required to enter data into a computer program is a deterrent to BRCAPRO becoming an office tool for any busy genetics clinic in its current form.
While the Myriad II tables were convenient to use, the quality of the data they are based on is highly dependent on the clinicians who ordered the mutation testing. Furthermore, the categorization of breast cancer cases in relatives is restrictive. Only those under 50 years at diagnosis are considered relevant. The tables do not recognize the difference between having no relatives with breast cancer after age 50 versus having multiple relatives with breast cancer after age 50. Like the other models information on third degree relatives are not incorporated. Despite these recognized issues, the tables performed similarly well in ranking carriers as did the other models.
The Couch tables, while convenient and simple to use, were created strictly for BRCA1 prediction. A rule of thumb sometimes used in the genetic counseling community (personal communications) is to multiple the “Couch” score for an individual by 1.5 so as to predict being a carrier of a mutation in either BRCA1 or BRCA2. To our knowledge this approach has never been validated. Used as intended, the “Couch” score matched BRCA1 results reasonably well (Fig. 3). Using the 1.5 multiplication rule-of-thumb gave results similar to Myriad II and BRCAPRO, with over-estimation of risk in the highest risk group and substantial under-estimation in the lower risk group.
The area under the ROC curve has been used by almost every study of carrier prediction models as a test of performance, yet it only tests the ability of models to rank carriers. This test does not reveal if the models gave similar results across families. It is of great clinical relevance to know if the four models “missed” the same probands, so we reviewed the 21 women for whom a deleterious mutation was discovered (15 in BRCA1 and 6 in BRCA2) but for whom one or more of the models predicted a <10% carrier probability (Table 5). For some, such as proband 1, no model predicted a high probability. The proband was unaffected but had two paternal aunts with ovarian cancer and several third and fourth degree relatives with early onset breast cancer. None of the models take into account relatives of the consultant more distant than second degree. A genetic counselor would likely detect this pattern of cancer in the extended paternal lineage, underscoring the importance of collecting an extended pedigree and experienced clinical review. On the other hand, there were families with wide discrepancies in prediction, such as proband 12. In her family there were four women with breast cancer, one bilateral. Two of the breast cancers were diagnosed before the age of 50 years, and the others were diagnosed in the 50s and 60s. The high incidence of breast cancer in this family was interpreted by the BRCAPRO and LAMBDA models as suggesting the proband was highly likely to be a carrier, but not by the Myriad II and Couch 1.5 models. Of this set of 21 carrier probands “missed” by one or more models through using the 10% threshold, the LAMBDA model “missed” the fewest (33%), whereas BRCAPRO “missed” 71%, Couch 1.5 “missed” 61%, and Myriad II “missed” 76%.
We considered what factors related to the nature of our Mayo Clinic practice may have affected the performance characteristics of the models. Unlike some model development processes, we did not have any pre-determined inclusion criteria on who could have BRCA testing. Individuals are referred to the clinic either because of their interest or concern about their cancer risk or the concern of their personal provider. Once referred to the genetic clinic, they receive comprehensive individualized risk assessment, counseling of risks and benefits of testing including information on costs, psychological factors, and medical and surgical options for screening and risks reduction. After that discussion, individuals decide if they want to have testing. This is quite different from some centers or research registries in which predetermined family history criteria may be required to access genetic testing. As a result, we may see people who are concerned but who are actually at low risk and we may see people with more extended family histories that are not captured by simplistic family history criteria, but may be important never-the-less. For example, cancer in cousins (third degree relatives) is not included in most models. Small family size can limit eligibility for registries with pre-determined requirements and some registries might not recruit or test an unaffected family member. We thus may be ascertaining families outside the boundaries upon which the models were originally designed. We would argue, though, that understanding the performance of the models in real world populations is of considerable interest and value.
Several other groups have been evaluating risk prediction model performance [19–24]. James et al.  studied 257 probands from a cancer family clinic in Melbourne, Australia (27% with mutations) and compared performance of the Myriad, Couch, BRCAPRO, FHAT  and MANCHESTER  approaches. They found the best discriminator between carriers and non carriers was BRCAPRO, and this could be enhanced by incorporating pathology data. Antoniou et al.  studied 195 French-Canadian probands and compared predicted carrier probabilities under the BOADICEA  and BRCAPRO computer models. They found, as we did, that BRCAPRO over-predicted carriers with high predicted probabilities. Barcenas et al.  studied 472 probands (21% AJ) and compared BRCAPRO, BOADICEA, Myriad II, Manchester and Couch models. The conclusion was BOADICEA performed better than the other models for AJ families, while overall, BRCAPRO, BOADICEA, and Myriad II performed similarly with Myriad II being the easiest of these to use. Nanda et al.  evaluated BRCAPRO in African-American probands and found that BRCAPRO performed as well in this population as in the white and AJ families. Euhus et al.  studied 301 probands (42% AJ), comparing BRCAPRO with six experienced counselors. The sensitivity for identifying mutation carriers was equivalent, but BRCAPRO showed slightly superior ability to discriminate carriers from non carriers. Gerdes et al.  studied 267 Danish families with high-risk family histories and compared results between the Myriad tables and the Manchester model.  The Manchester model uses a scoring system with a maximum of 10 points for each gene, intended to reflect a >10% probability of finding a mutation in that gene. Using a 10% threshold for recommending testing, the updated Manchester model would have had 84% sensitivity and 44% specificity compared to the Myriad model which would have had 79% sensitivity and 43% specificity. Kang et al.  analyzed 380 pedigrees on the BRCAPRO, Manchester, Couch/Penn, and Myriad models and reported the area under the ROC as about 0.75 for all models. Simard et al.  studied 256 high-risk families among French-Canadians in Quebec, Canada. They compared the Myriad tables with the Manchester model and a logistic regression approach derived from the data in this study and reported that their logistic regression and the Manchester scores provided equal predictive powers and both were significantly better than the Myriad tables. Using a Manchester score of ≥18, provided a sensitivity of 86% and specificity of 82%. No published studies have reported cross-model comparisons with the LAMBDA model to date.
An important observation from our study is that LAMBDA, developed for AJ women, out-performed or at worst matched the other major tools when applied to a general clinical setting in which the vast majority (85%) were not AJ. This study suggests that the simple LAMBDA scoring system may have considerable validity outside the AJ setting, at least for ranking women in terms of their carrier probabilities. It is likely that, given an appropriately large data set, LAMBDA can be adjusted to give a better fit to non-AJ women than has been achieved here, especially if information on pathology features of tumors is available.
The results of our study reflect in part the referral and testing pattern of one institution and may not reflect model performance at other institutions. Our study has shown that BRCA1 and BRCA2 mutation probability prediction models can give misleading and discordant results in some families, especially at the extreme ends of the probability scale. It has also shown that review of cancer family histories by an experienced clinician to supplement use of multiple models to provide risk estimates would appear to be the best strategy to assure that critical elements of clinical risk assessment are not overlooked.
Noralane M. Lindor, Department of Medical Genetics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA.
Rachel A. Lindor, College of St. Benedict, St. Joseph, MN, USA.
Carmel Apicella, The University of Melbourne, Melbourne, Australia.
James G. Dowty, The University of Melbourne, Melbourne, Australia.
Amanda Ashley, Department of Medical Oncology, Mayo Clinic College of Medicine, Rochester, MN, USA.
Katherine Hunt, Department of Hematology/Oncology, Mayo Clinic Arizona, Scottsdale, USA.
Betty A. Mincey, Cancer Clinical Studies Unit, Mayo Clinic Jacksonville, Jacksonville, USA.
Marcia Wilson, Department of Medical Genetics, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 44905, USA.
M. Cathie Smith, Cancer Clinical Studies Unit, Mayo Clinic Jacksonville, Jacksonville, USA.
John L. Hopper, The University of Melbourne, Melbourne, Australia.