That protein-truncating variants in CHEK2
confer a moderately increased risk of breast cancer is well established. The OR that we observed for T+SJVs is numerically somewhat higher than that reported in the 2004 CHEK2 Breast Cancer Case-Control Consortium study of c.1100delC
], but not significantly, as our 95% CIs do include the point estimate from that study. Moreover, as previous studies have observed higher ORs for c.1100delC
in familial versus sporadic cases and in early-onset versus later-onset cases [9
], we should expect that this study's focus on early-onset breast cancer cases with oversampling of familial cases would result in relatively high OR estimates.
Previous studies have shown that some CHEK2 missense substitutions are pathogenic, but the scale of their contribution to breast cancer susceptibility relative to that of T+SJVs is not known. Although we hesitate to extrapolate our current data to true population-attributable risks (within the age groups that we sampled) or familial relative risks, the data do provide a basis on which to compare the relative contributions of these two classes of variants. Working from the control carrier frequencies and the OR point estimates (adjusted for race or ethnicity, study center, and age) observed from the population-based Breast CFR sample series, we calculate attributable fractions of 0.014 for T+SJVs as compared with 0.015 for the sum of C15-C65 rMSs. In addition, we calculate a familial relative risk among first-degree relatives of 1.036 for T+SJVs as compared with 1.033 for a product across the C15-C65 rMSs. Thus, as a first approximation, the attributable fractions and familial relative risks of truncating variants and rare missense substitutions are virtually identical. It is important to remember that these attributable fraction and familial relative risk point estimates are inflated compared with those that would be obtained from a population-based study that included patients diagnosed in their 70s or older. In addition, as more than 25% of the T+SJVs observed in this study were nonsense and frame shift mutations other than c.1100delC, these data also speak to the importance of full open reading frame mutation screening to observe the majority of genetically relevant sequence variants in this cancer susceptibility gene.
Several of the missense substitutions observed in this study have been subjected to functional assays in one or more published works. For the 14 missense substitutions that Align-GVGD scored C0 and which we would consequently predict to be neutral or nearly so, assay results have been reported for 4 (p.P85L, p.R137Q, p.R180H and p.T323P). Using a Saccharomyces cerevisiae
Rad53 complementation assay, Shaag et al.
] found that p.P85L
is equivalent to wild-type CHEK2
. While Bell et al.
] found this allele to have modestly reduced activity in an in vitro
kinase function assay, both Bell et al.
and Shaag et al.
concluded that the allele is effectively neutral. Sodha et al.
] assayed the p.R137Q
allele and found that it encodes a protein with normal stability and normal response to DNA damage. Bell et al.
] also assayed the p.R137Q
allele and found that it has normal kinase activity. In addition, Sodha et al.
] assayed the p.R180H
allele and found that it encodes a protein with slightly reduced stability but normal response to DNA damage. Thus existing functional assay results for these three variants are consistent with their being either neutral or at most weakly pathogenic. Wu et al.
] found the fourth C0 substitution, p.T323P, to have moderately reduced autophosphorylation and Cdc25C kinase activity. Classification of this substation as C0 is probably a true Align-GVGD error, because the crystal structure of the protein reveals that T323 is located in an α-helix, which would not typically be permissive of substitution to proline. The algorithmic problem is that the atomic composition and polarity of proline (the amino acid side chain characteristics considered by the original Grantham difference [57
] and Align-GVGD are atomic composition, polarity and volume) are intermediate between those of threonine and isoleucine, which are the two amino acids observed at position 323 in our alignment. The consequence is that proline is only slightly outside the range of variation represented by these two wild-type residues and is consequently predicted to be neutral or nearly so. Although unpublished, misclassification of substitutions to proline that map within an α-helix is a problem that we have observed before and is an obvious issue to bear in mind when considering missense substitution analyses made using Align-GVGD. p.I157T is perhaps the most interesting of the substitutions observed in our study that have been subjected to functional assays. Align-GVGD scores the variant as C15, indicative of modest evidence in favor of pathogenicity. Initially, Lee et al.
] found that kinase activity of the p.I157T
allele was comparable to the wild type. More recent studies have reported that the allele is at least partially defective in dimerization and autophosphorylation, binding and phosphorylating Cdc25, and binding BRCA1
]. In populations in which p.I157T
are both present at appreciable frequencies and have been subject to independent risk estimates, p.I157T
does appear to confer increased risk of breast cancer, but the OR or penetrance associated with the missense substitution appears to be more modest than that associated with the frame shift c.1100delC
]. At the other end of the spectrum, of the five C65 substitutions that we observed, only one, p.R117G, has been subjected to functional assays. Summing across several studies, the protein encoded by this allele is phosphorylated by ATM in response to DNA damage, shows slightly to markedly reduced autophosphorylation, probably fails to oligomerize and has severely compromised kinase activity toward Cdc25C [39
]. Therefore, the p.R117G
allele encodes a functionally defective protein and is in all likelihood pathogenic. Thus, for the missense substitutions that were observed in our mutation-screening study and subjected to functional assays, there is a qualitative trend toward agreement between the Align-GVGD classification and the functional assay result, consistent with the trend in ORs that we observed across the Align-GVGD-defined ordered series of missense substitution grades. However, since concordant results between in silico
assessments and functional assays are not yet considered sufficient for formal clinical classification of missense substitutions observed in BRCA1
], it does not appear that the state-of-the-art of CHK2
functional assays has reached the point at which concordant results from an in silico
assessment and a functional assay would be sufficient for clinically relevant classification of a CHEK2
The genetic results described in this work, combined with the above functional assay summary, have implications for potential clinical genetic susceptibility tests that might include CHEK2
and other genes with similar mutation profiles. In the 2003 American Society of Clinical Oncology Policy Statement Update on Genetic Testing for Cancer Susceptibility, the second and third "indications for genetic testing for cancer susceptibility" were that "2) the genetic test can be adequately interpreted, and 3) the test results will aid in diagnosis or influence the medical or surgical management of the patient or family members at hereditary risk of cancer" (pp. 2398) [67
]. With regard to the third criterion, some investigators have argued that in the context of a high-risk family, the difference in risk between carriers and noncarriers of clearly pathogenic CHEK2
sequence variants is sufficient to justify a difference in cancer surveillance strategies [68
]. However, our results in addition to similar work regarding ATM
] point toward an issue under the second criterion. If roughly one-half of the genetically relevant risk that the test can pick up actually resides in rare missense substitutions that will be considered unclassified variants at their initial detection, it may not currently be possible to adequately interpret the test results. Therefore, while it is now technically feasible to design a massively parallel sequencing-based test that can accurately and relatively inexpensively identify mutations in a panel of breast cancer susceptibility genes that includes ATM
], it may be inappropriate to introduce such a test into widespread use before a clinically validated method of assessing unclassified missense substitutions in these genes has been developed.
The rare missense substitution analysis model combining Align-GVGD with the logistic regression test for trends grew out of the in silico
analysis of missense substitutions that has now become a standard component in the integrated evaluation of unclassified variants in BRCA1
]. We proposed the model on the basis of clinical BRCA1
mutation-screening data and then demonstrated its effectiveness by an analysis of ATM
case-control mutation-screening data [7
]. Thus the CHEK2
analysis presented here stands as a methodological confirmation of our approach to the inclusion of rare missense substitution data in case-control mutation-screening studies. The logistic regression test for trends that we used also provides a simple approach to combining evidence from rare missense substitutions with evidence from protein-truncating sequence variants to build a more complete and statistically powerful approach to assessing case-control mutation-screening data than would be afforded by either method alone. From a technological perspective, we can envision combining exon capture and massively parallel sequencing to extend case-control mutation screening to entire biochemical pathways and beyond. On the basis of our post hoc
power calculations, at least 2,000 patients and 2,000 controls would be required for a whole pathway (such as DNA double-stranded break repair and allied cell cycle checkpoints) study, and 3,300 patients and 3,300 controls would be required to undertake a whole-exome study. On the one hand, these numbers could be an underestimate because CHEK2
might be among the most important (in terms of familial relative risk) of the intermediate-risk class of breast cancer susceptibility genes. On the other hand, it could turn out that a test based on observations of evolutionarily unlikely sequence variants has an intrinsically lower false-positive rate than anonymous marker GWASs and consequently would not require a full Bonferroni multiple testing correction to reasonably constrain the rate of false-positive results.