|Home | About | Journals | Submit | Contact Us | Français|
We studied 17,576 members of 166 MLH1 and 224 MSH2 mutation-carrying families from the Colon Cancer Family Registry. Average cumulative risks of colorectal cancer (CRC), endometrial cancer (EC) and other cancers for carriers were estimated using modified segregation analysis conditioned on ascertainment criteria. Heterogeneity in risks was investigated using a polygenic risk modifier. Average CRC cumulative risks to age 70 years (95% confidence intervals) for MLH1 and MSH2 mutation carriers, respectively, were estimated to be 34% (25%-50%) and 47% (36%-60%) for male carriers and 36% (25%-51%) and 37% (27%-50%) for female carriers. Corresponding EC risks were 18% (9.1%-34%) and 30% (18%-45%). A high level of CRC risk heterogeneity was observed (p<0.001), with cumulative risks to age 70 years estimated to follow U-shaped distributions. For example 17% of male MSH2 mutation carriers have estimated lifetime risks of 0-10% while 18% have risks of 90-100%. Therefore, average risks are similar for the two genes but there is so much individual variation about the average that large proportions of carriers have either very low or very high lifetime cancer risks. Our estimates of CRC and EC cumulative risks for MLH1 and MSH2 mutation carriers are the most precise currently available.
DNA mismatch repair (MMR) genes encode proteins which detect and repair DNA mismatches that can occur during cell replication(Aaltonen, et al., 1994). A person with a germline mutation in any of the MMR genes MSH2, MLH1, MSH6 or PMS2 (MIM 120436, 609309, 600678, 600259) has an increased risk of colorectal carcinoma (CRC), endometrial carcinoma (EC) and cancers of the stomach, ovary, ureter, renal pelvis, brain, small bowel, hepatobiliary tract (Umar, et al., 2004) and possibly the breast (Jensen, et al., 2010; Lynch, et al., 1988; Walsh, et al., 2010; Win, et al., 2012), prostate (Grindedal, et al., 2009; Soravia, et al., 2003) and pancreas (Kastrinos, et al., 2009). Carriers of deleterious MMR gene mutations who develop cancers with MMR deficiency are said to have Lynch syndrome (MIM 120435), formerly known as hereditary nonpolyposis colorectal cancer (HNPCC) (Jass, 2006). Lynch syndrome is the most common genetic CRC syndrome (Southey, et al., 2005; Wagner, et al., 2003; Wijnen, et al., 1997) and 70-85% of Lynch syndrome is caused by mutations in MLH1 or MSH2 (Barnetson, et al., 2006; Hampel, et al., 2005; Southey, et al., 2005).
While it is known that carriers of germline mutations in MLH1 and MSH2 have high lifetime cumulative risks (penetrance) of CRC, EC and other cancers, estimates of these risks are imprecise (see Supp. Tables S1 and S2). It is also not known whether the estimates derived from carriers identified from familial cancer clinics are applicable to carriers sampled without regard to family histories of cancer. If heritable modifiers of risk exist then carriers with strong family histories of cancer would be expected to carry, on average, more familial disease-causing factors and hence to have greater cancer risks. However, the amount of variability in cancer risks between carriers has not been previously estimated.
Precise estimates of the average age-specific cumulative risks for MLH1 and MSH2 mutation carriers are needed for sound genetic counselling of carriers and their relatives, for choosing optimal surveillance strategies for known carriers and for the efficient identification of carriers based on their family histories of cancer using risk prediction models. The extent of variability in these risks is also of biological, epidemiological and clinical interest. We therefore estimated cancer risks for MLH1 and MSH2 mutation carriers as well as the variability in these risks using one of the largest series of MMR gene mutation-carrying families in the world.
Subjects were from all families recruited between 1997 and 2010 by the Colon Cancer Family Registry (Colon CFR; see (Newcomb, et al., 2007) for a detailed description) in which a deleterious mutation in MLH1 and MSH2 has been identified. Mutations were considered pathogenic if the sequence variant results in a stop codon, a large duplication or deletion, a frameshift mutation or a missense mutation previously reported in the scientific literature as being pathogenic. All participants who donated blood samples or completed questionnaires gave written, informed, consent for their data and biospecimens to be used in approved Colon CFR projects. The present study has been approved by the Colon CFR Steering Committee (project number C-CP-0606-03) and by the institutional human ethics committees of all participating centres.
Families were recruited via probands who were either recently diagnosed CRC cases ascertained through population-complete cancer registries in the USA (Puget Sound, Washington State; the State of Minnesota; Los Angeles, California; Arizona; Colorado; New Hampshire; North Carolina; and Hawaii), Australia (Victoria) and Canada (Ontario) (population-based recruitment) or were persons from multiple-case families referred to family cancer clinics in Australia (Melbourne, Adelaide, Perth, Brisbane, Sydney), New Zealand (Auckland) and the USA (Mayo Clinic, Rochester, Minnesota and Cleveland) (clinic-based recruitment). Probands were asked for permission to contact their relatives to seek their enrollment in the Colon CFR. For population-based families, first-degree relatives of probands were recruited at all centers and recruitment was extended to more distant relatives at some centres. For clinic-based families, there were pre-specified rules, consistent across centres, governing which family members were to be approached for recruitment. Written informed consent was obtained from all study participants and the study protocol was approved at each Colon CFR site.
Standardized questionnaires were used to collect data on each participant and his or her first- and second-degree relatives, including their sexes, cancer sites and dates of birth, death (if applicable), onset of any cancer and any prophylactic surgery to the colon, rectum, endometrium or cervix. Validation was sought for all reported diagnoses of CRC and, at some centres, for all invasive cancers. In this paper, hepatobiliary cancer means cancer of the liver, gall bladder and biliary tract while urinary tract cancer means cancer of the bladder, ureter, renal pelvis and kidney. Due to a lack of access to pathology reports, we could not differentiate cancers of the renal pelvis from other kidney cancers nor adenocarcinomas of the cervix from squamous cell carcinomas.
For population-based families, subjects were restricted to probands and their first- and second-degree relatives. For clinic-based families, all available family members were included in the analysis. Five probands with de novo mutations(Win, et al., 2011c) (two in MLH1 and three in MSH2) and their families could not be easily included in the segregation analyses so were excluded from all analyses. Families who were recruited through separate probands but found to contain members in common were combined, giving nine families which each had two probands, and these families were treated as clinic-based in the segregation analyses. All ages were truncated at age 80 years and subjects for whom sex was unknown were censored at birth. Affected subjects with missing ages at diagnosis (comprising 14% of CRC diagnoses, 8% of EC diagnoses and 22% of non-colorectal, non-endometrial (NCNE) Lynch cancer diagnoses) were included in the analysis by marginalizing over the missing ages (see Statistical methods). Unaffected subjects with missing ages were censored at birth, effectively removing them from the analysis, since their affected statuses were considered to be unreliable.
Standardized protocols were used to obtain and prepare biospecimens and to conduct all laboratory analyses. Sequence variants in MLH1 and MSH2 were termed ‘mutations’ if they encoded stop codons, large duplications or deletions, frameshift mutations or one of the missense mutations previously reported in the scientific literature as being pathogenic. Screening for germline mutations in MLH1 and MSH2 (and, at some centres, in MSH6 and PMS2) was performed for all clinic-based probands and for those population-based probands whose colorectal tumours displayed impaired MMR function, as evidenced by either microsatellite instability or the absence of MMR protein expression in immunohistochemical assays. Mutation testing was performed by Sanger sequencing or denaturing high pressure liquid chromatography (dHPLC), followed by confirmatory DNA sequencing(Newcomb, et al., 2007; Southey, et al., 2005). Large duplications and deletions in MMR genes were detected by multiplex ligation-dependent probe amplification (MLPA) according to the manufacturer's instructions (MRC Holland, Amsterdam, The Netherlands). Each participant who donated a blood sample and was related to a mutation-carrying proband underwent genetic testing for the proband's mutation.
Mean ages at diagnosis and the corresponding sample standard deviations were calculated for various cancer sites using R version 2.11.1 (R Development Core Team(Team, 2010)). Probands and cases with missing or imputed ages at diagnosis were excluded from this descriptive analysis.
Age-specific hazard ratios (HRs), i.e. the age-specific cancer incidence for carriers divided by that for the population, were estimated using modified segregation analysis (Antoniou, et al., 2001; Lange, 2002) (as described in detail in Supplementary Statistical Methods). This analytical method is not subject to population stratification, can be rigorously adjusted for many methods of ascertainment and uses data on all study participants, whether genotyped or not, thereby maximising statistical power. Models were fitted by the method of maximum likelihood using MENDEL (Lange, et al., 1988) version 3.2 and were appropriately adjusted for the clinic- and population-based ascertainment of study participants using a combination of retrospective likelihood and ascertainment-corrected joint likelihood (Antoniou, et al., 2001; Gong, et al., 2010; Kraft and Thomas, 2000). More specifically, a conditional likelihood was maximized where each pedigree's data was conditioned on the proband's genotype, cancer status and age of onset (for population-based families) or on the proband's genotype and the affected statuses and ages of onset of all family members at the time the proband was found to be a MMR gene mutation carrier (for clinic-based families). HRs and measures of risk heterogeneity were estimated and the corresponding cumulative risks, risk distributions and 10-year risks were then derived from these estimates (see below). Individuals were censored at the earliest of death, last known age alive, prophylactic surgery (polypectomy, bowel surgery or hysterectomy) or the first diagnosis of cancer (as applicable) so the resulting estimates describe risks of first primary cancers for people who have not undergone prophylactic surgery, regardless of whether or not they are undergoing surveillance. Censoring for surgery was site-specific, e.g. individuals were still considered to be at risk of CRC following hysterectomy.
Two different genetic models were used in the modified segregation analyses: a major gene model, in which only genotypes at the relevant MMR gene for each family were modelled, and a mixed model, which incorporated an unmeasured polygene in addition to the major gene (Antoniou, et al., 2001; Cannings, et al., 1978). All estimates presented in this paper for cumulative risks, 10-year risks, estimates of risk heterogeneity and HRs for CRC and EC were based on the mixed model. HR estimates for separate NCNE Lynch cancers and non-Lynch syndrome associated cancers were based on the major gene model, as were hypothesis tests for the dependence of cancer risks on sex, gene, country, setting (clinic versus population), mutation category, proband's age at onset and modelling assumptions. The use of different genetic models for different purposes was necessary because simulation has shown that major gene models are likely to give biased estimates of risk (Gong, et al., 2010) but analyses using the mixed model were prohibitively slow to run so could not be used to individually estimate a large number of parameters or to test a large number of hypotheses. Therefore, a major gene model was used for model-building while the final estimates of risk were based on a mixed model.
HRs for CRC, EC and NCNE Lynch cancers were estimated simultaneously to allow proper adjustment for CRC-based ascertainment schemes when estimating the risks of non-CRC cancers and to increase power (by helping the model identify likely carriers from the placement of Lynch syndrome-associated cancers within each family). For each site, the age at cancer diagnosis was modelled as a random variable whose hazard was the relevant population incidence multiplied by a site-specific HR. Except when testing for differences in risk by country, HRs were assumed to be independent of country of residence, as would be expected if different population incidences only occur because of variation in the prevalence of risk factors between countries and if these risk factors and MMR gene mutations all act multiplicatively on the HR. The ages at diagnosis of different family members were assumed to be conditionally independent given genotype, however residual correlation of cancer within families was incorporated in most analyses by including an unmeasured polygene (in addition to the major gene) in the definition of genotype (see below). As in (Quehenberger, et al., 2005), the ages at diagnosis for different cancer sites within the same individual were assumed to be conditionally independent, given genotype, until diagnosis of the first cancer. Missing ages at diagnosis were treated using the standard method for missing data in likelihood-based estimation, namely marginalization (integration) over all missing values before maximizing the likelihood function (Little and Rubin, 2002).
Our analyses were based on 17,576 people from 166 families segregating MLH1 mutations and 224 families segregating MSH2 mutations. Forty-eight (29%) of the MLH1 and 64 (29%) of the MSH2 mutation carrying families were population-based. Of all families, 253 (65%) carried a protein-truncating mutation while 137 (35%) carried another type of pathogenic mutation. There were 160 (41%) families from Australasia (Australia and New Zealand combined), 104 (27%) from Canada and 126 (32%) from USA.
Mutation-specific testing of those relatives of the probands who had provided blood samples identified an additional 768 carriers and 1128 non-carriers. The number and average ages at diagnosis of the first primary cancers of all cases, other than those of probands and cases with unknown ages at diagnosis, are listed in Table 1. There were almost twice as many extra-intestinal Lynch cancers (i.e. Lynch cancers other than those of the colon, rectum or stomach) in MSH2 mutation-carrying families than in MLH1 mutation-carrying families.
The estimated average HRs for various cancers are given in Table 2.
The estimated age-dependent CRC HRs for MSH2 mutation carriers were similar to those for MLH1, with any differences consistent with chance (p=0.5 for females and p=0.9 for males). Similarly, no statistically significant difference in EC risks between MLH1 and MSH2 mutation carriers was observed (p=0.3) though the point estimates for MSH2 were higher (see Table 2). CRC HRs for male MLH1 mutation carriers were higher than those for female carriers (p=0.01) but no such difference by sex was detected for MSH2 mutation carriers (p=0.8).
Female MLH1 mutation carriers with protein-truncating mutations were estimated to have CRC incidences 0.40 (95% confidence interval (CI) 0.16-0.98; p= 0.03) times those for female MLH1 mutation carriers with other types of mutations. No other differences in risk between families according to the type of mutation were observed for the other five combinations of sex, gene and site (CRC, EC), raising the possibility that this marginally-significant result is spurious.
For female MLH1 mutation carriers, the estimated CRC incidence varied between families according to the age of diagnosis of the family's proband (a proxy for heritable risk modifiers), being lower by a factor of 0.94 (95% CI 0.90-0.98; p<0.001) for every year of increase in the proband's age at diagnosis. No other differences in risk were observed for any other combinations of sex, gene and site.
No differences were observed between clinic- and population-based families for either CRC or EC risks except that female MLH1 mutation carriers were estimated to have incidences of CRC which are 2.2 times (95% CI: 1.2-4.0; p=0.03) higher in the clinic-based setting.
No differences between countries were observed for CRC or EC risks except for the CRC risks of female carriers. Female carriers from Canada and USA, respectively, were estimated to have CRC incidences equal to those of carriers from Australasia multiplied by factors (95% CI) of 0.44 (0.22-0.87) and 0.31 (0.14-0.71) for MLH1 (p=0.01) and 2.2 (0.81-5.8) and 0.85 (0.28-2.6) for MSH2 (p=0.02).
There was no evidence that male carriers had higher than population-levels of cancer risks at any of the non-Lynch syndrome-associated cancer sites considered (listed in Table 2) except that male MSH2 mutation carriers had incidences of pancreatic cancer estimated to be 18.1 times (95% CI: 8.4-39.0; p<0.001) the population rates. For female MLH1 and MSH2 mutation carriers, cervical cancer incidences were estimated to be 5.5 (95% CI: 1.7-17.7; p=0.01) and 9.7 (95% CI: 3.8-24.8; p<0.001) times the population incidences, respectively. No increased risks were observed for MLH1 and MSH2 mutation carriers at any other sites, notably not for breast cancer (p=0.8 and 0.3, respectively) or prostate cancer (p=0.7 and 0.9, respectively). The CIs for the HR estimates give likely upper bounds for the true breast and prostate HRs (respectively) of 2.6 and 2.5 for MLH1, and 3.3 and 2.3 for MSH2.
There was strong evidence for large heterogeneity in the CRC and EC risks for carriers about the average risks (p<0.001), with the standard deviation of the polygenic component of risk estimated to be 1.6 (95% CI: 1.1-2.1) for CRC and 1.2 (95% CI: 0.1-2.2) for EC. Figure 1 illustrates the predicted U-shaped distribution of lifetime risks of CRC. The standard deviation of the polygenic component of CRC risk was also estimated separately for population- and clinic-based families and found to be 1.5 (95% CI: 1.2-1.8) and 2.0 (95% CI: 1.4-2.7) respectively, consistent with no difference between the two settings (p=0.1).
The estimated average age-specific cumulative risks of CRC, EC and other cancers for carriers from USA are given in Table 3 and illustrated in Figure 2. It is estimated that 34% (95% CI: 25-50) and 47% (95% CI: 36-60) of male MLH1 and MSH2 mutation carriers (respectively) will be diagnosed with CRC by the age of 70 years. Of all female MLH1 and MSH2 mutation carriers, an estimated 36% (95% CI: 25-51) and 37% (95% CI: 27-50) respectively will be diagnosed with CRC by the age of 70 years while 18% (95% CI: 9.1-34) and 30% (95% CI: 18-45) respectively will develop EC. Ten-year risks of cancer for unaffected carriers from USA at various ages are given in Table 4. Corresponding results for carriers from Australasia and Canada are given in Supp. Tables S3 - S6 and Figure 2.
Our estimates of the average risks of CRC to age 70 years are broadly consistent with the estimates of previous studies that have correctly adjusted for ascertainment (see Supp. Table S1). In summary, we observed no differences between the CRC HRs for MLH1 and MSH2 mutation carriers, though malecarriers had higher CRC HRs than female carriers for MLH1 but not MSH2 (see Table 2). Evidence that CRC and EC risks depended on setting (clinic versus population) was weak or absent, consistent with no difference by setting, perhaps because family cancer clinics tend to genetically test families with histories of cancer which are less severe now than they were in the past. The estimated 10-year risks of CRC for unaffected carriers at various ages were roughly constant from the fifth decade of life onwards (see Table 4), suggesting that aggressive surveillance is as important for older mutation carriers as it is for younger ones.
Despite our study participants being drawn from countries with low population incidences of stomach cancer, we found relatively high risks of stomach cancer in MLH1 and, to a lesser extent, MSH2 mutation carriers. Imprecision in these estimates warrants caution but they highlight potential benefits of gastroscopy screening, as suggested by (Capelle, et al., 2010). Differences in the estimated stomach cancer risks between male MLH1 and MSH2 mutation carriers could be due to differences in the true risks, imprecision in the HR estimates, younger ages at onset for MLH1 mutation carriers, a higher proportion of MLH1 mutation carriers among the stomach cancer cases than MSH2 mutation carriers or a combination of these.
We assessed risks at cancer sites not currently part of the Lynch syndrome tumour spectrum and found strong evidence for higher incidences of pancreatic cancer, in agreement with a previous study (Kastrinos, et al., 2009), but no evidence for higher incidences of breast or prostate cancer (all p>0.3), though we can't rule out 2- or 3-fold increased risks, consistent with (Win, et al., 2012) (though this study partly overlaps with the present one). Despite these null results, a number of studies have shown that breast and prostate cancers in carrier families often show signs of MMR deficiency(Grindedal, et al., 2009; Jensen, et al., 2010; Lynch, et al., 1988; Soravia, et al., 2003; Walsh, et al., 2010), suggesting some increase in risk and a possible role for molecular characteristics of these tumours in predicting carrier status. We also found a higher risk for cervical cancer, though this could be due to misclassification of adenocarcinomas of the lower uterine segment.
Using a polygenic model of risk heterogeneity, we found very strong evidence that CRC and EC risks are highly heterogeneous, as has been observed anecdotally by a leading clinician (Henry Lynch, personal communication). Our estimates of the size of this risk heterogeneity imply that a substantial proportion of mutation carriers are at population-level risk while a significant minority are almost certain to develop CRC. The U-shapes of the histograms of Figure 1 are not likely to be caused by the mixture of countries, settings (clinic versus population) or mutation types (protein-truncating versus all other types) of our study participants because direct tests for differences in CRC risks by these characteristics found weak or no effects, whereas the estimated risk heterogeneity was quite large. Similarly, the risk heterogeneity is not likely to be caused by differences in screening practices between families since the effect of this must be weaker than that of polypectomy which roughly halves the risk (Jarvinen, et al., 2000). The estimated standard deviation for the polygenic component of CRC risk was 1.6, which corresponds to a relative risk of 3.6 associated with having an affected first-degree relative with early-onset disease (Pharoah, et al., 2002). The cause of this risk heterogeneity is unknown but its size is consistent with the existence of approximately 100 modifier SNPs, each with an HR of 1.05 and a minor allele frequency of 50%, acting independently and multiplicatively to alter the CRC risks for carriers. However, it could also be caused by environmental factors which are correlated within families or by mutation-specific penetrances. To date, only a few genetic variants and environmental risk factors have been found to modify the cancer risks of MMR gene mutation carriers (Botma, et al., 2010; Campbell, et al., 2007; Diergaarde, et al., 2007; Felix, et al., 2006; Frazier, et al., 2001; Kruger, et al., 2007; Pande, et al., 2008; Pande, et al., 2010; Reeves, et al., 2008; Talseth-Palmer, et al., 2011; Wijnen, et al., 2009; Win, et al., 2011a; Win, et al., 2011b) and these reports have not been replicated by large studies. Large genome-wide association studies of mutation carriers, similar to those carried out for BRCA1 and BRCA2 by the CIMBA consortium (Couch and Wang, 2009), are needed.
The largest study of cancer risks for MLH1 and MSH2 mutation carriers to date (Bonadona, et al., 2011) provided estimates that either agree with our results or are consistent with them. However, our estimates are far more precise, with CIs half as wide, probably due to the fact that the families of Bonadona et al. were all clinic-based while our study also included population-based families which are more informative in penetrance analyses because they require a less stringent adjustment for ascertainment. We also note that the study of Bonadona et al. did not allow for any risk heterogeneity so their estimates are probably downwardly biased (Gong, et al., 2010). For the same reason, the cumulative risks of (Quehenberger, et al., 2005) are also likely to be underestimates.
Our study has several notable strengths. It is one of the largest studies so far to estimate the penetrance of MLH1 and MSH2 mutations and it included population-based families. Standardized questionnaires and protocols were used by the six different study sites comprising the Colon CFR, ensuring a high degree of homogeneity across sites. Systematic attempts were made to verify all reports of CRC and (at some Colon CFR sites) all cancers, using pathology reports, medical records, corroboration by relatives, cancer registry reports and/or death certificates (where available). Lastly, sophisticated statistical techniques were used which properly adjusted for ascertainment, accounted for residual familial aggregation of disease (thereby avoiding bias) and made use of data on all family members, whether genotyped or not (thereby maximizing statistical power). The main weaknesses of our study are its incomplete validation of cancers and that we could not adequately differentiate between cancers within the same organ. Another weakness, shared by all Lynch syndrome penetrance studies, was the need to either assume non-informative censoring at polypectomy (the approach of our study and most others) or to make speculative assumptions about CRC risks after polypectomy.
We have obtained unbiased estimates of CRC and EC penetrance for MLH1 and MSH2 mutation carriers which are more precise than any currently available. These estimates will be useful for genetic counselling, designing optimal surveillance strategies for carriers and as the key ingredient for risk prediction models which identify likely carriers from their cancer family histories. We have also shown that penetrance varies greatly between carriers, perhaps due to genetic or environmental risk factors, with some mutation carriers at population levels of risk and others almost certain to develop CRC.
This work was supported by the National Cancer Institute, National Institutes of Health under RFA # CA-95-011 and through cooperative agreements with members of the Colon Cancer Family Registry and PIs of the Australasian Colorectal Cancer Family Registry (U01 CA097735), Familial Colorectal Neoplasia Collaborative Group (U01 CA074799) [USC], Mayo Clinic Cooperative Family Registry for Colon Cancer Studies (U01 CA074800), Ontario Registry for Studies of Familial Colorectal Cancer (U01 CA074783), Seattle Colorectal Cancer Family Registry (U01 CA074794), and University of Hawaii Colorectal Cancer Family Registry (U01 CA074806). This work was also supported by a project grant (400160) and JLH's Australia Fellowship from National Health and Medical Research Council, Australia and a grant (454695) from The Cancer Council of Victoria, Australia. The authors thank all study participants of the Colon Cancer Family Registry and staff for their contributions to this project.
The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Cancer Family Registries, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government or the Cancer Family Registry. Authors had full responsibility for the design of the study, the collection of the data, the analysis and interpretation of the data, the decision to submit the manuscript for publication, and the writing of the manuscript.
The authors have no conflicts of interest to declare with respect to this manuscript.