Among the 1562 somatic mutations detected with this strategy, 25.5% were synonymous, 62.4% were missense, 3.8% were nonsense, 5.0% were small insertions and deletions, and 3.3% were at splice sites or within the untranslated region (UTR) ( and table S3
). The spectra of somatic mutations can yield insights into potential carcinogens and other environmental exposures. lists the spectra observed in the four tumors that have been subjected to large-scale sequencing analyses of the majority of protein-encoding genes. It is evident that breast tumors have a unique somatic mutation spectrum, with a preponderance of mutations at 5′-TpC sites and a relatively small number of mutations at 5′-CpG sites. However, the spectra of colorectal (13
), brain (15
), and pancreatic tumors are similar, suggesting that breast epithelial cells are exposed to different levels or types of carcinogens or use distinctive repair systems (16
). Given that cells in the colon are expected to be exposed to dietary carcinogens more than breast, brain, or pancreatic cells, one possible interpretation of these results is that dietary components are not directly responsible for causing most of the mutations found in human cancers.
Table 1 Summary of somatic mutations in four tumor types. Pancreas data have their basis in 24 tumors analyzed in the current study; brain data have their basis in 21 nonhypermutable tumors analyzed in (15); and colorectal and breast data have their basis in (more ...)
Of the 20,661 genes analyzed by sequencing, 1327 had at least one mutation, and 148 had two or more mutations among the 24 cancers surveyed (table S3
). In addition to the frequency of mutations, the type of mutation can provide information useful for evaluating its potential role in disease (18
). Nonsense mutations, out-of-frame insertions or deletions, and splice-site changes generally lead to inactivation of the protein products. To evaluate missense mutations, we developed an algorithm that uses machine learning of 58 predictive features based on the physical-chemical properties of amino acids involved in the substitutions and their evolutionary conservation at equivalent positions of conserved proteins (9
). Of the 924 missense mutations that could be scored with this algorithm, 160 (17.3%) were predicted to contribute to tumorigenesis when assessed by this method (table S3
We also generated structural models of 404 of the missense mutations identified in this study [links to structural models available at (19
)]. In each case, the model was based on x-ray crystallography or nuclear magnetic resonance spectroscopy of the normal protein or a closely related homolog. This analysis showed that 55 of the 404 mutations were located near a domain interface or ligand-binding site and were likely to affect function (examples in ).
Fig. 1 Examples of structural models of mutations. (A). The x-ray crystal structure of the C2 domain of protein kinase C γ (PKCG) [Protein Data Bank identification number (PDBID) 2UZP]. R252 (41) is shown as yellow space-fills; Ca2+ ions are shown as (more ...)
The average number of somatic mutations in pancreatic cancers (48; ) is considerably less than that in breast cancer (101) or colorectal cancers (77) (P
< 0.001), even though fewer genes were sequenced in the latter two tumor types (14
). One plausible explanation for this lower rate is that the cells that initiate pancreatic tumorigenesis have gone through fewer divisions than colorectal or breast cancer cells. It has been previously shown that the majority of mutations observed in colorectal cancers are likely to have occurred in the normal stem cells that gave rise to the initiating neoplastic cell (12
). Our data are thus consistent with observations showing that normal pancreatic epithelial cells divide infrequently (20
Table 2 Core signaling pathways and processes genetically altered in most pancreatic cancers. A complete listing of the gene sets defining these signaling pathways and processes and the statistical significance of each gene set are provided in table S8.
We further evaluated 39 genes that were mutated in more than one of the 24 discovery screen cancers in a prevalence screen consisting of 90 pancreatic cancers. In this screen, we detected 255 nonsilent somatic mutations among 23 genes (table S4
). The nonsilent mutation rate of the genes in the prevalence screen (excluding KRAS
, and SMAD4
) was higher than that in the discovery screen (3.6 versus 1.47 nonsilent mutations per Mbase, P
< 0.001). The fraction of nonsilent mutations observed in these 19 genes was also higher than that observed in the genes assessed in the discovery screen (P
= 0.052). These data are consistent with the hypothesis that a greater fraction of the genes tested in the prevalence screen were positively selected during tumorigenesis.