Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Nat Genet. Author manuscript; available in PMC 2012 October 3.
Published in final edited form as:
PMCID: PMC3372921

Detectable clonal mosaicism and its relationship to aging and cancer

Kevin B Jacobs,1,2 Meredith Yeager,1,2 Weiyin Zhou,1,2 Sholom Wacholder,1 Zhaoming Wang,1,2 Benjamin Rodriguez-Santiago,3,5 Amy Hutchinson,1,2 Xiang Deng,1,2 Chenwei Liu,1,2 Marie-Josephe Horner,1 Michael Cullen,1,2 Caroline G Epstein,1 Laurie Burdett,1,2 Michael C Dean,6 Nilanjan Chatterjee,1 Joshua Sampson,1 Charles C Chung,1 Joseph Kovaks,1 Susan M Gapstur,7 Victoria L Stevens,7 Lauren T Teras,7 Mia M Gaudet,7 Demetrius Albanes,1 Stephanie J Weinstein,1 Jarmo Virtamo,8 Philip R Taylor,1 Neal D Freedman,1 Christian C Abnet,1 Alisa M Goldstein,1 Nan Hu,1 Kai Yu,1 Jian-Min Yuan,9,10 Linda Liao,1 Ti Ding,11 You-Lin Qiao,12 Yu-Tang Gao,13 Woon-Puay Koh,14 Yong-Bing Xiang,13 Ze-Zhong Tang,11 Jin-Hu Fan,12 Melinda C Aldrich,15,16 Christopher Amos,17 William J Blot,16,18 Cathryn H Bock,19,20 Elizabeth M Gillanders,21 Curtis C Harris,22 Christopher A Haiman,23 Brian E Henderson,23 Laurence N Kolonel,24 Loic Le Marchand,24 Lorna H McNeill,25,26 Benjamin A Rybicki,27 Ann G Schwartz,19,20 Lisa B Signorello,16,18,28 Margaret R Spitz,29 John K Wiencke,30 Margaret Wrensch,30 Xifeng Wu,17 Krista A Zanetti,21,22 Regina G Ziegler,1 Jonine D Figueroa,1 Montserrat Garcia-Closas,1,31 Nuria Malats,32 Gaelle Marenne,32 Ludmila Prokunina-Olsson,1 Dalsu Baris,1 Molly Schwenn,33 Alison Johnson,34 Maria Teresa Landi,1 Lynn Goldin,1 Dario Consonni,35,36 Pier Alberto Bertazzi,35,36 Melissa Rotunno,1 Preetha Rajaraman,1 Ulrika Andersson,37 Laura E Beane Freeman,1 Christine D Berg,38 Julie E Buring,39,40 Mary A Butler,41 Tania Carreon,41 Maria Feychting,42 Anders Ahlbom,42 J Michael Gaziano,40,43,44 Graham G Giles,45,46 Goran Hallmans,47 Susan E Hankinson,48 Patricia Hartge,1 Roger Henriksson,37,49 Peter D Inskip,1 Christoffer Johansen,50 Annelie Landgren,1 Roberta McKean-Cowdin,23 Dominique S Michaud,51,52 Beatrice S Melin,37 Ulrike Peters,53,54 Avima M Ruder,41 Howard D Sesso,40 Gianluca Severi,45,46 Xiao-Ou Shu,28 Kala Visvanathan,55 Emily White,53,54 Alicja Wolk,56 Anne Zeleniuch-Jacquotte,57 Wei Zheng,28 Debra T Silverman,1 Manolis Kogevinas,58,61 Juan R Gonzalez,59,61 Olaya Villa,3,5 Donghui Li,62 Eric J Duell,63 Harvey A Risch,64 Sara H Olson,65 Charles Kooperberg,53 Brian M Wolpin,48,66 Li Jiao,67 Manal Hassan,62 William Wheeler,68 Alan A Arslan,69,71 H Bas Bueno-de-Mesquita,72,73 Charles S Fuchs,48,66 Steven Gallinger,74 Myron D Gross,75 Elizabeth A Holly,76 Alison P Klein,77 Andrea LaCroix,53 Margaret T Mandelson,53,78 Gloria Petersen,79 Marie-Christine Boutron-Ruault,80 Paige M Bracci,76 Federico Canzian,81 Kenneth Chang,82 Michelle Cotterchio,83 Edward L Giovannucci,48,84,85 Michael Goggins,76,86,87 Judith A Hoffman Bolton,55 Mazda Jenab,88 Kay-Tee Khaw,89 Vittorio Krogh,90 Robert C Kurtz,91 Robert R McWilliams,92 Julie B Mendelsohn,1 Kari G Rabe,79 Elio Riboli,51 Anne Tjønneland,50 Geoffrey S Tobias,1 Dimitrios Trichopoulos,84,93 Joanne W Elena,38 Herbert Yu,24 Laufey Amundadottir,1 Rachael Z Stolzenberg-Solomon,1 Peter Kraft,48,84 Fredrick Schumacher,23 Daniel Stram,23 Sharon A Savage,1 Lisa Mirabello,1 Irene L Andrulis,74,94 Jay S Wunder,74,94 Ana Patiño García,95 Luis Sierrasesúmaga,95 Donald A Barkauskas,23 Richard G Gorlick,96,97 Mark Purdue,1 Wong-Ho Chow,1 Lee E Moore,1 Kendra L Schwartz,98 Faith G Davis,99 Ann W Hsing,1 Sonja I Berndt,1 Amanda Black,1 Nicolas Wentzensen,1 Louise A Brinton,1 Jolanta Lissowska,100 Beata Peplonska,101 Katherine A McGlynn,1 Michael B Cook,1 Barry I Graubard,1 Christian P Kratz,1,102 Mark H Greene,1 Ralph L Erickson,103 David J Hunter,48,84 Gilles Thomas,103 Robert N Hoover,1 Francisco X Real,3,105 Joseph F Fraumeni, Jr.,1 Neil E Caporaso,1 Margaret Tucker,1 Nathaniel Rothman,1 Luis A Pérez-Jurado,3,4, and Stephen J Chanock1,*


In an analysis of 31,717 cancer cases and 26,136 cancer-free controls drawn from 13 genome-wide association studies (GWAS), we observed large chromosomal abnormalities in a subset of clones from DNA obtained from blood or buccal samples. Mosaic chromosomal abnormalities, either aneuploidy or copy-neutral loss of heterozygosity, of size >2 Mb were observed in autosomes of 517 individuals (0.89%) with abnormal cell proportions between 7% and 95%. In cancer-free individuals, the frequency increased with age; 0.23% under 50 and 1.91% between 75 and 79 (p=4.8×10−8). Mosaic abnormalities were more frequent in individuals with solid-tumors (0.97% versus 0.74% in cancer-free individuals, OR=1.25, p=0.016), with a stronger association for cases who had DNA collected prior to diagnosis or treatment (OR=1.45, p=0.0005). Detectable clonal mosaicism was common in individuals for whom DNA was collected at least one year prior to diagnosis of leukemia compared to cancer-free individuals (OR=35.4, p=3.8×10−11). These findings underscore the importance of the role and time-dependent nature of somatic events in the etiology of cancer and other late-onset diseases.

Classically, genetic mosaicism is defined as the co-existence of cells with two or more distinct karyotypes within an individual that results from a post-zygotic event during development and can occur in both somatic and germline cells1,2. Errors in chromosomal duplication and subsequent transmission to daughter cells may lead to aneuploidy, the gain or loss of chromosomes or segments of chromosomes, and reciprocal gain and loss events manifesting in copy-neutral loss of heterozygosity (cnloh) or acquired uniparental disomy. Somatic mosaicism has been established as a cause of miscarriage, birth defects, developmental delay, and cancer3-9. Because mosaicism can be benign or may manifest with diverse clinical phenotypes, there are no accurate estimates of its frequency in the general population3,6. On rare occasions the propensity to develop chromosomal abnormalities is inherited and leads to multiple phenotypic abnormalities including cancer predisposition as reported in families with mutations in BUB1B and CEP57 10,11. Recently, two groups have identified somatic mosaic mutations in IDH1 and IDH2 in tumors of individuals with Ollier disease and Maffucci syndrome12,13 while another group has characterized somatic mosaicism of a HRAS mutation in an individual with urothelial cancer and epidermal nevus14. Recent work in a population of twins has suggested that the detection of somatic structural variants in blood increases with aging and may be related to reduction in blood cell clonality15. In this report, we define mosaic chromosomal abnormalities broadly: the presence of both normal karyotypes and those with large structural genomic events resulting in alteration of copy number or loss of heterozygosity in distinct and detectable subpopulations of cells regardless of the clonal or developmental origin of the subpopulations.

Recently, we reported on 1,991 individuals from the Spanish Bladder Cancer/EPICURO population-based case-control study in which we had performed a GWAS of adult-onset bladder cancer using DNA obtained from blood or buccal samples16. The SNP array data generated for the GWAS was subsequently used to detect clonal mosaic abnormalities in the autosomes of 1.7% of study subjects, suggesting a higher frequency in adults than previously suspected. Even though somatic mosaicism has been implicated in several cancers, this study did not reveal a significant difference in frequency between cases and controls. A computational algorithm was used to detect 42 large mosaic events involving two or more distinct clones in DNA extracted from blood or buccal samples and we experimentally validated the findings using multiplex ligation-dependent probe amplification (MLPA) and microsatellite analysis (as well as fluorescent in situ hybridization in a subset), establishing the robustness of the software detection method. A similar proportion of cells carrying each event was found in 5 of 6 events (in four individuals with bladder cancer in whom three had one event and one individual with three separate events) in which it was possible to examine more than one tissue (whole blood and bladder mucosa), suggesting an early embryonic origin of the somatic mutation leading to the observed mosaic chromosomal abnormalities16.


In this report, we extend our analysis of clonal mosaic abnormalities in the autosomes to 57,853 individuals (including those previously published16). We tested 31,717 cancer cases and 26,136 cancer free controls for evidence of mosaic abnormalities using genome-wide SNP array data generated as part of 13 distinct cancer GWAS drawn from 48 epidemiological case-control and case-cohort studies (Supplementary Table 1). DNA samples were extracted from blood or buccal samples using a variety of collection and extraction techniques and genotyped using one or more Infinium Human SNP arrays from Illumina Inc. (including versions of Hap300, Hap240, Hap550, Hap610, Hap660, Hap1, Omni Express, and Omni1). Genotype clusters were empirically estimated in 45 batches to optimize accuracy while minimizing potential batch effects (Online Methods).

Detection of clonal mosaic events was based on assessment of allelic imbalance and copy number changes. We used the B-allele frequency (BAF) measurement, derived from the ratio of probe values relative to the locations of the estimated genotype-specific clusters, for initial segmentation using the Mosaic Alteration Detection (MAD) algorithm implemented in GADA-R with modifications17,18. The BAF and log2 relative probe intensity ratio (LRR), which provides data on copy number, were used to classify each event as copy-altering (gain or loss) or neutral (reciprocal gain and loss resulting in loss of heterozygosity, LOH) and to assign the proportions of abnormal (p) and normal (1-p) cells. Mosaic proportions were required to deviate from levels expected from constitutional (non-mosaic) changes in order to exclude homozygous chromosomal segments inherited identical by descent and non-mosaic instances of trisomy, monosomy and uniparental disomy. A minimum event size threshold was set to detect only clonal mosaic events greater than 2 Mbps to minimize the false discovery of constitutional copy number variants. Copy-neutral LOH and copy-loss events could be detected for mosaic proportions between 7% and 95% (Figure 1) with sensitivity that was affected by the signal-to-noise ratio characteristic of each microarray assay and sample quality. There was reduced sensitivity to distinguish between copy-neutral LOH and copy-loss events for mosaic proportions less than 15% across the autosomes. The magnitude of BAF differences for single-copy gain events was 1/3 of the magnitude of copy-neutral LOH or copy-loss events, reducing the sensitivity for calling copy-gain events. As a result, single copy gain events could only be reliably detected for mosaic proportions between 22% and 88%, with ambiguity in distinguishing copy-gain from copy-neutral LOH for mosaic proportions of less than 20%. Since DNA was obtained for the purpose of performing a GWAS, it was not possible to further explore the developmental and clonal characteristics of mosaic events detected in these individuals (e.g. by studying DNA from fractionated blood and other tissue types, determining cell composition of buccal samples, or effect of DNA collection and extraction methods on detection and accuracy of the estimation of mosaic proportions). We report only autosomal chromosomal abnormalities, as the analysis of the sex chromosomes presents distinct technical and interpretative challenges.

Figure 1
Characteristics of detectable clonal mosaic events

We observed 681 mosaic segments of size greater than 2 Mb on 641 autosomal chromosomes in 517 individuals for an overall frequency of individuals with mosaicism of 0.87% (Tables 1 and and2).2). The most frequent type of event observed was copy-neutral LOH (48.2%), while copy-gains and copy-losses were observed for 15.1% and 34.8% of mosaic events, respectively (Table 1). A small proportion (1.9%) of mosaic chromosomes were complex, harboring more than one type of event. 18.7% of mosaic chromosomal events spanned the entire chromosome, including 62 complete trisomies, predominantly in chromosomes 8, 12 and 15. 47.9% of mosaic chromosomal events began at a telomere and extended across some portion of the chromosomal arm (Table 1 and Figure 2). The majority of telomeric events were mosaic copy-neutral LOH (85.7%), most frequently on 9p (Table 3). The remaining mosaic chromosomal events were interstitial (31.5%) spanning neither telomere nor centromere, while an additional small proportion (1.8%) spanned the centromere or had more complex structure (e.g. distinct events involving both telomeres, but not the whole chromosome). The majority of interstitial events were mosaic copy-loss (91.6%), which was most frequently observed within specific regions of chromosomes 13q and 20q (Figure 2). We observed 69 individuals (46 cancer cases and 23 cancer-free individuals) with clonal mosaic events on multiple chromosomes. The distribution of the number of clonal mosaic chromosomal events per individual is shown in Supplementary Table 3. Among cancer-free individuals, the greatest number observed was 5 mosaic chromosomal events, whereas six individuals with cancer had greater than 5 events, including two individuals with gastric cancer who each had 20. A list of mosaic events with phenotype data is available as Supplementary Data.

Figure 2
Circular genomic plot of detectable clonal mosaic events
Table 1
Count and frequency of mosaic chromosomal events by event type and location
Table 2
Count and frequency of individuals with detectable clonal mosaic events for cancer-free individuals and by first diagnosed cancer site
Table 3
Distribution and frequency of recurrent detectable clonal mosaic events

The strongest predictor of mosaic autosomal abnormalities was age at DNA collection. We examined the effect of aging on the frequency of mosaicism across all studies, which were predominantly individuals over the age of 50. The frequency of cancer-free individuals with detectable clonal mosaic events increased with age, from 0.23% for those under 50 to 1.91% (p=4.8×10−8) for those between the ages of 75 and 79, and with slightly higher frequencies for individuals with cancer (Figure 3). In the early onset cancers (under age 40), which constituted less than 5% of analyzed cases (e.g., testicular cancer and osteogenic sarcoma), we did not observe an increase in mosaic abnormalities. Further studies are needed to investigate the relationship between mosaic abnormalities and cancer in children and young adults, particularly because of the strong association between mosaicism and many developmental disorders. There was no apparent relationship between age at DNA collection and the number, size of mosaic events, or the proportion of abnormal cells (Supplementary Figures 1 and 2).

Figure 3
Frequency of detectable clonal mosaic events by age and cancer status

We regressed the presence of detectable clonal mosaicism in 26,136 cancer-free individuals on age at DNA collection (in 5 year intervals), sex (male versus female), DNA source (buccal cells versus blood), smoking (ever versus never) and admixture coefficients for African and East Asian ancestry in a logistic model to determine the additional factors that influenced frequency of detectable clonal mosaicism. The source of DNA was known for 87% of individuals, of whom 19% were derived from buccal cells and the remainder from blood. DNA source was not significantly associated with mosaicism (OR=0.83, 0.55-1.26 95% confidence interval (CI), p=0.39). By admixture analysis, 75% of subjects were determined to be of European ancestry, 9% of African ancestry and 16% of East Asian ancestry. Although power was limited, we observed that cancer-free individuals with African admixture were at a lower risk of being mosaic (OR=0.43, 0.20-0.92 95% CI, p=0.03), but not in those with East Asian admixture (OR=0.60, 0.32-1.15 95% CI, p=0.12). We did not observe an association between smoking (ever/never) and frequency of mosaic abnormalities (OR=1.04, 0.75-1.44 95% CI, p=0.81).

In 26,136 cancer-free controls and 23,093 cancer cases drawn from non-sex specific and non-hematological cancer sites (i.e. excluding 8,470 individuals with leukemia, lymphoma, multiple myeloma and cancers of the breast, endometrium, ovary, testis, and prostate), we observed a higher frequency of males with mosaic abnormalities than females. In cancer-free individuals, we observed mosaic events in 0.56% of females and 0.87% males (OR=1.35, 0.98-1.88 95% CI, p=0.07); for individuals with cancer we observed mosaic events in 0.79% of females and 1.21% of males (OR=1.48, 1.08-2.03 95% CI, p=0.015); and overall, 0.65% of females and 1.04% of males (OR=1.42, 1.14-1.80 95% CI, p=0.002) in logistic models adjusted for cancer diagnosis (if applicable), age at DNA collection, ancestry, DNA source and smoking. These differences could be due to a true sex-specific effect akin to sex-differential mutation and recombination rates19; however the complex and heterogeneous nature of the inclusion of individual studies and the differences in their entry and selection criteria could result in spurious associations. Although this observation was consistent across cancer types, it should be confirmed in additional studies better designed to address this question.

To determine the relationship between detectable mosaic autosomal abnormalities and non-hematological cancers, we regressed the presence of detectable clonal mosaicism on cancer diagnosis, age, sex, DNA source, smoking and ancestry in a logistic model. We observed a modest increase in cancer risk for mosaic individuals (OR=1.27, 1.05-1.52 95% CI, p=0.012) (Tables 2 & Supplementary Table 2). Notable associations were observed in stratified analyses of lung (OR=1.56, 1.18-2.08 95% CI, p=0.002) and kidney (OR=1.98, 1.27-3.06 95% CI, p=0.002) cancers, both tobacco-associated malignancies. However no cancer site-specific associations were observed for bladder, esophagus, stomach and pancreas cancers, which are also typically associated with tobacco use. There was no significant association in non-hematological cancer cases overall between smoking (ever/never) and frequency of mosaicism (OR=1.19, 0.92-1.54 95% CI, p=0.19) or when stratified by cancer site (results not shown).

In an analysis of the subset of 14,050 individuals with cancer for whom it was possible to determine that DNA was likely obtained before or at the time of diagnosis and prior to treatment with radiation or chemotherapy for a primary tumor (designated as “likely untreated”), we observed a stronger association between mosaic abnormalities and non-hematological cancer diagnosis (OR=1.45, 1.18-1.80 95% CI, p=0.0005). The associations for lung and kidney also increased in significance (Table 3). It is notable that the evidence for association with non-hematological cancer diminished in individuals who were potentially treated (OR=1.03, 0.81-1.30 95% CI, p=0.80). We had approached this analysis with the hypothesis that there could be an increased frequency in detectable clonal mosaicism in non-hematological cancers induced by chemotherapy or radiotherapy but were surprised to observe the frequency was reduced to virtually the same as in the cancer-free population. Although this attenuated effect could have many explanations (e.g., related to the diagnosis and treatment of a solid tumor leading to a decrease in populations of cells with mosaic alteration), we had a limited capacity to model and control for treatment-effects since many of the studies did not provide any treatment information or only provided incomplete, retrospective ascertainment of the specifics. Although many of the participating studies were prospectively ascertained cohorts, DNA collection often occurred after cancer diagnosis. Additional studies are needed in prospectively ascertained cohorts and longitudinal studies in which multiple DNA samples were collected prior to and after diagnosis in order to explore treatment and disease effects.

For the 43 individuals with hematological cancers for whom DNA was obtained at least a year prior to diagnosis, the frequency of detectable clonal mosaicism was 20% for myeloid leukemia and 22% for lymphocytic leukemia (predominantly chronic lymphocytic leukemia, Table 2) compared to 0.74% in 26,136 cancer free controls (overall OR=35.4, 14.7-76.6 95% CI, Fisher exact p=3.8×10−11). Of the 8 mosaic individuals with leukemia for whom DNA samples were collected at least a year prior to diagnosis, 4 were diagnosed with chronic lymphocytic leukemia (CLL) of which 2 had a mosaic deletion in a region of chromosome 13q14 previously described to be deleted in CLL20. DNA was obtained more than 5 years prior to diagnosis for 6 mosaic individuals, with the longest interval being 14 years, suggesting that detectable clonal mosaicism could be a marker of hematological cancer or its precursors, i.e., monoclonal B cell lymphocytosis (MBL) for CLL and myelodysplastic syndrome for acute myelogenous leukemia. Recent work shows that the majority of MBL have mono- or biallelic 13q14 abnormalities21. However, further studies will be needed, preferably with serial pre- and post- diagnosis sampling to investigate the predictive nature of detectable clonal mosaicism, especially involving regions of chromosome 13 and 20 with respect to leukemia risk20.

We further explored the 4 most recurrent altered regions (>20), which also harbor well known cancer genes (as noted in the COSMIC22 and Mitelman databases:; these were on chromosomes 9p (cnloh), 13q (del), 14 (cnloh) and 20q (del) (Table 4). Notably, the most recurrent mosaic events were observed in cancer-free individuals as well as across multiple solid tumors. We observed a comparable frequency in non-hematologic cancer cases and cancer-free controls for three of the regions, whereas the chromosome 14 cnloh abnormalities were more frequent in non-hematological cancer cases (OR=3.32, 1.42-9.00 95% CI, Fisher’s exact p=0.003), particularly in individuals with bladder or kidney cancer. Copy-neutral loh in this region of chromosome 14 has been associated with increased susceptibility to sporadic cancers and harbors imprinted genes, such as the tumor suppressing non-coding RNA, Maternally expressed gene 3 (MEG3)8,23. The recurrent segmental deletion of 13q14 was observed in 5 leukemia cases, but also in 18 individuals with solid tumors (9 with lung cancer and 4 with prostate cancer), and in 10 cancer-free individuals. This region includes the tumor suppressor gene DLEU7 (Deleted In Leukemia 7) and related genes, DLEU1 and DLEU2, the latter harboring two microRNAs within one of its introns (miR-15a and miR-16-1)24-26. The retinoblastoma gene, RB1 was also included within a subset harboring a mosaic deletion of 13q14. It cannot be ruled out that these individuals have either undiagnosed CLL or MBL. The 20q- was seen in two individuals with myeloid leukemia as has been described previously27 but also in cancer-free and individuals with solid tumors.

The accuracy of our software methods to detect clonal mosaic abnormalities was previously addressed and we were able to validate 100% of 42 events in 34 individuals from the Spanish Bladder Cancer Study using confirmatory cytogenetic assays16 (Supplementary Figure 3). We have also performed a comparison of mosaic events in samples from the EAGLE and PLCO lung cancer studies which were independently analyzed as part of the Gene, Environment Association studies consortium (GENEVA) report on mosaic events28. A total of 83 mosaic events in individuals from the EAGLE and PLCO lung cancer studies were detected in common, 20 additional events of size less than 2 Mb and 8 events greater than 2 Mb were detected by GENEVA and not by our study, while we detected 20 additional events (size > 2 Mb) that were not detected by GENEVA. Although additional cytogenetic or molecular validation was not performed, neither method detected notable false-positive events based on manual review of the data. The concordance rate is 75% if considering events > 2 Mb (the cut-off for this analysis) or 63% if considering all events, both of which are considerably better than the 25-50% concordance rates observed across CNV detection methods29-31. Our method is more conservative in the size of events detected, while the GENEVA method is more conservative with respect to sample quality, but provides calls for smaller events when assay quality is sufficient. Better approaches are needed to characterize smaller size events accurately as either mosaic or constitutional and to estimate their frequency. Further improvements to data normalization, segmentation and event classification methods will also likely reduce false-negative rates.


Our study has important implications for the design and analysis of molecular epidemiology studies in cancer as well as the somatic characterization of cancer genomes, like The Cancer Genome Atlas32 and International Cancer Genome Consortium33. Investigators will need to carefully analyze samples used as exemplars of germline DNA for somatic alterations, such as detectable clonal mosaicism. Otherwise, comparisons between “grmline” and tumor DNA may result in implausible somatic changes (e.g. large gains of heterozygosity) and it may be impossible to determine whether somatic events pre-date changes secondary to driver mutations. Since how to detect mosaic events with next generation sequencing technologies is neither routine nor well understood, for the near future it may be prudent to continue to utilize SNP microarrays for such analyses. Due to the increased frequency of detectable clonal mosaicism with age, this will be particularly important for the analysis of epithelial cancers, which characteristically occur in the older population. For future large-scale GWAS in prospective studies, it may be wise to consider analyzing the earliest, pre-diagnosis DNA samples and to consider time from collection to diagnosis in the analysis of longitudinally collected biospecimens.

We have extended our initial observation that detectable clonal mosaicism of the autosomes is present in the population with surprising frequency and particularly in the aging genome. A recent study of detectable clonal mosaicism in twins reported an increase in frequency with age and suggested that this reduction could lead to a less diverse blood cell population and immune system15. These emerging data raise a number of critical issues in mechanisms underlying the possible shift in the repertoire of clones with large structural abnormalities. Thus cells with abnormal karyotypes could have an early developmental origin in which a somatic event in a single stem cell progenitor during embryogenesis could become apparent when cellular diversity decreases with age and cell populations become increasingly oligoclonal. Higher rates of detectable clonal mosaicism in older cancer-free individuals could also be due to increased rates of somatic mutation or diminished capacity for genomic maintenance, such as with telomere attrition34 leading to proliferation of somatically altered cell populations. A survival bottleneck of cellular progenitors could also lead to observable mosaic alterations that were previously below the threshold of detection but subsequently expanded due to positive selection. Further work is required to begin to unravel the underlying mechanisms that result in mosaic abnormalities, particularly as it relates to how and when altered clones are created, tissue-specificity, and the timing and expansion of distinct populations of cells with age. Finally, these findings underscore the importance of considering the role and time-dependent nature of somatic events in the etiology of cancer as well as other late-onset diseases.

Supplementary Material



This research was supported by the Intramural Research Program and by contract number HHSN261200800001E of the USA National Institutes of Health, National Cancer Institute. Support for each contributing study is listed in the Supplementary Acknowledgement Section. We thank Cathy Laurie and Bruce Weir for constructive discussion and a comparison of methodology and results for the GENEVA study. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Cancer Institute, the National Institute for Occupational Safety and Health, or the Maryland Cancer Registry.


Author Contributions

K.B.J., M.Y., WY.Z., Z.W., X.D., C.L., S.W., N.E.C., M.T., N.R., and S.J.C. designed the study.

K.B.J., M.Y., L.P.-J., WY.Z., Z.W., S.W., N.E.C., N.R., M.G.C., M.C.D., D.A., B.I.G., R.N.H., F.X.R. , and S.J.C. interpreted the primary results.

K.B.J., M.Y., L.P.-J., B.R.-S., and J.R.G. developed the study methods.

K.B.J., M.Y., L.P.-J., WY.Z., Z.W., X.D., C.L., M.G.C., C.G.E., M.C.D., N.C., J.S., and C.C.C. analyzed the data.

K.B.J., M.Y., WY.Z., Z.W., X.D., C.L., A.H., L.B., and J.K. were responsible for production and analysis of the genotype data.

K.B.J., M.Y., and S.W. performed statistical analysis.

K.B.J., M.Y., S.W., M.-J.H. and S.J.C. drafted the manuscript.

M.T. , R.N.H., S.J.C. and J.F.F. provided vital programmatic and institutional support.

J.R.G., N.E.C., M.T., N.R., S.J.C., S.M.G.,V.L.S., L.T.T, M.M.G., D.A., S.J.W. J.V., P.R.T., N.D.F., C.C.A., A.M.G., N.H.,K.Y., J-M.Y., L.L., T.D., Y-L.Q., Y-T.G.,W-P.K., Y-B.X., Z-Z.T., J-H.F., M.C.A., C.A., W.J.B., C.H.B., E.M.G., C.C.H., C.A.H., B.E.H., L.N.K., L.L.M., L.H.M., B.A.R., A.G.S., L.B.S., M.R.S., J.K.W., M.W., X.W., K.A.Z., R.G.Z., J.D.F., M.G-C., N.M., G.M., L.P-O., D.B., M.S., A.J., M.T.L., L.G., D.C., P.A.B., M.R., P.R., U.A., L.E.B.F.,C.D.B., J.E.B., M.A.B., T.C., M.F., A.A., J.M.G., G.G.G., G.H., S.E.H., P.H., R.H., P.D.I., C.J., A.L., R.M-C., D.S.M., B.S.M., U.P., A.M.R., H.D.S., G.S., X-O.S., K.V., E.W., A.W., A.Z-J., W.Z., D.T.S., M.K., O.V., D.L., E.J.D., H.A.R., S.H.O., C.K., B.M.W., L.J., M.H., W.W., A.A.A., H.B.B-d-M., C.S.F., S.G., M.D.G., E.A.H., A.P.K., A.LC., M.T.M., G.P., M-C.B-R., P.M.B., F.C., K.C., M.C., E.L.G., M.G., J.A.H.B., M.J., K-T.K.,V.K.,R.C.K., R.R.M., J.B.M., K.G.R., E.R., A.T., G.S.T., D.T., J.W.E., H.Y., L.A., R.Z.S-S., P.K., F.S., D.S., S.A.S., L.M., I.L.A., J.S.W., A.P.G., L.S., D.A.B., R.G.G., M.P., WH.C., L.E.M., K.L.S., F.G.D., A.W.H., S.I.B., A.B., N.W., L.A.B., J.L., B.P., K.A.M., M.B.C., B.I.G., C.P.K., M.H.G., R.L.E., D.J.H., G.T., R.N.H., F.X.R., and J.F.F. contributed data or samples.

All authors contributed critical feedback, review, and approval of the manuscript.

Disclosures: BRS and OV are currently employees of the qGenomics company while LAPJ is a member of its scientific advisory board.


1. Youssoufian H, Pyeritz RE. Mechanisms and consequences of somatic mosaicism in humans. Nat Rev Genet. 2002;3:748–58. [PubMed]
2. Notini AJ, Craig JM, White SJ. Copy number variation and mosaicism. Cytogenet Genome Res. 2008;123:270–7. [PubMed]
3. Hsu LY, et al. Proposed guidelines for diagnosis of chromosome mosaicism in amniocytes based on data derived from chromosome mosaicism and pseudomosaicism studies. Prenat Diagn. 1992;12:555–73. [PubMed]
4. Menten B, et al. Emerging patterns of cryptic chromosomal imbalance in patients with idiopathic mental retardation and multiple congenital anomalies: a new series of 140 patients and review of published reports. J Med Genet. 2006;43:625–33. [PMC free article] [PubMed]
5. Lu XY, et al. Genomic imbalances in neonates with birth defects: high detection rates by using chromosomal microarray analysis. Pediatrics. 2008;122:1310–8. [PMC free article] [PubMed]
6. Conlin LK, et al. Mechanisms of mosaicism, chimerism and uniparental disomy identified by single nucleotide polymorphism array analysis. Hum Mol Genet. 2010;19:1263–75. [PMC free article] [PubMed]
7. Heim S, Mitelman F. Nonrandom chromosome abnormalities in cancer - an overview. In: Mitelman F, Heim S, editors. Cancer Cytogenetics. John Wiley & Sons, Inc.; Hoboken, NJ: 2009.
8. Tuna M, Knuutila S, Mills GB. Uniparental disomy in cancer. Trends Mol Med. 2009;15:120–8. [PubMed]
9. Solomon DA, et al. Mutational inactivation of STAG2 causes aneuploidy in human cancer. Science. 2011;333:1039–43. [PMC free article] [PubMed]
10. Rio Frio T, et al. Homozygous BUB1B mutation and susceptibility to gastrointestinal neoplasia. N Engl J Med. 2010;363:2628–37. [PubMed]
11. Snape K, et al. Mutations in CEP57 cause mosaic variegated aneuploidy syndrome. Nat Genet. 2011;43:527–9. [PMC free article] [PubMed]
12. Amary MF, et al. Ollier disease and Maffucci syndrome are caused by somatic mosaic mutations of IDH1 and IDH2. Nat Genet. 2011 [PubMed]
13. Pansuriya TC, et al. Somatic mosaic IDH1 and IDH2 mutations are associated with enchondroma and spindle cell hemangioma in Ollier disease and Maffucci syndrome. Nat Genet. 2011 [PMC free article] [PubMed]
14. Hafner C, T.A., Real FX. HRAS mutation mosaicism causing urothelial cancer and epidermal nevus. N Engl J Med. 2011;365:1940–2. [PubMed]
15. Forsberg LA, et al. Age-related somatic structural changes in the nuclear genome of human blood cells. Am J Hum Genet. 2012;90:217–28. [PubMed]
16. Rodriguez-Santiago B, et al. Mosaic uniparental disomies and aneuploidies as large structural variants of the human genome. Am J Hum Genet. 2010;87:129–38. [PubMed]
17. Gonzalez JR, et al. A fast and accurate method to detect allelic genomic imbalances underlying mosaic rearrangements using SNP array data. BMC Bioinformatics. 2011;12:166. [PMC free article] [PubMed]
18. Pique-Regi R, Caceres A, Gonzalez JR. R-Gada: a fast and flexible pipeline for copy number analysis in association studies. BMC Bioinformatics. 2010;11:380. [PMC free article] [PubMed]
19. Hedrick PW. Sex: differences in mutation, recombination, selection, gene flow, and genetic drift. Evolution. 2007;61:2750–71. [PubMed]
20. Dohner H, et al. Genomic aberrations and survival in chronic lymphocytic leukemia. N Engl J Med. 2000;343:1910–6. [PubMed]
21. Lanasa MC, et al. Immunophenotypic and gene expression analysis of monoclonal B-cell lymphocytosis shows biologic characteristics associated with good prognosis CLL. Leukemia. 2011;25:1459–66. [PMC free article] [PubMed]
22. Forbes SA, et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 2011;39:D945–50. [PMC free article] [PubMed]
23. Benetatos L, Vartholomatos G, Hatzimichael E. MEG3 imprinted gene contribution in tumorigenesis. Int J Cancer. 2011;129:773–9. [PubMed]
24. Lee S, et al. Forerunner genes contiguous to RB1 contribute to the development of in situ neoplasia. Proc Natl Acad Sci U S A. 2007;104:13732–7. [PubMed]
25. Migliazza A, et al. Nucleotide sequence, transcription map, and mutation analysis of the 13q14 chromosomal region deleted in B-cell chronic lymphocytic leukemia. Blood. 2001;97:2098–104. [PubMed]
26. Pekarsky Y, Zanesi N, Croce CM. Molecular basis of CLL. Semin Cancer Biol. 2010;20:370–6. [PMC free article] [PubMed]
27. Gurvich N, et al. L3MBTL1 polycomb protein, a candidate tumor suppressor in del(20q12) myeloid disorders, is essential for genome stability. Proc Natl Acad Sci U S A. 2010;107:22552–7. [PubMed]
28. Laurie CC, Laurie CA, Kenneth R, Doheny KF. Somatic mosaicism for large chromosomal anomalies from birth to old age and its relationship to cancer. submitted to Nature Genetics. 2011 [PMC free article] [PubMed]
29. Pinto D, et al. Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol. 2011;29:512–20. [PMC free article] [PubMed]
30. Dellinger AE, et al. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res. 2010;38:e105. [PMC free article] [PubMed]
31. Marenne G, et al. Assessment of copy number variation using the Illumina Infinium 1M SNP-array: a comparison of methodological approaches in the Spanish Bladder Cancer/EPICURO study. Hum Mutat. 2011;32:240–8. [PMC free article] [PubMed]
32. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–8. [PMC free article] [PubMed]
33. Hudson TJ, et al. International network of cancer genome projects. Nature. 2010;464:993–8. [PMC free article] [PubMed]
34. Sahin E, Depinho RA. Linking functional decline of telomeres, mitochondria and stem cells during ageing. Nature. 2010;464:520–8. [PMC free article] [PubMed]
35. Petersen GM, et al. A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nat Genet. 2010;42:224–228. [PMC free article] [PubMed]
36. Staaf J, et al. Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios. BMC Bioinformatics. 2008;9:409. [PMC free article] [PubMed]
37. Diskin SJ, et al. Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms. Nucleic acids research. 2008;36:e126. [PMC free article] [PubMed]
38. Peiffer DA, et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Res. 2006;16:1136–1148. [PubMed]
39. Itsara A, et al. Population analysis of large copy number variants and hotspots, of human genetic disease. Am J Hum Genet. 2009;84:148–161. [PubMed]
40. Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control. 1974;19:716–723.
41. Krzywinski MI, et al. Circos: An information aesthetic for comparative genomics. Genome Research. 2009 [PubMed]
42. Agresti A, Coull BA. Approximate is better than "exact" for interval estimation of binomial proportions. Am Stat. 1998;52:119–126.
43. Frazer KA, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–861. [PMC free article] [PubMed]