|Home | About | Journals | Submit | Contact Us | Français|
There is widespread agreement that cancer gene discovery requires high-quality tumor samples. However, whether primary tumors or cultured samples are superior for cancer genomics has been a longstanding subject of debate. This debate has recently become more important because federally funded cancer genomics has been centralized under The Cancer Genome Atlas, which has chosen to focus exclusively on primary tumors. Here, we provide a data-driven “perspective” on the effect of sample type selection on cancer genomics research. We show that, in the case of glioblastoma multiforme, primary tumors and xenografts are best for the identification of amplifications, whereas xenografts and cell lines are superior for the identification of homozygous deletions. We also note that many of the most important oncogenes and tumor suppressor genes have been discovered through the use of cell lines and xenografts, and highlight the lack of published evidence supporting the dogma that ex vivo culture generates artifactual genetic lesions. Based on this analysis, we suggest that cancer genomics projects such as The Cancer Genome Atlas should include a variety of sample types such as xenografts and cell lines in their integrated genomic analysis of cancer.
After several decades in which cancer genomics research was performed in individual laboratories and funded by single-investigator grants, the field has recently been centralized and expanded under the auspices of The Cancer Genome Atlas (TCGA), which is performing integrated genomic analysis on a large number of samples from a wide range of common human tumor types. TCGA was initiated in December 2005, recently completed a 3-year pilot project [focused on glioblastoma multiforme (GBM), ovarian cancer, and lung cancer], and is currently organizing itself to begin the production phase of genomic analysis on a wider range of tumor types.
The procurement of high-quality cancer samples is the critical first step for cancer genomics projects such as TCGA. There are four principle types of human cancer samples available for such studies—primary tumors, primary cultures, primary xenografts, and established cell lines. The availability of each sample type is somewhat tumor type–specific (e.g., breast cancers do not efficiently form xenografts). Each of these sample types has unique advantages and disadvantages that are thought to affect the success of genomic analyses (see Supplementary Table S1).
Unlike other ongoing cancer genomics projects (1–3), TCGA has chosen to focus exclusively on the collection and analysis of primary tumor samples. This decision was based on considerations such as the fact that primary tumors can most easily be collected in large numbers in a prospective fashion, and the concern that ex vivo culture could induce artifactual genetic lesions. However, this decision was not based strictly on scientific data, as few (if any) published studies have directly evaluated the advantages and disadvantages of various sample types for genetic analysis.
We initially became interested in this issue of sample type selection for cancer genomics because, as TCGA was performing copy number analyses on GBM primary tumor samples (4), we were performing similar analyses on a panel of all four GBM sample types (5, 6). The results of these studies, described comprehensively for the first time in detail below, suggested that whereas primary tumors are an ideal sample type for the identification of genomic amplifications, they are inferior to xenografts and cell lines for the identification of genomic deletions. As such, this “perspective” will describe the effects of sample type on copy number analysis in GBM, examine the evidence supporting the widely accepted idea that cultured sample types contain artifactual genetic lesions, and review the role of different sample types in the history of cancer gene discovery.
In an effort to experimentally address issues in sample type selection for cancer genomics projects, copy number analysis was performed on 58 GBM samples derived from all four GBM sample types—primary tumors, primary cultures, primary xenografts, and established cell lines.5 Copy number data from an additional panel of 50 cell lines were also analyzed.6
Initially, we identified amplifications and deletions of the major GBM oncogenes and tumor suppressor genes (Table 1A; Supplementary Table S2). There was a substantial discrepancy in the frequency of oncogene amplification between sample types. For example, amplification of EGFR was commonly found in primary tumors and xenografts, but rarely found in primary cultures and cell lines. This phenomenon of loss of amplifications in GBM cell lines has been previously described but was thought to be specific to EGFR (7, 8). However, our data indicate that amplification of other GBM oncogenes such as PDGFRA, CDK4, and MDM4 is similarly lost during in vitro culture, and suggest that primary tumors and xenografts are the best sample type for the identification of novel amplicons containing candidate oncogenes.
Of note, this loss of oncogene amplification during tissue culture seems to be tumor type–specific, as there are examples of tumor types in which oncogenes are amplified at a similar frequency in both cultured and uncultured samples. For example, MYC or MYCN are amplified in 28 out of 37 neuroblastoma cell lines (76%),7 comparable to that observed in neuroblastoma primary tumor samples (9).
There was also a discrepancy in the frequency of identifiable tumor suppressor gene deletions between sample types. For example, deletions of the CDKN2A/B locus were identifiable in a much higher fraction of xenografts and cell lines than in primary tumors and primary cultures (Table 1A; Supplementary Table S2). Importantly, this disparity was not limited to CDK inhibitors, but was also present for PTEN, NF1, and PTPRD. In the case of PTPRD, deletions in primary tumors were very rarely identified, and therefore TCGA did not sequence the gene in their GBM pilot project (4). It was only the use of additional sample types that enabled the identification of frequent deletions and somatic mutations of this emerging tumor suppressor gene in GBM (6).
To determine whether the presence of admixed nonneoplastic cells and intratumoral genetic heterogeneity was responsible for impeding the identification of deletions in primary tumor samples, we analyzed CDKN2A/B and CDKN2C in both a first passage xenograft and the primary tumor from which it was derived. Deletions of both loci were present in the xenograft, but were largely masked in the primary tumor by the presence of admixed nonneoplastic cells and intratumoral genetic heterogeneity (5, 10). This same observation is evident when comparing copy numbers at each of the major tumor suppressor genes—deletions in primary tumors are more difficult to identify because their average copy number is significantly higher and their boundaries are less discrete (Table 1B; Fig. 1; Supplementary Table S3).
Taken together, these data indicate that xenografts and cell lines are superior to primary tumors for the identification of genomic deletions. The presence of nonneoplastic cells and heterogeneity in even the most homogeneous tumor types such as GBM results in substantial “noise” in the analysis, which hinders the identification of deletions and leads to a high rate of false-negatives. Such noise would be expected to pose similar problems in other cancer genomics assays as well, including DNA sequencing.
Many cancer researchers favor using primary tumors rather than cultured samples because of the widespread belief that ex vivo culture can lead to the accumulation of spurious genetic alterations. Concerns of this type reached a pinnacle 15 years ago, when there was substantial controversy about whether the recently identified deletions and mutations of the p16INK4a tumor suppressor gene could be artifacts of ex vivo culture (11, 12). After substantial high-profile debate, this concern was eventually refuted and it is now universally accepted that p16INK4a is one of the most commonly inactivated tumor suppressor genes in human cancer. However, such concerns remain firmly entrenched in the minds of most cancer researchers.
To test whether these concerns are valid, we catalogued all the copy number alterations present in each of our 58 samples. Strikingly, there were no examples of recurrent deletions or amplifications present exclusively in cultured samples. Additionally, if ex vivo culture specifically enriches for cells with deleted tumor suppressor genes, one would similarly expect culture to enrich for cells with amplified oncogenes. Yet as we show in Table 1A, ex vivo culture leads to a decrease in oncogene amplification in GBM cells, not the predicted increase.
Next, a comprehensive search of the literature was performed in an effort to identify studies that document copy number alterations and/or mutations present exclusively in cultured samples but not in primary tumors. Although we were able to identify several studies which showed expression differences between primary tumors and cultured samples (13, 14), we were unable to identify any studies documenting genetic lesions unique to cultured samples.
In contrast, Jones and colleagues recently provided remarkably strong evidence in support of the idea that cultured samples faithfully recapitulate the genetic profile present in the tumor from which they were derived. In their study, 287 of 289 mutations (99.3%) initially discovered in human colon cancer xenografts and cell cultures were similarly present in the primary tumors from which the cultured samples were derived (15). These data indicate that ex vivo culture of colon tumors does not lead to the formation or accumulation of spurious genetic aberrations.
Based on these findings, we believe that there is little convincing evidence to support the dogma that ex vivo culture leads to artifactual deletions, amplifications, and somatic mutations. As such, the risk of failing to identify deletions in human cancer samples due to an exclusive focus on primary tumors is likely to be substantially greater than the risk of identifying spurious genetic events by including other sample types in the analysis. This is especially true because it is relatively trivial to determine whether an event initially discovered in cultured samples is similarly present in primary tumors, as was the case, for example, with the recent identification of CDKN2C as a GBM tumor suppressor gene (5, 16).
Finally, we looked back through the modern history of cancer genetics to identify the sample types used to discover the most commonly altered oncogenes and tumor suppressor genes (Table 2). Notably, most somatically altered cancer genes that were not discovered via linkage analysis were initially identified using xenografts and cell lines. This includes p53, PTEN, p16INK4a, K-Ras, PIK3CA, B-Raf, and others (11, 17–30). Based on this history, it seems prudent to include cultured samples in any cancer genomics initiative whose major goal is the identification of novel somatically altered cancer genes.
Here, we provide three rationales for the inclusion of cultured samples in TCGA and other cancer genomics efforts. First, we show that in the case of one major human tumor type, there are significant differences in the utility of different sample types for the identification of copy number alterations. Second, we document that there is little evidence supporting the popular notion that ex vivo culture of human tumors leads to spurious genetic alterations. And third, we show that most major somatically altered cancer genes discovered to date were identified using xenografts and cell lines. Based on these arguments, we believe it would be prudent for TCGA to include a range of sample types in their burgeoning analysis of cancer genomics. We also note that the use of cultured samples is supported by the Cancer Genome Project of the Wellcome Trust Sanger Institute and is within the agreed guidelines of the International Cancer Genome Consortium.
Grant support: National Cancer Institute, American Cancer Society, and Georgetown University School of Medicine (T. Waldman).
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
5These data were generated using Affymetrix 250K NspI SNP arrays and analyzed using dChip, a publicly available software program (http://biosun1.harvard.edu/complab/dchip/). These data have been reported on previously (5, 6), and the raw and processed data sets have been deposited into the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/), accession number GSE13021.
6Copy number data for a panel of 50 malignant glioma cell lines using Affymetrix SNP 6.0 arrays was generated by the Cancer Genome Project of the Wellcome Trust Sanger Institute, and is publicly available at http://www.sanger.ac.uk/genetics/CGP.
7Copy number data for a panel of 37 neuroblastoma cell lines using Affymetrix SNP 6.0 arrays was generated by the Cancer Genome Project of the Wellcome Trust Sanger Institute, and is publicly available at http://www.sanger.ac.uk/genetics/CGP.
Disclosure of Potential Conflicts of Interest No potential conflicts of interest were disclosed.