|Home | About | Journals | Submit | Contact Us | Français|
To investigate the genomic aberrations that are involved in lung tumorigenesis and therefore may be developed as biomarkers for lung cancer diagnosis, we characterized the genomic copy number changes associated with individual genes in 14 tumors from patients with primary non small cell lung cancer (NSCLC). Six squamous cell carcinomas (SQCAs) and eight adenocarcinomas (ADCAs) were examined by high-resolution comparative genomic hybridization (CGH) analysis of cDNA microarray. The SQCAs and ADCAs shared common frequency distributions of recurrent genomic gains of 63 genes and losses of 72 genes. Cluster analysis using 57 genes defined the genomic differences between these two major histologic types of NSCLC. Genomic aberrations from a set of 18 genes showed distinct difference of primary ADCAs from their paired normal lung tissues. The genomic copy number of four genes was validated by fluorescence in situ hybridization of 32 primary NSCLC tumors, including those used for cDNA microarray CGH analysis; a strong correlation with cDNA microarray CGH data emerged. The identified genomic aberrations may be involved in the initiation and progression of lung tumorigenesis and, most importantly, may be developed as new biomarkers for the early detection and classification of lung cancer.
Lung cancer is the most common cause of cancer death in North America. The unsatisfactory cure rate and poor prognosis of affected patients support efforts for better risk assessment and early detection. Non small cell lung cancer (NSCLC), the predominant form of lung cancer, has two major histologic subtypes: squamous cell carcinoma (SQCA) and adenocarcinoma (ADCA) .
Lung tumorigenesis is a heterogeneous process that arises after a series of clonal molecular genetic alterations, including genomic gains and losses, particularly deletion of tumor-suppressor genes (TSGs) and amplification of oncogenes . Therefore, defining these genomic aberrations may help us identify tumor-specific signatures involved in the initiation and progression of lung cancer and thus help produce genomic biomarkers for the early detection of lung tumors. Cytogenetic karyotypes have shown that NSCLCs display multiple numeric and structural chromosomal alterations . Loss of heterozygosity (LOH) analysis further disclosed major differences in patterns of allelic imbalances between ADCA and SQCA [3–6]. Metaphase comparative genomic hybridization (CGH) studies detected genomewide copy number changes in lung cancers [7–10]. CGH analysis of the microarray bacterial artificial chromosomal (BAC) clones that cover a limited fraction of the human genome was also used to analyze NSCLCs and showed a clear pattern of genomic changes for SQCAs . However, the low resolution of all the techniques makes it difficult to identify the causal genes whose structural alteration is critical for biologic behavior.
Most recently, CGH analysis of cDNA microarrays has provided high-resolution maps of genomic locations of single genes because it uses cDNA and expressed sequence tag clones as targets [12,13]. This technique has been proven to define genomic copy number gains and losses of individual genes in human cancer [12–14]. With sufficient genetic representation in cDNA microarrays, CGH resolution can be substantially improved to provide important genetic information underlying complex chromosomal rearrangements and genomic imbalances leading to tumorigenesis.
In this study, we used cDNA microarray CGH analysis to characterize, in detail, genomic aberrations associated with single genes in 14 primary NSCLC tumors and their paired normal tissues. The results demonstrate that NSCLC tumors share common frequency distributions of recurrent genomic gains and losses of sets of genes. Our study also defines the genomic difference between the two most common lung tumor subtypes, SQCA and ADCA, and provide a clear genomic profile of primary ADCA, which shows a distinct difference from paired normal lung tissues by a cluster of genomic aberrations. Validation of some of these genomic signatures raises the possibility of using the findings as new biomarkers for early detection of lung cancer.
For cDNA microarray CGH and metaphase CGH analysis, surgical specimens were obtained from 14 patients with stage I NSCLC between March 1, 2002 and June 28, 2003 at The University of Texas M. D. Anderson Cancer Center. All patients had a smoking history ranging from 32 to 95 pack years. Six SQCAs and eight ADCAs had been definitively resected by either a lobectomy or a pneumonectomy. None of the patients had received preoperative adjuvant chemotherapy or radiotherapy. Tissues samples had been routinely dissected intraoperatively from the surrounding lung parenchyma; paired normal lung tissues had also been obtained from the same patients at an area distant from their tumors. Tissue acquisition was approved by the institutional review board at our institution. Tissue sections (4 µm thick) were stained with hematoxylin and eosin, and reviewed to confirm the diagnosis and to verify the presence of greater than 70% tumor cells. For fluorescence in situ hybridization (FISH) analysis, touch imprints were made from surgical specimens obtained from 32 patients with stage I NSCLC (16 SQCAs and 16 ADCAs, including those used for the cDNA microarray CGH analysis) and then fixed in methanol and acetic acid (3:1).
Cancer cell lines BT474 and H358 were purchased from the American Tissue Culture Collection (Rockville, MD) and maintained in RPMI medium supplemented with 10% fetal bovine serum. Genomic DNA was extracted from cell lines, surgical tissues, and normal human lymphocytes using a DNA tissue kit (QIAGEN, Inc., Valencia, CA) following the manufacturer's instructions.
cDNA microarrays contained a total of 8000 cDNA clones (Research Genetics; Invitrogen, Huntsville, AL). Of these clones, 6894 represented known genes, and the remainder corresponded to uncharacterized expressed sequence tags. The preparation of array slides was performed essentially as described previously [12,13]. Chromosomal assignments of clones were determined from the July 2003 freeze of the assembled human genome available through the UCSC Genome Browser (http://genome.cse.ucsc.edu). CGH experiments on cDNA microarrays were performed as described previously [12,13]. Briefly, 20 µg of genomic DNA from cancer cell lines, tissue specimens, and normal human lymphocytes was digested for 14 to 18 hours with AluI and RsaI (New England Biolaboratories, Beverly, MA) and purified by phenol-chloroform extraction. Six micrograms of tested DNA was labeled with Cy3 by Bioprime labeling kit (Invitrogen, Carlsbad, CA), and normal lung tissue DNA was labeled with Cy5-dUTP (Invitrogen). Hybridization and postwashes were performed as described previously [12,14]. A laser confocal scanner (Agilent Technologies, Palo Alto, CA) was used to measure the fluorescence intensities at the target locations using DEARRAY software. After background subtraction, the average intensity of each clone in the test hybridization was divided by the average intensity of the corresponding clone in the control hybridization. For the copy number analysis, the ratios were normalized on the basis of the distribution of ratios of all targets on the array based on 126 housekeeping genes, which were spotted four times in the array. The distributions of fluorescence ratios were used to define cutpoints for increased or decreased copy number. Only clones that exhibited a log2 hybridization ratio of either >-1 or <-1 were considered completely amplified or deleted candidates, respectively.
Chromosomal CGH experiments were carried out as described in our previous publications [15,16]. Briefly, genomic DNA was labeled by nick translation using a nick translation labeling system (Vysis, Downers Grove, IL). Tumor samples were directly labeled with SpectrumGreen (Vysis) and hybridized with SpectrumRed-labeled reference DNA (Vysis). Samples were counterstained with 4,6-diamino-2-phenylindole (DAPI). Each CGH experiment included at least one normal human lymphocyte DNA as a negative control. Images were analyzed with CGH analysis software (Applied Imaging, Clara, CA). A gain of DNA sequence copy number was defined as a tumor-to-reference ratio >1.2 on both standard and inverse hybridizations. A copy number decrease was defined as a tumor-to-reference ratio of <0.8 on both hybridizations.
By searching the genome sequence database (http://www.ncbi.nlm.nih.gov/BLAST) using the BLAST algorithm, we identified the following BAC clones: 2320O4 for Skp2, 307C12 for Cks1, 391M1 for Gc20/Sui1, and 506M13 for SFTPA1. These clones were used as FISH probes. BAC DNA was prepared using the DNA Maxi Kit (QIAGEN, Inc.). Dual-color FISH was performed as described in our previous publication . Briefly, 1 µg of BAC DNA was labeled with SpectrumGreen (Vysis). Tissue imprint slides were denatured in 70% formamide and 2 x SSC for 5 minutes at 72°C, dehydrated in graded ethanol, and incubated with a hybridization mixture consisting of 60% formamide, 2 x SSC, CotI DNA, and 100 ng of both a Spectrum-Green-labeled BAC DNA probe and a SpectrumOrangelabeled corresponding chromosomal centromeric probe (Vysis). After overnight incubation at 37°C, the slides were washed at 45°C in 50% formamide and 2 x SSC for 10 minutes and counterstained with antifade solution containing DAPI. Two hundred cells on each slide were counted using Leica microscopes equipped with appropriate filter sets (Leica Microsystems, Buffalo, NY). Greater or lesser copy numbers of the tested probes compared with copy numbers of reference indicated gain and loss of the gene, respectively. The cutoff value was calculated from normal tissue samples using the mean number (plus 3 SD) of cells having an abnormal FISH signal pattern.
To analyze the cDNA microarray CGH data, clustering analysis was performed with cluster analysis software and in the TreeView program written by Michael Eisen . Before the clustering algorithm was applied, the fluorescence ratio for each spot was first log-transformed (log2), and then the data for each sample were median-centered to remove experimental biases. To distinguish differences in the copy number of genes between SQCAs, ADCAs, and normal tissues, we measured 7668 clones because these clones were present in more than 12 of 14 specimens. Each clone was assessed by computing a two-sample t statistic with equal variances. The P value for each test was determined using a permutation method to calculate the ability of individual clones to distinguish between the subtypes of lung cancer. This procedure was repeated 10,000 times. P values less than .05 were considered significant, and the clones associated with these significant values were thought to have the power to distinguish between any two groups of tissues.
A Wilcoxon ranks sum test was applied to compare the number of genomic alterations detected by conventional CGH between different histologic subtypes, and the Student's t test was used to evaluate the relationships between genomic copy number changes detected by FISH in the different histologic subtypes. Chi-square analysis was performed to examine the results of correlation between cDNA microarray CGH and FISH regarding the genomic copy number of the genes. A P value of less than .05 was considered statistically significant.
To assess the sensitivity of the cDNA microarray in detecting the genomic copy numbers, we first tested its ability to measure single-copy chromosomal changes by cohybridizing male DNA labeled with Cy5 and female DNA labeled with Cy3 in the cDNA microarrays. The average log2 Cy5:Cy3 hybridization ratio for X chromosome genes was -1, which compares to an ideal log2 value of 1 for a 2:1 female-to-male X chromosome ratio. We then tested the ability of the cDNA microarray to detect the genomic gain of single gene by hybridizing breast cancer cell line BT474 genomic DNA, in which the genomic copy number of the ERBB2 gene is approximately 10. When one third the amount of this DNA was compared with the normal reference DNA, the log2 hybridization ratio for ERBB2 genes was 3.2, suggesting that the ERBB2 copy number was approximately 3 (Figure 1).
To assess the ability of the array to detect the deletion of a single gene, we cohybridized DNA from NCI-H358 lung cancer cells, which have homozygous deletion of the TP53 genome, with normal reference DNA. The average log2 hybridization ratio for TP53 genes was -1.2 (Figure 1).
Our study also allowed a direct comparison of the sensitivity of the cDNA array CGH with that of the metaphase CGH because the same samples were applied to the two methods simultaneously. As illustrated in Figure 2, all of the imbalances identified by metaphase CGH were confirmed by microarray CGH, whereas the copy number imbalances at 2p, 2q, 4q, 7p, 7q, 6p, 10q, 14q, 15q, 16q, and Xq detected by microarray CGH were not identified by metaphase CGH. In addition, the genomic aberrations identified by metaphase CGH were delineated by microarray CGH to a much smaller regions, including gains at 1q21-22, 5p13, 8q22.1-23.1, 11q13, 19q13.1, 20q13.3, and 22q11.23, and losses at 5q23-32, 8p21-22, 9p21, 19p13.1, and 21q22.3. There was no statistically significant relationship between smoking pack year and certain genomic aberrations.
Consistent with the results of conventional CGH, those of cDNA microarray CGH showed that primary SQCAs and ADCAs share common frequency distributions of recurrent gains and losses of genes (Figure 2). The genomic aberrations involved several known oncogenes and TSGs within 1q, 3q, 5p, 8q, 16p, 17q, 19p, 19q, and 20q for gains and 1p, 3p, 5q, 8p, 9p, 11p, 11q, 13q, and 18q for losses. The total numbers of genes with genomic aberrations are 228 in SQCAs and 194 in ADCAs. Furthermore, using clustering analysis of cDNA microarray CGH data, we identified 25 genes with a high number of genomic copy number changes and another 63 genes with a high number of genomic copy number losses in both SQCAs and ADCAs. Notably, as Table 1 shows, molecular genetic alteration of some of the genes has been previously described in lung tumors [19–49].
Genomic differences associated with individual genes between lung SQCA and ADCA subgroups can be deduced from Figure 2, c and d. A paired t-test performed on the logtransformed cDNA microarray CGH data identified the 57 most informative genes that allowed accurate discrimination between SQCAs and ADCAs (Figure 3). Furthermore, a permutation t-test using 18 genes with genomic changes was able to show a significant difference between ADCAs and their paired normal lung tissues (Figure 4).
To confirm whether the genomic signatures detected by our cDNA microarray CGH analysis reflected the real frequencies of the gene alterations in primary NSCLCs and have the potential to correctly identify the two major histology types of lung tumor, two genes (Gc20/Sui1 and SFTPA1) with deletions in both the SQCAs and ADCAs and two genes (Skp2 and Cks1) with genomic gains in either SQCAs or ADCAs were selected. The copy number aberrations of the genes were detected by FISH in a set of lung tissue specimens, including those used for the cDNA microarray CGH analysis. As shown in Table 2, there was complete concordance between the cDNA microarray CGH and FISH results. The genomic deletion of Gc20/Sui1 and SFTPA1 is common for both subtypes of NSCLCs. Genomic amplification of Skp2 is specific for SQCA, whereas Cks1 amplification is more common for ADCA (Table 2, Figure 5).
Chromosomal aberrations reflect the selective retention of genomic fragments housing driver genes, whose abnormality contributes to tumorigenesis. The metaphase CGH assay has been used for the identification of novel driver genes, and its profiles correspond well to the chromosomal location of some known or suspected oncogenes and TSGs in lung tumors [7–10]. However, for other frequently observed aberrations, no specific driver genes have yet been implicated because the method has relatively low resolution. We demonstrated in our study that the use of cDNA microarray CGH analysis may address this issue because cDNA micro-arrays represent high-resolution maps (in our study, one clone every 376 kb through the human genome), an approximately 20-fold higher mapping resolution than that attained by metaphase CGH. With the completion of the human genomic database, cDNA microarray CGH can map genomic gains and losses by their gene position rather than their chromosomal band, and therefore can immediately provide a list of candidate genes that occur within the region of interest.
The genomic copy number imbalances identified by our cDNA microarray CGH analysis appear comparable to those found in a recent study of lung cancer that used a BAC array CGH technique . Furthermore, because the cDNA microarray has a much higher mapping resolution (376 kb) than that achieved by BAC array CGH (1.4 Mb), our study restricted the larger fragment of genomic copy number changes to a small focal point of copy number aberrations of individual genes in primary lung tumors. For example, we determined two peaks of genomic gain at 5p13 and 5p35.3 in both the SQCAs and ADCAs that were not detected by the BAC array CGH, suggesting that the use of cDNA microarrays for analysis of DNA copy number variation has marked advantages over the use of large genomic DNA clone array-based CGH methods. Moreover, we determined the pattern of composite genomic losses of variable regions on several chromosomal loci in lung cancer, which was in keeping with the complex pattern of chromosomal rearrangements observed for deletions discovered by LOH [4,5]; however, our results defined a narrower region and even identified the individual genes with genomic deletion. Thus, our cDNA microarray CGH analysis has a higher resolution than other methods and can be used to detect a small region or individual genes of amplification or deletion, and, finally, define the unique genomic signatures associated with lung tumorigenesis.
The results of our cDNA microarray CGH analysis also imply that primary SQCAs and ADCAs share common recurrent DNA copy number gains and losses of certain gene clusters, confirming previous findings that lung tumors involve a series of clonal molecular-genetic alterations [11,50]. We also showed that substantial genomic differences exist between SQCAs and ADCAs. For example, when using metaphase CGH, we detected genomic gains in both ADCAs and SQCAs; the frequency of the genomic copy number of 3q was higher in SQCAs (100%, 6/6) than in ADCAs (50%, 4/8) (P < .001). Correspondingly, when conducting cDNA array analysis of the same specimens, gene amplification of chromosome 3q was more frequent in SQCAs than in ADCAs. ADCAs tended to have a more heterogeneous gene transcript pattern and, in some cases, to exhibit a genomic profile in 3q more similar to that of nonneoplastic parenchymal lung tissues. The genomic copy number difference between primary SQCAs and ADCAs suggests that they may differ in the level of genomic instability or mechanisms by which they initiate and progress; the genomic aberrations of specific genes in the genomic phenotype of each of the histologic subtypes reflect their different genomic clonal evaluations and appear as different diseases at the molecular level . Most importantly, these unique genomic abnormalities may be developed as predictor sets of biomarkers for the early detection and classification of lung cancers.
Previous reports using the serial analysis of gene expression, oligonucleotides, or cDNA array analysis have described sets of genes overexpressed or downregulated in primary SQCAs and ADCAs [21,22,30,31,36,51]. In contrast to SQCAs, which always showed clusters as a distinct tumor group, ADCAs tended to have a more heterogeneous gene transcript pattern and, in some cases, to exhibit a profile more similar to that of nonneoplastic parenchymal lung tissues. Both of these facts make it difficult to molecularly classify ADCA using transcript signatures . In our study, cDNA microarray CGH analysis provided a clear genomic profile of genes in primary ADCAs that is distinctly different from that in normal lung tissues. That genes in primary ADCAs had a distinct genomic pattern in our study but no clear transcriptional profile in other study is not surprising for several reason: 1) genomic DNA is a different mixture from the mRNA representation of cells; 2) transcription of genes has different biologic changes and behaviors from their genomic ancestors in lung cancer; and 3) the level of mRNA expression does not completely reflect genomic copy number changes. In addition, the inclusion of some ADCAs with normal lung from the previous reports may due to the profiling of BAC bronchioloalveolar carcinomas. The comparison of ADCAs with normal lung at the genomic DNA level by cDNA microarray CGH analysis should reveal the differences. Future assessment of transcript level and gene copy number changes of the same set of lung tumors in parallel using the same array may define whether genomic structural abnormalities directly affect imbalances of expression in lung tumorigenesis. However, our findings showing that primary ADCAs have a set of genes with a unique genomic profile may be of interests because these minimal gene sets can be used for developing biomarkers for ADCAs. This finding is particularly important because ADCAs have become more prevalent than SQCAs—a trend that is occurring worldwide—and are more difficult to diagnose than SQCAs because they always arise from the smaller airways .
There was no statistically significant relationship between smoking pack year and certain genomic aberrations; one possible reason may be the small sample size of the current study. Currently, we are analyzing a large cohort of clinical specimens in an ongoing study, assessing the concordance of the genomic findings detected by cDNA microarray CGH and correlating these data with smoking history, prognosis, tumor progression, and treatment of the patients.
Although we used only four genes for confirmation in this study, all four showed a strong correlation between FISH analysis and cDNA microarray CGH data for genomic copy number changes, indicating that the genomic signatures discovered by cDNA microarray CGH might be developed as biomarkers for early interventional strategies for lung cancer. Furthermore, our results may suggest that the genomic changes we observed are likely relevant to lung tumorigenesis. In fact, alterations of some of the genes have been previously reported in lung cancer (Table 1). For example, SFTPA1 is a phospholipid-protein complex that lowers the surface tension at the air-liquid interface in the alveoli of the lung and plays a key role in the innate host defenses there. The transcription-level and protein-level aberration of SFTPA1 has previously been observed in lung tumorigenesis [52–54]. The product of GC20/Sui1 is a general monitor of the translational accuracy of proteins through recognition of the protein synthesis initiation codon, and the expression of GC20/Sui1 induced is related to cellular stress and may represent an important adaptive response to genotoxic agents . GC20/Sui1 has been detected in normal liver cells but not in hepatocellular carcinoma cells . We found that the GC20/Sui1 transcript was diminished in 80% of lung cancer cell lines tested by using reverse transcription polymerase chain reaction (RT-PCR) (data not shown). Skp2 displays an S-phase-promoting function in the cell cycle and is implicated in the ubiquitin-mediated degradation of several key regulators of mammalian G1 progression, including p27, a dosage-dependent tumor-suppressor protein. Skp2 protein is overexpressed in oral epithelial carcinomas, and its expression levels correlate positively with prognosis . A positive correlation of an increased relative copy number of Skp2 with a transcriptional level was found in small cell lung cancers cell lines . Cks1 is one of the components of the Skp1-Cullin1-F-box-Roc1 complex . Inui et al.  recently found a high expression of Cks1 in ADCAs and suggested that such high expression may be involved in the pathogenesis of the diseases. In agreement with that report, our study's detection of genomic gain of Cks1 by FISH and cDNA microarray CGH analyses was common in ADCAs. However, further characterization of the genes with a genomic aberration identified in our study is needed to evaluate the effects of the genomic aberrations on transcriptional and protein levels in lung cancer.
In summary, we have generated a profile of genomic copy number aberrations in the two major histologic subtypes of primary NSCLC tumors. Our findings may be a step toward defining a new genomic taxonomy of such tumors and demonstrate the potential power of genomic copy number profiling in lung cancer diagnosis. The development and implication of a relevant panel of probes for detecting genomic signatures might be of great value in lung cancer diagnosis and surveillance strategies in a clinical laboratory setting. Nevertheless, double-blind, prospective, confirmatory studies by independent groups are necessary to further validate these findings.
We thank Vickie J. Williams and Ellen M. McDonald for editing the manuscript.
1This work was supported by an M. Keck Center for Cancer Gene Therapy Award, the University of Texas M. D. Anderson Cancer Center Institutional Research Grant, the Developmental Project/Career Development Award from The University of Texas specialized programs of research excellence in lung cancer, and the M. D. Anderson Cancer Center Tobacco Settlement Fund.