|Home | About | Journals | Submit | Contact Us | Français|
Comparative genomic hybridization by means of BAC microarrays (array CGH) allows high-resolution profiling of copy-number aberrations in tumor DNA. However, specific genetic lesions associated with small but clinically relevant tumor areas may pass undetected due to intra-tumor heterogeneity and/or the presence of contaminating normal cells. Here, we show that the combination of laser capture microdissection, 29 DNA polymerase-mediated isothermal genomic DNA amplification, and array CGH allows genomic profiling of very limited numbers of cells. Moreover, by means of simple statistical models, we were able to bypass the exclusion of amplification distortions and variability prone areas, and to detect tumor-specific chromosomal gains and losses. We applied this new combined experimental and analytical approach to the genomic profiling of colorectal adenomatous polyps and demonstrated our ability to accurately detect single copy gains and losses affecting either whole chromosomes or small genomic regions from as little as 2 ng of DNA or 1000 microdissected cells.
Chromosomal instability (CIN) plays a central role in the progression and malignant transformation of solid tumors (1–3). However, evaluation of genome-wide CIN is often hampered by technical and sample limitations. Tumor cytogenetics is often not feasible even from freshly isolated tumors and requires short-term culturing of parenchymal cells. Moreover, different areas within a tumor may show heterogeneous CIN patterns associated with different malignant potentials. Also, tumor specimens are invariably contaminated with varying amounts of surrounding and/or infiltrating normal cells interfering with the analysis. Thus, a method is ideally required that enables the detection of chromosomal aberrations in small numbers of microdissected tumor cells (4,5). Such a method should also be sufficiently sensitive to detect single copy gains and losses, the most frequent genomic alterations in most tumor types (6).
Comparative genomic hybridization (CGH) by means of microarrays containing large-insert genomic clones such as bacterial artificial chromosomes (BACs) provides a sensitive and quantitative approach to assess DNA copy-number aberrations in tissue samples (6–8). Classical array CGH requires hundreds of nanograms of genomic DNA for fluorescent labeling, a prohibitive amount when working with microdissected samples. Several methods to amplify total genomic DNA have been applied to array CGH, most of them based on thermocycling protocols, such as degenerate oligonucleotide-primed (DOP) PCR (9) and ligation-mediated (LM) PCR (10). These methods produce relatively low molecular weight DNA that may not be representative of the entire genome (11). In contrast, multiple strand displacement amplification (MDA) generates thousands of high-molecular weight copies of genomic DNA in a robust simple protocol without the use of thermocycling or ligation of DNA adaptors (12,13). This amplification mechanism favors equal representation of sequences because each priming event is propagated over very long distances in the genome. Array CGH with MDA amplified DNA performs comparable to unamplified array CGH (14), and allows reliable detection of high-level copy-number changes (gene-dosage alterations of 3-fold or more) on cDNA microarrays (15) and synthetic oligonucleotide arrays (16). We aimed at significantly improving this method to allow reliable detection of low-range alterations and apply it to microdissected neoplastic lesions.
We developed an experimental approach combining laser capture microdissection (LCM), isothermal genomic DNA amplification and array CGH on BAC microarrays, which have a better signal-to-noise ratio compared to cDNA microarrays. To validate the method, we have generated a series of amplified CGH profiles from normal female and male DNA, and from cell lines with known gains on chromosome X or 20. Using two simple statistical methods, we were able to avoid the exclusion of genomic regions affected by amplification distortions and high variability, and to detect single copy changes varying in length from a few clones to a whole chromosome. Finally, we applied the combined experimental and statistical approach to the analysis of microdissected dysplastic cells from colorectal adenomatous polyps derived from patients affected by familial adenomatous polyposis (FAP), an autosomal dominant genetic predisposition to the development of hundreds to thousands benign adenomas in the distal gastro-intestinal tract, caused by germline mutations in the APC gene on chromosome 5q (17). We demonstrate the sensitivity of our LCM-CGH array approach by the detection of a small 5q deletion subsequently validated by PCR-based loss-of-heterozygosity (LOH) analysis. These results underline the usefulness of our approach in the study of chromosomal imbalances in small subpopulations of microdissected tumor cells or in other cases where limited amounts of DNA are available.
The human 3600 BAC/PAC genomic clone set, covering the full genome at 1 Mb-spacing used for the production of our arrays was obtained from the Welcome Trust Sanger Institute (http://www.sanger.ac.uk/). Information on this clone set can be obtained at the BAC/PAC Resources Center Web Site (http://bacpac.chori.org). Degenerated oligonucleotide PCR-products were prepared for spotting on CodeLink® slides (Amersham Biosciences) according to detailed protocols (5) with some modifications (18).
The data set comprised the measurements from 17 independent array CGH experiments. Normal male and female DNA samples used as controls and as reference in the hybridization were obtained from a commercial source and consisted of pools of individuals (Promega). The genomic DNA from 5cl20 sample was selected for testing, since it contained a known trisomy over five clones on chromosome 20 (position 12639830 to 16682277 Mb at 20p12.1). The lymphoblastoid cell line used contains a known X trisomy. Normal epithelial (D5_14) and dysplastic adenoma (D5_1) cells were microdissected from human colon tissue specimens from a male individual affected by FAP. Three normal mucosa samples were microdissected from healthy control individuals, one male (NC1) and two females (NC2 and NC3). The frozen colon specimens used for microdissection were collected in Tissue-Tec® (Sakura) embedding medium and snap-frozen in 2-methylbutane (Sigma) and dry ice. Tissue sections (10 μm) were sliced with a Shandon Cryotome® Cryostat (Shandon), and directly transferred to a UV-cross-linked PEN membrane (P.A.L.M. Microlaser Technologies) mounted on a PALM® Membrane Slide (1 mm). Slides with mounted sections were immersed immediately and stored up to one week in 70% ethanol at 4°C.
Following the staining of the tissue sections with Mayer's Hematoxilin and Eosin Y solutions, 1000 parenchymal cells (600000 μm2) were microdissected and laser pressure catapulted using the PALM® MicroBeam microscope system (P.A.L.M. Microlaser Technologies). The genomic DNA was extracted using the protocol from Isola et al. (19), adapted to small DNA amounts. Briefly, the adaptations included the resuspension of the microdissected cells in 100 μl of DNA extraction buffer, an overnight digestion with 0.6 mg/ml of proteinase K and a replacement of glycogen by 10 μg of GenElute™ linear polyacrylamide (Sigma) in the precipitation step. All samples were resuspended in 10 μl of TE buffer (10 mM Tris–HCl pH 7.4, 0.1 mM EDTA). Tests performed with DNA amounts that could be assessed on conventional agarose gels indicated that the average size of genomic DNA purified using this procedure was >20 kb.
A volume of 1 μl out of 2 ng/μl dilutions from cell lines and control DNA samples and from all microdissected samples (concentrated by speedvac from 8 μl) was used as starting material for the amplification. The 29-amplification was carried out according to the GenomiPhi kit manufacturer's instructions (Amersham Biosciences), using an incubation time of 16 h. GenomiPhi reactions were checked on a 0.6% agarose gel. An amplification was considered successful when a smear of DNA fragments, ranging from 1 to 20 kb, was visible. Samples were purified using Microcon® YM-100 spin columns (Millipore) according to the manufacturer's protocol and concentrated to 25 μl in water.
In order to roughly access the amount of DNA isolated from microdissected samples, we have used a PCR, specific for the TH01 human marker (20). Each 2 μl of microdissected sample was compared to 1 μl of serial dilutions of female and male control DNAs (10, 5, 2, 1, 0.5, 0.1, 0.01, 0.001 ng/μl). The 25 μl reactions contained 1X GeneAmp® PCR Buffer (Applied Biosystems), 1.5 mM MgCl2, 0.2 mM dNTPs Mix, 500 nM oligonucleotides TH01-1 (forward: 5′-GTGGGCTGAAAAGCTCCCGATTAT-3′) and TH01-2 (reverse: 5′-ATTCAAAGGGTATCTGGGCTCTGG-3′) and 1 unit of AmpliTaq® DNA Polymerase (Applied Biosystems). The cycling conditions were as follows: 1 min at 94°C; 35 cycles of 20 s at 94°C, 20 s at 55°C and 20 s at 72°C; 10 min at 72°C and then held at 25°C. TH01 amplification products (±200 bp) were checked on a 2% agarose gel or on a Bioanalyzer 2100 (Agilent) using a DNA 1000 LabChip kit, according to manufacturer's instructions. Due to a frequently observed background synthesis in the GenomiPhi amplified water control, the above-described PCR methodology was routinely used as an extra quality control on 1 μl (1/50 dilution) of unpurified 29-amplified DNA. The TH01 specific band should be present in all GenomiPhi amplified samples and absent from the GenomiPhi water control.
By default, Cy3 and Cy5 fluorescent channels were used to label test and reference DNA, respectively, except when otherwise specified. The labeling and hybridization protocols described by Fiegler et al. (5) were used, with some modifications in the labeling. Briefly, 130.5 μl of solution containing 5 μl of 29-amplified or 450 ng of unamplified DNA, 60 μl of BioPrime® DNA Labeling System random primers solution (Invitrogen) and water was incubated for 10 min at 100°C, and subsequently cooled on ice. After the addition of 15 μl of 10× dNTPs labeling mix (1 mM dCTP, 2 mM dATP, 2 mM dGTP and 2 mM dTTP), 1.5 μl of 1 mM Cy3- or Cy5-dCTP (Amersham Biosciences) and 120 U of BioPrime® DNA Labeling System Klenow fragment (Invitrogen), the mixture was gently mixed and subsequently incubated overnight at 37°C. The addition of 15 μl of BioPrime® DNA Labeling System Stop Buffer (Invitrogen) ended the reaction. In one tube, Cy3-labeled sample and Cy5-labeled reference DNAs were mixed together, and 135 μg of human Cot-1 (Roche), 55 μl of 3 M sodium acetate (pH 5.2) and 1 ml of cold 100% ethanol were added. In a second tube, 80 μg of denatured herring sperm DNA (Sigma) was mixed with 135 μg of human Cot-1 (Roche), 23 μl of 3 M sodium acetate (pH 5.2) and 400 μl of cold 100% ethanol. After gentle mixing of the two labeling mixes, the labeled nucleic acids were precipitated overnight at −20°C. Hybridizations were performed as described in (5). Sixteen-bit fluorescent images were acquired with a DNA Microarray Scanner (Agilent) and the resulting TIFF images were analyzed with the GenePix Pro 4.0 software (Axon Instruments). Per array, a GenePix results file (.gpr) with the extracted Cy3 and Cy5 spot and background raw intensities was generated.
Data analysis was performed with a set of functions implemented in R (21) (http://www.r-project.org/). Briefly, gpr files were directly loaded into the R environment using the marrayTools package. Intensity data were normalized with the local robust regression function Lowess contained in the package marrayNorm. The resulting log2Cy3/Cy5 transformed ratios from each set of triplicate spots were subsequently averaged to produce a unique ratio for each BAC clone. After this step, spots designated as empty, blank and Cy-dye controls, and spots with uncertain chromosome locations (clone id >35650 in the data set) were removed. The final data set, used in subsequent analysis, consisted of 3615 independent observations per sample. For all experiments, data was organized per chromosome and the clones were ordered from chromosome 1 to chromosome Y according to their Golden Path Mb position (http://genome.ucsc.edu/). Although data from all chromosomes was analyzed, for illustration purposes of the methods, the data from chromosomes 5, 20, X and Y only are shown. The data were analyzed separately per chromosome, given that chromosomes are not directly related and some display more variability than others. Histograms of the normalized log2 clone-specific associated residuals were also produced and they indicated that the error measurements approximately follow a normal distribution.
The explanatory factors of interest are Fx and Fy, which are qualitative variables with levels (k) defined by the columns Fx and Fy level (Table (Table1).1). For any given chromosome with n clones, the model used for the copy-number ratios, observed for the level k of the factor under study and clone j, Yjk was
where αk and βj represent the factor-specific and clone-specific effect respectively, μ is the intercept (level 0, no change) based on the two normal male versus male comparisons (CGH3_2 and 4), and εjk is the error with mean zero and constant variance. The term βj in the model takes care of individual clones varying technical performances. The model in Equation 1 was first fitted to X chromosome clones from the pilot and microdissected subsets (Table (Table1)1) and subsequently, the same approach was used for the remaining chromosomes. Finally, we extracted the P-values for the factor under study from the ANOVA table for the fitting of the model in Equation 1. These P-values were not used as probabilities but merely to compare model fits for different chromosomes. Factor effects plots were drawn with the qvcalc R package. Comparable error bars were computed using, instead of the computed variances, quasi-variances (22), which consists of variances adjusted for the covariance structure.
To detect gains and deletions over a few neighboring clones and simultaneously test their statistical significance against a reference baseline, we used the Smith–Waterman analytical algorithm (23), implemented in R (24) (http://www.well.ox.ac.uk/~tprice/). The algorithm was first applied to chromosome 20 data from sample 5cl20, to compare its clone ratios to each clone ratio median from the remaining eight samples, using a δ threshold of 3.6. The same algorithm was applied to chromosome 20 and to the other chromosomes data from the remaining samples from the pilot and microdissected data (Table (Table11).
LOH in 5q21.1–5q31.1 was analyzed by amplification of the dinucleotide repeat markers D5S400 (distal), D5S409 (LOH region) and D5S427 (proximal). The localization and the primer sequence of the markers were obtained from the ENSEMBL project (http://www.ensembl.org). Allele sizes were obtained from The Genome Database of the Human Genome Project (http://www.gdb.org). A radioactive PCR was performed using 1 μl of a 1:10 dilution of 29-amplified tumor and normal mucosa DNA from patient D5. An aliquot of 100 ng of unamplified DNA isolated from ethanol-fixed normal intestinal mucosa from the same patient was used as additional control. Amplifications were carried out in a 10 μl volume, containing 10 mM Tris–HCl, pH 8.9, 50 mM KCl, 2.5 MgCl2, 10% glycerol, 200 μg/ml BSA, 0.01% gelatin, 0.2 mM of each dATP, dGTP and dTTP, 0.05 mM dCTP, 0.25 μCi of dCTP (3000 Ci/mmol), 0.2 U Taq polymerase and 10 pmol of each primer (25). The cycling conditions were as follows: 1 cycle at 94°C for 4 min, followed by 35 cycles at 94°C for 1 min, 55°C for 1 min, 72°C for 1 min and final cycle at 72°C for 6 min. The amplified fragments were resolved in a denaturing 6% polyacrylamide gel and dried on paper, and subsequently scanned on a Typhoon 9200 imager (Amersham Biosciences). The generated TIFF images were subsequently analyzed with ImageQuant v5.2® software (Amersham Biosciences). The allelic imbalance was calculated as described elsewhere (26). LOH is interpreted as significant when the calculated comparative allelic ratio values are >1.5 (loss of the smaller allele) or <0.6 (loss of the larger allele).
The genomic microarray data discussed in this publication have been deposited in NCBIs Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO Series accession number GSE 1841.
For this study, we have generated CGH microarrays covering the full human genome by employing the human 3600 BAC/PAC genomic clone set from the Welcome Trust Sanger Institute. Individual BAC clones were amplified by degenerated oligonucleotides and the PCR-products spotted on glass slides (see Materials and Methods). The specificity and sensitivity of our genome-wide 1 Mb-spaced CGH microarray was tested with a series of control hybridizations using unamplified DNA samples (Figure (Figure1A1A and B). Assuming that a chromosomal region is classified as amplified or deleted when clone ratios fall outside fixed thresholds (+0.26 and −0.26 in log2 scale) for copy-number gain or loss respectively (27–30); for both arrays, >99% of the autosomal BAC clone ratios are contained within these limits (0 ± 0.07). Similar results are obtained for the log2 ratios of clones on the X chromosome (0 ± 0.11) and Y chromosome (0 ± 0.09) in the male versus male comparison (Figure (Figure1A),1A), while the female versus male comparison resulted in log2 ratios of 0.63 ± 0.22 for X-derived and −1.66 ± 1.05 for Y-derived BAC clones (Figure (Figure1B).1B). These results are concordant with the overall results expected for CGH profiling using unamplified control samples (6,8), certifying the quality of our arrays. The log2 ratios observed for X and Y in the female versus male hybridization are smaller than expected for the gene copy differences. This ‘dynamic range compression’ is commonly observed with array CGH, both for unamplified and amplified samples (6,8).
Classical array CGH requires genomic DNA in amounts well above those obtained by LCM tissues. To overcome the problem of limited amounts of starting material, the 29 polymerase was used to linearly amplify high-molecular weight genomic DNA and LCM-derived templates. It was expected that 29 polymerase-mediated reactions using small amounts of template would reach high amplification yields (12,15). However, in addition to high amplification yields, 29-mediated reactions also resulted in the generation of aspecific products both in the presence of non-human DNA (λ genomic DNA) and even when no DNA template was employed (Supplementary Figure A). Apparently, 29 is capable of primer-directed DNA synthesis in the absence of template via a yet unknown mechanism (15,31). To control for human-specific DNA synthesis, we employed conventional PCR on the 29-products to amplify the single-copy human marker TH01 (20). As expected, the 200 bp TH01-specific product was observed exclusively when human DNA templates were employed (Supplementary Figure B). Due to the fact that the aspecific amplification contributes to the amount of amplified DNA, conventional DNA quantification methods such as spectrophotometry could not be used. Therefore, we also used the TH01 PCR to semi-quantify the amount of 29-amplified DNA to be used as input for the labeling reaction. Finally, the TH01 PCR assisted us in the estimation of LCM-derived genomic DNA amounts used as input for the amplification reaction (Supplementary Figure C). The comparison of TH01 amplification products generated from serial dilutions of genomic DNA controls with TH01 amplification products generated using genomic DNA obtained from ~1000 microdissected colon epithelial cells showed that the microdissection procedure on a tissue section corresponding to an area of approximately 600000 μm2 yielded 5–10 ng of genomic DNA. The 29 reaction yielded approximately 1250–6250 ng of amplified DNA from 5 to 10 ng of starting DNA.
To test the reproducibility of the 29 amplification reaction, we co-hybridized 29-amplified female and male control DNA samples against an unamplified male reference. Apart from the expected overall increase in background noise, a marked and reproducible ratio distortion is observed at specific chromosomal regions (Figure (Figure1C1C and D). Major affected regions include 1p, and several telomeres (e.g. 7q and 8p) and centromeres, known to be prone to ratio distortions due to the repetitive nature of their sequences (32). The reproducible change in ratio structure becomes evident when replicate normal versus normal hybridizations are compared. The Pearson correlation between the autosomal log2 ratios obtained in two amplified normal female versus male experiments (Table (Table1,1, #1, 3) was high (0.90). In addition, we compared the CGH profiles of each individual sample to an ‘average’ of amplified normal CGH profiles (Figure (Figure1E1E and F; for boxplots of individual BAC clones per chromosome see Supplementary Material). The average normal profile is calculated for autosomal clones as the mean intensity values obtained from pilot subset samples (Table (Table1)1) with the exclusion of 5cl20; for X- and Y-specific chromosomal regions, the mean intensity values were calculated from three male samples included in the pilot subset (Table (Table1,1, #2, 4, 27). The reproducible wave-like pattern of the ratio distortions is likely to result from variability in the amplification process when repetitive and polymorphic genomic regions are differentially processed by the 29 polymerase (15). Accordingly, no significant ratio distortion was observed when two normal genomic DNA samples were separately microdissected from the same sections, 29 amplified and co-hybridized on the same array (Supplementary Figure D). The estimated mean log2 ratios were comparable to those previously observed with unamplified control DNA samples. Furthermore, we compared the ratio variability of 29-amplified samples co-hybridized with an unamplified or an amplified reference DNA and found no apparent difference (Supplementary Figure E). By including clone-specific characteristics such as median log2 ratio and variability in our downstream statistical analyses, we were able to analyze the amplified array CGH profiles without excluding variability-prone clones or regions.
In order to evaluate the quality of 29-amplified genomic DNA from microdissected cells, we compared the CGH profiles of 29-amplified diluted normal genomic DNA samples (Figure (Figure1E1E and F) with the profiles from 29-amplified DNA obtained from microdissected normal epithelial cells (Figure (Figure1G1G and H). Since these DNA profiles were comparable, the quality of the LCM-derived amplified DNA apparently was not compromised by the microdissection procedure. Therefore, we conclude that genomic DNA samples derived from laser capture microdissection and amplified by 29 are suitable for array CGH profiling.
To determine the sensitivity of our amplified array CGH approach in detecting low levels of chromosomal gains and losses, we generated a set of array CGH profiles with DNA samples from cell lines ranging in the number of X and Y chromosomes (Table (Table1).1). Since in our experience, ratio distortions introduced by 29-mediated amplification are systematic, we set to implement an applied and validated statistical model to deal with the variation and signify copy-number changes. Employing chromosome X as a model, we assessed the sensitivity of the CGH array technology in accurately quantifying chromosome-wide changes on 29-amplified DNA. Using our pilot data set encompassing different chromosome X copy numbers (Table (Table1),1), we modeled six different gain or loss scenarios: 2:1, 3:2, 1:1, 2.5:2 (gain of one copy in 50% of the cells), 1.5:2 (loss of one copy in 50% of the cells), and 1:2. The observed log2 ratios for the X and Y chromosomes correlated with the expected chromosome dosage increments for the known karyotypes, while the log2 ratio for autosomal clones was approximately 0 (Figure (Figure2).2). The graphs in Figure Figure22 also show that the variability of the measurements increased proportionally to the chromosome X and Y effect size.
For the detection of gains and losses of whole chromosomes, a linear regression model (see Materials and Methods) was fitted separately per chromosome to all pilot subset samples, using as explanatory factor the class of expected ratios for chromosome X (Fx). As expected, only when data from either chromosomes X or Y was used, the tested variable Fx showed a significant estimated difference compared to the baseline level 0, indicating copy-number changes (Figure (Figure3).3). Estimated X chromosome ratios of 2:1, 3:2, 1.5:2 and 1:2 were highly significant, as well as the more extreme Y chromosome ratios of 0:1, 0.5:0 and 1:0 (significant values highlighted in bold in Table Table2).2). Similar to classical array CGH, the estimated ratios were compressed compared to the expected ratios: an X chromosomal gene dosage ratio of 2:1, with an expected log2 ratio of 1 was measured as 0.8 difference to the intercept, 3:2 (log2 ratio 0.59) was measured as a difference to the intercept of 0.35, 2.5:2 (log2 ratio 0.32) was measured as 0.06, 1.5:2 (log2 ratio −0.41) as −0.42 and 1:2 (log2 ratio −1) as −0.85. Note, however, that this compression is linear (Figure (Figure3C)3C) and, based on its estimate from the pilot subset samples, it can be corrected. All Fx levels showed a significant estimated effect difference from the intercept (level 0) except for the estimated log2 ratio for 2.5:2 X chromosomes (Fx level 0.25, Table Table1).1). For this chromosome dosage level, the difference to the intercept (0.06) was neither statistically nor biologically meaningful, indicating that the thresholds for the detection of copy-number changes in 29-amplified genomic DNA are a gain of one copy in all cells, and a loss of one copy in 50% of the cells. Based on this pilot data set, we implemented a threshold for biological relevance in addition to statistical significance (P < 0.00001), requiring a log2 ratio compared to the intercept larger than +0.35 or smaller than −0.35, respectively for gain or loss.
Subchromosomal regions containing copy-number alterations can be difficult to detect, especially when they overlap with highly variable regions. We used the analytical Smith–Waterman algorithm as modified by Karlin et al. (23,33), implemented in R by Price et al. (24) (http://www.well.ox.ac.uk/~tprice/1), to detect subchromosomal gains and losses in our pilot CGH data (Figure (Figure4A–C).4A–C). After subtracting a user-defined baseline threshold delta (δ) to generate a negative mean for each chromosome, the genome is scanned for contiguous sequences of high positive or low negative scores, which may indicate polysomic or monosomic regions, respectively. High- or low-scoring segments of clones are denominated as ‘islands’, when their scores cannot be increased by shrinkage or expansion of the segment boundaries. The statistical significance of an island is estimated as the proportion of times that a higher-scoring island is found in 1000 random permutations of the coordinates of the scores, based on the premise that successive scores from the permuted data approximate the null distribution. The trisomic region present over five clones on chromosome 20 from sample 5cl20 was correctly predicted with a δ threshold of 3.6 (Figure (Figure4B),4B), with no false-positive islands detected. Applying the same threshold to chromosome 20 data from other samples (Figure (Figure4A4A and C) did not result in the detection of any significant islands (p < 0.001). Also, chromosome 5 ratios (Figure (Figure4D4D and E) and ratios from the remaining chromosomes (data not shown) were analyzed using the same approach and, as expected, no significant copy-number alterations were detected.
To obtain proof of principle for our combined LCM–array CGH approach in the genomic profiling of human tissues, the method was applied to 29-amplified DNA samples derived from approximately 1000 microdissected human colon cells. Three samples originated from control individuals, one male (NC1) and two females (NC2 and NC3), and two samples were derived from a male individual affected by FAP, normal epithelium (D5_14) and dysplastic adenoma cells (D5_1). Two illustrative normal CGH array profiles (NC1 and NC2) co-plotted with the average normal CGH profile showed no autosomal genomic imbalances (Figure (Figure1G1G and H).
We fitted the linear model using the Fx levels (Table (Table1)1) as factor to these samples to detect whole chromosome copy-number changes (Figure (Figure3,3, right panel of each plot; Table Table2).2). Except for the sex chromosomes in sex-mismatched hybridizations, none of the chromosomes from the microdissected normal and patient samples showed statistically significant and biologically meaningful changes from the expected number of two copies. Remarkably, the estimated X and Y chromosome average log2 ratios in the female samples versus male reference are lower than the corresponding estimated ratios from diluted control DNA samples. We believe this might be due to variations in the non-measurable low quantities of LCM-derived DNA used as templates in 29 amplifications.
Co-plotting of the adenoma sample CGH ratios (D5_1) with the average normal CGH profile (Figure (Figure5A),5A), suggested a loss of genetic material in a subchromosomal region of chromosome 5 that was not observed in the corresponding normal tissue (Figure (Figure5B).5B). Consequently, using the Smith–Waterman algorithm with a δ threshold of 2.5, we detected a sequence of 37 clones (~36 Mb), deleted in the region spanning from 5q21.1 to 5q31.1 (base position 100714436 to 135902806) encompassing the APC gene (5q22) causal to FAP. Allelic imbalance at chromosome 5q was confirmed by LOH analysis of dinucleotide repeat markers in the region of interest. The comparative allelic ratio of marker D5S409 (≤0.6), encompassed by the deletion, is indicative of loss of heterozygosity at this locus (Figure (Figure6).6). Markers D5S400 and D5S427, mapping respectively distal and proximal to the deletion predicted by array CGH, did not reveal allelic imbalances, thus confirming the subchromosomal nature of the 5q deletion.
We present a robust, standardized protocol for the analysis of genome copy-number alterations in DNA derived from as few as 1000 microdissected cells from histology specimens. The 29 polymerase-based MDA generates high-molecular DNA replication products in high yields, and is suitable when processing large numbers of samples. Some representational distortion occurs, which is likely to result from variability in priming density and processing by the 29 polymerase of repetitive and polymorphic sequences. Similar systematic changes in copy-number structure were reported in other array CGH studies using 29-amplified DNA (15,16). Some amplification biases can be partially compensated for by using test and reference samples amplified under the same conditions (10,15,34–36). We compared the results of co-hybridization of our test sample with amplified and unamplified reference DNA samples, and surprisingly found similar ratio distributions. Since previous studies used either a different strand displacement polymerase (Bst) and cDNA microarrays (15), or a PCR-based amplification method (10,34–36), these results may not be directly comparable to ours. Moreover, Lage et al. also successfully used an unamplified reference sample in an evaluation experiment on human BAC arrays (15). Taken the fact that the use of an unamplified reference is simpler and less expensive for high-throughput analyses of many samples, we opted for an unamplified reference. As we show, the observed over- and under-representations of specific genomic regions were reproducible and as such can be handled with adequate statistical tools. By taking individual clone-specific effects into account in our statistical analyses, they were dealt with in an effective way without excluding clones based on an arbitrary threshold. This approach allowed us to analyze the whole genome, including the Y chromosome, which is routinely excluded entirely due to variable hybridization results, even in classical array CGH (6,15,36).
Using the X chromosome as a model, we show that our amplified array CGH method detects monosomy and trisomy of the whole chromosomes using the linear regression model. It was even possible to detect a loss of one copy in 50% of the cells, while a similar sensitivity was not reached for a single copy gain. The observed underestimation of the magnitude of copy-number changes has been previously reported in array CGH studies, and most likely results from incomplete suppression of repetitive sequences or errors in background subtraction (6,8). However, for all the analyzed X chromosome dosage levels, the relationship between the estimated and the expected ratios were linear, indicating that the amplitude of compression is constant over these levels. As the copy number departed farther from the genome average, the variance of the ratios measured for the X and Y chromosome clones increased. These ratio variations were reproducible, suggesting that the sequence characteristics of individual clones, possibly differing amounts of sequence shared between the X and Y chromosomes, play a role (6). To establish a limit of accuracy for small regions of gains or deletions, we applied amplified array CGH to a sample known to harbor a trisomic region spanning five BAC clones (5 Mb) on chromosome 20. This aberration was readily detected using a simple, nonparametric Smith–Waterman dynamic algorithm. The lack of assumptions in this method makes its use very convenient. For our purpose, it demonstrated good sensitivity and specificity indicating that the resolution of our amplified array CGH method is at least 5 Mb. Both statistical models have a very low false discovery rate, when log2 ratio thresholds compared to the intercept of −0.35 (0.78 in linear scale) and 0.35 (1.27 in linear scale), representing a biologically meaningful gain or loss, are applied in addition to a significant p-value. Our data also show the importance of using a pilot study, with normal and control samples carrying known copy-number changes. The generated data can be subsequently used for calibration and for sensitivity determination in the analytical approach.
Applying the novel LCM-array CGH protocol to the analysis of normal colon mucosa and colonic adenomatous polyps from FAP patients provided proof of principle. A subchromosomal 5q deletion was detected by LCM-array CGH analysis and confirmed by an independent PCR-based method. Although we have not tested the minimum amount of input DNA for the MDA reaction, the recommended minimal amount is 1 ng, representing approximately 300–500 human genomic equivalents.
As shown here, cryo-preserved tissues provide excellent starting material for 29 amplification. However, the amplification efficiency is reduced proportionally to a decrease in molecular weight of the starting material, which is problematic for amplification of formalin-fixed archival DNA (15). Recently, several other methods for whole-genome amplification in combination with array CGH were described. Balanced-PCR amplification (36) employs digestion and ligation of a target and control genome with distinct linkers, which are mixed and amplified in a single PCR, thereby avoiding biases associated with PCR saturation and impurities. This procedure showed equivalent performance compared to MDA on cDNA microarrays using intact genomic DNA, but overcomes problems associated with modest DNA degradation in formalin-fixed paraffin-embedded tissues (36). Guillaud-Bataille et al. present an optimized LM-PCR protocol using 1 ng of starting DNA and BAC arrays. This approach preserves the initial ratios observed with BAC array CGH, allowing the reliable detection of one-copy-level variations among the amplified material (10). Although this method has not yet been applied to microdissected samples, the results on cell line DNA are very promising.
cDNA arrays typically consisting of several tens of thousands of features are used for array CGH experiments because of their more common availability and higher resolution (15,36–38). Compared to BAC arrays, however, the signal-to-noise ratio is lower and signals of two to five neighboring clones are averaged to improve signal reproducibility. To reach a similar resolution using BAC arrays, around 10000–30000 BACs would be necessary, resulting in a tiling array for the human genome (39). Since the size of the BACs, 150–200 kb, ultimately obscures higher resolution, microarrays containing 25mer (38,40,41) or 60–70mer (42–44) oligonucleotide probes are currently being explored for measuring DNA copy-number changes. The commercially available synthetic 25mer high-density oligonucleotide arrays, which were originally designed to detect single-nucleotide polymorphisms (SNPs) (45), have the advantage of giving genotyping data in conjunction with copy-number analysis (38,40,41). Overall, this platform exhibited more variability than BAC array-based CGH, and while high-level amplifications and homozygous deletions were reliably reported, changes resulting in loss or gain of a single copy were often missed in unamplified tumor cell line DNA (38,41). It is anticipated that as SNP density increases, resolution and the ability to assess subtle copy-number changes will increase. Importantly, in a small-scale study, Wong et al. used SNP arrays in combination with 29-amplified DNA from two tumor biopsies, and showed good concordance between amplified and unamplified DNA for a high-level amplification and deletion (40). This study shows the feasibility of combining MDA whole-genome amplification with sensitive copy-number analysis at high resolution.
In summary, we show that the strand displacement polymerase 29 reproducibly amplifies starting amounts of genomic DNA as low as 2 ng or 1000 laser-capture microdissected cells, resulting in the reliable detection of single copy variations on BAC array CGH. This method allows the detection of specific genetic alterations in small neoplastic lesions, including tumor biopsies obtained by non-surgical sampling methods like endoscopies, and in clinically relevant subpopulations of tumor cells, e.g. invading fronts of tumors. The significance of the capability to detect single-copy alterations in tissue samples consisting of only a few thousand cells, lies in the greatly expanded potential for discovery of novel genetic alterations limited to small clonal patches in tumors, or present in small preneoplastic lesions, as demonstrated here by the detection of the subchromosomal 5q deletion in a FAP-derived adenomatous polyp by combined LCM-array CGH analysis. These features will facilitate the identification of novel oncogenes or tumor suppressors mapping to regions of gene gain or loss (6,46). An additional application in medical genetics is the detection of copy-number changes in DNA derived from buccal swaps for diagnosis of congenital chromosomal abnormalities, such as microdeletions and duplications, and unbalanced chromosomal translocations.
Supplementary Material is available at NAR Online.
We thank Stephan White for kindly providing us DNA from the XXX cell line. This work was supported by grants from the Dutch Cancer Society (2001-2482; J.C. and L.M.), the Center of Biomedical Genetics, the Netherlands (J.M.B.), and the Center of Medical System Biology (CMSB) established by the Netherlands Genomics Initiative/Netherlands Organisation for Scientific Research (NGI/NWO).