Data from five sets of experiments were used in the development of the normalization strategy (Table ). The first four datasets were generated from array CGH experiments performed using the SMRT (Sub Mega base Resolution Tiling) arrays. These arrays are tiling resolution BAC arrays with complete coverage of the human genome using 32,433 fingerprint-verified individually amplified BAC clones [4
]. The experimental procedures for array CGH and generating spot images have been described previously [4
]. The entire set of 32,433 solutions was spotted in triplicate onto two slides by a 4 × 12 pin arrayer. For the purpose of this study, only the data from the first array out of the two arrays were used.
Data description. In this table, the array data of this study are summarized.
The fifth dataset is a public dataset downloaded from the Stanford Microarray Database http://smd.stanford.edu
. This datasets was generated from array CGH experiments performed using human cDNA microarrays, [12
The first dataset (self-self hybridization data) was derived from hybridization of the same DNA sample, i.e., normal male genomic DNA was used for both test and reference materials but labelled with different dyes. The four microarrays used in this CGH experiment are referred to as MM-1 to MM-4 in the following text.
The second dataset (hybridization data from replicate experiments) was derived from comparison of a tumor cell DNA sample with well characterized chromosomal aberrations (lung cancer cell line H526) [4
] against normal male DNA. The 8 arrays used in this experiment are denoted H526-1
The third dataset (hybridization data from male and female DNA mimicking single copy deletion) was derived from comparison of normal male DNA versus normal female DNA, using arrays named MF-1 and MF-2.
The fourth dataset (hybridization data from samples mimicking heterogeneous cell populations) was derived from a series of array CGH experiments in which the samples to be compared were mixtures of male and female DNA affecting X chromosome dosage mimicking tumor samples with varying levels of normal cell contamination. Precise proportions of DNA were mixed to simulate increasing levels of heterogeneity as previously described [5
]. Arrays T1
compared male DNA against female DNA generating a 1:2 ratio for X chromosome sites mimicking a single copy deletion. Contamination from normal cells was then simulated by spiking varying amounts of female DNA into the male DNA sample. Arrays T6
compared a 50/50 mixture of male and female DNA against a male DNA reference generating a 3:2 ratio for X chromosome sites mimicking single copy amplifications. Contamination from normal cells was simulated by spiking varying amounts of female DNA into the male/female DNA mixture.
The fifth dataset was derived from hybridization of genomic DNAs from cell lines containing varying numbers of X chromosomes to simulate varying levels of gene amplification and deletion for each of the X-chromosomal genes present in the cDNA array [12
]. The five experiments comprising the fifth data set are denoted X1
After a thorough investigation of the systematic variations in the data from our array CGH experiments, four kinds of bias were identified. Below we explain each bias type.
This bias is evident in the frequently used M-A plots which are plots of the log ratio M = log2(Ir/Ig) = log2(Ir) - log2(Ig) against the mean of the log intensities A = 1/2(log2(Ir) + log2(Ig)), where Ir and Ig are the intensities of the cyanine-5 and cyanine-3 channels respectively. In our data, this bias predominantly appears as curvature in the low intensity end of the M-A plot.
The representation of log ratios based on the corresponding spot location on the microarray is another type of plot which can be used to reveal spatially variable bias. We refer to this plot as M-XY plot. The spatially smoothed M-XY plot reveals the general trend of log ratios against their locations on the array (Fig. ). For randomly distributed genomic loci across an array this plot should be a flat plane.
A smoothed M-XY plot illustrating spatial bias. The plot displays representation of log2 ratios based on the corresponding spot location on the microarray, the plot is smoothed with a moving median filter.
Spatial heterogeneity was thought to be caused by the different print tips used in printing the targets on the arrays [6
]. However, our data show that the spatial heterogeneity is not caused by print tips effects because the spatial patterns are not organized in a block wise fashion (as they would be due to bias introduced by specific print tips). In fact, the patterns appear as a continuous function across the entire array.
This is a spatial pattern that can be seen in the data after the spatial gradient has been removed by the spatial normalization step mentioned above. This pattern is repeated in all subgrids in the M-XY plot and corresponds to the plate groups (groups of spots on the microarray that are all printed from the same microplate).
Plate bias is evident when box-plots of log2 ratios from each plate group are compared. These box plots show a systematic difference among the log2 ratios of the different plate groups. The median log2 ratio of each plate group is expected to be near zero, i.e. positive and negative deviations should cancel out in each plate group, unless the copy numbers of the clones in a plate biologically differ between the test and the control samples. We do not believe this is the case in our experiments.
This bias is caused by the fact that different clones that are produced in different microplates may have experienced slightly different physical conditions during the polymerase chain reaction (PCR) or in subsequent purification steps [7
]. This variation in the efficiency of spot solution synthesis appears to affect different plate groups resulting in a plate level bias.
The measured intensity for each microarray spot contains a contribution from the background fluorescence within the spot. This introduces a bias in the ratios of the spots' intensities. In the M-A plot this bias appears as deviation from zero in the log2 ratios of the lower intensity spots.
Methods of bias removal
In order to remove these types of biases, we evaluated the following stepwise normalization procedure:
1. The spatial trend is estimated by computing, for each spot on the array, the median of log2 ratios for the spots within a spatial neighbourhood window of size 11 rows by 11 columns centred on that spot. The spatial bias that is estimated for each spot in this way is then subtracted from the log2 ratio of that spot. This step is referred to as "Spatial" normalization.
2. The plate bias is removed by calculating the median of the log2 ratios for all spots in the same plate group and subtracting it from the log2 ratios for all those spots. This step is referred to as "Plate" normalization.
3. The intensity bias is estimated using robust LOWESS curve fitting [8
]. After this bias is estimated, assuming the bias is multiplicative; the bias is subtracted from the log ratios. This step is denoted as "Intensity LOWESS
4. To remove the background bias, one of the following two different approaches is usually taken: either the estimate of the background intensity is subtracted from the estimated foreground intensity of each spot before taking the ratios, or it is not subtracted. In the latter case, the introduced bias is dealt with by treating it as intensity dependent bias. We evaluated both of these approaches in our experiments (see below).
Below we show that the above stepwise procedure is effective in removing the mentioned types of systematic variations. We demonstrate the efficacy of our procedure by comparing several quantitative characteristics of data normalized by our proposed strategy to those of non-normalized data and data normalized by other techniques listed in Table .
Table 2 Summary of normalization methods. Each of the normalization methods in this table will be denoted by its number through out the text. For full description of methods refer to "Methods of bias removal" section in Results and Discussion and the "Normalization (more ...)
Normalization of self-self array CGH data
The self-self experiments (arrays MM-1 through MM-4) were used to study the effect of normalization on removing the bias from the data and increasing the accuracy of the measurements. The 19 methods of normalization listed in Table were evaluated on the data obtained from these arrays.
Since the same male genomic DNA serves as both sample and reference DNA, the copy numbers detected in both the Cyanine-3 and Cyanine-5 channels are expected to be the same at all loci, resulting in a zero theoretical value for the log2 ratio of intensities at all spots on the array. The effects of normalization on removing the bias were examined by calculating the standard deviation (s.d.) of the log2 ratios for each array in the experiment, evaluating each of the 19 methods listed in Table . Then all 19 standard deviations were scaled against the standard deviation of the raw ratios before normalization (i.e. against the s.d. value from the first method of Table ). For each normalization method, the scaled s.d. values were then averaged across the four arrays. Figure shows these average standard deviations.
Figure 2 Normalization of self-self hybridization data. Relative standard deviation (s.d.) of log2 ratios averaged across arrays MM-1 through MM-4 using all data points are shown in blue. The repeated analysis of relative s.d. after removal of the weakest 10% (more ...)
The three different window sizes of 10%, 25% and 40% of the data points, used for LOWESS intensity normalization (methods 4-6 in Table ) did not have a significant effect on the effectiveness of normalization.
Among 12 normalization methods that are performed on the ratios of background subtracted intensities, the stepwise strategy (method 12) results in the lowest s.d. for all four arrays. Also, among 7 normalization methods that are performed on the ratios of non-background subtracted intensities, the stepwise strategy (method 19) results in the smallest s.d.
When the three-step proposed normalization is performed on the ratios of non-background subtracted intensities, it yields better performance, in terms of reducing the s.d. of log2 ratios, than when it is applied to the ratios of background-subtracted intensities.
To further explore the effect of the background intensities, the standard deviations were recalculated for these four arrays with the lowest intensity spots removed from each data set. The difference between the s.d. of the ratios after normalization for the case of background subtracted and the case of non-background subtracted intensities became smaller on the reduced datasets. As an example, the new s.d. values when 10% of the lowest intensity spots are removed, are plotted in Fig. . This suggests that subtracting background increases the variability of ratios of lower intensity spots and the variability of higher intensity spots are not affected much by subtracting or not subtracting the background.
Normalization of hybridization data from replicate experiments
In order to see how normalization affects the consistency of the data from replicate experiments, 8 replicate experiments were performed. H526-1 through H526-8 represent independent array CGH experiments using the same source of sample DNA (isolated from the well studied lung cancer cell line H526).
The Standard deviations of the log2 ratios of the same spot across the 8 replicate arrays were calculated and averaged across all the spots for each normalization method. The results are shown in Fig. . The standard deviation measure attains its smallest value after method 12 or 19 is performed on the data. When the three-step normalization is performed on the ratios of non-background subtracted data (method 19), its performance is slightly better than when it is performed on ratios of background subtracted intensities (method 12).
Figure 3 Normalization of hybridization data from replicate experiments. 8 replicate array CGH experiments were done comparing sample DNA from H526 cell line and the reference normal male genomic DNA. A. Graph shows the average of the standard deviations of log (more ...)
The Pearson's Correlation Coefficient
] was calculated for the data from each pair of the replicate arrays, with 28 possible pairings. The average of the 28 correlation coefficients for each single method was then calculated (Fig. ).
The Intraclass Correlation Coefficient
] was calculated for the set of data obtained from the 8 replicate arrays normalized using each of the methods described above. The results are also summarized in Fig. . The ICC and Pearson correlation coefficient show similar results across the methods. Both ICC and Correlation coefficient attain their highest values after the three-step normalization method. This applies to both the ratios of non-background subtracted intensities and ratios of background subtracted intensities. ICC and Correlation coefficient are slightly higher when background subtraction is not performed on spot intensities measures.
Normalization of hybridization data from male and female DNA
To evaluate the effect of normalization on improving detection of single copy loss, two array CGH experiments were conducted comparing male (XY) genomic DNA against female (XX) genomic DNA. The copy numbers of autosomal loci (clones on chromosome 1 through 22) are equal, while the X loci exhibit a 1:2 ratio, simulating a single copy loss.
The normalization methods described above were applied to the data obtained from these two experiments. To determine which method results in the best separation of clones with normal copy from those with a single copy loss, a two-sample two-tailed T-test was performed on each array data normalized by each method. The T-test evaluates the difference between the means of two groups of log ratios. The first group consists of log ratios for clones from chromosomes 1 through 22 and the second group consists of log ratios for clones from chromosome X. The value of the T statistic is shown in Fig. for both arrays and for each normalization method. A larger value for the T-statistic indicates better separation between the means of the two samples.
Figure 4 Normalization of hybridization data from male and female DNA. For each of arrays MF-1 and MF-2, a T-test was performed on the two groups of log ratios, i.e. log ratios for the autosomal clones and those for the X chromosome clones. Values of T-statistic (more ...)
For the data from array MF-1, the largest T-statistic was obtained after our three-step normalization procedure was performed on the ratios of background subtracted intensities. For this array, the normalization methods performed on the ratios of the non-background subtracted intensities were not as effective.
For the MF-2 array data, the normalization methods do not significantly change the value of the T-statistic. The three-step normalization performed on the ratio of non-background subtracted intensities slightly increases the T-statistic. In fact the correlation coefficient of the log ratios and the estimated intensity bias and the correlation coefficient of the log ratios and the estimated spatial bias were both quite low for this array compared to the other arrays (below 15%). Also the background intensities for this array were quite low compared to the other arrays. This suggests that the reason for the lack of significant change in the T-statistic values after normalization is that the data from this particular array did not have significant bias.
Normalization of hybridization data from samples mimicking heterogeneous cell populations and single copy alterations
Array CGH is often used to detect genetic alterations in tumor cells. However, tumours generally consist of heterogeneous cell populations including a variety of infiltrating non-cancerous cells. Contamination from normal cells may affect the ability to detect copy number aberrations. In the case of a single copy gain, contamination from diploid normal cells dampens the expected 3:2 signal ratio produced by the single copy gained sequences in the tumour cells due to the averaging effect in the mixed cell population. In the case of a single copy loss, normal cell contamination increases the average copy number, deviating from the expected 1:2 ratio. In a previous study, this effect on detection sensitivity was evaluated by mixing male (XY) and female (XX) DNA in precise proportions to mimic 0%, 15%, 30%, 50% and 75% normal cell contamination affecting the dosage of the X chromosome [5
In this study, we wish to determine how our three-step normalization method affects the estimated log2 ratios for the clones with single copy number changes and increasing levels of heterogeneity. The stepwise normalization method was applied to the data from the titration series (arrays T1-T10) that simulated different contamination levels for both single copy gains and losses (Fig. ).
Figure 5 Normalization of hybridization data from samples mimicking heterogeneous cell populations and single copy alterations. Array CGH data were generated for samples mimicking single copy loss (deletion) or single copy gain (amplification) with contamination (more ...)
We compared the data obtained after performing the three-step normalization procedure to data obtained after performing global median normalization on both the ratios of background subtracted intensities and the ratios of non-background subtracted intensities. For each array, a T-test was performed on the two groups of log ratios, i.e. log ratios for the autosomal clones and those for the X chromosome clones. T-values are shown in Fig. .
The T-statistic values are higher after normalization in all cases which assures us that the separation of the two groups is increased and the low-level copy number changes are preserved and even magnified. Comparing the T-statistic values for data with no normalization to the normalized results shows that normalization increases the sensitivity of detection of the single copy number changes up to 5 times. However, the T-statistic values are considerably lower for the ratios of non-background subtracted intensities as compared to the ratios of background subtracted intensities.
Functional normalization increases the separation between the distributions of the clones with normal and abnormal copy numbers and this facilitates the analysis of heterogeneous samples. For example, after normalization, the T-statistic for array T9 which simulates a single copy amplification with 50% contamination, becomes quite close to the T-statistic of array T6 which simulates a case with no contamination.
Normalization of hybridization data from cDNA arrays simulating varying levels of gene amplification and deletion for X-chromosomal genes on the array
To evaluate the performance of the stepwise normalization strategy on hybridization data from cDNA arrays, we used public data from hybridization of genomic DNAs from cell lines containing varying numbers of X chromosomes that simulate varying levels of gene amplification and deletion for each of the X-chromosomal genes present on the array (arrays X1 to X5).
We compared the data obtained after performing the three-step normalization procedure to data obtained after performing global median normalization on both the ratios of background subtracted intensities and the ratios of non-background subtracted intensities. For each array, a T-test was performed on the two groups of log ratios, i.e. log ratios for the autosomal clones and those for the X chromosome clones. The T-statistic values are shown in Fig. .
Figure 6 Normalization of hybridization data from cDNA arrays. Array CGH data were generated for samples simulating varying levels of gene amplification and deletion for X-chromosomal genes on the array. Global median normalization (method 1), stepwise normalization (more ...)
The T-statistic values are higher after normalization in all cases. The increase in the T-statistic values may be interpreted as the increase in the separation of the distributions of the log2 ratios from two groups of normal and altered genes.
Visual comparison of the genomic profiles
The use of the genomic location of the clones allows us to compare profiles before and after normalization and to use the visual correlation between observed and expected profiles as a measure of success. (This is not possible when analyzing gene expression array data.)
In Figures and , chromosome plots of the data from two of the replicate H526 arrays, generated by SeeGH software [10
], are shown. Chromosome plots show the log2
of ratios for each of the target DNA clones, as a function of the location of the clone in the chromosome. Figure shows the chromosome plots for chromosome 1 of arrays H526-1
. Figure shows the chromosome plots for chromosome 2 of arrays H526-1
. For each array and each chromosome the log2
ratios are shown after global median normalization and after the three-step normalization. The variability of log2
ratios in array H526-5
is much higher than that of array H526-1
. For the H526 genome, the regions of copy number changes are known [4
]. As the figures show, for data from array H526-5
(low quality data), normalization reduced the unwanted variations. Consequently, after normalization the altered regions are clearer. An important point to note for data from array H526-1
(high quality data), where the variation of the log2
ratios is quite low even before normalization, is that normalization did not remove the true biological variation present in the sample.
Figure 7 Chromosome plots before and after normalization. Plot of log2 signal ratios for clones (from chromosome 1 in A and chromosome 2 in B) versus their location across the chromosome. The profiles from left to right are: H526-1 data with global median normalization (more ...)
The issue of subtracting or not subtracting background intensities has been an open question in microarray data analysis. Some groups choose to use the raw intensities while others use the background subtracted intensities. Through our experiments we observed that not subtracting the background results in slightly less variability and more repeatability of the ratios. However, knowing the truth about the ratios of array CGH experiments enabled us to examine how subtracting and not subtracting the background intensities affect the ability to detect copy number changes. We observed that for the array CGH data from SMRT arrays the ability to detect the copy number changes when using the ratio of non-background subtracted intensities is degraded when compared to using the ratio of background subtracted intensities. However, for the array CGH data from cDNA arrays, the ability to detect the copy number changes when using the ratio of non-background subtracted intensities is increased. We believe that the fact that different methods of background estimation are used in these two cases and the differences in the average level of background intensities of the arrays have caused this inconsistency between the results. The data from SMRT arrays along with the image analysis methods used suggest that background subtraction improves normalization and should be performed for these data.