PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of cancerinformAuthor InfoTable of ContentsEditorial Board
 
Cancer Inform. 2006; 2: 48–58.
Published online Feb 10, 2007.
PMCID: PMC2067254
NIHMSID: NIHMS22991
Computational Methods for the Analysis of Array Comparative Genomic Hybridization
Raj Chari,1,2 William W. Lockwood,1,2 and Wan L. Lam1
1Cancer Genetics and Developmental Biology, British Columbia Cancer Research Centre, Vancouver BC, Canada V5Z 1L3;
2These authors contributed equally to this work
Correspondence: Raj Chari, BC Cancer Research Centre, 675 West 10th Avenue, Vancouver, BC, V5Z 1L3, Canada. Tel: + 1 604-675-8111; Fax: + 1 604-675-8232; Email: rchari/at/bccrc.ca
Abstract
Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.
Keywords: array CGH, microarray, cancer genome, software, bioinformatics, alteration detection
Segmental deletion and duplication of chromosomal regions have been associated with both constitutional diseases and somatic alterations in cancer (Inazawa et al. 2004; Lockwood et al. 2006; Oostlander et al. 2004; Vissers et al. 2005). Recent studies have demonstrated that large scale copy number variations exist in the human population (Conrad et al. 2006; de Vries et al. 2005; Hinds et al. 2006; Iafrate et al. 2004; McCarroll et al. 2006; Sebat et al. 2004; Tuzun et al. 2005). Array comparative genomic hybridization (array CGH) is a method designed for identifying genomic regions with copy number aberration (Pinkel et al. 1998; Solinas Toldo et al. 1997). In this method, DNA from both reference and test genomes are differentially labeled with fluorescent dyes and competitively hybridized to DNA targets arrayed on a glass slide (Fig. 1). The hybridized slide is then scanned and the resulting signal intensity ratio at each DNA target reflects the copy number status of the DNA segment. By referring the segment to its corresponding position on the human genome map, the genes affected by copy number alteration can be identified (Fiegler et al. 2003; Ishkanian et al. 2004; Snijders et al. 2001). Numerous advances in array CGH technology have been made since its development in the mid 1990s with increased genome coverage and target density, improving resolution and sensitivity of detection. The majority of array CGH platforms use either oligonucleotide (oligo) or large insert clone (LIC) DNA targets (Davies et al. 2005). Oligos are short DNA fragments of approximately 21–60 nucleotides in length whereas LICs are typically bacterial artificial chromosome (BAC) clones which are ~100 kb in size. Historically, arrays were designed to cover specific chromosomes (Buckley et al. 2002; Buckley et al. 2005), chromosome arms (Coe et al. 2005; Garnis et al. 2003; Henderson et al. 2005) or selected regions of the genome implicated in disease (Albertson et al. 2000; Schwaenen et al. 2004). In contrast, genome wide arrays that sample copy number status of loci at megabase intervals have facilitated rapid survey for regions of loss and gain (Fiegler et al. 2003; Greshock et al. 2004; Snijders et al. 2001). Alternatively, cDNA microarrays, initially designed for gene expression profiling, have been used to assess copy number status of coding regions (Pollack et al. 1999; Squire et al. 2003). The development of high density arrays consisting of tens of thousands of DNA targets spanning the entire human genome has enabled precision mapping of the boundaries of genetic alterations throughout the genome in a single experiment (Barrett et al. 2004; Bignell et al. 2004; Ishkanian et al. 2004; Selzer et al. 2005; Zhao et al. 2004).
Figure 1.
Figure 1.
Generation of array comparative genomic hybridization profiles. Tumor and normal reference DNA are differentially labeled with cyanine-5 and cyanine-3 respectively and competitively hybridized to a genomic microarray. The array consists of DNA targets (more ...)
The production and use of these high density arrays relies not only on technical precision in array synthesis but also computational platforms tailored to the imaging, mapping, and analysis of replica sets of tens of thousands of DNA targets with spot signals in a narrow dynamic range. This article describes the principles behind visualization and analysis of whole genome array CGH data and reviews the software currently available.
Array CGH software applications can be classified according to three general functions: data preprocessing, visualization, and analysis (Fig. 2). Some software programs are specific to a particular function while others perform multiple tasks. The following section explains the principles and describes the methods for performing these functions.
Figure 2.
Figure 2.
Principles of array CGH analysis. The process is grouped into three general functions: data preprocessing, visualization, and detection of segmental alterations, in no particular order. Methodologies for each function are indicated in a horizontal manner. (more ...)
Data pre-processing
Upon completion of an array CGH experiment, the microarray slide is scanned in two channels to generate high resolution fluorescence images corresponding to the two cyanine dyes. The localization of spots on the array is a semi-automatic process supported by “spot finding” functions, available in most microarray scanner software packages and custom packages (Jain et al. 2002). The signal intensity for each spot is quantified for each channel. However, image normalization is critical to improving detection sensitivity of copy number alterations, as a single copy loss would only reduce the signal by 50% resulting in a 1:2 signal ratio, and a single copy gain would result in a 3:2 ratio. (These shifts in ratios are subtle compared to gene expression changes.) In tumor samples, these ratios are further dampened by tissue heterogeneity with a mixed population of normal and cancer cells (Garnis et al. 2005). Therefore, before signal ratio can be deduced, the intensities of the two images need to be balanced and systematic biases influencing measurements need to be removed (Fig. 3). Intensity bias (different intensities for the dye channels), spatial bias (the location of DNA target on the array), plate bias (source plate of the target DNA spotted) and background bias (the contribution of background fluorescence to spot signal intensity) are factors that have been shown to affect signal intensity ratio in high density array CGH experiments (Khojasteh et al. 2005).
Figure 3.
Figure 3.
Normalization of array CGH data. A: A plot illustrating spatial bias across the microarray. B: The copy number profile of a chromosome before and after normalization. The removal of systematic biases improves the conformity of the profile.
Data visualization
As replica spots are necessary to ensure experimental precision, arrays often contain multiple measurements of a DNA target. Therefore, basic operations are applied to determine the mean or median ratios of the replica, and the standard deviation for quality assessment and filtering.
To display spot data in the context of genomic position, log2 signal ratio for each spot is plotted against its corresponding location in the human genome. Graphical representations and interactive display are the two main approaches used in visualization. Graphical representations are XY scatter plots, with the X axis representing the array elements in ordered chromosomal position—typically, the chromosomes are arranged in series—and the Y axis representing the corresponding log2 signal ratio. However, with arrays containing tens of thousands of DNA elements (high density arrays), the number of data points are too numerous to display on this scale (Fig. 4a). Interactive displays are designed for high density arrays allowing the sequential magnification of selected chromosomes and chromosome segments to visualize individual data points. Typically, ratio data is displayed in parallel to a chromosome ideogram. Advanced visualization software provide practical features, for example, displaying the actual segment length represented by the spotted element (as opposed to non-overlapping single points), displaying aligned gene annotation (gene track), providing immediate linkage to public databases such as Online Medelian Inheritance of Man (OMIM), NCBI Entrez and UCSC Genome Browser (Fig. 4b).
Figure 4.
Figure 4.
Visualization of array CGH data. A: A graphical representation of array CGH data. The chromosomes are alternately labeled in green and black. In this graph, log2 signal ratio for each clone is plotted against its chromosomal position ordered in series. (more ...)
Detection of segmental alterations
A variety of methods are used in the identification of segmental copy number alterations. Here we describe the principles behind the commonly used analytical approaches (Fig. 2).
Direct thresholding
One of the simplest approaches for data analysis is by directly thresholding at a particular ratio. This methodology was very commonly used in early array CGH publications (Albertson et al. 2000; Garnis et al. 2004; Veltman et al. 2003). This threshold value can be defined in different ways. Ratio thresholds signify gains and losses based on a theoretical ratio of a single copy gain (3:2, log2 ratio of 0.585) and single copy loss, (1:2, log2 ratio of -1), albeit the actual ratio observed is typically significantly lower than the theoretical. Another approach relies on a sex mismatched experiment and using the signal ratio of the X chromosome to define the ratio for a single copy change (Fig. 5a). The drawback to this approach is that the ratio shift dampened by tissue heterogeneity is not reflected in the sex mismatch as both cancerous and non-cancerous cells in a sample have the same number of X chromosomes. Spectral karyotyping (SKY) or fluorescence in situ hybridization (FISH) can be used to calibrate the relationship between the copy number and the amplitude of the signal shift.
Figure 5.
Figure 5.
Analysis of array CGH data. Three of the methods described in the text for the detection of segmental alterations are illustrated. A) Direct thresholding, gains and losses are based on a theoretical ratio, in this case the indicated purple line, using (more ...)
Moving average based thresholding
In this method, thresholding is applied to multiple consecutive data points, rather than individual ones. This involves calculating the average across a sliding window of data points (e.g. 30 kb windows sliding at 10 kb intervals) (Fig. 5b). As such, larger-sized windows which incorporate more adjacent points would produce a smoother curve, but at a lower detection sensitivity. Conversely, smaller windows will detect the smaller alterations, but may introduce a higher number of false positives.
K-means clustering
K-means clustering involves the a priori determination of a set of clusters, k, such that a given quantity is minimized relative to the centroids of the clusters (MacQueen, 1967). Moreover, the variability in the types of K-means clustering is achieved by changing the method of measuring distance and the quantity to be minimized. For example, one quantity to minimize is the maximum distance of an object to its centroid using a distance measure such as the Euclidean distance (Autio et al. 2003). In terms of array CGH, three centroids are normally used, to represent “gain”, “loss” and “retention” respectively. However, the number of centroids may be increased to reflect multiple levels of gains and losses.
Hidden Markov model
Briefly, a Hidden Markov model (HMM) is a statistical approach designed for describing a system with unknown parameters using those that are observed—where the known aspects of the model are the states assigned and the unknown parts are the transition probabilities between states. Moreover, HMMs can be described by three main components: a set of probabilities associated with transitions between all states (Λ), a set of probability distributions associated with each state (B), and a distribution of initial states (π). Commonly, any HMM with a discrete, finite number of states can be defined as λ = (Λ, B, π) (Rabiner, 1989).
In the context of the application of HMM to array CGH analysis, a simple version of this approach was utilized where the hidden states in fact represented each of the states of copy number change; gain, loss and retention (de Vries et al. 2005). Moreover, this method has been used to extrapolate levels of copy number when accounting for such factors as tissue heterogeneity as the expected ratio change for a single copy gain and loss would be dampened (Fridlyand et al. 2004). In addition to the application to BAC based microarrays, this approach has been employed in the context of the oligonucleotide platforms (Iafrate et al. 2004; Nannya et al. 2005; Zhao et al. 2004).
Circular binary segmentation
Circular binary segmentation (CBS) is a change-point based method which searches for particular change points where neighboring regions of DNA exhibit a statistical difference in copy number. By modifying the standard binary segmentation approach to a circular approach, this algorithm can be used to detect breakpoints in DNA as the altered region would be flanked by two regions of different copy number level, requiring two breakpoints. This algorithm, implemented in the DNACopy package, has been applied to test BAC array and representative oligonucleotide microarray (ROMA) datasets (Olshen et al. 2004). The application of CBS to describe genetic alterations myeloid sarcoma has been reported recently (Deeb et al. 2005).
Wavelet-based
Another approach for array CGH analysis revolves around the use of wavelets. Briefly, this is a spatially-adaptive and non-parametric approach used to denoise (smooth) and segment data. Furthermore, this method can handle small discrete alterations which appear as an abrupt aberration and deal with the inherent property of variable sized alterations with different magnitudes seen in array CGH data (Hsu et al. 2005). This approach has been implemented in a few different algorithms used to smooth and segment array CGH data (Hsu et al. 2005; Khojasteh et al. 2006).
Genetic local search
The genetic local search approach is an algorithm which tries to partition the data by placing a user-defined number of breakpoints across a particular chromosome. Breakpoints are placed in a random fashion and the algorithm iteratively tries to improve the location of the breakpoints such that the negative log-likelihood of the data and the penalty associated with too many breakpoints within a partition are minimized (Jong et al. 2004). Furthermore, the data becomes segmented and the values are “smoothed” such that they are the average of all the data points in that segment (Fig. 5c). This method, implemented in the aCGH-Smooth software package, has been used in the analysis of non-small cell lung cancer (NSCLC) cell lines (Garnis et al. 2006), small cell lung cancer (SCLC) cell lines (Coe et al. 2006), and oral squamous cell carcinoma (Baldwin et al. 2005).
False discovery rate analysis and validation of copy number alterations
It should be noted that there is a false discovery rate (positive and negative) associated with any algorithm used for the detection of segmental alterations. The algorithm may not be able to consistently identify and correct for intrinsic noise in the data due to technical and biological variance encountered in array CGH experiments (Ylstra et al. 2006). Complementary methods such as fluorescence in situ hybridization and quantitative PCR will provide independent confirmation of the CGH findings. Alternatively, detection of changes in expression of genes within regions of alteration will also provide support of biological significance.
Table 1 summarizes currently available array CGH software programs and compares the algorithms used in the detection of segmental copy number changes and the types of visualization available.
Table 1.
Table 1.
Software for analysis and visualization of array CGH data.
Typically, software programs are developed to support the analysis and/or visualization of specific array platforms, especially for the commercially available platforms. For example, Affymetrix (Affymetrix Copy Number Analysis Tool) and Nimblegen (Nimblegen SignalMap) have been developed by the respective companies for their manufactured arrays. In contrast, software applications developed by academic laboratories were generally designed to handle a primary array utilized by the research group and upon subsequent improvements, could handle data from other commonly used array platforms. The application SeeGH, as an example, was initially developed to visualize and analyze BAC array CGH data but in new versions of the application, data from oligonucleotide or cDNA platforms can be accommodated. Furthermore, other programs such as ArrayCyGHt, CGH-Explorer, M-CGH and Normalise Suite v2.5 also demonstrate versatility by handling the data generated by all three types of array platforms (Table 1). The visualization capabilities of these applications are compared based on the ability to view single or multiple experiments, and simple static graphical representations versus interactive displays (Table 1). Here, we highlight three software examples to illustrate interactive display: CGHPro, CGHAnalyzer v2.2 and SeeGH v3.0.
CGHPro
CGHPro is a Java-based software operable on multiple operating systems. It requires the installation of the Java Runtime Environment Version 1.4.2 or higher, the statistical package R (Ihaka and Gentleman, 1996) Version 1.9.1 and the MySQL database server to store array CGH experiments (Chen et al. 2005). The major functionalities in this software include data quality assessment through graphical means, normalization of data using commonly used techniques for microarray imaging, integration of previously designed algorithms for alteration detection, and multiple methods for visualization. In addition, CGHPro can input formatted data from a variety of array platforms.
Data quality assessment is achieved using graphical methods such as scatter plots of the log2 spot intensities, box plots, histograms, M-A plots and QQ plots. Data filtering is achieved using user-defined parameters. Normalization routines include: Global Median, Subgrid Median, LOWESS (locally weighted scatter plot smooth), Subgrid LOWESS, and dye-swap normalization. Alteration detection algorithms include direct thresholding and thresholding after use of segmentation algorithms, incorporating the aCGH bioconductor (HMM) and DNACopy (CBS) packages (Fridlyand et al. 2004; Olshen et al. 2004). Visualization is interactive allowing sequential magnification and viewing of multiple experiments.
CGHAnalyzer v2.2
CGHAnalyzer is also a Java-based software with the requirement of Java Runtime Environment version 1.4 or later (Margolin et al. 2005). This program allows querying of pre-loaded or custom gene sets for copy number status and integrates the clustering options of TIGR Multi-Experiment Viewer (Saeed et al. 2003). CGHAnalyzer does not have normalization functions requiring pre-normalized data. However, mapping information for UPenn BAC array and Affymetrix P501 SNP array are pre-loaded.
Two visualization layouts are provided to give the option of viewing the chromosomes in concentric circles or as traditional chromosome ideograms. Multiple experiments can be viewed using heatmap alignment of individual chromosomes. Alteration detection depends on direct thresholding or by variation from a pre-defined distribution.
SeeGH v3.0
SeeGH was developed in C++, runs on Windows platform, requiring MySQL as the database structure. It accepts pre-normalized data and allows filtering of replica data points based on standard deviation and signal-to-noise ratio cut-offs. SeeGH accommodates data from a variety of sources, for example copy-number, gene expression, and global methylation profiles. Interactive display functions include sequential magnification, linking of clones to genes and, in turn, to biological databases (e.g. UCSC Genome Browser). Localization to specific regions of interest can be achieved through querying of identifiers such as gene name, clone name, and base pair position. Experimental parameters and user comments are stored within SeeGH allowing convenient information retrieval.
In addition, users can add customized or preloaded tracks to display gene location, CpG island position, microRNA location, etc. Multiple chromosome alignment, frequency summary plot, and heatmap display are included options for viewing multiple experiments (Fig. 6). Direct thresholding and moving average based thresholding are built in for alteration detection. Alternatively, segmentation using external software (e.g. aCGH-Smooth) can be imported for visualization.
Figure 6.
Figure 6.
Examples of multiple experiment visualization methods in SeeGH. A: Multiple alignment of individual chromosome profiles. B: Frequency plot summarizing multiple experiments. Here, red histograms represent frequency of gains and green lost. C: Heatmap display (more ...)
Considerations for future software development
With the rapid accumulation of large scale high throughput data describing cancer genomes, epigenomes, and transcriptomes, cross-platform meta-analysis will become prevalent. However, researchers with limited genomics and computational expertise will not be able to readily take advantage of such information. The development of facile, web-based software for the integration of large scale multidisciplinary databases will facilitate the widespread mining of genomic data and their correlation with clinical features (Kingsley et al. 2006). These issues are more pronounced with the increasing emphasis on translational research as array CGH technology moves towards clinical application. Added consideration of the ease of use, information security, automation and incorporation of prior knowledge of disease to assist in interpretation is necessary to deliver these emerging technologies to a clinical setting.
Acknowledgments
We thank Jonathan Davies, Timon Buys, and Bradley Coe for useful discussion, and Bryan Chi and Mehrnoush Khojasteh for providing software prior to publication. This work was supported by funds from Genome Canada/Genome British Columbia, Canadian Institute of Health Research, and NIDCR grant RO1 DE15965-01. WWL is supported by a scholarship from Natural Sciences and Engineering Research Council.
Albertson DG, Ylstra B, Segraves R, et al. Quantitative mapping of amplicon structure by array CGH identifies CYP24 as a candidate oncogene. Nat. Genet. 2000;25:144–6. [PubMed]
Autio R, Hautaniemi S, Kauraniemi P, et al. CGH-Plotter: MAT-LAB. toolbox for CGH-data analysis. Bioinformatics. 2003;19:1714–5. [PubMed]
Awad IA, Rees CA, Hernandez-Boussard T, et al. Caryoscope: an Open Source Java application for viewing microarray data in a genomic context. BMC Bioinformatics. 2004;5:151. [PMC free article] [PubMed]
Baldwin C, Garnis C, Zhang L, et al. Multiple microalterations detected at high frequency in oral cancer. Cancer Res. 2005;65:7561–7. [PubMed]
Barrett MT, Scheffer A, Ben-Dor A, et al. Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA. Proc. Natl. Acad. Sci., U.S.A. 2004;101:17765–70. [PubMed]
Beheshti B, Braude I, Marrano P, et al. Chromosomal localization of DNA amplifications in neuroblastoma tumors using cDNA micro-array comparative genomic hybridization. Neoplasia. 2003;5:53–62. [PMC free article] [PubMed]
Bignell GR, Huang J, Greshock J, et al. High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res. 2004;14:287–95. [PubMed]
Buckley PG, Mantripragada KK, Benetkiewicz M, et al. A full-coverage, high-resolution human chromosome 22 genomic microarray for clinical and research applications. Hum. Mol. Genet. 2002;11:3221–9. [PubMed]
Buckley PG, Jarbo C, Menzel U, et al. Comprehensive DNA copy number profiling of meningioma using a chromosome 1 tiling path microarray identifies novel candidate tumor suppressor loci. Cancer Res. 2005;65:2653–61. [PubMed]
Chen W, Erdogan F, Ropers HH, et al. CGHPRO—a comprehensive data analysis tool for array CGH. BMC Bioinformatics. 2005;6:85. [PMC free article] [PubMed]
Chi B, DeLeeuw RJ, Coe BP, et al. SeeGH—a software tool for visualization of whole genome array comparative genomic hybridization data. BMC Bioinformatics. 2004;5:13. [PMC free article] [PubMed]
Coe BP, Henderson LJ, Garnis C, et al. High-resolution chromosome arm 5p array CGH analysis of small cell lung carcinoma cell lines. Genes Chromosomes Cancer. 2005;42:308–13. [PubMed]
Coe BP, Lee EH, Chi B, et al. Gain of a region on 7p22.3, containing MAD1L1, is the most frequent event in small-cell lung cancer cell lines. Genes Chromosomes Cancer. 2006;45:11–9. [PubMed]
Conrad DF, Andrews TD, Carter NP, et al. A high-resolution survey of deletion polymorphism in the human genome. Nat. Genet. 2006;38:75–81. [PubMed]
Davies JJ, Wilson IM, Lam WL. Array CGH technologies and their applications to cancer genomes. Chromosome Res. 2005;13:237–48. [PubMed]
de Vries BB, Pfundt R, Leisink M, et al. Diagnostic genome profiling in mental retardation. Am. J. Hum. Genet. 2005;77:606–16. [PubMed]
Deeb G, Baer MR, Gaile DP, et al. Genomic profiling of myeloid sarcoma by array comparative genomic hybridization. Genes Chromosomes Cancer. 2005;44:373–83. [PubMed]
Fiegler H, Carr P, Douglas EJ, et al. DNA microarrays for comparative genomic hybridization based on DOP-PCR. amplification of BAC and PAC clones. Genes Chromosomes Cancer. 2003;36:361–74. [PubMed]
Fridlyand J, Snijders A, Pinkel D, et al. Hidden Markov models approach to the analysis of array CGH data. J. Multivar. Anal. 2004;90:132–53.
Garnis C, Baldwin C, Zhang L, et al. Use of complete coverage array comparative genomic hybridization to define copy number alterations on chromosome 3p in oral squamous cell carcinomas. Cancer Res. 2003;63:8582–5. [PubMed]
Garnis C, Coe BP, Zhang L, et al. Overexpression of LRP12, a gene contained within an 8q22 amplicon identified by high-resolution array CGH analysis of oral squamous cell carcinomas. Oncogene. 2004;23:2582–6. [PubMed]
Garnis C, Coe BP, Lam SL, et al. High-resolution array CGH increases heterogeneity tolerance in the analysis of clinical samples. Genomics. 2005;85:790–3. [PubMed]
Garnis C, Lockwood WW, Vucic E, et al. High resolution analysis of non-small cell lung cancer cell lines by whole genome tiling path array CGH. Int. J. Cancer. 2006;118:1556–64. [PubMed]
Greshock J, Naylor TL, Margolin A, et al. 1-Mb resolution array-based comparative genomic hybridization using a BAC clone set optimized for cancer gene analysis. Genome Res. 2004;14:179–87. [PubMed]
Henderson LJ, Coe BP, Lee EH, et al. Genomic and gene expression profiling of minute alterations of chromosome arm 1p in small-cell lung carcinoma cells. Br. J. Cancer. 2005;92:1553–60. [PMC free article] [PubMed]
Hinds DA, Kloek AP, Jen M, et al. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat. Genet. 2006;38:82–5. [PubMed]
Hsu L, Self SG, Grove D, et al. Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics. 2005;6:211–26. [PubMed]
Huang J, Wei W, Zhang J, et al. Whole genome DNA copy number changes identified by high density oligonucleotide arrays. Hum. Genomics. 2004;1:287–99. [PMC free article] [PubMed]
Hupe P, Stransky N, Thiery JP, et al. Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics. 2004;20:3413–22. [PubMed]
Iafrate AJ, Feuk L, Rivera MN, et al. Detection of large-scale variation in the human genome. Nat. Genet. 2004;36:949–51. [PubMed]
Ihaka R, Gentleman R. R.: a language for data analysis and graphics. J. Comput. Graphical Statist. 1996;5:299–314.
Inazawa J, Inoue J, Imoto I. Comparative genomic hybridization (CGH)-arrays pave the way for identification of novel cancer-related genes. Cancer Sci. 2004;95:559–63. [PubMed]
Ishkanian AS, Malloff CA, Watson SK, et al. A tiling resolution DNA microarray with complete coverage of the human genome. Nat. Genet. 2004;36:299–303. [PubMed]
Jain AN, Tokuyasu TA, Snijders AM, et al. Fully automatic quantification of microarray image data. Genome Res. 2002;12:325–32. [PubMed]
Jong K, Marchiori E, Meijer G, et al. Breakpoint identification and smoothing of array comparative genomic hybridization data. Bioinformatics. 2004;20:3636–7. [PubMed]
Khojasteh M, Lam WL, Ward RK, et al. A stepwise framework for the normalization of array CGH data. BMC Bioinformatics. 2005;6:274. [PMC free article] [PubMed]
Khojasteh M, Coe BP, Shah S, et al. 2006. A Novel Algorithm for the Analysis of Array CGH Data IEEE International Conference on Acoustics, Speech, and Signal ProcessingIn Press:
Kim SY, Nam SW, Lee SH, et al. ArrayCyGHt: a web application for analysis and visualization of array-CGH data. Bioinformatics. 2005;21:2554–5. [PubMed]
Kingsley CB, Kuo WL, Polikoff D, et al. Magellan: A Web Based System for the Integrated Analysis of Heterogeneous Biological Data and Annoations: Application to DNA Copy Number and Expression Data in Ovarian Cancer. Cancer Informatics. 2006;1:10–21. [PMC free article] [PubMed]
Lingjaerde OC, Baumbusch LO, Liestol K, et al. CGH-Explorer: a program for analysis of array-CGH data. Bioinformatics. 2005;21:821–2. [PubMed]
Lockwood WW, Chari R, Chi B, et al. Recent advances in array comparative genomic hybridization technologies and their applications in human genetics. Eur. J. Hum. Genet. 2006;14:139–48. [PubMed]
MacQueen JB. Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability. 1967;1:281–97.
Margolin AA, Greshock J, Naylor TL, et al. CGHAnalyzer: a stand-alone software package for cancer genome analysis using array-based DNA copy number data. Bioinformatics. 2005;21:3308–11. [PubMed]
McCarroll SA, Hadnott TN, Perry GH, et al. Common deletion polymorphisms in the human genome. Nat. Genet. 2006;38:86–92. [PubMed]
Myers CL, Dunham MJ, Kung SY, et al. Accurate detection of aneuploidies in array CGH and gene expression microarray data. Bioinformatics. 2004;20:3533–43. [PubMed]
Nannya Y, Sanada M, Nakazaki K, et al. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res. 2005;65:6071–9. [PubMed]
Olshen AB, Venkatraman ES, Lucito R, et al. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5:557–72. [PubMed]
Oostlander AE, Meijer GA, Ylstra B. Microarray-based comparative genomic hybridization and its applications in human genetics. Clin. Genet. 2004;66:488–95. [PubMed]
Pinkel D, Segraves R, Sudar D, et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 1998;20:207–11. [PubMed]
Pollack JR, Perou CM, Alizadeh AA, et al. Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat. Genet. 1999;23:41–6. [PubMed]
Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE. 1989;77:257–85.
Saeed AI, Sharov V, White J, et al. TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003;34:374–8. [PubMed]
Schwaenen C, Nessling M, Wessendorf S, et al. Automated array-based genomic profiling in chronic lymphocytic leukemia: development of a clinical tool and discovery of recurrent genomic alterations. Proc. Natl. Acad. Sci., U.S.A. 2004;101:1039–44. [PubMed]
Sebat J, Lakshmi B, Troge J, et al. Large-scale copy number polymorphism in the human genome. Science. 2004;305:525–8. [PubMed]
Selzer RR, Richmond TA, Pofahl NJ, et al. Analysis of chromosome breakpoints in neuroblastoma at sub-kilobase resolution using fine-tiling oligonucleotide array CGH. Genes Chromosomes Cancer. 2005;44:305–19. [PubMed]
Shankar G, Rossi MR, McQuaid DE, et al. aCGHViewer: A Generic Visualization Tool For aCGH data. Cancer Informatics. 2006;2:36–43. [PMC free article] [PubMed]
Snijders AM, Nowak N, Segraves R, et al. Assembly of microarrays for genome-wide measurement of DNA copy number. Nat. Genet. 2001;29:263–4. [PubMed]
Solinas-Toldo S, Lampel S, Stilgenbauer S, et al. Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer. 1997;20:399–407. [PubMed]
Squire JA, Pei J, Marrano P, et al. High-resolution mapping of amplifications and deletions in pediatric osteosarcoma by use of CGH analysis of cDNA microarrays. Genes Chromosomes Cancer. 2003;38:215–25. [PubMed]
Tuzun E, Sharp AJ, Bailey JA, et al. Fine-scale structural variation of the human genome. Nat. Genet. 2005;37:727–32. [PubMed]
Veltman JA, Fridlyand J, Pejavar S, et al. Array-based comparative genomic hybridization for genome-wide screening of DNA copy number in bladder tumors. Cancer Res. 2003;63:2872–80. [PubMed]
Vissers LE, Veltman JA, van Kessel AG, et al. 2005. Identification of disease genes by whole genome CGH arrays Hum. Mol. Genet.14 Spec No. 2:R.215–23.23. [PubMed]
Wang J, Meza-Zepeda LA, Kresse SH, et al. M-CGH: analysing microarray-based CGH experiments. BMC Bioinformatics. 2004;5:74. [PMC free article] [PubMed]
Wang P, Kim Y, Pollack J, et al. A method for calling gains and losses in array CGH data. Biostatistics. 2005;6:45–58. [PubMed]
Yi Y, Mirosevich J, Shyr Y, et al. Coupled analysis of gene expression and chromosomal location. Genomics. 2005;85:401–12. [PubMed]
Ylstra B, van den Ijssel P, Carvalho B, et al. BAC to the future! or oligonucleotides: a perspective for micro array comparative genomic hybridization (array CGH) Nucleic Acids Res. 2006;34:445–50. [PMC free article] [PubMed]
Zhao X, Li C, Paez JG, et al. An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays. Cancer Res. 2004;64:3060–71. [PubMed]
Articles from Cancer Informatics are provided here courtesy of
Libertas Academica