Related Articles
With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.
PMCID: PMC2759124
PMID: 19936083
aCGH; expression profiling; visualization; correlation; and data integration
Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visualization and biological interpretation. The user typically performs the following bioinformatics steps of a CGH array project within CAPweb: the secure upload of the results of CGH array image analysis and of the array annotation (genomic position of the probes); first level analysis of each array, including automatic normalization of the data (for correcting experimental biases), breakpoint detection and status assignment (gain, loss or normal); validation or deletion of the analysis based on a summary report and quality criteria; visualization and biological analysis of the genomic profiles and results through a user-friendly interface. CAPweb is accessible at .
doi:10.1093/nar/gkl215
PMCID: PMC1538852
PMID: 16845053
Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.
doi:10.1371/journal.pone.0002834
PMCID: PMC2515339
PMID: 18698372
Background
Array CGH (Comparative Genomic Hybridisation) is a molecular cytogenetic technique for the genome wide detection of chromosomal imbalances. It is based on the co-hybridisation of differentially labelled test and reference DNA onto arrays of genomic BAC clones, cDNAs or oligonucleotides, and after correction for various intervening variables, loss or gain in the test DNA can be indicated from spots showing aberrant signal intensity ratios.
Now that this technique is no longer confined to highly specialized laboratories and is entering the realm of clinical application, there is a need for a user-friendly software package that facilitates estimates of DNA dosage from raw signal intensities obtained by array CGH experiments, and which does not depend on a sophisticated computational environment.
Results
We have developed a user-friendly and versatile tool for the normalization, visualization, breakpoint detection and comparative analysis of array-CGH data. CGHPRO is a stand-alone JAVA application that guides the user through the whole process of data analysis. The import option for image analysis data covers several data formats, but users can also customize their own data formats. Several graphical representation tools assist in the selection of the appropriate normalization method. Intensity ratios of each clone can be plotted in a size-dependent manner along the chromosome ideograms. The interactive graphical interface offers the chance to explore the characteristics of each clone, such as the involvement of the clones sequence in segmental duplications. Circular Binary Segmentation and unsupervised Hidden Markov Model algorithms facilitate objective detection of chromosomal breakpoints. The storage of all essential data in a back-end database allows the simultaneously comparative analysis of different cases. The various display options facilitate also the definition of shortest regions of overlap and simplify the identification of odd clones.
Conclusion
CGHPRO is a comprehensive and easy-to-use data analysis tool for array CGH. Since all of its features are available offline, CGHPRO may be especially suitable in situations where protection of sensitive patient data is an issue. It is distributed under GNU GPL licence and runs on Linux and Windows.
doi:10.1186/1471-2105-6-85
PMCID: PMC1274268
PMID: 15807904
Background
Cervical carcinoma develops as a result of multiple genetic alterations. Different studies investigated genomic alterations in cervical cancer mainly by means of metaphase comparative genomic hybridization (mCGH) and microsatellite marker analysis for the detection of loss of heterozygosity (LOH). Currently, high throughput methods such as array comparative genomic hybridization (array CGH), single nucleotide polymorphism array (SNP array) and gene expression arrays are available to study genome-wide alterations. Integration of these 3 platforms allows detection of genomic alterations at high resolution and investigation of an association between copy number changes and expression.
Results
Genome-wide copy number and genotype analysis of 10 cervical cancer cell lines by array CGH and SNP array showed highly complex large-scale alterations. A comparison between array CGH and SNP array revealed that the overall concordance in detection of the same areas with copy number alterations (CNA) was above 90%. The use of SNP arrays demonstrated that about 75% of LOH events would not have been found by methods which screen for copy number changes, such as array CGH, since these were LOH events without CNA. Regions frequently targeted by CNA, as determined by array CGH, such as amplification of 5p and 20q, and loss of 8p were confirmed by fluorescent in situ hybridization (FISH). Genome-wide, we did not find a correlation between copy-number and gene expression. At chromosome arm 5p however, 22% of the genes were significantly upregulated in cell lines with amplifications as compared to cell lines without amplifications, as measured by gene expression arrays. For 3 genes, SKP2, ANKH and TRIO, expression differences were confirmed by quantitative real-time PCR (qRT-PCR).
Conclusion
This study showed that copy number data retrieved from either array CGH or SNP array are comparable and that the integration of genome-wide LOH, copy number and gene expression is useful for the identification of gene specific targets that could be relevant for the development and progression in cervical cancer.
doi:10.1186/1471-2164-8-53
PMCID: PMC1805756
PMID: 17311676
The main focus in pin-tip (or print-tip) microarray analysis is determining which probes, genes, or oligonucleotides are differentially expressed. Specifically in array comparative genomic hybridization (aCGH) experiments, researchers search for chromosomal imbalances in the genome. To model this data, scientists apply statistical methods to the structure of the experiment and assume that the data consist of the signal plus random noise. In this paper we propose “SmoothArray”, a new method to preprocess comparative genomic hybridization (CGH) bacterial artificial chromosome (BAC) arrays and we show the effects on a cancer dataset. As part of our R software package “aCGHplus,” this freely available algorithm removes the variation due to the intensity effects, pin/print-tip, the spatial location on the microarray chip, and the relative location from the well plate. removal of this variation improves the downstream analysis and subsequent inferences made on the data. Further, we present measures to evaluate the quality of the dataset according to the arrayer pins, 384-well plates, plate rows, and plate columns. We compare our method against competing methods using several metrics to measure the biological signal. With this novel normalization algorithm and quality control measures, the user can improve their inferences on datasets and pinpoint problems that may arise in their BAC aCGH technology.
doi:10.1155/2011/860732
PMCID: PMC3043322
PMID: 21403910
Menten, Björn | Pattyn, Filip | De Preter, Katleen | Robbrecht, Piet | Michels, Evi | Buysse, Karen | Mortier, Geert | De Paepe, Anne | van Vooren, Steven | Vermeesch, Joris | Moreau, Yves | De Moor, Bart | Vermeulen, Stefan | Speleman, Frank | Vandesompele, Jo
Background
The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH). One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment.
Results
We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment) supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser.
Conclusion
ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at .
doi:10.1186/1471-2105-6-124
PMCID: PMC1173083
PMID: 15910681
The Overlay Tool© has been developed to combine high throughput data derived from various microarray platforms. This tool analyzes high-resolution correlations between gene expression changes and either copy number abnormalities (CNAs) or loss of heterozygosity events detected using array comparative genomic hybridization (aCGH). Using an overlay analysis which is designed to be performed using data from multiple microarray platforms on a single biological sample, the Overlay Tool© identifies potentially important genes whose expression profiles are changed as a result of losses, gains and amplifications in the cancer genome. In addition, the Overlay Tool© will incorporate loss of heterozygosity (LOH) probability data into this overlay procedure. To facilitate this analysis, we developed an application which computationally combines two or more high throughput datasets (e.g. aCGH/expression) into a single categorized dataset for visualization and interrogation using a gene-centric approach. As such, data from virtually any microarray platform can be incorporated without the need to remap entire datasets individually. The resultant categorized (overlay) data set can be conveniently viewed using our in-house visualization tool, aCGHViewer© (Shankar et al. 2006), which serves as a conduit to public databases such as UCSC and NCBI, to rapidly investigate genes of interest.
PMCID: PMC2675835
PMID: 19455250
Overlay Analysis; Microarray; ACGH; expression profiling; CNAs; aCGHViewer
Background
In translational cancer research, gene expression data is collected together with clinical data and genomic data arising from other chip based high throughput technologies. Software tools for the joint analysis of such high dimensional data sets together with clinical data are required.
Results
We have developed an open source software tool which provides interactive visualization capability for the integrated analysis of high-dimensional gene expression data together with associated clinical data, array CGH data and SNP array data. The different data types are organized by a comprehensive data manager. Interactive tools are provided for all graphics: heatmaps, dendrograms, barcharts, histograms, eventcharts and a chromosome browser, which displays genetic variations along the genome. All graphics are dynamic and fully linked so that any object selected in a graphic will be highlighted in all other graphics. For exploratory data analysis the software provides unsupervised data analytics like clustering, seriation algorithms and biclustering algorithms.
Conclusions
The SEURAT software meets the growing needs of researchers to perform joint analysis of gene expression, genomical and clinical data.
doi:10.1186/1755-8794-3-21
PMCID: PMC2893446
PMID: 20525257
Array-Comparative Genomic Hybridization (aCGH) is a powerful high throughput technology for detecting chromosomal copy number aberrations (CNAs) in cancer, aiming at identifying related critical genes from the affected genomic regions. However, advancing from a dataset with thousands of tabular lines to a few candidate genes can be an onerous and time-consuming process. To expedite the aCGH data analysis process, we have developed a user-friendly aCGH data viewer (aCGHViewer) as a conduit between the aCGH data tables and a genome browser. The data from a given aCGH analysis are displayed in a genomic view comprised of individual chromosome panels which can be rapidly scanned for interesting features. A chromosome panel containing a feature of interest can be selected to launch a detail window for that single chromosome. Selecting a data point of interest in the detail window launches a query to the UCSC or NCBI genome browser to allow the user to explore the gene content in the chromosomal region. Additionally, aCGHViewer can display aCGH and expression array data concurrently to visually correlate the two. aCGHViewer is a stand alone Java visualization application that should be used in conjunction with separate statistical programs. It operates on all major computer platforms and is freely available at http://falcon.roswellpark.org/aCGHview/.
PMCID: PMC1847423
PMID: 17404607
array-CGH; CNA; gene expression; visualization
Array-Comparative Genomic Hybridization (aCGH) is a powerful high throughput technology for detecting chromosomal copy number aberrations (CNAs) in cancer, aiming at identifying related critical genes from the affected genomic regions. However, advancing from a dataset with thousands of tabular lines to a few candidate genes can be an onerous and time-consuming process. To expedite the aCGH data analysis process, we have developed a user-friendly aCGH data viewer (aCGHViewer) as a conduit between the aCGH data tables and a genome browser. The data from a given aCGH analysis are displayed in a genomic view comprised of individual chromosome panels which can be rapidly scanned for interesting features. A chromosome panel containing a feature of interest can be selected to launch a detail window for that single chromosome. Selecting a data point of interest in the detail window launches a query to the UCSC or NCBI genome browser to allow the user to explore the gene content in the chromosomal region. Additionally, aCGHViewer can display aCGH and expression array data concurrently to visually correlate the two. aCGHViewer is a stand alone Java visualization application that should be used in conjunction with separate statistical programs. It operates on all major computer platforms and is freely available at http://falcon.roswellpark.org/aCGHview/.
PMCID: PMC1847423
PMID: 17404607
array-CGH; CNA; gene expression; visualization
Background
As high-throughput technologies rapidly generate genome-scale data, it becomes increasingly important to visually integrate these data so that specific hypotheses can be formulated and tested.
Results
We present MochiView, a platform-independent Java software that integrates browsing of genomic sequences, features, and data with DNA motif visualization and analysis in a visually-appealing and user-friendly application.
Conclusions
While highly versatile, the software is particularly useful for organizing, exploring, and analyzing large genomic data sets, such as those from deep RNA sequencing, chromatin immunoprecipitation experiments (ChIP-Seq and ChIP-Chip), and transcriptional profiling. MochiView provides an extensive suite of utilities to identify and to explore connections between these data sets and short sequence motifs present in DNA or RNA.
doi:10.1186/1741-7007-8-49
PMCID: PMC2867778
PMID: 20409324
Motivation: Array-based comparative genomic hybridization (arrayCGH) has recently become a popular tool to identify DNA copy number variations along the genome. These profiles are starting to be used as markers to improve prognosis or diagnosis of cancer, which implies that methods for automated supervised classification of arrayCGH data are needed. Like gene expression profiles, arrayCGH profiles are characterized by a large number of variables usually measured on a limited number of samples. However, arrayCGH profiles have a particular structure of correlations between variables, due to the spatial organization of bacterial artificial chromosomes along the genome. This suggests that classical classification methods, often based on the selection of a small number of discriminative features, may not be the most accurate methods and may not produce easily interpretable prediction rules.
Results: We propose a new method for supervised classification of arrayCGH data. The method is a variant of support vector machine that incorporates the biological specificities of DNA copy number variations along the genome as prior knowledge. The resulting classifier is a sparse linear classifier based on a limited number of regions automatically selected on the chromosomes, leading to easy interpretation and identification of discriminative regions of the genome. We test this method on three classification problems for bladder and uveal cancer, involving both diagnosis and prognosis. We demonstrate that the introduction of the new prior on the classifier leads not only to more accurate predictions, but also to the identification of known and new regions of interest in the genome.
Availability: All data and algorithms are publicly available.
Contact: franck.rapaport@curie.fr
doi:10.1093/bioinformatics/btn188
PMCID: PMC2718663
PMID: 18586737
Holcomb, Ilona N. | Young, Janet M. | Coleman, Ilsa M. | Salari, Keyan | Grove, Douglas I. | Hsu, Li | True, Lawrence D. | Roudier, Martine P. | Morrissey, Colm M. | Higano, Celestia S. | Nelson, Peter S. | Vessella, Robert L. | Trask, Barbara J.
Androgen deprivation is the mainstay of therapy for progressive prostate cancer. Despite initial and dramatic tumor inhibition, most men eventually fail therapy and die of metastatic castration-resistant (CR) disease. Here, we characterize the profound degree of genomic alteration found in CR tumors using array CGH, gene expression arrays, and FISH. By cluster analysis, we show that the similarity of the genomic profiles from primary and metastatic tumors is driven by the patient. Using data adjusted for this similarity, we identify numerous high-frequency alterations in the CR tumors, such as 8p loss and chromosome 7 and 8q gain. By integrating array CGH and expression array data, we reveal genes whose correlated values suggest they are relevant to prostate cancer biology. We find alterations that are significantly associated with the metastases of specific organ sites, and others with CR tumors versus the tumors of patients with localized prostate cancer not treated with androgen deprivation. Within the high-frequency sites of loss in CR metastases, we find an over-representation of genes involved in cellular lipid metabolism, including PTEN. Finally, using FISH we verify the presence of a gene fusion between TMPRSS2 and ERG suggested by chromosome-21 deletions detected by array CGH. We find the fusion in 54% of our CR tumors, and 81% of the fusion-positive tumors contain cells with multiple copies of the fusion. Our investigation lays the foundation for a better understanding of and possible therapeutic targets for CR disease, the poorly responsive and final stage of prostate cancer.
doi:10.1158/0008-5472.CAN-08-3810
PMCID: PMC2771763
PMID: 19773449
Prostate cancer; castration-resistant metastases; metastatic prostate cancer; array CGH; expression; genomic alterations; TMPRSS2-ERG fusion
Background
Alternative RNA splicing greatly increases proteome diversity and thereby contribute to species- or tissue-specific functions. The possibility to study alternative splicing (AS) events on a genomic scale using splicing-sensitive microarrays, including the Affymetrix GeneChip Exon 1.0 ST microarray (exon array), has appeared very recently. However, the application of this new technology is hindered by the lack of free and user-friendly software devoted to these novel platforms.
Results
In this study we present a Java-based freeware, easyExon , to process, filtrate and visualize exon array data with an analysis pipeline. This tool implements the most commonly used probeset summarization methods as well as AS-orientated filtration algorithms, e.g. MIDAS and PAC, for the detection of alternative splicing events. We include a biological filtration function according to GO terms, and provide a module to visualize and interpret the selected exons and transcripts. Furthermore, easyExon can integrate with other related programs, such as Integrate Genome Browser (IGB) and Affymetrix Power Tools (APT), to make the whole analysis more comprehensive. We applied easyExon on a public accessible colon cancer dataset as an example to illustrate the analysis pipeline of this tool.
Conclusion
EasyExon can efficiently process and analyze the Affymetrix exon array data. The simplicity, flexibility and brevity of easyExon make it a valuable tool for AS event identification in genomic research.
doi:10.1186/1471-2105-9-432
PMCID: PMC2579307
PMID: 18851762
Recently, microarray-based comparative genomic hybridization (array-CGH) has emerged as a very efficient technology with higher resolution for the genome-wide identification of copy number alterations (CNA). Although CNAs are thought to affect gene expression, there is no platform currently available for the integrated CNA-expression analysis. To achieve high-resolution copy number analysis integrated with expression profiles, we established human 30k oligoarray-based genome-wide copy number analysis system and explored the applicability of this system for integrated genome and transcriptome analysis using MDA-MB-231 cell line. We compared the CNAs detected by the oligoarray with those detected by the 3k BAC array for validation. The oligoarray identified the single copy difference more accurately and sensitively than the BAC array. Seventeen CNAs detected by both platforms in MDA-MB-231 such as gains of 5p15.33-13.1, 8q11.22-8q21.13, 17p11.2, and losses of 1p32.3, 8p23.3-8p11.21, and 9p21 were consistently identified in previous studies on breast cancer. There were 122 other small CNAs (mean size 1.79 mb) that were detected by oligoarray only, not by BAC-array. We performed genomic qPCR targeting 7 CNA regions, detected by oligoarray only, and one non-CNA region to validate the oligoarray CNA detection. All qPCR results were consistent with the oligoarray-CGH results. When we explored the possibility of combined interpretation of both DNA copy number and RNA expression profiles, mean DNA copy number and RNA expression levels showed a significant correlation. In conclusion, this 30k oligoarray-CGH system can be a reasonable choice for analyzing whole genome CNAs and RNA expression profiles at a lower cost.
doi:10.3858/emm.2009.41.7.051
PMCID: PMC2721143
PMID: 19322034
cell line, tumor; gene dosage; gene expression profiling; oligonucleotide array sequence analysis
Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.
PMCID: PMC2067254
PMID: 17992253
array CGH; microarray; cancer genome; software; bioinformatics; alteration detection
Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development.
PMCID: PMC2067254
PMID: 17992253
Array CGH; microarray; cancer genome; software; bioinformatics; alteration detection
Aim
To discover putative oncogenes in head and neck squamous cell carcinoma (HNSCC) by integrating data from whole-genome comparison of array-based comparative genomic hybridization (CGH) and expression microarray analysis of HNSCC.
Methods
We integrated published data defining regions of loss/gain identified from the profiling of 21 HNSCC using high-resolution (<1 Mb) CGH arrays and data from an mRNA expression microarray (approx. 12,000 genes) comparing 6 normal tissues and 8 HNSCC tumor tissues. Eukaryotic translation initiation factor 2C subunit 2 (EIF2C2) was found to be the most significantly overexpressed gene by mRNA expression array, and corresponded to the most common region of amplification found by the CGH array described by Sparano et al. We validated EIF2C2 overexpression in primary tissue, overexpression and amplification in HNSCC lines (JHU-011, JHU-012, FADU) relative to a minimally transformed oral keratinocyte cell line (OKF6) and performed knockdown experiments.
Results
The tumor tissues had an average mRNA expression level of 123 (SD = 49) compared to the normal tissues (18.6, SD = 10) (p = 0.0005) by expression array. Quantitative RT-PCR validation of our expression arrays found that normal tissues had an average expression of 0.76 (SE = 0.08) and tumor tissues of 2.1 (SE = 0.35) (p = 0.0008). EIF2C2 was found to be amplified and overexpressed in 3 HNSCC cell lines. Knockdown of EIF2C2 in cell lines (JHU-012 and JHU-011) inhibited proliferation.
Conclusion
EIF2C2 is amplified and overexpressed in HNSCC cell lines and primary tumors and functionally significant in cell lines.
doi:10.1159/000320597
PMCID: PMC2975733
PMID: 20924207
Head and neck squamous cell carcinoma; EIF2C2
Background
Genomic tiling arrays have been described in the scientific literature since 2003, yet there is a shortage of user-friendly applications available for their analysis.
Methodology/Principal Findings
Tiling Array Analyzer (TiArA) is a software program that provides a user-friendly graphical interface for the background subtraction, normalization, and summarization of data acquired through the Affymetrix tiling array platform. The background signal is empirically measured using a group of nonspecific probes with varying levels of GC content and normalization is performed to enforce a common dynamic range.
Conclusions/Significance
TiArA is implemented as a standalone program for Linux systems and is available as a cross-platform virtual machine that will run under most modern operating systems using virtualization software such as Sun VirtualBox or VMware. The software is available as a Debian package or a virtual appliance at http://purl.org/NET/tiara.
doi:10.1371/journal.pone.0009993
PMCID: PMC2848623
PMID: 20376318
Background
Copy number alterations (CNAs) in genomic DNA have been associated with complex human diseases, including cancer. One of the most common techniques to detect CNAs is array-based comparative genomic hybridization (aCGH). The availability of aCGH platforms and the need for identification of CNAs has resulted in a wealth of methodological studies.
Methodology/Principal Findings
ADaCGH is an R package and a web-based application for the analysis of aCGH data. It implements eight methods for detection of CNAs, gains and losses of genomic DNA, including all of the best performing ones from two recent reviews (CBS, GLAD, CGHseg, HMM). For improved speed, we use parallel computing (via MPI). Additional information (GO terms, PubMed citations, KEGG and Reactome pathways) is available for individual genes, and for sets of genes with altered copy numbers.
Conclusions/Significance
ADaCGH represents a qualitative increase in the standards of these types of applications: a) all of the best performing algorithms are included, not just one or two; b) we do not limit ourselves to providing a thin layer of CGI on top of existing BioConductor packages, but instead carefully use parallelization, examining different schemes, and are able to achieve significant decreases in user waiting time (factors up to 45×); c) we have added functionality not currently available in some methods, to adapt to recent recommendations (e.g., merging of segmentation results in wavelet-based and CGHseg algorithms); d) we incorporate redundancy, fault-tolerance and checkpointing, which are unique among web-based, parallelized applications; e) all of the code is available under open source licenses, allowing to build upon, copy, and adapt our code for other software projects.
doi:10.1371/journal.pone.0000737
PMCID: PMC1940324
PMID: 17710137
Almagro-Garcia, Jacob | Manske, Magnus | Carret, Celine | Campino, Susana | Auburn, Sarah | MacInnis, Bronwyn L | Maslen, Gareth | Pain, Arnab | Newbold, Christopher I | Kwiatkowski, Dominic P | Clark, Taane G
Summary: Array-based comparative genomic hybridization (CGH) technology is used to discover and validate genomic structural variation, including copy number variants, insertions, deletions and other structural variants (SVs). The visualization and summarization of the array CGH data outputs, potentially across many samples, is an important process in the identification and analysis of SVs. We have developed a software tool for SV analysis using data from array CGH technologies, which is also amenable to short-read sequence data.
Availability and implementation: SnoopCGH is written in java and is available from http://snoopcgh.sourceforge.net/
Contact: jg10@sanger.ac.uk; tc5@sanger.ac.uk
doi:10.1093/bioinformatics/btp488
PMCID: PMC2759554
PMID: 19687029
The increasing availability and maturity of DNA microarray technology has led to an explosion of cancer profiling studies for identifying cancer biomarkers, and predicting treatment response. Uncovering complex relationships, however, remains the most challenging task as it requires compiling and efficiently querying data from various sources. Here, we describe the Stress Response Array Profiler (StRAP), an open-source, web-based resource for storage, profiling, visualization, and sharing of cancer genomic data. StRAP houses multi-cancer microarray data with major emphasis on radiotherapy studies, and takes a systems biology approach towards the integration, comparison, and cross-validation of multiple cancer profiling studies. The database is a comprehensive platform for comparative analysis of gene expression data. For effective use of arrays, we provide user-friendly and interactive visualization tools that can display the data and query results. StRAP is web-based, platform-independent, and freely accessible at http://strap.nci.nih.gov/.
doi:10.1371/journal.pone.0051693
PMCID: PMC3524254
PMID: 23284744
Freeman, Jennifer L. | Ceol, Craig | Feng, Hui | Langenau, David M. | Belair, Cassandra | Stern, Howard M. | Song, Anhua | Paw, Barry H. | Look, A. Thomas | Zhou, Yi | Zon, Leonard I. | Lee, Charles
The zebrafish is emerging as a prominent model system for studying the genetics of human development and disease. Genetic alterations that underlie each mutant model can exist in the form of single base changes, balanced chromosomal rearrangements, or genetic imbalances. To detect genetic imbalances in an unbiased genome-wide fashion, array comparative genomic hybridization (CGH) can be used. We have developed a 5 Mb resolution array CGH platform specifically for the zebrafish. This platform contains 286 BAC clones, enriched for orthologous sequences of human oncogenes and tumor suppressor genes. Each BAC clone has been end-sequenced and cytogenetically assigned to a specific location within the zebrafish genome, allowing for ease of integration of array CGH data with the current version of the genome assembly. This platform has been applied to three zebrafish cancer models. Significant genomic imbalances were detected in each model, identifying different regions which may potentially play a role in tumorigenesis. Hence, this platform should be a useful resource for genetic dissection of additional zebrafish developmental and disease models as well as a benchmark for future array CGH platform development.
doi:10.1002/gcc.20623
PMCID: PMC2605212
PMID: 18973135
Rose, Amy E | Satagopan, Jaya M | Oddoux, Carole | Zhou, Qin | Xu, Ruliang | Olshen, Adam B | Yu, Jessie Z | Dash, Atreya | Jean-Gilles, Jerome | Reuter, Victor | Gerald, William L | Lee, Peng | Osman, Iman
Background
The goal of our study was to investigate the molecular underpinnings associated with the relatively aggressive clinical behavior of prostate cancer (PCa) in African American (AA) compared to Caucasian American (CA) patients using a genome-wide approach.
Methods
AA and CA patients treated with radical prostatectomy (RP) were frequency matched for age at RP, Gleason grade, and tumor stage. Array-CGH (BAC SpectralChip2600) was used to identify genomic regions with significantly different DNA copy number between the groups. Gene expression profiling of the same set of tumors was also evaluated using Affymetrix HG-U133 Plus 2.0 arrays. Concordance between copy number alteration and gene expression was examined. A second aCGH analysis was performed in a larger validation cohort using an oligo-based platform (Agilent 244K).
Results
BAC-based array identified 27 chromosomal regions with significantly different copy number changes between the AA and CA tumors in the first cohort (Fisher's exact test, P < 0.05). Copy number alterations in these 27 regions were also significantly associated with gene expression changes. aCGH performed in a larger, independent cohort of AA and CA tumors validated 4 of the 27 (15%) most significantly altered regions from the initial analysis (3q26, 5p15-p14, 14q32, and 16p11). Functional annotation of overlapping genes within the 4 validated regions of AA/CA DNA copy number changes revealed significant enrichment of genes related to immune response.
Conclusions
Our data reveal molecular alterations at the level of gene expression and DNA copy number that are specific to African American and Caucasian prostate cancer and may be related to underlying differences in immune response.
doi:10.1186/1479-5876-8-70
PMCID: PMC2913940
PMID: 20649978