Search tips
Search criteria

Results 1-25 (1385979)

Clipboard (0)

Related Articles

1.  Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH 
PLoS Computational Biology  2007;3(6):e122.
Genomic DNA copy-number alterations (CNAs) are associated with complex diseases, including cancer: CNAs are indeed related to tumoral grade, metastasis, and patient survival. CNAs discovered from array-based comparative genomic hybridization (aCGH) data have been instrumental in identifying disease-related genes and potential therapeutic targets. To be immediately useful in both clinical and basic research scenarios, aCGH data analysis requires accurate methods that do not impose unrealistic biological assumptions and that provide direct answers to the key question, “What is the probability that this gene/region has CNAs?” Current approaches fail, however, to meet these requirements. Here, we introduce reversible jump aCGH (RJaCGH), a new method for identifying CNAs from aCGH; we use a nonhomogeneous hidden Markov model fitted via reversible jump Markov chain Monte Carlo; and we incorporate model uncertainty through Bayesian model averaging. RJaCGH provides an estimate of the probability that a gene/region has CNAs while incorporating interprobe distance and the capability to analyze data on a chromosome or genome-wide basis. RJaCGH outperforms alternative methods, and the performance difference is even larger with noisy data and highly variable interprobe distance, both commonly found features in aCGH data. Furthermore, our probabilistic method allows us to identify minimal common regions of CNAs among samples and can be extended to incorporate expression data. In summary, we provide a rigorous statistical framework for locating genes and chromosomal regions with CNAs with potential applications to cancer and other complex human diseases.
Author Summary
As a consequence of problems during cell division, the number of copies of a gene in a chromosome can either increase or decrease. These copy-number alterations (CNAs) can play a crucial role in the emergence of complex multigenic diseases. For example, in cancer, amplification of oncogenes can drive tumor activation, and CNAs are associated with metastasis development and patient survival. Studies on the relationship between CNAs and disease have been recently fueled by the widespread use of array-based comparative genomic hybridization (aCGH), a technique with much finer resolution than previous experimental approaches. Detection of CNAs from these data depends on methods of analysis that do not impose biologically unrealistic assumptions and that provide direct answers to fundamental research questions. We have developed a statistical method, using a Bayesian approach, that returns estimates of the probabilities of CNAs from aCGH data, the most direct and valuable answer to the key biological question: “What is the probability that this gene/region has an altered copy number?” The output of the method can therefore be immediately used in different settings from clinical to basic research scenarios, and is applicable over a wide variety of aCGH technologies.
PMCID: PMC1894821  PMID: 17590078
2.  A probe-density-based analysis method for array CGH data: simulation, normalization and centralization 
Bioinformatics  2008;24(16):1749-1756.
Motivation: Genomic instability is one of the fundamental factors in tumorigenesis and tumor progression. Many studies have shown that copy-number abnormalities at the DNA level are important in the pathogenesis of cancer. Array comparative genomic hybridization (aCGH), developed based on expression microarray technology, can reveal the chromosomal aberrations in segmental copies at a high resolution. However, due to the nature of aCGH, many standard expression data processing tools, such as data normalization, often fail to yield satisfactory results.
Results: We demonstrated a novel aCGH normalization algorithm, which provides an accurate aCGH data normalization by utilizing the dependency of neighboring probe measurements in aCGH experiments. To facilitate the study, we have developed a hidden Markov model (HMM) to simulate a series of aCGH experiments with random DNA copy number alterations that are used to validate the performance of our normalization. In addition, we applied the proposed normalization algorithm to an aCGH study of lung cancer cell lines. By using the proposed algorithm, data quality and the reliability of experimental results are significantly improved, and the distinct patterns of DNA copy number alternations are observed among those lung cancer cell lines.
Supplementary information: Source codes and.gures may be found at
PMCID: PMC2732214  PMID: 18603568
3.  Bayesian Disease Classification Using Copy Number Data 
Cancer Informatics  2014;13(Suppl 2):83-91.
DNA copy number variations (CNVs) have been shown to be associated with cancer development and progression. The detection of these CNVs has the potential to impact the basic knowledge and treatment of many types of cancers, and can play a role in the discovery and development of molecular-based personalized cancer therapies. One of the most common types of high-resolution chromosomal microarrays is array-based comparative genomic hybridization (aCGH) methods that assay DNA CNVs across the whole genomic landscape in a single experiment. In this article we propose methods to use aCGH profiles to predict disease states. We employ a Bayesian classification model and treat disease states as outcome, and aCGH profiles as covariates in order to identify significant regions of the genome associated with disease subclasses. We propose a principled two-stage method where we first make inferences on the underlying copy number states associated with the aCGH emissions based on hidden Markov model (HMM) formulations to account for serial dependencies in neighboring probes. Subsequently, we infer associations with disease outcomes, conditional on the copy number states, using Bayesian linear variable selection procedures. The selected probes and their effects are parameters that are useful for predicting the disease categories of any additional individuals on the basis of their aCGH profiles. Using simulated datasets, we investigate the method’s accuracy in detecting disease category. Our methodology is motivated by and applied to a breast cancer dataset consisting of aCGH profiles assayed on patients from multiple disease subtypes.
PMCID: PMC4196891  PMID: 25336897
breast cancer; classification; Bayesian network; hidden Markov model
4.  Microarray Comparative Genomic Hybridisation Analysis Incorporating Genomic Organisation, and Application to Enterobacterial Plant Pathogens 
PLoS Computational Biology  2009;5(8):e1000473.
Microarray comparative genomic hybridisation (aCGH) provides an estimate of the relative abundance of genomic DNA (gDNA) taken from comparator and reference organisms by hybridisation to a microarray containing probes that represent sequences from the reference organism. The experimental method is used in a number of biological applications, including the detection of human chromosomal aberrations, and in comparative genomic analysis of bacterial strains, but optimisation of the analysis is desirable in each problem domain.
We present a method for analysis of bacterial aCGH data that encodes spatial information from the reference genome in a hidden Markov model. This technique is the first such method to be validated in comparisons of sequenced bacteria that diverge at the strain and at the genus level: Pectobacterium atrosepticum SCRI1043 (Pba1043) and Dickeya dadantii 3937 (Dda3937); and Lactococcus lactis subsp. lactis IL1403 and L. lactis subsp. cremoris MG1363. In all cases our method is found to outperform common and widely used aCGH analysis methods that do not incorporate spatial information. This analysis is applied to comparisons between commercially important plant pathogenic soft-rotting enterobacteria (SRE) Pba1043, P. atrosepticum SCRI1039, P. carotovorum 193, and Dda3937.
Our analysis indicates that it should not be assumed that hybridisation strength is a reliable proxy for sequence identity in aCGH experiments, and robustly extends the applicability of aCGH to bacterial comparisons at the genus level. Our results in the SRE further provide evidence for a dynamic, plastic ‘accessory’ genome, revealing major genomic islands encoding gene products that provide insight into, and may play a direct role in determining, variation amongst the SRE in terms of their environmental survival, host range and aetiology, such as phytotoxin synthesis, multidrug resistance, and nitrogen fixation.
Author Summary
We describe the first use of a method for the analysis of bacterial microarray comparative genomic hybridisation (aCGH) that includes information about the spatial organisation of genes in the reference bacterium. We demonstrate that using this information improves predictive performance over standard bacterial aCGH methods in discriminating between genes from the reference organism that either do or do not have putative orthologues in the comparator organism. Our approach produces good results on more distantly related bacteria than can successfully be analysed by the standard methods. We apply our analysis to comparisons between four commercially-significant plant pathogenic bacteria, and identify several regions of the genome that are likely to contribute to their ability to cause disease, and to proliferate in the environment, generating hypotheses for future experiments.
PMCID: PMC2718846  PMID: 19696881
5.  wuHMM: a robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data 
Nucleic Acids Research  2008;36(7):e41.
Copy number variants (CNVs) are currently defined as genomic sequences that are polymorphic in copy number and range in length from 1000 to several million base pairs. Among current array-based CNV detection platforms, long-oligonucleotide arrays promise the highest resolution. However, the performance of currently available analytical tools suffers when applied to these data because of the lower signal:noise ratio inherent in oligonucleotide-based hybridization assays. We have developed wuHMM, an algorithm for mapping CNVs from array comparative genomic hybridization (aCGH) platforms comprised of 385 000 to more than 3 million probes. wuHMM is unique in that it can utilize sequence divergence information to reduce the false positive rate (FPR). We apply wuHMM to 385K-aCGH, 2.1M-aCGH and 3.1M-aCGH experiments comparing the 129X1/SvJ and C57BL/6J inbred mouse genomes. We assess wuHMM's performance on the 385K platform by comparison to the higher resolution platforms and we independently validate 10 CNVs. The method requires no training data and is robust with respect to changes in algorithm parameters. At a FPR of <10%, the algorithm can detect CNVs with five probes on the 385K platform and three on the 2.1M and 3.1M platforms, resulting in effective resolutions of 24 kb, 2–5 kb and 1 kb, respectively.
PMCID: PMC2367727  PMID: 18334530
6.  Comparison of chromosomal and array-based comparative genomic hybridization for the detection of genomic imbalances in primary prostate carcinomas 
Molecular Cancer  2006;5:33.
In order to gain new insights into the molecular mechanisms involved in prostate cancer, we performed array-based comparative genomic hybridization (aCGH) on a series of 46 primary prostate carcinomas using a 1 Mbp whole-genome coverage platform. As chromosomal comparative genomic hybridization (cCGH) data was available for these samples, we compared the sensitivity and overall concordance of the two methodologies, and used the combined information to infer the best of three different aCGH scoring approaches.
Our data demonstrate that the reliability of aCGH in the analysis of primary prostate carcinomas depends to some extent on the scoring approach used, with the breakpoint estimation method being the most sensitive and reliable. The pattern of copy number changes detected by aCGH was concordant with that of cCGH, but the higher resolution technique detected 2.7 times more aberrations and 15.2% more carcinomas with genomic imbalances. We additionally show that several aberrations were consistently overlooked using cCGH, such as small deletions at 5q, 6q, 12p, and 17p. The latter were validated by fluorescence in situ hybridization targeting TP53, although only one carcinoma harbored a point mutation in this gene. Strikingly, homozygous deletions at 10q23.31, encompassing the PTEN locus, were seen in 58% of the cases with 10q loss.
We conclude that aCGH can significantly improve the detection of genomic aberrations in cancer cells as compared to previously established whole-genome methodologies, although contamination with normal cells may influence the sensitivity and specificity of some scoring approaches. Our work delineated recurrent copy number changes and revealed novel amplified loci and frequent homozygous deletions in primary prostate carcinomas, which may guide future work aimed at identifying the relevant target genes. In particular, biallelic loss seems to be a frequent mechanism of inactivation of the PTEN gene in prostate carcinogenesis.
PMCID: PMC1570364  PMID: 16952311
7.  A fused lasso latent feature model for analyzing multi-sample aCGH data 
Biostatistics (Oxford, England)  2011;12(4):776-791.
Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.
PMCID: PMC3169672  PMID: 21642389
Cancer; DNA copy number; False discovery rate; Mutation
8.  Whole-Genome Array CGH Evaluation for Replacing Prenatal Karyotyping in Hong Kong 
PLoS ONE  2014;9(2):e87988.
To evaluate the effectiveness of whole-genome array comparative genomic hybridization (aCGH) in prenatal diagnosis in Hong Kong.
Array CGH was performed on 220 samples recruited prospectively as the first-tier test study. In addition 150 prenatal samples with abnormal fetal ultrasound findings found to have normal karyotypes were analyzed as a ‘further-test’ study using NimbleGen CGX-135K oligonucleotide arrays.
Array CGH findings were concordant with conventional cytogenetic results with the exception of one case of triploidy. It was found in the first-tier test study that aCGH detected 20% (44/220) clinically significant copy number variants (CNV), of which 21 were common aneuploidies and 23 had other chromosomal imbalances. There were 3.2% (7/220) samples with CNVs detected by aCGH but not by conventional cytogenetics. In the ‘further-test’ study, the additional diagnostic yield of detecting chromosome imbalance was 6% (9/150). The overall detection for CNVs of unclear clinical significance was 2.7% (10/370) with 0.9% found to be de novo. Eleven loci of common CNVs were found in the local population.
Whole-genome aCGH offered a higher resolution diagnostic capacity than conventional karyotyping for prenatal diagnosis either as a first-tier test or as a ‘further-test’ for pregnancies with fetal ultrasound anomalies. We propose replacing conventional cytogenetics with aCGH for all pregnancies undergoing invasive diagnostic procedures after excluding common aneuploidies and triploidies by quantitative fluorescent PCR. Conventional cytogenetics can be reserved for visualization of clinically significant CNVs.
PMCID: PMC3914896  PMID: 24505343
9.  CGHnormaliter: an iterative strategy to enhance normalization of array CGH data with imbalanced aberrations 
BMC Genomics  2009;10:401.
Array comparative genomic hybridization (aCGH) is a popular technique for detection of genomic copy number imbalances. These play a critical role in the onset of various types of cancer. In the analysis of aCGH data, normalization is deemed a critical pre-processing step. In general, aCGH normalization approaches are similar to those used for gene expression data, albeit both data-types differ inherently. A particular problem with aCGH data is that imbalanced copy numbers lead to improper normalization using conventional methods.
In this study we present a novel method, called CGHnormaliter, which addresses this issue by means of an iterative normalization procedure. First, provisory balanced copy numbers are identified and subsequently used for normalization. These two steps are then iterated to refine the normalization. We tested our method on three well-studied tumor-related aCGH datasets with experimentally confirmed copy numbers. Results were compared to a conventional normalization approach and two more recent state-of-the-art aCGH normalization strategies. Our findings show that, compared to these three methods, CGHnormaliter yields a higher specificity and precision in terms of identifying the 'true' copy numbers.
We demonstrate that the normalization of aCGH data can be significantly enhanced using an iterative procedure that effectively eliminates the effect of imbalanced copy numbers. This also leads to a more reliable assessment of aberrations. An R-package containing the implementation of CGHnormaliter is available at .
PMCID: PMC2748095  PMID: 19709427
10.  Quantitative copy number analysis by Multiplex Ligation-dependent Probe Amplification (MLPA) of BRCA1-associated breast cancer regions identifies BRCAness 
Breast Cancer Research : BCR  2011;13(5):R107.
Our group has previously employed array Comparative Genomic Hybridization (aCGH) to assess the genomic patterns of BRCA1-mutated breast cancers. We have shown that the so-called BRCA1-likeaCGH profile is also present in about half of all triple-negative sporadic breast cancers and is predictive for benefit from intensified alkylating chemotherapy. As aCGH is a rather complex method, we translated the BRCA1aCGH profile to a Multiplex Ligation-dependent Probe Amplification (MLPA) assay, to identify both BRCA1-mutated breast cancers and sporadic cases with a BRCA1-likeaCGH profile.
The most important genomic regions of the original aCGH based classifier (3q22-27, 5q12-14, 6p23-22, 12p13, 12q21-23, 13q31-34) were mapped to a set of 34 MLPA probes. The training set consisted of 39 BRCA1-likeaCGH breast cancers and 45 non-BRCA1-likeaCGH breast cancers, which had previously been analyzed by aCGH. The BRCA1-likeaCGH group consisted of germline BRCA1-mutated cases and sporadic tumours with low BRCA1 gene expression and/or BRCA1 promoter methylation. We trained a shrunken centroids classifier on the training set and validation was performed on an independent test set of 40 BRCA1-likeaCGH breast cancers and 32 non-BRCA1-likeaCGH breast cancer tumours. In addition, we validated the set prospectively on 69 new triple-negative tumours.
BRCAness in the training set of 84 tumours could accurately be predicted by prediction analysis of microarrays (PAM) (accuracy 94%). Application of this classifier on the independent validation set correctly predicted BRCA-like status of 62 out of 72 breast tumours (86%). Sensitivity and specificity were 85% and 87%, respectively. When the MLPA-test was subsequently applied to 46 breast tumour samples from a randomized clinical trial, the same survival benefit for BRCA1-like tumours associated with intensified alkylating chemotherapy was shown as was previously reported using the aCGH assay.
Since the MLPA assay can identify BRCA1-deficient breast cancer patients, this method could be applied both for clinical genetic testing and as a predictor of treatment benefit. BRCA1-like tumours are highly sensitive to chemotherapy with DNA damaging agents, and most likely to poly ADP ribose polymerase (PARP)-inhibitors. The MLPA assay is rapid and robust, can easily be multiplexed, and works well with DNA derived from paraffin-embedded tissues.
PMCID: PMC3262220  PMID: 22032731
11.  Identification of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data 
Nucleic Acids Research  2008;36(2):e13.
Tumor formation is in part driven by DNA copy number alterations (CNAs), which can be measured using microarray-based Comparative Genomic Hybridization (aCGH). Multiexperiment analysis of aCGH data from tumors allows discovery of recurrent CNAs that are potentially causal to cancer development. Until now, multiexperiment aCGH data analysis has been dependent on discretization of measurement data to a gain, loss or no-change state. Valuable biological information is lost when a heterogeneous system such as a solid tumor is reduced to these states. We have developed a new approach which inputs nondiscretized aCGH data to identify regions that are significantly aberrant across an entire tumor set. Our method is based on kernel regression and accounts for the strength of a probe's signal, its local genomic environment and the signal distribution across multiple tumors. In an analysis of 89 human breast tumors, our method showed enrichment for known cancer genes in the detected regions and identified aberrations that are strongly associated with breast cancer subtypes and clinical parameters. Furthermore, we identified 18 recurrent aberrant regions in a new dataset of 19 p53-deficient mouse mammary tumors. These regions, combined with gene expression microarray data, point to known cancer genes and novel candidate cancer genes.
PMCID: PMC2241875  PMID: 18187509
12.  Detection limit of intragenic deletions with targeted array comparative genomic hybridization 
BMC Genetics  2013;14:116.
Pathogenic mutations range from single nucleotide changes to deletions or duplications that encompass a single exon to several genes. The use of gene-centric high-density array comparative genomic hybridization (aCGH) has revolutionized the detection of intragenic copy number variations. We implemented an exon-centric design of high-resolution aCGH to detect single- and multi-exon deletions and duplications in a large set of genes using the OGT 60 K and 180 K arrays. Here we describe the molecular characterization and breakpoint mapping of deletions at the smaller end of the detectable range in several genes using aCGH.
The method initially implemented to detect single to multiple exon deletions, was able to detect deletions much smaller than anticipated. The selected deletions we describe vary in size, ranging from over 2 kb to as small as 12 base pairs. The smallest of these deletions are only detectable after careful manual review during data analysis. Suspected deletions smaller than the detection size for which the method was optimized, were rigorously followed up and confirmed with PCR-based investigations to uncover the true detection size limit of intragenic deletions with this technology. False-positive deletion calls often demonstrated single nucleotide changes or an insertion causing lower hybridization of probes demonstrating the sensitivity of aCGH.
With optimizing aCGH design and careful review process, aCGH can uncover intragenic deletions as small as dozen bases. These data provide insight that will help optimize probe coverage in array design and illustrate the true assay sensitivity. Mapping of the breakpoints confirms smaller deletions and contributes to the understanding of the mechanism behind these events. Our knowledge of the mutation spectra of several genes can be expected to change as previously unrecognized intragenic deletions are uncovered.
PMCID: PMC4235222  PMID: 24304607
aCGH; Intragenic deletions; Breakpoint analysis; Molecular characterization
13.  Application of Array CGH on Archival Formalin-Fixed Paraffin-Embedded Tissues including small numbers of microdissected cells 
Array-based comparative genomic hybridisation (aCGH) has diverse applications in cancer gene discovery and translational research. Currently, aCGH is performed primarily using high molecular weight DNA samples and its application to formalin-fixed and paraffin-embedded (FFPE) tissues remains to be established. To explore how aCGH can be reliably applied to archival FFPE tissues and whether it is possible to apply aCGH to small numbers of cells microdissected from FFPE tissue sections, we have systematically performed aCGH on 15 pairs of matched frozen and FFPE glioblastoma tissues using a well established in-house human 1Mb BAC/PAC genomic array. By spiking glioblastoma DNA with normal DNA, we demonstrated that at least 70% of tumour DNA was required for reliable aCGH analysis. Using aCGH data from frozen tissue as a reference, it was found that only FFPE glioblastoma tissues that supported PCR amplification of >300bp DNA fragment provided high quality, reproducible aCGH data. The presence of necrosis in a tissue specimen had an adverse effect on the quality of aCGH, while fixation in formalin for up to 96 hours of fresh tissue did not appear to affect the quality of the result. As little as 10-20ng DNA from frozen or FFPE tissues could be readily used for aCGH analysis following whole genome amplification. Furthermore, as few as 2000 microdissected cells from haematoxylin stained slides of archival FFPE tissues could be successfully used for aCGH investigations when whole genome amplification was used. By careful assessment of DNA integrity and review of histology, to exclude necrosis and select specimens with a high proportion of tumour cells, it is feasible to pre-select archival FFPE tissues adequate for aCGH analysis. With the help of microdissection and whole genome amplification, it is also possible to apply aCGH to histologically defined lesions, such as carcinoma in situ.
PMCID: PMC2815849  PMID: 16751780
array CGH; archival fixed tissue; microdissection; whole genome amplification; glioblastoma
14.  Bayesian Random Segmentation Models to Identify Shared Copy Number Aberrations for Array CGH Data 
Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and provides a cytogenetic representation of the relative DNA copy number variation. Analysis of such data typically involves estimation of the underlying copy number state at each location and segmenting regions of DNA with similar copy number states. Most current methods proceed by modeling a single sample/array at a time, and thus fail to borrow strength across multiple samples to infer shared regions of copy number aberrations. We propose a hierarchical Bayesian random segmentation approach for modeling aCGH data that utilizes information across arrays from a common population to yield segments of shared copy number changes. These changes characterize the underlying population and allow us to compare different population aCGH profiles to assess which regions of the genome have differential alterations. Our method, referred to as BDSAcgh (Bayesian Detection of Shared Aberrations in aCGH), is based on a unified Bayesian hierarchical model that allows us to obtain probabilities of alteration states as well as probabilities of differential alteration that correspond to local false discovery rates. We evaluate the operating characteristics of our method via simulations and an application using a lung cancer aCGH data set.
PMCID: PMC3079218  PMID: 21512611
Bayesian methods; Comparative Genomic Hybridization; Copy number; Functional data analysis; Mixed Models; Mixture Models
15.  A Bayesian Analysis for Identifying DNA Copy Number Variations Using a Compound Poisson Process 
To study chromosomal aberrations that may lead to cancer formation or genetic diseases, the array-based Comparative Genomic Hybridization (aCGH) technique is often used for detecting DNA copy number variants (CNVs). Various methods have been developed for gaining CNVs information based on aCGH data. However, most of these methods make use of the log-intensity ratios in aCGH data without taking advantage of other information such as the DNA probe (e.g., biomarker) positions/distances contained in the data. Motivated by the specific features of aCGH data, we developed a novel method that takes into account the estimation of a change point or locus of the CNV in aCGH data with its associated biomarker position on the chromosome using a compound Poisson process. We used a Bayesian approach to derive the posterior probability for the estimation of the CNV locus. To detect loci of multiple CNVs in the data, a sliding window process combined with our derived Bayesian posterior probability was proposed. To evaluate the performance of the method in the estimation of the CNV locus, we first performed simulation studies. Finally, we applied our approach to real data from aCGH experiments, demonstrating its applicability.
PMCID: PMC3171362  PMID: 20976296
16.  A Statistical Change Point Model Approach for the Detection of DNA Copy Number Variations in Array CGH Data 
Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.
PMCID: PMC4154476  PMID: 19875853
Statistical hypothesis testing; aCGH microarray data; gene expression; DNA copy numbers; CNVs
17.  Micro-Scale Genomic DNA Copy Number Aberrations as Another Means of Mutagenesis in Breast Cancer 
PLoS ONE  2012;7(12):e51719.
In breast cancer, the basal-like subtype has high levels of genomic instability relative to other breast cancer subtypes with many basal-like-specific regions of aberration. There is evidence that this genomic instability extends to smaller scale genomic aberrations, as shown by a previously described micro-deletion event in the PTEN gene in the Basal-like SUM149 breast cancer cell line.
We sought to identify if small regions of genomic DNA copy number changes exist by using a high density, gene-centric Comparative Genomic Hybridizations (CGH) array on cell lines and primary tumors. A custom tiling array for CGH (244,000 probes, 200 bp tiling resolution) was created to identify small regions of genomic change, which was focused on previously identified basal-like-specific, and general cancer genes. Tumor genomic DNA from 94 patients and 2 breast cancer cell lines was labeled and hybridized to these arrays. Aberrations were called using SWITCHdna and the smallest 25% of SWITCHdna-defined genomic segments were called micro-aberrations (<64 contiguous probes, ∼ 15 kb).
Our data showed that primary tumor breast cancer genomes frequently contained many small-scale copy number gains and losses, termed micro-aberrations, most of which are undetectable using typical-density genome-wide aCGH arrays. The basal-like subtype exhibited the highest incidence of these events. These micro-aberrations sometimes altered expression of the involved gene. We confirmed the presence of the PTEN micro-amplification in SUM149 and by mRNA-seq showed that this resulted in loss of expression of all exons downstream of this event. Micro-aberrations disproportionately affected the 5′ regions of the affected genes, including the promoter region, and high frequency of micro-aberrations was associated with poor survival.
Using a high-probe-density, gene-centric aCGH microarray, we present evidence of small-scale genomic aberrations that can contribute to gene inactivation. These events may contribute to tumor formation through mechanisms not detected using conventional DNA copy number analyses.
PMCID: PMC3524128  PMID: 23284754
18.  A Robust Method to Analyze Copy Number Alterations of Less than 100 kb in Single Cells Using Oligonucleotide Array CGH 
PLoS ONE  2013;8(6):e67031.
Comprehensive genome wide analyses of single cells became increasingly important in cancer research, but remain to be a technically challenging task. Here, we provide a protocol for array comparative genomic hybridization (aCGH) of single cells. The protocol is based on an established adapter-linker PCR (WGAM) and allowed us to detect copy number alterations as small as 56 kb in single cells. In addition we report on factors influencing the success of single cell aCGH downstream of the amplification method, including the characteristics of the reference DNA, the labeling technique, the amount of input DNA, reamplification, the aCGH resolution, and data analysis. In comparison with two other commercially available non-linear single cell amplification methods, WGAM showed a very good performance in aCGH experiments. Finally, we demonstrate that cancer cells that were processed and identified by the CellSearch® System and that were subsequently isolated from the CellSearch® cartridge as single cells by fluorescence activated cell sorting (FACS) could be successfully analyzed using our WGAM-aCGH protocol. We believe that even in the era of next-generation sequencing, our single cell aCGH protocol will be a useful and (cost-) effective approach to study copy number alterations in single cells at resolution comparable to those reported currently for single cell digital karyotyping based on next generation sequencing data.
PMCID: PMC3692546  PMID: 23825608
19.  The pitfalls of platform comparison: DNA copy number array technologies assessed 
BMC Genomics  2009;10:588.
The accurate and high resolution mapping of DNA copy number aberrations has become an important tool by which to gain insight into the mechanisms of tumourigenesis. There are various commercially available platforms for such studies, but there remains no general consensus as to the optimal platform. There have been several previous platform comparison studies, but they have either described older technologies, used less-complex samples, or have not addressed the issue of the inherent biases in such comparisons. Here we describe a systematic comparison of data from four leading microarray technologies (the Affymetrix Genome-wide SNP 5.0 array, Agilent High-Density CGH Human 244A array, Illumina HumanCNV370-Duo DNA Analysis BeadChip, and the Nimblegen 385 K oligonucleotide array). We compare samples derived from primary breast tumours and their corresponding matched normals, well-established cancer cell lines, and HapMap individuals. By careful consideration and avoidance of potential sources of bias, we aim to provide a fair assessment of platform performance.
By performing a theoretical assessment of the reproducibility, noise, and sensitivity of each platform, notable differences were revealed. Nimblegen exhibited between-replicate array variances an order of magnitude greater than the other three platforms, with Agilent slightly outperforming the others, and a comparison of self-self hybridizations revealed similar patterns. An assessment of the single probe power revealed that Agilent exhibits the highest sensitivity. Additionally, we performed an in-depth visual assessment of the ability of each platform to detect aberrations of varying sizes. As expected, all platforms were able to identify large aberrations in a robust manner. However, some focal amplifications and deletions were only detected in a subset of the platforms.
Although there are substantial differences in the design, density, and number of replicate probes, the comparison indicates a generally high level of concordance between platforms, despite differences in the reproducibility, noise, and sensitivity. In general, Agilent tended to be the best aCGH platform and Affymetrix, the superior SNP-CGH platform, but for specific decisions the results described herein provide a guide for platform selection and study design, and the dataset a resource for more tailored comparisons.
PMCID: PMC2797821  PMID: 19995423
20.  Pre-Descemet Corneal Dystrophy and X-linked Ichthyosis Associated with Deletion of Xp22.31 Containing the STS Gene 
Cornea  2013;32(9):1283-1287.
To report the association of X-inked ichthyosis and pre-Descemet corneal dystrophy with a deletion of the steroid sulfatase gene (STS) detected with microarray-based comparative genomic hybridization (aCGH).
A slit lamp biomicroscopic and cutaneous examination were performed, after which a saliva sample was collected as a source of genomic DNA. PCR amplification of each of the 10 exons of STS was performed, as was aCGH on genomic DNA to detect copy number variation (CNV).
Slit lamp examination revealed punctate opacities in the posterior corneal stroma of each eye. Cutaneous examination demonstrated scaling and flaking skin of the arms and legs. PCR amplification using primers designed to amplify each of the 10 exons of STS failed to produce any amplicons. Subsequently, aCGH performed on genomic DNA revealed a microdeletion in the Xp22.31 cytoband of approximately 1.7 megabases, containing STS.
The identification of a microdeletion within Xp22.3 containing STS with aCGH in an individual with suspected pre-Descemet corneal dystrophy and X-inked ichthyosis demonstrates the clinical utility of CNV analysis in confirming a presumptive clinical diagnosis.
PMCID: PMC3740086  PMID: 23807007
ichthyosis; pre-Descemet corneal dystrophy; steroid sulfatase deficiency; STS
21.  aCGHViewer: A Generic Visualization Tool For aCGH data 
Cancer informatics  2006;2:36-43.
Array-Comparative Genomic Hybridization (aCGH) is a powerful high throughput technology for detecting chromosomal copy number aberrations (CNAs) in cancer, aiming at identifying related critical genes from the affected genomic regions. However, advancing from a dataset with thousands of tabular lines to a few candidate genes can be an onerous and time-consuming process. To expedite the aCGH data analysis process, we have developed a user-friendly aCGH data viewer (aCGHViewer) as a conduit between the aCGH data tables and a genome browser. The data from a given aCGH analysis are displayed in a genomic view comprised of individual chromosome panels which can be rapidly scanned for interesting features. A chromosome panel containing a feature of interest can be selected to launch a detail window for that single chromosome. Selecting a data point of interest in the detail window launches a query to the UCSC or NCBI genome browser to allow the user to explore the gene content in the chromosomal region. Additionally, aCGHViewer can display aCGH and expression array data concurrently to visually correlate the two. aCGHViewer is a stand alone Java visualization application that should be used in conjunction with separate statistical programs. It operates on all major computer platforms and is freely available at
PMCID: PMC1847423  PMID: 17404607
array-CGH; CNA; gene expression; visualization
22.  aCGHViewer: A Generic Visualization Tool For aCGH data 
Cancer Informatics  2007;2:36-43.
Array-Comparative Genomic Hybridization (aCGH) is a powerful high throughput technology for detecting chromosomal copy number aberrations (CNAs) in cancer, aiming at identifying related critical genes from the affected genomic regions. However, advancing from a dataset with thousands of tabular lines to a few candidate genes can be an onerous and time-consuming process. To expedite the aCGH data analysis process, we have developed a user-friendly aCGH data viewer (aCGHViewer) as a conduit between the aCGH data tables and a genome browser. The data from a given aCGH analysis are displayed in a genomic view comprised of individual chromosome panels which can be rapidly scanned for interesting features. A chromosome panel containing a feature of interest can be selected to launch a detail window for that single chromosome. Selecting a data point of interest in the detail window launches a query to the UCSC or NCBI genome browser to allow the user to explore the gene content in the chromosomal region. Additionally, aCGHViewer can display aCGH and expression array data concurrently to visually correlate the two. aCGHViewer is a stand alone Java visualization application that should be used in conjunction with separate statistical programs. It operates on all major computer platforms and is freely available at
PMCID: PMC1847423  PMID: 17404607
array-CGH; CNA; gene expression; visualization
23.  A Multi-Sample Based Method for Identifying Common CNVs in Normal Human Genomic Structure Using High-Resolution aCGH Data 
PLoS ONE  2011;6(10):e26975.
It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample.
Methodology and Principal Findings
We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR).
Conclusions and Significance
We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at:
PMCID: PMC3205051  PMID: 22073121
24.  Neurobeachin (NBEA) is a target of recurrent interstitial deletions at 13q13 in patients with MGUS and multiple myeloma 
Experimental hematology  2009;37(2):234-244.
Chromosome 13 deletions (del[13]), detected by metaphase cytogenetics, predict poor outcome in multiple myeloma (MM), but the gene(s) responsible have not been conclusively identified. We sought to identify tumor suppressor genes on chromosome 13 using a novel array comparative genomic hybridization (aCGH) strategy.
We identified DNA copy number losses on chromosome 13 using genomic DNA isolated from CD138 enriched bone marrow cells (tumor) from twenty patients with MM, monoclonal gammopathy of undetermined significance (MGUS) or amyloidosis. We used matched skin biopsy (germline) genomic DNA to control for copy number polymorphisms and a novel aCGH array dedicated to chromosome 13 to map somatic DNA gains and losses at ultra-high resolution (>385,000 probes; median probe spacing 60bp). We analyzed microarray expression data from an additional 262 patient samples both with and without del[13].
Two distinct minimally deleted regions at 13q14 and 13q13 were defined that affected the RB1 and NBEA genes, respectively. RB1 is a canonical tumor suppressor previously implicated in MM. NBEA is implicated in membrane trafficking in neurons, PKA binding, and has no known role in cancer. Non-coding RNAs on chromosome 13 were not affected by interstitial deletions. Both the RB1 and NBEA genes were deleted in 40% of cases (8/20; 5 patients with del[13] detected by traditional methods and three patients with interstitial deletions detected by aCGH). Forty-one additional MM patient samples were used for complete exonic sequencing of RB1, but no somatic mutations were found. Along with RB1, NBEA gene expression was significantly reduced in cases with del[13].
The NBEA gene at 13q13, and its expression, are frequently disrupted in MM. Additional studies are warranted to evaluate the role of NBEA as a novel candidate tumor suppressor gene.
PMCID: PMC2868587  PMID: 19135901
Multiple Myeloma; array comparative genomic hybridization; RB1; NBEA
25.  Microdeletion and Microduplication Analysis of Chinese Conotruncal Defects Patients with Targeted Array Comparative Genomic Hybridization 
PLoS ONE  2013;8(10):e76314.
The current study aimed to develop a reliable targeted array comparative genomic hybridization (aCGH) to detect microdeletions and microduplications in congenital conotruncal defects (CTDs), especially on 22q11.2 region, and for some other chromosomal aberrations, such as 5p15-5p, 7q11.23 and 4p16.3.
Twenty-seven patients with CTDs, including 12 pulmonary atresia (PA), 10 double-outlet right ventricle (DORV), 3 transposition of great arteries (TGA), 1 tetralogy of Fallot (TOF) and one ventricular septal defect (VSD), were enrolled in this study and screened for pathogenic copy number variations (CNVs), using Agilent 8 x 15K targeted aCGH. Real-time quantitative polymerase chain reaction (qPCR) was performed to test the molecular results of targeted aCGH.
Four of 27 patients (14.8%) had 22q11.2 CNVs, 1 microdeletion and 3 microduplications. qPCR test confirmed the microdeletion and microduplication detected by the targeted aCGH.
Chromosomal abnormalities were a well-known cause of multiple congenital anomalies (MCA). This aCGH using arrays with high-density coverage in the targeted regions can detect genomic imbalances including 22q11.2 and other 10 kinds CNVs effectively and quickly. This approach has the potential to be applied to detect aneuploidy and common microdeletion/microduplication syndromes on a single microarray.
PMCID: PMC3788710  PMID: 24098474

Results 1-25 (1385979)