PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (847457)

Clipboard (0)
None

Related Articles

1.  BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data 
Bioinformatics  2011;27(11):1473-1480.
Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples.
Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines.
Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm.
Contact: yuewang@vt.edu
Supplementary Information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr183
PMCID: PMC3102226  PMID: 21498400
2.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma 
DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma.
We introduce a modeling approach termed EPoC (Endogenous Perturbation analysis of Cancer), enabling the construction of global, gene-level models that causally connect gene copy number with expression in glioblastoma.On the basis of the resulting model, we predict genes that are likely to be disease-driving and validate selected predictions experimentally. We also demonstrate that further analysis of the network model by sparse singular value decomposition allows stratification of patients with glioblastoma into short-term and long-term survivors, introducing decomposed network models as a useful principle for biomarker discovery.Finally, in systematic comparisons, we demonstrate that EPoC is computationally efficient and yields more consistent results than mRNA-only methods, standard eQTL methods, and two recent multivariate methods for genotype–mRNA coupling.
Gains and losses of chromosomal material (DNA copy number aberrations; CNAs) are a characteristic feature of cancer genomes. At the level of a single locus, it is well known that increased copy number (gene amplification) typically leads to increased gene expression, whereas decreased copy number (gene deletion) leads to decreased gene expression (Pollack et al, 2002; Lee et al, 2008; Nilsson et al, 2008). However, CNAs also affect the expression of genes located outside the amplified/deleted region itself via indirect mechanisms. To fully understand the action of CNAs, it is therefore necessary to analyze their action in a network context. Toward this goal, improved computational approaches will be important, if not essential.
To determine the global effects on transcription of CNAs in the brain tumor glioblastoma, we develop EPoC (Endogenous Perturbation analysis of Cancer), a computational technique capable of inferring sparse, causal network models by combining genome-wide, paired CNA- and mRNA-level data. EPoC aims to detect disease-driving copy number aberrations and their effect on target mRNA expression, and stratify patients into long-term and short-term survivors. Technically, EPoC relates CNA perturbations to mRNA responses by matrix equations, derived from a steady-state approximation of the transcriptional network. Patient prognostic scores are obtained from singular value decompositions of the network matrix. The models are constructed by solving a large-scale, regularized regression problem.
We apply EPoC to glioblastoma data from The Cancer Genome Atlas (TCGA) consortium (186 patients). The identified CNA-driven network comprises 10 672 genes, and contains a number of copy number-altered genes that control multiple downstream genes. Highly connected hub genes include well-known oncogenes and tumor supressor genes that are frequently deleted or amplified in glioblastoma, including EGFR, PDGFRA, CDKN2A and CDKN2B, confirming a clear association between these aberrations and transcriptional variability of these brain tumors. In addition, we identify a number of hub genes that have previously not been associated with glioblastoma, including interferon alpha 1 (IFNA1), myeloid/lymphoid or mixed-lineage leukemia translocated to 10 (MLLT10, a well-known leukemia gene), glutamate decarboxylase 2 GAD2, a postulated glutamate receptor GPR158 and Necdin (NDN). Furthermore, we demonstrate that the network model contains useful information on downstream target genes (including stem cell regulators), and possible drug targets.
We proceed to explore the validity of a small network region experimentally. Introducing experimental perturbations of NDN and other targets in four glioblastoma cell lines (T98G, U-87MG, U-343MG and U-373MG), we confirm several predicted mechanisms. We also demonstrate that the TCGA glioblastoma patients can be stratified into long-term and short-term survivors, using our proposed prognostic scores derived from a singular vector decomposition of the network model. Finally, we compare EPoC to existing methods for mRNA networks analysis and expression quantitative locus methods, and demonstrate that EPoC produces more consistent models between technically independent glioblastoma data sets, and that the EPoC models exhibit better overlap with known protein–protein interaction networks and pathway maps.
In summary, we conclude that large-scale integrative modeling reveals mechanistically and prognostically informative networks in human glioblastoma. Our approach operates at the gene level and our data support that individual hub genes can be identified in practice. Very large aberrations, however, cannot be fully resolved by the current modeling strategy.
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.
doi:10.1038/msb.2011.17
PMCID: PMC3101951  PMID: 21525872
cancer biology; cancer genomics; glioblastoma
3.  Landscape of somatic allelic imbalances and copy number alterations in HER2-amplified breast cancer 
Breast Cancer Research : BCR  2011;13(6):R129.
Introduction
Human epidermal growth factor receptor 2 (HER2)-amplified breast cancer represents a clinically well-defined subgroup due to availability of targeted treatment. However, HER2-amplified tumors have been shown to be heterogeneous at the genomic level by genome-wide microarray analyses, pointing towards a need of further investigations for identification of recurrent copy number alterations and delineation of patterns of allelic imbalance.
Methods
High-density whole genome array-based comparative genomic hybridization (aCGH) and single nucleotide polymorphism (SNP) array data from 260 HER2-amplified breast tumors or cell lines, and 346 HER2-negative breast cancers with molecular subtype information were assembled from different repositories. Copy number alteration (CNA), loss-of-heterozygosity (LOH), copy number neutral allelic imbalance (CNN-AI), subclonal CNA and patterns of tumor DNA ploidy were analyzed using bioinformatical methods such as genomic identification of significant targets in cancer (GISTIC) and genome alteration print (GAP). The patterns of tumor ploidy were confirmed in 338 unrelated breast cancers analyzed by DNA flow cytometry with concurrent BAC aCGH and gene expression data.
Results
A core set of 36 genomic regions commonly affected by copy number gain or loss was identified by integrating results with a previous study, together comprising > 400 HER2-amplified tumors. While CNN-AI frequency appeared evenly distributed over chromosomes in HER2-amplified tumors, not targeting specific regions and often < 20% in frequency, the occurrence of LOH was strongly associated with regions of copy number loss. HER2-amplified and HER2-negative tumors stratified by molecular subtypes displayed different patterns of LOH and CNN-AI, with basal-like tumors showing highest frequencies followed by HER2-amplified and luminal B cases. Tumor aneuploidy was strongly associated with increasing levels of LOH, CNN-AI, CNAs and occurrence of subclonal copy number events, irrespective of subtype. Finally, SNP data from individual tumors indicated that genomic amplification in general appears as monoallelic, that is, it preferentially targets one parental chromosome in HER2-amplified tumors.
Conclusions
We have delineated the genomic landscape of CNAs, amplifications, LOH, and CNN-AI in HER2-amplified breast cancer, but also demonstrated a strong association between different types of genomic aberrations and tumor aneuploidy irrespective of molecular subtype.
doi:10.1186/bcr3075
PMCID: PMC3326571  PMID: 22169037
4.  CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data 
Bioinformatics  2009;26(4):464-469.
Motivation: DNA copy number aberration (CNA) is a hallmark of genomic abnormality in tumor cells. Recurrent CNA (RCNA) occurs in multiple cancer samples across the same chromosomal region and has greater implication in tumorigenesis. Current commonly used methods for RCNA identification require CNA calling for individual samples before cross-sample analysis. This two-step strategy may result in a heavy computational burden, as well as a loss of the overall statistical power due to segmentation and discretization of individual sample's data. We propose a population-based approach for RCNA detection with no need of single-sample analysis, which is statistically powerful, computationally efficient and particularly suitable for high-resolution and large-population studies.
Results: Our approach, correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis. Directly using the raw intensity ratio data from all samples and adopting a diagonal transformation strategy, CMDS substantially reduces computational burden and can obtain results very quickly from large datasets. Our simulation indicates that the statistical power of CMDS is higher than that of single-sample CNA calling based two-step approaches. We applied CMDS to two real datasets of lung cancer and brain cancer from Affymetrix and Illumina array platforms, respectively, and successfully identified known regions of CNA associated with EGFR, KRAS and other important oncogenes. CMDS provides a fast, powerful and easily implemented tool for the RCNA analysis of large-scale data from cancer genomes.
Availability: The R and C programs implementing our method are available at https://dsgweb.wustl.edu/qunyuan/software/cmds.
Contact: qunyuan@wustl.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp708
PMCID: PMC2852218  PMID: 20031968
5.  Conditional random pattern model for copy number aberration detection 
BMC Bioinformatics  2010;11:200.
Background
DNA copy number aberration (CNA) is very important in the pathogenesis of tumors and other diseases. For example, CNAs may result in suppression of anti-oncogenes and activation of oncogenes, which would cause certain types of cancers. High density single nucleotide polymorphism (SNP) array data is widely used for the CNA detection. However, it is nontrivial to detect the CNA automatically because the signals obtained from high density SNP arrays often have low signal-to-noise ratio (SNR), which might be caused by whole genome amplification, mixtures of normal and tumor cells, experimental noise or other technical limitations. With the reduction in SNR, many false CNA regions are often detected and the true CNA regions are missed. Thus, more sophisticated statistical models are needed to make the CNAs detection, using the low SNR signals, more robust and reliable.
Results
This paper presents a conditional random pattern (CRP) model for CNA detection where much contextual cues are explored to suppress the noise and improve CNA detection accuracy. Both simulated and the real data are used to evaluate the proposed model, and the validation results show that the CRP model is more robust and reliable in the presence of noise for CNA detection using high density SNP array data, compared to a number of widely used software packages.
Conclusions
The proposed conditional random pattern (CRP) model could effectively detect the CNA regions in the presence of noise.
doi:10.1186/1471-2105-11-200
PMCID: PMC2876128  PMID: 20412592
6.  Clinical Omics Analysis of Colorectal Cancer Incorporating Copy Number Aberrations and Gene Expression Data 
Cancer Informatics  2010;9:147-161.
Background:
Colorectal cancer (CRC) is one of the most frequently occurring cancers in Japan, and thus a wide range of methods have been deployed to study the molecular mechanisms of CRC. In this study, we performed a comprehensive analysis of CRC, incorporating copy number aberration (CRC) and gene expression data. For the last four years, we have been collecting data from CRC cases and organizing the information as an “omics” study by integrating many kinds of analysis into a single comprehensive investigation.
In our previous studies, we had experienced difficulty in finding genes related to CRC, as we observed higher noise levels in the expression data than in the data for other cancers.
Because chromosomal aberrations are often observed in CRC, here, we have performed a combination of CNA analysis and expression analysis in order to identify some new genes responsible for CRC.
This study was performed as part of the Clinical Omics Database Project at Tokyo Medical and Dental University. The purpose of this study was to investigate the mechanism of genetic instability in CRC by this combination of expression analysis and CNA, and to establish a new method for the diagnosis and treatment of CRC.
Materials and methods:
Comprehensive gene expression analysis was performed on 79 CRC cases using an Affymetrix Gene Chip, and comprehensive CNA analysis was performed using an Affymetrix DNA Sty array. To avoid the contamination of cancer tissue with normal cells, laser micro-dissection was performed before DNA/RNA extraction. Data analysis was performed using original software written in the R language.
Result:
We observed a high percentage of CNA in colorectal cancer, including copy number gains at 7, 8q, 13 and 20q, and copy number losses at 8p, 17p and 18. Gene expression analysis provided many candidates for CRC-related genes, but their association with CRC did not reach the level of statistical significance. The combination of CNA and gene expression analysis, together with the clinical information, suggested UGT2B28, LOC440995, CXCL6, SULT1B1, RALBP1, TYMS, RAB12, RNMT, ARHGDIB, S1000A2, ABHD2, OIT3 and ABHD12 as genes that are possibly associated with CRC. Some of these genes have already been reported as being related to CRC. TYMS has been reported as being associated with resistance to the anti-cancer drug 5-fluorouracil, and we observed a copy number increase for this gene. RALBP1, ARHGDIB and S100A2 have been reported as oncogenes, and we observed copy number increases in each. ARHGDIB has been reported as a metastasis-related gene, and our data also showed copy number increases of this gene in cases with metastasis.
Conclusion:
The combination of CNA analysis and gene expression analysis was a more effective method for finding genes associated with the clinicopathological classification of CRC than either analysis alone. Using this combination of methods, we were able to detect genes that have already been associated with CRC. We also identified additional candidate genes that may be new markers or targets for this form of cancer.
PMCID: PMC2918356  PMID: 20706620
colorectal cancer; clinical omics; microarray; copy number aberration
7.  Novel Genes Associated with Colorectal Cancer Are Revealed by High Resolution Cytogenetic Analysis in a Patient Specific Manner 
PLoS ONE  2013;8(10):e76251.
Genomic abnormalities leading to colorectal cancer (CRC) include somatic events causing copy number aberrations (CNAs) as well as copy neutral manifestations such as loss of heterozygosity (LOH) and uniparental disomy (UPD). We studied the causal effect of these events by analyzing high resolution cytogenetic microarray data of 15 tumor-normal paired samples. We detected 144 genes affected by CNAs. A subset of 91 genes are known to be CRC related yet high GISTIC scores indicate 24 genes on chromosomes 7, 8, 18 and 20 to be strongly relevant. Combining GISTIC ranking with functional analyses and degree of loss/gain we identify three genes in regions of significant loss (ATP8B1, NARS, and ATP5A1) and eight in regions of gain (CTCFL, SPO11, ZNF217, PLEKHA8, HOXA3, GPNMB, IGF2BP3 and PCAT1) as novel in their association with CRC. Pathway and target prediction analysis of CNA affected genes and microRNAs, respectively indicates TGF-β signaling pathway to be involved in causing CRC. Finally, LOH and UPD collectively affected nine cancer related genes. Transcription factor binding sites on regions of >35% copy number loss/gain influenced 16 CRC genes. Our analysis shows patient specific CRC manifestations at the genomic level and that these different events affect individual CRC patients differently.
doi:10.1371/journal.pone.0076251
PMCID: PMC3813709  PMID: 24204606
8.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes 
Bioinformatics (Oxford, England)  2008;24(7):889-896.
Motivation
DNA copy number aberrations (CNAs) and gene expression (GE) changes provide valuable information for studying chromosomal instability and its consequences in cancer. While it is clear that the structural aberrations and the transcript levels are intertwined, their relationship is more complex and subtle than initially suspected. Most studies so far have focused on how a CNA affects the expression levels of those genes contained within that CNA.
Results
To better understand the impact of CNAs on expression, we investigated the correlation of each CNA to all other genes in the genome. The correlations are computed over multiple patients that have both expression and copy number measurements in brain, bladder, and breast cancer data sets. We find that a CNA has a direct impact on the gene amplified or deleted, but it also has a broad, indirect impact elsewhere. To identify a set of CNAs that is coordinately associated with the expression changes of a set of genes, we used a biclustering algorithm on the correlation matrix. For each of the three cancer types examined, the aberrations in several loci are associated with cancer-type specific biological pathways that have been described in the literature: CNAs of chromosome (chr) 7p13 were significantly correlated with epidermal growth factor receptor signaling pathway in glioblastoma multiforme, chr 13q with NF-kappaB cascades in bladder cancer, and chr 11p with Reck pathway in breast cancer. In all three data sets, gene sets related to cell cycle/division such as M phase, DNA replication, and cell division were also associated with CNAs. Our results suggest that CNAs are both directly and indirectly correlated with changes in expression and that it is beneficial to examine the indirect effects of CNAs.
doi:10.1093/bioinformatics/btn034
PMCID: PMC2600603  PMID: 18263644
9.  Genomic Profiling of Submucosal-Invasive Gastric Cancer by Array-Based Comparative Genomic Hybridization 
PLoS ONE  2011;6(7):e22313.
Genomic copy number aberrations (CNAs) in gastric cancer have already been extensively characterized by array comparative genomic hybridization (array CGH) analysis. However, involvement of genomic CNAs in the process of submucosal invasion and lymph node metastasis in early gastric cancer is still poorly understood. In this study, to address this issue, we collected a total of 59 tumor samples from 27 patients with submucosal-invasive gastric cancers (SMGC), analyzed their genomic profiles by array CGH, and compared them between paired samples of mucosal (MU) and submucosal (SM) invasion (23 pairs), and SM invasion and lymph node (LN) metastasis (9 pairs). Initially, we hypothesized that acquisition of specific CNA(s) is important for these processes. However, we observed no significant difference in the number of genomic CNAs between paired MU and SM, and between paired SM and LN. Furthermore, we were unable to find any CNAs specifically associated with SM invasion or LN metastasis. Among the 23 cases analyzed, 15 had some similar pattern of genomic profiling between SM and MU. Interestingly, 13 of the 15 cases also showed some differences in genomic profiles. These results suggest that the majority of SMGCs are composed of heterogeneous subpopulations derived from the same clonal origin. Comparison of genomic CNAs between SMGCs with and without LN metastasis revealed that gain of 11q13, 11q14, 11q22, 14q32 and amplification of 17q21 were more frequent in metastatic SMGCs, suggesting that these CNAs are related to LN metastasis of early gastric cancer. In conclusion, our data suggest that generation of genetically distinct subclones, rather than acquisition of specific CNA at MU, is integral to the process of submucosal invasion, and that subclones that acquire gain of 11q13, 11q14, 11q22, 14q32 or amplification of 17q21 are likely to become metastatic.
doi:10.1371/journal.pone.0022313
PMCID: PMC3141024  PMID: 21811585
10.  Wavelet-based identification of DNA focal genomic aberrations from single nucleotide polymorphism arrays 
BMC Bioinformatics  2011;12:146.
Background
Copy number aberrations (CNAs) are an important molecular signature in cancer initiation, development, and progression. However, these aberrations span a wide range of chromosomes, making it hard to distinguish cancer related genes from other genes that are not closely related to cancer but are located in broadly aberrant regions. With the current availability of high-resolution data sets such as single nucleotide polymorphism (SNP) microarrays, it has become an important issue to develop a computational method to detect driving genes related to cancer development located in the focal regions of CNAs.
Results
In this study, we introduce a novel method referred to as the wavelet-based identification of focal genomic aberrations (WIFA). The use of the wavelet analysis, because it is a multi-resolution approach, makes it possible to effectively identify focal genomic aberrations in broadly aberrant regions. The proposed method integrates multiple cancer samples so that it enables the detection of the consistent aberrations across multiple samples. We then apply this method to glioblastoma multiforme and lung cancer data sets from the SNP microarray platform. Through this process, we confirm the ability to detect previously known cancer related genes from both cancer types with high accuracy. Also, the application of this approach to a lung cancer data set identifies focal amplification regions that contain known oncogenes, though these regions are not reported using a recent CNAs detecting algorithm GISTIC: SMAD7 (chr18q21.1) and FGF10 (chr5p12).
Conclusions
Our results suggest that WIFA can be used to reveal cancer related genes in various cancer data sets.
doi:10.1186/1471-2105-12-146
PMCID: PMC3114745  PMID: 21569311
11.  Microsatellite Stable Colorectal Cancers Stratified by the BRAF V600E Mutation Show Distinct Patterns of Chromosomal Instability 
PLoS ONE  2014;9(3):e91739.
The BRAF (V600E) mutation in colorectal cancers that are microsatellite stable (MSS) confers a poor patient prognosis, whereas BRAF mutant microsatellite-unstable (MSI) colorectal cancers have an excellent prognosis. BRAF wild type cancers are typically MSS and display chromosomal instability (CIN). CIN has not been extensively studied on a genome-wide basis in relation to BRAF mutational status in colorectal cancer. BRAF mutant/MSS (BRAFmut/MSS) cancers (n = 33) and BRAF mutant/MSI (BRAFmut/MSI) cancers (n = 30) were compared for presence of copy number aberrations (CNAs) indicative of CIN, with BRAF wild type/MSS (BRAFwt/MSS) cancers (n = 18) using Illumina CytoSNP-12 arrays. BRAFmut/MSS and BRAFwt/MSS cancers showed comparable numbers of CNAs/cancer at 32.8 and 29.8 respectively. However, there were differences in patterns of CNA length between MSS cohorts, with BRAFmut/MSS cancers having significantly greater proportions of focal CNAs compared to BRAFwt/MSS cancers (p<0.0001); whereas whole chromosomal arm CNAs were more common in BRAFwt/MSS cancers (p<0.0001). This related to a reduced average CNA length in BRAFmut/MSS compared to BRAFwt/MSS cancers (20.7 Mb vs 33.4 Mb;p<0.0001); and a smaller average percent of CIN affected genomes in BRAFmut/MSS compared to BRAFwt/MSS cancers (23.9% vs 34.9% respectively). BRAFmut/MSI cancers were confirmed to have low CNA rates (5.4/cancer) and minimal CIN-affected genomes (average of 4.5%) compared to MSS cohorts (p<0.0001). BRAFmut/MSS cancers had more frequent deletion CNAs compared to BRAFwt/MSS cancers on 6p and 17q at loci not typically correlated with colorectal cancer, and greater amplification CNAs on 8q and 18q compared to BRAFwt/MSS cancers. These results indicate that comparable rates of CIN occur between MSS subgroups, however significant differences in their patterns of instability exist, with BRAFmut/MSS cancers showing a ‘focal pattern’ and BRAFwt/MSS cancers having a ‘whole arm pattern’ of CIN. This and the genomic loci more frequently affected in BRAFmut/MSS cancers provides further evidence of the biological distinctions of this important cancer subgroup.
doi:10.1371/journal.pone.0091739
PMCID: PMC3961279  PMID: 24651849
12.  A Comprehensive Characterization of Genome-Wide Copy Number Aberrations in Colorectal Cancer Reveals Novel Oncogenes and Patterns of Alterations 
PLoS ONE  2012;7(7):e42001.
To develop a comprehensive overview of copy number aberrations (CNAs) in stage-II/III colorectal cancer (CRC), we characterized 302 tumors from the PETACC-3 clinical trial. Microsatellite-stable (MSS) samples (n = 269) had 66 minimal common CNA regions, with frequent gains on 20 q (72.5%), 7 (41.8%), 8 q (33.1%) and 13 q (51.0%) and losses on 18 (58.6%), 4 q (26%) and 21 q (21.6%). MSS tumors have significantly more CNAs than microsatellite-instable (MSI) tumors: within the MSI tumors a novel deletion of the tumor suppressor WWOX at 16 q23.1 was identified (p<0.01). Focal aberrations identified by the GISTIC method confirmed amplifications of oncogenes including EGFR, ERBB2, CCND1, MET, and MYC, and deletions of tumor suppressors including TP53, APC, and SMAD4, and gene expression was highly concordant with copy number aberration for these genes. Novel amplicons included putative oncogenes such as WNK1 and HNF4A, which also showed high concordance between copy number and expression. Survival analysis associated a specific patient segment featured by chromosome 20 q gains to an improved overall survival, which might be due to higher expression of genes such as EEF1B2 and PTK6. The CNA clustering also grouped tumors characterized by a poor prognosis BRAF-mutant-like signature derived from mRNA data from this cohort. We further revealed non-random correlation between CNAs among unlinked loci, including positive correlation between 20 q gain and 8 q gain, and 20 q gain and chromosome 18 loss, consistent with co-selection of these CNAs. These results reinforce the non-random nature of somatic CNAs in stage-II/III CRC and highlight loci and genes that may play an important role in driving the development and outcome of this disease.
doi:10.1371/journal.pone.0042001
PMCID: PMC3409212  PMID: 22860045
13.  Determining Frequent Patterns of Copy Number Alterations in Cancer 
PLoS ONE  2010;5(8):e12028.
Cancer progression is often driven by an accumulation of genetic changes but also accompanied by increasing genomic instability. These processes lead to a complicated landscape of copy number alterations (CNAs) within individual tumors and great diversity across tumor samples. High resolution array-based comparative genomic hybridization (aCGH) is being used to profile CNAs of ever larger tumor collections, and better computational methods for processing these data sets and identifying potential driver CNAs are needed. Typical studies of aCGH data sets take a pipeline approach, starting with segmentation of profiles, calls of gains and losses, and finally determination of frequent CNAs across samples. A drawback of pipelines is that choices at each step may produce different results, and biases are propagated forward. We present a mathematically robust new method that exploits probe-level correlations in aCGH data to discover subsets of samples that display common CNAs. Our algorithm is related to recent work on maximum-margin clustering. It does not require pre-segmentation of the data and also provides grouping of recurrent CNAs into clusters. We tested our approach on a large cohort of glioblastoma aCGH samples from The Cancer Genome Atlas and recovered almost all CNAs reported in the initial study. We also found additional significant CNAs missed by the original analysis but supported by earlier studies, and we identified significant correlations between CNAs.
doi:10.1371/journal.pone.0012028
PMCID: PMC2920822  PMID: 20711339
14.  Germline DNA Copy Number Aberrations Identified as Potential Prognostic Factors for Breast Cancer Recurrence 
PLoS ONE  2013;8(1):e53850.
Breast cancer recurrence (BCR) is a common treatment outcome despite curative-intent primary treatment of non-metastatic breast cancer. Currently used prognostic and predictive factors utilize tumor-based markers, and are not optimal determinants of risk of BCR. Germline-based copy number aberrations (CNAs) have not been evaluated as determinants of predisposition to experience BCR. In this study, we accessed germline DNA from 369 female breast cancer subjects who received curative-intent primary treatment following diagnosis. Of these, 155 experienced BCR and 214 did not, after a median duration of follow up after breast cancer diagnosis of 6.35 years (range = 0.60–21.78) and 8.60 years (range = 3.08–13.57), respectively. Whole genome CNA genotyping was performed on the Affymetrix SNP array 6.0 platform. CNAs were identified using the SNP-Fast Adaptive States Segmentation Technique 2 algorithm implemented in Nexus Copy Number 6.0. Six samples were removed due to poor quality scores, leaving 363 samples for further analysis. We identified 18,561 CNAs with ≥1 kb as a predefined cut-off for observed aberrations. Univariate survival analyses (log-rank tests) identified seven CNAs (two copy number gains and five copy neutral-loss of heterozygosities, CN-LOHs) showing significant differences (P<2.01×10−5) in recurrence-free survival (RFS) probabilities with and without CNAs.We also observed three additional but distinct CN-LOHs showing significant differences in RFS probabilities (P<2.86×10−5) when analyses were restricted to stratified cases (luminal A, n = 208) only. After adjusting for tumor stage and grade in multivariate analyses (Cox proportional hazards models), all the CNAs remained strongly associated with the phenotype of BCR. Of these, we confirmed three CNAs at 17q11.2, 11q13.1 and 6q24.1 in representative samples using independent genotyping platforms. Our results suggest further investigations on the potential use of germline DNA variations as prognostic markers in cancer-associated phenotypes.
doi:10.1371/journal.pone.0053850
PMCID: PMC3547038  PMID: 23342018
15.  Global Burden of Sickle Cell Anaemia in Children under Five, 2010–2050: Modelling Based on Demographics, Excess Mortality, and Interventions 
PLoS Medicine  2013;10(7):e1001484.
Frédéric Piel and colleagues combine national sickle cell anemia (SCA) frequencies with projected demographic data to estimate the number of SCA births in children under five globally from 2010 to 2050, and then estimate the number of lives that could be be saved following implementation of specific health interventions starting in 2015.
Please see later in the article for the Editors' Summary
Background
The global burden of sickle cell anaemia (SCA) is set to rise as a consequence of improved survival in high-prevalence low- and middle-income countries and population migration to higher-income countries. The host of quantitative evidence documenting these changes has not been assembled at the global level. The purpose of this study is to estimate trends in the future number of newborns with SCA and the number of lives that could be saved in under-five children with SCA by the implementation of different levels of health interventions.
Methods and Findings
First, we calculated projected numbers of newborns with SCA for each 5-y interval between 2010 and 2050 by combining estimates of national SCA frequencies with projected demographic data. We then accounted for under-five mortality (U5m) projections and tested different levels of excess mortality for children with SCA, reflecting the benefits of implementing specific health interventions for under-five patients in 2015, to assess the number of lives that could be saved with appropriate health care services. The estimated number of newborns with SCA globally will increase from 305,800 (confidence interval [CI]: 238,400–398,800) in 2010 to 404,200 (CI: 242,500–657,600) in 2050. It is likely that Nigeria (2010: 91,000 newborns with SCA [CI: 77,900–106,100]; 2050: 140,800 [CI: 95,500–200,600]) and the Democratic Republic of the Congo (2010: 39,700 [CI: 32,600–48,800]; 2050: 44,700 [CI: 27,100–70,500]) will remain the countries most in need of policies for the prevention and management of SCA. We predict a decrease in the annual number of newborns with SCA in India (2010: 44,400 [CI: 33,700–59,100]; 2050: 33,900 [CI: 15,900–64,700]). The implementation of basic health interventions (e.g., prenatal diagnosis, penicillin prophylaxis, and vaccination) for SCA in 2015, leading to significant reductions in excess mortality among under-five children with SCA, could, by 2050, prolong the lives of 5,302,900 [CI: 3,174,800–6,699,100] newborns with SCA. Similarly, large-scale universal screening could save the lives of up to 9,806,000 (CI: 6,745,800–14,232,700) newborns with SCA globally, 85% (CI: 81%–88%) of whom will be born in sub-Saharan Africa. The study findings are limited by the uncertainty in the estimates and the assumptions around mortality reductions associated with interventions.
Conclusions
Our quantitative approach confirms that the global burden of SCA is increasing, and highlights the need to develop specific national policies for appropriate public health planning, particularly in low- and middle-income countries. Further empirical collaborative epidemiological studies are vital to assess current and future health care needs, especially in Nigeria, the Democratic Republic of the Congo, and India.
Please see later in the article for the Editors' Summary
Editors' Summary
Background
More than seven million babies are born each year with a structural or functional abnormality. Although some birth defects are caused by environmental factors, many are caused by the inheritance of a defective gene. One common inherited birth defect is sickle cell anemia (SCA). SCA arises when a baby inherits the gene for sickle hemoglobin (HbS), a structural variant of normal adult hemoglobin (HbA, the protein in the disc-shaped red blood cells that carry oxygen round the body), from both its parents. Every cell in the human body contains two full sets of genes, and babies inherit one set of genes from each parent. The parents usually each have one HbS gene and one HbA gene, and are unaffected. However, the red blood cells of their offspring who inherit two copies of HbS develop a sickle (crescent) shape. Sickle cells can block blood vessels in the limbs and organs and have a shorter lifespan than normal red blood cells, which causes anemia. Together, these changes can cause acute pain and organ damage, and can increase the risk of severe infections. SCA can be prevented by prenatal diagnosis and managed by interventions such as the provision of antibiotics and vaccination to prevent infections.
Why Was This Study Done?
Without early diagnosis and treatment, children with SCA often die within the first few years of life. Having one copy of the HbS gene provides people with protection from malaria, therefore SCA occurs mainly in low- and middle-income countries in tropical regions, where early diagnosis and treatment is often unavailable. Recent improvements in overall infant and childhood survival in these countries and population migration to higher-income countries mean that the global burden of SCA is likely to increase over the coming decades. To date, no one has tried to quantify this increase, although this information is needed to guide decisions on public health spending. In this modeling study, the researchers assess the size of the expected global burden of SCA between 2010 and 2050 in children under five years old and estimate the number of newborn lives that might be saved by implementation of various health interventions.
What Did the Researchers Do and Find?
The researchers used estimates of national SCA frequencies and data on projected birth rates to calculate that the number of newborns with SCA will increase from about 305,800 in 2010 to about 404,200 in 2050. They estimated that Nigeria, the Democratic Republic of Congo (DRC), and India accounted for 57% of newborns with SCA in 2010, and that Nigeria and the DRC will probably still be the countries most in need of policies for the prevention and management of SCA in 2050. The researchers then assessed how many newborns might be saved by the implementation of various health measures in 2015 that affect excess mortality (the difference between the frequency of SCA in newborns and in five-year-olds divided by the frequency of SCA in newborns) among children born with SCA. Implementation of prenatal diagnosis and newborn screening programs, and provision of antibiotics and vaccinations (interventions assumed by the researchers to reduce excess mortality from 90% to 50% in low- and middle-income countries and from 10% to 5% in high-income countries) could prolong the life of more than five million newborns with SCA by 2050. Implementation of universal screening and provision of other specific measures predicted to reduce excess mortality to 5% and 0% in low-to-middle-income countries and high-income countries, respectively, could save nearly ten million lives by 2050.
What Do These Findings Mean?
In estimating the global burden of SCA in children under five years old between 2010 and 2050 and the number of newborn lives that could be saved by implementation of health interventions, the researchers made numerous assumptions reflected in the uncertainty associated with the projections. For example, they assumed that implementation of specific interventions would lead to an immediate reduction of excess mortality in newborns with SCA. The study's findings confirm, however, that the global burden of SCA is increasing and indicate that the implementation of specific interventions could extend the lives of millions of newborns with SCA. Although further studies are needed to assess the current and future health care needs of children with SCA, these findings highlight the need to develop and implement national public health planning and funding policies for SCA, particularly in low- and middle-income countries.
Additional Information
Please access these websites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed.1001484.
This study is further discussed in a PLOS Medicine Perspective by Edward Fottrell and David Osrin
The US National Heart, Lung, and Blood Institute provides detailed information (including personal stories) about sickle cell anemia (in English and Spanish)
The UK National Health Service Choices website also provides detailed information and a personal story about sickle cell anemia
The Sickle Cell Society, a UK-based not-for-profit organization, provides information for patients and carers and includes a children's website
The World Health Organization has a factsheet on sickle cell anemia and other inherited hemoglobin diseases (in several languages)
MedlinePlus provides links to further resources about sickle cell anemia (in English and Spanish)
The Malaria Atlas Project provides epidemiological information on the inherited blood disorders (including sickle cell anemia) that affect our response to malaria infection
The Global Sickle Cell Disease Network is a portal bringing together leading sickle cell disease researchers and clinicians from high-, middle-, and low-income countries to form a network
doi:10.1371/journal.pmed.1001484
PMCID: PMC3712914  PMID: 23874164
16.  High-resolution genomic and expression analyses of copy number alterations in HER2-amplified breast cancer 
Introduction
HER2 gene amplification and protein overexpression (HER2+) define a clinically challenging subgroup of breast cancer with variable prognosis and response to therapy. Although gene expression profiling has identified an ERBB2 molecular subtype of breast cancer, it is clear that HER2+ tumors reside in all molecular subtypes and represent a genomically and biologically heterogeneous group, needed to be further characterized in large sample sets.
Methods
Genome-wide DNA copy number profiling, using bacterial artificial chromosome (BAC) array comparative genomic hybridization (aCGH), and global gene expression profiling were performed on 200 and 87 HER2+ tumors, respectively. Genomic Identification of Significant Targets in Cancer (GISTIC) was used to identify significant copy number alterations (CNAs) in HER2+ tumors, which were related to a set of 554 non-HER2 amplified (HER2-) breast tumors. High-resolution oligonucleotide aCGH was used to delineate the 17q12-q21 region in high detail.
Results
The HER2-amplicon was narrowed to an 85.92 kbp region including the TCAP, PNMT, PERLD1, HER2, C17orf37 and GRB7 genes, and higher HER2 copy numbers indicated worse prognosis. In 31% of HER2+ tumors the amplicon extended to TOP2A, defining a subgroup of HER2+ breast cancer associated with estrogen receptor-positive status and with a trend of better survival than HER2+ breast cancers with deleted (18%) or neutral TOP2A (51%). HER2+ tumors were clearly distinguished from HER2- tumors by the presence of recurrent high-level amplifications and firestorm patterns on chromosome 17q. While there was no significant difference between HER2+ and HER2- tumors regarding the incidence of other recurrent high-level amplifications, differences in the co-amplification pattern were observed, as shown by the almost mutually exclusive occurrence of 8p12, 11q13 and 20q13 amplification in HER2+ tumors. GISTIC analysis identified 117 significant CNAs across all autosomes. Supervised analyses revealed: (1) significant CNAs separating HER2+ tumors stratified by clinical variables, and (2) CNAs separating HER2+ from HER2- tumors.
Conclusions
We have performed a comprehensive survey of CNAs in HER2+ breast tumors, pinpointing significant genomic alterations including both known and potentially novel therapeutic targets. Our analysis sheds further light on the genomically complex and heterogeneous nature of HER2+ tumors in relation to other subgroups of breast cancer.
doi:10.1186/bcr2568
PMCID: PMC2917012  PMID: 20459607
17.  Basal-like Breast cancer DNA copy number losses identify genes involved in genomic instability, response to therapy, and patient survival 
Breast cancer is a heterogeneous disease with known expression-defined tumor subtypes. DNA copy number studies have suggested that tumors within gene expression subtypes share similar DNA Copy number aberrations (CNA) and that CNA can be used to further sub-divide expression classes. To gain further insights into the etiologies of the intrinsic subtypes, we classified tumors according to gene expression subtype and next identified subtype-associated CNA using a novel method called SWITCHdna, using a training set of 180 tumors and a validation set of 359 tumors. Fisher’s exact tests, Chi-square approximations, and Wilcoxon rank-sum tests were performed to evaluate differences in CNA by subtype. To assess the functional significance of loss of a specific chromosomal region, individual genes were knocked down by shRNA and drug sensitivity, and DNA repair foci assays performed. Most tumor subtypes exhibited specific CNA. The Basal-like subtype was the most distinct with common losses of the regions containing RB1, BRCA1, INPP4B, and the greatest overall genomic instability. One Basal-like subtype-associated CNA was loss of 5q11–35, which contains at least three genes important for BRCA1-dependent DNA repair (RAD17, RAD50, and RAP80); these genes were predominantly lost as a pair, or all three simultaneously. Loss of two or three of these genes was associated with significantly increased genomic instability and poor patient survival. RNAi knockdown of RAD17, or RAD17/RAD50, in immortalized human mammary epithelial cell lines caused increased sensitivity to a PARP inhibitor and carboplatin, and inhibited BRCA1 foci formation in response to DNA damage. These data suggest a possible genetic cause for genomic instability in Basal-like breast cancers and a biological rationale for the use of DNA repair inhibitor related therapeutics in this breast cancer subtype.
Electronic supplementary material
The online version of this article (doi:10.1007/s10549-011-1846-y) contains supplementary material, which is available to authorized users.
doi:10.1007/s10549-011-1846-y
PMCID: PMC3387500  PMID: 22048815
Basal-like breast cancer; Genome instability; BRCA1 pathway; Copy number aberration; Molecular subtypes; Array CGH
18.  High-resolution analysis of copy number alterations and associated expression changes in ovarian tumors 
BMC Medical Genomics  2009;2:21.
Background
DNA copy number alterations are frequently observed in ovarian cancer, but it remains a challenge to identify the most relevant alterations and the specific causal genes in those regions.
Methods
We obtained high-resolution 500K SNP array data for 52 ovarian tumors and identified the most statistically significant minimal genomic regions with the most prevalent and highest-level copy number alterations (recurrent CNAs). Within a region of recurrent CNA, comparison of expression levels in tumors with a given CNA to tumors lacking that CNA and to whole normal ovary samples was used to select genes with CNA-specific expression patterns. A public expression array data set of laser capture micro-dissected (LCM) non-malignant fallopian tube epithelia and LCM ovarian serous adenocarcinoma was used to evaluate the effect of cell-type mixture biases.
Results
Fourteen recurrent deletions were detected on chromosomes 4, 6, 9, 12, 13, 15, 16, 17, 18, 22 and most prevalently on X and 8. Copy number and expression data suggest several apoptosis mediators as candidate drivers of the 8p deletions. Sixteen recurrent gains were identified on chromosomes 1, 2, 3, 5, 8, 10, 12, 15, 17, 19, and 20, with the most prevalent gains localized to 8q and 3q. Within the 8q amplicon, PVT1, but not MYC, was strongly over-expressed relative to tumors lacking this CNA and showed over-expression relative to normal ovary. Likewise, the cell polarity regulators PRKCI and ECT2 were identified as putative drivers of two distinct amplicons on 3q. Co-occurrence analyses suggested potential synergistic or antagonistic relationships between recurrent CNAs. Genes within regions of recurrent CNA showed an enrichment of Cancer Census genes, particularly when filtered for CNA-specific expression.
Conclusion
These analyses provide detailed views of ovarian cancer genomic changes and highlight the benefits of using multiple reference sample types for the evaluation of CNA-specific expression changes.
doi:10.1186/1755-8794-2-21
PMCID: PMC2694826  PMID: 19419571
19.  Assessing the Significance of Conserved Genomic Aberrations Using High Resolution Genomic Microarrays 
PLoS Genetics  2007;3(8):e143.
Genomic aberrations recurrent in a particular cancer type can be important prognostic markers for tumor progression. Typically in early tumorigenesis, cells incur a breakdown of the DNA replication machinery that results in an accumulation of genomic aberrations in the form of duplications, deletions, translocations, and other genomic alterations. Microarray methods allow for finer mapping of these aberrations than has previously been possible; however, data processing and analysis methods have not taken full advantage of this higher resolution. Attention has primarily been given to analysis on the single sample level, where multiple adjacent probes are necessarily used as replicates for the local region containing their target sequences. However, regions of concordant aberration can be short enough to be detected by only one, or very few, array elements. We describe a method called Multiple Sample Analysis for assessing the significance of concordant genomic aberrations across multiple experiments that does not require a-priori definition of aberration calls for each sample. If there are multiple samples, representing a class, then by exploiting the replication across samples our method can detect concordant aberrations at much higher resolution than can be derived from current single sample approaches. Additionally, this method provides a meaningful approach to addressing population-based questions such as determining important regions for a cancer subtype of interest or determining regions of copy number variation in a population. Multiple Sample Analysis also provides single sample aberration calls in the locations of significant concordance, producing high resolution calls per sample, in concordant regions. The approach is demonstrated on a dataset representing a challenging but important resource: breast tumors that have been formalin-fixed, paraffin-embedded, archived, and subsequently UV-laser capture microdissected and hybridized to two-channel BAC arrays using an amplification protocol. We demonstrate the accurate detection on simulated data, and on real datasets involving known regions of aberration within subtypes of breast cancer at a resolution consistent with that of the array. Similarly, we apply our method to previously published datasets, including a 250K SNP array, and verify known results as well as detect novel regions of concordant aberration. The algorithm has been fully implemented and tested and is freely available as a Java application at http://www.cbil.upenn.edu/MSA.
Author Summary
Cancer is a genetic disease caused by genomic mutations that confer an increased ability to proliferate and survive in a specific environment. It is now known that many regions of genomic DNA are deleted or amplified in specific cancer types. These aberrations are believed to occur randomly in the genome. If these aberrations overlap more than would be expected by chance across individual occurrences of the cancer this suggests a selective pressure on this aberration. These conserved aberrations likely represent regions that are important for the development, progression, and survival of a specific cancer type in its environment. We present a method for identifying these conserved aberrations within a class of samples. The applications for this method include accurate high resolution mapping of aberrations characteristic of cancer subtypes as well as other genetic diseases and determination of conserved copy number variations in the population. With the use of high resolution microarray methods we have profiled different tumor types. We have been able to create high resolution profiles of conserved aberrations in specific cancer types. These conserved aberrations are prime targets for cancer therapies and many of these regions have already been used to develop effective cancer therapeutics.
doi:10.1371/journal.pgen.0030143
PMCID: PMC1950957  PMID: 17722985
20.  Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH 
PLoS Computational Biology  2007;3(6):e122.
Genomic DNA copy-number alterations (CNAs) are associated with complex diseases, including cancer: CNAs are indeed related to tumoral grade, metastasis, and patient survival. CNAs discovered from array-based comparative genomic hybridization (aCGH) data have been instrumental in identifying disease-related genes and potential therapeutic targets. To be immediately useful in both clinical and basic research scenarios, aCGH data analysis requires accurate methods that do not impose unrealistic biological assumptions and that provide direct answers to the key question, “What is the probability that this gene/region has CNAs?” Current approaches fail, however, to meet these requirements. Here, we introduce reversible jump aCGH (RJaCGH), a new method for identifying CNAs from aCGH; we use a nonhomogeneous hidden Markov model fitted via reversible jump Markov chain Monte Carlo; and we incorporate model uncertainty through Bayesian model averaging. RJaCGH provides an estimate of the probability that a gene/region has CNAs while incorporating interprobe distance and the capability to analyze data on a chromosome or genome-wide basis. RJaCGH outperforms alternative methods, and the performance difference is even larger with noisy data and highly variable interprobe distance, both commonly found features in aCGH data. Furthermore, our probabilistic method allows us to identify minimal common regions of CNAs among samples and can be extended to incorporate expression data. In summary, we provide a rigorous statistical framework for locating genes and chromosomal regions with CNAs with potential applications to cancer and other complex human diseases.
Author Summary
As a consequence of problems during cell division, the number of copies of a gene in a chromosome can either increase or decrease. These copy-number alterations (CNAs) can play a crucial role in the emergence of complex multigenic diseases. For example, in cancer, amplification of oncogenes can drive tumor activation, and CNAs are associated with metastasis development and patient survival. Studies on the relationship between CNAs and disease have been recently fueled by the widespread use of array-based comparative genomic hybridization (aCGH), a technique with much finer resolution than previous experimental approaches. Detection of CNAs from these data depends on methods of analysis that do not impose biologically unrealistic assumptions and that provide direct answers to fundamental research questions. We have developed a statistical method, using a Bayesian approach, that returns estimates of the probabilities of CNAs from aCGH data, the most direct and valuable answer to the key biological question: “What is the probability that this gene/region has an altered copy number?” The output of the method can therefore be immediately used in different settings from clinical to basic research scenarios, and is applicable over a wide variety of aCGH technologies.
doi:10.1371/journal.pcbi.0030122
PMCID: PMC1894821  PMID: 17590078
21.  Genetic markers associated with early cancer-specific mortality following prostatectomy 
Cancer  2013;119(13):10.1002/cncr.27954.
BACKGROUND
To identify novel effectors and markers of localized but potentially life-threatening prostate cancer (PCa), we evaluated chromosomal copy number alterations (CNAs) in tumors from patients who underwent prostatectomy and correlated these with clinicopathologic features and outcome.
METHODS
CNAs in tumor DNAs from 125 prostatectomy patients in the discovery cohort were assayed with high resolution Affymetrix 6.0 SNP microarrays and then analyzed using the Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm.
RESULTS
The assays revealed twenty significant regions of CNAs, four of them novel, and identified the target genes of four of the alterations. By univariate analysis, seven CNAs were significantly associated with early PCa-specific mortality. These included gains of chromosomal regions that contain the genes MYC, ADAR, or TPD52 and losses of sequences that incorporate SERPINB5, USP10, PTEN, or TP53. On multivariate analysis, only the CNAs of PTEN and MYC contributed additional prognostic information independent of that provided by pathologic stage, Gleason score, and initial PSA level. Patients whose tumors had alterations of both genes had a markedly elevated risk of PCa-specific mortality (OR = 53; C.I.= 6.92–405, P = 1 × 10−4). Analyses of 333 tumors from three additional distinct patient cohorts confirmed the relationship between CNAs of PTEN and MYC and lethal PCa.
CONCLUSION
This study identified new CNAs and genes that likely contribute to the pathogenesis of localized PCa and suggests that patients whose tumors have acquired CNAs of PTEN, MYC, or both have an increased risk of early PCa-specific mortality.
doi:10.1002/cncr.27954
PMCID: PMC3863778  PMID: 23609948
prostate cancer death; PTEN; MYC; somatic DNA copy number
22.  Significance of genomic instability in breast cancer in atomic bomb survivors: analysis of microarray-comparative genomic hybridization 
Background
It has been postulated that ionizing radiation induces breast cancers among atomic bomb (A-bomb) survivors. We have reported a higher incidence of HER2 and C-MYC oncogene amplification in breast cancers from A-bomb survivors. The purpose of this study was to clarify the effect of A-bomb radiation exposure on genomic instability (GIN), which is an important hallmark of carcinogenesis, in archival formalin-fixed paraffin-embedded (FFPE) tissues of breast cancer by using microarray-comparative genomic hybridization (aCGH).
Methods
Tumor DNA was extracted from FFPE tissues of invasive ductal cancers from 15 survivors who were exposed at 1.5 km or less from the hypocenter and 13 calendar year-matched non-exposed patients followed by aCGH analysis using a high-density oligonucleotide microarray. The total length of copy number aberrations (CNA) was used as an indicator of GIN, and correlation with clinicopathological factors were statistically tested.
Results
The mean of the derivative log ratio spread (DLRSpread), which estimates the noise by calculating the spread of log ratio differences between consecutive probes for all chromosomes, was 0.54 (range, 0.26 to 1.05). The concordance of results between aCGH and fluorescence in situ hybridization (FISH) for HER2 gene amplification was 88%. The incidence of HER2 amplification and histological grade was significantly higher in the A-bomb survivors than control group (P = 0.04, respectively). The total length of CNA tended to be larger in the A-bomb survivors (P = 0.15). Correlation analysis of CNA and clinicopathological factors revealed that DLRSpread was negatively correlated with that significantly (P = 0.034, r = -0.40). Multivariate analysis with covariance revealed that the exposure to A-bomb was a significant (P = 0.005) independent factor which was associated with larger total length of CNA of breast cancers.
Conclusions
Thus, archival FFPE tissues from A-bomb survivors are useful for genome-wide aCGH analysis. Our results suggested that A-bomb radiation may affect the increased amount of CNA as a hallmark of GIN and, subsequently, be associated with a higher histologic grade in breast cancer found in A-bomb survivors.
doi:10.1186/1748-717X-6-168
PMCID: PMC3280193  PMID: 22152285
breast cancer; atomic bomb survivors; radiation; genomic instability; CGH; microarray
23.  arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies 
PLoS ONE  2012;7(5):e36944.
Background
The delineation of genomic copy number abnormalities (CNAs) from cancer samples has been instrumental for identification of tumor suppressor genes and oncogenes and proven useful for clinical marker detection. An increasing number of projects have mapped CNAs using high-resolution microarray based techniques. So far, no single resource does provide a global collection of readily accessible oncogenomic array data.
Methodology/Principal Findings
We here present arrayMap, a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides a platform for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. To date, the resource incorporates more than 40,000 arrays in 224 cancer types extracted from several resources, including the NCBI’s Gene Expression Omnibus (GEO), EBI’s ArrayExpress (AE), The Cancer Genome Atlas (TCGA), publication supplements and direct submissions. For the majority of the included datasets, probe level and integrated visualization facilitate gene level and genome wide data review. Results from multi-case selections can be connected to downstream data analysis and visualization tools.
Conclusions/Significance
To our knowledge, currently no data source provides an extensive collection of high resolution oncogenomic CNA data which readily could be used for genomic feature mining, across a representative range of cancer entities. arrayMap represents our effort for providing a long term platform for oncogenomic CNA data independent of specific platform considerations or specific project dependence. The online database can be accessed at http//www.arraymap.org.
doi:10.1371/journal.pone.0036944
PMCID: PMC3356349  PMID: 22629346
24.  Specific Genomic Regions Are Differentially Affected by Copy Number Alterations across Distinct Cancer Types, in Aggregated Cytogenetic Data 
PLoS ONE  2012;7(8):e43689.
Background
Regional genomic copy number alterations (CNA) are observed in the vast majority of cancers. Besides specifically targeting well-known, canonical oncogenes, CNAs may also play more subtle roles in terms of modulating genetic potential and broad gene expression patterns of developing tumors. Any significant differences in the overall CNA patterns between different cancer types may thus point towards specific biological mechanisms acting in those cancers. In addition, differences among CNA profiles may prove valuable for cancer classifications beyond existing annotation systems.
Principal Findings
We have analyzed molecular-cytogenetic data from 25579 tumors samples, which were classified into 160 cancer types according to the International Classification of Disease (ICD) coding system. When correcting for differences in the overall CNA frequencies between cancer types, related cancers were often found to cluster together according to similarities in their CNA profiles. Based on a randomization approach, distance measures from the cluster dendrograms were used to identify those specific genomic regions that contributed significantly to this signal. This approach identified 43 non-neutral genomic regions whose propensity for the occurrence of copy number alterations varied with the type of cancer at hand. Only a subset of these identified loci overlapped with previously implied, highly recurrent (hot-spot) cytogenetic imbalance regions.
Conclusions
Thus, for many genomic regions, a simple null-hypothesis of independence between cancer type and relative copy number alteration frequency can be rejected. Since a subset of these regions display relatively low overall CNA frequencies, they may point towards second-tier genomic targets that are adaptively relevant but not necessarily essential for cancer development.
doi:10.1371/journal.pone.0043689
PMCID: PMC3427184  PMID: 22937079
25.  Molecular cytogenetic characterization of canine histiocytic sarcoma: A spontaneous model for human histiocytic cancer identifies deletion of tumor suppressor genes and highlights influence of genetic background on tumor behavior 
BMC Cancer  2011;11:201.
Background
Histiocytic malignancies in both humans and dogs are rare and poorly understood. While canine histiocytic sarcoma (HS) is uncommon in the general domestic dog population, there is a strikingly high incidence in a subset of breeds, suggesting heritable predisposition. Molecular cytogenetic profiling of canine HS in these breeds would serve to reveal recurrent DNA copy number aberrations (CNAs) that are breed and/or tumor associated, as well as defining those shared with human HS. This process would identify evolutionarily conserved cytogenetic changes to highlight regions of particular importance to HS biology.
Methods
Using genome wide array comparative genomic hybridization we assessed CNAs in 104 spontaneously occurring HS from two breeds of dog exhibiting a particularly elevated incidence of this tumor, the Bernese Mountain Dog and Flat-Coated Retriever. Recurrent CNAs were evaluated further by multicolor fluorescence in situ hybridization and loss of heterozygosity analyses. Statistical analyses were performed to identify CNAs associated with tumor location and breed.
Results
Almost all recurrent CNAs identified in this study were shared between the two breeds, suggesting that they are associated more with the cancer phenotype than with breed. A subset of recurrent genomic imbalances suggested involvement of known cancer associated genes in HS pathogenesis, including deletions of the tumor suppressor genes CDKN2A/B, RB1 and PTEN. A small number of aberrations were unique to each breed, implying that they may contribute to the major differences in tumor location evident in these two breeds. The most highly recurrent canine CNAs revealed in this study are evolutionarily conserved with those reported in human histiocytic proliferations, suggesting that human and dog HS share a conserved pathogenesis.
Conclusions
The breed associated clinical features and DNA copy number aberrations exhibited by canine HS offer a valuable model for the human counterpart, providing additional evidence towards elucidation of the pathophysiological and genetic mechanisms associated with histiocytic malignancies. Extrapolation of data derived from canine histiocytic disorders to human histiocytic proliferation may help to further our understanding of the propagation and cancerization of histiocytic cells, contributing to development of new and effective therapeutic modalities for both species.
doi:10.1186/1471-2407-11-201
PMCID: PMC3121728  PMID: 21615919

Results 1-25 (847457)