PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-7 (7)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Effective normalization for copy number variation detection from whole genome sequencing 
BMC Genomics  2012;13(Suppl 6):S16.
Background
Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV estimates but the impact of these changes on the estimated CNVs is not well characterized. We evaluate in detail the effect of normalization methodologies in two CNV algorithms FREEC and CNV-seq using whole genome sequencing data from 8 individuals spanning four populations.
Methods
We apply FREEC and CNV-seq to a sequencing data set consisting of 8 genomes. We use multiple configurations corresponding to different read-count normalization methodologies in FREEC, and statistically characterize the concordance of the CNV calls between FREEC configurations and the analogous output from CNV-seq. The normalization methodologies evaluated in FREEC are: GC content, mappability and control genome. We further stratify the concordance analysis within genic, non-genic, and a collection of validated variant regions.
Results
The GC content normalization methodology generates the highest number of altered copy number regions. Both mappability and control genome normalization reduce the total number and length of copy number regions. Mappability normalization yields Jaccard indices in the 0.07 - 0.3 range, whereas using a control genome normalization yields Jaccard index values around 0.4 with normalization based on GC content. The most critical impact of using mappability as a normalization factor is substantial reduction of deletion CNV calls. The output of another method based on control genome normalization, CNV-seq, resulted in comparable CNV call profiles, and substantial agreement in variable gene and CNV region calls.
Conclusions
Choice of read-count normalization methodology has a substantial effect on CNV calls and the use of genomic mappability or an appropriately chosen control genome can optimize the output of CNV analysis.
doi:10.1186/1471-2164-13-S6-S16
PMCID: PMC3481445  PMID: 23134596
2.  Major Chromosomal Breakpoint Intervals in Breast Cancer Co-Localize with Differentially Methylated Regions 
Frontiers in Oncology  2012;2:197.
Solid tumors exhibit chromosomal rearrangements resulting in gain or loss of multiple chromosomal loci (copy number variation, or CNV), and translocations that occasionally result in the creation of novel chimeric genes. In the case of breast cancer, although most individual tumors each have unique CNV landscape, the breakpoints, as measured over large datasets, appear to be non-randomly distributed in the genome. Breakpoints show a significant regional concentration at genomic loci spanning perhaps several megabases. The proximal cause of these breakpoint concentrations is a subject of speculation, but is, as yet, largely unknown. To shed light on this issue, we have performed a bio-statistical analysis on our previously published data for a set of 119 breast tumors and normal controls (Wiedswang et al., 2003), where each sample has both high-resolution CNV and methylation data. The method examined the distribution of closeness of breakpoint regions with differentially methylated regions (DMR), coupled with additional genomic parameters, such as repeat elements and designated “fragile sites” in the reference genome. Through this analysis, we have identified a set of 93 regional loci called breakpoint enriched DMR (BEDMRs) characterized by altered DNA methylation in cancer compared to normal cells that are associated with frequent breakpoint concentrations within a distance of 1 Mb. BEDMR loci are further associated with local hypomethylation (66%), concentrations of the Alu SINE repeats within 3 Mb (35% of the cases), and tend to occur near a number of cancer related genes such as the protocadherins, AKT1, DUB3, GAB2. Furthermore, BEDMRs seem to deregulate members of the histone gene family and chromatin remodeling factors, e.g., JMJD1B, which might affect the chromatin structure and disrupt coordinate signaling and repair. From this analysis we propose that preference for chromosomal breakpoints is related to genome structure coupled with alterations in DNA methylation and hence, chromatin structure, associated with tumorigenesis.
doi:10.3389/fonc.2012.00197
PMCID: PMC3530719  PMID: 23293768
DNA methylation; copy number variation; Alu repeat element; genome instability; multi-modal analysis; breast cancer
3.  Identification of Tumor Suppressors and Oncogenes from Genomic and Epigenetic Features in Ovarian Cancer 
PLoS ONE  2011;6(12):e28503.
The identification of genetic and epigenetic alterations from primary tumor cells has become a common method to identify genes critical to the development and progression of cancer. We seek to identify those genetic and epigenetic aberrations that have the most impact on gene function within the tumor. First, we perform a bioinformatic analysis of copy number variation (CNV) and DNA methylation covering the genetic landscape of ovarian cancer tumor cells. We separately examined CNV and DNA methylation for 42 primary serous ovarian cancer samples using MOMA-ROMA assays and 379 tumor samples analyzed by The Cancer Genome Atlas. We have identified 346 genes with significant deletions or amplifications among the tumor samples. Utilizing associated gene expression data we predict 156 genes with altered copy number and correlated changes in expression. Among these genes CCNE1, POP4, UQCRB, PHF20L1 and C19orf2 were identified within both data sets. We were specifically interested in copy number variation as our base genomic property in the prediction of tumor suppressors and oncogenes in the altered ovarian tumor. We therefore identify changes in DNA methylation and expression for all amplified and deleted genes. We statistically define tumor suppressor and oncogenic features for these modalities and perform a correlation analysis with expression. We predicted 611 potential oncogenes and tumor suppressors candidates by integrating these data types. Genes with a strong correlation for methylation dependent expression changes exhibited at varying copy number aberrations include CDCA8, ATAD2, CDKN2A, RAB25, AURKA, BOP1 and EIF2C3. We provide copy number variation and DNA methylation analysis for over 11,500 individual genes covering the genetic landscape of ovarian cancer tumors. We show the extent of genomic and epigenetic alterations for known tumor suppressors and oncogenes and also use these defined features to identify potential ovarian cancer gene candidates.
doi:10.1371/journal.pone.0028503
PMCID: PMC3234280  PMID: 22174824
4.  Expression-Based Network Biology Identifies Alteration in Key Regulatory Pathways of Type 2 Diabetes and Associated Risk/Complications 
PLoS ONE  2009;4(12):e8100.
Type 2 diabetes mellitus (T2D) is a multifactorial and genetically heterogeneous disease which leads to impaired glucose homeostasis and insulin resistance. The advanced form of disease causes acute cardiovascular, renal, neurological and microvascular complications. Thus there is a constant need to discover new and efficient treatment against the disease by seeking to uncover various novel alternate signalling mechanisms that can lead to diabetes and its associated complications. The present study allows detection of molecular targets by unravelling their role in altered biological pathways during diabetes and its associated risk factors and complications. We have used an integrated functional networks concept by merging co-expression network and interaction network to detect the transcriptionally altered pathways and regulations involved in the disease. Our analysis reports four novel significant networks which could lead to the development of diabetes and other associated dysfunctions. (a) The first network illustrates the up regulation of TGFBRII facilitating oxidative stress and causing the expression of early transcription genes via MAPK pathway leading to cardiovascular and kidney related complications. (b) The second network demonstrates novel interactions between GAPDH and inflammatory and proliferation candidate genes i.e., SUMO4 and EGFR indicating a new link between obesity and diabetes. (c) The third network portrays unique interactions PTPN1 with EGFR and CAV1 which could lead to an impaired vascular function in diabetic nephropathy condition. (d) Lastly, from our fourth network we have inferred that the interaction of β-catenin with CDH5 and TGFBR1 through Smad molecules could contribute to endothelial dysfunction. A probability of emergence of kidney complication might be suggested in T2D condition. An experimental investigation on this aspect may further provide more decisive observation in drug target identification and better understanding of the pathophysiology of T2D and its complications.
doi:10.1371/journal.pone.0008100
PMCID: PMC2785475  PMID: 19997558
5.  PAPAyA: a platform for breast cancer biomarker signature discovery, evaluation and assessment 
BMC Bioinformatics  2009;10(Suppl 9):S7.
Background
The decision environment for cancer care is becoming increasingly complex due to the discovery and development of novel genomic tests that offer information regarding therapy response, prognosis and monitoring, in addition to traditional histopathology. There is, therefore, a need for translational clinical tools based on molecular bioinformatics, particularly in current cancer care, that can acquire, analyze the data, and interpret and present information from multiple diagnostic modalities to help the clinician make effective decisions.
Results
We present a platform for molecular signature discovery and clinical decision support that relies on genomic and epigenomic measurement modalities as well as clinical parameters such as histopathological results and survival information. Our Physician Accessible Preclinical Analytics Application (PAPAyA) integrates a powerful set of statistical and machine learning tools that leverage the connections among the different modalities. It is easily extendable and reconfigurable to support integration of existing research methods and tools into powerful data analysis and interpretation pipelines. A current configuration of PAPAyA with examples of its performance on breast cancer molecular profiles is used to present the platform in action.
Conclusion
PAPAyA enables analysis of data from (pre)clinical studies, formulation of new clinical hypotheses, and facilitates clinical decision support by abstracting molecular profiles for clinicians.
doi:10.1186/1471-2105-10-S9-S7
PMCID: PMC2745694  PMID: 19761577
6.  Methylation detection oligonucleotide microarray analysis: a high-resolution method for detection of CpG island methylation 
Nucleic Acids Research  2009;37(12):e89.
Methylation of CpG islands associated with genes can affect the expression of the proximal gene, and methylation of non-associated CpG islands correlates to genomic instability. This epigenetic modification has been shown to be important in many pathologies, from development and disease to cancer. We report the development of a novel high-resolution microarray that detects the methylation status of over 25 000 CpG islands in the human genome. Experiments were performed to demonstrate low system noise in the methodology and that the array probes have a high signal to noise ratio. Methylation measurements between different cell lines were validated demonstrating the accuracy of measurement. We then identified alterations in CpG islands, both those associated with gene promoters, as well as non-promoter-associated islands in a set of breast and ovarian tumors. We demonstrate that this methodology accurately identifies methylation profiles in cancer and in principle it can differentiate any CpG methylation alterations and can be adapted to analyze other species.
doi:10.1093/nar/gkp413
PMCID: PMC2709589  PMID: 19474344
7.  T2D-Db: An integrated platform to study the molecular basis of Type 2 diabetes 
BMC Genomics  2008;9:320.
Background
Type 2 Diabetes Mellitus (T2DM) is a non insulin dependent, complex trait disease that develops due to genetic predisposition and environmental factors. The advanced stage in type 2 diabetes mellitus leads to several micro and macro vascular complications like nephropathy, neuropathy, retinopathy, heart related problems etc. Studies performed on the genetics, biochemistry and molecular biology of this disease to understand the pathophysiology of type 2 diabetes mellitus has led to the generation of a surfeit of data on candidate genes and related aspects. The research is highly progressive towards defining the exact etiology of this disease.
Results
T2D-Db (Type 2 diabetes Database) is a comprehensive web resource, which provides integrated and curated information on almost all known molecular components involved in the pathogenesis of type 2 diabetes mellitus in the three widely studied mammals namely human, mouse and rat. Information on candidate genes, SNPs (Single Nucleotide Polymorphism) in candidate genes or candidate regions, genome wide association studies (GWA), tissue specific gene expression patterns, EST (Expressed Sequence Tag) data, expression information from microarray data, pathways, protein-protein interactions and disease associated risk factors or complications have been structured in this on line resource.
Conclusion
Information available in T2D-Db provides an integrated platform for the better molecular level understanding of type 2 diabetes mellitus and its pathogenesis. Importantly, the resource facilitates graphical presentation of the gene/genome wide map of SNP markers and protein-protein interaction networks, besides providing the heat map diagram of the selected gene(s) in an organism across microarray expression experiments from either single or multiple studies. These features aid to the data interpretation in an integrative way. T2D-Db is to our knowledge the first publicly available resource that can cater to the needs of researchers working on different aspects of type 2 diabetes mellitus.
doi:10.1186/1471-2164-9-320
PMCID: PMC2491641  PMID: 18605991

Results 1-7 (7)