PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-15 (15)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Enrichment-based DNA methylation analysis using next-generation sequencing: sample exclusion, estimating changes in global methylation, and the contribution of replicate lanes 
BMC Genomics  2012;13(Suppl 8):S6.
Background
DNA methylation is an important epigenetic mark and dysregulation of DNA methylation is associated with many diseases including cancer. Advances in next-generation sequencing now allow unbiased methylome profiling of entire patient cohorts, greatly facilitating biomarker discovery and presenting new opportunities to understand the biological mechanisms by which changes in methylation contribute to disease. Enrichment-based sequencing assays such as MethylCap-seq are a cost effective solution for genome-wide determination of methylation status, but the technical reliability of methylation reconstruction from raw sequencing data has not been well characterized.
Methods
We analyze three MethylCap-seq data sets and perform two different analyses to assess data quality. First, we investigate how data quality is affected by excluding samples that do not meet quality control cutoff requirements. Second, we consider the effect of additional reads on enrichment score, saturation, and coverage. Lastly, we verify a method for the determination of the global amount of methylation from MethylCap-seq data by comparing to a spiked-in control DNA of known methylation status.
Results
We show that rejection of samples based on our quality control parameters leads to a significant improvement of methylation calling. Additional reads beyond ~13 million unique aligned reads improved coverage, modestly improved saturation, and did not impact enrichment score. Lastly, we find that a global methylation indicator calculated from MethylCap-seq data correlates well with the global methylation level of a sample as obtained from a spike-in DNA of known methylation level.
Conclusions
We show that with appropriate quality control MethylCap-seq is a reliable tool, suitable for cohorts of hundreds of patients, that provides reproducible methylation information on a feature by feature basis as well as information about the global level of methylation.
doi:10.1186/1471-2164-13-S8-S6
PMCID: PMC3535705  PMID: 23281662
2.  Systematic investigation of insertional and deletional RNA-DNA differences in the human transcriptome 
BMC Genomics  2012;13:616.
Background
The genomic information which is transcribed into the primary RNA can be altered by RNA editing at the transcriptional or post-transcriptional level, which provides an effective way to create transcript diversity in an organism. Altering can occur through substitutional RNA editing or via the insertion or deletion of nucleotides relative to the original template. Taking advantage of recent high throughput sequencing technology combined with bioinformatics tools, several groups have recently studied the genome-wide substitutional RNA editing profiles in human. However, while insertional/deletional (indel) RNA editing is well known in several lower species, only very scarce evidence supports the existence of insertional editing events in higher organisms such as human, and no previous work has specifically focused on indel differences between RNA and their matching DNA in human. Here, we provide the first study to examine the possibility of genome-wide indel RNA-DNA differences in one human individual, NA12878, whose RNA and matching genome have been deeply sequenced.
Results
We apply different computational tools that are capable of identifying indel differences between RNA reads and the matching reference genome and we initially find hundreds of such indel candidates. However, with careful further analysis and filtering, we conclude that all candidates are false-positives created by splice junctions, paralog sequences, diploid alleles, and known genomic indel variations.
Conclusions
Overall, our study suggests that indel RNA editing events are unlikely to exist broadly in the human transcriptome and emphasizes the necessity of a robust computational filter pipeline to obtain high confidence RNA-DNA difference results when analyzing high throughput sequencing data as suggested in the recent genome-wide RNA editing studies.
doi:10.1186/1471-2164-13-616
PMCID: PMC3505181  PMID: 23148664
Indel RNA-DNA differences; RNA-seq data analysis; Computational filtering
3.  Methods for high-throughput MethylCap-Seq data analysis 
BMC Genomics  2012;13(Suppl 6):S14.
Background
Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups.
Methods
The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization.
Results
Here, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J.
Conclusions
This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
doi:10.1186/1471-2164-13-S6-S14
PMCID: PMC3481483  PMID: 23134780
4.  Identification of cytoplasmic capping targets reveals a role for cap homeostasis in translation and mRNA stability 
Cell reports  2012;2(3):674-684.
Summary
The notion that decapping leads irreversibly to mRNA decay changed with the identification of capped transcripts missing portions of their 5′ ends and a cytoplasmic complex that can restore the cap on uncapped mRNAs. The current study used accumulation of uncapped transcripts in cells inhibited for cytoplasmic capping to identify the targets of this pathway. Inhibition of cytoplasmic capping resulted in the destabilization of some transcripts and the redistribution of others from polysomes to non-translating mRNPs, where they accumulate in an uncapped state. Only a portion of the mRNA transcriptome is affected by cytoplasmic capping, and its targets encode proteins involved in nucleotide binding, RNA and protein localization and the mitotic cell cycle. The 3′-UTRs of recapping targets are enriched for AU-rich elements and microRNA binding sites, both of which function in cap-dependent mRNA silencing. These findings identify a cyclical process of decapping and recapping we term cap homeostasis.
doi:10.1016/j.celrep.2012.07.011
PMCID: PMC3462258  PMID: 22921400
cytoplasmic capping; cap homeostasis; translation; polysomes; mRNP; non-translating mRNA; mRNA stability; uncapped transcriptome
5.  Regulation of the nucleosome unwrapping rate controls DNA accessibility 
Nucleic Acids Research  2012;40(20):10215-10227.
Eukaryotic genomes are repetitively wrapped into nucleosomes that then regulate access of transcription and DNA repair complexes to DNA. The mechanisms that regulate extrinsic protein interactions within nucleosomes are unresolved. We demonstrate that modulation of the nucleosome unwrapping rate regulates protein binding within nucleosomes. Histone H3 acetyl-lysine 56 [H3(K56ac)] and DNA sequence within the nucleosome entry-exit region additively influence nucleosomal DNA accessibility by increasing the unwrapping rate without impacting rewrapping. These combined epigenetic and genetic factors influence transcription factor (TF) occupancy within the nucleosome by at least one order of magnitude and enhance nucleosome disassembly by the DNA mismatch repair complex, hMSH2–hMSH6. Our results combined with the observation that ∼30% of Saccharomyces cerevisiae TF-binding sites reside in the nucleosome entry–exit region suggest that modulation of nucleosome unwrapping is a mechanism for regulating transcription and DNA repair.
doi:10.1093/nar/gks747
PMCID: PMC3488218  PMID: 22965129
6.  A Scalable, Flexible Workflow for MethylCap-Seq Data Analysis 
Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. The NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. Here, we present a scalable, flexible workflow for MethylCap-seq Quality Control, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. This workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.
doi:10.1109/GENSiPS.2011.6169426
PMCID: PMC3320741  PMID: 22484542
next generation sequencing; DNA methylation; epigenetics; cancer; data analysis; data visualization
7.  Comparison of Insertional RNA Editing in Myxomycetes 
PLoS Computational Biology  2012;8(2):e1002400.
RNA editing describes the process in which individual or short stretches of nucleotides in a messenger or structural RNA are inserted, deleted, or substituted. A high level of RNA editing has been observed in the mitochondrial genome of Physarum polycephalum. The most frequent editing type in Physarum is the insertion of individual Cs. RNA editing is extremely accurate in Physarum; however, little is known about its mechanism. Here, we demonstrate how analyzing two organisms from the Myxomycetes, namely Physarum polycephalum and Didymium iridis, allows us to test hypotheses about the editing mechanism that can not be tested from a single organism alone. First, we show that using the recently determined full transcriptome information of Physarum dramatically improves the accuracy of computational editing site prediction in Didymium. We use this approach to predict genes in the mitochondrial genome of Didymium and identify six new edited genes as well as one new gene that appears unedited. Next we investigate sequence conservation in the vicinity of editing sites between the two organisms in order to identify sites that harbor the information for the location of editing sites based on increased conservation. Our results imply that the information contained within only nine or ten nucleotides on either side of the editing site (a distance previously suggested through experiments) is not enough to locate the editing sites. Finally, we show that the codon position bias in C insertional RNA editing of these two organisms is correlated with the selection pressure on the respective genes thereby directly testing an evolutionary theory on the origin of this codon bias. Beyond revealing interesting properties of insertional RNA editing in Myxomycetes, our work suggests possible approaches to be used when finding sequence motifs for any biological process fails.
Author Summary
RNA is an important biomolecule that is deeply involved in all aspects of molecular biology, such as protein production, gene regulation, and viral replication. However, many significant aspects such as the mechanism of RNA editing are not well understood. RNA editing is the process in which an organism's RNA is modified through the insertion, deletion, or substitution of single or short stretches of nucleotides. The slime mold Physarum polycephalum is a model organism for the study of RNA editing; however, hardly anything is known about its editing machinery. We show that the combination of two organisms (Physarum polycephalum and Didymium iridis) can provide a better understanding of insertional RNA editing than one organism alone. We predict several new edited genes in Didymium. By comparing the sequences of the two organisms in the vicinity of the editing sites we establish minimal requirements for the location of the information by which these editing sites are recognized. Lastly, we directly verify a theory for one of the most striking features of the editing sites, namely their codon bias.
doi:10.1371/journal.pcbi.1002400
PMCID: PMC3285571  PMID: 22383871
8.  A quantitative model of nucleosome dynamics 
Nucleic Acids Research  2011;39(19):8306-8313.
The expression, replication and repair of eukaryotic genomes require the fundamental organizing unit of chromatin, the nucleosome, to be unwrapped and disassembled. We have developed a quantitative model of nucleosome dynamics which provides a fundamental understanding of these DNA processes. We calibrated this model using results from high precision single molecule nucleosome unzipping experiments, and then tested its predictions for experiments in which nucleosomes are disassembled by the DNA mismatch recognition complex hMSH2-hMSH6. We found that this calibrated model quantitatively describes hMSH2-hMSH6 induced disassembly rates of nucleosomes with two separate DNA sequences and four distinct histone modification states. In addition, this model provides mechanistic insight into nucleosome disassembly by hMSH2-hMSH6 and the influence of histone modifications on this disassembly reaction. This model's precise agreement with current experiments suggests that it can be applied more generally to provide important mechanistic understanding of the numerous nucleosome alterations that occur during DNA processing.
doi:10.1093/nar/gkr422
PMCID: PMC3201853  PMID: 21764779
9.  Genome annotation in the presence of insertional RNA editing 
Bioinformatics  2008;24(22):2571-2578.
Motivation: Insertional RNA editing renders gene prediction very difficult compared to organisms without such RNA editing. A case in point is the mitochondrial genome of Physarum polycephalum in which only about one-third of the number of genes that are to be expected given its length are annotated. Thus, gene prediction methods that explicitly take into account insertional editing are needed for successful annotation of such genomes.
Results: We annotate the mitochondrial genome of P.polycephalum using several different approaches for gene prediction in organisms with insertional RNA editing. We computationally validate our annotations by comparing the results from different methods against each other and as proof of concept experimentally validate two of the newly predicted genes. We more than double the number of annotated putative genes in this organism and find several intriguing candidate genes that are not expected in a mitochondrial genome.
Availability: The C source code of the programs described here are available upon request from the corresponding author.
Contact: bundschuh@mps.ohio-state.edu
doi:10.1093/bioinformatics/btn487
PMCID: PMC2579709  PMID: 18819938
10.  The flexibility of locally melted DNA 
Nucleic Acids Research  2009;37(14):4580-4586.
Protein-bound duplex DNA is often bent or kinked. Yet, quantification of intrinsic DNA bending that might lead to such protein interactions remains enigmatic. DNA cyclization experiments have indicated that DNA may form sharp bends more easily than predicted by the established worm-like chain (WLC) model. One proposed explanation suggests that local melting of a few base pairs introduces flexible hinges. We have expanded this model to incorporate sequence and temperature dependence of the local melting, and tested it for three sequences at temperatures from 23°C to 42°C. We find that small melted bubbles are significantly more flexible than double-stranded DNA and can alter DNA flexibility at physiological temperatures. However, these bubbles are not flexible enough to explain the recently observed very sharp bends in DNA.
doi:10.1093/nar/gkp442
PMCID: PMC2724272  PMID: 19487242
11.  SIB-BLAST: a web server for improved delineation of true and false positives in PSI-BLAST searches 
Nucleic Acids Research  2009;37(Web Server issue):W53-W56.
A SIB-BLAST web server (http://sib-blast.osc.edu) has been established for investigators to use the SimpleIsBeautiful (SIB) algorithm for sequence-based homology detection. SIB was developed to overcome the model corruption frequently observed in the later iterations of PSI-BLAST searches. The algorithm compares resultant hits from the second iteration to the final iteration of a PSI-BLAST search, calculates the figure of merit for each ‘overlapped’ hit and re-ranks the hits according to their figure of merit. By validating hits generated from the last profile against hits from the first profile when the model is least corrupted, the true and false positives are better delineated, which in turn, improves the accuracy of iterative PSI-BLAST searches. Notably, this improvement to PSI-BLAST comes at minimal computational cost as SIB-BLAST utilizes existing results already produced in a PSI-BLAST search.
doi:10.1093/nar/gkp301
PMCID: PMC2703926  PMID: 19429693
12.  Contribution of ribosomal residues to P-site tRNA binding 
Nucleic Acids Research  2009;37(12):4033-4042.
Structural studies have revealed multiple contacts between the ribosomal P site and tRNA, but how these contacts contribute to P-tRNA binding remains unclear. In this study, the effects of ribosomal mutations on the dissociation rate (koff) of various tRNAs from the P site were measured. Mutation of the 30S P site destabilized tRNAs to various degrees, depending on the mutation and the species of tRNA. These data support the idea that ribosome-tRNA interactions are idiosyncratically tuned to ensure stable binding of all tRNA species. Unlike deacylated elongator tRNAs, N-acetyl-aminoacyl-tRNAs and tRNAfMet dissociated from the P site at a similar low rate, even in the presence of various P-site mutations. These data provide evidence for a stability threshold for P-tRNA binding and suggest that ribosome-tRNAfMet interactions are uniquely tuned for tight binding. The effects of 16S rRNA mutation G1338U were suppressed by 50S E-site mutation C2394A, suggesting that G1338 is particularly important for stabilizing tRNA in the P/E site. Finally, mutation C2394A or the presence of an N-acetyl-aminoacyl group slowed the association rate (kon) of tRNA dramatically, suggesting that deacylated tRNA binds the P site of the ribosome via the E site.
doi:10.1093/nar/gkp296
PMCID: PMC2709574  PMID: 19417061
13.  A Structured Viroid RNA Serves as a Substrate for Dicer-Like Cleavage To Produce Biologically Active Small RNAs but Is Resistant to RNA-Induced Silencing Complex-Mediated Degradation▿  
Journal of Virology  2007;81(6):2980-2994.
RNA silencing is a potent means of antiviral defense in plants and animals. A hallmark of this defense response is the production of 21- to 24-nucleotide viral small RNAs via mechanisms that remain to be fully understood. Many viruses encode suppressors of RNA silencing, and some viral RNAs function directly as silencing suppressors as counterdefense. The occurrence of viroid-specific small RNAs in infected plants suggests that viroids can trigger RNA silencing in a host, raising the question of how these noncoding and unencapsidated RNAs survive cellular RNA-silencing systems. We address this question by characterizing the production of small RNAs of Potato spindle tuber viroid (srPSTVds) and investigating how PSTVd responds to RNA silencing. Our molecular and biochemical studies provide evidence that srPSTVds were derived mostly from the secondary structure of viroid RNAs. Replication of PSTVd was resistant to RNA silencing, although the srPSTVds were biologically active in guiding RNA-induced silencing complex (RISC)-mediated cleavage, as shown with a sensor system. Further analyses showed that without possessing or triggering silencing suppressor activities, the PSTVd secondary structure played a critical role in resistance to RISC-mediated cleavage. These findings support the hypothesis that some infectious RNAs may have evolved specific secondary structures as an effective means to evade RNA silencing in addition to encoding silencing suppressor activities. Our results should have important implications in further studies on RNA-based mechanisms of host-pathogen interactions and the biological constraints that shape the evolution of infectious RNA structures.
doi:10.1128/JVI.02339-06
PMCID: PMC1865973  PMID: 17202210
14.  Discovery of new genes and deletion editing in Physarum mitochondria enabled by a novel algorithm for finding edited mRNAs 
Nucleic Acids Research  2005;33(16):5063-5072.
Gene finding is complicated in organisms that exhibit insertional RNA editing. Here, we demonstrate how our new algorithm Predictor of Insertional Editing (PIE) can be used to locate genes whose mRNAs are subjected to multiple frameshifting events, and extend the algorithm to include probabilistic predictions for sites of nucleotide insertion; this feature is particularly useful when designing primers for sequencing edited RNAs. Applying this algorithm, we successfully identified the nad2, nad4L, nad6 and atp8 genes within the mitochondrial genome of Physarum polycephalum, which had gone undetected by existing programs. Characterization of their mRNA products led to the unanticipated discovery of nucleotide deletion editing in Physarum. The deletion event, which results in the removal of three adjacent A residues, was confirmed by primer extension sequencing of total RNA. This finding is remarkable in that it comprises the first known instance of nucleotide deletion in this organelle, to be contrasted with nearly 500 sites of single and dinucleotide addition in characterized mitochondrial RNAs. Statistical analysis of this larger pool of editing sites indicates that there are significant biases in the 2 nt immediately upstream of editing sites, including a reduced incidence of nucleotide repeats, in addition to the previously identified purine-U bias.
doi:10.1093/nar/gki820
PMCID: PMC1201332  PMID: 16147990
15.  The estimation of statistical parameters for local alignment score distributions 
Nucleic Acids Research  2001;29(2):351-361.
The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distribution’s parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described ‘island’ method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.
PMCID: PMC29669  PMID: 11139604

Results 1-15 (15)