Search tips
Search criteria

Results 1-5 (5)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Meeting report of the RNA Ontology Consortium January 8-9, 2011 
Standards in Genomic Sciences  2011;4(2):252-256.
This report summarizes the proceedings of the structure mapping working group meeting of the RNA Ontology Consortium (ROC), held in Kona, Hawaii on January 8-9, 2011. The ROC hosted this workshop to facilitate collaborations among those researchers formalizing concepts in RNA, those developing RNA-related software, and those performing genome annotation and standardization. The workshop included three software presentations, extended round-table discussions, and the constitution of two new working groups, the first to address the need for better software integration and the second to discuss standardization and benchmarking of existing RNA annotation pipelines. These working groups have subsequently pursued concrete implementation of actions suggested during the discussion. Further information about the ROC and its activities can be found at
PMCID: PMC3111981  PMID: 21677862
2.  NoiseMaker: simulated screens for statistical assessment 
Bioinformatics  2010;26(19):2484-2485.
Summary: High-throughput screening (HTS) is a common technique for both drug discovery and basic research, but researchers often struggle with how best to derive hits from HTS data. While a wide range of hit identification techniques exist, little information is available about their sensitivity and specificity, especially in comparison to each other. To address this, we have developed the open-source NoiseMaker software tool for generation of realistically noisy virtual screens. By applying potential hit identification methods to NoiseMaker-simulated data and determining how many of the pre-defined true hits are recovered (as well as how many known non-hits are misidentified as hits), one can draw conclusions about the likely performance of these techniques on real data containing unknown true hits. Such simulations apply to a range of screens, such as those using small molecules, siRNAs, shRNAs, miRNA mimics or inhibitors, or gene over-expression; we demonstrate this utility by using it to explain apparently conflicting reports about the performance of the B score hit identification method.
Availability and implementation: NoiseMaker is written in C#, an ECMA and ISO standard language with compilers for multiple operating systems. Source code, a Windows installer and complete unit tests are available at Full documentation and support are provided via an extensive help file and tool-tips, and the developers welcome user suggestions.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2944205  PMID: 20702398
3.  Statistical Methods for Analysis of High-Throughput RNA Interference Screens 
Nature methods  2009;6(8):569-575.
RNA interference (RNAi) has become a powerful technique for reverse genetics and drug discovery and, in both of these areas, large-scale high-throughput RNAi screens are commonly performed. The statistical techniques used to analyze these screens are frequently borrowed directly from small-molecule screening; however small-molecule and RNAi data characteristics differ in meaningful ways. We examine the similarities and differences between RNAi and small-molecule screens, highlighting particular characteristics of RNAi screen data that must be addressed during analysis. Additionally, we provide guidance on selection of analysis techniques in the context of a sample workflow.
PMCID: PMC2789971  PMID: 19644458
4.  Genome Reshuffling for Advanced Intercross Permutation (GRAIP): Simulation and Permutation for Advanced Intercross Population Analysis 
PLoS ONE  2008;3(4):e1977.
Advanced intercross lines (AIL) are segregating populations created using a multi-generation breeding protocol for fine mapping complex trait loci (QTL) in mice and other organisms. Applying QTL mapping methods for intercross and backcross populations, often followed by naïve permutation of individuals and phenotypes, does not account for the effect of AIL family structure in which final generations have been expanded and leads to inappropriately low significance thresholds. The critical problem with naïve mapping approaches in AIL populations is that the individual is not an exchangeable unit.
Methodology/Principal Findings
The effect of family structure has immediate implications for the optimal AIL creation (many crosses, few animals per cross, and population expansion before the final generation) and we discuss these and the utility of AIL populations for QTL fine mapping. We also describe Genome Reshuffling for Advanced Intercross Permutation, (GRAIP) a method for analyzing AIL data that accounts for family structure. GRAIP permutes a more interchangeable unit in the final generation crosses – the parental genome – and simulating regeneration of a permuted AIL population based on exchanged parental identities. GRAIP determines appropriate genome-wide significance thresholds and locus-specific P-values for AILs and other populations with similar family structures. We contrast GRAIP with naïve permutation using a large densely genotyped mouse AIL population (1333 individuals from 32 crosses). A naïve permutation using coat color as a model phenotype demonstrates high false-positive locus identification and uncertain significance levels, which are corrected using GRAIP. GRAIP also detects an established hippocampus weight locus and a new locus, Hipp9a.
Conclusions and Significance
GRAIP determines appropriate genome-wide significance thresholds and locus-specific P-values for AILs and other populations with similar family structures. The effect of family structure has immediate implications for the optimal AIL creation and we discuss these and the utility of AIL populations.
PMCID: PMC2295257  PMID: 18431467
5.  PyCogent: a toolkit for making sense from sequence 
Genome Biology  2007;8(8):R171.
The COmparative GENomic Toolkit, a framework for probabilistic analyses of biological sequences, devising workflows and generating publication quality graphics, has been implemented in Python.
We have implemented in Python the COmparative GENomic Toolkit, a fully integrated and thoroughly tested framework for novel probabilistic analyses of biological sequences, devising workflows, and generating publication quality graphics. PyCogent includes connectors to remote databases, built-in generalized probabilistic techniques for working with biological sequences, and controllers for third-party applications. The toolkit takes advantage of parallel architectures and runs on a range of hardware and operating systems, and is available under the general public license from .
PMCID: PMC2375001  PMID: 17708774

Results 1-5 (5)