PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (2745)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
more »
1.  A variable selection method for genome-wide association studies 
Bioinformatics  2010;27(1):1-8.
Motivation: Genome-wide association studies (GWAS) involving half a million or more single nucleotide polymorphisms (SNPs) allow genetic dissection of complex diseases in a holistic manner. The common practice of analyzing one SNP at a time does not fully realize the potential of GWAS to identify multiple causal variants and to predict risk of disease. Existing methods for joint analysis of GWAS data tend to miss causal SNPs that are marginally uncorrelated with disease and have high false discovery rates (FDRs).
Results: We introduce GWASelect, a statistically powerful and computationally efficient variable selection method designed to tackle the unique challenges of GWAS data. This method searches iteratively over the potential SNPs conditional on previously selected SNPs and is thus capable of capturing causal SNPs that are marginally correlated with disease as well as those that are marginally uncorrelated with disease. A special resampling mechanism is built into the method to reduce false positive findings. Simulation studies demonstrate that the GWASelect performs well under a wide spectrum of linkage disequilibrium patterns and can be substantially more powerful than existing methods in capturing causal variants while having a lower FDR. In addition, the regression models based on the GWASelect tend to yield more accurate prediction of disease risk than existing methods. The advantages of the GWASelect are illustrated with the Wellcome Trust Case-Control Consortium (WTCCC) data.
Availability: The software implementing GWASelect is available at http://www.bios.unc.edu/~lin.
Access to WTCCC data: http://www.wtccc.org.uk/
Contact: lin@bios.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics Online.
doi:10.1093/bioinformatics/btq600
PMCID: PMC3025714  PMID: 21036813
2.  A variable selection method for genome-wide association studies 
Biometrics  2011;27(1):1-8.
Motivation
Genome-wide association studies (GWAS) involving half a million or more single nucleotide polymorphisms (SNPs) allow genetic dissection of complex diseases in a holistic manner. The common practice of analyzing one SNP at a time does not fully realize the potential of GWAS to identify multiple causal variants and to predict risk of disease. Existing methods for joint analysis of GWAS data tend to miss causal SNPs that are marginally uncorrelated with disease and have high false discovery rates (FDRs).
Results
We introduce GWASelect, a statistically powerful and computationally efficient variable selection method designed to tackle the unique challenges of GWAS data. This method searches iteratively over the potential SNPs conditional on previously selected SNPs and is thus capable of capturing causal SNPs that are marginally correlated with disease as well as those that are marginally uncorrelated with disease. A special resampling mechanism is built into the method to reduce false-positive findings. Simulation studies demonstrate that the GWASelect performs well under a wide spectrum of linkage disequilibrium (LD) patterns and can be substantially more powerful than existing methods in capturing causal variants while having a lower FDR. In addition, the regression models based on the GWASelect tend to yield more accurate prediction of disease risk than existing methods. The advantages of the GWASelect are illustrated with the Wellcome Trust Case-Control Consortium (WTCCC) data.
doi:10.1093/bioinformatics/btq600
PMCID: PMC3025714  PMID: 21036813
3.  Efficient whole-genome association mapping using local phylogenies for unphased genotype data 
Bioinformatics  2008;24(19):2215-2221.
Motivation: Recent advances in genotyping technology has made data acquisition for whole-genome association study cost effective, and a current active area of research is developing efficient methods to analyze such large-scale datasets. Most sophisticated association mapping methods that are currently available take phased haplotype data as input. However, phase information is not readily available from sequencing methods and inferring the phase via computational approaches is time-consuming, taking days to phase a single chromosome.
Results: In this article, we devise an efficient method for scanning unphased whole-genome data for association. Our approach combines a recently found linear-time algorithm for phasing genotypes on trees with a recently proposed tree-based method for association mapping. From unphased genotype data, our algorithm builds local phylogenies along the genome, and scores each tree according to the clustering of cases and controls. We assess the performance of our new method on both simulated and real biological datasets.
Availability The software described in this article is available at http://www.daimi.au.dk/~mailund/Blossoc and distributed under the GNU General Public License.
Contact:mailund@birc.au.dk
doi:10.1093/bioinformatics/btn406
PMCID: PMC2553438  PMID: 18667442
4.  iFoldRNA: three-dimensional RNA structure prediction and folding 
Bioinformatics  2008;24(17):1951-1952.
Summary: Three-dimensional RNA structure prediction and folding is of significant interest in the biological research community. Here, we present iFoldRNA, a novel web-based methodology for RNA structure prediction with near atomic resolution accuracy and analysis of RNA folding thermodynamics. iFoldRNA rapidly explores RNA conformations using discrete molecular dynamics simulations of input RNA sequences. Starting from simplified linear-chain conformations, RNA molecules (<50 nt) fold to native-like structures within half an hour of simulation, facilitating rapid RNA structure prediction. All-atom reconstruction of energetically stable conformations generates iFoldRNA predicted RNA structures. The predicted RNA structures are within 2–5 Å root mean squre deviations (RMSDs) from corresponding experimentally derived structures. RNA folding parameters including specific heat, contact maps, simulation trajectories, gyration radii, RMSDs from native state, fraction of native-like contacts are accessible from iFoldRNA. We expect iFoldRNA will serve as a useful resource for RNA structure prediction and folding thermodynamic analyses.
Availability: http://iFoldRNA.dokhlab.org.
Contact: dokh@med.unc.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn328
PMCID: PMC2559968  PMID: 18579566
5.  Systematic biological prioritization after a genome-wide association study: an application to nicotine dependence 
Bioinformatics  2008;24(16):1805-1811.
Motivation: A challenging problem after a genome-wide association study (GWAS) is to balance the statistical evidence of genotype–phenotype correlation with a priori evidence of biological relevance.
Results: We introduce a method for systematically prioritizing single nucleotide polymorphisms (SNPs) for further study after a GWAS. The method combines evidence across multiple domains including statistical evidence of genotype–phenotype correlation, known pathways in the pathologic development of disease, SNP/gene functional properties, comparative genomics, prior evidence of genetic linkage, and linkage disequilibrium. We apply this method to a GWAS of nicotine dependence, and use simulated data to test it on several commercial SNP microarrays.
Availability: A comprehensive database of biological prioritization scores for all known SNPs is available at http://zork.wustl.edu/gin. This can be used to prioritize nicotine dependence association studies through a straightforward mathematical formula—no special software is necessary.
Contact: ssaccone@wustl.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn315
PMCID: PMC2610477  PMID: 18565990
6.  Comprehensive in silico mutagenesis highlights functionally important residues in proteins 
Bioinformatics  2008;24(16):i207-i212.
Motivation: Mutating residues into alanine (alanine scanning) is one of the fastest experimental means of probing hypotheses about protein function. Alanine scans can reveal functional hot spots, i.e. residues that alter function upon mutation. In vitro mutagenesis is cumbersome and costly: probing all residues in a protein is typically as impossible as substituting by all non-native amino acids. In contrast, such exhaustive mutagenesis is feasible in silico.
Results: Previously, we developed SNAP to predict functional changes due to non-synonymous single nucleotide polymorphisms. Here, we applied SNAP to all experimental mutations in the ASEdb database of alanine scans; we identified 70% of the hot spots (≥1 kCal/mol change in binding energy); more severe changes were predicted more accurately. Encouraged, we carried out a complete all-against-all in silico mutagenesis for human glucokinase. Many of the residues predicted as functionally important have indeed been confirmed in the literature, others await experimental verification, and our method is ready to aid in the design of in vitro mutagenesis.
Availability: ASEdb and glucokinase scores are available at http://www.rostlab.org/services/SNAP. For submissions of large/whole proteins for processing please contact the author.
Contact: yb2009@columbia.edu
doi:10.1093/bioinformatics/btn268
PMCID: PMC2597370  PMID: 18689826
7.  Systematic biological prioritization after a genome-wide association study 
Bioinformatics (Oxford, England)  2008;24(16):1805-1811.
Motivation
A challenging problem after a genome-wide association study (GWAS) is to balance the statistical evidence of geno-type-phenotype correlation with a priori evidence of biological relevance.
Results
We introduce a method for systematically prioritizing single nucleotide polymorphisms (SNPs) for further study after a GWAS. The method combines evidence across multiple domains, including statistical evidence of genotype-phenotype correlation, known pathways in the pathologic development of disease, SNP/gene functional properties, comparative genomics, prior evidence of genetic linkage, and linkage disequilibrium. We apply this method to a GWAS of nicotine dependence, and use simulated data to test it on several commercial SNP microarrays.
doi:10.1093/bioinformatics/btn315
PMCID: PMC2610477  PMID: 18565990
8.  LOT: a tool for linkage analysis of ordinal traits for pedigree data 
Bioinformatics  2008;24(15):1737-1739.
Summary: Existing linkage-analysis methods address binary or quantitative traits. However, many complex diseases and human conditions, particularly behavioral disorders, are rated on ordinal scales. Herein, we introduce, LOT, a tool that performs linkage analysis of ordinal traits for pedigree data. It implements a latent-variable proportional-odds logistic model that relates inheritance patterns to the distribution of the ordinal trait. The likelihood-ratio test is used for testing evidence of linkage.
Availability: The LOT program is available for download at http://c2s2.yale.edu/software/LOT/
Contact: heping.zhang@yale.edu
doi:10.1093/bioinformatics/btn258
PMCID: PMC2566542  PMID: 18535081
9.  Memory-efficient dynamic programming backtrace and pairwise local sequence alignment 
Bioinformatics (Oxford, England)  2008;24(16):1772-1778.
Motivation
A backtrace through a dynamic programming algorithm’s intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward–backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g. cache) existing approaches store selected stages of the computation, and recompute missing values from these checkpoints on an as-needed basis.
Results
Here we present an optimal checkpointing strategy, and demonstrate its utility with pairwise local sequence alignment of sequences of length 10 000.
Availability
Sample C++-code for optimal backtrace is available in the Supplementary Materials.
doi:10.1093/bioinformatics/btn308
PMCID: PMC2668612  PMID: 18558620
10.  Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification 
Bioinformatics (Oxford, England)  2008;24(13):i348-i356.
Motivation
Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms.
Results
We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.
Availability
Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk.
Contact
noble@gs.washington.edu
doi:10.1093/bioinformatics/btn189
PMCID: PMC2665034  PMID: 18586734
11.  Comprehensive in silico mutagenesis highlights functionally important residues in proteins 
Bioinformatics (Oxford, England)  2008;24(16):i207-i212.
Motivation
Mutating residues into alanine (alanine scanning) is one of the fastest experimental means of probing hypotheses about protein function. Alanine scans can reveal functional hot spots, i.e. residues that alter function upon mutation. In vitro mutagenesis is cumbersome and costly: probing all residues in a protein is typically as impossible as substituting by all non-native amino acids. In contrast, such exhaustive mutagenesis is feasible in silico.
Results
Previously, we developed SNAP to predict functional changes due to non-synonymous single nucleotide polymorphisms. Here, we applied SNAP to all experimental mutations in the ASEdb database of alanine scans; we identified 70% of the hot spots (≥1kCal/mol change in binding energy); more severe changes were predicted more accurately. Encouraged, we carried out a complete all-against-all in silico mutagenesis for human glucokinase. Many of the residues predicted as functionally important have indeed been confirmed in the literature, others await experimental verification, and our method is ready to aid in the design of in vitro mutagenesis.
Availability
ASEdb and glucokinase scores are available at http://www.rostlab.org/services/SNAP. For submissions of large/whole proteins for processing please contact the author.
Contact: yb2009@columbia.edu
doi:10.1093/bioinformatics/btn268
PMCID: PMC2597370  PMID: 18689826
12.  Powerful fusion: PSI-BLAST and consensus sequences 
Bioinformatics (Oxford, England)  2008;24(18):1987-1993.
Motivation
A typical PSI-BLAST search consists of iterative scanning and alignment of a large sequence database during which a scoring profile is progressively built and refined. Such a profile can also be stored and used to search against a different database of sequences. Using it to search against a database of consensus rather than native sequences is a simple add-on that boosts performance surprisingly well. The improvement comes at a price: we hypothesized that random alignment score statistics would differ between native and consensus sequences. Thus PSI-BLAST-based profile searches against consensus sequences might incorrectly estimate statistical significance of alignment scores. In addition, iterative searches against consensus databases may fail. Here, we addressed these challenges in an attempt to harness the full power of the combination of PSI-BLAST and consensus sequences.
Results
We studied alignment score statistics for various types of consensus sequences. In general, the score distribution parameters of profile-based consensus sequence alignments differed significantly from those derived for the native sequences. PSI-BLAST partially compensated for the parameter variation. We have identified a protocol for building specialized consensus sequences that significantly improved search sensitivity and preserved score distribution parameters. As a result, PSI-BLAST profiles can be used to search specialized consensus sequences without sacrificing estimates of statistical significance. We also provided results indicating that iterative PSI-BLAST searches against consensus sequences could work very well. Overall, we showed how a widely popular and effective method could be used to identify significantly more relevant similarities among protein sequences.
Availability
http://www.rostlab.org/services/consensus/
Contact:
dsp23@columbia.edu
doi:10.1093/bioinformatics/btn384
PMCID: PMC2577777  PMID: 18678588
13.  Efficient Whole-Genome Association Mapping using Local Phylogenies for Unphased Genotype Data 
Bioinformatics (Oxford, England)  2008;24(19):2215-2221.
Motivation
Recent advances in genotyping technology has made data acquisition for whole-genome association study cost effective, and a current active area of research is developing efficient methods to analyze such large-scale data sets. Most sophisticated association mapping methods that are currently available take phased haplotype data as input. However, phase information is not readily available from sequencing methods and inferring the phase via computational approaches is time-consuming, taking days to phase a single chromosome.
Results
In this paper, we devise an efficient method for scanning unphased whole-genome data for association. Our approach combines a recently found linear-time algorithm for phasing genotypes on trees with a recently proposed tree-based method for association mapping. From unphased genotype data, our algorithm builds local phylogenies along the genome, and scores each tree according to the clustering of cases and controls. We assess the performance of our new method on both simulated and real biological data sets.
doi:10.1093/bioinformatics/btn406
PMCID: PMC2553438  PMID: 18667442
14.  LOT 
Bioinformatics (Oxford, England)  2008;24(15):1737-1739.
Summary
Existing linkage-analysis methods address binary or quantitative traits. However, many complex diseases and human conditions, particularly behavioral disorders, are rated on ordinal scales. Herein, we introduce, LOT, a tool that performs linkage analysis of ordinal traits for pedigree data. It implements a latent-variable proportional-odds logistic model that relates inheritance patterns to the distribution of the ordinal trait. The likelihood-ratio test is used for testing evidence of linkage.
doi:10.1093/bioinformatics/btn258
PMCID: PMC2566542  PMID: 18535081
15.  iFoldRNA: Three-dimensional RNA Structure Prediction and Folding 
Bioinformatics (Oxford, England)  2008;24(17):1951-1952.
Summary
Three-dimensional RNA structure prediction and folding is of significant interest in the biological research community. Here, we present iFoldRNA, a novel web-based methodology for RNA structure prediction with near atomic resolution accuracy and analysis of RNA folding thermodynamics. iFoldRNA rapidly explores RNA conformations using discrete molecular dynamics simulations of input RNA sequences. Starting from simplified linear-chain conformations, RNA molecules (<50 nucleotides) fold to native-like structures within half an hour of simulation, facilitating rapid RNA structure prediction. All-atom reconstruction of energetically stable conformations generates iFoldRNA predicted RNA structures. The predicted RNA structures are within 2–5 Angstrom root mean square deviations from corresponding experimentally derived structures. RNA folding parameters including specific heat, contact maps, simulation trajectories, gyration radii, root mean square deviations from native state, fraction of native-like contacts are accessible from iFoldRNA. We expect iFoldRNA will serve as a useful resource for RNA structure prediction and folding thermodynamic analyses.
doi:10.1093/bioinformatics/btn328
PMCID: PMC2559968  PMID: 18579566
16.  Powerful fusion: PSI-BLAST and consensus sequences 
Bioinformatics  2008;24(18):1987-1993.
Motivation: A typical PSI-BLAST search consists of iterative scanning and alignment of a large sequence database during which a scoring profile is progressively built and refined. Such a profile can also be stored and used to search against a different database of sequences. Using it to search against a database of consensus rather than native sequences is a simple add-on that boosts performance surprisingly well. The improvement comes at a price: we hypothesized that random alignment score statistics would differ between native and consensus sequences. Thus PSI-BLAST-based profile searches against consensus sequences might incorrectly estimate statistical significance of alignment scores. In addition, iterative searches against consensus databases may fail. Here, we addressed these challenges in an attempt to harness the full power of the combination of PSI-BLAST and consensus sequences.
Results: We studied alignment score statistics for various types of consensus sequences. In general, the score distribution parameters of profile-based consensus sequence alignments differed significantly from those derived for the native sequences. PSI-BLAST partially compensated for the parameter variation. We have identified a protocol for building specialized consensus sequences that significantly improved search sensitivity and preserved score distribution parameters. As a result, PSI-BLAST profiles can be used to search specialized consensus sequences without sacrificing estimates of statistical significance. We also provided results indicating that iterative PSI-BLAST searches against consensus sequences could work very well. Overall, we showed how a very popular and effective method could be used to identify significantly more relevant similarities among protein sequences.
Availability: http://www.rostlab.org/services/consensus/
Contact: dariusz@mit.edu
doi:10.1093/bioinformatics/btn384
PMCID: PMC2577777  PMID: 18678588
17.  Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification 
Bioinformatics  2008;24(13):i348-i356.
Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms.
Results: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.
Availability: Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk.
Contact:noble@gs.washington.edu
doi:10.1093/bioinformatics/btn189
PMCID: PMC2665034  PMID: 18586734
18.  Memory-efficient dynamic programming backtrace and pairwise local sequence alignment 
Bioinformatics  2008;24(16):1772-1778.
Motivation: A backtrace through a dynamic programming algorithm's intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward–backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g. cache) existing approaches store selected stages of the computation, and recompute missing values from these checkpoints on an as-needed basis.
Results: Here we present an optimal checkpointing strategy, and demonstrate its utility with pairwise local sequence alignment of sequences of length 10 000.
Availability: Sample C++-code for optimal backtrace is available in the Supplementary Materials.
Contact: leen@cs.rpi.edu
Supplementary information: Supplementary data is available at Bioinformatics online.
doi:10.1093/bioinformatics/btn308
PMCID: PMC2668612  PMID: 18558620
23.  Census 2: isobaric labeling data analysis 
Bioinformatics  2014;30(15):2208-2209.
Motivation: We introduce Census 2, an update of a mass spectrometry data analysis tool for peptide/protein quantification. New features for analysis of isobaric labeling, such as Tandem Mass Tag (TMT) or Isobaric Tags for Relative and Absolute Quantification (iTRAQ), have been added in this version, including a reporter ion impurity correction, a reporter ion intensity threshold filter and an option for weighted normalization to correct mixing errors. TMT/iTRAQ analysis can be performed on experiments using HCD (High Energy Collision Dissociation) only, CID (Collision Induced Dissociation)/HCD (High Energy Collision Dissociation) dual scans or HCD triple-stage mass spectrometry data. To improve measurement accuracy, we implemented weighted normalization, multiple tandem spectral approach, impurity correction and dynamic intensity threshold features.
Availability and implementation: Census 2 supports multiple input file formats including MS1/MS2, DTASelect, mzXML and pepXML. It requires JAVA version 6 or later to run. Free download of Census 2 for academic users is available at http://fields.scripps.edu/census/index.php.
Contact: jyates@scripps.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btu151
PMCID: PMC4155478  PMID: 24681903
24.  Covariate-modulated local false discovery rate for genome-wide association studies 
Bioinformatics  2014;30(15):2098-2104.
Motivation: Genome-wide association studies (GWAS) have largely failed to identify most of the genetic basis of highly heritable diseases and complex traits. Recent work has suggested this could be because many genetic variants, each with individually small effects, compose their genetic architecture, limiting the power of GWAS, given currently obtainable sample sizes. In this scenario, Bonferroni-derived thresholds are severely underpowered to detect the vast majority of associations. Local false discovery rate (fdr) methods provide more power to detect non-null associations, but implicit assumptions about the exchangeability of single nucleotide polymorphisms (SNPs) limit their ability to discover non-null loci.
Methods: We propose a novel covariate-modulated local false discovery rate (cmfdr) that incorporates prior information about gene element–based functional annotations of SNPs, so that SNPs from categories enriched for non-null associations have a lower fdr for a given value of a test statistic than SNPs in unenriched categories. This readjustment of fdr based on functional annotations is achieved empirically by fitting a covariate-modulated parametric two-group mixture model. The proposed cmfdr methodology is applied to a large Crohn’s disease GWAS.
Results: Use of cmfdr dramatically improves power, e.g. increasing the number of loci declared significant at the 0.05 fdr level by a factor of 5.4. We also demonstrate that SNPs were declared significant using cmfdr compared with usual fdr replicate in much higher numbers, while maintaining similar replication rates for a given fdr cutoff in de novo samples, using the eight Crohn’s disease substudies as independent training and test datasets.
Availability an implementation: https://sites.google.com/site/covmodfdr/
Contact: wes.stat@gmail.com
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btu145
PMCID: PMC4103587  PMID: 24711653
25.  PRADA: pipeline for RNA sequencing data analysis 
Bioinformatics  2014;30(15):2224-2226.
Summary: Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program.
Availability and implementation: http://sourceforge.net/projects/prada/
Contact: gadgetz@broadinstitute.org or rverhaak@mdanderson.org
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btu169
PMCID: PMC4103589  PMID: 24695405

Results 1-25 (2745)