Search tips
Search criteria

Results 1-5 (5)

Clipboard (0)
more »
Year of Publication
Document Types
1.  Wave-spec: a preprocessing package for mass spectrometry data 
Bioinformatics  2011;27(5):739-740.
Summary:Wave-spec is a pre-processing package for mass spectrometry (MS) data. The package includes several novel algorithms that overcome conventional difficulties with the pre-processing of such data. In this application note, we demonstrate step-by-step use of this package on a real-world MALDI dataset.
Availability: The package can be downloaded at A shared mailbox ( also is available for questions regarding application of the package.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3105479  PMID: 21208983
2.  A novel comprehensive wave-form MS data processing method 
Bioinformatics  2009;25(6):808-814.
Motivation: Mass spectrometry (MS) can generate high-throughput protein profiles for biomedical research to discover biologically related protein patterns/biomarkers. The noisy functional MS data collected by current technologies, however, require consistent, sensitive and robust data-processing techniques for successful biomedical application. Therefore, it is important to detect features precisely for each spectrum, quantify them well and assign a unique label to features from the same protein/peptide across spectra.
Results: In this article, we propose a new comprehensive MS data preprocessing package, Wave-spec, which includes several novel algorithms. It can overcome several conventional difficulties. Wave-spec can be applied to multiple types of MS data generated with different MS technologies. Results from this new package were evaluated and compared to several existing approaches based on a MALDI-TOF MS dataset.
Availability: An example of MATLAB scripts used to implement the methods described in this article, along with Supplementary Figures, can be found at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2732299  PMID: 19176559
3.  ZOOM! Zillions of oligos mapped 
Bioinformatics  2008;24(21):2431-2437.
Motivation: The next generation sequencing technologies are generating billions of short reads daily. Resequencing and personalized medicine need much faster software to map these deep sequencing reads to a reference genome, to identify SNPs or rare transcripts.
Results: We present a framework for how full sensitivity mapping can be done in the most efficient way, via spaced seeds. Using the framework, we have developed software called ZOOM, which is able to map the Illumina/Solexa reads of 15× coverage of a human genome to the reference human genome in one CPU-day, allowing two mismatches, at full sensitivity.
Availability: ZOOM is freely available to non-commercial users at
PMCID: PMC2732274  PMID: 18684737
4.  PICKY: a novel SVD-based NMR spectra peak picking method 
Bioinformatics  2009;25(12):i268-i275.
Motivation: Picking peaks from experimental NMR spectra is a key unsolved problem for automated NMR protein structure determination. Such a process is a prerequisite for resonance assignment, nuclear overhauser enhancement (NOE) distance restraint assignment, and structure calculation tasks. Manual or semi-automatic peak picking, which is currently the prominent way used in NMR labs, is tedious, time consuming and costly.
Results: We introduce new ideas, including noise-level estimation, component forming and sub-division, singular value decomposition (SVD)-based peak picking and peak pruning and refinement. PICKY is developed as an automated peak picking method. Different from the previous research on peak picking, we provide a systematic study of the proposed method. PICKY is tested on 32 real 2D and 3D spectra of eight target proteins, and achieves an average of 88% recall and 74% precision. PICKY is efficient. It takes PICKY on average 15.7 s to process an NMR spectrum. More important than these numbers, PICKY actually works in practice. We feed peak lists generated by PICKY to IPASS for resonance assignment, feed IPASS assignment to SPARTA for fragments generation, and feed SPARTA fragments to FALCON for structure calculation. This results in high-resolution structures of several proteins, for example, TM1112, at 1.25 Å.
Availability: PICKY is available upon request. The peak lists of PICKY can be easily loaded by SPARKY to enable a better interactive strategy for rapid peak picking.
PMCID: PMC2687979  PMID: 19477998
5.  Designing succinct structural alphabets 
Bioinformatics  2008;24(13):i182-i189.
Motivation: The 3D structure of a protein sequence can be assembled from the substructures corresponding to small segments of this sequence. For each small sequence segment, there are only a few more likely substructures. We call them the ‘structural alphabet’ for this segment. Classical approaches such as ROSETTA used sequence profile and secondary structure information, to predict structural fragments. In contrast, we utilize more structural information, such as solvent accessibility and contact capacity, for finding structural fragments.
Results: Integer linear programming technique is applied to derive the best combination of these sequence and structural information items. This approach generates significantly more accurate and succinct structural alphabets with more than 50% improvement over the previous accuracies. With these novel structural alphabets, we are able to construct more accurate protein structures than the state-of-art ab initio protein structure prediction programs such as ROSETTA. We are also able to reduce the Kolodny's library size by a factor of 8, at the same accuracy.
Availability: The online FRazor server is under construction,,
PMCID: PMC2718643  PMID: 18586712

Results 1-5 (5)