PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of jbtJBT IndexAssociation Homepage
 
J Biomol Tech. 2010 September; 21(3 Suppl): S11.
PMCID: PMC2918104

WASP: Wiki-based Automated Sequence Processor for Epigenomics and Genomics Applications

R. Dubin,1 Q. Jing,1 P. O’Broin,1 B. Calder,2 D. Moskowitz,2 M. Suzuki,2 and J. Greally2,3,4
1Computational and Statistical Epigenomics Group, Albert Einstein College of Medicine, Price Center for Genetics and Translational Medicine, Bronx, NY, United States;
2Department of Genetics (Division of Computational Genetics), Albert Einstein College of Medicine, Price Center for Genetics and Translational Medicine, Bronx, NY, United States;
3Epigenomics Shared Facility, Center for Epigenomics, Albert Einstein College of Medicine, Price Center for Genetics and Translational Medicine, Bronx, NY, United States;
4Department of Medicine (Division of Hematology), Albert Einstein College of Medicine, Price Center for Genetics and Translational Medicine, Bronx, NY, United States;
5Albert Einstein College of Medicine, Price Center for Genetics and Translational Medicine, Bronx, NY, United States

Abstract

w7-2

The advent of massively parallel sequencing (MPS) technology has lead to the development of assays which facilitate the study of epigenomics and genomics at the genome-wide level. However, the computational burden resulting from the need to store and process the gigbytes of data streaming from sequencing machines, in addition to collecting metadata and returning data to users, is becoming a major issue for both sequencing cores and users alike. We present WASP, a LIMS system designed to automate MPS data pre-processing and analysis. WASP integrates a user-friendly MediaWiki front end, a network file system (NFS) and MySQL database for recording experimental data and metadata, plus a multi-node cluster for data processing. The workflow includes capture of sample submission information to the database using web forms on the wiki, recording of core facility operations on samples and linking of samples to flowcells in the database followed by automatic processing of sequence data and running of data analysis pipelines following the sequence run. WASP currently supports MPS using the Illumina GaIIx. For epigenomics applications we provide a pipeline for our novel HpaII-tiny fragment enrichment by ligation-mediated PCR (HELP)-tag method which enables us to quantify the methylation status of ~1.8 million CpGs located in 70% of the HpaII sites (CCGG) in the human genome. We also provide ChIP-seq analysis using MACS, which is also applicable for methylated DNA immunoprecipitation (MeDIP) assays, in addition to miRNA and mRNA analyses using custom pipelines. Output from the analysis pipelines is automatically linked to a users wiki-space and the data generated can be immediately viewed as tracks in a local mirror of the UCSC genome browser. WASP also provides capabilities for automated billing and keeping track of facility costs. We believe WASP represents a suitable model on which to develop LIMS systems for supporting MPS applications.


Articles from Journal of Biomolecular Techniques : JBT are provided here courtesy of The Association of Biomolecular Resource Facilities