|Home | About | Journals | Submit | Contact Us | Français|
The advent of massively parallel sequencing (MPS) technology has lead to the development of assays which facilitate the study of epigenomics and genomics at the genome-wide level. However, the computational burden resulting from the need to store and process the gigbytes of data streaming from sequencing machines, in addition to collecting metadata and returning data to users, is becoming a major issue for both sequencing cores and users alike. We present WASP, a LIMS system designed to automate MPS data pre-processing and analysis. WASP integrates a user-friendly MediaWiki front end, a network file system (NFS) and MySQL database for recording experimental data and metadata, plus a multi-node cluster for data processing. The workflow includes capture of sample submission information to the database using web forms on the wiki, recording of core facility operations on samples and linking of samples to flowcells in the database followed by automatic processing of sequence data and running of data analysis pipelines following the sequence run. WASP currently supports MPS using the Illumina GaIIx. For epigenomics applications we provide a pipeline for our novel HpaII-tiny fragment enrichment by ligation-mediated PCR (HELP)-tag method which enables us to quantify the methylation status of ~1.8 million CpGs located in 70% of the HpaII sites (CCGG) in the human genome. We also provide ChIP-seq analysis using MACS, which is also applicable for methylated DNA immunoprecipitation (MeDIP) assays, in addition to miRNA and mRNA analyses using custom pipelines. Output from the analysis pipelines is automatically linked to a users wiki-space and the data generated can be immediately viewed as tracks in a local mirror of the UCSC genome browser. WASP also provides capabilities for automated billing and keeping track of facility costs. We believe WASP represents a suitable model on which to develop LIMS systems for supporting MPS applications.