High-throughput profiling platforms have produced a large amount of data with public repositories such as the Genome Expression Omnibus (
1) and ArrayExpress (
2,
3) already storing tens of thousands of profiles across different experimental conditions. There is a steady growth in the amount and diversity of profiling results causing challenges in data analysis and integration as well as a strong need for novel comprehensive online bioinformatics tools which are easy to use by biologists and able to process raw profiles in a single- or global-analysis manner.
Although many methods are now available for low- and high-level analysis of genomic and transcriptomic experiments (
4–7), most require programming knowledge as well as bioinformatics expertise and results can vary substantially amongst these. Analysis of large data sets may involve the need for powerful computational resources as well as time and effort to set up the necessary infrastructure. For example, the use of aroma.affymetrix (
4,
8) for analysis of copy number data involves the creation of annotation files via a specific directory with a strict directory structure to organize raw and processed data. Additionally, there is no analytical tool that can handle raw and/or partially processed genomics data and annotate/display results online in a user-friendly manner that would alleviate the need for bioinformatics expertise and allow researchers to process their in-house data in isolation or alongside the accumulated publicly available data in their area of research.
To overcome these problems, we have developed O-miner (
http://www.o-miner.org), which can analyse the most popular, and widely used Affymetrix genomics and transcriptomics array types on the fly starting from raw standard Affymetrix file format (CEL image files obtained from the scanner) or partially processed format (normalized, segmented and/or binary) with minimal set-up efforts. The analysis is performed on a dedicated server removing memory or disk space requirements on end-user machines. All analytical pipelines are transparent, robust, well documented and based on well-established and recently developed statistical methods. Results can be viewed online as dynamic HTML reports for easy navigation through an interactive friendly interface or downloaded as text, excel or graphics files.
O-miner is comprehensive, robust, memory-efficient and can easily be extended with new methods and algorithms to cover additional chip types and platforms. In this article, we provide an overview of O-miner and discuss both transcriptomics and genomics workflows. We outline some examples of use to show how to perform low single-level as well as high global-level analysis and to illustrate how to navigate through obtained results. Finally, we discuss future updates of the software to accommodate and link additional data types.