Tilescope was entirely developed in Java. Java was chosen as the programming language because of its built-in threading capability and its excellent library support for graphic user interface and networking development. More importantly, it was chosen because of its object-oriented nature: the program code is organized into different coherent classes and, thus, it naturally modularizes the system, which greatly facilitates parallel system development and subsequent system updating, a desideratum for any software engineering project of non-trivial complexity.
As a web-accessible program system, Tilescope is composed of three connected components: an applet, a servlet, and a pipeline program. The applet is the graphical interface through which the user interacts with Tilescope. It is automatically downloaded and launched inside a Java-enabled web browser whenever the pipeline web page is browsed. Through the Tilescope applet, a user can upload array data files to the pipeline server, select appropriate pipeline parameters and methods, run the data processing program, and view or download analysis results. The applet, however, cannot run the pipeline program directly. Instead, it makes data processing requests to the servlet, a server program that acts as the proxy of the pipeline program on the web and communicates with the applet upon requests. The servlet, the central layer of Tilescope, runs two 'daemon' threads in the background to handle - that is, accept and schedule or reject based on the current system load - file upload or data processing requests, prepare the pipeline running environment, and initiate with user-specified parameters the back-end pipeline program, which carries out the heavy lifting - the actual data processing procedure. This modular design - the separation between the request handling and the data processing itself - enables the usage of a computer farm for parallel computing and multiple concurrent processing.
On the web form of the Tilescope applet (Figure ), a user can either upload a parameter file, if available from a previous use of Tilescope, to have all parameters set accordingly in one easy step, or set parameters one by one manually, which is more likely to happen if an array data set is to be analyzed for the first time. The main body of the form was organized into two panels, one for setting the tile scoring parameters and the other for selecting the feature identification method, reflecting two main stages of data processing in the pipeline. After the pipeline program is started on the server, the users can monitor its progress through pipeline messages, which are constantly updated by the server throughout each pipeline run.
Screenshots of Tilescope. (a) The applet of Tilescope, the graphic user interface of the pipeline. (b) An example of the data analysis result web page.
When data processing is done, a web page with analysis results will be presented to the user in a new browser window (Figure ). On the result web page, the parameters and methods that were used to analyze the data are summarized at the top, followed by log-intensity scatter plots for each array and log-intensity histograms for all arrays in the data set before and after normalization. These enlargeable plots enable the user to quickly identify any problematic arrays visually and subsequently exclude them from further consideration. Both tile maps with log-ratio and P value annotations and the feature list in various text formats can be downloaded for further processing and analysis. The feature list in regular tab-delimited text format gives the user the chromosome (or other genomic sequence ID), the genomic start and end coordinates, the log-ratio, the P value, and, if the tiled genome is specified, the upstream and downstream genes of each feature. If it is the human genome that is under investigation, Tilescope will also provide links to display identified features on custom tracks in the UCSC genome browser. Moreover, if the tiling array was designed from a previous human genome build (for example hg16, NCBI 34), Tilescope will also provide an additional feature list with the coordinates lifted over to the current human genome build (for example hg17, NCBI 35).