The U-Compare system itself is a stand-alone application. In this platform, users can create a workflow from the components in the repository, or any third-party UIMA components, in an easy drag-and-drop manner and compare, evaluate and visualize the workflow results. The entire system can be started by a single mouse click in the U-Compare website. Workflows can also be executed via the command line without the GUI based platform.
Taverna plugin The U-Compare Taverna plugin works with Taverna version 2.1.0. The user must specify two options: the U-Compare workflow to embed and a post-processing Beanshell script with proper I/O ports to appropriately reformat the output for further processes in Taverna. This post-processing script is executed as the final UIMA component in the U-Compare workflow. UIMA and U-Compare APIs can be used in the script. Upon running the Taverna workflow the U-Compare application starts and runs the specified text mining workflow automatically, then shows U-Compare GUIs such as statistics and visualizations of generated annotations. This plugin is implemented to download, install and update U-Compare automatically. Users are only required to install the plugin from the Taverna's menu by inputting the plugin URL. As running GUIs can be demanding when processing a large number of documents this plugin is mainly for testing and analyzing with results visualization.
Users can deploy two modes of workflow inputs to the U-Compare Taverna plugin. In the typical mode, the U-Compare workflow takes a collection reader as input, generates a list of annotated documents as output. We have also implemented another mode to link specifically with Taverna where the input is not the collection reader but instead a list of String (depth 1) which is passed to the ‘input_text’ port of the U-Compare plugin.
2.2 U-Compare activity with command line mode
As any workflow can also be called via the command line we provide special UIMA components, which reads input text and writes generated annotations via the standard I/O streams. Using U-Compare's command line mode, we created an example Taverna workflow of a protein–protein interaction extraction, selected to show its usefulness in systems biology (Ananiadou et al.
). This workflow outputs a possible interaction network from the literature associated with a PubMed query. Extracted information is available in Supplementary Material
. This example workflow is provided as a template for users to create their own workflow by imitating, reusing and modifying the interpretation part. Since U-Compare outputs results in a uniform format users only need to change the specific data types and their corresponding fields to create their own workflows, reusing most the template codes without modification. shows a diagram of the example Taverna workflow. A box labeled ‘UCompare’ corresponds to a Beanshell script activity, calls U-Compare via the command line. In this article, a mechanism to download, install and update the U-Compare system is also implemented, so there is no need to explicitly install anything. In parts prior to ‘UCompare’, this workflow retrieves documents from PubMed, passes them to the U-Compare activity, then interprets the results in parts following ‘UCompare’.
An example run of the workflow, which sets ‘PubMed_Query’ and ‘workflowPath’ parameters in the , and its result are given in Supplementary Material
for the top 50 hits from the query ‘saccharomyces AND “translation initiation” ’. The ‘workflowPath’ is set to a U-Compare workflow given in the Supplementary Material
, which runs ‘UIMA Sentence Splitter’ to detect sentence boundaries, ‘ABNER’ (Settles, 2005
) to detect protein named entities and ‘EventMine’ (Miwa et al.
) to detect interaction events. Evaluations of the text mining tools are provided in U-Compare (Kano et al.
). This example run detects 677 non-unique entities, which by exact string matching correspond to 371 unique entity names, and 254 non-unique binding events.
2.3 U-Compare as a generic text mining workflow
The Taverna U-Compare workflow illustrated in can be used to run any U-Compare workflow, and does not require reworking for different text mining analyses. Whilst developing a new text mining workflow does involve configuration within the U-Compare environment, once developed, it can be deployed within the Taverna activity described here by resetting the ‘workflowPath’ value to the location of the new workflow's descriptor, relative to a directory [user home]/.UCompare/taverna/classpath-root/.