The Chronic Lymphocytic Leukemia Research Consortium (CRC, http://cll.ucsd.edu
) is an NCI-funded program/project (P01CA081534) consisting of eight sites. Initially funded in 1999, the CRC coordinates and facilitates an integrated translational research program, with specific emphasis on basic and clinical research targeting the genetic, biochemical and immunologic bases of Chronic Lymphocytic Leukemia (CLL). A critical facility supporting the ability of the CRC to engage in such research is the use of shared data repositories, associated data collection instruments, and data mining and analysis tools. The CRC Integrated Information Management System (CIMS), is the data management system currently used by the consortium, incorporating: 1) multiple task-specific web portal interfaces supporting clinical trial, basic science and tissue bank data management; and 2) a set of shared data repositories. CIMS facilitates the collection and storage of numerous heterogeneous bio-molecular data sources generated by instrumentation and methodological approaches including: quantitative and qualitative immunophenotyping, multiple modalities of gene expression analysis, and Fluorescent In Situ Hybridization (FISH) analyses of cytogenetic abnormalities. CIMS was initially deployed for use by the CRC in 2000, and at the time of this submission, is being used to collect, manage and analyze data for well over 5000 patients involved in multiple clinical trial modalities, as well as hundreds of thousands of CLL-specific tissue samples. Despite the success of CIMS in satisfying the informatics requirements of the CRC over the past ten years, CRC participants have identified numerous usability and computational limitations of CIMS, including:
- A reliance upon proprietary software architectures and standards, thus limiting the extensibility of CIMS to other, analogous clinical and translational research programs, as well as the exchange of data with external systems that utilize modern electronic data interchange mechanisms;
- A loose-coupling of constituent data entry, management and query tools, which introduce significant complexity to the design and execution of data integration and analysis workflows that span multiple levels of granularity from bio-molecules to clinical phenotypes; and
- A complex human-computer interface model that requires significant end-user training and acculturation in order to ensure optimal system utilization and high quality data.
Motivated by these limitations, the CRC has launched the TRITON (T
ibus) project, in order to re-engineer the current CIMS platform and develop a highly usable, extensible, standards-based, open source, and integrative translational research information management platform. A primary goal of these efforts is to enable the integration between TRITON and basic science, clinical research and translational science focused data management tools and interchange mediums associated with the NCI’s Cancer Biomedical Informatics Grid (caBIG) initiative, including the caGrid service-oriented middleware (1
). In doing so, our objectives is to increase the translational capacity of the CRC by enabling consortium investigators to discover, integrate, analyze and disseminate heterogeneous, multi-dimensional data sets
. It is anticipated that many of these data sets will be generated by high-throughput bio-molecular technologies or instrumentation, as well as electronic health record (EHR) systems that are currently in use at the majority of CRC sites.