Informatics has been an important part of cancer research efforts to develop more effective diagnostics and therapeutics. These initiatives have led to better clinical outcomes for many patients.19
However, prognosis for many patients, including those with MPM, remains poor.19
Consequently, it is imperative that we continue researching novel therapeutics to combat cancer as its incidence rises worldwide. To ensure that such research continues, we must develop informatics infrastructures that meet research needs, one of which is an easily implementable comprehensive translational research database capable of handling imaging.
Relational databases that incorporate imaging have been developed by other groups,3–5
but they differ from ours in a fundamental way: ease of implementation. For example, the eDiaMoND database is designed to aid clinicians and researchers by compiling mammography and related clinical data;3
the Biomedical Image Metadata Manager (BIMM) allows researchers to access and query images and associated metadata;4
and the Pathology Analytic Imaging Standards (PAIS) data model database enables the storage and analysis of large TMA datasets.5
All three of these databases are developed based on published data models that can be replicated by outside groups. While implementing one of these databases might be beneficial for some, they are sophisticated enough that we feel it would require a dedicated informatics specialist to replicate them. Consequently, we felt the need to design a simpler informatics infrastructure that incorporated imaging but did not focus on it and that would be more easily implemented by translational research groups without special informatics expertise.
To do so, we decided to use a ready-made database platform that required little to no coding. Unfortunately, widely available, readymade database platforms are often designed to meet a variety of research needs, but rarely ever do they meet all the needs of a specific researcher. Consequently, it was necessary to utilise a tandem database infrastructure in order to incorporate imaging. Microsoft Access has been a very useful platform for our translational research due to its relational nature, ease of querying, portability, ease of deployment and low cost and ubiquity, which enable collaboration with institutions around the world. These features have allowed us to develop the TOPDP database, a comprehensive thoracic database containing patient demographic, clinical, proteomic and genomic data in a centralised location.6
However, Microsoft Access is not without its problems: in particular, Access databases are limited to a 2 GB footprint. Thus, Access is well suited to capture text-based data, but it is limited when capturing images or other files with a large memory footprint.
For this reason, we developed the TORP database using the online REDCap database platform, which was developed at Vanderbilt University and made available to us by the University of Chicago CRI. Like Microsoft Access, the REDCap platform is well suited to meet some of our research needs, but falls short in other areas. REDCap is not relational, so the decision was made to maintain our comprehensive database in Microsoft Access. However, REDCap allows up to 1 TB of storage space and so is ideal for research projects utilising large files. This capability was especially important for this research project, as multiple representative images from CT scans and histological images for each patient were uploaded into the database. Moreover, REDCap interacts easily with Access, communicating via Microsoft Excel or an API call, and, like Access, REDCap encourages collaboration within and among institutions, as it is web based and available freely.
In addition to facilitating more robust and novel analyses, this database structure also fosters intrainstitutional and interinstitutional collaboration. Microsoft Access is widely available for a minimal cost, and REDCap is available freely online to registered users. Moreover, researchers interested in adopting the Salgia Lab's TOPDP and TORP databases may access the lab's standard operating procedures for its Access21
databases, which further detail the construction and utilisation of the databases and are freely available on the iBridge network. Only by developing a common infrastructure will we be able to facilitate fast and easy collaboration in MPM research, which will be essential if the global biomedical research community is to overtake this increasingly global disease.
This informatics infrastructure is not without its limitations, however, one of which is that data must be captured via patient report or chart abstraction and then manually entered into the TOPDP database. This process is tedious, subject to error and time-consuming. However, there are plans to automate this process by enabling data to be transferred immediately from the patient's EMR, which will reduce workload and the potential for error considerably. In this investigation, data were transferred easily from the Access database to REDCap using Microsoft Excel as an intermediary and REDCap's data upload functionality. This method was sufficient for the purposes of the present study, but if necessary or desired, it is also possible to automate the data transfer process using the REDCap API. However, images must be uploaded manually using REDCap's online file upload field. The time required to upload images for this investigation was negligible. However, having to upload images manually would most likely be prohibitive of studies involving hundreds or thousands of patients.
Our proof of principle investigation was also limited in various ways, for one by sample size (n=22). As this study was retrospective, it was also limited by a lack of standardization: when possible, we selected a follow-up CT scan acquired immediately after the second cycle of chemotherapy, but for some patients, follow-up CT scans were only available after the third or fourth cycle. Furthermore, some patient data remained unreported because it could not be found in physician notes during chart abstraction. Finally, tumour measurements were not acquired using Modified RECIST, so they cannot be said to be valid data from which we can draw clinical conclusions.