The REDCap project was developed to provide scientific research teams intuitive and reusable tools for collecting, storing and disseminating project-specific clinical and translational research data. The following key features were identified as critical components for supporting research projects: 1) collaborative access to data across academic departments and institutions; 2) user authentication and role-based security; 3) intuitive electronic case report forms (CRFs); 4) real-time data validation, integrity checks and other mechanisms for ensuring data quality (e.g. double-data entry options); 5) data attribution and audit capabilities; 6) protocol document storage and sharing; 7) central data storage and backups; 8) data export functions for common statistical packages; and 9) data import functions to facilitate bulk import of data from other systems. Given the quantity and diversity of research projects within academic medical centers, we also determined two additional critical features for the REDCap project: 10) a software generation cycle sufficiently fast to accommodate multiple concurrent projects without the need for custom project-specific programming; and 11) a model capable of meeting disparate data collection needs of projects across a wide array of scientific disciplines.
REDCap accomplishes key functions through use of a single study metadata table referenced by presentation-level operational modules. Based on this abstracted programming model, studies are developed in an efficient manner with little resource investment beyond the creation of a single data dictionary. The concept of metadata-driven application development is well established, so we realized early in the project that the critical factor for success would lie in creating a simple workflow methodology allowing research teams to autonomously develop study-related metadata in an efficient manner.(3
) In the following sections, we describe the workflow process developed for REDCap metadata creation and provide a brief overview of the user interface and underlying architecture.
A. Study Creation Workflow
provides a schematic representation of the workflow methodology for building a REDCap database for an individual study. The process begins with a request from the research team to the Informatics Core (IC) for database services. A meeting is scheduled between the research team and an IC representative for a formal REDCap demonstration. Key program features (intuitive user interface, data security model, distributed work environment, data validation procedures, statistical package-ready export features, and audit trail logging) are stressed during the initial meeting in addition to providing project-specific data collection strategy consultation. Researchers are given a Microsoft Excel spreadsheet file providing detailed instructions for completing required metadata information (ex. field name, end-user label, data type, data range, etc) about each measurement in each case report form. They are asked to spend time over the next several days working with the spreadsheet template to define data elements for their specific project, and then return the worksheet to the IC via electronic mail. The returned worksheet is used to build and populate the study-specific database tables feeding a working web-based electronic data collection (EDC) application for the project. A web link to the prototype application is given to the research team along with instructions for testing and further iteration of the metadata spreadsheet until the study data collection plan is complete. The system is then placed in production status for study data collection. The workflow process typically includes several members of the research group and allows the entire team to devise and test every aspect of study data collection requirements before study initiation.
REDCap Project Initiation Workflow
B. User Interface
The REDCap user interface provides an intuitive method to securely and accurately input data relating to research studies. shows a typical CRF view. Each form is accessible only to users who have sufficient access privileges set by the research team. Forms contain field-specific validation code sufficient to ensure strong data integrity. In addition to checking for mandatory field type (e.g. numeric entry for systolic blood pressure), embedded functions also check data ranges (e.g. 70–180 mmHg) and alert the end-user whenever entered data violates specified limits. Data fields may be populated using text fields or through embedded pull-down boxes or radio buttons where the end-user is shown one value and a separate code is stored in the database for later statistical analysis (e.g. 0=No, 1=Yes).
Clickable project menu items are shown on the right side of the screen in . All menu items in the Data Entry tab point to CRFs specific to the scientific project, while the Applications tab contains menu items pointing to general REDCap functions. Researchers use the “Data Export Tool” to push collected data out of the REDCap system for external analysis and may select entire forms and/or individual fields for export. The module returns downloadable raw data files (comma-delimited format) along with syntax script files used to automatically import data and all context information (data labels, coded variable information) into common statistical packages (SPSS, R/S+, SAS, Stata). The “Data Import Tool” module allows bulk upload of data from existing files with automatic validation of data and audit trail creation similar to those created when using CRF data entry methods. The “Data Comparison Tool” module provides a mechanism to view and reconcile data for those studies employing double-data entry or blinded-data entry methods. The “Data Logging” module provides users a view of all data transactions for the duration of the project. The “File Repository” module allows end-users to store individual supporting files for the project and retrieve wherever and whenever necessary. The “Data Dictionary” module allows researchers to download a clean copy of the project metadata during the iterative study data planning process and the “User Rights” module is used by project administrators to add or remove research personnel and set individual security rights.
) Hardware and software requirements are modest and the system runs in Windows/IIS and Linux/Apache web server environments. In keeping with the goal of creating a rapid development methodology and easily maintainable resource for multiple concurrent studies, we devised a set of similar database tables for use in each study. The standard REDCap project requires five distinct tables stored in a single MySQL database: 1) a METADATA table containing all study data pertaining to database storage (data field types and naming used for automatic creation of separate data storage table) and CRF presentation (form names and security levels, field-specific display and validation rules); 2) a LOGGING table used to store all information about data changes and exports; 3) a DOCS table used to store uploaded (ex. consent forms, analysis code) or generated export files (SAS, SPSS, R, Stata, Excel); 4) a RIGHTS table containing specific researcher rights and expiration settings; and 5) a flat DATA table used to store all collected data for the study (one row per subject with all collected data fields stored in columns). In studies requiring greater than 1500 data fields per subject, multiple DATA tables are used with a 1:1 relationship between tables linked on the subject identifier field. Although simplistic, this data model is easy to reproduce and tailor for individual research studies during the project creation process and has proven adequate for a wide variety of clinical and translational research studies seen in multiple academic research environments.