This study describes the fusion of a PACS server (neaPACS®) with a HIS server (CRDAS) and their implementation in a large university hospital organization. To test the combination PACS/HIS software, a pilot study retrospectively examined 120 patients with asthma. Institutional review board permission was granted for the pilot study. The following sections provide a description of the research enterprise workflow, the construction of Research PACS, CRDAS and their fusion, and finally the implementation of the ensemble.
Workflow of the Research Archival System
The Medical Imaging Centre and the Science Center of Pirkanmaa Hospital District divided responsibilities to construct, develop, and facilitate the practical use of TARAS. The Science Center is the department responsible for managing scientific studies. The Medical Imaging Centre defines the database variables and transfers images from the clinical PACS to the Research PACS. The Science Center administers the study databases within the CRDAS, the pseudonymization code register, and the Research PACS, as well as data transfer from the external sources. This process is depicted in Fig. , which shows the pathway from the request by a researcher to use TARAS to the new research project database and the linked images in the Research PACS. Figure also shows how the system integrates into the organization.
Fig. 1 Workflow of the research enterprise. A principal investigator from any clinical department defines, collects, and delivers research information to the main user of CRDAS (Clinical Research Data Archival System) (1), who establishes a new database on the (more ...)
Construction of Research PACS
The research image archive used to construct Research PACS, neaPACS®, is a standard-conforming Digital Imaging and Communications in Medicine standard (DICOM) PACS augmented with automatic pseudonymization capabilities. The image archive provides standard DICOM PACS services such as storing, querying, image and other DICOM data retrieval, as well as a WWW administrative interface and a viewing application (neaView® radiology) for diagnostic radiological use and administrative purposes.
Each image sent to the archive is automatically pseudonymized according to configurable rules. The identity of the DICOM entity sending the images determines which rule to use. The pseudonymization (or lack thereof) of each DICOM field can be configured separately. Information that could be used to identify the patient or organization can be replaced with generated values that bear no relation to the original values. Certain information of limited sensitivity, e.g., patient’s age, sex, or date of imaging study, can be retained if research purposes deem it necessary. All vendor-specific information is removed because it may contain personal information.
Most of the information replaced with pseudonymized values is permanently discarded; the exception is the single unique identifier for each patient, imaging study, series, and image. The original and corresponding pseudonymized identifiers are stored separately from the pseudonymized images and only used internally by the pseudonymization process to ensure that subsequent data on the same patient, study, or series will be pseudonymized with the same patient, study, or series information respectively, as with previous images. Thus, the hierarchical relationship of data received is maintained after pseudonymization and allows the conduction of longitudinal studies.
In our study, each patient’s social security number was encoded in the Research PACS database by a pseudonymization code combining the encoding day and a running number within the day. If that social security number already exists in the database, the corresponding pseudonymization code is used and no new code is necessary. The following describes a typical configuration of which fields are pseudonymized and how. The PACS software places no other limits on which fields are pseudonymized except that Patient ID and Study, Series, and SOP Instance UIDs must always be assigned new values. In practice, several other values should always be pseudonymized or cleared as well to prevent identifying information from being revealed.
A sample of fields that are assigned new values are listed in Table . Fields that are replaced with date or time of pseudonymization are for example birth date of the patient, study date, and study time. The same value is used for study/series level fields in all instances in the same study/series, respectively, as required by DICOM. Fields that are cleared (replaced with an empty value) include referring physician, station, and institution details etc. In addition, all private DICOM fields (odd group number, any element number) are completely removed from the pseudonymized dataset.
Example of the fields assigned with new values in the pseudonymization process
Only pseudonymized information is visible to the users and administrators of the archive, regardless of whether the archive is accessed through standard DICOM interfaces (e.g., a researcher’s workstation), the viewer application or the administrative WWW interface. Original information is not displayed to users, and with the exception of the unique identifiers mentioned above, not stored in the image archive. However, it should be noted that automatic pseudonymization is not sufficient to guarantee patient confidentiality; for example, images sent to the archive may contain identifying information such as textual annotations or facial photographs, for which automatic detection and removal are not feasible or reliable. Hence, human involvement is required when selecting images for pseudonymization.
In the Research PACS pilot study, the image data of 120 patients, including most often multiple X-ray and CT imaging studies per patient, were transferred to the Research PACS archive. The transfer was executed with a normal PACS client. The Research PACS archive received image data sent by the clinical PACS client. The transfer procedure can be controlled either manually—as was the case of the pilot study—or automated. If automated, images are sent to the Research PACS database directly from the scanner device in the scanning phase, based on the patient’s social security number or other parameters, if the patient consented to the transfer.
After transfer and pseudonymization are complete, the radiological images can be viewed using the neaView® interface (Fig. ), which includes several basic image processing tools needed to clinically analyze the images.
neaView® interface. A patient’s images have been queried with the patient ID (pseudonymization code), which links Research PACS and CRDAS, and images can now be opened for analysis and processing
Construction of the Clinical Research Data Archival System
Pirkanmaa Hospital District’s Science Center maintains an information system for health research data called the Clinical Research Data Archival System. It is based on the Sapphire LIMS® information system and is customized for scientific database purposes. CRDAS uses Oracle® as a database solution and Business Objects® as a reporting tool. CRDAS has a web-based user interface; access to the system is restricted to the users of the Pirkanmaa Hospital District Intranet. The head of the Science Center grants user rights and roles based on principal investigator’s applications. The main user of CRDAS configures user details and roles. Roles vary from registered users with full access to the data (typing rights, etc.) to view-only access. Only people with user rights can access a specific database. User rights can be modified and expanded after the new research database has been created; for example, in cases of new co-operation projects between different research groups.
The Pirkanmaa Hospital District has many different information systems for patient health care; however, systems do not usually communicate with each other and information is difficult to retrieve from these systems. CRDAS can be configured to have many structural databases according to the researcher’s needs. Data can be stored in the CRDAS by automatic update, manual data submission, or updates from specific files, such as comma separated value files. CRDAS has configurable user rights, database variables, and a graphic interface. Patient data can be normal, pseudonymized, or completely anonymized data. In the pilot study, we used pseudonymized data, which enables us to do follow-up studies. However, anonymization could also be applied in studies using data from deceased patients.
The most recent personal data on a patient can be obtained from the personal data register of Tampere University Hospital via HL7 connection. Most of the patient’s clinical information is obtained from the Information Service of the Hospital, which gathers patient data from various sources and provides sampling options. The Information Service of the Hospital uses an information system called SAS®, which researchers cannot use. Thus, CRDAS is needed to provide greater data access and to serve, metaphorically, as a window with the required information security properties for this large data warehouse.
In the pilot study, the data was collected from patient health records and from radiological image analysis. When collating data from different sources, social security numbers were used to identify the data of a single patient; however, the complete database only used a pseudonymization code as a personal identification. Pseudonymization (Fig. ) provides a chance to have an open database for different research groups with different research interests and facilitates access to permission rights. As the data is pseudonymized, patient information can be updated in the future. The head of the Science Center controls the pseudonymization codes and can authorize possible decoding.
Pseudonymization process. Data can be brought into TARAS from the electronic archives of Pirkanmaa Hospital District (PHD) or from other patient data sources, such as other hospital districts
The principal investigator first lists all data variables needed for the study and then delivers this information to the main user of CRDAS. The main user creates a new database for the study and configures its variables. When the new database is ready for data input, patient data is transferred via a manual input and data transfer. One of the main goals of the TARAS project was to link pseudonymized radiological images to CRDAS so that analysis results of the images can be saved directly in the database.
CRDAS has an easy-to-use user interface and is shown in Fig. . The database configuration determines how data is entered. Options include typing or selecting from a list, a data collection, or a calendar. New users may have different options available depending on what role they have been assigned. Patient queries can be done, for example, using the pseudonymization code as an identifier.
User interface for Clinical Research Data Archival System
Linking Research PACS and Clinical Research Data Archival System
The newly created pairs of social security number and pseudonymization code in the Research PACS were transferred into the CRDAS database in order to enable the combination of patients’ electronic medical records and patients’ images for scientific purposes. However, a TARAS user is not able to see any identifying information. Instead, he or she is able to see patient images and the information saved to the CRDAS (e.g., age, profession, laboratory results, etc.) depending on the specific needs of the study.
We use a straightforward solution in which the user searches images in the neaView® interface (Fig. ) by entering a pseudonymization code, study time, or image modality issued by CRDAS. However, we plan to create a direct link from the CRDAS user interface to the neaView® interface in the near future.
Implementation of the Research PACS and Clinical Research Data Archival System
Before the final implementation of Research PACS and CRDAS, Dr. Prasun Dastidar, M.D., Ph.D., experienced radiologist, and Ms. Tiina Rajala, M.Sc., Bachelor of Medicine, performed a dummy run using TARAS on a personal computer Osborne Mini 945 workstation with 1 GB memory, 3 GHz Intel Pentium 4 processor, Windows XP operating system (Service Pack 2), and a 19-in. monitor with 1,280
1,024 pixel resolution. The findings are presented in the Results and Discussion