The requests for biological specimens and data for research purposes have increased significantly over the years. Data requests have become increasingly complex. This increased complexity is partly related to outcomes related initiatives to evaluate biomarkers and their role in guiding therapy or predicting outcome. In addition, awareness of confidentiality issues has increased significantly since the implementation of HIPAA.
The primary request for research projects consist primarily of:
- Tissue and biological specimens only.
- Clinical (phenotype) data, most frequently pathology data.
- Outcomes information including treatment, progression and vital status
We evaluated tissue and data requests at the University of Pittsburgh and found that 20% of research projects needed biological specimens only, 10% of research projects needed outcomes information, while the remaining 70% needed more fully annotated tissues requiring phenotypic (clinical) data. The annotation varied from easily accessible (e.g. pathology data) to complex (pre-therapy and post-therapy information). This breakdown of research requests is shown in . This suggested the need to design a system that could provide research biological specimens annotated with patient data while protecting the confidentiality of patient information, while fully meeting the requirements of federal regulations [
15]. This required implementation of a system that is HIPAA compliant and provides human subjects protection. The resulting system for this process was based on the Honest Broker Concept.
In many instances, the collection of information on clinical progression, treatment and outcomes of cancer patients may fall under human subjects’ research and therefore require specific IRB review and approval. The overall attempt is to have “informed consent” from all the patients for research use of their biological materials. In addition, it is also the attempt of the various registries in the institute to obtain “informed consent” from all patients for the research use of their data.
Human Subjects Protection – The Honest Broker Concept
The tissue/databank ensures protection of patient identity through "The Honest Broker Concept." The honest broker is an individual/organization/system which acts on, or on the behalf of, the tissue/databank. The role of the honest broker is to collect and provide health information to research investigators in such a manner whereby it would not be reasonably possible for the investigators, or other individuals, to identify the subjects directly or indirectly. The “honest broker” or “tissue/data bank trustee” acts as a well defined barrier between the clinical environment (in which fully identified confidential patient information is routinely exchanged as part of medical care) and the general research community (in which all information must be completely de-identified). The honest broker also ensures that research data, which is generally not clinically validated, is not used for clinical care [
16].
In our rendition, the honest broker is not part of either the clinical or research team. The honest broker is dedicated to providing “honest broker” services only to a particular project and is not part of either the data collection team or the research team. This is to avoid any potential conflict of interest. It needs to be emphasized that these roles change from project to project for a particular individual. However the end result is that the dedicated honest broker for that project is not part of either the research team or the data/biological specimen aggregation team.
This is important to ensure confidentiality and honest research. The honest broker is the only entity that can link research identifiers and clinical identifiers. This transfers control and responsibility of the de-identification process to an independent third party, the honest broker, thereby reducing the risk of conflict of interest. Personal and clinical identifiers (names, addresses, medical record numbers etc.) are limited to the clinical space. The research identifiers (i.e. “subject 12432”) cannot be traced back to the personal or clinical identifies except through the honest broker’s linkage codes. This concept differs from anonymization. Anonymization is a one-way process in which the linkage between personal identifiers and research identifiers is removed. Anonymization precludes any subsequent updating of data. The process of data annotation with the particular specimen stops when anonymization is performed. The process of having the honest broker assign linkage codes (re-identification codes) allows information to be updated at anytime in the future. The honest broker can identify the patient by means of the linkage code, access information related to this patient from the clinical domain, and provide updated information to the researchers in a de-identified fashion, using the original linkage code. The link between codes must be retained and protected by the honest broker. Subsequent requests to update information on research protocol participants (research cohort) must be conducted through the honest broker. The honest broker system is therefore an upgrade to the process of anonymization. Anonymization essentially provides information up tot the time of accrual, whereas the honest broker concept allows information to be updated in a manner that is consistent with current legal and ethical protocols.
Discussions involving the Cancer Registry and the Health Sciences Tissue Bank identified the major sources of tissue and biological specimens and annotating data for research use. The privacy rule of the HIPAA of 1996 permits access to protected health information without patient authorization in a limited number of situations [
13]. One frequent situation is where the protected health information is being used in a de-identified fashion. The honest broker plays a prominent role in this scenario, since neither the federal policy nor HIPAA regulations require prior written consent or authorization of patients when using existing health information in a de-identified fashion. The honest broker can be a part of the facility providing the data. In addition the honest broker can be a business associate of the facility [
17]. The supplemental files attached include the University of Pittsburgh template business associate agreement. This approach allowed us to expand the circle of participating facilities. We decided to include division/departments involved in data aggregation as well as facilities that were creating and implementing software solutions and tools for these groups as participants for this initiative. The software groups included Pathology and Oncology Informatics and the Electronic Medical Records team. This list may not include every possible entity that could play a role; nonetheless it does capture the major players involved in aggregation and provision of specimens and data, and designing software tools for these efforts.
The facilities currently part of the “Honest broker facility” and their role in this initiative is described below.
Participating facilities
1. The Health Sciences Tissue Bank The Health Sciences Tissue Bank is the main institutional infrastructure for collecting tissue and other biological materials for research. These research specimens are stored in a de-identified fashion, annotated with linkage codes, because of confidentiality issues. However the linkage codes allow access to specific information regarding the donor. This is important since many research projects require not only tissue and biological specimens but also additional data regarding family history, treatment history, and outcomes.
2. The Pathology Laboratory Information System This is the clinical system used for reporting pathology information. This repository contains extensive information regarding clinical evaluation of tissue and other biological specimens. This information is extremely useful to provide a better understanding of the composition of the research specimen. The system stores clinically reported information pertaining to tissue specimens (biopsy and resection reports), cytology specimens (exfoliated as well as aspirate specimens), and other biologic specimens (blood/blood products/urine/other biological specimens).
3. The Cancer Registry The Registry performs the state-mandated function of collecting information on cancer patients. The information collected pertains to both diagnostic details as well as follow up information. The data collected by the Registry consists of a set of defined data elements that are part of a standardized set of common data elements. We have further modified this approach by adding additional data elements, of primarily research value, as part of a separate IRB approved initiative.
4. The Clinical Outcomes group This institutional entity collects and provides information pertaining to ongoing clinical trials, health services research and patient safety research.
5. Radiation and Medical Oncology Radiation and medical oncology are important caregivers for oncologic diseases. The clinical database of these two entities provides critical information regarding therapeutic intervention and responses to those specific therapies. Information accrued from Radiation and Medical Oncology is therefore critical in providing insight regarding patient response to therapeutic protocols.
6. Pathology and Oncology Informatics This growth is responsible for designing and maintaining the informatics infrastructure for collection, storage and disbursement of annotating information. It is important to affiliate this group with the honest broker infrastructure development since Pathology and Oncology Informatics designs, tests and maintains the tools needed for the other components of the honest broker system. Some of these include software packages needed for Inventory Management by the Health Sciences Tissue Bank, data aggregation software packages for the Cancer Registry and clinical outcomes group, clinical information and research information recording mechanisms for Medical and Radiation Oncology, and de-identification software packages needed by many participating facilities (Health Sciences Tissue Bank, Cancer Registry, the Electronic Medical Record team and others). NOTE: Our Pathology and Oncology Informatics groups were recently merged into the new Department of Biomedical Informatics as of June 2006 (
http://www.dbmi.pitt.edu)
7. The University of Pittsburgh Health Systems Information Services Division Most clinical data is captured in an electronic form in various hospital information systems. This includes patient history, details of surgical and radiological procedures, therapeutic interventions and follow-up information. The clinical component of the electronic medical records consists of information in an identified form. However the transfer of this information into the research domain requires de-identification of this information. The electronic medical record team therefore serves as a gatekeeper for this information and oversees implementation of appropriate de-identification protocols prior to the incorporation of this data into research databases. The electronic medical record team also plays a critical role in performing queries for specific research requests. This activity helps identify appropriate patient populations for research projects. These identified patient lists then need to undergo de-identification.
In this concept at least one individual is acting as an honest broker at each of the facilities listed above. For clinical and translational research studies in oncology, the Cancer registrars are extremely valuable since their federal mandate and the job specifications allow them ready access to clinical information on cancer patients. In addition, they are not involved in specimen banking or research and thus do not have access to the data annotating tissue bank samples or the results of the research studies. The inclusion of the cancer registry into an honest broker system facilitates data accrual from this purely clinical data entity which maintains updated information on all oncology patients. This updating is done every six months and is part of the state-mandated function of the cancer registry.
The “Institutional Honest Broker” system ensures that the honest broker ("trustee") is the only person who can link a patient with the tissue bank number that identifies that patient. The Institutional Honest Broker system also provides a process via which new clinical outcome information can be added to a file identified only by a code number, rather than a name. This creates a fail-safe mechanism for communicating with patients in the extremely rare event of an IRB directed dissemination of important research data to the patient or their survivors.
It was decided to incorporate the above named groups, involved in tissue and data aggregation with possible research application, into an Institutional Honest Broker system.
The University of Pittsburgh Academic Health Center consists of two closely interacting, but legally separate, entities. These are the University of Pittsburgh, which oversees primarily the research activities, and the University of Pittsburgh Medical Center (UPMC), which oversees clinical activity and in which the clinical data resides. Potential legal/ethical issues pertaining to the creation of this system were discussed with the Institutional Review Board (IRB) of the University of Pittsburgh as well the legal team of the UPMC. A formal IRB application for this “Honest Broker Facility” incorporating the comments and suggestions of the IRB and the legal team of the University of Pittsburgh Medical Center Health Systems was approved by the IRB and formally went into effect in May 8, 2003.
The employees of the Honest Broker Facility have honest broker agreements with the University of Pittsburgh and the University of Pittsburgh Health Systems. This Honest Broker Facility encompasses several separate departments and divisions. Each of these entities has contributed by providing personnel into the honest broker pool. This arrangement has provided a large task force for honest broker activities, which is important since an honest broker should not be involved with the research requiring honest broker services. This approach ensures lack of conflict for the individual engaged in honest broker activities, thereby creating an appropriate work environment.
Honest Broker Process
The honest broker certification process requires completion of IRB mandated education modules. These modules are Research Integrity, Human Subjects Research in Biomedical Sciences, and HIPAA Researchers Privacy Requirements. The education modules can be completed via the Web at the University of Pittsburgh IRB web site (
https://cme.hs.pitt.edu/). A certificate of completion is generated once each module has been completed. In addition the honest broker also has to enter into a business associate agreement (17). An individual can become a certified honest broker, once these administrative requirements have been completed.
The honest broker facility provides an update to the IRB every six months. The update is in opportunity to add/delete honest brokers. The Institutional Honest Broker system at the University of Pittsburgh has assigned overall administrative responsibility for the honest broker service to the Manager of the Cancer Registry. However this oversight can be provided by the leaders of any of the participating entities.
The Pathology and Oncology Informatics division has designed a Data Request Tracking Tool for the honest broker system. This tool is located on a password protected website. The description of the process, and interaction with affiliated entities, is described on an accessible website (
http://www.upci.upmc.edu/facilities/cis/serv.html). This tool provides the interface for entering descriptive detail information pertaining to a research project requiring honest broker services. This tracking tool is password protected and is located within the firewall of the University of Pittsburgh. After logging into the system, a menu of options is available to the honest broker. This is shown in . The honest broker handling a particular request enters all the information about the research project into the database using the initial data-entry screen of this tool. The initial data-entry screen captures information pertaining to the investigator, the nature of the request, as well as important workflow issues like requested turnaround time, IRB status and approval number. In addition this screen also captures information pertaining to billing, in case the services provided will be compensated through an institutional account, rather than grant funded mechanisms. This tool has a built-in query capability. The honest broker designates the fields required for the data sources, the disease category, method of output for tissue/ biological specimens and data, the method of distribution and the purpose of the request. A screen capture of this aspect of the tool is shown in . The honest broker alerts their supervisor once all project information has been entered into the tracking tool. The supervisor reviews project details and provides input and approval. This tracking tool is used to follow a research tissue/data request from start to finish. This provides information regarding turnaround time as well as time spent on a project. All of this information is summarized and available in the final "complete request" snapshot of the tool. This is shown in .
De-identification protocols
The de-identification of patient samples and data is performed using a variety of tools. The Pathology Lab Information System, CoPath, has limited de-identification capabilities. The electronic medical record system also has de-identification software systems. The honest broker system can be utilized for de-identifying specimens/data, with the honest broker retaining codes for the specimen/data provided. The Clinical Research Informatics Service (
http://www.dbmi.pitt.edu/cris/) in the Department of Biomedical Informatics has created a HIPAA compliant de-identification engine. Electronic mechanisms for addressing honest broker issues are described in literature (18, 19, 20).
This de-identification engine has been certified by the IRB of the University of Pittsburgh as well as by the University of Pittsburgh medical Center security office for generating de-identified output from a variety of free text medical reports. This engine identifies all HIPAA mandated PHI, e.g. names and replaces them with a de-identified tag and replacement letters. If the same person is encountered in multiple places in the same report, the same replacement letters are used for every occurrence. Similarly dates are replaced by an offset which allows intervals among aggregated reports to still allow for interval determination. An example of a de-identified report generated by this engine is shown in . The system generates a linkage file for each patient. This file is stored on a secure server. A diagrammatic representation of this process is shown in .
Data sources
The collaborative honest broker service utilizes multiple sources of data. These include clinical applications (Pathology Laboratory Information Services, Radiation Oncology Systems, Outpatient Systems and Hospital Information Systems), Clinical Trials related applications, Cancer Registry applications, and Tissue Banking Inventory and Information Systems. In addition paper-based records in physician offices and legacy records in the hospital may be used. These multiple data sources are listed in Table 1.
Oversight of the honest broker system
Sharon Winters, the director of the cancer registry, serves as the overall manager of the honest broker facility. She is primarily responsible for maintaining oversight regarding administrative and regulatory issues. She is assisted in this role by the lead supervisors of the participating facilities. This includes the manager of the Tissue Bank, the manager of the Quality Assurance facility, the manager of Pathology and Oncology Informatics, as well as the data managers for Medical Oncology and Radiation Oncology.
Oversight for tissue and biological specimen disbursement
is provided by a number of organ-specific Tissue Utilization Committees (TUCs). These utilization committees are within the University of Pittsburgh’s translational and clinical programs. The University of Pittsburgh has functioning Tissue Utilization committees in the following organ sites: Lung, Head & Neck, GU, GI, Women’s Health, Melanoma, Liver and Transplant, and non-neoplastic lung diseases. The committee provides representation to the different groups involved in decision making and research specimen usage for that particular organ type (Surgery/ Oncology/ Pathology/ Researchers). Each of these committees makes binding recommendations to the personnel of the Tissue Bank for the priorities for distribution of tissue and biological materials. There is an Institutional Oversight Committee that oversees the different organ-specific TUC committees. This Institutional Oversight Committee serves as a final arbitrator in case of conflicts that are not resolved in the organ specific TUC. The oversight committee consists of clinical and research leaders at the University of Pittsburgh. The oversight committee also serves the role of an internal scientific advisory board.
Mechanisms for prioritization of biological specimens
The prioritization protocol is consistent with institutional policies. However there are variations from organ system to organ system, depending on the different projects being taken care of. The criteria for prioritization are:
- SPORE projects.
- Exploratory pilot projects directed at SPORE project development.
- Projects funded by federally funded peer-reviewed agencies.
- Projects funded by non-federal agencies.
- Projects funded by industry.
The tissue utilization committees do have the authority to make exceptions, with the approval of the oversight committee.