Clinical data management of this 11PCV efficacy trial and related satellite studies was challenging, because of the magnitude of data collection and the corresponding complexities presented by the simultaneously conducted studies. One of the challenging tasks was to keep track of children with multiple clinical events, i.e. all the different outcomes studied. These multiple clinical events and all information from the different studies had to be linked accurately. It was laborious to maintain the quality of the data, and the linkage of various clinical data records to establish relationships. However, due to careful trial preparation and conscious efforts to follow SOPs for data collection and management, the data management team cleaned, corrected and properly joined data from various sources.
Despite all the trial preparation activities, the approach taken during the initial design of the data entry application did not take into account that major changes would need to be implemented in the CDMS during the course of the trial. The data management group attempted to reduce the effort necessary to implement new modules in the system, and these efforts, combined with the original design, formed the final architecture.
Given the extensive data entry validation requirements, the approach taken with low level of database normalisation and the database design was, initially, to leave validation-related code in separate application. This greatly simplified the database design and coding. Reporting was done with external system in SAS using data exported from database. This made data entry and database design less dependent from reporting. It could have been possible to follow all standard database design techniques instead of using straight forward copies of paper forms in database design; this design complicated implementing changes in database. This is a question about which tools the personnel is already familiar with and how much already existing code can be reused. For some institutions in the ARIVAC consortium SAS as a major statistics programming environment was a natural choice for many tasks in data validation and reporting. Problems arose in meeting deadlines for the verbatim report for the SAE prepared by the local safety monitoring team. Lack of coordination among the data managers and the manual computation of the figures to be included in the report were some of the causes for the delay.
The trial recruited local statisticians who learned how to manipulate data management and statistical software through regular trainings provided by the UQ data management experts. Although the CDMS evolved, the locally hired data management team became skilled in operating the system. There was considerable turnover in the data management team, but new staff quickly and efficiently gained knowledge of the system after training sessions conducted by the UQ.
Data collection officially ended December 2004. Some field trial staff, i.e. study nurses and physicians remained until end of 2005 to answer questions arising from the data processing. Data management team personnel, particularly the statisticians, were still available part time until 2009, when the main trial results were published [3
]. An additional study of the geographic locations of children’s homes and services in the study area took place in years 2008–2009 with core members of the data management team [10
]. Because of the skills that had been acquired, the local data management team was able to apply to this and other new research projects that had hired them. This was a great benefit to the research community at the RITM, in Manila. Continuation of data management is also important for the future use of the data for research and personnel familiar with the data solves queries of the researchers fast and reliably.
At the time of this trial, the data collection in the developing country settings had to rely on paper CRFs and recent development towards Electronic Data Capture (EDC) and/or web based distributed data collection and management was not considered [11
]. At the time of the trial, the BHS had neither computer systems nor advanced telecommunications systems to support distributed data collection. Therefore, data collection and queries for quality control had to rely on paper forms, phone calls and fax [12
We have hereby shown that in this complex, randomized, controlled trial to determine the efficacy of an investigational pneumococcal conjugate vaccine against childhood pneumonia (conducted in a developing country setting and involving concurrent satellite studies) data management needs/requirements were successfully met. There were several factors that made the task of data management doable and efficient. First, a pre-trial data management system established during the preparatory hospital based and immunogenicity studies in the population to obtain important background information about pneumonia, gave the trial a head start to conduct the data management tasks. Secondly, enough human resources like statisticians, programmers and support staff were recruited to help in data collection and management. Thirdly, the personnel had undergone training by experienced senior statisticians and programmers. Moreover, updating training was conducted locally and internationally throughout the study period. Fourthly, there was a health care data infrastructure utilizing the English language in the health records, which contributed to the ease of clinical data collection in this international project. Finally, a centralized data entry system combined with a semi-automated quality control resulted in a fairly fast feedback of corrections and high quality data with low error rates.
Although data management was successfully conducted, some problems arise during the trial. Trusting the functionality and reliability of the data management system building up the final database for data analyses, a full-time statistical expert from Australian collaborators was not hired for the entire duration of the trial to supervise the local data analyses [13
]. We therefore encountered problems inherent to long-distance communication with the international statistical experts. Also the distributed administrative responsibilities of the consortium and unclear roles in new situations sometimes delayed decision making. Since the vaccine was an investigational product which was dropped from further licensure pathway, producer withdraw from the project and did not support post project data management or analysis. The need for human resources for post processing and analysis was underestimated. All this resulted in a one-year delay in providing the first results of the trial (Table ).
Summary table of key learnings