For several measures critical to vaccine effectiveness studies, this study showed that Boston's immunization registry would have significant shortcomings as a source of immunization data. Problems, discussed in more detail below, included inability to match subjects to registry records and incomplete and inaccurate immunization data. This registry is of one of the nation's oldest, so its data may reflect the cumulative impact of both population mobility and historical methods of data capture that are not manifest to the same degree in newer registries. This interpretation is supported by our finding that data for adolescents were generally less reliable than for younger children. Also, reflecting the delivery of pediatric care in Boston, almost all sites participating in this registry are community health or academic medical centers. Nationwide, however, most immunizations are delivered in the private practice setting, with processes and documentation that may differ from those used in the Boston sites. We doubt that the problems revealed in our study are unique to this registry, however. Previous studies of registries focusing on vaccination delivery and coverage, rather than on vaccine effectiveness, have shown varying but suboptimal completeness and accuracy [11
]. Although the problems we found may (or may not) be particularly severe in this registry, our findings highlight general challenges that may face investigators wishing to use registry-based immunization data for vaccine effectiveness studies.
Matching persons identified with cases of the vaccine-preventable disease under study to their registry immunization records is likely to be a first step in many registry-based vaccine effectiveness studies. In this registry, matching was incomplete in all age groups and was particularly problematic in children who were in the 11–17-year-old age group at their first visit to their primary care provider. We do not know whether these children moved to Boston from another location or transferred from one provider to another within the city, but the finding provides a cautionary note for the use of registry data in studies of adolescents. It might be tempting to include persons with disease in a vaccine effectiveness study regardless of whether they match a registry record, since their immunization records are likely to be available from other sources. However, doing so could bias the estimate of vaccine effectiveness, if the vaccination status of persons who match a registry record is systematically different from that of persons who do not match. For example, persons with a matching registry record might be, on average, more completely vaccinated than persons without a matching registry record. If so, then a study including diseased persons who do not match a registry record would upwardly bias vaccine effectiveness estimates, because these cases would be less likely to be completely vaccinated than the population represented by the registry independent of any protective effect of the vaccine.
We suspect that two main factors, both with implications for vaccine effectiveness studies, account for the lower registry matching rates of older children. First, the registry was launched in 1992, meaning that the oldest patients enrolled during infancy would have been 13 years old during our evaluation; patients older than 13 were necessarily enrolled at older ages with retrospective entry of their historical immunization data. Vaccine effectiveness studies in adolescents are of current interest, as several new vaccines have recently been recommended for this age group [16
]. If subjects in a registry-based study are older than the registry, investigators will need to consider how historical data were handled. Second, the US population has high rates of moving between states or regions [19
]; even registries that automatically enroll children at birth could be prone to incomplete matching of children who move into the registry catchment area later in life. Direct transfer of data from EMRs into the registry might be expected to solve the problem of incomplete matching, at least for children served by providers using EMRs. Interestingly, though, using EMR data as a proxy for registry data and comparing these data to the entire medical record, we found that similar rates of failure to match – slightly lower in 0–10-year-olds and slightly higher in 11–17-year-olds – would be expected for children from sites using an EMR. This phenomenon occurred because most provider sites did not create an EMR for all of their existing patients during implementation of the EMR system; a registry record would only be generated through direct data transfer for patients with an EMR.
Patterns of immunization data discrepancy for children who did match a registry record also have important implications for vaccine effectiveness studies. Most discrepancies in the number of pertussis-related immunization were due to immunizations that were recorded in the provider record but were absent from the registry (Table ), implying that the registry record, not the provider record, was incomplete. We also found that the data on vaccine formulation, manufacturer, and lot number were often either absent from the provider record or, when present, discrepant in the registry. These types of discrepancies have minimal consequences for achieving the registry's primary goal, supporting efforts to achieve and maintain high immunization coverage among young children, aside from time lost in tracking children who appear incorrectly to have immunization delay. For vaccine effectiveness studies, however, these discrepancies could have important implications. The importance of using comparable methods to determine vaccination status of diseased and non-diseased subjects has been emphasized previously [7
]; for registry-based studies, this principle implies that, if the registry is the sole source of vaccination information for non-cases or the population as a whole, then it should be the sole source for cases also. Even if immunization data for all study subjects is obtained from a registry, however, to the extent that some immunizations are missing from the registry (and that the pattern of missing immunizations is not related to disease status), vaccine effectiveness estimates will be biased toward the null. Immunization data that are captured and transmitted to a registry from EMRs or by other electronic means should be more accurate than data transmitted by other methods [20
], but our data show that, beyond the errors that can occur during actual use of electronic medical records, much information can be lost from paper records during the implementation of the electronic system. Investigators should therefore consider the impact of all the methods of data entry used over a registry's history for studies in which data for any immunizations under study would have been entered using past methods.
Before relying on registry data for a registry-based vaccine effectiveness study, therefore, we suggest that it would be prudent to conduct an evaluation of the registry's population coverage and data quality. We recognize, however, that conducting such an evaluation may in itself be challenging. Registries are intentionally and appropriately designed with a high level of confidentiality for participating patients and provider sites and so may be prohibited from releasing to investigators the individual-level data needed to conduct such an evaluation [1
]. In our case, compliance with registry confidentiality policies meant that we needed to develop a somewhat cumbersome method for reviewing data abstracted from provider records that was labor-intensive for registry managers. However, a registry-based vaccine effectiveness study conducted without this evaluation would almost certainly have produced misleading results.