We sought to identify physician, patient, encounter, and billing characteristics associated with the PPV of syndromic surveillance case definitions. Several of the predictors of syndromic surveillance case definition accuracy that we identified are readily accessible to public health departments and other organizations that routinely perform syndromic surveillance. These predictors may be used to reduce syndromic surveillance system false-positive alerts, for example, by focusing on the data most likely to be accurate or by adjusting the observed data for known biases and performing surveillance using the adjusted values; however, future research is needed to quantify the impact of our 'improved' syndrome definitions on surveillance system performance and public health practice.
Specifically, we found that visits with a syndrome-positive diagnosis in physician claims were more likely to be confirmed as syndrome-positive by the medical chart when the physician was recently licensed. This finding is similar to those of other, general studies of billing diagnosis accuracy and physician experience [21
]. A potential explanation for this finding is that younger physicians may be more likely to give greater attention to billing; also, more experienced physicians may be more likely to 'code from memory', which has been associated with more frequent diagnostic coding errors, as compared to coding from reference materials [23
]. Similar to another study [21
], we found that physicians with a higher workload on the day of the encounter had lower billing diagnosis accuracy. We also found that claims for less complex patients (i.e., younger and less socially deprived patients) were more likely to be confirmed as syndrome-positive by the medical chart, as compared to those of more complex patients. These findings may be due to higher physician workload and greater patient complexity increasing demands on limited physician resources, taxing working memory and increasing cognitive load, thereby increasing the likelihood of physician errors, including errors in billing diagnosis. Similar to prior studies' finding that common billing diagnoses are more likely to be accurate than rare ones [31
], we found that syndrome-positive diagnoses in physician claims were more likely to represent true-positives when the physician had billed several visits for the same syndrome recently. The observation that billing diagnosis accuracy increases with frequency of use can be explained by widely accepted theories on the effect of repetition on recall [40
We found that billing software had a significant impact on the PPV of syndromic surveillance case definitions: billing diagnoses abstracted from the electronic medical record in an automated manner were more accurate than diagnoses input manually for billing purposes. Although this finding is based upon only a few approaches that we were able to categorize as automatic or manual, it has important implications for both clinical users and public health surveillance. Whereas public health surveillance previously required health practitioners to submit case reports manually, it is now becoming a process where public health agencies automatically extract relevant data from clinical information systems. Indeed, the US federal government has allotted $39 billion to support the adoption and 'meaningful use' of electronic health records, and software purchased using these funds must support automated submission of data to public health agencies for three public health uses, including syndromic surveillance [41
]. This investment presents an opportunity to improve syndromic surveillance systems by having electronic health records capture and transmit information on highly influential predictors of case definition accuracy. To this end, a working group of surveillance experts from the US Centers for Disease Control and Prevention and the International Society for Disease Surveillance recently proposed specifications for the data captured by emergency department electronic health records and transmitted to public health [42
]; however, this process has yet to take place for community-based ambulatory care settings. Our study findings are directly relevant to the discussion of what data elements should be captured and transmitted by electronic health records from primary care settings to public health under the 'meaningful use' mandate.
Our study had several strengths. It was based on a large representative sample of physicians and patients. We had access to many physician, patient, encounter, and billing characteristics, which enabled us to perform a comprehensive assessment of the impact of a variety of factors on the accuracy of syndromic surveillance case definitions. Whereas some of our findings may be specific to our study population, most of our findings are likely generalizable across North American jurisdictions due to similar physician and patient populations. A limitation of our study was that the number of visits per syndrome was too small to identify predictors of case definition accuracy specific to each syndrome individually. Whereas most of the predictors of case definition accuracy that we identified would be expected to impact all syndrome definitions in a similar manner (e.g., physician workload, patient complexity), some predictors (e.g., season) may have a greater impact on some case definitions than others. Also, it should be noted that our study identified predictors of the PPV of billing diagnoses; therefore, our findings may not be directly applicable to surveillance systems that use different data, such as chief complaints from emergency departments. However, the research methodology described in this manuscript can be used to identify predictors of accuracy of other types of surveillance data.