Objective To generate and measure the reliability for a reference standard set with representative cases from seven broad syndromic case definitions and several narrower syndromic definitions used for biosurveillance.
Design From 527,228 eligible patients between 1990 and 2003, we generated a set of patients potentially positive for seven syndromes by classifying all eligible patients according to their ICD-9 primary discharge diagnoses. We selected a representative subset of the cases for chart review by physicians, who read emergency department reports and assigned values to 14 variables related to the seven syndromes.
Measurements (1) Positive predictive value of the ICD-9 diagnoses; (2) prevalence of the syndromic definitions and related variables; (3) agreement between physician raters demonstrated by κ, κ corrected for bias and prevalence, and Finn's r; and (4) reliability of the reference standard classifications demonstrated by generalizability coefficients.
Results Positive predictive value for ICD-9 classification ranged from 0.33 for botulinic to 0.86 for gastrointestinal. We generated between 80 and 566 positive cases for six of the seven syndromic definitions. Rash syndrome exhibited low prevalence (34 cases). Agreement between physician raters was high, with κ > 0.70 for most variables. Ratings showed no bias. Finn's r was >0.70 for all variables. Generalizability coefficients were >0.70 for all variables but three.
Conclusion Of the 27 syndromes generated by the 14 variables, 21 showed high enough prevalence, agreement, and reliability to be used as reference standard definitions against which an automated syndromic classifier could be compared. Syndromic definitions that showed poor agreement or low prevalence include febrile botulinic syndrome, febrile and nonfebrile rash syndrome, respiratory syndrome explained by a nonrespiratory or noninfectious diagnosis, and febrile and nonfebrile gastrointestinal syndrome explained by a nongastrointestinal or noninfectious diagnosis.