Design, setting, and subjects
A retrospective review was conducted of charts for infants tested for pertussis by culture, presenting to the pediatric emergency department (ED) of a large urban tertiary care US hospital from 1 January 2003 to 31 December 2007. The ED volume exceeds 50

000 patients per year. The study received institutional review board approval.
Inclusion and exclusion criteria
Subjects included all infants tested for pertussis by culture from 2003 to 2007. If a patient had multiple pertussis cultures from 2003 to 2007, only the first test was included.
Case definition
An infant was defined as pertussis-positive or pertussis-negative based on culture result, which is widely regarded as the gold standard.
14
15 Alternate tests like PCR, serology, and direct fluorescent antibody (DFA) were not used in the case definition. Positive culture from a nasopharyngeal specimen is 100% specific for pertussis.
4
16 Sensitivity, however, may be limited for several reasons including the organism's fastidious nature, specimen collection technique, when the patient is tested in the course of the illness, and prior or concurrent use of antibiotics.
16
17 While PCR may have a better sensitivity, we did not rely on it because there is no FDA-approved test kit available, because test characteristics vary widely by laboratory and because outbreaks have recently been attributed to PCR false positives.
4
18 PCR may, in fact, be oversensitive, and requires correlation with at least 2 weeks of cough and paroxysm, whoop or post-tussive emesis,
4 which are difficult to assess accurately in a retrospective review. Serology is not recommended for infants, and DFA is not widely available.
19Clinical data collection
Demographics, signs, and symptoms commonly associated with infant pertussis, local disease incidence data and outcomes were collected for each patient.
4
20
21 Demographics included visit date, gender, and age (months). Signs and symptoms included cough duration (days), fever duration (days), history of apnea, post-tussive emesis, cyanosis, seizure, and contact with a person with known pertussis. If the record did not contain information about these symptoms, they were coded as absent. Cough descriptors like paroxysm, staccato, and “whoop” were not included because they could not be measured accurately by chart review. Outcome data including antibiotic use, hospitalization, and mortality were collected to help describe the study population.
In the initial review, the pertussis culture result for each patient was obtained from the hospital laboratory information system. Subsequently, the chart abstractor (an attending physician specializing in pediatric emergency medicine) responsible for collecting and entering patient data into structured forms was blinded to the culture result. The culture result was accessible through a unique laboratory link to a PDF file from the external laboratory that performed the culture. These results were kept separated from the portion of the electronic chart used from the ED clinical encounter. No linkage between the culture result and the clinical portion of the chart was conducted until after all clinical charts had been reviewed. Historical and physical exam features were based on the EMR generated during the ED encounter. Outcome data were collected from the ED EMR, inpatient discharge summaries, and outpatient follow-up visits. To assess inter-rater reliability, a second abstractor (also an attending physician specializing in pediatric emergency medicine) reviewed a random sample of 7% of charts.
22
23 The two chart abstractors had over 90% agreement (range 91–97%) and
κ24 from 0.52 to 0.87 for all candidate predictors.
Local disease-incidence data collection
A query of the State Laboratory of the Massachusetts Department of Public Health database yielded 19

907 pertussis culture results from patients of all ages over the study period (2003–7). These data were obtained through a limited data sharing agreement. State data about cultures included date sent and culture result, but not demographics, clinical findings or outcomes.
Aggregate disease incidence variables were created for the number of pertussis cultures performed, the number of positives and the proportion positive at the state laboratory. Each of these variables was tabulated over a range of different timescales: 1–7, 8–14, 15–21, and 22–28 days prior to each visit date. Based on date of presentation, the corresponding public health incidence variables (number of cultures performed, positive, and proportion positive in the prior and cumulative 1–4 weeks) were assigned to each infant.
Building the decision models
The same sequence of steps was used to build three decision models: (1) “clinical only” model—candidate predictors included only clinical data based on demographics, history, and physical exam; (2) “local disease incidence” model—candidate predictors included only public health incidence data; and (3) “contextualized” model—all clinical and public health predictors were considered.
Variable discretization and selection
Dichotomous variables (history of apnea, post-tussive emesis, cyanosis, and seizure) associated with positive pertussis culture in the clinical data set were identified. Significance of association was tested with a χ2 goodness-of-fit test (p<0.05). Continuous variables (duration of cough, duration of fever, and local disease incidence variables) were dichotomized at categorical cut-offs considered by the clinical investigators to be clinically useful and easy to remember (eg, cough at least 1 week, presence of fever, and proportion positive past 21 days >0.10).
In the multivariate analysis, candidate variables were entered into a forward stepwise logistic regression to identify independent predictors of infants testing pertussis positive. Cut-offs for entry and departure for the logistic regression model were 0.25 and 0.10.
For the local disease incidence model, each variable was considered for entry into the model as an independent predictor. Because of the interdependence of these variables, it was established a priori that no more than one candidate incidence variable would be contained in the final model. Thresholds were defined for the numbers of tests performed, positive, and proportion positive over 1–4 weeks. For proportion positive, thresholds were tested from 0.01 to 0.20 in increments of 0.01.
For the contextualized model, each clinical and local disease incidence variable was considered for entry into the multivariate model. Variables not included in the final clinical only or final local disease incidence model were still considered for inclusion into the contextualized model.
Validation
After selecting the best final model for each analysis (
clinical,
local disease incidence, or
contextualized), a bootstrap validation was performed. Predictors that were selected in over 50% of the 1000 bootstrap samples were retained in the final model.
25
26
27Measurement of model performance
Sensitivity, specificity, positive and negative predictive values, area under the receiver-operating characteristics (ROC) curve (AUC), and percent correct classification were used to compare performance. The best model was defined as that with the greatest specificity among those with highest sensitivity, in order to minimize missed pertussis cases, and also minimize misclassification of those without pertussis.
Comparing clinician performance with decision models
Clinicians' actual performance was compared with the clinical, local disease incidence, and contextualized models by measuring correct classification. Clinician performance of correct classification was judged by utilization or omission of antibiotics in the clinical encounter. The clinical actions taken, as determined by chart review, were compared with what would have been recommended based on the three decision models generated.