This study demonstrates that incident DKA cases may be identified with greater than 88% PPV using a computer case definition based on inpatient medical care encounter claims with a diagnosis code consistent with DKA. While uncommon, the risk of antipsychotic drug-associated DKA is important to quantify because it is life-threatening, and because DKA may be the first manifestation of any metabolic disturbance after antipsychotic drug initiation [10
]. The risk of treatment-emergent metabolic derangements, including DKA, appears to vary according to specific agent [2
]. However, five cases of DKA have been linked with aripiprazole [12
], considered to be among the least metabolically-liable atypical antipsychotics [11
Our case definition included only inpatient claims, a decision that was based on the high likelihood that a majority of DKA cases would require inpatient or emergency medical care. We did not require primary diagnoses of DKA because the clinical criteria for DKA diagnosis are well-established. Therefore, we did not assume that diagnostic reliability was higher for primary DKA diagnoses than that of secondary diagnoses.
To our knowledge, this is the first attempt to validate a computer case definition for DKA intended for use in pharmacoepidemiological studies of DKA as a study endpoint using automated databases. Automated databases may be the only efficient means of quantifying DKA risk associated with specific drug exposures, given how infrequently it occurs. On the other hand, there are several challenges to conducting health outcomes studies using automated databases. The potential for misclassification bias is among the most serious of these [6
]. Most automated databases, including the one used in our study, are made up of medical encounter and other service utilization data are collected for purposes other than research, and the quality of the collected data can vary substantially [7
]. Thus, computerized medical encounter records are subject to misclassification due to coding errors or other problems [6
]. Endpoint misclassification is a particular concern for database studies of medical conditions that are not reliably diagnosed or treated [7
]. However, the potential for endpoint misclassification also exists for database studies in which the endpoint of interest is DKA, a condition that may reliably come to medical attention due to its acuity and severity. Misclassification errors can introduce bias that cannot be overcome using data analytic or other techniques [6
]. Therefore, in addition to improving the efficiency of database studies, a validated computer-based DKA endpoint definition is needed in order to reduce the potential for misclassification bias and improve the validity of study findings.
Our DKA computer case definition was developed and validated in a single sample of children and youth in Tennessee Medicaid who recently initiated treatment with a psychotropic medication. Although our case definition has face validity, it is unclear how well the case definition may perform in more general populations including those with existing diagnoses of diabetes mellitus and in adults. One might expect the PPV of our DKA case definition to increase among those with established diagnoses of diabetes mellitus, a necessary precondition for DKA development. However, for many, DKA may be the first manifestation of diabetes mellitus because of delays in the diagnosis and/or treatment of diabetes [17
]. One might also suspect that our case definition may perform more poorly in adults because DKA has classically been regarded as a feature of type 1 (rather than type 2) diabetes [18
] and a more common complication of diabetes mellitus in children and youth than in adults [19
]. However, more recent epidemiological studies have documented increases in the occurrence of DKA in adults and among patients with type 2 diabetes [20
], although the majority of DKA cases occur in the setting of type 1 diabetes [21
]. Further investigations of our DKA computer case definition in other settings are needed.
Interpretation of our results should proceed with additional limitations in mind. First, our sample size was small, and we were unable to abstract all records sought. The precision of our PPV estimates was reduced as a result. Second, we were unable to determine the sensitivity of our DKA case definition because we did not seek to identify cases presenting in the absence of an inpatient diagnosis. We believe this is unlikely to occur for moderate-to-severe DKA cases. However, some patients with mild DKA may be discharged without subsequent hospital admission after receiving appropriate treatment in the emergency department [22
]. Moreover, determining sensitivity (the proportion of true DKA cases that the case definition identifies as having DKA), would quantify performance of the case definition only for those already known to have DKA (established cases). Our objective was to develop a DKA case definition for use in automated database studies, where suspected (not established) cases would be first identified. Our results suggest that a high proportion of these will be true cases using our definition. Third, it should be emphasized that our case definition, which relies on inpatient ICD diagnosis codes from Tennessee Medicaid medical claims data that may be encoded days or weeks following discharge, is applicable to retrospective studies that use automated databases as a data source. Other data collection approaches should be considered for studies designed to identify cases prospectively. Finally, while the rate of DKA misclassification was low, the PPV was lower for antipsychotic initiators in our study compared with control medication initiators. Our results also suggest that the performance of our case definition may vary somewhat depending on which clinical subgroup is under investigation. Larger samples will be needed to determine whether the performance of our case definition varies according to drug exposure or clinical subgroup of interest.