Our study has implications for research, quality assurance and performance improvement efforts in emergency mental health settings. Our experience reinforced that chart abstractions can be unreliable, even when careful attention is devoted to establishing the tool’s reliability in advance of its use and following rigorous protocols. Our instrument was created by experienced investigators, included domains covered by the Expert Consensus Guidelines for the Treatment of Behavioral Emergencies [9
], underwent several stages of testing prior to use in the study and assessed similar domains as other studies on quality of care in emergency psychiatry [12
]. We centralized training for research assistants, required demonstrable proof of accuracy of each reviewer’s abstractions using preliminary reviews, used a written manual and detailed protocol and encouraged “as needed” clarification with the study’s principal investigator. Despite these efforts, four of the seven sites that performed double-reviews yielded at least one item with a Kappa of <0.80. As one might expect, the lack of agreement seemed to plague those domains susceptible to poor or vague documentation, which allowed for greater difference in abstractor interpretation, such as indicators of agitation, mood and self-harm. For example, the original Kappa for the item assessing whether agitation was present during the ED visit was barely above chance, at 0.58 for the entire sample. Fortunately, we incorporated efforts to reconcile discrepant data by having sites re-review the charts using original source documentation, so we have greater confidence in the reliability and validity of our final, reconciled data.
The problem with reliability and validity related to chart review methods is an issue that has broad implications. First, it highlights the lack of standardization in definitions and data collection in emergency settings. The content and included multiple sources for some data points. Further, our experience argues for continued development of robust tools with established reliability and validity but also for testing of inter-rater reliability within and across sites. Using a double-abstraction process with reconciliation of discrepant ratings is a labor-intensive strategy but is encouraged to improve the validity of the results of chart review efforts. Indicators that have strong objective documentation and which usually require a physician’s order, such as the use of restraints or medications to manage agitation, will likely be associated with greater reliability and validity. Particular caution should be used when abstracting more subjective domains, such as signs of agitation, abnormal mood and self-harm. Not only can practitioner documentation of these domains be vague and idiosyncratic, but interpretation of the documentation by data abstractors adds another level of complexity.
Substance use is among the several large challenges for emergency services. Substance use disorders generally make up about 30% of mental health-related visits to emergency departments in the United States [3
], with comorbidity between substance abuse and other psychiatric symptoms being very common [16
]. Urine toxicology screening and blood alcohol testing are controversial because of increased costs and transit times through the ED. Some proponents advocate routine screening while others argue for clinically indicated testing [17
]. There remains no clear consensus for the standard of care. Our data do not clearly support either position, but they do demonstrate that substance use screening is common in clinical practice, and that, when completed, the urine toxicology and blood alcohol level screens are positive in more than 40% of patients (112/253).
Doshi et al. [4
] have shown that there were approximately 412,000 annual ED visits for self-harm, or 0.4% of all ED visits, but emergency departments do not routinely screen for suicidal ideation or risk. Even for patients with known suicidal ideation or behavior, little is known about the emergency assessment or management of suicidal behavior. Seventeen percent of all visits in this sample were prompted by self-harm, and recent or current suicidal ideation, intent or plans for self-harm were present in more than half all patients, with one-third of suicidal patients admitting to at least one prior attempt. Considering the frequency of the problem, the field should have well-established standards for suicide assessment and documentation.
Over half of the subjects demonstrated signs of agitation. Almost half of these (47%) received medications to reduce agitation, but for 40% of these, there was no documentation of the response. Similarly, physical restraints were used with 22 subjects, but, contrary to regulation, adequate documentation of start and end times were absent for 9 (41%) of restrained patients. The lack of documented treatment response in so many cases may indicate that clinicians are not routinely monitoring response by any objective standard or that medical records do not have adequate prompts for documentation of this procedure. These data suggest that studies designed to improve the measurement and management of agitation, and the documentation thereof, is warranted. Alternatively, the procedures for abstracting such data from the chart may be lacking. For example, restraint documentation may appear on a part of the chart that is not easily connected with the primary ED record, like an area of the nursing notes that abstractors could have missed.
Time is important in emergency settings. Our data suggests that while the median time to request a psychiatric consult is 22 min, the median time to begin the psychiatric consultation is 1 1/2 h. The median total length of stay is more than 5 1/2 h, with 25% staying more than 10 h. Our data is consistent with figures published from a study of hospitals in Illinois, which found an average total length of stay for psychiatric patients of approximately 5 h, which was nearly twice as long as the stay for other ED patients [21
]. This may be driven by the need for medical assessment, toxicology or observation. Regardless of the cause, further efforts designed to improve turnaround time and decrease “boarders” are warranted.
Our sites varied in the resources available to them for disposition. On average, only 35% were admitted to a hospital psychiatry service, 54% were released to the community and the remainder was placed in observation units, crisis residences, medical settings or substance abuse treatment facilities. Of those released, a small minority had a specific aftercare appointment (17%) while many had no plan recorded (21%). Beyond state laws that address appropriate criteria for involuntary admission, there are few standards to guide which patients should be admitted or methods for ensuring continuity of care to outpatient settings. Further research on alternatives to inpatient admission and interventions designed to enhance linkage between the ED and outpatient settings are needed.
Different models of providing emergency psychiatric care, such as a consultation-liaison model versus a dedicated PES, may be associated with different outcomes. The heterogeneity both within and between these different models highlights the difficulty in studying the management of emergency mental health issues. For example, Woo et al. [13
] found that several indices of quality care, including timeliness of psychiatric care, completion of mental status exam, pregnancy testing, use of seclusion and elopement improved in one hospital that transitioned from a consultation-liaison model to a dedicated PES. As indicates, our sample is weighted with patients treated in settings where a dedicated PES was available. Unfortunately, the study was not designed or powered to examine differences in the variables between the two models. While many clinicians believe that dedicated PESs provide better quality of care than other models, the existing research supporting this claim is tentative and relies on retrospective chart review data. This is an important area for future research, especially when one considers the resource investment in establishing and maintaining a dedicated PES.
There are many features pertaining to the presentation and management of emergency mental health conditions that were not assessed. For example, it would have been useful to collect data on common methods of overdose, like salicylate, acetaminophen and tricyclic antidepressant levels. Unfortunately, we were limited in what we could expect sites to abstract and chose those variables we believed most important and generalizable.
As described in the article, some items on the abstraction form exhibited poor reliability, a common limitation of chart review studies. This varied considerably by site, with some sites submitting highly reliable abstractions and others submitting highly unreliable abstractions. We implemented a second review and reconciliation procedure to help minimize the impact. However, because of the labor required, this process was not applied to items beyond the 10 primary indicators. Their reliability is unknown.