|Home | About | Journals | Submit | Contact Us | Français|
To describe the Psychiatric Emergency Research Collaboration (PERC), the methods used to create a structured chart review tool and the results of our multicenter study.
Members of the PERC Steering Committee created a structured chart review tool designed to provide a comprehensive picture of the assessment and management of psychiatric emergency patients. Ten primary indicators were chosen based on the Steering Committee’s professional experience, the published literature and existing consensus panel guidelines. Eight emergency departments completed data abstraction of 50 randomly selected emergency psychiatric patients, with seven providing data from two independent raters. Inter-rater reliability (Kappas) and descriptive statistics were computed.
Four hundred patient charts were abstracted. Initial concordance between raters was variable, with some sites achieving high agreement and others not. Reconciliation of discordant ratings through re-review of the original source documentation was necessary for four of the sites. Two hundred eighty-five (71%) subjects had some form of laboratory test performed, including 212 (53%) who had urine toxicology screening and 163 (41%) who had blood alcohol levels drawn. Agitation was present in 220 (52%), with 98 (25%) receiving a medication to reduce agitation and 22 (6%) being physically restrained. Self-harm ideation was present in 226 (55%), while other-harm ideation was present in 82 (20%). One hundred seventy-nine (45%) were admitted to an inpatient or observation unit.
Creating a common standard for documenting, abstracting and reporting on the nature and management of psychiatric emergencies is feasible across a wide range of health care institutions.
Of the more than 115 million annual visits to US emergency departments (EDs) , approximately one of every 20 patients presents for assessment and treatment of psychiatric and behavioral emergencies [2–5]. Results from the National Hospital Ambulatory Medical Care Survey  documented 53 million mental health related visits to EDs 1992 and 2001. The annual proportion of psychiatric emergency visits increased over that time frame, from 4.9% of all ED visits in 1992 to 6.3% in 2001. These epidemiological data are corroborated by results from a recent large-scale survey of emergency care practitioners, which also reported an increasing number of ED patients presenting with psychiatric emergencies, resulting in significant increases in emergency psychiatric patients who are waiting in the ED until inpatient beds become available (i.e., “boarding”) [6,7].
The assessment and management of patients presenting to EDs with psychiatric emergencies is complex, but little is known about either decision making processes or the quality of care provided in these settings. There is little information on the prevalence of important problems, such as agitation, administration of medications and use of restraints. The ACEP Clinical Policy Committee  reviewed the available evidence for four critical questions related to managing psychiatric emergencies and, in some cases, found no Class A evidence to address the question. In the absence of evidence, expert consensus may be helpful. Guidelines endorsed by the American Association for Emergency Psychiatry were developed using the Rand consensus methodology [9,10]. Consensus was established around appropriate management of common clinical problems, like agitation and self-harm ideation. Nevertheless, empirical evidence to support these recommendations remains inadequate. Consequently, we created the Psychiatric Emergency Research Collaboration (PERC), a network of mental healthcare providers dedicated to advancing our understanding and management of psychiatric and behavioral emergencies. This inaugural article discusses a chart abstraction project developed to collect standardized clinical information from emergency mental health visits, as well as the results arising from using the abstraction tool in eight US EDs.
This study is a retrospective, structured chart review performed on psychiatric emergency patients presenting to eight PERC hospitals from January 1 through June 30, 2005. Table 1 lists the participating hospitals, including the model of their psychiatric emergency service and psychiatric patient volume during the study period. Each site generated a registry of all patients evaluated by psychiatric staff during the study period. Using a random number generator, the principal investigator (E.D.B.) randomly selected 50 subjects from each site’s registry. Missing charts, repeat visits and those who left without being seen were excluded.
Two independent reviewers at each site used a standardized abstraction form to review each chart. One site used a single reviewer because it did not have the resources to complete the double-review. All site investigators and chart reviewers were trained on the protocol by the principal investigator by teleconference. A protocol describing the data abstraction form, procedures and frequently asked questions was provided. All chart abstractors completed five preliminary abstractions, which were reviewed by the site investigator for quality, accuracy and agreement. If problems were noted, additional charts were reviewed until the site investigator was satisfied the reviewers were reliable and valid in their abstractions.
The completed data abstraction forms were sent to the data coordinating center where they were reviewed and coded by a research assistant. Missing data and irregularities were resolved through query with the site investigators. The institutional review board for each hospital approved the study.
A structured data abstraction form was created using a three stage process. First, three authors (E.D.B., M.H.A., G.W.C.) and another colleague (Chiadi Onyike, MD) identified domains pertinent to the clinical management of emergency psychiatric patients. This was heavily informed by the Expert Consensus Guidelines for the Treatment of Behavioral Emergencies [9,10]. The form was highly structured and coding instructions were defined for each item. As the scope of assessment in emergency settings is ill defined, we included a “not documented” option for most of the items. Ten primary variables of interest were identified by the PERC Steering Committee: (1) laboratory testing, (2) urine toxicology screening, (3) blood or breath alcohol level assessment, (4) signs of agitation, (5) self-harm/suicidal ideation, (6) other-harm/homicidal ideation, (7) abnormal mood, (8) medications administered for agitation, (9) physical restraints used and (10) disposition. These 10 variables were chosen because of their relevance to the practice of emergency psychiatry, as reflected by their centrality in consensus panel and best practice guidelines [6,9–11]. In addition, a variety of other variables were assessed. The eight domains are described in the Domains assessed section.
An initial version of the abstraction form was piloted with 10 charts at one site (E.D.B.). Three independent reviewers, all of whom were psychiatrists or psychologists, abstracted data from the 10 charts, and their responses were compared. Problematic items and response options were revised to reduce ambiguity. The revised form was used with another sample of 50 charts to confirm the reliability with two new independent raters. The inter-rater reliability for the 10 primary items was confirmed by a Kappa >0.80 for each item using the 50 charts. This version was used for the present study.
Demographics, mode of arrival, chief complaint and treatment process times were recorded, including times for initial triage, requesting psychiatric consultation, beginning of the psychiatric evaluation and discharge or transfer.
Abstractors recorded whether the following medical assessments were conducted: laboratory tests, other diagnostic tests (e.g., imaging studies) and physical exam. They also noted if the chart documented an acute medical problem and whether the patient received treatment for one, if it was present. The nature of the medical problem and treatment was described, if present.
Abstractors recorded whether urine toxicology screening and blood alcohol testing were completed, as well as the results of these when conducted.
Abstractors recorded whether each of 14 items taken from the Agitated Behavior Scale (ABS)  were documented. The ABS is a well-validated, comprehensive scale used to prospectively assess and quantify agitation. Since no comparable measure has been established for retrospective studies of agitation, we chose to use the ABS.
The abstractors recorded whether intentional self-harm and other-harm were documented, including ideation, intent, plan, past attempts and whether the current ED visit was prompted by such behaviors. If the visit was prompted by self or other harm, the mechanism or behavior was noted.
Eight domains were abstracted, including appearance, alertness, mood/affect, thought processes, percepts/ideas, cognition and meta-cognition. The Mini-Mental State Exam  has been recommended, so the frequency of this was also noted.
Abstractors documented the psychiatric interventions each patient received, including medication administration to manage agitation, the use of physical restraints or treatment for an overdose or intoxication. If physical restraints were used, the start and end times were abstracted. If a medication was administered, the type of medication, voluntary or involuntary administration, route, response and adverse events were noted for each administration.
Abstractors recorded whether patients were discharged or admitted. If admitted, the type of treatment setting (medical, standard psychiatric, psychiatric observation unit, substance abuse/detoxification) and admission status (voluntary, involuntary) were abstracted. If discharged, the aftercare plan was abstracted, including the instructions for further psychiatric evaluation or treatment and whether patients were given a prescription for psychotropic medications.
Once all of the data were entered, the principal investigator compared the agreement between the abstractors from each site by calculating inter-rater reliability (Kappa) statistics for each of the ten primary indicators. Each site that yielded a Kappa of <0.80 for any of the 10 items were required to resolve all discrepancies for all 10 items through a second review of the records.
Using the fully reconciled data, we calculated descriptive statistics, including proportions, means with S.D. or medians with inter-quartile ranges (IQR). No statistical comparisons were completed. This article is primarily concerned with describing our methods and descriptive results. Analyses describing relationships between variables, including predictors of admission, will be reported in subsequent publications. All analyses were performed using SPSS 15.0.
Of the seven sites that completed double-data abstraction, four had Kappa <0.80 on at least one of the 10 primary indicators and had to reconcile their data using the original source documents. The final Kappas for the 10 items using the fully reconciled data are reported in Table 2.
The mean age of the subjects was 35 years old (S.D.=15 years, range=5–82 years). Two hundred five subjects (51%) were male, 227 (61%) white, 96 (26%) black/African American, 35 (9%) Hispanic and 14 (4%) of other race/ethnicity. One hundred ninety-five (49%) arrived by self-transportation, while 118 (30%) were brought in by emergency medical service and 47 (12%) were brought in by law enforcement. Eighty-five (21%) had no insurance, while 118 (30%) were insured by Medicaid, 58 (15%) by Medicare and 119 (29%) by private insurance. The vast majority, 368 (93%), presented with a psychiatric component to their chief complaint, while 53 (13%) and 12 (3%) also presented with a medical or trauma complaint, respectively. One hundred sixty-six (42%) were treated exclusively by a psychiatric consultant in a medical ED, while 166 (42%) were treated exclusively in a dedicated psychiatric emergency service (PES). The remainder (n=64; 16%) was treated in both the medical ED and the PES.
Table 3 summarizes the ten primary variables of interest. Table 4 summarizes the treatment process times, while Table 5 presents other miscellaneous variables collected. Laboratory testing was quite common, with 285 (71%) receiving non-toxicology laboratory testing, such as complete blood count and basic metabolic panel. Of the 253 subjects who had a urine toxicology screen, blood alcohol level drawn or both; 112 (44%) were positive for at least one substance, while 19 (7.5%) screened positive on both illicit drugs and alcohol. A large proportion of subjects had risk factors for harming themselves or others. More than half of our sample (n=226; 55%) had at least some self-harm ideation, while 120 (30%) reported a self-harm plan and 68 (17%) presented because of self-inflicted injuries. Aggressive or other-harm ideation was documented in 82 (20%) subjects, with 56 (14%) visits due to actual or threatened other-harm. Self- and other-harm ideation occurred in combination in 45 (11%) subjects.
Agitation was also common, with at least some signs present in 210 (52%) subjects. Of those with documented agitation, medications were used for agitation in 98 (47%), with 19 (19%) described as involuntary. A response to the medication was documented in only 59 (60%) of those who received them. Restraints were used in 22 (6%) of cases for a median of just over 4 h (interquartile range, 2 h 5 min–7 h 35 min). One hundred seventy-nine (45%) subjects were admitted to an inpatient or observation unit, with 83 (46%) admitted involuntarily.
Our study has implications for research, quality assurance and performance improvement efforts in emergency mental health settings. Our experience reinforced that chart abstractions can be unreliable, even when careful attention is devoted to establishing the tool’s reliability in advance of its use and following rigorous protocols. Our instrument was created by experienced investigators, included domains covered by the Expert Consensus Guidelines for the Treatment of Behavioral Emergencies [9,10], underwent several stages of testing prior to use in the study and assessed similar domains as other studies on quality of care in emergency psychiatry [12,13]. We centralized training for research assistants, required demonstrable proof of accuracy of each reviewer’s abstractions using preliminary reviews, used a written manual and detailed protocol and encouraged “as needed” clarification with the study’s principal investigator. Despite these efforts, four of the seven sites that performed double-reviews yielded at least one item with a Kappa of <0.80. As one might expect, the lack of agreement seemed to plague those domains susceptible to poor or vague documentation, which allowed for greater difference in abstractor interpretation, such as indicators of agitation, mood and self-harm. For example, the original Kappa for the item assessing whether agitation was present during the ED visit was barely above chance, at 0.58 for the entire sample. Fortunately, we incorporated efforts to reconcile discrepant data by having sites re-review the charts using original source documentation, so we have greater confidence in the reliability and validity of our final, reconciled data.
The problem with reliability and validity related to chart review methods is an issue that has broad implications. First, it highlights the lack of standardization in definitions and data collection in emergency settings. The content and included multiple sources for some data points. Further, our experience argues for continued development of robust tools with established reliability and validity but also for testing of inter-rater reliability within and across sites. Using a double-abstraction process with reconciliation of discrepant ratings is a labor-intensive strategy but is encouraged to improve the validity of the results of chart review efforts. Indicators that have strong objective documentation and which usually require a physician’s order, such as the use of restraints or medications to manage agitation, will likely be associated with greater reliability and validity. Particular caution should be used when abstracting more subjective domains, such as signs of agitation, abnormal mood and self-harm. Not only can practitioner documentation of these domains be vague and idiosyncratic, but interpretation of the documentation by data abstractors adds another level of complexity.
Substance use is among the several large challenges for emergency services. Substance use disorders generally make up about 30% of mental health-related visits to emergency departments in the United States , with comorbidity between substance abuse and other psychiatric symptoms being very common . Urine toxicology screening and blood alcohol testing are controversial because of increased costs and transit times through the ED. Some proponents advocate routine screening while others argue for clinically indicated testing [17–20]. There remains no clear consensus for the standard of care. Our data do not clearly support either position, but they do demonstrate that substance use screening is common in clinical practice, and that, when completed, the urine toxicology and blood alcohol level screens are positive in more than 40% of patients (112/253).
Doshi et al.  have shown that there were approximately 412,000 annual ED visits for self-harm, or 0.4% of all ED visits, but emergency departments do not routinely screen for suicidal ideation or risk. Even for patients with known suicidal ideation or behavior, little is known about the emergency assessment or management of suicidal behavior. Seventeen percent of all visits in this sample were prompted by self-harm, and recent or current suicidal ideation, intent or plans for self-harm were present in more than half all patients, with one-third of suicidal patients admitting to at least one prior attempt. Considering the frequency of the problem, the field should have well-established standards for suicide assessment and documentation.
Over half of the subjects demonstrated signs of agitation. Almost half of these (47%) received medications to reduce agitation, but for 40% of these, there was no documentation of the response. Similarly, physical restraints were used with 22 subjects, but, contrary to regulation, adequate documentation of start and end times were absent for 9 (41%) of restrained patients. The lack of documented treatment response in so many cases may indicate that clinicians are not routinely monitoring response by any objective standard or that medical records do not have adequate prompts for documentation of this procedure. These data suggest that studies designed to improve the measurement and management of agitation, and the documentation thereof, is warranted. Alternatively, the procedures for abstracting such data from the chart may be lacking. For example, restraint documentation may appear on a part of the chart that is not easily connected with the primary ED record, like an area of the nursing notes that abstractors could have missed.
Time is important in emergency settings. Our data suggests that while the median time to request a psychiatric consult is 22 min, the median time to begin the psychiatric consultation is 1 1/2 h. The median total length of stay is more than 5 1/2 h, with 25% staying more than 10 h. Our data is consistent with figures published from a study of hospitals in Illinois, which found an average total length of stay for psychiatric patients of approximately 5 h, which was nearly twice as long as the stay for other ED patients . This may be driven by the need for medical assessment, toxicology or observation. Regardless of the cause, further efforts designed to improve turnaround time and decrease “boarders” are warranted.
Our sites varied in the resources available to them for disposition. On average, only 35% were admitted to a hospital psychiatry service, 54% were released to the community and the remainder was placed in observation units, crisis residences, medical settings or substance abuse treatment facilities. Of those released, a small minority had a specific aftercare appointment (17%) while many had no plan recorded (21%). Beyond state laws that address appropriate criteria for involuntary admission, there are few standards to guide which patients should be admitted or methods for ensuring continuity of care to outpatient settings. Further research on alternatives to inpatient admission and interventions designed to enhance linkage between the ED and outpatient settings are needed.
Different models of providing emergency psychiatric care, such as a consultation-liaison model versus a dedicated PES, may be associated with different outcomes. The heterogeneity both within and between these different models highlights the difficulty in studying the management of emergency mental health issues. For example, Woo et al.  found that several indices of quality care, including timeliness of psychiatric care, completion of mental status exam, pregnancy testing, use of seclusion and elopement improved in one hospital that transitioned from a consultation-liaison model to a dedicated PES. As Table 1 indicates, our sample is weighted with patients treated in settings where a dedicated PES was available. Unfortunately, the study was not designed or powered to examine differences in the variables between the two models. While many clinicians believe that dedicated PESs provide better quality of care than other models, the existing research supporting this claim is tentative and relies on retrospective chart review data. This is an important area for future research, especially when one considers the resource investment in establishing and maintaining a dedicated PES.
There are many features pertaining to the presentation and management of emergency mental health conditions that were not assessed. For example, it would have been useful to collect data on common methods of overdose, like salicylate, acetaminophen and tricyclic antidepressant levels. Unfortunately, we were limited in what we could expect sites to abstract and chose those variables we believed most important and generalizable.
As described in the article, some items on the abstraction form exhibited poor reliability, a common limitation of chart review studies. This varied considerably by site, with some sites submitting highly reliable abstractions and others submitting highly unreliable abstractions. We implemented a second review and reconciliation procedure to help minimize the impact. However, because of the labor required, this process was not applied to items beyond the 10 primary indicators. Their reliability is unknown.
The lack of established benchmarks of care is a major impediment to quality assurance and performance improvement efforts in psychiatric emergency services. The PERC’s efforts represent an attempt to characterize the nature of the emergency mental health problems seen in acute care settings and the manner in which the patients are assessed and managed. Caution should be used when interpreting these statistics, however, considering the small number of sites, the different practice models used (i.e., medical EDs with consult services versus dedicated PESs), and the different contexts in which these sites function (inner-city public hospital vs. suburban private). While our study does not provide a robust benchmarking database, it does demonstrate the ability to study a variety of critical emergency issues across settings. Additional multisite efforts to describe and monitor the current state of care provided for psychiatric emergencies in the United States are urgently needed. Our experiences suggest caution, however, and dedication to rigorous abstraction procedures will be necessary to generate reliable, valid data.