|Home | About | Journals | Submit | Contact Us | Français|
Educating physicians and other health care professionals to identify and treat patients who drink above recommended limits is an ongoing challenge.
An educational Randomized Control Trial (RCT) was conducted to test the ability of a stand alone training simulation to improve the clinical skills of health care professionals in alcohol screening and intervention. The “virtual reality simulation” combines video, voice recognition and non branching logic to create an interactive environment that allows trainees to encounter complex social cues and realistic interpersonal exchanges. The simulation includes 707 questions and statements and 1207 simulated patient responses.
A sample of 102 health care professionals (10 physicians; 30 physician assistants [PAs] or nurse practitioners [NPs]; 36 medical students; 26 pharmacy, PA or NP students) were randomly assigned to no training (n=51) or a computer based virtual reality intervention (n=51). Subjects in both groups had similar pre-test standardized patient alcohol screening skill scores – 53.2 (experimental) vs. 54.4 (controls), 52.2 vs. 53.7 alcohol brief intervention skills, and 42.9 vs. 43.5 alcohol referral skills. Following repeated practice with the simulation there were significant increases in the scores of the experimental group at 6 months post-randomization compared to the control group for the screening (67.7 vs. 58.1, p<.001) and brief intervention (58.3 vs. 51.6, p<.04) scenarios.
The technology tested in this trial is the first virtual reality simulation to demonstrate an increase in the alcohol screening and brief intervention skills of health care professionals.
Training health care professionals to ask or talk with patients about substance use, exposure to interpersonal violence, sexual practices and other sensitive topics is an ongoing challenge. Despite the critical importance of these areas several studies have shown that physicians infrequently ask about these topics as part of routine care. For example, in a study of primary care practices, patients with alcohol dependence received the recommended quality of care, including assessment and referral to treatment, only about 10 percent of the time.(10) Fiore, et al. (2000) reported that a population-based survey found that less than 15% of smokers who saw a physician in the last year were offered assistance and only 3% received a follow-up appointment to address tobacco use.(5) In a survey conducted by Elliot (2002), 10% of physicians reported screening for domestic violence and only 6% screened all of their patients.(4) In a Canadian survey over 80% of the physicians felt they had adequate or excellent medical training in assessing risk behaviors for heart disease and STD risk. The proportion who felt this way about their training in screening for substance use disorders, family violence and sexual abuse ranged between 12.7% and 31.6%.(9)
Traditional educational methods utilized to increase clinical skills of students and practitioners in these difficult clinical topics include lectures, case based discussions, evidenced based journal reviews, role plays, videotape playback reviews, E-Learning and standardized patients.(11) Bowman and colleagues (2,13) utilized simulated patients to improve physician performance on the prevention of sexual transmitted diseases. They were able to demonstrate significant improvements in clinical practice. However, since these methods provide limited ability for learners to repeatedly practice new clinical skills these teaching methods have significant limitations. Students need the opportunity to repeat the words in multiple clinical situations, to observe patients’ reactions to sensitive questions, to practice behavioral intervention statements over and over again, and to receive direct and immediate feedback.
Highly interactive role-play simulations have been shown to improve training effectiveness and “boost learning retention rates dramatically.”(1,7) Virtual reality simulations offer many potential advantages over traditional educational methods. These advantages include: a) allowing learners to practice the simulation multiple times, b) the ability to ensure that learners receive a different patient response to the questions and behavioral statements for each virtual reality play, c) activated voice response that permits the learner to verbally ask questions and conduct brief intervention with what becomes a “real” clinician – patient interaction, d) portability, which lets the learner play the simulation at any time and in any location, e) the ability of the program to score the performance of the learner and provide immediate feedback, both from the patient and a computerized “coach”, f) a playback feature which can replay the clinician-patient interaction, repeating good interactions or trying purposeful mistakes; g) providing educational screens that give the learner access to read about basic screening and counseling skills prior to or during the simulation, h) the option of offering the learner course credit or continuing education credits by linking the simulation to an on-line internet connection.(15)
The goal of the project was to produce and test a self-contained, off-the-shelf virtual reality simulation system for health care professionals to improve their clinical skills in the areas of alcohol screening, brief alcohol interventions and referral for at-risk, problem and dependent drinkers. This report presents the results of an educational trial designed to test the ability of this system to improve the clinical skills of students and primary care clinicians. A successful demonstration of effect would lead to the development of additional “virtual reality” training simulations for other sensitive behavioral issues such as screening and intervening with underage drinkers, tobacco addiction, illicit drug use, non-prescription opioid abuse, interpersonal violence, sexual risk reduction, and suicide ideation.
A randomized controlled educational trial was conducted to test the hypothesis of interest. 102 health care professional students or practitioners were assigned to the experimental virtual reality simulation program or to a no education control group. The intervention was based on SIMmersion simulation technology.(12,13) Subjects assigned to the intervention group were expected to read the educational materials and to practice the simulation at least 10 times on their personal computer, over the three-month study period. The primary outcomes of interest were changes in the clinical skills of the participants. Clinical skills were assessed using standardized patients. Each participant was tested with three different case scenarios at baseline and six months post-randomization. The case scenarios were developed specifically for this study as there are no standard case scenarios that have been tested and validated for alcohol screening, brief intervention and referral. These scenarios build on prior work conducted by the PI’s research group.(8)
Participants were initially recruited by email and invited to participate in an educational study focused on alcohol screening and brief intervention. Eligible participants included physicians, residents, nurse practitioners (NPs), physician assistants (PA’s), NP/PA students, medical students and pharmacy doctoral students at the University of Wisconsin in Madison. 14% (102 out of 731) of the students and clinicians who were sent a blanket email participated in the trial. The primary groups were fourth year medical students and primary care physicians. Subjects who responded to the email invitation were contacted by telephone and screened to determine eligibility. These criteria included 18 years of age or older; currently enrolled in a professional training program or currently practicing medicine, nursing or pharmacy; ability to practice the required number of plays for the simulation; and availability to complete the six months post-test at the testing facility in Madison.
Subjects who met these criteria were then scheduled to participate in the baseline standardized patient skills test. Written informed consent was obtained at the time of the baseline testing scenario. Randomization also occurred at this time. The study was approved by the UW Madison Health Sciences Human Subjects Committee. Subjects in both groups were paid $50 for completion of the pre-test and $50 for the post-test. Intervention subjects received an additional $10 for each play up to 10 plays. Eight subjects who completed all aspects of the trial were randomly selected to receive an additional $500 for their participation at the end of the trial. The subjects in the control group were given the virtual reality simulation program and voice recognition equipment at the end of the 6-month post-test.
The testing scenarios were developed by the study research team. Three standardized cases (screening, intervention, and referral) and rating methods were used for the pre-test and a separate set of three cases for the post-tests. The pre-test screening case was a 43 year-old salesman who presented to a new physician for a blood pressure medication refill and evaluation of hypertension. He had a stressful job in a new location, had about four drinks most nights after work and occasionally had up to six to eight drinks. He was annoyed by his wife’s concern about his drinking and had tried to cut down, but wasn’t successful. The pre-test brief intervention case was a 48 year-old single female financial advisor who presented to a new physician for trouble sleeping, awakening often each night. She had three to four drinks each night, recently missed a crucial morning business appointment, and sprained her ankle when drinking. She had been diagnosed in the past as a problem drinker, but didn’t believe it. The pre-test referral case was a 48 year-old divorced female who had been a licensed practical nurse (LPN) for 30 years. She presented to her physician for a refill of a Valium prescription to help her relax and sleep. She had five to six drinks every night, her father and brother were alcoholics, and she had been irritable at work. She didn’t believe she was an alcoholic and wouldn’t go to AA. She was a dependent drinker.
The post-test screening case was a 40 year-old male salesman for a manufacturing company, who presented to his physician to have stitches removed from an injury suffered in a bar fight. He had about four drinks most nights after work, six to eight on an occasional heavy night, was annoyed by his wife’s concern about his drinking and had hangovers several times each month. The post-test brief intervention case was a 48 year-old female with two married children and three small grandchildren. Her marriage was failing and she did volunteer work for several organizations. She had three to four drinks most nights and her children were concerned about her drinking and were reluctant to have her baby sit on weekends for her grandchildren. She had tried to cut down and her father and brother were alcoholics. She was a problem drinker. The post-test referral case was a 47 year-old woman who owned her own business and presented to her physician for increasing fatigue and trouble sleeping. She had three to five drinks per night, and lived with a woman partner who is concerned about her drinking. Her father was an alcoholic and she had tried to cut down but was unsuccessful. She was a dependent drinker. The scoring items for each of the six scenarios are listed in tables 3, ,4,4, and and55.
The University of Wisconsin School of Medicine and Public Health Clinical Teaching and Assessment Center (CTAC) was formally established in 1994 and has an active Standardized Patient program for research, teaching and clinical assessment. Standardized Patients (SP’s) for this project were drawn from a pool of 90 persons, who had previously participated as standardized patients and were selected based on longevity of experience, interest in the project, time availability and background in a health care-related field. There were nine used for the pre-test and ten for the post-test. The SP’s included six women and four men. Their ages ranged from 38–60 and they had been with the SP Program for up to four years.
The CTAC director was the primary trainer who taught the SP’s to portray the cases and score the clinical skills of the subjects. Each SP participated in two 2-hour training sessions that focused on practicing their assigned role. The CTAC training director provided feedback and comments to every SP. Each SP was expected to portray one case scenario for the pre-test and one for the post-test. SP’s were also asked to attend several meetings with the researchers to assist in script development as well as to refine checklist items so they were descriptive, clear, and intuitive to the SP’s. The research team created detailed scripts as well as directions for the checklist completion based on these discussions. The close training with, and collaboration between, the SP’s and members of the research team resulted in an open atmosphere where SP’s questions could be readily asked and addressed.
A pilot test of the cases, role plays, checklists, technology, and logistics took place in September 2007 with three subjects and eight SP’s either portraying roles or viewing the sessions. None of the three subjects involved in the pilot participated in the trial. Further modification of the scripts and scoring items were facilitated by immediate post-pilot discussion and videotape review with the SP’s and the research team observing the pilot and subjects.
There were 10 sessions used to conduct the pre-test standardized patient scenario over a six-week period in the November and December of 2007, with 8–12 subjects tested during each session. Each of the 102 participants interviewed three standardized patients. The participants were given five minutes to read an abbreviated medical record on the outside of the door prior to entering the exam room and interviewing the SP. Each research participant had 15 minutes to interview and/or counsel the SP. The SP had five minutes to score the checklist before the next clinician entered the room. Research subjects completed the pre-test in 60 minutes. The post-test sessions occurred in April and May of 2008 using similar testing methodology. All subject-SP sessions were videotaped. A 20% sample of the video taped sessions was reviewed by a panel of members of the research team to determine validity of the SP scoring. The reviews found strong agreement between the ratings of the SP’s and research team. The degree of agreement (Kappa) between the standardized patient and expert panel’s 20% review was 0.95.
The virtual reality simulation software used for this study was developed by Dr. Dale Olsen at the Johns Hopkins University Applied Physics Laboratory. The stand-alone software modified for this study consisted of six elements. The first element was an educational program that could be read by the participant to give basic background on alcohol screening and intervention. This was based on the National Institute on Alcohol Abuse and Alcoholism (NIAAA) 2005 clinician’s guide and consisted of 20 screens of text. The second element consisted of 707 questions and statements learners were able to ask the simulated patient, to conduct counseling or make a referral to a treatment center. The third element was a set of 1,207 responses developed for the simulated patient to respond to the learner (an actress video recorded these responses in a production studio over a five-day period with varied mood and affect). An algorithm embedded in the program was programmed to respond based on the type and appropriateness of the question or statement made by the learner, as well as the history of the conversation between the character and the learner. Negative or inappropriate questions or statements were linked to audio and visual anger or mood changes in the simulated patient.
The fourth element was the on-screen help agent. The agent is an action figure in the corner of the screen who intermittently displays hand and body signals to indicate especially good or not-so-good questions by the learner. The fifth element was an instant replay feature. The sixth element was scoring and feedback to the learner on their performance during the play. The basic computer screen for the simulation is presented in Figure 1. For this simulation the program was designed so that 40% of the time the character in the simulation would be an at-risk drinker, 40% a problem or dependent drinker and 20% a low-risk drinker (based on criteria developed by NIAAA).
Subjects were asked to conduct face-to-face alcohol screenings, brief interventions and referrals with the simulated character using a microphone or a computer-mouse to communicate. The questions and statements were scripted to include a variety of natural choices. For each scripted question or statement, there are multiple simulated character responses available. The simulated character’s brain, selects a response based on the level of rapport developed by the subject, the character’s risk level, previously discussed information, and chance. For example, if the subject selects a series of inappropriate options, the simulated character will become curt and uncooperative; if however, the subject selects a series of appropriate options, the character will become friendly and forthcoming. This realistic emotional variation allows the simulated character to emulate actual human behavior.
Subjects received feedback from an on-screen help agent who provided non-verbal cues regarding the user’s choice of questions. In addition, the subject could click help buttons for assistance with question choices and character responses. The system scored the subject’s performance, and an instant replay feature enabled users to review portions of dialogue or their entire conversation. At the end of the conversation, subjects would have to decide which type of drinker (randomly selected by the program) they were talking to and were scored on their accuracy.
Each subject was expected to screen the patient for alcohol use and decide if the patient required a brief intervention and a referral for treatment. Since each of the three patients would require varying amounts of time, a completed play was based on the number of statements and questions that were used by the subject, with 10 statements set as the minimum requirement for one play. The guidelines could be accessed at anytime during the play as a reference for the subject.
Plays were tracked using SIMmersion’s on-line tracking system, with the goal being that each experimental subject would play at least 10 times in the 3–4 months prior to the final post-test. After pilot testing by the expert panel, there was a general consensus that 10 plays was a minimum number of plays in order to take advantage of the various patient scenarios and responses built into the simulation. The experimental subjects were contacted by research staff once during the practice period to ensure they were using the simulation software correctly and to aid in solving any issues.
The questions and simulated patient responses were written and developed by an expert panel at the University of Wisconsin. The expert panel consisted of 14 UW Madison primary care clinicians, addiction medicine physicians, psychologists and members of the SIMmersion team with significant expertise in substance abuse. Members of the panel also played and pilot-tested a number of versions of the virtual reality simulation as it was developed. The simulation took approximately nine months to develop prior to testing.
The demographic variables for experimental and control groups were described by way of frequencies (%). Univariate analysis was used to assess potential differences on gender, age, clinician vs. student status and prior alcohol training. As noted in Table 1 there were no significant differences between groups on these four variables.
The primary outcomes were the Intervention, Screening and Referral scores from the standardized patient scenarios. As noted in Tables 3, ,44 and and5,5, the clinical skills of the subjects were assessed using a set of clinical criteria developed for the screening, brief intervention and referral scenario. Eighty-five points was allocated to 17 specific skills criteria with five points for each. The standardized patients were instructed to score the skill done as a simple yes -- the learner demonstrated the skill or alternatively -- no they did not. Subjects received a five points or zero points for each skill. No partial credit scores were given. Fifteen points was used to rate the clinicians overall performance. These scores (0–100) were aggregated to create a total score for each scenario and described with means and standard deviations.
Ninety-one out of 102 subjects completed the six months post-test. The primary reasons the eleven subjects did not complete the post-test included scheduling conflicts with patient care and course work, relocation and illness. Intention to treat analysis was followed. Baseline scores were imputed to the missing data in the post test scores for the 11 subjects who did not complete. We elected to use the most conservative method to handle missing follow-up data. All 102 subjects originally randomized into the trial were included in the outcome analysis.
The mean values for experimental and control groups on post-test were compared with t-tests to derive effect size for the educational trial on the 3 scenarios. T-tests were executed separately for all Intervention, Referral and Screening scales items. The sample was too small to assess a dose response affect between the number of plays and changes in clinical behavior. We stratified the results on each of co-variates in table 1 and found no statistical association between these variables and the primary outcomes. The analyses were performed with SAS version 9.1 for Linux (SAS Institute Incorporated, 2002–2003).
Table 1 provides a general description of the 102 trial participants. As noted the majority of the research subjects were medical students and other health care professional students. Ten primary care physicians (four family physicians, four internal medicine physicians and two pediatricians) participated in addition to 30 nurse practitioners and physician assistants. As in most health care professions the majority of participants were women. Only 5% reported previous training in alcohol screening and brief intervention.
Table 2 presents the primary results of the trial. The scoring was set up to range between 0 and 100 points for each case. Baseline scores were similar for each group across all three scenarios. The post-test scores demonstrated significant differences in alcohol screening (p<.001) and brief intervention skills (p<.04) between groups. The screening skills in the control group increased by 3.7% whereas the skills in the experimental group improved by 14.4% over the 6 month period. The number of subjects who inquired about the frequency of alcohol use increased from 40 to 51 in the experimental group, with no change in the control group (42 to 43). The brief intervention skills went down in the control group by 2.1% and increased 5.7% in the experimental group. While there were significant changes in referral skills on pre- and post-tests, there were no differences between groups.
Tables 3–5 present the score and differences for the individual items contained in each of the scenarios. Table 3 illustrates each individual screening item on which subjects were scored. The items with significant pre-post changes included asking about quantity of alcohol use, the frequency of alcohol use and the frequency of heavy drinking. Subjects also asked more often about alcohol-related injuries and prior treatment.
Table 4 lists the items scored for the brief intervention scenario. Items with significant changes focused on drinking cons and readiness to change.
In table 5 there were some minor differences between the control and intervention groups in the frequency of referral to AA and making a follow-up appointment.
The technology tested is in a true sense a “virtual reality teaching method” that allows a learner to engage in a “real doctor patient interaction”. The algorithm on which the technology is based allows for an unlimited number of variations in the interaction so that every time the learner plays the program the situation is novel and different. Due to the large number of potential responses to questions and counseling statements, the patient can respond in a nearly limitless number of ways. There is also a realistic emotional variation in the patient response that allows the patient to react to the type and sequence of questions and statements. These emotional variations include negative or positive facial expressions, non-verbal body postures, negative vocalizations, changes in eye contact and expressions of affect that change the atmosphere of the interaction. The simulation attempted to overcome concerns about the “reality” of patient simulations (6) allowing learners to “practice” on fake patients before applying these new skills to real patients where patient safety is a concern (16).
The 14 members of the expert panel who practiced and played the program reported that the interview felt like entering an office examination room and dealing with a real patient. Similar unsolicited comments were made by the students and physicians in the intervention group. The program makes learning fun and interactive, especially for a topic that is difficult and sensitive. In an area like substance abuse, where learners often roll their eyes and become somnolent when being taught, this technology represents a refreshing, innovative way to teach, enhance clinical skills for a variety of health care clinicians.
The most robust finding of the study was an improvement in alcohol screening skills. This finding may be related to nature of the simulation that begins with screening and then moves into brief intervention and referral. With a limited number of plays (10 was the minimum number of plays) learners may not spend sufficient time practicing brief intervention and referral skills. While we were not able to test the dose effect of additional plays, future research may want to focus on the brief intervention and referral skills training portions of the simulation.
How can this virtual reality program be used to improve the knowledge and skills of health care professionals? First, the program could function as part of a core curriculum for teaching medical, physician assistant, nursing and pharmacy students about how to conduct alcohol screening and intervention. Other parts of the curriculum could include case discussions, E-learning sites, an evidenced-based review of clinical protocols and standardized patient testing. The program could serve as the platform for a more comprehensive program that could include rotations on consult services, use of these skills on clinical rotations with feedback by faculty supervisors, supervised assessments in addiction medicine programs as well as opportunities to practice these new skills in community based prevention and treatment programs.
Second, since the program is designed as a stand alone resource, the protocol could be used for Continuing Medical Education (CME) for practicing physicians. In a review of studies on formal CME programs, Davis et al. (1999) presented evidence that interactive CME sessions that enhance participant activity provide the opportunity to practice skills can effect change in professional practice and, on occasion, health care outcomes.(3) While the program is easy to install and learn how to use, a brief orientation by local Information Technology staff may be helpful in overcoming challenges many clinicians have with new computer technology. Third, for health care systems or specific groups such as hospital-based trauma surgeons, the program could be used to meet the requirements for national certifying organizations such as the Joint Commission or American Trauma Society.
The strengths of the study include random assignment of a control group, large diverse sample of learners, state-of-the-art measurement of clinical skills, intention-to-treat procedures and 90% post-test follow-up information collected. Weaknesses of the study include challenges in measuring changes in clinical behavior skills. There are no standard methods to measure changes in alcohol screening and intervention skills. We had to develop our own standardized patient scenarios and scoring methods. While we pilot tested the scenarios and used an expert panel to review 20% of the tapes to ensure consistency, scoring remains a challenge. The screening and brief intervention scenarios seemed to work well with large positive outcomes in the intervention group. The large change in the referral scenarios for both groups suggests a problem with the post-test scenario that needs further assessment.
Another potential limitation is the generalizability of a volunteer sample compared to a general sample of learners. On the one hand, one could argue that paid volunteers are more motivated or are more likely to change their clinical skills. However, in this case we make the opposite argument. A student or clinician who is required to play the simulation is more likely to learn the material and get a higher score on the post-test than a volunteer. Our volunteers did not receive a grade or feel any pressure to pass the post-test or to learn how to screen and conduct brief intervention. Grades and a passing requirement are a powerful motivator to perform. Another argument is the observation that the pre-test baseline standardized patient scores on alcohol skills are likely to lower in a general sample than in a volunteer sample. Most of our volunteers were interested in leaning more about alcohol and had fairly high baseline pre-test scores. This created a ceiling effect that limited our ability to demonstrate change. Based on these arguments, we would expect clinicians and students who are required to play the simulation and pass a certain level of proficiency, to be likely to improve their skills at least as much as our volunteer sample.
In summary the technology successfully tested in this paper offers great promise. Virtual reality game based teaching methods are ideally suited to increase the behavioral skills of health care clinicians. It is difficult to ask patients about personal issues (substance use, sexual practices, violence, depression) when they made an appointment for a blood pressure check or a headache. While patients appreciate concern and caring from their clinicians they also expect personal questions to be asked with skill, empathy and confidentiality. Clinicians who do not have the skills to inquire or talk about these sensitive topics often generate fear and resistance on the part of the patient. Patient’s reactions to being asked about these topics can be strong and in some cases carries the risk of harm. Virtual reality simulation offers learners the opportunity to practice and develop skills before trying to apply these skills with real patients.
Expert Panel: University of Wisconsin: Bhushan Bhamb, MD; Randall Brown, MD; Richard Brown, MD, MPH; Jane Crone, NP; Tanya Jagodzinski, MD; Patricia Kokotailo, MD; Amy Miller, NP; Linda Roberts, PhD; Sharon Woodford, NP; Aleksandra Zgierska, MD, PhD. Evergreen University and University of Washington: Jason Kilmer, PhD.
Standardized Patients: Catherine Antczak, Steven Clark, Jeanne Harris, Richard Kreklow, Karyn McCann, Rob Rivard, Joyce Schwert, Kim Stalker-Herron, Deborah Sutinen, Dave Verban.
Production Team: Zachary Barrier, Henry Dewitt, and Peter Roca, Software Development; Clay Hopper, Director; Sean Kobrin and Mark Smith, Audio and Video Producers; Julie-Ann Elliott, Actress; Elizabeth H. Richards, Female Voice-Over; Michael Mortenson, Male Voice-Over.
Funded by: National Institute on Alcohol Abuse and Alcoholism, grant number 1R42 AA016486-01
No author has a conflict of interest.
Michael Fleming, Professor of Family Medicine, University of Wisconsin Madison, School of Medicine and Public Health, 777 South Mills Street, Madison WI, 53715, Phone: (608)263-9953, Fax: (608)263-5813, Email: mike.fleming/at/fammed.wisc.edu.
Dale Olsen, President, SIMmersion LLC, Columbia MD.
Hilary Stathes, Script Writer, SIMmersion LLC, Columbia MD.
Laura Boteler, Project Leader, SIMmersion LLC, Columbia MD.
Paul Grossberg, Clinical Professor, Pediatrics, UW Madison School of Medicine and Public Health.
Judie Pfeifer, Project Coordinator and Education Specialist, Department of Family Medicine, UW Madison.
Stephanie Schiro, Research Specialist, Dept. of Family Medicine, UW Madison.
Jane Banning, Director of UW School of Medicine and Public Health Clinical Teaching and Assessment Center (CTAC)
Susan Skochelak, Senior Associate Dean for Academic Affairs, UW School of Medicine and Public Health.