Design
A randomized controlled educational trial was conducted to test the hypothesis of interest. 102 health care professional students or practitioners were assigned to the experimental virtual reality simulation program or to a no education control group. The intervention was based on SIMmersion simulation technology.(
12,
13) Subjects assigned to the intervention group were expected to read the educational materials and to practice the simulation at least 10 times on their personal computer, over the three-month study period. The primary outcomes of interest were changes in the clinical skills of the participants. Clinical skills were assessed using standardized patients. Each participant was tested with three different case scenarios at baseline and six months post-randomization. The case scenarios were developed specifically for this study as there are no standard case scenarios that have been tested and validated for alcohol screening, brief intervention and referral. These scenarios build on prior work conducted by the PI’s research group.(
8)
Subject recruitment
Participants were initially recruited by email and invited to participate in an educational study focused on alcohol screening and brief intervention. Eligible participants included physicians, residents, nurse practitioners (NPs), physician assistants (PA’s), NP/PA students, medical students and pharmacy doctoral students at the University of Wisconsin in Madison. 14% (102 out of 731) of the students and clinicians who were sent a blanket email participated in the trial. The primary groups were fourth year medical students and primary care physicians. Subjects who responded to the email invitation were contacted by telephone and screened to determine eligibility. These criteria included 18 years of age or older; currently enrolled in a professional training program or currently practicing medicine, nursing or pharmacy; ability to practice the required number of plays for the simulation; and availability to complete the six months post-test at the testing facility in Madison.
Subjects who met these criteria were then scheduled to participate in the baseline standardized patient skills test. Written informed consent was obtained at the time of the baseline testing scenario. Randomization also occurred at this time. The study was approved by the UW Madison Health Sciences Human Subjects Committee. Subjects in both groups were paid $50 for completion of the pre-test and $50 for the post-test. Intervention subjects received an additional $10 for each play up to 10 plays. Eight subjects who completed all aspects of the trial were randomly selected to receive an additional $500 for their participation at the end of the trial. The subjects in the control group were given the virtual reality simulation program and voice recognition equipment at the end of the 6-month post-test.
Standardized patient testing scenarios and scoring
The testing scenarios were developed by the study research team. Three standardized cases (screening, intervention, and referral) and rating methods were used for the pre-test and a separate set of three cases for the post-tests. The pre-test screening case was a 43 year-old salesman who presented to a new physician for a blood pressure medication refill and evaluation of hypertension. He had a stressful job in a new location, had about four drinks most nights after work and occasionally had up to six to eight drinks. He was annoyed by his wife’s concern about his drinking and had tried to cut down, but wasn’t successful. The pre-test brief intervention case was a 48 year-old single female financial advisor who presented to a new physician for trouble sleeping, awakening often each night. She had three to four drinks each night, recently missed a crucial morning business appointment, and sprained her ankle when drinking. She had been diagnosed in the past as a problem drinker, but didn’t believe it. The pre-test referral case was a 48 year-old divorced female who had been a licensed practical nurse (LPN) for 30 years. She presented to her physician for a refill of a Valium prescription to help her relax and sleep. She had five to six drinks every night, her father and brother were alcoholics, and she had been irritable at work. She didn’t believe she was an alcoholic and wouldn’t go to AA. She was a dependent drinker.
The post-test screening case was a 40 year-old male salesman for a manufacturing company, who presented to his physician to have stitches removed from an injury suffered in a bar fight. He had about four drinks most nights after work, six to eight on an occasional heavy night, was annoyed by his wife’s concern about his drinking and had hangovers several times each month. The post-test brief intervention case was a 48 year-old female with two married children and three small grandchildren. Her marriage was failing and she did volunteer work for several organizations. She had three to four drinks most nights and her children were concerned about her drinking and were reluctant to have her baby sit on weekends for her grandchildren. She had tried to cut down and her father and brother were alcoholics. She was a problem drinker. The post-test referral case was a 47 year-old woman who owned her own business and presented to her physician for increasing fatigue and trouble sleeping. She had three to five drinks per night, and lived with a woman partner who is concerned about her drinking. Her father was an alcoholic and she had tried to cut down but was unsuccessful. She was a dependent drinker. The scoring items for each of the six scenarios are listed in , , and .
| Table 3This table compares the mean score for each group for each individual skill tested in the standardized patient Screening scenario |
| Table 4This table compares the mean score for each group for each individual skill tested in the standardized patient Brief Intervention scenario |
| Table 5This table compares the mean score for each group for each individual skill tested in the standardized patient referral scenario |
Training of standardized patients
The University of Wisconsin School of Medicine and Public Health Clinical Teaching and Assessment Center (CTAC) was formally established in 1994 and has an active Standardized Patient program for research, teaching and clinical assessment. Standardized Patients (SP’s) for this project were drawn from a pool of 90 persons, who had previously participated as standardized patients and were selected based on longevity of experience, interest in the project, time availability and background in a health care-related field. There were nine used for the pre-test and ten for the post-test. The SP’s included six women and four men. Their ages ranged from 38–60 and they had been with the SP Program for up to four years.
The CTAC director was the primary trainer who taught the SP’s to portray the cases and score the clinical skills of the subjects. Each SP participated in two 2-hour training sessions that focused on practicing their assigned role. The CTAC training director provided feedback and comments to every SP. Each SP was expected to portray one case scenario for the pre-test and one for the post-test. SP’s were also asked to attend several meetings with the researchers to assist in script development as well as to refine checklist items so they were descriptive, clear, and intuitive to the SP’s. The research team created detailed scripts as well as directions for the checklist completion based on these discussions. The close training with, and collaboration between, the SP’s and members of the research team resulted in an open atmosphere where SP’s questions could be readily asked and addressed.
A pilot test of the cases, role plays, checklists, technology, and logistics took place in September 2007 with three subjects and eight SP’s either portraying roles or viewing the sessions. None of the three subjects involved in the pilot participated in the trial. Further modification of the scripts and scoring items were facilitated by immediate post-pilot discussion and videotape review with the SP’s and the research team observing the pilot and subjects.
Testing procedure
There were 10 sessions used to conduct the pre-test standardized patient scenario over a six-week period in the November and December of 2007, with 8–12 subjects tested during each session. Each of the 102 participants interviewed three standardized patients. The participants were given five minutes to read an abbreviated medical record on the outside of the door prior to entering the exam room and interviewing the SP. Each research participant had 15 minutes to interview and/or counsel the SP. The SP had five minutes to score the checklist before the next clinician entered the room. Research subjects completed the pre-test in 60 minutes. The post-test sessions occurred in April and May of 2008 using similar testing methodology. All subject-SP sessions were videotaped. A 20% sample of the video taped sessions was reviewed by a panel of members of the research team to determine validity of the SP scoring. The reviews found strong agreement between the ratings of the SP’s and research team. The degree of agreement (Kappa) between the standardized patient and expert panel’s 20% review was 0.95.
Virtual reality simulation
The virtual reality simulation software used for this study was developed by Dr. Dale Olsen at the Johns Hopkins University Applied Physics Laboratory. The stand-alone software modified for this study consisted of six elements. The first element was an educational program that could be read by the participant to give basic background on alcohol screening and intervention. This was based on the National Institute on Alcohol Abuse and Alcoholism (NIAAA) 2005 clinician’s guide and consisted of 20 screens of text. The second element consisted of 707 questions and statements learners were able to ask the simulated patient, to conduct counseling or make a referral to a treatment center. The third element was a set of 1,207 responses developed for the simulated patient to respond to the learner (an actress video recorded these responses in a production studio over a five-day period with varied mood and affect). An algorithm embedded in the program was programmed to respond based on the type and appropriateness of the question or statement made by the learner, as well as the history of the conversation between the character and the learner. Negative or inappropriate questions or statements were linked to audio and visual anger or mood changes in the simulated patient.
The fourth element was the on-screen help agent. The agent is an action figure in the corner of the screen who intermittently displays hand and body signals to indicate especially good or not-so-good questions by the learner. The fifth element was an instant replay feature. The sixth element was scoring and feedback to the learner on their performance during the play. The basic computer screen for the simulation is presented in . For this simulation the program was designed so that 40% of the time the character in the simulation would be an at-risk drinker, 40% a problem or dependent drinker and 20% a low-risk drinker (based on criteria developed by NIAAA).
Subjects were asked to conduct face-to-face alcohol screenings, brief interventions and referrals with the simulated character using a microphone or a computer-mouse to communicate. The questions and statements were scripted to include a variety of natural choices. For each scripted question or statement, there are multiple simulated character responses available. The simulated character’s brain, selects a response based on the level of rapport developed by the subject, the character’s risk level, previously discussed information, and chance. For example, if the subject selects a series of inappropriate options, the simulated character will become curt and uncooperative; if however, the subject selects a series of appropriate options, the character will become friendly and forthcoming. This realistic emotional variation allows the simulated character to emulate actual human behavior.
Subjects received feedback from an on-screen help agent who provided non-verbal cues regarding the user’s choice of questions. In addition, the subject could click help buttons for assistance with question choices and character responses. The system scored the subject’s performance, and an instant replay feature enabled users to review portions of dialogue or their entire conversation. At the end of the conversation, subjects would have to decide which type of drinker (randomly selected by the program) they were talking to and were scored on their accuracy.
Each subject was expected to screen the patient for alcohol use and decide if the patient required a brief intervention and a referral for treatment. Since each of the three patients would require varying amounts of time, a completed play was based on the number of statements and questions that were used by the subject, with 10 statements set as the minimum requirement for one play. The guidelines could be accessed at anytime during the play as a reference for the subject.
Plays were tracked using SIMmersion’s on-line tracking system, with the goal being that each experimental subject would play at least 10 times in the 3–4 months prior to the final post-test. After pilot testing by the expert panel, there was a general consensus that 10 plays was a minimum number of plays in order to take advantage of the various patient scenarios and responses built into the simulation. The experimental subjects were contacted by research staff once during the practice period to ensure they were using the simulation software correctly and to aid in solving any issues.
The questions and simulated patient responses were written and developed by an expert panel at the University of Wisconsin. The expert panel consisted of 14 UW Madison primary care clinicians, addiction medicine physicians, psychologists and members of the SIMmersion team with significant expertise in substance abuse. Members of the panel also played and pilot-tested a number of versions of the virtual reality simulation as it was developed. The simulation took approximately nine months to develop prior to testing.
Statistical Analysis
The demographic variables for experimental and control groups were described by way of frequencies (%). Univariate analysis was used to assess potential differences on gender, age, clinician vs. student status and prior alcohol training. As noted in there were no significant differences between groups on these four variables.
| Table 1This table presents socio-demographics on the 102 subjects who participated in study by group status |
The primary outcomes were the Intervention, Screening and Referral scores from the standardized patient scenarios. As noted in , and , the clinical skills of the subjects were assessed using a set of clinical criteria developed for the screening, brief intervention and referral scenario. Eighty-five points was allocated to 17 specific skills criteria with five points for each. The standardized patients were instructed to score the skill done as a simple yes -- the learner demonstrated the skill or alternatively -- no they did not. Subjects received a five points or zero points for each skill. No partial credit scores were given. Fifteen points was used to rate the clinicians overall performance. These scores (0–100) were aggregated to create a total score for each scenario and described with means and standard deviations.
Ninety-one out of 102 subjects completed the six months post-test. The primary reasons the eleven subjects did not complete the post-test included scheduling conflicts with patient care and course work, relocation and illness. Intention to treat analysis was followed. Baseline scores were imputed to the missing data in the post test scores for the 11 subjects who did not complete. We elected to use the most conservative method to handle missing follow-up data. All 102 subjects originally randomized into the trial were included in the outcome analysis.
The mean values for experimental and control groups on post-test were compared with t-tests to derive effect size for the educational trial on the 3 scenarios. T-tests were executed separately for all Intervention, Referral and Screening scales items. The sample was too small to assess a dose response affect between the number of plays and changes in clinical behavior. We stratified the results on each of co-variates in and found no statistical association between these variables and the primary outcomes. The analyses were performed with SAS version 9.1 for Linux (SAS Institute Incorporated, 2002–2003).