|Home | About | Journals | Submit | Contact Us | Français|
Accurate assessment of resident competency is a fundamental requisite to assure the training of physicians is adequate. In surgical disciplines, structured tests as well as ongoing evaluation by faculty are used for evaluating resident competency. Although structured tests evaluate content knowledge, faculty ratings are a better measure of how that knowledge is applied to real-world problems. In this study, we sought to explore the performance of surgical residents in a simulation exercise (strategic management simulations [SMS]) as an objective surrogate of real-world performance.
Forty surgical residents participated in the SMS simulation that entailed decision making in a real-world−oriented task situation. The task requirements enable the assessment of decision making along several parameters of thinking under both crisis and noncrisis situations. Performance attributes include “simpler” measures of competency (activity level), intermediate categories (information management and emergency responses) to complex measures (breadth of approach and strategy). Scores obtained in the SMS were compared with the scores obtained on the American Board of Surgery In-Training Examination (ABSITE).
The data were intercorrelated and subjected to a multiple regression analysis with ABSITE as the dependent variable and simulation scores as independent variables. Using a 1-tail test analysis, only 3 simulation variables correlated with performance on ABSITE at the .01 level (ie, basic activity, focused activity, task orientation). Other simulation variables showed no meaningful relationships to ABSITE scores at all.
The more complex real-world−oriented decision-making parameters on SMS did not correlate with ABSITE scores. We believe that techniques such as the SMS, which focus on critical thinking, complement assessment of medical knowledge using ABSITE. The SMS technique provides an accurate measure of real-world performance and provides objective validation of faculty ratings.
Accurate assessment of resident competency is a fundamental requisite to assure the public that the training of physicians is adequate. Knowledge about how to best develop and assess physician competency continues to grow. Procedural specialties require the assessment of procedural as well as cognitive skills. In surgical disciplines, structured written tests evaluate content knowledge (eg, the American Board of Surgery In-training Examination [ABSITE]).1,2 Ongoing faculty ratings of resident behaviors during patient care provide documentation on how residents combine content knowledge, skills, and attitudes during real-world challenges. Structured tests measure content knowledge and have limited ability to measure how well the resident uses this knowledge. Faculty ratings, on the other hand, are a better measure of the residents' ability to apply knowledge in a timely fashion, yet they are often subject to bias (positive or negative).3 Additional techniques are beginning to emphasize different variables of performance, including the need for focus on organizational factors,4 emergency management,5 and improvement of performance through structured observations.6 Tools such as the Oxford Non-Technical Skills (NOTECHS) scale have emerged as a reliable and valid instrument for teamwork analysis.7
Meaningful assessments of clinical performance and the potential for clinical performance are important to resident selection and evaluation. Brothers and Wetherhalt8 found that faculty evaluations of personal characteristics and letters of reference at the time of interview were likely to predict subsequent clinical performance. United States Medical Licensing Examination scores and academic grade performance were predictive of subsequent formalized testing (ABSITE) but they were poor predictors of clinical performance. Superior faculty rating of resident performance was correlated with better rating on communication and professionalism competencies (residents who are able to communicate better and work in a team),3 but ratings on communication and professionalism competencies did not correlate with ABSITE scores.3 ABSITE scores also had a low correlation with technical skills or operative performance in surgical residents.9
Faculty ratings appear to judge the resident performance (in and out of the operating room) better than the ABSITE scores. Robust approaches for global evaluation tools by faculty can use a standard set of questions addressing specific competency-based behaviors and/or proficiencies, with standardization of results between those being rated through the use of a Likert scale.10 Still, a major concern in the use of faculty evaluation is the potential for assessment bias. A “halo effect” may result from evaluators being unduly influenced by resident affability and availability. This concern is echoed in the tradition of the “3 As of private practice”: affability, availability, and ability, which are the factors most important to a referring physician when selecting a consultant in descending order of priority.3 Physicians are frequently unaware of their own bias toward other physicians, and it has been shown that referral patterns appear to be based on relationships, not physician skill.11 On the other hand, 1 negative experience may establish the perception on the part of faculty that a resident is below average. All these concerns with faculty ratings could be minimized with an objective surrogate for a real-world performance of the surgical resident. This can potentially be achieved using techniques such as the strategic management simulation (SMS) described in this article.
Simulation has become increasingly important in graduate and undergraduate medical education in order to avoid observer bias. A variety of techniques using simulation for both needs assessment and evaluation have been developed. Procedural simulations (both low and high-tech), use of standardized patients and Objective Structured Clinical Examinations, and simulations of team performance and emergency situations are a few of the more commonly described techniques.12–,16 However, capturing performance and providing useful, structured feedback can be challenging in some simulated situations. Just as we sometimes struggle to teach leadership and decision-making skills in the clinical setting, we face the same challenges in a simulated environment.
Patient care at the frontlines occurs under “VUCAD” conditions: volatility, uncertainty, complexity, ambiguity, and delayed feedback.17,18 Although many commonly used procedural simulations can recreate such conditions, they do not necessarily offer defined ways to evaluate performance in the areas of strategic thinking, prioritizing, decision making, and leadership. Cognitive simulation can do that because it offers another avenue for assessing and promoting the development of competent physicians, especially with regard to areas of decision making and leadership. These simulations are designed to replicate complex realities and can be used to evaluate decision-making and leadership skills in resident physicians.19,20
The technique of strategic management simulation has been widely used in many industries to assess cognitive and behavioral responses to task demands as well as the cognitive and behavioral components of “executive functions.”21 Strategic management simulation is a tool that explores how we think without assessing the content of the subject matter; in this case, surgery. We have previously reported the use of SMS in the evaluation of surgical residents.22,23 The simulation technique was able to accurately assess performance of the surgical residents in an objective manner in a brief period of time. A number of measures obtained on simulation performance (such as activity level, response speed, initiative, adequate usage of and appropriate search for relevant information) were highly correlated with comprehensive faculty assessment (eg, measures of crisis management, team interactions, flexibility of approach). In other words, simulation data were highly similar to faculty ratings that were based on at least 2 years of experience with the resident.
The SMS simulations are designed to assess multiple critical dimensions of decision making under dynamic environmental conditions. The SMS measurement system emphasizes the underlying parameters of thinking that are critical to communication, teamwork, utilization of knowledge, breadth of approach, integration of knowledge with incoming information, use of planning, and strategy.24 The SMS simulations have been used across the world to assess and train professional decision makers.25
In this article, we sought to explore how standardized test scores on ABSITE correlated with performance in SMS in surgery residents. In other words, we were interested in determining the real-world commonalities between ABSITE scores and real-world performance indicators of cognitive simulation (SMS) performance.
Forty surgical residents from Stanford School of Medicine participated in the SMS simulation. Scores obtained were compared with ABSITE, a test of the general level of knowledge attained by residents.
Strategic management simulations assess both cognitive and behavioral responses to task demands. The method provides more than 80 computer-gathered and computer-calculated measures of functioning, loading on 12 reliable factors (based on factor analytic varimax rotation for more than 2000 prior subjects). High levels of predictive validity, reliability, and applicability of the SMS simulations to real-world settings have been repeatedly demonstrated across multiple professions, cultures, and continents (predicting an individual's achievement and future success level on indicators such as “job level at age,” “income at age,” “promotions,” “number of persons supervised,” and so forth).26 Overall validity coefficients consistently exceed r =+.60. Reliability values range between r =+.7 and r =+.94.26
During the simulation, participants make decisions during a 90-minute task period that includes responding to scenarios that are specifically designed to assess decision making along multiple parameters.27 The real-world atmosphere of the task and setting, involving multiple potentially interactive components of task demands as well as multiple and interactive options to engage in various aspects of behavior, allows for a more realistic (ecologically relevant) assessment of competency.28 Assessed performance attributes on several validated performance indicators vary from “simpler” measures of competency in categories such as “activity” and “timeliness of response,” through intermediate categories such as “information orientation,” “information utilization,” and “emergency management” to increasingly complex measures in such areas of functioning as “initiative,” “breadth of approach to challenges,” “planning,” “strategy,” and so forth (table 1). During the simulation, the task requirements enable the assessment of decision making under both crisis and noncrisis situations along several parameters of thinking. These include levels of activity, task orientation, information management, and strategy among others.
The ABSITE is a test in clinical and basic science that is designed to serve as an objective measure of resident knowledge to determine competency for progression in the program. The data were intercorrelated and subjected to a multiple regression analysis with ABSITE as the dependent variable and simulation scores as independent variables.
The intercorrelations of ABSITE scores with simulation scores and their common variance values are presented in table 2. Because we are concerned with potential parallel predictions of ABSITE and SMS simulation scores, a 1-tailed test analysis was used. Only 3 simulation variables correlated with performance on ABSITE at the .01 level (ie, basic activity, focused activity, task orientation). Other simulation variables showed no meaningful relationships to ABSITE scores at all.
To help define the primary contributors to any relationship between ABSITE performance and performance on the simulation, a multiple regression analysis was performed to single out those simulation variables that meaningfully predict the ABSITE score. Using a stepwise model of F ≤ 0.50 to enter and ≥ .10 to remove, an overall R2 (adjusted) value of .827 was obtained for 2 of the simulation variables, basic activity and focused activity, generating an F value of 94.495 (P < .001) for the 2-variable model. In other words, the primary simulation predictors for performance on the ABSITE are measures of activity. Other simulation-based variables are less or not at all predictive of ABSITE performance.
Assessment of residents' competency in surgery is a complex process. Competence encompasses several components: knowledge of diagnosing and treating disease, appropriate judgment in treating a patient's condition in elective and emergent conditions, knowledge of surgical technique and performing the procedures with technical finesse, empathy and caring for patients, ability to deliver health care within a framework of rules and regulations (local hospital, state, and federal), and the ability to learn from experience (lifelong learning). Most residency programs use a combination of test scores and faculty evaluations based on performance at the bedside or in the clinic and in the operating room.
In this study, we used SMS to evaluate the surgical resident performance and compared it with ABSITE scores. ABSITE scores correlated significantly with simple measures such as activity level, focused activity, and task orientation. The more complex, real-world−oriented, decision-making parameters did not correlate with the ABSITE scores. These measures included basic initiative, information management, strategy, emergency responses, and breadth of approach. Translating these SMS measures to real-world situations is important to understand the implications of these findings. Initiative to look for information when evaluating or managing a patient's condition is critical in surgery. Similarly, appropriate use of information gathered (information management), considering different surgical and nonsurgical conditions during diagnosis, and considering different options in managing a surgical problem (and nonsurgical such as radiation therapy, medical management including no treatment) are important in optimizing the risk-to-benefit ratio for the patient. Considering various options comes under the rubric of breadth of approach and strategy. Finally, surgical problems can evolve rapidly, and the ability to gather critical pieces of information, come up with a plan of action, and execute the plan in a timely fashion, especially in an emergency.
In conclusion, we believe that techniques such as the SMS, which focus on critical thinking, provide information about resident competency that is different from, and adds to, information generated by ABSITE (and potentially other objective measures across medical specialties). Because it has been repeatedly shown that the SMS technique provides an accurate measure of underlying performance indicators, data obtained by this method would be somewhat similar to (but because they are unbiased, they can improve on) faculty ratings.
Satish Krishnamurthy, MD, MCh, is Director, Minimally Invasive Neurosurgery, Department of Neurosurgery, SUNY Upstate Medical University; Usha Satish, PhD, and Siegfried Streufert, PhD, are Professors, and Mantosh Dewan, MD, is Professor and Chair, all in the Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University; Tina Foster, MD, MPH, MS, is Assistant Professor of Obstetrics and Gynecology and Community and Family Medicine, Dartmouth-Hitchcock Medical Center, and Associate Program Director of the Dartmouth-Hitchcock Leadership Preventive Medicine Residency. Thomas Krummel, MD, is Professor and Chair, Department of Surgery, Stanford School of Medicine.