|Home | About | Journals | Submit | Contact Us | Français|
The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance.
Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark.
Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score.
Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition.
Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions.
The history of using human cadavers for learning anatomy can be traced back to 1sixth century AD when Andrea Vesalius, referred to as the Father of Anatomy, published an epic on anatomical examination in 1543 AD, titled de humani corporis fabrica (on the fabric of the human body).1 Until recently, cadavers have been used for dissection and demonstration of anatomy but not to train in surgical procedures.1 This concept took a major revamp when the United Kingdom (UK) Human Tissue Act 2004 established standards and provided guidance to clinicians to carry out education and training using cadaveric materials. This allowed for research, education, and surgical skills training to be carried out on donors who bequeathed their bodies to medical science.
Prior to implementation of any training tool, evaluation and validation of the tool and its parameters are mandatory.2 Construct validity can be defined as “evaluating a testing instrument based on the degree to which the test item identifies the quality, ability, or trait it was designed to measure.”3 This is usually done by measuring the performance in 2 groups that are hypothesized to differ in the skill being measured by the instrument (e.g., experienced surgeons and novices).2,4,5 There is a paucity of studies on human cadavers as a method of simulated training for laparoscopic procedures and none dealing specifically with construct validity or evaluating “fresh cadaver” as a training tool.6–9
There are multiple factors that will aid rapid acquisition of technical surgical skills, but evidence of the construct validity of fresh frozen cadaver (FFC) use in operative skills training has not been established to date. With both access to and availability of human cadavers being strictly controlled, it is also important to know how much exposure (e.g., number of repetitions of a particular procedure) is needed to reach an acceptable level of safety to operate on live subjects.
The primary aim of the study was to assess the construct validity of FFC as a method of minimal access surgical skills training. A secondary aim was to identify how much exposure to this training medium, in terms of number of repetitions of a task, were required to see the skills of the novice rise to a predetermined safe level.
This was a single-center trial conducted in a Human Tissue Act-approved fresh cadaver laboratory in a UK teaching hospital. Appropriate ethical approval was obtained from the local research ethics committee. Recruitment to the study was through global E-mails sent to surgical trainees. Those who had previously performed >3 laparoscopic procedures were excluded from the novice group.
Ten junior surgical trainees (<2 y of surgical experience) who were novices in laparoscopic surgery and 2 experts, fully trained general surgeons (postlaparoscopic surgery fellowship and >100 laparoscopic procedures) with their practice composed primarily of minimal access surgery, were recruited. Neither experts nor novices had practiced on the cadaver model or were familiar with the structure of our curriculum before the study commenced.
A set of 5 tasks were improvised from the Fundamentals of Laparoscopic Surgery (FLS) curriculum, a valid and reliable curriculum developed by the Society of American Gastrointestinal and Endoscopic Surgeons.10
Task 1. Nondominant to dominant hand sharp peg transfer: Colored peg, with sharp end, to be lifted from 1 paracolic gutter using nondominant hand; transferred in midair to dominant hand and placed in other paracolic gutter
Task 2. Dominant to nondominant hand sharp peg transfer: Above-mentioned exercise to be started with dominant hand
Task 3. Simulated appendicectomy: Mesentery to be cut flush off 5-cm to 6-cm segment of divided small bowel (avoiding any perforations to bowel). This segment of bowel to be endolooped and divided. This task incorporates pattern cut and endoloop tasks of FLS
Task 4. Intracorporeal knot tying: The mesenteric rifts to be closed using 2-Vicryl suture with 3 square knots
Task 5. Extracorporeal knot tying: The mesenteric rifts to be closed using 2-PDS suture with 4 square knots
Novices and experts watched a live demonstration of the above-mentioned 5 tasks, performed by a national laparoscopic trainer, followed by an interactive session to clarify any doubts. In addition, a DVD of this demonstration was available for the trainees and experts to refer to during the technical skills exercise sessions, if needed.
Novices performed 10 repetitions of 1 task before they progressed on to perform the rest of the tasks mentioned above.
Participants worked in pairs (as operator and camera assistant). The operator took a rest for 2.5 min after each repetition. In addition, the operator and camera assistant took a break for 20 min and swapped their roles after completing 10 consecutive repetitions of 1 task or after 1 h of practice session block, whichever was earlier. This was to ensure adequate rest and allow distributed training.11,12 The whole exercise was spread over 2 d for trainees.
Experts performed 3 repetitions of each task in the similar manner. An invigilator was present during training sessions to ensure no undue help was taken from the assistant and to document the number of times any participant referred to the demonstration DVD. All repetitions were video recorded for later analysis by 2 blinded assessors.
All of the experts’ and trainees’ recorded procedures were split into individual repetition clips using iMovies (Apple, USA) software. These were then randomised and coded using a computer-generated random number list to ensure blinding of assessors. The expert and trainee videos were kept in the same pool of clips. Hence, the assessor (postlaparoscopic fellowship consultant surgeon) did not know the identity of either the performer or the order of repetition.
Autonomy was to be assessed as per the domain in the GOALS scale, and this was to be achieved by assessing the help required or requested from the invigilator or from the need to reference the training DVD. This was to be mentioned on coded video clips for the assessors to make an inference. As no novice referred to the DVD nor asked for help from the invigilator, the autonomy domain was removed from the assessment.
To check the reliability of assessments, scores of the principal assessor were compared with a second assessor for the task of intracorporeal knot tying. The second assessor was also equivalent to the primary assessor in terms of surgical experience (postlaparoscopic fellowship consultant grade surgeon) and scored these blinded clips using same assessment scale.
The benchmark score for each task was obtained by calculating the mean performance score of both experts for the 3 repetitions, excluding the values lying beyond 2 standard deviations from the mean. The trimmed mean was used as the benchmark score for safe performance.
Any significant difference in consecutive scores for the first 3 repetitions between and expert groups was taken as a measure of construct validity.
Comparison of each of the 10 repetition scores of the novices’ group with the benchmark score was used to assess the number of task-specific repetitions required for trainees to reach the benchmarked “safe level” on all construct valid tasks.
The data were analysed with SPSS version 17 (SPSS, Chicago, Illinois, USA). The Mann-Whitney U test was used to compare the performance of novices and experts to determine construct validity. A 1-sample t test was used to compare benchmark score with the novice group's scores for 10 repetitions. Inter-rater reliability was assessed using Kendall's τ-b and Spearman's rho tests. P < .050 was considered statistically significant.
The median (IQR) performance scores of experts was significantly higher than those of novices for nondominant to dominant hand peg transfer [15(13) vs 12(2); P = .006]; simulated appendicectomy [18(3) vs 14(3); P = .005]; intracorporeal knot tying [15.5(5) vs 12(3); P = .001]; and extracorporeal knot tying [16.5(3.75) vs 14(3.25); P = .033].
The scores achieved by experts for each task are shown in Table 1. The mean expert scores for the tasks of simulated appendicectomy, intracorporeal and extracorporeal knots were readjusted by trimming the values lying outside 2 SD of the group mean to obtain the benchmark scores.
The performance score of the novice group did not show any significant difference for the task of dominant to nondominant hand peg transfer compared to the benchmark score even for the first repetition [mean novices’ performance score 10.7, benchmark score11.25; P = .569] or for subsequent repetitions (P > .05).
The mean scores and comparisons to benchmark scores for the remaining 4 tasks are shown in Tables 2A to to22D. The mean scores for the these tasks performed by novices became nonsignificant (or comparable to the benchmark scores) between the eighth to tenth repetitions for each task.
Laparoscopic surgery training in the operating theater is often unstructured.16 The present era of time-restricted training, requires trainees to acquire basic skills rapidly to progress to advanced training as soon as possible. Teaching complex skills like these on patients presents formidable obstacles as the teaching surgeon's role as an assistant is more limited than in open surgery.7 It has also been demonstrated that operating room training of junior surgeons is time consuming, resulting in increased cost.17,18
The value of practice on cadavers before proceeding to supervised live patient surgery and ultimately independent practice has been acknowledged by trainers.1 Traditionally, embalmed human cadaver specimens have been used in medical school dissection classes. The stiffening and discoloration caused by full embalming are unavoidable but unacceptable for surgical training.19 Trainees learning the visual and tactile components should work on tissues that have the same appearance and handling qualities as living tissue.19 The closest anatomical model to resemble a live patient is fresh human cadaver. These issues make the perfect ground for training models like human cadavers to be used in training of surgeons to a safe level before embarking on to patients.
To provide a realistic learning experience, our lab uses human FFCs. FFC can be kept indefinitely in freezers with the temperatures between –17°C and –20°C. In order to defrost the cadavers for use, they are placed into a refrigerated area that is at a temperature between +3°C and +5°C. This takes up to 2 wk for the cadavers to defrost until they are ready for use. When this technique is used, the skeletal and visceral tissues retain the colors found in the living body. The compliance of tissues and the ease or difficulty in separating one structure from another resembles those of living tissue.
The usefulness of fresh cadavers to practice surgical skills is unique. The trainee has to practice maneuvers with real-life challenges, like proximity to other viscera and the inherent danger of injury to them; similar tissue compliance; similar depth perception and abdominal pressures; and similar tactile feedback of tissues along with the challenge of dealing with a similar color scheme intraabdominally. All these cannot be replicated in any physical and virtual models of training.
A previous study on lightly embalmed cadavers for use in gynecology skills training evaluated the reduction in handling and manipulation times during procedure-specific maneuvres.7 We measured the performance score rather than the speed of the operation, because it was considered more important for a junior trainee to perform a procedure following appropriate steps than simply encouraging speed of completion, which would be best learned during the next level of training. Efficiency of movements is certainly to be encouraged, the ultimate outcome of which would be more rapid completion of tasks in subsequent repetitions.
Eversbusch et al.20 in their randomized trial on psychomotor training on virtual reality simulated colonoscopy, found that learning curves reached a plateau for experienced surgeons, senior trainees, and novices after the second, fifth, and seventh repetitions, respectively. Greater realism in training was shown for FFC than for virtual reality simulators in a previous study.21 Hence, we made novices perform 10 repetitions of psychomotor skills on FFC, anticipating improvement within this period.
The GOALS scale, developed by McGill University Health Centre, Canada, can be used for evaluating performance (with construct validity and reliability) on synthetic simulation tasks and live animal models.14
Gumbs et al.13 concluded in their study that the same scale can be used to assess laparoscopic appendicectomy with construct validity, and Vassiliou et al.15 concluded that it can be used to assess laparoscopic skills based on videotaped performance. The GOALS scale modification (removal of autonomy) was necessary in our study to ensure blinding, because the performance was assessed by video tapes rather than direct observation. Autonomy is an important domain to assess performance of trainee, and a provision was made during our experiment to record the level of autonomy for trainees. This was captured by a record kept by the invigilator, recording the number of times the trainee referred to the demonstration DVD. This was to be mentioned on the coded video clip for the assessor to make an inference. Since no trainee or expert referred to the demonstration DVD, this domain was removed from the scale in our study. This also indicates that adequate cognitive training is of paramount importance to learn technical skills more effectively and should precede technical skills training. This could explain why our participants did not lack in autonomy. We recommend that if the assessments are done in a face-to-face setting, the domain of autonomy should be retained in the assessment scale.
Our method of establishing construct validity is consistent with previous studies on the subject.2,16,22 The method of establishing a benchmark score using a trimmed mean score of 2 experts is also consistent with the previous study by Ritter et al.10 who have established benchmark scores for FLS curriculum using 2 experts. Similarly, in order to help novices make the proficiency levels achievable in a reasonable amount of time and to compensate for the fact that the expert group consisted of only 2 experts, levels for all the tasks were set at 2 SD from the determined means. This allows for an increased chance of including the true mean performance scores of a larger sampling of expert surgeons while not setting the performance bar too high for the novices.10
The task of “dominant to nondominant hand peg transfer” did not show significant improvement with subsequent repetitions. This may be because it was too short a task to be effectively scored using an explicit GOALS scale. The other reason could be that the novices had performed 10 repetitions of nondominant to dominant hand peg transfer before embarking on the task. The similarity of the 2 tasks and repeated practice of that task could have resulted in improved dexterity skills of novices, sufficient enough to negate any statistical differences compared to the expert group. The transfer of peg from nondominant to dominant hand was, however, found to have construct validity and showed significant improvement with subsequent repetitions.
There are a few other identifiable limitations of training on the cadaver model compared to in vivo training. The most important one is the lack of bleeding on the cadaver model. Hence, the scenario of “iatrogenic bleeding control” cannot be created to make the trainees deal with bleeding disasters. Work is underway to overcome this limitation.
The cost and legal constraints make cadaveric training a very precious resource. The challenge for future expansion of FFC training rests on the limited number of licensed cadaver training centers. The cost of cadavers for this study was reduced by ensuring their use for a range of disciplines, including colorectal, orthopedics, otolaryngology, cardiovascular, and urology training courses. The creation of a simulated appendicectomy model using the small bowel, as shown in our study, ensures that the task is repeatable and integrates the practice of several basic skills within a full task. Such initiatives can ensure maximum use of cadavers.
Our study demonstrates that the use of fresh human cadavers is a very effective model to train junior trainees in basic laparoscopic skills up to a safe level within a short time period. Valid assessment scales can successfully differentiate between novice and expert surgeons on the fresh human cadaver model, suggesting construct validity.
The authors would like to thank Mr. Tony Fouweather, Statistician, Newcastle University, for his help with statistical analysis. The authors also thank Dr. Roger Searle, Director, Anatomy and Clinical Skills, Newcastle University, and Mr Julian Hance, SpR, Leeds Royal Infirmary, for their invaluable advice.
Mitesh Sharma, Newcastle Surgical Training Centre, Department of General Surgery, Freeman Hospital NHS Trust, Newcastle Upon Tyne, UK.
David Macafee, Department of General Surgery, The James Cook University Hospital, Middlesbrough, UK.
Nagarajan Pranesh, Newcastle Surgical Training Centre, Department of General Surgery, Freeman Hospital NHS Trust, Newcastle Upon Tyne, UK.
Alan F. Horgan, Newcastle Surgical Training Centre, Department of General Surgery, Freeman Hospital NHS Trust, Newcastle Upon Tyne, UK.