Reliable and valid written tests of higher cognitive function are difficult to produce, particularly for the assessment of clinical problem solving. Modified Essay Questions (MEQs) are often used to assess these higher order abilities in preference to other forms of assessment, including multiple-choice questions (MCQs). MEQs often form a vital component of end-of-course assessments in higher education. It is not clear how effectively these questions assess higher order cognitive skills. This study was designed to assess the effectiveness of the MEQ to measure higher-order cognitive skills in an undergraduate institution.
An analysis of multiple-choice questions and modified essay questions (MEQs) used for summative assessment in a clinical undergraduate curriculum was undertaken. A total of 50 MCQs and 139 stages of MEQs were examined, which came from three exams run over two years. The effectiveness of the questions was determined by two assessors and was defined by the questions ability to measure higher cognitive skills, as determined by a modification of Bloom's taxonomy, and its quality as determined by the presence of item writing flaws.
Over 50% of all of the MEQs tested factual recall. This was similar to the percentage of MCQs testing factual recall. The modified essay question failed in its role of consistently assessing higher cognitive skills whereas the MCQ frequently tested more than mere recall of knowledge.
Construction of MEQs, which will assess higher order cognitive skills cannot be assumed to be a simple task. Well-constructed MCQs should be considered a satisfactory replacement for MEQs if the MEQs cannot be designed to adequately test higher order skills. Such MCQs are capable of withstanding the intellectual and statistical scrutiny imposed by a high stakes exit examination.
Objectives: To evaluate Multiple Choice and Short Essay Question items in Basic Medical Sciences by determining item writing flaws (IWFs) of MCQs along with cognitive level of each item in both methods.
Methods: This analytical study evaluated the quality of the assessment tools used for the first batch in a newly established medical college in Karachi, Pakistan. First and sixth module assessment tools in Biochemistry during 2009-2010 were analyzed. Cognitive level of MCQs and SEQs, were noted and MCQ item writing flaws were also evaluated.
Results: A total of 36 SEQs and 150 MCQs of four items were analyzed. The cognitive level of 83.33% of SEQs was at recall level while remaining 16.67% were assessing interpretation of data. Seventy six percent of the MCQs were at recall level while remaining 24% were at the interpretation. Regarding IWFs, 69 IWFs were found in 150 MCQs. The commonest among them were implausible distracters (30.43%), unfocused stem (27.54%) and unnecessary information in the stem (24.64%).
Conclusion: There is a need to review the quality including the content of assessment tools. A structured faculty development program is recommended for developing improved assessment tools that align with learning outcomes and measure competency of medical students.
Assessment; MCQ; SEQ; Item analysis
Objective: To evaluate assessment system of the 'Research Methodology Course' using utility criteria (i.e. validity, reliability, acceptability, educational impact, and cost-effectiveness). This study demonstrates comprehensive evaluation of assessment system and suggests a framework for similar courses.
Methods: Qualitative and quantitative methods used for evaluation of the course assessment components (50 MCQ, 3 Short Answer Questions (SAQ) and research project) using the utility criteria. Results of multiple evaluation methods for all the assessment components were collected and interpreted together to arrive at holistic judgments, rather than judgments based on individual methods or individual assessment.
Results: Face validity, evaluated using a self-administered questionnaire (response rate-88.7%) disclosed that the students perceived that there was an imbalance in the contents covered by the assessment. This was confirmed by the assessment blueprint. Construct validity was affected by the low correlation between MCQ and SAQ scores (r=0.326). There was a higher correlation between the project and MCQ (r=0.466)/SAQ (r=0.463) scores. Construct validity was also affected by the presence of recall type of MCQs (70%; 35/50), item construction flaws and non-functioning distractors. High discriminating indices (>0.35) were found in MCQs with moderate difficulty indices (0.3-0.7). Reliability of the MCQs was 0.75 which could be improved up to 0.8 by increasing the number of MCQs to at least 70. A positive educational impact was found in the form of the research project assessment driving students to present/publish their work in conferences/peer reviewed journals. Cost per student to complete the course was US$164.50.
Conclusions: The multi-modal evaluation of an assessment system is feasible and provides thorough and diagnostic information. Utility of the assessment system could be further improved by modifying the psychometrically inappropriate assessment items.
Assessment; Evaluation; Utility Criteria; Research Course
In the regional movement toward ASEAN Economic Community (AEC), medical professions including physicians can be qualified to practice medicine in another country. Ensuring comparable, excellent medical qualification systems is crucial but the availability and analysis of relevant information has been lacking.
This study had the following aims: 1) to comparatively analyze information on Medical Licensing Examinations (MLE) across ASEAN countries and 2) to assess stakeholders’ view on potential consequences of AEC on the medical profession from a Thai perspective.
To search for relevant information on MLE, we started with each country's national body as the primary data source. In case of lack of available data, secondary data sources including official websites of medical universities, colleagues in international and national medical student organizations, and some other appropriate Internet sources were used. Feasibility and concerns about validity and reliability of these sources were discussed among investigators. Experts in the region invited through HealthSpace.Asia conducted the final data validation. For the second objective, in-depth interviews were conducted with 13 Thai stakeholders, purposely selected based on a maximum variation sampling technique to represent the points of view of the medical licensing authority, the medical profession, ethicists and economists.
MLE systems exist in all ASEAN countries except Brunei, but vary greatly. Although the majority has a national MLE system, Singapore, Indonesia, and Vietnam accept results of MLE conducted at universities. Thailand adopted the USA's 3-step approach that aims to check pre-clinical knowledge, clinical knowledge, and clinical skills. Most countries, however, require only one step. A multiple choice question (MCQ) is the most commonly used method of assessment; a modified essay question (MEQ) is the next most common. Although both tests assess candidate's knowledge, the Objective Structured Clinical Examination (OSCE) is used to verify clinical skills of the examinee. The validity of the medical license and that it reflects a consistent and high standard of medical knowledge is a sensitive issue because of potentially unfair movement of physicians and an embedded sense of domination, at least from a Thai perspective.
MLE systems differ across ASEAN countries in some important aspects that might be of concern from a fairness viewpoint and therefore should be addressed in the movement toward AEC.
AEC; medical licensing examination; medical qualification; medical education; medical practice
Assessment has a powerful influence on curriculum delivery. Medical instructors must use tools which conform to educational principles, and audit them as part of curriculum review.
To generate information to support recommendations for improving curriculum delivery.
Pre-clinical and clinical departments in a College of Medicine, Saudi Arabia.
A self-administered questionnaire was used in a cross-sectional survey to see if assessment tools being used met basic standards of validity, reliability and currency, and if feedback to students was adequate. Excluded were cost, feasibility and tool combinations.
Thirty-one (out of 34) courses were evaluated. All 31 respondents used MCQs, especially one-best (28/31) and true/false (13/31). Groups of teachers selected test questions mostly. Pre-clinical departments sourced equally from “new” (10/14) and “used” (10/14) MCQs; clinical departments relied on ‘banked’ MCQs (16/17). Departments decided pass marks (28/31) and chose the College-set 60%; the timing was pre-examination in 13/17 clinical but post-examination in 5/14 pre-clinical departments. Of six essay users, five used model answers but only one did double marking. OSCE was used by 7/17 clinical departments; five provided checklist. Only 3/31 used optical reader. Post-marking review was done by 13/14 pre-clinical but 10/17 clinical departments. Difficulty and discriminating indices were determined by only 4/31 departments. Feedback was provided by 12/14 pre-clinical and 7/17 clinical departments. Only 10/31 course coordinators had copies of examination regulations.
MCQ with single-best answer, if properly constructed and adequately critiqued, is the preferred tool for assessing theory domain. However, there should be fresh questions, item analyses, comparisons with pervious results, optical reader systems and double marking. Departments should use OSCE or OSPE more often. Long essays, true/false, fill-in-the-blank-spaces and more-than-one-correct-answer can be safely abolished. Departments or teams should set test papers and collectively take decisions. Feedback rates should be improved. A Center of Medical Education, including an Examination Center is required. Fruitful future studies can be repeat audit, use of “negative questions” and the number of MCQs per test paper. Comparative audit involving other regional medical schools may be of general interest.
Assessment Technique; Curriculum review; MCQ
Evidence-Based Medicine (EBM) is an important competency for the healthcare professional. Experimental evidence of EBM educational interventions from rigorous research studies is limited. The main objective of this study was to assess EBM learning (knowledge, attitudes and self-reported skills) in undergraduate medical students with a randomized controlled trial.
The educational intervention was a one-semester EBM course in the 5th year of a public medical school in Mexico. The study design was an experimental parallel group randomized controlled trial for the main outcome measures in the 5th year class (M5 EBM vs. M5 non-EBM groups), and quasi-experimental with static-groups comparisons for the 4th year (M4, not yet exposed) and 6th year (M6, exposed 6 months to a year earlier) groups. EBM attitudes, knowledge and self-reported skills were measured using Taylor’s questionnaire and a summative exam which comprised of a 100-item multiple-choice question (MCQ) test.
289 Medical students were assessed: M5 EBM=48, M5 non-EBM=47, M4=87, and M6=107. There was a higher reported use of the Cochrane Library and secondary journals in the intervention group (M5 vs. M5 non-EBM). Critical appraisal skills and attitude scores were higher in the intervention group (M5) and in the group of students exposed to EBM instruction during the previous year (M6). The knowledge level was higher after the intervention in the M5 EBM group compared to the M5 non-EBM group (p<0.001, Cohen's d=0.88 with Taylor's instrument and 3.54 with the 100-item MCQ test). M6 Students that received the intervention in the previous year had a knowledge score higher than the M4 and M5 non-EBM groups, but lower than the M5 EBM group.
Formal medical student training in EBM produced higher scores in attitudes, knowledge and self-reported critical appraisal skills compared with a randomized control group. Data from the concurrent groups add validity evidence to the study, but rigorous follow-up needs to be done to document retention of EBM abilities.
Evidence-based medicine; Undergraduate medical education; Curriculum development; Educational assessment; Critical appraisal skills
Multiple-choice question (MCQ) examinations are increasingly used as the assessment method of theoretical knowledge in large class-size modules in many life science degrees. MCQ-tests can be used to objectively measure factual knowledge, ability and high-level learning outcomes, but may also introduce gender bias in performance dependent on topic, instruction, scoring and difficulty. The ‘Single Answer’ (SA) test is often used in which students choose one correct answer, in which they are unable to demonstrate partial knowledge. Negatively marking eliminates the chance element of guessing but may be considered unfair. Elimination testing (ET) is an alternative form of MCQ, which discriminates between all levels of knowledge, while rewarding demonstration of partial knowledge. Comparisons of performance and gender bias in negatively marked SA and ET tests have not yet been performed in the life sciences. Our results show that life science students were significantly advantaged by answering the MCQ test in elimination format compared to single answer format under negative marking conditions by rewarding partial knowledge of topics. Importantly, we found no significant difference in performance between genders in either cohort for either MCQ test under negative marking conditions. Surveys showed that students generally preferred ET-style MCQ testing over SA-style testing. Students reported feeling more relaxed taking ET MCQ and more stressed when sitting SA tests, while disagreeing with being distracted by thinking about best tactics for scoring high. Students agreed ET testing improved their critical thinking skills. We conclude that appropriately-designed MCQ tests do not systematically discriminate between genders. We recommend careful consideration in choosing the type of MCQ test, and propose to apply negative scoring conditions to each test type to avoid the introduction of gender bias. The student experience could be improved through the incorporation of the elimination answering methods in MCQ tests via rewarding partial and full knowledge.
Objective: The purpose of the study was to identify technical item flaws in the multiple choice questions submitted for the final exams for the years 2009, 2010 and 2011.
Methods: This descriptive analytical study was carried out in Islamic International Medical College (IIMC). The Data was collected from the MCQ’s submitted by the faculty for the final exams for the year 2009, 2010 and 2011. The data was compiled and evaluated by a three member assessment committee. The data was analyzed for frequency and percentages the categorical data was analyzed by chi-square test.
Results: Overall percentage of flawed item was 67% for the year 2009 of which 21% were for testwiseness and 40% were for irrelevant difficulty. In year 2010 the total item flaws were 36% and 11% testwiseness and 22% were for irrelevant difficulty. The year 2011 data showed decreased overall flaws of 21%. The flaws of testwisness were 7%, irrelevant difficulty were 11%.
Conclusion: Technical item flaws are frequently encountered during MCQ construction, and the identification of flaws leads to improved quality of the single best MCQ’s.
Frequency; Item writing flaws; Testwiseness
To assess the student's attitude, perception and feedback on teaching–learning methodology and evaluation methods in pharmacology.
Materials and Methods:
One hundred and forty second year medical students studying at Smt. Kashibai Navale Medical College, Pune, were selected. They were administered a pre-validated questionnaire containing 22 questions. Suggestions were also asked regarding the qualities of good pharmacology teachers and modification in pharmacology teaching methods. Descriptive statistics were used and results were expressed as percentage.
Majority of the students found cardiovascular system (49.25%) as the most interesting topic in pharmacology, whereas most of the students opined that cardiovascular system (60.10%), chemotherapy (54.06%) and central nervous system (44.15%) are going to be the most useful topics in internship. 48.53% students preferred clinical/patient-related pharmacology and 39.13% suggested use of audiovisual-aided lectures. Prescription writing and criticism of prescription were amongst the most useful and interesting in practical pharmacology. Students expressed interest in microteaching and problem-based learning, whereas seminars, demonstrations on manikin and museum studies were mentioned as good adjuvants to routine teaching. Multiple Choice Question (MCQ) practice tests and theory viva at the end of a particular system and periodical written tests were mentioned as effective evaluation methods. Students were found to have lot of interest in gathering information on recent advances in pharmacology and suggested to include new drug information along with prototype drugs in a comparative manner.
There is a need of conducting few microteaching sessions and more of clinical-oriented problem-based learning with MCQ-based revisions at the end of each class in the pharmacology teaching at undergraduate level.
Evaluation methods; medical students; pharmacology; teaching–learning methodology
Multiple choice questions (MCQs) are frequently used to assess students in different educational streams for their objectivity and wide reach of coverage in less time. However, the MCQs to be used must be of quality which depends upon its difficulty index (DIF I), discrimination index (DI) and distracter efficiency (DE).
To evaluate MCQs or items and develop a pool of valid items by assessing with DIF I, DI and DE and also to revise/ store or discard items based on obtained results.
Study was conducted in a medical school of Ahmedabad.
Materials and Methods:
An internal examination in Community Medicine was conducted after 40 hours teaching during 1st MBBS which was attended by 148 out of 150 students. Total 50 MCQs or items and 150 distractors were analyzed.
Data was entered and analyzed in MS Excel 2007 and simple proportions, mean, standard deviations, coefficient of variation were calculated and unpaired t test was applied.
Out of 50 items, 24 had “good to excellent” DIF I (31 - 60%) and 15 had “good to excellent” DI (> 0.25). Mean DE was 88.6% considered as ideal/ acceptable and non functional distractors (NFD) were only 11.4%. Mean DI was 0.14. Poor DI (< 0.15) with negative DI in 10 items indicates poor preparedness of students and some issues with framing of at least some of the MCQs. Increased proportion of NFDs (incorrect alternatives selected by < 5% students) in an item decrease DE and makes it easier. There were 15 items with 17 NFDs, while rest items did not have any NFD with mean DE of 100%.
Study emphasizes the selection of quality MCQs which truly assess the knowledge and are able to differentiate the students of different abilities in correct manner.
Difficulty index; discrimination index; distractor efficiency; multiple choice question or item; nonfunctional distractor (NFD); teaching evaluation
Characterizing and comparing cognitive skills assessed by introductory biology and physics indicate that (a) both course sequences assess primarily lower-order cognitive skills, (b) the distribution of items across cognitive skill levels differs significantly, and (c) there is no strong relationship between student performance and cognitive skill level.
Assessments and student expectations can drive learning: students selectively study and learn the content and skills they believe critical to passing an exam in a given subject. Evaluating the nature of assessments in undergraduate science education can, therefore, provide substantial insight into student learning. We characterized and compared the cognitive skills routinely assessed by introductory biology and calculus-based physics sequences, using the cognitive domain of Bloom's taxonomy of educational objectives. Our results indicate that both introductory sequences overwhelmingly assess lower-order cognitive skills (e.g., knowledge recall, algorithmic problem solving), but the distribution of items across cognitive skill levels differs between introductory biology and physics, which reflects and may even reinforce student perceptions typical of those courses: biology is memorization, and physics is solving problems. We also probed the relationship between level of difficulty of exam questions, as measured by student performance and cognitive skill level as measured by Bloom's taxonomy. Our analyses of both disciplines do not indicate the presence of a strong relationship. Thus, regardless of discipline, more cognitively demanding tasks do not necessarily equate to increased difficulty. We recognize the limitations associated with this approach; however, we believe this research underscores the utility of evaluating the nature of our assessments.
Exams are essential components of medical students’ knowledge and skill assessment during their clinical years of study. The paper provides a retrospective analysis of validity evidence for the internal medicine component of the written and clinical exams administered in 2012 and 2013 at King Abdulaziz University’s Faculty of Medicine.
Students’ scores for the clinical and written exams were obtained. Four faculty members (two senior members and two junior members) were asked to rate the exam questions, including MCQs and OSCEs, for evidence of content validity using a rating scale of 1–5 for each item.
Cronbach’s alpha was used to measure the internal consistency reliability. Correlations were used to examine the associations between different forms of assessment and groups of students.
A total of 824 students completed the internal medicine course and took the exam. The numbers of rated questions were 320 and 46 for the MCQ and OSCE, respectively. Significant correlations were found between the MCQ section, the OSCE section, and the continuous assessment marks, which include 20 long-case presentations during the course; participation in daily rounds, clinical sessions and tutorials; the performance of simple procedures, such as IV cannulation and ABG extraction; and the student log book.
Although the OSCE exam was reliable for the two groups that had taken the final clinical OSCE, the clinical long- and short-case exams were not reliable across the two groups that had taken the oral clinical exams. The correlation analysis showed a significant linear association between the raters with respect to evidence of content validity for both the MCQ and OSCE, r = .219 P < .001 and r = .678 P < .001, respectively, and r = .241 P < .001 and r = .368 P = .023 for the internal structure validity, respectively. Reliability measured using Cronbach’s alpha was greater for assessments administered in 2013.
The pattern of relationships between the MCQ and OSCE scores provides evidence of the validity of these measures for use in the evaluation of knowledge and clinical skills in internal medicine. The OSCE exam is more reliable than the short- and long-case clinical exams and requires less effort on the part of examiners and patients.
Validity; Assessment; Undergraduate medical education
The postgraduate training program in psychiatry in Saudi Arabia, which was established in 1997, is a 4-year residency program. Written exams comprising of multiple choice questions (MCQs) are used as a summative assessment of residents in order to determine their eligibility for promotion from one year to the next. Test blueprints are not used in preparing examinations.
To develop test blueprints for the written examinations used in the psychiatry residency program.
Based on the guidelines of four professional bodies, documentary analysis was used to develop global and detailed test blueprints for each year of the residency program. An expert panel participated during piloting and final modification of the test blueprints. Their opinion about the content, weightage for each content domain, and proportion of test items to be sampled in each cognitive category as defined by modified Bloom’s taxonomy were elicited.
Eight global and detailed test blueprints, two for each year of the psychiatry residency program, were developed. The global test blueprints were reviewed by experts and piloted. Six experts participated in the final modification of test blueprints. Based on expert consensus, the content, total weightage for each content domain, and proportion of test items to be included in each cognitive category were determined for each global test blueprint. Experts also suggested progressively decreasing the weightage for recall test items and increasing problem solving test items in examinations, from year 1 to year 4 of the psychiatry residence program.
A systematic approach using a documentary and content analysis technique was used to develop test blueprints with additional input from an expert panel as appropriate. Test blueprinting is an important step to ensure the test validity in all residency programs.
test blueprinting; psychiatry; residency program; summative assessment; documentary and content analysis; Kingdom of Saudi Arabia
This paper is an attempt to produce a guide for improving the quality of Multiple Choice Questions (MCQs) used in undergraduate and postgraduate assessment. Multiple Choice Questions type is the most frequently used type of assessment worldwide. Well constructed, context rich MCQs have a high reliability per hour of testing. Avoidance of technical items flaws is essential to improve the validity evidence of MCQs. Technical item flaws are essentially of two types (i) related to testwiseness, (ii) related to irrelevant difficulty. A list of such flaws is presented together with discussion of each flaw and examples to facilitate learning of this paper and to make it learner friendly. This paper was designed to be interactive with self-assessment exercises followed by the key answer with explanations.
Pitfalls; assessment; student
Introduction: Active learning strategies have been documented to enhance learning. We created an active learning environment in neuromuscular physiology lectures for first year medical students by using ‘Pause Procedure’.
Materials and Methods: One hundred and fifty medical students class is divided into two Groups (Group A and Group B) and taught in different classes. Each lecture of group A (experimental Group) undergraduate first year medical students was divided into short presentations of 12-15 min each. Each presentation was followed by a pause of 2-3min, three times in a 50 min lecture. During the pauses students worked in pairs to discuss and rework their notes. Any queries were directed towards the teacher and discussed forthwith. At the end of each lecture students were given 2-3 minutes to write down the key points they remembered about the lecture (free-recall). Fifteen days after completion of the lectures a 30 item MCQ test was administered to measure long term recall. Group B (control Group) received the same lectures without the use of pause procedure and was similarly tested.
Results: Experimental Group students did significantly better on the MCQ test (p-value<0.05) in comparison to the control Group. Most of the students (83.6%) agreed that the ‘pause procedure’ helped them to enhance lecture recall.
Conclusion: Pause procedure is a good active learning strategy which helps students review their notes, reflect on them, discuss and explain the key ideas with their partners. Moreover, it requires only 6-7 min of the classroom time and can significantly enhance student learning.
Active learning strategies; Control group; Experimental group; Long term recall; Pause procedure
Four- or five-option multiple choice questions (MCQs) are the standard in health-science disciplines, both on certification-level examinations and on in-house developed tests. Previous research has shown, however, that few MCQs have three or four functioning distractors. The purpose of this study was to investigate non-functioning distractors in teacher-developed tests in one nursing program in an English-language university in Hong Kong.
Using item-analysis data, we assessed the proportion of non-functioning distractors on a sample of seven test papers administered to undergraduate nursing students. A total of 514 items were reviewed, including 2056 options (1542 distractors and 514 correct responses). Non-functioning options were defined as ones that were chosen by fewer than 5% of examinees and those with a positive option discrimination statistic.
The proportion of items containing 0, 1, 2, and 3 functioning distractors was 12.3%, 34.8%, 39.1%, and 13.8% respectively. Overall, items contained an average of 1.54 (SD = 0.88) functioning distractors. Only 52.2% (n = 805) of all distractors were functioning effectively and 10.2% (n = 158) had a choice frequency of 0. Items with more functioning distractors were more difficult and more discriminating.
The low frequency of items with three functioning distractors in the four-option items in this study suggests that teachers have difficulty developing plausible distractors for most MCQs. Test items should consist of as many options as is feasible given the item content and the number of plausible distractors; in most cases this would be three. Item analysis results can be used to identify and remove non-functioning distractors from MCQs that have been used in previous tests.
The aim of this study was to evaluate the efficacy of a new psychiatry clerkship curriculum which was designed to improve the knowledge and skills of medical students of Tehran University of Medical Sciences (TUMS), Iran.
This quasi-experimental study was conducted in two consecutive semesters from February 2009 to January 2010. In total, 167 medical students participated in the study. In the first semester, as the control group, the clerks’ training was based on the traditional curriculum. In the next semester, we constructed and applied a new curriculum based on the SPICES model (student-centered, problem-based, integrated, community-based, elective and systematic).At the end of the clerkship, the students were given two exams: Multiple Choice Questions (MCQ) to assess their knowledge, and Objective Structured Clinical Examination (OSCE) to assess their skills. Baseline data and test performance for each student were analyzed.
Compared to the control group, students in the intervention group showed significantly higher OSCE scores (P= 0.01). With respect to MCQ score, no significant difference was found between the two groups.
The results suggest that the revised curriculum is more effective than the traditional one in improving the required clinical skills in medical students during their psychiatry clerkship.
Psychiatry; Clerkship; Education; Medical students; Curriculum
Chinese medical universities typically have a high number of students, a shortage of teachers and limited equipment, and as such histology courses have been taught using traditional lecture-based formats, with textbooks and conventional microscopy. This method, however, has reduced creativity and problem-solving skills training in the curriculum. The virtual microscope (VM) system has been shown to be an effective and efficient educational strategy. The present study aims to describe a VM system for undergraduates and to evaluate the effects of promoting active learning and problem-solving skills.
Two hundred and twenty-nine second-year undergraduate students in the Third Military Medical University were divided into two groups. The VM group contained 115 students and was taught using the VM system. The light microscope (LM) group consisted of 114 students and was taught using the LM system. Post-teaching performances were assessed by multiple-choice questions, short essay questions, case analysis questions and the identification of structure of tissue. Students’ teaching preferences and satisfaction were assessed using questionnaires.
Test scores in the VM group showed a significant improvement compared with those in the LM group (p < 0.05). There were no substantial differences between the two groups in the mean score rate of multiple-choice questions and the short essay category (p > 0.05); however, there were notable differences in the mean score rate of case analysis questions and identification of structure of tissue (p < 0.05). The questionnaire results indicate that the VM system improves students’ productivity and promotes learning efficiency. Furthermore, students reported other positive effects of the VM system in terms of additional learning resources, critical thinking, ease of communication and confidence.
The VM system is an effective tool at Chinese medical university to promote undergraduates’ active learning and problem-solving skills as an assisted teaching platform.
Virtual microscope; Active learning; Problem-solving skills; Assisted platform; Chinese medical university
The objective of this study was to report on the role of the Trauma Evaluation and Management (TEAM) module devised by the American College of Surgeons in the trauma education of senior medical students.
Twenty-nine medical students who completed their surgical clerkship at the University of Toronto were randomly divided into 2 groups: a control and a TEAM group. All students completed a 20-item multiple-choice questionnaire (MCQ) pre-test. The TEAM group (15 students) took a post-test after completing the TEAM program and the control group (14 students) took the same “post-test” without completing the TEAM program. Students in the control group did complete the TEAM program after taking the post-test, allowing all 29 students to complete a post-module evaluation questionnaire. Paired t-tests were used for within group comparisons and unpaired t-tests for between group comparisons. The results of the evaluation questionnaire were analyzed according to the percentage of response in each of 5 categories of 1 = strongly disagree to 5 = strongly agree, as well as according to the median, range and 95% confidence intervals.
The students had similar mean (± standard deviation) scores on the MCQ pre-test (TEAM 46.3 [5.5], control 47.5 [9.9]), but the TEAM group showed a significant (p < 0.05) improvement in their scores after they completed the TEAM program (TEAM 80.7 [11.5], control 44.6 [6.3]). Eight of the 15 students in the TEAM group reached the Advanced Trauma Life Support (ATLS) pass mark of 80%, whereas none in the control group achieved this mark. With respect to the evaluation questionnaire, a score of 4 or greater was assigned by 100% of the students when asked if the objectives were met, 100% when asked if trauma knowledge was improved, 62% when asked whether clinical trauma skills were improved, 100% for overall satisfaction and 100% in recommending that the module be made mandatory in the undergraduate curriculum.
This study demonstrates the teaching effectiveness of the TEAM module. It also was very well accepted by the senior medical students who unanimously indicated that this module should be mandatory in the undergraduate medical curriculum.
Internationally, tests of general mental ability are used in the selection of medical students. Examples include the Medical College Admission Test, Undergraduate Medicine and Health Sciences Admission Test and the UK Clinical Aptitude Test. The most widely used measure of their efficacy is predictive validity.
A new tool, the Health Professions Admission Test- Ireland (HPAT-Ireland), was introduced in 2009. Traditionally, selection to Irish undergraduate medical schools relied on academic achievement. Since 2009, Irish and EU applicants are selected on a combination of their secondary school academic record (measured predominately by the Leaving Certificate Examination) and HPAT-Ireland score. This is the first study to report on the predictive validity of the HPAT-Ireland for early undergraduate assessments of communication and clinical skills.
Students enrolled at two Irish medical schools in 2009 were followed up for two years. Data collected were gender, HPAT-Ireland total and subsection scores; Leaving Certificate Examination plus HPAT-Ireland combined score, Year 1 Objective Structured Clinical Examination (OSCE) scores (Total score, communication and clinical subtest scores), Year 1 Multiple Choice Questions and Year 2 OSCE and subset scores. We report descriptive statistics, Pearson correlation coefficients and Multiple linear regression models.
Data were available for 312 students. In Year 1 none of the selection criteria were significantly related to student OSCE performance. The Leaving Certificate Examination and Leaving Certificate plus HPAT-Ireland combined scores correlated with MCQ marks.
In Year 2 a series of significant correlations emerged between the HPAT-Ireland and subsections thereof with OSCE Communication Z-scores; OSCE Clinical Z-scores; and Total OSCE Z-scores. However on multiple regression only the relationship between Total OSCE Score and the Total HPAT-Ireland score remained significant; albeit the predictive power was modest.
We found that none of our selection criteria strongly predict clinical and communication skills. The HPAT- Ireland appears to measures ability in domains different to those assessed by the Leaving Certificate Examination. While some significant associations did emerge in Year 2 between HPAT Ireland and total OSCE scores further evaluation is required to establish if this pattern continues during the senior years of the medical course.
Selection; Medical; Student; Validity; Predictive; HPAT-Ireland; Assessment; Cognitive; Ability
determine what standard paediatric medical students would set for
examining their peers and how that would compare with the university standard.
computer marked examination with questionnaire.
students during their final paediatric attachment.
students asked to derive 10, five branch negatively marked multiple
choice questions (MCQs) to a standard that would fail those without
sufficient knowledge. Each 10 were then assessed by another student as
to the degree of difficulty and the relevance to paediatrics. One year
later student peers sat a mock MCQ examination derived from a random 40 questions (unaware that the mock MCQs had been derived by peers).
MEASURES—Comparison of marks obtained in mock and
final MCQ examinations; student perception of the standard in the two
examinations assessed by questionnaire.
derived 439 questions, of which 83% were considered an appropriate
standard by a classmate. One year later 62students sat the mock
examination. Distribution of marks was better in the mock MCQ
examination than the final MCQ examination. Students considered the
mock questions to be a more appropriate standard (72%
v 31%) and the topics more relevant (88%
v 64%) to paediatric medical students.
Questions were of a similar clarity in both examinations (73%
in this study were able to derive an examination of a satisfactory
standard for their peers. Involvement of students in deriving
examination standards may give them a better appreciation of how
standards should be set and maintained.
The advent of newer technology and students’ growing familiarity with it has enabled information providers to introduce newer teaching methods such as audio podcasting in education. Inclusion of audio podcasts as a teaching aid for undergraduate medical or dental students could serve as a useful supplement to make reviewing more convenient and to enhance understanding and recall of the subject matter.
To assess the efficacy of podcasts as a supplementary teaching and learning aid for first-year dental students of Manipal.
To study students’ attitudes towards audio podcasts and perceived utility of podcasts.
This study was conducted at the Manipal College of Dental Sciences, India. The participants were first-year dental students. Live lecture classes were conducted for the students (n=80). The students were then divided randomly into two equal groups of 40 each. Group 1 students (n=40) had a study session followed by a multiple choice question (MCQ) test. This was followed by a podcasting session. Group 2 students had a study session along with an opportunity to listen to a podcast, followed by the test. Following this both groups completed a feedback form intended to assess their perceived utility and attitude towards podcasts. The performance score was analysed using SPSS and an independent sample t test was used to test the significance of differences in the mean score between the two groups.
Our analysis revealed a significant difference (p = 0.000) in the mean score between the two groups. Group 1 scored a mean of 7.95 out of 13 and group 2 scored a mean of 6.05 out of 13. Analysis of the feedback forms showed that 91.3 per cent of the students found the podcasts useful, as they could listen to lecture content repeatedly and at their own convenience. Sixty-three per cent of the students, however, felt that the absence of images and diagrams in podcasts was a disadvantage.
Students benefited when podcasts were used to supplement live lectures and textbook content. This was indicated by better student performance in the podcast group. Also, students showed a favourable attitude for podcasts being used as a supplementary teaching and learning aid.
Audio podcasts; dental students; student attitude
Background: Multiple choice questions (MCQs) are often used in exams of medical education and need careful quality management for example by the application of review committees. This study investigates whether groups communicating virtually by email are similar to face-to-face groups concerning their review process performance and whether a facilitator has positive effects.
Methods: 16 small groups of students were examined, which had to evaluate and correct MCQs under four different conditions. In the second part of the investigation the changed questions were given to a new random sample for the judgement of the item quality.
Results: There was no significant influence of the variables “form of review committee” and “facilitation”. However, face-to-face and virtual groups clearly differed in the required treatment times. The test condition “face to face without facilitation” was generally valued most positively concerning taking over responsibility, approach to work, sense of well-being, motivation and concentration on the task.
Discussion: Face-to-face and virtual groups are equally effective in the review of MCQs but differ concerning their efficiency. The application of electronic review seems to be possible but is hardly recommendable because of the long process time and technical problems.
multiple choice questions; MCQ; face-to-fac; virtual; facilitation; review-committee
This study was carried out to assess the relationship between thevarious assessment parameters, viz. continuous assessment (CA), multiple choice questions (MCQ), essay, practical, oral with the overall performance in the first professional examination in Physiology.
Materials and Methods:
The results of all 244 students that sat for the examination over 4 years were used. The CA, MCQ, essay, practical, oral and overall performance scores were obtained. All the scores were rounded up to 100% to give each parameter equal weighting.
Analysis showed that the average overall performance was 50.8 ± 5.3. The best average performance was in practical (55.5 ± 9.1), while the least was in MCQ (44.1 ± 7.8). In the study, 81.1% of students passed orals, 80.3% passed practical, 72.5% passed CA, 58.6% passed essay, 22.5% passed MCQ and 71.7% of students passed on the overall performance. All assessment parameters significantly correlated with overall performance. Continuous assessment had the best correlation (r = 0.801, P = 0.000), while oral had the least correlation (r = 0.277, P = 0.000) with overall performance. Essay was the best predictor of overall performance (β = 0.421, P = 000), followed by MCQ (β = 0.356, P = 000), while practical was the least predictor of performance (β = 0.162, P = 000).
We suggest that the department should uphold the principle of continuous assessment and more effort be made in the design of MCQ so that performance can improve.
Continuous assessment; essay; examination; MCQ; oral; practical
There has been comparatively little consideration of the impact that the changes to undergraduate curricula might have on postgraduate academic performance. This study compares the performance of graduates by UK medical school and gender in the Multiple Choice Question (MCQ) section of the first part of the Fellowship of the Royal College of Anaesthetists (FRCA) examination.
Data from each sitting of the MCQ section of the primary FRCA examination from June 1999 to May 2008 were analysed for performance by medical school and gender.
There were 4983 attempts at the MCQ part of the examination by 3303 graduates from the 19 United Kingdom medical schools. Using the standardised overall mark minus the pass mark graduates from five medical schools performed significantly better than the mean for the group and five schools performed significantly worse than the mean for the group. Males performed significantly better than females in all aspects of the MCQ – physiology, mean difference = 3.0% (95% CI 2.3, 3.7), p < 0.001; pharmacology, mean difference = 1.7% (95% CI 1.0, 2.3), p < 0.001; physics with clinical measurement, mean difference = 3.5% (95% CI 2.8, 4.1), p < 0.001; overall mark, mean difference = 2.7% (95% CI 2.1, 3.3), p < 0.001; and standardised overall mark minus the pass mark, mean difference = 2.5% (95% CI 1.9, 3.1), p < 0.001. Graduates from three medical schools that have undergone the change from Traditional to Problem Based Learning curricula did not show any change in performance in any aspects of the MCQ pre and post curriculum change.
Graduates from each of the medical schools in the UK do show differences in performance in the MCQ section of the primary FRCA, but significant curriculum change does not lead to deterioration in post graduate examination performance. Whilst females now outnumber males taking the MCQ, they are not performing as well as the males.