|Home | About | Journals | Submit | Contact Us | Français|
To track pharmacy student knowledge over time using a proprietary software program in an accelerated program for curricular assessment.
All students were required to complete a computerized comprehensive diagnostic examination 3 times during the doctor of pharmacy (PharmD) program: at the beginning of the second year, and near the end of the second and third years. The examination was comprised of 100 questions in 3 content areas: pharmacotherapy, preparation and dispensing of medications, and providing health care information. Within-subject differences in mean area and total percent scores were compared.
Based on 123 students' data, mean scores for pharmacotherapy and total percent scores for examination 1 were significantly different from examinations 2 and 3.
The computer-based comprehensive diagnostic examination shows promise for use as a component of a comprehensive assessment plan.
Progress assessments or examinations in pharmacy education have gained considerable attention, as evidenced by the amount of coverage in the pharmacy literature in recent years.1-4 A progress examination has been defined as a method of assessing both the acquisition and retention of knowledge at 1 or more points in the curriculum relative to curricular goals and objectives.1 In 2007, the assessment committee at the Massachusetts School of Pharmacy and Health Sciences (MCPHS) School of Pharmacy–Worcester/Manchester embarked on a search for an assessment tool that could be administered at key intervals in its accelerated, year-round, 34-month PharmD degree program to help diagnose curricular areas of strengths and weaknesses, so that timely and appropriate interventions could be initiated. During the first 24-months of the PharmD program, except for laboratory courses, all core courses were delivered via real time distance education technology between the 2 MCPHS campuses located in Worcester, MA, and Manchester, NH. Approximately 85% of these courses were taught synchronously from the Worcester to the Manchester campus (ie, Worcester students received face-to-face instruction in 85% of the core courses in the first 2 years of the curriculum) and the other 15% from the Manchester to the Worcester campus. The advanced pharmacy practice experiences (APPEs) comprised the final 10 months of the program. The assessment committee wanted to find an assessment tool that could measure the knowledge acquisition of students on the Worcester and Manchester campuses as they progressed through the curriculum.
Initial discussions within the assessment committee focused on: (1) whether to use examination results to determine a student's preparedness to continue into subsequent courses such as APPEs or to obtain an ongoing assessment of a student's knowledge and skills for both the student and faculty members (ie, summative or formative assessment); (2) whether to develop an in-house examination or purchase a commercially available examination; and (3) whether the tool should be administered at the same time to the 3 classes periodically or to each class at key intervals throughout the 34-month curriculum (ie, cross-sectional versus longitudinal study design). The committee deliberated on the pros and cons of developing an in-house examination (ie, questions written by faculty members based primarily on their expertise or the course content taught) and determined that, while such an examination would most closely reflect faculty members' input and contribution to the curriculum, the process of developing the examination, establishing its psychometric properties, and keeping it current and relevant, would be time consuming and costly when weighed against the benefits of the information obtained. Ultimately, the committee decided to examine commercially available assessment tools and developed a list of selection criteria.
In addition to the tool providing useful and meaningful evidence of students' knowledge acquisition, the committee used the following criteria for selecting an assessment tool which it called the comprehensive diagnostic examination: comprehensive in scope (ie, the tool tests a broad range of content areas, especially those covered in the pharmacy curriculum); flexible to use across 3 campuses (the third campus was MCPHS-Boston); scores would be available in a timely manner; requires minimum involvement of non-academic departments (eg, registrar, information services); and incurs no additional direct financial cost to the students and minimum cost to MCPHS. The committee examined multiple resources that were commercially available to prepare pharmacy students for the North American Pharmacist Licensure Examination (NAPLEX). The majority of the resources were in print format and offered paper-and-pencil practice tests; the committee preferred those that were available electronically, ie, computer-based, considering that the NAPLEX uses the computer-adaptive testing model.5 Further, the practice test items in many of the resources often tested knowledge of a specific or limited content area (eg, chapter topic).
The assessment committee selected the commercially available Comprehensive Pharmacy Review NAPLEX Preparation CD-ROM.6 A member of the committee initiated contact with the publisher's electronic media specialist, mediated questions from the information services department, obtained information on the cost and terms for 3 network licenses, and facilitated the purchase of the software program. In 2007, the cost for 3 network licenses was $1,855. Each network license agreement allowed uploading of the contents of the CD-ROM to an unlimited number of computers at each campus facility.
The CD-ROM has over 1,000 questions that correspond to the review questions from each of the 61 chapters of the review book7 and from 2 practice examinations that also are available in a book format.8 Each review question is classified into 1 of the 3 NAPLEX competency statements published in the NAPLEX Blueprint:7-10
Another feature of the software program is that the user has the capability to build a test that matches the percentage distribution of questions across the 3 distinct content areas in the NAPLEX Blueprint as shown above. Thus, the comprehensive diagnostic examination is a 100-item examination that contains approximately 54 questions in area 1, 35 questions in area 2, and 11 questions in area 3, randomly selected from the pool of over 1,000 review questions or the 2 practice examinations. The individualized score report for each area displays a variety of information: (1) a summary section that shows the total number of questions answered, total number of correct answers, and percent correct; (2) a detailed section that shows the same information by chapter or subtopic; and (3) a “categories” section that shows the same information by NAPLEX competency area. At its most basic level, the examination may be able to measure students' knowledge levels in each of the 3 competency areas at select times in the curriculum.
One of the limitations of the software program is that the psychometric properties of the review questions or practice examinations are not known. It is for this reason that the assessment committee decided to use it for diagnostic purposes only and not as a high-stakes or capstone examination that determines student progression in the curriculum.
A longitudinal study design was chosen to track the knowledge acquisition of each student cohort. The examination was administered at 3 points in the accelerated, 34-month curriculum. Examination 1 was administered at the start of the second year. At this point, students had completed most of the building block courses (eg, basic sciences, pharmaceutical sciences, social and administrative sciences, introductory clinical sciences) and were about to begin a 3-course series in pharmacotherapeutics and a 3-course series in pharmacology-toxicology-medicinal chemistry. Examination 2 was administered at the end of the second year, following the completion of these series of courses. Examination 3 occurred at the end of the fourth 6-week APPE; at Worcester/Manchester, the third year was comprised of six 6-week APPEs (Table1). This particular cohort study was approved by the MCPHS Institutional Review Board.
The students first learned about the details of the examination during an orientation session to the second year of the program. All students were mandated to take the examination. Any student with a scheduling conflict had to notify the office of student affairs, obtain an excused absence, and make arrangements to take the examination on another date and time. Several months in advance of the examination, the registrar was contacted to obtain the most current roster of members of each student cohort. These students were e-mailed a “save the date” notice. The note included a brief description of the purposes of the examination, ie, to provide the student with an individualized self-assessment of his/her knowledge level at select times in the curriculum, to provide the faculty members with valuable feedback regarding students' knowledge levels at select times in the curriculum, and to meet the types of feedback and assessment activities required by regional and professional accrediting bodies. The note also described the examination format, highlighting the fact that it was akin to a mock NAPLEX in that it was a timed (2.5 hours), computer-based examination reflecting the appropriate percentage distribution of questions across the 3 distinct competency areas in the NAPLEX Blueprint. The students also were informed that the examination was mandatory but would not affect their grade point average or progression in the PharmD program. The message to the students also stated that “in order for the results to be meaningful to you and the faculty, you are required to complete these examinations. The results will only be of use to you and the Worcester/Manchester faculty if a sincere effort is made to take the examinations to the best of your ability.” Students were notified of their examination room assignments after the add/drop period so that such assignments could take students' schedules into account. A reminder e-mail was sent to the students about 4 weeks prior to the examination.
A call for examination proctors was e-mailed to the faculty members at the same time that the “save-the-date” notice was sent to the students. Approximately 2-3 proctors were needed for every 30-40 students taking the examination. As this was not a high-stakes examination, proctors were not present to prevent academic dishonesty. Rather, they were needed to assist with technical aspects of the examination administration, especially to ensure that the correct default printer was set up and individualized score reports for each competency area were printed. The CD-ROM was not designed to save multiple-user data in the network server at each campus (another limitation of the software program). Since data collection was such a critical piece to this longitudinal study, students were instructed to raise their hands upon completion of a competency area so that proctors could assist with the printing of individualized score reports. Two copies were printed; 1 copy was given to the student and the other was kept by the proctor. The latter was used to enter data manually into a database for analysis.
As this was a computer-based examination, rooms with computer terminals had to be reserved with the registrar several months in advance of the examination. On the Worcester campus, there were 2 rooms, each of which had 40 to 45 computers. On the Manchester campus, the campus library, which had 30 computers, was used. Although the examination was available to faculty members at all times via a link on the network server at each campus, student access was limited and only granted during examination times by the information services department. Arrangements also had to be made with the information services department so that printing fees for the individualized score reports were waived.
On the day of the examination, the room was set up to minimize confusion during examination administration. Each student was assigned a specific computer terminal at which pencils, blank paper, and step-by-step instructions for taking the examination were placed. These instructions included color screenshots for the windows observed at each step. Faculty proctors were also provided with copies of these instructions prior to the examination so that they could review the process ahead of time. Each examination began with a verbal summary of the purposes of the examination and additional directions regarding the technical aspects of the examination (eg, set up the default printer, print 2 copies of each individualized score report, view patient profiles using the scroll button, use calculator button/ function).
Data were analyzed using SPSS, version 17.0 (SPSS, Chicago, IL).11 A 1-way repeated measures ANOVA was performed to compare within-subject differences in mean area and total percent scores across the 3 examinations. Mauchly's test was used to test the assumption of sphericity of the data points. When significant differences among the mean percent scores were indicated by the ANOVA test (F ratio and Wilks' Λ), post hoc analyses (ie, a multiple pairwise comparisons procedure using Bonferroni adjustment) were performed to identify in which pairs of mean percent scores were the means significantly different.
The first cohort of students completed examination 1 in fall 2007, examination 2 in summer 2008, and examination 3 in spring 2009. Data that were included in the analysis needed to fit the following inclusion criteria: (1) data for an individual student must be complete for all 3 examinations (3 area scores per examination for a total of 9 area scores), and (2) the number of questions answered in each area must fall within the following pre-specified ranges: area 1, ≥ 52 and ≤ 54 questions; area 2, ≥ 33 and ≤ 35 questions; and area 3, ≥ 9 and ≤ 11 questions. The latter criterion was used to exclude “outliers” from the analysis, that is, the area scores of students who failed to follow instructions and may have answered the software's default number of 854 questions in area 1; 351 questions in area 2; or 47 questions in area 3. One-hundred and eighty-eight students took examination 1; after application of the inclusion criteria, data from 123 students were included in the analysis. Some of the reasons why 65 students' data (34.6% of the cohort) were excluded from analysis were irregular student status (ie, failure to maintain membership in the cohort due to academic or nonacademic problems) and incomplete data (eg, failure to capture or print 1 or more area scores).
A 1-way repeated measures ANOVA was performed to compare within-subject differences in mean area and total percent scores across the 3 examinations. Area percent score was calculated by dividing the number of correct answers by the number of questions answered; the quotient was then multiplied by 100. Mauchly's test statistics were nonsignificant; the assumption of sphericity of the data points was met in all data analyses. (For example, for area 1 percent scores, Mauchly's W = 0.983, χ2(2), p = 0.350.) Table Table22 displays the results of the post hoc analyses using Bonferroni adjustment for multiple comparisons. Each row shows the pairs of mean percent scores that were compared (columns 1 and 2), the mean difference between each paired mean percent scores and corresponding standard error (column 3), the significance value (column 4), and the 95% confidence interval for the difference (column 5).
Area 1 items tested knowledge on topics related to assessing pharmacotherapy to ensure safe and effective therapeutic outcomes and comprised about 54% of the examination. Analysis of the mean percent scores for area 1 showed that there were significant differences across the 3 examinations (p < 0.001, Wilks' p < 0.001). However, ANOVA results did not specify where these differences existed (whether between examinations 1 and 2, 2 and 3, or 1 and 3). As shown on Table Table2,2, post hoc comparisons indicate that there was a significant difference in the mean area 1 percent scores between examinations 1 and 2, and between examinations 1 and 3, but not between examinations 2 and 3.
Area 2 items tested knowledge on topics related to assessing safe and accurate preparation and dispensing of medications, and comprised about 35% of the examination. Analysis of the mean percent scores for area 2 showed that there were no significant differences between the means across the 3 examinations (p = 0.190; Wilks' p = 0.182). Thus, follow up analyses of post hoc test results were not conducted.
Area 3 items tested knowledge on topics related to assessing, recommending, and providing health care information that promotes public health, and comprised about 11% of the examination. Analysis of the mean percent scores for area 3 showed that there were no significant differences between the means across the 3 examinations (p = 0.455; Wilks' p = 0.425). Thus, follow-up analyses of post hoc test results were not attempted.
Weighted and unweighted mean total percent scores for each examination were calculated, the former to account for the difference in the number of test items in each area (54 items in area 1; 35 items in area 2; and 11 items in area 3) which could artificially inflate a total percent score. As stated previously, only students with a complete set of area scores (3 area scores per examination for a total of 9 area scores) were included in the analysis. The weighted mean total percent score for an examination was calculated by dividing the total number of correct answers in the 3 areas by the total number of questions answered in the 3 areas (which ranged from 96 to 100 questions). The unweighted mean total percent score for an examination was calculated as the average of the 3 area percent scores.
There were significant differences between mean total percent scores across the 3 examinations for both weighted (p < 0.001; Wilks' p < 0.001) and unweighted means (p < 0.001; Wilks' p < 0.001). As shown in Table Table2,2, post hoc comparisons of mean total percent scores for both weighted and unweighted means indicate that there was a significant difference in the mean total percent scores between examinations 1 and 2, and between examinations 1 and 3, but not between examinations 2 and 3.
The purpose of this cohort study was to evaluate the use of the Comprehensive Pharmacy Review NAPLEX Preparation CD-ROM6 for assessing the knowledge acquisition of Worcester/Manchester students as they progressed through the 34-month, accelerated PharmD curriculum. The assessment committee wanted to know if this software program would provide evidence that, if triangulated with other assessment-related information, it could be used to identify strengths and weaknesses of the curriculum so that timely and appropriate interventions could be initiated.
A cursory look at the mean percent scores for examination 1 in areas 1, 2, and 3 (Table (Table2)2) indicated that students had the lowest mean percent scores in area 1. A possible explanation for higher examination 1 mean percent scores in areas 2 and 3 relative to area 1 was that students were more knowledgeable in these areas as a result of their successful completion of the first-year courses in the 34-month curriculum (Table (Table1).1). The first-year courses (49 semester hours) include a mix of basic science courses (eg, biochemistry series, human physiology and pathophysiology series, immunology), pharmaceutical science courses (eg, pharmaceutics series, 1 including a laboratory component; pharmacokinetics-biopharmaceutics), administrative science courses (pharmacy law, health care delivery-public health, pharmacy management-outcomes assessment), and introductory pharmacy practice courses (introduction to pharmaceutical care series, 1 including a laboratory component; self-care therapeutics). Thus, this computer-based program seemed to be able to measure the impact of first-year courses on students' knowledge levels, as indicated by relatively higher examination 1 mean percent scores in areas 2 and 3 than in area 1. This supposition could be evaluated more definitively by the administration of the examination to students prior to the start of their first year to obtain baseline data, which the assessment committee chose not to do after deliberating the benefits and types of information it could obtain.
The mean percent scores for examination 1 were significantly different from the mean percent scores for examination 2 and examination 3 in area 1 only. Students completed examination 1 at the beginning of their second year and examination 2 at the end of their second year. The bulk of the second year (29 of 47 semester hours, 4 of which are for 2 elective courses) in the 34-month curriculum consists of a 3-course series in pharmacotherapeutics (total of 18 semester hours) and a 3-course series in pharmacology-toxicology-and medicinal chemistry (total of 11 semester hours). The second-year courses appeared to have contributed to students' knowledge acquisition of area 1 topics, as indicated by the significant increase in area 1 scores from examination 1 to examination 2, but not of areas 2 and 3 topics. Students appeared also to retain their knowledge of area 1 topics in their third year, as indicated by a significant difference between their examination 1 and 3 scores (examination 3 was administered in the third year, approximately 16 months since taking examination 1). Thus, this computer-based program seemed to be able to measure the impact of second-year courses on students' knowledge levels in areas which may have been predicted based on the curriculum content (Table (Table11).
The impact of the third-year courses on students' knowledge levels, as measured by the difference between examinations 2 and 3 scores, was inconclusive. The third year in the 34-month curriculum is comprised of six 6-week APPEs (36-semester hours). The students completed examination 3 at the end of the fourth 6-week APPE, approximately 6 months since taking examination 2. In most cases, the mean percent scores increased slightly but not significantly from examination 2 to examination 3. The 6-month period between examinations may have been too short to measure change in knowledge, if any, in specific areas. The lack of a significant change in scores also may have been a function of the type of learning that occurs during the third year, comprised solely of APPEs, where the practice and application of previously acquired knowledge and skills are emphasized more than the acquisition of new knowledge. Also, the questions in the software program may not test higher-level thinking, per Bloom's Taxonomy of Thinking Skills.12
The use of weighted and unweighted means to calculate mean total percent score yielded comparable results; the mean total percent scores using either weighted and unweighted means were significantly different between examinations 1 and 2, and between examinations 1 and 3, but not between examinations 2 and 3.
Compared to the progress examinations described in the pharmacy literature in recent years, the comprehensive diagnostic examination is different in many ways. At the college of pharmacy at the University of Houston, the faculty developed their own case-based, multiple-choice questions for their MileMarker Assessments.2 MileMarker I and II were administered at the beginning of the second and third years, and MileMarker III near the end of the third year in the 4-year curriculum. The examinations were both formative and summative in nature. Performance on MileMarkers I and II did not impact progression in the program but students who failed them had to remediate identified areas of weaknesses; students who failed MileMarker III did progress into their APPEs until they were successful.2 The Touro University California College of Pharmacy also developed its own progress examination program called the Triple Jump Examination.4 Students took the examination 4 times, ie, at the end of each semester of the first 2 years of the 4-year curriculum. The examination had 3 components: 2 case-based, written examinations (one was closed-book and the other was open-book) and an objective structured clinical examination (OSCE). The examination also had formative and summative elements; it also had been used to evaluate the readiness of students for APPEs.4 Thus, unlike the MileMarker Assessments and the Triple Jump Examination, which were locally developed in-house assessment tools, the comprehensive diagnostic examination uses multiple-choice questions, some of which are case based, from a commercially available NAPLEX review software program. The comprehensive diagnostic examination, specifically examination 3, was administered beyond the completion of the didactic portion of the curriculum, ie, at the end of the fourth 6-week APPE. Most importantly, rather than to determine student progression into APPEs, the results of the examination were intended to be used for diagnostic purposes only, ie, to help students self-assess their areas of strengths and weaknesses as they progressed through the 34-month curriculum, and to help faculty members identify curricular areas of strengths and weaknesses.
Students' knowledge levels were measured using a computer-based program that can be purchased readily by the students. More importantly, the program has unknown psychometric properties. Further, the 1,000+ review questions and the questions in the 2 practice examinations in the software program were taken at face value and were not individually scrutinized for correctness of NAPLEX competency area classification or for accuracy and relevance of content (eg, clarity of the stem and answer choices).
Another limitation is that the results presented in this paper were based on a cohort of 123 pharmacy students whose knowledge levels were measured at 3 different times over a period of about 16 months. Even though a longitudinal study design offers more power than a cross-sectional study design to study changes in students' knowledge acquisition, the generalizability of the results of this study outside of this first cohort of students is debatable. More cohorts of students need to be studied longitudinally to see if the same results are produced, before the utility of the CD-ROM as an assessment tool can be determined fully. With additional cohorts, studying and comparing trends across cohorts over time will also be possible.
In this study, the depth of analyses and reporting of knowledge levels was confined to the 3 competency categories or content areas. Until it is determined that the computer-based program can provide the type of information that could be used to help diagnose curricular areas of strengths and weaknesses in ongoing curricular assessment initiatives to improve the curriculum, the assessment committee is wary of expanding the depth of data analyses to the level of subcategories or subtopics (shown in the “detailed” section of the individualized score report) that correspond to the 61 chapters of the review book.7
How students perceived or used this assessment process is not known, ie, whether it provided valuable constructive feedback and whether they used their individualized score reports to identify and monitor their areas of strengths and weaknesses. These areas need to be explored.
The computer-based program seemed to show an impact of first-year courses on students' knowledge levels, as indicated by higher examination 1 mean percent scores in areas 2 and 3 relative to area 1. Areas 2 and 3 questions pertain mostly to the preparation and dispensing of medications, and provision of health care information to promote public health, while those in area 1 are related to pharmacotherapy and its outcomes. The program also was able to show significant within-subject differences in area 1 mean percent scores between examinations 1 and 2 (about a 10-month interval between examinations) and between examinations 1 and 3 (about a 16-month interval), but not between examinations 2 and 3 (about a 6-month interval). In light of these findings, this program shows promise for use as an assessment tool. However, its utility for longitudinally tracking and assessing students' knowledge acquisition as they progress through the curriculum needs to be further studied with more student cohorts. At the Worcester/Manchester campuses, the use of the examination on 2 additional cohorts of students is underway.
The examination, by itself, may be a rudimentary tool for assessing the areas of strengths and weaknesses of the PharmD curriculum. To enhance confidence in these findings, the assessment committee needs to corroborate them with findings from ensuing cohorts and findings from other assessment tools, as components of a comprehensive assessment plan.
The authors would like to thank the members of the assessment committee and the faculty members who volunteered to proctor the comprehensive diagnostic examinations.