|Home | About | Journals | Submit | Contact Us | Français|
To predict student performance in an introductory graduate-level biomedical informatics course from application data.
A predictive model built through retrospective review of student records using hierarchical binary logistic regression with half of the sample held back for cross-validation. The model was also validated against student data from a similar course at a second institution.
Earning an A grade (Mastery) or a C grade (Failure) in an introductory informatics course.
The authors analyzed 129 student records at the University of Texas School of Health Information Sciences at Houston (SHIS) and 106 at Oregon Health and Science University Department of Medical Informatics and Clinical Epidemiology (DMICE). In the SHIS cross-validation sample, the Graduate Record Exam verbal score (GRE-V) correctly predicted Mastery in 69.4%. Undergraduate grade point average (UGPA) and underrepresented minority status (URMS) predicted 81.6% of Failures. At DMICE, GRE-V, UGPA, and prior graduate degree significantly correlated with Mastery. Only GRE-V was a significant independent predictor of Mastery at both institutions. There were too few URMS students and Failures at DMICE to analyze. Course Mastery strongly predicted program performance defined as final cumulative GPA at SHIS (n = 19, r = 0.634, r 2 = 0.40, p = 0.0036) and DMICE (n = 106, r = 0.603, r 2 = 0.36, p < 0.001).
The authors identified predictors of performance in an introductory informatics course including GRE-V, UGPA and URMS. Course performance was a very strong predictor of overall program performance. Findings may be useful for selecting students for admission and identifying students at risk for Failure as early as possible.
“I believe that we need to break down the walls that exist between scientific disciplines, inside and outside NIH. We need to foster the growth of interdisciplinary teams in order to maximize the enormous potential of research to improve our lives.”
Elias Zerhouni, NIH director 1
Health informatics is an interdisciplinary field without a well-defined undergraduate feeder program. Like similar programs, the School of Health Information Sciences (SHIS) at the University of Texas Health Science Center in Houston (UT-Houston) and the Department of Medical Informatics and Clinical Epidemiology (DMICE) at Oregon Health and Science University admit students with technical (e.g., computer science, engineering) and biomedical (e.g., biology, nursing, medicine) backgrounds. Some students are established professionals seeking to augment skills or change careers; a small, increasing number enroll to acquire mandated informatics competencies, e.g., for the doctor of nursing practice (DNP) 2,3 degree program at UT-Houston.
Demand is growing for appropriately trained biomedical informaticians. 4,5 The AMIA 10 × 10 program (http://www.amia.org) has increased public awareness of informatics training. 6 States such as Texas have passed legislation to encourage informatics in healthcare. 7 Large healthcare employers have also encouraged employees to obtain training to help implement electronic medical records. Thus, a variety of students not previously engaged in informatics are now enrolling in introductory courses. Fewer “typical” informatics students exist and schools cannot assume that students have a common set of skills.
The variety of incoming students poses a challenge to informatics educators. First, admission committees must select students likely to be successful. Second, courses must provide a rigorous introduction to informatics that remains accessible to a variety of students—or alternatively, provide preintroductory curricula that prepare students to succeed in introductory informatics courses. Educators ideally should determine which student background factors predict success. This paper relates student characteristics and prior educational achievement to success in an introductory informatics course in two programs. We also determined if success in an introductory course predicted success in the overall programs.
Graduate student success has been defined variously. Some authors define success using grade point average (GPA) for the first or for all years of graduate school. Others use faculty ratings, or combinations of factors. 8–10 Additional considerations include grades in a particular course, performance on professional licensing examinations, and graduates' publications. For medical students, studies included academic performance (i.e., probation, honors) and scores on the 3-step United States Medical Licensing Examination (USMLE). 8,11 Only a few studies have examined success in graduate-level courses. 12
There are three commonly used predictors of graduate student performance: demographics, prior academic performance, and other factors. Student attributes such as gender, age, marital status, ethnicity and underrepresented minority status (URMS) have been found to predict performance. 13,14 Underrepresented minorities are at higher risk of having at least one adverse academic event such as academic probation or dropping out of the program of study. 14 Programs often require a minimum undergraduate grade point average (UGPA) for admission; it predicts success in a variety of fields, for both Master's and PhD degrees. 9,11,15 Studies also examined overall UGPA, UGPA for the last 60 hours of undergraduate study, and/or grades in courses related to major. 15
Standardized test scores may also predict graduate student success in some fields. 15,16 The Graduate Records Examination (GRE) and its subscores: verbal (GRE-V), quantitative (GRE-Q), analytic (GRE-A, pre-2002), writing (GRE-W, post-2002) and subject tests are commonly used, but problems arise when using the GRE-V for non-native English-speaking students. 17 Several studies found that GRE subject tests are highly predictive of graduate school success; however, no GRE subject test focuses specifically on informatics.
Applicants' personal statements and letters of recommendation can potentially correlate with graduate student success. 9,18 Approaches vary in assessing inherently subjective data, such as the letter writer's prominence 18 and rating scales devised by admissions committees. 9 Cognitive factors may influence success in graduate school, such as reading ability 19 and critical thinking skills, 12 although for the general graduate student population, their impact is unclear. Non-cognitive factors, such as interest in the subject, motivation, and persistence are difficult to measure before admission; instruments exist 20 but must be administered to students by each school specifically. Such factors may predict persistence in an educational program as well as academic performance. 8,21
This study addressed two related research questions. First, what factors predict student performance in an introductory informatics course, in terms of the separate binary outcomes of Mastery and Failure? We hypothesized that (1) cognitive factors in admission portfolios could predict Mastery or Failure, and (2) the predictors for Mastery and for Failure are different. Second, does introductory course performance predict program performance more reliably than variables available at the time of admission? We hypothesized that course performance (grade) would predict program performance (overall GPA).
We developed predictive models using data from the SHIS, and cross-validated the models using additional SHIS data. We then tested significant predictors for Mastery (receiving an A grade in the introductory course) at a second institution, using DMICE data. Finally, we tested whether Mastery could predict overall program GPA at both institutions.
Located in the Texas Medical Center, SHIS has 24 faculty members and grants certificates, Master's (MS) and Doctoral (PhD) degrees in health informatics. The SHIS offers joint MS and PhD degrees with the School of Public Health. Certificate students and over half of the Master's students attend part-time; most PhD students are full-time. Nondegree seeking students, some from other academic programs in the Houston area (e.g., UT Nursing School, Baylor College of Medicine) also take SHIS courses. For fall semester 2008, 46 certificate/nondegree seeking students, 30 Master's and 23 PhD students were enrolled. To date SHIS has granted nine PhD and 119 Master's degrees. The analysis included all SHIS students who took the Foundations I course.
The DMICE, a department in the School of Medicine at Oregon Health and Science University, in Portland, OR, has 21 faculty members. It grants certificates as well as Master's (MS, MBI) and PhD degrees in Biomedical Informatics. Fall 2008 enrollment included 69 certificate, 40 Master's and 10 PhD students. The DMICE partners with Portland State University to offer a joint degree in biomedical informatics and computer science. The DMICE also offers a version of its introductory course as part of the AMIA 10 × 10 program. To date, DMICE has granted four PhD and 110 Master's degrees; 106 of these students took the introductory course. In contrast to SHIS, DMICE data were only available for degree-seeking students who had completed their degree.
The introductory course at SHIS, Foundations of Health Information Sciences I (Foundations), was taught for the first four of its 7 years in a conventional manner (face-to-face) during one 3-hour block per week. Students completed three homework assignments, a midterm and final examination. Performance on homework assignments (30% weight) and examinations (70% weight) determined the final grade. Since fall 2006, the course was taught completely online. That course included weekly online quizzes, and one homework assignment was removed. Course Materials have been updated regularly as course content evolved. Nevertheless, core topics remained substantially unchanged since 2002. The course used the Shortliffe, et al textbook, 22,23 switching from the second to 3rd edition in Fall, 2006. With the 2006 transition, the Coiera text 24 was added. Since 2002, students also read selected current primary literature, which varied over time. The online course instruction used Moodle courseware (http://www.moodle.org) and was offered in each of three academic semesters [16 wks in the fall and spring, and 12 wks in the summer (Table 1a, available in online Appendix 1 at http://www.jamia.org)]. Grades in the online course were determined through weekly quizzes (20%), midterm (20%), two assignments (20%) and final examination (40%). The lowest quiz grade was dropped. Since 2002, only minor variations occurred in grading. Final course grades used a normalized scale emphasizing an individual's relative performance compared with other students in the same semester. By UT-Houston policy, only full letter grades (A, B, C, or F) were assigned; F is rarely used and a C grade reflects unsatisfactory graduate level performance.
The introductory course at DMICE, 25 Introduction to Biomedical Informatics, has been taught for 15 years. It was initially taught face-to-face, with an on-line version started in 1999. Since 1996, it was only taught on-line (with on-campus students having a live weekly discussion section and on-line students interacting in threaded discussions). The OHSU operates on an academic quarter system, with all offerings of the course spanning each of the four 11–week quarters. The on-line course was delivered via Blackboard (http://www.blackboard.com) from 1999 to 2007, when it was replaced by the Sakai (http://sakaiproject.org) system. The Shortliffe, et al textbook was used continuously. The OHSU grades were determined via weekly quizzes (30%), a term paper (30%), a take-home final examination (30%), and class participation (10%) (Table 1b, available in online Appendix 1 at http://www.jamia.org). Letter grades with pluses and minuses were assigned (i.e., A, A−, B+, etc), with a C representing unsatisfactory graduate level performance.
The predictor models were developed using half of the SHIS data, focusing on objective data in the student's record, specifically excluding personal statements, resumes and letters of recommendation. All candidate predictors were manually abstracted from the student's records (Table 2, available in online Appendix 1 at http://www.jamia.org). We considered three general groups of candidate predictors: demographics; prior academic record (UGPA, standardized test scores, other prior academic data), and school-related data. Details of how we derived the model parameters appear in online Appendix 2, available at http://www.jamia.org.
At both SHIS and DMICE, the course grade distributions did not satisfy normal, homoscedasticity, or continuity assumptions for parametric multiple regression. To permit valid statistical analyses, we used binary logistic regression to separate predictors of Mastery from predictors of Failure. We defined two binary measures of course outcomes, Mastery (A = grade of “A”, No A = grade lower than “A”) and Failure, (C = grade of “C”, No C = grade higher than “C”). While the study defined Failure as a C grade, we do not mean to imply any specific administrative consequences of that grade. No F grades were assigned, though permitted at both institutions. Course performance was determined from students' course records. Program performance was operationally defined as cumulative graduate GPA and was computed only for students who graduated. As educational research, this study was deemed to be exempt from full review by the UT-Houston Committee for the Protection of Human Subjects (Institutional Review Board).
Over the 7 years, 129 SHIS students who took Foundations received Mastery and Failure status scores (i.e., A grade: Yes/No, C grade: Yes/No). We randomly divided the students into model-development (n = 65) and cross-validation (n = 64) samples. We built and validated our predictive models using SHIS data and then determined if predictors found to be significant at SHIS were significant at DMICE. We examined whether course performance predicted program performance at both schools.
In developing the predictive model, we first excluded disadvantaged status (n = 62) and GRE-W (n = 33) from the model because there were too many missing data. However, correlations of disadvantaged status, and GRE-W with Mastery and Failure are shown in . We computed individual composite student competency self-rating (SRComp) scores for 96 of the 129 students for whom sufficient self-rating data were available by summing their eight competency self-ratings. Preliminary Student's t-tests indicated no significant differences in the means of the model-development and cross-validation samples for any of the quantitative predictors (age, UGPA, GRE-V, GRE-Q, SRComp).
In creating the predictive model, we tested whether Mastery or Failure might be related to missing UGPAs or GREs. We created missing value indicator variables (absent = 1, present = 0) for UGPA (n = 91); GRE-V, GRE-Q (n = 71). To maintain the model-development sample size at 65, we employed the plugging strategy recommended by Cohen and Cohen. 26 For the model-development sample only, we included missing value indicator variables in the prediction equation while substituting means for missing UGPA, GRE-V, GRE-Q, and SRComp values. This enabled control of the “missingness” of data by the inclusion of explicit significance tests of data “missingness” as either a predictor or covariate in the regression analysis.
To further develop the Mastery and Failure prediction models, we employed a hierarchical forward inclusion strategy, starting with the remaining 16 potential predictors. We entered the predictors sequentially block by block, in five separate blocks, into each binary logistic regression model. We entered the blocks in the following order: (a) UGPA, including UGPA and the UGPA-missing indicator; (b) standardized test, including GRE-V, GRE-Q, and GRE-V/Q-missing value indicator; (c) demographics, including age, gender, foreign-non-English-language status, and URMS; (d) other prior academics, including type of undergraduate degree, United States MD, prior graduate degree, composite competency self-rating, and whether undergraduate institution offered graduate degrees; (e) school related factors, including online instruction and SHIS program.
We tested significance (α = 0.05) of each block before proceeding to the next. To control alpha inflation, we used the “protected-t” procedure, 26,27 recommended in the general linear model or multiple regression and correlation context. 26 However, if a variable within a nonsignificant block did not converge, or was significant, that variable was reentered after deleting all nonsignificant variables from the equation and its independent contribution tested again. If the variable still did not converge or was nonsignificant, it was then deleted from the equation. Otherwise, if a block's χ2 statistic was not significant, we excluded all its predictors from the equation. If its χ2 was significant, only predictors with significant (α = 0.05) Wald tests were retained in the equation. If χ2 and Wald tests provided inconsistent results for a block containing a single predictor, the χ2 was regarded as the definitive statistic. 28 If parameter estimates for a predictor did not converge to a solution, the predictor could not be included in the equation. For Mastery prediction, parameter estimates failed to converge for URMS. For Failure prediction, no convergence occurred for United States MD, and prior graduate degree.
We computed binary logistic regression model parameter estimates, omnibus likelihood ratio χ2, and Nagelkerke R 2. We employed the resulting final model equations to compute predicted Mastery and Failure occurrences, probabilities and associated sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV), and overall correct prediction percentages using observed Mastery and Failure percentages as threshold values. Additional measures of relationships between predicted and observed occurrences and probabilities of Mastery and Failure included Pearson r, r 2, odds ratios, and their statistical significance. 29 Pearson correlation effect sizes were considered small (approximately 0.1), moderate (approximately 0.3) or large (approximately 0.5).
To cross-validate the predictive model, we used binary logistic regression equations and their decision thresholds using the SHIS validation sample (n = 64). Students missing data on GRE-V, UGPA and URMS predictors (see “Predictive model data considerations” above) reduced the sample sizes for Mastery cross-validation to 36 and for Failure to 49. We computed predicted Mastery/Failure occurrences, probabilities of Mastery/Failure, associated Se, Sp, PPV, NPV, overall correct prediction percentages, as well as r, r 2, odds ratios. We determined statistical significance for relationships between predicted and observed occurrences and probabilities of Mastery/Failure.
For the entire sample of 129 students, to determine the bivariate correlations of predictors with Mastery and Failure, we computed measures and two-tailed tests of significance of 16 predictors' associations. For the predictors in , we tested significance of their: (a) Pearson (i.e., point biserial) correlations with quantitative normally distributed predictors, (b) Pearson and Spearman correlations with quantitative non-normally distributed predictors, and (c) Pearson (i.e., ϕ) correlations and Fisher's exact tests with dichotomous predictors.
Tests of significance of Spearman correlations replicated and verified validity of reported levels of statistical significance for Pearson correlations reported in for nonnormally distributed predictors. The “unknown” category of the Carnegie classification predictor was treated as missing and excluded from the calculation of these correlations. All bivariate measures of association and statistical tests employed pairwise deletion of cases for unavailable data (see for resulting sample sizes). Hypothesis tests significant at p < 0.01 or p < 0.05 levels are reported in but should not be regarded as definitive, given that 32 hypothesis tests were conducted.
To validate the models at DMICE, we obtained data from 106 students who had completed the comparable DMICE course and graduated from the program. We examined all the significant (experiment wise α = 0.05) SHIS model predictors (i.e., URMS, UGPA, GRE-V, GRE-Q, GRE-V/Q-missing value indicator, prior graduate degree). Because only five (4.7%) experienced Failure grades (grade ≤ 2.0), only course Mastery (grade ≥ 3.7) could serve as a meaningful criterion for DMICE correlation and logistic regression validation tests. We also excluded the student with the lowest GRE-V as both a univariate outlier (GRE-V = 330, 50 below the next lowest; z < −2.5) and multivariate outlier (GRE-V and GRE-V Missing Mahalanobis D 2 = 12.553, χ2 > 10.60, df = 2, p < 0.005) that was not representative of the population.
Starting with the five above-named predictors, and employing the plugging strategy used for missing values at SHIS, we used the hierarchical binary logistic regression forward inclusion strategy to predict course Mastery at DMICE. We sequentially entered five successive blocks (GRE-V, GRE-V Missing; GREQ; URMS; UGPA, UGPA-Missing; prior graduate degree). We then computed the appropriate phi, point-biserial, Pearson or Spearman bivariate correlations of the SHIS predictors with DMICE Mastery.
To address our second research question, concerning predictors of program performance, we first conducted simple regression analyses for both schools using course Mastery to predict cumulative graduate GPA. At both, Kolmogorov-Smirnov tests indicated graduate GPAs did not deviate significantly from normal distributions. We calculated and tested significance of the 17 Pearson correlations of SHIS demographic, prior academic performance, school related variables, and Failure with cumulative GPA for the 19 SHIS students who had graduated. Finally, we calculated and tested significance of the seven Pearson correlations of DMICE URMS, UGPA, prior graduate degree, GRE-V, GRE-Q, GRE-analytic, and GRE-writing variables with cumulative GPA for the 106 DMICE graduates. At both schools, course performance comprised less than 7% (three credit hours) of the cumulative GPA where the minimum credit hours for graduation at either school was 45. As noted, too few cases of Failure for meaningful prediction of low graduate GPA occurred at DMICE. reports results of hypothesis tests. Tests significant at 0.01 or 0.05 levels should not be regarded as definitive, given that 25 hypothesis tests were conducted.
While few publications have examined course performance in a single graduate course, several papers have examined performance in an introductory undergraduate course. 30–32 While one paper used decision tree classifier, 32 the two others used regression analysis. 30,31 Prior academic performance was included in all analyses, the inclusion of other cognitive and noncognitive variables varied.
The average SHIS student was 37 years old, consistent with the fact that many SHIS students have previous experience in either the healthcare or technical fields (Table 2, available as an online data supplement at http://www.jamia.org). Sixty one percent were female, 15% were URMS. The SHIS attracts a large number of foreign students; 34% of the students in Foundations I were not United States citizens.
Most SHIS students had a health-related undergraduate background (62%), while 21% had a technical education. About half the students had at least one prior graduate degree (48%), usually in a health-related discipline such as nursing (e.g., MSN). Ten students held a United States MD. UGPA ranged widely (2.1–4.0), with an average of 3.22. More than half (54%) received their undergraduate education at an institution that offers graduate degrees. Due to differences in educational systems, whether the undergraduate institution offers graduate degrees was not determined for foreign institutions.
Forty-two percent of the students taking Foundations I had been admitted to a degree-seeking program at SHIS. Of the students who had completed Foundations I, 19 had completed their SHIS program (MS or PhD) at the time of this study. The Foundations course was not required for graduation. Therefore, some students graduated without completing Foundations.
For DMICE, we had data only on 106 students who had completed the medical informatics track of the Master's or PhD program. These students had a slightly lower average age, much higher proportion of males, and lower proportion of URMS. DMICE also had a somewhat higher proportion of MDs, which explained the lower number of students with GRE scores, as the GRE was not required of applicants already holding doctoral degrees.
Note: for the following sections, the full analytic details for Mastery Model Development and Cross-validation appear in online Appendix 3, available at http://www.jamia.org.
The GRE-V (M = 510.86, SD = 125.75) was the only predictor that independently contributed significantly (p < 0.005, see ) in the final model for the prediction of Mastery. The final model prediction of Mastery employing the observed Mastery rate of 32% (model-development sample) as a decision threshold resulted in 73.8% correct predictions. The URMS and United States MD had bivariate correlation significance levels of p < 0.01 with Mastery in the sample of 129 students, but neither contributed significant (α = 0.05) variance independent of GRE-V. Odds of Mastery for students predicted to master the course were 5.76 times the odds of other model-development sample students.
Final model prediction of Mastery employing the model-development sample Mastery rate of 32% as a decision threshold resulted in 68.6% correct predictions for the cross-validation sample. Odds of Mastery for students predicted to master the course were 4.8 times the odds of other cross-validation sample students.
The UGPA and URMS were the only predictors (see ) in the final model for the prediction of Failure. Final model predictions of Failure employing the observed model-development sample Failure rate of 15% as a decision threshold resulted in overall 80% correct predictions. Each additional increase of one grade point in UGPA multiplied odds of Failure by 17.6%, controlling for URMS. In other words, a decrease of one grade point in UGPA multiplied odds of failing by approximately 5.7 (i.e., 1/0.176). Odds of Failure for URMS students were 9.9 times those of non-URMS, controlling for UGPA. Odds of Failure for students predicted to fail the course were 7.67 times those of other model-development sample students.
Final model prediction of Failure employing the observed model-development sample Mastery rate of 15% as a decision threshold resulted in 81.6% correct predictions.
presents measures and significance of 16 predictors' bivariate associations with Mastery and Failure for the entire SHIS sample of 129 students. Correlations having p < 0.01 or p < 0.05 are reported in but should not be regarded as definitive given the 32 hypothesis tests reported. Prior graduate degree, correlated significantly negatively with Failure (r = −0.31, p < 0.001). In the sample of 129 students, observed odds of Failure for students with a prior graduate degree were 6% those of other students.
We explored whether the correlation between Failure and prior graduate degree may be mediated by more recent school experience. Students taking Foundations I at SHIS earned their most recent degree on average 8.7 years before enrolling in the course, however, the range was 0–35 years (Table 2, available in online Appendix 1 at http://www.jamia.org). Time since last degree did not correlate significantly with either Mastery or Failure ().
Finally, there was no significant bivariate correlation between course delivery method at SHIS (online vs. face-to-face) and Failure. However, when we controlled for effects related to UGPA and URMS, online instruction significantly predicted lower Failure rates than face-to-face instruction for both the model-development sample, and for the 114 students having complete data, but not the cross-validation sample. Subsequent χ2 tests for independence also revealed that for the total sample of 129, URMS students were disproportionately overrepresented (19.8%) among those that took the course online. URMS students had a significantly higher observed Failure rate, so URMS appeared to act as a suppressor variable and when statistically controlled (along with UGPA), participation in online instruction was significantly related to a reduction in Failures in the model-development and total samples.
The second research question examined the predictors of program performance. Analysis revealed a statistically significant (p < 0.01) relationship only between course Mastery and program performance. Correlation between Failure and program performance was not significant (). Course Mastery accounted for less than 7% of the cumulative graduate GPA credit hours but predicted approximately 40% of the variability. Approximately sixty-eight percent of the predicted GPAs were accurate to within 0.116 (less than a quarter of the observed GPA range).
The DMICE students' demographic and grade distributions differed substantially from SHIS students (Table 2, available in online Appendix 1 at http://www.jamia.org). At SHIS, 19 of 129 (14.7%) were URMS versus only 3 of 106 (2.9%) at DMICE, too few for meaningful correlations. The DMICE Mastery grades (67.9%) occurred with approximately double the prevalence of SHIS Mastery grades (32.6%). The DMICE Failure grades, 5 of 106 (4.7%), occurred with less than half the prevalence of SHIS Failure grades (11.6%).
The GRE-V remained the only predictor of Mastery independently significant at both institutions, and was accompanied at DMICE by the independently significant UGPA.
Three of the usable model variables having experiment wise significant bivariate correlations with either Mastery or Failure at SHIS (see ) correlated significantly with Mastery at DMICE: GRE-V, UGPA, and prior graduate degree.
Finally at DMICE, simple regression analysis revealed a strong relationship between course Mastery and program performance in the 3.01–4.0 range of cumulative graduate GPAs earned by the 106 degree recipients (see ). Course Mastery for less than 7% of the cumulative graduate GPA credit hours predicted approximately 36% of the GPA variability. The 0.201 standard error of estimate indicated that approximately 68% of the predicted GPAs were accurate to within 0.201 (approximately a fifth of the DMICE GPA grade range) of the observed GPA (M = 3.71, SD = 0.251). At both DMICE and SHIS, Mastery-level course performance was a very strong 28 predictor of performance in the graduate program, accounting for approximately 19% more of the variance in DMICE graduate GPA than the next most significant predictor (UGPA). A follow-up analysis adding Mastery to a regression equation containing UGPA indicated the contribution of Mastery was significantly larger than UGPA. Correlation between Failure and program GPA was not significant at SHIS.
We found that just one predictor (GRE-V) for Mastery and just two predictors (UGPA and URMS) for Failure remained in final cross-validated binary logistic regression models that correctly predicted 69.4% of Mastery and 81.6% of Failure at SHIS. Controlling for URMS status, each one point increase in GRE-V multiplied predicted odds of Mastery by 1.012, meaning, for example, a 100 point increase in GRE-V multiplied predicted odds of Mastery by approximately (1.012)100 = 3.3. URMS students' odds of Failure were 20 times those of non-URMS controlling for UGPA. Each one point drop in UGPA multiplied predicted odds of Failure by 10.5, controlling for URMS status. At SHIS, Failure odds for students holding a prior graduate degree were 6% those of students without a prior graduate degree.
At DMICE, the small number of outcomes classified as Failure (n = 5) precluded valid statistical tests addressing Failure. At both institutions, GRE-V emerged as the only predictor that contributed significant independent variance to the prediction of Mastery. Three of the four model variables having experiment wise significant bivariate correlations with either Mastery or Failure at SHIS were also significant at DMICE. The GRE-V (r = 0.250), UGPA (r = 0.341), and prior graduate degree, (r = ϕ = 0.204) correlations with Mastery demonstrated statistical significance at both institutions.
Finally, Mastery in the introductory course was strongly predictive of program performance (see ) at both SHIS and DMICE. At both schools, approximately 68% of the GPAs predicted from Mastery were accurate to within less than a quarter of the observed range of GPAs. Mastery performance in the introductory course was a stronger predictor of program performance than any predictor available before enrollment at both institutions, exceeding the UGPA contribution to graduate GPA. Thus, the introductory course may be a useful screening tool for admission to the graduate program. All significant SHIS predictors of Mastery and GPA that could be validly tested at DMICE were significant at both institutions.
To our knowledge, this was the first study to quantitatively examine predictors of performance in a graduate biomedical informatics course and to quantify the relationship between course and program performance. Our student population included a wide variety of students ranging from those pursuing PhDs (presumably committed to a career in informatics involving research), to nondegree students who may only take a single course. Despite differences in student population, Mastery in the introductory course was consistently a strong predictor of program outcome measured as graduating GPA at two institutions.
Using citizenship as a proxy for English language ability, we noted that citizenship did not correlate significantly with either Mastery or Failure. Several possible explanations exist. Citizenship may be an inadequate proxy, since students who are citizens may not be proficient in English and foreign students may be native English speakers (e.g., Canada, UK, Australia) or have had extensive English language training. Instead, reading comprehension may be more closely tied to course performance. 19 Students with good reading comprehension in their primary language can use similar comprehension strategies in English, whereas students who have difficulty reading and understanding written language have less recourse. Lastly, foreign students who are accepted for study in the United States were successful in their home country and are highly motivated. Thus, they may simply be better prepared for graduate studies and have the motivation (noncognitive factors) to overcome their initial lack of language skills.
Self-reported demographic risk (i.e., participating in a Head Start program, being eligible for free/reduced-fee lunch) also did not correlate with course performance at SHIS. It is possible that these risk factors are more strongly correlated to other student outcomes such as entry into the program 33 or persistence in the program, than to course outcome.
As part of the application to SHIS, students rated their own level of expertise in each of eight areas relevant to our program. None of the areas individually or as a composite significantly correlated with course performance. Problems known to occur with self-rating scales could contribute to this finding. 34,35 The scale was worded in broad terms, and when prospective students were not familiar with a topic, they may not have correctly estimated their abilities; they may also have misinterpret the scale entirely. Alternatively, prior knowledge in these topics may not be correlated with course performance. Course topics varied widely, and were not closely related to the questions on the self-rating scale.
The very low rate of Failure for students with a prior graduate degree was surprising. We explored further and found that time since most recent degree attained did not correlate significantly with either Mastery or Failure, nor did it add significantly to the regression equation with “having a prior graduate degree”. Perhaps students with a prior graduate degree were more motivated, better educationally prepared, and/or had mastered the time management skills required to succeed.
The introductory course at SHIS transitioned from face-to-face to online during the study. A persistent concern has been the effect of online instruction on student performance. At least in this course, online instruction was associated with a reduction in Failures. Several external factors changed concurrently, among them removal of a programming assignment, introduction of weekly quizzes and the addition of a second faculty member and TA—as well as more practice problems for medical decision making. While multiple factors may have contributed to the reduction in Failures, transitioning to the online format apparently did not worsen SHIS graduate student performance.
Our study has several limitations. We focused on one course at two institutions with significantly different student populations. While our results are consistent across two institutions, they may not generalize to other courses within our school and/or courses and programs at other schools. In addition, the predictors identified using SHIS data may not be the strongest predictors at DMICE. In other words, there may be variables found to be poor predictors at SHIS that would have been good predictors at DMICE. However, our study was not designed to identify the strongest predictors at two institutions. Instead, we sought to test the generalizability of our SHIS model across institutions. Student populations at other informatics programs may vary significantly from those at SHIS and DMICE, making generalization of the prediction models problematic without local validation.
The generalizability of this study was impacted by several data constraints. In general, bivariate correlations significant at p ≥ 0.01 should not be regarded as definitive, given the number of hypothesis tests conducted. Additionally, several groups had too small a sample size for reliable conclusions. In particular, the number of Failures at SHIS was small in the cross-validation sample. With respect to URMS students at SHIS, our conclusions are based on a small sample. At DMICE the low frequency of Failures and URMS students precluded meaningful statistical analyses. Lastly, while all students in the DMICE sample had graduated, the number of students who had taken Foundations I and had graduated from SHIS at the time of the study was small.
We defined program outcome by the graduating GPA. Thus, we did not have the data to predict retention. For example, does Failure in the introductory course predict program noncompletion? Since we did not have data on program retention, our samples were somewhat biased. We only reviewed program outcome data on students who completed the program, ignoring those that dropped out. This may have decreased the predictive power of introductory course performance. Despite this, we found course performance to be a stronger predictor of program GPA than any data available before matriculation.
Further, we used the standard government definition of under-represented minority that conflates race, ethnicity and national origin. Therefore, it is difficult to distinguish the effects of these individual factors. Finally, data on all predictors were not available for some students (e.g., old GRE score that is no longer in our files, student did not take the GRE, student provided graduate but not undergraduate transcript).
A final limitation is that we did not study noncognitive predictors such as motivation and interest in the field. These may be important for predicting program retention, success as well as success post graduation. Several instruments designed to assess noncognitive factors exist, and can be administered to students at the beginning of the course.
This was possibly the first study to examine success in an introductory course in informatics, and one of few examining performance in a single course at the graduate level. 12 Prior studies showed that past academic performance predicted performance in graduate programs, 8,36 with the proportion of variance explained ranging from 23–72%, depending on the number of prior academic variables included. The student performance in these studies was operationalized as final GPA in the program. In contrast, we examined two constructs of student performance, Mastery and Failure in a graduate course.
Prior academic performance, such as GRE scores, UGPA and holding a prior graduate degree, was significantly correlated with either Mastery or Failure in this study. We were only able to find one other study, from the field of economics, that examined the predictive power of a prior graduate degree on performance in a different graduate program, and it showed no effect. 18 In general, the effect sizes seen in our study were similar to prior studies.
Nelson, et al 10 found that GPA in the first 9 hours of graduate study was predictive of degree completion in a variety of degree fields. In contrast, our study used a single course to predict graduate program success as measured by the cumulative GPA of graduates. The quality of a student's undergraduate education has been used to predict graduate school performance. 10,18 Nelson, et al 10 found that having attended an institution that grants graduate degrees (Carnegie classification) was positively correlated with degree completion in the applied sciences, humanities and arts. Our study included the Carnegie undergraduate institution classification as well; however, attending an undergraduate institution that offers graduate degrees was not significantly associated with Mastery or Failure. Possible reasons for this discrepancy between studies include the difference in student populations (at risk vs. all students) and the difference in outcomes studied (degree completion vs. course performance).
Our findings can help determine which student characteristics should weigh most heavily in the admissions process and which students are likely to require additional resources to help them succeed. The presence of a prior graduate degree, URMS, UGPA and GRE scores can be used to predict student performance. Our data did not allow us to analyze the effect of English language proficiency on student performance. It may be important to correlate the performance of foreign graduate students to a more comprehensive variable set that includes information on language ability.
We found a small set of objective variables that can correctly classify most students who will master or fail an introductory informatics class. We found that prior academic performance predicts performance in informatics as it does for many other fields. The predictors of Mastery differed from the predictors of Failure. We quantified the degree and accuracy of those predictions. Further, we found that course mastery in an introductory informatics course was a strong predictor of overall program performance (GPA), substantially superior to any information available on or before admission. Our findings may inform student recruitment, retention and advising as well as the design and evaluation of educational initiatives.
The authors thank Debbie Todd and Andrea Ilg for help with data collection. This work was supported in part by the Center for Clinical and Translational Sciences at UT-Houston (NCRR grant 1UL1RR024148). The valuable input of the anonymous reviewers is also gratefully acknowledged.