Evaluating the residency selection process is difficult because there is no uniformly accepted or objective means of measuring a resident's overall performance or “success” in residency training program. Scores on in-service training examinations provide 1 objective indicator of a resident's cognitive ability, but global assessments of residents' performance, specifically in the noncognitive competencies, are often found to be subjective. Faculty ranking of residents is the most commonly used method for assessing “success.” Other indicators, such as fellowship matching, continuation in academic medicine, and passing specialty-specific licensing board examinations, have also been used. Typically, these modalities are not used in isolation but in combination with some form of faculty assessment.1,3,4,10,11
In our study, we defined success on the basis of the independent rankings of 2 experienced resident educators. Although more subjective than some of the other indicators, this method allowed us to detect the greatest possible difference within our study cohort and allowed us to broadly define “success.” Residents who are clinically proficient and have achieved competence in each of the 6 ACGME competencies may be highly “successful,” yet they may not elect to pursue subspecialty training or academic positions. Using solely markers such as fellowship placement or remaining within academic medicine as measures of success may skew the data to favor academic prowess over clinical and professional excellence. Thus, we believe that our methodology allowed us to investigate our primary question and assess whether we can reliably identify which applicants will go on to become successful residency graduates.
In our study, none of the objective measures of medical student performance predicted performance during residency. Sex was the only demographic characteristic to reach statistical significance, with a greater percentage of women in the highest group compared to the lowest group. The significance of this finding is questionable, as 80% of the residents during the study period examined were women.
Our finding that no predictable correlation exists between USMLE scores and performance during residency is in agreement with prior studies.5,6,9,14–,16
This finding is not surprising because the USMLE only assesses a student's cognitive ability and does not measure the varied noncognitive skills that are required for resident success. Our study was not designed to investigate a relationship between performance on standardized tests as a medical student and performance on standardized tests and/or board certification examinations as a resident; however, previous studies have almost universally demonstrated a positive correlation in both medical and surgical specialties.16,17–,20
A high score on one measure of cognitive aptitude predicts a high score on a second, similar measure.
While the purpose of our study was to specifically examine the objective components of medical student applications to assess their predictive value, even the few elements of a medical student's performance that had a subjective component (ie, grades in clinical rotations) did not correlate with resident success. This finding was a bit surprising, as we hypothesized that a partly subjective evaluation of a medical student would be associated with a partly subjective evaluation of a resident. Our study is one of a few studies1,2
to examine the potential significance of a “distinctive talent” (such as championship athlete or musician) or leadership position(s) during medical school. These metrics are objective yet “softer” measures of a candidate's noncognitive abilities and could be expected to indicate performance success. However, even these attributes were not predictive of a high ranking by the faculty evaluations in our study.
Although these residents may be considered a separate cohort, we included in our analysis the 7 residents who ranked in the lowest quartile because of premature termination, because they make up an important element of a “success failure.” By definition in our study, any resident who did not complete the 4-year residency training program in obstetrics-gynecology at our institution was considered unsuccessful. Their “success failures” highlight the importance of predicting resident success from the start of residency. A resident who does not complete the training program not only loses personal time (owing to the need to start over in a new specialty) but also creates additional work for the program director in replacing them, uncertainty for their fellow residents, and disruption within the educational flow of the residency program. Predicting which residents would not complete the residency training program would save resources and time as well as enhance learning. Premature termination from our program was suggestive of a poor career choice as a medical student, with individuals finding themselves unsatisfied with a more surgically intensive specialty.
The primary limitations of our study include assessing a small study group at a single institution. It is possible that the type of medical student applying to our residency program differs from applicants to academic programs in other geographic areas and/or programs without university affiliations. Although these limitations may impact the generalizability of our data, they are commonly noted in other similar studies,1,6,13
especially in a relatively small specialty such as ours. Unlike programs in internal medicine or general surgery, residency programs in obstetrics and gynecology match an average of 5 residents per year. These numbers make a large study difficult to perform without compromising the currency of the data.
An additional study limitation is that resident success in our study was determined by only 2 individuals (the residency program director and department chair). By nature, a faculty evaluation is subjective and may be dependent upon the clinical setting in which the resident is evaluated.21
Including scores from a larger group of faculty members from all departmental divisions would have allowed for a more robust determination of a resident's success, but obtaining a faculty consensus is often quite difficult and thus a mechanism for decreasing interobserver variability is necessary. We attempted to overcome this problem in our study by selecting the program director and department chair to serve as the faculty evaluators. Both individuals had worked closely with each resident in the program in a variety of clinical and nonclinical settings, allowing them to best evaluate a resident's overall performance. Additionally, the 2 evaluators independently ranked the residents to avoid biasing each others' opinions. The 2 evaluations were given completely equal weight in the ultimate determination of a resident ranking. Interestingly, there was remarkable agreement between the evaluations, even with this blinding.