|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: JD MS-P FR TKW LANA. Performed the experiments: JD XHTZ MS-P FR SO. Analyzed the data: JD XHTZ MS-P FR TKW LANA. Contributed reagents/materials/analysis tools: JD FR. Wrote the paper: JD XHTZ MS-P TKW LANA. Prepared the figures: XHTZ.
Many studies demonstrate that there is still a significant gender bias, especially at higher career levels, in many areas including science, technology, engineering, and mathematics (STEM). We investigated field-dependent, gender-specific effects of the selective pressures individuals experience as they pursue a career in academia within seven STEM disciplines. We built a unique database that comprises 437,787 publications authored by 4,292 faculty members at top United States research universities. Our analyses reveal that gender differences in publication rate and impact are discipline-specific. Our results also support two hypotheses. First, the widely-reported lower publication rates of female faculty are correlated with the amount of research resources typically needed in the discipline considered, and thus may be explained by the lower level of institutional support historically received by females. Second, in disciplines where pursuing an academic position incurs greater career risk, female faculty tend to have a greater fraction of higher impact publications than males. Our findings have significant, field-specific, policy implications for achieving diversity at the faculty level within the STEM disciplines.
The proportion of women faculty members in many STEM fields has been steadily increasing, but at the level of associate and full professor, men continue to far outnumber women . This is troubling because studies suggest that a lack of women in leadership positions has a negative impact on women’s aspirations and advancement ,  and may perpetrate gender biases . Many mechanisms have been proposed to explain the gradual loss of women along the STEM academic career path . For example, Carnes et al.  suggested that female faculty in academic medical centers experience a number of systemic and selective pressures that put them at a disadvantage at each step of their pursuit of tenure, and in achieving positions of leadership. These pressures could amount to a “glass ceiling” preventing women’s advancement. Others have referred to the Matthew  and Matilda  effects as the cause of gender differences, that is, the greater resources awarded to men enable them to further advance their careers beyond what is possible for women.
In contrast, to these concerns, Etzkowitz and Ranga  recently suggested that the low number of females in academic positions within STEM disciplines should not be a cause for concern because women do not drop from STEM pursuits when they abandon academic careers but merely pursue STEM careers in other arenas. Curiously, Etzkowitz and Ranga’s “vanish box” perspective  does not address whether the reasons for women leaving academia do not detract from a level-playing field or whether women have the opportunity to rise to positions of prominence in non-academic careers.
To determine how and why gender may affect the professional practices and scientific production of researchers, we investigated for seven STEM fields in a quantitative manner the gender-specific and discipline-specific effects of (i) research resource requirements and (ii) relative risk in pursuing an academic career. We explicitly separated the researchers in our database along disciplinary lines in order to more carefully investigate the mechanisms potentially responsible for the observed differences. In contrast to most studies concerned with this matter, we did not conduct surveys but instead systematically analyzed the complete publication records of faculty at a large number of departments in selected research universities in the United States (Table 1, Fig. 1 and Table S4, S5, S6, S7, S8, S9, S10). These data enabled us to characterize the career-long scientific production of a sizable sample of faculty from seven disciplines, and to measure statistically significant differences that would have otherwise remained hidden.
We collected data on the 2010 faculty rosters of selected top research institutions in the U.S. (see Supporting Information S1) in seven STEM disciplines – chemical engineering, chemistry, ecology, industrial engineering, material science, molecular biology and psychology – and measured scientific productivity and impact during the various phases of each faculty member’s academic career . We focus on faculty at top U.S. research university departments because most high impact research produced by U.S. authors is published by authors in the top departments. We chose these disciplines for three sets of reasons. First, for all seven disciplines, women only began to join faculty rosters in a consistent manner in the 1980’s, and today they still comprise a small fraction of total faculty (Fig. 1 and Fig. 2). Second, these disciplines cover a broad range of scientific approaches: some place greater emphasis on theoretical or computational work, whereas others focus on industrial applications or on biological systems. Thus the requirement for institutional support – be it lab space –, size of start-up packages–, or the ability to lead center-level projects – required for success differs dramatically across these disciplines (Table 2). Third, these disciplines pose quite different relative risk profiles to individuals wishing to pursue an academic career. For example, the seven disciplines differ significantly in the prospective earnings of different career options available to Ph.D. graduates and on the time needed to achieve career stability within academia (Table 2).
We first focus on research resource requirements. As mentioned earlier, the typical annual research expenditures per faculty member differ substantially across the seven disciplines. For example, industrial engineering faculty tend, for the most part, to train a small number of students at a time. Additionally, much of the research in industrial engineering is theoretical or computational in nature. These two characteristics suggest that, for industrial engineering, researchers do not need to compete against each other for limited resources, and institutional support may not be as important a factor in faculty productivity.
In contrast, most faculty in molecular biology conduct experimental research, and many require significant lab space and expensive specialized equipment. Moreover, faculty in molecular biology are able to compete for funding supporting the creation of large centers or the acquisition of major equipment. Thus, availability of resources, especially institutionally granted resources or institutional support for securing large grants, can be crucial components of academic success in molecular biology . Furthermore, consistent with the Matthew effect , –, researchers who have already received more institutional support are able to secure even more research resources.
Since historically female faculty members have received less institutional support and have had less access to research resources –, these considerations prompt a question with significant policy implications: Could the differences in resource requirements lead to distinct gender-specific publication patterns across disciplines? In order to answer this question, we systematically investigated gender-specific publication rates for the seven disciplines. Even though several studies report greater publication rates by male authors –, we hypothesize that only in disciplines where resource requirements are high and institutional support is vital will female faculty members typically publish fewer papers than their male peers. Thus, we predict that gender differences in publication rate in disciplines such as industrial engineering are going to be quite low. In contrast, we predict that gender differences in publication rate are going to be very significant in molecular biology and similar disciplines.
We define the publication rate of a faculty member years into her/his career as the number of scientific articles published by the individual years after her/his first publication. We cannot simply compare the raw publication numbers per year, because these numbers depend strongly on publication year and career stage (Fig. 3). Let denote the number of publications published by author from discipline in year , and let be the total number of authors that have started their careers no later than year . We calculate author ’s z-score (standard score) in year as
where is the average number of publications per author from discipline published in year
and is the standard deviation of the number of papers per author published in year
In order to account for the effect of career stage, we consider , which is the z-score of author as a function of the career stage , where is the year of the first publication of author (Fig. 4, S10). Please note that by considering the z-score we are not making any assumption about normality of , but merely making the results easier to compare across disciplines and time periods.
Our analysis fully confirms our hypothesis (Fig. 3, ,4,4, ,5).5). As predicted, for disciplines where research expenditures are high, such as molecular biology, we found that females consistently publish at a rate significantly lower than males, whereas for industrial engineering we do not observe a significant difference between genders. More importantly, as shown in Fig. 5, we found that the gender difference in publication rate, measured as the average z-score of females, has a significant negative correlation with magnitude of typical research expenditures. Our results thus support the hypothesis that gender differences in institutional support have had a crucial effect on the publication rates of females.
It is important to point out that in our analysis we did not consider human and social capital such as collaboration level and leadership position, which may also have critical roles for a productive career , as research resources. Whether and how the gender difference in the ability to acquire these resources harder to quantify affects career productivity is a matter worth of further investigation.
We next investigated gender-specific and discipline-specific effects of career relative risk profile of an academic career on publication patterns. The risk to pursue a faculty position after obtaining a Ph.D. varies across disciplines. A graduate student considering an academic career in chemistry faces a small risk if unsuccessful. Within about six years from publication of their first paper, successful individuals will move into independent positions (Fig. 6, S11, S12 Table 2 and Methods). Doctoral degree holders in chemistry unable or uninterested in obtaining academic positions can chose from among a number of high-paying careers in industry and government.
In contrast, an individual considering an academic career in ecology faces a much more uncertain future. Instead of waiting six years post publication of the first paper to learn whether it will be possible to secure a faculty position, an ecologist has to wait an average of eight years (Fig. 6, S11, S12, Table 2). Perhaps even more challenging, doctoral degree holders in ecology who are not able or not interested in obtaining academic positions may have to settle for jobs that do not pay a significant premium over academic positions.
These observations raise a critical question: Could the different risk profiles of STEM disciplines lead to distinct gender-specific selective pressures? Because pursuing an academic career is a risky undertaking and because propensity towards risk-taking , , self-motivation towards career development , social expectations , perception of gender stereotypes  and biological constraints , ,  are different for females and males, we surmise that a female will choose to pursue an academic career in “high-risk” disciplines, such as ecology, only if she is so highly qualified that she will be quite confident of success. This biased self-selection for outstanding individuals among females likely happens prior to embarking on an academic career , leading to females’ advantage in career performance that would be magnified in later stages of career due to the Matthew effect . In contrast, because of the low risk profile of chemistry, we expect that female faculty members in chemistry will incur no extra burden when compared to their male colleagues. It is worth mentioning that an alternative hypothesis is that high career risk induces selection for individuals with greater propensity to risk-taking among females. However, this is consistent with our hypothesis, since risk-taking might be a necessary ingredient, among other intellectual abilities, towards success, and individuals may augment their competence through risk-taking. Therefore, females who enter disciplines with high career risks may be not only risk-takers but in fact also highly qualified.
We further hypothesize that the higher qualification of females in high-risk disciplines will become apparent through higher impact per publication. In order to uncover gender differences in publication impact, we studied a commonly used metric of academic performance, the -index . We studied the -index instead of the total number or average number of citations because the distributions of these numbers can be dramatically biased by a single highly-cited publication . The -index avoids this bias by identifying the number of publications of an author that have at least that number of citations. Moreover, because the -index was introduced after the time period considered for the data, it will not be affected by behaviors of the authors aimed at deliberately increasing their -indices.
An identified weakness of the -index is its dependence on the number of publications. In order to compare the publication impact of authors with different number of publications, we determined the dependence of the -index on the number of publications for the faculty cohorts in the seven disciplines considered. We found that for these seven disciplines the -index grows with the number of publications as a power law ,
where is the number of publications (Fig. 7 and Methods). For , the -index would grow linearly with number of publications. Importantly, since we find , one cannot explain the observed values of through self-citations alone (Methods).
We next measured the deviations of h-indices from the trend predicted by Eq. (4) for individual faculty members to obtain the z-scores (standard score) of their publication impact (Fig. 8). Let denote the h-index of author , and her/his total number of publication. The z-score of h-index of author is
We then calculated the average z-scores of this publication adjusted -index of females (Fig. 8, S4 and Methods). Our analysis unambiguously shows that for all ranges of number of publications, female faculty members in ecology published research with higher impact than their male counterparts, whereas for faculty in chemistry we found no significant gender-specific differences in impact.
The data in Fig. 8 suggest that the difference in publication impact may be an increasing function of the discipline-specific risk profile associated with an academic career. That is,
While we lack a theory for the true definition of career risk, , it is plausible that it will be a function of factors such as the time to reach career independence, the fraction of Ph.D. graduates that go on to careers in academia, and the reciprocal of the salary premium of non-academic careers (Table 2, ), which we define as
Even though we do not know its functional form, we can expand as a multivariate polynomial,
and it follows that we can expand as
Because we only have data points, we must fit our data to combinations of at most terms in the expansion. Ordinary least squares regression indicates that the difference in publication impact across the seven disciplines is positively correlated with several combinations of the factors in Eq. (9), thus confirming the existence of the relative risk associated with academic careers and its gender-specific role on publication impact (Table 3). In Fig. 9 we show the correlation between the gender difference in publication impact and the academic career risk, quantified as
This model suggests that in disciplines where there are few non-academic career options available and the time to reach career independence is long, and where it is difficulty to recover salary loss due to unsuccessful academic career, pursuing an academic position is highly risky.
Our study reveals the possible contribution of perceived risk and resource allocation to the under-representation of women in STEM academic careers. Our results are not by themselves an empirical validation of the causal relationship between publication rate and resource requirements, and between publication impact and career risk, since we cannot conduct controlled experiments or account for other factors that could play a role in the measured outcome. However, the hypothesis that there is a causal relationship between gender differences in resource allocation and the reported gender differences in publication rates is plausible and well supported by our empirical observations, as is the hypothesis that there is a causal relationship between the relative risk associated with academic careers and the gender differences in publication impact.
The issues we identify here, together with the known socialization concerns surrounding work-life balance, may have created a “tipping point” that explains the nearly intractable problem of retaining women within STEM disciplines. It is equally important to think about the role these previously unrecognized risk factors may contribute to the number of under-represented minorities in the STEM pipeline. It is not possible to address this point using the methods we describe here, but there may be opportunity and new impetus to develop novel tools that can provide a more sophisticated insight into why some groups of people are not well represented in scientific subspecialties. More intriguingly, we wonder how the perceived or real risks associated with resource infrastructure and future opportunities can be translated into other fields (business, politics, the legal profession) where there is a paucity of women and minorities in the upper career rungs. Most importantly, now that these factors have been identified, it should be possible to create policies that provide better opportunities for all individuals with an aptitude for science, and perhaps in all kinds of careers, to ensure that our work force is diverse and can gain from the insights of all contributing members.
We obtained complete faculty rosters as of June, 2010 for several top research universities in the U.S. in the disciplines of chemical engineering, chemistry, ecology, industrial engineering, material science, molecular biology and psychology (see Table S4, S5, S6, S7, S8, S9, S10 for a complete list of institutions and departments that were included in our analysis). We considered all active faculty members, including tenure-track and research faculty, but excluded emeritus professors. For each faculty member, we collected the following data: gender, year of Ph.D. (if available), current and past positions, a list of publications published by the end of 2010 and indexed in Thomson Reuters Web of Science (WoS), and the number of citations for these publications as of June, 2011. To obtain a reliable list of publications for each investigator from the WoS, we designed a supervised disambiguation protocol. Our protocol uses biographic information for an investigator to build and refine a query that retrieves the entire list of publications from the WoS. For example:
The disambiguation protocol downloads all types of publications of the authors. In the analysis we included articles, conference proceedings and reviews. At each step, we obtained the number of publications assigned to a particular author and checked for anomalies using a number of data features, the most important of which were:
Our disambiguation protocol allows us to introduce different names or initials for each scientist. For example, for females, for whom there is evidence in the list of publications of their CVs that they change their family name after marriage, we include both names in the query. Note that the errors in the publication list introduced by name changes is small . To estimate the percentage of false positives in the publications assigned to an author, we randomly sampled about one hundred authors in our database who had an updated list of publications on their personal websites. We then manually checked these lists against the results we obtained from the WoS. We estimated that, using our disambiguation protocol, the percentage of false positives in the publications assigned to an author is less than .
For the analysis of the -index and the number of publications, we considered only papers published by December 31, 2000. In order to have a reliable measure of the -index, we need to consider papers which have accrued a number of citations that truly reflects the impact of that research. Based on prior studies , we set ten years as the threshold for papers to have accumulated their “ultimate” number of citations.
Assume that an author with publications makes self-citations in each of her/his publications. The total number of self-citations is thus . In order to maximize his/her h-index, the author will distribute his self-citations homogeneously among of his own publications. Thus, the average number of citations per publication is , yielding or . That is, .
We surmise that given the number of publications , the -index is a random variable obeying the Poisson distribution:
with mean . The likelihood of the data given this model is then:
where the product runs over all pairs in real data. The best estimates of and are those that maximize . The estimates yield good fits to the data (see Fig. 7).
We fitted the data in Fig. 6 to the generalized logistic function,
where is the lower asymptote, is the upper asymptote, is the growth rate, and is the time of maximum growth. We provide the values of the fitting parameters for all data sets in Tables S11, S12, S13, S14, S15, S16, S17. We use as a proxy for the time for transition to professional independence.
The p-values of the linear correlations in Figure 5 and Figure 9 are obtained using two statistical tests, the permutation test and Student’s t-test. Since the Student’s t-test is well known, we describe here only the permutation test. Suppose that we have data points on the two dimensional plane. We consider all the permutations of the (or ) values of the data points, and calculate the correlation coefficient for each of the permutation, which will yield correlation coefficients, , , , . We then calculate the probability that these coefficients are larger than or equal to the correlation coefficient of the original data set , . This probability is the p-value given by the permutation test.
Statistical significance of gender difference in publication rate. Probability that a female faculty member published more articles at a given stage of her career than a male peer at the same career stage (red lines). We use z-scores to account for two trends in the data: (i) the publication rate increases over years (Fig. 3), and (ii) the publication rate varies with the career length (Fig. 4). We indicate the 90% and 95% confidence intervals by the dark grey and light grey areas respectively, and the medians of the probabilities obtained from random ensembles by black lines.
Time to career independence of female faculty members. The fraction of publications authored by female faculty members in which the female faculty member is the last author (red diamonds) and the fraction of publications in which a faculty member is the first author (pink squares). The red/pink lines are fits of the data to a generalized logistic function (Methods, Table S11, S12, S13, S14, S15, S16, S17). The grey shaded areas indicate the periods of professional independence for the different disciplines.
Time to career independence of male faculty members. The fraction of publications authored by male faculty members in which the male faculty member is the last author (blue diamonds) and the fraction of publications in which a faculty member is the first author (azure squares). The blue/azure lines are fits of the data to a generalized logistic function (Methods, Table S11, S12, S13, S14, S15, S16, S17). The grey shaded areas indicate the periods of professional independence for the different disciplines.
Statistical significance of gender difference in publication impact. Probability that female authors have larger h-index than male authors when accounting for the number of publications. The red line shows the results for windows including authors with at least 30 publications and at most publications. Dark grey areas and light grey areas show the 90% and 95% confidence intervals (see Methods for details).
Gender of faculty in Chemical Engineering departments.
Gender of faculty in Chemistry departments.
Gender of faculty in Ecology departments.
Gender of faculty in Industrial Engineering departments.
Gender of faculty in Material Science departments.
Gender of faculty in Molecular Biology departments.
Gender of faculty in Psychology departments.
Estimated values of parameters of logistic function for Chemical Engineering data.
Estimated values of parameters of logistic function for Chemistry data.
Estimated values of parameters of logistic function for Ecology data.
Estimated values of parameters of logistic function for Industrial Engineering data.
Estimated values of parameters of logistic function for Material Science data.
Estimated values of parameters of logistic function for Molecular Biology data.
Estimated values of parameters of logistic function for Psychology data.
Estimated values of parameters of the power law relation between impact and number of publications, .
We thank R. Guimerà, S. Mukherjee, R. D. Malmgren, P. McMullen, M. J. Stringer, and James A. Evans for comments and suggestions. We thank S. C. Tobin for editorial assistance.
The authors acknowledge the support of NSF awards SBE 0624318 and IIS 0830388. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding was received for this study.