The term “grade inflation” covers a multitude of phenomena, some of which are even alleged to be sins. Continuing increases in average grades have been widely documented in many universities over the last several decades (for example, Sabot and Wakeman-Linn, 1991; Johnson, 2003). Conversely, cases of grade deflation are rare and short-lived, although in some settings, such as first-year law courses, some universities have held to a strict curve. Also widely documented, and often associated with grade inflation, are systematic differences in grade levels by field of study, with a common belief that the sciences and math grade harder than the social sciences, which in turn grade harder than the humanities—and that economics behaves more like the natural sciences than like the social sciences. The general persistence of these relative differences in grades seem to us to be more interesting and more difficult to explain than the persistence of modest grade inflation in general, and they are the principal focus of this paper. Why, for example, should average grades in English be much higher than average grades in chemistry? And what is going on when a department’s grading practices change markedly relative to other departments?
We begin with an overview of some evidence on grade inflation by department and course level, focusing in particular on detailed data that we have from the University of Michigan. Grades in undergraduate arts and sciences courses at the University of Michigan have, with a few exceptions, been rising slowly and steadily since at least 1992. But our main focus is to explore some possible reasons for the highly (but not perfectly) stable differences in the grading practices of departments. Perhaps surprisingly, we uncover a story that is much richer and more interesting than some variant of “the sciences (which are virtuous or mean, depending on your point of view) grade tough, and the humanities (which are the opposite, again depending) grade easy.”
Our basic story is fairly simple. Grades are an element of an intra-university economy that determines, among other things, enrollments and the sizes of departments. Departments supply courses and students demand them, although the payment from students to faculty is mediated by the university administration, and there are also nonpecuniary rewards and costs associated with teaching. Departments generally would prefer small classes populated by excellent and highly motivated students. The dean, meanwhile, would like to see departments supply some target quantity of credit hours—the more the better, other things equal—and will penalize departments that don’t do enough teaching. In this framework, grades are one mechanism that departments can use to influence the number of students who will take a given class. But both the costs and consequences of different grading policies vary systematically across departments and courses. Grading is always at least somewhat costly, but the cost is greater the greater are the opportunities for students to quarrel with the fairness of the grading standards and methods that faculty use. On the demand side, some courses have close substitutes while others do not, and one would expect the grade-elasticity of demand to behave in the usual way.
This framework leads to several hypotheses about relative grades across departments and courses. First, the distribution of grades is likely to be lower where courses are required, and where there are agreed-upon and readily assessed criteria—right or wrong answers—for grading. By contrast, departments that evaluate student performance using interpretative methods will tend to have higher grades, because using these methods increases the personal cost to instructors of assigning and defending low grades. Second, upper-division classes are likely to have higher grades than lower-division classes, both because students have selected into the upper-division courses where their performance is likely to be stronger and because faculty want to support(and may even like) their student majors. Third, grades can be used in conjunction with other tools to attract students to departments that have low enrollments and to deter students from courses of study that are congested.
We find some evidence in support of each of these patterns. As it happens, the consequence of the preceding tendencies is that, indeed, the sciences (mostly) grade harder than the humanities. But there are some surprises. For example, consistent with our framework but not consistent with the notion that the humanities grade softer than the sciences for some intrinsic reason, we find that at Michigan introductory physics and chemistry labs grade much easier than second-year French courses.
We find relative grades to be both more interesting and more amenable to analysis than the low rate of general grade inflation. Inflation, we expect, arises from two complementary features of the landscape: for any instructor in any course, grading a little more softly than expected is costless and it makes students happy. The instructor may gain some benefit in teaching evaluations (Johnson, 2003), but even if not, the opportunity to (in effect) print money is one that at least some instructors will find appealing. As long as some faculty respond to this opportunity, others in the department will be under pressure to adjust to the new norms for grades, and at least some other departments will endeavor to follow this trend in order to maintain market share and perhaps also to avoid the unpleasantness of widespread student grumbling. This story is hard to verify or to refute, in part because general grade inflation has proceeded without interruption for so long, but the key ingredients are surely in place.
We conclude with a discussion of implications for further research and for academic policy. We argue that differential grading standards have potentially serious negative consequences for the ideal of liberal education. At the same time, we conclude that any discussion of a policy response to grade inflation must begin by recognizing that American colleges and universities are now in at least the fifth decade of well-documented grade inflation and differences in grading norms by field. Current grading behavior must and will be interpreted in the context of current norms and expectations about grades, not according to some dimly imagined (anyone who actually remembers it is retired) age of uniform standards across departments. Proposals that attempt to alter grading behavior will face the costs of acting against prevailing customs and expectations, whether in altering pre-existing patterns of grades across departments within a college or university or in attempting to alter grades in one institution while recognizing that other universities may not change.