There are deep divides over the use of racial and ethnic categories in biomedical research and its application in both medical and non-medical contexts. On one side of a roughly described dividing line are practitioners who need to use every piece of information at their disposal to solve pressing, real-world problems in real time, such as making clinical diagnoses or identifying perpetrators of crime. On the other side are scientists and policy makers committed to meeting a scientific and social need for accuracy and thus trying to avoid miscategorization.
As Jay Cohn describes in this issue, medical practitioners in particular have used racial and ethnic categories to “enhance diagnostic and therapeutic precision.”1 He argues for retaining this practice. The plea, motivated by genuine concern for patients, is to avoid “throwing the baby out with the bathwater.” However, Cohn and others mischaracterize the nature of the debate. The argument is not about whether differences among populations exist, or even whether differences among “races” exist. There are clearly phenotypic and physiological differences within the human population, and some of these roughly track socially-defined groupings that we call “race” or “ethnicity.” However, there is no single accepted set of racial and ethnic categories; we cannot even clearly define what “race” or “ethnicity” means. Second, while some clinically significant differences map onto racial or ethnic boundaries (however defined), many do not.2 This means that basing diagnostic and therapeutic decisions in part on perceived race or ethnicity will be imprecise. Further, this mapping will ultimately rely largely on visual identification, which is notoriously unreliable.3 Finally, reliance on racial and ethnic categories distracts from information that might actually be more relevant to research, diagnosis, or therapy, such as environmental factors or finer-grained differences in ancestral origins than the crude grouping of “race.”
The point of current research to characterize human genetic, environmental, and phenotypic variation is to bring precision to genomic analysis, and to help us understand when environmental, rather than genomic variation, is the major determinant of disease. The problem with using race or ethnicity as a measure is that it is really used as a proxy for an as-yet undetermined mix of genetic, biological, and environmental factors. While this may be perceived as “good enough” for use in daily clinical practice, it reinforces inaccurate perceptions about “racial” and “ethnic” groups.
Race is real, but as noted by Troy Duster and others,4 it is often not a measure of an individual, but an interactive measure of a perception of an individual by another. As such, it can be a useful measure in certain studies such as research on health disparities, where the effects of perceived race on social interactions and health are specifically of interest. In this situation, the “race” variable is appropriately treated as a combination of biological and social factors.
The point of current research to characterize human genetic, environmental, and phenotypic variation is to bring precision to genomic analysis, and to help us understand when environmental, rather than genomic variation, is the major determinant of disease.
There is a long tradition in medicine and other professions of using racial group to categorize individuals, and these categories are deeply embedded in the legal and social systems of the United States as well as the national psyche. However, the history of our classification scheme argues against, rather than for, continuing this tradition.5 Scientists and clinicians do not intend to imply hierarchy when they use racial classifications, but it is naive to think that hierarchy can be surgically removed from the concept of race. Hierarchy was an integral part of the concept as originally defined.
When introduced into the scientific taxonomy by Linneaus in 1758, there were four racial groups.6 Since then, the scientific literature has documented as many as thirty-four different races.7 More recent literature describing groupings of human populations based on genetic analysis has suggested the existence of anywhere from two to six groups.8 However, even Darwin questioned the distinctiveness of racial groupings in the human species: “It may be doubted whether any character can be named which is distinctive of a race and is constant.”9 This doubt is echoed in modern times by geneticists: “Thus, populations are never pure in a genetic sense, and definite boundaries between individuals or populations (e.g., “races”) will necessarily be somewhat inaccurate and arbitrary.”10
As described by Keita, Rotimi, and many others,11 “race” is used in human populations in a way that it is not used in other species. Definitions of race in other species include: “An interbreeding, usually geographically isolated population of organisms differing from other populations of the same species in the frequency of hereditary traits. A race that has been given formal taxonomic recognition is known as a subspecies.”12 Another definition is a “group of organisms (all of the same species) that is genetically self-sustaining and isolated geographically or temporally during reproduction.”13 Neither of these definitions applies to our use of the term “race” to describe human populations today. As Rotimi has noted, isolated groups such as Old Order Amish might be considered genetically isolated enough to be considered a “race” by this type of definition,14 but this clearly does not square with typically-used racial categories in the United States.15
This conceptual mismatch illustrates the most serious problem with the use of racial and ethnic categories in biomedical research. Even if “race” and “ethnicity” could be described consistently by genetic or biological measures (and they cannot), categorization of individuals in the clinic relies on classification by visually-identified characteristics, self-definition, or a combination of the two.16 Even though we may feel confident of our visual perceptions and racial or ethnic conclusions, we know that this kind of classification is dismally inaccurate. Further, self-definition may capture an individual’s identification with a particular social group, rather than biological ancestry or anything genomic.
For example, in a study comparing the racial classification on birth and death certificates of infants in the United States who died within a year of birth, “inconsistency in the coding of race is low for whites (1.2%), greater for blacks (4.3%), and greatest for races other than white or black (43.2%).”17 In another example, a study comparing the “true” ethnic classification (by visual identification by police officers) with the predicated classification (by analysis of short tandem repeat, or STR loci that are used for identification of individuals because they vary highly from person to person) of individuals in the United Kingdom, no group was classified the same way by both methods more than sixty-seven percent of the time, and those genetically assessed to be in the “Middle-Eastern” category were perceived that way visually only thirty percent of the time.18
Relying on self-reporting or replacing race with ethnicity does not increase precision. “[I]n one American study where people had to assign themselves to an ethnic group in two consecutive years, one third of the population chose a different ethnic group on the second occasion.”19 As noted above, self-identification may have more to do with social identification and affiliation than genomics.
One of the most insidious consequences of clinging to racial and ethnic classifications is the assumption that if a difference between “racial groups” is found, (a) race is the best way to categorize this difference, and (b) race is the most relevant factor contributing to this difference. For example, pharmacogenetic differences among groups that differ by “race” are often cited as a reason to retain racial categorization. The authors of one study state that “5-10% of Europeans, but only ~1% of Japanese, have loss-of-function variants at [the CYP2D6 locus] that affect the metabolism of more than 40 drugs.”20 However, the authors also note that “[t]he CYP2D6 ultra-rapid metabolizer alleles also vary in frequency, even within Europe, from ~10% in Northern Spain to 1-2% in Sweden,” suggesting that differences within a “racial groups” can be just as great and just as relevant as differences between such groups.
Even if genetic differences among populations might be associated with differences in, for example, drug reactions, the genes implicated in these difference might not be the most important factors driving the phenotype. Racial differences in thiopurine methyl transferase genotype21 and adverse reactions to chemotherapy are an example of a clinical use for racial classification.22 However,
[a]n analysis of six clinical studies correlating adverse thiopurine effects and TPMT genotype revealed that an average of seventy-eight percent of adverse drug reactions were not associated with TPMT polymorphisms. Pharmacogenetic testing will thus not eliminate the need for careful clinical monitoring of adverse drug reactions.23
All of this means that there is no “baby in the bathwater,”no clinical or scientific utility to racial and ethnic categories unless one is studying perceived race or ethnicity or self-perception. There are certainly clinically significant differences among individuals and among groups. However, what defines these individuals and groups is not what we call “race” or “ethnicity” because there is no consistent definition of racial or ethnic categories. Because social perceptions of the meaning of race and ethnicity are extremely fluid, basing research findings on these categories or applying scientific findings based on perceived race or ethnicity is fraught with problems. Thus, attempts to “better define [the racial and ethnic] structure [of drug response]” will be futile.24