|Home | About | Journals | Submit | Contact Us | Français|
The National Institutes of Health (NIH) Revitalization Act of 1993 requires that NIH-funded clinical trials include women and minorities as participants; other federal agencies have adopted similar guidelines. The objective of this study is to determine the current level of compliance with these guidelines for the inclusion, analysis, and reporting of sex and race/ethnicity in federally funded randomized controlled trials (RCTs) and to compare the current level of compliance with that from 2004, which was reported previously.
RCTs published in nine prominent medical journals in 2009 were identified by PubMed search. Studies where individuals were not the unit of analysis, those begun before 1994, and those not receiving federal funding were excluded. PubMed search located 512 published articles. After exclusion of ineligible articles, 86 (17%) remained for analysis.
Thirty studies were sex specific. The median enrollment of women in the 56 studies that included both men and women was 37%. Seventy-five percent of the studies did not report any outcomes by sex, including 9 studies reporting <20% women enrolled. Among all 86 studies, 21% did not report sample sizes by racial and ethnic groups, and 64% did not provide any analysis by racial or ethnic groups. Only 3 studies indicated that the generalizability of their results may be limited by lack of diversity among those studied. There were no statistically significant changes in inclusion or reporting of sex or race/ethnicity when compared with 2004.
Ensuring enhanced inclusion, analysis, and reporting of sex and race/ethnicity entails the efforts of NIH, journal editors, and the researchers themselves.
In 1986, the National Institutes of Health (NIH) instituted a policy urging the inclusion of women in clinical trials. This policy became law when Congress passed the NIH Revitalization Act of 1993, which requires that NIH-supported clinical research include women and minorities as subjects.1 The guidelines for implementation, amended in 2001,2 require researchers to address inclusion of women and minorities in funding proposals, impose mandatory reporting of subject accrual in annual progress reports, and state that phase III drug trials must be designed and carried out to allow for the valid analysis of differences between women and men when prior research has indicated that it may be important and that preliminary trials must provide enough information to inform the design of subsequent phase III trials. In 1993, the Food and Drug Administration (FDA) published similar guidelines, allowing for the inclusion of women of childbearing potential in early studies of drugs. The Agency for Health Research and Quality (AHRQ) and the Centers for Disease Control and Prevention (CDC) developed similar guidelines soon after the NIH Revitalization Act took effect.
The Office of Research on Women's Health (ORWH) with the NIH Office of Extramural Research and the Office of Intramural Research has played a critical role in funding clinical trials to study women's health and explore the role of sex and gender differences in health outcomes. It has also taken the lead on overseeing adherence to the NIH policies on the inclusion of women and minorities as subjects in clinical research. ORWH's outreach efforts assist investigators with the preparation of proposals and progress reports in compliance with NIH policies as well as with strategies to improve recruitment and retention of women and minorities in clinical research.3 The NIH Tracking and Inclusion Committee monitors the numbers of males and females by race and ethnicity who are enrolled in clinical research studies funded by NIH, which is published in a biennial report.4
In fact in FY2007, women constituted 61.8% of participants in NIH-funded clinical research overall. When single-sex studies are excluded, participation of men and women was nearly equal (49.8% and 46.5%, respectively).4 Although the NIH and, specifically, ORWH have done a laudable job increasing the rate of women in clinical trials, they lack the authority to require subgroup analysis and reporting in publications resulting from the research they fund. Current wording in the guidelines “strongly encourages” subgroup analyses in all publication submissions or, at minimum, a brief explanation about why such information was not provided.2
To assess the rate of reporting of sex-specific and race-specific results from clinical trials, we conducted a review5 of randomized controlled trials (RCTs) published in 2004. We found that federally funded clinical trials adhered inconsistently to the standards mandated by the NIH. The study found that women were generally underrepresented in clinical studies that enrolled both men and women, making up, on average, 37% of the study participants. The findings supported previous research that had included studies begun before the Congressional mandate in 1995.6–8
Given the continued emphasis by NIH and others over the past 5 years on including women and underrepresented minorities in clinical trials, we were interested in reassessing compliance in 2009. The current analysis evaluated the inclusion, analysis, and reporting of sex and race/ethnicity in clinical trials by examining federally funded studies published in 2009 in the areas of general internal medicine, oncology, cardiology, infectious disease, and obstetrics and gynecology (to assess race/ethnicity only) and comparing the current levels with those from 2004. We used the same methodology as in the 2004 analysis. We also examined a sample of studies that did not receive federal funding to assess if these studies adhered to the guidelines.
We located RCTs published in 2009 by computerized search of PubMed, focusing on the areas of general internal medicine, oncology, cardiology, infectious disease, and obstetrics and gynecology. We analyzed the same nine journals as in 2004 based on both their impact factor in 2003, a rating of the frequency with which articles are cited in a given year determined by Journal Citation Reports® retrieved from the ISI Web of Knowledge®, and by the number of RCTs published in 2004. The 2008 impact factors have increased for each of the nine journals. The journals evaluated were New England Journal of Medicine, Journal of the American Medical Association, Annals of Internal Medicine, American Journal of Medicine, Journal of Clinical Oncology, Circulation, Clinical Infectious Disease, Obstetrics and Gynecology, and the American Journal of Obstetrics and Gynecology.
A PubMed search in each journal used limits as indexed by the National Library of Medicine to select all articles described as “randomized controlled trial” that were in English, based on data from humans, and published during 2009. In cases where an article was published online and also in print, the date of publication used for selection refers to the earlier of the two dates.
For each article, we determined the source of funding and the date when study recruitment began. Letters, Brief Communications, and clinical trials begun before 1994, as well as studies with no federal support, were excluded. Studies were also excluded when an individual was not the unit of randomization or analysis, where only a subset of a trial's subjects were analyzed, where data were combined from several trials, or where no subjects resided in the United States.
For the analysis of sex-based reporting, studies published in obstetrics and gynecology journals and those that were specific to only males or females were excluded. Conditions that are not exclusive to one sex but may disproportionately affect members of one sex (e.g., autoimmune diseases) were not excluded. Studies based in treatment facilities for veterans were not excluded unless they addressed a condition found only in men (e.g., prostate cancer). We evaluated articles for the reporting of sex-specific and race/ethnicity-specific results. Obstetrics and gynecology articles were evaluated for race/ethnicity-specific results only. We also recorded whether sex and race/ethnicity were considered during the analysis of outcomes and whether the authors acknowledged the potential impact of sex, race, or ethnicity on either the results or their generalizability.
The sample distribution across sex and race/ethnicity was recorded in terms of both percent distribution and absolute numbers (because sample size drives the ability to find statistical significance). Comparisons between subsets of the articles were made using Fisher exact test. Each article was examined in its entirety, including abstract, text, and tables. In addition to the articles themselves, any published follow-up articles or comments by either the author(s) or another researcher were examined for information relating to sex, race, and ethnicity. The race/ethnicity portion of the analysis is limited to black and Hispanic subgroup reporting, as other racial/ethnic minorities were rarely and inconsistently reported.
We were also interested in whether nonfederally funded studies were more or less compliant with NIH guidelines. To select a sample of nonfederally funded studies, we assigned random numbers generated using Microsoft Excel to the 356 studies that did not acknowledge any federal funding. We then sorted the list and selected the first 25 studies for analysis. If a study was determined to be ineligible (e.g., no subjects resided in the United States, only a portion of trial subjects' data were analyzed), that study was not included, and the next study on the list was selected.
The search found 512 publications in the areas of general and internal medicine, oncology, cardiac and cardiology, obstetrics and gynecology, and infectious disease, all meeting the search criteria as of January 6, 2010. Of these, 426 were eliminated for one or more of the following reasons: no federal funding, no support described, funding prior to 1994 or unknown date of funding, no subjects residing in the United States, full sample of subjects not described, or meta-analyses. The remaining 86 studies that were included in this analysis represent 17% of all RCTs published in these nine journals, similar to the 14% of studies that were eligible for the 2004 analysis (p=0.15). Eighty-one studies (94%) received funding from the NIH, a significant increase over the 57 studies (83%) funded by NIH in 2004 (p=0.02).
Of the 86 studies, 12 were published in obstetrics and gynecology journals and were excluded from the primary analysis for reporting of sex differences. Of the remaining 74 studies, 18 (24%) were sex specific, 14 enrolling only women and 4 enrolling only men. Table 1 shows the distribution of these studies by journal type. Among the 56 studies that were not sex specific, the majority of studies (n=40, 71%) enrolled ≥30% women. The percentage of women in each sample ranged from 1% to 78%, the number of women enrolled ranged from 3 to 6801, and 18 studies had >300 female subjects. Comparing 2004 with 2009, there were no significant differences in the median percent of women enrolled in nonsex-specific studies (43% and 38%, respectively); 9 (16%) studies had ≤20% women in 2009, compared with 7 studies (15%) in 2004.
Fourteen of 56 studies included in this analysis (25%) reported at least one outcome by sex, compared with only 6 (13%) in 2004 (p=0.10). Table 2 provides a breakdown of the inclusion of sex in the modeling and analysis by journal type. Six studies included sex in the model but did not provide sex-specific data for any outcome because differences by sex were found not to be statistically significant. Thirty-six studies (64%) did not report their findings by sex, did not include sex as a factor in modeling (or mention “baseline characteristics” as a general category of covariate), and did not provide a rationale for disregarding sex as a potential influence. There was little change from 2004, when 67% did not report any outcomes by sex (p=0.36). Of the 9 studies with <20% women subjects, 8 did not mention any limitations to generalizing the results to both sexes. Only 1 study, which enrolled 98% men from a Veterans Administration (VA) population, acknowledged that the findings may not be applicable to women.
None of the 86 studies addressed a disease or condition that might be considered race specific (e.g., sickle cell disease in African Americans). Table 3 provides information on the extent to which sample size for racial and ethnic groups was reported in 2004 and 2009. About half of all studies reported the percent of subjects who were black (n=49, 57%) or Hispanic (n=34, 40%). Eighteen (21%) studies did not report the number of subjects in any racial/ethnic categories; none gave any indication that recruitment was limited to a single race or ethnicity. More than half of the studies (n=52, 60%) included both white and nonwhite subjects, with the proportion of white subjects ranging from 10% to 99%. In 15 (17%) of the studies, at least 90% of subjects were white, without a rationale as to why. There has been little change in the reporting of enrollment by race/ethnicity since the 2004 analysis (all p values nonsignificant).
Thirteen of the 34 studies that reported sample counts for Hispanic subjects treated “Hispanic” as an ethnic category, separate from racial description, in accordance with the Office of Management and Budget census categories.9 The remaining studies considered Hispanic to be a mutually exclusive race category (i.e., no subject could be classified as both black and Hispanic). The number of studies regarding Hispanic as an ethnicity rather than a racial category has substantially increased since 2004, when only 2 of 33 studies reported enrollment counts for Hispanic subjects (p<0.01).
The majority of studies in both 2004 and 2009 (83% and 79%, respectively, p=0.37) did not report any outcome by race or include it in the model. Table 4 shows the distribution of the inclusion of race/ethnicity in the model or analysis by journal type. Twelve studies did report at least one outcome by racial/ethnic subgroup, and 3 studies found that race/ethnicity was a key predictor of the outcomes under study. Only 1 study, which enrolled 97% white subjects, noted that the lack of racial/ethnic diversity in their sample limited the applicability of their findings to other racial/ethnic groups. In 81 articles (94%), the findings were deemed generalizable without regard to race or ethnicity.
Of 12 eligible studies in obstetrics and gynecology journals, 9 reported the number of black women enrolled, which ranged from 1% to 57%, and 7 reported the number of Hispanic women, which ranged from 4% to 52%. There were no differences in reporting of sample size for race/ethnicity between these journals and the seven medicine journals (p=0.15 for blacks and 0.13 for Hispanics). Only 1 of the 12 obstetrics and gynecology studies reported results by race or ethnicity. The proportion of obstetrics and gynecology studies that did not report results by race/ethnicity, include it in a model, or provide an explanation did not differ from the other seven journals in 2004 (p=0.19) or in 2009 (p=0.28).
Of the 86 studies discussed here, 12 were described in the body of the article as phase III drug studies for which the NIH requires sample sizes adequate to allow subgroup comparisons. One third (n=4) of the phase III studies were sex specific and enrolled only women, with 3 studies focused on breast cancer and 1 focused on cervical cancer. Of the 8 studies that were not sex specific, 4 (50%) provided results for at least one outcome by sex. Women constituted 15%–50% of the sample populations for these studies. Among all 12 phase III studies, 3 (25%) reported race/ethnicity-specific results. In 2004, only 4 studies were designated to be phase III drug studies, and none reported outcomes by sex or race/ethnicity.
We reviewed 25 articles randomly selected from 356 studies that did not acknowledge any federal funding support to determine if privately funded clinical trials reported results for sex or racial differences at the same rates as those studies with federal funding. Aside from funding source, all other exclusion criteria (e.g., all participants resided outside the United States, only a portion of a trial's subjects was analyzed) were the same. In terms of the analysis and reporting of sex and/or racial/ethnic differences, nonfederally funded studies did not differ significantly from the federally funded studies which are bound by the guidelines of the NIH and other federal funding agencies. Of the 25 studies, 10 (40%) were sex specific (2 enrolled men only and 8 enrolled only women). Studies including both men and women enrolled women as, on average, 40% of their total sample. Three of the 15 studies that enrolled both sexes reported at least one outcome by sex. One additional study, a phase II drug study, noted that the subsequent phase III study would be designed to detect differences between the sexes. Although nearly half of the studies (n=12, 48%) reported the racial/ethnic distribution of their subjects, none of the 25 studies reported any outcome by race/ethnicity.
This research was undertaken to examine studies published in nine high-impact journals in 2009 for compliance with the NIH inclusion and reporting guidelines for sex and race/ethnicity and to compare the results with previous findings from 2004. We found very little improvement over the last 5 years. The median enrollment of women in studies that included both sexes remained low, at 37%. Our finding is reinforced by two reviews of published cardiovascular studies that also found that inclusion of women and sex-specific analysis and reporting remain low.10,11
One fifth of the studies we reviewed failed to report the racial/ethnic distribution of their participants, and when reported, black and Hispanic subjects remain underrepresented in clinical studies relative to their distribution in the U.S. population. Slightly more studies reported at least one outcome by sex or race/ethnicity in 2009 than in 2004, although these increases did not reach statistical significance. Perhaps the most notable change since 2004 was an increase in the number of studies that consider “Hispanic” as an ethnic rather than a racial category in accordance with the Office of Management and Budget census categories.9
Although subgroup analysis and reporting are required under the NIH guidelines only explicitly for phase III clinical trials, it is also important that such analyses be published for other studies to generate hypotheses about sex and racial/ethnic differences for subsequent studies.12,13 In fact, the NIH guidelines require designers of phase III studies to examine data from prior studies to determine whether or not clinically important sex or racial/ethnic differences are expected. Where the evidence supports the existence of significant differences, the study design must specifically accommodate this. When the evidence is inconclusive, however, the study design should address the possibility but does not need to be powered to detect subgroup differences. Consequently, the lack of analysis and reporting for sex and racial/ethnic differences in preliminary studies could become the basis for the failure to conduct and report such analyses in subsequent research.
Observers of trends in clinical research caution that when researchers do find subgroup differences, they frequently fail to distinguish between biological and social causes of difference, essentially portraying differences as biologically inevitable and drawing attention away from the social constructs that cause observed health disparities.14 We agree that the analysis and interpretation of differences between subgroups requires a nuanced approach that considers nonbiological as well as biological explanations; however, the complexity of etiologies should not be considered an excuse for not analyzing sex or racial differences and, in some cases, gender differences.
One study in the present analysis that presented results by both sex and race provides an example for the reporting of subgroup differences in a thoughtful way. In the discussion of why black and female patients are at greater risk for adverse outcomes after vein bypass surgery, the authors acknowledged the possibility that differential clinical outcomes may be the result of biological differences or underlying social and economic disparities associated with race and gender.15
The major limitation to our study is that in selecting studies published in nine of the highest impact journals, we do not cover the whole body of NIH-funded RCTs. This may account for the differences seen in reporting between this study and the NIH tracking report for inclusion of women in clinical studies.4 However, the selected journals were the same set evaluated in our earlier study.
If it is the case (as we believe it is) that there is no willful intent to be noncompliant with Congressional mandates, why is there a continued lack of appropriate attention to the underrepresentation of women and ethnic/minority participants in the reporting of clinical research? Difficulty recruiting women and nonwhite participants into clinical research is one issue that is receiving considerable and appropriate attention through such initiatives as the Community Engagement aspects of the NIH Clinical and Translational Science Awards16 and the growing awareness of the importance of community-based participatory research.17
Failure to acknowledge the limitations of clinical research overwhelmingly involving white males within reports of the study results is a separate issue.18 Multiple examples exist across history of the experience and perspectives of low status or marginalized groups being ignored. Medical education and research are also replete with such examples.19,20 Although it remains disappointing, respected scientists who are well versed in the importance of acknowledging the limitations of any study have a long historical precedent of failing to acknowledge that their findings may not be generalizable to women or to men from nonwhite groups.
What are the next steps for ensuring enhanced inclusion, analysis, and reporting? NIH/ORWH can mandate and track inclusion, but they lack the ability to require analysis and reporting of sex and race/ethnicity in publications. Journal reviewers and editors can play an integral role in requiring reporting of these criteria or an adequate explanation (e.g., a footnote) as to why such reporting is not appropriate. As in the past, the scientists who conduct and report on the results of the research have the greatest responsibility, but they are not complying. In the CONSORT 2010 reporting guidelines for RCTs,21 items 20 and 21 in the checklist for authors encourage discussion of limitations and generalizability, respectively. Perhaps the next CONSORT could be even more prescriptive and include in these items the directive to authors to specifically comment on limitations and generalizability relative to the gender and ethnic/racial composition of the study participants. Then, journal editors could insist that all reports of clinical trials follow the CONSORT guidelines.
On a positive note, some journals have instituted policies about reporting sex and race/ethnicity. If race or ethnicity is reported in JAMA, the authors must explain why it was assessed and describe the classifications used, who classified participants as to race/ethnicity, and whether the options were defined by the investigator or the participant. Circulation asks authors to provide subanalyses by sex or race/ethnicity when appropriate or to specifically state that their analyses did not detect subgroup difference. The journal Nature is working on a policy change that requires authors to clearly label single-sex studies and to justify including only one sex in their research design (personal communication) following publication of an opinion piece urging such a policy.22 We applaud these journals for taking important and tangible steps toward enhanced inclusion, analysis, and reporting by sex and racial/ethnic categories and encourage them to enforce these policies.
Given the ongoing focus by NIH and others over the past 5 years for inclusion of women and underrepresented minorities in clinical trials, we expected to see greater compliance with the guidelines, but for these nine influential journals, we found little improvement.
The authors have no conflicts of interest to report.