|Home | About | Journals | Submit | Contact Us | Français|
Odds ratios (ORs) are widely used in scientific research to demonstrate associations between outcome variables and covariates (risk factors) of interest and are often described in language suitable for risks or probabilities, but odds and probabilities are related, not equivalent. In situations where the outcome is not rare (e.g., obesity), ORs no longer approximate the relative risk ratio and may be misinterpreted. Our study examines the extent of misinterpretation of ORs in Obesity and International Journal of Obesity. We reviewed all 2010 issues of these journals to identify all articles that presented ORs. Included articles were then primarily reviewed for correct presentation and interpretation of ORs; and secondarily reviewed for article characteristics that may have been associated with how ORs are presented and interpreted. Of the 855 articles examined, 62 (7.3%) presented ORs. ORs were presented incorrectly in 23.2% of these articles. Clinical articles were more likely to present ORs correctly than social science or basic science articles. Studies with outcome variables that had higher relative prevalence were less likely to present ORs correctly. Overall, almost a quarter of the studies presenting ORs in two leading journals on obesity misinterpreted them. Furthermore, even when researchers present ORs correctly, the lay media may misinterpret them as relative risk ratios. Therefore, we suggest that when the magnitude of associations is of interest, researchers should carefully and accurately present interpretable measures of association -- including risk ratios and risk differences -- to minimize confusion and misrepresentation of research results.
Odds ratios (ORs) and risk ratios (RRs)1 are commonly reported measures of association in the health care literature ([1-3]). Some authors have offered a preference for RRs because they may provide a more straight-forward, intuitive way to interpret results than ORs ([1-3]). However, RR cannot be easily calculated in certain situations such as in case-control and meta-analysis studies. In these cases and/or other situations where RR is simply not calculated, the OR is often used ([1-4]). ORs asymptotically approximate RRs as the prevalence of an outcome of interest observed in a study approaches zero (). However, in practical terms, when the prevalence of an outcome is greater than 10%, ORs can grossly misrepresent RRs and are therefore incorrectly interpreted if interpreted as RRs in such cases. Researchers have reported that the misuse of ORs is as high as 26% of published studies in some academic journals () which can result in loss of study interpretability, confusion, and either over- or under estimation of the association of predictors with an outcome ([1-11]). Use of RRs and/or risk differences (often called marginal effects in the economics literature where it is widely used) when presenting results can prevent these limitations ([1-3]), and several articles and text-books provide guidance on this topic ([12, 13]). There exist mathematical ways to adjust ORs to approximate (and in some cases exactly equal) RRs ([3, 14]). In addition, other statistical techniques can be used to estimate the magnitude of association in lieu of RRs such as a modified Poisson regression () or a Mantel-Haenszel estimate (). These methods can help minimize confusion when interpreting study results ([15, 16]), but a large number of researchers are not familiar with these techniques ([1, 3, 17]).
ORs are particularly problematic when studying health outcomes with relatively high prevalence, such as obesity. The obesity literature has recently grown significantly in terms of both volume and breadth in the clinical, basic, and social sciences (). Furthermore, obesity as an outcome is now estimated at 15-20% in adolescents and 30% - 35% in US adults ([19-24]) making it particularly vulnerable to the potential misinterpretation of effect sizes2 based on ORs. In addition, obesity-related studies are often cited in the mainstream press where ORs from scientific journals can be misinterpreted by the media as RRs, even if the original authors had not interpreted them as such. An example of media misinterpretation of ORs was presented in the New York Times () and other major media outlets in 1999. An OR of 0.60, presented by Schulman et al. in the New England Journal of Medicine(), was interpreted as showing that when compared to whites and males, blacks and females were “40 percent less likely” to be referred for cardiac testing (). The relative risk difference between whites and blacks was in actuality only 7% () (see Appendix A). A more recent instance occurred in Science Daily, where authors correctly reported the ORs, but incorrectly interpreted them as an increased percent in risk as in the following example: “results indicate that short sleep was associated with obesity, with the adjusted ORs for black Americans (1.78) and white Americans (1.43) showing that blacks had a 35 percent greater risk than whites of obesity associated with short sleep,” (). This type of misinterpretation is problematic given that scientific obesity studies are intended to influence policy makers, clinicians, and other researchers.
Misuse of ORs has been examined in the field of obstetrics and gynecology () and in medical and epidemiology journals ([2, 29]), but to our knowledge, no previous study has examined the extent of potential misuse of ORs in the obesity literature. The purpose of this paper is to review the recent obesity literature for potential misuse and or misinterpretations of ORs. Specifically, we are interested in estimating the prevalence of misuse (using it in studies where the outcome of interest has an occurrence rate greater than 10%); and the prevalence of misinterpretation of ORs (incorrectly quantifying the effect size) in obesity studies. To do so, we focus on two of the most prominent journals focusing on obesity research and publishing primarily original research (i.e., not reviews): Obesity and International Journal of Obesity.
Every article in each of the 2010 issues of International Journal of Obesity and Obesity were searched for articles that presented ORs. Our literature review consisted of two phases. First, we identified articles that were likely to use ORs by electronically searching the titles and abstracts of articles in these journals for the terms “odds ratio,” “OR,” and “logistic,” in reference to logistic regression. Second, the full text of these articles was then further assessed using a coding sheet designed to extract information that may relate to the use and misuse of ORs in these articles (see Appendix B). Upon each article review, we extracted a number of variables including the journal the article was found in, whether the first author had a university appointment or not, whether the research was conducted in the US or internationally, whether the research was basic science, clinical science, or social science, cellular or non-cellular, animal or human, experimental (e.g., randomized control trial) or non-experimental (e.g., observational).
Information on sample size, prevalence of the outcome of interest, the reported main effect of the primary OR, and the OR calculation method were also extracted from each included article. In addition, we looked at whether P0, the prevalence of the outcome of interest in the unexposed group, was calculated and reported. P0 is useful because along with the OR it can be used to estimate the RR of an outcome (). In some cases where P0 was not calculated, we were able to impute the value, if sufficient data were provided. We also looked at whether “risk differences” were reported. Widely used in the economics literature, risk differences estimate the absolute change observed in an outcome variable given a change in a covariate([30, 31]), and, like RRs, risk differences are more intuitively interpretable than are ORs (). In order to ensure inter-rater reliability, the first 16% of articles examined were coded separately by three authors and discussed for discrepancies (<1% disagreement). Next, one author coded the remaining articles and at least one additional author reviewed a random sample of 15 articles for agreement; again little or no disagreement existed.
Ultimately we were concerned with whether the articles we evaluated presented ORs correctly. We defined correct use of ORs as cases when authors:
ORs were defined as incorrect when authors interpreted ORs as risk ratios where the outcome of interest was not rare (e.g., greater than 10%). We looked for statements by the authors such as “increased risk,” “X% higher,” “X times as likely,” that denoted misinterpretation of their studies’ ORs as risk ratios4. All of these data were then entered into a database.
After all articles were coded, results were entered into STATA version 11 for analysis. We conducted chi-square or Fisher’s exact tests as appropriate to determine if there were differences in the association between categorical variables and correct OR usage in our sample. Fisher’s exact tests were performed because of the presence of small cell sizes. Next, logistic regression was used to determine the relationship between the dependent variable, correct use of ORs, and the following independent variables: journal, author appointment and location, article type, and prevalence of study outcome variable. The prevalence of the main outcome variable was categorized as; <20%, 20-49%, >50%, or missing. Article type was categorized as “basic science,” “clinical science,” or “social science.” “Basic science” articles were studies conducted at the cellular level (e.g. tissue composition’s effect on enzymes, antibodies’ effects on total cholesterol). “Clinical science” articles were intended to have prevention, treatment, or diagnosis implications (e.g. genetic makeup as a risk factor for higher BMI, predicting metabolic disease risk based on abdominal volume). “Social science” articles were concerned with behavioral factors related to obesity (e.g. leisure time activities’ association with obesity, exercise protocols’ effect on weight loss). Lastly we calculated risk differences using the ‘mfx’ command in STATA (version 11).
A total of 855 articles from the International Journal of Obesity and from Obesity were examined, and 62 (7.3% of 855) articles all of whom utilized logistic regression were included in the current analysis. Relative to how many total articles are published, both journals contributed similar proportions of included articles to our study. Table 1 displays the characteristics of these articles. Twenty three of these articles (37.1%) came from International Journal of Obesity, and 39 (62.9%) came from Obesity. The majority of articles was published by university-based authors (75.8%), came from non US researchers (67.7%), had observational study designs (95.2%) and focused on human subjects (98.4%). The most frequent article focused on a social science topic (53.2%); followed by clinical science (27.4%) and basic science (19.4%). The mean sample size of all articles was 59,992 subjects with a median of 3,310, and a range between 61 and 1.7 million. A high percentage of the articles (82.8%) reported a prevalence rate for their outcome of interest with a median prevalence of 27%.
ORs were presented correctly in 76.8% of the articles. In univariate analysis, several characteristics of articles were associated with correct presentation of ORs by authors (see Table 2). Mean prevalence rates for articles reporting ORs correctly were lower than their counterparts (28.9% vs. 43.9%; p = 0.058). Moreover, articles published on clinical science topics were more likely than social science and basic science topics to report ORs correctly (94.1 vs. 72.4 vs. 60.0; p=0.065).
In a logistic regression model that controlled for journal, article type, appointment of first author, prevalence of outcome, and location of research, several article characteristics were still significantly associated with correctly reporting ORs (see Table 3). Article type was associated with authors reporting ORs correctly. Specifically, when compared to the basic science articles, clinical articles (OR= 140.7; risk difference +43%, p = 0.007) and social science articles (OR=16.2; risk difference +38%, p = 0.045) had higher odds of reporting ORs correctly. Also, as the prevalence of the outcome variable increased, the odds of correctly reporting the ORs decreased (see Table 3).
ORs are vulnerable to incorrect interpretation. When the prevalence of an outcome is greater than 10%, such as in many obesity studies, correct interpretation of ORs becomes particularly difficult. Even when researchers present ORs correctly, they can be misinterpreted by the mainstream press. In this analysis, we examined recent literature from two prominent obesity journals in order to estimate the prevalence of misuse and misinterpretation of ORs.
We found that almost 1 in 4 studies that present odd ratios discuss their results incorrectly. Although the problems we outlined regarding the misuse and limitations of ORs have been discussed in many different fields ([1-11]), only one previous study is comparable to ours in that it estimated the prevalence of misuse and misinterpretation of ORs in leading journals in a particular field. Specifically, Holcomb et al. estimated an OR misuse rate of 26% (39 out of 151 articles), in Obstetrics & Gynecology and the American Journal of Obstetrics and Gynecology (). This is similar to our own findings herein.
In our study, we also found that certain article characteristics were associated with incorrect presentation of ORs. For example, we found that an increase in prevalence rate of an outcome variable was associated with increased odds of reporting the ORs incorrectly. Notably, this was not due to one of our definitions for correct presentation of odds ratios (interpreted ORs as risk ratios only when the outcome was rare, ≤10%), as none of the articles we examined with prevalence rates ≤10% presented ORs as RRs. This tendency to misuse ORs as study outcome prevalence rates increase is particularly worrisome given that high prevalence rates are more susceptible to over or under inflating the OR in relation to the RR. We believe that as researchers contributing to the obesity literature we should be cognizant about the correct use of these techniques.
Article type was also associated with correct use of ORs in our study. Authors of articles categorized as basic science had higher odds of incorrectly presenting ORs when compared to authors of clinical science and social science articles. This may reflect the type of training and exposure to OR methods received by authors in these various areas of study. In fact, our finding illuminates the often confusing nature of ORs. Specifically, we found that clinical studies had 140 greater odds than basic science articles to correctly present ORs. If incorrectly interpreted as a risk ratio, this would suggest a 100+ fold increase in the correct use of ORs by authors of clinical studies. Notably, the corresponding risk difference articulated as a risk difference was 43%: a measure that is several orders of magnitudes lower. Although none of the article types were associated with 100% correct use of ORs, journal editors and authors of basic science articles could consider paying closer attention to the use and presentation of ORs in these types of articles.
Despite the advantages of mathematically converting ORs to risk ratios ([3, 14, 15]) there are some issues with risk ratios that are worth pointing out. First, when comparing more than two groups, the RR for each group changes if the reference group changes. Also, risk ratios cannot take into account differences in prevalence rates across groups. For instance, if the prevalence of disease A in one group is 3 percent of the sample and in the other group it is 1.5 percent, and the prevalence of disease B in one group is 40 percent and in the other group is 20 percent, then the risk ratio for the first group relative to the second for both diseases will be 2. In order to overcome these limitations, researchers in other fields, most notably economists (), have used the risk difference or marginal effect. Risk differences are not subject to the limits of risk ratios and would yield a result of 0.015 in the first case and 0.20 in the second case, thus helping illustrate the fact that the second disease has a higher prevalence in the overall sample than the first. However, in spite of the advantages of risk differences, none of the articles reviewed in our study presented them. We believe that for studies likely to be of interest to many stakeholders, providing the RR (its limitations notwithstanding) or risk differences in addition to ORs is more useful to academic and lay readers than providing ORs alone.
There are several limitations to our study. Although we examined the most recent articles of two of the preeminent obesity journals, our study only looked at one year of research from each journal. Because of this, there is no way to determine if the misuse of ORs in these journals is getting better or worse over time. The use of only one year of each journal also resulted in another limitation: a small sample size. However, even with a small sample size, we were able to find statistically significant findings between some article characteristics and the correct presentation of ORs. Furthermore, our findings regarding the prevalence of incorrect use of ORs are similar to other literature on the topic. We recognize it was not possible to collect all possible article characteristic. For example, we did not collect information on “funding source” which may have been interesting to some readers. Although we targeted Obesity and International Journal of Obesity because of their exclusive publication of articles related to obesity, our focus on obesity journals represents a limitation to the generalizability of our findings given that articles related to obesity are published in many general public health journals, generic clinical journals, and other outlets outside of specialized obesity journals.
The obesity epidemic has had major political, social, and economic impacts in communities. As such, it is incumbent upon those in the scientific community to ensure that results of original research are presented in a clear, accurate, and interpretable manner ([35-40]). We advocate the prudent use of ORs in general; and when doing so, including clear statements which specify that associations with odds (and not risks) are being estimated, presenting risk differences, and/or converting ORs to RRs. In cases where the outcome prevalence rates are greater than 10%, researchers should be especially cautious about the potential misinterpretation of ORs among the media and other diverse groups of stakeholders to their research.
The opinions expressed are those of the authors and not necessarily any organization with which we are affiliated. Supported in part by P30DK056336 (DBA).
|P (Prevalence of referral to|
|Odds = P/(1−P)|
|Odds Ratio: (OddsBlack/ OddsWhite)||0.57|
|Risk Ratio: (PBlack /PWhite)||0.93|
1Odds ratios for group ‘a’ compared to group ‘b’ are calculated as (Pa/(1-Pa)/(Pb/(1-Pb), while risk-ratios are calculated as (Pa/Pb), where Pa and Pb are the prevalence of the outcome among the two groups respectively. ‘Risk differences’ discussed later in the paragraph, are calculated as (Pa-Pb). Risk ratios are considered a well- known measure to determine the etiology of an outcome (disease), while risk-differences are often used as a measure of the public health impact for that disease.
2We use the term ‘effect size’ in a simply descriptive manner and not necessarily to imply a cause and effect relationship.
3It is a general rule of thumb and mathematically validated that ORs approximate RRs in studies where outcome prevalence rates are below and up to 10%, ([1-3, 32, 33]). However we are aware that some researchers find 10% too forgiving a cutoff. and have suggested a cutoff point <10% given that in studies where ORs are below 0.2 and above 2, they begin to diverge from RRs at outcome prevalence rates below 10% ().
4A ‘risk ratio’ of X should be interpreted as the risk being ((X-1)*100) percent higher. Odds-ratios in of themselves lack any similar intuitive interpretation. We refer readers to the following citations for more information how to correctly interpret odds ratios ([3, 31]). In this context, we also note that there is another potential nuanced misinterpretation. An RR (or OR) of X (when X is > 1) is often interpreted as “an X-fold increase in the risk (or odds),” when in fact, it is an (X-1)-fold increase. We are aware of this interpretation problem but did not study it in our analysis.
Gabriel Tajeu, Department of Health Care Organization and Policy.
Bisakha Sen, Department of Health Care Organization and Policy.
David B. Allison, Dean’s Office, School of Public Health and Nutrition Obesity Research Center.
Nir Menachemi, Department of Health Care Organization and Policy.