|Home | About | Journals | Submit | Contact Us | Français|
Odds ratio (OR), risk ratio (RR), and prevalence ratio (PR) are some of the measures of association which are often reported in research studies quantifying the relationship between an independent variable and the outcome of interest. There has been much debate on the issue of which measure is appropriate to report depending on the study design. However, the literature on selecting a particular category of the outcome to be modeled and/or change in reference group for categorical independent variables and the effect on statistical significance, although known, is scantly discussed nor published with examples. In this article, we provide an example of a cross-sectional study wherein PR was chosen over (Prevalence) OR and demonstrate the analytic implications of the choice of category to be modeled and choice of reference level for independent variables.
Odds ratio (OR) and risk ratio (RR) are two commonly used measures of association reported in research studies. In cross-sectional studies, the odds ratio is also referred to as the prevalence odds ratio (POR) when prevalent cases are included, and, instead of the RR, the prevalence ratio (PR) is calculated. However, it should be noted that, although, mathematical calculations are the same, there are inherent differences in ORs for each study design. Similarly, PR as such neither equals the RR nor the incidence (density) rate ratio.
The literature is abundant with articles discussing advantages/disadvantages of POR/OR versus PR/RR and debate about the “appropriate” measure of association.[1–19] One of the advantages of OR is that they are preferred for their convenient mathematical property, for example in the Cornfield-chi-square statistics in unstratified analysis, in the Mantel-Haenszel odds ratio in stratified analysis and in the logistic regression model for multivariable analyses. For dichotomous data with binomial distributions, the log(OR) is considered as a convenient mean for modelling the probability of an outcome when RR have potential of producing estimated probabilities beyond the zero to one range. Furthermore, the log(OR) is directly related to Bayes theorem and is the natural (time invariant) measure in stochastic-risk modelling. However, Poisson modelling will be appropriate for incidence data leading to Poisson distribution in stable cohorts and proportional hazard modelling in dynamic populations where RR will be more intuitive. Greenland  presents a strong theoretical argument against the use of OR and comment that “ only incidence differences and ratios possess direct interpretations as measures of impact on average risk or hazard.” He further comments that ORs are useful only when they serve as incidence-ratio (i.e. RR) estimates and logistic and log-linear models are useful only insofar as they provide improved (smoothed) estimates of incidence differences or ratios. The choice of measure of association also affects assessment of confounding. When confounding is defined using “collapsibility”, RR and not the OR is an intrinsic measure of interest.
Similarly, issues of the “overestimation” of the strength of associations and reciprocity of OR have been extensively addressed in various articles and books.[1, 5, 7, 8, 10, 13, 18, 20–22] Note, however, the term ‘overestimation’ applies when one wishes OR to be an approximation of RR, otherwise, both are valid measures of associations estimating different population parameters. The property of reciprocity (changing reference category of a dichotomized variable will yield “reciprocal” estimates) of OR is also well-known.
Although, much of the literature for the above-mentioned properties of OR/RR (and POR/PR) is available, changes in statistical significance (p-value) for PR depending upon the category of the outcome modeled or choice of reference category for categorical independent variables, is not commonly discussed. In this article, we provide an example of a cross-sectional study wherein PR was chosen over POR and demonstrate the analytic implications, especially with regard to statistical significance, for each measure of association. These implications could very well be important in the conclusions of various investigations and require careful consideration in planning studies and/or thought about the choice of reference group.
The aim of our cross-sectional study was to examine predictors of hypertension (HT) control in a cohort of HIV-positive patients. Overall, the analysis of the study included descriptive statistics with univariate and multivariable analyses examining association of multiple predictors with HT control. However, for the purpose of this article and simplicity, we will discuss results only pertaining to one predictor: Race-Sex combination (White-Male, White-Female, Black-Male, and Black-Female).
Analyses were conducted using SAS statistical software (version 9.3, Cary NC). POR and PR were calculated using the PROC GENMOD procedure with binomial distribution and logit or log links, respectively. In situations where convergence problems arise, Poisson regression and (modified) Poisson regression with robust standard errors approaches have been suggested.[2, 14, 23, 24] Although, we did not encounter convergence problems for the specified independent predictor (i.e. Race-Sex), models were also run using Poisson regression with robust standard errors to examine consistency of the results obtained using binomial distribution.
Of the 699 study participants, 380 (54.4%) had achieved hypertension control (Table 1). Due to the high prevalence of the outcome, we chose PR over POR, as POR would have “overestimated” the strength of the association considerably. For example, with Black-Male as the reference category, POR for White-Female was 2.63 (Table 2A) while PR was 1.48 (Table 2B) when (hypertension) control=“Yes” was modelled (“No” being the reference group). Likewise, overestimation by POR is evident when control=“No” was modelled (POR=0.38 versus PR=0.56).
When the point estimates for PRs with different reference categories were compared (Tables 2B and 2C), the change was approximately 10%. For example, PR for While-Female changed from 1.48 (Table 2B) to 1.32 (Table 2C) when reference changed from Black-Male to Black-Female, respectively. Similar change was observed for White-Males (PR=1.23 versus PR=1.10).
When PORs were compared for outcome=“Yes” versus “No”, as expected, they were reciprocals of each other [e.g. White-Female: Yes=2.63 and No=0.38 (=1/2.63)] (Table 2A). Again as expected, this reciprocity was not observed for PRs (PR: Yes=1.48 versus No=0.56) (Table 2B).
The p-values remained exactly the same for PORs irrespective of whether the outcome=“Yes” or “No” was modeled (e.g. White-Female: p=0.02) [Table 2A]. In contrast, the p-values changed considerably for PRs depending upon the outcome modelled (e.g. White-Female: Yes=0.003 versus No=0.04) [Table 2B]. Such a change was especially seen for While-Females, but not so for White-Males and Black-Females. Furthermore, when the reference category was changed from Black-Males (Table 2B) to Black-Females (Table 2C), the p-value for White-Female changed from 0.04 when control=“Yes” was modelled to 0.10 when control=“No” was modelled; a change from being statistically significant (at 0.05 level) to not being statistically significant. Similar results were obtained with Poisson regression using robust standard error method; the 95% confidence intervals generated (by Poisson modeling) were the same generated for POR (logit link) and PR (log link) in binomial distribution.
Overestimation of strength of association by OR as compared to RR has been explained in detail in various books.[21, 22] To note, however, that term ‘overestimation’ applies when one wishes OR to be an approximation of RR, otherwise, both are valid measures of associations estimating different population parameters whose use depends on various reasons. The same logic could be applied for a cross-sectional study explaining discrepancy between POR and PR. In brief, as shown in Table 3, it’s a function of the mathematical formula and is related to the term [ ] due to which POR overestimates PR.[21, 22] When the proportion of outcome is “rare” (e.g. <10%), POR and PR are closer to each other. The magnitude of discrepancy between the POR and PR depending upon the incidence/prevalence of the outcome is well presented in figures in Zhang et al. and Schmidt et al. papers published previously. On a side note, although the mathematical computation for ORs in general are the same for various study designs, different values could be obtained due to selection bias related to study designs.[16, 27]
The magnitude of the difference in the point estimates of the PRs for White-Females (1.48 vs. 1.32, Tables 2B and 2C, respectively) and While Males (1.23 vs. 1.10, Tables 2B and 2C, respectively) will depend on the difference between the proportions compared. This could also mirror in the discrepancy of p-values, as explained later.
As shown in Table 3, for POR, the mathematical terms modelled for versus are reciprocals of each other. However, for PR, when the outcome=“Yes” is of interest, the term modeled is [ ] while when outcome=“No” is of interest, the term is [ ]. That is, the terms modeled are different, and therefore, the property of reciprocity in not observed for PR. Therefore, interpretation of a comparison based on PR is critically important on whether the positive outcome or its negative complement is modelled as also shown by Eckerman et al. with regard to RR.
For PORs, the reasons for obtaining the same p-value irrespective of whether outcome=“Yes” or “No” was modeled is related to the property of reciprocity and the term modeled being symmetric. Due to this property, some researchers prefer POR as the only measure of association that needs to be calculated and the choice between outcome=“Yes” or outcome=“No” does not affect the results/decisions.[29–31] On the contrary, the property of reciprocity does not hold true for PR and yet, such conversions are used in many applications, such as cost effectiveness analyses and meta-analyses where authors convert results into the same direction. The terms modeled are different for “Yes” and “No” when PR is calculated (Table 3), and therefore p-values (statistical significance) obtained need not be the same. In other words, one is estimating a model which is not symmetrical with the coding of the dependent variable unlike that of POR. The issue of symmetry is less important when the outcome is rare.
Moreover, the magnitude of discrepancy between the p-values depends on the difference between proportions compared. If the two proportions are closer to each other (around 50%), the difference between the two p-values would not be too “dramatic.” For example (Table 1), with Black-Males being a reference category (47.8% had HT control=“Yes”), the difference between the proportions of White-Females having hypertension control (70.6%) was larger than that for White-Males (58.9%) and Black-Females (53.3%). Therefore, the change in p-value was more “dramatic” for White-Females while it remained the same for White-Males and Black-Females.
In this example cross-sectional study reporting PR was deemed more appropriate than reporting POR due to considerable “overestimation” of the strength of the association by POR. Although, the direction/trend of the association remained the same, the statistical significance of the results did change when reference category for the outcome and/or independent variable was switched while calculating PRs. Therefore, researchers should be cautious of the lack of reciprocity and potentially altered p-values for PRs. The study results and the discussion do apply for OR vs. RR too. Furthermore, they could be generalized to any disease (acute or chronic) where both POR and PR could be calculated. However, duration of (outcome) disease (and exposure too) does dictate the study design, therefore, the measure of association preferred.
It is worthy of note that the decision should not be based solely on statistical significance, but also on clinical significance, which sometimes is overlooked and undue importance to statistical significance, especially p-value is given (e.g. selecting variables solely on p-value in univariate analysis to be included in multivariable analysis). We, however, acknowledge that the results of this study may not be applicable to all studies where prevalence of the outcome is high as the results, in particular statistical significance, would vary depending upon the sample size.
Sources of support: This research was supported by the University of Alabama at Birmingham Center for AIDS Research an NIH funded program (P30 AI027767) that was made possible by the following institutes: NIAID, NCI, NICHD, NHLBI, NIDA, NIMH, NIA, FIC, and OAR. Dr. Greer Burkholder is funded by NHLBI grant K23HL126570. We would like to thank Dr. Paul Allison (University of Pennsylvania) and Dr. Gerald McGwin (University of Alabama at Birmingham) for their valuable comments.