The causal effect of a drug would ideally be assessed by administering the drug to a person and comparing this person’s experience with the counterfactual experience of what would have happened to the same person at the same time had the drug not been taken.30
As such an experiment is not practical, research seeks to mimic a causal experiment as closely as possible. There are three fundamental ways to vary exposure status, and all the three types can be imagined in an experimental setting, e.g., RCTs, or in a non-randomized setting, e.g., epidemiologic studies ():
- Instead of varying exposure status within the same person at the same time, it is possible to examine the outcomes of varying drug exposure status in the same person but over time. This way a patient becomes his/ her own control, and all non-time-varying patient characteristics are kept constant by design. This is the basis for randomized crossover trials or non-randomized case-crossover studies.
- Instead of varying exposure within patients, exposure may vary between patients. One group of patients will be exposed to a new drug and another group to a comparison drug. Under the assumption that patients in both groups are on average comparable with regard to their patient characteristics, this method will mimic a causal experiment. This is the basic consideration for the frequently used two-group randomized trial design or epidemiologic cohort studies.
- Instead of varying exposure between patients, exposure may vary between providers or larger patient groups. Some physicians prescribe one drug over another independent of patient characteristics, because of either randomization or treatment preference. This is the basis for cluster randomized trials or IV analyses of cohort studies.
Drug utilization patterns guide the choice of non-randomized study designs.
Although this framework provides a logical ordering that is derived from extending a causal experiment, it is not necessarily the order that epidemiologists would consider for a specific study question.
The structure of health-care utilization databases allows extraction of information on all three levels of drug exposure variation with little effort. They provide longitudinal strings of information on the use of health service, including drug dispensings. Because each service is tied to reimbursement, the recorded time of service and dispensing are among the few highly reliable items in such databases. With the dispensing date and supply information, a drug exposure calendar can be established, and variation of drug exposure within a patient over time can be studied.
In cohort studies, it is critical to first understand the prescribing of drugs by tabulating measured patient characteristics by drug exposure group, which will allow the investigator to identify imbalances of some patient characteristics. In large randomized trials, such tables will show almost perfect balance of patient characteristics between randomly assigned treatment groups. In a cohort study, there are often substantial differences in the prevalence of measured patient factors between drug exposure groups that may lead to confounding, if these factors are also independent risk factors for the study outcome. Such factors need to be adjusted in further analyses. Instead of considering each factor individually, it is possible to combine all patient characteristics into a single propensity score (PS), which is the estimated probability of treatment, given all covariates. The distribution of the PSs for treated and untreated patients () can be plotted, and the degree of non-overlap of the two distributions is a measure of the multivariate imbalance of the two treatment groups (see more discussion of PSs below). In rare circumstances, the two PS distributions may be fully overlapping, which indicates that in the observed setting there is a perception of clinical equipoise of the two drugs, and physicians will quasi-randomly choose one. Consequently, all measured patient risk factors may be balanced. Examples of such situations include celecoxib vs rofecoxib in their early marketing phase.31,32
Regions of non-overlap of the exposure PS distributions of two treatment groups. In this example, study patients were restricted to those with largely overlapping exposure PSs by trimming patients with extreme PS values.
Utilization databases are also well suited to understand the properties and predictors of physicians’ prescribing decisions. Doctors’ ID numbers and limited physician characteristics can be linked to their patients, making it possible to identify provider subgroups that are more likely to prescribe one drug over another; if such a prescribing preference is largely independent of patient characteristics, it can be used as a substitute for exposure in an IV analysis.33,34