|Home | About | Journals | Submit | Contact Us | Français|
Active drug safety monitoring based on longitudinal electronic healthcare databases (a Sentinel System), as outlined in recent FDA-commissioned reports, consists of several interlocked processes, including signal generation, signal strengthening, and signal evaluation. Once a signal of a potential drug safety issue is generated, signal strengthening and signal evaluation have to follow in short sequence in order to quickly provide as much information about the triggering drug-event association as possible.
This paper proposes a basic study design based on the incident user cohort design for expedited signal evaluation in longitudinal healthcare databases. It will not resolve all methodological issues nor will it fit all study questions arising within the framework of a Sentinel System. It should rather be seen as a guidance that will fit the majority of situations and serve as a starting point for adaptations to specific studies.
Such an approach will expedite and structure the process of study development and highlight specific assumptions, which is particularly valuable in a Sentinel System where signals are by definition preliminary and evaluation of signals is time critical.
Active drug safety monitoring based on longitudinal electronic healthcare databases (a Sentinel System), as outlined in recent FDA-commissioned reports,1 consists of several interlocked processes, including signal generation, signal strengthening, and signal evaluation. Once a signal of a potential drug safety issue is generated, signal strengthening and signal refutation/confirmation must follow in short sequence, even in parallel, and provide as much information about the triggering drug-event association as possible. At this stage, speed and high accuracy of analysis are of the essence. Information on true drug safety signals should not be withheld from physicians and patients, but false positive signals may cause substantial harm if they limit access to safe medications.2 This paper will focus on the last step in a Sentinel System: the fast implementation of pharmacoepidemiologic investigations to refute (or fail to refute) a safety signal. The paper focuses on design elements that may shorten the time necessary to design and implement a specific study.
Elaborations are based on the suggestion that the majority of drug safety signals generated by a Sentinel System can be investigated with a default cohort design that may be tailored to the drug-event pair of interest. This paper will not dictate one design, but rather will suggest a robust study design as a starting point for fast adaptation and implementation of an in-depth epidemiologic evaluation. Adaptations of this design to specific study needs are encouraged and will often make transparent the trade-offs between high validity and expeditious decision-making.
This paper proposes such a basic study design by combining well-known design elements and analytic strategies. It also provides a flowchart for implementing and adapting the design and discusses advantages and limitations as compared to alternative designs and analyses.
The following assumptions will be made to be less dependent on any specific implementation of a drug safety Sentinel System:
Consider a basic cohort design comparing new users of one treatment to new users of a comparison treatment for the same or similar indication. Further, consider that covariate information will be assessed in the longitudinal health care claims stream during the 6 months preceding treatment initiation. Follow-up starts the day after treatment initiation (Figure 1).
For several reasons, such an incident user cohort study is a broadly applicable design that is fairly robust against investigator error.
Consideration about the sources for exposure variation is a fundamental decision point in design choice. In a causal experiment, one would expose a patient to an agent and observe the agent's effect on his or her health, then rewind time, leave the patient unexposed, and keep all other factors constant to establish a counterfactual experience.7 Since this experiment is impossible, the next logical expansion of the experiment is to generate or observe exposure variation within the same patient but over time. If we observe time-varying drug use that has a short washout period, and the adverse event of interest has a rapid onset, then we can use the case-crossover design (Figure 2).8 An advantage of the case-crossover design is that time-invariant patient characteristics are implicitly controlled. In pharmacoepidemiology, however, treatment choice might change with changes in health status over time and thus introduce within-patient confounding. This may explain why we see few applications of the case-crossover design in drug safety research.9 For most safety studies, we will utilize variation in exposure between individual patients, and we will therefore apply a cohort study design.
There are several advantages to identifying patients who start a new drug and begin follow-up after initiation—similar to a parallel group randomized controlled trial which establishes an inception cohort.10 As medications have been started in patients of both the study group and the comparison group, they have been equally evaluated by physicians who concluded that they might benefit from the newly prescribed drug. This makes the treatment groups similar in characteristics that might not be observable in the study database.11 The clear temporal sequence of confounder adjustment before treatment initiation in an incident user design avoids mistakenly adjusting for consequences of treatment (intermediates) rather than predictors for treatment, a possible reason for over-adjustment.12 Identifying two active treatment groups further reduces the chances of immortal time bias, a mistake that most frequently emerges when defining a ‘non-user’ comparison group in healthcare databases.13 Because of the well-defined starting point of inception cohorts, it is possible to assess whether and in what form hazards vary over time by stratifying on duration of treatment (Figure 3). This is particularly useful when studying newly marketed medications: the incident user design avoids comparing populations predominantly composed of first-time users of a newly marketed drug with a population predominantly composed of prevalent users of the old drug (Figure 4). Such a comparison may be biased because patients who stay on treatment for a longer time may be less susceptible to the event of interest.14
A common criticism of the incident user design is that excluding prevalent users will reduce the study size, in some cases substantially. While this is true, researchers should be aware that if they decide against an incident user design, they may gain precision at the cost of validity. Screening and identifying incident users in secondary databases, however, requires only a bit more computing time.
In some incident user designs, particularly studies of second-line treatments in chronic conditions, we can only study patients who switch from one drug to another, as very few patients will be treatment naive. Such switching is often not random, but rather is determined by progressing disease and treatment failure or by side effects that may be related to the study outcome; thus, users are not really incident users. However, a fair treatment comparison can be achieved by comparing new switchers to the study drug with new switchers to a comparison drug (Figure 5b). In the study of Diseasemodifying anti-rheumatic drugs (DMARD) safety in patients with Rheumatoid arthritis (RA), a common first-line DMARD is methotrexate (MTX). Among all MTX users, it is, therefore, appropriate to compare switchers to one biologic agent with switchers to another biologic agent.15 In both cases, physicians decided that treatment should be changed, which makes the comparison groups similar and preserves the main advantages of the incident user design. If the comparison of interest is MTX versus biologics, then another common first-line medication like chloroquine could be used as the baseline medication, from which patients switch either to MTX or to a biologic. Analogously, stepping-up therapy can be studied by comparing the addition of two different agents to a common baseline medication (Figure 5c).
A common question is, why not conduct a case-control study? Unless additional data will be collected at meaningful additional expense, there is no advantage in nesting a case-control study in a cohort if all data are already collected and stored electronically.16 There are no efficiencies to be gained; information on absolute rates and rate differences must be computed indirectly; and case-control studies are prone to errors in confounder adjustment (see Appendix 1 for a detailed discussion).
The exposure risk window is the time period during which the medication puts a patient at risk for a measurable outcome. The period often starts shortly after taking the first tablet and ends soon after taking the last, though notable exceptions include disruptions of the body's physiology by medications that put patients at risk long beyond bioavailability (e.g., immunosuppressant agents, methylating agents) and the study of incident cancer outcomes which will not be causally linked to a newly-used medication until after substantial lag time (Figure 3). Such exceptions aside, the exact form of the exposure risk window generally depends on the pharmacokinetics and pharmacodynamics of the drug as well as the outcome under study.15 Because of the clear temporality in cohort studies, it is fairly easy to vary the exposure risk window and to assess empirically the most likely underlying risk window.17 An as-treated (AT) analysis censors patients as soon as their exposure risk window ends.
This paper is written under the assumption that a Sentinel System has raised a signal based on a specific drug-outcome pair definition. At the point of signal evaluation, investigators might consider broadening or narrowing the outcome definition in a way compatible with the triggering association and current medical knowledge to gain a better understanding of the underlying causality. Such changes in outcomes are easily established in cohort studies.
Depending on the availability and results of prior validation studies, it might become necessary to validate all or a sample of outcomes.18 Case-control studies would be equally affected by the resulting delay of such validation.
Subgroup analyses enable better characterization of a hypothesized drug-event association. In a cohort study, it is simple to predefine multiple patient subgroups based on their baseline characteristics. The resulting analysis is straightforward, although there is no clear guidance on the issue of multiple testing.19,20 Patient subgroups that should be considered include duration of drug use and dose categories of the study drug.
Confounding is a formidable threat to validity in non-randomized studies of treatment effects. A litany of options for reducing confounding is available to epidemiologists.21,22 Propensity score (PS) matching, however, has emerged as an expeditious and effective tool for adjusting large numbers of confounders, even if outcomes are infrequent.
A PS is the estimated probability of starting medication A versus starting medication B, conditional on pretreatment patient characteristics. Such prediction of treatment choice, based on preexisting patient characteristics, fits the structure of the proposed incident user cohort design. PS are known to balance large numbers of covariates in an efficient way even if the study outcome is rare, which fits the anticipated situation of most drug safety signals raised by a Sentinel System. Estimating the PS using logistic regression is mechanistically uncomplicated. Strategies for variable selection are well described,23 and potential confounders can be identified empirically in the study data.24 Macros for 1:1 greedy matching of patients who share the same estimated score but who received different treatments (Figure 6a) are available and perform well.25 Such matching will exclude patients in the extreme PS ranges where there is little clinical ambivalence in treatment choice; (Figure 6b). These tails of the PS distribution often harbor extreme patient scenarios that are not useful for the majority in clinical practice.9,26,27
One unjustly negative opinion of PS matching holds that if the treatment decision process can be modeled well with observed patient characteristics, a resulting PS will lead to substantial or even full separation of treated and untreated patients.28 This means that for patients initiated on a study drug, very few patients initiated on a comparison drug could be identified who had the same PS. This would leave few patients for analysis. In other words, treatment choice would be almost deterministic; little randomness would be left in the prescribing decision that could be exploited for inference about the drug effect.
Consider an example of such a situation, a comparison of combination ezetimibe and simvastatin (Vytorin) versus simvastatin alone. Assume that the health plan that provides the study data covers Vytorin only if LDL and HDL levels have crossed certain thresholds: every patient below those thresholds will use simvastatin alone. The LDL and HDL levels therefore become strong determinants of treatment choice, and including them in the PS estimation will lead to substantial if not complete separation of the PS distributions of the two treatment groups.
PS matching, therefore, serves as an important diagnostic. If situations occur where no matches can be found, it means that the specific comparison cannot be made validly in the study population. This is not a limitation of the method, but rather a very insightful description of a limitation inherent in the study population. The corresponding effect estimates from conventional multivariate outcome models will have substantial imprecision, reflecting the fact that few patients contribute to the estimation in such situations, despite a large study size. Investigators may want to reconsider the comparison agent and choose a more comparable drug or use another study population where there is less treatment separation in clinical practice.
In summary, PS matching embedded in an incident user cohort design is an effective covariate balancing tool that is robust against investigator error.
In addition to the validity gained by multivariate PS matching, matched incident user cohort studies can be analyzed as easily as randomized trials. Since covariate adjustment is already achieved by matching, simple 2 × 2 tables can be constructed, and risk differences and risk ratios can be computed expeditiously with their 95% confidence intervals or Kaplan–Meier plots and log rank tests. The easy computation of multivariate adjusted additive effect measures (risk differences, numbers needed to treat) leads to valuable metrics for examining the balance of benefits and risks. Such metrics consider the baseline risk of each outcome, which may vary considerably between intended and unintended effects.
Standard analyses will include a cross-tabulation of all baseline characteristics by drug exposure, which will make transparent the extent to which covariate balancing by PS matching was achieved. If imbalances persist, further population restrictions should be considered.11 Displaying goodness-of-fit statistics for the PS model and duration of follow-up for each treatment group completes a first set of analyses.
The proposed analyses flow from the study design and are sufficient to provide robust effect estimates quickly. The robustness of results should be measured by applying customary sensitivity analyses, as illustrated in Figure 7.
When using retrospective databases, one cannot contact patients and ask when they began using a drug for the first time. Therefore, incident users are identified empirically by a drug dispensing that was not preceded by a dispensing of the same drug for a defined time period, or washout period. This washout period is identical for all patients. A typical length is 6 months. In sensitivity analyses, this interval can be extended to 9 and 12 months. Increasing the length of the washout increases the certainty that patients are truly incident users, but it also reduces the number of patients eligible for the study, and thus reduces precision.
As discussed, there is often uncertainty about the right definition of the exposure risk window. This is further complicated in claims data, since the discontinuation date is imputed using the days supply field of the last dispensing. Varying the exposure risk window is therefore insightful as well as easy to accomplish in cohort studies.
Another set of sensitivity analyses concerns the potential for informative censoring. Patients change and discontinue treatment because they lack a treatment effect or experience early signs of a side effect (Figure 8). The stronger such non-adherence is associated with the outcome the more an as-treated (AT) analysis, which censors at the point of discontinuation, will be biased. A cumulative risk (CR) analysis follows all patients for a fixed time period, carrying forward the initial exposure status and disregarding any changes in treatment status over time (Figure 7). Because this analysis disregards informative non-adherence, it will not suffer bias as a consequence of censoring, but it will suffer bias as a consequence of exposure misclassification. Such misclassification increases with a longer follow-up period and a shorter average time to discontinuation. In most cases, though not all, such misclassification will bias effects towards the null, similar to intention-to-treat analyses in randomized trials.29 Viewed separately, the ATand CR analyzes trade biases, but together they give a range of plausible effect estimates.
Adjusting for non-adherence in an analysis of a drug effect requires information about the predictors of treatment discontinuation,30 which is often not available with sufficient accuracy in secondary data.
Independent of the design, the sensitivity of findings toward residual confounding may be assessed by applying a set of predefined analyses, including the rule-out approach and array approach described elsewhere.22 Excel spreadsheets expedite this task and produce graphical illustrations of the effect estimate's sensitivity to possible residual confounding (Figure 9). A flowchart summarizing this basic design for expedited safety signal refutation is included in Appendix 2.
This paper proposes a basic study design based on the incident user cohort design for expedited signal evaluation in longitudinal healthcare databases. This proposal will not resolve all methodological issues, nor will it fit all study questions arising within the framework of a Sentinel System. It should rather be seen as a guideline that will fit the majority of study questions and serve as a starting point for adaptations to specific pharmacoepidemiologic study questions. This paper focuses on the evaluation of prescription drugs, but given the availability of adequate data sources, this design proposal is equally applicable to other medical products.
One way to implement rapid safety signal refutation analyses is to start the study design process with the proposed approach in mind. As adaptations become necessary because of data limitations or specific concerns related to confounding control or other anticipated biases, these changes can be made explicitly, and their implications for validity can be discussed. Such an approach will expedite and structure the process of study development and highlight specific assumptions. This is particularly valuable in a Sentinel System, where signals are by definition preliminary and evaluation of signals is time-critical so that consumers can be informed about existing safety issues or, equally important, their likely absence. This proposal is not dependent on any specific implementation of a Sentinel System built on healthcare databases, but for the reasons given above, monitoring will likely focus on incident drug users.
If possible, the expedited primary analysis should be supplemented by other approaches that rest on different assumptions for valid inference, including instrumental variable analyses31 and case-crossover designs.8 Comparing evidence from a variety of data sources and analysis types may substantially strengthen the evidence base for regulatory decision makers.32
Readers may want to think of the last few pharmacoepidemiologic studies they have performed and mentally begin the design process from scratch using the proposed approach. Likely you will realize that most studies can be designed following this approach and adapted a bit here or there to best fit your study questions and accommodate your external constraints. Thinking explicitly about your adaptations will draw your attention to potential trade-offs between validity and precision.
Funded by grants from the National Library of Medicine (RO1-LM010213; RC1-LM010351) and the National Center for Research Resources (RC1-RR028231). Dr Schneeweiss is principal investigator of the Brigham and Women's Hospital DEcIDE Center on Comparative Effectiveness Research funded by AHRQ and of the Harvard–Brigham Drug Safety and Risk Management Research Center contracted by FDA. Dr. Schneeweiss is an investigator of the Mini-Sentinel project funded by FDA (PI : Dr. Richard Platt), however, the opinions expressed here and any errors are his own. Opinions expressed here are only those of the author and not necessarily those of the agencies. Dr Schneeweiss is paid member of scientific advisory boards from HealthCore and ii4sm and received consulting fees from WHISCON, RTI Health Solutions, The Lewin Group, and HealthCore.
Case-control sampling nested within a cohort study will in expectation produce the same rate ratio estimates as a full cohort analysis.33,34 So why not perform case-control studies in large healthcare databases? It is well understood that case-control studies are not able to estimate absolute incidence rates and rate differences unless the sampling fractions of cases and controls are known, which means the underlying cohort needs to be enumerated.35 Rate differences are an important metric for benefit/risk assessment and population impact. Once the underlying cohort needs to be identified why not implement a cohort study in the first place?
Another, practical limitation of case-control studies is that unless multiple case-control studies are implemented it is not possible to study multiple outcomes, something that is often of interest in drug safety research to establish risk profiles. In contrast, cohort studies can study multiple outcomes as well as multiple exposures.36
The main limitation of cohort studies, the large size required to study rare events, is less of an issue in large databases. If it were an issue in a specific study, case-control studies embedded in the same database would suffer form the same limitation.
Unless additional information like biomarkers, detailed diagnostic information, or patient survey data are collected to enrich the longitudinal healthcare database at additional cost there is no reason for embedding a case-control study in an already existent cohort for which all data are already collected. All biases that may occur in the underlying cohort study will also affect the case-control study nested in the cohort.
In addition to the lack of an advantage of case-control studies in a database setting they often raise issues of the accurate chronological sequence of confounder assessment in longitudinal claims or Electronic medical records (EMR) data. Figure A1 depicts three case or control patients. For this example their case status is immaterial. Drug exposure (blue box) is longitudinal, not a one-point exposure, and may be episodic. Two typical choices of covariate assessment periods are indicated in black shade. A covariate assessment period preceding the first exposure as shown for Patient 1 is equivalent to the incident user cohort design proposed here. However, the window immediately preceding the case/control index date, a common choice illustrated in Patients 2 and 3, is sometimes exposed to the drug (Patient 2) or post-treatment (Patient 3), and thus covariates are subject to the drug effect and possibly on the causal pathway. Think of a case-control study on the effects of high-dose rofecoxib on myocardial infarction. Covariate adjustment should include hypertension, an independent risk factor of Myocardial infarction (MI). However, if the assessment is based on the time period just before the case/control index date hypertension could be the consequence of rofecoxib use and thus adjusting for it would bias results toward the null. This can be avoided by placing the exposure assessment period before the initial drug use (Patient 1) and thus the case-control study is no different from a incident user cohort study except for the disadvantages inherent to case-control analyses already discussed.
In addition to the concern about the chronology of confounder identification, there is no or very limited operational gain when choosing a case-control study approach. As mentioned the entire cohort that gives rise to the cases will have to be enumerated in order to estimate incidence rates. Once a cohort of incident users is established that gives rise to the cases, it is operationally easy to identify exposure and covariates for all patients at that point. In a case-control setting computer code has to be written to identify covariates for all cases and the selected controls. There is no difference in programming when extending this algorithm to all patients in the underlying cohort. The additional computing time is usually negligible compared to the programming time. Time-varying exposures can be assessed in both designs; this complicates the analysis by assuming that treatment change is independent of risk factors for the outcome. If this assumption does not hold then other methods like g-computation37 or marginal structural models30 are necessary as discussed in the paper. Again, there is not advantage of case-control studies.
†This manuscript was presented at a meeting convened by the Engelberg Center at The Brookings Institute, in collaboration with the Centers for Education and Research on Therapeutics (CERTs) on ‘Methods, Tools, and Scientific Operations for the Sentinel System’ chaired by Dr Rich Platt and Dr Mark McClellan, Washington, DC, 7 May 2009.