Search tips
Search criteria 


Logo of amiasummtspLink to Publisher's site
AMIA Jt Summits Transl Sci Proc. 2012; 2012: 19–24.
Published online 2012 March 19.
PMCID: PMC3392060

A Simulation Platform to Examine Heterogeneity Influence on Treatment


Although a protocol aims to guide treatment management and optimize overall outcomes, the benefits and harms for each individual vary due to heterogeneity. Some protocols integrate clinical and genetic variation to provide treatment recommendation; it is not clear whether such integration is sufficient. If not, treatment outcomes may be sub-optimal for certain patient sub-populations. Unfortunately, running a clinical trial to examine such outcome responses is cost prohibitive and requires a significant amount of time to conduct the study. We propose a simulation approach to discover this knowledge from electronic medical records; a rapid method to reach this goal. We use the well-known drug warfarin as an example to examine whether patient characteristics, including race and the genes CYP2C9 and VKORC1, have been fully integrated into dosing protocols. The two genes mentioned above have been shown to be important in patient response to warfarin.


A warfarin protocol aims to minimize the risks of anticoagulant treatment (i.e. stroke if under-dosed and hemorrhage if over-dosed) as well as improve overall costs by quickly identifying a therapeutic that will maintain stable and safe warfarin treatment.

In general, the protocol guides warfarin treatment at two steps: initial dose for the first few days and maintenance therapy for the rest of treatment period. The first step aims to achieve stable treatment as quickly as possible; fixed dose and dosing algorithms based on clinical and genetic factors are often used at this step. The second step aims to continue the stable warfarin treatment by adjusting doses based on drug response: International Normalized Ratio (INR).

Many patient characteristics (clinical and genetic factors such as CYP2C9) are associated with warfarin outcomes. A protocol fully integrated with important clinical and genetic factors to guide treatment makes warfarin management easier. These characteristics are known predictors of outcomes and dose requirement, so an efficient protocol that can guide treatment should take these factors into consideration. On the other hand, protocols which do not integrate these characteristics can recommend incorrect doses or dose adjustment, and may, consequently, result in poor outcomes for patient sub-populations.

Comparing treatment outcomes against patient characteristics in clinical trials provides a gold-standard solution but is cost prohibitive and requires many years to conduct the study. Instead of clinical trials, we used a simulation method to discover such knowledge using electronic medical records (EMR) and identify patient sub-populations that tend to have poor outcomes for a treatment protocol.


Warfarin, the vitamin K antagonist, is the most widely used anticoagulant in the world to prevent stroke. The therapeutic management of warfarin is highly individualized because: (1) Several clinical and genetic factors strongly influence drug metabolism; (2) The nature of warfarin’s narrow therapeutic index can easily result in a patient’s over-dosing (increasing risk of hemorrhage) or under-dosing (increasing risk of stroke).

The common approach to achieving and maintaining therapeutic dose is to adjust based on the INR, an adjusted measure of coagulation time. A normal person’s INR is about one, and this will increase with the initiation of warfarin treatment. In general, a high INR (>3) is associated with an increased risk of hemorrhage, and a low INR (<2) is associated with an increased risk of stroke. Thus, patients face the risk of stroke before warfarin treatment and both sides of the risk (hemorrhage and stroke) after starting the treatment. The optimal balance, INR target range, is between 2 and 3, achieving which is the goal for dosing adjustment.

The risk of hemorrhage and stroke is small but definite [5]. Time in therapeutic range (TTR) is a commonly used surrogate risk measurement, which is the proportion of time in the target INR range (between 2 and 3).

Due to recent efforts in genetic studies, many genes have been found to be related to warfarin metabolism and drug responses. Among them, CYP2C9 and VKORC1 are most frequently included in modern protocols, e.g., [2, 1, 7].

One of the most famous clinical trials for the pharmacogenetic protocol is the Coumagen trial [1]. This clinical trial compares two protocols, pharmacogenetic (prediction of therapeutic dose based on clinical and genetic factors) and standard (predict therapeutic dose by INR measurements) with patient follow-up for up to 3 months. Although the pharmacogenetic protocol predicts the stable therapeutic dose more accurately (p<0.001), the primary end point, percentage of out-of-range INRs did not significantly differ from the standard protocol.


This section discusses each element of the clinical trial simulation platform. The platform was used to generate outcome data for pharmacogenetic and standard warfarin dosing protocols used by the Coumagen trial [1]. Sensitivity analyses were conducted to identify patient characteristics not appropriately integrated in both protocols.

Data source

Data were collected by the International Warfarin Pharmacogenetics Consortium [7] which comprises 21 research teams from 9 countries. The data consists of 5700 patients on warfarin treatment with demographic characteristics, primary indication for warfarin treatment, stable therapeutic dose and the corresponding INR, target INR, concomitant medications, and genotype variants of CYP2C9 and VKORC1. Table 1 shows a subset of variables selected for this simulation project. Before this project, we conducted a thorough literature review. We decided the variable set based on the highest compatibility to most dosing protocols and algorithms.

Table 1.
Simulated Data Inputs

Initial dose for warfarin treatment

The initial doses of warfarin treatment administered for the first two days are either fixed (at 10 mg/day) or pharmacogenetic [1] (based on the following function).


Maintenance therapy for warfarin treatment

Both Coumagen protocols, standard and pharmacogenetic, adjust doses based on INR measurements to maintain safe treatment. In general, if INR is too low (<2), protocols suggest an increase a dose amount based on how far away the INR is from the target INR, and vise versa for a high INR (>3). In this simulation project, each patient receives warfarin treatment for 10 days.

PK/PD modeling for INR response

We adapted the PK/PD model from Hamberg et al. [4] to predict the INR response for an individual with a given warfarin dose. Briefly, the PK/PD model was estimated from a set of 150 patients. Warfarin is a racemic mixture of two enantiomers S-warfarin and R-warfarin of which S-warfarin is 3–5 times more potent and was shown to be the dominant factor. The authors indicate that the influence of warfarin therapy on INR response of the R warfarin was not statistically significant. Therefore, the model only considered the PK/PD effects of S-warfarin. We modeled the PK effects using a two-compartment model with first order input and first order elimination and the PD effects using a two-chain transit compartment model. Due to limitations on the original model it was necessary to make assumptions about the covariance of the variables because the complete covariance matrix was not provided. Many parameters and functions in this manuscript are provided in log form, and from our empirical test, the log normal distribution meets our expectation for parameter variation. Thus, we used a random log normal distribution to estimate the variability of the clearance rate, the volume in the central compartment, and the volume in the peripheral compartment and restricted the range to be within physiological ranges.

To model the accumulation of warfarin concentration over time (assuming daily doses), we used the principle of superposition. Superpositioning does not require assumptions regarding a PK model or absorption kinetics, but instead assumes each dose of the drug acts independently and that the rate and extent of absorption and average systemic clearance are the same for each dosing interval and that linear PK apply [3]. We created a table of warfarin concentrations over time and summed across the rows at 24-hour time intervals to predict the amount of wafarin remaining in the system. The results agree with previously published methods [3, 4].

Outcome metric

The time points where INR is measured depend on the protocols. INRs of a patient are further converted to time in therapeutic range (TTR), which is the proportion of time a patient’s INRs fall within the target range between 2 and 3.

Clinical trial simulation

Figure 1 summarizes the clinical trial simulation platform. For the first two days, each simulated patient receives initial warfarin treatment, either a pharmacogenetic dose or a 10 mg fix dose. From days 3 to 10, doses are adjusted according to protocol to reach the target INR range and maintain therapy. The PK/PD model predicts INRs, which are measured at times specified by the protocols.

Figure 1.
The simulation process. Each simulated patient receives warfarin for 10 days: initial doses for 2 days and dosing adjustment from days 3 to 10. Time to measure INR depends on the protocols.

We ran a two armed simulation with 5700 patients per arm. The patient population in each arm was identical, and each was assigned to one of the two protocols. We then repeated the simulation 100 times in order to capture the variance of INR outcomes. Each iteration’s INRs were further converted to TTR.

Apply sensitivity analysis to identify patient characteristics

1,140,000 simulations (5700 patients × 2 protocols × 100 iterations) were conducted. For simplicity, we took the average TTR of the 100 iterations to represent each simulated patient’s TTR. Two sets of TTRs (one for each protocol) for 5700 simulated patients constitute the final data for the next analysis.

Sensitivity analysis (SA) [6] is a common way to identify how different values of a variable influence the variation of output. This project used standardized regression coefficients implemented in R for the SA.

Three important variables: race, CYP2C9, and VKORC1, are chosen for the SA. In order to understand the influence of a certain value of a variable can take to TTR variation (for example, TTR variation due to presence of the characteristic: African American), we further convert each variable possibility into a binary dummy variable. For example, the four race groups, Caucasian, Asian, African American, and Unknown are represented by four dummy variable sets ([1,0,0], [0,1,0], [0,0,1], and [0,0,0], respectively) which will pull out the appropriate coefficients from our regression equations.


The average TTRs of 5700 patients for the pharmacogenetic and standard protocols are 0.622 and 0.658 (SD 0.149 and 0.164) respectively. Figure 2 shows sensitivity scores for the four race groups (Caucasian, Asian, African American, and Unknown) and three VKORC1 genotypes (A/B, A/A, and B/B). Unknown race and B/B (wild type of VKORC1) are the reference group. The Y axis shows standardized coefficients or sensitivity scores. The scores of most characteristics in pharmacogenetic protocol are close to 0 and TTRs are very consistent. This indicates the protocol successfully integrates these characteristics. This is because the protocol guides the initial treatment based on these characteristics and thus captures the factor-dependent variation in patients’ dosing requirement. Consequently TTR variations are low. It is interesting to note that although the initial doses depend on clinical and genetic factors, dose adjustment for days 3 to 10 is only based on INRs. As a result, we conclude that the initial dose predictions sufficiently capture and propagate the variation due to these characteristics.

Figure 2.
Sensitivity scores of four race groups and three VKORC1 genotypes for pharmacogenetic and standard protocols. Unknown race group and VKORC1 genotype B/B are reference group so they are absent from the figure. Observed TTRs of four race groups, Caucasian, ...

On the other hand, the standard protocol fails to integrate most of these characteristics. Compared to the reference group, Compared to Unknown race, Asian is positively associated with TTR, but African American is negatively associated with TTR. In addition, the observed TTR for African American and wild type (BB) VKORC1 are the lowest among their respective factor values (0.58 and 0.658, respectively). These results indicate that African Americans and persons with VKORC1 genotype B/B should be appropriated more attention when receiving warfarin treatment based on the standard protocol because they tend to have low TTR, and consequently, higher risks of hemorrhage and stroke.

Figure 3 shows sensitivity scores for six CYP2C9 genotypes (*1/*1, *1/*2, *1/*3, *2/*2, *2/*3, and *3/*3 corresponding to CYP11, CYP12, CYP13, CYP22, CYP23, and CYP33, respectively) for both protocols. Genotype *1/*1 (wild type) is the reference group. Neither the pharmacogenetic nor standard protocols sufficiently integrate variant CYP2C9 genotypes, especially CYP33, which is negatively associated with TTR (compared to CYP11) and shows the lowest TTRs (0.392 for pharmacogenetic and 0.484 for standard protocols). Thus, the simulation framework determined that patients with CYP33 may be more at risk for an adverse event when they receive warfarin treatment based on either protocol.

Figure 3.
Sensitivity scores of six CYP2C9 genotypes for both protocols. Both phrmacogenetic and standard protocols does not well control variant types of CYP2C9, especially CYP33. Observed TTRs of CYP2C9 genotypes, CYP33, CYP23, CYP22, CYP13, CYP12, and CYP11, ...


We propose a simulation approach to examine which patient characteristics a protocol “captures” and integrates. Several patient characteristics (see Table 1) influence warfarin outcomes. When a protocol captures important characteristics and accurately integrates them, patients displaying these characteristics will receive effective individualized treatment. Consequently, TTR variation between patients with these characteristics will be low.

We note that a protocol integrating important characteristics does not guarantee the protocol to have a high overall TTR. More complex issues such as the number of integrated characteristics, the weights of these characteristics, and interaction among characteristics should be considered. Rather it indicates people with these characteristics can equally benefit from the protocol, showing similar TTR measurements.

The simulation framework can identify patient sub-populations (when treated based on a certain protocol) that tend to have poor outcomes. Precautions can then be taken to reduce the chance of having a poor outcome in these sub-populations.

This proposed method merges three distinctive knowledge sources; dosing protocols, an INR model, and EMRs, and integrates them into the simulation framework for knowledge discovery. The three knowledge sources, which exist in various forms, provide different aspects of information: protocols, which exist in the forms of tables and algorithms, guide warfarin dosing; the INR model, which exists in the form of a complex two-compartment model, predicts dynamic INR measurements; finally, PharmGKB, EMR data provides patient characteristics and their interactions needed for the INR model and protocol predictions which generate TTR and facilitate the discovering process. This project demonstrates that a sophisticated simulation algorithm can discover valuable clinical knowledge from mixed types of data sources.

Each element in the proposed approach is highly modularized. One can easily replace the PharmGKB data, INR model, or protocol for other research purposes. In addition, we can use this simulation platform to examine which characteristics a newly designed warfarin protocol is able to capture and compare advantages and disadvantages among other existing protocols.

Future work will expand the current work to a large number of protocols and allow for protocol comparison. We also plan to use different INR models, compare them, and further improve the simulation platform. More importantly, we will refine the platform to support comparative effectiveness research that allows comparison of not only TTR but also other outcome metrics, such as bleeding and thrombosis risks, and treatment costs.


We thank NIH grant R01LM010130 and Dr. Michiyo Yamada to support this work.


[1] Anderson JL, Horne BD, Stevens SM, Grove AS, Barton S, Nicholas ZP, Kahn SF, May HT, Samuelson KM, Muhlestein JF, Carlquist JB, Couma-Gen Investigators Randomized trial of genotype-guided versus standard warfarin dosing in patients initiating oral anticoagulation. Circulation. 2007;116(22):2563–2570. [PubMed]
[2] Gage BF, Eby C, Milligan PE, Banet GA, Duncan JR, McLeod HL. Use of pharmacogenetics and clinical factors to predict the maintenance dose of warfarin. Thromb Haemost. 2004;91(1):87–94. [PubMed]
[3] Gibaldi M, Perrier D. Pharmacokinetics, Second Edition. Marcel Dekker; New York: 1982.
[4] Hamberg AK, Dahl ML, Barban M, Scordo MG, Wadelius M, Pengo V, Padrini R, Jonsson EN. A pk-pd model for predicting the impact of age, cyp2c9, and vkorc1 genotype on individualization of warfarin therapy. Clinical Pharmacology and Therapeutics. 2007;81(4):529–538. [PubMed]
[5] Odén A, Fahlén M, Hart RG. Optimal inr for prevention of stroke and death in atrial fibrillation: a critical appraisal. Thrombosis Research. 2006;117(5):493–499. [PubMed]
[6] Saltelli A, Tarantola FS, Campolongo, Ratto M. Sensitivity analysis in practice: A guide to assessing scientific models. John Wiley and Sons, Inc; NY: 2004.
[7] The International Warfarin Pharmacogenetics Consortium. Estimation of the warfarin dose with clinical and pharmacogenetic data. The New England Journal of Medicine. 2010;360(8):753–764. PMCID: PMC2722908. [PMC free article] [PubMed]

Articles from AMIA Summits on Translational Science Proceedings are provided here courtesy of American Medical Informatics Association