Search tips
Search criteria 


Logo of bmcmrmBioMed Centralsearchsubmit a manuscriptregisterthis articleBMC Medical Research Methodology
BMC Med Res Methodol. 2012; 12: 23.
Published online 2012 March 9. doi:  10.1186/1471-2288-12-23
PMCID: PMC3348070

Comparative effectiveness research on patients with acute ischemic stroke using Markov decision processes



Several methodological issues with non-randomized comparative clinical studies have been raised, one of which is whether the methods used can adequately identify uncertainties that evolve dynamically with time in real-world systems. The objective of this study is to compare the effectiveness of different combinations of Traditional Chinese Medicine (TCM) treatments and combinations of TCM and Western medicine interventions in patients with acute ischemic stroke (AIS) by using Markov decision process (MDP) theory. MDP theory appears to be a promising new method for use in comparative effectiveness research.


The electronic health records (EHR) of patients with AIS hospitalized at the 2nd Affiliated Hospital of Guangzhou University of Chinese Medicine between May 2005 and July 2008 were collected. Each record was portioned into two "state-action-reward" stages divided by three time points: the first, third, and last day of hospital stay. We used the well-developed optimality technique in MDP theory with the finite horizon criterion to make the dynamic comparison of different treatment combinations.


A total of 1504 records with a primary diagnosis of AIS were identified. Only states with more than 10 (including 10) patients' information were included, which gave 960 records to be enrolled in the MDP model. Optimal combinations were obtained for 30 types of patient condition.


MDP theory makes it possible to dynamically compare the effectiveness of different combinations of treatments. However, the optimal interventions obtained by the MDP theory here require further validation in clinical practice. Further exploratory studies with MDP theory in other areas in which complex interventions are common would be worthwhile.

Keywords: Markov decision processes, Acute ischemic stoke, Comparative effectiveness research, Traditional Chinese Medicine/integrative medicine


Comparative effectiveness research (CER) is a way of identifying what works for which patients under which circumstances [1]. CER is not a single entity, it can take many forms, including cohort studies, literature systematic reviews, observational studies, and randomized controlled trials (RCTs) [1,2]. Non-randomized comparative clinical studies also play an important role in assessing the safety and effectiveness of medical interventions for routine practice. Recent attention to non-randomized comparative clinical studies in CER has focused on methodological issues [3,4]. Experts realize that there are methodological challenges for non-randomized comparative clinical studies that cannot be ignored, especially with the increased requirements for data analysis driven by the demand for real-world evidence. These challenges include [4] dealing adequately with multiple therapies and possible outcomes; an extremely heterogeneous baseline in terms of patient characteristics and setting; and confounding in studies that use different kinds of health databases. Methodology researchers have made great progress in the development and application of statistical methods for the description and analysis of CER data [5-7]. Such methods include using propensity score analysis to adjust for group differences [8,9], structural equation models and decomposition methods to identify how outcomes vary differentially with respect to patient characteristics and other factors for alternative treatment cohorts [10], and instrumental variable methods to address the problem of uncontrolled confounding [7,11-14]. However, the uncertainties in real-world systems that evolve dynamically with time have yet to be adequately identified.

Treatment with syndrome differentiation is considered the kernel of Traditional Chinese Medicine (TCM)[15], which means that therapeutic interventions are changed dynamically according to the variation of the state of the syndrome or disease over time. There is a general impression among Chinese medicine practitioners that treatments that change dynamically with syndrome differentiation and time are superior to those that remain unchanged. However, when TCM treatments are tailored to the individual patient, as is common practice, it is more difficult to assess their effectiveness than when they are applied to all patients in a standard manner in clinical studies. Methods that allow the researcher to model the uncertainties in real-world practice, and especially those that may dynamically change with time, are needed to describe TCM treatments and compare their effectiveness.

MDP theory is a versatile and powerful tool used to analyze sequential decision problems [16] with applications in many areas, such as natural science, engineering technology, and medical care, and it increase the utilization of medical resources and optimize methods of diagnosis or treatment. The MDP theory is also important for medical decision-making, such as the administration of medical devices, admission control in hospitals, decisions on operation timing, and the adjustment of treatment strategies [17-23].

Syndrome differentiation and TCM treatments are very often interdependent and interleaved over time, principally due to uncertainty about the underlying disease, uncertainty associated with patient responses to certain treatments, and the likelihood of patient states varying within the period of treatment, such as from one pattern of TCM to another pattern. The introduction of MDP theory into CER on TCM makes dynamic comparison and evaluation possible. In this study, we show how MDP theory can be used to model integrative medicine treatments (the blending of the best of conventional medicine and complementary and alternative medicine) [24] for patients with acute ischemic stroke (AIS), and to provide an optimal solution from dynamic effectiveness comparisons in sequential clinical practice.


Data collection

The electronic health records (EHR) of patients with AIS hospitalized at the 2nd Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China, were collected. The inclusion criteria for the records were a primary diagnosis of cerebral infarction and hospital admission within 14 days of the onset of stroke. Records of patients who had thrombolysis or had undergone early anticoagulation treatment were excluded.

All of the data were collected with an information acquisition form, one form for each record, that captured the general information of the patient, TCM and Western medicine diagnosis, all applied treatments with course detail, levels of neurological function defect on the first, third, and last day of hospitalization, and the results of brain imaging (i.e., computerized X-ray tomography or magnetic resonance imaging). This study was approved by the ethic committee of 2nd Affiliated Hospital of Guangzhou University of Chinese Medicine.

Description of patients' condition and the criterion to be optimized

To determine the key characteristics for describing the condition of patients with AIS and the criterion to be optimized by using MDP theory, an expert panel was formed that included scholars, physicians of Western medicine, TCM practitioners, and doctors in the field of integrative medicine (with an educational background in both Western medicine and TCM), and a half-day expert panel meeting was held.

Six key characteristics were selected based on the results of the panel meeting (see Additional file 1: Appendix 1): (i1) age; (i2) any disease history, such as diabetes, hypertension, coronary heart disease, abnormal blood liquid level, or auricular fibrillation; (i3) any complication, such as pulmonary infection, urinary tract infection, or deep vein thrombosis; (i4) TCM diagnosis; (i5) TCM syndrome differentiation (TCM pattern); and (i6) level of neurological function (with items for evaluation taken from the NIHSS [25] and assessment standard of neurological function impairment [26]). A score was used to describe the level of neurological function defect (see Additional file 2: Appendix 2). The total scores were in the range of 0-29, where a high score indicates poor function. Patients who were dead scored 29.

Duration of hospitalization for each patient was divided into two stages. Stage 1 ran from admission to the third day of hospital stay, and Stage 2 ran from the third day of hospital stay to discharge. This resulted in three time points for the state assessment: the first (timepoint 1, t1), third (timepoint 2, t2), and last day (timepoint 3, t3) of hospitalization. Each record was treated as two "state-action-reward" stages divided by the three timepoints. State refers to a patient's condition in terms of the six key characteristics; action represents the combination of treatments; and reward refers to the value of the differential between the scores for neurological function impairment [25,26] before and after treatment (equal to the total score before treatment minus the score after treatment). According to the expert panel's advice, the total reward values for the two stages became the criteria to be optimized. In terms of the reward values, 0 represents no change in a patient's condition, values larger than 0 represent improvement in a patient's condition, and values lower than 0 mean deterioration. If the value is larger than 0, then the larger the value, the better the improvement in state. The action that maximizes the total reward value is regarded as the optimal action, that is, the optimal intervention combination for the corresponding state.

Description of interventions

Five circumstances were used to distinguish different treatment combinations (action) at each stage (see Additional file 3: Appendix 3): (a1) whether to use antiplatelet and/or anticoagulant agents; (a2) whether to use TCM treatments for replenishing qi and wen yang (Yi Qi Wen Yang); (a3) whether to use TCM treatments for clearing heat and extinguishing wind (Qing Re Xi Feng); (a4) whether to use TCM treatments for relaxing the bowels; and (a5) whether to use herbal medicine.

Treatment strategies were carried out at the request of the physician in charge of the patient under the same theory of TCM [27]. Patients with a TCM diagnosis belonging to the Yin pattern were treated by "Yi Qi Wen Yang" treatments, and those with a TCM diagnosis belonging to the Yang pattern received "Qing Re Xi Feng" treatments. Herbal medicine was prescribed according to the current symptoms of the patient. If the patient was constipated, TCM treatments to relax the bowels were used. Aspirin or Clopidogrel was taken orally by each patient within 48 hours of hospital admission, except those who were allergic to or genuinely intolerant of these agents. Anticoagulant agents, including unfractionated heparin (UFH), low-molecular-weight heparin (LMWH), or warfarin were used if the patient had any of the following conditions: atrial fibrillation, serious artery angiostenosis, or advancing stroke. Any treatment might be changed at any time if the physician thought it necessary.

For patients with a history of hypertension, diabetes, or dyslipdemia, the agents that they had been taking before admission continued to be administrated during their hospital stay. However, these interventions were not included in the analysis, as they did not focus on stroke treatment.

Data management and analysis

All of the information acquisition forms were double entered with EpiData 3.1 (EpiData Association Odense, Denmark). The final dataset was converted into SPSS format. Missing data were replaced by the median of nearby points. Data were analyzed primarily with SPSS13.0 (SPSS, USA). The Markov decision processes (MDPs) were written in C language and compiled using Dev C++

Formulating an MDP model for the treatment of AIS

According to clinical experience and TCM theory, treatment decision-making depends on the current condition of patient, and the corresponding TCM/integrative medicine (i.e. the combination of practices and methods of alternative medicine with conventional medicine) therapies are described as non-stationary finite horizon MDPs, in which each state variable denotes the patient's condition at a certain time. The optimality problem is solved by maximizing the non-stationary finite horizon expected total utility. For finite horizon MDPs, the state space is a set of vectors consisting of all possible conditions for a patient, the set of available actions for a state is composed of treatments used for therapy for a given state, the transition probabilities in the MDPs are determined by the records of therapeutic effectiveness, and the corresponding utility function is evaluated based on the neurological functional impairment score related to the patient's condition and the effectiveness of treatment. Thus, the optimality problem is actually described as a non-stationary finite horizon expected total utility MDP model, and the optimality technique already developed for MDPs can be used to solve it efficiently [16].

Formulating a model for MDPs with finite horizon reward criteria

First, it is necessary to specify the condition of the patient, which is the information known by the physician. A state i in MDPs denotes the patient's condition. As described in former section, a patient's condition is evaluated based on an overall consideration of various factors, such as i6 represents level of consciousness, visual field defects, and muscle power of the limbs, etc.. Thus, the state is denoted by a vector i = (i1, ..., in), where the state vector ik (k = 1, ..., n) corresponds to every aspect of the patient's condition and n is the dimension of the state vector. The state space is composed of all possible state vectors, that is, S = {i = (i1, ..., in) | ik[set membership]{0, 1, ..., li}, k = 1, ..., n}, where li denotes the number of corresponding factors.

Second, a vector consisting of treatment combinations a = (a1, ..., am) is regarded as action a available to the decision-maker. As explained in former "description of intervention" section, in the treatment of AIS, each component ai corresponds to a type of treatment used for therapy, and ai takes a value in {0, 1, ..., ji} (i = 1, ..., m). For example, in the case of whether to use antiplatelet agents or not, 0 denotes that an antiplatelet agent should not be used and 1 denotes that aspirin and/or clopidogrel should be chosen. Similarly, in the case of whether to use herbal medicine or not, 0 and 1 respectively denote that herbal medicine should not and should be used. A(i) denotes a set of all possible actions available to the controller when the state is at state i[set membership]S. In other words, A(i) represents the set of all treatments available to the controller at state i.

Third, when a physician prescribes a type of treatment combination (action a) for a certain patient in state i, the corresponding effectiveness can be detected in state j of the patient at the next observable time point. Therapeutic effectiveness may differ when the same treatment combination is applied to different patients with the same condition. Thus, the dynamic evolution of the treatment process is specified using the so-called transition probability Pt(j|i,a), which means that Pt(j|i,a) denotes the probability that the state is j [set membership]S at time t + 1 when action a[set membership]A(i) is taken at state i[set membership]S at time t. We use # (j, i, a) to denote the number of transfers from state i to the next state j under action a. For each state i, j[set membership]S, and any given action a[set membership]A(i), the transition probability is given by Equation (1).


Fourth, the reward function ut(i, a), which depends on the current state i [set membership] S, a chosen action a[set membership]A(i), and decision epoch t, is expressed as


where ut(j, i,a) denotes the reward value when the state of the treatment process is i at stage t, an action a[set membership]A(i) is taken, and the treatment process results in state j at the next stage t + 1.

Finally, to complete the model, it is necessary to introduce the N-horizon expected total reward criterion. This needs to define a class of policies (i.e., all possible sequences of treatment combinations) admissible to the controller. A policy can be denoted as a sequence of functions π = {f1, f2, . . fN}, where ft (1 ≤ t ≤ N) acts on S and satisfies that ft(i)[set membership]A(i) for all i[set membership]S. Hence, function ft(i) is the treatment combination chosen at state i at stage t. Let Π be the set of all policies. For any given policy π and initial state i, J(π,i) denotes the corresponding expected total reward from the initial time to the end time N.

To that end, a model is specified for non-stationary MDPs with the N -horizon expected total reward criterion for the foregoing treatment processes:


where the state space S, the available action set A(i) at state i[set membership]S, the transition probability pt(j|i,a) with i, j[set membership]S and a[set membership]A(i), and the reward function ut(i,a), are as previously defined. To elucidate following arguments, some notation is introduced: For each fixed policy π = {f1, f2, . . fN}[set membership]Π, a transition probability matrix P(t, π) is defined with the (i,j) element as pt(j|i, ft(i)).

For each π[set membership]Π and initial state i[set membership]S, the N -horizon expected total reward to be maximized is denoted by


where Eiπ denotes the expectation operator determined by the given pt(j|i, ft(i)) and the initial state i[set membership]S, i(t) and a(t) are the state and action variables at time t, and uN(i(N)) is the terminal reward associated with the state i(N)[set membership]S; see [16] for details.

Finally, the corresponding optimal value function is defined as J*(i) = supπ[set membership]IIJ(π, i), i[set membership]S. A policy π* in Π is said to be optimal if J(π*,i) = J*(i) for all i[set membership]S.

Solutions to the optimality problem

For each π[set membership]Π, Ut(π, i) denotes the corresponding expected total utility from time t to the end time N given state it = j at time t, that is (by the well known Markov property),




implies that J*(i) = U1(i) = J1(i).

To find a method to obtain an optimal policy, by Theorem 4.3.3 (16) the following algorithm is used.

StepI: Set t = N and


StepII: Substitute t-1 for t and compute Jt(i) by


Obtain ft*, which realizes the maximum in Eq. (9).

Step III: If t = 1, then stop. Otherwise return to StepII. The policy obtained π* = {f*1, ..., f*N-1} is optimal (by Theorem 4.3.3 in [16]) as the control model consists of finite state and action spaces.

Numerical implementation

All of the records from the patients with AIS were broadly classified into several groups according to the patient's condition (each of which is called a "state"), and the types of treatments were divided into two stages during which different treatment combinations were used. Information was collected to form Tables Tables11 and and2,2, which show patient condition and the corresponding treatment combination (i.e., "actions") at Stage 1 and Stage 2, respectively. Patient condition as assessed by the six key characteristics is listed in columns 2 through 7. The first column denotes the number of patients with the same condition, and columns 8 through 12 list the main treatments (sometimes more than one for each "state") used for AIS (the columns in Tables Tables11 and and22 have the same meaning but are for a different treatment stage.)

Table 1
The patients' conditions and treatments at Stage 1*
Table 2
The patients' conditions and treatments at Stage 2*

The elements of the MDP model can now be formulated. From Table Table11 and Table Table2,2, the state space can be expressed as S = {200111, 200112, ......, 311122, 311123}, and the corresponding sets of admissible actions are given as

A 200111 = {00001; 00101; 00111; 10001; 10101; 10111}

A (200112 = {10011; 10101; 10110; 10111; 11111; 10001; 11101} ...... The optimality problem is considered to be within a finite time horizon from stage 1 to stage 2. A terminal reward of 0 is assigned to all states. Based on Tables Tables11 and and22 and Eq (1), the transition probabilities pt(j|i, a) (t = 1, 2) are computed and listed in Additional file 4: Appendix 4 and Additional file 5: Appendix 5. From the neurological functional impairment scores in Tables Tables11 and and22 and Eq (2), the reward functions ut (i, a) (t = 1, 2) can be obtained by Eq (2), and are listed in Additional file 6: Appendix 6 and Additional file 7: Appendix 7.

Using the algorithm to solve the optimal problem, an optimal policy π* = {f*1, f2*} (corresponding to the optimal treatments) can be obtained as follows.

f *1 (200111) = {00001}, f *1 (200112) = {10101},......

f *2 (200111) = {00111}, f *2 (200112) = {10001},...... The optimal treatments with this optimal policy are shown in Table Table33 and Table Table44.

Table 3
Optimal combination of treatment at stage 1 (example)
Table 4
Optimal combination of treatment at stage 2 (example)


General information

A total of 1504 records with a primary diagnosis of AIS were identified for the period 1st May 2005 to 31th July 2008. Of these, 1337 met the inclusion criteria. Only states with more than 10 (including 10) patients' information were included, resulting in 960 records being enrolled in the MDP model representing 30 kinds of patient condition. Sixty-eight percent of records were from patients over 66 years old. A disease history was given for 74% of the 960 patients. Most of the records had fairly low scores for neurological function impairment, indicating that the severity of the patient's condition was minor to medium (see Table Table5).5). The i6 value for eight patients who were dead in stage 2 was 29 (the highest score for neurological functional impairment).

Table 5
General information of the patients at admission

There was 0 to 1.12% of missing data in i1 to i5 and 0.07 to 18.39% of data missing for i6, of which 18.39% was on ataxia, 13.80% information on visual field defects, and 13.76% on sensory disturbance. Other missing data for i6 were found in other indexes, such as level of consciousness, facial paralysis, muscle power of upper and lower limbs, aphasia, and dysarthria, with levels of missing data ranging from 0.07 to 7.11%. For a1 to a5 this figure was 0 to 0.37%. All of the missing data were replaced.

Optimal combination of treatments for corresponding states

By calculating and screening with the MDP theory, the optimal combinations of treatments for the 30 states (see Table Table66 and Table Table7)7) were obtained.

Table 6
Optimal combination of treatments for a variety of states at Stage 1
Table 7
Optimal combination of treatments for a variety of states at Stage 2

The results of six states (see Table Table88 and Table Table9)9) can be used as an example to show how these can be used to individually compare the effectiveness of treatments. The states in Table Table88 represent patients who were older than 66 (i1 = 3), had at least one kind of disease history (i2 = 1), were without complications during their hospitalization (i3 = 0), had Zhong Jing Luo (apoplexy involving channels or collaterals) (i4 = 1) as the TCM diagnosis and a Yin TCM pattern (i5 = 2). Different levels of neurological functional impairment (i6) were detected, which meant that the severity of stroke varied among patients, as represented by State 10036, State 10037, and State 10038.

Table 8
Example of states within which patient's pattern of Chinese medicine was Yin
Table 9
Examples of States within which patient's pattern of Chinese medicine was Yang

At Stage 1, 122 patients were in State 10036, and received a combination of therapeutic intervention including TCM treatments to replenish qi and wen yang (Yi Qi Wen Yang), TCM treatments to relax the bowels, and herbal medicine (labeled as 01011). Each patient was given a score for neurological functional impairment to describe their i6 level. Among patients in State 10036 at Stage 1, those who had been treated with a combination of a2, a4, and a5 (labeled as action "01011" at Stage 1) got the highest Reward (valued as 1 unit, see Table Table8)8) at t2 compared with other kinds of treatment combinations for patients in the same State.

One hundred and twenty-seven patients were in State 10036 at Stage 2, which implies that if the treatment combination labeled "01011" was maintained, then patients in this State at Stage 2 would obtain the highest reward (1 unit) at t3.

Similarly, for patients at Stage 1 in State 10037, who had a more severe clinical condition than those in State 10036, the results showed that if the action was "01011", then the reward value would be a maximum of 4 units. In contrast, for patients in State 10037 at Stage 2, an intervention with only herbal medicine (action labeled as "00001") resulted in the highest reward of 4 units. For patients in State 10038 at Stage 1, a "10001" action resulted in a reward of 6.28 units at t2, whereas the action "10001" at Stage 2 resulted in 4.67 units of reward at t3.

Patients in States 10031, 10032, and 10033 (see Table Table9)9) all had a TCM pattern of Yang, whereas those in States 10036, 10037, and 10038 had a TCM pattern of Yin.

The results in the first line of Table Table99 show that by combining TCM treatments for clearing heat and extinguishing wind (Qing Re Xi Feng) (labeled as a3) with herbal medicine (labeled as a5), the best reward value at Stage 1 for patients in state 10031 was 1 unit. At Stage 2, patients in the same state 10031 may have needed a treatment of antiplatelet agents (a1) together with TCM treatments to relax the bowels (a4), and a5 to form the action known as "10011" to gain a maximum value reward. It seems that for State 10033, in which patients tendered to have a more severe clinical condition, the two actions that involved TCM therapeutic interventions achieved the best rewards.


Based on inpatient EHR, MDPs were applied to describe and analyze the dynamic process of different combinations of TCM treatments and/or integrated treatments of TCM and Western medicine for patients with AIS, and to determine the optimal treatment combination for each State by comparing the rewards gained from the corresponding actions. To the best of our knowledge, no similar topic has been previously addressed in the field of integrative medicine (IM) or in complementary and alternative medicine (CAM).

No medication has yet been confirmed to have neuroprotective effects in the management of patients with AIS [28]. Although antiplatelet agents can reduce the risk of mortality and morbidity when aspirin is administered within 48 hours after the onset of stroke, it cannot be used in up to 28% patients with aspirin "resistance" [29]. The management of patients with AIS with heparin carries an increased risk of bleeding complications [30]. The use of intravenous recombinant tissue plasminogen activators (rt-PA) in cerebral infarctions is associated with improved outcomes, but cannot be used as a routine therapy outside special units [31].

Several commonly used and government-approved traditional Chinese patent medicines (TCPMs), such as, Ginkgo biloba [32], milk vetch [33,34], Mailuoning [35], Qingkailing [36], and Danshen [37] agents, have shown promising effects for ischemic stroke. However, no definite conclusions can be drawn from studies of these agents due to a general lack of reporting on methodology [30,38-40]. Properly designed clinical research to study the role of traditional medicine in ischemic stroke is warranted, but a number of issues must be addressed in the design of such studies first [41]. One of these issues is complex interventions involving varying dosages and interactions. Randomized controlled trials (RCTs) are a possible approach to evaluating complex interventions as a whole compared with an appropriate alternative [42], but cannot separate the benefits of different combinations of components. The multi-component structure of treatments is closer to real world practice, especially in therapy for stroke with complex dynamics from onset through progression [43]. Moreover, the model of applying a treatment and conducting it without any change through the whole course of acute stroke is inconsistent with the basic theory of TCM whereby treatment is altered according to syndrome differentiation [15,44].

The results of this study indicate that the new method of MDPs may prove useful for comparative effectiveness research (CER). MDPs can be applied to dynamically compare the effectiveness of various combinations of complex treatments, and may be able to overcome the uncertainties related to individual patients' responses to certain combination of treatments and the uncertainties concerning dynamic changes in treatment for certain patients over the course of disease [21-23,45].

Past research implies that herbal medicine may possess neuroprotective properties [46,47], protect against ischemic reperfusion injury [48,49], reduce edema in the brain [48], improve cerebral microcirculation [33,47], and inhibit apoptosis [50]. Such properties may partly explain the effectiveness of the combinations of treatments identified in this research.

This study has several limitations. First, all of the data were taken from EHR, and missing data are inevitable. The amount of missing data was less than 1.12% in most categories, although 18.39% of missing data was detected in i6. As i6 is a key variable in describing the rewards of actions, the results should be interpreted cautiously because of the possible bias caused by the replacement of missing data. In addition, due to too much variety, different components of herbal medicine were classified as one action. As a result, the effectiveness of different prescriptions of herbal medicine is not comparable. Another limitation is that each patient's record was divided into two stages according to three time points, with each episode being regarded as an independent sample when modeled by MDPs. This is consistent with the Markov property of non-after effect according to the basic theory of MDPs, but it may, to a certain extent, ignore potential correlations between episodes obtained from the same patient at different stages. Finally, although the key characteristics representing the patient states were based on the results of an expert panel meeting, the states of patients with acute ischemic stroke are variable, and it is likely that some characteristics that might be important for certain patients were missed.


MDPs can be used as a new method for comparative effectiveness research on TCM. This new approach makes it possible to compare the effectiveness of certain combinations of treatments dynamically by considering state, action, and reward simultaneously. The method can be applied to optimize medical intervention combinations and to support clinical decision-making. However, the optimal interventions obtained by the MDPs in this study require further validation in clinical practice. The results from the MDP model should be interpreted with caution both due to the property of the MDPs themselves and because of possible bias that may have been generated either from the data collection or the data management. Further exploratory studies with MDPs in other areas in which complex interventions involving TCM, Western medicine, or a combination of both are common would be worthwhile.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

DRW, XPG: Study design, analysis and interpretation, drafts and revision of article, and final approval for submission. YFC, JXC, YQZ, MZ: Study design, acquisition of data and clean up the data, revision of article, and final approval for submission. QLL, JHC, YHH, LEY: Study design, analysis the data, drafts and revision of article, and final approval for submission. YBL: Study design, drafts and revision of article, and final approval for submission. All authors read and approved the final manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here:

Supplementary Material

Additional file 1:

Appendix 1. State, Action and corresponding values.

Additional file 2:

Appendix 2. Clinical Neurological Functional Impairment Assessment for Stroke Patients.

Additional file 3:

Appendix 3. Traditional Chinese Patent Medicine(TCPM) and Western medicine.

Additional file 4:

Appendix 4. transition probability of step 1.

Additional file 5:

Appendix 5. transition probability of step 2.

Additional file 6:

Appendix 6. utility functions of step 1.

Additional file 7:

Appendix 7. utility functions of step 2.


This work was supported by the Scientific Research Project of Public Welfare Industry, State Administration of Traditional Chinese Medicine of P. R. of China (No. 200707004); the Finance Department of Guangdong Province (No. [2006]143); National Natural Science Foundation of China (NSFC) and Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (GDUPS, 2011).


  • IOM. Initial National Priorities for Comparative Effectiveness Research [cited 2011, March 1]
  • Concato J, Peduzzi P, Huang GD, O'Leary TJ, Kupersmith J. Comparative effectiveness research: what kind of studies do we need? J Investig Med. 2010;5(8):764–769. [PubMed]
  • Avorn J. Debate about funding comparative-effectiveness research. N Engl J Med. 2009;360(19):1927–1929. doi: 10.1056/NEJMp0902427. [PubMed] [Cross Ref]
  • Lohr KN. Comparative effectiveness research methods: symposium overview and summary. Med Care. 2010;48(6 suppl):S3–S6. [PubMed]
  • Crown WHO, Obenchain RL, Englehart L, Lair T, Buesching DP, Croghan T. The application of sample selection models to outcomes research: the case of evaluating the effects of antidepressant therapy on resource utilization. Stat Med. 1998;17(17):1943–1958. doi: 10.1002/(SICI)1097-0258(19980915)17:17<1943::AID-SIM885>3.0.CO;2-0. [PubMed] [Cross Ref]
  • Hadley J, Polsky D, Mandelblatt JS, Mitchell JM, Weeks JC, Wang Q. et al. An exploratory instrumental variable analysis of the outcomes of localized breast cancer treatments in a medicare population. Health Econ. 2003;12(3):171–186. doi: 10.1002/hec.710. [PubMed] [Cross Ref]
  • Brookhart MA, Rassen JA, Schneeweiss S. Instrumental variable methods in comparative safety and effectiveness research. Pharmacoepidemiol Drug Saf. 2010;19(6):537–554. doi: 10.1002/pds.1908. [PMC free article] [PubMed] [Cross Ref]
  • Mojtabai R, Zivin JG. Effectiveness and cost-effectiveness of four treatment modalities for substance disorders: a propensity score analysis. Health Serv Res. 2003;38:233–259. doi: 10.1111/1475-6773.00114. [PMC free article] [PubMed] [Cross Ref]
  • Baker SG, Lindeman KS, Kramer BS. The paired availability design for historical controls. BMC Med Res Methodol. 2001;1:9. doi: 10.1186/1471-2288-1-9. [PMC free article] [PubMed] [Cross Ref]
  • Crown WH. There's a reason they call them dummy variables: A note on the use of structural equation techniques in comparative effectiveness research. PharmacoEconomics. 2010;28(10):947–955. doi: 10.2165/11537750-000000000-00000. [PubMed] [Cross Ref]
  • Bennett DA. An introduction to instrumental variables analysis: part 1. Neuroepidemiology. 2010;35(3):237–240. doi: 10.1159/000319455. [PubMed] [Cross Ref]
  • Bennett DA. An introduction to instrumental variables-part 2: Mendelian randomisation. Neuroepidemiology. 2010;35(4):307–310. doi: 10.1159/000321179. [PubMed] [Cross Ref]
  • Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–729. doi: 10.1093/ije/29.4.722. [PubMed] [Cross Ref]
  • Martens EP, Pestman WR, de Boer A, Belitser SV, Klungel OH. Instrumental variables: application and limitations. Epidemiology. 2006;17(3):260–267. doi: 10.1097/01.ede.0000215160.88317.cb. [PubMed] [Cross Ref]
  • Deng TT. Syndrome Differentiation and Treatment: a essence of TCM. Tradit Chin Med J. 2005;4(1):1–4. Chinese.
  • Puterman ML. Markov decision processes: discrete stochastic dynamic programming. New York: Wiley; 1994. pp. P74–P93.
  • Sloan TW. Safety-cost trade-offs in medical device reuse: a Markov decision process model. Health Care Manag Sci. 2007;10(1):81–93. doi: 10.1007/s10729-006-9007-2. [PubMed] [Cross Ref]
  • Nunes LG, de Carvalho SV, Rodrigues Rde C. Markov decision process applied to the control of hospital elective admissions. Artif Intell Med. 2009;47(2):159–171. doi: 10.1016/j.artmed.2009.07.003. [PubMed] [Cross Ref]
  • Magni P, Quaglini S, Marchetti M, Barosi G. Deciding when to intervene: a Markov decision process approach. Int J Med Inform. 2000;60(3):237–253. doi: 10.1016/S1386-5056(00)00099-X. [PubMed] [Cross Ref]
  • Kim M, Ghate A, Phillips MH. A Markov decision process approach to temporal modulation of dose fractions in radiation therapy planning. Phys Med Biol. 2009;54(14):4455–4476. doi: 10.1088/0031-9155/54/14/007. [PubMed] [Cross Ref]
  • Hauskrecht M, Fraser H. Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif Intell Med. 2000;18(3):221–244. doi: 10.1016/S0933-3657(99)00042-1. [PubMed] [Cross Ref]
  • Saucedo VM, Karim MN. Experimental optimization of a real time fed-batch fermentation process using Markov decision process. Biotechnol Bioeng. 1997;55(2):317–327. doi: 10.1002/(SICI)1097-0290(19970720)55:2<317::AID-BIT9>3.0.CO;2-L. [PubMed] [Cross Ref]
  • Hauskrecht M, Fraser H. Modeling treatment of ischemic heart disease with partially observable Markov decision processes. Proc AMIA Symp. 1998. pp. 538–542. [PMC free article] [PubMed]
  • Bell IR, Caspi O, Schwartz GE, Grant KL, Gaudet TW, Rychener D. et al. Integrative medicine and systemic outcomes research: issues in the emergence of a new model for primary health care. Arch Intern Med. 2002;162(2):133–140. doi: 10.1001/archinte.162.2.133. [PubMed] [Cross Ref]
  • NIH Stroke Scale(Rev 10/1/2003). The internet stroke center. [cited 2011, March 1]
  • The Forth National Conference of Cerebrovascular Disease. The standard assessment of Clinical Neurological Functional Impairment on patients with stroke(1995) Chin J Neural. 1996;29:381–383. Chinese.
  • Mou XL, Huang Y. Application of Yin and Yang syndrome differentiation method in Triditional Chinese Medcine syndrome differentiation on patients with stoke. J Guangzhou Univ Tradit Chin Med. 2009;26(1):80–82. Chinese.
  • Adams HP Jr, del Zoppo G, Alberts MJ, Bhatt DL, Brass L, Furlan A. et al. Guidelines for the early management of adults with ischemic stroke: a guideline from the american heart association/american stroke association stroke council, clinical cardiology council, cardiovascular radiology and intervention council, and the atherosclerotic peripheral vascular disease and quality of care outcomes in research interdisciplinary working groups: the american academy of neurology affirms the value of this guideline as an educational tool for neurologists. Stroke. 2007;38(5):1655–1711. doi: 10.1161/STROKEAHA.107.181486. [PubMed] [Cross Ref]
  • Krasopoulos G, Brister SJ, Beattie WS, Buchanan MR. Aspirin "resistance" and risk of cardiovascular morbidity: systematic review and meta-analysis. BMJ. 2008;336(7637):195–198. doi: 10.1136/bmj.39430.529549.BE. [PMC free article] [PubMed] [Cross Ref]
  • Tan Y, Liu M, Wu B. Puerarin for acute ischaemic stroke. Cochrane Database Syst Rev. 2008;23(1):CD004955. [PubMed]
  • Ihlen H, Ditlefsen L. Procainamide in acute myocardial infarction: a study on two different tablet preparations of sustained release type. Curr Ther Res Clin Exp. 1975;18(5):720–726. [PubMed]
  • Liu J. The use of Ginkgo biloba extract in acute ischemic stroke. Explore (NY) 2006;2(3):262–263. doi: 10.1016/j.explore.2006.03.012. [PubMed] [Cross Ref]
  • Tang Q. Milk vetch for cerebral infarction. J Jiangsu University (Medicine edition) 2003;13(4):366–367. Chinese.
  • Zhang Y, Liu JL, Li F. Milk vetch and Ligustrazine for ischemic stroke. Chin J Info Traditional Chin Med. 2003;10(7):53. Chinese.
  • Chen JH, Guo HB. Mailuoning and Naofukang for cerebral infarction. Henan Med Info. 2002;10(12):59–60. Chinese.
  • Yu BR, Liao YX. Qingkailing for cerebral infarction. Chin J Rehabil. 1999;14(2):102–103. Chinese.
  • Geng ZB, Yao JY. Compound Dan Shen for acute ischemic stroke. Res Traditional Chin Med. 2000;16(4):30–31. Chinese.
  • Zeng X, Liu M, Yang Y, Li Y, Asplund K. Ginkgo biloba for acute ischaemic stroke. Cochrane Database Syst Rev. 2005;19(4):CD003691. [PubMed]
  • Wu T, Ni J, Wu J. Danshen (Chinese medicinal herb) preparations for acute myocardial infarction. Cochrane Database Syst Rev. 2008;16(2):CD004465. [PubMed]
  • Wu B, Liu M, Liu H, Li W, Tan S, Zhang S. et al. Meta-analysis of traditional Chinese patent medicine for ischemic stroke. Stroke. 2007;38(6):1973–1979. doi: 10.1161/STROKEAHA.106.473165. [PubMed] [Cross Ref]
  • Feigin VL. Herbal medicine in stroke: does it have a future? Stroke. 2007;38(6):1734–1736. doi: 10.1161/STROKEAHA.107.487132. [PubMed] [Cross Ref]
  • Campbell M, Fitzpatrick R, Haines A, Kinmonth AL, Sandercock P. et al. Framework for design and evaluation of complex interventions to improve health. BMJ. 2000;321(7262):694–696. doi: 10.1136/bmj.321.7262.694. [PMC free article] [PubMed] [Cross Ref]
  • Krakauer JW. The complex dynamics of stroke onset and progression. Curr Opin Neurol. 2007;20(1):47–50. doi: 10.1097/WCO.0b013e328013f86b. [PubMed] [Cross Ref]
  • Wang YY. The proposal for improving the methodological system of Syndrome Differentiation of Traditional Chinese Medicine. J Tradit Chin Med. 2004;45(10):729–931. Chinese.
  • Alagoz O, Hsu H, Schaefer AJ, Roberts MS. Markov decision processes: a tool for sequential decision making under uncertainty. Med Decis Making. 2010;30(4):474–483. doi: 10.1177/0272989X09353194. [PMC free article] [PubMed] [Cross Ref]
  • Kim H. Neuroprotective herbs for stroke therapy in traditional eastern medicine. Neurol Res. 2005;27(3):287–301. doi: 10.1179/016164105X25234. [PubMed] [Cross Ref]
  • Gong X, Sucher NJ. Stroke therapy in traditional Chinese medicine (TCM): prospects for drug discovery and development. Phytomedicine. 2002;9(5):478–484. doi: 10.1078/09447110260571760. [PubMed] [Cross Ref]
  • Wang NL, Liou YL, Lin MT, Lin CL, Chang CK. Chinese herbal medicine, Shengmai San, is effective for improving circulatory shock and oxidative damage in the brain during heatstroke. J Pharmacol Sci. 2005;97(2):253–265. doi: 10.1254/jphs.FP0040793. [PubMed] [Cross Ref]
  • Lee IY, Lee CC, Chang CK, Chien CH, Lin MT. Sheng mai san, a Chinese herbal medicine, protects against renal ischaemic injury during heat stroke in the rat. Clin Exp Pharmacol Physiol. 2005;32(9):742–748. doi: 10.1111/j.1440-1681.2005.04259.x. [PubMed] [Cross Ref]
  • Bei W, Peng W, Ma Y, Xu A. NaoXinQing, an anti-stroke herbal medicine, reduces hydrogen peroxide-induced injury in NG108-15 cells. Neurosci Lett. 2004;363(3):262–265. doi: 10.1016/j.neulet.2004.04.031. [PubMed] [Cross Ref]

Articles from BMC Medical Research Methodology are provided here courtesy of BioMed Central