Home | About | Journals | Submit | Contact Us | Français |

**|**BMC Med Res Methodol**|**v.12; 2012**|**PMC3348070

Formats

Article sections

- Abstract
- Background
- Methods
- Results
- Discussion
- Conclusion
- Competing interests
- Authors' contributions
- Pre-publication history
- Supplementary Material
- References

Authors

Related links

BMC Med Res Methodol. 2012; 12: 23.

Published online 2012 March 9. doi: 10.1186/1471-2288-12-23

PMCID: PMC3348070

Darong Wu,^{1,}^{2} Yefeng Cai,^{1} Jianxiong Cai,^{1,}^{2} Qiuli Liu,^{3} Yuanqi Zhao,^{1} Jingheng Cai,^{4} Min Zhao,^{1} Yonghui Huang,^{4} Liuer Ye,^{4} Yubo Lu,^{1,}^{2} and Xianping Guo^{}^{4,}^{5}

Darong Wu: moc.liamg@uwgnoradrd; Yefeng Cai: moc.621@gnefeyiac; Jianxiong Cai: moc.621@628sucal; Qiuli Liu: nc.oohay@7002lquil; Yuanqi Zhao: nc.moc.oohay@8002mctmct; Jingheng Cai: nc.ude.usys.liam@gnehjiac; Min Zhao: moc.361@ardnaseissac; Yonghui Huang: nc.ude.usys.liam@5hgnoyh; Liuer Ye: moc.liamtoh@reuiley; Yubo Lu: moc.621@gnahznauyl; Xianping Guo: nc.ude.usys.liam@pxgscm

Received 2011 May 16; Accepted 2012 March 9.

Copyright ©2012 Wu et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Several methodological issues with non-randomized comparative clinical studies have been raised, one of which is whether the methods used can adequately identify uncertainties that evolve dynamically with time in real-world systems. The objective of this study is to compare the effectiveness of different combinations of Traditional Chinese Medicine (TCM) treatments and combinations of TCM and Western medicine interventions in patients with acute ischemic stroke (AIS) by using Markov decision process (MDP) theory. MDP theory appears to be a promising new method for use in comparative effectiveness research.

The electronic health records (EHR) of patients with AIS hospitalized at the 2^{nd }Affiliated Hospital of Guangzhou University of Chinese Medicine between May 2005 and July 2008 were collected. Each record was portioned into two "*state*-*action*-*reward*" stages divided by three time points: the first, third, and last day of hospital stay. We used the well-developed optimality technique in MDP theory with the finite horizon criterion to make the dynamic comparison of different treatment combinations.

A total of 1504 records with a primary diagnosis of AIS were identified. Only *states *with more than 10 (including 10) patients' information were included, which gave 960 records to be enrolled in the MDP model. Optimal combinations were obtained for 30 types of patient condition.

MDP theory makes it possible to dynamically compare the effectiveness of different combinations of treatments. However, the optimal interventions obtained by the MDP theory here require further validation in clinical practice. Further exploratory studies with MDP theory in other areas in which complex interventions are common would be worthwhile.

Comparative effectiveness research (CER) is a way of identifying what works for which patients under which circumstances [1]. CER is not a single entity, it can take many forms, including cohort studies, literature systematic reviews, observational studies, and randomized controlled trials (RCTs) [1,2]. Non-randomized comparative clinical studies also play an important role in assessing the safety and effectiveness of medical interventions for routine practice. Recent attention to non-randomized comparative clinical studies in CER has focused on methodological issues [3,4]. Experts realize that there are methodological challenges for non-randomized comparative clinical studies that cannot be ignored, especially with the increased requirements for data analysis driven by the demand for real-world evidence. These challenges include [4] dealing adequately with multiple therapies and possible outcomes; an extremely heterogeneous baseline in terms of patient characteristics and setting; and confounding in studies that use different kinds of health databases. Methodology researchers have made great progress in the development and application of statistical methods for the description and analysis of CER data [5-7]. Such methods include using propensity score analysis to adjust for group differences [8,9], structural equation models and decomposition methods to identify how outcomes vary differentially with respect to patient characteristics and other factors for alternative treatment cohorts [10], and instrumental variable methods to address the problem of uncontrolled confounding [7,11-14]. However, the uncertainties in real-world systems that evolve dynamically with time have yet to be adequately identified.

Treatment with syndrome differentiation is considered the kernel of Traditional Chinese Medicine (TCM)[15], which means that therapeutic interventions are changed dynamically according to the variation of the state of the syndrome or disease over time. There is a general impression among Chinese medicine practitioners that treatments that change dynamically with syndrome differentiation and time are superior to those that remain unchanged. However, when TCM treatments are tailored to the individual patient, as is common practice, it is more difficult to assess their effectiveness than when they are applied to all patients in a standard manner in clinical studies. Methods that allow the researcher to model the uncertainties in real-world practice, and especially those that may dynamically change with time, are needed to describe TCM treatments and compare their effectiveness.

MDP theory is a versatile and powerful tool used to analyze sequential decision problems [16] with applications in many areas, such as natural science, engineering technology, and medical care, and it increase the utilization of medical resources and optimize methods of diagnosis or treatment. The MDP theory is also important for medical decision-making, such as the administration of medical devices, admission control in hospitals, decisions on operation timing, and the adjustment of treatment strategies [17-23].

Syndrome differentiation and TCM treatments are very often interdependent and interleaved over time, principally due to uncertainty about the underlying disease, uncertainty associated with patient responses to certain treatments, and the likelihood of patient states varying within the period of treatment, such as from one pattern of TCM to another pattern. The introduction of MDP theory into CER on TCM makes dynamic comparison and evaluation possible. In this study, we show how MDP theory can be used to model integrative medicine treatments (the blending of the best of conventional medicine and complementary and alternative medicine) [24] for patients with acute ischemic stroke (AIS), and to provide an optimal solution from dynamic effectiveness comparisons in sequential clinical practice.

The electronic health records (EHR) of patients with AIS hospitalized at the 2^{nd }Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China, were collected. The inclusion criteria for the records were a primary diagnosis of cerebral infarction and hospital admission within 14 days of the onset of stroke. Records of patients who had thrombolysis or had undergone early anticoagulation treatment were excluded.

All of the data were collected with an information acquisition form, one form for each record, that captured the general information of the patient, TCM and Western medicine diagnosis, all applied treatments with course detail, levels of neurological function defect on the first, third, and last day of hospitalization, and the results of brain imaging (i.e., computerized X-ray tomography or magnetic resonance imaging). This study was approved by the ethic committee of 2^{nd }Affiliated Hospital of Guangzhou University of Chinese Medicine.

To determine the key characteristics for describing the condition of patients with AIS and the criterion to be optimized by using MDP theory, an expert panel was formed that included scholars, physicians of Western medicine, TCM practitioners, and doctors in the field of integrative medicine (with an educational background in both Western medicine and TCM), and a half-day expert panel meeting was held.

Six key characteristics were selected based on the results of the panel meeting (see Additional file 1: Appendix 1): (*i _{1}*) age; (

Duration of hospitalization for each patient was divided into two stages. Stage 1 ran from admission to the third day of hospital stay, and Stage 2 ran from the third day of hospital stay to discharge. This resulted in three time points for the *state *assessment: the first (*timepoint 1*, *t1*), third (*timepoint 2, t2*), and last day (*timepoint 3, t3*) of hospitalization. Each record was treated as two "s*tate*-a*ction*-*reward*" stages divided by the three timepoints. *State *refers to a patient's condition in terms of the six key characteristics; *action *represents the combination of treatments; and *reward *refers to the value of the differential between the scores for neurological function impairment [25,26] before and after treatment (equal to the total score before treatment minus the score after treatment). According to the expert panel's advice, the total reward values for the two stages became the criteria to be optimized. In terms of the reward values, 0 represents no change in a patient's condition, values larger than 0 represent improvement in a patient's condition, and values lower than 0 mean deterioration. If the value is larger than 0, then the larger the value, the better the improvement in *state*. The *action *that maximizes the total reward value is regarded as the optimal *action*, that is, the optimal intervention combination for the corresponding *state*.

Five circumstances were used to distinguish different treatment combinations (*action*) at each stage (see Additional file 3: Appendix 3): (*a _{1}*) whether to use antiplatelet and/or anticoagulant agents; (

Treatment strategies were carried out at the request of the physician in charge of the patient under the same theory of TCM [27]. Patients with a TCM diagnosis belonging to the *Yin *pattern were treated by "*Yi Qi Wen Yang*" treatments, and those with a TCM diagnosis belonging to the *Yang *pattern received "*Qing Re Xi Feng*" treatments. Herbal medicine was prescribed according to the current symptoms of the patient. If the patient was constipated, TCM treatments to relax the bowels were used. Aspirin or Clopidogrel was taken orally by each patient within 48 hours of hospital admission, except those who were allergic to or genuinely intolerant of these agents. Anticoagulant agents, including unfractionated heparin (UFH), low-molecular-weight heparin (LMWH), or warfarin were used if the patient had any of the following conditions: atrial fibrillation, serious artery angiostenosis, or advancing stroke. Any treatment might be changed at any time if the physician thought it necessary.

For patients with a history of hypertension, diabetes, or dyslipdemia, the agents that they had been taking before admission continued to be administrated during their hospital stay. However, these interventions were not included in the analysis, as they did not focus on stroke treatment.

All of the information acquisition forms were double entered with EpiData 3.1 (EpiData Association Odense, Denmark). The final dataset was converted into SPSS format. Missing data were replaced by the median of nearby points. Data were analyzed primarily with SPSS13.0 (SPSS, USA). The Markov decision processes (MDPs) were written in C language and compiled using Dev C++ 4.9.9.2.

According to clinical experience and TCM theory, treatment decision-making depends on the current condition of patient, and the corresponding TCM/integrative medicine (i.e. the combination of practices and methods of alternative medicine with conventional medicine) therapies are described as non-stationary finite horizon MDPs, in which each *state *variable denotes the patient's condition at a certain time. The optimality problem is solved by maximizing the non-stationary finite horizon expected total utility. For finite horizon MDPs, the *state *space is a set of vectors consisting of all possible conditions for a patient, the set of available *actions *for a *state *is composed of treatments used for therapy for a given *state*, the transition probabilities in the MDPs are determined by the records of therapeutic effectiveness, and the corresponding utility function is evaluated based on the neurological functional impairment score related to the patient's condition and the effectiveness of treatment. Thus, the optimality problem is actually described as a non-stationary finite horizon expected total utility MDP model, and the optimality technique already developed for MDPs can be used to solve it efficiently [16].

First, it is necessary to specify the condition of the patient, which is the information known by the physician. A *state i *in MDPs denotes the patient's condition. As described in former section, a patient's condition is evaluated based on an overall consideration of various factors, such as *i*_{6 }represents level of consciousness, visual field defects, and muscle power of the limbs, etc.. Thus, the *state *is denoted by a vector i = (i_{1}, ..., i_{n}), where the *state *vector i_{k }(k = 1, ..., n) corresponds to every aspect of the patient's condition and *n *is the dimension of the *state *vector. The *state *space is composed of all possible *state *vectors, that is, S = {i = (i_{1}, ..., i_{n}) | i_{k}{0, 1, ..., l_{i}}, k = 1, ..., n}, where l_{i }denotes the number of corresponding factors.

Second, a vector consisting of treatment combinations *a* = (*a*_{1}, ..., *a _{m}*) is regarded as

Third, when a physician prescribes a type of treatment combination (*action a*) for a certain patient in *state i*, the corresponding effectiveness can be detected in *state j *of the patient at the next observable time point. Therapeutic effectiveness may differ when the same treatment combination is applied to different patients with the same condition. Thus, the dynamic evolution of the treatment process is specified using the so-called transition probability P_{t}(j|i,a), which means that P_{t}(j|i,a) denotes the probability that the *state *is j S at time t + 1 when *action a*A(i) is taken at *state i*S at time t. We use # (j, i, *a*) to denote the number of transfers from *state i *to the next *state j* under *action a*. For each *state i*, jS, and any given *action a*A(i), the transition probability is given by Equation (1).

$${p}_{t}\left(j|i,a\right):=\frac{\#\left(j,i,a\right)}{\sum _{j\in S}\#\left(j,i,a\right)}$$

(1)

Fourth, the *reward *function u_{t}(i, *a*), which depends on the current *state i * S, a chosen *action a*A(i), and decision epoch t, is expressed as

$${u}_{t}\left(i,a\right)\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\sum _{j\in S}{p}_{t}\left(j|i,a\right){u}_{t}\left(j,i,a\right),\phantom{\rule{0.3em}{0ex}}$$

(2)

where u_{t}(j, i,*a*) denotes the *reward *value when the *state *of the treatment process is i at stage t, an *action a*A(i) is taken, and the treatment process results in *state j* at the next stage t + 1.

Finally, to complete the model, it is necessary to introduce the N-horizon expected total reward criterion. This needs to define a class of policies (i.e., all possible sequences of treatment combinations) admissible to the controller. A policy can be denoted as a sequence of functions π = {f_{1}, f_{2}, . . f_{N}}, where f_{t }(1 ≤ t ≤ N) acts on S and satisfies that f_{t}(i)A(i) for all iS. Hence, function f_{t}(i) is the treatment combination chosen at *state i *at stage t. Let Π be the set of all policies. For any given policy π and initial *state i*, J(π,i) denotes the corresponding expected total reward from the initial time to the end time N.

To that end, a model is specified for non-stationary MDPs with the N -horizon expected total reward criterion for the foregoing treatment processes:

$$\left\{S,\left(A\left(i\right),i\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}S\right),\phantom{\rule{0.3em}{0ex}}{p}_{t}\left(j|i,a\right),{u}_{t}\left(i,a\right)\right\},\phantom{\rule{0.3em}{0ex}}$$

(3)

where the *state *space S, the available *action *set A(i) at *state i*S, the transition probability p_{t}(j|i,*a*) with i, jS and *a*A(i), and the *reward *function u_{t}(i,*a*), are as previously defined. To elucidate following arguments, some notation is introduced: For each fixed policy π = {f_{1}, f_{2}, . . f_{N}}Π, a transition probability matrix P(t, π) is defined with the (i,j) element as p_{t}(j|i, f_{t}(i)).

For each πΠ and initial *state i*S, the N -horizon expected total reward to be maximized is denoted by

$$J\left(\pi ,i\right):=\phantom{\rule{0.3em}{0ex}}{E}_{i}^{\pi}\left[\sum _{t=0}^{N-1}{u}_{t}\left(i\left(t\right),a\left(t\right)\right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{u}_{N}\left(i\left(N\right)\right)\right],$$

(4)

where ${E}_{i}^{\pi}$ denotes the expectation operator determined by the given p_{t}(j|i, f_{t}(i)) and the initial *state i*S, *i*(t) and *a*(t) are the *state* and *action* variables at time t, and u_{N}(i(N)) is the terminal *reward *associated with the *state i*(N)S; see [16] for details.

Finally, the corresponding optimal value function is defined as J*(i) = sup_{πII}J(π, i), iS. A policy π* in Π is said to be optimal if J(π*,i) = J*(i) for all iS.

For each πΠ, *U*_{t}(π, *i*) denotes the corresponding expected total utility from time t to the end time N given *state i _{t }*=

$${U}_{t}\left(\pi ,j\right)\phantom{\rule{0.3em}{0ex}}:=\phantom{\rule{0.3em}{0ex}}{E}_{i}^{\pi}\left[\sum _{n=t}^{N-1}{u}_{t}\left(i\left(t\right),a\left(t\right)\right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{u}_{N}\left(i\left(N\right)\right)|\phantom{\rule{0.3em}{0ex}}i\left(t\right)\phantom{\rule{0.3em}{0ex}}=j\right]\phantom{\rule{0.3em}{0ex}}\mathsf{\text{for}}\phantom{\rule{0.3em}{0ex}}t\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}N\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}1,...,1\phantom{\rule{0.3em}{0ex}}$$

(5)

$${U}_{N}\left(\pi ,j\right)\phantom{\rule{0.3em}{0ex}}:=\phantom{\rule{0.3em}{0ex}}{u}_{N}\left(j\right),\phantom{\rule{0.3em}{0ex}}j\phantom{\rule{0.3em}{0ex}}\in S.\phantom{\rule{0.3em}{0ex}}$$

(6)

Further,

$${J}_{t}\left(i\right):=\underset{\pi \in \text{\Pi}}{\mathsf{\text{inf}}\phantom{\rule{0.3em}{0ex}}}{U}_{t}\left(\pi ,i\right)\phantom{\rule{0.3em}{0ex}}\mathsf{\text{for}}\phantom{\rule{0.3em}{0ex}}t\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}N,...,\phantom{\rule{0.3em}{0ex}}1\phantom{\rule{0.3em}{0ex}}$$

(7)

implies that J*(i) = *U*_{1}(i) = J_{1}(i).

To find a method to obtain an optimal policy, by Theorem 4.3.3 (16) the following algorithm is used.

StepI: Set t = N and

$${J}_{N}\left(i\right):={u}_{N}\left(i\right)\phantom{\rule{0.3em}{0ex}}\mathsf{\text{for}}\phantom{\rule{0.3em}{0ex}}\mathsf{\text{all}}\phantom{\rule{0.3em}{0ex}}i\phantom{\rule{0.3em}{0ex}}\in \phantom{\rule{0.3em}{0ex}}S\phantom{\rule{0.3em}{0ex}}$$

(8)

StepII: Substitute t-1 for t and compute J_{t}(i) by

$${J}_{t}\left(i\right)\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\underset{a\in A\left(i\right)}{\mathsf{\text{max}}}\phantom{\rule{0.3em}{0ex}}\left\{{u}_{t}\left(i,a\right)\phantom{\rule{0.3em}{0ex}}+\sum _{j\in S}{P}_{t}\left(j|i,a\right){J}_{t+1}\left(j\right)\right\}\phantom{\rule{0.3em}{0ex}}\mathsf{\text{for}}\phantom{\rule{0.3em}{0ex}}t\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}N-1,...,1.\phantom{\rule{0.3em}{0ex}}$$

(9)

Obtain f_{t}*, which realizes the maximum in Eq. (9).

Step III: If t = 1, then stop. Otherwise return to StepII. The policy obtained π* = {f*_{1}, ..., f*_{N-1}} is optimal (by Theorem 4.3.3 in [16]) as the control model consists of finite *state *and *action *spaces.

All of the records from the patients with AIS were broadly classified into several groups according to the patient's condition (each of which is called a "*state*"), and the types of treatments were divided into two stages during which different treatment combinations were used. Information was collected to form Tables Tables11 and and2,2, which show patient condition and the corresponding treatment combination (i.e., "*actions*") at Stage 1 and Stage 2, respectively. Patient condition as assessed by the six key characteristics is listed in columns 2 through 7. The first column denotes the number of patients with the same condition, and columns 8 through 12 list the main treatments (sometimes more than one for each "*state*") used for AIS (the columns in Tables Tables11 and and22 have the same meaning but are for a different treatment stage.)

The elements of the MDP model can now be formulated. From Table Table11 and Table Table2,2, the state space can be expressed as S = {200111, 200112, ......, 311122, 311123}, and the corresponding sets of admissible actions are given as

A 200111 = {00001; 00101; 00111; 10001; 10101; 10111}

A (200112 = {10011; 10101; 10110; 10111; 11111; 10001; 11101} ...... The optimality problem is considered to be within a finite time horizon from stage 1 to stage 2. A terminal *reward *of 0 is assigned to all *states*. Based on Tables Tables11 and and22 and Eq (1), the transition probabilities p_{t}(j|i, *a*) (t = 1, 2) are computed and listed in Additional file 4: Appendix 4 and Additional file 5: Appendix 5. From the neurological functional impairment scores in Tables Tables11 and and22 and Eq (2), the *reward *functions u_{t }(i, *a*) (t = 1, 2) can be obtained by Eq (2), and are listed in Additional file 6: Appendix 6 and Additional file 7: Appendix 7.

Using the algorithm to solve the optimal problem, an optimal policy π* = {f*_{1}, f_{2}*} (corresponding to the optimal treatments) can be obtained as follows.

f *_{1 }(200111) = {00001}, f *_{1 }(200112) = {10101},......

f *_{2 }(200111) = {00111}, f *_{2 }(200112) = {10001},...... The optimal treatments with this optimal policy are shown in Table Table33 and Table Table44.

A total of 1504 records with a primary diagnosis of AIS were identified for the period 1^{st }May 2005 to 31^{th }July 2008. Of these, 1337 met the inclusion criteria. Only *states *with more than 10 (including 10) patients' information were included, resulting in 960 records being enrolled in the MDP model representing 30 kinds of patient condition. Sixty-eight percent of records were from patients over 66 years old. A disease history was given for 74% of the 960 patients. Most of the records had fairly low scores for neurological function impairment, indicating that the severity of the patient's condition was minor to medium (see Table Table5).5). The *i*_{6 }value for eight patients who were dead in stage 2 was 29 (the highest score for neurological functional impairment).

There was 0 to 1.12% of missing data in *i*_{1 }to *i*_{5 }and 0.07 to 18.39% of data missing for *i*_{6}, of which 18.39% was on ataxia, 13.80% information on visual field defects, and 13.76% on sensory disturbance. Other missing data for *i*_{6 }were found in other indexes, such as level of consciousness, facial paralysis, muscle power of upper and lower limbs, aphasia, and dysarthria, with levels of missing data ranging from 0.07 to 7.11%. For *a _{1 }*to

By calculating and screening with the MDP theory, the optimal combinations of treatments for the 30 *states *(see Table Table66 and Table Table7)7) were obtained.

The results of six *states *(see Table Table88 and Table Table9)9) can be used as an example to show how these can be used to individually compare the effectiveness of treatments. The *states *in Table Table88 represent patients who were older than 66 (*i*_{1 }= 3), had at least one kind of disease history (*i*_{2 }= 1), were without complications during their hospitalization (*i*_{3 }= 0), had *Zhong Jing Luo *(apoplexy involving channels or collaterals) (*i*_{4 }= 1) as the TCM diagnosis and a *Yin *TCM pattern (*i*_{5 }= 2). Different levels of neurological functional impairment (*i*_{6}) were detected, which meant that the severity of stroke varied among patients, as represented by *State *10036, *State *10037, and *State *10038.

At Stage 1, 122 patients were in *State *10036, and received a combination of therapeutic intervention including TCM treatments to replenish qi and wen yang (*Yi Qi Wen Yang*), TCM treatments to relax the bowels, and herbal medicine (labeled as 01011). Each patient was given a score for neurological functional impairment to describe their *i _{6 }*level. Among patients in

One hundred and twenty-seven patients were in *State *10036 at Stage 2, which implies that if the treatment combination labeled "01011" was maintained, then patients in this *State *at Stage 2 would obtain the highest *reward *(1 unit) at t_{3}.

Similarly, for patients at Stage 1 in *State *10037, who had a more severe clinical condition than those in *State *10036, the results showed that if the *action *was "01011", then the *reward *value would be a maximum of 4 units. In contrast, for patients in *State *10037 at Stage 2, an intervention with only herbal medicine (*action *labeled as "00001") resulted in the highest *reward *of 4 units. For patients in *State *10038 at Stage 1, a "10001" *action* resulted in a reward of 6.28 units at t_{2}, whereas the *action *"10001" at Stage 2 resulted in 4.67 units of *reward *at t_{3}.

Patients in *States *10031, 10032, and 10033 (see Table Table9)9) all had a TCM pattern of *Yang*, whereas those in *States *10036, 10037, and 10038 had a TCM pattern of *Yin*.

The results in the first line of Table Table99 show that by combining TCM treatments for clearing heat and extinguishing wind (*Qing Re Xi Feng*) (labeled as *a*_{3}) with herbal medicine (labeled as *a*_{5}), the best reward value at *Stage 1 *for patients in *state *10031 was 1 unit. At Stage 2, patients in the same *state *10031 may have needed a treatment of antiplatelet agents (*a*_{1}) together with TCM treatments to relax the bowels (*a*_{4}), and *a*_{5 }to form the *action *known as "10011" to gain a maximum value reward. It seems that for *State *10033, in which patients tendered to have a more severe clinical condition, the two *actions *that involved TCM therapeutic interventions achieved the best *rewards*.

Based on inpatient EHR, MDPs were applied to describe and analyze the dynamic process of different combinations of TCM treatments and/or integrated treatments of TCM and Western medicine for patients with AIS, and to determine the optimal treatment combination for each *State *by comparing the *rewards *gained from the corresponding *actions*. To the best of our knowledge, no similar topic has been previously addressed in the field of integrative medicine (IM) or in complementary and alternative medicine (CAM).

No medication has yet been confirmed to have neuroprotective effects in the management of patients with AIS [28]. Although antiplatelet agents can reduce the risk of mortality and morbidity when aspirin is administered within 48 hours after the onset of stroke, it cannot be used in up to 28% patients with aspirin "resistance" [29]. The management of patients with AIS with heparin carries an increased risk of bleeding complications [30]. The use of intravenous recombinant tissue plasminogen activators (rt-PA) in cerebral infarctions is associated with improved outcomes, but cannot be used as a routine therapy outside special units [31].

Several commonly used and government-approved traditional Chinese patent medicines (TCPMs), such as, *Ginkgo biloba *[32], milk vetch [33,34], Mailuoning [35], Qingkailing [36], and Danshen [37] agents, have shown promising effects for ischemic stroke. However, no definite conclusions can be drawn from studies of these agents due to a general lack of reporting on methodology [30,38-40]. Properly designed clinical research to study the role of traditional medicine in ischemic stroke is warranted, but a number of issues must be addressed in the design of such studies first [41]. One of these issues is complex interventions involving varying dosages and interactions. Randomized controlled trials (RCTs) are a possible approach to evaluating complex interventions as a whole compared with an appropriate alternative [42], but cannot separate the benefits of different combinations of components. The multi-component structure of treatments is closer to real world practice, especially in therapy for stroke with complex dynamics from onset through progression [43]. Moreover, the model of applying a treatment and conducting it without any change through the whole course of acute stroke is inconsistent with the basic theory of TCM whereby treatment is altered according to syndrome differentiation [15,44].

The results of this study indicate that the new method of MDPs may prove useful for comparative effectiveness research (CER). MDPs can be applied to dynamically compare the effectiveness of various combinations of complex treatments, and may be able to overcome the uncertainties related to individual patients' responses to certain combination of treatments and the uncertainties concerning dynamic changes in treatment for certain patients over the course of disease [21-23,45].

Past research implies that herbal medicine may possess neuroprotective properties [46,47], protect against ischemic reperfusion injury [48,49], reduce edema in the brain [48], improve cerebral microcirculation [33,47], and inhibit apoptosis [50]. Such properties may partly explain the effectiveness of the combinations of treatments identified in this research.

This study has several limitations. First, all of the data were taken from EHR, and missing data are inevitable. The amount of missing data was less than 1.12% in most categories, although 18.39% of missing data was detected in *i*_{6}. As *i*_{6 }is a key variable in describing the *rewards *of *actions*, the results should be interpreted cautiously because of the possible bias caused by the replacement of missing data. In addition, due to too much variety, different components of herbal medicine were classified as one *action*. As a result, the effectiveness of different prescriptions of herbal medicine is not comparable. Another limitation is that each patient's record was divided into two stages according to three time points, with each episode being regarded as an independent sample when modeled by MDPs. This is consistent with the Markov property of *non-after effect *according to the basic theory of MDPs, but it may, to a certain extent, ignore potential correlations between episodes obtained from the same patient at different stages. Finally, although the key characteristics representing the patient states were based on the results of an expert panel meeting, the states of patients with acute ischemic stroke are variable, and it is likely that some characteristics that might be important for certain patients were missed.

MDPs can be used as a new method for comparative effectiveness research on TCM. This new approach makes it possible to compare the effectiveness of certain combinations of treatments dynamically by considering *state*, *action*, and *reward *simultaneously. The method can be applied to optimize medical intervention combinations and to support clinical decision-making. However, the optimal interventions obtained by the MDPs in this study require further validation in clinical practice. The results from the MDP model should be interpreted with caution both due to the property of the MDPs themselves and because of possible bias that may have been generated either from the data collection or the data management. Further exploratory studies with MDPs in other areas in which complex interventions involving TCM, Western medicine, or a combination of both are common would be worthwhile.

The authors declare that they have no competing interests.

DRW, XPG: Study design, analysis and interpretation, drafts and revision of article, and final approval for submission. YFC, JXC, YQZ, MZ: Study design, acquisition of data and clean up the data, revision of article, and final approval for submission. QLL, JHC, YHH, LEY: Study design, analysis the data, drafts and revision of article, and final approval for submission. YBL: Study design, drafts and revision of article, and final approval for submission. All authors read and approved the final manuscript.

The pre-publication history for this paper can be accessed here:

**Appendix 1**. State, Action and corresponding values.

Click here for file^{(11K, PDF)}

**Appendix 2**. Clinical Neurological Functional Impairment Assessment for Stroke Patients.

Click here for file^{(48K, PDF)}

**Appendix 3**. Traditional Chinese Patent Medicine(TCPM) and Western medicine.

Click here for file^{(83K, PDF)}

This work was supported by the Scientific Research Project of Public Welfare Industry, State Administration of Traditional Chinese Medicine of P. R. of China (No. 200707004); the Finance Department of Guangdong Province (No. [2006]143); National Natural Science Foundation of China (NSFC) and Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (GDUPS, 2011).

- IOM. Initial National Priorities for Comparative Effectiveness Research [cited 2011, March 1] http://www.iom.edu/~/media/Files/Report%20Files/2009/ComparativeEffectivenessResearchPriorities/CER%20report%20brief%2008-13-09.pdf
- Concato J, Peduzzi P, Huang GD, O'Leary TJ, Kupersmith J. Comparative effectiveness research: what kind of studies do we need? J Investig Med. 2010;5(8):764–769. [PubMed]
- Avorn J. Debate about funding comparative-effectiveness research. N Engl J Med. 2009;360(19):1927–1929. doi: 10.1056/NEJMp0902427. [PubMed] [Cross Ref]
- Lohr KN. Comparative effectiveness research methods: symposium overview and summary. Med Care. 2010;48(6 suppl):S3–S6. [PubMed]
- Crown WHO, Obenchain RL, Englehart L, Lair T, Buesching DP, Croghan T. The application of sample selection models to outcomes research: the case of evaluating the effects of antidepressant therapy on resource utilization. Stat Med. 1998;17(17):1943–1958. doi: 10.1002/(SICI)1097-0258(19980915)17:17<1943::AID-SIM885>3.0.CO;2-0. [PubMed] [Cross Ref]
- Hadley J, Polsky D, Mandelblatt JS, Mitchell JM, Weeks JC, Wang Q. et al. An exploratory instrumental variable analysis of the outcomes of localized breast cancer treatments in a medicare population. Health Econ. 2003;12(3):171–186. doi: 10.1002/hec.710. [PubMed] [Cross Ref]
- Brookhart MA, Rassen JA, Schneeweiss S. Instrumental variable methods in comparative safety and effectiveness research. Pharmacoepidemiol Drug Saf. 2010;19(6):537–554. doi: 10.1002/pds.1908. [PMC free article] [PubMed] [Cross Ref]
- Mojtabai R, Zivin JG. Effectiveness and cost-effectiveness of four treatment modalities for substance disorders: a propensity score analysis. Health Serv Res. 2003;38:233–259. doi: 10.1111/1475-6773.00114. [PMC free article] [PubMed] [Cross Ref]
- Baker SG, Lindeman KS, Kramer BS. The paired availability design for historical controls. BMC Med Res Methodol. 2001;1:9. doi: 10.1186/1471-2288-1-9. [PMC free article] [PubMed] [Cross Ref]
- Crown WH. There's a reason they call them dummy variables: A note on the use of structural equation techniques in comparative effectiveness research. PharmacoEconomics. 2010;28(10):947–955. doi: 10.2165/11537750-000000000-00000. [PubMed] [Cross Ref]
- Bennett DA. An introduction to instrumental variables analysis: part 1. Neuroepidemiology. 2010;35(3):237–240. doi: 10.1159/000319455. [PubMed] [Cross Ref]
- Bennett DA. An introduction to instrumental variables-part 2: Mendelian randomisation. Neuroepidemiology. 2010;35(4):307–310. doi: 10.1159/000321179. [PubMed] [Cross Ref]
- Greenland S. An introduction to instrumental variables for epidemiologists. Int J Epidemiol. 2000;29(4):722–729. doi: 10.1093/ije/29.4.722. [PubMed] [Cross Ref]
- Martens EP, Pestman WR, de Boer A, Belitser SV, Klungel OH. Instrumental variables: application and limitations. Epidemiology. 2006;17(3):260–267. doi: 10.1097/01.ede.0000215160.88317.cb. [PubMed] [Cross Ref]
- Deng TT. Syndrome Differentiation and Treatment: a essence of TCM. Tradit Chin Med J. 2005;4(1):1–4. Chinese.
- Puterman ML. Markov decision processes: discrete stochastic dynamic programming. New York: Wiley; 1994. pp. P74–P93.
- Sloan TW. Safety-cost trade-offs in medical device reuse: a Markov decision process model. Health Care Manag Sci. 2007;10(1):81–93. doi: 10.1007/s10729-006-9007-2. [PubMed] [Cross Ref]
- Nunes LG, de Carvalho SV, Rodrigues Rde C. Markov decision process applied to the control of hospital elective admissions. Artif Intell Med. 2009;47(2):159–171. doi: 10.1016/j.artmed.2009.07.003. [PubMed] [Cross Ref]
- Magni P, Quaglini S, Marchetti M, Barosi G. Deciding when to intervene: a Markov decision process approach. Int J Med Inform. 2000;60(3):237–253. doi: 10.1016/S1386-5056(00)00099-X. [PubMed] [Cross Ref]
- Kim M, Ghate A, Phillips MH. A Markov decision process approach to temporal modulation of dose fractions in radiation therapy planning. Phys Med Biol. 2009;54(14):4455–4476. doi: 10.1088/0031-9155/54/14/007. [PubMed] [Cross Ref]
- Hauskrecht M, Fraser H. Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif Intell Med. 2000;18(3):221–244. doi: 10.1016/S0933-3657(99)00042-1. [PubMed] [Cross Ref]
- Saucedo VM, Karim MN. Experimental optimization of a real time fed-batch fermentation process using Markov decision process. Biotechnol Bioeng. 1997;55(2):317–327. doi: 10.1002/(SICI)1097-0290(19970720)55:2<317::AID-BIT9>3.0.CO;2-L. [PubMed] [Cross Ref]
- Hauskrecht M, Fraser H. Modeling treatment of ischemic heart disease with partially observable Markov decision processes. Proc AMIA Symp. 1998. pp. 538–542. [PMC free article] [PubMed]
- Bell IR, Caspi O, Schwartz GE, Grant KL, Gaudet TW, Rychener D. et al. Integrative medicine and systemic outcomes research: issues in the emergence of a new model for primary health care. Arch Intern Med. 2002;162(2):133–140. doi: 10.1001/archinte.162.2.133. [PubMed] [Cross Ref]
- NIH Stroke Scale(Rev 10/1/2003). The internet stroke center. [cited 2011, March 1] http://www.strokecenter.org/trials/scales/nihss.html
- The Forth National Conference of Cerebrovascular Disease. The standard assessment of Clinical Neurological Functional Impairment on patients with stroke(1995) Chin J Neural. 1996;29:381–383. Chinese.
- Mou XL, Huang Y. Application of Yin and Yang syndrome differentiation method in Triditional Chinese Medcine syndrome differentiation on patients with stoke. J Guangzhou Univ Tradit Chin Med. 2009;26(1):80–82. Chinese.
- Adams HP Jr, del Zoppo G, Alberts MJ, Bhatt DL, Brass L, Furlan A. et al. Guidelines for the early management of adults with ischemic stroke: a guideline from the american heart association/american stroke association stroke council, clinical cardiology council, cardiovascular radiology and intervention council, and the atherosclerotic peripheral vascular disease and quality of care outcomes in research interdisciplinary working groups: the american academy of neurology affirms the value of this guideline as an educational tool for neurologists. Stroke. 2007;38(5):1655–1711. doi: 10.1161/STROKEAHA.107.181486. [PubMed] [Cross Ref]
- Krasopoulos G, Brister SJ, Beattie WS, Buchanan MR. Aspirin "resistance" and risk of cardiovascular morbidity: systematic review and meta-analysis. BMJ. 2008;336(7637):195–198. doi: 10.1136/bmj.39430.529549.BE. [PMC free article] [PubMed] [Cross Ref]
- Tan Y, Liu M, Wu B. Puerarin for acute ischaemic stroke. Cochrane Database Syst Rev. 2008;23(1):CD004955. [PubMed]
- Ihlen H, Ditlefsen L. Procainamide in acute myocardial infarction: a study on two different tablet preparations of sustained release type. Curr Ther Res Clin Exp. 1975;18(5):720–726. [PubMed]
- Liu J. The use of Ginkgo biloba extract in acute ischemic stroke. Explore (NY) 2006;2(3):262–263. doi: 10.1016/j.explore.2006.03.012. [PubMed] [Cross Ref]
- Tang Q. Milk vetch for cerebral infarction. J Jiangsu University (Medicine edition) 2003;13(4):366–367. Chinese.
- Zhang Y, Liu JL, Li F. Milk vetch and Ligustrazine for ischemic stroke. Chin J Info Traditional Chin Med. 2003;10(7):53. Chinese.
- Chen JH, Guo HB. Mailuoning and Naofukang for cerebral infarction. Henan Med Info. 2002;10(12):59–60. Chinese.
- Yu BR, Liao YX. Qingkailing for cerebral infarction. Chin J Rehabil. 1999;14(2):102–103. Chinese.
- Geng ZB, Yao JY. Compound Dan Shen for acute ischemic stroke. Res Traditional Chin Med. 2000;16(4):30–31. Chinese.
- Zeng X, Liu M, Yang Y, Li Y, Asplund K. Ginkgo biloba for acute ischaemic stroke. Cochrane Database Syst Rev. 2005;19(4):CD003691. [PubMed]
- Wu T, Ni J, Wu J. Danshen (Chinese medicinal herb) preparations for acute myocardial infarction. Cochrane Database Syst Rev. 2008;16(2):CD004465. [PubMed]
- Wu B, Liu M, Liu H, Li W, Tan S, Zhang S. et al. Meta-analysis of traditional Chinese patent medicine for ischemic stroke. Stroke. 2007;38(6):1973–1979. doi: 10.1161/STROKEAHA.106.473165. [PubMed] [Cross Ref]
- Feigin VL. Herbal medicine in stroke: does it have a future? Stroke. 2007;38(6):1734–1736. doi: 10.1161/STROKEAHA.107.487132. [PubMed] [Cross Ref]
- Campbell M, Fitzpatrick R, Haines A, Kinmonth AL, Sandercock P. et al. Framework for design and evaluation of complex interventions to improve health. BMJ. 2000;321(7262):694–696. doi: 10.1136/bmj.321.7262.694. [PMC free article] [PubMed] [Cross Ref]
- Krakauer JW. The complex dynamics of stroke onset and progression. Curr Opin Neurol. 2007;20(1):47–50. doi: 10.1097/WCO.0b013e328013f86b. [PubMed] [Cross Ref]
- Wang YY. The proposal for improving the methodological system of Syndrome Differentiation of Traditional Chinese Medicine. J Tradit Chin Med. 2004;45(10):729–931. Chinese.
- Alagoz O, Hsu H, Schaefer AJ, Roberts MS. Markov decision processes: a tool for sequential decision making under uncertainty. Med Decis Making. 2010;30(4):474–483. doi: 10.1177/0272989X09353194. [PMC free article] [PubMed] [Cross Ref]
- Kim H. Neuroprotective herbs for stroke therapy in traditional eastern medicine. Neurol Res. 2005;27(3):287–301. doi: 10.1179/016164105X25234. [PubMed] [Cross Ref]
- Gong X, Sucher NJ. Stroke therapy in traditional Chinese medicine (TCM): prospects for drug discovery and development. Phytomedicine. 2002;9(5):478–484. doi: 10.1078/09447110260571760. [PubMed] [Cross Ref]
- Wang NL, Liou YL, Lin MT, Lin CL, Chang CK. Chinese herbal medicine, Shengmai San, is effective for improving circulatory shock and oxidative damage in the brain during heatstroke. J Pharmacol Sci. 2005;97(2):253–265. doi: 10.1254/jphs.FP0040793. [PubMed] [Cross Ref]
- Lee IY, Lee CC, Chang CK, Chien CH, Lin MT. Sheng mai san, a Chinese herbal medicine, protects against renal ischaemic injury during heat stroke in the rat. Clin Exp Pharmacol Physiol. 2005;32(9):742–748. doi: 10.1111/j.1440-1681.2005.04259.x. [PubMed] [Cross Ref]
- Bei W, Peng W, Ma Y, Xu A. NaoXinQing, an anti-stroke herbal medicine, reduces hydrogen peroxide-induced injury in NG108-15 cells. Neurosci Lett. 2004;363(3):262–265. doi: 10.1016/j.neulet.2004.04.031. [PubMed] [Cross Ref]

Articles from BMC Medical Research Methodology are provided here courtesy of **BioMed Central**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |