Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3042487

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Anticipating “shock” disturbances; shock-anticipating LQR (SA-LQR)
- 3. SA-LQR with behavioral profiles
- 4. In silico preclinical trials
- 5. In silico comparison of SA-LQR to conventional therapy
- 6. Conclusions
- REFERENCES

Authors

Related links

Comput Methods Programs Biomed. Author manuscript; available in PMC 2012 May 1.

Published in final edited form as:

Published online 2010 June 19. doi: 10.1016/j.cmpb.2010.04.011

PMCID: PMC3042487

NIHMSID: NIHMS235669

The publisher's final edited version of this article is available at Comput Methods Programs Biomed

See other articles in PMC that cite the published article.

Automatic control of Type 1 Diabetes Mellitus (T1DM) with subcutaneous (SC) measurement of glucose concentration and subcutaneous (SC) insulin infusion is of great interest within the diabetes technology research community. The main challenge with the so-called “SC–SC” route to control is sensing and actuation delay, which tends to either destabilize the system or inhibit the aggressiveness of the controller in responding to meals and exercise. Model predictive control (MPC) is one strategy for mitigating delay, where optimal insulin infusions can be given in anticipation of future meal disturbances. Unfortunately, exact prior knowledge of meals can only be assured in a clinical environment and uncertainty about when and if meals will arrive could lead to catastrophic outcomes. As a follow-on to our recent paper in the IFAC symposium on Biological and Medical Systems (MCBMS 2009) [1], we develop a control law that can anticipate meals given a probabilistic description of the patient’s eating behavior in the form of a random meal (behavioral) profile. Preclinical *in silico* trials using the oral glucose meal model of Dalla Man et al. show that the control strategy provides a convenient means of accounting for uncertain prior knowledge of meals without compromising patient safety, even in the event that anticipated meals are skipped.

Recent national data show that nearly 21 million Americans have diabetes—a lifelong condition that affects people of every age, race, and nationality, and is the leading cause of kidney failure, blindness, and amputations not related to injury. Approximately 1.5 million of these people have Type 1 Diabetes Mellitus (T1DM), in which the immune system destroys the pancreatic beta cells, permanently suppressing insulin secretion. A typical T1DM patient will need 50,000 insulin shots over his/her lifetime, accompanied by testing of blood glucose levels several times a day. Automatic control of blood glucose has been studied for more than three decades and widely different solutions have been proposed. We refer to [2,3] for a review of the early literature, most of which focuses on the case of intravenous (IV) glucose measurement and IV actuation, with both positive (glucose) and negative (insulin) control variables.

As an extension to [1], this paper addresses the challenge to develop effective closed loop systems for regulating blood glucose concentration, via the “SC–SC” route to control, i.e. using commercially available subcutaneous Continuous Glucose Monitoring (CGM) sensors and conventional subcutaneous insulin pump devices. The SC–SC control problem is more challenging than control with intravenous measurement of glucose and insulin infusion, for several reasons. First, because measurement and actuation are achieved via interstitial fluids (and not directly in the bloodstream) there are significant delays associated with this mode of control. State of the art CGM technology relies on sensing glucose in interstitial fluids, which reflects blood glucose with an unknown time lag (between 0 and 45 min [4–6]). Also, the delay introduced by conventional (interstitial) insulin infusion pumps can lead to a limit on control performance, hindering the system’s ability to reject major disturbances, specifically meals and exercise.

Model Predictive Control (MPC) has been proposed as a means of overcoming measurement and actuation delay in either IV- or SC-based systems, cf. [2,7,8], where control actions are computed at update intervals based on model predicted optimal future evolution of the system taking into account knowledge about future disturbances. While clinical evaluation of these algorithms is underway, there are clearly many obstacles to automated control of blood glucose *outside of the clinic*, including the need for on-going adaptation to the patient’s time-varying physiological parameters and behavior. In this work we address this issue by developing a control methodology that plans insulin delivery given a probabilistic model of meal timing and meal amounts, which can be learned from historical data.

Run-to-run control methods adopted from the chemical process control literature have also been proposed as a means of responding to time-varying aspects of the patient’s behavior and physiology. Recent studies have shown that, even with infrequent sampling of blood glucose, run-to-run control can learn bolus and basal rate profiles that are optimal (on average) in the face of time-varying aspects of the patient’s daily routine (e.g. timing of meals), cf. [9–11], and circadian fluctuation in insulin resistance (e.g. dawn phenomenon), cf. [12]. The approach developed in this paper differs from run-to-run control in that it employs an explicit probabilistic model of patient meal behavior (including the possibility that the meal will be skipped) and accordingly plans a model-based dynamic response to both measured glucose and anticipated meals.

To outline the sequel, in Section 2, we derive a general control law that optimizes an expected quadratic cost criterion for a linear system subject to a random single shock disturbance, which for this paper represents a potential meal with unknown timing. We set up our Shock-Anticipating Linear Quadratic Regulator (SA-LQR) methodology in Section 3, introducing the notion of a random meal profile characterizing the likelihood, timing, and carbohydrate (CHO) content of the *next* meal and reviewing the glucose–insulin kinetics model used to derive our control law, where state observation is employed. Section 3 also describes a hybrid open-closed loop control strategy, where a closed loop control is used in anticipation of meal arrival, while a meal bolus which is informed by insulin-on-board is delivered at the time of meal arrival. In Section 4, we illustrate the benefits of our control strategy for a breakfast meal scenario. We focus in Section 4 on controller performance in cases where the anticipated meal fails to arrive. Throughout, we focus exclusively on the use of insulin for managing blood glucose concentration, ignoring the opportunity to infuse other metabolic hormones such as glucagon, which can provide a so-called “positive” control effect; however, our control methodology can be extended to this setting provided a suitable model of glucagon effect. In Section 5, we compare the SA-LQR strategy to an open loop conventional therapy approach, in which we deliver basal rate continuously and a meal bolus at mealtime. In Section 6, we wrap up the paper with brief conclusions and directions for future work. Finally, we prove the main technical result introduced in Section 2 in the Appendix.

In the context of control of T1DM, meals are the primary disturbance. These meal disturbances occur at random times with random magnitudes and can be treated as “shock” disturbances.

Here, we derive a generic optimal linear quadratic control law that anticipates a single random “shock” disturbance that, if it arrives at all, will arrive in stages *k* = 0 through . The setup here is general: we assume that the plant is linear, time-invariant, with *x*(*k*) ^{nx} (where throughout, *x*(*k*) is our best estimate of the state *$\widehat{x}$*(*k*), which we formulate in 3.1), *u*(*k*) ^{nu}, and ω(*k*) ^{nd}. Let + *w* ^{nd} denote the “shock disturbance,” which arrives (or not) according to the conditional probability of arrival *p _{k}* given that the shock has not arrived prior to stage

We assume that the controller is aware of the “shock” disturbance when it arrives, and consequently we augment the state vector *x*(*k*) of the dynamical system (18) with an additional “disturbance state” *m*(*k*) that provides an indication of whether the disturbance is still possible (*m*(*k*) = 0 if “yes,” *m*(*k*) = 1 if “no”). The augmented dynamical equations are as follows:

$$x(k+1)=\mathit{\text{Ax}}(k)+\mathit{\text{Bu}}(k)+G\omega (k)$$

(1)

$$m(k+1)=\text{max}\{m(k),{I}_{k}\}$$

(2)

where *I _{k}* is the indictor variable for the event that the shock arrives in stage

$$J(x(0),m(0))=\mathbf{E}\phantom{\rule{thinmathspace}{0ex}}\left\{{\displaystyle \sum _{t=0}^{\infty}[x(t)\prime \mathit{\text{Qx}}(t)+u(t)\prime \mathit{\text{Ru}}(t)]}\right\}$$

(3)

where *Q* and *R* are both symmetric and positive semidefinite and positive definite, respectively. Let *C* denote the “square root” of *Q*, such that *Q* = *C*′*C*. Also, for notational convenience in the sequel, let

$${J}_{k}^{*}(x,m)\equiv \text{min}\phantom{\rule{thinmathspace}{0ex}}\mathbf{E}\phantom{\rule{thinmathspace}{0ex}}\left\{{\displaystyle \sum _{t=k}^{\infty}[x(t)\prime \mathit{\text{Qx}}(t)+u(t)\prime \mathit{\text{Ru}}(t)]}\right\}$$

(4)

where the expectation above is conditional on *x*(*k*) = *x* and *m*(*k*) = *m*. For states where *m*(*k*) = 1 the shock disturbance is no longer possible, and the problem of minimizing the infinite horizon quadratic cost-to-go amounts to a standard LQR problem. However, for states where *m*(*k*) = 0 the shock may arrive at some future time, and the minimization of expected quadratic costs involves an optimal response in *anticipation* of the disturbance. The basic idea is:

$${J}_{k}^{*}(x,1)=\text{optimal LQR cost-to-go from}\phantom{\rule{thinmathspace}{0ex}}x,\times \phantom{\rule{thinmathspace}{0ex}}\text{with no more future disturbances}=x\prime \mathit{\text{Px}}$$

and

$$\begin{array}{cc}{J}_{k}^{*}(x,0)\hfill & =\text{optimal cost-to-go from}\phantom{\rule{thinmathspace}{0ex}}x,\text{with anticipation of a}\phantom{\rule{thinmathspace}{0ex}}\mathit{\text{possiblefuturedisturbance}}\hfill \\ \hfill & =\underset{u}{\text{min}}[E[{p}_{k}{J}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G\tilde{d},1)]]+[[(1-{p}_{k}){J}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}},0)]]\hfill \end{array}$$

The following proposition establishes the form of both ${J}_{k}^{*}(x,m)$ and the optimal control law in detail.

**Proposition 1.** *Given that the pair* (*A*, *B*) *is controllable and the pair* (*A*, *C*) *is observable, the optimal control law is uniquely characterized as follows*:

*At states where**m*(*k*) = 1,$${J}_{k}^{*}(x,1)=x\prime \mathit{\text{Px}}$$(5)*and the minimizing control action at state*(*x*, 1)*in stage k is*$${u}_{k}^{*}(x,1)=-\mathit{\text{Kx}}$$(6)*where P is the unique positive semidefinite solution to the Riccati equation*$$P=A\prime (P-\mathit{\text{PB}}{\Psi}^{-1}B\prime P)A+Q$$(7)*with*Ψ =*R*+*B*′*PB*, and*K*= Ψ^{−1}*B*′*PA*.*At states where**m*(*k*) = 0,*the optimal expected cost-to-go function takes the form*$${J}_{k}^{*}(x,0)=x\prime \mathit{\text{Px}}+\tilde{d}\prime {H}_{k}x+\tilde{d}\prime {L}_{k}\tilde{d}+{\Xi}_{k}$$(8)*and the minimizing control action at state*(*x*, 0)*in stage k is*$${u}_{k}^{*}(x,0)=-\mathit{\text{Kx}}-{\tilde{K}}_{k}\tilde{d}$$(9)*where**H*,_{k}*L*, Ξ_{k}_{k}, and,_{k}*d*erive from the following.*Backwards recursion*:*Stage**k*= :$${\tilde{K}}_{\overline{k}}={p}_{\overline{k}}{\Psi}^{-1}B\prime \mathit{\text{PG}}$$(10)$${H}_{\overline{k}}=G\prime [2{p}_{\overline{k}}P-2{p}_{\overline{k}}\mathit{\text{PB}}{\Psi}^{-1}B\prime P]A$$(11)$${L}_{\overline{k}}=G\prime [{p}_{\overline{k}}P-{p}_{\overline{k}}^{2}\mathit{\text{PB}}{\Psi}^{-1}B\prime P]G$$(12)$${\Xi}_{\overline{k}}={p}_{\overline{k}}{\mathbf{E}}_{w}((G\tilde{w})\prime P(D\tilde{w}))$$(13)*Stage**k*< :$${\tilde{K}}_{k}={\Psi}^{-1}B\prime {Q}_{k}$$(14)$${H}_{k}=2{Q}_{k}^{\prime}(I-B{\Psi}^{-1}B\prime P)A$$(15)$${L}_{k}={p}_{k}G\prime \mathit{\text{PG}}+{q}_{k}{L}_{k+1}-{Q}_{k}^{\prime}B{\Psi}^{-1}B\prime {Q}_{k}$$(16)$${\Xi}_{k}={p}_{k}{\mathbf{E}}_{w}[w\prime G\prime \mathit{\text{PGw}}]+{q}_{k}{\Xi}_{k+1}$$(17)*where**Q*= (1 −_{k}*p*)(_{k}*H*_{k+1})′/2 +*p*._{k}PG

The proof of the proposition employs familiar techniques used in the analysis of linear-quadratic control models. We refer to the Appendix B for the formal arguments.

Our control strategy is motivated from the realization that historical data about a patient’s eating behavior can be used to characterize the timing and likelihood of a meal disturbance within pre-specified meal regimes (e.g. breakfast), along with the magnitude of the disturbance. We refer for example to [14], which outlines a method for establishing a probabilistic meal-pattern profile for individuals based on CGM data. “Profile” information can be used in planning insulin injections prior to the arrival of meals, as long as the uncertainties involved are appropriately considered. In Section 3.4 we also formulate an insulin-informed control action to be taken *at the time* of the disturbance arrival.

As a model for the *next meal* we assume that the patient will experience up to a single meal and that the meal, if it occurs, will involve a random total carbohydrate load of *D _{meal}* (g CHO) ingested in one sampling period within the discrete interval between stages 0 and . Thus, if a meal is taken at stage

As was introduced in Section 2, *p*(*k*) denotes the probability that the meal will “arrive” in stage *k*, given that the meal has not occurred up to and including stage *k* − 1. Note that ω(*k*) is not a white noise process; it is, in fact, highly correlated in time, since a meal can occur in up to one stage of the process. In practice, the conditional probabilities *p _{k}* would come from observation of the patient’s eating patterns. For example, a diary of eating behavior can be used to identify typical meal regimes for individual patients. Note that if $F\equiv {\displaystyle {\sum}_{i=1}^{\overline{k}}{f}_{i}<1}$, then the patient occasionally misses the meal and 1 −

The model from which our controller is derived can be traced back to the compartmental “minimal model” of [15]. The model employed here is an 8-state model with extensions which account for subcutaneous oral glucose sensing and actuation, along with oral glucose ingestion. The coefficients of the model reflect population-average glucose–insulin dynamics with a 15 min sample period (Δ = 15). We can express the linearized, discretized implementation of this model in state space form:

$$x(k+1)=\mathit{\text{Ax}}(k)+\mathit{\text{Bu}}(k)+G\omega (k)$$

(18)

where state space matrices *A*, *B*, and *G* are derived from a linearized, discretized form of the model, with model parameters chosen to be representative of an “average” adult patient with T1DM. The matrix values are given in Appendix A.

Additionally, *u*(*k*) = *J _{ctrl}*(

Because we cannot measure the state of this system at a given stage *k*, we estimate the state *$\widehat{x}$*(*k*) of the system using a metabolic state observer. The state observer is based on a steady state Kalman filter. Since only CGM measurements are available each minute, it is necessary to compute estimates *$\widehat{x}$*(*k*) of *x*(*k*) based on the knowledge of infused insulin rate *u*(*k*) and measurements *y*(*k*), where

$$y(k)=\mathit{\text{CGM}}(k)-{G}_{\mathit{\text{ref}}}$$

(19)

where *CGM*(*k*) (mg/dl) is the readout of the CGM at stage *k*. We model the measurement signal as

$$y(k)=\mathit{\text{Cx}}(k)+v(k)$$

(20)

where *v*(*k*) (mg/dl) represents CGM signal noise and the matrix *C* is given by

$${C}^{T}=[1,0,0,0,0,0,0,0]$$

(21)

The metabolic state observer is derived as a Kalman filter based on the state space model of Eq. (18), where, for convenience, we treat ω(*k*) (the meal disturbance process) as a zero-mean Gaussian process with covariance *R _{s}* = 1. (We do this despite the fact that in the SA-LQR methodology we treat ω(

$$\hat{x}(k|k-1)=A\hat{x}(k-1|k-1)+\mathit{\text{Bu}}(k-1)$$

(22)

$$\hat{x}(k|k)=\hat{x}(k|k-1)+{L}_{f}(y(k)-C\hat{x}(k|k-1))$$

(23)

where *$\widehat{x}$*(*k*|*k* − 1) refers to the best estimate of *$\widehat{x}$*(*k*) using data collected up to stage *k* − 1, *$\widehat{x}$*(*k*) refers to the best estimate of *x*(*k*) using data collected up to stage *k*, the filter gain matrix

$${L}_{f}={\mathit{\text{AP}}}_{f}{C}^{T}{({\mathit{\text{CP}}}_{f}{C}^{T}+{R}_{s})}^{-1}$$

(24)

the estimate update matrix

$${M}_{f}={P}_{f}{C}^{T}{({\mathit{\text{CP}}}_{f}{C}^{T}+{R}_{s})}^{-1}$$

(25)

the matrix *P _{f}* is the unique stabilizing solution to the algebraic Riccati equation

$${A}^{T}\mathit{\text{PA}}-{A}^{T}\mathit{\text{PG}}{({G}^{T}\mathit{\text{PG}}+{R}_{s})}^{-1}{G}^{T}\mathit{\text{PA}}+{Q}_{s}=P$$

(26)

In keeping with the discretization used to develop the model, our controller operates by sampling glucose (via CGM) every 15 min and computing a insulin infusion rates via Proposition 1, holding the rate constant between samples. For our control strategy, we apply our best estimate *$\widehat{x}$*(*k*) of the state *x*(*k*) in Proposition 1, except at the stage of meal arrival, when we deliver an open loop meal bolus as described in Section 3.4 below.

Section 2 gives us a method for *anticipating* meal arrival. The aggressiveness of this anticipatory effect is driven by the construction of the matrix *Q* introduced in Eq. (3). In this context, *Q* is taken to be diagonal, with a weight *q* > 0 associated with glucose states and a weight of zero associated with meal disturbance states; we set *R* = 1. The weight parameter *q* is a tuned patient-specific parameter (the only such parameter in the controller) that determines the aggressiveness with which the controller rejects glucose deviations away from the target operating point (110 mg/dl). At the time of meal arrival, we switch our control to an open loop strategy which is consistent with the action taken by a patient in a conventional therapy setting. At the time of meal arrival, assumed to be known, we deliver an open loop meal bolus which takes into account the insulin that has been delivered in anticipation of the meal, but has yet to act. This insulin is referred to as insulin-on-board (IOB) and we compute the IOB at stage *k* as follows:

$$\text{IOB}(k)={\text{IOB}}_{\mathit{\text{curve}}}\xb7[u(k),u(k-1),u(k-2),\dots ,u(k-24)]\prime $$

(27)

where IOB_{curve} is a vector containing entries which correspond to the proportion of a given insulin injection remaining at stage *k* (adapted from [16]) and [*u*(*k*), *u*(*k* − 1), *u*(*k* − 2), …, *u*(*k* − 24)] is a vector which contains 6 h of insulin injection history with a sample time of 15 min, where IOB(*k*) is the assessment of the insulin-on-board above the basal rate at stage *k*. Using this information, the meal bolus, *U*(*k*) (U), given at meal time is computed as:

$$U(k)=d(\mathit{\text{meal}})\xb7\text{CR-IOB}(k)$$

(28)

where *d*(*meal*) is the amount of the meal taken at stage *k* (in g CHO) and *CR* is the carbohydrate ratio of the subject (U/g CHO). (We note that the total amount of the meal bolus will be Δ · *U*(*k*), as we hold the injection constant over the Δ–minute interval of stage *k*. This meal bolus covers a meal of size *D*(*meal*) = *d*(*meal*) · Δ, where Δ in this formulation is 15 min.) Note that Eq. (28) determines the size of the meal bolus based in part on how much anticipatory insulin has been delivered and when it is delivered. The total amount of insulin can vary from subject to subject and according to different values of *p*(*skip*). In the following section, we evaluate the controller performance where we implement our anticipatory closed loop strategy prior to meals and the open loop meal bolus described by Eq. (28) at mealtime.

In this section, we evaluate the “hybrid” SA-LQR algorithm using a computer simulation environment based on the oral glucose “meal model” of [17,18] equipped with a population of 100 *in silico* adult patients with T1DM, focusing particularly on the controller’s ability to anticipate a breakfast meal without endangering the patient if the meal is skipped. Because the meal model parameters were developed under the assumption of a fixed “mixed” meal composition, we assume that meals are composed of a fixed percentage of protein, fat, and carbohydrates (45% carbohydrate, 15% protein, and 40% fat). The controller that we test is derived from an instance of the model of Eq. (18), representative of an “average” patient with T1DM. In keeping with the discretization used to develop the model, our controller operates by sampling glucose (via CGM) every 15 min and computing insulin infusion rates via Proposition 1, holding the rate constant between samples. In Section 4.3, we show illustrative results for a representative subject with sensor noise. In Section 4.4, we show aggregate results for the population of 100 *in silico* subjects. All simulations are conducted with sensor noise, where we use the CGM model proposed in [19], feeding the corrupted CGM data into the controller. Section 5 compares the SA-LQR strategy to an open loop conventional therapy approach to insulin delivery.

The *in silico* experiments presented here reveal performance characteristics of the controller in a protocol focused on the anticipation of the breakfast meal. The protocol runs from midnight to 12:30 p.m. and includes one meal at time 420 min (7 a.m.). The breakfast meal, if it arrives at all, will take place at minute 420 of the protocol and will involve 55 g CHO.

Table 1 shows the meal profile information used in the backwards recursion of Proposition 1 in planning insulin delivery prior to breakfast. Note that the profile for breakfast describes an earliest-possible meal arrival time of 240 min and a latest possible meal time of 525 min. The relative frequency distribution *f _{k}* within this interval is derived from a clipped Gaussian distribution centered at the most likely time for breakfast, 420 min. The standard deviation of the Gaussian distribution before clipping is 30 min. The distribution

The breakfast meal regime starts at minute 0, and at this time uses the backwards recursion of Proposition 1 to compute a feedback gain matrix *K* and set of feedforward gains * _{k}* for the 15 min sampling intervals

Here we illustrate the SA-LQR controller as applied to a representative subject within the *in silico* population of adult patients with T1DM, using the experimental scenario and meal profile information above. Of course, performance of the controller will depend strongly on the accuracy of the sensor. Fig. 1 illustrates the case where the meal arrives at minute 420, with the top plot showing the rate of insulin delivery (U/h) and the bottom plot showing BG (mg/dl). Fig. 2 shows the plots of insulin delivery rate and BG for the same subject in the case of a skipped meal, where we focus on the anticipatory insulin effect.

Notice that in either scenario, whether breakfast is taken or skipped, a profile with *p*(*skip*) = .5 injects less insulin in anticipation of the meal. Because there is high level of uncertainty associated with the meal occurrence, there is less anticipatory insulin delivery to guard against a skipped meal scenario. Thus in the case when *p*(*skip*) = .5 and the meal is in fact skipped, the subject is able to maintain a safe level of blood glucose, while in the case of *p*(*skip*) close to zero, the subject experiences a lower glucose minimum and is at greater risk of hypoglycemia. In any case, there is some level of anticipatory insulin delivery which precedes the meal bolus delivery. Table 2 gives the meal bolus amounts for the breakfast meal when it does arrive.

Referring to Table 2, in the case where *p*(*skip*) = .5 and the meal does occur, the subject experiences the largest meal bolus delivery at meal time, to compensate for insulin that was not delivered in anticipation of the meal; that is, IOB assessment is smallest for the case of *p*(*skip*) = .5 at the time of meal arrival. In the case of *p*(*skip*) close to zero, a more aggressive anticipatory insulin delivery results in a larger IOB value at the time of meal arrival, thus yielding the smallest meal bolus at mealtime. Note that the insulin trace of Fig. 2 (for the case the that breakfast meal is skipped) shows a gradual decline at minute 450 lasting to the end of the breakfast meal regime, which can be explained by the fact that the controller recognizes that the most likely time of the meal has passed, and the chance of the breakfast meal occurring after this most likely time gradually decreases until minute 525, the last possible arrival time for breakfast.

The above representative subject results suggest that, by designing the controller around a meal profile with a significant probability that the meal will be skipped, we can obtain control postprandial performance without the threat of severe hypoglycemia when the skipped-meal event actually occurs. In the case of a skipped meal, we are considering the most extreme scenario. We hypothesize that the case of a delayed meal would lead to results which fall somewhere between the skipped meal and the nominal scenario (when the anticipated meal amount is taken at the most likely time). Alternatively, if a meal is taken which is not accounted for in our behavioral profile, the linear quadratic Gaussian controller will inject insulin in reaction, as opposed to anticipation, to an increase in the blood glucose level.

The results of Section 4.4 confirm this observation for an *in silico* population of 100 adult patients with T1DM.

Here we illustrate the SA-LQR controller applied to the full *in silico* population of adults with T1DM, again with sensor noise. Table 3 summarizes the results of the study reporting max and min glucose concentrations from the onset of the protocol to just following the postprandial peak of the breakfast meal (focusing on controller performance leading up to and just after breakfast).

Looking at Table 3, we see that for larger values for *p*(*skip*), the controller delivers less insulin in anticipation of the meal on average, a strategy that protects the subject from hypoglycemia in the event that the meal is skipped. This can be seen in the mean glucose minimum of 90.81 mg/dl in Table 3 for *p*(*skip*) = .5. Because we compensate for a less aggressive anticipatory insulin delivery by delivering a larger meal bolus at meal time (when the meal arrives) in the case of *p*(*skip*) = .5, we avoid a higher postprandial peak that could result from this weaker anticipatory insulin delivery. Alternatively, when we consider the results for *p*(*skip*) close to zero, results indicate that when the meal is skipped, a greater risk of hypoglycemia results (a mean glucose min of 62.22 mg/dl). A *p*(*skip*) close to zero represents an essentially deterministic meal profile, where we control with near certainty the timing and the presence of the meal. Because the controller anticipates that a meal will occur with near certainty, this skipped meal poses an unexpected lack of meal disturbance for the controller. From this population study, we can infer that some optimal value of *p*(*skip*) exists so as to control both hypoglycemia (in the event that no meal occurs), as well as hyperglycemia (when a meal does occur). By viewing the probability of a skipped meal in this way, *p*(*skip*) is a parameter that can be tuned to optimize control. From the perspective of hypoglycemia avoidance and safety, we choose a larger *p*(*skip*) when there is uncertainty as to whether or not a subject will eat in a given regime. Thus choosing a larger *p*(*skip*) allows for us to control in a safe way while also taking advantage of our ability to anticipate the arrival of a meal to improve postprandial performance.

Fig. 3 presents population outcomes using the Control-Variability Grid Analysis technique of [20]. The plot shows each patient’s peak post-breakfast glucose concentration (given that breakfast arrives) plotted against the minimum glucose concentration associated with a skipped breakfast, for *p*(*skip*) = .0001 (blue) versus *p*(*skip*) = .5 (pink). (Each patient appears twice, once as a pink dot and once as a blue dot.) Despite the uncertainty about whether breakfast will arrive, a significant fraction of the population lands in the A-zone of the plot where peak glucose is below 180 mg/dl (after breakfast) and above 80 mg/dl (when breakfast is skipped).

In this section, we compare results obtained when we employ the SA-LQR strategy to those results obtained when a patient is subjected to open loop therapy. We consider again the scenario presented in Section 4, except that in this case, we investigate variability in the standard deviation of the meal arrival time, while keeping *p*(*skip*) fixed at a value of *p*(*skip*) = .1 (Table 4).

In this section, we consider only the case when the breakfast meal is taken. The purpose of this comparison is to demonstrate that, when we employ anticipatory insulin delivery in a closed loop control strategy, we can improve glucose performance by reducing the postprandial glucose concentration when compared to an open loop strategy. The open loop control strategy applied here is one in which basal insulin is delivered continuously to maintain a steady state glucose concentration of 112.5 mg/dl and at mealtime, a meal bolus is delivered. The amount of this bolus (U) is computed based on the known meal amount (g CHO) and the subject’s carbohydrate ratio (U/g CHO). This computation is described by Eq. (28), where, in the case of open loop therapy, IOB = 0 at the time of the meal bolus (as IOB measures insulin-on-board *above* the basal insulin).

Fig. 4 shows a comparison of the SA-LQR strategy to the open loop strategy.

Fig. 4 indicates that, in the case when SA-LQR is employed, there is a “ramping up” in the insulin delivery prior to the meal arrival. This increase in the insulin delivery rate prior to the meal arrival drives the glucose concentration down prior to the arrival of the breakfast meal (for this representative subject, the open loop pre-meal glucose concentration is 112.5 mg/dl, compared to the SA-LQR pre-meal glucose concentration of 86.5 mg/dl). The result of this glucose lowering prior to the meal in the case of the SA-LQR strategy is a lower postprandial peak (as compared to the open loop case). In Section 5.2, we confirm these results for the *in silico* adult population.

For the population comparison, mean minimum and maximum BG concentrations (mg/dl) are computed as mean values for the population of 100 adults, where these values are computed for the time prior to the breakfast meal (to obtain the minimum value) and the postprandial peak following the meal arrival (to obtain the maximum value).

Table 5 shows only a small deviation between the mean min and max glucose concentration in the case of the SA-LQR for the three values of σ. For the three cases of σ, there is a small decrease in the mean min and max BG as the value of σ increases. This is a result of the anticipatory insulin delivery beginning earlier in the case of larger σ, as in this case it is more likely that the meal will occur at the earlier stages of the meal regime than in the case of smaller σ values. This being said, the lower minimum BG in the case of larger σ values is slight; this is a result of the fact that, while the meal may occur at the early stages of the meal regime and we need to prepare for that through insulin delivery, it is also more likely that the meal will arrive at the later stages of the meal regime than in the case of smaller σ values. Thus for larger values of σ, the insulin delivery has a greater spread over the meal regime. In the case of a σ value of 1 min, the meal, if it does occur, will occur with near certainty in minute 420; thus anticipation of the breakfast in this case is concentrated over the most likely breakfast arrival time. In contrast to this, the distinction of mean min and max glucose values between the SA-LQR and the open loop strategy is much greater. This is a result of the anticipatory insulin delivery: prior to the meal arrival, the open loop BG (as we expect) is 109.84 mg/dl, while in the case of SA-LQR, anticipatory insulin drives the BG concentration to approximately 96 mg/dl. In the case of the open loop therapy, delivering basal rate insulin and an open loop bolus at meal time keeps the subject’s glucose level at or above 110 mg/dl (the basal steady state level). This translates to a 13 mg/dl difference in the postprandial peaks, where a higher postprandial peak results when no pre-meal insulin is given in the case of open loop therapy. The CVGA results shown in Fig. 5 correspond to the results of Table 5, where we compare a σ of 30 min to the open loop strategy.

CVGA comparing SA-LQR, Std. Dev. 30 min (blue) to open loop (pink); for SA-LQR: (A) 75% and (B) 25%; for open loop: (A) 88% and (B) 12%. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of **...**

In [1], we used the notion of a meal profile to develop a model-based stochastic predictive control law for optimal insulin delivery in anticipation of meals. Our *in silico* study shows that being able to account for uncertainty as to *if* and *when* a meal disturbance will occur allows us to achieve reasonable tradeoffs between aggressive anticipatory insulin treatment and the risk of hypoglycemia in the event that meals are skipped. In this paper, we have extended the methodology of [1] to allow the application of a state space model of glucose insulin kinetics with state estimation via a Kalman filter, and we have modified the postprandial action of the controller by introducing an open loop meal bolus that is calibrated to pre-meal insulin “on-board.” A linear control law applied to nonlinear system dynamics is shown to perform well in this setting. We have also demonstrated the ability of the control strategy to work in a hybrid environment; anticipating insulin delivery through a closed loop control strategy and delivering an insulin-on-board- informed meal bolus at the time of the meal.

The work presented here uses a model-based stochastic predictive control strategy to anticipate the arrival of meals. In effect, this strategy leads to an offset from the glucose target concentration of 110 mg/dl prior to the meal, with a greater offset associated with increased certainty as to whether not the meal will arrive at all. In this way, the SA-LQR strategy can be viewed as similar to a PID controller with time-varying setpoints chosen to reflect certainty about meal arrival and timing. The “effective setpoint reduction” of SA-LQR is achieved in a systematic fashion based on the patient’s behavioral profile. In this sense, the SA-LQR performs optimally under the level of uncertainty we have regarding future disturbances. Future work hopes to address the way that *p*(*skip*) is chosen for each subject, or for each meal regime. We are now considering an implementation of *p*(*skip*) in which the value of *p*(*skip*) is “learned” for each subject based on the subject’s history of past behavior. In this case, the value of *p*(*skip*) is also something that can vary over the course of 1 day, or from 1 day to the next.

This work presents a method for anticipating meals through the use of a behavioral profile. In future work we plan to more fully evaluate controller performance in the face of uncertainty about meal amounts and meal timing. (While Proposition 1 allows for all three types of uncertainty, the simulation results in this paper only considers uncertainty in meal arrival and meal timing.) The controller developed in this paper only looks ahead to the next (potential) meal, one meal regime at a time. In on-going efforts we are extending the methodology to plan over multiple meal regimes, so that the anticipated response to a lunch meal could be influenced by the anticipated response to breakfast, and vice versa. In conjunction with this effort, we plan to reexamine the form and structure of the meal profile itself, thinking of the model of Section 3.1 as a first step toward a more comprehensive description of patient behavior.

$$\begin{array}{cc}A\hfill & =\phantom{\rule{thinmathspace}{0ex}}\left[\begin{array}{cccccccc}\hfill .9913\hfill & \hfill -102.7\hfill & \hfill -1.5\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-8}\hfill & \hfill -2.89\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-6}\hfill & \hfill -4.1\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-4}\hfill & \hfill 0\hfill & \hfill 2.01\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-6}\hfill & \hfill 4.3\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-5}\hfill \\ \hfill 0\hfill & \hfill .839\hfill & \hfill 5.23\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-10}\hfill & \hfill 7.44\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-8}\hfill & \hfill 6.84\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-6}\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill .9798\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill .0200\hfill & \hfill .9798\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill 1.9\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-4}\hfill & \hfill .0180\hfill & \hfill .7882\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill \\ \hfill .0865\hfill & \hfill -4.667\hfill & \hfill -2.73\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-10}\hfill & \hfill -6.59\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-8}\hfill & \hfill -1.26\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-5}\hfill & \hfill .9131\hfill & \hfill 6.00\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-8}\hfill & \hfill 1.90\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-6}\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill .9083\hfill & \hfill 0\hfill \\ \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill 0\hfill & \hfill .09115\hfill & \hfill .9891\hfill \end{array}\right]\hfill \\ {B}^{T}\hfill & =\phantom{\rule{thinmathspace}{0ex}}[-3.05\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-9},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}1.34\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-10},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}.9900,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}.0100,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}6.50\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-5},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-4.61\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-11},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0]\hfill \\ {G}^{T}\hfill & =\phantom{\rule{thinmathspace}{0ex}}[6.76\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-7},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}0,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}1.52\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{10}^{-8},\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}.9534,\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}.0464]\hfill \end{array}$$

The case where *m*(*k*) = 1 is straightforward. Under the assumptions of the proposition, it is clear (e.g. from [DPvol1]), that ${J}_{k}^{*}(x,1)=x\prime \mathit{\text{Px}}\prime \phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{u}_{k}^{*}(x,1)=-\mathit{\text{Kx}}$, where *P* and *K* are as specified. The result for *m*(*k*) = 0 follows by mathematical induction, as argued in the following lemmas.

**Lemma 2.** *For* *k* = , *the expected cost to go function* ${J}_{\overline{k}}^{*}(x,0)$ *is of the form*

$${J}_{\overline{k}}^{*}(x,0)=x\prime \mathit{\text{Px}}+\tilde{d}\prime {H}_{\overline{k}}x+\tilde{d}\prime {L}_{\overline{k}}\tilde{d}+{\Xi}_{\overline{k}}$$

(B.1)

*where*

*H*_{}=*G*′[2*p*_{}*P*− 2*p*_{}*PB*(*R*+*B*′*PB*)^{−1}*B*′*P*]*A*,- ${L}_{\overline{k}}=G\prime [{p}_{\overline{k}}P-{p}_{\overline{k}}^{2}\mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime P]G$,
- Ξ
_{}=*p*_{}**E**_{w}((*G*)′*P*(*D*)),

*and the optimal control action at this state is*

$${u}_{\overline{k}}^{*}(x,0)=-{\Psi}^{-1}(B\prime \mathit{\text{PAx}}+{p}_{\overline{k}}B\prime \mathit{\text{PG}}\tilde{d})$$

(B.2)

*where* Ψ = *R* + *B*′*PB*.

**Proof.** By straightforward dynamic programming:

$$\begin{array}{cc}{J}_{\overline{k}}^{*}(x,0)\hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}{\mathbf{E}}_{w}({J}_{\overline{k}+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w),1))+(1-{p}_{\overline{k}}){J}_{\overline{k}+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}},1)]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}{\mathbf{E}}_{w}((\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w))\prime \phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}P(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w)))+(1-{p}_{\overline{k}})(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime \phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}P(\mathit{\text{Ax}}+\mathit{\text{Bu}})]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+2{p}_{\overline{k}}{\mathbf{E}}_{w}((G(\tilde{d}+w))\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}}))+{p}_{\overline{k}}{\mathbf{E}}_{w}((G(\tilde{d}+w))\prime \phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}\mathit{\text{PG}}(\tilde{d}+w))]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+2{p}_{\overline{k}}(G\tilde{d})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+{p}_{\overline{k}}(G\tilde{d})\prime P(D\tilde{d})+2{p}_{\overline{k}}{\mathbf{E}}_{w}((G\tilde{w})\prime P(D\tilde{d}))+{p}_{\overline{k}}{\mathbf{E}}_{w}((G\tilde{w})\prime P(D\tilde{w}))]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+(\mathit{\text{Ax}})\prime P(\mathit{\text{Ax}})+2(\mathit{\text{Ax}})\prime P(\mathit{\text{Bu}})+(\mathit{\text{Bu}})\prime P(\mathit{\text{Bu}})+2{p}_{\overline{k}}(G\tilde{d})\prime P(\mathit{\text{Ax}})+2{p}_{\overline{k}}(G\tilde{d})\prime P(\mathit{\text{Bu}})+{p}_{\overline{k}}(G\tilde{d})\prime P(D\tilde{d})+{\Xi}_{\overline{k}}].\hfill \end{array}$$

(B.3)

Taking the derivative with respect to *u* and setting equal to zero, we get the optimal control

$$\begin{array}{cc}{u}_{\overline{k}}^{*}(x,0)\hfill & =\mathit{\text{Arg}}\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}{\mathbf{E}}_{w}({J}_{\overline{k}+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w),1))+(1-{p}_{\overline{k}}){J}_{\overline{k}+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}},1)]\hfill \\ \hfill & =-{(R+B\prime \mathit{\text{PB}})}^{-1}(B\prime \mathit{\text{PAx}}+{p}_{\overline{k}}B\prime \mathit{\text{PG}}\tilde{d})\hfill \end{array}$$

(B.4)

Plugging back into the expression above, we get

$$\begin{array}{cc}{J}_{\overline{k}}^{*}(x,0)\hfill & =[x\prime \mathit{\text{Qx}}+x\prime A\prime \mathit{\text{PAx}}+2{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PAx}}+{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PG}}\tilde{d}+{u}_{\overline{k}}^{*}(x,0)\prime (R+B\prime \mathit{\text{PB}}){u}_{\overline{k}}^{*}(x,0)+2(x\prime A\prime \mathit{\text{PB}}+{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PB}}){u}_{\overline{k}}^{*}(x,0)+{\Xi}_{\overline{k}}]\hfill \\ \hfill & =[x\prime \mathit{\text{Qx}}+x\prime A\prime \mathit{\text{PAx}}+2{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PAx}}+{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PG}}\tilde{d}-(x\prime A\prime \mathit{\text{PB}}+{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PB}}){(R+B\prime \mathit{\text{PB}})}^{-1}\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}(B\prime \mathit{\text{PAx}}+{p}_{\overline{k}}B\prime \mathit{\text{PG}}\tilde{d})+{\Xi}_{\overline{k}}]\hfill \\ \hfill & =[x\prime \mathit{\text{Qx}}+x\prime A\prime \mathit{\text{PAx}}-x\prime A\prime \mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime \mathit{\text{PAx}}+2{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PAx}}-2{p}_{\overline{k}}\tilde{d}\prime G\prime \mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime \mathit{\text{PAx}}+{p}_{\overline{k}}d\prime G\prime \mathit{\text{PG}}\tilde{d}-{p}_{\overline{k}}^{2}\tilde{d}\prime G\prime \mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime \mathit{\text{PG}}\tilde{d}+{\Xi}_{\overline{k}}]\hfill \\ \hfill & =[x\prime [Q+A\prime [P-\mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime P]A]x+\tilde{d}\prime G\prime [2{p}_{\overline{k}}P-2{p}_{\overline{k}}\mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime P]\mathit{\text{Ax}}+\tilde{d}\prime G\prime [{p}_{\overline{k}}P-{p}_{\overline{k}}^{2}\mathit{\text{PB}}{(R+B\prime \mathit{\text{PB}})}^{-1}B\prime P]G\tilde{d}+{\Xi}_{\overline{k}}]\hfill \\ \hfill & =x\prime \mathit{\text{Px}}+\tilde{d}\prime {H}_{\overline{k}}x+\tilde{d}\prime {L}_{\overline{k}}\tilde{d}+{\Xi}_{\overline{k}},\hfill \end{array}$$

(B.5)

where we have used the fact that *P* = *Q* + *A*′[*P* − *PB*(*R* + *B*′*PB*)^{−1}*B*′*P*]*A*.

**Lemma 3.** *For* *k* < , *the expected cost to go function* ${J}_{k}^{*}(x,0)$ *is also of the form*:

$${J}_{k}^{*}(x,0)=x\prime \mathit{\text{Px}}+\tilde{d}\prime {H}_{k}x+\tilde{d}\prime {L}_{k}\tilde{d}+{\Xi}_{k}$$

(B.6)

*where*

- ${H}_{k}=2{Q}_{k}^{\prime}(I-B{\Psi}^{-1}B\prime P)A$
- ${L}_{k}={p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1}-{Q}_{k}^{\prime}B{\Psi}^{-1}B\prime {Q}_{k}$ (
*symmetric*) - Ξ
_{k}=*p*_{k}**E**_{w}[*w*′*G*′*PGw*] + (1 −*p*) Ξ_{k}_{k+1}.and *Q*= (1 −_{k}*p*)(_{k}*H*_{k+1})′/2 +*p*._{k}PG

*The optimal policy is*

$${u}_{k}^{*}(x,0)=-{\Psi}^{-1}B\prime [\mathit{\text{PAx}}+{Q}_{k}\tilde{d}]$$

(B.7)

**Proof.** Using the induction hypothesis:

$${J}_{k}^{*}(x,0)=\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}{\mathbf{E}}_{w}({J}_{k+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w),1))+(1-{p}_{\overline{k}}){J}_{k+1}^{*}(\mathit{\text{Ax}}+\mathit{\text{Bu}},0)]$$

(B.8)

$$\begin{array}{cc}{J}_{k}^{*}(x,0)\hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}{\mathbf{E}}_{w}[(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w))\prime \phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}P(\mathit{\text{Ax}}+\mathit{\text{Bu}}+G(\tilde{d}+w))]+(1-{p}_{\overline{k}})\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}[(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {H}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {L}_{k+1}\tilde{d}+{\Xi}_{k+1}]]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+{p}_{\overline{k}}[(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+2{\mathbf{E}}_{w}((G(\tilde{d}+w))\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}}))+{\mathbf{E}}_{w}((G(\tilde{d}+w))\prime \mathit{\text{PG}}(\tilde{d}+w))]+(1-{p}_{\overline{k}})[(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {H}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {L}_{k+1}\tilde{d}+{\Xi}_{k+1}]]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+{p}_{\overline{k}}[2(G\tilde{d})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+(G\tilde{d})\prime \mathit{\text{PG}}\tilde{d}+{\mathbf{E}}_{w}((Gw)\prime \mathit{\text{PGw}})]+(1-{p}_{\overline{k}})[\tilde{d}\prime {H}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {L}_{k+1}\tilde{d}+{\Xi}_{k+1}]]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime \mathit{\text{Qx}}+u\prime \mathit{\text{Ru}}+(\mathit{\text{Ax}}+\mathit{\text{Bu}})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+{p}_{\overline{k}}[2(G\tilde{d})\prime P(\mathit{\text{Ax}}+\mathit{\text{Bu}})+(G\tilde{d})\prime \mathit{\text{PG}}\tilde{d}]+(1-{p}_{\overline{k}})[\tilde{d}\prime {H}_{k+1}(\mathit{\text{Ax}}+\mathit{\text{Bu}})+\tilde{d}\prime {L}_{k+1}\tilde{d}]+{\Xi}_{k}]\hfill \\ \hfill & =\underset{u}{\text{min}}[x\prime (Q+A\prime \mathit{\text{PA}})x+u\prime \Psi u+[2x\prime A\prime P+2\tilde{d}\prime {Q}_{k}^{\prime}]\mathit{\text{Bu}}+2\tilde{d}\prime {Q}_{k}^{\prime}\mathit{\text{Ax}}+\tilde{d}\prime [{p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1}]\tilde{d}\prime +{\Xi}_{k}]\hfill \end{array}$$

(B.9)

Taking the derivative with respect to *u* and setting equal to zero, we get

$$2\Psi u+2B\prime [\mathit{\text{PAx}}+{Q}_{k}\tilde{d}]=0$$

so that ${u}_{k}^{*}(x,0)=-{\Psi}^{-1}B\prime [\mathit{\text{PAx}}+{Q}_{k}\tilde{d}]$. Plugging into the expression above, we get

$$\begin{array}{cc}{J}_{k}^{*}(x,0)\hfill & =[x\prime (Q+A\prime \mathit{\text{PA}})x+(x\prime A\prime P+\tilde{d}\prime {Q}_{k}^{\prime})B{\Psi}^{-1}\Psi {\Psi}^{-1}B\prime (\mathit{\text{PAx}}+{Q}_{k}\tilde{d})-(2x\prime A\prime P+2\tilde{d}\prime {Q}_{k}^{\prime})B{\Psi}^{-1}B\prime (\mathit{\text{PAx}}+{Q}_{k}\tilde{d})+2\tilde{d}\prime {Q}_{k}^{\prime}\mathit{\text{Ax}}+\tilde{d}\prime ({p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1})\tilde{d}+{\Xi}_{k}]\hfill \\ \hfill & =[x\prime (Q+A\prime \mathit{\text{PA}})x-(x\prime A\prime P+\tilde{d}\prime {Q}_{k}^{\prime})B{\Psi}^{-1}B\prime (\mathit{\text{PAx}}+{Q}_{k}\tilde{d})+2\tilde{d}\prime {Q}_{k}^{\prime}\mathit{\text{Ax}}+\tilde{d}\prime ({p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1})\tilde{d}+{\Xi}_{k}]\hfill \\ \hfill & =[x\prime (Q+A\prime \mathit{\text{PA}})x-x\prime A\prime \mathit{\text{PB}}{\Psi}^{-1}B\prime \mathit{\text{PAx}}+2\tilde{d}\prime {Q}_{k}^{\prime}\mathit{\text{Ax}}-2x\prime A\prime \mathit{\text{PB}}{\Psi}^{-1}B\prime {Q}_{k}\tilde{d}+\tilde{d}\prime ({p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1})\tilde{d}-\tilde{d}\prime {Q}_{k}^{\prime}B{\Psi}^{-1}B\prime {Q}_{k}\tilde{d}+{\Xi}_{k}]\hfill \\ \hfill & =[x\prime [Q+A\prime (P-\mathit{\text{PB}}{\Psi}^{-1}B\prime P)A]x+\tilde{d}\prime [2{Q}_{k}^{\prime}(I-B{\Psi}^{-1}B\prime P)A]x+\tilde{d}\prime [{p}_{k}G\prime \mathit{\text{PG}}+(1-{p}_{k}){L}_{k+1}-{Q}_{k}^{\prime}B{\Psi}^{-1}B\prime {Q}_{k}]\tilde{d}+{\Xi}_{k}]\hfill \\ \hfill & =x\prime \mathit{\text{Px}}+\tilde{d}\prime {H}_{k}x+\tilde{d}\prime {L}_{k}\tilde{d}+{\Xi}_{k}\hfill \end{array}$$

(B.10)

where again we have used the fact that *P* = *Q* + *A*′[*P* − *PB*(*R* + *B*′*PB*)^{−1} *B*′*P*]*A*.

^{}This work was sponsored in part by (1) the Juvenile Diabetes Research Foundation (JDRF) Artificial Pancreas Project and (2) the National Library of Medicine (Award Number T15LM009462). This content is solely the responsibility of the authors and does not necessarily represent the official views of the JDRF, the National Library of Medicine, or the National Institutes of Health.

^{1}In the context of control of T1DM, *p _{k}* becomes the probability that the meal will “arrive” in stage

**Conflict of interest**

There are no conflicts of interest for the authors of this article.

1. Patek S, Hughes C, Breton M, Kovatchev B. Anticipating meals with behavioral profiles: towards stochastic model predictive control of T1DM; Proceedings of the 7th IFAC Symposium on Modelling and Control in Biomedical Systems; 2009. pp. 37–42. Available in IFAC Papers-OnLine: http://www.ifac-papersonline.net/Detailed/39984.html.

2. Parker RS, Doyle FJ, Peppas NA., III A model-based algorithm for blood glucose control in type 1 diabetic patients. IEEE Trans. Biomed. Eng. 1999;BME-46(2):148–157. [PubMed]

3. Bequette BW. A critical assessment of algorithms and challenges in the development of a closed-loop artificial pancreas. Diab. Technol. Ther. 2005;7:28–47. [PubMed]

4. Boyne M, Silver DM, Kaplan J, Saudek C. Timing of changes in interstitial and venous blood glucose measured with a continuous subcutaneous glucose sensor. Diabetes. 2003;52:2790–2794. [PubMed]

5. Rebrin K, Steil G, van Antwerp W, Mastrototaro J. Subcutaneous glucose predicts plasma glucose independent of insulin: implications for continuous monitoring. Am. J. Physiol. Endocrinol. Metab. 1999;277:561–571. [PubMed]

6. Kulcu E, Tamada J, Reach G, Potts R, Lesho M. Physiological differences between interstitial glucose and blood glucose measured in human subjects. Diabetes Care. 2003;26:2405–2409. [PubMed]

7. Hovorka R, Canonico V, Chassin LJ, Haueter U, Massi-Benedetti M, Federici MO, Pieber T, Schaller H, Schaupp L, Vering T, Wilinska ME. Nonlinear model predictive control of glucose concentration in subjects with type 1 diabetes. Physiol. Meas. 2004;25:905–920. [PubMed]

8. Magni L, Raimondo DM, Dalla Man C, Nicolao GD, Kovatchev B, Cobelli C. Model predictive control of glucose concentration in subjects with type 1 diabetes: an in silico trial; 17th IFAC World Congress; 2008. pp. 4246–4251.

9. Zisser H, Jovanovic L, Doyle FJ, Ospina P, Owens C., III Run-to-run control of meal-related insulin dosing. Diab. Technol. Ther. 2005;7:48–57. [PubMed]

10. Owens C, Zisser H, Jovanovic L, Srinivasan B, Bonvin D, Doyle FJ., III Run-to-run control of blood glucose concentrations for people with type 1 diabetes mellitus. IEEE Trans. Biomed. Eng. 2006;53:996–1005. [PubMed]

11. Palerm C, Zisser H, Bevier W, Jovanovic L, Doyle FJ., III Prandial insulin dosing using run-to-run control: application of clinical data and medical expertise to define a suitable performance metric. Diabetes Care. 2007;30:1131–1136. [PubMed]

12. Palerm C, Zisser H, Jovanovic L, Doyle FJ., III A run-to-run control strategy to adjust basal insulin infusion rates in type 1 diabetes. J. Process Control. 2008;18:258–265. [PMC free article] [PubMed]

13. Campo L, Bar-Shalom Y. Control of discrete-time hybrid stochastic systems. IEEE Trans. Autom. Control. 1992;37(10):1522–1527.

14. Patek S, Breton M, Cobelli C, Dalla Man C, Kovatchev B. Adaptive meal detection algorithm enabling closed-loop control in type 1 diabetes; Proceedings of the 2007 Diabetes Technology Meeting (Abstract); 2007.

15. Bergman R, Ider Y, Bowden C, Cobelli C. Quantitative estimation of insulin sensitivity. Am. J. Physiol. 1979;236:E667–E677. [PubMed]

16. Swan KL, Dziura JD, Steil GM, Voskanyan GR, Sikes KA, Steffen AT, Martin ML, Tamborlane WV, Weinzimer SA. Effect of age of infusion site and type of rapid-acting analog on pharmacodynamic parameters of insulin boluses in youth with type 1 diabetes receiving insulin pump therapy. Diabetes Care. 2009;32:240–244. [PMC free article] [PubMed]

17. Dalla Man C, Rizza RA, Cobelli C. Meal simulation model of the glucose–insulin system. IEEE Trans. Biomed. Eng. 2007;54(10):1740–1749. [PubMed]

18. Dalla Man C, Raimondo DM, Rizza RA, Cobelli C. GIM, simulation software of meal glucose–insulin model. J. Diabetes Sci. Technol. 2007;1(3):323–330. [PMC free article] [PubMed]

19. Breton M, Kovatchev B. Analysis, modeling, and simulation of the accuracy of continuous glucose sensors. J. Diabetes Sci. Technol. 2008;2(5):1–10. [PMC free article] [PubMed]

20. Magni L, Raimondo DM, Dalla Man C, Breton M, Patek SD, Nicolao GD, Cobelli C, Kovatchev BP. Evaluating the efficacy of closed-loop glucose regulation via control-variability grid analysis. J. Diabetes Sci. Technol. 2008;2(4):630–635. [PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |