PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of plosonePLoS OneView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
 
PLoS One. 2012; 7(5): e36160.
Published online May 2, 2012. doi:  10.1371/journal.pone.0036160
PMCID: PMC3342242
Numerical Integration of the Master Equation in Some Models of Stochastic Epidemiology
Garrett Jenkinson and John Goutsias*
Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
Stephen J. Cornell, Editor
University of Leeds, United Kingdom
* E-mail: goutsias/at/jhu.edu
Conceived and designed the experiments: GJ JG. Performed the experiments: GJ JG. Analyzed the data: GJ JG. Contributed reagents/materials/analysis tools: GJ JG. Wrote the paper: GJ JG.
Received September 14, 2011; Accepted April 2, 2012.
The processes by which disease spreads in a population of individuals are inherently stochastic. The master equation has proven to be a useful tool for modeling such processes. Unfortunately, solving the master equation analytically is possible only in limited cases (e.g., when the model is linear), and thus numerical procedures or approximation methods must be employed. Available approximation methods, such as the system size expansion method of van Kampen, may fail to provide reliable solutions, whereas current numerical approaches can induce appreciable computational cost. In this paper, we propose a new numerical technique for solving the master equation. Our method is based on a more informative stochastic process than the population process commonly used in the literature. By exploiting the structure of the master equation governing this process, we develop a novel technique for calculating the exact solution of the master equation – up to a desired precision – in certain models of stochastic epidemiology. We demonstrate the potential of our method by solving the master equation associated with the stochastic SIR epidemic model. MATLAB software that implements the methods discussed in this paper is freely available as Supporting Information S1.
Stochasticity can play an important role when studying a disease that spreads through a population of individuals [1][3]. A common approach to modeling this problem is by means of a Markov process, whose probability distribution satisfies a deterministic differential equation known as the master equation. Solving the master equation analytically however is not in general possible and Monte Carlo sampling, based on the Gillespie algorithm [4], is often used to accomplish this goal. Unfortunately, accurate evaluation of the probability distribution of a Markov process requires a prohibitively large number of Monte Carlo samples for most systems of interest. As a consequence, Monte Carlo sampling is mostly used to estimate statistical summaries of the underlying stochastic population dynamics, such as means and variances.
To evaluate the solution of the master equation, a number of approximation techniques have been proposed in the literature, such as the system-size expansion method of van Kampen [1], [5], [6]. While approximations may work well in certain circumstances, they often fail when the underlying assumptions are not satisfied. The system-size expansion method for example can only produce a normal approximation to the solution of the master equation. Therefore, if the probability distribution of the population process is bimodal, then this method will produce erroneous results.
Some effort has recently shifted away from Monte Carlo sampling and approximation techniques and has focused on exploiting the linear structure of the master equation associated with the population process. This results in a numerical solution to the master equation through matrix exponentiation; e.g., see [2], [7][10]. A popular technique along these lines employs a Krylov subspace approximation (KSA) method [7], [8] that dramatically reduces the size of matrix exponentiation and results in an attractive iterative algorithm for solving the master equation. However, the KSA technique is based on several approximations, whose cumulative effect may appreciably affect the method's accuracy, numerical stability, and computational efficiency.
There are two main issues that can affect performance of the KSA method. One is choosing the dimension of the approximating Krylov subspace used. If the dimension is chosen too small, the method may produce an inaccurate solution to the master equation, whereas, a value that is too large can result in an appreciable decrease of computational efficiency. Unfortunately, there is no rigorous way to optimally determine an appropriate value for this parameter, which is chosen manually, even in advanced implementations such as Expokit [7]. Another issue is the fact that, at each step, the KSA method may not necessarily produce a probability vector (i.e., a vector composed of nonnegative elements that sum to one). This problem can be addressed by using a sufficiently small step-size, but this may seriously affect the method's computational efficiency. In practice, the KSA method is equipped with a heuristic step that zeros-out all negative values and re-normalizes the positive values so that they sum to one. This step however introduces its own errors, which may affect the quality of the approximation in a manner that is not easy to predict.
Instead of using the population process, we can describe the stochastic spread of a disease by a more informative stochastic process known as the degree-of-advancement (DA). Exploiting the structure of the master equation governing this process results in a novel numerical algorithm for calculating the exact solution of the master equation, up to a desired precision, which we refer to as the implicit Euler (IE) method. This technique enjoys several advantages over the KSA method: its global error is of first-order with respect to the step-size, it is numerically stable regardless of the step-size used, and always produces a solution whose elements are nonnegative and sum to one. As we will discuss in this paper, the IE method shows great promise for solving certain problems in stochastic epidemiology in which the sample space associated with the DA process is reasonably sized. It is not however meant to replace the KSA method, which is still the best numerical method available for solving the master equation in problems where implementation of the IE method is not computationally attractive or possible. To illustrate the potential of the proposed IE method, we calculate the solution of the master equation associated with the stochastic SIR epidemic model and use this solution to study some important properties of this model.
Disease dynamics
The classical SIR epidemic model (without births, deaths, or imports of disease) is one of the simplest models in epidemiology [11]. Here, each individual in a population is either susceptible to a disease, infected, or recovered. If we denote by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e001.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e002.jpg, and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e003.jpg the susceptible, infected and recovered individuals, respectively, and by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e004.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e005.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e006.jpg their corresponding (and possibly random) population numbers, we can characterize the state of the SIR model at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e007.jpg by using the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e008.jpg vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e009.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e010.jpg denotes vector transpose. The state depends on time due to the (possibly random) occurrences of the following two reactions:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e011.jpg
(1)
which model infection of a susceptible individual (first reaction) as well as recovery of an infected individual (second reaction).
We can model a complex epidemiological system in more general terms by using the following reactions:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e012.jpg
(2)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e013.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e014.jpg. This model congregates individuals into An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e015.jpg different groups, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e016.jpg, which interact through An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e017.jpg coupled reactions. Parameters An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e018.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e019.jpg are the stoichiometry coefficients of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e020.jpg reaction. These parameters tell us how individuals interact with each other as well as their status after occurrence of a particular reaction. For example, in the aforementioned SIR model, we may set An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e021.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e022.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e023.jpg, resulting in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e024.jpg, with the remaining coefficients being zero.
The usual way to characterize an epidemiological system is by means of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e025.jpg random vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e026.jpg with elements An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e027.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e028.jpg denotes the population of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e029.jpg group of individuals present in the system at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e030.jpg. By convention, we set An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e031.jpg, for some known value An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e032.jpg (i.e., we assume that we know the initial population numbers at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e033.jpg). We refer to the multivariate stochastic process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e034.jpg as the population process.
Let An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e035.jpg be the (possibly random) number of times that the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e036.jpg reaction occurs during the time interval An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e037.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e038.jpg be the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e039.jpg random vector with elements An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e040.jpg. Then, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e041.jpg is a counting process, known as the degree of advancement (DA) of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e042.jpg reaction [5]. We set An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e043.jpg and refer to the multivariate stochastic process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e044.jpg as the DA process. Note that
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e045.jpg
(3)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e046.jpg is the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e047.jpg net stoichiometry matrix of the system with elements An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e048.jpg. Therefore, and for a given initial population vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e049.jpg, equation (3) allows us to uniquely determine the population process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e050.jpg from the DA process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e051.jpg. However, we cannot in general determine the DA process from the population process. This can only be done when the nullity of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e052.jpg is zero, in which case An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e053.jpg. As a consequence, the DA process is more informative than the population process.
The master equation
To model an epidemiological system governed by the reactions in (2), we must specify, for each An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e054.jpg, the probability that one reaction An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e055.jpg will occur within an infinitesimally small time interval An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e056.jpg. For most systems of interest, this probability is given by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e057.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e058.jpg is a term that goes to zero faster than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e059.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e060.jpg is a (usually nonlinear) function of individual populations at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e061.jpg, known as the propensity function of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e062.jpg reaction [12]. It turns out that the DA process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e063.jpg is a Markovian counting process with intensity An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e064.jpg. As a result, it can be shown that the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e065.jpg of the DA process satisfies the following master equation [13], [14]:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e066.jpg
(4)
for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e067.jpg, initialized by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e068.jpg, with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e069.jpg being the Kronecker delta function, where
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e070.jpg
(5)
and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e071.jpg is the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e072.jpg column of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e073.jpg identity matrix. In the theory of Markov processes, the master equation is a special case of the more general forward Kolmogorov equation [5].
We can use the solution An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e074.jpg of the previous master equation to calculate the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e075.jpg of the population process. Since we are dealing with discrete random variables, this calculation is straightforward:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e076.jpg
(6)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e077.jpg. Therefore, by solving the master equation (4) [i.e., by calculating the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e078.jpg, for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e079.jpg], we can completely specify the dynamic properties of a Markovian model. However, and for most cases of interest, this is a notoriously difficult task, both analytically and computationally. In the following, we propose a promising numerical method to address this problem and illustrate its potential using a simple example based on the stochastic SIR epidemic model.
Exploiting structure
Most available algorithms for solving the master equation focus on the population process instead of the DA process. It turns out that, by using the DA process, we may reap some benefits that can lead to a simple numerical solver for the general master equation (4).
In the following, we assume that statistical analysis of an epidemiological model of interest is limited within a finite time interval An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e080.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e081.jpg is the maximum time for which the DA process is almost surely contained within an An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e082.jpg-dimensional discrete and finite sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e083.jpg; i.e.,
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e084.jpg
We index the elements in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e085.jpg by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e086.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e087.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e088.jpg is the cardinality of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e089.jpg (i.e., the total number of elements in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e090.jpg). We can then define the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e091.jpg vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e092.jpg with elements An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e093.jpg, for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e094.jpg. Clearly, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e095.jpg specifies the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e096.jpg. It can be seen from (4) that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e097.jpg can be calculated by solving the following system of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e098.jpg linear ordinary differential equations (ODEs):
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e099.jpg
(7)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e100.jpg is a An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e101.jpg matrix that can be directly constructed from the master equation. In the theory of Markov processes, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e102.jpg is known as the generator matrix. Note that the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e103.jpg column of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e104.jpg contains zeros in most places except for the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e105.jpg element that takes value An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e106.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e107.jpg off-diagonal elements that take values An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e108.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e109.jpg. Therefore, the elements of each column of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e110.jpg add to zero. Note also that equation (7) is initialized by a vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e111.jpg whose first element equals An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e112.jpg (assuming that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e113.jpg), whereas, the remaining elements are all zero.
The main advantage of using the DA process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e114.jpg is that, under an appropriate ordering of the elements in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e115.jpg, the generator matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e116.jpg will be lower triangular. We will shortly demonstrate that this can result in substantial simplification of the numerical algorithm used to solve (7).
To obtain a matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e117.jpg that is lower triangular, we must order the points An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e118.jpg in the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e119.jpg lexicographically, such that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e120.jpg, for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e121.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e122.jpg denotes that one variable is lexicographically smaller than another [e.g., An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e123.jpg if and only if An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e124.jpg or An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e125.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e126.jpg]. Because a reaction can only increase (by one) the value of a single element of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e127.jpg, it is not possible for probability mass to be transferred from An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e128.jpg to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e129.jpg when An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e130.jpg. Such monotonic transfer of probability does not generally occur when the population process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e131.jpg is used. Therefore, when the points An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e132.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e133.jpg, in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e134.jpg are ordered lexicographically, the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e135.jpg element of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e136.jpg will be zero when An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e137.jpg and, therefore, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e138.jpg will be lower triangular. An example is provided in Supporting Information S2.
Numerical solver
We now proceed by exploiting the three key structural characteristics of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e139.jpg: its stability, triangularity, and sparsity. We have noted that the diagonal elements of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e140.jpg are non-positive. However, since An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e141.jpg is triangular, its diagonal elements will be the eigenvalues of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e142.jpg. Thus, the linear constant coefficient system of ODEs given by (7) is stable, ensuring the efficacy of implicit ODE solvers [15]. As a consequence, we can use the implicit Euler method to estimate An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e143.jpg at discrete time points An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e144.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e145.jpg, for a given time step An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e146.jpg. Then, given an estimate An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e147.jpg of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e148.jpg, we can obtain an estimate An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e149.jpg of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e150.jpg by solving the following system of linear equations:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e151.jpg
(8)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e152.jpg is the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e153.jpg identity matrix. By initializing this computation with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e154.jpg, we can therefore recursively calculate the values of the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e155.jpg of the DA process at the discrete time points An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e156.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e157.jpg. In Supporting Information S2, we show that solving the previous system is always possible, for any An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e158.jpg, due to the invertibility of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e159.jpg. We also show that this procedure always returns a probability vector for any step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e160.jpg. Moreover, we demonstrate that the resulting method is a first-order solver, since the global error An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e161.jpg is of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e162.jpg (i.e., the global error is proportional to the step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e163.jpg). Finally, since the implicit Euler step is always stable for any choice of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e164.jpg [15], the errors from previous iterations will not be amplified in later stages, regardless of the step-size used. Therefore, a desired error can be achieved by simply reducing the value of the step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e165.jpg. We refer to the resulting technique for solving the master equation based on (8) as the implicit Euler (IE) method.
In general, solving (8) would require An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e166.jpg computations, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e167.jpg is the cardinality of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e168.jpg, which will be prohibitive. However, since An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e169.jpg is a triangular matrix, we can use forward substitution whose cost is usually of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e170.jpg. But since An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e171.jpg is a sparse matrix, with each column having only An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e172.jpg non-zero elements, forward substitution can be done at a cost of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e173.jpg [16], where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e174.jpg is the number of reactions. In addition, calculating the probability mass function at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e175.jpg requires storage of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e176.jpg nonzero numbers. In particular, we need to store An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e177.jpg nonzero elements of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e178.jpg as well as An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e179.jpg elements of vectors An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e180.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e181.jpg [note that the elements of each column of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e182.jpg and the elements of each of the two vectors An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e183.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e184.jpg sum to one]. Since An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e185.jpg, the computational and memory requirements of the IE method will be An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e186.jpg, which grow linearly in terms of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e187.jpg.
Matrix exponentiation
Instead of the previous approach, we may attempt to solve the master equation governing the population process An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e188.jpg by a matrix exponentiation method [2], [9]. Let An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e189.jpg be an An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e190.jpg-dimensional discrete and finite sample space such that
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e191.jpg
If we index the elements in An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e192.jpg by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e193.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e194.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e195.jpg is the cardinality of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e196.jpg, then we can define the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e197.jpg vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e198.jpg with elements An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e199.jpg, for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e200.jpg. In this case, the probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e201.jpg can be calculated by solving the following system of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e202.jpg linear ODEs:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e203.jpg
(9)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e204.jpg is a sparse An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e205.jpg matrix whose structure can be inferred directly from the master equation [9]. Note that (9) is initialized by a vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e206.jpg whose first element equals An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e207.jpg [assuming that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e208.jpg], whereas, the remaining elements are all zero.
We can obtain estimates An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e209.jpg of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e210.jpg, for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e211.jpg, by using the following recursion:
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e212.jpg
(10)
initialized with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e213.jpg. The main issue with this equation however is the need to evaluate the matrix exponential An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e214.jpg. Although many techniques are currently available to do this job, they are not very satisfactory due to issues related to stability, accuracy, and computational efficiency [17].
The best method currently available to compute (10) is based on a Krylov subspace approximation (KSA) method [7], [8], [18]. In its simplest form, the method approximates An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e215.jpg, for a small An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e216.jpg, by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e217.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e218.jpg is an An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e219.jpg matrix whose columns form an orthonormal basis for the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e220.jpg-dimensional Krylov subspace An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e221.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e224.jpg is an An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e225.jpg matrix computed during the well-known Arnoldi procedure used to calculate An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e226.jpg, and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e227.jpg is the first column of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e228.jpg identity matrix. Then, the value of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e229.jpg is recursively approximated by
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e230.jpg
for An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e231.jpg, with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e232.jpg. The KSA method reduces the problem of calculating the exponential of the large and sparse An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e233.jpg matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e234.jpg to the problem of calculating the exponential of the much smaller and dense An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e235.jpg matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e236.jpg (note that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e237.jpg, with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e238.jpg30–50 being sufficient for most applications). Computation of the reduced size problem can be done by standard methods, such as Chebyshev or Padé approximation [7], [8], [17].
As we mentioned in the Introduction, there are several disadvantages of using the KSA method: possible error accumulation that may lead to instabilities, inability to produce, at each step, a probability vector without heuristically modifying the calculated values, and a need to specify an appropriate dimensionality for the Krylov subspace without appreciably affecting computational efficiency while achieving acceptable numerical accuracy. These issues are nicely circumvented by the IE method, but with a price: the proposed method can be applied to a smaller set of problems than the KSA method.
Practical considerations
In general, the computational and memory requirements of matrix exponentiation grow quadratically in terms of the cardinality An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e239.jpg of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e240.jpg, and can quickly become prohibitive for large values of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e241.jpg. The KSA method however can greatly reduce this expense to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e242.jpg computations and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e243.jpg memory locations, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e244.jpg is the dimension of the approximating Krylov subspace used and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e245.jpg is the number of reactions; see our discussion in Supporting Information S2. Thus, the relative efficiency of the IE method, which requires An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e246.jpg computation and storage cost, to the KSA approach will depend on the relative values of the cardinalities An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e247.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e248.jpg of the sample spaces An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e249.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e250.jpg, respectively.
As we mentioned before, if the nullity of the net stoichiometry matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e251.jpg is zero, then there is a one-to-one correspondence between An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e252.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e253.jpg. As a consequence of (6), the cardinalities of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e254.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e255.jpg will be the same, in which case An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e256.jpg. Under these circumstances, the IE method will outperform the KSA method. This is a consequence of the fact that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e257.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e258.jpg in this case. We can easily verify that, for the simple SI model (An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e259.jpg), the SIR epidemic model characterized by (1), and the SEIR model (An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e260.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e261.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e262.jpg, where E denotes a group of individuals exposed to disease but not yet infectious), the nullity of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e263.jpg is indeed zero and, therefore, the IE method will be superior to the KSA method.
In general, the IE method will be computationally superior to the KSA method, provided that the cardinality of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e264.jpg is not appreciably larger than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e265.jpg times the cardinality of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e266.jpg [or not much larger than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e267.jpg times the cardinality of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e268.jpg, if we also consider memory requirements]. Of course, in situations where the nullity of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e269.jpg is large, the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e270.jpg can become appreciably larger than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e271.jpg, in which case the KSA method will be more preferable. Note that there are cases in which An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e272.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e273.jpg can become infinite (e.g., suppose an influx of people at some constant rate An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e274.jpg, in which case both sample spaces will be unbounded). In these situations, the use of a finite state projection approach [9] is required to reduce the sample spaces, and the relative efficiency of the two methods will depend on the sizes of the resulting subspaces.
For a given step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e275.jpg, the IE method described so far generates a sequence of probability vectors An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e276.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e277.jpg. Assuming that the true solution An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e278.jpg is known at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e279.jpg, we can show [see equation (S.11) in Supporting Information S2] that the local error An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e280.jpg is of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e281.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e282.jpg is the approximation of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e283.jpg obtained by the IE method for a given value of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e284.jpg. We can further improve this result by employing a powerful computational tool known as Richardson extrapolation [19].
We have shown in the Supporting Information S2 that, if An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e285.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e286.jpg are the approximations of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e287.jpg obtained from An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e288.jpg by the IE method with step-sizes An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e289.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e290.jpg, respectively, then An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e291.jpg also approximates An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e292.jpg, but with a local error of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e293.jpg. We therefore expect that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e294.jpg is a better approximation to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e295.jpg than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e296.jpg [or even An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e297.jpg; see Supporting Information S2] for a sufficiently small step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e298.jpg. This suggests that we can use a valuable modification of the IE method to obtain a better approximation to the solution of the master equation than the original technique. This modification combines two runs of the IE method, with time steps An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e299.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e300.jpg, and produces a solution An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e301.jpg, given by
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e302.jpg
(11)
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e303.jpg denotes the minimum value of the elements of vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e304.jpg. In this case, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e305.jpg is given by the ‘improved’ vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e306.jpg only when all elements of that vector are nonnegative. Otherwise, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e307.jpg is given by the vector An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e308.jpg calculated by the IE method with the smaller step-size An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e309.jpg. This assures that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e310.jpg is always a probability vector. We will be referring to the resulting technique as the Richardson-based implicit Euler (RIE) method. We illustrate one step of this method in Figure 1.
Figure 1
Figure 1
One step of the RIE method.
Many ODE solvers, including the KSA method, adjust the step-size at each iteration to assure that the local error An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e313.jpg is less than a pre-specified error tolerance An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e314.jpg while minimizing the computational effort required to accomplish this goal. We can also modify the RIE method to accommodate variable step-sizes. By following our analysis in the Supporting Information S2, we can approximately calculate the local error An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e315.jpg at step An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e316.jpg by [see equation (S.16)]
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e317.jpg
where we use a factor of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e318.jpg to compensate for the possibility that the true (but unknown) local error is larger (by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e319.jpg) than the actual error calculated by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e320.jpg. If An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e321.jpg, then we consider the step successful and increase the step-size from An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e322.jpg to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e323.jpg, where [see equation (S.17) in Supporting Information S2]
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e324.jpg
(12)
However, if An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e325.jpg, then the step is unsuccessful. In this case, we decrease the step-size from An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e326.jpg to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e327.jpg by using (12) and redo the RIE step.
We finally note that some readers might be concerned with precision loss in the forward substitution step of the IE and RIE methods. To address this issue, we could employ the standard numerical technique of iterative improvement [15] with moderate additional computational cost. However, we show in the Supporting Information S2 that the matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e328.jpg being inverted is never singular. Moreover, this matrix is far from being singular (i.e., its eigenvalues are not numerically close to zero) as An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e329.jpg. We therefore suggest that a preferable method of combating precision loss is to reduce An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e330.jpg, since the step-size tightly regulates the global error as well (see Supporting Information S2). Although in the subsequent example we did not perform iterative improvement, the results indicate that any precision loss is negligible, despite the large dynamic range of probability values involved in the solution.
To demonstrate the efficacy of our method, we tackle the problem of modeling a well-documented 1978 influenza epidemic in an English boarding school [20]. A deterministic SIR model was originally developed to analyze these data [21]. Subsequently, the model was extended to the stochastic case and approximately solved using van Kampen's system-size expansion method [1]. In the following, we use the IE method to compute the exact solution of the underlying master equation up to a desired precision.
There are three classes of individuals, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e331.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e332.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e333.jpg, representing An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e334.jpg susceptible, infected and recovered pupils. Spreading of the epidemic is governed by the reactions in (1) with propensity functions
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e335.jpg
where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e336.jpg/day and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e337.jpg/day are the rate constants of infection and recovery, respectively [1]. The initial conditions are given by
A mathematical equation, expression, or formula.
 Object name is pone.0036160.e338.jpg
reflecting the fact that only one pupil is infected at the start of the epidemic. We take the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e339.jpg to be the rectangular region in the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e340.jpg plane that begins at An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e341.jpg and extends to include the maximal point An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e342.jpg. This is due to the fact that the first reaction can occur at most An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e343.jpg times, after which all pupils will have been infected, whereas, the second reaction can occur at most An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e344.jpg times, after which all pupils will have recovered from the infection. As a consequence, the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e345.jpg contains An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e346.jpg points.
Numerically solving the master equation over a period of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e347.jpg days by means of the KSA method using Expokit [7] took 72 minutes of CPU time on a 2.20 GHz Intel Mobile Core 2 Duo T7500 processor running Matlab 7.7. The resulting solution produces an L2 error An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e348.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e349.jpg is a solution of the master equation obtained by a stringent run of Expokit, which we consider to be the ‘true’ solution. The required An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e350.jpg value (used to determine a desired error tolerance for the KSA method and for the RIE method with variable step-size) was set to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e351.jpg. We obtained the Expokit solution by using a Krylov subspace approximation with dimension An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e352.jpg. This value was determined by starting with the default value of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e353.jpg and successively increasing it by An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e354.jpg until the resulting Expokit error estimate was less than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e355.jpg. The reported L2 errors were calculated using a solution obtained by a computationally more expensive Expokit run with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e356.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e357.jpg, which we consider it to be the ‘true’ solution. This is based on the premise that Expokit will produce the true solution for sufficiently large An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e358.jpg and small An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e359.jpg.
To be compatible with Expokit, we report here the L2 error. Note however that the error analysis of our method, provided in the Supporting Information S2, is based on the L1 error. On the other hand, using equation (8) with An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e360.jpg days, the IE method took a mere 53 seconds of CPU time, achieving a smaller (by a factor of 2.8) final L2 error of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e361.jpg. We can achieve a further reduction of the L2 error by using the RIE method with fixed step-size. This is clear from the results summarized in Table 1. This performance can be achieved however at the expense of increasing the CPU time required to calculate the solution. Note that we may be able to decrease the CPU time by using the RIE method with variable step-size (see Table 1). This method however results in a noticeable decrease of accuracy (at least for the example considered here), with an L2 error that is 2.8 times larger than the one obtained with the KSA method.
Table 1
Table 1
The L2 error and the CPU time associated with the four numerical methods discussed in this paper.
Since An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e370.jpg, it suffices to focus on the joint probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e371.jpg of susceptible and infected pupils. It turns out however that the epidemic-free state occurs with high probability An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e372.jpg, a situation that visually obscures the values of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e373.jpg. For this reason, instead of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e374.jpg, we depict in Figure 2 a snapshot of the calculated joint conditional probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e375.jpg of the susceptible and infected pupils at the end of the 6th day, given that at least one pupil is infected. The Supporting Information S3 contains a .gif movie that depicts the dynamic evolutions of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e376.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e377.jpg during the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e378.jpg day period. We have obtained these and all subsequent results by exclusively using the basic IE method.
Figure 2
Figure 2
A snapshot of the calculated probability mass function.
In Figure 3, we depict the dynamic profiles of the mean numbers of susceptible, infected and recovered pupils (solid green lines) as well as the dynamic profiles of the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e380.jpg standard deviations (dashed red lines), computed directly from the joint probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e381.jpg. We also depict the observed data (blue circles) obtained from the literature [20]. These results are identical to the results obtained by Monte Carlo estimation based on An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e382.jpg trajectories sampled from the master equation using the Gillespie algorithm (only data related to the infected pupils are shown), and assures that the IE method produces the correct results. Unfortunately, we cannot employ the Gillespie algorithm to accurately estimate the joint probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e383.jpg in a reasonable time, due to the prohibitively large number of samples required by this method.
Figure 3
Figure 3
Dynamic mean and standard deviation profiles.
The bimodal nature of the probability mass function depicted in Figure 2 clearly demonstrates that the system-size expansion method used previously in [1] is not appropriate for this model, since the method leads to a unimodal Gaussian approximation. As a matter of fact, the results depicted in Figure 3 are different than the mean and standard deviation profiles depicted in Figures 34 of [1]. Because of the Gaussian nature of the system-size expansion method, the results reported in [1] over-estimate the means and under-estimate the standard deviations, since this technique is blind to the bimodal nature of the probability distribution. As a matter of fact, using the means and standard deviations to characterize the stochastic properties of individual classes in the SIR model is not appropriate. This is also evident by the fact that the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e385.jpg standard deviations can take negative values as well as values greater than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e386.jpg. In Figure 3, we have truncated these misleading values.
Figure 4
Figure 4
Dynamic properties of the SIR model.
We can use the calculated joint probability mass functions An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e392.jpg to study a number of dynamic properties of the SIR model in a stochastic setting. In Figure 4(a), for example, we depict the evolution of the expected number of recovered pupils (solid green line), as well as the An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e393.jpg standard deviations (dashed red lines), given that at least one pupil is always infected. During the first few days, few infections occur, and the expected number of recovered pupils will almost be zero. Subsequently, this number increases monotonically to An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e394.jpg, following a near sigmoidal curve. The An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e395.jpg standard deviation curves and the evolution of the Fano factor (variance/mean) depicted in Figure 4(b) indicate that there is appreciable fluctuation in the number of recovered pupils during days 3–10, after which most pupils recover from the infection. According to the results depicted in Figure 4(b), the maximum fluctuation in the number of recovered pupils occurs during the 6th day.
In Figure 4(c), we depict the dynamic evolution of the calculated probability of extinction An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e396.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e397.jpg, during a period of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e398.jpg days. This evolution is characterized by four phases. During phase I (days 1–4), the probability of extinction increases rapidly from An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e399.jpg to about An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e400.jpg, due to the small number of infectious pupils. During phase II (days 5–17), the probability remains relatively constant to about An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e401.jpg. During this period of time, the epidemic takes its natural course, increasingly infecting susceptible individuals, who eventually recover from the disease. As a consequence, we do not expect the probability of extinction to increase during this phase. On the other hand, during phase III (days 18–40), the number of infected pupils monotonically decreases to zero. It is therefore expected that, during this phase, the probability of extinction will monotonically increase to its maximum value of one. Finally, during phase IV (days 40–50), there is no infectious pupils present. As a result, the influenza virus cannot be transmitted to the remaining susceptible pupils and the epidemic ceases to exist.
When studying an epidemic model with extinction, a task of practical interest is to calculate the number of individuals that escape infection. This is usually done by evaluating the expected number An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e402.jpg of individuals that escape infection (or the average number of susceptible individuals that remain after extinction) as the mean value of the stationary probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e403.jpg [2]. In Figure 4(d), we depict the joint probability An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e404.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e405.jpg days, which we assume to be a very close approximation to the stationary probability mass function An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e406.jpg. By using this probability, we compute An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e407.jpg. Note however that, due to the bimodal nature of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e408.jpg, calculating An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e409.jpg is misleading. On the other hand, by using the result depicted in Figure 4(d), we can confirm that there is a An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e410.jpg chance that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e411.jpg pupils or less, and a An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e412.jpg chance that An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e413.jpg pupils or more, escape infection. Clearly, these ‘confidence intervals’ provide a more accurate statistical assessment of the number of individuals that escape infection than An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e414.jpg. Interestingly, there is only An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e415.jpg chance that the number of pupils escaping infection is within the range An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e416.jpg, which includes the value of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e417.jpg.
Modeling the stochastic dynamics of a disease that spreads through a small and well-mixed population of individuals is an increasingly important subject of modern epidemiology. Unfortunately, even for the simplest model, calculating the underlying probability distribution is a daunting task.
In an effort to address this problem, we have introduced in this paper a new approach to numerically compute the probability mass function of a Markovian population process governed by the master equation. Implementation of this approach is feasible when the number of possible states is not prohibitively large. In this case, the proposed method can lead to exact statistical analysis – up to a desired precision – of certain stochastic epidemiological models of interest, such as the SIR epidemic model.
The method introduced in this paper is linear – both in terms of memory and computational requirements – with respect to the cardinality An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e418.jpg of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e419.jpg of the degrees of advancement of the underlying reactions. As a consequence, the method is feasible any time An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e420.jpg is relatively small. In general, however, the cardinality of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e421.jpg may grow arbitrarily large, making implementation of the method impossible without an appropriate FSP approximation [9]. Thus, the proposed technique is only applicable to models that constrain the number of reaction events, such as the SIR epidemic model considered in this paper, or models for which the number of reaction events is sufficiently small during a time period of interest (i.e., models without ‘fast’ reactions). Moreover, due to the well-known problem of the ‘curse of dimensionality,’ An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e422.jpg grows exponentially with respect to the number of reactions An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e423.jpg. Hence, models with many reactions cannot be solved by the proposed method.
An effort is currently underway to reduce the size of the sample space An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e424.jpg, without compromising accuracy. A plausible way to accomplish this goal is to reduce the number of reactions involved by removing ‘fast’ reactions using a multi-scale approximation technique, such as one of the techniques introduced for biochemical reaction systems [13], [14], [22], and to adaptively update An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e425.jpg at each time point An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e426.jpg by confining it to the smallest possible subspace An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e427.jpg of An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e428.jpg. Because of the lower-triangular and sparse nature of matrix An external file that holds a picture, illustration, etc.
Object name is pone.0036160.e429.jpg in (8), it is also plausible that we employ optimized algorithms developed for solving sparse triangular systems of linear equations on parallel and distributed memory computer architectures [23], indicating that future efforts towards solving the master equation could potentially focus on using high-performance computing systems.
Finally, it was brought to our attention by one of the reviewers that, in an earlier work, K. N. Crank proposed a method to map a general Markovian population process on a countable sample space to an augmented Markovian process with triangular generator matrix [24] by appropriately ordering that space. Crank's technique can be easily combined with our implicit Euler method to construct an alternative algorithm for numerically solving the master equation with focus on the population process instead of the DA process. However, we cannot find any advantage of using Crank's approach over ours. We believe that an approach for numerically solving the master equation based on the DA process is more preferable than an approach based on the population process. The former can provide the probability distributions of both the DA and population processes, whereas, the latter can only produce the probability distribution of the population process. Moreover, the IE method based on the DA process is easier to implement, due primarily to a faster and more natural implementation of the lexicographic ordering used by this approach as opposed to the more complex ordering of the population sample space proposed by Crank. For more details on this issue, see our discussion in Supporting Information S2.
Supporting Information S1
This file contains the MATLAB code used to generate the results presented in the paper.
(ZIP)
Supporting Information S2
This file contains additional information and proofs that elucidate various mathematical and numerical aspects of the work presented in the paper. It also provides a brief discussion of an alternative method for solving the master equation using the implicit Euler method, based on ordering the population sample space instead of the DA sample space.
(PDF)
Supporting Information S3
This file contains a video of the dynamic evolution of the joint conditional probability mass function of susceptible and infected pupils in an influenza epidemic predicted by the SIR model.
(GIF)
Acknowledgments
The authors would like to thank Prof. Hong Qian for his feedback on an earlier version of this paper, as well as the anonymous reviewers for their constructive comments and valuable suggestions.
Footnotes
Competing Interests: The authors have declared that no competing interests exist.
Funding: This research was supported in part by the DoD High Performance Computing Modernization Program through the National Defense Science and Engineering Graduate (NDSEG) Fellowship 32 CFR 168a, and in part by the National Science Foundation (NSF) Grant CCF-0849907. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
1. Chen WY, Bokka S. Stochastic modeling of nonlinear epidemiology. J Theor Biol. 2005;234:455–470. [PubMed]
2. Keeling MJ, Ross JV. On methods for studying stochastic disease dynamics. J R Soc Interface. 2008;5:171–181. [PMC free article] [PubMed]
3. Youssef M, Scoglio C. An individual-based approach to SIR epidemics in contact networks. J Theor Biol. 2011;283:136–144. [PubMed]
4. Gillespie DT. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys. 1976;22:403–434.
5. van Kampen NG. Stochastic Processes in Physics and Chemistry. Amsterdam: Elsevier, 3rd edition; 2007.
6. Black AJ, McKane AJ. Stochasticity in staged models of epidemics: quantifying the dynamics of whooping cough. J R Soc Interface. 2010;7:1219–1227. [PMC free article] [PubMed]
7. Sidje RB. Expokit: A software package for computing matrix exponentials. ACM T Math Software. 1998;24:130–156.
8. Sidje RB, Stewart WJ. A numerical study of large sparse matrix exponentials arizing in Markov chains. Comput Stat Data An. 1999;29:345–368.
9. Munsky B, Khammash M. The finite projection algorithm for the solution of the chemical master equation. J Chem Phys. 2006;124:044104. [PubMed]
10. Keeling MJ, Ross JV. Efficient methods for studying stochastic disease and population dynamics. Theor Popul Biol. 2009;75:133–141. [PubMed]
11. Hethcote HW. The mathematics of infectious diseases. SIAM Rev. 2000;42:599–653.
12. Gillespie DT. The chemical Langevin equation. J Chem Phys. 2000;113:297–306.
13. Haseltine EL, Rawlings JB. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys. 2002;117:6959–6969.
14. Goutsias J. Quasiequilibrim approximation of fast reaction kinetics in stochastic biochemical systems. J Chem Phys. 2005;122:184102. [PubMed]
15. Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical Recipes: The Art of Scientific Computing. New York, NY: Cambridge University Press, 3rd edition; 2007.
16. Fox GC, Williams RD, Messina PC. Parallel computing works! San Francisco, CA: Morgan Kaufmann Publishers, Inc; 1994.
17. Moler C, Van Loan C. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev. 2003;45:3–49.
18. Philippe B, Sidje RB. Transient solutions of Markov processes by Krylov subspaces. In: Stewart W, editor. 2nd International Workshop on the Numerical Solution of Markov Chains. Raleigh, NC: Kluwer Academic Publishers; 1995. pp. 95–119.
19. Faragó I, Havasi Á, Zlatev Z. Efficient implementation of stable Richardson extrapolation algorithms. Comput Math Appl. 2010;60:2309–2325.
20. CDSC & CDU. Influenza in a boarding school. Brit Med J. 1978;1:587.
21. Murray JD. Mathematical Biology. I. An Introduction. New York, NY: Springer-Verlag, 3rd edition; 2001.
22. Haseltine EL, Rawlings JB. On the origins of approximations for stochastic chemical kinetics. J Chem Phys. 2005;123:164115. [PubMed]
23. Mayer J. Parallel algorithms for solving linear systems with sparse triangular matrices. Computing. 2009;86:291–312.
24. Crank KN. A method for approximating the probability functions of a Markov chain. J Appl Prob. 1988;25:808–814.
Articles from PLoS ONE are provided here courtesy of
Public Library of Science