To demonstrate the efficacy of our method, we tackle the problem of modeling a well-documented 1978 influenza epidemic in an English boarding school 
. A deterministic SIR model was originally developed to analyze these data 
. Subsequently, the model was extended to the stochastic case and approximately solved using van Kampen's system-size expansion method 
. In the following, we use the IE method to compute the exact
solution of the underlying master equation up to a desired precision.
There are three classes of individuals,
susceptible, infected and recovered pupils. Spreading of the epidemic is governed by the reactions in (1) with propensity functions
/day are the rate constants of infection and recovery, respectively 
. The initial conditions are given by
reflecting the fact that only one pupil is infected at the start of the epidemic. We take the sample space
to be the rectangular region in the
plane that begins at
and extends to include the maximal point
. This is due to the fact that the first reaction can occur at most
times, after which all pupils will have been infected, whereas, the second reaction can occur at most
times, after which all pupils will have recovered from the infection. As a consequence, the sample space
Numerically solving the master equation over a period of
days by means of the KSA method using Expokit 
took 72 minutes of CPU time on a 2.20 GHz Intel Mobile Core 2 Duo T7500 processor running Matlab 7.7. The resulting solution produces an L2 error
is a solution of the master equation obtained by a stringent run of Expokit, which we consider to be the ‘true’ solution. The required
value (used to determine a desired error tolerance for the KSA method and for the RIE method with variable step-size) was set to
. We obtained the Expokit solution by using a Krylov subspace approximation with dimension
. This value was determined by starting with the default value of
and successively increasing it by
until the resulting Expokit error estimate was less than
. The reported L2 errors were calculated using a solution obtained by a computationally more expensive Expokit run with
, which we consider it to be the ‘true’ solution. This is based on the premise that Expokit will produce the true solution for sufficiently large
To be compatible with Expokit, we report here the L2 error. Note however that the error analysis of our method, provided in the Supporting Information S2
, is based on the L1 error. On the other hand, using equation (8) with
days, the IE method took a mere 53 seconds of CPU time, achieving a smaller (by a factor of 2.8) final L2 error of
. We can achieve a further reduction of the L2 error by using the RIE method with fixed step-size. This is clear from the results summarized in . This performance can be achieved however at the expense of increasing the CPU time required to calculate the solution. Note that we may be able to decrease the CPU time by using the RIE method with variable step-size (see ). This method however results in a noticeable decrease of accuracy (at least for the example considered here), with an L2 error that is 2.8 times larger than the one obtained with the KSA method.
The L2 error and the CPU time associated with the four numerical methods discussed in this paper.
, it suffices to focus on the joint probability mass function
of susceptible and infected pupils. It turns out however that the epidemic-free state occurs with high probability
, a situation that visually obscures the values of
. For this reason, instead of
, we depict in a snapshot of the calculated joint conditional probability mass function
of the susceptible and infected pupils at the end of the 6th day, given that at least one pupil is infected. The Supporting Information S3
contains a .gif movie that depicts the dynamic evolutions of
day period. We have obtained these and all subsequent results by exclusively using the basic IE method.
A snapshot of the calculated probability mass function.
In , we depict the dynamic profiles of the mean numbers of susceptible, infected and recovered pupils (solid green lines) as well as the dynamic profiles of the
standard deviations (dashed red lines), computed directly from the joint probability mass function
. We also depict the observed data (blue circles) obtained from the literature 
. These results are identical to the results obtained by Monte Carlo estimation based on
trajectories sampled from the master equation using the Gillespie algorithm (only data related to the infected pupils are shown), and assures that the IE method produces the correct results. Unfortunately, we cannot employ the Gillespie algorithm to accurately estimate the joint probability mass function
in a reasonable time, due to the prohibitively large number of samples required by this method.
Dynamic mean and standard deviation profiles.
The bimodal nature of the probability mass function depicted in clearly demonstrates that the system-size expansion method used previously in 
is not appropriate for this model, since the method leads to a unimodal Gaussian approximation. As a matter of fact, the results depicted in are different than the mean and standard deviation profiles depicted in – of 
. Because of the Gaussian nature of the system-size expansion method, the results reported in 
over-estimate the means and under-estimate the standard deviations, since this technique is blind to the bimodal nature of the probability distribution. As a matter of fact, using the means and standard deviations to characterize the stochastic properties of individual classes in the SIR model is not appropriate. This is also evident by the fact that the
standard deviations can take negative values as well as values greater than
. In , we have truncated these misleading values.
Dynamic properties of the SIR model.
We can use the calculated joint probability mass functions
to study a number of dynamic properties of the SIR model in a stochastic setting. In , for example, we depict the evolution of the expected number of recovered pupils (solid green line), as well as the
standard deviations (dashed red lines), given that at least one pupil is always infected. During the first few days, few infections occur, and the expected number of recovered pupils will almost be zero. Subsequently, this number increases monotonically to
, following a near sigmoidal curve. The
standard deviation curves and the evolution of the Fano factor (variance/mean) depicted in indicate that there is appreciable fluctuation in the number of recovered pupils during days 3–10, after which most pupils recover from the infection. According to the results depicted in , the maximum fluctuation in the number of recovered pupils occurs during the 6th day.
In , we depict the dynamic evolution of the calculated probability of extinction
, during a period of
days. This evolution is characterized by four phases. During phase I (days 1–4), the probability of extinction increases rapidly from
, due to the small number of infectious pupils. During phase II (days 5–17), the probability remains relatively constant to about
. During this period of time, the epidemic takes its natural course, increasingly infecting susceptible individuals, who eventually recover from the disease. As a consequence, we do not expect the probability of extinction to increase during this phase. On the other hand, during phase III (days 18–40), the number of infected pupils monotonically decreases to zero. It is therefore expected that, during this phase, the probability of extinction will monotonically increase to its maximum value of one. Finally, during phase IV (days 40–50), there is no infectious pupils present. As a result, the influenza virus cannot be transmitted to the remaining susceptible pupils and the epidemic ceases to exist.
When studying an epidemic model with extinction, a task of practical interest is to calculate the number of individuals that escape infection. This is usually done by evaluating the expected number
of individuals that escape infection (or the average number of susceptible individuals that remain after extinction) as the mean value of the stationary probability mass function
. In , we depict the joint probability
days, which we assume to be a very close approximation to the stationary probability mass function
. By using this probability, we compute
. Note however that, due to the bimodal nature of
is misleading. On the other hand, by using the result depicted in , we can confirm that there is a
pupils or less, and a
pupils or more, escape infection. Clearly, these ‘confidence intervals’ provide a more accurate statistical assessment of the number of individuals that escape infection than
. Interestingly, there is only
chance that the number of pupils escaping infection is within the range
, which includes the value of