To evaluate the performance of the methods from Section 2, we ran 4 series of simulations: mass-action models with exponential and Weibull contact intervals and network-based models with exponential and Weibull contact intervals. All models had exponential infectious periods with mean one and no latent period. For each model, we used data from the first
m = 1000 infectious in a population of size
n = 100000. For each infected person
j, we recorded the infection time
tj and the recovery time
tj +
ιj. In network-based models, the degree
dj and the indices of all neighbors of
j were also recorded. All outbreaks started with a single imported infection at time
t = 0. Outbreaks that terminated with a final size less than 1000 were discarded; if a model ran 100 times without producing an epidemic of size at least 1000, it was discarded and another model generated. In all models, ln
R0 was sampled from a uniform distribution on (1.05,2), corresponding to
R0 between 1.05 and 7.39, and the rate parameter of the contact interval distribution was chosen to achieve the sampled
R0. All network-based models had undirected Erdős–Rényi contact networks, which have a Poisson degree distribution (Newman
and others, 2006). A new contact network with mean degree

was generated for each simulation, where

was sampled uniformly from ln
R0+0.1,3).
The true contact interval distribution in each model was either exponential or Weibull. The exponential distribution has the hazard function
λ(
τ;
β) =
β for
τ > 0, where
β > 0 is the rate parameter. The Weibull distribution has the hazard function
λ(
τ;
α,
β) =
αβ(
βτ)
α − 1 for
τ > 0, where
α > 0 is the shape parameter and
β > 0 is the rate parameter. For models with Weibull contact interval distributions, lnα was sampled from a uniform distribution on ( − 1.25,1.25), corresponding to
α between 0.29 and 3.49. For each model, analyses were performed assuming exponential, Weibull, and gamma contact interval distributions. The gamma distribution has the hazard function
where
β is the rate parameter,
κ is the shape parameter, and Γ(
a,
b) is the upper incomplete gamma function. The exponential distribution is a Weibull distribution with
α = 1 and a gamma distribution with
κ = 1. Using a gamma distribution allowed us to examine the effect of fitting a richer parametric model to exponential contact intervals and fitting a misspecified parametric model to Weibull contact intervals.
All analyses assumed that who-infected-whom was not observed. For network-based models, we used the log-likelihood

. For mass-action models, we used the asymptotic log-likelihood

from (2.16). When information about relationships between infected persons are missing, it is common to analyze the data using a mass-action model (Lipsitch
and others, 2003), (, Mills), (, WallingaTeunis), (, FraserH1N1), (, YangH1N1). To look at the possible effects of this, we analyzed data generated by network-based models using the asymptotic log-likelihood for a mass-action model, ignoring all information about the links between infected persons and about uninfected neighbors of infected persons.
Simulations were implemented in Python 2.6 (
www.python.org) using SciPy 0.7 (Jones
and others, 2001–2009). Contact networks were generated using NetworkX 1.0 (Hagberg
and others, 2008). Analyses were performed in R 2.10 (
R Development Core Team, 2009) via RPy2 2.0 (Moreira and Warnes, 2002–2009). Maximum likelihood estimates and confidence intervals were calculated in the log scale using the “mle” and “confint” functions in R. Bias-corrected bootstrap confidence intervals were calculated for
R0. Multivariate normal samples were obtained using the Cholesky decomposition of the covariance matrix (
Rizzo, 2008). Code for the models and estimates is provided as online
supplementary material.
3.1. Estimates of R0
Let
ιk denote the infectious period of the
kth infection observed. For mass-action models with exponential contact intervals,
R0 =
β. Our point estimate of
R0 is

, where

is the rate parameter maximum likelihood estimate (MLE). A bootstrap replicate of
R0 is

, where ln
β* is a sample from the approximate normal distribution of
In 
and
ι1*,…,
ιm* is a bootstrap sample from the observed
ι1,…,
ιm. For mass-action models with Weibull contact intervals
R0 =
βαΓ(
α + 1). Our point estimate of
R0 is
where

is the shape parameter MLE and

is the rate parameter MLE. A bootstrap replicate of
R0 is
where (ln
α*,ln
β*) is a sample from the approximate joint normal distribution of

.
In a contact network with
n nodes, let

denote the expected number of edges across which a nonimported infection in the early stages of an epidemic can transmit infection (
Andersson, 1998), (, Newman1), (, Kenah1). Let
ιk and
dk denote the infectious period and degree, respectively, of the
kth infection observed. When the contact interval has an exponential distribution,

and

. A bootstrap replicate of
R0 is

, where ln
β* is a sample from the approximate normal distribution of
ln
and (
ι1*,
d1*),…,(
ιm*,
dm*) is a bootstrap sample from (
ι1,
d1),…,(
ιm,
dm). When the contact interval has a Weibull distribution,

, and
A bootstrap replicate of
R0 is
where (ln
α*, ln
β*) is a sample from the approximate joint normal distribution of

. We assume that infected persons can report their degree in the contact network, so these estimates of
R0 do not depend on any prior knowledge of the contact network on the part of the investigators.
3.2. Results
For mass-action models with exponential contact intervals, shows scatterplots of the estimated versus true R0 and , panel A, shows 95% confidence interval coverage probabilities. For analyses assuming an exponential contact interval, R0 estimates are close to the truth and coverage probabilities for R0 and β are excellent. Analyses assuming a Weibull or gamma distribution produce R0 estimates that are biased upward and right skewed, particularly for the Weibull distribution. Confidence interval coverage probabilities for R0 are above 80% but well below 95%. On the bright side, both analyses have greater than 90% chance of failing to reject the null hypothesis that the contact interval distribution is exponential at the 5% level of significance. Performance of the gamma and Weibull analyses improves at lower R0.
| Table 1.Coverage probabilities and exact binomial 95% confidence intervals in simulations |
For mass-action models with Weibull contact intervals, shows scatterplots of the estimated versus true R0 and , panel B, shows 95% confidence interval coverage probability estimates. Analyses assuming a Weibull contact interval distribution produce estimates that are right skewed but nearly unbiased, with confidence interval coverage probabilities only slightly below 95%. Analyses assuming an exponential or gamma contact interval distributions are biased upward and right skewed, but those assuming a gamma distribution perform far better than those assuming an exponential distribution. Performance of both misspecified analyses improves at lower R0.
For network-based models with exponential contact intervals, shows scatterplots of the estimated versus true R0 and , panel C, shows 95% confidence interval coverage probabilities. Analyses assuming a network-based model with exponential contact intervals produce excellent point and interval estimates. The cost of relaxing this assumption to allow Weibull or gamma contact intervals is far less than that seen for mass-action models in Figure 1 and Table 1, panel A. R0 estimates remain close to the truth, and all confidence intervals have coverage probabilities near 95%. Ignoring information about neighbors of infected persons and assuming a mass-action model with exponential contact intervals is far less successful, though performance improves at lower R0.
For network-based models with Weibull contact intervals, shows scatterplots of the estimated versus true R0 and Table 1, panel D, shows 95% confidence interval coverage probabilities. Analyses assuming a network-based model with Weibull contact intervals produce excellent point and interval estimates. Assuming a gamma contact interval distribution produces R0 estimates biased upward and confidence interval coverage probabilities well below 95%. Assuming an exponential contact interval distribution produces R0 estimates that are biased upward and confidence intervals that are much too narrow. Ignoring information about neighbors of infected persons and assuming a mass-action model with Weibull contact intervals produces estimates that are strongly biased upward, although confidence interval coverage probabilities are superior to those of a network-based analysis assuming exponential confidence intervals. The performance of all misspecified analyses improves at lower R0.