We obtain estimates of the reproductive number and the serial interval. These estimates, along with information on population susceptibility and risk of severe disease, help to inform public health policy, such as potential utility or success of different community mitigation strategies, and help to characterize the spread of the disease. Our estimates of the early reproductive number of novel influenza A/H1N1 in the United States are higher than those obtained in another published study of data from the Netherlands (14
) and Mexico (10
). Our estimates are slightly smaller than those obtained from an initial analysis of the outbreak in Japan (15
) and an alternative analysis of data from Mexico (16
). There are several possible explanations for this. First, the prior estimates were based on a completed outbreakof a respiratory infection in La Gloria, Mexico and on virus genetic data, whereas our study uses the early phase of the epidemic curve from the United States as a whole. Each of these data sets has various uncertainties associated with it; we have highlighted and attempted to correct for changes in reporting, reporting delays, and missing dates of onset, but these corrections will only be approximate. Indeed, all data sets for an infection with a spectrum of severity and changing ascertainment patterns will be imperfect in these ways. Second, we have used a different approach (8
) from that used in the Mexico data; results reported here use a method focused on a period of exponential growth of the epidemic, while the prior estimates used either viral sequence coalescence estimates or analysis of a whole epidemic curve, including the declining phase, in the case of La Gloria. Finally, our estimate of the serial interval from the data is longer than that obtained for La Gloria, though somewhat shorter than that obtained from contact tracing in Spain (9
). As expected, if we assume a serial interval distribution, rather than estimate it, our estimate of the reproductive number shifts to adjust, as a consequence of the relationship between these two quantities (17
The results presented here should be interpreted with the following caveats in mind. First the data is not from a closed system, and clearly there are imported cases, such as individuals who acquired the illness in Mexico after March 28. Although we account for cases that are known to be imported, it is likely that the data we have is incomplete and several other infections could have been imported. Misclassification of cases that were truly imported will bias reproductive number estimates upwards. Second, incomplete reporting is a feature of nearly all data on the novel influenza A/H1N1, and certainly of any data sets large enough to estimate temporal trends in case numbers. If underreporting were consistent over time, it would have only a minor effect on our point estimates (which depend mainly on the growth rate and on cyclical signals in the data) but would increase uncertainty around these estimates. More likely, as we have noted, there are trends in reporting, with increasing reporting as awareness grows, and declining reporting as public health workers become unable to obtain and report detailed information on each case. One might argue for analyzing only a subset of cases during the time period with optimal reporting or by only looking at hospitalizations, which might be more accurately recorded. However, in the first case, we ignore a large number of initial cases that will undoubtedly lead to gross errors in the estimates. In this case all secondary cases after the first day that is analyzed will be attributable to that day. By only considering hospitalizations, we violate the assumption of a closed system and assume that all cases that are hospitalized are attributable to another hospitalized case. The results from such an analysis would be challenging to interpret. Instead, we have accounted for these changes by imputation of onset dates, augmentation of data to account for reporting delays, and adjustments for an estimated upward trend in reporting of the early data. We feel that such adjustments, while still imperfect, are superior to ignoring information in incomplete data. In all analyses of such data, the statistical confidence intervals obtained should not be interpreted as measuring all of the uncertainty in estimates; additional uncertainty comes from unmeasured changes in reporting.
We have also noted the impact of the assumed reporting distribution on the estimates with a sensitivity analysis. While we have estimated the rate of increase in the reporting fraction through time from our data, our estimate of the initial reporting fraction is not based on data. We have illustrated the impact of variation in these quantities on our estimates and note that while our estimates do change as these quantities vary the changes are not dramatic. In fact if we assume that the initial reporting fraction is as low as 1% rather than our assumed 15%, then the estimate of the reproductive number increases from 1.75 to 1.90. The impact that the difference in these two estimates will have on policy is minimal. We also note that under the same circumstances, the estimated mean of the serial interval changes very little (from 2.21 to 2.19), illustrating the robustness of the mean to variations in this quantity. What these results mean is that as fewer of the cases are reported, our estimates of the reproductive number are likely to be overly conservative if we do not properly adjust for this underreporting.
We have discussed the impact of the assumed serial interval on the estimates of the reproductive number. It is clear that assuming a form of the serial interval directly impacts the estimates of the reproductive number. External estimates of the serial interval distribution have the advantage that they are directly observed rather than inferred from properties of the epidemic curve; on the other hand, pairs of cases with known infector and infectee are nonrepresentative of the overall pattern of transmission in a population. For our baseline results, we estimate the serial interval nonparametrically rather than imposing a shape on it. We have also incorporated previous estimates of serial interval to test the sensitivity of our conclusions.
The difference between our low estimates (when assuming increased reporting fraction and using Fraser et al. (10
)'s serial interval distribution from La Gloria) and our high estimates (when ignoring increased reporting and using the serial interval distribution of Cowling et al. for seasonal influenza(7
)) is the difference between an epidemic that is readily controlled and one that is virtually uncontrollable according to existing models of pandemic interventions (6
). It is clear that more precise estimates of the serial interval in various contexts for this virus are essential to reduce the uncertainty of estimates of the reproductive number; similarly, it is essential to estimate growth rates in a variety of contexts where reporting fractions can be better understood, possibly at local levels where a single reporting system is used.
Finally, it should be remembered that neither serial interval (20
) nor reproductive number is a constant of nature; each depends on the population, the state of control measures and behavior, and other factors. Continued monitoring of the growth of the pandemic in various settings will be required to define the range of reproductive numbers achieved by this virus and their possible dependence on geography, population, season, and changes in the virus.