We propose a new epidemiological method for assessing the virulence of an emerging infectious disease at the early stage of an epidemic. The results with the Hong Kong SARS dataset prove the usefulness of this method that corrects the biased cCFR estimator which is simply the ratio of cumulative deaths to cases. Early in the epidemic, the ultimately realized cCFR is within the confidence interval obtained by our method. The proposed method is particularly useful when an epidemic curve of confirmed cases is the only data available (i.e. when individual data from onset to death are not available, especially, during the early stage of the epidemic).
Our estimates suggest that the virulence of S-OIV H1N1 infection is comparable to the virulence observed in past influenza pandemics of the 20th century (<2.0 % for the 1918–19 pandemic and<0.5 % for the 1957–58 pandemic
[21]). Although our estimates may not be as high as 2.0%, and even though the unbiased cCFR estimate for the USA is a likely overestimation (see below), we should emphasize that antiviral treatment and other medical interventions have been instituted from the beginning of this pandemic. Our results show that the few observations of death in the USA and Canada give us no reason to believe that the unbiased cCFR, and therefore the virulence of the novel pandemic strain, is smaller in the USA and Canada than in Mexico. Nevertheless, given that the CFR of seasonal influenza is equal to or less than 0.1%
[10], our estimates (with the lower bound of cCFR close to the 0.1%) do not offer conclusive results to indicate that the S-OIV is more virulent than seasonal influenza, but do point in that direction.
It should be noted that our method only adjusts underestimation due to time delay from onset to death, and other epidemiological characteristics associated with unbiased estimation of the cCFR have yet to be addressed. In the present study, we estimated the cCFR as the proportion of deaths among confirmed cases. This definition was chosen, because of our aim to use the minimally available data, and so we were not able to estimate the proportion of deaths among all symptomatic cases, and not able to estimate the proportion of deaths among all those infected (symptomatic and asymptomatic). The issue of defining the correct denominator population can never be completely resolved, but it is essential to realize how the obtained estimate relates to other situations
[8]. By only using confirmed cases, it is clear that all cases will be missed that do not seek medical treatment or are not notified, as well as all cases that are asymptomatic. This means that our cCFR estimate is higher than the proportion of deaths among infecteds, and may be considered an overestimate. However, when relating our estimate to previous pandemics, it should also be realized that the current pandemic is the first where many confirmatory diagnoses of influenza have been recorded using RT-PCR techniques, allowing improved precision of cCFR estimates over those for previous influenza epidemics. Whereas the use of RT-PCR in the current pandemic may yield a smaller denominator (and thus an overestimate of CFR compared to previous pandemics), other pandemics could have involved substantial numbers of false-positive cases in the denominator. Developing a method which permits comparable assessment of virulence is ongoing.
shows the time course of biased cCFR estimates in the USA and Canada based on the reporting date of confirmed cases and deaths to the World Health Organization. Note that the estimates in are different from our
bt due to unavailability of the date of onset, although they give an approximate indication of the time-course of the biased cCFR. It is striking to see that the biased cCFR during the very early stage (i.e. from late April to mid-May) showed a declining trend following a single spike. The biased cCFR estimates at later time points show a slight increase as a function of time, which is consistent with our knowledge of underestimation of the cCFR
[8]. The early spike may be explained by a time-varying coverage of confirmed diagnoses which could have increased as a function of time (i.e. cases in the very beginning of the epidemic were less likely to be confirmed). Other plausible explanations include (1) demographic stochasticity, (2) effective treatment, and (3) heterogeneous risk of death among subpopulations. As for (1), because the number of deaths in the USA and Canada was very small during the early stage, the spike may reflect (unpredictable) probabilistic variations in the number of deaths among a small number of confirmed cases. If that is the case, our unbiased cCFR estimate for the USA (with data until May 1) may be too high, not because of a systematic bias but just by chance. In relation to factor (2), it is plausible that cases diagnosed in later stages of the epidemic receive treatment at an early stage of illness (or even before symptom onset). With respect to (3), the risk of dying is likely to be different for different subpopulations
[8],
[10],
[22],
[23]. It should be noted that the composition of sub-populations (e.g. age-groups and those with a specific underlying disease) is likely to vary as a function of time, and a cCFR estimate for the entire population, such as ours, is influenced by this variation. These points need to be addressed in future studies.
To fully clarify the virulence and its epidemiological characteristics (e.g. variable risks by age and underlying diseases), two lessons for surveillance and data sharing should be noted. First, rather than updating the data based on date of reporting, it is critically important to summarize the data according to the date of onset both at local and global levels. Knowing the date of symptom onset is a key to applying our proposed estimation framework to empirical observation. Second, epidemiological data should be updated in a precise reporting interval at least during the early stage of an epidemic (so that the data permit estimation of the unbiased cCFR). Given that mean time from onset to death is around 9 days, weekly data do not enable us to make our explicit adjustment. Optimal reporting for the early cCFR estimation may be incorporated into official pandemic response plans. Moreover, in addition to using death as an outcome of virulence, the usefulness of other epidemiological measurements of severe manifestation (e.g. the number of admissions to intensive care unit) needs to be explored.
Despite a need to further clarify heterogeneous risks of death for the S-OIV pandemic, early assessment of virulence by means of our unbiased cCFR estimator is useful for informing policy makers and the general public about the potential severity of an infectious disease (of course, one needs to ensure an understanding of the above mentioned bias among non-experts). We have shown that underestimation can be adjusted in a very simple manner, and our approach enabled us to obtain an unbiased cCFR estimate by only minimizing a binomial deviance. These methods are particularly useful when there have been only a few deaths or even no death at all by time t during the course of an epidemic. Uncertainties surrounding the unbiased estimate of cCFR based on a few deaths can partly be addressed by sensitivity analysis of the estimate to different lengths of time from onset to death. An observation of zero deaths in a given country (or a specific setting) should not be deemed a signature of a “benign” virus without observing a substantial number of cases. We have shown that a conservative upper bound of cCFR is a more useful interpretation of the observed number of cases without death. In this way, given that we have some prior knowledge or a few observations of death which permit us to assume F(s) is known, epidemiologists and biostatisticians in each country or locality can directly apply our method to assess the virulence of an infection at the early stage of any emerging infectious disease.
During the final stages of revision, it came to our attention that an epidemiological study on cCFR of S-OIV with similar techniques and statistical philosophy has been published online
[24], indicating that the preliminary estimate of cCFR for a combination of the USA, Canada and Mexico is 0.5% and emphasizing a need to accurately capture the cases for the denominator.