|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: JO MIC ARC VJL LGG. Performed the experiments: JO MIC. Analyzed the data: ARC HCL. Contributed reagents/materials/analysis tools: ARC HCL. Wrote the paper: JO MIC ARC HCL VJL RTPL PT LGG. Set up the surveillance network: ARC. Wrote the statistical software used to form predictions: ARC HCL. Wrote the software that automated the procedures: ARC JO. Contributed to writing the paper: ARC JO HCL. Set up the network: JO MC VL LGG. Managed the running of the project: JO. Devised the model used in predictions: ARC MC HCL. Provided medical and epidemiological insight, and contributed to writing the paper: MC VL LGG. Provided medical, epidemiological and national policy insights and contributed to writing the paper: RTPL PAT.
Reporting of influenza-like illness (ILI) from general practice/family doctor (GPFD) clinics is an accurate indicator of real-time epidemic activity and requires little effort to set up, making it suitable for developing countries currently experiencing the influenza A (H1N1 -2009) pandemic or preparing for subsequent epidemic waves.
We established a network of GPFDs in Singapore. Participating GPFDs submitted returns via facsimile or e-mail on their work days using a simple, standard data collection format, capturing: gender; year of birth; “ethnicity”; residential status; body temperature (°C); and treatment (antiviral or not); for all cases with a clinical diagnosis of an acute respiratory illness (ARI). The operational definition of ILI in this study was an ARI with fever of 37.8°C or more. The data were processed daily by the study co-ordinator and fed into a stochastic model of disease dynamics, which was refitted daily using particle filtering, with data and forecasts uploaded to a website which could be publicly accessed. Twenty-three GPFD clinics agreed to participate. Data collection started on 2009-06-26 and lasted for the duration of the epidemic. The epidemic appeared to have peaked around 2009-08-03 and the ILI rates had returned to baseline levels by the time of writing.
This real-time surveillance system is able to show the progress of an epidemic and indicates when the peak is reached. The resulting information can be used to form forecasts, including how soon the epidemic wave will end and when a second wave will appear if at all.
On 2009-04-24, the World Health Organization (WHO) reported the spread of a novel influenza A (H1N1) strain in the United States and Mexico. Sentinel surveillance which was mainly hospital based had indicated increased numbers of influenza-like-illness (ILI) in Mexico occurring since 2009-03-18 . Over the next months, the virus spread rapidly across the globe, resulting in the WHO declaring a pandemic and advising countries to activate their pandemic preparedness plans . Singapore identified her first imported case of influenza A (H1N1-2009) on 2009-05-27 , and the first unlinked cases on 2009-06-19 , which indicated community transmission had begun in Singapore.
Singapore experienced all three influenza pandemics of the last century—in 1918, 1957 and 1968 , . During the 1957 pandemic, reporting of influenza cases by clinicians provided a reasonably clear indication of daily epidemic activity (Figure 1A) . Influenza-like illness (ILI) has also been used widely as an indicator of influenza activity during non-pandemic epidemics, with ILI reporting by sentinel general practice/family doctor (GPFD) clinics forming the backbone of surveillance systems for influenza in many countries –, and these have been used to monitor the current pandemic –. In Singapore, though, acute respiratory illness (ARI) data captured from electronic medical records, as a more general indicator of infectious disease outbreaks, have traditionally been used by health authorities, including during the early part of the current pandemic. However, ILI monitoring can provide an estimate of case numbers and hence attack rates, hospitalisation and case fatality ratios , and is more specific for influenza than ARIs.
Data from ILI monitoring can also be used for modelling of influenza epidemics and pandemics –. Modelling can be performed retrospectively to determine the relative importance of community compared to household transmission, or to determine the effect of pharmaceutical and non-pharmaceutical interventions , –. Modelling can also be performed in real-time during an epidemic, as proposed by Hall and colleagues, who used mortality data from England and Wales to demonstrate how models could have forecast when epidemic activity would peak during several historical pandemic events . Since H1N1-2009 has low hospitalisation and mortality rates (less than 1% of infected individuals) , reporting of ILI from GPFD clinics would potentially provide a more accurate indicator of real-time epidemic activity and progress than hospitalisations and confirmed fatalities.
While data on ARIs are routinely collated and laboratory surveillance of influenza has been in place in Singapore for more than 30 years , there is currently no system for monitoring GPFD consults for ILI in Singapore. In order to monitor the epidemic and adjust response plans in real-time, we rapidly developed a system for ILI surveillance, with resulting data and forecasts made publicly available via a website. The purpose of this paper is twofold:
We started the project in early June 2009, shortly after Singapore identified her first imported case of influenza A (H1N1-2009) on 2009-05-27. We sent out mass appeals to 535 e-mail addresses of GPFDs or clinics, and 23 clinics agreed to participate; the locations of the participating GPFD clinics are shown in Figure 2. Four clinics were city or office area practices and the remainder were situated in residential areas across the island.
Figure 1(B,C) shows trends in consultations for ARIs and ILIs from the network. Data submissions started on 2009-06-25, by which time there had been 315 confirmed H1N1 cases (including 87 locally transmitted cases) in Singapore . There was a clear but initially unanticipated weekly periodicity to the data, with lower consultation rates on the weekend and a post-weekend surge in attendances. For descriptive purposes (but not analytical ones), we therefore used weekly averages to provide a smoothed picture of the epidemic trajectory. A comparison between Figure 1B or D and C clearly displays ILI as a better indicator of epidemic activity than ARI. The weekly average ARI consults per doctor in the early epidemic period was between 10 and 15 (Figure 1B), and peaked at 17 in the week ending 2009-07-25, but from this alone it was difficult to determine how much H1N1-2009 epidemic activity there was around the time community transmission was starting; this is compounded by the high baseline rate making the height of the peak relatively low, at just around one and a half times the baseline level. Figure 1D shows the weekly epidemiological data for acute respiratory tract infections in Singapore based on government clinic attendances for ARI . The government clinic ARI data peaked in the week ending 2009-08-01, but, as with our ARI surveillance data, the high levels of background noise make it difficult to ascertain how much community-level infection there was near the start of the epidemic, especially since the epidemic was preceded by a considerable dip in ARI numbers. On the other hand, there was a marked, nearly five-fold increase in our ILI case data, from an average of about of an ILI per GPFD per day in the week ending 2009-06-27 to a peak of 3 in the week ending 2009-08-01. The highest recorded ILI rate occurred on 2009-08-03 (a Monday) with 6 ILIs per family doctor being reported. The sentinel network indicated that the epidemic had peaked around the start of August, and that ILI rates had returned to near baseline levels early in September.
Predictions of the number of ILIs being seen by our GPFDs and of the total number of people infected are presented in figure 3; animations of these forecasts and of the forecast total number seeking medical attention can be found in the supporting information (ILI/ GPFD /d in video S1, total ILI/d in video S2 and cumulative infections in video S3). These incorporate both population stochasticity and parametric uncertainty. The eventual forecast was that 13% of the population had been infected, with a 95% credible interval of (9%,19%). Initial forecasts were adversely affected by uncertainty in the parameters, caused by the vagueness of the subjective prior distributions we used and the scarcity of information from the data. By the middle of July, the algorithm was correctly forecasting the peak would occur at the start of August, although the magnitude of the epidemic was grossly overpredicted, and the accuracy of the forecast of the time of the peak may have been merely fortuitous. By the end of July, forecasts were stabilising around what transpired to be the eventual data, and by the middle of August, after the peak had come, the forecasts closely foreshadowed the tail of the epidemic. A measure of predictive accuracy is presented in figure 4. By the end of July, predictive error was averaging around 1 ILI per GPFD per day over a one-week time horizon. The sequence of subjective posterior distributions for the parameters and for the effective reproduction number over time are presented in figure 5, although we stress that these are our subjective distributions and do not expect the reader to share them –.
We have shown that it is possible rapidly, and at short notice, to deploy a real-time influenza epidemic surveillance system using GPFDs in the absence of an existing system. This is likely to be a workable model in much of the developing world where a significant proportion of primary care is delivered by private practice GPFDs. Firstly, we provide proof of concept that it is feasible, within a month, and with no budget, to establish a protocol for daily data submission for ILI and begin submission. Secondly, we show that processing the data in near real-time—with cases seen each day entered by the following day—can provide graphical trends that describe the progress of an influenza epidemic. Finally, we demonstrate how such data can be used in real-time, and in combination with a process-based model refitted daily, to generate forecasts that can subsequently be verified against actual data as an epidemic unfolds, as is common in other dynamic applications such as weather and finance.
While ILI surveillance is used widely in temperate countries –, there are few publications on the effectiveness of ILI surveillance in tropical countries to chart the spread of epidemic influenza, given the high baseline incidence of other non-influenza diseases and minimal seasonal forcing. Evidence is now emerging on the value of such surveillance systems in the tropics , and our study shows that ILI surveillance can track epidemic influenza activity in such settings. The slow uptake of influenza surveillance systems for tropical countries may be related to the lack of appreciation for the epidemiology and impact of tropical influenza . Previous work has shown that both non-pandemic (often called “seasonal” in temperate countries in which influenza is associated with winter) and pandemic influenza caused substantial excess mortality in tropical Singapore , .
In Singapore, influenza activity has traditionally been monitored through a combination of laboratory and ARI morbidity . ARI data reflect the total burden of acute respiratory illness from all causes, often including non-infectious causes such as exacerbations of chronic lung disease which may be environmental in origin. However, it is clear from this study that while both ARI and ILI counts give an indication of when epidemic activity peaks, ILI data provide better resolution of influenza epidemic activity, with the relative magnitude of increase over the baseline being far greater than for ARI data, since influenza activity in the early epidemic phase is masked by the high and obstreperous baseline rates of other respiratory illnesses diagnosed as ARI. The other system for tracking influenza activity in Singapore is based on laboratory confirmed diagnoses of influenza. This is similar to what is done in many countries throughout the world as part of the World Health Organization's Global Influenza Surveillance Network. Monitoring of laboratory confirmed diagnoses picked up an increase in H1N1-2009 isolates among a sub-sample of ILI cases presenting at government polyclinics about one week before the epidemic was apparent in our ILI data (data not shown). However, the advantages of ILI surveillance is that it is much cheaper than laboratory-based surveillance and there are no capacity issues that may limit the number of samples that can be processed daily. In addition, laboratory testing of random samples is less sensitive to changes in absolute numbers of community cases at the peak of the pandemic when the influenza proportion among ILI cases remains relatively steady . ILI surveillance is therefore a cheaper and possibly more effective alternative to traditional laboratory surveillance, especially for resource-poor areas, to obtain reasonable sample sizes.
Setting up such a surveillance network has the secondary benefit of allowing real-time forecasting, which allows more informed policy making. By forecasting the epidemic ahead of time, we allow our forecasts of epidemic activity to be verified against data. We observed during the epidemic that modelling results correctly forecast the timing of peak epidemic activity on some days, but was off by up to a week at other times, though the actual magnitude of the peak was markedly different from early forecasts. We note though that even the relative accuracy of the forecast of the timing of the peak may have been merely fortuitous, and stress that we provide no theoretical results to guarantee this accuracy is repeatable. One particular difficulty we faced was ensuring the predictive accuracy of the system, given the lack of training data and the need to inform policy making as the epidemic unfolded. The results presented herein are therefore almost entirely the same as those presented on-line, including any shortcomings; the only alterations to the model and approach were to allow reporting rates to vary across the week (a change partially implemented part-way through the study) and to remove an adhoc method intended to make the approach more robust to potential changes to the parameters in time (which transpired not to improve matters enough to warrant introducing statistical non-coherencies).
The eventual forecast for the final size of the outbreak was around 13% with a 95% credible interval of (9%,19%). If true, then combined with the rolling out of vaccine and the potential for some additional existing immunity  (a possibility we conservatively excluded from the analysis), this figure suggests Singapore is unlikely to experience a large second wave without substantial mutation of the pathogen. The estimate of around 13% corresponds closely to a paired serological study of Singaporean adults which estimated 13% (11,16)%, adjusting for the age distribution of the country, had experienced a four-fold rise in antibody titres (Mark I-Cheng Chen, personal communication). The close correspondence adds considerable confidence to the conclusions of the study.
Further evaluation is underway of the value of retaining a sentinel network permanently in a tropical city-state with year-round non-pandemic influenza transmission and additional bi-annual epidemics. By establishing an avenue for public display of infectious disease forecasts, we hope to build public and institutional confidence in and acceptance of modelling in the context of infectious diseases. To this end, the network was publicised in the local media and the website was made freely available to the general public. This helped provide an additional layer of transparency to reporting of the numbers of people infected with influenza and the relative impact on the wider community. We believe that this contributed to the overall national risk communication strategy and helped to reduce the level of panic and disruption to normal activity feared at the onset of the pandemic.
Several limitations of our work need to be highlighted. Firstly, this system of data collection was fully dependent on the goodwill of participating GPFDs, who received no monetary compensation. We found that we could continue to motivate the participating GPFDs by providing frequent updates based on their aggregated contributions. Although we sent out mass appeals to over 500 e-mail addresses, only 23 GPFDs agreed to participate. The poor response rate could be due to a combination of factors, including:
The final premise may be the most critical, and we suggest that some form of financial reimbursement be considered to compensate GPFDs for the effort and time needed to drive data submission in future, as this would likely improve recruitment rates and make such a system sustainable in the long term. Overall, the poor response rate highlights the challenge of recruiting appropriate clinics for any such system, particularly when using e-mails to disseminate such information, and at short notice. However, for a medium-sized city of 4.8 million residents, the network of around 20 GPFDs sufficed to provide considerable information on epidemic progress. Notwithstanding this small number of participating GPFDs, the surveillance system achieved its intended objective of tracking and forecasting influenza epidemic activity in near real-time. The small number of participating GPFDs (estimated to be about 2% of all GPFD clinics in Singapore) may make it difficult to assess if our ILI data are representative of all influenza diagnoses during the epidemic, but this is a limitation common to sentinel GPFD networks for influenza. The potential impact of non-representativeness caused by non-response would not, however, impact the validity of the forecasts, since the methods used for that do not assume the sentinels were selected at random. Other countries have used GPFD networks for surveillance of other viral illnesses –, , ,  and perhaps the combined lessons from these strategies could be applied more widely internationally.
In hindsight, several aspects of the approach could have been bettered. We did not anticipate the strong day of the week effect on ILI consulting rates, and this had a deleterious effect on predictions, especially when moving to Mondays from Sundays. In mid-July we changed the model to allow different rates at the weekend from the rest of the week, but by mid-August it became clear that the model would fit much better were every day of the week allowed its own reporting rate; this is the model presented herein. Again, in hindsight, it is obvious that there was bound to be sufficient information in the data to be able to estimate the differential reporting rates over the days of the week. Alternative models, such as the Richards model , , might have proven as or more effective, and certainly could be more parsimonious, than the compartmental model we used, but our experience was that the challenges of developing the software before any data had been collected effectively ruled out deciding on an optimal model to use. As is common in the field of infectious disease modelling, the model we used made many simplifying assumptions (see methods), all of which may potentially have reduced the quality of the forecasts. For instance, the presence of heterogeneous mixing or susceptibility in reality but not in the model may lead eventually to changes to the parameter estimates over time as the routine endeavours to fit a model excluding these effects, but in forming forecasts at an early stage, the future path of parameter estimates is unknown and so forecasts cannot take this into account. In this paper, we have used the term “forecast” sensu Keyfitz , to indicate the belief we invested in these predictions and the way they were used in contingency planning in some of the authors' institutions. This contrasts with his definition of a projection, which is the extrapolation of past trends without claiming to expect them to match the future. A consequence of this reticence, according to Keyfitz, is that projections cannot be wrong (never being claimed right), while predictions or forecasts are “practically certain”  to be in error, and are prone to black swan-type events —accepting this, and excepting the initial predictions, the forecasts we made fared very well (figures 3 and and4).4). Had we concentrated instead on projecting the epidemic, via a suite of competing models, we might have learned more about the assumptions underlying those models, which would have informed future modelling efforts. A comparison of different projecting approaches, as has been done for seasonal influenza monitoring , would therefore be very useful to refine the general approach for future outbreaks of emerging diseases, but this remains work for the future.
In conclusion, a real-time GPFD surveillance system can be set up rapidly during an epidemic and is able to show the progress of the epidemic. Such an inexpensive system can be deployed even in resource-poor settings to track future influenza epidemics and pandemics and forecast their trajectories in near real-time.
Ethics approval for the project was obtained from the institutional review board of the National University of Singapore.
We obtained e-mail addresses of GPFDs in Singapore from the College of Family Physicians Singapore (CFPS) and the directory of Pandemic Preparedness Clinics, a group of over 200 clinics registered with the Ministry of Health to manage influenza cases. In all, invitations were sent to 535 e-mail addresses. A series of road shows was also conducted at the CFPS to describe how ILI surveillance could help to track an epidemic. GPFDs who agreed to participate were also asked to extend the recruitment to their contacts.
Participating GPFDs had to be doctors registered with the Singapore Medical Council who worked at least three full days a week in a general practice or family medicine clinic in the community. Participation was purely voluntary and participating GPFDs were given the option to withdraw from the project at any time.
Enrolled GPFDs were requested to submit returns on their work days by e-mail or facsimile by 2pm the following day. The data submitted comprised information on clinically diagnosed ARIs. Clinically, influenza is an acute respiratory infection. As a group, the ARIs may be defined as a clinical diagnosis of patients who present with new short-term (time from onset less than two weeks) respiratory symptoms of cough, rhinorrhœ a, nasal congestion and/or sore throat, which may or may not be accompanied by fever. The syndrome is usually though not exclusively associated with viral æ tiologies. The range of pathogens responsible for ARI besides influenza is described in a recent WHO paper . A number of viruses cause a clinical illness which is difficult to distinguish from influenza, including respiratory syncytial virus, piconaviruses, parainfluenza, and adenovirus. These produce an influenza like illness . The operational definition of ILI we then used in performing the analyses was an ARI exhibiting a fever of 37.8°C; this approximates the definition used by the United States' Centers for Disease Control and Prevention, which defines ILI as an acute illness with cough and/or sore throat with a fever of 37.8°C, in the absence of a known cause other than influenza . Other data elements collected in the data collection form (figure S1) included demographic, clinical, and antiviral treatment information.
Disease dynamics are modeled via a standard, stochastic compartmental model –[50, inter alios], with daily increments and individuals passing through a series of unobserved classes corresponding to clinical stages of infection—Susceptible, Exposed (infected but not infectious), Infectious and Removed (recovered and subsequently immune, or deceased)—formulated by the equations
where , , represent the number of people in the whole population newly infected, infectious, and removed, respectively. These are assumed to follow binomial distributions as follows
To be explicit, the infection model is formulated under the simplifying assumptions that:
As before, the validity of these assumptions is open to debate.
In the original formulation, we forced for all , i.e. to be equal. In the middle of July, in response to the obvious variation over the week, we changed the constraint of the model to and . By mid-August, it was apparent that the day of the week effect needed to differ on each day of the week to attain a good fit. It is therefore the model without constraints that we present in this paper.
The parameters of the model are estimated within the Bayesian statistical paradigm [52, for instance] in which semi-informative prior distributions are assigned to parameters and incoming data incorporated via the likelihood function to obtain a time series of posterior distributions for the parameters and unobserved state space.
Since the state space is unobserved, a statistical method called particle filtering ,  is used to integrate over the possible realisations consistent with the daily observations. A series of 10 000 “particles” are created to which are associated parameter values and state space configurations generated from the prior distribution. Particles are iterated forward one day at a time via simulation of the state space, and the likelihood function calculated conditional on the trajectory of that particle and its associated parameter values. The likelihood function is then used to weight the particle. Particle degeneracy is overcome via resampling , while particle diversity is maintained via kernel smoothing ; the latter means that the resulting posterior distribution is approximate. The (approximate) posterior predictive distribution is derived by continuing the simulations beyond the last observation and weighting the resulting distribution via the particle weights at the last observation.
The particle filter algorithm proceeds as follows.
The algorithm provides the posterior distribution of any parameter, state or function thereof (such as the basic reproduction number, , or the effective reproduction number, , see e.g. , ) by taking a weighted average of this characteristic according to the posterior weights at the last observation time . Here, only the posterior predictive distribution of the underlying states is of interest. Since the prior distributions taken were subjective (see below), the resulting posterior distributions are also subjective, and as a caveat lector we caution that our posterior distributions may differ from the reader's; for further information on subjective probability the reader is directed to the writings of de Finetti (e.g. ) or Lindley (e.g. ). For references on particle filtering and examples of its use in population dynamic modelling in ecology, see –, –.
The prior distributions used are given in figure 5. In setting these, we aimed to balance the need to supplement the information content of the sentinel data with relevant information from other sources, with the desire not to obliterate the signal from the data. We set the prior mean for the infection rate, , to be 1.2, with standard deviation 0.8. Combined with the prior distribution for the infectious period, this leads to a range for of 0 to around 6, i.e. more than spanning the range of estimates for historic pandemics. The prior distribution for the importation rate, , was derived from a crude extrapolation of the timeline of the first five weeks of importations to the country . The prior distributions for the latent period and infectious period were modelled loosely on symptom onset after infection on an aeroplane  and a review of volunteer challenge studies . The prior distributions for the background rate of non-pandemic ILIs () were based upon the clinical insight of the authors, and for the reporting probabilities from guesstimation, noting that it is common for employers or schools in Singapore to require a formal medical certificate before allowing staff or students off work or out of class. We conservatively forced to be 0 since we did not know how the findings of studies in temperate countries  relating to prior exposure would extrapolate to the tropics; in this way, forecasts may be seen as worst case scenarios. The prior distributions for and were derived from extrapolating the number of confirmed locally acquired cases.
Predictive error was assessed by taking the posterior distribution of absolute difference between forecasts and observations, averaged over a one-week time horizon, and then averaged to get the posterior mean prediction error.
All statistical routines were written by the authors using the R statistical programming language .
Modelling results were updated daily around 3pm to a website that could be publicly accessed . This was automated using a bourne shell script that handled time, file transfer, archiving of previous forecasts, statistical processing, and positing of new output on the web. This was run on a unix web server using ISC's cron.
Data collection form.
(0.01 MB PDF)
Animation of forecast average ILI per GPFD per day. Note the change in scale on the y-axis.
(0.81 MB SWF)
Animation of forecast total nationwide ILI cases seeking medical attention. The day of week effect has been removed for clarity by treating all days as being Mondays. Note the change in scale on the y-axis.
(0.48 MB SWF)
Animation of forecast proportion of population infected or recovered, including those not seeking medical attention. Note the change in scale on the y-axis.
(0.50 MB SWF)
We thank the following doctors and their clinics for their valuable contribution to this study by generously volunteering data collection and submission: Dres Cheng Kah Ling Grace of Joyhealth Medical Clinic & Surgery, Chua Tee Lian of C & K Family Clinic Pte Ltd, Gan Tek Kah of Street 21 Clinic (Tampines), Julian Lim of Newlife Family Clinic & Surgery, Lew Yin Choo of Lew Clinic, Lim Bee Lin of BL Medical Associates, Low How Cheong of Healthwise Medical Clinic & Surgery and Tiffany Yap of Healthway Bukit Timah Clinic, and several others who wished to remain anonymous. We thank Yang Yang, Nishiura Hiroshi, and a further, anonymous, reviewer for their insightful comments on an earlier draft of the paper.
Competing Interests: PAT has received research support and honoraria from Baxter, Adamas, Merlion Pharma, and Novartis as well as travel support from Pfizer and Wyeth and sits on the boards of the Asia Pacific Advisory Committee on Influenza and the Asian Hygiene Council. VJL has received research support from GSK. The rest of the authors declare that they do not have any conflict of interests, financial or otherwise, in this study.
Funding: The National University of Singapore provided research funding to ARC and a scholarship to HCL. Staff time for JBSO and MI-CC was partially funded by the National Medical Research Council, Singapore. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. URL: http://www.nus.edu.sg/ and https://www.nmrc.gov.sg/