|Home | About | Journals | Submit | Contact Us | Français|
Dengue fever (DF) and dengue hemorrhagic fever (DHF) are growing health concerns throughout Latin America and the Caribbean. This study focuses on Costa Rica, which experienced over 100 000 cases of DF/DHF from 2003 to 2007. We utilized data on sea-surface temperature anomalies related to the El Niño Southern Oscillation (ENSO) and two vegetation indices derived from the Moderate Resolution Imaging Spectrometer (MODIS) from the Terra satellite to model the influence of climate and vegetation dynamics on DF/DHF cases in Costa Rica. Cross-correlations were calculated to evaluate both positive and negative lag effects on the relationships between independent variables and DF/DHF cases. The model, which utilizes a sinusoid and non-linear least squares to fit case data, was able to explain 83% of the variance in weekly DF/DHF cases when independent variables were shifted backwards in time. When the independent variables were shifted forward in time, consistently with a forecasting approach, the model explained 64% of the variance. Importantly, when five ENSO and two vegetation indices were included, the model reproduced a major DF/DHF epidemic of 2005. The unexplained variance in the model may be due to herd immunity and vector control measures, although information regarding these aspects of the disease system are generally lacking. Our analysis suggests that the model may be used to predict DF/DHF outbreaks as early as 40 weeks in advance and may also provide valuable information on the magnitude of future epidemics. In its current form it may be used to inform national vector control programs and policies regarding control measures; it is the first climate-based dengue model developed for this country and is potentially scalable to the broader region of Latin America and the Caribbean where dramatic increases in DF/DHF incidence and spread have been observed.
Dengue fever (DF) and dengue hemorrhagic fever (DHF) are the most important vector-borne viral diseases (family Flaviviridae: genus Flavivirus) globally (WHO 2000). Approximately 2.5 billion people are at risk and 50–100 million cases occur each year (PAHO 2002, WHO 2002). About two-thirds of the world’s population resides in areas infested with dengue vectors (Aedes aegypti and Ae. albopictus mosquitoes) and all four dengue virus serotypes affect urban populations (Gubler and Clark 1994, Jetten and Focks 1997). Dengue transmission is heavily influenced by environmental conditions, human behavior, and demographic changes. The main vector, Ae. aegypti, lives in close association with humans in urban and suburban environments, preferring human blood meals and laying its eggs in artificial containers such as drums, buckets, tires, flower pots, and vases (Service 1992, Focks and Chadee 1997, Gubler 1998). The incidence of DF has increased significantly over the past 25 years (Gubler 2004), qualifying it as an ‘emerging or uncontrolled disease’ (TDR 2005). In the Americas, vigorous control campaigns eliminated Ae. aegypti from most of Central and South America during the 1950s, but discontinuation of the program lead to re-infestation during the 1970s and 1980s and re-emergence of dengue (Gubler 1998). Global trade, population growth and uncontrolled or unplanned urbanization (where inadequate housing, water supply, and waste collection services increase available larval habitats) have all been major factors influencing the current pandemic (Kuno 1995). These demographic and social changes, as well as a lack of effective mosquito control, have facilitated the spread and persistence of Ae. aegypti and dengue virus in many areas of the world (Gubler 1998).
Several studies have examined the wave-like behavior of DF/DHF epidemics in different areas and have demonstrated an association between DF/DHF incidence or vector populations and climate variables (Cazales et al 2005, Chadee et al 2007). Mechanistic models have been developed to simulate mosquito populations using temperature- and moisture-dependent epidemiological factors (Focks et al 1993a, 1993b, Cheng et al 1998, Hopp and Foley 2003), while other studies have analyzed DF/DHF time series using climatic indices that relate to global teleconnections such as the El Niño Southern Oscillation (ENSO) (Gagnon et al 2001, Cazales et al 2005). Climate-based studies have generally revealed strong relationships between DF/DHF outbreaks and climate oscillations using data from meteorological stations and sea-surface temperature observations (SST). Pacific SST anomalies, which are indicative of ENSO fluctuations, are often invoked to explain teleconnections that relate weather patterns over broad areas of the Earth’s surface. Precipitation and temperature oscillations over large parts of Latin America and the Caribbean are strongly influenced by changes in Pacific SST (Glantz 2001) and these in turn can influence vector competence and survivorship. In endemic areas, DF/DHF epidemics may also cycle over multiple years, although the period between epidemics may also be a function of herd immunity from previous epidemics. While ENSO may play a role in synchronizing epidemics (Cazales et al 2005), seasonal vegetation dynamics may also influence vector populations at relatively local scales (e.g. Gomez-Elipe et al 2007). Often, there is a close association between vegetation canopy development, local moisture supply and breeding of mosquito vectors (Linthicum et al 1999). Fully developed tree canopies not only provide shade that can reduce evaporation from containers, but may also decrease sub-canopy wind speed and increase humidity near the ground, factors that tend to increase vector competence (Linthicum et al 1999).
A major implication of macro-scale (i.e., ENSO) and micro-climate effects is that vector-disease dynamics may be explained using models that incorporate climate and vegetation data to predict the occurrence and spread of vector-borne diseases (Patz et al 2005). Such models have been developed to predict malaria incidence (Thomson et al 2005), but there has been limited progress in developing early warning systems for DF/DHF. For example, a dengue early warning model (based on 5 weeks of climate data) was developed to predict dengue incidence in San Juan, Puerto Rico, but this model was not considered reliable as a sole predictor of dengue in this area (Schreiber 2001). One of the limitations for developing an early warning system is that detailed multi-year studies of climate and dengue are generally lacking and climate data are often limited to few meteorological stations (Chadee et al 2007), which often contain recording gaps. Further, the non-stationary behavior of most DF/DHF time series poses a challenge to predict DF/DHF outbreaks, although variables such as sea-surface temperature (SST) may also display a degree of interannual nonstationarity (Mestas-Nuñez and Enfield 2001). In this paper, we present results from a new model developed to predict weekly DF/DHF cases in Costa Rica from 2003 to 2007. The model is based on weekly ENSO SST indices and interpolated vegetation index data obtained from polar-orbiting satellite observations. Model fitting was done using weekly DF/DHF case data aggregated to the national scale, which provides high temporal resolution appropriate for prediction of future epidemics.
Data on weekly cases of DF/DHF were obtained from the Costa Rica Ministry of Health (MoH) reports, extracted from the documents available at http://www.ministeriodesalud.go.cr/estavigiepi.htm covering the time period from January 2003 through week 48 of 2007, and include a major epidemic in 2005. During this period there were a total 104 288 DF/DHF cases reported in Costa Rica, 522 of which were diagnosed as DHF. The MoH DF data are aggregated nationally and compiled from case reports supplied by regional and local clinics, and most are diagnosed clinically. DF and DHF cases are combined in the time series, although the latter constitute a small fraction (<1%) of total infections. Weekly ENSO SST index data were obtained from the Australian Bureau of Meteorology, which compiles time series of NINO1–NINO4 SST indices on http://www.bom.gov.au/climate/enso/indices.shtml. These data are defined as the average of SST anomalies over five Niño regions, which extend across the Pacific equatorial belt from 160°E to 80°W and include NINO3.4, an area that overlaps Niño regions 3 and 4 from 120°W to 170°W. Niño regions 1 and 2 are closest to South America where upwelling processes are sensitive to air–sea interaction in the central and equatorial Pacific (Glantz 2001). Anomalies are variously defined for the different regions, show a quasi-periodic behavior and are weakly inter-correlated. Time series of the enhanced vegetation index (EVI) and the normalized difference vegetation index (NDVI) averaged for Costa Rica were extracted from 16-day 500 m MODIS composite imagery downloaded from Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/lpdaac/getdata/). Multitemporal vegetation indices provide a measurement of photosynthetic activity and vegetation phenology, and are related positively to rainfall and moisture availability. EVI also possesses an advantage over other commonly used vegetation indices in that it incorporates a blue wave band that reduces the effects of background reflectance and atmospheric constituents such as aerosols (Huete et al 2002). EVI is related closely to near infrared reflectance (i.e., leaf display), while NDVI is more closely related to red-reflectance (i.e., photosynthesis) (Glenn et al 2008). National-level, one-dimensional NDVI and EVI time series were constructed from the mean of randomly sampled points within Costa Rica and the 16-day NDVI and EVI values were interpolated to weekly values using a cubic spline in order to match the temporal resolution of the DF/DHF cases and ENSO indices.
Simple additive models of the general form
provide a common descriptor of time series, where St denotes a signal and Vt denotes a time series that may be correlated over time (Shumway and Stoffer 2006). In general, any time series can be described in terms of three components, a linear trend, a seasonal component and a random or irregular component, which are also additive (Janacek and Swift 1993). Further, we can often say that a time series is dependent on a set of independent inputs or independent series, zt1, zt2, zt3, …, ztq where the inputs are fixed and known
and where β1, …, βq are unknown fixed regression coefficients and ωt random error or noise process (Shumway and Stoffer 2006). It is natural to estimate the unknown coefficients by minimizing the residual sum of squares (RSS) with respect to β1 ··· βq and many implementations are available to achieve this. Such approaches are often used to obtain accurate model fits and are generally considered the first step in developing reliable forecasting models for time series (Montgomery et al 2008). Further, models to predict vector-borne disease may include an autoregressive (AR) component (e.g., Gomez-Elipe et al 2007), which is based on the idea that the current value of the time series, Xt, can be explained as a function of past values. For example, Gomez-Elipe et al (2007) used an autoregressive integrated moving average model (ARIMA) in conjunction with a sinusoid to forecast malaria incidence using NDVI, temperature, rainfall and preceding malaria cases with 93% accuracy.
To assess lagged relationships between DF/DHF cases and ENSO–vegetation dynamics we used the cross-correlation function (CCF) to calculate the correlation coefficients between time series over +/−52 weeks. We hypothesized that such long lags may capture system memory effects over more than one wet season since the eggs of Ae. aegypti may survive desiccation for four to six months after oviposition in containers and survival times may be associated with humidity levels during the dry season (Sota and Mogi 1992). Therefore, the moisture conditions during the preceding seasons may influence the number of cases in the current season. As an initial step to fit the model, the maximum correlations over negative (i.e., independent variables shifted backward) and positive (i.e., independent variables shifted forward) lags were identified and the ENSO and EVI time series were shifted the appropriate number of lags to match the maximum correlation with DF/DHF cases. Thus, the model was fit with the different independent variables approximately in phase with the DF/DHF time series. These lagged series were included as independent variables in a model that contains a set of sinusoids having the general form
where ct is the number of cases at time t, ztn are independent input variables and where an and bn are parameters estimated using non-linear least squares regression. Equation (3) is the analog of the discrete Fourier series, which describes Xt in terms of contributions of different cycling components with different frequencies and having amplitudes of an and bn.
Models of this general form have been used extensively in time series analysis and generally explain phenomena that are periodic or quasi-periodic, which applies to many DF/DHF time series. We elected not to incorporate an AR component to the model as this generally requires extensive transformation of the variables to achieve the assumption of stationarity. Logically, using weakly stationary or non-stationary time series as independent input variables in equation (3) presents a solution to model the non-stationary behavior of most DF/DHF time series and thus we retained the seasonal component without any transformations. We then evaluated different combinations of climate and vegetation index variables in the model and used non-linear regression to estimate the coefficients and the percent explained variance (R2). In this way, we were able to determine the best combination of input model parameters for predicting DF/DHF outbreaks.
Time series plots of each of the independent variables and dengue cases are shown in figures 1(a)–(f), which reveals lags between the SST departures in degrees celsius, vegetation indices and DF/DHF cases. Figures 1(a)–(e) generally suggest a negative correlation between the ENSO indices and dengue, with the former reaching a minimum several weeks after the major dengue epidemic of 2005. Figure 1(f) shows vegetation indices plotted against case numbers and reveals a generally synchronous relationship with small lags between NDVI maxima, EVI maxima and annual case maxima. The cross-correlation coefficients for the ENSO SST indices and case numbers were negative over a range of +/−52 weeks, which indicates that the ENSO cool phase (La Niña) is more likely to favor greater numbers of DF/DHF cases in Costa Rica. These periods tend to be more humid in Central America (Glantz 2001) and may favor survival of greater numbers of Ae. aegypti eggs and adults (Sota and Mogi 1992). Other studies (e.g., Gagnon et al 2001) have shown that the warm phase of ENSO (El Niño) tends to be associated with increased DF/DHF incidence especially after the major El Niño events of 1983–1984 and 1997–1998 in Indonesia, where dengue has been endemic for several decades (Arcari et al 2007). In contrast, the MODIS-EVI cross-correlation coefficients were positive from −6 to +15 lags, which indicates modest synchrony between canopy greenness and numbers of infections in Costa Rica. The maximum correlations between ENSO SST indices and case numbers were −0.42 (lag = −2), −0.18 (lag = −5), −0.43 (lag = −10), −0.40 (lag = −17) for NINO1–NINO4, respectively. The maximum correlation coefficient for NINO3.4 index and cases was found to be −0.45 at lag −11, which suggests that SST anomalies in this particular Niño region relate modestly to case numbers. For EVI the maximum cross-correlation coefficient was 0.36 at lag = −20 and for NDVI the relationship was strongest (−0.4) at lag = −40. The different cross-correlation results for the two vegetation indices suggest that each responds to different aspects of the vegetated landscape (Glenn et al 2008) and that both may provide independent inputs to our model.
To assess the model’s forecasting potential, we also evaluated relationships for forward or positive lags in which the independent variables were shifted ahead in time. These relationships were generally weaker than those cited above; for example, maximum correlation coefficients were 0.20 (lag = 38), 0.24 (lag = 46), 0.21 (lag = 43), 0.28 (lag = 46) for the NINO1–4 indices and 0.24 (lag = 45). However, for EVI the cross-correlation coefficient reached a maximum of 0.49 at lag = 5 and 0.37 for NDVI at lag = 47.
Figure 2 shows the results of a set of different model simulations in which different combinations of ENSO and vegetation index data were entered using the negative lags given above. Note that the model outputs are truncated to varying degrees in figure 2 owing to the variable shifts of the independent variables. Figure 2(a) shows how the model performed when only ENSO indices were included in the model and figure 3 reveals how the model results improved as more ENSO variables were added. For example, with all five ENSO indices included, the model explained 45% of the variance in the DF/DHF time series (figure 3) and showed a small peak in phase with the 2005 epidemic. Figure 2(b) shows how the model performed with only EVI and NDVI as inputs and the corresponding figure 3 reveals that these two variables explained 33% of the variance. Interestingly, EVI and NDVI alone were unable to produce a modeled increase in predicted cases during the 2005 period. Figure 3 shows an increase in model performance as more ENSO variables were added, but that the combination of ENSO and vegetation index data significantly improved the model’s ability to estimate the epidemic period of 2005. When EVI and the five ENSO indices were included, the model explained close to 58% of the variance but when NDVI was used with the ENSO indices the R2 improved to 0.75. This suggests that NDVI may be a more powerful index for predictive purposes, possibly because it is more closely related to moisture conditions than EVI. When all variables were included, the model R2 improved to 0.83.
Figure 4 shows how the model performed when the independent variables were shifted forward in time (positive lags) and focuses specifically on the epidemic period of 2005. As expected from the analysis of cross-correlation coefficients, the overall predictive power of the model was less (64% of variance) when the independent variables were positively lagged to match the maximum cross-correlation in this direction. Nonetheless, figure 4(a), which provides the result for all independent variables, shows that the model was still able to predict a large increase in DF/DHF cases during 2005, consistent with the epidemic, although the overall magnitude was underestimated. When the model was run with positive lags and parameterized using the first 104 observations (figure 4(b)), a period that did not include a major epidemic, the per cent variance explained decreased to 49 and the predicted peak of the epidemic was shifted to the right by several weeks. This suggests that the model parameterization may benefit by further incorporation of one or more past epidemics.
In order to forecast future DF/DHF epidemics, it will be necessary to run such a model with independent variables lagged in the positive direction. With the exception of EVI, which was fairly synchronous with DF/DHF case fluctuations, maximum cross-correlations were obtained for NDVI and the ENSO indices around +40 weeks, which would provide ample warning for public health authorities to prepare for a potential outbreak. This result is also consistent with studies that analyzed egg survival times in Ae. aegypti, which can range from four to six months (Sota and Mogi 1992) up to a year (Christophers 1960). Moreover, temperature variations may also explain some of the variance in dengue cases owing to the decrease in the extrinsic incubation period (the time between an infective blood meal and infectivity of the mosquito) when temperatures range between 32 and 35 °C (Schreiber 2001). This may explain why negative departures in Pacific SST (the cool phase of ENSO) appear to be inversely related to DF/DHF cases in our study area. Of course unexplained variance in DF/DHF cases may also relate to a variety of factors including herd immunity to the dominant circulating serotypes, specific health interventions or adoption of vector control measures. Considerable variation in national-level control measures and the expansion of circulating serotypes into highly susceptible populations may have affected the magnitude of the epidemic in 2005 and account for some of the unexplained variance in the model, in addition to the possible misreporting and underreporting of cases, which is a common problem in endemic areas (Halstead 2008). Nonetheless, model simulations that incorporate NDVI, EVI and ENSO indices all produce a notable peak in 2005, which suggests the model may be used to predict future outbreaks despite underestimating to varying degrees the recorded cases in 2005.
The model may be improved in a variety of ways, for example, by incorporating different types of indices relating to other climate teleconnections (e.g., the Pacific Decadal Oscillation, Atlantic Multi-decadal Oscillation), use of wavelet transforms (e.g., Cazales et al 2005) to better characterize changes in the dominant frequencies that control the non-stationary nature of dengue time series, inclusion of locally based climate data (e.g., temperature fields) that are used in more traditional models of vector population dynamics (Focks et al 1993a, 1993b), and use of seasonal autoregressive modeling (Chaves and Pascual 2007).
Both dengue and malaria are among a set of acute vector-borne diseases that show seasonality and a clear association with rainfall and temperature (Halstead 2008). However, in developing countries where meteorological station data tend to be spatially sparse and often include large temporal gaps, other climate variables may be substituted for station data when studying climate–disease patterns. The ability to substitute ENSO SST data for spatially sparse station data is a particular strength of our approach as ENSO data are compiled regularly and consistently at weekly intervals.
The overall success of climate-based models for vector-borne disease prediction has lead to greater interest in the development of operational early warning systems (EWS) (Kelly-Hope and Thomson 2008). The feasibility of malaria EWS has been established for parts of Africa (e.g., Thomson et al 2005, Ceccato et al 2007, Jones et al 2007) and these approaches typically incorporate SST and remotely sensed vegetation indices (usually NDVI). Depending on the model, malaria EWS typically show comparable predictive power to our model for DF/DHF cases in Costa Rica (e.g., Jones et al 2007). According to (Chaves and Pascual 2007), the accuracy of infectious disease EWS typically ranges from 50–80% for disease burdens at timescales of one year or less. DF/DHF is generally considered more difficult to predict using climate variables than malaria since many malaria vectors (mainly Anopheles mosquitoes) tend to deposit their eggs in rain-fed water bodies that are much larger than the container habitats used by the Ae. aegypti mosquito and the eggs of malaria vectors do not survive desiccation. Further, container reduction efforts in urban areas are largely independent of climate, as are various other control measures such as spraying and use of larvicides (Halstead 2008).
Further work is needed to develop a spatial dengue EWS that entails the creation of risk maps showing likely patterns of disease burden (Kelly-Hope and Thomson 2008). However, in the case of DF/DHF there may be clear patterns of spatial variability that may relate to the built environment and tree cover (Troyo et al 2008, 2009), local climate variability, as well as differential herd immunity (Halstead 2008). Thus, more work is needed to develop spatiotemporal models that predict incidence and spread of dengue and dengue vectors for endemic urban areas. Fortunately, a range of remotely sensed data may be used to derive information on Ae. aegypti habitat suitability and rainfall in the tropics and high-resolution (10° × 10°) climate surfaces depicting rainfall and temperature (e.g., New et al 2002) may be used to drive a spatial version of our model for application at a regional scale to predict dengue cases. However, much higher-resolution surfaces would be needed to apply such an approach to small countries such as Costa Rica.
Our analysis shows that a relatively simple structural model that incorporates lagged SST and MODIS vegetation indices explained 83% of the variance in weekly DF/DHF cases in Costa Rica from 2003 to 2007. When run with the independent variables lagged in the positive direction, the model also performed reasonably well (R2 = 0.64); i.e., within the range of accuracies of most climate-based disease EWS (Chaves and Pascual 2007). Given all the factors that tend to be associated with DF/DHF, including poor sanitation, inadequate management of small containers, variable efficacy of vector control, underreporting of cases and immunity to circulating serotypes, the results reported here suggest the feasibility of advancing DF/DHF prediction at national-to-regional scales using climate-based statistical models that estimate future outbreaks and quiescence. Moreover, the temporal resolution of our model is higher than that used in other predictive tools such as malaria EWS, which typically rely on monthly data to generate advanced notification of disease risk. This is significant because the onset of DF/DHF epidemics can be rapid and weekly data are more appropriate than monthly observations to capture rapid fluctuations in the independent variable. Our CCF analysis suggests that the model may be used to predict DF/DHF outbreaks as early as 40 weeks in advance and may also provide valuable information on the magnitude of future epidemics. In its current form we believe the model may be used to inform national vector control programs and policies regarding control measures, including prevention and planning of medical services for those likely to be affected during future outbreaks. Our climate-based model is the first that has been developed for this country (Troyo et al 2006), it is potentially scalable to the broader region of Latin America and the Caribbean and therefore it may be applied to other countries that are experiencing dramatic increases in DF/DHF incidence and spread.
The authors wish to thank Nelson Mena and Lucia Obando for their assistance in processing the MODIS-EVI data. Support for JCB comes from NIH grant P20 RR020770 as well as the Abess Center for Ecosystem Science and Policy, University of Miami.