We used WNV disease in Colorado as a case study to quantitatively examine 1) the degree to which estimates of vector-borne disease incidence is influenced by spatial scale of data aggregation (i.e., county versus census tract), and 2) the extent of concordance between spatial risk patterns based on disease case counts versus disease incidence for commonly used spatial boundary units. The analyses showed that variability in WNV disease incidence within counties is approximately the same as the variability between counties, and that county-scale determinations of spatial WNV disease incidence patterns therefore account for only approximately 50% of the variance in WNV disease incidence that is shown at the census tract scale. This pattern was even stronger for WNND, with variability in incidence within counties approximately twice the variability between counties and the county scale accounting for only approximately 33% of the variability evident at the census tract scale. Use of the county scale was also found to mask hot-spots for WNV disease evident at finer scale (census tract or zip code) in counties with low overall WNV disease incidence. Furthermore, there was high concordance between spatial patterns of areas with high risk for exposure to WNV based on WNV disease incidence and WNV disease case counts for the census tract scale but not for the county or zip code scales. The primary weakness of the study, which needs to be addressed in prospective follow-up studies, is the lack of reliable information for WNV exposure sites for patients. Developing a more detailed understanding of the spatial dimensions of WNV transmission to humans in different environments, for example in urban versus rural areas, is an important next step to provide additional data to guide the public health community in the choice of appropriate spatial boundary units for presentation of aggregated vector-borne disease data.
There is a diverse stakeholder community with an interest in spatial patterns of risk for contracting diseases caused by vector-borne pathogens. In the specific case of WNV disease, stakeholders include federal, state, and local public health agencies, mosquito control programs, health care providers, purveyors of disease prevention products, and the general public. These stakeholders have needs for spatial information that differ not only in terms of scale but also in type of information. For example, a mosquito control program aiming to implement control activities to suppress vector mosquitoes and reduce the burden of WNV disease likely will be most interested in finding out where high numbers of WNV disease cases occur at sub-county scales to focus expensive prevention efforts. Conversely, a member of the public seeking information to help determine his/her personal risk of exposure to WNV, and the need for use of personal protective measures such as repellents, will be more interested in a spatial risk estimate based on WNV disease incidence (which accounts for population size) in the area of interest. The challenge presented to public health map-makers is to present stakeholders with a package of suitable and easy-to-understand information for spatial risk patterns in electronic map formats while at the same time protecting patient privacy and carefully considering benefits and drawbacks to determination and presentation of risk assessments at different spatial scales.23
Basic options to present information for spatial risk of vector-borne diseases in map formats include point locations for disease cases or aggregation of disease case counts or disease incidence to administrative boundary units (summarized in ). A map showing individual case point locations is obviously the most precise way to present spatial disease data. However, this has distinct disadvantages including 1) the possibility that the address of residence is not the site of pathogen exposure, 2) a lack of accounting for population size, and 3) in some countries, including the United States, strict regulations to guide the use of patient health information.24–26
The latter issue can be addressed by random offsets from the actual location of the patient's residence but this essentially means that an inaccurate disease case location map is presented.
Options for presentation of spatial patterns of vector-borne diseases
A commonly used approach to avoid privacy issues is to aggregate disease case counts or disease incidence to administrative boundaries. This approach in turn raises the issue of the modifiable areal unit problem,27
which occurs when numerical results vary when the same set of data is grouped at different levels of spatial resolution, and raises the question of which boundary unit best captures the variability of spatial vector-borne disease data without compromising data quality.23
Another issue to consider is that data collection practices for patients afflicted with common and less severe vector-borne diseases, such as WNF and Lyme disease, often do not enable reliable determination of probable pathogen exposure sites.15
This issue introduces uncertainty for pathogen exposure sites related to recognized disease cases and places restrictions on the use of fine spatial boundary units such as census blocks. In the United States, the Centers for Disease Control and Prevention and nearly all individual state health agencies provide spatial WNV disease information to the public at the county scale. One exception is the Colorado Department of Public Health and Environment, which in addition to county-based information, also provides maps for WNV disease incidence by census tract.
Although our results provide a compelling argument for display of risk patterns for exposure to vector-borne pathogens at sub-county scales, there are several problems that need to be considered before sub-county information is presented to end-users. There is no question that sub-county variability exists for risk of exposure to mosquito and tick vectors of human pathogens such as WNV and the Lyme disease spirochete, Borrelia burgdorferi
, in the United States.28–32
The basic problem when working with sub-county spatial risk patterns developed based on epidemiologic data is to determine which of the resulting patterns are real and which are likely to be analysis artifacts. Such artifacts may occur for several reasons including that 1) case files for common vector-borne diseases, such as WNV disease and Lyme disease, often lack information for likely site of vector and pathogen exposure and thus the address of residence may not be the exposure location; 2) information that a case has occurred may result in other nearby cases being detected through increased risk perception and health care seeking; and 3) lack of access to health care among lower income zip codes or census tracts may prevent reporting and thus mask the presence of disease in those areas. These problems also occur at the county scale but can be assumed to have greater impact at sub-county scales.
One way to evaluate the accuracy of sub-county scale risk patterns that are based on epidemiologic data is to develop complementary spatial models based on entomological risk measures such as abundance of vectors or pathogen-infected vectors and compare the spatial patterns based on epidemiologic versus entomological data.14,32,33
Concordance between epidemiologic and entomological risk measures can validate sub-county scale risk patterns, whereas discordance indicates the need for additional investigations. For example, ground-based entomological surveillance in areas with high projected epidemiologic risk but low projected entomological risk can be used to assess whether the observed epidemiologic pattern represents real risk or more likely is a data artifact.
When choosing the most appropriate spatial scale to use for presentation of epidemiologic data for vector-borne diseases to stakeholder communities, we are faced with a situation where use of the county scale obscures variability in spatial risk patterns evident at sub-county scales. However, use of sub-county scales introduces more potential error in terms of actual pathogen exposure location not falling within the spatial boundary unit containing the case's residence. Prospective studies are urgently needed to determine the extent of this error for county versus sub-county scales for various vector-borne diseases. Use of sub-county units with small population sizes may also present the problem of unstable incidence rates.23
Numerous spatial statistical smoothing methods exist to deal with the problem of rate instability including local-area averaging or geostatistical smoothing such as kriging.34,35
Finally, our findings also highlight the need to present maps of vector-borne disease incidence at either county or sub-county scales together with information on the limitations for the scale at which data are presented.
provides a powerful visual example of the value of side-by-side presentations of spatial disease patterns based on case counts versus incidence. At the county scale, there was low overall correlation between WNV disease incidence and case counts and poor concordance (50%) for counties categorized as high risk for WNV exposure based on case counts versus incidence. Because some stakeholders are better served knowing disease case counts (e.g., mosquito control programs) whereas other stakeholders need information based on disease incidence (e.g., general public), our findings argue for presentations of WNV disease data at the county scale that include maps showing WNV disease case counts and WNV disease incidence. Concordance between high-risk areas determined by case counts versus incidence was also poor for the zip code scale (31%) but much higher for the census tract scale (83%). This pattern of higher concordance for census tracts than for either zip codes or counties in Colorado likely results, in part, from census tracts having a more uniform population size (mean population = 4,427, SD = 2,321) than either zip codes (mean population = 10,742, SD = 13,584) or counties (mean population = 74,355, SD = 148,158).
The analytical methods used in our study on WNV disease in Colorado are broadly applicable to vector-borne diseases in North America where humans are incidental pathogen hosts. These include a wide range of diseases caused by pathogens transmitted by fleas (e.g., plague), mosquitoes (e.g., eastern equine encephalitis, La Crosse encephalitis, St. Louis encephalitis, western equine encephalitis and WNV disease) and ticks (e.g., babesiosis, Colorado tick fever, human granulocytic anaplasmosis, human monocytic ehrlichiosis, Lyme disease, Rocky Mountain spotted fever, tick-borne relapsing fever, and tularemia). The same methods may also be applicable to mosquito-borne diseases where humans serve as important or primary pathogen hosts (e.g., dengue and malaria), but this needs to be corroborated in future studies.
Our study demonstrates the potential value of using sub-county scales to determine and present spatial assessments of risk for vector-borne pathogens based on epidemiologic data. It also underscores some problem areas that need to be addressed in future studies including 1) development of a more detailed understanding of the spatial dimensions of WNV transmission to humans in different environments to assess the potential for increases in error of spatial assignation of WNV disease cases by address of residence at census tract or zip code scale, compared with the county scale, related to pathogen exposure occurring outside of the census tract or zip code of residence but within the county of residence, and 2) assessment of how data collection practices could be changed to provide improved information regarding potential pathogen exposure sites without placing undue burdens on the medical community. Other important research needs include 1) development of spatial risk models based on entomological risk measures to complement risk assessments based on epidemiologic data, and 2) assessment of the extent to which model results may differ based on the scale of the data used to develop the model (for example home location versus census tract or county of residence for models based on epidemiologic data). The latter question applies not only to vector-borne diseases but also broadly to other diseases with causes linked to environmental conditions that are spatially heterogeneous.
There also is need for extensive research on delivery mechanisms for spatial risk maps and other risk assessment information to stakeholder communities, especially through web-based information delivery mechanisms. This need includes 1) gaining a better understanding of what type of information different stakeholder groups feel that they require, and 2) determining optimal map and text formats to ensure that the message we aim to transmit is clear to the user. Evaluating the effect of different data presentations for disease risk (e.g., maps of WNV disease case counts versus disease incidence) also merits future research because threat perception is closely linked to use of personal protective measures such as mosquito repellents.