|Home | About | Journals | Submit | Contact Us | Français|
The geographic pattern of human risk for infection with Borrelia burgdorferi sensu stricto, the tick-borne pathogen that causes Lyme disease, was mapped for the eastern United States. The map is based on standardized field sampling in 304 sites of the density of Ixodes scapularis host-seeking nymphs infected with B. burgdorferi, which is closely associated with human infection risk. Risk factors for the presence and density of infected nymphs were used to model a continuous 8 km×8 km resolution predictive surface of human risk, including confidence intervals for each pixel. Discontinuous Lyme disease risk foci were identified in the Northeast and upper Midwest, with a transitional zone including sites with uninfected I. scapularis populations. Given frequent under- and over-diagnoses of Lyme disease, this map could act as a tool to guide surveillance, control, and prevention efforts and act as a baseline for studies tracking the spread of infection.
Lyme disease, the most prevalent vector-borne disease in the United States is caused by Borrelia burgdorferi sensu stricto, a tick-borne spirochete. In the eastern United States, the bacterium is maintained in a horizontal transmission cycle between its vector, the black legged tick Ixodes scapularis, and vertebrate reservoir host species. Humans are incidental hosts, acquiring the pathogen through tick bites. Most patients develop a distinctive rash, erythema migrans, which is accompanied by flu-like symptoms such as fatigue, headache, mild stiff neck, joint and muscle aches, and fever.1 In some untreated cases, symptoms of disseminated disease involving neurologic, cardiac, or articular complications, may develop weeks or months after exposure.2 Since initially described in Lyme, Connecticut,3 the disease has steadily increased in incidence and expanded its geographic range,4 causing a regional epidemic in the eastern United States and southeastern Canada.
Accurate information on spatial patterns of human risk of exposure to infected ticks is essential for the public to make personal protection decisions and for efficient allocation of public health resources. Delineation of Lyme disease-endemic areas also assists local medical communities in considering a diagnosis of tick-borne disease. Accurate and timely diagnosis is critical as delay may lead to severe disease requiring more aggressive treatment.5 On the other hand, overuse of antibiotics sometimes results in serious negative outcomes, potentially including death.6,7 Considering the 2.7 million diagnostic assays for B. burgdorferi that are conducted annually in the United States,8 even a small proportion of false positive results could dwarf the number of reported cases (~20,000 cases/year4) and result in a skewed distribution of case reports. Finally, accurate information on local exposure risk can guide the use of antimicrobial prophylaxis for the prevention of Lyme disease after a recognized tick bite, currently recommended where tick infection prevalence is greater than 20%.9
The Lyme disease case definition currently adopted by the Centers for Disease Control and Prevention (CDC) considers endemic those counties with at least two confirmed, locally acquired cases or in which established populations of a known tick vector are infected with B. burgdorferi.10 However, significant Lyme disease underreporting and misdiagnosis4 and geographic expansion in vector distribution limit the reliability of using past human cases to predict risk. In addition, the variable interval between time of exposure to infected ticks and manifestation of symptoms confounds the precise determination of exposure location, and can result in the incorrect association of cases with specific counties. Acarological risk, as measured by the density of infected host-seeking I. scapularis nymphs has been previously found to be positively correlated to Lyme disease incidence on a regional scale11 and is free of some of the biases involved in human case reporting.
The use of geographic information systems and remote sensing techniques to map vector-borne diseases has evolved significantly over the past 25 years. The convergence of factors such as the availability of multi-temporal satellite data and sophisticated statistical and image processing algorithms have provided the necessary tools to generate predictive surfaces of disease risk based on vector or human case data that can be used to guide prevention and interventions (reviewed in Reference 12).
We developed an acarological risk map for Lyme disease based on standardized field sampling to estimate the density of B. burgdorferi-infected host-seeking nymphal I. scapularis throughout the range of the tick. We focused on the nymphal stage because it is the only tick life stage that has a significant role as a vector for B. burgdorferi in eastern North America11,13 as a result of its small size, propensity to feed to repletion on humans, and summer host-seeking activity. Building on our previous research to identify the environmental predictors for the density of ticks,14 here we assess environmental predictors for nymphal infection prevalence and combine them to model a continuous predictive surface of human risk of exposure to infected ticks independent of human case reports.
The study area included the continental United States east of the 100th meridian (37 states), encompassing the known I. scapularis distribution.15 A spatially stratified random design was applied by overlaying a two-degree sampling grid across the study area; state parks or other publicly accessible forested areas were randomly selected within each grid cell. A total of 304 sites were sampled between 2004 and 2007, 30 of which were repeatedly sampled in 2 to 4 years, resulting in a total of 348 site-year samples. No infection data were available for four sites resulting in 344 site-years. Within each site, we measured relative nymphal density by drag sampling16 of closed-canopy deciduous forest along five 200 m transects. To capture the host-seeking phenology of I. scapularis nymphs, we visited sites a median of five times during late spring and summer (range: 1–6), when nymphs actively seek their hosts in the northeastern United States.17 Sampling was performed between 19 May and 27 August in 2004, 9 May and 4 October in 2005, 10 May and 30 September in 2006, and 8 May and 4 August 2007. In 2007, data on tick density was not collected; therefore, only infection prevalence data are reported. Additional details of the sampling methodology are described in a previous publication with focus on modeling the distribution of host-seeking nymphs.14
The dependent variable in the analysis to follow is the annual mean density of infected nymphs per 1,000 m2. This was derived from multiplying the density of nymphs per 1,000 m2 by the infection prevalence at each site-year. The estimated annual mean density of nymphs per 1,000 m2 was derived by 1) calculating the mean number of I. scapularis nymphs collected per visit by averaging the collections from the five transects; 2) representing the observed phenology at a site by plotting the mean number of nymphs per visit by time; 3) calculating the area under phenology curve; and 4) dividing this area by the total number of days elapsed between the first and last sampling visit at a site for a given year.14
Following ammonium hydroxide total DNA extraction, the presence of B. burgdorferi DNA was determined for all I. scapularis nymphs in 344 of the sites/years by polymerase chain reaction amplification of a portion of the 16S rRNA gene with species-specific primers.18
Infection prevalence was calculated for each collection site-year by dividing the total number of B. burgdorferi-positive nymphs by the total number tested at each site-year. The prevalence estimate at each site was used to calculate the density of infected nymphs, regardless of sample size. We assumed infection prevalence did not vary over one transmission season.
The environmental covariates evaluated as predictors of the density of infected nymphs were selected from among those found to be associated with the density of host-seeking I. scapularis nymphs,14 in addition to landscape metrics describing forest fragmentation patterns that we hypothesized to be indirectly associated with nymphal infection prevalence with B. burgdorferi. Smaller forest fragments have been linked to higher abundance and infection prevalence of nymphal I. scapularis.19,20
Our previous model of nymphal density14 assessed a large number of variables (Supplementary Technical Appendix). On the basis of the best predictive model for the density of nymphs, we included in the current model elevation, mean vapor pressure deficit (VPDm), the annual amplitude of the maximum temperature cycle (TMAXaa), the annual phase of the minimum temperature cycle (TMINap), the annual amplitude of the normalized difference vegetation index (squared, NDVIaa), and an autocovariate term that accounts for the expectation of sites closer in space to be more similar to one another than sites farther apart.14
We used data derived from the 30 m resolution 2001 National Land Cover Database (NLCD21) to characterize spatial patterns in the distribution of deciduous and mixed deciduous-evergreen forest cover and fragmentation in 8 km× 8 km cells centered on each of our sampling sites. Each cell included 71,111 30 m NLCD pixels, and we calculated the following fragmentation metrics using Fragstats (version 3.3, Amherst, MA): total area, proportion of the landscape occupied by deciduous-mixed forest, number of deciduous-mixed forest patches, total linear edge between deciduous-mixed forest and other land cover types, proportion of the landscape occupied by the largest deciduous-mixed forest patch, mean deciduous-mixed forest patch area, average Euclidean distance between nearest-neighbor deciduous-mixed forest patches, an index of total aggregation of deciduous-mixed forest patches, an index of mean edge/area ratio for deciduous-mixed forest classes, and perimeter-area fractal dimension. We also calculated Shannon's and Simpson's diversity indices using all classes included in the 2001 NLCD (N = 27) independently, but collapsing deciduous and mixed forest.
After preprocessing the data, the values of the environmental data sets corresponding to each site were extracted for the point location of each site from the closest 8 km×8 km pixel.
We developed a zero-inflated negative binomial (ZINB) regression model for the expected density of infected I. scapularis nymphs using environmental covariates as predictors (countreg procedures, SAS software, SAS Institute Inc., Cary, NC). The ZINB regression model is used for data that appears to have a negative binomial distribution with an excess number of zero-valued observations. The ZINB regression model simultaneously uses a logistic function to model the process that is generating the excess zeros and a negative binomial regression function to model the process that is generating the remaining data.22
We used a modified forward stepwise procedure to select the variables to include in the model. We initially included all variables present in the model for the density of host-seeking nymphs14 and subsequently added a single variable to either the logistic or the negative binomial components of the model. At each step, we identified the best fit model as the one where all variables were significant (P < 0.05) and had the lowest Akaike information criterion. Variables were added until there were no improvements in the model fit.
The acarological risk map was developed by running the selected regression model using the values of the environmental predictors at each pixel location. In addition to presenting a continuous surface of acarological risk, we generated confidence bands for the predicted surface. A confidence band (CB) essentially estimates the 95% confidence interval (CI) for the entire range of expected values produced by the regression function. In contrast with simple regression models, for which a closed form solution can be calculated for the joint distribution of regression parameters, such closed form solution is not readily available for the ZINB. Therefore, we used bootstrap simulations to generate CIs for each density estimate produced in our analysis. Collectively, these bootstrap CIs form an estimate of the confidence band or confidence surface for the regression function. To generate these CIs, a total of 10,000 pseudo-samples of size 344 were created by randomly selecting (with replacement) from the data set used to do the original analysis. The ZINB regression model used for the original analysis was applied to each pseudo-sample, resulting in estimates of density of infected nymphs for every pixel. By ranking the estimates for a pixel and identifying the 5th and 95th percentiles, an empirical estimate for the upper and lower bootstrap confidence surface was obtained for the original estimates of our regression model. High-risk areas were defined as those for which the lower bound of the 95% CB was larger than a threshold number of infected nymphs and low risk were those for which the upper bound of the 95% CB was smaller than the threshold number of infected nymphs. Transitional areas were defined as those neither in the high nor low risk categories. A range of thresholds were examined and the one that maximized the sensitivity of the model was selected for the final risk map (Supplementary Figure S1).
A total of 5,332 I. scapularis nymphs were collected at 94 sites (107 site-years) out of 304 sites (344 site-years) (30.1%) sampled between 2004 and 2006, with yearly totals in positive sites ranging from 1 to 506. The weighted mean density of host-seeking nymphs per 1,000 m2 was 3.28 (s.d. 10.53), ranging from 0 to 101.29 (Supplementary Table 1). Two population foci were identified in the Northeast and upper Midwest, with no nymphs collected between these two regions (in and around Ohio). A map and details of the nymphal density data have been previously published.14
The DIN per 1,000 m2 was derived for each site-year by multiplying the annual mean density of host-seeking I. scapularis nymphs (DON) by the infection prevalence at each site-year. Mean DIN was 0.64 (s.d. 2.24), ranging from 0 to 18.62 (Figure 1). Two discontinuous acarological risk foci were identified, one from northern Virginia to southern Maine and one in the upper Midwest, centered in Wisconsin and Minnesota. The density of infected nymphs per 1,000 m2 was not significantly different between the two foci or between sites with high and low density of nymphs, split at the median (Mann-Whitney two-sample statistic, P < 0.05).
The ZINB model for the density of infected host-seeking I. scapularis nymphs with the lowest Akaike information criterion and for which all covariates were significant (P < 0.05) included elevation, VPDm, and TMINaa as predictors for the absence of infected nymphs. The spatial autocovariate term and the proportion of the landscape occupied by the largest deciduous-mixed forest patch were significant positive predictors for the density of infected nymphs in positive sites (Table 1). No spatial autocorrelation was detected in the residuals of the regression model (Moran's I Z = −1.28, P = 0.19).
A threshold of 0.3 infected nymphs per 1,000 m2 was selected to define high- and low-risk areas; the model's sensitivity was maximized using this threshold, although retaining a high model specificity and overall accuracy (Supplementary Figure S1). Of the 344 site-years sampled, 279 were classified with 95% confidence using this threshold, 62 as high risk and 217 as low risk (Figure 2), with just three high-risk sites misclassified as low risk. Accuracy and precision metrics are presented in Table 2.
Ixodes scapularis nymphs from 92 of the sites (5,328 nymphs; 4 nymphs from 2 sites could not be processed) were tested for the presence of B. burgdorferi DNA. Overall infection prevalence was 0.20 (1,044 positive nymphs/5,328 nymphs tested). At least one positive nymph was collected in 50 sites (61 site-years). Of the remaining 42 sites, in only 4 sites we sampled > 14 nymphs, which we estimated as the threshold to be 95% confident that the site is negative based on the binomial probability distribution. Two of these sites were located along the Illinois-Indiana border and two along the New York-Vermont border (Figure 3).
We selected a subsample of sites with a minimum of 100 nymphs per sampling year to examine spatial and temporal patterns in infection prevalence (Figure 4). We used a binomial log-likelihood probability function to estimate 95% CIs for each prevalence estimate.
Climate and landscape-based modeling of Lyme disease acarological risk revealed two discontinuous population foci for both B. burgdorferi and I. scapularis, one in the Northeast and one in the upper Midwest. The ZINB modeling approach allowed for the identification of separate sets of variables for predicting presence/absence and density of infected nymphs. Lower elevation, low vapor pressure deficit, and low seasonal extremes in minimum temperature were associated with the presence of infected nymphs. No nymphs were collected above a threshold elevation of 510 m. The less fragmented landscapes were associated with higher densities of infected nymphs in positive sites. The density of infected nymphs was also associated with the density of infected nymphs in nearby sites, as represented by the autocovariate term. The spatial dependency captured by this term is likely caused by recent and ongoing population expansion of I. scapularis from past refuges in the Northeast23 and the upper Midwest24 following environmental changes such as reforestation, suburbanization, and reintroduction of deer.25 The inclusion of an autocovariate term captures an important feature of this ongoing expansion, that is, the so far unrealized occupation of all suitable habitats by I. scapularis, a consequence of constraints on dispersal or establishment.26
The absence of host-seeking nymphs infected with B. burgdorferi in the southern portion of the range of I. scapularis is consistent with previous studies.27,28 Although I. scapularis populations are present in this region, the nymphs have an altered feeding behavior presumably adapted to lizards and skinks and cannot be readily sampled by the drag cloth collecting method.14 Because this method is directly correlated with human contact with host-seeking ticks, it is considered a direct measure of risk for tick bites.11,29 The absence of infected host-seeking nymphal I. scapularis in most southern states suggests that reported cases from this region may be mostly caused by either misdiagnosis or travel to an endemic area.
The threshold for classifying areas into low and high risk was selected to maximize the sensitivity of the model, given the high potential cost of underestimating risk of infection. Using this threshold, the final model had 91% accuracy, 93% sensitivity, and 90% specificity. The model predicted the presence of infected nymphs in some negative sites for which other sources indicate emerging risk, such as eastern Maine,30 the Illinois/Indiana border,24,31 the New York/Vermont border,32 southwestern Michigan,33 and eastern North Dakota (Vaughan J, personal communication). The density of infected nymphs was often underestimated by the model in these regions, and bootstrap CIs often included the threshold value for infected nymphs, therefore the sites could not be assigned to a 95% confidence risk category. Uninfected I. scapularis populations were also identified along two of these putative expansion fronts (Illinois/Indiana and New York/Vermont), consistent with a pattern of initial tick spread followed by the spread of B. burgdorferi.34 An alternative invasion pattern was detected in Michigan,33 where low-prevalence B. burgdorferi infection was detected in other tick species and in wildlife at inland sites before the arrival of I. scapularis. The authors suggested a cryptic B. burgdorferi transmission cycle by other vector-competent tick species in the absence of I. scapularis. Further studies are warranted to evaluate the accuracy of the predictive model and to describe the patterns and processes driving the expansion of I. scapularis and B. burgdorferi.
The model performed poorly in western Pennsylvania, underestimating the density of infected nymphs at Parker Dam State Park in northern Clearfield County and in Presque Isle State Park in Erie County. Borrelia burgdorferi was previously detected in Peromyscus leucopus in Elk County, north of Clearfield County,35 and both I. scapularis and B. burgdorferi were previously reported in Erie County.36 Low estimated density of infected nymphs at these sites is caused by their isolation from other high-density sites, resulting in low values of the autocovariate term. In addition, Parker Dam State Park's high elevation (490 m), near the limit of the nymphal distribution identified by our study (510 m), further reduced the density estimate for this site. The isolation of these infected tick populations suggests a separate Lyme disease focus from the known northeastern and midwestern foci37; further study is needed to determine the extent of these isolated Pennsylvania foci.
The spatial pattern of infected nymphs was highly consistent with the distribution of all nymphs found in our previous model,14 suggesting that nymphal density, rather than infection prevalence, drives the spatial patterns in acarological risk. No obvious spatial structure was detected in the distribution of nymphal infection prevalence (Figure 4). However, low precision of the infection prevalence estimates because of small sample sizes at the site level limit any inferences beyond a regional level.
The probability of finding nymphs was driven by elevation, VPDm, and seasonal variation in temperature. These variables were selected for use in the current model because of their predictive value for the density of I. scapularis nymphs.12 The density of host-seeking ticks of Lyme disease vectors in Europe has been found to decrease with increasing elevation38,39; the extent to which this effect is caused by temperature remains to be determined. The strong predictive power of VPD on the density of host-seeking nymphs is consistent with other studies.40–42 Water stress and high temperatures are hypothesized to regulate tick populations by decreasing tick survival during off-host periods43 and regulating host-seeking activity.44–47
Seasonality in temperature has also been an important predictor in other studies of the distribution of vectors and disease,48 including tick-borne diseases.49 Temporal Fourier analysis captures both the extremes in temperature and the rates of fall cooling and spring warming. Extreme winter temperatures can limit the northern distribution of ticks by directly killing the ticks,43,50 inhibiting host-seeking activity,44,46,47 or limiting the availability of hosts.51 The high rate of fall cooling can also affect tick population dynamics by limiting the time available for larvae to find a host in the fall, thus entering diapause unfed,52 which potentially increases their mortality rate.52 On the other hand, high rates of spring warming may have an opposite effect, because it would result in faster accumulation of degree-days for development, potentially leading to earlier egg deposition and larval emergence.51
Previous studies identified a positive link between forest fragmentation and both tick density and infection prevalence.19,20 Extensive data mining of our dataset did not identify any significant landscape predictors of nymphal infection prevalence (results not shown), and the only landscape predictor included in the final regression model reflected a negative, rather than positive, link between forest fragmentation and the density of infected nymphs. The larger spatial extent and coarser spatial scale of this study may partially explain differences with previous studies.53 Forest fragmentation metrics were calculated for each 8 km×8 km pixels in this work, although previous studies focused on a scale of a few hectares19 to 0.5 km.20 At the resolution of our study, highly urbanized or agricultural areas appear as highly fragmented pixels, whereas lower fragmentation is linked to more forested areas that typically support I. scapularis populations. A limitation of this study is that sampling was limited to natural areas to standardize sampling at a continental scale. However, human risk for Lyme disease appears to be peridomestic, at least in the northeastern United States.54,55 More studies on the associations between the density of infected nymphs and human risk at different scales are warranted.
In terms of public health applications, the lack of spatial structure in nymphal infection prevalence, combined with the difficulty in accurately estimating prevalence with small sample sizes, brings into question the validity of using an exact prevalence threshold to guide clinical decisions on treatment, such as the current recommendation of tick-bite prophylaxis when infection prevalence is > 0.2.9 Our results indicate that the presence of any number of infected nymphs may be considered sufficient to recommend post-exposure prophylaxis. Infected nymphs were found in 92.3% of the sites where a threshold of 14 nymphs per 1,000 m2 were collected.
The construction of an accurate map showing the spatial distribution and density of I. scapularis in the United States has been limited by the use of non-standardized, county-based data on the distribution of I. scapularis,15 the mapping of all I. scapularis life stages,56,57 and the absence of data on nymphal infection with B. burgdorferi.14 The Lyme disease risk map that we present here is based on standardized sampling of infected host-seeking I. scapularis nymphs, the primary vector stage of B. burgdorferi for humans throughout the geographic range of the species. Although the large spatial extent of this map necessarily limits its spatial resolution and localized studies are warranted in some focal areas, our map makes new contributions to surveillance, prevention, and control programs. Specifically, this risk map can assist in surveillance and control programs by identifying regions where human cases are expected and may assist treatment decisions such as the use of antimicrobial prophylaxis following a tick bite. Finally, this map provides an essential baseline for tracking spread of the infection from areas endemic in 2004–2007.
Supplementary Figure S1.
Supplementary Table 1
We thank the 80 field assistants who made this project possible. Special thanks to Tim Andreadis, Katherine Watkins, Laura Kramer, Jessica Payne, Elizabeth Racz, Kelly Liebman, Liza Lutzker, and David Boozer for tick identification and logistic support; Carlos Diuk for database support; Brad Lobitz and Andrew Michaelis for technical assistance; Dennis Grove, Lindsay Rollend, and Russell Barbour for arranging collection permits. We also acknowledge Corrine Folsom-O'Keefe for manuscript editing and Kimberly Tsao for helpful comments, and John Brownstein and Ben Beard for their early contributions to this work.
Note: Supplemental appendix, table and figure appear at www.ajtmh.org.
Financial support: This project was funded by CDC-Division of Vector-Borne Infectious Diseases Cooperative Agreement No. CI00171-01.
Authors' addresses: Maria A. Diuk-Wasser and Paul Cislo, Division of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, E-mails: maria.diuk/at/yale.edu and prcislo/at/juno.com. Anne Gatewood Hoen, Community and Family Medicine, Dartmouth Medical School, Lebanon, NH, E-mail: Anne.G.Hoen/at/dartmouth.edu. Robert Brinkerhoff, Biology Department, University of Richmond, Richmond, VA, E-mail: jbrinker/at/richmond.edu. Sarah A. Hamer and Jean I. Tsao, Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI, E-mails: hamer/at/msu.edu and tsao/at/msu.edu. Michelle Rowland, Magee-Women's Hospital of UPMC, Pittsburgh, PA, E-mail: rowlandmr/at/upmc.edu. Roberto Cortinas, Department of Entomology, University of Nebraska, Lincoln, NE, E-mail: rcortinas/at/unlnotes.unl.edu. Gwenaël Vourc'h, Institut National de la Recherche Agronomique, Saint Genès Champanelle, France, E-mail: gvourch/at/clermont.inra.fr. Forrest Melton, California State University, Monterey Bay, Seaside, CA, E-mail: Forrest.S.Melton/at/nasa.gov. Graham Hickling, Department of Forestry, Wildlife and Fisheries, University of Tennessee, Knoxville, TN, E-mail: ghicklin/at/utk.edu. Jonas Bunikis, Vilnius University, Vilnius, Lithuania, E-mail: Jonas.BUNIKIS/at/ec.europa.e. Alan G. Barbour, Department of Microbiology and Molecular Genetics, University of California, Irvine, CA, E-mail: abarbour/at/uci.edu. Uriel Kitron, Department of Environmental Studies, Emory University, Atlanta, GA, E-mail: ukitron/at/emory.edu. Joseph Piesman, Division of Vector-Borne Infectious Diseases, Centers for Disease Control and Prevention, Fort Collins, CO, E-mail: jfp2/at/CDC.GOV.