|Home | About | Journals | Submit | Contact Us | Français|
Detailed information regarding the spatial and/or spatial–temporal distribution of mortality is required for the efficient implementation and targeting of public health interventions.
Identify high risk clusters of mortality within the Agincourt subdistrict for targeting of public health interventions, and highlight areas for further research.
Mortality data were extracted from the Agincourt health and socio-demographic surveillance system (HDSS) for the period 1992–2007. Mortality rates by age group and time were calculated assuming a Poisson distribution and using precise person-time contribution estimates. A spatial scan statistic (Kulldorff) was used to test for clusters of age group specific all-cause and cause-specific mortality both in space and time.
Many statistically significant clusters of higher all-cause and cause-specific mortality were identified both in space and time. Specific areas were consistently identified as high risk areas; namely, the east/south- east and upper east central regions. This corresponds to areas with higher mortality due to communicable causes (especially HIV/TB and diarrhoeal disease) and indicates a non-random element to the distribution of potential underlying causative factors e.g. settlements comprising former. Mozambican refugees in east/south-east of the site, corresponding higher poverty areas, South African villages with higher HIV prevalence, etc. Clusters of older adult mortality were also observed indicating potential non-random distribution of non-communicable disease mortality.
This study has highlighted distinct clusters of all-cause and cause-specific mortality within the Agincourt subdistrict. It is a first step in prioritizing areas for further, more detailed research as well as for future public health follow-on efforts such as targeting of vertical prevention of HIV/TB and antiretroviral rollout in significant child and adult mortality clusters; and assessment and provision of adequate water and sanitation in the child mortality clusters particularly in south-east where diarrheal mortality appears high. Underlying causative factors need to be identified and accurately quantified. Other questions for more detailed research are discussed.
Reliable statistics on mortality, its causes and trends are in high demand for assessing the global and regional health situation and developing appropriate interventions. Countries that monitor mortality and its causes are among those that have made substantial progress in health. In the absence of routine mortality statistics (especially sub-Saharan Africa), health and socio-demographic surveillance site (HDSS) data provide a valuable source for estimating all-cause adult mortality and mortality trends. An additional benefit of HDSS implementing the verbal autopsy (VA) is that they are often the only data in many countries to monitor cause-specific mortality of a population on a longitudinal basis (1).
In 1992, the Agincourt subdistrict of Bushbuckridge was demarcated by Wits University as a site for health and socio-demographic surveillance (HDSS) and a baseline census conducted that same year (2–4). Life expectancy among both males and females in Agincourt has significantly and steadily decreased (12 years in females and 14 years in males). The increases in mortality were most prominent in children (0–4) and young adults (20–49) where increases of two- and fivefold, respectively, have been observed when comparing mortality rates from 1992–1993 to 2002–2003. Gender differences in mortality patterns are also evident with more marked increases in females in most adult age groups (5). According to a study by Tollman et al. (6), comparing periods from 1992–1994 to 2002–2005, the increase in infectious and/or parasitic (I&P) disease mortality was significant in all age and sex groups except children aged 5–14 years (increase in HIV and tuberculosis mortality was significant however) and the elderly (65 + ). With respect to increased I&P disease mortality, the change was driven by HIV/TB. Age-specific mortality from non-communicable disease increased significantly in adults who were 30 years and older; the change in younger age groups was not significant. Thus the prominent increase in all-cause mortality is being driven by the large increase in I&P disease (HIV) and a modest increase in non-communicable disease (6). However, few true spatial analyses have been undertaken and thus offer an area for more detailed research within this site.
Benzler and Sauerborn (7) suggest that when population-wide intervention programs are too expensive to implement, it is necessary to limit such efforts to high risk units where certain adverse health effects are the most likely to occur. Therefore, investigating the distribution of adverse health outcomes in a population (whether random or not) should be an important objective before starting a program for primary or secondary prevention of communicable disease. It is necessary to determine whether there are clusters where adverse health outcomes seem to aggregate. If this is the case, there is a need to identify them by means of simplified scores and to develop specific health strategies targeted at these clusters (8).
Use of spatial–temporal analysis has increasingly been applied in epidemiological research in recent years (9). Advances in data availability and analytic methods have created new opportunities for investigators to improve on the traditional reporting of disease on a national or regional scale by studying variations in disease occurrence rates at a local (small-area) scale (10). Among the most important exploratory methods for epidemiology and public health are those that identify significant clusters in space and/or time (11–15). Spatial, temporal, and space–time scan statistics are now commonly used to detect and evaluate statistically significant, spatial clusters. These methods can be analyzed by using the space–time scan statistic (SaTScan™) software (16), which is used widely in an increasing number of applications including epidemiology (8, 13, 17–19) and other research fields and minimizes the problem of multiple statistical tests. SaTScan™ is useful for determining those cluster alarms that merit further investigation and those clusters that are likely to occur by chance. Despite growing applications of spatial methodology, fewer studies have analyzed spatial variation of all-cause and cause-specific mortality, with little or no work on DSS longitudinal data. Analysis in Agincourt HDSS has thus not utilized a proper spatial (and spatial–temporal) analysis of mortality or other outcomes, especially since this and other HDSS sites keep track of the coordinates of all households and update these regularly.
This study will aim to identify clusters of all-cause and cause-specific mortality within the Agincourt subdistrict, which will be important for local and national health departments to minimize morbidity and mortality through timely and spatially directed implementation of prevention and control measures in a resource limited rural area. It will also thus direct future research efforts in terms of identifying the underlying reasons (risk factors) for the observed clustering both in space and time.
The Agincourt health and socio-demographic surveillance site (HDSS), established in 1992, contains a blend of former Mozambican refugees, migrant workers and a more stable permanent population (2). The site is situated in the northeast of South Africa (Fig. 1), covers an area in excess of 400 km2 and consists of 21 villages with approximately 11,700 households and a population of 70,000 people at the end of 2007. A full geographic information system (GIS) exists for village boundaries (20) and households within the site and is updated annually. The study population comprised all individuals within the site during the period 1992–2007.
A VA is conducted on every death to determine its probable cause (21). The Agincourt VA tool was first validated in the mid-1990s (22) and again in 2006 with particular reference to HIV/AIDS and tuberculosis (manuscript in preparation). Cause-specific fraction analysis of main or underlying causes of death was limited to 1992–2006 as assessments of VAs for 2007 have not yet been completed.
Data on population size, structure, and deaths were extracted from the Agincourt HDSS using Microsoft SQL Server 2005. Data cleaning were done in Stata 10.0. Precise person-years (PY) at risk by age, gender, year, and village were used as the denominator. Observation dates were used for the calculation of person–time as they are the most reliable. We calculated the mortality rates by village and year by dividing the observed number of deaths by the total person-years contributed in village i (i = 1,…, 21) at year j (j = 1992,…, 2007). To identify villages in which the mortality rate was significantly above average in time, we constructed exact 95% confidence intervals (CI) for each rate using the Poisson distribution of the observed number of events i.e. deaths (23). Village mortality was considered significantly above average for a given year if the overall rate for the given year was below the lower value (α = 0.025) of the mortality rate CI for that village (24). Temporal trends in rates were analyzed in Stata by using a simple Poisson regression model containing person–time exposure, a constant and temporal (annual) trend term (25).
In this study, the Kulldorff spatial scan statistic (26) was used to identify space-only clusters of high mortality only by age-group in the Agincourt HDSS overall for the entire aggregated period (1992–2007). A circular window is imposed on a map by the statistic and the center of the circle moves across the study region. This window is centered on each of the possible grid points (village centroids) positioned throughout the study region; the radius of the circle changes continuously between zero and a specified upper limit and is thus flexible both in location and size. Each of these circles can contain a different set and number of neighboring villages, and each of the circles is a potential cluster of age-specific deaths in the Agincourt study area. A village is captured in the cluster if it lies within the circle. The spatial scan statistic calculates the likelihood of observing the number of deaths inside and outside each circle, and the one with the maximum likelihood is defined as the most likely cluster i.e. least likely to have occurred by chance (tests the null hypothesis that the risk of dying is the same in all villages in the study area). Kuldorff et al. (13) also extended the spatial scan statistic into a space–time scan statistic. The window imposed by the statistic on the study area is cylindrical with a circular geographical base and height corresponding to time. The center is again one of several possible village centroids located throughout the Agincourt study area and the height reflects the time interval. The cylindrical window is then moved in space and time. This was also applied to the Agincourt HDSS data for the period 1992–2007 (time aggregation of 1 year) to identify high space–time clusters only. The following age groups were used: <5 years, 5–14, 15–49, 50–64, and 65+. Person–time by age group, gender, and village was used as the denominator. To ensure sufficient statistical power, the number of Monte Carlo replications was set to 19,999. The p-value of the statistic is obtained through Monte Carlo hypothesis testing. SaTScan™ gives the most likely cluster with a corresponding p-value (significant was set at the 5% level in this study). If other clusters not overlapping with the most likely cluster are identified (secondary, tertiary, etc.), these are also given with their corresponding p-values. Maps showing all significant non-overlapping clusters were constructed in MapInfo Professional 9.5. Larger circles do not represent greater risk clusters but rather contain a larger number of neighboring villages i.e. extend over larger geographical area. Village centroids were not displayed to preserve confidentiality in a small geographic area.
During 1992–2007 the highest mortality rates were observed among children, 50–64 and 65+ (9, 19, and 46 per 1,000 person-years, respectively) (Table 1). Similar mortality rates were observed by age group and gender except in 50–64 and 65+ years where males had much higher rates.
A significant increase in the mortality rate over time was observed from 4.7 deaths per 1,000 person years (95% CI: 4.16–5.32) in 1992 to 12.5 deaths per 1,000 person years (95% CI: 11.35–13.82) in 2007. A significant increase in the mortality rate in all villages over time was observed. Overall there were significantly higher mortality rates in one village in the upper central part of the site and two in the south-east (Table 2). Two villages (both in the south-east part of the site) showed significantly higher mortality rates during specific periods, one in 2000–2003 and the other in 2004–2007. Several villages showed excessive increases in mortality when comparing the rate in the first to last period, with all but two (one in the west and the other in the upper central region) situated toward the eastern part of the site (Table 2).
There were significant increasing trends in mortality for <5, 15–49, and 50–64-year age groups during the period 1992–2007 (Fig. 2). Mortality in the 5–14-year age groups remained constant and at a low level. Significant increases in mortality in the age group 50–64 for both genders occurred but are more pronounced among males. The elderly (65 + ) had the highest mortality rates, higher (and slightly increasing) in males than females (constant).
Among children (<5) and adults (15–49), I&P-related diseases remains the highest causes of death (560 or 48% of 1,165 and 1,306 or 45% of 2,883, respectively). This is largely due to HIV/TB mortality, which accounted for 23% (273) and 40% (1,141) of child and adult mortality, respectively. Diarrhoea and acute respiratory illness (ARI) feature as prominent causes of death among children (145 deaths or 12% and 94 or 8%, respectively). HIV/TB featured as a prominent cause of death in the 50–64-year age group (244 or 23% of 1,077).
Vascular disease (all circulatory system disease) and cancer (neoplasm) feature as the most prominent non-communicable causes of death, particularly in older age groups where they accounted for 14% (151) and 7% (76) in 50–64 years and 22% (401) and 11% (193) in those 65+. Malnutrition is a prominent cause of death among children (8% or 92 deaths). Vehicle accidents followed by assault are the two leading external causes of death (4% or 300 deaths and 2% or 170 deaths overall). Suicide was highest among children aged 5–14 years and adults 15–49 (3% or 6 deaths and 2% or 63 deaths, respectively).
Toward the south-east corner of the site, a statistically significant (at 5% level) cluster of higher mortality comprising five villages was observed for the period 1992–2007 (observed deaths = 1,831, expected deaths = 1706, RR = 1.09, p = 0.025).
With the exception of 65+ mortality, all significant clusters of higher mortality were in the south-east corner of the site. There were clusters of higher child (<5) and adult mortality (15–49 years) in one village in the upper central region (Table 3). No significant clusters were identified for <1, 5–14, and 50–64-year age groups.
There were three statistically significant space–time clusters of higher all-cause mortality. The most likely cluster was situated in the south-east corner and comprised six villages for the period 2002–2007 (observed deaths = 1,155, expected deaths = 789, RR = 1.54, p < 0.001) using the space–time scan statistic. A secondary cluster of 7 villages was situated in the upper central to east region of the site during the period 2001–2007 (observed deaths = 1,237, expected deaths = 898, RR = 1.44, p < 0.001); while a tertiary cluster of three villages was situated in the central/west region during 2002–2007 (observed deaths = 1,038, expected deaths = 742, RR = 1.46, p < 0.001).
Spatial–temporal clustering of age-specific all-cause mortality can be seen in Table 4. Significant space–time clusters of higher all-cause mortality were observed among children in six villages in the upper central region (mostly likely) of the site during 1999–2006 (233 observed cases, 148 expected, RR = 1.70, p < 0.001); and in five villages in the south-east (secondary cluster) during the same period (227 observed cases, 150 expected, RR = 1.62, p = < 0.001). During 2001–2007, three significant clusters of high adult mortality (15–49) were observed. Most likely cluster during 2001–2007 was in the south-east corner of the site comprising seven villages (638 observed cases, 385 expected, RR = 1.80, p < 0.001); a secondary cluster of seven villages in the upper central/east region during the same period (602 observed cases, 402 expected, RR = 1.60, p < 0.001); and a tertiary cluster of three villages in the west/central region during 2003–2007 (426 observed cases, 278 expected, RR = 1.60, p < 0.001). Significant clusters of higher older adult mortality (50–64) were observed in similar areas during similar periods (Table 4). No significant space–time clusters of all-cause mortality were identified for the 5–14 and 65+ age groups. Graphical depictions of clusters by age group can be seen in Fig. 3.
Demographic surveillance systems provide a viable method for the collection of reliable data on vital events in rural sub-Saharan Africa, especially in the absence of accurate routine mortality statistics. Increasingly, there is renewed interest in the spatial clustering of infectious disease and mortality, especially in poor areas with limited resources. Little proper spatial analysis of longitudinal HDSS data has been done thus far. This study has demonstrated the usefulness of Kulldorff's scan statistic in highlighting high risk areas within the Agincourt sub-district for future targeting of health interventions, as well as focusing more detailed research regarding the underlying risk factors (at individual, household or community level) that may be driving these spatial–temporal all-cause and cause-specific mortality patterns. This study should be regarded as a first step in prioritizing areas for follow-up public health efforts and evaluating their impact (e.g. ARV rollout started in this area in 2007).
Increasing trends in mortality were observed in most age groups (<5, 15–49, and 50–64 years) during the period 1992–2007, largely due to the HIV epidemic. As can be seen in the cause-specific fractions, all I&P mortality (mainly HIV) is the leading cause of death in this population. Thus, mother-to-child HIV transmission prevention in clusters with high child mortality needs to be undertaken, along with other interventions.
Several statistically significant clusters of higher all-cause and cause-specific mortality rates were identified among 21 villages within the Agincourt sub-district both in space and space–time. The south-east and upper central regions of the site were consistently identified as high risk clusters, i.e. a non-random distribution. Former Mozambican refugees (about a third of the Agincourt population) entered South Africa via the Kruger National Park situated along the eastern border of the site and settled in this area. Kahn indicates that they are a vulnerable subgroup, poorer in more isolated villages with less infrastructure and generally further away from health facilities, with poor access to water and sanitation as well as labor markets (27). It also appears that the burden of communicable disease mortality (specifically HIV/TB and diarrhea) is highest in these areas (upper central and south-east for HIV/TB and south-east fordiarrhea), this all leading leading to the all-cause space and space–time findings. Thus, settlements comprising former Mozambican refugees as well as selected South African villages appear to have increased risk of I&P-related mortality. Suitable interventions such as ARV treatment and assessment and provision of adequate water and sanitation need to be directed to these villages to overcome existing inequalities. More detailed research to elucidate the exact risk factors and the relative contribution of each needs to be undertaken. The confounding effect of settlement specific socio-economic status (SES) in space and time also needs to be adjusted for in future studies.
From the space–time analysis we observed that most of the significant mortality clusters appeared during the later period (1999–2007) with none in the earlier period (1992–1998) (Table 4). Significant increases in mortality rates particularly in <5, 15–49, and 50–64-year age groups were observed (6). Hence this temporal increase in clustering – a newly described phenomenon – is also linked to the increase in mortality over the time period.
A significant space cluster of older adult (65 + ) mortality (as well as a space–time cluster of higher mortality among 50–64 year olds during 2002–2006) was observed toward the west of the site in the later period. The study by Tollman et al. (6) also found a significant increase in the mortality rate from non-communicable diseases in adults 30+ from 1992–1994 to 2002–2005 (RR = 1.22, p = 0.026). As noted, most of the self-settled Mozambican settlements are to the east of the site with more of the South African settlements to the west. According to Hargreaves et al. (28), Mozambican households generally have a lower standard of living than South African households and were three times more likely to fall in the poorest quintile than South African households. Thus the South Africans' higher standard of living may be contributing to a relatively higher spatial risk of non-communicable disease. This needs to be investigated in more detailed future studies.
The increasing use of linked social-spatial and health-spatial data raises significant concerns regarding the confidentiality of research participants and the stigmatization that may arise if sensitive information were released. This is especially true in a small geographic area such as the Agincourt HDSS. Rural areas present an additional problem in that settlements are fewer, more dispersed and thus more distinct than in urban areas. Hence higher levels of buffering are required to ensure confidentiality and limit disclosure risk (29). Presenting information cartographically is a useful tool for ascertaining complex spatial patterns visually, yet disclosure risks are associated with this form of presentation (29). Increased layers (e.g. borders, roads, etc.) displayable on a map add to the security threat. In this study we removed all geographically identifying features (administrative and Increased layers (e.g. borders, roads, etc.) displayable on a map add to the security threat. In this study we removed all geographically identifying features (administrative and village boundaries, roads) from the subset of all-cause mortality maps that were developed. For other significant mortality clusters, tables describing their relative location within the site were rather used to further protect those villages with high HIV burden.
Exploratory analysis of spatial data aims to describe spatial patterns using inferential statistics (occurrence of mortality for example is random or not), and to develop of hypotheses. However, it does not answer the question as to what may be influencing the spatial patterns, while spatial modeling (incorporating spatial dependency) is better suited to predict mortality rates (e.g. at unsampled locations). A study by Sankoh et al. (30) demonstrated that mapping of mortality rates using Bayesian smoothing techniques is a useful graphical supplement to spatial analytical methods as it addresses the issue of heterogeneity in the population at risk. Future research will thus use Bayesian kriging, as suggested by Gelfand et al. (31), to produce smooth maps of mortality risk. As mentioned earlier, underlying risk factors (both quantified and unquantified) drive the spatial (and temporal) risk clustering observed in this study. Common exposures may influence mortality similarly in households of the same geographical area, introducing spatial correlation in mortality outcomes. Longitudinal data are also expected to be correlated in time. Standard statistical methods assume independence of outcome measures (e.g. mortality events) and overlook correlation biases. Recent developments recommend Bayesian techniques as the appropriate methodology for taking account of this spatial and temporal dependence. Future risk factor studies in the Agincourt subdistrict will employ Bayesian geostatistical models to correctly quantify risk factors for mortality by age group.
This study underscores the need for an exploratory approach to assess geographic and temporal patterns (both historical and emerging) in all-cause mortality within a relatively small geographic area such as the Agincourt sub-district. It highlights villages requiring more targeted health interventions, raising detailed questions regarding cause-specific and spatial–temporal changes as well as the risk factors that may drive the observed all-cause mortality patterns.
This work was supported by a grant from the INDEPTH Network as well as by a PhD fellowship from the South African Centre for Epidemiological Modelling and Analysis (SACEMA). Additional funding was provided by the MRC/Wits Rural Public Health and Health Transitions Research Unit (Agincourt) through the Wellcome Trust, UK [Grant No. 069683/Z/08/Z] and the Swiss South African Joint Research Programme. SaTScan™ is a trademark of Martin Kulldorff. The SaTScan™ software was developed under the joint auspices of (a) Martin Kulldorff, (b) the National Cancer Institute, and (c) Farzad Mostashari of the New York City Department of Health and Mental Hygiene.
The authors have not received any funding or benefits from industry to conduct this study.