|Home | About | Journals | Submit | Contact Us | Français|
The Atlas of Cancer Mortality in the United States, 1950–94 (Devesa et al.) published in 1999 by the National Institutes of Health suggests that there are elevated rates of brain and other nervous system cancer in the northwestern, north central, and southeastern parts of the country. Being descriptive in nature, the atlas does not evaluate whether observed patterns are simply due to random variation or if they are reflective of true geographical differences in disease risk or treatment practices. To formally test for geographical clustering of disease, we analyzed U.S. brain cancer mortality data from 1986 to 1995 with Tango’s Excess Events test, the Cuzick-Edwards k-Nearest-Neighbors test, and the spatial scan statistic. All tests revealed statistically significant geographical clustering for both adult men and women. The spatial scan statistic indicated that the most likely cluster of high mortality was in parts of Arkansas, Mississippi, and Oklahoma (relative risk [RR] = 1.22, P < 0.0001) for women and in parts of Tennessee and Kentucky (RR = 1.15, P < 0.0001) for men. Several secondary clusters were detected, but there were no statistically significant clusters of a very localized nature and a high RR. For childhood brain cancer, there were no statistically significant geographical clusters. It is reassuring that no local brain cancer mortality “hot spots” with very high RRs were found. While the causes of the large geographical clusters with modest RRs are unclear, the geographical pattern of brain cancer mortality provides valuable information that can help in formulating etiological hypotheses and in targeting high-risk populations for further epidemiological and health services research.
In the United States, the brain is among the top 10 sites for adult cancer mortality, accounting for more than 2% of all adult cancer deaths (Legler et al., 1999). Incidence is slightly higher among men than women and has increased over the last 20 years (Jukich et al., 2001; Surawicz et al., 1999). For U.S. children, brain cancer is the second most common cancer (Wu and Huber, 1994). While its cause is not known, it is generally accepted that brain cancer may be due to an alteration in the person’s genetic structure which could be inherited or caused by environmental factors (Bondy et al., 1994; Thomas and Inskip, 1996). As yet, very little is known about the causes of tumors of the brain (Harrington et al., 1997; Inskip et al., 1995; Kuitjen and Bunin, 1993; Preston-Martin et al., 1996).
Descriptions of geographic variation of disease may provide important clues about etiology (Lawson and Kulldorff, 1999). For example, elevated lung cancer mortality rates across coastal areas in Georgia, Virginia, northeastern Florida, and Louisiana were linked to shipyard workers’ exposure to asbestos during World War II (Devesa et al., 1999). Moreover, targeting epidemiological case-control and cohort studies to geographical areas with high incidence raises the potential for discovering etiological risk factors not commonly present in the population at large, and because of the short survival of brain cancer patients, the geographical patterns of mortality may potentially be a good reflection of the geographical distribution of incidence, for which no national data is available.
Early studies in the United States and Sweden have revealed little geographic variation in brain cancer rates (Hjalmars et al., 1999; Kurtzke, 1969; Kurtzke and Stazio, 1967). However, the recently published Atlas of Cancer Mortality in the United States noted higher rates for brain and other nervous system cancers (disease categories 191–192 [WHO, 1975]) for white males and females in the northwestern, north central, and southeastern areas, with low rates in the Southwest and the Northeast (Devesa et al., 1999). With a rare disease like brain cancer and the inhomogeneous U.S. population density, one could expect mortality maps to include many unstable rates. As it is descriptive by nature, the atlas does not evaluate whether these geographic differences in rates reflect random variation or statistically significant clusters that, in turn, reflect differences in the occurrence and/or course of this disease.
In this paper, we use 3 statistical methods to evaluate the geographical variation in U.S. brain cancer mortality rates for the years 1986 to 1995: Tango’s Excess Events test (EET)3 (Tango, 1995), the Cuzick-Edwards k-Nearest-Neighbors (k-NN) test (Cuzick and Edwards, 1990), and the spatial scan statistic (Kulldorff, 1997). The first 2 methods are global clustering tests that evaluate the overall presence of disease clustering across the map, without pinpointing specific local clusters (Besag and Newell, 1991; Kulldorff et al., 2003). The latter is a cluster detection test designed to detect local clusters and evaluate their statistical significance, adjusting for the multiple testing inherent in the many cluster locations and sizes considered. The latter method is capable of finding very localized “hot-spot” clusters with high relative risks (RRs) as well as larger areas with elevated but modest RRs.
Brain cancer mortality data for the 48 contiguous U.S. states were obtained from the National Center for Health Statistics. All deaths during the years 1986 to 1995 for which primary brain cancer (disease categories 191.0–191.9 [HHS, 1980]) was listed as the underlying cause were identified, and data on age, gender, race/ethnicity, and county of residence at time of death were extracted. A total of 111,859 deaths were recorded, and of these, 5,149 persons were under the age of 20. Cancers in other parts of the nervous system, such as the cranial nerve, cerebral meninges, spinal cord, and spinal meninges, were excluded from the analyses since these may have a different etiology.
For every year from 1986 to 1995, population estimates for each of the 3111 counties in the 48 contiguous states were obtained from the U.S. Census Bureau (U.S. Bureau of the Census, 1996). These are official estimates based on the 1980 and 1990 censuses, among other things. Geographic coordinates for each county were also obtained from the Census Bureau (U.S. Bureau of Census, 1996), approximately reflecting the geographical centroid of each county. Both the mortality and population numbers were available by gender, ethnicity (white, black, other), and age (ages 1 – 4, 5 – 9, 10 – 14, 15 – 19, 20–24, 25–34, 35–44, 45–54, 55–64, 65–74, 75–84, and 85+). Analyses were carried out separately for children (0–19 years of age), all adults (20 years and older), adult men, and adult women.
We calculated the expected number of brain cancer deaths for each county by indirect standardization (Last, 2001), adjusting for age, gender, and race/ethnicity, using the respective internal standard of the age-, gender- and race/ethnic-specific population in the United States for 1986 to 1995. We did this by (i) calculating the mortality rate in each age by gender by ethnicity combination using data from all of the 48 states, (ii) multiplying this rate by the age-, gender-, and ethnicity-specific population of each county to get the expected number of cases in each subgroup, and (iii) adding these expected numbers over all age, gender, and ethnicity combinations to get the total number of expected deaths in each county. For each county, the standardized mortality rate, defined as the observed divided by the expected number of cases, was calculated and mapped. Because we used the same observed and expected numbers for the standardized mortality rates and the statistical tests, the descriptive maps are directly comparable to the statistical analysis results.
The presence of global clustering of brain cancer mortality in the United States was evaluated by 2 different methods. Tango’s EET evaluates whether counties with an excess number of brain cancer deaths are close to other counties with an excess number of deaths. With a parameter lambda (λ), Tango’s EET uses a weighting mechanism so that closer counties are weighted more than those further away, and with a smaller λ there is more weight on the closest counties (Tango, 1995). The test statistic is
where oi and ei are the observed and expected number of deaths, respectively, in county i, dij is the distance in kilometers between the centroids of counties i and j, and λ is a clustering scale parameter chosen by the user.
The Cuzick-Edwards k-NN test evaluates whether a county with individuals dying from brain cancer has nearest-neighbor counties with more individuals dying from brain cancer than would be expected by chance (Cuzick and Edwards, 1990). For each death in turn, the number of other deaths is noted among its nearest neighboring counties, and all these other deaths are then summed (many deaths are of course counted multiple times). The set of nearest neighbors is specified through a parameter k, which is normally the raw population numbers, but which we set to be the expected number of deaths under the null hypothesis. Ties are dealt with in accordance to the method described by Cuzick and Edwards (1990).
To detect the specific locations of either high-rate or low-rate clusters with a minimum of assumptions about cluster size, and to evaluate their statistical significance, we also employed the spatial scan statistic (Kulldorff, 1997). This method imposes an infinite number of circles on the map at different locations, with the circle centroid at any of the county coordinates, and with different radius, varying continuously from zero up to when 50% of the population at risk is included in the circle. Each circle is a potential cluster that may consist of a single county or a large number of neighboring counties, and the circle with the maximum likelihood is the most likely cluster, that is, the cluster that is least likely to have occurred by chance. The method adjusts for the multiple testing inherent in the many potential cluster locations and sizes considered, thereby avoiding the common problem of deflated P-values due to preselection bias that has plagued cancer cluster studies in the past. This means that if the null hypothesis of constant risk everywhere is true, then the probability of having one or more false alarms on the map is at the nominal α = 0.05 level. Secondary clusters are also evaluated. We report only those secondary clusters that do not contain any counties in common with a more likely cluster with lower P-value.
For all 3 methods, P-values were obtained by using Monte Carlo hypothesis testing, comparing the test statistic from the real data set to the test statistics from 9999 random data sets generated under the null hypothesis of no clustering. Analyses were performed by using the SaTScan software (Kulldorff et al., 1998) for the spatial scan statistic and with specially written C++ code for the other 2 methods.
The U.S. annual mortality rates for brain cancer are shown in Table 1. There were 5.6 annual deaths per 100,000 adults and 0.80 annual deaths per 100,000 children. Within specific gender and ethnic categories, rates among adults varied from 1.4 to 7.5, while variation among children was from 0.48 to 0.88. The geographic variation of county-level mortality rates adjusted for age, gender, and race/ethnicity are shown in Fig. 1.
According to both global clustering tests, there is statistically significant clustering of brain cancer among adult women (P < 0.0001), adult men (P < 0.0001), and all adults combined (P < 0.0001) (Table 2). The results are consistent over a wide range of different parameter values. For children, the results are less clear. Statistically significant clustering was noted when the scale parameter was small, but not for larger values.
For women, the spatial scan statistic detected 8 statistically significant clusters, 4 with high rates and 4 with low rates (Table 3, Fig. 2). The most likely cluster was found around Arkansas, Mississippi, and Oklahoma, where 2830 cases were observed, whereas 2328 were expected (RR = 1.22, P < 0.0001). Other locations with high mortality were in North and South Carolina (RR = 1.17, P < 0.0001); in the states of Oklahoma, Texas, and Kansas (RR = 1.14, P = 0.003); and in Minnesota, North Dakota, South Dakota, and Nebraska (RR = 1.10, P = 0.01). The most likely area of low brain cancer mortality for women was found in New York City and northern New Jersey (RR = 0.79, P < 0.0001), with secondary clusters in southern Texas (RR = 0.59; P < 0.0001) and the Southwest/Mountain states of New Mexico, Arizona, and Colorado (RR = 0.81, P < 0.0001).
For men, there were 5 statistically significant clusters with elevated risk of brain cancer mortality and 5 statistically significant clusters with lower than expected mortality (Table 4, Fig. 2). The most likely cluster of elevated brain cancer mortality was centered in the states of Kentucky and Tennessee (RR = 1.15, P < 0.0001). Secondary clusters with significantly high rates were found in North and South Carolina (RR = 1.16; P < 0.0001); the southern Midwest states, including parts of Arkansas, Mississippi, and Oklahoma (RR = 1.19; P = 0.001); the Pacific Northwest, including Oregon and Washington State (RR = 1.14; P = 0.003); and the Great Lakes area, including Michigan (RR = 1.17, P = 0.005). The most likely low-mortality area was in northern New Jersey and New York City (RR = 0.80, P < 0.0001). Other clusters with low mortality were found in southern Texas (RR = 0.60, P < 0.0001); the Southwest (RR = 0.84; P < 0.0001); the New York and Pennsylvania area (RR = 0.87, P < 0.0001); and the Virginia and West Virginia region (RR = 0.75, P = 0.008). Overall, the spatial scan statistic yielded similar findings for men, women, and all adults (Fig. 2, Tables 3–5).
For children, no statistically significant clusters were detected by the spatial scan statistic (Table 6, Fig. 2). The most likely cluster of childhood brain cancer mortality was on the border of North and South Carolina, where the observed number of deaths was 86 compared to 51 expected (RR = 1.67, P = 0.24).
The causes of brain cancer are largely unknown, and while researchers have identified several genetic abnormalities that relate to specific malignant disease (Legler et al., 1999), only about 5% of primary brain cancers are known to be associated with hereditary factors (Bondy et al., 1994). Families with excess disease may have a genetic predisposition, or they may be similarly affected by exposure to common environmental/behavioral factors.
The only established environmental risk is high-dose ionizing radiation (Ron et al., 1988). Exposure to N-nitroso compounds, phenols, pesticides, polycyclic aromatic hydrocarbons, or organic solvents may increase the risk of brain cancer (Inskip et al., 1995; Thomas and Inskip, 1996), and some researchers have associated childhood brain tumors with agriculture-related exposures (Wilkins and Koutras, 1988; Wilkins and Sinks, 1990). Brain cancers are more frequent among workers in industries such as oil refining, rubber manufacturing, and drug manufacturing (Kuijten and Bunin, 1993; Marsh et al., 1991), as well as among farmers (Brownson et al., 1990; Musicco et al., 1988).
Consistent with previous findings from a Swedish incidence study (Hjalmars et al., 1999), our analysis did not reveal any strong evidence of geographical clustering of childhood brain cancer. Regarding adults, several areas of the country had significantly elevated rates, and several others were significantly low. All detected areas are fairly large with modest deviation from the nationwide rate. Both the high-rate and low-rate mortality clusters were for the most part in similar locations for men and women, which suggests common risk factors. Even though the exact borders often differ, the borders generated by the spatial scan statistic are approximate in nature, and it is more appropriate to look at the general regions with high and low rates. However, despite statistically significant variation in brain cancer mortality across the United States, there are no local hot spots with extreme rates. This is reassuring given that the spatial scan statistic has high power to detect such clusters (Kulldorff et al., 2003). On the other hand, our use of aggregate county-level data may hide hot-spot clusters covering only a small portion of a county. With an average of 15.6, 18.7, and 1.7 deaths per county for women, men, and children, respectively, this is especially a problem for large counties and the adult analyses.
The geographical variation observed could be related to the geographic distribution of genetic, occupational, environmental, and/or lifestyle risk factors. The variation could also be due to geographic differences in diagnosis and/or treatment. It would be interesting to compare the results with geographical analysis of brain cancer incidence, but such data is unfortunately not available for the country as a whole. White Americans have higher brain cancer mortality rates than African Americans (Ries et al., 1999), but that cannot explain the observed clusters because we adjusted for white, black, and “other” ethnicity. It was not possible to adjust for Hispanic ethnicity, though, and the high proportion of Hispanic population in Arizona, New Mexico, southern Texas, and New York City is one potential explanation for the low rates in those areas.
Alternative explanation for the findings can be offered. Brain cancer clusters could be artifacts due to variation in the recording of underlying causes of death, as notation in death certificates is not completely reliable and may exhibit geographic variation of its own (Gittelsohn and Senning, 1979). Factors affecting whether brain cancer is reported on death certificates, and if so, included as the underlying or a contributing cause, have not been fully investigated but may include misclassification between brain cancer and stroke as well as between primary and metastatic brain cancers. In this study, we tallied all brain cancer decedents from each county only on the basis of the underlying cause of death. Kurtzke and Stazio (1967) found that brain cancer rates in the United States correlate well with the distribution of physicians. It is unclear if the distribution of medical facilities plays a role in the clusters reported here. Nonetheless, if true, this relationship is important to note and monitor in subsequent surveillance efforts.
At this time it is impossible to associate the observed geographic pattern of brain cancer to any of the specific risk factors. Nevertheless, the geographical patterns of brain cancer provide a valuable source of information that can help in formulating etiological hypotheses. It may also be used to select high-risk populations when designing future epidemiological case-control and cohort studies, recruiting patients in, for example, the Midwest and the Carolinas. It could also be used for studying the health services aspects of the disease, for example, by using cancer registries to compare brain cancer survival in low- versus high-mortality rate areas.
The authors thank Alison Hemhauser for help preparing the maps. Comments from four anonymous reviewers greatly improved the paper.
1This research was funded in part by grant #R01-CA95979-0 from the National Cancer Institute.
3Abbreviations used are as follows: EET, Excess Events test; k-NN, k-Nearest-Neighbors [test]; RR, relative risk.