|Home | About | Journals | Submit | Contact Us | Français|
Firearm violence is the end result of a causative web of individual-level and geographic risk factors. Few, if any, studies of firearm violence have been able to simultaneously determine the population-based relative risks that individuals experience as a result of what they were doing at a specific point in time and where they were, geographically, at a specific point in time. This paper describes the linkage of individual and geographic data that was undertaken as part of a population-based case-control study of firearm violence in Philadelphia. New methods and applications of these linked data relevant to researchers and policymakers interested in firearm violence are also discussed.
For firearm violence to occur certain situational inducements are needed. Past work has shown that these inducements are partly the by-products of an area’s geographic landscape and routine activities and that they can influence firearm violence independent of individual-level factors (Birkbeck & LaFree, 1993; Branas, Nance, Elliott, Richmond, & Schwab, 2004; Cohen & Felson, 1979; Eck & Weisburd, 1995; Felson, 1983) This past work also mirrors longstanding epidemiologic theory that views both victims and their environments as two major components in the creation of disease, in this case firearm injury (Branas, 2008).
Firearm violence is thus the end result of a causative web of risk factors (Romelsjo, 1995) generated both by individuals themselves and the geography within which individuals find themselves (Branas et al., 2004). Individuals may experience an increased probability of falling victim to firearm violence simply by being in an area where geographic risk factors are present regardless of whether they themselves are in a high-risk category or are engaging in risky behaviors. Alternatively, individuals engaging in risky behaviors and who are also in risky areas may experience an increased probability of firearm violence far beyond that generated by behavior or geography alone.
Past studies of firearm violence have been largely designed to determine either the risks generated by individual factors or the risks generated by geographic factors. Some analyses of firearm violence at the individual level have included comparison participants to estimate relative risks (Kellermann et al., 1993; Nielsen, Martinez, & Rosenfeld, 2005; Wiebe, 2003). Other, geographic analyses of firearm violence have studied small area risks such as those between neighborhoods (Branas et al., 2004; Shenassa, Daskalakis, & Buka, 2006; Tardiff et al., 1995; Wei, Hipwell, Pardini, Beyers, & Loeber, 2005). Few, if any, studies of firearm violence have been able to simultaneously determine the population-based, relative risks that an individual experiences as a result of what they were doing and where they were, geographically, at a specific point in time. The purpose of this paper is to describe the linkage of individual and geographic data that was undertaken as part of a case-control study of firearm violence in Philadelphia. New methods and applications of these linked data relevant to researchers focusing on violence, as well as other conditions, are also described.
The current study is a novel use of existing data sources and telephone interviews to conduct a population-based case-control study of risk factors for gunshot injury. The study sample of case and control participants was accrued on a continuous basis (incidence density sampling) over a 2-year data collection period in Philadelphia, Pennsylvania. Cases consisted of fatally and nonfatally injured victims of assaultive gun violence (including homicide), self-inflicted gun violence (including suicide), and matched controls that were concurrently recruited from the general population. Numerous exposures, which include both individual-level characteristics (i.e., characteristics of persons) and characteristics of the geographic environment, were studied.
The conceptual framework behind our study separates predictors and confounders of the likelihood of being violently injured with a firearm into both individual and geographic variables. Ecologic study designs have been extensively used by social scientists and epidemiologists in many areas of research. However, these studies are often difficult to properly interpret and have been called “incomplete” due to problems of causal inference (Morgenstern, 1998). What limits ecologic studies for testing causal hypotheses is that the unit of analysis is a group, often defined geographically. Thus, we know the number of exposed persons and the number of cases within each group but do not know the number of exposed cases. Without knowing the joint distribution of exposure and outcome within each group, we do not know whether the outcome was more common among one exposure group than the other. Aggregation bias, then, is a primary limitation of ecologic studies for making causal inferences in that effect estimates may fail to reflect the nature of the individual-level effect that is being studied.
A need has arisen for analyses using quantitative study designs other than the traditional ecologic study framework to jointly study the individual and geographic precursors of firearm violence. Randomized controlled trials of individual and geographic predictors of firearm injury are accompanied by compelling ethical problems (Robertson, 1998) and thus are often not feasible. Because firearm injury occurs with sufficient rarity, cohort studies are also often impractical (Rothman & Greenland, 1998). To perform a cohort study prospectively requires not only adequately sized exposed and unexposed samples, but also a follow-up period that is long enough to accrue sufficient numbers of individuals with the outcome of interest. Although a retrospective cohort study could obviate having to wait as a follow-up period proceeds, it would pose equally challenging limitations due to potentially heavy recall bias associated with collecting historical data.
Given that it begins by identifying a group of case participants and a group of control participants, and then works backward to identify the exposure status of each participant, the case-control design was an alternative that we turned to for this study comparing individual and geographic predictors of firearm violence (Rothman & Greenland, 1998). We describe below details of how we linked individual and geographic data in applying the case-control design to study firearm violence.
The study site was the City of Philadelphia (synonymous with Philadelphia County), which is bordered and bisected by two rivers and has a land area of 135 square miles. The current population of Philadelphia, the fifth largest city in the United States by population, is roughly 1.5 million. Approximately 43% of Philadelphia residents are Black and approximately 45% are White (U.S. Census Bureau, 2005). The median household income in Philadelphia is lower than that of Pennsylvania and the United States and approximately 25% of Philadelphia residents live below the poverty threshold. Philadelphia is also a city of about 70 distinct neighborhoods that geographically break down into 381 census tracts, 1,816 block groups, and 17,315 blocks.
These demographic statistics coincide with most large cities in the United States. Similarly, gun-related violent crime rates, homicide rates, and suicide rates in many other large US cities are similar to those in Philadelphia (Boyer & Mucha, 2006; Branas et al., 2004). Philadelphia was an excellent site for our work because its firearm violence problem is similar to and thus generalizable to other US cities yet substantial enough to conduct an adequately sized study. Furthermore, the city’s public safety and medical communities were acutely aware of their firearm violence problem and thus receptive to any partnerships that might help reduce it (Fitzgerald, 2005; Gorenstein, Boyer, & Ciotta, 2005).
Incident cases of gunshot injury were identified as they occurred, from October 2003 to April 2006. The final 6 months of this period was limited to only fatal cases. Collection of nongunshot injury cases was not pursued because it was seen as a considerably more challenging data endeavor given that shootings were much better defined and monitored by the police and medical systems in Philadelphia. A complex process involving several data sources was implemented to both detect and collect comprehensive information on all eligible shooting victims. The complexity of this process was necessitated by the multifaceted nature of firearm injury: various public and private institutions held an interest in the documentation of shootings and some database elements had been balkanized between certain institutions. These institutions infrequently communicated with one another and no centralized shootings database existed.
Case participants were defined as assault-related or self-inflicted injury events caused by powder charge firearms. Unintentional gunshot injuries and those of undetermined intent, both of which occur relatively infrequently at the local level, (Van Tuinen & Crosby, 1998) were excluded as were gunshot wounds from BB or pellet guns. In addition, only adult cases 21 years of age and older who were shot in Philadelphia and also lived in Philadelphia at the time of their shooting were included. Our age restriction was based on the NIH definition of a child (National Institutes of Health, 1998) and the fact that individuals less than 21 years old were legally restricted from various activities such as purchasing firearms and consuming alcohol in Philadelphia. We also restricted our case sample to current Philadelphia residents to study a defined source population. Case identification and eligibility criteria were then applied to new and existing public safety and medical data systems as they became available for audit.
Although all incident cases were initially identified, a sampling scheme was implemented that electronically assigned a random number to assaultive shootings as they occurred to select a representative one-third of these case participants. This procedure was implemented because there were more assaultive shooting cases than the study needed to maintain sufficient statistical power and more than the study could afford in terms of the costs required to identify matching controls for all assaultive shooting cases. However, as notable exceptions to this random selection procedure, all female assaultive shooting cases and all self-inflicted shootings were retained for separate study.
Study cases were identified from several state and local public safety and medical data sources (Figure 1). Using these data sources, the study protocol enabled us to ascertain the locations and activities of both assaultive and self-inflicted gunshot injury cases. Shooting cases that were possibly not captured by our system included those that were privately transported to a nontrauma center emergency department and then discharged alive as well as those that were never found. Because hospitals were required by law to notify police when they received a shooting victim, we used police data to determine that only a handful of shootings were taken to a nontrauma center hospital without eventually showing up at a trauma center hospital over our 2-year study period. Moreover, it is reasonable to contend that the vast majority of shooting victims who die on-scene and are not immediately reported are eventually found because of the difficult nature of surreptitious human body disposal (Cattaneo et al., 1999; Grevin, Bailet, Quatrehomme, & Ollier, 1998; Vesterby & Poulsen, 1997). Furthermore, the vast majority of individuals who initially survive a gunshot wound seek medical care for their injury. This is evident even among criminals who are very likely to enter the medical care system after a firearm injury (May, Hemenway, Oen, & Pitts, 2000). Therefore, the number of shootings that were not found in the public safety or medical system records that we used was negligible and, in fact, most were able to be identified within a very short period of time.
In the event that conflicting information was encountered among the same data elements in different databases, we resorted to the data element from the database that was designated as primary. For cases of assaultive shootings, information from the Philadelphia Police Department (PPD) data was primary; for self-inflicted shootings, information from the Philadelphia Medical Examiner’s Office (PMEO) was considered primary.
Case information was wirelessly sent to the University of Pennsylvania on a daily basis from the PPD and Medical Examiner’s Office. Each time this occurred, a survey research firm was also wirelessly notified and an age, race, and gender-matched adult Philadelphia resident was randomly selected and interviewed over the telephone for their location and activities at the time of their case’s shooting (Figure 2). Rapid identification of cases followed by rapid interview of controls to determine where they were and what they were doing at the time of a case’s injury greatly minimized recall bias.
Controls identified from the general population, “population-based” controls, were used and found feasible because the base population was well defined (i.e., case participants were Philadelphia residents). Population-based controls are theoretically drawn from the base population known to have given rise to the cases (Wacholder, Silverman, McLaughlin, & Mandel, 1992). Given that the study was to develop risk estimates for the general population of Philadelphia, the same population that theoretically gave rise to the cases in our sample, population-based controls were ideal because they represented a true sample of individuals who were at risk of being shot and who would have been identified as cases had they been shot in Philadelphia. Because the study also intended to develop comparative risk estimates for individual and geographic factors, control participants were specifically not selected based on their geographic location. Thus, the use of population-based controls decreased the possibility of selection bias and greatly increased the study’s generalizability relative to other control groups that could have been used, namely, hospital, emergency department, morgue, neighborhood, or friend-based controls.
Control participants were matched to cases based on age-group (21-24, 25-39, 40-64, and more than 65 years old), gender, race (Black or White), and the date and time (within 30 min intervals, i.e., 10:00 p.m., 10:30 p.m., 11:00 p.m.) when the case participant’s shooting occurred. Rather than adjust for them in our analysis, we pairmatched on these variables to avoid extremely sparse data in certain subgroups given that exceedingly different age, race, and gender distributions existed among assaultive and self-inflicted shootings relative to the general population of Philadelphia. As part of an incidence density sampling structure, we also pair-matched on time of shooting because the primary exposures we planned to test were often fleeting (for instance, geographic location may change with time) and, as such, the status of participants at the time of the shooting was most etiologically relevant (Roberts, 1995). Based on early power calculations, one control participant was matched to each assaultive shooting case and two control participants were matched to each self-inflicted shooting case.
Control participants were sampled from the study area (Philadelphia) using a modification of the Waksberg random digit dialing (RDD) method (Waksberg, 1978). The RDD selection of controls was performed by DataStat, Inc., a survey research firm in Ann Arbor, Michigan, and sought to insure an equal and known probability of selection for all residential telephone numbers in Philadelphia. Sets of randomly chosen telephone numbers were assigned to random digit dialers. Each randomly derived telephone number was dialed up to 15 times (five attempts each during the day, evening, and weekend). This process continued until all eligible controls that consented to complete a telephone interview were obtained. Even if an eligible control was found from an assigned list, the eligibility of all numbers on that list were determined as described (i.e., 15 attempts per number). In this way, the potential bias of interviewing only the first person found at home, thereby preferentially selecting inactive participants, was eliminated. Controls who later were shot remained eligible to be included in the study as a case participant (Rothman & Greenland, 1998).
We took several steps to maximize participation and thereby avoid selection biases due to nonresponse. Telephone screening was extensive with calls spread out over different times of the day, evening, and weekend. This approach should have produced higher response rates and avoided the potential for bias related to the fact that people who are frequently at home (and therefore more likely to be available for a telephone interview) may be systematically different (e.g., older, less healthy) than people who are not frequently at home. When interviewers made calls to arrange interviews and a telephone answering machine or voicemail was encountered, they left a message explaining who they were, why they were calling, and when they would call back. This approach has been shown to increase response rates (Harlow et al., 1993; Koepsell, McGuire, Longstreth, Nelson, & van Belle, 1996) and served to establish the study’s credibility and differentiate the interviewer from commercial telemarketers and solicitors. We called back later if a participant or a participant’s family member was unwell or hospitalized (Herzog & Rodgers, 1992).
Participants who initially refused to participate (unless they explicitly ask not to be recontacted) were assigned to one of the most effective interviewers who attempted to convert the refusal into an acceptance. The success of such conversion attempts has been shown to be as high as 40% (Perneger, Myers, Klag, & Whelton, 1993). To reduce respondent burden, we also conducted interviews at times that were convenient for control participants. We pretested our survey instrument to ensure that interviews were completed in a reasonable amount of time (no more than 20 min on average) and that respondent fatigue did not lead prospective controls to prematurely terminate the interview.
Once an eligible control was identified through a brief screen conducted by a telephone interviewer, a questionnaire began with a concise introduction of the study that clearly identified its affiliation with the University of Pennsylvania and its noncommercial, research-oriented intentions. On getting the respondent’s verbal consent to participate, a structured interview was conducted. Data from completed interviews were periodically sent from DataStat to study investigators who in turn conducted quality checks and merged the case and control databases into a single matched dataset.
The study involved two general categories of data: individual-level data, which refer to the case and control participants themselves, and geographic data, which refer to characteristics of the surroundings of case and control participants. The categories of data collected at each of these levels are described below and shown in Table 1. All data were obtained, stored, and analyzed under approval from both the University of Pennsylvania and the Philadelphia Department of Public Health Institutional Review Boards. A federal certificate of confidentiality was also obtained from the National Institutes of Health for the duration of the study.
The primary source of individual-level case participant data was the PPD. The PPD has 25 patrol districts and special patrol functions and has developed a geographic information systems infrastructure that is one of the largest distributed, integrated municipal geographic information system in the United States (Cheetham, 1999). About 30% of shootings in Philadelphia are transported to either the morgue or the hospital emergency department by local police and nearly all shootings in Philadelphia involve the PPD (Branas, Sing, & Davidson, 1995). This also corresponds with the fact that the primary public safety answering point for the Philadelphia 9-1-1 system is with the PPD who then filter calls for emergency medical services to the Philadelphia Fire Department as necessary making police and emergency medical services call data on shootings duplicative. The PPD was therefore designated our primary source for assaultive shooting cases. This primary designation has been used in past city-level firearms data efforts (Van Tuinen & Crosby, 1998).
The data provided by the PPD included information about the shooting victims themselves, the shooters, the circumstances of each shooting, and victim behaviors. Location information was also forwarded by the PPD in the form of blockface addresses of shootings and the home blockface addresses of victims and shooters.
Additional case data came from the PMEO, which is within the Philadelphia Department of Public Health. The PMEO investigates and determines the cause and manner of death in sudden, violent, and suspicious deaths, including all homicides and suicides occurring in Philadelphia. Because all homicides and suicides that occur in Philadelphia eventually pass through the PMEO, access to death certificate records housed by the Medical Examiner was vital to identifying all fatal shootings. This was particularly important for firearm homicides and suicides that were pronounced dead on-scene and taken directly to the city morgue by private means. Because of the very high lethality of self-inflicted gun injuries, the PMEO was our primary source of information for cases of self-inflicted firearm injury.
To corroborate and simplify the collection of data from the PMEO, we also obtained electronic mortality records from the Pennsylvania Department of Health (PDH). The PDH Division of Vital Records oversees the release of protected death records as per the Pennsylvania Vital Statistics Act of 1953. This Act permits access to individual death certificates for the purpose of medical research under the approval and strict supervision of the PDH. Many of the data elements that were available through the PMEO were also available in electronic format through the PDH.
The data provided by the PMEO and the PDH included information about the shooting victims themselves, the education and occupation of victims, and victim behaviors. Location information was also forwarded in the form of blockface addresses of the homes of victims.
The Pennsylvania Trauma Systems Foundation (PTSF) was an additional source of individual-level case participant data. The PTSF administers a statewide trauma registry whose data are collected by an unblinded registrar, audited for consistency and omissions, and entered into an electronic database. A full-time centralized staff also conducts annual on-site surveys of trauma centers to ensure data quality. Uniform, statewide definitions and reliability checks standardize the data further to ensure a high level of accuracy (Forrester & McMinn, 1990; Gillott, Thomas, & Forrester, 1989).
The trauma registry includes all fatal and nonfatal injuries that occur in Pennsylvania and that are transported to an accredited trauma center hospital. Over the course of the study, the City of Philadelphia had six or seven hospitals that were accredited trauma centers. Almost every shooting that did not die on-scene was taken directly to a trauma center as opposed to a nontrauma center hospital. Shooting cases and their intent were identified in the registry using external cause of injury codes E955.0-E955.4, E965.0-E965.4, and E985.0-E985.4.
Individual-level data on case participants also came from the Pennsylvania Health Care Cost Containment Council (PHC4). The PHC4 is an independent state agency that collects, analyzes, and makes available to the public, more than 2 million inpatient hospital discharge records each year. All hospitals in Philadelphia, regardless of their trauma center status, report their patient admissions to the PHC4. We identified shooting cases as well as their intent in the PHC4 data again using E-codes. The data provided by the PTSF and the PHC4 included information about the shooting victims themselves, the severity of their injuries, and their behaviors.
The questionnaire administered during the telephone interview elicited information from population-based control participants in seven sections: Background Information (e.g., age, gender, race, and household composition); Recent Activities (e.g., nearest street corner or intersection at the time of the shooting and nature of location [home, bar, etc.]); Alcohol Information (e.g., number of drinks at the time of the shooting; drinking habits); Drug Information (e.g., under the influence of any nonprescription, non-over-the-counter drugs at the time of the shooting, general drug consumption habits); Firearm Information (e.g., possession of a firearm at the time of the shooting; whether participant has ever been shot); General Information (e.g., height, weight, education, occupation, arrest record, mental health, and nearest street corner or intersection of home); and Interviewer Remarks (e.g., respondent interest, attentiveness, cooperation, comfort level, and perceived recall). Control participants were also given memory and recall anchors as needed and asked to report how accurately they could remember certain key data elements they provided that were tied to the time of their matched case’s shooting.
Geographic data on firearms were obtained from multiple sources. Among these was the Philadelphia Health Management Corporation (PHMC), a nonprofit, public health organization which administers the Southeastern Pennsylvania Household Health Survey every 2 years. More than 13,000 adult, older adult, adolescent, and child respondents at more than 10,000 households in five counties, including Philadelphia, are selected via RDD to participate in this telephone survey. We used responses pertaining to firearm ownership and storage practices from this survey’s 2002, 2004, and 2006 waves. Survey data were identifiable down to the level of the census tract level for all of Philadelphia. Survey balancing weights and small area estimation techniques were taken into account to obtain adjusted and more representative estimates of gun availability at the tract level (Xie, Raghunathan, & Lepkowski, 2007).
Another source of geographic data on firearms was the U.S. Bureau of Alcohol, Tobacco, and Firearms (BATF). To curb the illegal use of firearms and enforce Federal firearms laws, BATF issues licenses to gun dealers. Every gun obtained via secondary, unregulated markets was at one time legally manufactured and sold to a dealer making the gatekeeping role of gun dealers and the locations of primary gun markets important considerations in the geographic study of firearm violence. Although most gun sales do not directly involve a legal gun dealer (Reiss & Roth, 1993), the potential indirect effect that legal gun dealers have on their surrounding communities merits consideration. An electronic list of all gun dealers was obtained annually for the period 2003-2006 from the BATF. This list includes the licensee name, street address, and license type (e.g., dealers, pawnbrokers, importers, and manufacturers).
Geographic data on alcohol and drugs were obtained from multiple sources. The Pennsylvania Liquor Control Board (PLCB) provides regulation over the beverage alcohol industry in Pennsylvania and issues beverage alcohol licenses for either on-premises retail sales of wine, liquor or beer, or off-premises wholesale sales of malt beverages by the case and keg. The PLCB maintains an electronic list of more than 2,000 beverage alcohol licenses for Philadelphia and its contiguous counties that is updated each day. To adequately account for turnovers in alcohol licenses, we acquired this list every 6 months for the duration of the study period. Alcohol outlets were identified by name, address, and license type.
By ordinance, the Philadelphia Department of Revenue collects a tax on sales of liquor and malt and brewed beverages in the City of Philadelphia at a rate of 10%. Every sale at retail by any business or person holding a license or permit issued by the Commonwealth of Pennsylvania to sell or dispense liquor or malt and brewed beverages is subject to the tax. Exempt from the tax are state-operated liquor stores and malt beverage distributors although these account for only about 6% of the alcohol outlets in Philadelphia (Philadelphia City Code 19, section 1805). Access to the quarterly liquor taxes (in dollars) paid by each alcohol outlet (liquor license holder) in Philadelphia was obtained for use by the study. These liquor tax data were confidential and blinded as to business identity but provided the study with an unprecedented opportunity to quantify geographic alcohol consumption.
Other geographic alcohol data were obtained from the PHMC Southeastern Pennsylvania Household Health Survey which included questions pertaining to alcohol consumption practices at the census tract level for all tracts in Philadelphia. We also obtained data compiled by the Environmental Systems Research Institute (ESRI, Redlands, California) on consumer expenditures for alcoholic beverages at the block group level in Philadelphia for 2004.
Geographic data pertaining to illicit drug markets in 2003-2005 were also obtained for use by the study. Although illicit drug markets have not been directly measured for purposes of the study, they will be measured by proxy through drug-related arrest data (both sales and possession), a previously validated surrogate measure of illicit drug markets (Warner & Coomer, 2003), collected from the PPD and the U.S. Attorney’s Office in Philadelphia. These data were available at the census block level for all of Philadelphia.
Geographic data on fast food restaurants, grocery, and convenience stores were obtained from multiple sources. These business establishments often have excessively high levels of foot traffic which has been associated with violence (Anderson, 1999; Roncek & Maier, 1991). Fast food restaurant information for Philadelphia was obtained for 2004 from Restaurant Trends, Inc. (Wall, New Jersey) and included name, address, type, and annual sales volumes for each business establishment. Data for supermarkets, small groceries, and convenience stores were obtained from Trade Dimensions, Inc. (Wilton, Connecticut) and included name, address, and type of each business establishment in 2004. Annual data on these businesses were also obtained for 2003-2005 from the Philadelphia Department of Licenses and Inspections that provided access to an electronic list of all food service licensees in Philadelphia that included addresses, types, square footage, and number of seats.
Geographic data on crime and public safety were acquired from the PHMC household survey described above and the PPD. From PHMC we included data pertaining to respondents’ feelings of safety and encounters with physical violence from this survey’s 2002, 2004, and 2006 waves. These survey data were identifiable down to the census tract level for all of Philadelphia. From the PPD we incorporated information on locations of complaints and arrests for personal and property crimes in Philadelphia from 2003 to 2005. These crime data included robberies and assaults with guns, vandalism and criminal mischief, narcotics arrests, liquor law violations, and public drunkenness. All crime data were measured at the block level for all of Philadelphia.
The PHMC Southeastern Pennsylvania Household Health Survey was used for Philadelphia County to obtain important data pertaining to issues of social capital. We again combined this survey’s data from the 2002, 2004, and 2006 waves. From the survey we were able to ascertain the strength of social bonds, organizational participation, feelings of belonging, perceptions of trust among neighbors, and residential connectedness within census tracts for all of Philadelphia.
Data on land use in Philadelphia came from the Philadelphia Board of Revision of Taxes, the Philadelphia Department of Licenses and Inspections, the Philadelphia Department of Recreation, the School District of Philadelphia, the Free Library of Philadelphia, the American Hospital Association, ESRI, the US Postal Service, and the US Census Bureau. These data included 2003-2005 information at the block level on residential, commercial, industrial, and vacant properties, housing code violations, and owner occupied, renter occupied, and vacant housing units. In addition, these data also included addresses, centroid longitude and latitude coordinates, and polygon shape files for bodies of water, parks, playgrounds, brownfield, streets and highways, hospitals, health centers, and schools.
A variety of demographic data were also obtained. At the census block level for 2004 in Philadelphia, Geolytics, Inc. (New Brunswick, New Jersey) provided projections of population, housing units, households, families, median age, gender, race, ethnicity, average household size, household headship, and marital status. The methodology used by Geolytics in estimating block level projections for intradecennial census years is available online at www.geolytics.com. At the census block group level, ESRI data were obtained for 2004 and included per capita income, median household income, and unemployment rates.
The record of a given case participant in one dataset was linked to the record for the same case participant in other datasets using both deterministic and probabilistic methods. The method used depended on the availability of personal identifiers such as name and date of birth. Such personal identifiers were available to two data coordinators who were part-time employees of the study team but full-time civilian employees of the PPD. This enabled the police data collected for each case participant to be linked deterministically to OME data for the case participants who were deceased. In addition, an OME data manager sent the PPD data coordinators a fax each month that contained a list of gunshot decedents received by the OME. Along with information yielded during autopsy, the list included personal identifier information which additionally enabled the deterministic linkage of OME data to PPD data.
Other case participant information was linked across data sources in a probabilistic manner. This was accomplished through a comparative inspection of datasets to identify the same individuals. Although probabilistic software was tested for this purpose, we proceeded manually based on the high degree of success we encountered and the relative ease of the manual procedure. Our sample size was small enough to make this feasible and only a handful of records were not conclusively linked.
The goal of the data surveillance system was to capture all assaultive and self-inflicted shootings in Philadelphia within hours of their occurrence. The two data coordinators at the PPD were equipped with mobile laptops that had wireless modems and electronic data collection forms. These data coordinators were scheduled so that one was on duty for study purposes each day, six days per week (alternating taking Saturday off one week and Sunday off the next), every week. During their shifts, data coordinators continuously reviewed computerized police incident reports to identify shooting cases. For each new case, the data coordinator entered information into 10 fields of a relational database: case participant identification number; police record number; medical examiner record number (if warranted); shooting date; shooting time; victim age range (under 21 years, 21-24, 25-39, 40-64, 65 or older); victim sex; victim race; shooting intent; and resident status (Philadelphia or not). The database in turn generated a “Short Form,” a one-page text file that reported the 10 information fields as well as a computer-generated random number and a flag indicating whether the victim met study enrollment criteria as a case participant.
The Short Form was forwarded immediately to the study leader by password-protected, encrypted e-mail (with a cc to the principal investigator to monitor the process). The study leader and principal investigator possessed handheld e-mail devices at all times during the study period. On receipt of a Short Form, the study leader reviewed it for logical consistency (i.e., the flag was appropriate given the information listed), replied by e-mail to the data coordinator to confirm receipt, and, as indicated, forwarded Short Forms for eligible case participants to a data collection firm to begin the process of recruiting a matched control participant. On receipt of a new Short Form at the data collection firm, a computer server also automatically replied by e-mail to the study leader to confirm receipt.
Participant locations were collected as street intersection or blockface points. Environmental-level factors were collected as centroid and population-weighted centroid points of blocks, block groups, and tracts. All geographic data were linked to study participants using the known locations of cases and controls, which had been converted into longitude and latitude point coordinates. On completing this linkage, all data were also converted into point-based (longitude and latitude coordinates) and areal-based (census blocks, block groups, and tracts) measures.
Case and control participants were compared based on their geographic proximity to risk factors using both areal and point-based measures. Areal-based measures were defined within census blocks, block groups, and tracts and used to quantify the extent to which participants were exposed to a given type of risk factor within a defined area. Point-based measures were used to quantify the extent to which participants were exposed to a given type of risk factor at any point in space. Based on the point where they were located and the surrounding point locations and magnitudes of geographic factors, participants were assigned cumulative levels of exposure to these factors using inverse distance-weighted measures. The higher the measure the greater the clustering and magnitude of factors around a participant’s location. Distances were exponentiated to greatly de-emphasize factors that were far away and to avoid undefined fractions. A bandwidth of 2 miles, beyond which all values were assumed to be zero (Silverman, 1978, 1986), was also incorporated based on cross-validation techniques (Fotheringham, Brunsdon, & Charlton, 2000) and a heuristic calculated with the number of observed points under study and the square root of Philadelphia’s total land area (Bailey, 1995; Williamson, McLafferty, Goldsmith, McGuire, & Mollenkopf, 1998). Inverse distance-weighted measures produced no aggregation effects and needed no multilevel or clustering adjustments (Longley, Goodchild, Maguire, & Rhind, 2005) while accounting for spillover effects and the variability of neighboring areas (Geronimus, 2006; Holt, Steel, & Tranmer, 1996; Krieger et al., 2002; Openshaw, 1984; Scribner, 2000; Wong, 1991; Wrigley, 1995).
Over the study period, our research team was notified about 3,485 shootings occurring in Philadelphia. This translated into an average of 4.77 ± 2.82 shootings per day with a maximum of 21 shootings in a single day and an average of 9 days a year that were shooting-free. From among all these shootings, 3,202 (91.88%) were assaults, 167 were self-inflicted (4.79%), 60 were unintentional (1.72%), 54 were legal interventions (1.55%), and 2 were of undetermined intent (0.06%).
When considering only assaults, an average of 4.39 ± 2.70 individuals were shot per day in Philadelphia with a maximum of 20 in a single day and an average of 13 days a year in which no individuals were shot. These cases were geographically concentrated in a few areas of Philadelphia (Figure 3). For use in our study, we excluded assault cases who were under 21 years of age or of unknown age (29.83%), non-Philadelphia residents (4.34%), individuals not described as being Black or White (1.62%), and police officers who had been shot (0.09%). From the remaining group of 2,073 participants, 677 (32.66%) were randomly selected and enrolled. Among all 677 enrolled shooting assaults, the case fatality rate was 18.46%. An age, race, and gender-matched group of 684 control participants were also concurrently identified and enrolled.
When considering only self-inflicted gun injuries, an average of 0.23 ± 0.47 individuals shot themselves each day in Philadelphia with a maximum of three in a single day and an average of 75 days each year in which individuals shot themselves. Geographically, these cases were relatively diffuse across Philadelphia (Figure 3). For use in our study, we excluded self-inflicted cases who were under 21 years of age or of unknown age (7.19%), non-Philadelphia residents (1.80%), and individuals not described as being Black or White (2.39%). All 149 participants who remained were enrolled. The case fatality rate for these remaining participants was 91.89%. An age, race, and gender-matched group of 302 control participants were also concurrently identified and enrolled.
Geographically, both assault and self-inflicted control participants were found to be relatively diffuse across Philadelphia similar to the general population (Figure 3). The median number of days between the time a shooting occurred and the time a control participant was recruited and interviewed to completion was 2 days, with more than 75% of all control participant interviews being completed within 4 days of the shooting. As a check of their recall, controls were also asked how certain they were of their location at the specific time of their index case’s shooting. In total, 94.10% reported that they were very sure of their location, 3.82% said they were sure, and 1.82% said they were not very sure or did not respond. The ability to recall their activities at the index time was also very similar for controls, with 95.77% reporting that they were very sure of key activities, 2.03% saying they were sure, and 1.15% saying they were not very sure or not responding.
Using standard formulae, the cooperation rate for our control survey was calculated to be 74.4% and the response rate 56.0% (Daves, 2006). These rates exceeded those of other surveys conducted at about the same time (Galea & Tracy, 2007) and were high enough to produce a reasonably representative sample of our target population (Groves, 2006; Keeter, Kennedy, Dimock, Best, & Craighill, 2006). Our respondents were also statistically similar to the general population of Philadelphia in terms of marital status, retirement, education, general health status, and smoking status within the age, gender, and race categories that they were matched (Southeastern Pennsylvania Household Health Survey, 2006). Our controls were however found to be unemployed significantly more often than the general population.
Data collection and linkage efforts were largely successful for key variables related to shooting cases and controls. Among cases, all police and medical examiner records were successfully linked although only 35.03% of these linked records that went to a hospital were then able to be linked to hospital data. Gender, race, and age data were obtained for all cases. Data on the location of participants at the reference time were missing for 0.7% of cases and 4.2% of controls. Data on the location of participants’ homes were missing for 5.5% of cases and 3.0% of controls. A nontrivial proportion of data, however, were missing for shooters. For example, shooter gender was missing for 29.7% of cases, shooter race was missing for 30.9% of cases, shooter age was missing for 49.2% of cases, and shooters’ home addresses were missing for 37.5% of cases.
The study described here is a novel linkage of individual and geographic data to study firearm violence. Existing and newly captured citywide data from local, state, and federal sources as well as telephone interviews were used to assess acute risk factors for gunshot injury. The study included the rapid ascertainment of shooting victims (viz. cases) and the concurrent enrollment and interviewing of firearm injury-free participants selected via RDD (viz. controls). Our method of data collection, which included rapid enrollment and a well-coordinated comprehensive surveillance system employing multiple agencies, was unique and positions the study to significantly enhance our understanding of the relationships between individual and geographic risk factors for firearm violence.
Enrolling controls in near-real-time gave us the ability to study risk factors, both individual and geographic, that were fleeting but potentially important in better understanding and ultimately preventing firearm violence. This incidence density sampling also helped to avoid problems of recall bias insofar as case information was obtained while it was still fresh in the minds of data collectors (such as police officers) and controls who were asked about their activities in the near past, only a few days prior. The alternative method would have meant that participants would have been enrolled at the end of the study period (i.e., cumulative sampling) introducing problems of recall and ascertainment given that the shooting events could have occurred far earlier than the actual collection of data.
Incidence density sampling also conferred analytic advantages. Specifically, in addition to deriving estimates of relative rates of gunshot injury from the logistic regression analyses we conduct, we are also able to estimate the actual rates of gunshot injury associated with being exposed or not exposed to individual and geographic risk factors of interest (Kleinbaum, Kupper, & Morgenstern, 1982; Rothman & Greenland, 1998). In doing so, we will be better able to assess the magnitude of the risk burden of firearm violence on urban populations.
A number of study limitations also deserve discussion. Our control population was more unemployed than the target population of Philadelphians that it was to intended to represent. Although our control population was found to be representative of Philadelphians for five other indicators, having a preponderance of unemployment among our controls may mildly erode our study’s generalizability. It is also worth noting that our findings are possibly not generalizable to nonurban areas whose gun injury risks can be significantly different than those of urban centers like Philadelphia (Branas et al., 2004).
As another limitation, we also did not correspondingly enroll nongun injuries and cannot compare the risks of being injured with a gun as opposed to a nongun weapon. These would have been useful comparisons to make although collection of nongun injury information was not pursued because it was seen as a considerably more challenging data collection endeavor given that shootings were much better defined and monitored by the police and medical systems in Philadelphia.
Finally, the study will not be able to conclusively determine that the geographic factors it has tested cause individuals to risk being shot. However, compared with ecologic studies, our study has taken a large step forward by much more precisely calculating geographic risk using inverse-distance weighted measures based on the longitude and latitude point locations of actual study participants (not aggregations of study participants). Because we also collected individual-level information, the study is thus able to test whether geographic factors independently generate risk over and above individual factors.
Few, if any, studies of firearm violence have been able to simultaneously determine the population-based, relative risks that individuals experience as a result of what they were doing at a specific point in time and where they were, geographically, at a specific point in time. We accomplished this using a population-based, case-control design that accounts for individual characteristics, individual behaviors, and the geography of fatal and nonfatal firearm injuries. The new data linkage methods used in this study, as well as the application of these linked data, are very relevant to researchers interested in estimating and comparing the variety of risk factors that may lead to violent victimization. By being able to calculate comparative risks between individuals and the geographic factors that are around them, this study will promote our understanding of how individuals interact with their environments. The data linkage and methods demonstrated here are also potentially of value to policymakers interested in individual-level prevention strategies but also geographic planning and zoning as politically feasible, yet often overlooked, strategies for local communities to contend with problems (Gordis, 1997; Hoyos, 1991, pp. 1-14; Wittman & Hilton, 1987) such as firearm violence.
Funded by the National Institutes of Health, National Institute on Alcohol Abuse and Alcoholism (under grant number R01AA013119).
Charles C. Branas is an associate professor in the Department of Biostatistics and Epidemiology at the University of Pennsylvania. He works to improve health and health care and is recognized for his studies to reduce violence and enhance emergency care. As codirector of the Penn Cartographic Modeling Laboratory, much of his work incorporates human geography and spatial interactions.
Dennis Culhane is a professor of social policy and psychology in the University of Pennsylvania School of Social Policy and Practice, where he codirects the Cartographic Modeling Laboratory. His research interests include modeling the built and social environments, and their impacts on health and behavior.
Therese S. Richmond is an associate professor in the Biobehavioral and Health Sciences Division of the School of Nursing at the University of Pennsylvania. Her continuing research focuses on the psychological and physiological repercussions of injury and reducing individual, environmental, and social risks for violent injuries.
Douglas J. Wiebe is an assistant professor in the Department of Biostatistics and Epidemiology at the University of Pennsylvania. He currently leads a study about the locations of adolescents’ daily activities and the impact of the built environment on the likelihood of being assaulted.