A. General population sampling
The difficulties associated with selecting a representative sample of disaster survivors differ depending on whether the disaster is or is not defined in terms of geography. In the case of natural disasters (e.g., tornados, hurricanes) or man-made disasters that have a geographic epicenter (e.g., the Oklahoma City bombing), it makes most sense to think in terms of area probability household sampling as the main basis for sample selection. There are inevitable practical problems with this form of sampling that can be exacerbated in situations of mass evacuation. As described below; multiple-frame sampling (Skinner and Rao, 1996
) can be used to decrease coverage problems in situations of this sort. In the case of disasters that do not have a geographic epicenter (e.g., a plane crash), in comparison, the use of list samples is a necessity unless the researchers have the resources to engage in large-scale mass screening, using multiplicity sampling (Kalton and Anderson, 1986
) whenever possible to increase the efficiency of the screening exercise. In any of these cases, frame biases have to be taken into consideration. Land line telephone frames, in particular, might under-represent the most disadvantaged segments of the population (Brick et al., 2006
), making it particularly useful to implement a multiple-frame sampling approach that enriches the less restrictive frame for high-risk cases, possibly by over-sampling Census blocks with low rates of land line telephone penetration or high rates of poverty.
Some studies will involve both geographically clustered and dispersed cases. For example, the workers in a government building exposed to a terrorist attack with anthrax would be geographically dispersed during the initial time period when the building was evacuated and workers were sent home prior to a thorough evaluation of building contamination. The most feasible way to evaluate need for mental health treatment of these workers and their families during that time period would be from an administrative list sampling frame with home contact information for all such workers. Once the Environmental Protection Agency makes an evaluation that the building is safe for workers to return, though, the affected workers (although not their families) would become highly clustered geographically (i.e., at their place of work), making it possible efficiently to carry out mental health needs assessment surveys on site.
Another mixed case is the situation where a man-made disaster occurs at a place that involves both residents of the area in which the disaster occurred and people who were passing through the area at the time of the disaster. A good example is the 2005 train crash at a depot in the middle of the small town of Graniteville, South Carolina that released toxic chemicals into the local environment, leading to injury, death, and toxic exposure among the passengers and crew of the train and to risk of toxic exposure, evacuation, and community disruption among residents of the community in which the crash occurred (US Environmental Protection Agency, 2005
). In a situation of this sort, the residents of the community would be geographically clustered while the surviving passengers and crew of the train would not be geographically clustered.
We faced an especially complex situation with regard to sampling in assembling the Hurricane Katrina CAG. A small proportion of the population, presumably representing the most high-risk pre-hurricane residents of the areas most hard hit by the storm and resulting flood in New Orleans, were living in evacuation centers (ECs) and later FEMA-sponsored hotel rooms, trailers, and even luxury liners. Many other pre-hurricane residents of the New Orleans Metropolitan Area were scattered throughout the country, largely living with relatives, but also in communities that had established evacuation centers and subsequently created community living situations in which a certain number of needy families from New Orleans were, in effect, adopted by the community. The vast majority of pre-hurricane residents of the other areas in Alabama, Louisiana, and Mississippi that were affected by the hurricane remained living either in their pre-hurricane households or in the surrounding community in which they lived before the hurricane as they went about repairing the damage caused to their homes and communities. Telephone lines were down in many parts of the affected areas for a considerably longer time than is typical in US natural disasters. In addition, physical movement was made difficult by infrastructure damage and difficulty finding gasoline for cars. Conventional household enumeration was made difficult in some areas by the fact that many pre-hurricane homes no longer existed.
At the same time, we had several important resources available to us that we used in building a multiple-frame sampling strategy that combined information from a number of restricted frames to assemble the sample of people who participated in the CAG. One rather unexpected resource was the use of random digit dialing (RDD). It seems counterintuitive that RDD could be used to study Katrina survivors in light of the fact that the vast majority of the New Orleans population was forced to evacuate their homes after the storm and the fact that many people who lived in other areas affected by the hurricane had nonworking land lines because of damage to telephone infrastructure. However, the main telephone provider in the hurricane area, Bell South, forwarded phone calls made into the hurricane area to new numbers (either land line numbers or cell phone numbers) outside the area that were registered by the owners of the pre-hurricane numbers. As a result of this service, we were able to call an RDD sample of phone numbers selected from 1+ telephone banks working in New Orleans prior to the hurricane and to connect with many displaced pre-hurricane New Orleans residents in temporary residences all across the country.
A second important resource was the availability of extensive ARC and FEMA lists of people who registered for assistance. Of the over four million adult residents of the area defined by FEMA as affected by Katrina (4,137,000 adult residents in the 2000 Census), a majority applied for assistance to one or both of the two major agencies that maintained comprehensive applicant lists. We were in the fortunate position of having access to both of these lists. In order to reduce overlap with the RDD frame, we restricted our use of these lists to cell phone exchanges and to land line exchanges in areas outside of the RDD sampling area. Over 1.4 million families representing more than 2.3 million adults applied to the ARC for assistance and provided post-hurricane contact information that included new residential addresses, telephone numbers (often cell phones), and email addresses. An even larger number of families (roughly 2.4 million) applied to FEMA for assistance and also provided post-hurricane contact information comparable to the ARC list information. As one would predict, considerable overlap existed in the entries on these two lists, but the more surprising finding was that a substantial number of families applied only to one of the two. There were also a number of families that fraudulently applied on multiple occasions and at different locations to the same agency. We corrected for these multiple counts in sampling from these lists.
It is also noteworthy that a great many hurricane evacuees registered with one or more of the “safe lists” set up on the internet by CNN, MSNBC, the ARC, and others. These lists allowed people separated from their loved ones during the hurricane or aftermath to let it be known that they were alive and to record their whereabouts in the hopes of reconnecting with their loved ones. Google subsequently integrated all the names recorded on all the internet safe lists into a single consolidated list that contained over 400,000 names. We made extensive use of this consolidated list in piloting the baseline CAG interview. However, this pilot testing led to the discovery that virtually all people on the safe lists were also on the more inclusive ARC and FEMA lists of people who applied for assistance. As a result, we did not use the safe lists in our final sample selection for the CAG.
By the time the baseline CAG survey was fielded, all the Katrina ECs had been closed and only a small number of evacuees were still housed in FEMA-supported hotel rooms. This made it relatively easy to screen a representative sample of hotels selected from the Donnelly commercial sampling frame to find hotels housing evacuees, to use information provided by hotel managers to select a sample of rooms with probabilities proportional to size from these hotels, and to include the respondents interviewed in this way as a supplemental sample. Not surprisingly, though, this exercise showed that virtually all hotel evacuees were included with valid contact information on the FEMA relief list that we were using as one of the main sample frames. As with respondents sampled from each of the other frames, information was included about this overlap and used in making weighting adjustments in the consolidated CAG sample.
The availability of these different frames allowed us to use relatively inexpensive telephone administration to reach the great majority of people who were living in the areas affected by Katrina before the hurricane. As noted above, we reduced overlap between the two main frames by restricting our use of the ARC and FEMA lists to cell phone exchanges and to land line exchanges in areas outside of the RDD sampling area. In addition, we collected data from every respondent in the entire sample that allowed us to determine whether they had a non-zero probability of selection in each frame. For example, we asked respondents in the RDD sample if they applied to the ARC and to FEMA for assistance. This information made it possible for us to use capture-recapture methods (Fisher et al., 1994
) to estimate the size of each population segment defined by the multivariate profiles of their existence or non-existence in each frame and to use these estimates of size to develop weights that were used to combine these segments into an equal-probability sample of the population.
Concerns could be raised about the under-representation of three population segments in the frames discussed up to now: evacuees who lived outside the hurricane area, were reachable by RDD, but who were not included on either the ARC or FEMA lists (either because they did not apply or because they did not provide traceable telephone contact information); other evacuees who lived outside the hurricane area who could not be reached by telephone (whether or not they applied for ARC or FEMA assistance); and residents of the affected area who remained in the area but could not be contacted by telephone (because they did not have a working land line that could be reached by RDD and they either did not have a cell phone or did not apply to the ARC or FEMA and provide a cell phone contact number). We attempted to reach the first of these three groups (i.e., evacuees who lived outside the hurricane area, were reachable by RDD, but who were not included in either the ARC or FEMA lists) by experimenting with the use of a national RDD sample that employed multiplicity methods (i.e., asking for evacuees among current household residents and among first-degree relatives of a randomly selected informant in each household) either with live telephone interviewers or interactive voice response (IVR) messages with follow-up live telephone interviewers. Based on data from the ARC and FEMA lists about geographic evacuation patterns, we anticipated that approximately one in every 500 households in the US outside of the hurricane area would contain one or more hurricane evacuees and that some additional number of household informants would tell us about the whereabouts of such evacuees.
We screened a nationally representative sample of 20,000 listed telephone numbers to investigate the validity of these assumptions, a random half using IVR and the other half using live interviewers. We found a hit rate closer to one in 1000 in the households randomized to be screened by live interviewers, with the number of evacuees in these households typically quite large (4-7). This presumably reflects differential preferences for relocation destinations of evacuees with and without families. We found that the hit rate was much smaller in the households randomized to be screened by IVR. It is possible that this disadvantage of IVR could have been corrected if we had pursued additional iterations of alternative IVR scripts. We terminated the exercise before these iterations, though, based on the finding: that all evacuees in telephone households with listed phone numbers outside the hurricane area had applied either to the ARC or to FEMA for assistance with traceable contact information. This means that these people were already part of our primary sample frames, making it unnecessary to screen for them in a supplemental national RDD sample.
The most feasible way to reach the remaining groups that are under-represented in the frames discussed above (i.e., evacuees who could not be reached by telephone) using probability sampling would have been to use a survey field staff to carry out face-to-face interviews on an area probability sample of households and group quarters. We did not do this in our survey of Katrina survivors due to financial constraints. If we had done so, it would have been important to include information that allowed us to determine whether each respondent sampled from this frame also had a probability of selection in the list samples and the RDD sample. With regard to design considerations, a sample of this sort that focused on people living in the area affected by the hurricane would be based on a conventional multi-stage clustered area probability sampling design.
Logistical complications would exist in sample selection, as the Census measures of size used to select sampling segments (i.e., blocks in urbanized areas and block-equivalents in rural areas) would be much less accurate than normal because of housing destruction. Block listing would also be more complex than usual in that the normal landmarks used to define sample segments would in some cases be destroyed, possibly making it necessary to work with knowledgeable local informants (e.g., mail delivery workers) to help define segment boundaries. It might also be efficient to select larger segments than in a usual household survey to allow for the likelihood of housing unit destruction and to invest more heavily in block listing than usual. Logistical complications would also exist in interviewer travel and housing and because of infrastructure damage. While making fieldwork more difficult, though, none of these problems would be insurmountable.
An argument could be made that even non-probability sampling would be useful situations where probability sampling is prohibitively expensive so long as the sampling was based on characteristics identified as reflecting high exposure to disaster-related stressors (e.g., areas that were directly hit by a tornado or areas that were not reconnected to services after a natural disaster), as such an approach could provide useful information about the range of exposures and psychological reactions to the disaster. Quotas on the basis of a cross-classification of basic socio-demographic variables could be imposed in such a case in order to guarantee breadth of coverage.
B. High-risk population sampling
Initial needs assessment surveys of Hurricane Katrina survivors focused on high-risk populations, including pre-hurricane residents of New Orleans who remained in their homes shortly after the hurricane (Centers for Disease Control and Prevention, 2006a
), people staying in evacuation centers (Centers for Disease Control and Prevention, 2006b
), and people residing in FEMA-sponsored trailers or hotel rooms (Abramson and Garfield, 2006
). First responders also are a high-risk population of importance that has been the focus of considerable research attention (Ben-Ezra et al., 2006
; Fullerton et al., 2004
). Although these populations make up only a small percentage of all the people who were affected by Katrina, their distinct geographic characteristics and their presumably high level of exposure to hurricane-related stressors make them important targets for needs assessment.
Such high-risk populations can be expected to vary widely across disaster situations. The workers in a government office building that was the target of an anthrax attack along with their families might be a high-risk group in one disaster situation, while the residents of a geographic area close to a toxic chemical spill might be a high-risk group in another disaster situation. Geographic propinquity need not be a defining feature of these groups. The families and close friends of the people killed in an airplane crash, for example, would be a high-risk group for needs assessment that is widely dispersed in terms of geography. In the case of natural disasters, there are some other high-risk groups that might be expected to be more consistent across situations, such as residents of nursing homes and people with physical disabilities who would have a difficult time evacuating.
One of the most important of these high-risk groups after Hurricane Katrina consisted of people with pre-hurricane severe-persistent mental illness (SPMI) whose medical records were temporarily lost in the storm, whose local pharmacies were destroyed, and who were unable to refill their antipsychotic medications. This group represents an extreme case of the much larger group of people with pre-existing chronic conditions who were found in assessments of EC residents often to have unmet need for maintenance medications to treat their chronic conditions (Brodie et al., 2006
). An exacerbating factor is that the Strategic National Stockpile of emergency medications (Centers for Disease Control and Prevention) and short-term deployments of emergency medical personnel in the Public Health Security and Bioterrorism Preparedness and Response Act (Rosenbaum, 2006
) both failed to anticipate this problem by providing ready access to desperately-needed medications for SPMI and other extreme chronic conditions. Once the problem was recognized, emergency mental health service planners made special efforts to obtain psychotropic medications for emergency medical clinics as well as to recruit psychopharmacology experts to provide appropriate medications to people with SPMI who sought care in these clinics.
In the course of these planning activities, questions arose about the magnitude and distribution of unmet needs for services of the pre-hurricane SPMI population. Needless to say, people with SPMI make up such a small part of the general population that we were unable to make reliable statements about the special needs of people with SPMI based on the CAG sample. Assessments could, of course, be made of unmet demand for treatment of SPMI based on systematic epidemiologic surveillance systems set up in emergency health clinics. However, we know that information on demand for services often fails to give an accurate assessment of need for services, which is why general population needs assessment surveys are of such great importance.
In the case of comparatively rare high-risk populations, the only practical option for needs assessment is to gain access to a list sample that can be used as a sampling frame for tracing. It might sometimes be possible to merge multiple list samples to refine sampling or to answer certain critical policy questions regarding high-risk populations. For example, a comprehensive list existed of all nursing home residents in the areas affected by Katrina that could be linked to the National Death Index (NDI) in order to address concerns that the relocation was associated with a substantial increase in mortality of nursing home residents, although this would involve substantial delays in light of the fact that posting in the NDI sometimes does not occur until as much as a year after death. Linkage of this sort could be done across multiple administrative data systems to generate very useful data, especially when done in conjunction with follow-up surveys. It would be possible, for example, to use linked income tax records and mortality records to track the mortality experience of pre-hurricane residents of the areas affected by Hurricane Katrina who either subsequently returned to their pre-hurricane residence or who moved to a different part of the country.
Similarly, it would be possible to link pre-disaster medical-pharmacy claims data of members of large health plans in areas affected by a disaster with post-disaster claims data, income tax data, and NDI mortality data to track the associations of pre-disaster morbidity with subsequent geographic mobility, healthcare utilization, and mortality. Targeted tracking surveys then could be used to investigate the determinants of substantially reduced healthcare utilization among people with evidence of high pre-disaster need for treatment. The main impediment to this kind of integrated analysis is lack of coordination among the agencies and organizations that maintain the many different administrative data systems that would be relevant to such undertakings. Legal constraints on sharing identifying information are important considerations here along with organizational inertia and structural disincentives to collaborate in inter-organizational initiatives. An inter-agency task force in the US federal government is currently grappling with these complex issues in an effort to develop a workable plan for the use of administrative databases in these ways in response to future disasters. In addition, legislation and regulations associated with the US federal government's Confidential Information Protection and Statistical Efficiency Act (CIPSEA; www.eia.doe.gov/oss/CIPSEA.pdf
) call for increased data sharing among statistical units of federal agencies and for a correspondingly more extensive confidentiality umbrella over shared data.