The NSHAP sample consisted of multiple stages of selection: (a) two area stages, in which geographic areas were selected into the sample with probabilities proportional to their sizes, (b) a household selection stage in which a sample of households was selected from the selected areas for screening, (c) an individual selection stage in which persons were selected for the NSHAP interview. These stages together determine the probabilities of selection of the individuals in the study. This design is a classic multistage area probability sample (for details on this class of sample designs, see Harter, Eckman, English, & O’Muircheartaigh, in press
NSHAP wished to interview adults aged 55–85 years. However, only approximately 30% of U.S. households contain an individual in this age range. Identifying such households and the eligible individuals within them would have involved an extremely expensive (about $2 million) and time-consuming screening of a large sample of households. At the time that we were planning the NSHAP design, the HRS (also funded by the National Institute on Aging) was about to embark on the recruitment of a new cohort. Through an innovative collaboration between NSHAP and HRS (and between the NORC and the Institute for Social Research [ISR], the respective survey organizations), the screening for both surveys was carried out as a single operation, with substantial saving in costs. As HRS interviewers screened households in the selected segments for individuals eligible for their survey, they also identified individuals who were eligible for NSHAP. HRS screening took place from February to November 2004. At the end of their data collection period, they sent all NSHAP-eligible individuals to NORC, and we selected our final sample of households and individuals from this database. This sharing of field resources allowed NSHAP to have a much larger sample size than would otherwise have been possible. However, this collaboration did require that the NSHAP redefine its target population to adults aged 57–85 years. This change meant that the HRS and NSHAP populations were nearly nonoverlapping.
Coverage of the Sampling Frame
Undercoverage occurs whenever some eligible persons have no chance to be selected for the survey. There are two minor sources of undercoverage in the NSHAP design: First, as the survey was to be carried out as a household survey, the population was limited to adults living in households; thus, the institutionalized population and the homeless were excluded; second, those absent from the country during the period of the fieldwork were excluded.
The only substantial source of undercoverage arises from the link between HRS and NSHAP fieldwork. HRS was recruiting the 50- to 56-year-old cohort (the Early Baby Boomers) and their partners as well as the next cohort, the 44- to 49-year-old Middle Baby Boomers and their partners. Due to HRS’s complex eligibility rules, 57- to 85-year-olds living in households with 44- to 56-year-old nonpartners would not be available for NSHAP by virtue of their residence in the household of an HRS-eligible individual.
To estimate the magnitude of the undercoverage due to the loss of these nonpartners, we used the household composition data in the U.S. Census Bureau’s Public Use Microdata Set. Overall, we estimated that this constraint could exclude just more than 5% of the NSHAP-eligible population (6% of the eligible women and 4% of the men). This is a relatively low degree of undercoverage overall, but there is some concern that those excluded would be concentrated in particular (and potentially interesting) subclasses of the population. The most common reason that we expected individuals to be excluded was that they were living with an adult child. Other common reasons included living with a sibling, an unmarried partner, or an unrelated housemate. Data about the excluded cases from the HRS recruitment process itself are not available.
Area Stages of the Design
The sample designs for NSHAP and HRS are identical at the area stages. The first stage consists of primary sampling units (PSUs; either metropolitan areas or counties) selected with probability proportional to size. Within selected PSUs, second stage units (segments) were formed from Census blocks and selected with probability proportional to size. In order to generate sufficient sample size for African American and Latino subsamples, the probabilities of selection of segments with more than 10% African American or Latino population were more than doubled relative to all other segments. This meant that all adults living in these segments, whether African American or Latino, or not, were overrepresented in the sample at this stage. The spatial correspondence between the HRS and NSHAP samples also has significant potential for future joint analyses of the data from the two surveys. For details on selection of multistage area probability samples generally, see Harter et al. (in press)
. At the time of this writing, some details on the HRS 2004 design were available at the HRS Web site at the Institute for Social Research, University of Michigan; see http://hrsonline.isr.umich.edu/
Within these selected segments, a full listing of housing units (households) was carried out by HRS field staff. Health and Retirement Study interviewers selected 30,000 households from those listed. Although they planned to select households with equal probability within four domains defined by the concentration of minorities, they deviated from this plan and oversampled segments, regardless of domain, that were found to have higher proportions of eligible persons. HRS also subsampled cases to hasten the end of the fieldwork (private correspondence from HRS statisticians). ISR then delivered all screened households containing at least one NSHAP-eligible member (except as discussed earlier), and we performed additional stages of selection.
Design for Sample of Households and Individuals Within Households
In planning the analyses we wished to run with the NSHAP data, we indentified six domains of interest, that is, subclasses of the population for which separate estimates would be required: three age groups, each subdivided by gender. We determined, using approximate power calculations, that a sample size of 500 would be required for each subclass, giving an overall sample size of 3,000. The binding constraints are those for men and women in the oldest age group. Although we did not use race or ethnicity in the formation of our explicit domains, another design goal was to overrepresent African Americans and Latinos in our final sample.
A general principle of estimation is that, ceteris paribus, a sample design in which individuals are selected with equal probabilities will provide more precise estimates (estimates with smaller standard errors) than a sample in which the probabilities vary arbitrarily from individual to individual (Kish, 1965
, 1992). Thus, in selecting households and individuals into the study from the frame provided to us by the HRS screening, we wished to equalize the probabilities of selection of the individuals as much as possible within our six domains.
To avoid possible within-household contamination of responses, we also wanted to select no more than one person in any household. This decision made it much more difficult to meet our domain targets as the number of respondents available for selection into a particular sample was thereby much reduced. For example, a household containing an 82-year-old woman and an 84-year-old man contains an individual from our two challenging domains, but it was not possible to have both of them in the sample.
The screener data delivered by ISR contained 7,407 households, each of which contained at least one person born before 1948, together with data on adults within these households. Sample eligibility for NSHAP was defined based on year of birth (1920–1947 inclusive). In all, 6,974 of the delivered households had at least one person born in this range (9,816 eligible people). Twenty-three percent of the eligible people were identified as African American and/or Latino. This list (6,974 households/9,816 people) constituted the sampling frame for the selection of individuals for NSHAP.
The HRS interviewers attempted to collect name, race/ethnicity (race was collected in the HRS screener instrument as a dichotomous variable: minority [meaning African American or Latino] and nonminority [all others]), and birth year for all eligible individuals in the households they screened. The data quality was imperfect, with design and estimation implications. Several steps were necessary to prepare the frame for the sample selection process: gender coding, gender and race imputation, and subsampling in segments oversampled by HRS due to race/ethnic composition.
The HRS screening operation did not collect gender. Because gender was so important to the NSHAP sample design, we coded gender for each age-eligible case from name and family relationship data. In conducting the household roster to identify individuals eligible for NSHAP and HRS, interviewers permitted respondents to identify household members by initials or family roles (“husband,” “abuela”) rather than by names. Tourangeau, Shapiro, Kearney, and Ernst (1997)
find that this can reduce undercoverage in rosters. We coded 87.52% of the eligible cases (12.48% contained no data from which we could deduce gender) and 52.09% of these were determined to be women.
We imputed gender for the 12.48% of cases where it could not be determined, and we imputed race/ethnicity for the 1.23% of cases where it was missing. Age was not missing for any cases on the frame. For gender, imputation was done systematically, sorted on PSU, segment, and individual within segment. After this step was completed, the eligible individuals consisted of 52.10% women. Imputation of race/ethnicity was based on the dominant race of the segment. In 19 of the 416 segments, African American/Latino individuals were the dominant group and all cases with missing race/ethnicity were imputed to this category. In all other segments, cases with missing race/ethnicity data were imputed to “not African American/Latino.” After this step was completed, 22.61% of eligible individuals were coded as African American/Latino. Gender and race imputation was carried out only to form strata for use in the sample selection process: The final sample file does not include these imputed variables.
The HRS national sample design oversampled segments with high minority concentration and household within these segments, introducing unequal probabilities of selection for all residents in these segments. Our design intention, however, was to increase the selection probabilities only for African American and Latino individuals. To reduce diversity of selection probabilities of nonminorities in these segments (and thus produce a sample that has more nearly equal probability), we subsampled these cases. Prior to the selection of households for interview, we selected a sample of nonminority individuals (not households) and discarded them from the frame. (Because additional subsampling was carried out by ISR during the screening, this adjustment did not fully equalize the selection probabilities among the nonminority cases.)
After imputation and subsampling, the final frame consisted of 7,768 eligible individuals in 5,920 households (average of 1.31 eligible members per household). In total 51.66% of the eligible individuals were women and 28.57% were racial/ethnic minorities. Individuals were coded into three age categories based on year of birth: 49.11% of eligible individuals were in the first age category (57–65 years; 1939–1947), 29.70% were in the second age category (66–74 years; 1930–1938), and 21.19% were in the last age category (75–84 years; 1920–1929).
Size of sample.—
Our objective was to complete interviews with 3,000 eligible respondents, with approximately 500 completed interviews in each of the six age/gender domains. We anticipated a 5% ineligibility rate (although the sample had been recently screened, we did expect some loss due to moving or death) and a response rate of 70% or a little more. Thus, the necessary number of cases to issue to the field staff was 3,000/.95/.7
4,500. To optimize representation, we felt that we should maximize the response rate; consequently, we decided to select 4,400 individuals from the frame; to generate 3,000 interviews under our assumption of 5% ineligibility, this would require a response rate of 71.7%.