Data Source: NHIS and Its Supplements
NCHS/CDC has administered NHIS annually since 1957 to assess the health of the civilian noninstitutionalized population in the United States. Each year, the NHIS randomly samples about 35,000 households (87,500 persons) from 201 defined geographical units throughout the United States to provide a representative sample of U.S. households. Basic health and demographic information is collected on every member of the household, and in-depth health information is collected on one adult and one child in that household.
The households and noninstitutional group quarters selected for interview each week in the NHIS are a probability sample representative of the target population. Survey participation is voluntary and the confidentiality of responses is assured under Section 308(d) of the Public Health Service Act. The NHIS annual response rate is close to 90 percent of the eligible households in the sample.
The description that follows is adapted from Appendices III and VII of the 2006 NHIS Survey Description
]. Sampling and interviewing for the NHIS are continuous throughout each year. Multistage sampling techniques are used to select the sample of dwelling units for the survey. The sampling plan follows a multistage area probability design that permits the representative sampling of households and noninstitutional group quarters (e.g., college dormitories); the plan is redesigned after every decennial census. In 1997, the survey was substantially redesigned [3
], with questions added to capture data on health insurance, access to health care, and health behaviors.
The current sampling plan was implemented in 2006. The first stage of the plan consists of a sample of 428 primary sampling units (PSUs) drawn from about 1,900 geographically defined PSUs that cover the 50 states and the District of Columbia. The multistage methods partition the target universe into several nested levels of strata and clusters. Within a PSU, two types of second-stage units are used: area segments and permit segments. Area segments are defined geographically and contain an expected 8, 12, or 16 addresses. Permit segments cover housing units built after the 2000 census; they are defined using updated lists of building permits issued in the PSU since 2000 and contain an expected four addresses.
The sampling design oversamples Black, Hispanic, and Asian persons. One of the two procedures used for oversampling is "screening." Prior to interviewing, the sample addresses in area segments are randomly separated into two parts. In one part, the sample addresses are assigned to be "screened," and the NHIS interview proceeds through the collection of the household roster. The interview continues only if the household roster contains one or more Black, Hispanic, or Asian persons. Otherwise, the interview terminates and the household is said to be "screened out" In the second part of the NHIS sample, full interviews occur at all households. No screening occurs in permit segments. Another oversampling procedure is applied when area segments are sampled within PSUs. Segments are grouped by 2000 census concentrations of Black, Hispanic, or Asian persons, and groups with higher concentrations are sampled at a higher rate.
Data collection procedures
Data are collected through a personal household interview conducted by interviewers employed and trained by the U.S. Census Bureau according to procedures specified by NCHS/CDC and detailed in the NHIS report found at: http://ftp.cdc.gov/pub/Health_Statistics/NCHS/Survey_Questionnaires/NHIS/2007/frmanual.pdf
Nationally, the NHIS uses about 400 interviewers, trained and directed by health survey supervisors in each of the 12 Census Bureau Regional Offices. The revised NHIS questionnaire fielded since 1997 uses a computer-assisted personal interviewing (CAPI) mode. Interviewers use a laptop to administer the CAPI version of the NHIS questionnaire and enter responses as they are given during the interview. This computerized mode of data collection offers distinct advantages in timeliness and improved quality of the data.
Content of questionnaire
The NHIS questionnaire has two parts: a set of basic health and demographic items – known as the Core Questionnaire – and one or more sets of questions on current health topics. Core Questionnaire questions generally do not vary from year to year, which allows for trend analysis and for pooling data from more than one year to increase sample size for analytic purposes. The Core Questionnaire has four major components: Household, Family, Sample Adult, and Sample Child questionnaires.
The Household questionnaire collects limited demographic information on all individuals living in a house. The Family questionnaire verifies and collects additional demographic information on each family member in the household and data on health status and limitations, injuries, health care access and utilization, health insurance, and income and assets. The Family questionnaire also allows the NHIS to serve as a sampling frame for additional integrated surveys, as needed.
From each family in the NHIS, one sample adult and one sample child (if children are in the household) are randomly selected; information on each is collected with the Sample Adult questionnaire and the Sample Child questionnaire. Because health issues differ for children and adults, some items differ in the two questionnaires. However, both questionnaires collect basic information on health status, health care services, and health behaviors.
Since 1965, the NHIS has included tobacco-related questions – such as questions on cigarette, pipe, and cigar smoking and use of smokeless tobacco (snuff and chewing tobacco) – although methods and questions have varied. Questions on cigarette smoking routinely included in the Core Questionnaire collect data on lifetime smoking status, current smoking status, number of cigarettes smoked per day, age of smoking initiation, and attempts to quit smoking.
In contrast to the Core Questionnaire, NHIS supplements are used to respond to new public health data needs as they arise. These questionnaires may be used to provide additional detail on a subject covered in the Core Questionnaire or on a topic not covered in other parts of the NHIS. Several supplements have included questions on tobacco, such as those designed to assess cancer control and occupational health.
NHIS supplements are administered to a subset of respondents, ranging from 10,000 to 80,000 people. For example, in 1970, questions on smoking were included in a special topic supplement sponsored by the National Cancer Institute (NCI) to explore smoking and health more fully. In one of the more detailed survey supplements, tobacco questions in the 2000 Cancer Control Supplement http://appliedresearch.cancer.gov/surveys/nhis/
focused on the extent of current smoking; smoking cessation; switching to a lower tar and nicotine cigarette; intent to quit smoking; provision of smoking advice from health care professionals; use of other types of tobacco, such as cigars, pipes, chewing tobacco, moist snuff, and bidis; worksite smoke-free policies; home exposure to secondhand smoke; and opinions about smoke-free policies, health effects, and tax increases. These data can be invaluable in assessing the total impact of tobacco use on public health and in identifying strategies to promote healthier lifestyles. Additional file 1
highlights NHIS supplements that have included tobacco questions and provides the survey year, sample size, participant age, and survey topics.
Strengths and limitations of NHIS data
Strengths of the NHIS dataset include its large sample size, large number of variables, and links to other datasets. In addition to being large enough to provide estimates for a number of population subgroups, the data also can be used to compare demographic characteristics – such as gender, age, race/ethnicity, and socioeconomic status – with knowledge, attitudes, and behaviors related to health practices, including tobacco initiation, use, and cessation. The utility of the NHIS dataset is enhanced through links with other NCHS databases that include mortality data, Medicare Enrollment and Claims data, and Social Security Benefit History data. The NHIS also can be linked with Medical Expenditure Panel Survey Linkage Files and the National Immunization Provider Record Check Study (1997–1999).
NHIS questions on tobacco allow researchers to monitor trends in tobacco-related behaviors and can be used to evaluate the context of tobacco use. The criterion of having smoked at least 100 cigarettes (as the threshold for asking additional smoking questions) has been part of the NHIS from the beginning. The wording and positioning of the questions in the interview have been relatively stable. With few exceptions, smoking data are self-reported by the sample adult so that inaccuracies associated with proxy reporting are not an issue. Because tobacco-related questions are embedded in a broad range of questions, NHIS data can be used to relate tobacco behavior to other behaviors and information, such as stress, injury control, cancer screening and knowledge, family history of cancer, alcohol use, dietary knowledge and behaviors, physical activity, health insurance, and social activities. Although such analyses may provide important insights for tobacco prevention and cessation interventions or policies, they have been explored on a limited basis to date. On the other hand, because the NHIS has been used in both research and policymaking arenas for so long, its results can be used as benchmarks.
The NHIS does not collect information that may be needed for some tobacco-related research and does not include all subgroup populations. Focusing on health information, the NHIS does not collect data in areas such as labor force participation or industry. Further, the health information collected does not include verifiable medical data or laboratory data, such as blood pressure readings, oximeter readings, or blood and urine data. The NHIS omits institutionalized individuals, thus missing such segments of the population as military personnel or older adults in nursing homes and other long-term care facilities. Also, the age-tobacco use relationships may be biased, as older users may have died before the survey. Finally, because the data from the survey and questions on tobacco are cross-sectional, based on an annual sample, they represent a changing cohort of subjects.
Researchers also need to take into account the limitations inherent in self-reported data, such as that collected by the NHIS. First, it is possible that some respondents may not be forthcoming about a behavior many consider to be undesirable, which could lead to underestimates of current tobacco use and overestimates of attempts to quit such use. The number of cigarettes smoked is subject to the respondents' rounding and estimation error. Information on the age of tobacco initiation depends on the respondents' recall of an event that may not have had a clear starting point and, especially for older respondents, may have occurred a long time ago. Additional file 2
summarizes additional national tobacco-related surveys used in analyzing other variables related to tobacco use, attitudes, knowledge, behaviors, and clinical data.