|Home | About | Journals | Submit | Contact Us | Français|
To assess nonresponse bias in a mixed-mode general population health survey.
Secondary analysis of linked survey sample frame and administrative data, including demographic and health-related information.
The survey was administered by mail with telephone follow-up to nonrespondents after two mailings. To determine whether an additional mail contact or mode switch reduced nonresponse bias, we compared all respondents (N = 3,437) to respondents from each mailing and telephone respondents to the sample frame (N = 6,716).
Switching modes did not minimize the under-representation of younger people, nonwhites, those with congestive heart failure, high users of office-based services, and low-utilizers of the emergency room but did reduce the over-representation of older adults.
Multiple contact and mixed-mode surveys may increase response rates, but they do not necessarily reduce nonresponse bias.
Survey participation is declining (Hox and de Leeuw 1994; Hartge 1999; Steeh et al. 2001; de Leeuw and de Heer 2002; Tickle et al. 2003; Curtin, Presser, and Singer 2005; Morton, Cahill, and Hartge 2006; Berk, Schur, and Feldman 2007); this trend is of great concern because response rate is the most widely used measure of survey quality (Atrostic et al. 2001) and nonresponse bias can be a serious threat to the validity of survey estimates (Sackett 1979; Barton et al. 1980). In an effort to increase response rates, and potentially reduce nonresponse bias, household surveys are increasingly turning to mixed-mode designs whereby instruments are designed to be administered in more than one mode, including mail, web, telephone, and/or in-person, and respondents are allowed to respond to the mode of their choice (De Leeuw 2005; Dillman, Smyth, and Christin 2009b). The attraction of mixed-mode designs is that the characteristics of nonrespondents may vary by the mode of data collection (Groves 2006) and a second mode will bring in potentially different types of respondents. For this reason (among others), the data collection protocols for three major surveys, the Consumer Assessment of Healthcare Providers and Systems, the Experience of Care and Health Outcomes studies, and the American Community Survey (ACS), call for an initial contact by mail with telephone follow-up to encourage initial nonrespondents to mail in their completed questionnaires or to complete a telephone interview.
Available evidence supports the notion that some respondents exhibit mode preference (Siemiatycki 1979; Brambilla and McKinlay 1987; Link and Mokdad 2005) and that a sequential strategy of implementing multiple contacts allows prospective respondents to respond to a particular mode will improve response rates. For example, in work evaluating the effect of pairing a mixed mail and telephone methodology with a prepaid cash incentive on response rates in a survey of Medicaid enrollees response rates increased considerably after telephone follow-ups, from 54 to 69 percent in the incentive condition, and from 45 to 64 percent in the nonincentive condition (Beebe et al. 2005). Similarly, Gallagher, Fowler, and Stringfellow (2000) found that approximately 34 percent of a sample of Medicaid enrollees responded to a mailed survey and another 10–13 percent responded by telephone. Finally, the ACS, a large national demographic survey conducted by the U.S. Census Bureau, achieves a response rate of 56.2 percent to an initial mailed survey, an increase to 63.5 percent after telephone follow-up, and a final response rate of 95.4 percent after face-to-face interviews (Griffin and Obenski 2002).
Although these studies demonstrate the ability of mixed-mode surveys to increase response rates, they do not clarify their effect on response bias because little information on nonrespondents is available. Some research suggests that switching modes does bring in a different population from those that respond to the initial mode. For example, Fowler et al. (2002) found that telephone interviews with mail nonrespondents produced a less biased final sample in terms of gender and age in a sample of 800 health plan enrollees. In one of the few mixed-mode studies to have more detailed health-related information on the full sample of 1,900 adult patients enrolled in a randomized controlled trial to promote smoking cessation, a telephone followed by mail design improved representativeness in a number of health-related areas, such as seeking treatment, cardio-pulmonary comorbidities, and substance abuse (Baines et al. 2007). However, these studies had limited information on respondents and nonrespondents (Fowler et al. 2002); used an atypical sequential strategy (e.g., telephone followed by mail versus mail followed by telephone (Baines et al. 2007); and focused on specialized patient populations (Fowler et al. 2002; Baines et al. 2007) that render the generalizability of their results unclear.
In a general population survey utilizing a mixed-mode, mail followed by telephone data collection approach, this article reports a systematic analysis of survey nonresponse bias using extensive sociodemographic and health-related information on both respondents and nonrespondents to a general population survey. Our primary focus is to assess whether nonresponse bias was reduced by the utilization of a mixed-mode, mail and telephone data collection design.
The data on response status come from a sequential mixed-mode, mail and telephone survey on recent gastrointestinal symptoms conducted between September 2005 and April 2006 by the Mayo Clinic Survey Research Center. Further details of the study and its methods are available elsewhere (Beebe et al. 2007, 2011). The population for the study survey included noninstitutionalized residents of Olmsted County, Minnesota, aged 18 and older as identified in a purchased list-based sample.
The study population is the 6,939 eligible cases that were sent a mailed survey packet. Initial nonresponders were sent a second survey 3 weeks later. A telephone interview was attempted approximately 2 weeks later for remaining nonrespondents. The overall response rate for the survey was 51.2 percent (American Association for Public Opinion Research 2006). The response rates for the first and second mailings were 24.1 and 38.3 percent, respectively.
The sampling frame for the study was linked to administrative data from the Rochester Epidemiology Project (REP). Each health care provider in Olmsted County (home of Mayo Clinic, Olmsted Medical Center, and the Rochester Family Medicine Clinic) uses a unit medical record system whereby all data collected on an individual are assembled in one place. Each participating site also solicits and documents permission from patients for their records to be used. Currently, 95 percent of patients have granted this permission. The REP includes medical diagnoses, hospital admissions and surgical procedures, and demographic information. Overall, at least 98 percent of the Olmsted county population has been seen by a REP provider at some point (Melton 1996; St Sauver et al. 2011). Approximately 97 percent of the cases in the sample file were matched to members in the REP database. Primary analyses focused on the 6,716 individuals for whom health care information was available. This study was approved by the Mayo Clinic and Olmsted Medical Center IRBs.
Respondents include those who completed a mailed survey or telephone interview (at least two-thirds of the items completed). Nonrespondents include those who refused or could not be contacted. Respondents are further categorized by whether they completed the survey at the first or second mailings, or completed the telephone interview.
Selected demographic variables were obtained from the REP frame, including age, gender, and race/ethnicity. Race/ethnicity was classified as white versus other because sample sizes did not permit analysis of specific minority cultural groups. All medical and surgical diagnoses received by patients at a health care site participating in the REP are coded using either Hospital Adaptation of the International Classification of Diseases (Commission on professional and hospital activities 1973) or the International Classification of Diseases, 9th Edition (ICD-9) codes. Also included was the formal diagnosis in the past decade of a number of disease statuses (see Table 1) dichotomized as presence or absence of each condition. The severity-weighted Charlson Index (Charlson et al. 1987; Deyo, Cherkin, and Ciol 1992) based on these diagnoses was used to provide a summary score of comorbidity. The Charlson measure is an effective method of estimating future morbidity and mortality in longitudinal studies (Charlson et al. 1987) and therefore has utility as a measure of current health.
Also ascertained was whether each subject had a surgical or nonsurgical procedure at one of the hospitals in Olmsted County in the past decade. Finally, the number of emergency room (ER) visits, outpatient clinic visits, and hospital admissions during the 2 years that covered when the survey was in the field (2005 and 2006) were calculated. Utilization was dichotomized and cut-offs were chosen to facilitate analysis and interpretation, informed by the items’ marginal distributions to identify natural breaks, and designed to accord with prior authorization studies in Olmsted county using the REP (Jacobsen et al. 1999).
The key research question was, “What effect did deploying a mixed-mode, mail and telephone data collection strategy have on nonresponse bias?” Using the distribution from the total eligible sample as population estimates, we used chi-square goodness-of-fit tests to compare respondents by mode (first and second mailing and phone) to the population. Note that throughout we refer to the survey protocol as reflecting a mixed-mode design, we acknowledge that part of our analysis, looking at response patterns between the first and second mailings, is not an evaluation of mixing modes but rather an evaluation of a second contact in the same mode. Multivariable logistic regression analysis was used to assess whether our mixed-mode design affected sample representation across data collection phases, including all sociodemographic and health-related variables. Three regression models were analyzed, considering three outcomes: (1) probability of responding to the first mailing (versus second mailing or phone or nonresponse), (2) probability of responding to either mailing (versus phone or nonresponse), and (3) probability of any response (versus nonresponse). Odds ratios (adjusted for all predictors included in the model) and 95 percent confidence intervals were estimated. All analyses were performed using SAS v. 9.1 software. p-values less than 0.05 were considered statistically significant.
Table 1 assesses the differences between responders and the population by response mode where we compare respondents reached after the first and second mailings and via telephone (first three columns of Table 1) to the population. Demographically, men are under-represented in the first mail contact with 48.9 percent being male, compared to 52.6 percent of the population. Older people, particularly those over 65 and white individuals are over-represented in the first mail contact.
With respect to health status, individuals with a severity-weighted Charlson score of two or more are over-represented by about 12 percent in the first mail contact. For most of the measured health conditions, the sample reached by mail (either contact) closely matched the population, with the exception of other cancer types where the sample responding to the first mailing was significantly more likely to have cancer (15 percent) compared to the population (11.8 percent), an over-representation of approximately 27 percent. The telephone mode brought in respondents with some of the other health conditions that were less representative of the population. That is, telephone respondents were less likely to have congestive heart failure, cerebrovascular disease, moderate/severe renal disease and other cancer than were the total eligible sample. With respect to office visits and procedures, early respondents were heavier utilizers than the population. The same was true (but to a lesser degree) for those responding to the second mailing with respect to office visits. The sample obtained with the second mail survey was less likely to have a hospitalization admission than the population. Finally, the sample obtained after the first mailing significantly under-represented individuals who had used the ER.
Table 2 provides the results of the multivariable logistic regression analyses that included all sociodemographic and health-related variables. The first of the three regression analyses (Model 1) shows the likelihood of response to the first mail contact compared to not responding to that initial contact, revealing biases in the sample gathered from the first mail contact. Adjusting for selected demographics, health status and utilization, older adults (50–65 and 65+) are more likely to respond to the first mail contact than are 18- to <35-year-olds. White individuals are more likely to respond as are those with three or more office visits. Individuals with one or more ER visits are less likely to respond. Most important, the results indicate younger people, those from minority cultural groups, ER users, and those who have fewer doctor visits would have been under-represented if estimates had been based only on respondents to the initial mailing.
With a few exceptions, the above biases persist after considering the sample characteristics following the second mailing of the survey (Model 2), and after additional respondents completed the survey by phone (Model 3). Adding a second mailing and a phone mode did not measurably reduce the biases that were observed in the mail sample; however, it does appear to reduce, but not eliminate, the over-representation of older persons that was observed in the first mailing. The over-representation of high-utilizers of clinics and low users of the ER that was observed after the first mail contact remains substantially unchanged after the second mail attempt and after the phone mode.
Interestingly, individuals with congestive heart failure were less likely to be a respondent once the mode switched to telephone (odds ratio [OR] = 0.61, p = 0.001), indicating that the third contact resulted in a respondent population that may be less representative of the underlying population. Of note, in a similar set of models that used the severity-weighted Charlson score as a predictor or response status instead of individual diseases, all the demographic and utilization relationships remained the same (data not shown). Across all three models, however, individuals with a weighted Charlson score of two or more were less likely to be respondents.
There is ample evidence that attaining high levels of survey participation is increasingly difficult (Hox and de Leeuw 1994; Hartge 1999; Steeh et al. 2001; Tickle et al. 2003; Curtin, Presser, and Singer 2005; Morton, Cahill, and Hartge 2006; Berk, Schur, and Feldman 2007) and that deployment of a mixed-mode data collection protocol can be an effective way of increasing survey response rates (Gallagher, Fowler, and Stringfellow 2000; Griffin and Obenski 2002; Beebe et al. 2005). However, emerging evidence suggests that a low response rate does not necessarily portend major study bias (Groves 2006; Groves and Peytcheva 2008) and little evidence that mixing modes minimizes the latter. In our general population survey with an overall response rate of 51.2 percent, contrary to expectations, we found that switching modes from a mail survey to a telephone interview did not uniformly increase the representativeness of the responding sample. Indeed, we found evidence that switching modes may make the sample less representative of the population in terms of at least one clinical variable. Incidentally, we also found that a second contact in the same mode did not increase sample representativeness either.
Our finding that switching mode did not increase the representation of the final sample runs counter to the few studies investigating this issue. In the two studies most similar to our study with respect to order of contact, this approach yielded a more representative sample, although only one study had health and health care utilization for nonrespondents (Gallagher, Fowler, and Stringfellow 1999; Fowler et al. 2002). However, the populations from neither study were representative of the general population and, as such, may be more attuned to the nuances of data collection strategies and more susceptible to the deployment of specific modes. Tacit support for this notion is supplied by the juxtaposition of two studies deploying a mixed-mode design representing the converse of ours: initial telephone contact followed by another mode (e.g., mail, web). Whereas switching to a mailed survey after a telephone interview reached a segment of the population quite different from the segment that would have been reached through telephone alone among adult patients enrolled in a trial to promote treatment for relapsed smokers at five Veteran's Administration Centers (Baines et al. 2007), a similar effect was not seen in a similarly designed general population survey of close to 9,000 households, albeit in an area unrelated to health (Dillman et al. 2009a).
For general populations, switching modes may be more akin to a multiple attempt strategy, perceived only as an increased effort on our part to enlist cooperation, rather than the introduction of a new method, per se. As such, our results are more aligned with the literature investigating the effects of multiple attempts on response rates (Keeter et al. 2000; Davern et al. 2010). The impact of additional measures to enlist participation, such as multiple contacts and/or switching modes, may actually bring in respondents for whom the topic is less salient, leading to an under-representation of those who are less healthy and higher utilizers. This interpretation is consistent with Leverage-Salience Theory proposed by Groves and colleagues (Groves, Singer, and Corning 2000; Groves, Presser, and Dipko 2004), which posits that survey features, such as mode, could have variable leverage for different types of sample members and that switching modes may make a given survey more or less salient for certain types of people, thus increasing or decreasing participation. Regardless of the cause, it appears that use of a mixed-mode approach does not represent a wholesale good when considering use among general population samples, particularly if the topic of survey pertains to health.
In considering our findings, we note potentially important limitations. Our data may not be generalizable to the U.S. population because the racial composition of the population is predominantly white; the prevalence of clinical disease status may vary by ethnicity, but at a minimum our data are probably generalizable to the U.S. white population. Additionally, our study relied on the medical chart to determine disease status and utilization, which may be subject to underreporting of mild symptoms or disease status. However, we assume that more severe symptoms or disease conditions would have been charted and that utilization history was accurately characterized as payment is based on such documentation. Finally, this relatively health-literate population has been heavily surveyed and lives in close proximity to a well-known medical center with close community ties that may have reduced nonresponse bias; the results may not apply in all other U.S. population-based studies.
Survey researchers usually work with fixed resources and are faced with difficult choices of how to allocate efforts to maximize study goals. The choice to use multiple modes of data collection is increasingly popular because it is assumed to serve multiple goals. First, starting with a relatively inexpensive mode such as mail allows one to reach a substantial proportion of the sample at relatively low costs. Second, multiple modes typically are effective at reaching the goal of achieving higher response rates. The research presented here, however, suggests that it is overly simplistic to assume that reaching higher response rates in itself is consistent with a goal of reduced bias. Finally, sample size is also an important goal of survey research, especially when it comes to providing precise estimates for small subpopulations. Balancing the competing goals of survey research will always prove difficult, but further study of which types of designs actually reduce nonresponse bias is essential for informed decisions about how to allocate efforts.
Joint Acknowledgment/Disclosure Statement: Supported by funds from the National Cancer Institute (R03 CA132974; PI: Beebe) and the Mayo Clinic Foundation for Education and Research. The study was made possible by the Rochester Epidemiology Project (R01 AG034676 from the National Institute on Aging; PI: Rocca).
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.