Personal network studies are particularly burdensome for the respondents. In this paper, we investigated the effect of randomly sampling alters to reduce respondent burden on the behavior of four structural measures. We also assessed the range of the total amount of error we incur when computing these measures using a sample of alters and showed how this error varies as function of the number of alters sampled and the amount of time saved.
We provide researchers with a figure illustrating the amount of error they should expect to incur when sampling alters and the amount of time saved. This figure provides guidance for making an informed decision on the number of alters to sample when in need to reduce respondent burden. The only limitation to this figure is that the likely total amount of error was derived using a sample of 28 networks of homeless women. This sample might not be representative of the networks of other populations, though these 28 networks show a wide range of values for the four considered structural measures.
We think that sampling alters represents an effective way of reducing respondent burden. While the focus of this paper was on structural measures, it should be noted that sampling a smaller set of alters can also reduce the respondent burden for the network composition phase of the interview. The time savings, in terms of a shorter interview, can actually be more substantial for the composition phase than the structure phase. However, since most (if not all) of the composition measures are either means or percentages, such as the alters' mean age or the percentage of family members in the network, their statistical properties are well known. We know that composition variables computed on a sample are unbiased; hence the only source of error is their variability. In the study that motivated this paper we reduced respondent burden for both composition and structure sections. Only a small set of the composition questions (4 questions) was asked of all 20 alters, while the bulk of these questions (14 questions) was asked to the 12 sampled alters. Respondents on average took 5 seconds per composition question; this means that the second part of the composition phase took on average 14 minutes instead of 23.3 minutes. So the total (composition and structure phases combined) time saving per respondent obtained by sampling 12 of the 20 named alters amounts to about 20 minutes: a substantial reduction of the overall interview time.
In that study we went a step further in that we took a stratified sample of 12 alters. More specifically, the 20 named alters were grouped into two strata: sex partners and non-sex partners. Since one of the major goal of the study was analyzing the relationship between risky sexual behaviors (such as unprotected sex) and social network characteristics, it was important to include in the sample of 12 alters some of the sex partners the woman named.
The use of stratified sampling is one solution to the situation in which researchers need to collect information on specific types of alters and find themselves in the need of reducing respondent burden. The only drawback is that if alters from different strata are sampled at different rates, the sample measures need to be weighted (Lehtonen and Pahkinen 1994
A stratified sample represents a potential solution to the disadvantage of randomly sampling alters noted by McCarty et al. (2007)
: since the selection of the alters is random, it is likely that key alters are not sampled and therefore that the sample measures might be significantly different from the true structural measures. In a study like ours with 445 cases, the case in which for few respondents the sample structural measures estimate poorly the true structural measures might have very little consequence on the analysis results. It is unlikely that, for example, the correlation between drug use and network density is affected if the network density is poorly estimated for a few respondents.
While this paper focuses on measuring the quality of estimates for personal network structural measures that is rarely the end product of the analysis. More commonly these network features act as explanatory variables in regression models. Including covariates measured with error, as would be the case with sample structural measures, results in their regression coefficients being biased toward 0. Methods for adjusting for this “attenuation bias” (Frost and Thompson, 2000
) utilize the estimated variance of the covariate to project what the coefficient would be if the true structural measure had been used. Explanatory variables measured with bias are more problematic. Future work should assess the impact of estimation error in the sample structural measures on estimated regression coefficients and develop methods for adjusting regression coefficients for bias and variance in the estimated structural measures.
This paper showed that if researchers are interested in measuring network density, respondent burden can be reduced substantially since the overall estimation error is low even for relatively small samples. For the other three sample measures the overall estimation error tends to be higher. Therefore researchers need to assess the amount of error that they are willing to tolerate in order to reduce respondent burden. This is particularly true for the percentage of isolates. The fact that three of the four considered sample structural measures are biased suggests that sample measures might not be the best estimators of the true structural measures. Future research should investigate alternative estimators that eliminate or reduce the bias and that ultimately exhibit a smaller overall estimation error. Though using the sample structural measures is advantageous since they require hardly any computation time; therefore the use of alternative estimators is warranted only if the reduction in the estimation error outweighs the increase in computing time and complexity.