Assessment of Recruitment Bias
To better understand the sources and relative contributions of recruitment bias, we examined discrepancies between network compositions of those recruited and those who refused coupons; we explored the relationship between recruiters and recruits to determine whether recruits were known to recruiters, an important requirement of RDS; we assessed the samples for possible homophily bias; we conducted a sensitivity analysis to test the assumption of random recruitment; and we examined points at which samples reached equilibrium.
Recruited Social Network Composition Versus Composition of Those Who Refused Coupons
At the Raleigh–Durham site, racial/ethnic composition of those who refused coupons did not differ significantly (chi-square, p
0.05) from the composition of those recruited. In Los Angeles, the percentage of overall coupon refusals who were Hispanic (21%) differed significantly from the overall percentage of recruits who were Hispanic (18%) χ2
0.05. In Chicago, the percentage of overall coupon refusals who were black (59%) differed significantly from the overall percentage of recruits who were black (50%) χ2
0.05, and the percentage of overall refusals who were white (28%) differed significantly from the percentage of recruits where were white (38%) χ2
An assumption of RDS is that there is a reciprocal relationship between recruiters and recruits;25
that is, recruiters are instructed to recruit people “they know” who are members of their MSM or DU networks. To test this assumption, we asked recruits how they would describe their relationship with their recruiter—friend, acquaintance, or stranger. Most participants at each site described their recruiter as a friend or acquaintance (Table ). There were, however, a small percentage of participants at each site who said their recruiter was a stranger. This indicates some breakdown in study procedures and, in these cases, a violation of the reciprocity assumption.
The homophily index provides information about the tendency among study participants to recruit others with characteristics like or unlike their own.25
Random mixing produces homophily values that approximate zero. A homophily level can be said to approximate random mixing statistically if the confidence interval (CI) for homophily includes the zero point. When the question at issue is the extent to which homophily affects affiliation patterns, no hard line of demarcation defines low homophily, but a figure of 0.4 is commonly used. In that case, a strong plurality of affiliations (i.e., 60%) are formed in a manner independent of homophily.36
Tables , , , and show the homophily scores by site for, respectively, risk group, gender, race/ethnicity, and HIV status. A score of 0 on a homophily index indicates equivalent recruitment across groups, a score of 1 indicates exclusive in-group recruitment, and a score of −1 indicates exclusive out-group recruitment. When viewing homophily scores along the diagonal in the tables, a score of 1 indicates that individuals recruit only others like themselves; a score of −1 indicates that they recruit only others unlike themselves. Homophily index values of greater than ±0.7 are thought to be problematic in RDS studies because design effects are estimated at 4 or more, with steep increases in design effect as homophily values rise.55
A homophily value of +0.7 may be interpreted as indicating that recruitment was consistent with random mixing 30% of the time and was in-group 70% of the time. Of course, increases in design effect for making point prevalence estimates across groups require an increase in sample size and more recruitment waves to overcome population segmentation. High homophily values also require an increase in recruitment waves to overcome population segmentation.
Risk group homophily by site
Race/ethnicity homophily by site
The results for the risk groups (Table ) indicate that DU had a tendency to recruit other DU and not to recruit non-DU MSM. Similarly, non-DU MSM tended to recruit other non-DU MSM and not DU. In Chicago, however, homophily values were quite large, with DU demonstrating out-group recruitment of MSM 78% of the time and MSM demonstrating out-group recruitment of DU 67% of the time. In St. Petersburg, homophily values were even larger, with DU demonstrating in-group recruitment of other DU 74% of the time and out-group recruitment of MSM 93% of the time. MSM in St. Petersburg demonstrated in-group recruitment of other MSM 84% of the time and out-group recruitment of DU 97% of the time. Of great interest is the MSM/DU group at all sites that demonstrated lower homophily values and a much greater tendency to demonstrate random mixing than either DU or MSM. These values indicate an important potential role for MSM/DU as a bridge between these two risk groups in Chicago and St. Petersburg.
A review of the homophily values for gender (Table ) indicates that females at all of the US sites demonstrated a slight tendency to recruit other females and not to recruit males. The homophily values for males demonstrated recruitment patterns that were generally consistent with random mixing.
The racial/ethnic recruitment patterns (Table ) are more complex. In general, black participants at the US sites demonstrated a slight tendency to recruit other blacks and not to recruit Hispanics or whites. This tendency was more pronounced in Chicago where black participants demonstrated in-group recruitment of other black participants approximately 71% of the time and out-group recruitment of Hispanic participants 81% of the time. At the same time, Hispanic participants demonstrated out-group recruitment of black participants 77% of the time. Other recruitment combinations in Chicago did not demonstrate the same level of segmentation (e.g., white participants’ recruitment of Hispanic participants was consistent with random mixing almost 90% of the time).
With respect to HIV status (Table ), in Los Angeles, HIV-negative participants tended to recruit persons who were also HIV-negative (homophily
0.66) and not to recruit persons who were HIV-positive (homophily
−0.66). HIV-positive participants responded similarly: They tended to recruit persons who were HIV-positive (homophily
0.63) and not persons who were HIV-negative (−0.63).
Evaluating the Random Recruitment Assumption An analytic assumption of the RDS method holds that respondents recruit randomly from their personal networks. A means for testing this assumption involves comparing the composition of personal networks as revealed by self-reports versus actual recruitment behavior. If the assumption of random recruitment is satisfied and the self-reports are accurate, the two should vary only due to stochastic variation.
The plausibility of this assumption depends importantly on research design. For example, the location of the interview site can be significant, as was shown in a Bridgeport, Connecticut study of IDUs.55
Sampling in an interview site located in a black neighborhood yielded only blacks, though respondents also reported knowing Hispanic IDUs. Consequently, the random recruitment assumption was violated; that is, recruitment behavior (i.e., all blacks) did not coincide with self-reported network composition (i.e., a mix of blacks and Hispanics). The solution to this problem involved moving the interview site to neutral ground, the downtown area within which no ethnic group predominated. Recruitment then yielded a mix of blacks and Hispanics in a manner more consistent with the self-reports. Similarly, the plausibility of the random recruitment assumption can be affected if the times during which interviews are conducted exclude some subjects, if the incentives are not salient for some groups of respondents, and if the remoteness of the interview site limits access. More generally, the random recruitment assumption is only plausible if members of the target population have reasonably easy and comfortable access to the interview site along with appropriate incentives to take advantage of that access.
Testing the random recruitment assumption using comparisons of network composition as revealed by self-reports and by recruitment behavior has an inherent limitation. Comparisons are possible for visible personal characteristics, such as gender, race/ethnicity, or homelessness. However, they are not possible for terms that are not public knowledge, such as HIV status or frequency of HIV risk behavior. If the random recruitment assumption is satisfied with respect to issues of public knowledge, the assumption of random recruitment is made more plausible for other issues.
This section reports on a comparison based on gender of both waves of the SATH-CAP study. This characteristic was selected because with rare exceptions, it is an unambiguous matter of public knowledge. In contrast, a term such as race/ethnicity is increasingly ambiguous for respondents of diverse racial/ethnic background. Furthermore, whereas gender is a characteristic that is meaningful at all four sites, race/ethnicity would not be meaningful in St. Petersburg, Russia where race/ethnicity is homogeneous. Analysis was limited to recruitment of drug users, a group that encompassed most respondents at all sites, because non-drug-using MSM and MSMW are by definition male. Therefore, non-drug users, and respondents whose drug use status was unknown, were eliminated from the analysis.
The central issue for the comparison is how much bias might have been introduced into the RDS population estimate by violations of the assumption. To measure this potential bias, a two-step procedure was employed. First, the population estimate was calculated in the usual manner for an RDS analysis. This involved calculating the population estimate based on two terms, proportional cross-group recruitment (e.g., the proportion of males who recruit females) and the estimated network size for each group. This yields the RDS population estimate.
The second step is to calculate cross-group recruitment based on network self-reports. This reveals what actual cross-group recruitment would have been had respondents recruited in a manner consistent with their self-reports. For example, respondents were asked how many male drug users that they knew and how many female drug users that they knew. The self-report-based cross-group recruitment could then be calculated. For example, for male respondents, cross-gender recruitment of drug users is the number of female drug users who are known divided by the sum of the number of male and female drug users who are known. This self-report-based cross-group recruitment proportion was then entered into the RDS estimator equation to calculate what the population estimate would have been for this group had actual recruitment patterns exactly coincided with the self-reports.
The difference between the RDS population estimator and the self-report-based population estimator indicates the potential bias resulting from violations of the random recruitment assumption. Figure shows the outcome of these analyses where, for each of the four study sites, the bar with the diagonal lines is the RDS estimate, the bar with the dots is the self-report estimate, and the error bars correspond to 95% confidence intervals.
Analysis of random recruitment by gender.
The estimates for the Raleigh–Durham and St. Petersburg sites are convergent, with each estimate lying within the other’s confidence intervals. This correspondence provides support for the random recruitment assumption. In contrast, the Chicago site estimates are more divergent, though the confidence intervals are adjacent, so the difference is not statistically significant.
Finally, the Los Angeles estimates are divergent, with no overlap in the confidence intervals, so the difference is statistically significant. An examination of the origins of this discrepancy reveals that it lies principally in the recruitment patterns of males. Though they reported knowing 24.2% females, they in fact recruited only 6%. Consequently, they appear to have massively under-recruited females. The pattern for recruitment by females reflects a smaller divergence between actual recruitment and the self-report, but a similar pattern. Females reported knowing 65.2% males, but recruited only 54.5%. Consequently, using self-reports as a baseline, each gender under-recruited members of the opposite gender.
No definitive explanation of the Los Angeles site’s results is possible because neither self-reports nor recruitment behavior constitutes a gold standard for measuring social network composition. As a result, when discrepancies arise, it is not clear which term is at fault. Alternative arguments are possible. On the one hand, comparisons of the accuracy of self-report network indicators have found reports of network size to be among the more reliable and valid.23
On the other hand, behavioral indicators such as actual recruitment behavior are frequently viewed as more reliable and valid than self-report-based data. The validity of self-reports can be assessed, at least in part, by examining internal consistency and plausibility of the reports. In the Los Angeles data set, internal consistency was imperfect; 74.3% of respondents (710/956) reported knowing a number of drug users that did not equal the numbers of male and female drug users that were known, (e.g., one respondent reported knowing 70 drug users, of whom three were male and three were female). Also, some self-reports appeared implausible, (e.g., 31.7% of drug users (303/956) reported not knowing any other drug users, and 2% of respondents (19/956) failed to answer the question). Overall, 82.8% of respondents’ self-reports (792/956) were in some respect problematic. Unfortunately, no comparable method for assessing the internal consistency of recruitment patterns is available; however, an indirect measure is available for assessing the relative role
of problematic self-reports versus recruitment patterns in producing discrepant estimates. If the discrepancy between the two types of estimates results from errors in self-reports, a positive relationship would be expected between the percentage of problematic self-reports and the discrepancy between the estimates, and the two types of estimates would become convergent when problematic self-reports are rare. Evidence for this pattern emerges from a cross-site comparison of problematic self-reports. In the site where the estimates are most divergent, problematic self-reports are more common, 82.8%, than in the sites where estimates are more convergent, 9.4%, 10%, and 17.2%, for Chicago, Raleigh–Durham, and St. Petersburg, respectively. This pattern suggests that errors in self-reports may be contributors to the discrepancies.
Several explanations of the discrepancy that reflect either discrepancies between network composition and recruitment behavior or errors in the self-reports are possible. First, the discrepancy could reflect greater social cohesion within as compared to across genders, so members of each gender are more effective at recruiting from their own group, thereby inducing a discrepancy between recruitment and network composition. Should this prove to be the case, then RDS population estimates could potentially be improved if the network size question is reframed to focus only on close associates and exclude distant acquaintances. However, it is not clear why this pattern appeared at sites other than LA for males, but not for females. Another possibility is that cross-gender associations are differentially valued, so self-reports could have been distorted by a social acceptability bias in which members of each gender exaggerate their number of cross-gender peers. Should the social desirability explanation prove to be the case, the validity of network self-reports for assessing the random recruitment assumption could be refined based on measures of social desirability, and this might differentiate the LA case from the other cases.
In sum, evidence consistent with the random recruitment assumption was found at two sites (Raleigh–Durham and St. Petersburg), and discrepant evidence was found at the Los Angeles site. However, interpretation of the discrepancy given that the origin of the disparity—either errors in self-reports or divergence between recruitment behavior and network composition—remains unclear. Clearly, more research is needed both on recruitment behavior in RDS and on how respondents formulate their descriptions of network composition. Such research might lead to ways to structure the recruitment relationship more effectively to ensure that the random recruitment assumption is satisfied and more valid and reliable means for eliciting self-reports of network composition. A useful place to begin would be cases where the discrepancy is especially large, e.g., a closer study both of recruitment by gender among male respondents in LA and of the process by which this group reported its network composition.
Sample Equilibrium Sample equilibrium in RDS is the point in the recruitment process at which the sample stabilizes, i.e., the sample becomes independent of the choice of initial seeds and, theoretically, reflects the composition of the population being studied. When we examined race/ethnicity for the US sites, we found that all three of them eventually reached equilibrium, although they did so at different points in the recruitment process (see Figure a–c, which reflects equilibrium for phase 1 of the study).
Race/ethnicity proportions by wave (phase 1): a Raleigh–Durham, b Los Angeles, c Chicago.
Waves are the number of levels in a recruitment chain that follow from an initial seed. The Raleigh–Durham sample stabilized almost from the start in both phases of the study seed (Figure a), indicating that proportions recruited during the first waves of the study accurately reflected the reported sample composition for race/ethnicity. The Los Angeles sample (Figure b) reached equilibrium for race/ethnicity by about wave 10 in phase 1 and by about wave 6 during phase 2 (data not shown). The Chicago sample reached equilibrium in about wave 6 at its largest interview site (Figure c). We present Chicago data from only one interview site as an illustration.