|Home | About | Journals | Submit | Contact Us | Français|
Several assumptions determine whether respondent-driven sampling (RDS) is an appropriate sampling method to use with a particular group, including the population being recruited must know one another as members of the group (i.e., injection drug users [IDUs] must know each other as IDUs) and be networked and that the sample size is small relative to the overall size of the group. To assess these three assumptions, we analyzed city-specific data collected using RDS through the US National HIV Behavioral Surveillance System among IDUs in 23 cities. Overall, 5% of non-seed participants reported that their recruiter was “a stranger.” 20 cities with multiple field sites had ≥1 cross-recruitment, a proxy for linked networks. Sample sizes were small in relation to the IDU population size (median = 2.3%; range: 0.6%- 8.0%). Researchers must evaluate whether these three assumptions were met to justify the basis for using RDS to sample specific populations.
Behavioral surveillance of persons at risk of HIV infection is an important component of an overall HIV surveillance program [1,2]; these data are used to estimate prevalence, identify correlates of behaviors and determine prevention needs. Multiple methods have been used to sample populations at high risk of HIV infection including venue-based, time-space sampling; targeted sampling; snowball sampling; and respondent-driven sampling [3,4]. Respondent-driven sampling (RDS) [5,6] has been used successfully to reach injecting drug users (IDUs) in the United States [7,8] and elsewhere .
RDS has certain assumptions that must be met to determine if it is an appropriate sampling method to use with a particular group [10,11]. These assumptions require that the population being recruited must know one another as members of the target population (i.e., IDUs must know each other as IDUs). If members of the population cannot identify each other, then participants will not be able to produce eligible recruits and the method will fail to produce a sample. The population being recruited also must be adequately networked to accommodate a chain referral process; ideally, networks should form a single component (network of networks), rather than multiple, disconnected networks, so that referral chains can reach all subsets of the population in a defined area. Subsets of the population that are completely disconnected from the primary network cannot be reached by the peer recruitment process and thus the RDS findings will not be generalizable to these groups. A third assumption, that the sample size to be recruited using RDS is small relative to the overall size of the target population (i.e. a small sampling fraction), is required to ensure that each participant’s ability to be recruited remains constant over time because the pool of potential recruiters is not noticeably diminished . Given that respondents may only participate once, it is important to ensure that the sample size does not exhaust the pool of potential recruiters in the population as sampling progresses. Two other RDS assumptions, that participants can accurately report their personal network size and that recruitment is a random selection from the recruiter’s network, are applicable to RDS analysis. Discussion of these assumptions is beyond the scope of this paper and has been reported elsewhere [12,13].
Few RDS studies have assessed these three assumptions. To build the literature on situations in which RDS does and does not work well as a recruitment and sampling strategy for reaching hard to reach groups, there is a need for quantitative indicators to assess the RDS assumptions. This paper defines quantitative measures to evaluate, post-hoc, the extent to which the three assumptions were met in the US National HIV Behavioral Surveillance System among injecting drug users for the first cycle of data collection from May 2005 to February 2006 (NHBS-IDU1). Based on this evaluation, we describe the lessons learned that were then applied to the second cycle (NHBS-IDU2).
Methods for NHBS-IDU are reported in detail elsewhere  and briefly described here. NHBS-IDU1 was conducted by the Centers for Disease Control and Prevention (CDC) in collaboration with state and local health departments in 23 metropolitan statistical areas (“cities”) within the United States. CDC determined that NHBS-IDU1 was not research; each local area obtained approval of human subjects in accordance with their institutions’ determinations.
Local project staff in each city started the NHBS-IDU1 cycle with formative research to determine logistics of survey operations and to gather information on the local IDU population . Each city set up at least one interview field site accessible to the various local drug-use networks and began RDS with a limited number (8-10) of initial recruiters or ‘seeds’ representing various drug networks and geographic or demographic characteristics.
NHBS-IDU1 procedures included eligibility screening, obtaining oral informed consent from participants, and an interviewer-administered survey. Eligibility for NHBS includes being of age 18 or older, being a resident of the city, not having already participated in the current NHBS data collection cycle, and being able to complete the survey in English or Spanish. An additional IDU cycle eligibility criterion was having injected drugs within 12 months preceding the interview date, measured by self-report and either evidence of recent injection or adequate description of injection practices . The survey measured characteristics of participants’ IDU networks (total number, gender and race/ethnicity), demographics, drug use and injection practices, sexual behaviors, HIV testing history, and use of HIV prevention services. Interviewers used handheld computers to administer the survey and record responses.
Participants could take the survey at any NHBS field site in their city. Participants who completed the survey were asked and trained to recruit others who also injected drugs by distributing number-coded coupons. Participants were compensated for their participation and for each eligible recruit who completed the survey; this dual-incentive structure is unique to RDS [5,6]. Compensation levels were determined in each city, but generally were about $25 for participation and $10 for recruitment.
NHBS-IDU1 was conducted from May 2005 through February 2006. Data collection duration varied across cities due to differences in timing for approval of human subjects, logistics, and speed of sample accrual.
Participants who agreed to be recruiters were told to give coupons to someone they knew as an IDU. Participants (excluding seeds) described their relationship to the person who gave them their coupon. Multiple responses were allowed, including: sex partner, drug partner, family, friend, colleague, acquaintance, and stranger (“you don’t really know the person, just met him/her”). For analysis purposes, the participant’s recruiter was categorized as a stranger if this response option was selected with no additional relationship reported.
Five variables which may affect participants’ recruitment selections and introduce sampling bias were assessed: race/ethnicity, gender, age, preferred drug, and self-reported HIV status. Race and ethnicity were coded into one variable with mutually exclusive categories: white, black, Hispanic (regardless of race), and other (including Asians, Native Hawaiian and Pacific Islanders, multiracial persons, and those with no recorded race). The variable “preferred drug” was derived from questions asking frequency of use of several drug types and then grouped into 5 categories: heroin only, heroin and cocaine (equal frequency or combined as speedball), cocaine or crack only, amphetamine (including methamphetamine), and other (all other drugs or combinations thereof). Self-reported HIV status was categorized as HIV-positive or not (which included those whose results were negative or indeterminate, those who never received the result or never tested, and those whose HIV status could not be determined).
Coupon numbers and other information linking recruiters to their recruits were collected and maintained in RDS Coupon Manager (RDSCM) 2.0 software (Cornell University, Version 2.0, Ithaca, New York, USA). Survey data were transferred from the handheld to a computer and then uploaded to a secure server; some survey records were lost during collection or transfer and only the recruitment data from RDSCM 2.0 remained. Survey and RDSCM 2.0 data were merged using SAS software (SAS Institute Inc., Version 9.1, Cary, North Carolina, USA) and output to an electronic text file for analysis in RDSAT software (Cornell University, Version 5.6, Ithaca, New York, USA). The analyses for this paper included only eligible participants, except where otherwise noted. For some analyses, city-specific samples were aggregated to report on the whole NHBS-IDU1 sample.
Using SAS, we calculated the proportion of participants reporting that their recruiter was a stranger; a low proportion (2-4%) indicates that this assumption is met . We also assessed the proportion of potential participants who were eligible as a way to determine the extent to which participants knew one another as IDUs; a high proportion of ineligible recruits would suggest that this assumption was not met.
We used RDSAT to create a matrix of cross-recruitments. To determine whether the IDU networks within each city were linked, cross-recruitment was assessed for field site, as networks often are defined by geography. An example of cross-recruitment is when a participant interviewed at Field Site B had received his/her coupon from a recruiter interviewed at Field Site A. We also assessed cross-recruitment for the 5 variables; we report data only for race/ethnicity as it had the most impact on sampling. To be considered linked at least one recruitment between any two field sites or any two racial/ethnic groups, respectively, was required. The presence of at least one cross-recruitment in the sample suggests the presence of a large number of connections across groups in the population; the higher the proportion of cross-recruitments, the greater the number of network connections among IDUs.
The sampling fraction was defined as the number of persons screened for NHBS-IDU1 (regardless of eligibility) divided by the total number of IDUs in each city .
From May 2005 to February 2006 a total of 13,519 persons were recruited, 384 of whom were seeds. A total of 1,563 (12%) persons were deemed ineligible and excluded from analysis: 196 did not meet NHBS general eligibility criteria (86 of whom were ineligible due to previous participation) and 1,367 did not meet current injection drug use criteria. Additionally, 46 persons had no recruitment information so their records could not be used. There were 334 persons with lost survey records. In addition, we did not include for analysis 38 persons with responses of highly questionable validity and 67 who were not classified as either male or female.
In the complete analysis dataset, there were 334 seeds and 11,137 peer-recruited participants recruited for a total of 11,471 participants. Table 11 displays characteristics of the overall sample; city-specific characteristics of NHBS-IDU1 participants are reported elsewhere . Among the 11,471 participants, most (71%) were male and were of age 35 years and older (81%) (Table 11). Nearly half (49%) were black, 25% white, and 21% Hispanic. Heroin was the preferred drug for 53% of the sample and 8% self-reported they were HIV-infected.
Table 11 shows responses regarding the relationship to the recruiter (as reported by the participant). The most common (59%) relationship was “friend;” many reported relationships related to drug use such as someone they “buy drugs with” or “buy drugs from.” Overall, 5% of non-seed participants reported that their recruiter was “a stranger” (with no other relationship; only 26 persons reported stranger and another relationship); this proportion varied by city (range 1.2%-20%), with 5 cities having >5% recruitment by strangers (Table 22).
The proportion of potential participants who were eligible for NHBS-IDU1 was high overall (90%) and in each city (range 83%-98%, Table 22). The majority of potential participants (61%, range 40%-86%) had physical signs of recent injection (data not shown). Although a higher proportion of ineligibles in cities with a high proportion of participants recruited by a stranger might be expected, we did not see this pattern (Table 22).
Of the 23 NHBS-IDU1 cities, 3 used a single field site, so cross-recruitment was not assessed. All other cities had multiple field sites, ranging from 2 to 7 with an average of 4 field sites. In 3 cities with multiple field sites, each had 1 field site with no cross-recruitment to any other field site. In 1 of these cities, a new field site was opened after the existing ones were closed, making cross recruitment to this site impossible. We assume cross recruitment would have occurred from this field site had it been possible and therefore included all data in the analysis dataset. The other 2 cities had a field site located in an area that was geographically distant from the other locations, with limited hours of operation; there was no evidence suggesting that participants interviewed at these 2 field sites were part of the same networks as participants from other field sites. Therefore, data from these 2 field sites (n=90) were considered separate networks (i.e., not part of one component) and were excluded from the analysis dataset.
In all of the cities with multiple field sites there was at least 1 cross-recruitment by field site and by race/ethnicity. The proportion of cross-recruitments by field site ranged from 0.2% to 74% (Table 22). The proportion of cross-recruitments by race/ethnicity ranged from 8% to 52% (Table 22). In the two cities with the lowest proportion of cross-recruitments, nearly all the participants were Black (Table 22).
The sample sizes by city ranged from 341 to 785 (Table 22). Overall, the sampling fraction was low, with less than 10% of the IDU population sampled in each city (median = 2.3%; range: 0.6%-8.0%).
In summary, NHBS-IDU1 met the three RDS assumptions we assessed based on the quantitative indicators we created. Results for each assumption varied by city. Related to the first assumption, that participants knew one another as members of the target population, we found that, for most cities, the proportion of recruitments by a stranger was low while the proportion of eligible recruits was high. In 5 cities the proportion recruited by a stranger was >5%, but these cities still had high eligibility rates suggesting that participants knew each other well enough to recognize each other as IDUs. This assumption also has implications for analysis as RDS weighting is based on individuals with larger networks having greater likelihood of being recruited; if many participants recruit strangers (i.e., persons outside their network), then RDS weights based on network size would not be applicable. To examine the second RDS assumption, that the IDU networks within the NHBS cities were linked, we examined cross-recruitment by field site and by race/ethnicity. Cross-recruitment by field site ranged from 0.2% to 74%. Two cities had limited cross-recruitment by race/ethnicity, which may suggest that IDU networks in these cities are racially defined. When there is a low proportion of cross-recruitments, RDS analysis may still produce valid estimate; however the variance around these estimates will be noticeably high. For the third assumption, we found that in each city the sampling fraction was too small to noticeably diminish the recruiter pool, therefore allowing for robust recruitment.
This is the first paper to assess the extent to which the three RDS assumptions were met in samples from a standardized, multi-city behavioral surveillance system in the United States using quantitative indicators. The results from this paper can be used to guide other researchers to conduct similar evaluations of their own RDS studies. We created indicators for the assumptions that are easy to calculate; although we conducted our assessment post-hoc, the assumptions should be considered during formative research and the indicators can be used while planning an RDS study (e.g., considering sampling fraction by using existing population size estimates and planned sample size) or monitored as part of process evaluation during sample accrual (proportion recruited by a stranger and cross-recruitments) so that recruitment can be adjusted as needed. Rudolph et al.  also described ways they tested RDS assumptions in New York City among IDUs, using similar metrics reported here.
Two papers reviewing 123 RDS studies outside the US discussed challenges  and summarized characteristics of RDS studies . Papers such as these have not reported data on whether these 3 assumptions were met empirically. Few other studies have reported on relationships between recruiters and recruits, including the proportion recruited by a stranger or cross-recruitments . Other RDS studies have reported high proportions of eligible recruits, similar to the high proportion found in NHBS-IDU1 [21-23]. The hidden nature of most RDS target populations often precludes knowledge of population size and therefore makes calculation of the sampling fraction more challenging; we were able to use existing published estimates of the IDU population size in each NHBS city . This is the first paper to report sampling fractions for 23 RDS samples collected using a standard protocol. Our data can contribute to refinement of theoretical work related to RDS estimation: in NHBS-IDU1, the overall sampling fraction was 2.3%, a figure well below the threshold of 50%, at which sampling-with-replacement can become a source of bias .
Our analyses had some limitations that suggest further development of quantitative indicators of the three RDS assumptions. Field site may not be the best variable to assess whether networks are sufficient to sustain a chain-referral process; other factors such as neighborhood of residence or zip code may be more relevant within each NHBS city to determine the extent to which networks are related. Our findings on cross-recruitment by race/ethnicity are similar to that reported in another IDU study in New York City . Future research should consider what proportion of cross-recruitment is considered adequate to demonstrate linked networks; our standard of 1 cross-recruitment is a minimum level for lack of cross-recruitment to be ruled out, rather than a level of adequate cross-recruitment. Local NHBS project staff are encouraged to examine the assumptions considered here for their own data and staff from each NHBS city should consider their knowledge of the local IDU population to determine how well RDS sampled different groups of IDU within their city. The sample of IDUs reached by RDS can be compared to other methods of recruitment to determine if key sub-populations were missed .
Based on the analysis reported here, additional operational procedures were developed for NHBS-IDU2. A more refined definition of ‘knowing’ someone was added to the question assessing the relationship to the recruiter as well as to the recruiter training script (By “know,” I mean you know their name OR you see them around even if you don’t know their name). Participants who reported that their recruiter was a stranger were probed using standardized questions; if participants reported never seeing the recruiter prior to being given a coupon or reported having first seen the recruiter in a situation related to NHBS-IDU, then the relationship classification of ‘stranger’ was considered validated. In addition, recruiters were trained not to give coupons to strangers. As part of their formative research, NHBS-IDU staff were required to analyze peer recruitment patterns in their NHBS-IDU1 data by race/ethnicity, gender, and other characteristics of potentially insular sub-populations of IDU (i.e., networks that are not linked to other networks). Based on this information, staff selected seeds from loosely networked sub-populations to ensure each group’s representation, whereas closely networked sub-populations did not require the same extent of planning for selecting seeds. In addition, staff assessed potential field sites in part for the location’s ability to serve as a “bridge” between major IDU sub-populations. Other formative research activities such as identifying studies of local IDU populations that describe networks and other characteristics of drug users can also help lay the foundation for the success of an RDS sample in reaching all groups of IDUs .
RDS is increasingly used to sample IDUs and other populations at high risk of HIV infection. As RDS is still a relatively new sampling and analysis method, it is important for investigators to share operational findings. As use of RDS increases, researchers must not only report on whether RDS assumptions were met to justify its use among specific populations, as we did here, but also plan formative research to ensure that assumptions can be met.
We would like to thank Drs. Lillian Lin and Christopher Johnson, Michael Spiller III, and Cristin Haggard for their consultation regarding the analyses in this report. We recognize contributions to this report made by the persons who were NHBS-IDU Principal Investigators (R. Luke Shouse, Georgia Division of Human Resources; Colin Flynn, Maryland Department of Health and Mental Hygiene; Eric Rubinstein, Massachusetts Department of Public Health; Carol Ciesielski, Chicago Department of Public Health; Sharon Melville, Texas Department of State Health Services; Beth Dillon, Colorado Department of Health and Environment; Eve Mokotoff, Michigan Department of Community Health; Marcia Wolverton, Houston Department of Health; Dave Crockett, Nevada Department of Public Health; Trista Bingham, Los Angeles County Department of Public Health; Marlene LaLota, Florida Department of Health; Chris Nemeth, New York Department of Health; Christopher Murrill, New York City Department of Health and Mental Hygiene; Helene Cross, New Jersey Department of Health and Senior Services; Dena Bensen, Virginia Department of Public Health; Kathleen Brady, Philadelphia Department of Health; Assunta Ritieni, California Department of Health; H Fisher Raymond, San Francisco Department of Public Health; Sandra Miranda De Leon, Puerto Rico Department of Health; Yelena Friedberg, Missouri Department of Health and Senior Services; Maria Courogen, Washington Department of Health) and the Behavioral Surveillance Team, Behavioral and Clinical Surveillance Branch, Division of HIV/AIDS Prevention, CDC.
The findings and conclusions in this manuscript are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.
This work was supported by the Centers for Disease Control and Prevention (212-2005-M-11776 and 200-2006-M-16175 to D.D.H).
The authors confirm that this article content has no conflicts of interest.