|Home | About | Journals | Submit | Contact Us | Français|
To empirically evaluate Respondent-Driven Sampling (RDS) recruitment methods, which have been proposed as an advantageous means of surveying hidden populations.
The National HIV Behavioral Surveillance system used RDS to recruit 370 IDU in the Seattle area in 2005 (NHBS-IDU1). We compared the NHBS-IDU1 estimates of participants’ area of residence, age, race, sex and drug most frequently injected to corresponding data from two previous surveys, RAVEN and Kiwi, and to persons newly diagnosed with HIV/AIDS and reported 2001–2005.
The NHBS-IDU1 population was estimated to be more likely to reside in downtown Seattle (52%) than participants in the other data sources (22%–25%), be over 50 years old (29% vs. 5%–10%) and report multiple races (12% vs. 3%–5%). The NHBS-IDU1 population resembled persons using the downtown needle exchange in age and race distribution. An examination of cross-group recruitment frequencies in NHBS-IDU1 suggested barriers to recruitment across different areas of residence, races and drugs most frequently injected.
The substantial differences in age and area of residence between NHBS-IDU1 and the other data sources suggest that RDS may not have accessed the full universe of Seattle area injection networks. Further empirical data is needed to guide the evaluation of RDS-generated samples.
Injection drug users (IDU) are a population at elevated risk for several infections of public health importance including HIV, hepatitis B and C.1;2 Epidemiologic surveys of drug-injecting populations are important in measuring the prevalence of these diseases, evaluating public health measures to control these infections, identifying unmet needs and noting opportunities for prevention efforts. However, obtaining an unbiased sample of IDU has proved problematic due to the illegal nature of drug injection and the social marginalization of many IDU. Most methods of IDU recruitment contain well-recognized sources of bias whose effects are difficult or impossible to quantify.3;4
Respondent-driven sampling (RDS) has been proposed as an advantageous means of accessing hidden populations.5-7 In RDS, participants are given coupons with which to recruit their peers and offered payments when new recruits bring in the coupons. Theoretical reasoning and mathematical modeling propose that RDS methods can produce a study population from which unbiased estimates of the properties of a target population can be calculated. RDS offers further advantages in the ease of its implementation and the standardization of its methods, advantages which have made RDS attractive in international studies with limited resources.8;9 While RDS is being used in a growing number of settings,10-24 there remains a need for empirical data evaluating how well RDS fulfills its promise in practice.
One means of evaluating RDS is to compare RDS results for a population with data on the same population obtained through other means.25 In 2005, IDU were surveyed for the National HIV Behavioral Surveillance system using RDS in 23 U.S. cities (the NHBS-IDU1 survey), including Seattle.25 Two previous studies, the Risk Activity Variables, Epidemiology and Network Study (RAVEN) and the Kiwi Study, recruited Seattle area IDU using differing sampling strategies. Data on IDU diagnosed with HIV/AIDS 2001–2005 was available through the HIV/AIDS Reporting System (HARS). We compared these different populations in terms of area of residence, age, race gender and drug most frequently injected, variables commonly used to characterize IDU populations.15;18;20;26
The methodology for surveying IDU in NHBS-IDU1 has been described.25 Participants were required to be 18 years of age or older, have injected in the previous 12 months, reside in King, Snohomish or Island counties and be able to communicate in English. In Seattle, 19 initial IDU (seeds) were each given 3 coupons to pass on to their injecting peers. Participants who completed a survey questionnaire were paid $20 and offered in turn coupons to distribute to their IDU peers. They received a payment of $10 for each eligible study participant they referred. Recruitment began 5/25/05 and was terminated 1/31/06. In-person interviews were conducted using hand-held computers in a storefront office in Seattle’s south downtown, which was open for interviews 3–4 days per week, or at one of two sites in south King County, open 1–2 days a week.
The RAVEN study recruited IDU from June 1994 through May 1997 from among persons entering four methadone treatment centers (39% of participants), a drug treatment evaluation agency (17%), a drug detoxification centre (15%), two social service agencies (19%) and entering the King County jail in Seattle on drug-related charges (10%).27 In each of these settings, candidates were selected for recruitment on the basis of a random number algorithm. Among persons potentially eligible for the study, 10% refused participation. The 2538 RAVEN participants meeting the criteria described below for inclusion in this analysis constitute 16% of the approximately 16,000 IDU estimated to reside in the Seattle area.28
The Kiwi study recruited IDU from among persons incarcerated in the two main King County jails, in Seattle and Kent, from September 1998 through December 2002.29 Participants were recruited by screening all persons booked into jail during randomly selected time intervals in the Seattle jail or from inmates visiting the Kent jail health clinic for a regular 14 day physical exam, or from persons requesting HIV counseling and testing in both the Seattle and Kent jails. Among persons screened and determined to be eligible, 60% completed an interview. The 1567 Kiwi participants constituted about 10% of the estimated IDU in the Seattle area.28
HARS collects data on all reported cases of HIV/AIDS. We present results from HARS on persons newly diagnosed with HIV/AIDS between 2001 and 2005 with residence in King, Snohomish or Island Counties. Analysis is restricted to the 263 cases whose exposure category was recorded as IDU (48% of cases) or IDU/MSM (52%).
Our findings are supplemented by a survey of Seattle area needle exchangers. Public Health Seattle-King County staff attempted to briefly interview every person exchanging needles at a needle exchange in King County from 11/11/2006 through 11/24/2006. Overall, 49% of persons approached completed a survey.
Informed consent was obtained from participants in RAVEN, Kiwi and NHBS-IDU1 before administration of the questionnaire. For RAVEN, study procedures were approved by the institutional review boards of the state of Washington and the University of Washington; for Kiwi by the state of Washington and the Centers for Disease Control and Prevention (CDC); and for NHBS-IDU1 by the state of Washington.
To obtain consistency among study populations, analysis was restricted to participants in the three studies 18 years of age or older and residents of King, Snohomish and Island Counties. For the NHBS-IDU1 data, the RDS Analysis Tool (RDSAT) was used to calculate RDS-adjusted population proportion estimates, their 95% confidence intervals, RDS-adjusted standard errors, sample population and equilibrium proportions, and cross group recruitment probabilites.30 The RDS-adjusted estimates incorporate adjustments based on network size, cross-group recruitment probabilities and group-specific recruitment efficiency. Network size was based on a question about the number of injectors that a participant knew and whom they had seen at least once in the previous 6 months. Recruitment chains were diagrammed using NETDRAW.31
While a number of different approaches to the statistical evaluation of RDS recruited populations have been published, no one method has found general acceptance.8;17;18;32-34 We used a test based on chi-square methods, after adjustment for RDS design effects according to a method proposed by Heckathorn.35 RDSAT generates standard errors for population proportion estimates by a bootstrap method (S.E.bootstrap).36 The design effect quantifies the proportionate difference between the variance of RSDAT bootstrap method and the variance that would be expected were the RDS estimates normally distributed.36 It was calculated as (S.E.bootstrap)2/ (p(1-p)/n), where p is the RDS-adjusted population proportion estimate. For the RAVEN, Kiwi and HARS populations the χ2 tests used the actual numbers of participants in each category of interest. For NHBS-IDU1, we used the population proportion estimate times the number of NHBS-IDU1 participants (i.e. the expected number of participants), then divided this by the design effect. This adjusts for the weaker power in an RDS sample (as reflected in the wider bootstrap confidence intervals). Even so, these methods may overestimate the precision of the estimates in the RDS sample.37 Analyses were conducted using SPSS and Epistat.38;39
Of the 19 seeds interviewed, 10 recruited at least one study participant. Sixty percent of the sample population derived from one seed. Fifty-eight percent of participants recruited at least one new study participant; 31% of the coupons distributed were returned by new participants. The rate of recruitment was substantially slower than anticipated. At the fixed time the survey was terminated (as mandated by CDC protocol) data from 370 eligible participants were available for analysis. This was less than the initial target sample size of 500. Despite concerted efforts to plant seeds in south King County, these seeds generated few participants. Ninety-two percent of study interviews were conducted at the downtown Seattle site. Only ten participants reported being recruited by a stranger.
There were 28 waves of recruitment (Figure 1). Ninety-one percent of the sample population was recruited at the fourth wave or higher. All categories of area of residence, age, race, sex, and drug most frequently injected were represented in the NHBS-IDU1 sample population within 2% of their calculated equilibrium estimates, except for residence in south King County and south Seattle where the deviation was 4% (Table 1). There was cross-recruitment across each of the three interview sites.
There were substantial differences between the RDS estimates of the geographic distribution of the NHBS-IDU1 population and the distributions of other IDU populations (Table 2). The RDS estimate indicated that 52% of the NHBS-IDU1 population resided in downtown Seattle, compared to 24% in both RAVEN and Kiwi and 25% in HARS.
The NHBS-IDU1 age estimates indicated an IDU population older than the other data sources (Table 2). Twenty-nine percent of the NHBS-IDU1 study population was estimated to be over 50 years old compared to 6% in RAVEN, 5% in Kiwi and 10% in the HARS data.
To investigate the extent to which the NHBS-IDU1 study population represents an aging cohort of IDU, we performed a linear regression analysis of participants’ age against the date of interview within the RAVEN and Kiwi studies. There was a modest trend towards decreasing age within both RAVEN (β=−.06; p=.001) and Kiwi (β=−.04; p=.14). Similar analyses revealed no evidence of a temporal trend towards an increasing proportion of IDU residing in downtown Seattle within RAVEN or Kiwi.
The proportion of participants reporting multiple races was higher in NHBS-IDU1 than the other data sources (13% vs. 3%–5%). Among participants reporting multiple races, 77% in NHBS-IDU1 and 75% in Kiwi reported Native American as one of the races selected. The percent of Hispanics rose in the successive interview studies over time.
RAVEN had a significantly higher proportion of females (37%) than the other data sources (14%–24%). Females were more likely than males to have been in drug or alcohol treatment in the previous year in both the Kiwi (45% vs. 38%; p=.02) and NHBS-IDU1 populations (54% vs. 42%; p=.05).
Heroin was by far the most commonly reported drug in each of the interview studies (HARS collected no data on use of specific drugs). The proportion of participants reporting amphetamines was higher in Kiwi (26%) than RAVEN (6%) and at an intermediate level in the NHBS-IDU1 estimate (18%). Among participants from south King County, a region currently affected by high levels of amphetamine injection,40 NHBS-IDU1 estimated a lower proportion injected amphetamines (23%) than found in Kiwi participants (50%).
The distribution of age, race and sex among needle exchange users participating in the 2006 Seattle needle exchange survey at sites in three different Seattle neighborhoods is shown in table 3. Needle exchange users at the downtown site differed from those at the other two sites in having a higher proportion over 50 years of age (25% vs. 6% and 11%) and reporting Black race (17% vs. 3% and 9%) and with respect to these variables most closely resembled the NHBS-IDU1 population.
We evaluated recruitment probabilities across groups of participants defined by differing areas of residence, age, race, sex and drug most frequently injected in order to assess differential recruitment patterns on the basis of these characteristics (Table 4).
A pronounced pattern of preferential recruitment of participants from the same area of residence was seen for participants from South King County (62% did so) and south Seattle (54%). Participants tended to recruit persons broadly similar in age to themselves; recruitment probabilities were substantially higher among persons in the same or adjacent age categories of participants than for those farther apart in age.
Black participants showed a marked tendency to recruit other Black participants (55% did so) and were themselves recruited by very low proportions of persons of other races (from 0% to 8%). Amphetamine injectors most frequently recruited other amphetamine injectors (65%).
The NHBS-IDU1 results stand out from RAVEN, Kiwi and HARS data in the high estimated proportion of participants from downtown Seattle residents and in the markedly older age distribution. Lacking a definitive gold standard, we cannot determine with assurance which of these populations, if any, accurately reflects the characteristics of Seattle area IDU. The various data sources in this report have strengths and weaknesses. RAVEN’s random number based sampling reduced volunteer bias, and the multiple recruitment settings reduced the influence of any single site. IDU who had no contact with the institutions where RAVEN recruitment occurred, however, would have been missed. With its jail-based recruitment, Kiwi provided access to the substantial proportion of IDU experiencing incarceration,41 but would have missed IDU who were not arrested. As HIV/AIDS is a reportable infection, HARS data would be expected to identify essentially all persons diagnosed with HIV/AIDS,42 but is likely to be affected by patterns of HIV testing, and to over-sample IDU at higher risk for HIV transmission (such as IDU/MSM) and older IDU. The RDS methodology of NHBS-IDU1 has a body of theory-based investigations asserting its capacity to produce unbiased estimates of population characteristics but this has not been convincingly verified by empirical data.
The closer concordance among the three other sources of data constitutes an argument that the NHBS estimates for these characteristics are the less representative portrayal of Seattle area IDU. We offer the hypothesis that RDS coupon distribution did not effectively penetrate the full universe of injector networks in the Seattle area. The similarity in age and racial distribution between the NHBS-IDU1 population and downtown needle exchangers suggests that NHBS-IDU1 recruitment occurred disproportionately among networks of downtown IDU. Our data on recruitment probabilities within and across groups support the idea of incomplete network penetration in NHBS-IDU1 by documenting lower recruitment probabilities, and hence network barriers, between IDU across differing areas of residence, races and injection drugs.
The criteria generally considered important for valid RDS recruitment appear to have been fulfilled in the Seattle NHBS-IDU1 survey: recruitment chains were long and the preponderance of participants were recruited in the fourth or higher waves, the sample population approached estimated equilibrium values for the characteristics analyzed, recruitment occurred across interviewing sites and few participants reported being recruited by strangers. The numbers of participants in the present report and the number of survey sites are comparable to what has been published in other studies. Our findings raise the question whether the efficient penetration of the injector networks throughout a large metropolitan area might require a wider dispersal of interview sites and larger numbers of participants than has been the practice in RDS studies.
In addition to differing recruiting methods, the four sources of data were conducted over an 11-year time span. The differing patterns of drug preference in the different studies may reflect increasing amphetamine use in the Seattle area over the time period of our data,43 as has been seen in other areas.44 The increasing proportions of Hispanic participants across the studies likely reflects the continuing growth of the Hispanic population in the King County, increasing from 2.9% in 1990 to 6.7% in 2005.45 The higher proportion of females in RAVEN may be a product of a higher likelihood that females take part in the drug treatment programs that were a source of a substantial proportion of RAVEN participants.
Other reports have found discrepancies between RDS-recruited IDU study populations and those recruited by other methods. RDS-derived IDU populations were compared with contemporaneous samples derived from targeted sampling methods in Detroit, Houston and New Orleans, finding no significant difference between the different sample populations in gender or age distributions but differences in racial distributions in Houston and New Orleans.18 An RDS-generated study population of drug users in New York found similar age and gender distributions to those in two previous studies but a different racial makeup, possibly as a result of population changes over time.26 Outside the United States, RDS-recruited IDU study populations were compared to participants recruited by earlier studies by indigenous field worker sampling in Volgograd, Russia and Barnaul, Estonia.15 In both locations, significant differences were found between the sample populations in age, gender, education, and needle sharing. A St. Petersburg study found a substantially higher proportion of females than seen in 2 previous studies.20 Also, a web-based RDS survey of Cornell University students found substantial discrepancies between RDS-adjusted estimates for race and gender proportions and their true distribution.24
Two RDS studies in MSM provide data relevant to our findings. Ma et al. reported on MSM in Beijing surveyed by RDS in three consecutive years. Substantial differences were observed among the surveys in a number of key study variables.13 While these differences may be a product of rapid changes over time, or a change in how network size data was elicited, it is also possible they reflect material variation in repeat samples recruited by RDS methodology. Kendall et al. compared a RDS generated MSM study population in Forteleza, Brazil with previous surveys which used snow ball sampling and venue-based recruitment.12 The RDS population had characteristics, most notably lower SES, less consistent with the other study populations than those populations were with one another. To explain this, the authors note that the club venues at which recruitment occurred might have biased the sample to MSM of higher SES and noted that the RDS sample more closely resembled the census characteristics of Forteleza. On the other hand, they remark that higher SES MSM may have hesitated to travel to the two interview sites in the central city.
The discrepancies among the different sources of data on Seattle area IDU make it difficult to determine the degree to which, if any, accurately reflects the characteristics of the underlying IDU universe. We offer a hypothesis that incomplete penetration of injector networks lies behind the observed differences between the NHBS-IDU1 estimates for age and area of residences and those of the other sources of data. We acknowledge that there are other plausible sources of bias among the sources of data we examined. Given the limited empirical evidence of RDS functioning currently available we would recommend caution in drawing overly broad conclusions from any individual study. Claims that RDS efficiently accesses all but the most isolated networks need to be validated in practice. Further empirical data on RDS functioning in a variety of settings are called for.
Funding for this work came from the National Institute on Drug Abuse (1RO1DA08023) and Centers for Disease Control and Prevention (U62/CCU006260). The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention. We wish to acknowledge the contributions of Nadine Snyder (Project Coordinator), Carrie Shriver, Susan Nelson and Jef St. De Lore (interviewers) and study participants, without whose efforts this study would not have been possible.
Current affiliation: New York University, New York, NY
Current affiliation: World Health Organization, Geneva
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.