Participants were recruited from eight geographically contiguous reservations, with a total population of about 3,000 individuals, using a combination of a venue-based method for sampling hard-to-reach populations (Kalton and Anderson, 1986
; Muhib et al., 2001
), as well as a respondent-driven procedure (Heckathorn, 1997
) as previously described (Ehlers et al., 2004a
; Gilder et al., 2004
). The venues for recruitment included: tribal halls and culture centers, health clinics, tribal libraries, and stores on the reservations. A 10–25% rate of refusal was found depending on venue. Refusal rates were higher at tribal libraries and stores than health clinics and tribal halls/culture centers. Transportation from participants’ homes to The Scripps Research Institute was provided by the study.
To be included in the study, participants had to be a Native American Indian indigenous to the catchment area, at least 1/16th Native American Heritage (NAH), between the age of 18 and 70 years, and be mobile enough to be transported from his or her home to The Scripps Research Institute (TSRI). The protocol for the study was approved by the Institutional Review Board (IRB) of TSRI, and the Indian Health Council, a tribal review group overseeing health issues for the reservations where recruitment was undertaken.
Potential participants first met individually with research staff to have the study explained and give written informed consent. During a screening period, participants had blood pressure and pulse taken, and completed a questionnaire that was used to gather information on demographics, personal medical history, ethnicity, and drinking history (Schuckit, 1985
). Participants were asked to refrain from alcohol and drug usage for 24 hours prior to the testing. No individuals with detectable breath alcohol levels were included in the study dataset (n=3). During the screening period, the study coordinator also noted whether the participant was agitated, tremulous, or diaphoretic and their data were eliminated from subsequent analyses. Each participant also completed an interview with the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) and the family history assessment module (FHAM) (Bucholz et al., 1994
), which was used to make substance use disorder and psychiatric disorder diagnoses according to Diagnostic and Statistical Manual (DSM-III-R) criteria in the probands and their family members (American Psychiatric Association, 1987
). The SSAGA is a semi-structured, poly-diagnostic psychiatric interview that has undergone both reliability and validity testing (Bucholz et al., 1994
; Hesselbrock et al., 1999
). It has been used in another Native American sample (Hesselbrock et al., 2000
). Personnel from the Collaborative Study on the Genetics of Alcoholism (COGA) trained all interviewers. The SSAGA interview includes retrospective lifetime assessments of alcohol use, abuse, and dependence. A research psychiatrist/addiction specialist made all best final diagnoses.
The phenotypes chosen for the present linkage analyses, based on having significant heritability were: (1) a DSM–III-R stimulant (amphetamine or cocaine) dependence diagnosis, (2) stimulant “craving” defined as endorsing: “In situations where you couldn’t use stimulants, did you ever have such a strong desire for it that you couldn’t think of anything else,” and ( 3) A measure of a period of heavy use of stimulants defined as: “Was there ever a period of a month or more when a great deal of your time was spent using stimulants, getting stimulants, or getting over its effects.”
One hundred and eighty-one pedigrees containing 1600 individuals were used in the genetic analyses. Sixty-six families have only a single individual with phenotype data. All these individuals were included within some analyses to the extent that they contribute information about trait means and variance and the impact of covariates. The family sizes for the remaining families ranged between 4 and 41 subjects (average 12.19 ± 8.19). Eighty-one families were genetically informative. The data includes 142 parent-child, 260 sibling, 53 half sibling, 11 grandparent-grandchild, 235 avuncular, and 240 cousin relative pairs. Only sibling, half-sibling, avuncular and cousin pairs were included as being potentially genetically informative. Several pedigrees contained large numbers of individuals and/or complex loops that could not be analyzed due to the high computational demands required. These pedigrees were thus broken using procedures originally described by Lange and Elston (1975)
, and treated as independent to allow for their inclusion in the linkage analysis.
DNA was isolated from whole blood using an automated DNA extraction procedure, genotyping was done as previously described (Wilhelmsen et al., 2003
). Genotypes were determined for a panel of 791 autosomal microsatellite polymorphisms (Weber and May, 1989
) using fluorescently labeled PCR primers under conditions recommended by the manufacturer (HD5 version 2.0; Applied Biosystems, Foster City, CA). The HD5 panel set has an average marker-to-marker distance of 4.6 cM, and an average heterozygosity of greater than 77% in a Caucasian population. Allele frequencies observed in the unrelated founders were used for linkage analysis.
Genotypes were determined for 381 subjects. The PREST software program, which assesses degree of allele sharing among relative-pairs, was used to identify potential errors in pedigree structure (McPeek and Sun, 2000
). Six individuals were identified as problematic and removed from further analyses. Pedcheck was then used to detect non-Mendelian inheritance patterns (O’Connell and Weeks, 1998
). When a Mendelian inconsistency was observed, genotypes for the nuclear family at that polymorphism were removed. This resulted in the removal of 772 genotypes (0.3%). To further reduce errors, the maximum-likelihood error-checking algorithm implemented in Merlin (Abecasis et al., 2002
) was used to identify genotypes that had a probability of less than 0.025 of being correct. A total of 508 genotypes (0.2%) were removed in this step. Ultimately 273,598 genotypes (99.5%) were accepted.
Analyses were conducted to estimate the heritability of the three phenotypes of interest: DSM-III-R stimulant dependence, stimulant craving, and heavy use using SOLAR (Almasy and Blangero, 1998
) as previously described (see Ehlers et al., 2009
). Participant’s age at the time of evaluation and sex were evaluated as potential covariates and retained if they accounted for at least 5% of the total variance. The total additive genetic heritability (h2
) and its standard error were estimated, and the probability that h2
was greater than zero was determined using a Student’s t-test for each scale. All three phenotypes were found to be heritable and as such suitable for linkage analyses. There were 684 individuals with full phenotype data included in these analyses.
For linkage analysis, a variance components approach was used to calculate multipoint LOD scores at 1 cM intervals across the genome for the three stimulant phenotypes using SOLAR v4.2.0 (Almasy and Blangero, 1998
; S.F.B.R, 2011
). Because the Native American Mission Indian sample contains large extended pedigrees, a variance components approach to linkage analysis allowing for multiple pedigree types was preferred over sibling pair approaches (i.e., Kong and Cox statistic (1997)
) due to the greater statistical power afforded by the former (Amos et al., 1997
; Duggirala et al., 1997
). All traits were analyzed using a latent threshold model in which a normally distributed trait is assumed along with a threshold in the distribution above which an individual is designated affected.
Variance components linkage analysis assumes that phenotypes are normally distributed, and violations of this assumption can result in inflated LOD scores. To protect against this possibility, simulations were conducted in which a single genetic locus was simulated under the null hypothesis of no linkage across 100,000 trials to derive pointwise empirical p-values. These p-values were used to determine the significance of the reported LOD scores (Blangero et al., 2000
) with a p<2.2×10−5
used to identify genome-wide significance as suggested by Lander and Kruglyak (1995)
, and p<0.001 to identify suggestive evidence for linkage. These simulations suggested some negative bias in LOD scores for the stimulant dependence diagnosis though little bias for the remaining phenotypes as 17, 105, and 72 simulations out of 100,000 for the stimulant dependence, “craving,” and “heavy use” phenotypes, respectively, yielded LOD scores greater than 2.00 compared to an expected 100 simulations for each phenotype and 0, 4, and 3 out of 100,000 simulations for the stimulant dependence, “craving,” and “heavy use” phenotypes, respectively, yielded LOD scores greater than 3.00 compared to an expected 10 simulations for each phenotype.
To better characterize the evidence for linkage across families at the reported peaks, heterogeneity tests of the family-specific LOD scores were performed using the SOLAR HLOD (Goring, 2002
) test. This test contrasts a null model in which families belong to a single distribution exhibiting genetic linkage to the tested locus against an alternative model in which families belong to one of two distributions only one of which shows evidence of genetic linkage to the tested locus.