Participants were recruited from eight geographically contiguous reservations, with a total population of about 3,000 individuals, using a combination of a venue-based method for sampling hard-to-reach populations [
58,
59], as well as a respondent-driven procedure [
60] as previously described [
5,
13]. The venues for recruitment included: tribal halls and culture centers, health clinics, tribal libraries, and stores on the reservations. A 10-25% rate of refusal was found depending on venue. Refusal rates were higher at tribal libraries and stores than health clinics and tribal halls/culture centers. Transportation from their home to The Scripps Research Institute was provided by the study.
To be included in the study, participants had to be an Indian indigenous to the catchment area, at least 1/16th Native American Heritage (NAH), between the age of 18 and 70 years, and be mobile enough to be transported from his or her home to The Scripps Research Institute (TSRI). Participants were excluded from electrophysiological analyses if they had a history of head trauma or were currently using medications that could bias the EEG recording. The protocol for the study was approved by the Institutional Review Board (IRB) of TSRI, and the Indian Health Council, a tribal review group overseeing health issues for the reservations where recruitment was undertaken.
Potential participants first met individually with research staff to have the study explained and give written informed consent. During a screening period, participants had blood pressure and pulse taken, and completed a questionnaire that was used to gather information on demographics, personal medical history, ethnicity, and drinking history [
61]. Participants were asked to refrain from alcohol and drug usage for 24 hours prior to the testing. No individuals with detectable breath alcohol levels were included in the study dataset (n = 3). During the screening period, the study coordinator also noted whether the participant was agitated, tremulous, or diaphoretic and their data were eliminated from subsequent analyses. Each participant also completed an interview with the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA) and the family history assessment module (FHAM) [
62], which was used to make substance use disorder and psychiatric disorder diagnoses according to Diagnostic and Statistical Manual (DSM-III-R)[
63] criteria in the probands and their family members [
63]. The SSAGA is a semi- structured, poly-diagnostic psychiatric interview that has undergone both reliability and validity testing [
62,
64]. It has been used in another Native American sample [
65,
66]. Personnel from the Collaborative Study on the Genetics of Alcoholism (COGA) trained all interviewers. The SSAGA interview includes retrospective lifetime assessments of alcohol use, abuse, and dependence. A research psychiatrist/addiction specialist made all best final diagnoses.
Six channels of bipolar EEG (F3-C3, C3-P3, P3-01 and F4-C4, C4-P4, P4-02, international 10-20 system) were obtained using an electrode cap (impedance < 5 K ohms), as described. Bipolar recordings were obtained for comparison to previous studies in a wide range of ethnic groups [
67-
70]. A forehead ground electrode was used. An electrode placed left lateral infra-orbitally and referenced to the left earlobe was used to monitor both horizontal and vertical eye movement. Resting EEG was recorded in a temperature and noise controlled room while a participant was comfortably sitting on a chair. Participants were instructed to relax and keep their eyes closed, but to remain awake throughout the EEG recording. Ten to 15 minutes of EEG was collected on paper (Nihon Kohden, high-low pass filters 1-70 Hz) and also digitized for subsequent analyses. EEG records were monitored during all recordings for signs of drowsiness or artifact. Ten minutes of artifact-free, drowsiness-free EEG, as defined by Daly and Pedley [
71], was computer analyzed for each channel. Muscle and movement artifact are identified by a computer driven algorithm that identifies epochs with waveforms that are between 0.25 and 0.5 Hz with an amplitude of higher that 100 microvolts squared per Hz (movement artifacts) and waveforms that are between 20 and 50 Hz with amplitudes above 25 microvolts squared per Hz (muscle artifact). These epochs are verified by the user as artifact and are then removed prior to processing. Time of recording with respect to the menstrual cycle was not controlled, as previous studies have demonstrated that the EEG variables under study are not sensitive to time during the cycle [
72].
Records were digitized at 128 Hz. The Fourier transform of consecutive four-second epochs (minimum 140) was calculated and the power spectrum produced using an IBM compatible PC with software developed by Ehlers and Havstad [
73]. Power density is calculated in microvolts squared per octave, a transformation that expands amplitudes at high frequencies and reduces them at low frequencies, producing a spectrum with less 1/f characteristics [
74]. A rectangular window is used. The transformed data were compressed into frequency bands. Mean spectral power density (microvolts squared/octave) in the alpha 7.5-12.0 Hz frequency band was calculated by summing the raw power spectral values within the band, multiplying by a scale factor derived from the calibration signal to produce the total power in the band in microvolts squared, and dividing by the width of the band in octaves. This width is the logarithm of the ratio of the maximum and minimum frequencies in the band, divided by the log of two. The details of the spectral analysis procedures have been previously described [
50,
73,
74].
The data analyses were based on the overall aim that was to map loci linked to EEG alpha power phenotypes and to determine if there was overlap with loci previously mapped for alcohol dependence phenotypes in an American Indian community. To reduce the number of dependent variables in our linkage analyses, a principal component analysis (PCA) was performed over the six bipolar electrode locations for the alpha frequency band. Varimax rotation yielded two components (eigenvalues > 1, range 2.64-2.67). The electrode sites loading on the first factor were the two fronto-central leads (F3-C3, F4-C4) and the electrode sites loading on the second factor were the four more posterior leads (C3-P3, C4-P4, P3-O1, P4-O2) (loadings ranged from 0.64 to 0.93). The two orthogonal factors explained 87% of the variance. Mean power in each band was averaged across the electrode sites within each of the identified components generating a value for mean power (microvolts squared/octave) for each of the regions identified by the PCA for each participant. These values were the dependent variables in the linkage analyses.
One hundred and eighty-one pedigrees containing 1600 individuals were used in the genetic analyses. Of these, 410 individuals have both genotype and phenotype data and 236 additional individuals have only phenotypic data. Sixty-six families have only a single individual with phenotype data. These individuals were included within some analyses to the extent that they contribute information about trait means and variance and the impact of covariates. The family sizes for the remaining families ranged between 4 and 41 subjects (average 12.19 ± 8.19). Eighty-one families were genetically informative. The data includes 142 parent-child, 260 sibling, 53 half sibling, 11 grandparent-grandchild, 235 avuncular, and 240 cousin relative pairs. Only sibling, half-sibling, avuncular and cousin pairs were included as being potentially genetically informative. Many individuals can be linked to a few large extended pedigrees with many founders and complex "loop" structures, which were "broken" to simplify the analysis.
DNA was isolated from whole blood using an automated DNA extraction procedure, genotyping was done as previously described [
75]. Genotypes were determined for a panel of 791 autosomal microsatellite polymorphisms [
76] using fluorescently labeled PCR primers under conditions recommended by the manufacturer (HD5 version 2.0; Applied Biosystems, Foster City, CA). The HD5 panel set has an average marker-to-marker distance of 4.6 cM, and an average heterozygosity of greater than 77% in a Caucasian population. Allele frequencies were estimated from the entire Mission Indian population with genotype data. Gender and age accounted for greater than 5% of the phenotypic variance for each of the phenotypes. Therefore, age and gender were included as covariates in the analyses.
Genotypes were ultimately determined for 410 subjects. Samples for which less than 90% of genotypes met quality standard were repeated for the entire panel. Ultimately 273,598 genotypes were accepted. Less than 10% of the sample had the majority of the failed genotypes. All available genotypes for all of the autosomal markers were analyzed for each family using PREST [
77] to detect sample and pedigree structure errors resulting in the removal of 6 individuals from further analyses. PREST assesses degree of allele sharing and calculates several statistics for each relative pair that are each sensitive difference type of pedigree miss-specification. Pedcheck was used to detect non-Mendelian inheritance patterns [
78]. Relevant genotypes were reviewed blind to diagnosis. Very few Mendelian inconsistencies could be resolved by review of electropherograms. Genotypes for the nuclear family were removed for each Mendelian inconsistency. A total of 772 genotypes were removed from linkage analysis because of Mendelian inconsistencies. To further reduce errors, the probability that each genotype is correct was assessed in the context of all other available genotypes using the maximum-likelihood error-checking algorithm implemented in Merlin [
79]. Genotypes that had a probability of less than 0.025 of being correct were removed from further consideration. A total of 508 genotypes were removed in this step. Duplicate genotypes were available for a large fraction of the genotype problems detected by Pedcheck and Merlin. In almost all cases these problematic results are reproducible, suggesting somatic mutations, mosaicism or "null alleles" (the failure to amplify the allele from one chromosome resulting in the assumption that an individual is homozygous for the other allele). In our previous experience about 0.5% of microsatellite genotypes using the HD5 panel give reproducible results that are inconsistent with other family genotype data.
Variance component estimate methods were used to calculate LOD scores using SOLAR v2.0.4 [
80,
81]. Simulation analyses were then conducted in which a genetic locus was simulated under the null hypothesis of no linkage across 50,000 trials to derive nominal p-values for the reported LOD scores [
82].