|Home | About | Journals | Submit | Contact Us | Français|
Dependence on alcohol and illicit drugs frequently co-occur. Results from a number of twin studies suggest that heritable influences on alcohol dependence and drug dependence may substantially overlap. Using large, genetically informative pedigrees from the Collaborative Study on the Genetics of Alcoholism (COGA), we performed quantitative linkage analyses using a panel of 1717 SNPs. Genome-wide linkage analyses were conducted for quantitative measures of DSM-IV alcohol dependence criteria, cannabis dependence criteria and dependence criteria across any illicit drug (including cannabis) individually and in combination as an average score across alcohol and illicit drug dependence criteria. For alcohol dependence, LOD scores exceeding 2.0 were noted on chromosome 1 (2.0 at 213 cM), 2 (3.4 at 234 cM) and 10 (3.7 at 60 cM). For cannabis dependence, a maximum LOD of 1.9 was noted at 95 cM on chromosome 14. For any illicit drug dependence, LODs of 2.0 and 2.4 were observed on chromosome 10 (116 cM) and 13 (64 cM) respectively. Finally, the combined alcohol and/or drug dependence symptoms yielded LODs > 2.0 on chromosome 2 (3.2, 234 cM), 10 (2.4 and 2.6 at 60 cM and 116 cM) and 13 (2.1 at 64 cM). These regions may harbor genes that contribute to the biological basis of alcohol and drug dependence.
Twin studies have demonstrated that heritable influences play a prominent role in the etiology of both alcohol and illicit drug dependence. In adults, genetic factors contribute to 40-60% of the total variance in risk for alcohol dependence (Dick et al., 2006; Heath et al., 1997; Kendler et al., 1994; Prescott et al., 1995; Tsuang et al., 2001), while for illegal drug dependence, estimates of heritability range from 35-75% (Kendler et al., 2003a; Lynskey et al., 2002; Tsuang et al., 1996; Tsuang et al., 2001; van den Bree et al., 1998). Results from multivariate twin analyses suggest that (i) the liability for dependence on various illicit drug classes is governed, in part, by common genetic factors (Kendler et al., 2003a; Tsuang et al., 1998), and (ii) a significant proportion of the genetic influences on alcohol and illicit drug dependence may be overlapping (Bierut et al., 1998; Kendler et al., 2003b; McGue et al., 2000; Tsuang et al., 2001).
Linkage studies of alcohol dependence have identified several chromosomal regions, including chromosomes 1, 2, 4, 7 and 11 (see Dick et al., 2006 for a review). Gelernter et al. (2005, 2006) have found evidence for linkage on chromosomes 2, 3, 9, 11, 12, 17 and 18 for various definitions of cocaine and opioid dependence. Recently, Hopfer and colleagues (2006) reported linkage on chromosomes 3 and 9 for cannabis dependence criteria in adolescents. Stallings et al. (2003) performed linkage analyses, in the same sample of adolescent probands and controls for average number of dependence criteria across alcohol, tobacco and illicit drugs, and reported elevations on chromosomes 3 and 9. Uhl and colleagues (2001) used 1,494 single nucleotide polymorphisms (SNPs) to examine genome-wide association for general substance dependence vulnerability in pooled samples from a case-control study. Evidence for association was reported for regions on chromosomes 3, 4, 9, 11 and 13 in both European-Americans and African-Americans (Uhl et al., 2001). Subsequently, these authors conducted genome-wide association analyses using 639,401 SNPs in pooled samples to identify a number of cell-adhesion genes (e.g. CNTN4, CNTN5, CNTN6: Contactin genes) associated with a general vulnerability to substance abuse/dependence (Liu et al., 2006).
A number of linkage studies of alcohol or substance dependence have used dichotomous phenotypes (i.e. sib pairs affected or unaffected for alcohol dependence). Such a dichotomous phenotype is limited in power and relies on diagnostic thresholds (e.g. a diagnosis of dependence upon endorsement of three or more criteria), which have repeatedly been demonstrated to be methodologically limited (Helzer et al., 2006a; Helzer et al., 2006b; Saha et al., 2006). Alternatively, ordinal measures (e.g. dependence criteria), that are adjusted for influential covariates (e.g. age, gender, ethnicity), may be more informative phenotypes, especially as they are not restricted by diagnostic thresholds (e.g. affected if and only if atleast 3 criteria endorsed). Unlike an analysis of affected sibling pairs, ordinal and continuous measures allow for incorporation of phenotypic data from all eligible family members, and thus better utilize the entire sample.
In the present study, we used data from the Collaborative Study on the Genetics of Alcoholism (COGA) (Begleiter et al., 1995; Reich et al., 1996,1998) to perform regression-based linkage analyses (Sham et al. 2002) for DSM-IV criteria of alcohol dependence, cannabis dependence and any illicit drug dependence (including cannabis). However, instead of a microsatellite marker panel, we selected a panel of 1,717 SNPs (Edenberg et al., 2005), which yielded a significant increase in information content across the genome.
Genotyping of a panel of single-nucleotide polymorphisms (SNPs) was performed on a subset of COGA families determined to be most informative for linkage analyses (Edenberg et al., 2005). In general, the COGA high-risk families consist of multigenerational pedigrees ascertained using probands who met diagnostic criteria for both DSM-IIIR alcohol dependence (American Psychiatric Association, 1987) and for definite alcoholism specified by Feighner et al. (1972). Multiplex alcoholic families that were not bilineal and had at least two affected first-degree relatives in addition to the proband, along with informative second and third degree relatives comprise the original COGA high-risk genetic sample (Nurnberger et al., 2004). For SNP-typing, a sub-set of 143 families with at least 6 members with genotypic and interview data, a total of 1,614 individuals, was drawn from the high risk genetic sample. Of these 1,614 individuals, 1,364 were genotyped with the SNP panel, with the remaining 250 family members included to link members of the family in a genotypically informative manner (i.e. pedigree cohesion). Further details can be found in Edenberg et al. (2005).
The full COGA sample also consists of 984 individuals from the community recruited using a variety of methods including driver’s license records, medical/dental clinics, mailed questionnaires and advertisements, who were neither ascertained nor excluded for psychiatric or substance-related disorders. This community sample was interviewed using the same diagnostic instruments as the high-risk sample; however, no genotypic information is available on these individuals.
We used a thinned panel of the original 4,596 SNPs typed by Illumina. Details surrounding the genotyping procedure may be found elsewhere (Edenberg et al., 2005). COGA pedigrees selected for SNP-typing were analyzed for Mendelian inconsistencies and genetic relatedness, including issues of non-paternity, by two independent groups during the GAW14 workshop (Wang et al., 2005; Hinrichs & Suarez, 2005). Final genotypic cleaning was performed using PREST (McPeek & Sun, 2000). Here, we focus on the procedure used to account for the substantial inter-marker linkage disequilibrium in the original Illumina panel and the consequent thinning procedure. Huang et al., (2004) have shown that in the absence of parental genotypes, linkage disequilibrium can produce spurious inflations in IBD estimates and inflate information content as a consequence. While a proportion of the COGA parental generations have been genotyped, we opted to utilize a thinning procedure that would both reduce the likelihood of IBD inflation and would retain the information content afforded by the full panel of SNPs. Thinning was performed by deleting SNPs with r2 of 0.1 or greater with any other SNP within 1 Mb. This produced our final thinned map of 1,717 SNPs and constitutes the panel used for the present analyses. Whenever possible, SNPs with the highest minor allele frequency were retained. The thinned map provided similar information content across the genome when compared to the full panel of SNPs (Hinrichs et al., 2005a; Hinrichs et al., 2005b). In general, information content was high (between 80-95%) across the genome. For larger chromosomes, such as chromosome 1, a modest dip (to between 50-65%) in information was noted in telomeric regions. Further, as shown by Hinrichs and colleagues (2005a), using chromosome 7 as an example, the dense and thinned maps afforded identical information, ranging from 75-95% (with lower information at telomeres), across this chromosome, which was significantly greater than information afforded by microsatellites (range 55-90%).
Interview data on alcohol and illicit drug dependence criteria were collected using the Semi-Structured Assessment for the Genetics of Alcoholism (Bucholz et al., 1994; Hesselbrock et al., 1999). DSM-IV dependence criteria were coded for alcohol and individual illicit drugs including cannabis, cocaine, sedatives, stimulants and opiates (American Psychiatric Association, 1994). This included 7 criteria (tolerance, using larger quantities/longer periods of time, spending time acquiring/using, multiple attempts to quit/cut back, recurrent use despite physiological/emotional problems, giving up or cutting down important activities to use the substance and withdrawal for each substance, except cannabis). The following phenotypes were log-transformed (using “log (n+1)” transformation, where n is the raw variable with addition of “1” to account for zeroes) and used for linkage analyses:
Criterion counts were missing in those who reported never having used the substance even once in their lifetime. Average scores were created by dividing the total number of criteria by the number of substances used. Stallings et al. (2003) have previously demonstrated that this assessment of dependence vulnerability is highly heritable (h2=48%) and also least likely to be influenced by environmental influences shared by siblings.
A total of 1,214 Caucasians (N=112 families, average family size of 10.8) and 150 African-Americans (N=11 families, average family size of 13.6), were used for linkage analyses. The mean age of participants was 40.6 years (range 17-91years). Table 1 shows the distribution of dependence criteria in 984 individuals from the community sample and 2,773 individuals from the high risk sample: 13 and 46.4% of the community and high risk participants endorsed 3 or more alcohol dependence criteria respectively. Similarly 3.1 and 11.5% of participants endorsed 3 or more cannabis dependence criteria in community and high risk sample respectively. The mean number of alcohol dependence criteria was 0.9 and 2.7 in the community and high risk samples, with the comparable estimates for mean number of cannabis dependence criteria being 0.4 and 1.3. The mean number of DSM-IV illicit drug dependence criteria was 0.7 for the community sample and 3.9 for the high risk sample. Mean number of DSM-IV alcohol and illicit drug dependence criteria was 1.2 and 5.4 for the community and high risk samples respectively.
We used the regression-based quantitative trait linkage analysis available through MERLIN-REGRESS (Sham et al., 2002). While this method is, in general, robust to distributional assumptions of the trait, it is sensitive to ascertainment effects (phenotypic) and to differences in allele frequencies (genotypic) across populations. To account for these effects, we employed the following analytic steps:
To estimate levels for genomewide significance, 1000 Caucasian and African-American pedigrees were simulated using the gene-dropping algorithm available from MERLIN (Abecasis et al., 2002). Simulations assign random chromosomes to founders and segregate these alleles through the pedigrees while retaining the phenotypic data and segregation patterns. Pedigrees were combined after simulation using pedmerge and multipoint linkage analyses were performed. Maximum LOD scores across simulations were collected and the empirical p-value = r + 1/1001, where r was the number of times a simulated LOD score exceeded an observed LOD score value (North et al., 2003).
For DSM-IV alcohol dependence criteria, we observed linkage peaks on chromosomes 1 (LOD 2.0, 213 cM), 2 (LOD 3.4, 234 cM), 10 (LOD 3.7, 60 cM) and 13 (LOD 1.4, 64 cM) (see Table 2). For cannabis dependence criteria, a maximum LOD score of 1.9 was noted at 95 cM on chromosome 14, with a LOD of 1.4 on chromosome 13, coinciding with the LOD of similar magnitude for alcohol dependence at the same position. This elevation on chromosome 13 was considerably higher for any illicit drug dependence (LOD 2.4, 64 cM). In addition, for any illicit drug dependence, a LOD of 2.0 was noted at 116 cM on chromosome 10. When criteria of alcohol dependence and drug dependence were averaged, two regions of the genome showed significant linkage: on chromosome 2, a LOD of 3.2 at 234 cM and on chromosome 10, a LOD of 2.4 at 60 cM. Both linkage peaks overlapped with linkage peaks for DSM-IV alcohol dependence criteria (see Figure 2). Other evidence for linkage with the combined phenotype included chromosome 1 (LOD 1.4, 213 cM) and chromosome 13 (LOD 2.1, 64 cM). Empirical p-values based on a 1,000 multipoint simulations suggest that our linkage peaks are suggestive (p-values of 0.10 or less) for LOD scores exceeding 3.4, with p-value of 0.05 for LOD scores of 3.7 and greater.
Using an informative panel of single nucleotide polymorphisms (SNPs) that were typed in a subset of individuals from the Collaborative Study on the Genetics of Alcoholism (COGA), we conducted multipoint linkage analyses for DSM-IV criteria of alcohol and drug dependence. Linkage peaks on chromosome 2 and 10 were noted for alcohol dependence and for an average score of alcohol and drug dependence.
Prior linkage studies, especially those utilizing COGA data and a categorical phenotype (i.e. affected sibling pairs), have noted linkage peaks on chromosome 2 for a variety of substance-related phenotypes but these peaks have converged to 120-135 cM (Bierut et al., 2004; Dick et al., 2004; Foroud et al., 2000; Hesselbrock et al., 2004; Reich et al., 1998). Our findings, in contrast, occur at 230-260 cM, where Schuckit et al., (Schuckit et al., 2001) report linkage for SRE (Subjective Response to Ethanol) scores during the first five times alcohol was consumed (FIRST 5) and by Nurnberger et al. (2001) for comorbid alcoholism and depression. In addition, Straub et al. (1999) report linkage in this 2q region for nicotine dependence in their Christchurch sample. Gelernter et al., (2005, 2006) have also reported linkage for late-onset cocaine dependence and for DSM-IV opioid dependence at 221 cM on chromosome 2, but only in their African-American pedigrees. Zubenko et al., (2003a, b) find linkage in this region for Depression Spectrum Disorder, and recently, Kuo et al., (2006) report linkage for an alcohol withdrawal criteria factor score. While there is considerable phenotypic, and potentially genetic overlap, across the phenotypes assessed in our manuscript and those used in the above referenced studies, it is also possible that imprecise localization of linkage signals, or random chance, may have contributed to convergence of linkage signals.
Our highest LOD score for alcohol dependence was on chromosome 10. The elevations at 60 and 116 cM are in regions where several studies have reported linkage for correlated psychiatric traits. Li and colleagues (2006) report a LOD of 4.17 for a peak ranging from 60-100 cM for number of cigarettes smoked per day in a sample of African-American families ascertained for nicotine dependence. In contrast, Uhl et al., 2001, in their genomewide association study found evidence for association between a polymorphism (WIAF-3336) at 86 cM on chromosome 10 and substance dependence vulnerability in their European-American cases and controls alone. While the African-American families contributed to our LOD score on chromosome 10, re-doing the linkage analyses without these families did not greatly reduce the LOD score (LOD was reduced to 3.5), suggesting that our finding on chromosome 10 is not specific to African-American families. Vink et al. (2004) showed LOD scores of similar magnitude of both smoking initiation and quantity at 50-60 cM on chromosome 10. Strikingly, Zubenko and colleagues have reported linkage for mood disorders at 76 and 113 cM in their family study of recurrent unipolar major depressive disorder (Zubenko et al., 2003b). The latter peak at 116 cM also maps to within 10 cM of the region where Schuckit et al, (2005) report linkage for the Subjective High Assessment Scale (SHAS), a measure of alcohol response.
We did not, however, find significant evidence for linkage peaks on chromosome 3, 9 or 17, where Stallings et al., (2003) report linkage for average scores of alcohol and drug dependence, as well as for cannabis dependence, in their sample of adolescent probands from treatment samples, and matched controls. In fact, our highest LOD score for cannabis dependence criteria (LOD 1.92) occurred on chromosome 14. There may be several reasons for this including different ascertainment strategies between the two studies (i.e. COGA used alcohol dependent probands and dense families, while Stallings et al., used probands with substance use or conduct problems) and the higher information content afforded by using SNP panels instead of microsatellites.
It is also worth noting that some linkage peaks previously identified in COGA on chromosome 4 and 7 (Saccone et al., 2000; Saccone et al., 2005; Foroud et al., 2000; Reich et al., 1998; Nurnberger, Jr. et al., 2001) were not identified in the current study. Those peaks harbor genes such as GABRA2, ADH (on chromosome 4) and CHRM2 and hTAS2R16 (on chromosome 7), which have subsequently found to be associated with risk for alcoholism (Edenberg et al.,2004; Wang et al., 2004; Edenberg et al., 2006a; Hinrichs et al., 2006). We did not detect these peaks in our analyses. In other samples, however, quantitative indices of alcohol dependence similar to those used here have yielded linkage on chromosome 4 (see Prescott et al. 2006 for summary). For instance, in the Irish high-density sample of alcoholics, Prescott and colleagues (2006) report linkage (LOD 4.5) on chromosome 4. Whether the lack of these linkage findings in the present study is related to the phenotype used here or is a false negative is unknown. While the non-replication poses some interpretive challenges, it is not unusual in the extant linkage literature for substance-related phenotypes.
Before considering plausible candidates in the genomic regions encompassing our highest LOD scores, some study limitations need to be considered. First, there is significant overlap between the high risk families that contributed to the linkage signal for alcohol dependence criteria and the families for illicit drug dependence criteria. In our sample, nearly half of those with alcoholism also meet criteria for DSM-IV drug dependence and less than 25% of those with drug dependence do not report a lifetime history of alcohol dependence. Therefore, the overlap of linkage findings is likely due to the high comorbidity in this sample. Second, due to the relatively small proportion of African-American families, we did not have sufficient power to conduct linkage analyses independently in these families. Differences in allele frequencies, however, were maintained by creating pedigrees in each racial group and using pedmerge to combine them. Third, the community-based sample was ascertained from various sources: driver’s license registries, dental clinics and health maintenance organizations. While these families were not enriched for substance dependence or psychiatric disorders, they were selected to be large, therefore they may provide limited representation of the general population. Notwithstanding this limitation, using community-based estimates is highly recommended for regression-based linkage analyses. Furthermore, re-doing the analyses without using the community sample (as discussed below) did not significantly alter our findings. Lastly, diagnostic criteria for nicotine dependence were not included in the initial COGA assessments, and hence dependence vulnerability to tobacco was not included in these analyses.
Several subsidiary linkage analyses were conducted to validate the robustness of our LOD scores. First, analyses were performed using means and variances from the high-risk sample (i.e. without scoring) and without regressing out the influence of covariates. Second, analyses were replicated using varying estimates of heritability (0.45-0.55). Third, analyses were also conducted using DSM-IIIR dependence criteria for alcohol and illicit drugs. Fourth, linkage analyses were repeated in the full panel of 4,596 SNPs and were also conducted using the original microsatellite panel. Across all subsidiary analyses, LOD scores remained consistent, although modest magnitude changes were noted in some instances. In addition, LOD scores remained unchanged when particular large families were trimmed from the analysis.
Our strongest evidence for linkage was observed on 2q and 10p-q. On the q-arm of chromosome 2, putative candidate genes include HTR2B (5-Hydroxytryptamine receptor 2B) receptors. Recently, Lin et al. (2004) reported association between 3 coding SNPs (2 non-synonymous, resulting in double-mutants of the protein, and one synonymous) in HTR2B and substance abuse vulnerability. Other possible candidate genes include GPR55 (G protein-coupled receptor 55), which is expressed in human brain tissue and involved in extracellular-intracellular signal transduction, and HDAC4 (histone deacetylase 4) which participates in epigenetic modification of core histones and. Work by Zubenko and colleagues has also suggested a role of CREB1 on mood disorders; this lies 20 cM centromeric to our linkage peak (Zubenko et al., 2003a). These authors have also identified a sex-specific association between the 124bp repeat allele of D2S944 and recurrent, early-onset major depression and with anxious depression in an independent U.S. and Dutch sample (Beem et al., 2006; Philibert et al., 2003; Zubenko et al., 2002). Due to the high level of comorbidity between alcohol and drug dependence and major depression, and as indexed by a high LOD score of 3.49 for comorbid alcoholism and major depression in prior linkage analyses using COGA data (Nurnberger, Jr. et al., 2001), this region on chromosome 2 may be tapping into a region of susceptibility for both disorders.
The linkage peak on chromosome 10 spans from 42 to 65 cM. GAD2 (glutamic acid decarboxylase 2), a possible candidate in this region, has been putatively implicated, although not unequivocally, for its role in acute ethanol consumption and withdrawal, in animal models (Fehr et al., 2003). Two recent association studies in humans, however report negative findings for the role of GAD2 in the risk for alcoholism while another found limited evidence for the role of this gene in association with anxiety disorders, depression and neuroticism (Hettema et al., 2006; Lappalainen et al., 2007; Loh et al., 2006). A smaller linkage peak at 90-110 cM encompasses candidate genes such as HTR7 (5-Hydroxytryptamine receptor 2B) and VMAT2 (vesicular monoamine oxidase 2), as well as KCNMA1 (calcium activated channel). SNPs in VMAT2 and KCNMA1 have been shown to be associated with alcoholism and the endophenotype of subjective responses to ethanol (Lin et al., 2005; Schuckit et al., 2005). Also located at 83 cM is CTNNA3 (catenin alpha 3) which was recently identified in genomewide association studies of substance dependence vulnerability and nicotine dependence (Bierut et al., 2007; Liu et al., 2006).
Finally, our finding on chromosome 14 for cannabis dependence, while modest, is in a region of biological interest. Our linkage peak on chromosome 14 harbors candidate genes, such as GPR68 (G-protein coupled receptor 68) involved in cAMP regulation, CKB (creatine kinase, brain), encoding a cytoplasmic enzyme involved in energy homeostasis, as well as SERPINA1 and SERPINA2 (serine-peptidase inhibitor, clade A, members 1 and 2), which were previously identified in the genomewide association study for substance abuse vulnerability by Liu and colleagues (and also by Bierut et al.,61 for SERPINA1 and nicotine dependence).
In the past, linkage analyses in COGA have been extremely successful in identifying putative candidate genes which have led to positive association results (Dick et al., 2006; Edenberg et al., 2006b). We have now identified additional genomic regions that may harbor genetic loci that contribute to the etiology of alcohol and illicit drug dependence. Future efforts will target candidate genes in these regions for association analyses. This continued effort at understanding the biological basis of alcohol and drug use disorders is critical as, even today, the challenge posed by the considerable morbidity and mortality due to alcohol and drug problems is a substantial concern (Centers for Disease Control (CDC), 2004; Williams et al., 1988).
In memory of Henri Begleiter and Theodore Reich, Principal and Co-Principal Investigators of COGA since its inception; we are indebted to their leadership in the establishment and nurturing of COGA, and acknowledge with great admiration their seminal scientific contributions to the field.
Acknowledgement and Role of Funding Source: This national collaborative study is supported by the NIH Grant U10AA008401 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA) and the National Institute on Drug Abuse (NIDA). A.A. is also funded by DA023668. L.J.B. is supported by DA13423, DA019963 & PO1 CA89392. SNP data were obtained through participation in a collaborative effort between COGA, organizers of the 14th Genetic Analysis Workshop (GAW), the Center for Inherited Disease Research (CIDR: CIDR is fully funded through a federal contract from the National Institutes of Health to the Johns Hopkins University, contract number N01-HG-65403), Illumina and Affymetrix. Funding sources were not involved in conducting these analyses.
Contributors: The Collaborative Study on the Genetics of Alcoholism (COGA), Co-Principal Investigators B. Porjesz, V. Hesselbrock, H. Edenberg, L. Bierut, includes nine different centers where data collection, analysis, and storage take place. The nine sites and Principal Investigators and Co-Investigators are: University of Connecticut (V. Hesselbrock); Indiana University (H.J. Edenberg, J. Nurnberger Jr., P.M. Conneally, T. Foroud); University of Iowa (S. Kuperman); SUNY Downstate (B. Porjesz); Washington University in St. Louis (L. Bierut, A. Goate, J. Rice); University of California at San Diego (M. Schuckit); Howard University (R. Taylor); Rutgers University (J. Tischfield); Southwest Foundation (L. Almasy). Zhaoxia Ren serves as the NIAAA Staff Collaborator.
All authors provided feedback and revisions and approved the manuscript for submission. In addition to data collection (see above), individual contributions to this paper, in alphabetical order, are as follows: Agrawal: Conducted analyses, responsible for first draft and incorporating feedback into all subsequent drafts; Bertelsen: Conducted analyses; Bierut: Phenotypic & Bioinformatics expertise; on-site supervision; Bucholz: Phenotypic expertise; Cloninger: Phenotypic and statistical expertise; Dick: Phenotypic and statistical expertise; Dunn: Conducted analyses; Edenberg: Phenotypic and bioinformatics expertise; Foroud: Phenotypic and bioinformatics expertise; Goate: Bioinformatics expertise; Grucza: Phenotypic expertise; Hesselbrock: Phenotypic expertise; Hinrichs: Conducted analyses, statistical/bioinformatics expertise; Kramer: Phenotypic expertise; Kuperman: Phenotypic expertise; Nurnberger: Phenotypic and statistical expertise; Porjesz: Phenotypic and statistical expertise; Saccone N: Bioinformatics expertise; Saccone S: Bioinformatics expertise; Schuckit: Phenotypic and statistical expertise; Wang: Bioinformatics expertise;
Conflict of Interest: None
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.