To our knowledge, this is the first GWA study for multiple substance dependence traits performed simultaneously in two cohorts from distinct populations. Recognizing the limited power of our sample ascertained through and comprised primarily of sib pairs affected with CD or OD, our goal was to take advantage of the combined sample to generate new hypotheses about the genetic basis of substance dependence by identifying and prioritizing candidate genes or chromosome regions implicated in both samples as targets for further investigation.
The most significant finding in this study was the association of CIP with rs1133503 (p = 0.00005 in the combined sample), which is located in the 3′UTR of the α-endomannosidase (MANEA
) gene. Absence or defective function of lysosomal α-endomannosidase could result in accumulation of undegraded mannose-rich oligosaccharides that can induce progressive neurologic deterioration and premature death (Crawley and Walkley 2007
). From a biological perspective, the most noteworthy finding among the significant results for CD was with rs8929 (p=0.003 in the combined sample). This SNP is located in the 3′UTR of the gene encoding synaptotagmin XIII (SYT13
), which belongs to a family of proteins serving as calcium sensors in facilitation and asynchronous neurotransmitter release (Saraswati et al. 2007
). These calcium sensors regulate baseline synaptic transmission and short-term synaptic plasticity, and may play a key role in the etiology of substance dependence. We also detected significant association (p=0.0003 in the combined sample) of OD with a SNP (rs770124) in the neuron navigator 3 (NAV3
) gene; neuron navigators are expressed predominantly in the nervous system and involved in axon guidance (Maes et al. 2002
). A SNP (rs8688) in the TTC9
gene on chromosome 14 was significantly associated with OD in our study (p=0.003 in the combined sample) and is located approximately 5 cM from a linkage peak for OD in an ethnically mixed sample from New York City (Lachman et al. 2007
encodes tetratricopeptide repeat domain 9, a hormonally regulated protein whose function is not yet clear. We have shown previously that another TTC gene, TTC12
, is associated to nicotine (Gelernter et al, 2006b
) and alcohol (Yang et al, 2007
Our GWA study identified association of FTND with two SNPs from unlinked genes, PLEKHG1
on chromosome 6 and PHLPP
) on chromosome 18, which encode homologous proteins involved in cell signaling. Each of these genes contains a pleckstrin homology (PH) domain, which plays a key role in cell signaling and cytoskeletal regulation by binding to phosphoinositides (Harlan et al. 1994
is involved in the selective termination of PI3K/Akt signaling pathways (Brognard et al. 2007
), which could be activated by nicotine via the nicotinic acetylcholine receptors (Carlisle et al. 2007
is located under a broad linkage peak for a smoking-related quantitative trait in an independent sample of EA and AA families (Li et al, 2008
). A recent study showed that a mutation in the PH domain of PLEKHG5
, another member of the PLEKHG
family, causes lower motor neuron disease (Maystadt et al. 2007
). According to the UniGene expression profile database (http://www.ncbi.nlm.nih.gov/sites/entrez?db=unigene
are both expressed in the brain and peripheral nervous system. It is possible that variants or isoforms of these PH–domain-containing proteins have an impact on the cell signaling pathway that regulates neuronal plasticity, and thus could influence predisposition to ND.
The use of GWA is increasingly recognized as a promising approach to identify common genetic variants that contribute substantially to the risk of human disease (Risch and Merikangas 1996
; Kruglyak 1999
; Hirschhorn and Daly 2005
; Christensen and Murray 2007
), and there is an impressive list of robust associations for several complex disorders (The Welcome Trust Case Control Consortium 2007
). As discussed above, the results from our study that were strongest statistically also make sense biologically, which is encouraging. Nonetheless, highly significant genetic association findings for complex traits are not often replicated, and thus must be interpreted cautiously (Colhoun et al. 2003
; Tabor et al. 2002
). In response to this problem, an expert panel (Chanock et al. 2007
) suggested several criteria for establishing replication of genetic associations including: (1) replication studies should be conducted in independent data sets of sufficient sample size to distinguish convincingly the proposed effect from no effect; (2) the same or a very similar phenotype should be analyzed; (3) similar magnitude of effect and significance should be demonstrated with the same SNP or SNP in high linkage disequilibrium with the prior SNP; and (4) a joint or combined analysis should lead to a smaller p-value than that seen in the original report. Two aspects of our study address these guidelines. First, because our results were obtained from family-based samples and by comparing allele transmission rates, they are unlikely to be caused by stratification within a population group. Second, our criteria for selecting SNPs or genes for further consideration included significant results in both population samples with the same pattern of association. Consistent results from independent samples of distinctive genetic background not only lessen the concern that the results are due to chance, but also increase the likelihood that the association is generalizable. In addition, we took advantage of a rich dataset containing detailed information on dependence on several psychoactive drugs (for which diagnosis has been shown to be reliable), conducting a simultaneous search for potential candidate genes influencing several substance dependence traits. The benefit of a single large and well-characterized population was recently demonstrated in a GWA study of seven common diseases in a British population (The Welcome Trust Case Control Consortium). Similarly, our findings offer a set of candidates for future genetic studies of substance dependence traits.
To avoid high genotyping cost and multiple testing problems, GWA studies often follow a staged design, in which a large number of markers are genotyped in a portion of the sample in the first stage, and a relatively small number of markers showing association in the discovery dataset are genotyped in the remainder of the sample in the second stage. Association test findings in the second stage are usually considered to be a replication. However, in spite of the recommendations for stringent significance levels in the discovery sample, Skol et al. 2006
) demonstrated that analysis of a single undivided dataset often has greater power to detect association than the two-stage design. Although our GWA study included two datasets derived from a single study population and thus appears to conform to the staged design, we treated the datasets as independent discovery samples since they are genetically distinct and thus may have some unique genetic associations with substance dependence (which we had to forgo identifying owing to our requirement for significance in each dataset individually). We capitalized instead on the opportunity to replicate findings within the discovery sample.
Our results should be interpreted cautiously in light of several limitations of our study design. First, we analyzed 5,240 SNPs, a number that is much smaller than contemporary high density GWA studies and insufficient to cover most gene regions. Many genes influencing risk to substance dependence traits were probably not detected because the SNP array panel in our study included SNPs from fewer than 10 percent of all known genes. Second, the FBAT approach is one of the most conservative methods for genetic association analysis and is less powerful than methods used in population-based designs due in part to families that are uninformative for the transmission component of the association test (Van Steen et al, 2005
). Third, none of the results in our study would be considered significant after adjustment for multiple comparisons using a Bonferroni correction (threshold p = 0.05/5240 = 0.00001). However, since all of the results proposed for follow-up required evidence for association in each data set, this correction is probably overly conservative. Moreover, given our requirement that a result attain a p-value of 0.05 in both population samples to be considered significant, the expected number of findings for a trait surpassing this threshold would be six (i.e., (0.05)2
* 5240 * 0.5) assuming a one-tailed test. Seven or more significant results were obtained with CD, OD and CIP. Finally, our selection criteria ignored potential true associations that are evident in only one population. Population-specific associations may account for lack of correspondence in the same dataset of the association signals reported here with linkage peaks for these traits, each of which was found in only one population (Gelernter et al. 2005
; Gelernter et al. 2006a
; Gelernter et al. 2007
). Given that the purpose of this study was hypothesis generation rather than hypothesis testing, the latter two concerns would be lessened by follow-up studies involving more detailed analysis of candidate genes and testing in additional populations.
In summary, our GWA study identified several novel candidate genes for six substance dependence traits in sets of families from two distinct populations. This illustrates the merits of a GWA approach using distinct population samples in the discovery (i.e., hypothesis generating) stage. The results of this approach will encourage future investigations of the identified associations using this and other datasets.