|Home | About | Journals | Submit | Contact Us | Français|
Addictions are chronic and common brain disorders affected by many genetic, environmental, and behavioral factors. Recent genome-wide linkage and association studies have revealed several promising genomic regions and multiple genes relating to addictions. To explore the underlying biological processes in the development of addictions, we used 62 genes recently reviewed by Li and Burmeister (2009) as representative addiction-related genes, and then we investigated their features in gene function, pathways, and protein interaction networks. We performed enrichment tests of their Gene Ontology (GO) annotations and of their pathways in the Ingenuity Pathways Analysis (IPA) system. The tests revealed that these addiction-related genes were highly enriched in neurodevelopment-related processes. Interestingly, we found circadian rhythm signaling in one of the enriched pathways. Moreover, these addiction-related genes tended to have higher connectivity and shorter characteristic shortest-path distances compared to control genes in the protein–protein interaction (PPI) network. This investigation is the first of such kind in addiction studies, and it is useful for further addiction candidate-gene prioritization and verification, thus helping us to better understand molecular mechanisms of addictions.
Drug addictions are complex, chronic, and mental diseases. Genetic studies of twins and families have suggested that genetic factors might account for 30 to 60% of the overall factors in the risk to the development of drug addictions . In addition, numerous studies aiming to discover genetic variants or candidate genes, including genome-wide linkage scans, candidate gene association studies, gene expression, and genome-wide association studies, have also suggested that multiple genes and genomic regions or markers might play important roles in the development of addictions . Therefore, we hypothesized that, in the study of complex disease such as addictions, a systematic investigation of multiple genes might provide us a high-level view of their underlying biological processes that otherwise have been missed in individual gene or marker studies.
As many large data sets for addictions have been generated and more are expected, how to integrate them or arrange them into a proper order becomes an important task. Such task allows us to effectively interpret the data and their related biological information. To meet this demand, a new approach, ‘systems biology of the neuron’, has been proposed in neuropsychiatric genetics for deeper understanding mental illnesses at the molecular level . For neuropsychiatric disorders and many other complex diseases, one major problem in applying the systems biology approach is the lack of a complete and error-free candidate gene set. In addiction studies, this issue has recently been largely eased, thanks to a comprehensive review by Li and Burmeister . This review provided a list of genes with one or more variants that have been reported to be associated with more than one addiction phenotype. Although this list is still incomplete, but error-free, it offers us an opportunity to study multiple genes, rather than only one gene at a time, so that biological processes of addictions might be explored.
Here, we retrieved 62 genes in Li and Burmeister  and considered them as representative addiction-related genes. We first examined Gene Ontology (GO) terms of these addiction-related genes. Second, we identified addiction-enriched pathways and further explored their relationship by examining their crosstalk. Finally, we explored how strongly these genes interact with or approach to each other in the whole human protein –protein interaction (PPI) networks or in the addiction-specific network. This systems biology-based investigation provides us many insights into the biological processes of addictions, which might provide valuable information for further candidate gene prioritization and development of more effective diagnosis or interventions.
The GO-term enrichment test is a useful approach in the functional analysis of a set of genes . We performed a GO-term enrichment test of addiction-related genes by comparing to the control genes. We found 44 GO terms that were significantly enriched in the addiction-related genes. The details of these GO terms are shown in Table 1.
Among the 44 GO terms, 18 belonged to biological processes, 16 to molecular functions, and 10 to cellular components. Among them, 14 terms were directly related to neurodevelopment, 3 to amino acid metabolism, 19 to the common signal transduction and receptor/ion transport activity, and 8 to plasma membrane or membrane. In the biological processes, three terms, ‘synaptic transition’, ‘transmission of nerve impulse’, and ‘neurological system process’, were ranked first, second, and fourth, respectively. Moreover, 36 genes (58.1%) had annotations of GO terms ‘synaptic transition’ and ‘transmission of nerve impulse’, and 39 genes (62.9%) had annotations of ‘neurological system processes’. This test indicated that most addiction-related genes are involved in neurodevelopment related processes. In the GO category ‘molecular function’, 6 GO terms were directly related to neurotransmitter activity, or more specifically, related to two neurotransmitters: glutamate and acetylcholine. This observation indicates that these two neurotransmitters might play important roles in the development of addictions. Interestingly, two nicotine-related GO terms stood out in the enrichment results, supporting that addictions are psychiatric diseases  and neuronal signaling cascades in the brain are important in the process of addictions .
We searched for pathways overrepresented in the addiction-related genes using Ingenuity Pathway Analysis (IPA) . IPA enables researchers to find whether a set of genes is overrepresented in a pathway compared with all known genes. Although this analysis could not directly prove the biological mechanisms underlying diseases , it could detect the biological process measured by the pathway that is related to addictions.
After applying our statistical criteria to the IPA preliminary results, we identified 13 canonical pathways that were significantly overrepresented (Fisher’s exact test, p-value<0.001) in the addiction-related genes (Table 2). Among them, six pathways (46.2%) were directly related to neurodevelopment, which further demonstrates the relationship between the pathology of addiction and the development of neuron systems. Importantly, three pathways that are related to monoamine neurotransmitters stood out: serotonin receptor signaling (ranked third), glutamate receptor signaling (ranked fifth), and dopamine receptor signaling (ranked seventh). These three monoamine neurotransmitter-related signaling processes are important for signaling transduction between neurons. These observations confirmed that drug addictions are involved in neurobiological changes in the brain . Of note, two pathways, synaptic long-term depression (LTD) and synaptic long-term potentiation (LTP), were in the list. These two pathways are important for synaptic plasticity development and have been reported to be related to addictions  . This observation further reflects that addictions represent a pathological form of learning and memory . Interestingly and importantly, circadian rhythm signaling was overrepresented in addiction-related genes. This may support that addictions are associated with disruptions in circadian rhythmicity . Moreover, consistent with GO term results, signaling related pathways could be highlighted in these enriched pathways: G-protein coupled receptor signaling (ranked first), calcium signaling (ranked second), and cAMP-mediated signaling (ranked fourth). Finally, we also identified three amino acid metabolism pathways: tryptophan metabolism, phenylalanine metabolism, and tyrosine metabolism. Again, these three pathways are complementary to the GO term results (Table 1).
Among the 13 enriched pathways, we further explored their relationship through examining their crosstalk (interaction). In a real cell environment, pathways should likely interact with or affect each other, rather than function independently. For example, P53 is involved in many pathways and regulations in the cell. How to assess pathway crosstalk is a challenging issue for the complicated biological process. Here, we used a simple approach by assessing common proteins and protein interactions between any two pathways. There were a total of 78 pathway pairs for the 13 enriched pathways. As shown in Fig. 1, we found 21 of them were statistically significant (p≤0.001) based on our pathway assessment method (see pathway crosstalk in the Exper. Part). Among these 13 pathways, 8 could form a large cluster, which included four neurodevelopment related pathways, three signaling related pathways, and the circadian rhythm signaling. Both synaptic long-term potentiation (ranked eighth) and neuropathic pain signaling in dorsal horn neurons (ranked sixth) had seven links to other pathways. It is worth noting that the synaptic long-term potentiation plays an important role for drug adoption in the neuro system . Interestingly, there seems to be no report on the relationship between the pathway ‘neuropathic pain signaling in dorsal horn neurons’ and addictions. This finding might provide an additional biological process involved in addictions.
Protein –protein interactions (PPIs) play critical roles in the biological processes including the development of addictions . Investigation of addiction-related genes in the human PPI network would provide valuable information for elucidating the molecular mechanisms of addictions. Among 62 addiction-related genes, 52 could be mapped into our reconstructed human interactome. The average degree of addiction-related genes was 13.06, which was significantly higher than that of the control genes (10.37, Wilcoxon test, p = 0.03). The result suggests that addiction-related genes tended to encode proteins with higher degree when compared to the control genes. This observation also supports a recent finding which states that the disease genes tended to have higher degrees compared to non-disease genes .
Fig. 2 shows the degree distribution of the addiction-related genes and control genes by a degree interval of 3. When the degree is 4 or more, the proportion of the addiction-related genes was higher than that of the control genes, except in the degree interval 10– 12. When we summarized, in the degree interval 4 – 20, the proportion of addiction-related genes was higher (57.7%) than that of control genes (42.0%), which indicated that more than half of the addiction-related genes tended to have an intermediate degree compared to the control genes.
In a network, the average shortest-path distance measures how many links are traveling from one node to another node, which measures a network’s overall navigability. The average characteristic shortest-path distance among the addiction-related genes (3.70) was significantly shorter than that of the control genes (3.93, p = 0.0003). Fig. 3 shows the distribution of the characteristic shortest-path distances of the addiction-related genes and control genes. In the interval 1 – 3, the proportion of proteins encoded by addiction-related genes (41.8%) was larger than that of the control genes (29.1%); conversely, in the interval 4– 7, the proportion of addiction-related genes (58.2%) was smaller than that of the control genes (70.8%). The result indicates that addiction-related genes approached to each other more efficiently than the control genes.
To explore the local environment of the proteins encoded by addiction-related genes, we extracted their network from the whole human protein interaction network using the Steiner minimum tree algorithm . Fig. 4 shows the addiction-specific network. It contains 99 protein nodes and 164 links. Among these 99 nodes, 49 were encoded by the addiction-related genes, which accounted for 94.2% of the addiction-related genes mapping into the human interactome. The high coverage indicates these addiction-related genes tended to link tightly to each other. In this network, the average degree was 3.31, the average shortest-path distance was 4.38, the average betweenness was 165.5, and the clustering coefficient was 0.15.
To test whether this addiction-specific network is random in the whole network, we generated 1000 randomized networks and compared their average shortest-path distance, average betweenness, and average clustering coefficient with the corresponding values of the addiction-specific network. The shortest-path distance of the addiction-specific network was significantly higher than that (3.77) of the average shortest-path distance of the random networks (empirical p = 0.00). The betweenness of the addiction-specific network was significantly higher than that (122.8) of the random networks (p = 0.04). The addiction-specific network was approximately four times that (0.04) of the randomly generated networks (p = 0.00). These results indicated that the extracted addiction-specific network was not random in the whole network.
In this study, we investigated the features of function, pathways, and global and local PPI networks of the addiction-related genes. The GO term and pathway enrichment tests consistently indicated that these addiction-related genes were highly correlated with neurodevelopmental processes. The topological properties of the proteins encoded by these addiction-related genes in the human interactome suggested that the addiction-related genes tended to encode moderately connected proteins in the network and the addiction-specific network was not random. Our results suggested that systems level analysis is useful for exploring underlying biological processes involved in addictions.
We considered the 62 candidate genes associated with at least one drug addiction reviewed in Li and Burmeister in 2009  as the representatives of addiction-related genes. Associations of these genes with addictions were largely supported by the linkage scanning, association studies, and genome-wide association (GWA) studies. Most of these candidate genes belonged to the aldehyde dehydrogenase (ADH) gene cluster, the set of genes encoding nicotinic acetylcholine receptor (nAChR) subunits, gamma-aminobutyric acid A (GABAA) receptor subunit 2 (GABRA2), ankyrin repeat and kinase domain containing 1 (ANKK1), and neurexins.
For comparison purpose, we compiled a control gene set. We first downloaded (in February, 2009) the gene information of 27,066 protein-coding genes from the NCBI Gene database . Second, we retrieved (in February, 2009) the 2447 disease genes with at least three evidences from the OMIM (Online Mendelian Inheritance in Man) database . After excluding these disease genes from all the protein-coding genes, we had a total of 24,619 genes, which were considered as control genes.
We applied the web-based gene set analysis toolkit WebGestalt  to explore the functional characteristics of the addiction-related genes . During the data process, the addiction-related genes were submitted to the WebGestalt as the test genes and the control genes as the reference genes. For each GO term, Fisher’s exact test was performed to detect whether the GO term is significantly overrepresented among the genes in a gene list by comparing to the control genes. We applied the following three criteria to detect enriched GO terms: i) the GO term level should be higher than 3, ii) the number of the addiction-related genes involved in each GO term should be higher than 5, and iii) the p-value from Fisher’s exact test should be less than 1.0 × 10−4.
To have a global view of the pathways involved in addictions, we conducted pathway enrichment analysis using Ingenuity Pathway Analysis (IPA) . IPA enables researchers to determine whether a set of genes is overrepresented, as compared with all known genes, in one or more functionally defined pathways. We submitted the addiction-related genes into the IPA and extracted the canonical pathways. To get the enriched pathways, we applied two criteria: i) the score should be higher than 3, and ii) the number of candidate genes involved in a pathway should be more than 3. Here, the score is –10 logarithm of Fisher’s exact test p-values.
To explore the relationship among these pathways, we assessed their pathway crosstalk. During this data process, we considered two parameters: common proteins (nodes) and common interactions (edges) between any two pathways. For any pair of addiction-enriched pathways, we first constructed a 2 × 2 contingency table based on common proteins and common interactions, resp., and then calculated p-values by Fisher’s exact test. The contingency table includes the following four counts: n, N–n, r, and R–r, where n denotes the count of the common nodes (or links) between two tested pathways, N denotes the count of total nodes (or links) of the two tested pathways, r denotes the average number of common nodes (or links) between all possible pairs of addiction-enriched pathways, and R denotes the average number of proteins (or links) of all possible pairs of addiction-enriched pathways. We adjusted these p-values by FDR using the Benjamini–Hochberg procedure . Therefore, for each pair of the pathways, we had two p-values (pnodes and plinks). We used the smaller p-value to evaluate the pathway crosstalk, with a significance level of 0.001.
We mapped these genes into the human interactome and then calculated the two basic global network properties: degree and characteristic shortest-path distance. The degree measures the number of links of a node to any other nodes in the same network. The characteristic shortest-path distance is used to measure how many links go through from one node to others in the same gene set. The human interactome was reconstructed by integrating the data from six databases with experimental evidence including BIND , DIP , HPRD , IntAct , MINT , and Reactome .
After mapping addiction-related genes into the human interactome, a network was extracted using the Steiner minimal tree algorithm and considered as addiction-specific network. The Steiner minimal tree algorithm is designed to construct the smallest network covering nodes of interest as many as possible . To test whether the addiction-specific network is random, 1000 randomized networks were generated using the Erdős–Rényi model in the R igraph package . These randomized networks had the same number of nodes and links as the addiction-specific network. Based on these randomizations, p-values were computed by an empirical approach. During the processes, we first calculated the three network measures: betweenness, clustering coefficient, and shortest-path distance. Second, we compared them with the observed corresponding values and counted the number (Ni,i=1,2,3) of randomized networks whose betweenness, clustering coefficient, or shortest-path distance was higher than the observed corresponding values of the addiction-specific network, resp. Finally, we calculated the empirical pi=Ni/1000 for these three network topological measurements.
We thank the financial support of this project by the National Institute of Health (AA017437, AA017828 and LM009598), the Thomas F. and Kate Miller Jeffress Memorial Trust Fund (J-900), and the NARSAD Young Investigator Award to Z. Z.