A pressing challenge of human genetics is to combine diverse diseaserelated genetic variations to illuminate pathways and networks affected in common disorders. Schizophrenia represents an important example of a common psychiatric disorder in which a statistically significant contribution to disease susceptibility has now been demonstrated for different types of genetic variations. Specifically, several genomic loci associated with common human polymorphisms have been implicated by genome-wide association studies (GWAS)1–4
, a contribution from de novo
and rare copy number variants (CNVs) has been established5–7
, and a significant contribution from de novo
single nucleotide variants (SNVs) was demonstrated in a recent study based on exome sequencing in two populations8
Biological networks provide a natural framework for integration of diverse genetic variations associated with such a complex and multifactorial phenotype as schizophrenia9,10
. To identify affected molecular networks, we have developed an algorithm (NETBAG+) that searches for cohesive clusters of genes perturbed by disease-associated genetic variations (). The approach is based on the previously described phenotype network11
, which assigns every pair of human genes a score proportional to the likelihood ratio that these genes are involved in the same genetic phenotype (Online Methods). The phenotype network was used previously to identify a functionally cohesive gene cluster perturbed by de novo
CNVs in autism11
. The new NETBAG+ approach is able to integrate data from multiple types of genetic variation: SNVs, CNVs and GWAS-implicated loci. The greedy search algorithm identifies highly connected gene clusters that are affected by genetic variations, and the significance of the identified clusters is then established using an appropriate randomization (Online Methods). Although we and others have previously developed several methods to identify and analyze disease-related gene networks11–15
, to our knowledge NETBAG+ is the first principled approach for integration of diverse sources of genome-wide genetic variation under a unified framework. The statistical power of this integrative approach stems from the convergence of different types of genetic variations on a set of interrelated molecular processes.
Figure 1 The NETBAG+ approach and the identified schizophrenia gene clusters. (a) The NETBAG+ algorithm: different types of genetic variations are mapped to a phenotype network (pale gray) in which every pair of genes is assigned a score proportional to the likelihood (more ...)
Here we applied the NETBAG+ algorithm to integrate several unbiased whole-genome data sets associated with schizophrenia. We identified several cohesive gene networks related to the disorder and characterized their biological and cellular functions. We also investigated the expression of the network genes in the brain. Finally, we examined the relationship between the genes forming the identified schizophrenia networks and genes associated with other neurodevelopmental disorders, such as autism and intellectual disability.