|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: IM PG ECL SBB PPM. Performed the experiments: PG ECL. Analyzed the data: IM PG. Contributed reagents/materials/analysis tools: SBB PPM. Wrote the paper: IM PG.
Autism spectrum disorder (ASD) is one of the most prevalent and highly heritable neurodevelopmental disorders in humans. There is significant evidence that the onset and severity of ASD is governed in part by complex genetic mechanisms affecting the normal development of the brain. To date, a number of genes have been associated with ASD. However, the temporal and spatial co-expression of these genes in the brain remain unclear. To address this issue, we examined the co-expression network of 26 autism genes from AutDB (http://mindspec.org/autdb.html), in the framework of 3,041 genes whose expression energies have the highest correlation between the coronal and sagittal images from the Allen Mouse Brain Atlas database (http://mouse.brain-map.org). These data were derived from in situ hybridization experiments conducted on male, 56-day old C57BL/6J mice co-registered to the Allen Reference Atlas, and were used to generate a normalized co-expression matrix indicating the cosine similarity between expression vectors of genes in this database. The network formed by the autism-associated genes showed a higher degree of co-expression connectivity than seen for the other genes in this dataset (Kolmogorov–Smirnov P=5×10−28). Using Monte Carlo simulations, we identified two cliques of co-expressed genes that were significantly enriched with autism genes (A Bonferroni corrected P<0.05). Genes in both these cliques were significantly over-expressed in the cerebellar cortex (P=1×10−5) suggesting possible implication of this brain region in autism. In conclusion, our study provides a detailed profiling of co-expression patterns of autism genes in the mouse brain, and suggests specific brain regions and new candidate genes that could be involved in autism etiology.
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition associated with many different genes. However, the neuroanatomical and functional properties of these genes in the brain are largely unknown. Here we examined the co-expression network of 26 genes associated with ASD, using data from the Allen Mouse Brain Atlas, which provides a whole-genome, high-resolution map of gene expression pattern in the adult mouse brain. We discovered that autism genes are significantly more co-expressed than expected by chance, suggesting common neuro-functional properties. We then examined the spatial properties of co-expression modules that are highly enriched with autism genes. Consequently, we found that genes in two of these modules are significantly over-expressed in the cerebellar cortex, and particularly in sections that are predominantly populated by granular cells. These findings provide the essential link between gene networks associated with ASD and specific brain regions, and hence lay out a basis for further exploration of the particular neuronal circuits involved in ASD etiology.
Autism spectrum disorder (ASD) is one of the most prevalent and highly heritable neurodevelopmental disorders in humans –. There is strong evidence that the onset and severity of ASD is governed in part by complex molecular mechanisms affecting the normal development of the brain , . While no major anatomical pathology have been observed in brains of ASD cases , various molecular and neuroimaging studies have linked several brain regions to ASD. For example, Voineagu et al. have found differences in gene expression patterns in the cortex of ASD brain . Cortical regions has also been highlighted in neuroimaging studies of autistic brains along the cerebellum and other brain areas , . In addition, other studies have pointed to various molecular mechanisms that might be altered in the autistic brain –. In this realm, genes involved in synapse formation and brain circuitry are consistently found to be dysregulated in people with ASD –.
Recent genomic advances have led to the discovery of diverse genetic loci linked to ASD, including chromosomal aberrations , , copy number variations , ,  and both common and rare single nucleotide variations (SNVs) –. Consequently, to date, more than 330 candidate genes have been associated with ASD susceptibility  and many more are projected to be found. However, despite the plethora of genetic variations associated with ASD, the molecular mechanisms and neuroanatomical structures underlying ASD traits remain largely unclear.
The mouse model system provides a convenient and safe approach to experimentally study neuroanatomical mechanisms and candidate genes for autism susceptibility –. At present, dozens of single gene knockout and transgenic mice models have been used to elucidate neuropathology that might underlie the autism-like behaviors . Despite the obvious genetic and neuroanatomical differences between mouse and human, mouse models are extremely valuable and effectively used in dissecting out the role of specific gene, pathway, neuron subtype, or brain region in a particular abnormal behavior shared by both these mammals. In this realm, the Allen Brain Atlas of the mouse – provides a comprehensive source of genome-wide high-resolution atlas of gene expression throughout the adult mouse brain. In this study, we utilized this database to examine the spatial co-expression characteristics of genes associated with autism susceptibility. Consequently, we identified several co-expression gene networks that are enriched with autism genes highlighting potential candidate genes and brain regions implicated in autism.
The autism gene database, AutDB , was used to construct a set of 84 genes, AutRef84, strongly implicated in autism pathogenesis , . Genes contained in AutDB were classified into four genetic categories: (1) Rare: rare submicroscopic copy number variants and single gene disruptions or mutations directly linked to ASD; (2) Syndromic: genes implicated in syndromes in which a subset of affected individuals also develop autistic symptoms; (3) Association: small risk-conferring candidate genes with common polymorphisms that have been identified from genome-wide association studies in idiopathic ASD cases; and (4) Functional: functional candidate genes relevant to ASD that are not covered by any of the other genetic categories. AutRef84 was generated by filtering out functional candidate genes that lack any experimentally derived genetic link to ASD as well as genes that solely belong to the Association classification. The resulting dataset consisted of 64 genes classified within the rare classification and 20 within the syndromic classification (Supplementary Table S1).
We utilized gene expression data available from the Allen Brain Atlas (ABA) of the mouse brain (http://mouse.brain-map.org) which contains voxelized expression profiles for ~20,000 genes derived from in situ hybridization (ISH) experiments conducted on male, 56-day old C57BL/6J mice) –. Gene expression profiles in the ABA were generated for the mouse brains from processed image data for the sagittal sections of a single hemisphere and, for 4,104 genes of high neurobiological interest, coronal and sagittal sections across the whole brain. We focused on genes for which brain-wide data are available. The ISH data were co-registered to the Allen Reference Atlas, which is partitioned into 49,742 cubic voxels of size 200 microns. A gene labeled g has a profile of expression energy presented as a function E(v,g) over the brain, where v is a voxel label. The Allen Mouse Brain Atlas Addiction Database (http://addiction.brainarchitecture.org/) used this expression dataset to create a high quality brain expression dataset, called Abest, which consists of 3,041 genes with the most highly correlated expression energy profiles between the coronal and sagittal sections . This gene-expression dataset was used for the co-expression analysis discussed below.
For the set of 3,041 genes defined above, a gene-by-gene co-expression matrix was computed as follows:
where V=49,742 is the total number of voxels in the brain. The quantity C(g,g') is the cosine similarity between the gene-expression profiles of genes g and g' with values ranging between zero and one by construction. The motivation for choosing the cosine similarity as the co-expression measure in this study, is its connection to the difference between the energies of two brain-wide functions. The proposed co-expression measure (Equation 1) can also be described as a simple decreasing function of the squared energy of the difference between two brain-wide functions. The co-expression matrix C is symmetric, with ones on the diagonal. Its size is 3,041, the number of genes in the full Abest dataset. Given a subset of this dataset, the co-expression matrix of this subset can be obtained by extracting the sub-matrix of C corresponding to the position of the genes in the full dataset.
To determine whether the autism genes are more co-expressed than other genes expected by chance, the cumulative distribution function (CDF) of the entries of the autism-related and Abest co-expression matrices were compared, and a two-sample Kolmogorov–Smirnov test was used to determine whether they were drawn from the same probability distribution. In addition, we compared the connectivity of the co-expression network of the autism genes to co-expression networks of the same size randomly drawn from Abest. For that, Monte Carlo simulations  were conducted to generate 100,000 random gene sets sampled from Abest, and the co-expression matrices of these sets of genes where extracted from the co-expression matrix C.
Given the co-expression matrix of G genes of interest, one can consider the underlying weighted graph with nodes corresponding to genes, and the weight of links equals to the co-expression of the nodes . The matrix can be cut at any value ρ between zero and one, resulting in links with weights lower than the threshold ρ. At any value of the threshold, the connected components (sets of connected genes) can be computed using Tarjan's algorithm. In particular, the maximum size M(ρ) and average size A(ρ) of connected components can be calculated for all gene sets over different co-expression values. If Nρ(k) is the number of connected components in the co-expression matrix of G genes defined by ρ threshold and contain exactly k genes out of G genes in the gene set, the maximum and average sizes are expressed as follows:
Further, we applied the classical definition of “clique” from graph theory ,  to our co-expression matrix to characterize networks of co-expressed genes such that every gene in the network is connected to all other genes at a co-expression threshold ρ.
Next, we aimed at evaluating the unique anatomical properties of co-expression cliques that are significantly enriched with autism genes. For that, we first identified virtually all cliques in the dataset that contained ng≥2 autism genes. Then, we used Monte Carlo simulations using 100,000 randomly generated gene sets of size G=26 to compute the likelihood of each clique to contain at least ng autism genes. Finally, for cliques that were significantly enriched with autism genes, we calculated the sum of the normalized expression profiles of the genes in the clique as follow:
We then used Sclique, to examine the neuroanatomical properties of the genes included in the clique as follows . For a given brain region ω in the Allen Reference Atlas , the fitting score between the expression profile and the region ω is defined as the cosine similarity between the expression profile and the characteristic function χω of the brain region.
where χω(v) equals one if voxel labeled v belongs to region ω, and zero otherwise. This fitting score would equal one if the sum of the gene expression in the clique were proportional to the characteristic function of the region ω, and zero if it were entirely supported outside the region. The fitting score defined in the above equation equals the co-expression between a (hypothetical) gene whose expression profile would be Sclique, and a (hypothetical) gene whose expression would coincide exactly with region ω. Fitting scores of a given clique can be computed in all brain regions, and the distribution of these fitting scores can be simulated by repeatedly drawing random cliques of genes (with the same number of genes) from Abest. These analyses were performed using a commercial software package (MATLAB R2011b, The MathWorks Inc., Natick, MA, 2000).
Finally, we used the Bioconductor GOstats package in R software  to assess whether genes belonging to a co-expression clique, also share other functional or molecular properties. The absolute list of GO terms were obtained using both a (a) cut-off=2*ratio (fg/fc) [where fg=frequency of occurrence of a GO term in the given gene set, fc=frequency of its occurrence in the complete list of human genes] and (b) cut-off=median value of ratio (fg/fc). Only significant terms (P<0.01) with an associated gene count>=5 were considered.
Overall, 26 genes were found in the intersection of the autism-genes dataset and the dataset of high-quality expression genes from the Allen Brain Atlas (ABA) of the mouse brain (AutRef84∩Abest=26). These autism-related genes showed a higher degree of co-expression connectivity than all other genes in this dataset (Kolmogorov–Smirnov P=5×10−28). Comparing the empirical distributions of co-expression values of the autism genes to the other genes in the Allen dataset revealed that the largest deviation between these distributions was at co-expression value of 47.53% ( Figure 1A ). Furthermore, we evaluated the connectivity of genes across different co-expression values. Here too, the average size of connected components (See methods) among autism gene was consistently larger than seen in 1000 randomly generated gene sets from the Allen database ( Figure 1B ).
Next, we examined cliques (see methods) of co-expressed genes delineated by autism genes inter-connected at co-expression values of ≥47.53%. A total of 59 overlapping cliques were characterized containing on average 563.5 genes and 8.3 autism genes ( Figure 2 ). Finally, using Monte Carlo simulations, we identified ten cliques that were significantly enriched with autism genes at P<0.01 ( Figure 3 ). Of note, the two top ranked cliques remained significant (P<0.05) even after accounting for multiple testing using the conservative Bonferroni correction.
The top-ranked clique in our analysis (hereafter will be referred as Clique I), was delineated by the autism genes: Ptchd1, Galnt13, Dpp6 and Astn2, and included another 29 genes, inter-connected with a co-expression values of ≥70% (Supplementary Table S2). The second top-ranked clique (hereafter will be referred as Clique II) included the autism genes: Rims3, and Astn2, and another four genes, all inter-connected at ≥79% co-expression level (Supplementary Table S3). Examining the neuroanatomical expression properties across a set of 134 brain regions of the left hemisphere (41 of which are cortical, and 93 subcortical) grouped by the 12 main brain regions according to the Allen Reference Atlas, revealed a significant over-expression of genes belonging to Cliques I in the cerebellar cortex ( Figures 4A, 4B , Supplementary Figure S1). Genes belonging to Clique II also showed a slight over-expression in the cerebellar cortex ( Figures 5A, 5B ), as well as in several cortical regions, however these signals were much weaker than the one of clique I. Next, we asked if the over-expression in the cerebellum is a unique property of these two cliques. For that, we examined the neuroanatomical expression of all cliques in Figure 2 and found only seven other cliques showing a similar over-expression in the cerebellar cortex. Interestingly, these cliques had substantial gene overlap with cliques I&II, and were ranked high in their autism gene enrichment scores (Supplementary Table S4), thus supporting the illumination of the cerebellar cortex in this analysis.
Finally, we asked if these two co-expressed cliques are associated with particular cell type or any other functional or cellular property. Examining cell-type-specific microarray data which we have for 64 cell types  revealed that 4 of them (stellate basket cells, granule cells, oligodendrocytes and Purkinje cells) are considerably populating the cerebellum (Supplementary Figure S2). Further, we looked through different coronal sections through the cerebellum from the Allen Reference Atlas  and visually compared them to sections of the sum of expressions of genes in Cliques I & II. Figure 6 shows the normalized volumetric expression quantities of both cliques along with the closest coronal section of the mouse brain in the Allen Reference Atlas. One can see that the voxels with the most intense expression in both cliques tend to follow the granular layer. Hence, the results of these analyses suggest that genes in both clique I & II tend to be over-expressed in granule cells in the cerebellar cortex. Using the Bioconductor GOstats package in R, we two biological processes: “Transmission of nerve impulse” (P=0.001842), and “Ion transport” (P=0.000733) and one cellular component “Vesicles” (P=0.001134) that were enriched with genes from Clique I (Supplementary Table S5). Unfortunately, the number of genes in Clique II was too small for this analysis.
In this study, we explored the co-expression network 26 autism genes within the framework of 3,041 genes exhibiting the highest-quality expression data in the Allen Mouse Brain Atlas database –. The significantly tighter co-expression connectivity among the 26 autism genes than other genes, implies common functional properties for these genes in the mouse brain. Further investigation into the co-expression patterns of these genes revealed two cliques of co-expressed genes that were significantly dominated by autism genes. Genes in both these cliques shown significant over-expression in the cerebellar cortex, and particularly in sections that are predominantly populated by granular cells. Some regions of the cerebral cortex are also highlighted by the second clique ( Figure 5 ), but to a lesser extent than the cerebellar cortex. Another recent study of our group examining the expression of the same autism gene set (AutRef84) in different human tissues, found a statistically significant enrichment in the frontal cortex . The cerebral cortex was highlighted in other neuroanatomical studies of autism in both human ,  and mouse  and is known to play a central role in cognitive and emotional processing , which are key deficits in autism and other neuropsychiatric disorders. In addition, a recent neuroimaging study  highlighted functional sub-regions in the cerebellum as playing a role in both motor and cognitive tasks. Other genes associated with autism have been shown to play a role in the development of this region –. Our results, provide additional support in the potential involvement of the cerebellum in autism etiology, and suggest additional candidate genes that are also over-expressed in the cerebellar cortex.
Two recent transcriptomic analyses in human brains ,  revealed additional co-expression modules enriched with autism-associated genes. Some of these modules partially overlap with our findings in either gene content or brain regions, suggesting common functional and neuroanatomical properties of autism gene in both human and mouse brains. Together, these studies provide new insights into the specific gene networks and brain regions that could be involved in autism etiology.
A major strength of our study is the utilization of the Allen Mouse Brain Atlas – which comprises a high-resolution genome-wide exploration of gene expression in the adult mouse brain. This data allows one to explore gene expression properties up to a resolution of 200 microns, which provide a good distinction between different brain regions as well as potentially tell apart different sub-regions and cell types. Another advantage of this study is the focus on those genes exhibiting the highest expression correlation between the coronal and sagittal sections  as well as restricting the autism gene to a subset demonstrating, to the best of our knowledge, the most compelling associations to autism susceptibility. These strict criteria reduce the chances of erroneous results. Our study has also some pitfalls. First, the analyses were done on data from mouse brains. Since autism is a human condition, one may ask how well finding of this study apply to human brain. A recent study comparing postsynaptic protein composition between mouse and human suggest a high correlation between these two mammals in those matters . Nevertheless, similar analyses in the human brain are still required to provide a finer validity to our findings. In addition, the strict criteria used here, restricted the number of studied genes to 3, 041 and 26 autism genes which are roughly represent 15% and 31% of the genes in the Allen Brain Atlas and AutDB datasets respectively. Such a small number of genes might results in false negatives and hence might miss other co-expression properties and brain regions associated with autism. Hence, larger studies are needed to complement the results of our analysis.
In conclusions, our study provides unique insights into the neuroanatomical co-expression properties of genes associated with autism in the mouse brain and suggest specific regions implicated in autism etiology. Complementing these findings with additional genomics and neuroimaging analyses from both mouse and human brains would help gaining a broader picture of the autistic brain.
ISH images of the autism genes of Clique I. Imaged sections of ISH-treated brains (close to bregma), for (A) Astn2, (B) Galnt13, and (C) Ptchd1 with the cerebellar cortex clearly visible.
Estimated brain-wide density for (A) Stellate Basket Cells, (B) Granule Cells, and (C) Mature Oligodendrocytes. These data correspond to microarray data from , using the data sets estimated in  to have the highest purity. Interestingly, the estimated brain-wide densities are almost zero in the cerebral cortex, suggesting that cell types characterized by their transcriptomes are indeed specific to the cerebellar cortex.
AutRef84: A Reference set of Rare and Syndromic ASD-linked genes.
Co-expression of genes in Clique I. The pairwise co-expression values between all 33 genes in Clique I are depicted and color-coded. The four genes already associated with autism are highlighted in yellow.
Co-expression of genes in Clique II. The pairwise co-expression values between all six genes in Clique II are depicted and color-coded. The two genes already associated with autism are highlighted in yellow.
Overlap of cliques displaying overexpression in the cerebellum with cliques I & II.
Results of the Bioconductor GOstat analysis for genes belonging to Clique I.
AutDB is funded by the Simons Foundation, which licenses it as SFARI Gene (https://gene.sfari.org/autdb/Welcome.do). This work was also supported by NIH-NIDA Grant 1R21DA027644-01, Co-expression networks of genes in the mouse and human brain. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.