|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide transcriptional profiling was used to characterize the molecular underpinnings of neocortical organization in rhesus macaque, including cortical areal specialization and laminar cell type diversity. Microarray analysis of individual cortical layers across sensorimotor and association cortices identified robust and specific molecular signatures for individual cortical layers and areas, prominently involving genes associated with specialized neuronal function. Overall, transcriptome-based relationships were related to spatial proximity, being strongest between neighboring cortical areas and between proximal layers. Primary visual cortex (V1) displayed the most distinctive gene expression compared to other cortical regions in rhesus and human, both in the specialized layer 4 as well as other layers. Laminar patterns were more similar between macaque and human compared to mouse, as was the unique V1 profile that was not observed in mouse. These data provide a unique resource detailing neocortical transcription patterns in a non-human primate with great similarity in gene expression to human.
The mammalian neocortex is characterized by its stereotyped laminar cytoarchitecture and regional variations in cellular architecture that differentiate cortical areas. As emphasized by Brodmann over a century ago through the creation of cytoarchitectonic cortical maps (Brodmann, 1909), cortical organization is conserved across species, particularly between humans and non-human primates (reviewed in (Zilles and Amunts, 2010)). Gene expression is increasingly used as an empirical means of differentiating and delineating cortical areas, for example through identification of area-specific gene markers (Takahata et al., 2009), or boundary mapping based on differences in neurotransmitter receptor expression (Zilles et al., 2004). Whole genome transcriptional profiling has particular potential to elucidate cortical areal specification and specialization through identification of differentially regulated genes and molecular pathways that underlie cytoarchitectural and functional areal identity (Johnson et al., 2009).
The major factor that differentiates different cortical areas is their distinct laminar organization, reflecting the composition of specific cell types within each layer. Work in rodents has shown that the specific cell types that make up different cortical layers have robust and selective molecular signatures. Many gene markers have been identified through mining genome-wide cellular resolution gene expression data resources in the Allen Mouse Brain Atlas (Lein et al., 2007)(http://www.brain-map.org) and by using targeted approaches (Molyneaux et al., 2007). In addition, transcriptional profiling using DNA microarrays or RNA sequencing has been successful in identifying molecular signatures for discrete cortical layers in mice (Belgard et al., 2011; Hoerder-Suabedissen et al., 2009; Wang et al., 2009) using punches or laser microdissection, as well as in specific excitatory and inhibitory cortical cell types using selective genetic or tracer-based cell labeling and live isolation methods (Arlotta et al., 2005; Doyle et al., 2008; Sugino et al., 2006).
In contrast, other studies aiming to identify cortical area-enriched gene expression in humans and non-human primates were performed using macrodissected whole cortex, which yielded few genes that robustly differentiate between cortical areas (Khaitovich et al., 2004; Yamamori and Rockland, 2006). One likely reason for this is methodological variability associated with regional dissections, as precise dissections have yielded significantly more regional differences in the Vervet neocortex (Jasinska et al., 2009) and in developing and adult human brain (Johnson et al., 2009). Additionally, since gene markers differentiating cortical areas have been readily identified in mouse via cellular resolution in situ hybridization databases (Lein et al., 2007), the paucity of areal gene markers identified in primate transcriptional profiling studies might be due to dilution effects resulting from the high degree of cellular heterogeneity in whole cortical samples. Therefore, a more precise approach targeting more homogeneous cortical cell populations may reveal more robust areal signatures as well.
Rhesus macaque provides a tractable non-human primate model system to analyze the transcriptional organization of the primate neocortex. Macaque is genetically and physiologically similar to humans, with a sequence identity of approximately 93% (Gibbs et al., 2007). Many elements of cortical cytoarchitecture are similar in macaque and human, including specialized primary visual cortex and dorsal and ventral visual streams. In this study, we aimed to understand organizational principles of the primate neocortex using transcriptional profiling analysis of individually isolated cortical layers from a variety of well-defined cortical regions in the adult rhesus macaque, and to compare rhesus gene expression patterns in homologous cortical areas and cell types in human and mouse. The entire microarray data set is also available through the NIH Blueprint NHP atlas website (http://blueprintnhpatlas.org).
To analyze transcriptional profiles associated with major laminar and areal axes of cortical organization, laser microdissection (LMD) was used to selectively isolate individual cortical layers in ten discrete areas of the neocortex from two male and two female adult rhesus monkeys. As shown schematically in Figure 1A, these areas spanned primary sensorimotor cortices (S1, M1, A1, and V1), higher order visual areas (V2, MT and TE), and frontal cortical areas (DLPFC, OFC and ACG). In each cortical region, samples were isolated from layers definable on the basis of lightly stained Nissl sections used for the sample preparation, taking care to avoid layer boundaries. In most areas, 5 layers were isolated (L2, L3, L4, L5 and L6), although in M1, OFC and ACG no discernible L4 could be isolated. 8 layers were sampled in V1 (Figure 1B, C) to include the functionally specialized and cytoarchitecturally distinct sublayers of L4 (4A, 4B, 4Ca and 4Cb). For a non-neocortical comparator data set, samples were also isolated from subfields of the hippocampus (CA1, CA2, CA3 and dentate gyrus) and from the magno-, parvo- and koniocellular layers of the dorsal lateral geniculate nucleus (LGN). Collectively, the selected regions allowed for interrogation of differences in gene expression between cortical areas and layers located distal or proximal to each other, and from regions that comprise specific functional types or streams. Representative pre- and post-cut images from each structure are shown in Figure S1, and stereotaxic locations of sampled cortical regions in Table S1. RNA was isolated from LMD samples, and 5ng total RNA per sample was amplified to generate sufficient labeled probe for use on Affymetrix rhesus macaque microarrays.
Multiple analytical methods were used independently to identify the most robust patterns of gene expression. Principle component analysis (PCA) can often illustrate the major organizational features of microarray data sets (Colantuoni et al., 2011), and we initially applied it to the whole sample set comprising 225 cortical, hippocampal and thalamic samples across all 52,865 probes. A significant proportion of the variance was accounted for by the first three components (12.5%, 8.7% and 6.8% respectively; Figure S2). As shown in Figure 2A, samples from major structures (cortex, hippocampus, and thalamus) cluster together, have highly distinct molecular signatures and appear well-segregated. Considering the cortical samples alone, the first three components accounted for a similar proportion of variance (13.6%, 8.5% and 6.6% respectively), and plotting samples by areal or laminar class revealed striking organization along two orthogonal axes reflecting the areal (Figure 2B) and laminar (Figure 2C) dimensions of the neocortex. Remarkably, the spatial relationships between neocortical samples are recapitulated by the transcriptional relationships between samples. Samples align in a rostral to caudal orientation by cortical area along the first principle component horizontally in Figure 2B, and appear tightly clustered in their native laminar order along the second principle component vertically in Figure 2C.
To identify differentially expressed genes, three-way ANOVA of the cortical data set identified large numbers of probes that vary between cortical regions (6,170 at p<10-12), layers (4,923) and individual animals (2,347; Figure 2D, E and Table S2). Importantly, there was a high degree of overlap between the sets of genes varying by cortical region and layer, suggesting that a substantial proportion of the genes differentiating cortical areas vary within specific cortical layers. Gene set analysis of both areal and laminar genes showed enrichment for genes associated with axonal guidance signaling and ephrin receptor signaling, synaptic long term potentiation (LTP) and neuronal activities (Table S2). Gene expression patterns associated with gender and individual animals were also identified by ANOVA (Figure S2), and individual-associated differences were enriched with genes related to metabolism, mitochondria and antigen presentation (Table S2). Gender-specific gene expression was observed both on sex and autosomal chromosomes (Fig S2), and there was significant overlap (p<10-9) between the individual-related genes identified here and gender-related genes identified in human brain (Kang et al., 2011).
We next applied WGCNA to identify sets, or modules, of highly co-expressed genes by searching for genes with similar patterns of variation across samples as defined by high topological overlap (Zhang and Horvath, 2005). Applied to the entire set of neocortical samples, WGCNA revealed a series of gene modules (named here as colors) related to different features of the data set (Figure 2F,G, also Figure 3B,D and Figure 5B,C). Gene assignment to modules and gene ontology analysis for the whole cortex network are shown in Table S3. The majority of these modules correlated with laminar and regional patterns as described below. Several modules were related to gender and individual differences, as previously observed in humans (Oldham et al., 2008). In Figure 2G the lightyellow module was strongly enriched in male versus female samples (upper panel), while the grey60 module was selectively lowest in samples originating from one particular animal. The top (hub) genes in the lightyellow module were on the Y chromosome, including the putative RNA helicaseDDX3Y and the 40S ribosomal protein RPS4Y1.
The most striking features were the robust molecular signatures associated with different cortical layers. As shown in Figure 3, a wide variety of transcriptional patterns were associated with individual cortical layers or subsets of layers. For example, ANOVA of laminar expression in all cortical regions and clustering of these genes identified large gene sets enriched in specific subsets of (generally proximal) layers (Figure 3A). Notably, the majority of these laminar patterns are consistent across different cortical areas, reflecting conserved laminar and cellular architecture across the cortex. Gene set analysis suggests these layer-associated clusters are associated with neuronal function, including neuronal activity, LTP/LTD, calcium, glutamate and GABA signaling (Figure 3A and Table S4). Consistent with functional studies of superficial layer synaptic plasticity, genes and pathways involved in LTP and calcium signaling were most represented in L2 and L3. Pathways related to cholesterol metabolism were enriched in deeper layers, likely reflecting the greater proportion of oligodendrocytes closer to the underlying white matter. Similarly, many of the gene modules identified through WGCNA of all cortical samples were correlated with specific cortical layers (Figure 3B). By ANOVA-based clustering and WGCNA, proximal layers showed the strongest correlations, with superficial L2 and L3 highly correlated with one another, and the deeper L4-6 highly correlated as well (dendrograms in Figure 3B, E,F).
Individual layers showed highly specific gene expression signatures. Layer-enriched expression patterns were identified by searching for genes with high correlation to layer-specific artificial template patterns (Lein et al., 2004)(Table S5). Figure 3C shows cohorts of genes with remarkably layer-specific expression that was relatively constant across all cortical areas. These observations demonstrate the specificity of the laminar dissections with minimal interlaminar contamination, and also the constancy of laminar gene expression across the neocortex.
WGCNA gene modules derived from the whole cortex network also showed highly layer-enriched expression, demonstrating the robustness of our findings. For example, the black module contains genes enriched in superficial L2 (hub genes plotted in Figure 3D, top row). While some layer-specific genes could be identified by targeted analyses, the dominant patterns were more complex, with most network modules being associated with combinations of layers, typically proximal to one another. For example, individual modules were enriched in L2-4 (salmon), L3-5 (greenyellow), L4-5 (royalblue) and in a gradient increasing from L2 to L6 (red). This tendency for coexpression between adjacent layers is also apparent in the heatmap representation of gene clusters in Figure 3A and 3E. Gene ontology (GO) analysis of these modules provides some insight into their functional relevance (Table S3). The greenyellow module was enriched for genes associated with axons and neuron projections, potentially related to long-range pyramidal projection neurons in L3 and L5. The red module showed enrichment in genes associated with myelination, consistent with the presence of oligodendrocytes in deep layers, and this module was highly correlated with oligodendrocyte-associated gene networks in other studies (data not shown, (Oldham et al., 2008)).
Interestingly, the expanded L4 of V1 displayed a distinct signature from the rest of L4 (see top of middle box in Fig 3A). To explore this further, we performed ANOVA and WGCNA selectively on samples from V1 (Figure 3E, F and Tables S6 and S7). A comparison between V1 ANOVA-derived laminar differential expression and membership in whole cortex WGCNA modules is in Table S8. Similar to the whole cortex analysis, robust clusters and network modules were associated with individual cortical layers. As shown in the unsupervised hierarchical 2D clustering of ANOVA results in Figure 3E, individual samples from each layer cluster together, and neighboring cortical layers are most similar to one another. Interestingly, L4A clusters with more superficial layers, while L4B, L4Ca and L4Cb display a distinct transcriptional pattern, most easily seen by the dendrograms based on ANOVA and network analysis in Figures 3E and 3F.
To investigate whether layer specificity of gene expression may relate to selective patterns of connectivity, we examined the relationship between thalamocortical inputs and their targets in V1. L4Ca and L4Cb receive input selectively from magnocellular (M) and parvocellular (P) divisions of the LGN, respectively. Hypothesizing that there may be substantial shared gene expression patterns selective for specific pairs of connected neurons, we searched for genes that were differentially expressed between the thalamic inputs and between the cortical targets. 1,002 probes were differentially expressed between L4Ca and L4Cb (t-test, p<.01), and 825 probes between M and P. Surprisingly, these gene sets did not significantly overlap (13/1827, p=.08). Although the possibility certainly exists that specific ligand-receptor pairs are associated with this selective connectivity, it would appear that the specificity of these connections is not associated with specific large-scale correlated gene expression patterns.
To validate the specificity of the microarray findings and test hypotheses about laminar enrichment based on ANOVA and WGCNA, we examined a set of genes displaying layer-enriched patterns using in situ hybridization (ISH) in areas V1 and V2 (Figure 4). Overall the laminar specificity of gene expression and variations between cortical areas predicted by microarrays were confirmed by cellular-level analysis, and illustrate the high information content of layer-specific expression profiling and gene specificity of the microarray probesets. For example, GPR83 is selectively expressed in L2 of all cortical areas, both by microarray and ISH analysis (Figure 4C, D). Laminar specificity was confirmed for RORB (L3-5; Figure 4E, F), PDYN (L4-5; Figure 4G, H), CUX2 (L2-4; Figure 4I), and SV2C (L3-4 enriched; Figure 4J). Specificity for deep cortical layers was prominent, as shown for PDE1A (L5-6; Figure 4K), NR4A2 (L5-6; Figure 4L), COL24A1 (L6; Figure 4M) and RXFP1 (L5-6; Figure 4N). Differences in laminar specificity were sometimes apparent between V1 and V2 (generally V1 versus all other areas); CUX2 was expressed in L2 through L4Cb in V1 but more limited to L2 and L3 in V2 (Figure 4I), and SV2C was highest in L4B in V1, but highest in L3 in area V2 (Figure 4J).
Both ANOVA (Figure 5A) and WGCNA analysis (Figure 5B) identified gene clusters enriched in specific subsets of cortical regions. As illustrated in the dendrograms from both methods, the strongest relationships between cortical areas were based on areal proximity rather than functionally connectivity. For example, the caudal visual areas V1, V2 and MT showed highly correlated patterns of gene expression, while the functionally related but distal visual region TE had greater transcriptional similarity to its proximal neighbor A1 in temporal cortex. Strong relationships were observed for the adjacent primary motor and sensory cortices M1 and S1, and for the frontal DLPFC and OFC regions. Differentially expressed genes showed enrichment in specific subsets of (generally proximal) cortical areas (Figure 5A), generally related to neuronal development and function (axon guidance, neuronal activities, LTP/LTD; Table S9). Areal expression also had a strong laminar signature, easily visualized by grouping these ANOVA-derived genes by cortical layer (Figure S3).
Parallel relationships between cortical areas were observed by WGCNA demonstrating the robustness of these observations (Figure 5B), with individual gene modules showing enrichment in specific cortical regions (Figure 5C). Module eigengenes revealed additional patterning, including rostrocaudal gradients and laminar components to areal patterning. For example the tan module (Figure 5C, upper left) reflected a caudal low to rostral high patterning enriched in deep L5 and L6. Another gene module (purple, upper right) had an opposite gradient from high caudal to low rostral, in this case enriched in L3 and L4. Other modules were more area-specific: in V2, MT, DLPFC and OFC (blue) or lowest in V1, V2 and MT, with enrichment in L2 and L3 (pink).
Individual genes showed a wide range of areal patterns reflecting the modules described above, as well as patterns related to individual cortical areas or combinations of areas. Example gene patterns derived from the clustering analyses above, as well as analysis of the genes showing maximal cross-area fold changes, are shown in Figures 6 and and8.8. A large cohort of genes displayed rostrocaudal gradients. For example, MET, PVALB and RORB were expressed most strongly in caudal V1 and decreased moving rostrally. Typically this gradient expression also had a laminar component. For example, MET, which has been associated with autism (Campbell et al., 2006) was expressed preferentially in L4 of V1, expanded to include L2-4 in temporal areas, and had no detectable expression in frontal cortices (Figure 6A-D), consistent with observations in fetal human brain (Mukamel et al., 2011). Conversely, WFDC1 (Figure 6E-H) was expressed most strongly in frontal cortical areas (e.g. DLPFC) and lowest in caudal V1, with expression primarily restricted to L2.
Many genes showed area-selective expression (Figure 6I-L). NEFH, an intermediate filament heavy chain subunit generally expressed in long range projection neurons, was selectively enriched in L5 of M1 where the longest, spinally-projecting neurons, Betz cells, are located. Other genes were selectively enriched in specific regions of frontal cortex, including ACG (CALML4, IGFBP5, and LXN) or DLPFC and OFC (CD53). ISH showed selective enrichment of IGFBP5 in ACG compared to DLPFC and OFC (Figure 6K). Conversely, ISH analysis of CD53 (Figure 6L) showed enrichment in dorsolateral, ventrolateral and ventromedial cortex relative to dorsomedial cortex and ACG.
The laminar and areal patterning observed in macaque was then compared to homologous structures in human and mouse. This phylogenetic comparison is shown in Figure 7, made possible by the availability of mouse and human ISH data in the Allen Mouse Brain Atlas (http://mouse.brain-map.org) and Allen Human Brain Atlas (http://human.brain-map.org) generated using the same ISH technology platform as the macaque data. Some macaque data was also derived from the NIH Blueprint Non-Human Primate Atlas (www.blueprintnhpatlas.org). Most genes in Figure 4 above showed laminar expression patterns that were highly conserved between macaque and human, with lesser conservation in mouse. Laminar patterns were generally conserved between primates and mice in visual cortex for CUX2, RORB, RXFP1 and COL24A1 (left panels in Figure 7), although RXFP1 in mouse showed some regional differences between visual and somatosensory cortices not apparent in the macaque (Figure 7C). Other genes with highly laminar patterns showed major differences between species, indicated by blue arrowheads in Figure 7E-G. PDYN (Figure 7E) was enriched in excitatory neurons in L4 and L5 in rhesus and human, but in neurons with a broader laminar distribution that co-label with GAD1 indicating expression in GABAergic interneurons in mice (data not shown). The synaptic vesicle protein SV2C was predominantly enriched in superficial L3 pyramidal neurons in most cortical areas in primates (e.g. V2), while it is fairly selective for deep L5 pyramidal neurons in mice. NR4A2 was expressed selectively in deep L5 and L6 in macaque and human, and in both L6 and L2/3 in mice (Figure 7H).
While laminar distributions between V1 and V2 were highly conserved between macaque and human, differences were also noted. PDYN was expressed in L4B in macaque V1 in addition to the dominant L4Cb/5 expression (Figure 7E), while in human only L 4Cb/5 expression was observed. Similarly, faint L2 expression in addition to L5/6 enrichment was observed for PDE1A in macaque but not human (Figure 7G). In both cases, only the fainter expression in macaque was not observed in human, leaving the possibility that these differences relate less to true biological differences than to detection sensitivity on postmortem human tissues as compared to rapidly frozen rhesus macaque specimens.
The most robust patterns of areal specificity, both in terms of numbers of genes and their relative fold differences, were related to the highly specialized area V1 (Figure 8). Both selective enrichment in V1 and selective lack of expression in V1 were observed, with a sharp boundary corresponding to the cytoarchitectural boundary observed by Nissl staining. This areal patterning was typically restricted to particular cortical layers as well. Some of this selective expression related to the expanded input L4 in V1. For example, ASAM, VAV3 and ESRRG were enriched primarily in L4 of V1 (Figure 8A-C). However, selective enrichment or decreased expression was seen in all cortical layers, including L2 and L3 (MEPE and RBP4; Figs. 8D, E), L5 (HTR2C; Figure 8I), and L6 (CTGF, SYT6 and NPY2R; Figures 8F-H).
The V1-selective patterning appeared to be highly conserved between macaque and human, while significant differences were observed between primates and mice (Figure 8G-I). For example, the enrichment of SYT6 (Figure 8G) and NPY2R (Figure 8H) in L6 of V1 relative to V2 was conserved between macaque and human, as was absence in L5 of V1 for HTR2C (Figure 8I). NPY2R expression showed a completely different pattern in mice, restricted to sparse (presumably GABAergic) neurons scattered across the cortex. Conversely, for both SYT6 and HTR2C, laminar restriction to L6 and L5 respectively was conserved in mice, but with no selective enrichment or lack of expression in V1. Thus these V1-specific gene expression differences correlate with primate-specific cytoarchitectural and functional specialization, rather than with the functional sensory modality subserved by visual cortex.
The basic laminar structure of the neocortex is highly conserved across mammalian species, reflecting a general preservation of the constituent cell types and local circuitry (Brodmann, 1909). However, the specifics of laminar structure of the neocortex vary across both cortical region and species, with primates showing both a general expansion of superficial cortical layers and a massive expansion of cortical area with particular functional and cytoarchitectural specializations that is most dramatic in humans (Krubitzer, 2009). Understanding molecular differences between cortical layers and cell types across cortical regions, and the degree to which gene regulation is similar in homologous structures in humans and model organisms, may help explain features of cortical structure and function and the gene networks that underlie them. The current study aimed to examine organizational principles of the transcriptome of non-human primate neocortex by first determining the whole-genome mRNA profiles of adult rhesus macaque across cortical areas and individual layers within each area. We then performed a phyletic comparison of these laminar and areal gene expression patterns to those of human and mouse cortices. Our analyses show that these data provide a rich platform for species-specific phenotype discovery.
Individual layers of macaque cortex have highly distinct patterns of gene expression sufficient to unequivocally identify and cluster samples together based on the layer of origin across independent biological replicates, similar to recent observations in mouse cortex (Belgard et al., 2011) and validating the sampling strategy of profiling individual layers. These differential laminar patterns are both robust and widespread. For example, 5,537 probes showed differential expression among cortical layers in V1 where the greatest number of sublayers were assayed (ANOVA p<.01), with 3,353 genes showing at least 2-fold difference between two layers and 4,980 genes showing >1.5 fold differences. These data appeared to be highly reliable, as these laminar differences were consistent across animals and were readily confirmed by ISH. Very small numbers of individual genes show layer-specific expression; rather, most laminar genes appeared to be expressed in more complex patterns most often involving enrichment in multiple proximal layers. Interestingly, gene set annotation of different laminar gene clusters tended to give similar results for each cluster, namely enrichment in gene categories associated with neuronal development (including axon guidance), activity and plasticity. This suggests that the specific properties of particular neuronal subtypes are the product of unique combinations of members of the same gene families underlying those properties (e.g. ion channels, neurotransmitter receptors, axon guidance molecules, etc).
Network-based analytical techniques can be very effective at extracting biologically relevant information embedded in large, high-dimensional data sets. We applied WGCNA to identify groups of co-regulated genes, or “modules,” with high topological overlap across the data set (Langfelder et al., 2008; Zhang and Horvath, 2005). This approach has been successful at identifying biologically interpretable gene clusters in brain microarray data, including the identification of gene modules related to neural cell types (Oldham et al., 2008) and subcellular compartments (Winden et al., 2009). Applied either to the entire cortex data set or selectively to laminar samples from V1, many gene modules were identified whose expression was enriched in particular cortical layers. Individual modules were either enriched in specific layers (e.g. black module enriched in L2), or more commonly across multiple contiguous layers. Similar to previous WGCNA of human neocortex (Oldham et al., 2008), a robust oligodendrocyte module was identified with a laminar pattern similar to their cellular distribution from low numbers in superficial layers to higher in deep layers approaching the white matter.
Cytoarchitectural variation in cortical laminar architecture and cellular makeup have been the basis for parcellation of cortical areas for over a hundred years (Brodmann, 1909), yet identification of genes with clear areal specificity has proven to be remarkably difficult (Yamamori and Rockland, 2006). V1 in primates is easily distinguished by the relative expansion and specialization of the input L4 compared to other areas, and most genes described with areal specificity thus far differentiate primary visual cortex from other areas. For example, OCC1 (FSTL1) was identified as a V1-enriched transcript, which is additionally regulated by light-driven activity through direct retino-thalamo-cortical activation (Takahata et al., 2009). Importantly, the majority of studies to date have used samples containing the entire neocortex from a particular brain region (Abrahams et al., 2007; Johnson et al., 2009; Khaitovich et al., 2004; Takahata et al., 2009; Watakabe et al., 2009). This type of design, while permitting analysis of broad cell classes (Oldham et al., 2008), likely under-represents differential areal gene expression through a dilution effect due to the high degree of cellular heterogeneity in the cerebral cortex. We took advantage of laser microdissection from tissue sections to selectively isolate specific cortical areas and their component layers on the basis of cellular cytoarchitecture, thereby providing a great improvement in precise regional anatomical specificity over gross dissections. These dissections were consistent across animals, although it should be noted that we balanced achieving the finest areal specificity with our ability to clearly differentiate areas based on Nissl cytoarchitecture on fresh frozen tissue sections alone. Consequently, while consistent and well-separated from one another, in some instances these areas contain further subdivisions that may be molecularly distinct from one another as well (e.g. anterior cingulate cortex). We found a large number of differentially expressed genes between cortical areas, with a high degree of overlap between genes with laminar enrichment and areal enrichment. Furthermore, all of the genes we analyzed for areal enrichment by ISH, selected for maximal fold change between areas, were highly enriched in specific cortical layers as well. Together this suggests that much of what differentiates cortical areas is differential expression in specific layers (i.e. in specific excitatory neuron populations).
Specializations in cortical cellular and functional architecture were reflected by differential gene expression. For example, L5 of primary motor cortex, containing the large projection neurons (corticospinal Betz cells), showed highest expression of the neurofilament heavy chain (NEFH) which is expressed in large caliber and long range projection neurons (Elder et al., 1998; Jacobs et al., 1996). Consistent with previous analysis of whole cortex (Oldham et al., 2008) , we find V1 to have the most distinct areal molecular profile, with differential gene expression patterns that changed sharply at the Nissl-defined boundaries between V1 and V2. As anticipated, this difference was due in part to the expanded sublayers of L4, which were highly distinctive both at the transcriptome-wide level and at the level of individual genes as shown by ISH. For example, several genes with novel selective expression in V1 L4 were identified, including adipocyte-specific adhesion molecule (ASAM), a type I transmembrane immunoglobulin protein that may participate in cell-cell adhesion (Raschperger et al., 2004), the guanine nucleotide exchange factor VAV3 which has been implicated in Purkinje cell dendritogenesis (Quevedo et al., 2010), and the orphan estrogen-related receptor gamma ESRRG. Surprisingly, many of the most robust V1-selective genes were outside of L4, most notably in L6 where the synaptic vesicle fusion-related gene SYT6 and the neuropeptide Y receptor NPY2R were highly enriched. Finally, V1 appears to be demarcated equally by selective enrichment and selectively decreased gene expression, as for the matrix extracellular phosphoglycoprotein MEPE in L2 and the serotonin receptor HTR2C in L5. From a molecular perspective then, the cytoarchitectural and functional specialization of primate V1 appears to be mediated by complex differences in gene expression across many different excitatory neuronal subtypes.
An unanticipated finding from this study is that molecular similarities are strongest between spatial neighbors, both between cortical areas and between cortical layers. There are a number of potential explanations for this finding. One possibility, particularly for cortical layers, is that these similarities reflect a “spill-over” of cell types between layers, since layer boundaries are not sharp, cellular segregation by layer may not be complete, and our isolations were not cell type-specific. However, this seems unlikely for several reasons. First, we were careful to avoid laminar borders (see Figure S1). Furthermore, we were able to identify genes with nearly binary layer-specific expression, while most genes with laminar specificity appeared to be expressed across multiple contiguous layers at similar expression levels. These observations would appear to be inconsistent with spill-over of a small proportion of cells of a particular type across layers, although it is certainly possible that gradients of glial or inhibitory cell subtypes account for some proportion of adjacent layer similarity.
An alternate explanation for proximity relationships is that they reflect developmental origin, or lineage, an interpretation that is supported by our results. The development of laminar cortical structure involves the sequential generation of excitatory neurons in an “inside-out” fashion (Bystron et al., 2008; Rakic, 1974). Cell division of precursor pools occurs in the ventricular and subventricular zones, and later-generated cells destined for more superficial cortical layers migrate over earlier-generated deep layer neurons. Ultimately, this process results in the segregation of different cortical cell types into discrete layers, with corticocortical pyramidal projection neurons dominating superficial L2 and L3, corticofugal projection neurons dominating deep L5 and L6, and local circuit stellate neurons in L4. Despite morphological and projectional similarities between deep and superficial neurons, relationships between layers based on gene expression clearly reflected physical proximity. This organizational principle was highly robust, and was seen using a variety of analytical methods including PCA (based on all 52,865 probe sets on the arrays), ANOVA and unsupervised hierarchical clustering (based on 3,000-5,000 probe sets with significant differential expression), and WGCNA-derived gene networks (based on >18,000 probe sets). Since physical proximity between cortical layers also reflects temporal proximity in terms of the developmental genesis of neurons from the neocortical germinal zones, this suggests that the global mRNA signatures for cortical layers bear a developmental imprint resulting from the sequential generation from increasingly differentiated cortical progenitor cells. Similar conclusions have been made by others comparing transcriptional profiles of different brain regions in rodents (Zapala et al., 2005).
Our selection of cortical areas allowed a discrimination between molecular similarities based on proximity, functional type (sensory, motor, association), or functional stream (e.g. dorsal (MT) versus ventral (TE) visual streams). Similar to findings for layers, cortical areas cluster by proximity more so than by functional type or functional stream. The caudally located visual areas V1, V2 and dorsal stream area MT cluster together, while ventral stream area TE is most similar to the proximally located primary auditory cortex (A1). The adjacent S1 and M1 areas are highly similar despite different cytoarchitecture and function. Furthermore, WGCNA identified modules of covarying genes with rostrocaudal gradients. These patterns are highly reminiscent of molecular gradients of transcription factors in the early developing neocortex that are important for proper areal patterning (Bishop et al., 2002; O’Leary and Sahara, 2008). Therefore, although individual cortical areas have molecular signatures that relate to their distinct cellular makeup or functional properties, broad molecular coherence between cortical areas more closely reflect spatial, nearest neighbor relationships.
Molecular similarities between nearby cortical areas may be important from the perspective of selection pressure for wiring economy in cortico-cortical connectivity (Bullmore and Sporns, 2009; Raj and Chen, 2011). Cortical networks show features of small-world networks (Bassett and Bullmore, 2006), with dense local clustering of connections and relatively few long-range connections that serve to minimize wiring and energy costs while maintaining dynamical complexity (Sporns and Zwi, 2004). An intriguing idea is that cortical areas with strong molecular similarities preferentially wire together during development. In support of this idea, the top enriched GO categories for genes that vary by cortical area were axon guidance and ephrin receptor signaling, while gene clusters showing enrichment in proximal cortical areas were enriched for axon guidance molecules as well.
It has been argued extensively that species differences may be largely a product of differences in gene regulation as opposed to gene sequence or structure (King and Wilson, 1975). Consistent with this idea, a number of genes with specific cellular distributions were seen to vary across species, suggesting alterations in cis-regulation at the level of specific cortical cell types. While it is possible that differences in species-specific probe sequences may contribute to differences observed across species in some cases, several overall patterns were observed across the genes examined. In general, rhesus patterns closely matched human expression patterns, both in their laminar (cellular) distributions and their areal specificity for V1 versus V2. Several differences were noted, including a lack of PDYN labeling in human compared to macaque in L4A, the same layer where other molecular differences have been noted in humans compared to other primates (Preuss and Coleman, 2002). However, these differences involved low expressing cells that may not be detected in human postmortem tissues with much longer postmortem intervals than experimental model system-derived tissues. On the other hand, substantially greater differences were observed for specific cortical laminar gene expression patterns between primates and mice, ranging from partially matching laminar patterns to completely different cell populations labeled. For example, SV2C is expressed preferentially in L3 pyramidal neurons in primates, and in L5 pyramidal neurons in mice. Prodynorphin (PDYN), which produces dynorphin and other kappa opioid receptor peptide agonists, is expressed in L4Cb and L5 in primate V1, but only in scattered GABAergic interneurons in mice. This difference suggests alterations in cis-regulation, potentially supported by the finding that the promoter region of PDYN has been shown to vary across primates and human populations through positive natural selection (Rockman et al., 2005) . A similar shift from L6 neurons to sparse, putative GABAergic neurons in V1 is seen for the neuropeptide Y receptor NPY2R. These types of species differences are particularly important, as cell type class-shifting in gene usage, particularly for genes such as neurotransmitter receptors, could have profound effects on cortical function.
Finally, the differential molecular signatures identified in V1 in rhesus macaque and human were not observed in mouse visual cortex. For example, SYT6 is highly enriched in L6 in V1 relative to other cortical areas, and shows a sharp transition at the V1/V2 boundary in macaque and human. In mouse Syt6 is similarly enriched in L6, but with no discernable difference in expression between primary visual cortex and other cortical regions. Similarly, the serotonin receptor HTR2C, implicated in schizophrenia, bipolar disorder and major depression (Iwamoto et al., 2009) is expressed selectively in L5 in macaque, human and mouse. However, in mouse this pattern is in all regions, while in primates HTR2C is selectively decreased in V1. These observations suggest that cortical specialization may occur through variations on a basic cellular and molecular cortical architectural template.
The strong similarities in molecular architecture indicate that rhesus macaque is a highly predictive non-human primate model system for human neocortical structure and corresponding gene expression, at least for homologous functional areas of the neocortex. These data provide an information-rich resource, and it will be important in the future to extend these studies to human neocortex to understand the molecular underpinnings of human-specific neocortical specialization.
Adult (mean age ± SEM = 8.5 ± 1.0 years) male and female Rhesus monkeys (Macaca mulatta) were used for the study. All animals were housed at the New Iberia Primate Research Center (New Iberia, LA). Animals had negative histories for Cercopithecine Herpes virus I, measles, pox viruses, rabies and tuberculosis.All animal handling procedures were approved by Institutional Animal Care and Use Committees at Merck and Co. and the New Iberia Primate Research Center.
Animals were euthanized with an overdose of sodium pentobarbital and phenyltoin sodium, immediately after which the brain was removed,placed into cold (4°C) phosphate buffered saline (pH 7.4), and then placed ventral side up in a Rhesus brain matrix (EMS, Hatfield PA). Coronal slabs (6 mm thick) were made by placing razor blades (EMS) into slots on the matrix and gently depressing the blades through the tissue. Each slab was marked for orientation, placed on a metal disk that was embedded in dry ice until frozen, and stored at -80°C in bar-coded bags. The mean (± SEM) time between euthanasia and freezing was 49 ± 2.1 min.
Slabs from frozen male and female (n=2-3 animals/gender) brains were serially cryosectioned at 14 μm onto PEN slides for LMD (Leica Microsystems, Inc., Bannockburn, IL) and a 1:10 Nissl series was generated for neuroanatomical reference. After drying for 30 minutes at room temperature, PEN slides were frozen at -80°C. Slides were later rapidly fixed in ice cold 70% ethanol, lightly stained with cresyl violet to allow cytoarchitectural visualization, dehydrated, and frozen at -80°C. LMD was performed on a Leica LMD6000 (Leica Microsystems, Inc.), using the cresyl violet stain to identify target brain regions. Quadruplicate biological replicates were collected from ten cortical regions (4-8 individual layers), four hippocampal subfields, and three layers of the LGN (Table S1). Images of pre- and post-laser microdissection are shown in Figure S1.
Microdissected tissue was collected directly into RLT buffer from the RNeasy Micro kit (Qiagen Inc., Valencia, CA) supplemented with ß-mercaptoethanol. Samples were volume-adjusted with RLT Buffer to 75μl, vortexed, centrifuged, and frozen at -80°C. RNA was isolated for each brain region following the manufacturer’s directions. RNA samples were eluted in 14μl and 1μl was run on the Agilent 2100 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA) using the Pico 6000 assay kit. Samples were quantitated using the Bioanalyzer concentration output. The average RNA Integrity Number (RIN) of all 225 passed experimental samples was 6.7.
Sample amplification, labeling, and microarray processing were performed by the Rosetta Inpharmatics Gene Expression Laboratory (Seattle, WA). Samples passing RNA QC were amplified and profiled as described (Winrow et al., 2009) with a few modifications. Briefly, samples were amplified and labeled using a custom 2 cycle version, using 2 kits of the GeneChip® HT One-Cycle cDNA Synthesis Kit from Affymetrix. Five nanograms of total RNA was added to the initial reaction mix together with 250 ng of pBR322 (Invitrogen). As little as 2ng was used in some cases where tissue was extremely limited. Hybridization was performed in three batches to GeneChip® Rhesus Macaque Genome Arrays from Affymetrix containing 52,803 probesets/sequences. To control for batch effects, common RNA pool control samples were amplified and hybridized in each batch (3 replicates per 96 well batch). Profile quality was assessed using standard Affymetrix quality control metrics as well as by PCA. A total of 8 outliers were identified, and these samples were recollected and hybridized successfully.
A total of 258 samples passed sample QC, including 225 experimental samples and 33 control samples. The experimental samples include two male and two female profiles for each region (except three missing samples; Table S1). The data discussed in this publication were deposited in NCBI’s Gene Expression Omnibus (Edgar et al., 2002) and are accessible through GEO Series accession number GSE31613.
Each batch was normalized within itself using RMA (Irizarry et al., 2003), and batch effects were removed by subtracting the difference for each probe between controls of one batch from the controls of each other batch. Following this correction, no correlation with batch was observed among all samples within the four primary principal components which explain approximately 40% of the cumulative variance (Figure S2A, B and data not shown).
ANOVA, principal component and agglomerative clustering was performed using Matlab2007a. Gene set annotation analysis was performed by comparing input sets to GeneGo (genego.com), Ingenuity (ingenuity.com), KEGG (genome.jp/keg) and PANTHER (pantherdb.org) pathway sets. Bonferroni (BF) corrected hypergeometric p-values of less than 0.1 were considered as significant overlap between sets.
Genes correlated to templates were identified using Microsoft Excel, based on correlation function scores across all cortical samples between an artificial template set with values of 100 for one cortical layer of interest versus 0 for all other layers.
WGCNA (Langfelder et al., 2008; Zhang and Horvath, 2005) was used to identify clusters of coregulated genes across the entire neocortical sample set or just within the laminar samples in area V1. Outlier samples were removed based on inter-array correlations (IAC) < 2 standard deviations from the mean IAC, and cross-batch normalization was performed using the R package “ComBat” (http://statistics.byu.edu/johnson/ComBat/). 182 samples were included in the whole cortex analysis, and 30 samples in the V1 analysis, using probes present in at least half of the samples (18,080 for whole cortex, 15,234 for V1). A signed weighted network (Zhang and Horvath, 2005) was constructed for each data set. Using a dynamic tree-cutting algorithm (Langfelder et al., 2008), we identified 20 modules in the entire neocortical data set, and 36 modules in V1 only data set. The Module Eigengene (ME), defined as the first principle component of a given module, was used to represent the characteristic anatomical expression pattern of individual modules (Oldham et al., 2008).
Non-isotopic colorimetric in situ hybridization (ISH) was performed as described previously (Lein et al., 2007). Briefly, following cryosectioning of fresh frozen samples at 20μm, tissue sections were fixed, acetylated, and subsequently dehydrated. Digoxigenin-based riboprobe labeling coupled with TSA amplification and alkaline-phosphatase-based colorimetric detection was used to label target mRNAs in expressing cells.
Riboprobes were designed to overlap probe designs for homologous genes in mouse and human used in the Allen Mouse Brain Atlas (http://mouse.brain-map.org) and Allen Human Brain Atlas (http://human.brain-map.org/), and cross-species comparisons were made to data publicly available in those databases. A subset of rhesus macaque ISH data shown was generated in 4-year old adult male specimens as part of the NIH Blueprint NHP atlas (http://www.blueprintnhpatlas.org/). Additional ISH data were generated on tissue sections collected from the frontal pole, medial/temporal areas and caudal/visual areas in two adult specimens from this study.
This work was sponsored by Merck Research Labs, the Allen Institute for Brain Science and NIH Grant 5R37 MH060233-11 (DHG, RL). The authors wish to thank the Allen Institute founders, Paul G. Allen and Jody Patton, for their vision, encouragement, and support. We thank Mike Citron for specimen identification, Ken Lodge for providing pilot tissue and sample collection methods, and Dr. Jane Fontenot, Dana Hasselschwert and Marcus Louis for assistance with tissue collection. Thanks to Crissa Wolkey for sample processing and Rachel Dalley and Sheila Shapouri for LMD images. We wish to acknowledge Paul Wohnoutka, Amanda Ebbert and Lon Luong for supporting data production, Chinh Dang for supporting database needs, Kelly Overly for contracting assistance, David Haynor for discussions on project design and Christof Koch for critical reading of the manuscript. Finally, thanks to Affymetrix for preferred pricing on rhesus microarrays.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.