We determined gene expression profiles across nine tissues—eight brain regions [cerebellar vermis, pulvinar, head of caudate, hippocampus, occipital pole, orbital frontal cortex, frontal pole, dorsolateral prefrontal cortex (DLPFC)] and peripheral blood—in 12 male vervets. Having transcript measures from different tissues from the same individuals allowed us to evaluate, for each transcript, sources of transcript level variation within and between individuals (Fig. ). We further focused on two classes of transcripts characterized by high variation of expression across brain regions or high variation between individuals. High spatial and temporal inter-individual variation determined, respectively, between brain tissues and blood and between independent blood samplings, allowed us to investigate heritable brain gene expression traits in peripheral blood. The brain and blood gene expression data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (19
) and are accessible through GEO Series accession number GSE15301 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15301
Figure 1. Components of transcript level variation. Observation of transcript levels across different tissues and individuals allows analysis of variation of transcript levels from the perspective of its two components: inter-individual variation (green) and intra-individual (more ...)
To evaluate the impact of probe-target sequence incompatibility in our data set, we compared the number of probes widely detected in vervet brain tissues (at least 80% of tissues) and in 193 human cortical samples analyzed by Myers et al
), using 6791 probes that are in common between the two data sets. At this threshold, 2410 probes were detected in vervet brain and 4622 probes were detected in human cortex. There was considerable overlap in the probes detected by both studies. In spite of human–vervet sequence differences, 46% (2128/4622) of the probes detected in human cortex were also detected in vervet brain, whereas 88% (2128/2410) of the probes detected in vervet brain were detected also in human cortex, showing that human–vervet sequence differences do not prevent reliable detection of a substantial fraction of the brain transcripts.
Transcripts differentially detected between brain tissues
We used the detection status of all 22 184 probes represented on the Illumina HumanRef-8 version 2 chip across the tissues sampled in tissue set one to determine relationships in gene expression between eight brain regions. Distances between tissues were estimated on the basis of the probes that showed the most striking differences in terms of number of shared detections. Most such probes were either detected in only one tissue or detected in all or most brain tissues. Similarities between all possible pairs of tested tissues are illustrated by a heat map (Fig. A). Cortical tissues generally show the greatest similarity within this sample set, but three genes (KREMEN1, MED13L and ZMYM6) differentiate orbital frontal cortex from frontal pole and one gene (POLE) differentiates DLPFC and frontal pole. As reflected in a corresponding dendrogram showing relations between all brain tissues (Fig. B), neocortical regions cluster close to hippocampus and are more distantly linked to caudate and pulvinar tissues, with respect to the number of detected samples. Cerebellar vermis and pulvinar tissues are more distant from other brain tissues in terms of the number of differentially detected transcripts on the dendrogram.
Figure 2. Gene expression differences between brain tissues. Pairwise comparison between all eight brain regions is presented on a heat map (A) constructed based on differentially detected transcript. Corresponding hierarchical clustering of tissues is presented (more ...)
When tissues are ranked according to the number of region specific detections (Table ), cerebellar vermis is first, followed by head of the caudate. Cerebellar vermis shows a tissue-specific presence of 38 transcripts and absence of 54 transcripts. Transcripts with decreased detection in this tissue are significantly enriched for genes associated with developmental processes and coding calcium binding proteins. In the head of the caudate, 22 transcripts are preferentially detected. Of these transcripts, 18 have a decreased number of detections and are enriched for genes associated with neuronal activities (2.97E−04). Additionally, we grouped together three frontal regions clustering on the bottom of the tissue dendrogram—frontal pole, DLPFC and orbital frontal cortex—and compared them against all other brain tissues, identifying two transcripts (GPR120 and RASGRF2) as specifically expressed in these frontal regions.
Region-specific transcripts identified based on differential detection
Ubiquitously detected transcripts differentially expressed between blood and brain
We focused on transcripts that are ubiquitously detected across tissues and individuals, as these transcripts provide a means to investigate brain-related biology using peripheral blood samples. Of the 22 184 probes on the array, we identified 2481 probes—representing 2430 genes—for which expression was detected in all 12 individuals in all eight brain regions and in blood (Supplementary Material, Table S1
). The regional mean expression levels of these ubiquitously expressed 2430 genes were examined in all pairwise comparisons of brain tissues, and the number of differentially expressed probes used as a distance metric in a hierarchical clustering analysis (Supplementary Material, Fig. S2
). The relations between tested tissues based on the number of ubiquitously detected transcripts with differential expression levels are mostly concordant with relationships determined based on differentially detected transcripts (previous section) and with relationships known from human studies (20
). The only exception to this observation is the clustering together of occipital pole and pulvinar patterns, while hippocampus unexpectedly localizes closer to frontal cortex than does occipital cortex (Supplementary Material, Fig. S2B
). Unlike the tissue grouping based on extreme differences in detection count, this tissue clustering is based on a group of ubiquitously detected transcripts that is clearly depleted of transcripts showing the most extreme differences in expression between tissues, such as those whose expression is restricted to a specific tissue.
Exclusion of the tissue-specific transcripts generates unexpected topography in the tissue dendrogram, possibly indicating that these transcripts determine important regional features to a greater extent than ubiquitously expressed transcripts. Nevertheless, the ubiquitously expressed transcripts still may suggest important aspects of the biology of different brain regions; as many as 1474 ubiquitously expressed genes show increased or decreased expression levels in specific brain regions, based on pairwise comparisons of mean expression values between tissues (Supplementary Material, Fig. S2A
). Consistent with the suggestion from the detection based distance measure, the frontal cortical region shows smaller inter-tissue differences than other tissues tested. Additionally, a differential mean probe signal was observed for 90 probes between DLPFC and frontal pole, 92 probes between DLPFC and orbital frontal cortex, and 215 probes between frontal pole and orbital frontal cortex.
Among the 2430 ubiquitously expressed genes, we identified genes showing region—specific differential expression, and we determined functional categories that were under—and over-represented among these genes (Supplementary Material, Table S2
). Consistent with the results of the differential detection measures, cerebellar vermis, and head of caudate show the largest differences from other tissues in the number of genes showing differential mean expression levels (790 and 383, respectively). Genes differentially expressed between cerebellar vermis and all other tissues are associated with two interrelated functions, metabolic processes of nucleic acids and mRNA transcription. These two biological processes are significantly over-represented (6.4E−11, 7.06E−03, respectively) among 533 up-regulated genes and under-represented (1.00E−05, 3.84E−02) among 257 down-regulated genes, which may indicate distinctive transcriptional regulation mechanisms acting in this brain region. In head of caudate, a group of 244 up-regulated genes includes structural proteins of small and large units of cytoplasmatic (21
) and mitochondrial (5
) ribosomes. As a result, among the genes preferentially expressed in this tissue, molecular function of ribosomal proteins is significantly over-represented (2.16E−04), suggesting that specific protein synthesis mechanisms and regulation may be characteristic of this brain region. Among genes down-regulated (139) in head of caudate, there is an over-representation of kinases (4.05E−02).
Genes ubiquitously detected in brain and blood tissues
Among 2481 transcripts that are ubiquitously detected in brain and blood tissues, 2430 also show differential expression between tissues. This group of 2430 transcripts does not include transcripts showing the most extreme differences in cross-tissue expression patterns, such as transcripts expressed exclusively on a single region. A large number of such differentially expressed transcripts (1474) are still widely detected across brain and blood tissues. This set of genes is clearly depleted of genes related to many brain-specific functions including signal transduction or neurogenesis, while it is enriched (with P
-values less than 0.05) in genes related to the ubiquitin proteasome system, Parkinson disease (e.g. SNCA
, synuclein-alpha) and Ras pathways as well as genes involved in the various biological processes and molecular functions related to mRNA and protein metabolism (Supplementary Material, Table S3
Over representation of genes involved in basic biological processes is consistent with a high representation of housekeeping (HK) genes which, by definition, are widely expressed in numerous human tissues and are involved in maintenance of basal cellular functions. More than 44% of recognized HK genes (255/575) are ubiquitously detected in vervet brain and blood tissues. Among these, HK genes are genes widely used in quantitative RT–PCR as control genes whose expression is assumed to be constant among samples (21
). It is therefore noteworthy that from the genes utilized as endogenous controls, ARHGDIA
showed differential regional expression and therefore do not suit the purpose of tissue-to-tissue normalization. As expected, this group of HK genes is enriched for genes involved in maintenance of constitutive functions such as protein and mRNA metabolism, ribosomal activity, energy release and cytoskeletal regulation (Supplementary Material, Table S4
Here among the genes broadly expressed in brain and blood tissues, we detected a considerable number of ubiquitous transcripts that were previously reported as stable across tissues HK genes. Analysis of blood and high-quality brain tissues from precisely dissected regions revealed that levels of almost all these ubiquitous transcripts vary between at least one pair of tested tissues in the current study.
Variation of transcript profiles in brain and blood tissues
To identify probes showing correlated expression profiles in brain and blood tissues, we estimated, for each brain region, the Spearman rank correlation (SRC) between the paired brain and blood expression measures. Additionally, to identify probes with greater inter-individual variation than intra-individual variation, we used a variance components approach to estimate the percent variation (PV) attributable to the inter-individual component and within monkey component (between tissues). Both measures (SRC and PV) for the whole brain and blood data set are available on the Integrated Vervet Monkey Genomics website: http://genomequebec.mcgill.ca/compgen/submit_db/vervet_web
, which enables searches for similarities of expression profiles between eight brain regions and blood for specific genes. Examples of brain–blood expression patterns are shown in Supplementary Material, Figure S3
, for highly correlated profiles (left column), moderately correlated profiles (middle column) and poorly correlated profiles (right column).
Among the 2481 widely detected probes, 825 show PV >0.55 and SRC >0.55. For the 2481 ubiquitously detected probes, the number of brain regions with PV >0.55 or with SRC >0.55 was skewed in favor of excluding a majority of probes and varied from zero brain regions (PV: n
= 1329, SRC: n
= 1574) to all eight brain regions (PV: n
= 52; SRC: n
= 28). The probes exceeding these PV and SRC thresholds for each brain region are shown in Supplementary Material, Table S5
. Correlated expression patterns and inter-individual variation (according to the above criteria) in all eight brain regions were attributable to 23 genes: SPOCK2, HSBP1,CLN3, CIRBP, ANXA11, PNKP, CCT2, RPS20, GGA2, SCO1, CCDC115, DDOST, DUSP11, MRPL51, DMTF1, RPL31, RPL35A, LARP5, SS18L2, TUBA1B, C9orf114, SRF, MRPS17
. Even though these genes displayed correlated patterns across all brain tissues, ten of them showed regionally increased or decreased expression levels, for example, three-ribosomal genes RPS20, MRPL51
are up-regulated in the head of the caudate in comparison to all other brain tissues. Supplementary Material, Table S6
shows the number of probes that met PV and SRC criteria for comparison between blood and each brain region. Both methods consistently identified hippocampus as the brain region most dissimilar to peripheral blood with regard to transcriptional profiles, but did not show considerable variation among other brain tissues. Probes that have passed the PV and SRC thresholds, for at least one brain region, are probes that show similar profiles in brain and blood expression and have more inter-monkey variation than intra-monkey variation. Therefore, these selected probes can be further studied in blood samples to identify brain gene expression traits.
For investigating specific regional brain functions, genes expressed only in one or a few brain regions only are of interest. Such regional specificity, however, is likely to predict lack of expression of transcripts in peripheral tissues and therefore the impossibility of using correlated brain and blood expression patterns to guide studies of such transcripts in peripheral blood. We verified that only three transcripts in our data set (LHX1, PPP1R1B and RGS9) meet both tissue-specific detection and brain–blood correlation criteria.
Variation of transcript profiles in peripheral blood in replicate samples
To assess the reproducibility of gene expression profiling, which could be affected by either technical variation or transcript level stability over time, we used expression data set two, derived from 18 monkeys each sampled twice for blood. From the initial list of 22 184 probes, we identified 1880 probes that were detected in peripheral blood in all 36 replicate samples from 18 monkeys. We required detection in all replicates in order to identify the most reliable transcripts for future eQTL mapping in blood samples. The group of 1839 genes represented by these1880 probes is greatly enriched for genes involved in protein and mRNA metabolism and various signaling pathways, as well as pathways characteristic of peripheral blood such as lymphocyte activation and inflammation (Supplementary Material, Table S7
). Among the most overrepresented biological processes in this set of genes are those implicated in oxidative phosphorylation, apoptosis, immunity and defense, and cell cycle, structure and motility. Nucleic acid binding, and ribosomal and cytoskeletal functions are the molecular functions most enriched among these genes.
To determine the biological reproducibility over time of the gene expression profiles in these 1880 probes, we assessed the within monkey versus between monkey (between duplicate samplings) variance, including sex as a fixed effect in the model. Examples of the similarity in expression signal between replicate samples is shown in Supplementary Material, Figure S4
, for highly correlated replicates (left column), moderately correlated replicates (middle column) and poorly correlated replicates (right column). There were significant differences (at the 0.05 level, uncorrected for multiple testing) between males and females in blood expression data for 238 of the 1880 probes that passed the detection threshold.
For each of the 18 vervets with duplicate blood expression measures, correlation of duplicate measures across all probes was at least 0.9 (range across the 18 vervets was 0.89–0.99). To identify transcripts with stable levels in blood but differentially expressed between monkeys, we focused on transcripts which had more variation between monkey than within monkey (temporal transcript variation in blood and technical reproducibility) as defined by the percent of total variance attributable to the within monkey component PV>55%. Among the 1880 probes that passed the detection threshold in peripheral blood replicate samples, there were 134 probes (representing 133 genes) with PV>0.55, indicating that for these probes the majority of the total variation in probe signal was between monkey rather than within monkey (between replicates over time). Both high inter-individual variation and intra-individual reproducibility make these selected genes suitable candidates for genetic mapping of their eQTL.
Selection of candidate transcripts for eQTL mapping
We merged results from data sets one and two to select expression traits for future eQTL mapping. We examined the 2481 probes that passed detection thresholds in brain and blood expression data set one to identify probes that passed PV and SRC thresholds for both brain–blood similarity (825) and the PV threshold for biological reproducibility (Fig. A) from data set two (130). We identified 53 of 2491 probes that met both these criteria, i.e. having correlated expression profiles in brain and blood (for at least one brain region) and showing more inter-individual variation than intra-individual (between tissues) variation. Next, we limited the list of probes to the probes that meet the detection threshold (36 measurements) for the biological reproducibility data set (Fig. B). The reduced subset of 32 probes passing all variation, reproducibility and detection thresholds is presented in Table .
Figure 3. Selection of candidate transcripts for mapping brain eQTL in peripheral blood. The diagram represents the set of probes in the brain–blood gene expression comparison that passed the 55% threshold for PV (PV BB) and the 0.55 threshold for the SRC (more ...)
Thirty-two candidate genes for mapping brain eQTL in peripheral blood
Significant correlation and PV of the TUBA1B transcript and also PV of additional transcripts (BAT1, C19orf62, EIF1 and SUV420H1) was observed between all brain regions and blood, suggesting a common regulatory mechanism acting across brain tissues. Region-specific correlation of brain expression with blood was observed in caudate (EIF1), cerebellar vermis (SMOX and TSPAN14), DLPFC (SLC25A23), frontal pole (RAB5A), hippocampus (STOM) and orbital frontal cortex (ERAL1 and TFE3); no such correlation was observed specifically for occipital pole and pulvinar.
Using transcript levels measured in the peripheral blood of 347 individuals (tissue set three) from the extended vervet pedigree with known and genetically confirmed structure, we estimated the heritability of the 32 selected transcripts (Fig. B, Table ). Twenty-nine of these expression traits showed heritability at a significance level less than 0.05 and 25 transcripts showed heritability at P < 0.001; 62.5% of these traits displayed heritability estimates of ≥0.4. The generally high heritability of these transcripts suggests that selecting transcripts whose inter-individual variation in transcript levels is greater than their intra-individual variation, identifies transcripts whose regulation has a strong genetic component. Evidence consistent with this hypothesis is provided by much lower estimates of heritability in a comparison set of 32 transcripts chosen randomly from the data set (data not shown).
Polymorphisms located within a probe sequence may cause differential hybridization which mimics differential expression results and lead to inaccurate estimates of the heritability of levels of particular transcripts. To assess the effects of such polymorphisms on our results, we sequenced probes for 16 transcripts showing significant heritability in the vervet monkey pedigree. Ten of these probes were monomorphic. For the six probes in which we detected SNPs (TUBA1B, TMED3, BAT1, CDKN1A, C19orf62 and SMOX), we compared expression level measures between different SNP genotype classes to observe possible correlations between signal intensity and genotype. Four of these six probes showed marked correlation between signal intensity and genotype, raising the possibility that probe hybridization properties rather than differential gene regulation are responsible for observed inter-individual variation in these transcripts. A probe for the SMOX transcript did not show such signal intensity–genotype correlation, most likely due to the low frequency of the minor allele. The probe for the CDKN1A transcript showed consistent differences between genotypes in seven tested tissues but not in hippocampus and pulvinar. This observation suggests that the CDKN1A probe is sensitive to transcript level, but confirmation of this interpretation will require use of an alternative gene expression assay such as quantitative real-time PCR.
Fourteen of the heritable transcripts showed differential expression levels across brain regions. This group includes transcripts specifically elevated in cerebellar vermis (5
), head of caudate (3
) and hippocampus (1
). It will be of interest to examine further the genetic determinants of regional gene expression that may be involved in specific tissue functions.