|Home | About | Journals | Submit | Contact Us | Français|
The dramatic technical advances in methods to measure gene expression on a genome-wide level thus far have not been paralleled by breakthrough discoveries in psychiatric disorders—including major depression (MD)—using these hypothesis-free approaches. In this review, we first describe the methodologic advances made in gene expression analysis, from quantitative polymerase chain reaction to next-generation sequencing. We then discuss issues in gene expression experiments specific to MD, ranging from the choice of target tissues to the characterization of the case group. We provide a synopsis of the gene expression studies published thus far for MD, with a focus on studies using mRNA microarray methods. Finally, we discuss possible new strategies for the gene expression studies in MD that circumvent some of the addressed issues.
The pathophysiology of major depression (MD) and the mechanism of action of the antidepressant treatments remain largely obscure. With the sequence of the human genome being publicly available since February 2001, an array of novel research tools have become available that may yield unbiased, hypothesis-free insight into the pathophysiologic underpinnings of this disorder. This article focuses on methods investigating disease-related changes in gene expression at the level of mRNA, the nucleic acid transcript of gene sequence from which protein is synthesized in all mammalian cells. We first discuss methodologic issues for the measurement of gene expression and factors related to the choice of the investigated tissue. We then summarize recent publications describing gene expression changes related to MD and how they may impact our understanding of the pathophysiology of this disorder. This discussion sets the stage for other articles in this journal that describe epigenetic mechanisms, as the direct consequence of epigenetic changes is a long-lasting impact at the level of gene transcription to mRNA.
The initial step in gene expression is transcription (transfer) of genetic information contained in genomic DNA to mRNA. Most of the genetic regulation in humans is thought to occur at the level of gene transcription. The objective of gene expression analysis at the transcriptional level is to determine whether specific mRNA sequences transcribed from particular genes are present in cells or tissues of interest, and if so, at what level. Transcripts can be directly measured as RNA species or can be converted into cDNA via reverse transcription—a laboratory procedure that uses viral enzymes to transcribe RNA to DNA. The cDNA copies are amplified using polymerase chain reaction (PCR), and levels of transcripts within samples in a particular disease state can be compared with those in healthy samples to identify transcriptional differences between the two conditions.
Traditional methods of gene expression analysis included Northern blots, quantitative PCR (qPCR), real time qPCR, and in situ hybridization, all of which allow profiling of the transcriptome on a limited scale restricted to single genes or small groups of genes (Fig. 1). Microarray technologies now allow parallel analysis of thousands of transcripts across many samples simultaneously. Expression levels serve as a surrogate to study the activity of a gene, even though microarrays only measure steady-state levels and do not provide information on the type of transcriptional regulation or post-transcriptional changes. Variation in transcript levels represents an intermediate stage between DNA sequence differences and complex human traits, thereby providing a snapshot of the consequences of DNA variance on cellular processes.
Microarrays function on the principle of complementary hybridization between nucleic acids (A → T and G → C) and take advantage of the knowledge of the human genome sequence . DNA sequences of varying lengths representing all known genes, and even putative genes are spotted onto a solid support (eg, glass, metal, nitrocellulose, beads). A typical high-density microarray contains sequences complementary to thousands of gene sequences (probes), each immobilized to a specific spatial coordinate on the microarray surface. The RNA is extracted from tissues or cells of interest and labelled with fluorescent tags or radioactivity. The labelled RNA hybridizes only to the cDNA sequences on the array, and the signal is proportional to the abundance of the RNA in the sample. This signal is detected using autoradiography, chemiluminescence, or fluorescent scanning. A kinetic analysis allows gene expression levels to be measured by their positions on the microarray and level of hybridization (measured by signal intensity) to be detected for each probe. For data analysis, signal intensity of each probe is compared between the experimental groups to determine whether the specific mRNA of a gene in a group is upregulated, downregulated, or unaffected compared with the other group. Although recent methodologic developments have significantly reduced the technical problems of microarrays, they are still hampered by the semiquantitative nature of the measures and the fact that only currently known genes or splice variants can be measured.
The completion of the Human Genome Project, together with advances in sequencing technologies have directed the emergence of a panel of revolutionary sequencing technologies termed next-generation sequencing (NGS). Three NGS platforms currently dominate the market: SOLiD (Applied Biosystems, Foster City, CA), Genome Analyzer (Illumina, San Diego, CA) and 454 GS FLX (Roche, Basel, Switzerland), with more coming in the near future. These technologies allow generation of three to four magnitudes more sequence in a cost-effective manner than the conventional Sanger sequencing method. The fundamental nature of these systems is the miniaturization of single-molecule DNA sequencing reactions, allowing optimal spatial arrangement of each reaction and an efficient scanning for millions of individual sequences on a standard glass slide .
NGS has already proven successful in de novo sequencing of genomes, including the giant panda and the sequencing of total cDNA for transcriptome studies, a method known as RNA-Seq. The key advantages of RNA-Seq are the combined analysis of accurate, quantitative measurement of gene expression; unbiased discovery of novel transcribed regions; and global assessment of alternative splice sites in the genome, all in a single experiment. In RNA-Seq, a population of long RNA is converted to a library of cDNA fragments with adaptor sequences attached to one or both ends. Depending on the platform, each cDNA molecule with or without amplification is sequenced in a high-throughput manner to obtain short sequences of 30 to 400 bp. Following sequencing, the reads are typically aligned with a reference genome or transcriptome or assembled de novo to generate a base-resolution expression map for each gene at the transcriptional structure level and/or expression level. One major advantage of RNA-Seq over microarrays, especially for psychiatric disorders, in which smaller differences in gene expression are expected between disease and nondisease tissue, is that because the technique is quantitative, RNA-Seq sequences every single transcript so that quantification is not confounded by the intermediate step of hybridization, as for microarrays. The wide dynamic range of RNA-Seq allows robust capture of low expressed transcripts and makes comparisons of the transcriptome across different tissues without technical considerations such as normalization possible. Several challenges of RNA-Seq include the huge amount of data and bioinformatic analysis, library preparation biases, and other technical issues (eg, deep sequencing requirements for enough coverage of certain low expressed transcripts and low-quality reads and errors in image analysis). Despite these limitations, with the surfacing of new projects such as the 1,000 Genomes Project (http://www.1000genomes.org) and the reduction of sequencing costs, it is expected that RNA-Seq will soon surpass microarrays as the gold standard for comprehensive surveying of transcriptomes.
To obtain meaningful results from gene expression experiments for MD research, one has to carefully consider the target tissue. Although brain tissue would be the optimal choice, its use in MD transcriptome experiments is challenging. Postmortem samples can retain their RNA quality and intact histologic architecture with careful processing but may be affected by gene expression changes accompanying death. In addition, phenotype information obtained through psychological autopsies can be confounded by its retrospective nature. This is why many researchers have investigated MD-related gene expression changes in peripheral tissue, most often peripheral blood.
Gene expression patterns are likely to undergo recognizable changes in specific regions of the brain to initiate, sustain, and/or modify the altered biological states that accompany behavioral phenotypes, thereby providing an opportunity to characterize the basis of mental disorders .
High-quality mRNA is a prerequisite for all molecular methods of mRNA profiling. Several groups have published detailed guidelines to assist brain banks and researchers in the processing and freezing of postmortem brains to ensure high-quality RNA . Some of the factors influencing RNA quality in postmortem tissue are listed below, including pH, temperature, and length of the agonal state. Studies have indicated that below the critical pH threshold of about 6.8, transcriptional changes in stress response, apoptosis, and inflammation genes are accelerated . The length of the agonal state has been shown to affect transcription, but it is still unclear to what extent neurons in different layers of the brain exhibit variable vulnerabilities and if apoptosis and RNA degradation occur at the same rates during the agonal period.
Another important factor is the selection of the brain region for analysis. Pierce and Small  suggested the use of brain imaging approaches to select brain regions for microarray experiments. Many brain structures have been implicated in the pathophysiology of psychiatric disorders. However, the available data point to dysregulations that affect brain circuits that involve several brain regions as opposed to single brain regions. Changes of expression in one area may only be disease relevant when accompanied by changes in other structures in the implicated circuit. This circuit-based approach, however, poses novel problems for the already complicated data analysis in expression microarray studies. In addition, smaller but relevant changes in a subpopulation of cells may be diluted and thus not recognized if a whole region is being analyzed. A combination of laser capture microscopy (LCM) and mRNA amplification techniques  allows the comparison of expression changes in single cells.
In addition to these technical considerations, it is important to carefully match cases and controls for a series of factors, including age, gender, ethnicity, agonal state, medications, postmortem interval, laterality of the brain, and time to processing .
The main reason for using peripheral blood as a target tissue to pursue transcriptomic research in MD is that blood is readily accessible. In addition, peripheral blood cells may in fact serve as surrogate markers for some of the disease processes in MD or help characterize disease state even though they are not likely the cell type causally involved in the cognitive symptoms of MD. Peripheral blood cells share more than 80% of the transcriptome with nine tissues: brain, colon, heart, kidney, liver, lung, prostate, spleen, and stomach , and the expression levels of many classes of biological processes have been shown to be comparable between whole blood and prefrontal cortex . Indeed there is considerable communication between the immune system and the central nervous system (CNS). Many cytokine receptors have been located within the CNS, and interleukin-2 mRNA and T-cell receptors have been specifically detected in neurons . Lymphocytes also express several neurotransmitter and hormone receptors, including dopamine, cholinergic, and serotonergic receptors and glucocorticoid and mineralocorticoid receptors and their chaperones . Lymphocytes are directly influenced by glucocorticoids and catecholamines, and these two systems are perturbed in MD. Several studies have reported abnormalities in the immune system of psychiatric patients . Studying lymphocytes in psychiatric disorders thus may yield information on disease-specific immune changes, and changes in lymphocytic immune function may serve as markers of disease progression. In addition, some receptor systems may show similar abnormalities in lymphocytes and the brain. CNS glucocorticoid receptor resistance and its resolution with antidepressant treatment is one of the most consistent biological findings in MD . Steroid resistance also has been reported for the activation of T cells and monocytes in MD and bipolar disorder, suggesting comparable glucocorticoid receptor impairment in immune and CNS cells. In addition, genetic polymorphisms may similarly affect the function of molecules that are expressed in both lymphocytes and the brain. Binder et al.  reported that polymorphisms in FKBP5, a glucocorticoid receptor–regulating co-chaperone of hsp90, are associated with increased lymphocytic levels of FKBP5 protein, as well as an altered response of the stress hormone system, suggesting that the functional effects of these polymorphisms were not limited to immune cells, but also affected CNS function.
Transcriptional profiles in peripheral blood are highly sensitive to the collection method and other handling procedures. Three options are commonly used for transcriptional profiling of peripheral blood cells. The first is to use whole blood. This can be stabilized at the time of blood draw against RNA degradation and further transcriptional activation using proprietary reagents (eg, PAXgene RNA tubes [Qiagen, Venlo, The Netherlands] and Tempus blood RNA tubes [Applied Biosystems, Foster City, CA]) . The advantage of investigating mRNA profiles stabilized at the time of blood draw may be counteracted by the fact that whole blood consists of a multitude of different cell types that may be present in varying ratios in diseased compared with control individuals and may consequently result in a heterogeneous cell mixture. Variability in the blood transcriptome may then indicate differences in cellular composition rather than the underlying disease processes. Furthermore, reticulocytes present in whole blood still have high levels of hemoglobin mRNA; this represents about 70% of the mRNA in whole blood. Globin mRNA has been shown to subdue signals from other transcripts and result in noisy data ; procedures to remove it are advocated for some applications.
Isolation of specific cell subtypes from whole blood such as peripheral blood monocytes, or even more specific subgroups such as CD4+ or CD8+ T cells, requires additional cell separation and purification steps. These have been shown to alter gene expression by inducing several cell-stress–related genes due to prolonged handling or by activating certain receptor-specific pathways if subtypes are selected using antibodies against certain surface receptors . The third option is to use lymphoblastoid cell lines. Although these have been shown to be a good representation of the in vivo state , expression patterns of specific genes may be affected by the Epstein-Barr virus infection that is necessary for their transformation . Lymphoblastoid cell lines may also exhibit extreme clonality (ie, a situation in which most cells in a culture derive from the same single-cell ancestor) with random patterns of monoallelic expression in single clones .
Although the use of a single cell type reduces the range of factors influencing gene expression, thus increasing the power for genetic investigations [18, 21], expression profiles may be confounded by changes in gene expression due to handling and transformation. In addition, the relatively complex procedures necessary for isolating single cell types are often impractical for very large cohorts. Therefore, in practice, many studies rely on whole blood RNA collection tubes that can be easily used for data collection in large cohorts.
Animal tissue is often the only realistic option as a tissue source for examining brain-related gene expression changes. In animals, postmortem delay is much shorter and can be held constant for all experimental groups. Gene expression studies in inbred animal strains can identify gene expression changes in a homogeneous genetic background, with the signal not masked by the noise generated from the variable genetic background present in human studies [22•]. Furthermore, several inducible gene expression systems have been developed in transgenic animals that allow expression of certain genes of interest in distinct brain regions to be turned on and off . These animals represent powerful tools that enable detailed studies of the impact of individual genes on gene expression.
Optimal animal models for major depression, however, have not been developed yet, undoubtedly at least partly due to the intrinsically human nature of these complex behavioral phenotypes. One approach has been to focus on certain behavioral or endocrinologic dimensions of major depression that can be modelled in animals but do not necessarily represent the complexity of the disorder in humans. In addition, as described for human postmortem studies, neurons within a given brain region exhibit a very heterogeneous expression of neurotransmitters, receptors, and connections to other brain regions, likely leading to differential alterations of gene expression and modifications in neighboring neurons after exposure to the same stimulus. As for the human brain, cell type–specific dissection may be required .
Overall, there is no optimal tissue for examining gene expression changes related to MD, and we must weigh the advantages and disadvantages of each option for a specific research question.
The National Institutes of Health recently proposed an ambitious Genotype-Tissue Expression project, a database that will include expression analysis from 30 different tissues in 1,000 samples. In addition, the National Center for Biotechnology Information has initiated the Gene Expression Omnibus, a gene expression/molecular abundance repository (http://www.ncbi.nlm.nih.gov/geo/index.cgi) supporting high-quality data submissions from research groups worldwide. This curated online resource for gene expression allows data browsing, query, and retrieval in gene expression datasets from human and experimental animals, different tissue types, and different diseases, as well as physiologic states.
Large variability exists in the type of brain regions investigated in postmortem gene expression studies for MD. The investigated brain regions include the amygdala, anterior cingulate cortex [25-27••], prefrontal and orbitofrontal cortex regions [28-30], and hippocampus and hypothalamus [31, 32], all of which have been implicated in the pathophysiology of MD through animal and brain imaging studies. Transcriptional profiles of different brain regions show substantial differences [27••, 29], making comparisons between profiles derived from various brain regions difficult. Even if the same regions are chosen, the proportion of neurons and glia that make up the final cells for RNA extraction remains mostly undetermined, and this may underlie some of the conflicting results from MD gene expression studies. A possible alternative is LCM, as described previously, which allows selection of specific cell types for RNA extraction. Wang et al.  used LCM to analyze mRNA from neurons in the paraventricular nucleus of the hypothalamus in postmortem tissue from patients with MD compared with that of controls. With a qPCR candidate gene approach, they observed an upregulation of genes activating the hypothalamus-pituitary-adrenal axis (including the corticotropin-releasing hormone and its receptor [CRHR1] and NR3C2 encoding the mineralocorticoid receptor) and downregulation of inhibiting genes (eg, androgen receptor). This was specific to neurons in the paraventricular nucleus and not observed in neurons of the supraoptic nucleus, underlining the importance of cell and region specificity in these analyses.
Genome-wide approaches, including transcriptomics, using microarrays carry the risk of false-positive associations due to the high number of performed tests. To reduce the number of reported false-positive associations, Sibille et al. [27••] used data from postmortem tissue, as well as an animal model. The authors reported an mRNA signature derived from the amygdala of depressed patients that was validated in an animal model of unpredictable chronic mild stress. In both approaches, genes implicated in oligodendrocyte structure and function were downregulated, whereas genes associated with neuronal enrichment were upregulated. This molecular signature was reversed in the stress model by antidepressive treatment. Alterations in oligodendroglial abnormalities were also shown in the temporal cortex of depressed individuals by Aston et al. .
Several other studies investigated gene expression changes in postmortem tissue of patients with MD, with or without suicide (Table 1). The most consistent findings were differences in expression of genes associated with glutamatergic and γ-aminobutyric acid–ergic function [25, 26, 29, 35].
To our knowledge, only one microarray study has investigated depression using RNA from peripheral white blood cells at a genome-wide level. Segman and colleagues  found gene expression signatures that could differentiate between women prone to postpartum depression. These differential signatures were characterized by differences in immune activation and decreased transcriptional engagement in cell proliferation, DNA replication, and repair processes. All other studies have analyzed candidate genes using qPCR methods, the results of which often have been inconsistent or could not be replicated by other groups (Table 2).
For example, Iga et al.  observed an increase in vascular endothelial cell growth factor mRNA in peripheral leukocytes with MD that was reversed after psychopharmacologic treatment, but this finding could not be replicated at the protein level . Serotonin transporter (5-HTT) mRNA was shown to be increased in patients with MD in two reports [39, 40] but reduced in another .
Several different animal models for depression have been investigated using microarrays, including animals subjected to chronic stress or comparisons of behaviorally different inbred rodent lines or transgenic animals. However, no consistent results have emerged from these studies [23, 42-44]. This may be due to inherent differences in the stress paradigms, animal strains, and brain-region specific alterations. For example, Surget et al.  showed specific transcriptome changes in different brain regions using the unpredictable chronic mild stress paradigm, a model of depression based on socioenvironmental stressors. Chronic administration of two pharmacologically completely different (putative) antidepressant drugs (fluoxetine and CRHR1 antagonist) reversed the behavioral effects and gene expression changes .
Each genome-wide molecular genetic approach bears a high risk of false-positive associations. The combination of results from several platforms to generate convergent evidence may decrease the number of false-positive hits. For example, Le-Niculescu et al. [22•] combined whole-genome gene expression data from the whole blood of bipolar patients with gene expression from the blood and brain of a pharmacologic murine model of bipolar disorder, as well as merged results from published genetic association and linkage studies and published human postmortem brain data. The authors identified five genes involved in myelination and six genes involved in growth factor signaling to show regulation and associations on all required levels. Although this specific approach has limitations, including the choice of the animal model (valproate vs methamphetamine administration), the use of convergent evidence may be a way in expression research in psychiatric disorders to identify new candidate genes.
Overall, the dramatic technical advances in methods to measure gene expression on a genome-wide level thus far have not been paralleled by breakthrough discoveries in MD using hypothesis-free approaches. This is due in part to technical problems. Microarrays are only semiquantitative, and their dynamic range may not be sufficiently sensitive to detect relatively subtle gene expression differences in psychiatric disorders. This problem may be solved by the use of next-generation sequencing. In addition, access to human brain tissue is difficult and only possible postmortem. Furthermore, it is still not clear whether it is sufficient to investigate changes within a single brain region or whether concerted changes in circuits have to be analyzed. Cell type–specific dissections within a region are likely necessary, but here again we lack the knowledge as to exactly which cells in which region to select. Gene expression changes in whole blood–derived mRNA can be used as disease state or treatment response markers, but they may not uncover major candidates directly involved in the pathomechanism of MD. No single animal model for major depression can claim full validity for the human condition. In addition, MD itself is mostly a biologically and genetically heterogeneous disorder. Studies relying on case-control associations without further biological or genetic stratification may never lead to consistent findings. Adding risk genotypes and endophenotypes may help select more homogeneous patient samples, which may increase the chances of finding consistent gene expression profiles for this disorder.
A next generation of gene expression experiments using biological phenotyping of patients, convergent evidence from animal models, brain-circuit related data analysis, and next-generation sequencing will be needed to make substantial progress in gene expression studies in MD.
Dr. Binder receives grant support from the National Institute of Mental Health and the Doris Duke Foundation.
Disclosure Dr. Binder receives grant support from PharmaNeuroBoost. No other potential conflicts of interest relevant to this article were reported.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.