|Home | About | Journals | Submit | Contact Us | Français|
The underlying genetic variations of late-onset Alzheimer’s disease (LOAD) cases remain largely unknown. A combination of genetic variations with variable penetrance and lifetime epigenetic factors may converge on transcriptomic alterations that drive LOAD pathological process. Transcriptome profiling using deep sequencing technology offers insight into common altered pathways regardless of underpinning genetic or epigenetic factors and thus represents an ideal tool to investigate molecular mechanisms related to the pathophysiology of LOAD. We performed directional RNA sequencing on high quality RNA samples extracted from hippocampi of LOAD and age-matched controls. We further validated our data using qRT-PCR on a larger set of postmortem brain tissues, confirming downregulation of the gene encoding substance P (TAC1) and upregulation of the gene encoding the plasminogen activator inhibitor-1 (SERPINE1). Pathway analysis indicates dysregulation in neural communication, cerebral vasculature, and amyloid-β clearance. Beside protein coding genes, we identified several annotated and non-annotated long noncoding RNAs that are differentially expressed in LOAD brain tissues, three of them are activity-dependent regulated and one is induced by Aβ1 - 42 exposure of human neural cells. Our data provide a comprehensive list of transcriptomics alterations in LOAD hippocampi and warrant holistic approach including both coding and non-coding RNAs in functional studies aimed to understand the pathophysiology of LOAD.
Alzheimer’s disease (AD) is a complex age-related neurodegenerative disorder characterized by progressive loss of synapses and neurons in the hippocampus and cortex, which is associated with gradual decline in short-term memory and cognitive functions. It affects more than 36 million people worldwide, and by the year 2050, the number of people affected by AD may triple [1, 2]. At the molecular level, the hallmarks of AD are the presence of amyloid-β (Aβ) plaques and neurofibrillary tangles (NFTs) in the brains of patients affected by the disease. Mutations in genes involved in Aβ biogenesis (APP, PS1, and PS2) have been found in the small percentage (~13% ) of cases of familial early-onset AD. Despite an enormous research effort, causative genetic alterations responsible for the most cases of late-onset AD (LOAD) remain largely unknown, suggesting a polygenic multifactorial type of inheritance. The 4 allele of apolipoprotein E (APOE) is the major genetic risk factor of LOAD ; however, recent genome wide association studies have found common low-penetrance variants associated with LOAD in more than 20 genomic loci [4–8]. Current research is still a long way from the ultimate goal of revealing clear factors that can help in the diagnosis, prevention, and treatment of the disease. Based on the ‘Common Disease, Common Variant’ hypothesis, genetic variations with high frequency and low penetrance might be the major contributors to LOAD. Alternatively, the ‘Common Disease, Rare Variant’ hypothesis suggests that rare variations with high penetrance might explain genetic susceptibility to LOAD. Above both of these genetic variation hypotheses, epigenetic factors and lifelong environmental or nutritional impetuses might contribute to disease without any apparent genetic background. The low heritability and absence of clear genetic correlates implies that LOAD may represent a common pathogenic process resulting from an interplay of different genetic, environmental, and epigenetic factors . In an attempt to investigate transcriptomic changes that take place in LOAD, several gene expression profile studies on postmortem brain tissues from AD patients have been performed mostly using microarray technology . These studies identified AD specific alterations in cellular processes such as mitochondrial activity, intracellular signaling, and neuroinflammation but, unfortunately, little consensus among these studies or meaningful insight into AD pathophysiology has been obtained (reviewed in ). This is likely due to inherent limitations of microarray technology, which has since been largely replaced by next generation RNA sequencing (RNAseq) . RNAseq provides a more comprehensive and accurate transcriptomic analysis with the main advantage being the accurate analysis of all RNA species, including non-protein coding RNAs (ncRNAs)  and alternatively spliced isoforms, thus representing an extraordinary tool to study transcriptomic changes associated with complex diseases pathogenesis .
Here we investigated transcriptomic changes in the hippocampus of AD patients with directional RNA sequencing and analyzed the data using an experimentally validated bioinformatics pipeline that allows the discovery of novel RNA transcripts and the precise measurements of protein- and non protein-coding genes expression. Our study provides for the first time a comprehensive and experimentally validated data set of transcriptomic changes occurring in the hippocampus of LOAD patients and identifies conceptually novel molecular targets that may play a role in the pathogenesis of LOAD.
Human brain samples were prepared from rapid autopsy brain tissue that had been obtained from The Branner Sun Health Research Institute. The average postmortem interval (PMI) for all the samples is 2.48h (Supplementary File 1). All enrolled subjects or legal representatives had signed a Sun Health Research Institutional Review Board–approved informed consent form allowing both clinical assessments during life and several options for brain and bodily organ donation after death. Total RNA was isolated via CsCl purification from tissue dissected from specific regions of brain. Although not all regions were available from all cases, we examined a total of 118 RNA samples from superior frontal gyrus, entorhinal cortex, hippocampus, and cerebellum in this study. Drs. Douglas E. Wood and Barbara G. Sahagan at Central Nervous System Discovery, Pfizer Global Research and Development, USA previously gifted these RNA samples to our laboratory, which were utilized in our published work . Tissue sample information was received from the Banner Sun Health Research Institute. Complete clinicopathological information on the samples are provided as Supplementary File 1.
Average plaque density was scored in different brain regions (frontal, temporal, and parietal lobes, hippocampus, and entorhinal cortex) according to the CERAD templates . Values: 0=none; 1=sparse; 2=moderate; 3=frequent; 9=unknown or unavailable. Average NFT density was scored in different brain regions (frontal, temporal, and parietal lobes, hippocampus, and entorhinal cortex) according to the CERAD templates . Values: 0=none; 1=sparse; 2=moderate; 3=frequent; 9=unknown or unavailable. The Braak stage describes the topographical progression of NFTs, as well as associated dystrophic neurites and neuropil threads, throughout the transentorhinal and entorhinal areas, CA1 subfield of the hippocampus, amygdala, and cerebral neocortex. Evaluations was made according to the original publication . NIA-Regan criteria are consensus recommendation for the diagnosis of AD developed by a consensus committee jointly sponsored by the National Institute on Aging (NIA) and the Reagan Institute . The guidelines suggested that certain combinations of CERAD neuritic plaque density and Braak neurofibrillary stage confer probabilistic estimates of their likelihood for being responsible for dementia. Thus, there are low, intermediate, and high likelihoods for dementia due to AD histopathology. The category “not AD” is checked off if there are either no plaques at all or no tangles at all. The category “criteria not met” is checked off if the subject was not demented.
RNA quality was measured using Agilent Bioanalyzer RNA nano chip and the RNA integrity number (RIN) of the samples was between 6.5 and 8.6. RNA samples were prepared for directional RNA sequencing using a modified version of the Illumina Directional mRNA-Seq sample preparation protocol. Briefly, 1μg of total RNA was processed using Ribo-ZeroTM rRNA Removal Kits (Epibio) to remove ribosomal RNAs. Ribosome-depleted RNA was treated with phosphatase before being treated with T4 polynucleotide kinase (PNK). PNK-treated RNA was then purified with the QIAGEN RNeasy column purification kit and different 3’ and 5’ RNA adapters were ligated to both ends of the RNA in separate reactions. Next, the RNA was reverse transcribed and PCR amplified on regular thermo cyclers for 15 cycles. PCR products were purified using AMPure beads. RNA sequencing libraries were validated using the Agilent Bioanalyzer High Sensitivity DNA kit and sequenced using the Illumina HiSeq2000 platform at the Genomics sequencing core at the Hussman Institute for Human Genomics, University of Miami. Each sample was run in a single lane of a flow cell to increase depth of sequencing.
RankProd R package  was used to perform gene expression analysis of RNAseq data. First, genes were filtered based on read coverage and only genes having average coverage of at least 20 reads in one of the two groups were retained. These FPKM values were used as input for RankProd analysis, which ranks genes in each replicate from the experimental group based on their up- or downregulation compared to the control group. Then, it derives rank product (RP), which is the product of all the individual ranks for a given gene in each replicate. Genes with small RP are consistently up- or downregulated in several experimental replicate conditions. P value for each gene is calculated based on permutation tests and reflects the number of times RP values smaller than or equal to a given experimental RP value occur in the 100 random experiments. To correct for multiple comparisons, the percentage of false-positives (pfp) equivalent to FDR is calculated by dividing the p value by the relative rank of a gene in the gene list. pfp value of 0.1 was used as the cutoff for statistical significance.
Differentially expressed protein-coding genes were utilized as the input list to perform enrichment analysis with GeneGo MetaCore from Thomson Reuters. GeneGo Enrichment analysis tool uses manually annotated reference pathways to calculate the enrichment of a given list of genes in each pathway and provides p value and FDR value that reflect the chance the given number of genes from a pathway would appear in the list by chance. Only pathways with FDR less than 0.1 were considered as significantly enriched.
All RNA samples were treated with DNase to remove any contaminant genomic DNA. RNA was then reverse transcribed using Superscript first strand kit (Life Technologies) and the cDNA was diluted and used as template for SYBR Green or TaqMan qPCR on the ABI 7900 (Life Technologies). RNA from human brain, heart, kidney, liver, lung, and muscle was purchased from Life Technologies. Human neurons and astrocytes RNA have been purchased from ScienceCell Research Laboratories, cat# 1525 and cat# 1585 respectively. SERPINE1, TAC1, PGK1 and β-ACTIN TaqMan assays were purchased from Life Technologies. Primers used for SYBR green qRT-PCR to validate and measure the expression of novel ncRNAs are listed in Supplementary File 7. Non-coding RNAs expression was measured using Power SYBR Green (Life Technologies). For all qRT-PCR reactions, we included three technical replicates. To compare the expression of genes across different cellular compartment, GraphPad prism software was used to perform ANOVA followed by Tukey post-hoc test. A p value of below 0.05 was considered as statistically significant. The Student’s t-test was used to compare the expression between the normal brain and LOAD.
Human NSCs are isolated from three human fetal brains collected from 3rd trimester aborted fetuses, that we receive from the Birth Defects Research Lab at the University of Washington in Seattle, and maintained in culture as neurospheres as previously reported . Briefly, dissected brain tissues are mechanically dissociated into single cell suspensions and seeded at a density of 5×106 cells in 75-mm tissue culture flasks in Human Neural Progenitor Media from Lonza in the presence of EGF and FGF2. After 7–10 days of culture, NSCs cells form neurospheres colonies, whereas other cell types remain in suspension as single cells or attach to the bottom of the flask. The isolated hNSCs can be cultured as neurospheres in suspension for several months. Alternatively hNSCs can be differentiated in vitro into a mixed population of neurons and astrocytes. Briefly, to induce differentiation, neurospheres are disaggregated into single cells and plated in 6-well plates coated with poly-L-Ornythine (PLO) and laminin in the presence of B27/Neurobasal media in absence of growth factors for 21 days. The differentiated culture contains a mix of neural cell lineages including astrocytes and neurons as well as their progenitor cells.
To induce cellular depolarization, NSCs are differentiated for 21 days in the presence of B27 (without retinoic acid) into a mixed population of neurons and astrocytes. This mixed population of neural cells is depolarized utilizing 50mM KCl for 1h prior to RNA extraction.
1mg of lyophilized Aβ42 peptide ωασ purchased from Sigma Aldrich (Catalog Number A9810) and dissolved in 1mL of 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP) to promote formation of α-helix and minimize β-sheet structure. The solution was air-dried in a fume hood and the resulting clear film was re-suspended in DMSO to a stock concentration of 1mM. NSCs are differentiated for 21 days in the presence of B27 (without retinoic acid) into a mixed population of neurons and astrocytes. This mixed population of neural cells is exposed to 10μM of soluble Aβ42 for 48h prior to RNA extraction.
Human NSCs were fractionated into cytosol, nucleoplasm and chromatin using a modified NE-PER Kit (PIERCE) and RNA was isolated using a combination of two protocols: Trizol (Life Technologies) and RNeasy Mini Kit (QIAGEN). The kit was used as described in the protocol except the insoluble pellet after nucleoplasm extraction was washed with PBS once and re-suspended in Trizol reagent until completely dissolved before proceeding to RNA extraction from the chromatin fraction.
Human neural stem cells were plated on 8-well glass chamber slides (Millipore, PEZGS0816) coated with poly-l-ornithine for 3h and laminin overnight. Cells were plated at the density of 40,000 cells per well and differentiated in Neurobasal media containing B27 supplement without vitamin A in absence of growth factors for 21 days before staining. After 21 days, cells were fixed with 4% formaldehyde for 10min, permeabilized with 0.2% triton X, and incubated for 1h in 20% goat serum to prevent non-specific binding of primary antibodies. Cells were then incubated with primary antibody overnight at 4°C, subsequently cells were washed 3 times with PBS and incubated with fluorescently labeled secondary antibodies for 2h at room temperature. Antibodies used: rabbit anti-βIII-tubulin (Covance, MRB-435P), chicken anti-MAP2 (Abcam, ab5392), mouse anti-GFAP (Millipore, MAB360).
RNA deep sequencing (RNAseq) data analysis currently represents the bottleneck for the application of this powerful technology since an experimentally validated and globally accepted pipeline for data analysis is still missing. In this study, we developed and experimentally validated a computational approach based on TopHat  and Cufflinks  packages to accurately measure gene expression and to discover and annotate novel RNA transcripts. As schematically represented in Fig. 1A, TopHat is used to align reads to the human genome, while transcriptome reconstruction of each individual sample is performed using Cufflinks. Reconstructed transcriptomes for each individual samples are compared to each other using Cuffcompare that also merges overlapping constructs and annotates them according to the reference transcriptome provided (Ensembl GRCh37) . In order to discriminate real transcripts from sequencing artifacts we adopted a filtering strategy based of two main criteria: 1) the presence of at least one splice junction in the transcripts and 2) the recurrent expression of single exon transcripts across multiple samples. We noted that applying this filtering strategy dramatically reduces the number of fragments originating from introns, repeat transcripts and polymerase run-off RNAs.
After filtering, annotated transcripts are subdivided based on Ensembl classification in protein coding genes, long intergenic non-coding RNAs (lincRNAs) and natural antisense RNAs transcripts (NATs) (Fig. 1A). Non-annotated transcripts are divided into intergenic RNAs, located in between and non-overlapping other transcripts; and NATs, overlapping other transcripts in the opposite orientation. These novel transcripts may encode for novel proteins or may be noncoding RNAs. To assess the protein coding potential of these novel transcripts we used the coding potential assessment tool (CPAT), an alignment free method that rapidly recognizes coding and noncoding RNAs . Cuffdiff module of Cufflinks is utilized to retrieve Fragment Per Kilobase per Million fragments mapped (FPKM) values for both annotated and novel genes that are then used to calculate differential gene expression applying Rank Products. Importantly, we noted a certain level of reads coverage to be necessary in order to obtain reliable differential expression analysis. By comparing qRT-PCR with RNAseq data we established coverage of 20 reads per transcript to be the minimum number of reads required to achieve reliable differential expression analysis.
We performed directional RNA sequencing using Illumina Hiseq 2000 on RNA extracted from the hippocampi of four patients affected by LOAD and of four age-matched control individuals (Table 1 and Supplementary File 1). Directional RNAseq was performed running one sample per lane and we generated ~177.000.000 reads per sample, yielding excellent coverage and sequencing depth (Tables 2 and and3).3). Directional sequencing technique, unlike conventional sequencing methods, keeps strand-of-origin information in order to accurately align reads from positive and negative DNA strands. The strand-specific alignment helps identifying NATs and to accurately measure expression values of both sense and antisense RNA transcripts. We applied our bioinformatics pipeline and identified 13,054 annotated genes to be expressed in at least one of the sequenced samples (Fig. 1B). We further subdivided these annotated genes based on Ensembl classification into protein coding genes (12,132), lincRNAs (499), and NATs (423) (Fig. 1B). We also found 2,082 novel lincRNAs and 443 novel NATs to be expressed in at least one of the analyzed samples. Of these novel transcripts, 51 intergenic RNAs and 32 antisense RNAs are predicted to have protein-coding potential by CPAT analysis, thus representing potential novel protein-coding genes (Fig. 1B). A complete list of all RNA transcripts retrieved from sequencing is provided as additional material (Supplementary File 2) and the raw data discussed in this publication have been deposited in NCBI’s Gene Expression Omnibus [24, 25] and are accessible through GEO Series accession number GSE67333.
In order to validate the accuracy of reconstruction and expression of novel RNA transcripts, we designed primers spanning exons junctions and performed qRT-PCR on RNA extracted from human hippocampus, cerebellum, and prefrontal cortex. With this method, we were able to validate the expression of 14 not-previously annotated ncRNAs, confirming the accuracy of our bioinformatics analysis in reconstructing and detecting expression of novel ncRNA transcripts (Fig. 1C).
To identify AD specific transcriptomic changes, we retrieved FPKM values for genes utilizing Cuffdiff and performed Rank Products analysis [18, 26] to obtain q values corrected for multiple comparison. As a criterion for determine differentially expressed genes, we used percentage of false positive (pfp), which is the equivalent of false discovery rate, at <0.1. In order to detect potential outlier samples, we utilized hierarchical clustering followed by plotting a heatmap of log transformed FPKM values for each sample. The hierarchical clustering was performed by calculating Euclidian distance of logarithm of FPKM values for differentially expressed
genes between each of the 4 control and 4 AD samples. As shown in Fig. 1D, control samples clustered together, as well as AD samples, indicating the absence of obvious outlier samples in our analysis. Differential expression analysis identified 143 protein coding genes, 90 lincRNAs, 31 antisense RNAs, and 1 novel putative protein coding gene to be differentially expressed between AD and control (Fig. 2A and Supplementary File 3). Three of the AD samples we selected for sequencing were Braak stage VI and the remaining sample was stage V. Braak stage of V and VI indicate the presence of NFT in neocortical regions and this typically indicates with high probability an advanced stage of the disease. These cases of AD could be characterized by extensive neuronal cell death and presence of gliosis in the brain, and this could results in differential expression of cell-type specific genes in our data set. To verify this possibility, we looked at the expression of known neuronal, astroglial, and microglial markers in Control and AD hippocampi from our RNAseq data. We observed no changes in the expression of neuronal markers: DCX (Fc: 1.02, pfp: 1.07), MAP2 (Fc: 0.74, pfp: 0.99), NFH (Fc: 0.73, pfp: 0.99), NEFM (Fc: 0.71, pfp: 0.99), RBFOX3 (Fc: 0.92, pfp: 1). No changes in the expression of astroglial markers: GFAP (Fc: 1.4, pfp: 0.8), AQP4 (Fc: 1.2, pfp: 0.99), ALDH1L1 (Fc: 1.22, pfp: 0.98), SLC1A3 (Fc: 1.08, pfp: 0.99). No changes in the expression of microglial markers: PTPRC (Fc: 0.74, pfp: 0.99), AIF1 (Fc: 0.74, pfp: 1.01). These data indicate that, although the AD samples we analyzed have high Braak score, they were not depleted of neurons and enriched of astroglia and microglia cells and that the differentially expressed genes from our analysis are not the results of an imbalance between the cellular populations between AD and CTRL hippocampi.
Interestingly, the proportion of differentially expressed genes is much higher for lincRNAs and NATs compared to protein coding genes, suggesting substantial alteration of noncoding part of the genome in complex disorders such as LOAD and underestimated importance of long ncRNAs in pathophysiological processes underlying AD. Next, we achieved technical validation of RNAseq differential expression data by performing qRT-PCR analysis on the same RNA samples used for sequencing (Fig. 2B). From RNAseq data we selected protein coding genes, lincRNAs and NATs with different expression levels and with a wide range of differential expression changes. We observed a high degree of correlation between log2 fold change differences from the two techniques for protein coding genes (n=6, p=0.0255, R=0.8666), lincRNAs (n=6, p=0.0121, R=0.9086), and AS (n=6, p=0.004, r=0.9481) (Fig. 2B, Supplementary File 4). Our RNAseq data provide an accurate assessment of gene expression levels and provide an experimentally validated and comprehensive catalog of expression changes of both protein-coding and non-coding genes in the hippocampus of patients suffering from LOAD.
Among the differentially expressed genes, we identified 61 protein coding genes to be expressed at lower levels and 82 at higher levels in AD hippocampi compared to control samples (pfp< 0.1) (Supplementary File 3). We utilized GeneGo MetaCore from Thomson Reuters to perform pathway analysis of differentially expressed genes, focusing our attention on process networks. This represents a comprehensive classification of biological processes based on a specific functional theme and defines involvement of genes in a process based on both the gene function and its known interactions with others genes related to the process. The two most enriched process networks are related to nerve impulse transmission and neuropeptide signaling (Table 4). The analysis of neuropeptide signaling pathways enriched for differentially regulated genes in our dataset revealed that the gene encoding substance P (TAC1) is downregulated in the hippocampi of AD patients compared to controls. The neurotrophic and neuroprotective activity of substance P, a neuropeptide involved in pain perception, has been demonstrated both in vitro and in vivo and reduced substance P levels and degeneration of substance P-immunopositive neurons in the brain of AD patients have been reported [27, 28]. In order to confirm our RNAseq results and to investigate the expression of TAC1 in other brain regions, we decided to perform qRT-PCR analysis of the hippocampus, cerebellum, superior frontal gyrus, and entorhinal cortex of LOAD and control patients. We used a cohort of postmortem brain tissues representing four brain regions of 24 LOAD and 23 age- and gender-matched controls (Supplementary File 1). Although not all of the brain regions were available from every individual, we had access to RNA form the cerebellum of 24 patients and 20 controls, hippocampus of 12 patients and 10 controls, entorhinal cortex of 8 patients and 9 controls, and superior frontal gyrus 17 patients and 18 controls. We could validate downregulation of substance P expression in the hippocampus and we observed a clear reduction, although not statistically significant, in the entorhinal cortex of AD patients compared to control (Fig. 3A). There were no significant changes in TAC1 expression in the superior frontal gyrus, or in the cerebellum (Fig. 3A). Hippocampus and enthorinal cortex are among the first areas of the brain to be affected in AD, thus substance P loss in these two regions may represent an early event during the pathogenesis of the disease.
The other two statistically enriched process networks were muscle contraction and development blood vessel morphogenesis, indicating alterations in hippocampus vascularization in LOAD patients (Table 4).
Beside the genes belonging to this networks, we noticed, among the most differentially expressed protein coding genes in our RNAseq dataset, SERPINE1 which is a regulator of vascular function and was previously associated with AD [29, 30]. According to our RNAseq data, SERPINE1 was upregulated more than 3-fold in LOAD hippocampi. SERPINE1 codes for the plasminogen activator inhibitor type-1 (PAI-1), which negatively regulates plasminogen activator proteins thus reducing the production of plasmin, a serine protease that plays a critical role in fibrinolysis, but also in mediating Aβ clearance . Because of this dual role of SERPINE1, we decide to further study the expression of this gene in our bigger cohort of AD samples. qRT-PCR analysis of SERPINE1 expression in different areas of the brain confirmed the abundant overexpression of SERPINE1 in LOAD brain regions known to be affected by the disease: hippocampus (~11 fold), superior frontal gyrus (~5 fold), and entorhinal cortex (~10 fold) (Fig. 3B). On the other hand, no differences in the expression of SERPINE1 were observed in cerebellum, the brain region that is typically not affected by the disease (Fig. 3B). Thus SERPINE1 overexpression in specific areas of the brain may result in alteration of the cerebral vasculature and reduction of Aβ degradation.
NATs are a class of lncRNAs that are transcribed from the opposite DNA strand of other RNA transcripts for which they share sequence complementarity. Antisense RNAs exert regulatory functions on protein-coding gene expression by different mechanisms both at the transcriptional and post-transcriptional level and the involvement of NATs in the pathophysiology of neuropsychiatric disorders has been previously demonstrated [32–35]. Among the 31 differentially expressed NATs we noticed 21 (9 novel and 12 annotated) to be upregulated and 10 to be downregulated (4 novel and 6 annotated) in LOAD compared to control individuals (Supplementary File 3). Among the most differentially expressed NATs are two multi-exonic RNAs that were not previously annotated and that we named HAO2-AS (XLOC_051561) and EBF3-AS (XLOC_083817). HAO2-AS is a 5 exons RNA of 488 nt transcribed from the opposite strand of the protein-coding gene HAO2 on chromosome 1 (Fig. 4A), while EBF3-AS is a 2-exons RNA of 842 nt transcribed from the opposite strand of the protein-coding gene EBF3 on chromosome 10 (Fig. 4A). 5’ RACE experiments validated the structure of these novel antisense ncRNAs and confirmed the accuracy of our RNAseq analysis in reconstructing and annotating novel transcripts (Table 5). These two antisense ncRNAs show tissue specific expression profiles with HAO2-AS most abundantly expresses in the heart and EBF3-AS most abundantly expressed in the brain (Fig. 4B). Differential expression of these NATs in LOAD cases and their tissue-specific pattern of expression suggest that these NATs are not spurious random transcripts and further suggest that these transcripts might have important regulatory function in heart and brain.
In order to better characterized HAO2-AS and EBF3-AS, we performed subcellular fractionation of hNSCs isolated from three fetal brain tissues and extracted RNA from three separated fractions: cytosol, nucleoplasm, and chromatin. While protein coding genes are enriched in the cytoplasm (data not shown), both of the NATs are more abundantly expressed in the nucleoplasm and in the chromatin fractions (Fig. 4C), suggesting their involvement in nuclear-related cellular processes such as epigenetic modification of histones that has been demonstrated for many long noncoding RNAs (lncRNAs) . qRT-PCR analysis of EBF3-AS expression in different areas of the brain of AD cases and controls revealed the abundant overexpression of this NATs in AD brain regions known to be severely affected by the pathology (hippocampus, superior frontal gyrus, and entorhinal cortex) while no differences were observed in cerebellum (Fig. 4D, E). Similar expression analysis for HAO2-AS revealed its upregulation in hippocampus and superior frontal gyrus while no differences were observed in the entorhinal cortex or cerebellum. Our data demonstrate that NATs are specifically dysregulated in brain regions affected by LOAD pathology and suggest involvement of this class of lncRNAs in nuclear-related cellular processes.
LincRNAs are lncRNAs transcribed in intergenic regions [36, 37] that can exert regulatory functions at different levels and by different mechanisms . Several independent studies confirmed lincRNAs to function as enhancer (eRNAs) to regulate gene expression of neighboring genes [39–41] or to act as scaffold for the recruitment of chromatin modifying enzymes to regulate the epigenetic state of specific genomic loci [42–44]. In our RNAseq analysis, we identified 89 lincRNAs to be differentially expressed in LOAD hippocampi compared to control, 72 of these are novel non-annotated RNA transcripts (Supplementary File 3). Among the differentially expressed lincRNAs, two multiexonic lincRNAs were selected for qRT-PCR validation in our larger cohort of brain specimens of LOAD cases and control. AD-linc1 (XLOC_753726) is a 264 nt long transcript composed of 14 exons transcribed from chromosome 9, while AD-linc2 (XLOC_612449) is a 3 exon, 246 nt long transcript transcribed from chromosome 6 (Fig. 5A). Both of these transcripts were upregulated in AD compared to control and because they were among the most highly expressed lincRNAs in our dataset, we decided to focus our attention on them. Expression of these two RNAs is tissue specific with AD-linc1 being mostly expressed in the brain while AD-linc2 is abundantly expressed in the heart and kidney and almost not detectable in the normal brain (Fig. 5B). Similarly to what we have observed for antisense transcripts, we noticed a clear localization of these two lincRNAs in the chromatin compartment, suggesting their role in regulating chromatin structure (Fig. 5C). qRT-PCR analysis of AD-linc1 and AD-linc2 expression in different brain regions of AD cases compare to controls revealed increased expression of these two ncRNAs in the hippocampus and superior frontal gyrus of AD patients while no differences were observed in cerebellum (Fig. 5D, E). Although non-statistically significant, due to the limited number of cases available, we also observed the same trend in the entorhinal cortex where these two ncRNAs are more abundantly expressed in AD than in controls (Fig. 5D, E).
To evaluate in which specific cell type (neuronal or astroglial) LOAD-lncRNAs are enriched, we compared their expression in commercially available RNA extracted from human neurons and astrocytes. As shown in Fig. 6A AD-linc1, AD-linc2, HAO2-AS, and EBF3-AS are all enriched, although at different degrees, in neurons compared to astrocytes. Recent studies revealed that the expression of a subset of long ncRNAs is dynamically regulated during neuronal activation, thus suggesting their involvement in activity-dependent neuronal processes [45, 46]. To investigate LOAD-dysregulated lncRNAs transcriptional changes in response to activity and identify potential novel activity-dependent players in the pathology, we depolarized human neurons derived from the in vitro differentiation of human neural stem cells (hNSCs) (Fig. 6B, C and Supplementary File 5) and we measured the expression of AD-linc1, AD-linc2, HAO2-AS, and EBF3-AS using qRT-PCR. NSCs were isolated from human fetal brain and were cultured as floating neurospheres as previously described [47, 48]. KCl depolarization induced a canonical activity-dependent transcription of the immediate early gene c-FOS, thus confirming the ability of our neuronal culture to activate transcription of activity-dependent genes upon depolarization. Among the analyzed ncRNAs, AD-linc2, HAO2-AS, and EBF3-AS show activity-dependent transcription activation (Fig. 6C). Recent studies have shown that lncRNAs expression changes in cells exposed to different stress conditions [49, 50]. To exclude the possibility that KCl treatment caused a generalized increase in lncRNAs transcription, we measured the expression of BDNF-AS, MALT1, and BACE1-AS, the last two being previously shown to be modulated by stress conditions [14, 51]. As shown in Fig. 6C, expression of BDNF-AS, MALAT1, and BACE1-AS does not change in response to KCl treatment thus excluding a generalized increase of lncRNAs. Activity-dependent transcriptional regulation implies that these LOAD-lncRNAs might have functional regulatory significance. One possibility was that these activity-dependent lncRNAs could regulate the expression of nearby genes in cis. To test this hypothesis, we looked at changes in the expression of genes located in the same locus after KCl treatment. Interestingly we did not observe any changes upon KCl depolarization in the expression of nearby genes (Supplementary File 6), indicating that transcriptional regulation of these activity-regulated lncRNAs is independent of nearby genes and suggesting a trans-acting mechanisms of function for these lncRNAs. Together our data suggest involvement of these lncRNAs in LOAD pathophysiology.
Different long ncRNAs have been previously implicated in distinctive aspect of AβPP processing and a μψλoιδ generation [14, 52–54]; however, the effects of Aβ ξπoσυρ on the expression of lncRNAs in human neural cells have not been previously investigated. Iν oρδ to assess whether the expression of LOAD-dysregulated lncRNAs was altered by Aβ ξπoσυρ, we treated differentiated hNSCs with 10μM of soluble Aβ42 and then measured the expression of AD-linc1, AD-linc2, HAO2-AS, and EBF3-AS using qRT-PCR. We observed that 48h exposure of Aβ42 triggers the expression of the novel lincRNA overexpressed in LOAD, AD-linc1. As expected, exposure to Aβ42 also induces activation of c-FOS  in differentiated neural cells. These data imply that accumulation of Aβ42 in LOAD induces the expression of AD-linc1, which might contribute to Aβ42 induced neurotoxicity.
The complex etiology of LOAD suggests that the pathology results from intricate interplay of genetic predisposition and environmental factors. This interplay is reflected in epigenomic and transcriptomic alteration of neural cells during disease progression [56–58]. Transcriptome profiling is an important tool in studying complex disorders to identify end-stage alterations that converge from interaction of multiple genetic, epigenetic, and environmental factors. The intricate nature of contributing factors might be hard to identify without studying their end products of transcriptome alterations in the affected brain tissues. Historically, transcriptome profiling was performed using microarray technology with the major limitation of measuring only previously identified probes. Second major limitation of studying transcriptome is quality of RNA samples, particularly human variations, comorbidities, tissue type, and long PMIs. Recent advances in RNA sequencing technologies have greatly contributed to understand the extent and complexity of mammalian transcriptomes and to unravel molecular mechanisms involved in the pathogenesis of complex diseases.
In this study, we used RNAseq to profile the expression of coding and non-protein coding RNA transcripts in the hippocampus of 4 LOAD patients compared to 4 healthy age-matched control individuals. We identified known and novel RNA transcripts, including protein-coding genes, lincRNAs, and NATs to be dysregulated in the brain of LOAD patients. We validated sequencing data using qRT-PCR in a larger set of high quality, short PMI RNA samples and we confirmed dysregulation of candidate genes in other brain regions affected by the disease. Moreover, we demonstrated that expression of several lncRNAs dysregulated in LOAD is dependent on neuronal activity, further indicating their role in disease-related alteration in neuronal function. We found more than a hundred protein coding genes to be differentially expressed in LOAD and pathways analysis suggested alteration in neural communication and cerebral vasculature. According to the “two-hit vascular hypothesis” of AD , a vascular-related dysfunction represents the first hit in the etiology of the disease. This initial hit is the cause of neurovascular defects like blood-brain barrier (BBB) dysfunction and oligaemia, which have been repeatedly observed in LOAD patients, and which may be directly responsible for both neuronal dysfunction and alteration in Aβ production and clearance . In healthy physiological conditions, the BBB protects the brain by regulating the entrance of blood-born molecules through the activity of specific transporters or receptors [61–63]. Thus BBB breakdown and the uncontrolled trafficking of molecules, peptides, and cells from the blood to the brain and from the brain to the blood may participate in the activation of neuroinflammation and directly influence Aβ homeostasis in the brain . The imbalance in Aβ homeostasis caused by the first hit culminates with the accumulation of Aβ in the brain and represents the second hit necessary for the progression of the pathology. When we mined our dataset looking for genes potentially implicated in both regulating vasculature function and Aβ homeostasis, our attention was caught by SERPINE1, which was among the most over-expressed protein-coding genes. SERPINE1 codes for PAI-1, which is the main activator of the plasmin cascade and fibrinolysis. In the brain, PAI-1 is mainly produced by endothelial cells, but it is also present in astrocytes and pericytes . Increased expression of SERPINE1 and impairment of the fibrinolytic cascade are associable with thrombotic conditions, stroke, artery disease, atherosclerosis, and diabetes . Interestingly all these conditions have been reported by epidemiological studies to be risk factors for AD . Moreover, increased expression of SERPINE1 has been observed in the brain of AβPP/PS1 transgenic mice and in the cortex of AD patients and knockout of SERPINE1 in AβPP/PS1 mice reduces the amounts of both soluble and insoluble Aβ and plaques in the brain [68–70]. In our study, using qRT-PCR we demonstrated robust overexpression of SERPINE1 in AD brain regions that are characterized by Aβ accumulation during the pathogenesis of the disease, but not in non-affected regions like cerebellum. Our findings open the possibility that the increased expression of SERPINE1 and the suppression of fibrinolysis may play a dual role in LOAD. First, it may lead to neurovascular dysfunction and second, it may cause impairment in amyloid clearance, thus resulting in increased Aβ levels. SERPINE1 overexpression in LOAD cases, as we reported here, and the crucial involvement of this protein in both the pathophysiology of cerebral vasculature and regulation of amyloid homeostasis strongly support the “two-hit vascular hypothesis” for AD.
Data generated in the past 10 years mostly from the effort of two international research consortiums, the ENCODE and the FANTOM, demonstrated that the majority of the human genome is transcribed mostly in non-protein coding RNAs . A clear functional classification of these RNA transcripts is still not available due to the paucity of functionally validated examples; however, ncRNAs can be classified by their length into short ncRNAs (<30 nt) and long ncRNAs (>200 nt). In the brain and nervous system, lncRNAs are particularly abundant , have cell-type and activity-dependent specificity of expression , and play fundamental roles in a variety of biological processes and diseases, including AD . For instance, our previous works demonstrated the pivotal role of the antisense long ncRNA (lncRNA) BACE1-AS in the pathophysiology of LOAD [14, 72, 73]. Even thought catalogs of lncRNAs have been described in many human cell types and in multiple types of cancer , demonstrating the existence of cell-specific and disease-specific lncRNAs, a signature of lncRNAs for LOAD is still missing. We identified several annotated and non-annotated lncRNAs differentially expressed in LOAD hippocampi compared to control, thus providing for the first time a list of unexplored potential targets that could be involved in the pathogenesis of the disease. We went further to investigate the expression of four lncRNA candidates using qRT-PCR in a bigger cohort of brain samples. Interestingly, we noted clear changes in the expression of these lncRNAs not only in the hippocampus but also in the entorhinal cortex and superior frontal gyrus of LOAD patients. As a control for our study, we analyzed expression of these ncRNA in the cerebellum, a brain region only partially affected by the disease, and we could not detect any statistical changes in the expression of these four lncRNAs between patients and control samples. Interestingly all the four ncRNAs are enriched in the nuclear or chromatin compartment and the expression of three of them is triggered in response to neuronal activity, thus suggesting involvement of these ncRNAs in activity-dependent, nuclear-associated cellular processes. In cultured human neural cells, the expression of one of the discovered lincRNA, AD-linc1, is triggered by Aβ1 - 42 exposure. These data suggest that the observed overexpression of AD-linc1 in LOAD is due to Aβ1 - 42 accumulation and open the possibility that AD-linc1 is involved in amyloid-induced neurotoxicity. Alternations of activities of specific neuronal networks during AD are thought to be important contributors to the development and progression of the pathology, in part through regional vulnerability to Aβ deposition . Differential expression of these lncRNAs in LOAD cases, tissue-specific pattern of expression and their nuclear localization suggest involvement of noncoding part of genome in complex disorders such as LOAD and warrant holistic approaches to consider these transcripts together with protein-coding genes in studies aim to understand pathological processes leading to LOAD.
Our data demonstrate the existence of region specific transcriptomic changes in the brain of LOAD patients in protein and non-protein coding genes and represent an important resource of LOAD-specific transcriptomic changes that can help to better understand alterations in biological cellular pathways that contribute to the pathophysiology of AD.
Brain samples information. Clinicopathological information of the LOAD and control brain samples utilized for RNAseq and qRT-PCR in the study.
Gene expression profile of LOAD and control hippocampi. RNAseq data showing the expression of protein-coding genes (PCGs), long intergenic ncRNAs (lincRNAs) and antisense RNAs in the hippocampi of 4 LOAD patients and 4 controls.
Genes differentially expressed in LOAD hippocampus. Differentially expressed protein-coding genes (PCGs), long intergenic ncRNAs (lincRNAs) and antisense RNAs in the hippocampi of 4 LOAD patients and 4 controls
RNAseq and qRT-PCR correlation analysis. Pearson correlation analysis of log2 fold change differences from RNAseq and qRT-PCR data for 6 different protein coding genes, 6 lincRNAs and 6 AS RNA transcripts.
Gene expression changes showing in vitro differentiation of human NSCs into a mixed population of neural cells. qRT-PCR analysis of the expression of the neural stem cells marker Nestin and different neuronal markers (NFH, DCX, βIII-Tub) in RNA isolated from human NSCs differentiated for 21 days. On the Y axe is depicted the expression of the analyzed gene relative to the housekeeping gene β-ACTIN. Gene expression is normalized to hNSCs.
Expression of genes nearby LOAD-dysregulated lncRNAs upon KCl-induced depolarization of human neural cells. qRT-PCR analysis showing expression changes 1h post KCl stimulation of genes located nearby LOAD-dysregulated lncRNAs. EBF3, MGMT and Linc00959 are located nearby EBF3-AS. OGFRL1 is located nearby AD-linc2. On the Y axe is depicted the expression of the analyzed gene relative to the housekeeping gene β-ACTIN. Gene expression is normalized to non-treated cells. Error bars are S.D.
qPCR Primers sequence.
This work was supported by the US NIH NINDS R01NS081208-01A1 awarded to Mohammad Ali Faghihi. Dr. Magistri was supported by the fellowship for Prospective Researcher from the Swiss National Science Foundation, which covered the first year of his postdoctoral training.
Authors’ disclosures available online (http://j-alz.com/manuscript-disclosures/15-0398r2).
The supplementary material is available in the electronic version of this article: http://dx.doi.org/10.3233/JAD-150398.