Thousands of long noncoding RNAs (lncRNAs) have been reported in mammalian genomes. These RNAs represent an important subset of pervasive genes involved in a broad range of biological functions. Aberrant expression of lncRNAs is associated with many types of cancers. Here, in order to explore the potential lncRNAs involved in hepatocellular carcinoma (HCC) oncogenesis, we performed lncRNA gene expression profile analysis in 3 pairs of human HCC and adjacent non-tumor (NT) tissues by microarray.
Differentially expressed lncRNAs and mRNAs were detected by human lncRNA microarray containing 33,045 lncRNAs and 30,215 coding transcripts. Bioinformatic analyses (gene ontology, pathway and network analysis) were applied for further study of these differentially expressed mRNAs. By qRT-PCR analysis in nineteen pairs of HCC and adjacent normal tissues, we found that eight lncRNAs were aberrantly expressed in HCC compared with adjacent NT tissues, which is consistent with microarray data.
We identified 214 lncRNAs and 338 mRNAs abnormally expressed in all three HCC tissues (Fold Change ≥2.0, P<0.05 and FDR <0.05) with the genome-wide lncRNAs and mRNAs expression profile analysis. The lncRNA-mRNA co-expression network was constructed, which may be used for predicting target genes of lncRNAs. Furthermore, we demonstrated for the first time that BC017743, ENST00000395084, NR_026591, NR_015378 and NR_024284 were up-regulated, whereas NR_027151, AK056988 and uc003yqb.1 were down-regulated in nineteen pairs of HCC samples compared with adjacent NT samples. Expression of seven lncRNAs was significantly correlated to their nearby coding genes. In conclusion, our results indicated that the lncRNA expression profile in HCC was significantly changed, and we identified a series of new hepatocarcinoma associated lncRNAs. These results provide important insights about the lncRNAs in HCC pathogenesis.
The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia-nigra, compared to controls. This novel workflow allows deep multi-level inspection of RNA-Seq datasets and provides a comprehensive new resource for understanding disease transcriptome modifications in PD and other neurodegenerative diseases.
Long non-coding RNAs (lncRNAs) comprise a novel, fascinating class of RNAs with largely unknown biological functions. Parkinson's-disease (PD) is the most frequent motor disorder, and Deep-brain-stimulation (DBS) treatment alleviates the symptoms, but early disease biomarkers are still unknown and new future genetic interference targets are urgently needed. Using RNA-sequencing technology and a novel computational workflow for in-depth exploration of whole-transcriptome RNA-seq datasets, we detected and analyzed lncRNAs in sequenced libraries from PD patients' leukocytes pre and post-treatment and the brain, adding this full profile resource of over 7,000 lncRNAs to the few human tissues-derived lncRNA datasets that are currently available. Our study includes sample-specific database construction, detecting disease-derived changes in known and novel lncRNAs, exons and junctions and predicting corresponding changes in Polyadenylation choices, protein domains and miRNA binding sites. We report widespread transcript structure variations at the splice junction and exons levels, including novel exons and junctions and alteration of lncRNAs followed by experimental validation in PD leukocytes and two PD brain regions compared with controls. Our results suggest lncRNAs involvement in neurodegenerative diseases, and specifically PD. This comprehensive workflow will be of use to the increasing number of laboratories producing RNA-Seq data in a wide range of biomedical studies.
Pancreatic ductal adenocarcinoma (PDAC) is known by its aggressiveness and lack of effective therapeutic options. Thus, improvement in current knowledge of molecular changes associated with pancreatic cancer is urgently needed to explore novel venues of diagnostics and treatment of this dismal disease. While there is mounting evidence that long noncoding RNAs (lncRNAs) transcribed from intronic and intergenic regions of the human genome may play different roles in the regulation of gene expression in normal and cancer cells, their expression pattern and biological relevance in pancreatic cancer is currently unknown. In the present work we investigated the relative abundance of a collection of lncRNAs in patients' pancreatic tissue samples aiming at identifying gene expression profiles correlated to pancreatic cancer and metastasis.
Custom 3,355-element spotted cDNA microarray interrogating protein-coding genes and putative lncRNA were used to obtain expression profiles from 38 clinical samples of tumor and non-tumor pancreatic tissues. Bioinformatics analyses were performed to characterize structure and conservation of lncRNAs expressed in pancreatic tissues, as well as to identify expression signatures correlated to tissue histology. Strand-specific reverse transcription followed by PCR and qRT-PCR were employed to determine strandedness of lncRNAs and to validate microarray results, respectively.
We show that subsets of intronic/intergenic lncRNAs are expressed across tumor and non-tumor pancreatic tissue samples. Enrichment of promoter-associated chromatin marks and over-representation of conserved DNA elements and stable secondary structure predictions suggest that these transcripts are generated from independent transcriptional units and that at least a fraction is under evolutionary selection, and thus potentially functional.
Statistically significant expression signatures comprising protein-coding mRNAs and lncRNAs that correlate to PDAC or to pancreatic cancer metastasis were identified. Interestingly, loci harboring intronic lncRNAs differentially expressed in PDAC metastases were enriched in genes associated to the MAPK pathway. Orientation-specific RT-PCR documented that intronic transcripts are expressed in sense, antisense or both orientations relative to protein-coding mRNAs. Differential expression of a subset of intronic lncRNAs (PPP3CB, MAP3K14 and DAPK1 loci) in metastatic samples was confirmed by Real-Time PCR.
Our findings reveal sets of intronic lncRNAs expressed in pancreatic tissues whose abundance is correlated to PDAC or metastasis, thus pointing to the potential relevance of this class of transcripts in biological processes related to malignant transformation and metastasis in pancreatic cancer.
pancreatic cancer; molecular markers; noncoding RNAs; intronic transcription; metastasis; MAPK ; pathway; cDNA microarrays
Although analysis pipelines have been developed to use RNA-seq to identify long non-coding RNAs (lncRNAs), inference of their biological and pathological relevance remains a challenge. As a result, most transcriptome studies of autoimmune disease have only assessed protein-coding transcripts.
We used RNA-seq data from 99 lesional psoriatic, 27 uninvolved psoriatic, and 90 normal skin biopsies, and applied computational approaches to identify and characterize expressed lncRNAs. We detect 2,942 previously annotated and 1,080 novel lncRNAs which are expected to be skin specific. Notably, over 40% of the novel lncRNAs are differentially expressed and the proportions of differentially expressed transcripts among protein-coding mRNAs and previously-annotated lncRNAs are lower in psoriasis lesions versus uninvolved or normal skin. We find that many lncRNAs, in particular those that are differentially expressed, are co-expressed with genes involved in immune related functions, and that novel lncRNAs are enriched for localization in the epidermal differentiation complex. We also identify distinct tissue-specific expression patterns and epigenetic profiles for novel lncRNAs, some of which are shown to be regulated by cytokine treatment in cultured human keratinocytes.
Together, our results implicate many lncRNAs in the immunopathogenesis of psoriasis, and our results provide a resource for lncRNA studies in other autoimmune diseases.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0570-4) contains supplementary material, which is available to authorized users.
Long noncoding RNAs (lncRNAs) have crucial roles in cancer biology. We performed a genome-wide analysis of lncRNA expression in hepatoblastoma tissues to identify novel targets for further study of hepatoblastoma. Hepatoblastoma and normal liver tissue samples were obtained from hepatoblastoma patients. The genome-wide analysis of lncRNA expression in these tissues was performed using a 4×180 K lncRNA microarray and Sureprint G3 Human lncRNA Chips. Quantitative RT-PCR (qRT-PCR) was performed to confirm these results. The differential expressions of lncRNAs and mRNAs were identified through fold-change filtering. Gene Ontology (GO) and pathway analyses were performed using the standard enrichment computation method. Associations between lncRNAs and adjacent protein-coding genes were determined through complex transcriptional loci analysis. We found that 2736 lncRNAs were differentially expressed in hepatoblastoma tissues. Among these, 1757 lncRNAs were upregulated more than two-fold relative to normal tissues and 979 lncRNAs were downregulated. Moreover, in hepatoblastoma there were 420 matched lncRNA-mRNA pairs for 120 differentially expressed lncRNAs, and 167 differentially expressed mRNAs. The co-expression network analysis predicted 252 network nodes and 420 connections between 120 lncRNAs and 132 coding genes. Within this co-expression network, 369 pairs were positive, and 51 pairs were negative. Lastly, qRT-PCR data verified six upregulated and downregulated lncRNAs in hepatoblastoma, plus endothelial cell-specific molecule 1 (ESM1) mRNA. Our results demonstrated that expression of these aberrant lncRNAs could respond to hepatoblastoma development. Further study of these lncRNAs could provide useful insight into hepatoblastoma biology.
Long non-coding RNAs (lncRNA) play an important role in carcinogenesis; knowledge on lncRNA expression in renal cell carcinoma is rudimental. As a basis for biomarker development, we aimed to explore the lncRNA expression profile in clear cell renal cell carcinoma (ccRCC) tissue.
Microarray experiments were performed to determine the expression of 32,183 lncRNA transcripts belonging to 17,512 lncRNAs in 15 corresponding normal and malignant renal tissues. Validation was performed using quantitative real-time PCR in 55 ccRCC and 52 normal renal specimens. Computational analysis was performed to determine lncRNA-microRNA (MiRTarget2) and lncRNA-protein (catRAPID omics) interactions. We identified 1,308 dysregulated transcripts (expression change >2-fold; upregulated: 568, downregulated: 740) in ccRCC tissue. Among these, aberrant expression was validated using PCR: lnc-BMP2-2 (mean expression change: 37-fold), lnc-CPN2-1 (13-fold), lnc-FZD1-2 (9-fold), lnc-ITPR2-3 (15-fold), lnc-SLC30A4-1 (15-fold), and lnc-SPAM1-6 (10-fold) were highly overexpressed in ccRCC, whereas lnc-ACACA-1 (135-fold), lnc-FOXG1-2 (19-fold), lnc-LCP2-2 (2-fold), lnc-RP3-368B9 (19-fold), and lnc-TTC34-3 (314-fold) were downregulated. There was no correlation between lncRNA expression with clinical-pathological parameters. Computational analyses revealed that these lncRNAs are involved in RNA-protein networks related to splicing, binding, transport, localization, and processing of RNA. Small interfering RNA (siRNA)-mediated knockdown of lnc-BMP2-2 and lnc-CPN2-1 did not influence cell proliferation.
We identified many novel lncRNA transcripts dysregulated in ccRCC which may be useful for novel diagnostic biomarkers.
Electronic supplementary material
The online version of this article (doi:10.1186/s13148-015-0047-7) contains supplementary material, which is available to authorized users.
Long intergenic non-coding RNAs (lncRNAs) represent an emerging and under-studied class of transcripts that play a significant role in human cancers. Due to the tissue- and cancer-specific expression patterns observed for many lncRNAs it is believed that they could serve as ideal diagnostic biomarkers. However, until each tumor type is examined more closely, many of these lncRNAs will remain elusive.
Here we characterize the lncRNA landscape in lung cancer using publicly available transcriptome sequencing data from a cohort of 567 adenocarcinoma and squamous cell carcinoma tumors. Through this compendium we identify over 3,000 unannotated intergenic transcripts representing novel lncRNAs. Through comparison of both adenocarcinoma and squamous cell carcinomas with matched controls we discover 111 differentially expressed lncRNAs, which we term lung cancer-associated lncRNAs (LCALs). A pan-cancer analysis of 324 additional tumor and adjacent normal pairs enable us to identify a subset of lncRNAs that display enriched expression specific to lung cancer as well as a subset that appear to be broadly deregulated across human cancers. Integration of exome sequencing data reveals that expression levels of many LCALs have significant associations with the mutational status of key oncogenes in lung cancer. Functional validation, using both knockdown and overexpression, shows that the most differentially expressed lncRNA, LCAL1, plays a role in cellular proliferation.
Our systematic characterization of publicly available transcriptome data provides the foundation for future efforts to understand the role of LCALs, develop novel biomarkers, and improve knowledge of lung tumor biology.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0429-8) contains supplementary material, which is available to authorized users.
Mammalian genomes are extensively transcribed producing thousands of long non-protein-coding RNAs (lncRNAs). The biological significance and function of the vast majority of lncRNAs remain unclear. Recent studies have implicated several lncRNAs as playing important roles in embryonic development and cancer progression. LncRNAs are characterized with different genomic architectures in relationship with their associated protein-coding genes. Our study aimed at bridging lncRNA architecture with dynamical patterns of their expression using differentiating human neuroblastoma cells model.
LncRNA expression was studied in a 120-hours timecourse of differentiation of human neuroblastoma SH-SY5Y cells into neurons upon treatment with retinoic acid (RA), the compound used for the treatment of neuroblastoma. A custom microarray chip was utilized to interrogate expression levels of 9,267 lncRNAs in the course of differentiation. We categorized lncRNAs into 19 architecture classes according to their position relatively to protein-coding genes. For each architecture class, dynamics of expression of lncRNAs was studied in association with their protein-coding partners. It allowed us to demonstrate positive correlation of lncRNAs with their associated protein-coding genes at bidirectional promoters and for sense-antisense transcript pairs. In contrast, lncRNAs located in the introns and downstream of the protein-coding genes were characterized with negative correlation modes. We further classified the lncRNAs by the temporal patterns of their expression dynamics. We found that intronic and bidirectional promoter architectures are associated with rapid RA-dependent induction or repression of the corresponding lncRNAs, followed by their constant expression. At the same time, lncRNAs expressed downstream of protein-coding genes are characterized by rapid induction, followed by transcriptional repression. Quantitative RT-PCR analysis confirmed the discovered functional modes for several selected lncRNAs associated with proteins involved in cancer and embryonic development.
This is the first report detailing dynamical changes of multiple lncRNAs during RA-induced neuroblastoma differentiation. Integration of genomic and transcriptomic levels of information allowed us to demonstrate specific behavior of lncRNAs organized in different genomic architectures. This study also provides a list of lncRNAs with possible roles in neuroblastoma.
Long non-coding RNAs (lncRNAs) as a key group of non-coding RNAs have gained widely attention. Though lncRNAs have been functionally annotated and systematic explored in higher mammals, few are under systematical identification and annotation. Owing to the expression specificity, known lncRNAs expressed in embryonic brain tissues remain still limited. Considering a large number of lncRNAs are only transcribed in brain tissues, studies of lncRNAs in developmental brain are therefore of special interest. Here, publicly available RNA-sequencing (RNA-seq) data in embryonic brain are integrated to identify thousands of embryonic brain lncRNAs by a customized pipeline. A significant proportion of novel transcripts have not been annotated by available genomic resources. The putative embryonic brain lncRNAs are shorter in length, less spliced and show less conservation than known genes. The expression of putative lncRNAs is in one tenth on average of known coding genes, while comparable with known lncRNAs. From chromatin data, putative embryonic brain lncRNAs are associated with active chromatin marks, comparable with known lncRNAs. Embryonic brain expressed lncRNAs are also indicated to have expression though not evident in adult brain. Gene Ontology analysis of putative embryonic brain lncRNAs suggests that they are associated with brain development. The putative lncRNAs are shown to be related to possible cis-regulatory roles in imprinting even themselves are deemed to be imprinted lncRNAs. Re-analysis of one knockdown data suggests that four regulators are associated with lncRNAs. Taken together, the identification and systematic analysis of putative lncRNAs would provide novel insights into uncharacterized mouse non-coding regions and the relationships with mammalian embryonic brain development.
Ventricular septal defects (VSD) are the most common form of congenital heart disease, which is the leading non-infectious cause of death in children; nevertheless, the exact cause of VSD is not yet fully understood. Long non-coding RNAs (lncRNAs) have been shown to play key roles in various biological processes, such as imprinting control, circuitry controlling pluripotency and differentiation, immune responses and chromosome dynamics. Notably, a growing number of lncRNAs have been implicated in disease etiology, although an association with VSD has not been reported. In the present study, we conducted an integrated analysis of dysregulated lncRNAs, focusing specifically on the identification and characterization of lncRNAs potentially involving in initiation of VSD. Comparison of the transcriptome profiles of cardiac tissues from VSD-affected and normal hearts was performed using a second-generation lncRNA microarray, which covers the vast majority of expressed RefSeq transcripts (29,241 lncRNAs and 30,215 coding transcripts). In total, 880 lncRNAs were upregulated and 628 were downregulated in VSD. Furthermore, our established filtering pipeline indicated an association of two lncRNAs, ENST00000513542 and RP11-473L15.2, with VSD. This dysregulation of the lncRNA profile provides a novel insight into the etiology of VSD and furthermore, illustrates the intricate relationship between coding and ncRNA transcripts in cardiac development. These data may offer a background/reference resource for future functional studies of lncRNAs related to VSD.
Advances in vertebrate genomics have uncovered thousands of loci encoding long noncoding RNAs (lncRNAs). While progress has been made in elucidating the regulatory functions of lncRNAs, little is known about their origins and evolution. Here we explore the contribution of transposable elements (TEs) to the makeup and regulation of lncRNAs in human, mouse, and zebrafish. Surprisingly, TEs occur in more than two thirds of mature lncRNA transcripts and account for a substantial portion of total lncRNA sequence (∼30% in human), whereas they seldom occur in protein-coding transcripts. While TEs contribute less to lncRNA exons than expected, several TE families are strongly enriched in lncRNAs. There is also substantial interspecific variation in the coverage and types of TEs embedded in lncRNAs, partially reflecting differences in the TE landscapes of the genomes surveyed. In human, TE sequences in lncRNAs evolve under greater evolutionary constraint than their non–TE sequences, than their intronic TEs, or than random DNA. Consistent with functional constraint, we found that TEs contribute signals essential for the biogenesis of many lncRNAs, including ∼30,000 unique sites for transcription initiation, splicing, or polyadenylation in human. In addition, we identified ∼35,000 TEs marked as open chromatin located within 10 kb upstream of lncRNA genes. The density of these marks in one cell type correlate with elevated expression of the downstream lncRNA in the same cell type, suggesting that these TEs contribute to cis-regulation. These global trends are recapitulated in several lncRNAs with established functions. Finally a subset of TEs embedded in lncRNAs are subject to RNA editing and predicted to form secondary structures likely important for function. In conclusion, TEs are nearly ubiquitous in lncRNAs and have played an important role in the lineage-specific diversification of vertebrate lncRNA repertoires.
An unexpected layer of complexity in the genomes of humans and other vertebrates lies in the abundance of genes that do not appear to encode proteins but produce a variety of non-coding RNAs. In particular, the human genome is currently predicted to contain 5,000–10,000 independent gene units generating long (>200 nucleotides) noncoding RNAs (lncRNAs). While there is growing evidence that a large fraction of these lncRNAs have cellular functions, notably to regulate protein-coding gene expression, almost nothing is known on the processes underlying the evolutionary origins and diversification of lncRNA genes. Here we show that transposable elements, through their capacity to move and spread in genomes in a lineage-specific fashion, as well as their ability to introduce regulatory sequences upon chromosomal insertion, represent a major force shaping the lncRNA repertoire of humans, mice, and zebrafish. Not only do TEs make up a substantial fraction of mature lncRNA transcripts, they are also enriched in the vicinity of lncRNA genes, where they frequently contribute to their transcriptional regulation. Through specific examples we provide evidence that some TE sequences embedded in lncRNAs are critical for the biogenesis of lncRNAs and likely important for their function.
Recent genome-wide expression profiling studies have uncovered a huge amount of novel, long non-protein-coding RNA transcripts (lncRNA). In general, these transcripts possess a low, but tissue-specific expression, and their nucleotide sequences are often poorly conserved. However, several studies showed that lncRNAs can have important roles for normal tissue development and regulate cellular pluripotency as well as differentiation. Moreover, lncRNAs are implicated in the control of multiple molecular pathways leading to gene expression changes and thus, ultimately modulate cell proliferation, migration and apoptosis. Consequently, deregulation of lncRNA expression contributes to carcinogenesis and is associated with human diseases, e.g., neurodegenerative disorders like Alzheimer’s Disease. Here, we will focus on some major challenges of lncRNA research, especially loss-of-function studies. We will delineate strategies for lncRNA gene targeting in vivo, and we will briefly discuss important consideration and pitfalls when investigating lncRNA functions in knockout animal models. Finally, we will highlight future opportunities for lncRNAs research by applying the concept of cross-species comparison, which might contribute to novel disease biomarker discovery and might identify lncRNAs as potential therapeutic targets.
functional genomics; genetically engineered mouse models (GEMM); long intergenic RNA (lincRNA); metastasis; metastasis-associated lung adenocarcinoma transcript 1 (MALAT1); HOX transcript antisense RNA (HOTAIR)
Long non-coding RNAs (lncRNAs) are novel transcripts that may play important roles in cancer. Our study aimed to resolve the lncRNA profile of larynx squamous cell carcinoma (LSCC) and to determine its clinical significance. The global lncRNA expression profile in LSCC tissues was measured by lncRNA microarray. Distinctly expressed lncRNAs were identified and levels of AC026166.2-001 and RP11-169D4.1-001 lncRNAs in 87 LSCC samples and paired adjacent normal tissue were analyzed by real-time quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR). The clinical significance of these lncRNAs in laryngeal cancer was analyzed and survival data were estimated by the Kaplan–Meier method and the log-rank test. A receiver operating characteristic (ROC) curve was constructed to check the diagnostic value. In the lncRNA expression profile of tumor samples, 684 lncRNAs were upregulated and 747 lncRNAs were downregulated (fold-change >2.0). Of these, AC026166.2-001 and RP11-169D4.1-001 were distinctly dysregulated, with AC026166.2-001 exhibiting lower expression in cancer tissues and RP11-169D4.1-001 higher expression. We verified that both AC026166.2-001 and RP11-169D4.1-001 were expressed at a lower level in cervical lymph nodes compared with paired laryngeal cancer tissues and paired normal tissues. RP11-169D4.1-001 levels were positively correlated with lymph node metastasis (P = 0.007). From the survival analysis, decreased levels of AC026166.2-001 and RP11-169D4.1-001 were associated with poorer prognosis. The area under the ROC curve was up to 0.65 and 0.67, respectively, and the cut-off point of ΔCt was 11.23 and 10.53, respectively. AC026166.2-001 and RP11-169D4.1-001 may act as novel biomarkers in LSCC and may be potential therapeutic targets for LSCC patients. Both AC026166.2-001 and RP11-169D4.1-001 could be independent prognostic factors for survival in LSCC.
Background: Recent research indicates that long non-coding RNAs (lncRNA) represent a new family of RNAs that is of fundamental importance for controlling transcription and translation. Thereby, there is increasing evidence that lncRNAs are also important in tumourigenesis. Thereby valid expression profiling using quantitative PCR requires suitable, stably expressed normalisers to achieve reliable and reproducible data. However, no systematic analysis of suitable references in lncRNA studies in human glioma has been performed yet.
Methods: In this study, we investigated 90 lncRNAs in 30 tissue specimen for the expression stability in human diffuse astrocytoma (WHO-Grade II), anaplastic astrocytoma (WHO-Grade III) and glioblastoma (WHO-Grade IV) both alone as well as in comparison with normal white matter. Our identification procedure included a rigorous bioinformatical selection process that resulted in the inclusion of only highly abundant, equally expressed lncRNAs for further analysis. Additionally, lncRNAs were classified according to their stability value using the NormFinder algorithm.
Results: We identified 24 appropriate normalisers suitable for studies in diffuse astrocytoma, 22 for studies in anaplastic astrocytoma and 12 for studies in glioblastoma. Comparing all three glioma entities 7 lncRNAs showed stable expression levels. Addition of normal brain tissue resulted in only 4 suitable lncRNAs.
Conclusions: Our findings indicate that 4 lncRNAs (HOXA6as, H19 upstream conserved 1 and 2, Zfhx2as and BC200) are suitable as normalisers in glioma and normal brain. These lncRNAs may thus be regarded as universal references being applicable for the accurate normalisation of lncRNA expression profiling in various glioma (WHO-Grades II-IV) alone and in combination with brain tissue. This enables to perform valid longitudinal studies, e.g. of glioma before and after malignisation to identify changes of lncRNA expressions probably driving malignant transformation.
long non-coding RNA; lncRNA; Glioma; References; qPCR; Profiling.
Accumulating evidence highlights the potential role of long non-coding RNAs (lncRNAs) as biomarkers and therapeutic targets in solid tumors. However, the role of lncRNA expression in human breast cancer biology, prognosis and molecular classification remains unknown. Herein, we established the lncRNA profile of 658 infiltrating ductal carcinomas of the breast from The Cancer Genome Atlas project. We found lncRNA expression to correlate with the gene expression and chromatin landscape of human mammary epithelial cells (non-transformed) and the breast cancer cell line MCF-7. Unsupervised consensus clustering of lncRNA revealed four subgroups that displayed different prognoses. Gene set enrichment analysis for cis- and trans-acting lncRNAs showed enrichment for breast cancer signatures driven by master regulators of breast carcinogenesis. Interestingly, the lncRNA HOTAIR was significantly overexpressed in the HER2-enriched subgroup, while the lncRNA HOTAIRM1 was significantly overexpressed in the basal-like subgroup. Estrogen receptor (ESR1) expression was associated with distinct lncRNA networks in lncRNA clusters III and IV. Importantly, almost two thirds of the lncRNAs were marked by enhancer chromatin modifications (i.e., H3K27ac), suggesting that expressed lncRNA in breast cancer drives carcinogenesis through increased activity of neighboring genes. In summary, our study depicts the first lncRNA subtype classification in breast cancer and provides the framework for future studies to assess the interplay between lncRNAs and the breast cancer epigenome.
breast cancer; enhancers; expression profiling; lncRNA; RNA-Seq
Dysregulation of long noncoding RNAs (lncRNAs) has been regarded as a primary feature of several human cancers. However, the genome-wide expression and functional significance of lncRNAs in bladder cancer remains unclear. The aim of this study was to identify aberrantly expressed lncRNAs that may play an important role in contributing to bladder cancer pathogenesis. In this study, we described lncRNAs profiles in four pairs of human bladder cancer and matched normal bladder tissues by microarray. We finally determined 3,324 differentially expressed human lncRNAs and 2,120 differentially expressed mRNAs (≥2-fold change). A total of 110 lncRNAs were significantly differentially expressed between the tumor and the control groups (≥8-fold change). Four lncRNAs (TNXA, CTA-134P22.2, CTC-276P9.1 and KRT19P3) were selected for further confirmation of microarray results using quantitative PCR (qPCR), and a strong correlation was identified between the qPCR results and microarray data. We also observed that numerous lncRNA expression levels were significantly correlated with the expression of tens of protein coding genes by construction of the lncRNA-mRNA co-expression network. Kyoto Encyclopedia of Genes and Genomes annotation showed a significant association with p53, bladder cancer, cell cycle and propanoate metabolism pathway gene expression in the bladder cancer group compared with the normal tissue group, indicating that deregulated lncRNAs may act by regulating protein-coding genes in these pathways. We demonstrated the expression profiles of human lncRNAs in bladder cancer by microarray. We identified a collection of aberrantly expressed lncRNAs in bladder cancer compared with matched normal tissue. It is likely that these deregulated lncRNAs play a key or partial role in the development and/or progression of bladder cancer.
bladder cancer; long noncoding RNA; microarray
AIM: To investigate the expression patterns of long non-coding RNAs (lncRNAs) in gastric cancer.
METHODS: Two publicly available human exon arrays for gastric cancer and data for the corresponding normal tissue were downloaded from the Gene Expression Omnibus (GEO). We re-annotated the probes of the human exon arrays and retained the probes uniquely mapping to lncRNAs at the gene level. LncRNA expression profiles were generated by using robust multi-array average method in affymetrix power tools. The normalized data were then analyzed with a Bioconductor package linear models for microarray data and genes with adjusted P-values below 0.01 were considered differentially expressed. An independent data set was used to validate the results.
RESULTS: With the computational pipeline established to re-annotate over 6.5 million probes of the Affymetrix Human Exon 1.0 ST array, we identified 136053 probes uniquely mapping to lncRNAs at the gene level. These probes correspond to 9294 lncRNAs, covering nearly 76% of the GENCODE lncRNA data set. By analyzing GSE27342 consisting of 80 paired gastric cancer and normal adjacent tissue samples, we identified 88 lncRNAs that were differentially expressed in gastric cancer, some of which have been reported to play a role in cancer, such as LINC00152, taurine upregulated 1, urothelial cancer associated 1, Pvt1 oncogene, small nucleolar RNA host gene 1 and LINC00261. In the validation data set GSE33335, 59% of these differentially expressed lncRNAs showed significant expression changes (adjusted P-value < 0.01) with the same direction.
CONCLUSION: We identified a set of lncRNAs differentially expressed in gastric cancer, providing useful information for discovery of new biomarkers and therapeutic targets in gastric cancer.
Long non-coding RNA; Gastric cancer; Microarray analysis; Data mining
Long non-coding RNAs (lncRNAs) are emerging as potent regulators of cell physiology, and recent studies highlight their role in tumor development. However, while established protein-coding oncogenes and tumor suppressors often display striking patterns of focal DNA copy-number alteration in tumors, similar evidence is largely lacking for lncRNAs. Here, we report on a genomic analysis of GENCODE lncRNAs in high-grade serous ovarian adenocarcinoma, based on The Cancer Genome Atlas (TCGA) molecular profiles. Using genomic copy-number data and deep coverage transcriptome sequencing, we derived dual copy-number and expression data for 10,419 lncRNAs across 407 primary tumors. We describe global correlations between lncRNA copy-number and expression, and associate established expression subtypes with distinct lncRNA signatures. By examining regions of focal copy-number change that lack protein-coding targets, we identified an intergenic lncRNA on chromosome 1, OVAL, that shows narrow focal genomic amplification in a subset of tumors. While weakly expressed in most tumors, focal amplification coincided with strong OVAL transcriptional activation. Screening of 16 other cancer types revealed similar patterns in serous endometrial carcinomas. This shows that intergenic lncRNAs can be specifically targeted by somatic copy-number amplification, suggestive of functional involvement in tumor initiation or progression. Our analysis provides testable hypotheses and paves the way for further study of lncRNAs based on TCGA and other large-scale cancer genomics datasets.
The human genome encodes tens of thousands of long non-coding RNAs (lncRNAs), a novel and important class of genes. Our knowledge of lncRNAs has grown exponentially since their discovery within the last decade. lncRNAs are expressed in a highly cell- and tissue-specific manner, and are particularly abundant within the nervous system. lncRNAs are subject to post-transcriptional processing and inter- and intra-cellular transport. lncRNAs act via a spectrum of molecular mechanisms leveraging their ability to engage in both sequence-specific and conformational interactions with diverse partners (DNA, RNA, and proteins). Because of their size, lncRNAs act in a modular fashion, bringing different macromolecules together within the three-dimensional context of the cell. lncRNAs thus coordinate the execution of transcriptional, post-transcriptional, and epigenetic processes and critical biological programs (growth and development, establishment of cell identity, and deployment of stress responses). Emerging data reveal that lncRNAs play vital roles in mediating the developmental complexity, cellular diversity, and activity-dependent plasticity that are hallmarks of brain. Corresponding studies implicate these factors in brain aging and the pathophysiology of brain disorders, through evolving paradigms including the following: (i) genetic variation in lncRNA genes causes disease and influences susceptibility; (ii) epigenetic deregulation of lncRNAs genes is associated with disease; (iii) genomic context links lncRNA genes to disease genes and pathways; and (iv) lncRNAs are otherwise interconnected with known pathogenic mechanisms. Hence, lncRNAs represent prime targets that can be exploited for diagnosing and treating nervous system diseases. Such clinical applications are in the early stages of development but are rapidly advancing because of existing expertise and technology platforms that are readily adaptable for these purposes.
Electronic supplementary material
The online version of this article (doi:10.1007/s13311-013-0199-0) contains supplementary material, which is available to authorized users.
ANRIL; Long Non-Coding RNA; Non-Coding RNA; Microvesicle; NEAT2; Neurological Disease
Hereditary Haemorrhagic Telangiectasia (HHT) is an autosomal dominantly inherited vascular disease characterized by the presence of mucocutaneous telangiectasia and arteriovenous malformations in visceral organs. HHT is predominantly caused by mutations in ENG and ACVRL1, which both belong to the TGF-β signalling pathway. The exact mechanism of how haploinsufficiency of ENG and ACVRL1 leads to HHT manifestations remains to be identified. As long non-coding RNAs (lncRNAs) are increasingly recognized as key regulators of gene expression and constitute a sizable fraction of the human transcriptome, we wanted to assess whether lncRNAs play a role in the molecular pathogenesis of HHT manifestations. By microarray technology, we profiled lncRNA transcripts from HHT nasal telangiectasial and non-telangiectasial tissue using a paired design. The microarray probes were annotated using the GENCODE v.16 dataset, identifying 4,810 probes mapping to 2,811 lncRNAs. Comparing HHT telangiectasial tissue with HHT non-telangiectasial tissue, we identified 42 lncRNAs that are differentially expressed (q<0.001). Using GREAT, a tool that assumes cis-regulation, we showed that differently expressed lncRNAs are enriched for genomic loci involved in key pathways concerning HHT. Our study identified lncRNAs that are aberrantly expressed in HHT telangiectasia and indicates that lncRNAs may contribute to regulate protein-coding loci in HHT. These results suggest that the lncRNA component of the transcriptome deserves more attention in HHT. A deeper understanding of lncRNAs and their role in telangiectasia formation possesses potential for discovering therapeutic targets in HHT.
Long non-coding RNAs (lncRNAs) are an important class of pervasive genes involved in a variety of biological functions. They are aberrantly expressed in many types of diseases. In this study, we aimed to investigate the lncRNA profiles in preeclampsia. Preeclampsia has been observed in patients with molar pregnancy where a fetus is absent, which demonstrate that the placenta is sufficient to cause this condition. Thus, we analyzed the lncRNA profiles in preeclampsia placentas.
In this study, we described the lncRNA profiles in six preeclampsia placentas (T) and five normal pregnancy placentas (N) using microarray. With abundant and varied probes accounting for 33,045 LncRNAs in our microarray, 28,443 lncRNAs that were expressed at a specific level were detected. From the data, we found 738 lncRNAs that were differentially expressed (≥1.5-fold-change) among preeclampsia placentas compared with controls. Coding-non-coding gene co-expression networks (CNC network) were constructed based on the correlation analysis between the differentially expressed lncRNAs and mRNAs. According to the CNC network and GO analysis of differentially expressed lncRNAs/mRNAs, we selected three lncRNAs to analyze the relationship between lncRNAs and preeclampsia. LOC391533, LOC284100, and CEACAMP8 were evaluated using qPCR in 40 preeclampsia placentas and 40 controls. These results revealed that three lncRNAs were aberrantly expressed in preeclampsia placentas compared with controls.
Our study is the first study to determine the genome-wide lncRNAs expression patterns in preeclampsia placenta using microarray. These results revealed that clusters of lncRNAs were aberrantly expressed in preeclampsia placenta compared with controls, which indicated that lncRNAs differentially expressed in preeclampsia placenta might play a partial or key role in preeclampsia development. Misregulation of LOC391533, LOC284100, and CEACAMP8 might contribute to the mechanism underlying preeclampsia. Taken together, this study may provide potential targets for the future treatment of preeclampsia and novel insights into preeclampsia biology.
Here, we present LNCipedia (http://www.lncipedia.org), a novel database for human long non-coding RNA (lncRNA) transcripts and genes. LncRNAs constitute a large and diverse class of non-coding RNA genes. Although several lncRNAs have been functionally annotated, the majority remains to be characterized. Different high-throughput methods to identify new lncRNAs (including RNA sequencing and annotation of chromatin-state maps) have been applied in various studies resulting in multiple unrelated lncRNA data sets. LNCipedia offers 21 488 annotated human lncRNA transcripts obtained from different sources. In addition to basic transcript information and gene structure, several statistics are determined for each entry in the database, such as secondary structure information, protein coding potential and microRNA binding sites. Our analyses suggest that, much like microRNAs, many lncRNAs have a significant secondary structure, in-line with their presumed association with proteins or protein complexes. Available literature on specific lncRNAs is linked, and users or authors can submit articles through a web interface. Protein coding potential is assessed by two different prediction algorithms: Coding Potential Calculator and HMMER. In addition, a novel strategy has been integrated for detecting potentially coding lncRNAs by automatically re-analysing the large body of publicly available mass spectrometry data in the PRIDE database. LNCipedia is publicly available and allows users to query and download lncRNA sequences and structures based on different search criteria. The database may serve as a resource to initiate small- and large-scale lncRNA studies. As an example, the LNCipedia content was used to develop a custom microarray for expression profiling of all available lncRNAs.
Intronic and intergenic long noncoding RNAs (lncRNAs) are emerging gene expression regulators. The molecular pathogenesis of renal cell carcinoma (RCC) is still poorly understood, and in particular, limited studies are available for intronic lncRNAs expressed in RCC.
Microarray experiments were performed with custom-designed arrays enriched with probes for lncRNAs mapping to intronic genomic regions. Samples from 18 primary RCC tumors and 11 nontumor adjacent matched tissues were analyzed. Meta-analyses were performed with microarray expression data from three additional human tissues (normal liver, prostate tumor and kidney nontumor samples), and with large-scale public data for epigenetic regulatory marks and for evolutionarily conserved sequences.
A signature of 29 intronic lncRNAs differentially expressed between RCC and nontumor samples was obtained (false discovery rate (FDR) <5%). A signature of 26 intronic lncRNAs significantly correlated with the RCC five-year patient survival outcome was identified (FDR <5%, p-value ≤0.01). We identified 4303 intronic antisense lncRNAs expressed in RCC, of which 22% were significantly (p <0.05) cis correlated with the expression of the mRNA in the same locus across RCC and three other human tissues. Gene Ontology (GO) analysis of those loci pointed to 'regulation of biological processes’ as the main enriched category. A module map analysis of the protein-coding genes significantly (p <0.05) trans correlated with the 20% most abundant lncRNAs, identified 51 enriched GO terms (p <0.05). We determined that 60% of the expressed lncRNAs are evolutionarily conserved. At the genomic loci containing the intronic RCC-expressed lncRNAs, a strong association (p <0.001) was found between their transcription start sites and genomic marks such as CpG islands, RNA Pol II binding and histones methylation and acetylation.
Intronic antisense lncRNAs are widely expressed in RCC tumors. Some of them are significantly altered in RCC in comparison with nontumor samples. The majority of these lncRNAs is evolutionarily conserved and possibly modulated by epigenetic modifications. Our data suggest that these RCC lncRNAs may contribute to the complex network of regulatory RNAs playing a role in renal cell malignant transformation.
Renal cell carcinoma (RCC); Unspliced intronic long noncoding RNAs; Antisense lncRNAs; Microarray analysis; Molecular markers; Gene expression correlation; Histone methylation; Histone acetylation; Evolutionary lncRNA conservation
Study on long non-coding RNAs (lncRNAs) has been promoted by high-throughput RNA sequencing (RNA-Seq). However, it is still not trivial to identify lncRNAs from the RNA-Seq data and it remains a challenge to uncover their functions.
We present a computational pipeline for detecting novel lncRNAs from the RNA-Seq data. First, the genome-guided transcriptome reconstruction is used to generate initially assembled transcripts. The possible partial transcripts and artefacts are filtered according to the quantified expression level. After that, novel lncRNAs are detected by further filtering known transcripts and those with high protein coding potential, using a newly developed program called lncRScan. We applied our pipeline to a mouse Klf1 knockout dataset, and discussed the plausible functions of the novel lncRNAs we detected by differential expression analysis. We identified 308 novel lncRNA candidates, which have shorter transcript length, fewer exons, shorter putative open reading frame, compared with known protein-coding transcripts. Of the lncRNAs, 52 large intergenic ncRNAs (lincRNAs) show lower expression level than the protein-coding ones and 13 lncRNAs represent significant differential expression between the wild-type and Klf1 knockout conditions.
Our method can predict a set of novel lncRNAs from the RNA-Seq data. Some of the lncRNAs are showed differentially expressed between the wild-type and Klf1 knockout strains, suggested that those novel lncRNAs can be given high priority in further functional studies.
Long noncoding RNAs (lncRNAs) are an important class of pervasive genes involved in a variety of biological functions. They are aberrantly expressed in many types of cancers. In this study, we described lncRNAs profiles in 6 pairs of human renal clear cell carcinoma (RCCC) and the corresponding adjacent nontumorous tissues (NT) by microarray.
With abundant and varied probes accounting 33,045 LncRNAs in our microarray, the number of lncRNAs that expressed at a certain level could be detected is 17157. From the data we found there were thousands of lncRNAs that differentially expressed (≥2 fold-change) in RCCC tissues compared with NT and 916 lncRNAs differentially expressed in five or more of six RCCC samples. Compared with NT, many lncRNAs were significantly up-regulated or down-regulated in RCCC. Our data showed that down-regulated lncRNAs were more common than up-regulated ones. ENST00000456816, X91348, BC029135, NR_024418 were evaluated by qPCR in sixty-three pairs of RCCC and NT samples. The four lncRNAs were aberrantly expressed in RCCC compared with matched histologically normal renal tissues.
Our study is the first one to determine genome-wide lncRNAs expression patterns in RCCC by microarray. The results displayed that clusters of lncRNAs were aberrantly expressed in RCCC compared with NT samples, which revealed that lncRNAs differentially expressed in tumor tissues and normal tissues may exert a partial or key role in tumor development. Taken together, this study may provide potential targets for future treatment of RCCC and novel insights into cancer biology.