Leiomyosarcoma (LMS) is a malignant neoplasm with smooth muscle differentiation. Little is known about its molecular heterogeneity and no targeted therapy currently exists for LMS. We performed expression profiling on 99 cases of LMS with 3’end RNA sequencing (3SEQ) and demonstrated the existence of 3 molecular subtypes in this cohort. We consequently showed that these molecular subtypes are reproducible using an independent cohort of 82 LMS cases from TCGA. Two new formalin-fixed, paraffin-embedded (FFPE) tissue-compatible diagnostic immunohistochemical markers were identified for two of the three subtypes: LMOD1 for subtype I LMS and ARL4C for subtype II LMS. Subtype I and subtype II LMS were associated with good and poor prognosis, respectively. Here, we describe the details of LMS diagnosis, RNA isolation, 3SEQ library construction, 3SEQ sequencing data analysis and molecular subtype determination. The 3SEQ data produced in this study was deposited into Gene Expression Omnibus (GEO) under GSE45510.
Leiomyosarcoma; subtypes; expression profiling; 3’ end RNA sequencing
Leiomyosarcoma (LMS) is a malignant neoplasm with smooth muscle differentiation. Little is known about its molecular heterogeneity and no targeted therapy currently exists for LMS. Recognition of different molecular subtypes is necessary to evaluate novel therapeutic options. In a previous study on 51 LMS, we identified three molecular subtypes in LMS. The current study was performed to determine whether the existence of these subtypes could be confirmed in independent cohorts.
99 cases of LMS were expression profiled with 3′end RNA-Sequencing (3SEQ). Consensus Clustering was conducted to determine the optimal number of subtypes.
We identified 3 LMS molecular subtypes and confirmed this finding by analyzing publically available data on 82 LMS from The Cancer Genome Atlas (TCGA). We identified two new FFPE tissue-compatible diagnostic immunohistochemical markers; LMOD1 for subtype I LMS and ARL4C for subtype II LMS. An LMS tissue microarray with known clinical outcome was used to show that subtype I LMS is associated with good outcome in extrauterine LMS while subtype II LMS is associated with poor prognosis in both uterine and extrauterine LMS. The LMS subtypes showed significant differences in expression levels for genes for which novel targeted therapies are being developed, suggesting that LMS subtypes may respond differentially to these targeted therapies.
We confirm the existence of 3 molecular subtypes in LMS using two independent datasets and show that the different molecular subtypes are associated with distinct clinical outcomes. The findings offer an opportunity for treating LMS in a subtype-specific targeted approach.
Leiomyosarcoma; subtypes; outcome; biomarker; sequencing
The identification of high-risk stage II colon cancers is key to the selection of patients who require adjuvant treatment after surgery. Microarray-based multigene-expression signatures derived from stem cells and progenitor cells hold promise, but they are difficult to use in clinical practice.
We used a new bioinformatics approach to search for biomarkers of colon epithelial differentiation across gene-expression arrays and then ranked candidate genes according to the availability of clinical-grade diagnostic assays. With the use of subgroup analysis involving independent and retrospective cohorts of patients with stage II or stage III colon cancer, the top candidate gene was tested for its association with disease-free survival and a benefit from adjuvant chemotherapy.
The transcription factor CDX2 ranked first in our screening test. A group of 87 of 2115 tumor samples (4.1%) lacked CDX2 expression. In the discovery data set, which included 466 patients, the rate of 5-year disease-free survival was lower among the 32 patients (6.9%) with CDX2-negative colon cancers than among the 434 (93.1%) with CDX2-positive colon cancers (hazard ratio for disease recurrence, 3.44; 95% confidence interval [CI], 1.60 to 7.38; P = 0.002). In the validation data set, which included 314 patients, the rate of 5-year disease-free survival was lower among the 38 patients (12.1%) with CDX2 protein–negative colon cancers than among the 276 (87.9%) with CDX2 protein–positive colon cancers (hazard ratio, 2.42; 95% CI, 1.36 to 4.29; P = 0.003). In both these groups, these findings were independent of the patient's age, sex, and tumor stage and grade. Among patients with stage II cancer, the difference in 5-year disease-free survival was significant both in the discovery data set (49% among 15 patients with CDX2-negative tumors vs. 87% among 191 patients with CDX2-positive tumors, P = 0.003) and in the validation data set (51% among 15 patients with CDX2-negative tumors vs. 80% among 106 patients with CDX2-positive tumors, P = 0.004). In a pooled database of all patient cohorts, the rate of 5-year disease-free survival was higher among 23 patients with stage II CDX2-negative tumors who were treated with adjuvant chemotherapy than among 25 who were not treated with adjuvant chemotherapy (91% vs. 56%, P = 0.006).
Lack of CDX2 expression identified a subgroup of patients with high-risk stage II colon cancer who appeared to benefit from adjuvant chemotherapy. (Funded by the National Comprehensive Cancer Network, the National Institutes of Health, and others.)
Sarcomas of soft tissue and bone are rare neoplasms that can be separated into a large number of different diagnostic entities. Over the years, a number of diagnostic markers have been developed that aid pathologists in reaching the appropriate diagnoses. Many of these markers are sarcoma-specific proteins that can be detected by immunohistochemistry in formalin-fixed, paraffin-embedded (FFPE) sections. In addition, a wide range of molecular studies have been developed that can detect gene mutations, gene amplifications or chromosomal translocations in FFPE material. Until recently, most sequencing-based approaches relied on the availability of fresh frozen tissue. However, with the advent of next-generation sequencing technologies, FFPE material is increasingly being used as a tool to identify novel immuno-histochemistry markers, gene mutations, and chromosomal translocations, and to develop diagnostic tests.
molecular testing; next-generation sequencing; sarcoma
Well-differentiated leiomyosarcoma show morphologically recognizable smooth muscle differentiation, while poorly differentiated tumors may form a spectrum with a subset of undifferentiated pleomorphic sarcomas. Expression of certain muscle markers has been reported to have prognostic impact. We investigated the correlation between morphologic spectrum and muscle-marker expression profile of leiomyosarcoma and the impact of these factors on patient outcomes.
Methods and Results
Tissue microarrays including 203 non-uterine and 181 uterine leiomyosarcoma with a spectrum of tumor morphologies were evaluated for expression of immunohistochemical markers of muscle differentiation. Poorly differentiated tumors frequently lost one or more conventional smooth muscle markers (smooth muscle actin, desmin, h-caldesmon and smooth muscle myosin (p<0.0001)), as well as more recently described markers SLMAP, MYLK and ACTG2 (p<0.0001). In primary tumors, both desmin and CFL2 expression predicted improved overall survival in multivariate analyses (p=0.0111 and p=0.043, respectively). Muscle-marker enriched tumors (expressing all 4 conventional markers or any 3 of ACTG2, CFL2, CASQ2, MYLK, and SLMAP, had improved overall survival (p<0.05) in univariate analyses.
Morphologically and immunohistochemically, poorly differentiated leiomyosarcoma can masquerade as undifferentiated pleomorphic sarcoma with progressive loss of muscle markers. Expression of muscle markers has prognostic significance in primary leiomyosarcoma independent of tumor morphology.
Leiomyosarcoma; differentiation; immunohistochemistry; prognosis; desmin
Despite the importance of WRKY genes in plant physiological processes, little is known about their roles in Panax ginseng C.A. Meyer. Forty-eight unigenes on this species were previously reported as WRKY transcripts using the next-generation sequencing (NGS) technology. Subsequently, one gene that encodes PgWRKY1 protein belonging to subgroup II-d was cloned and functionally characterized. In this study, eight WRKY genes from the NGS-based transcriptome sequencing dataset designated as PgWRKY2-9 have been cloned and characterized. The genes encoding WRKY proteins were assigned to WRKY Group II (one subgroup II-c, four subgroup II-d, and three subgroup II-e) based on phylogenetic analysis. The cDNAs of the cloned PgWRKYs encode putative proteins ranging from 194 to 358 amino acid residues, each of which includes one WRKYGQK sequence motif and one C2H2-type zinc-finger motif. Quantitative real-time PCR (qRT-PCR) analysis demonstrated that the eight analyzed PgWRKY genes were expressed at different levels in various organs including leaves, roots, adventitious roots, stems, and seeds. Importantly, the transcription responses of these PgWRKYs to methyl jasmonate (MeJA) showed that PgWRKY2, PgWRKY3, PgWRKY4, PgWRKY5, PgWRKY6, and PgWRKY7 were downregulated by MeJA treatment, while PgWRKY8 and PgWRKY9 were upregulated to varying degrees. Moreover, the PgWRKY genes increased or decreased by salicylic acid (SA), abscisic acid (ABA), and NaCl treatments. The results suggest that the PgWRKYs may be multiple stress–inducible genes responding to both salt and hormones.
Panax ginseng; transcription factors; abiotic stress
Leiomyosarcoma (LMS) is a malignant neoplasm with smooth muscle differentiation. Little is known about its molecular heterogeneity and no targeted therapy currently exists for LMS. We performed expression profiling on 99 cases of LMS with 3′ end RNA sequencing (3SEQ) and demonstrated the existence of 3 molecular subtypes in this cohort. We consequently showed that these molecular subtypes are reproducible using an independent cohort of 82 LMS cases from TCGA. Two new formalin-fixed, paraffin-embedded (FFPE) tissue-compatible diagnostic immunohistochemical markers were identified for two of the three subtypes: LMOD1 for subtype I LMS and ARL4C for subtype II LMS. Subtype I LMS and subtype II LMS were associated with good and poor prognosis, respectively. Here, we describe the details of LMS diagnosis, RNA isolation, 3SEQ library construction, 3SEQ sequencing data analysis and molecular subtype determination. The 3SEQ data produced in this study was deposited into Gene Expression Omnibus (GEO) under GSE45510.
Leiomyosarcoma; Subtypes; Expression profiling; 3′ end RNA sequencing
Many common human mesenchymal tumors, including gastrointestinal stromal tumor (GIST), rhabdomyosarcoma (RMS), and leiomyosarcoma (LMS), feature myogenic differentiation1–3. Here we report that intragenic deletion of the dystrophin-encoding and muscular dystrophy-associated DMD gene is a frequent mechanism by which myogenic tumors progress to high-grade, lethal sarcomas. Dystrophin is expressed in nonneoplastic and benign counterparts for GIST, RMS and LMS, and the DMD deletions inactivate larger dystrophin isoforms, including 427kDa dystrophin, while preserving expression of an essential 71kDa isoform. Dystrophin inhibits myogenic sarcoma cell migration, invasion, anchorage independence, and invadopodia formation, and dystrophin inactivation was found in 96%, 100%, and 62% of metastatic GIST, embryonal RMS, and LMS, respectively. These findings validate dystrophin as a tumor suppressor and likely anti-metastatic factor, suggesting that therapies in development for muscular dystrophies may also have relevance in treatment of cancer.
The earliest recognizable stages of breast neoplasia are lesions that represent a heterogeneous collection of epithelial proliferations currently classified based on morphology. Their role in the development of breast cancer is not well understood but insight into the critical events at this early stage will improve efforts in breast cancer detection and prevention. These microscopic lesions are technically difficult to study so very little is known about their molecular alterations.
To characterize the transcriptional changes of early breast neoplasia, we sequenced 3′- end enriched RNAseq libraries from formalin-fixed paraffin-embedded tissue of early neoplasia samples and matched normal breast and carcinoma samples from 25 patients. We find that gene expression patterns within early neoplasias are distinct from both normal and breast cancer patterns and identify a pattern of pro-oncogenic changes, including elevated transcription of ERBB2, FOXA1, and GATA3 at this early stage. We validate these findings on a second independent gene expression profile data set generated by whole transcriptome sequencing. Measurements of protein expression by immunohistochemistry on an independent set of early neoplasias confirms that ER pathway regulators FOXA1 and GATA3, as well as ER itself, are consistently upregulated at this early stage. The early neoplasia samples also demonstrate coordinated changes in long non-coding RNA expression and microenvironment stromal gene expression patterns.
This study is the first examination of global gene expression in early breast neoplasia, and the genes identified here represent candidate participants in the earliest molecular events in the development of breast cancer.
Multiple studies have shown that the tumor microenvironment (TME) of carcinomas can play an important role in the initiation, progression, and metastasis of cancer. Here we test the hypothesis that specific benign fibrous soft tissue tumor gene expression profiles may represent distinct stromal fibroblastic reaction types that occur in different breast cancers. The discovered stromal profiles could classify breast cancer based on the type of stromal reaction patterns in the TME.
Next generation sequencing-based gene expression profiling (3SEQ) was performed on formalin fixed, paraffin embedded (FFPE) samples of 10 types of fibrous soft tissue tumors. We determined the extent to which these signatures could identify distinct subsets of breast cancers in four publicly available breast cancer datasets.
A total of 53 fibrous tumors were sequenced by 3SEQ with an average of 29 million reads per sample. Both the gene signatures derived from elastofibroma (EF) and fibroma of tendon sheath (FOTS) demonstrated robust outcome results for survival in the four breast cancer datasets. The breast cancers positive for the EF signature (20-33% of the cohort) demonstrated significantly better outcome for survival. In contrast, the FOTS signature-positive breast cancers (11-35% of the cohort) had a worse outcome.
We defined and validated two new stromal signatures in breast cancer (EF and FOTS), which are significantly associated with prognosis. Our group has previously identified novel cancer stromal gene expression signatures associated with outcome differences in breast cancer by gene expression profiling of three soft tissue tumors, desmoid-type fibromatosis (DTF), solitary fibrous tumor (SFT), and tenosynovial giant cell tumor (TGCT/CSF1), as surrogates for stromal expression patterns. By combining the stromal signatures of EF and FOTS, with our previously identified DTF and TGCT/CSF1 signatures we can now characterize clinically relevant stromal expression profiles in the TME for between 74% to 90% of all breast cancers.
Social caste determination in the honey bee is assumed to be determined by the dietary status of the young larvae and translated into physiological and epigenetic changes through nutrient-sensing pathways. We have employed Illumina/Solexa sequencing to examine the small RNA content in the bee larval food, and show that worker jelly is enriched in miRNA complexity and abundance relative to royal jelly. The miRNA levels in worker jelly were 7–215 fold higher than in royal jelly, and both jellies showed dynamic changes in miRNA content during the 4th to 6th day of larval development. Adding specific miRNAs to royal jelly elicited significant changes in queen larval mRNA expression and morphological characters of the emerging adult queen bee. We propose that miRNAs in the nurse bee secretions constitute an additional element in the regulatory control of caste determination in the honey bee.
Endometrial stromal sarcoma (ESS) characterized by YWHAE-FAM22 genetic fusion is histologically higher-grade and clinically more aggressive than ESS with JAZF1-SUZ12 or equivalent genetic rearrangements, hence it is clinically important to recognize this subset of ESS. To identify diagnostic immunomarkers for this biologically-defined ESS subset, we compared gene expression profiles from YWHAE-FAM22 ESS, JAZF1-rearranged ESS and uterine leiomyosarcomas. These studies showed consistent upregulation of cyclin D1 in YWHAE-FAM22 ESS compared to JAZF1-SUZ12 ESS. Immunohistochemically, the high-grade round cell component of all 12 YWHAE-FAM22 ESS demonstrated diffuse (≥70%) moderate-to-strong nuclear cyclin D1 staining and this diffuse positivity was not seen in 34 ESS with JAZF1 and equivalent genetic rearrangements or in 21 low-grade ESS with no demonstrable genetic rearrangements. In a series of 243 non-ESS pure uterine mesenchymal and mixed epithelial-mesenchymal tumors, only 2 of 8 undifferentiated endometrial sarcomas with nuclear uniformity and 1 of 80 uterine leiomyosarcomas demonstrate diffuse cyclin D1 immunoreactivity. Both cyclin D1-positive undifferentiated endometrial sarcomas showed diffuse strong CD10 staining, which is consistently absent in the high-grade round cell component of YWHAE-FAM22 ESS. The low-grade spindle cell component of YWHAE-FAM22 ESS showed a spatially heterogeneous cyclin D1 staining pattern that was weaker and less diffuse overall. Our findings indicate that cyclin D1 is a sensitive and specific diagnostic immunomarker for YWHAE-FAM22 ESS. When evaluating high-grade uterine sarcomas, cyclin D1 can be included in the immunohistochemical panel as an indicator of YWHAE-FAM22 ESS.
Endometrial stromal sarcoma; round cell; YWHAE-FAM22; cyclin D1; JAZF1-SUZ12
Long non-coding RNAs (lncRNAs) that have no protein-coding capacity make up a large proportion of the transcriptome of various species. Many lncRNAs are expressed within the animal central nervous system in spatial- and temporal-specific patterns, indicating that lncRNAs play important roles in cellular processes, neural development, and even in cognitive and behavioral processes. However, relatively little is known about their in vivo functions and underlying molecular mechanisms in the nervous system. Here, we report a neural-specific Drosophila lncRNA, CASK regulatory gene (CRG), which participates in locomotor activity and climbing ability by positively regulating its neighboring gene CASK (Ca2+/calmodulin-dependent protein kinase). CRG deficiency led to reduced locomotor activity and a defective climbing ability—phenotypes that are often seen in CASK mutant. CRG mutant also showed reduced CASK expression level while CASK over-expression could rescue the CRG mutant phenotypes in reciprocal. At the molecular level, CRG was required for the recruitment of RNA polymerase II to the CASK promoter regions, which in turn enhanced CASK expression. Our work has revealed new functional roles of lncRNAs and has provided insights to explore the pathogenesis of neurological diseases associated with movement disorders.
Molecular characterization of tumors has been critical for identifying important genes in cancer biology and for improving tumor classification and diagnosis. Long non-coding RNAs, as a new, relatively unstudied class of transcripts, provide a rich opportunity to identify both functional drivers and cancer-type-specific biomarkers. However, despite the potential importance of long non-coding RNAs to the cancer field, no comprehensive survey of long non-coding RNA expression across various cancers has been reported.
We performed a sequencing-based transcriptional survey of both known long non-coding RNAs and novel intergenic transcripts across a panel of 64 archival tumor samples comprising 17 diagnostic subtypes of adenocarcinomas, squamous cell carcinomas and sarcomas. We identified hundreds of transcripts from among the known 1,065 long non-coding RNAs surveyed that showed variability in transcript levels between the tumor types and are therefore potential biomarker candidates. We discovered 1,071 novel intergenic transcribed regions and demonstrate that these show similar patterns of variability between tumor types. We found that many of these differentially expressed cancer transcripts are also expressed in normal tissues. One such novel transcript specifically expressed in breast tissue was further evaluated using RNA in situ hybridization on a panel of breast tumors. It was shown to correlate with low tumor grade and estrogen receptor expression, thereby representing a potentially important new breast cancer biomarker.
This study provides the first large survey of long non-coding RNA expression within a panel of solid cancers and also identifies a number of novel transcribed regions differentially expressed across distinct cancer types that represent candidate biomarkers for future research.
3SEQ; FFPE; human cancer; intergenic transcripts; lncRNAs; novel transcripts; solid tumors; transcriptional profiling
Small non-coding RNAs (sRNAs) play key roles in plant development, growth and responses to biotic and abiotic stresses. At least four classes of sRNAs have been well characterized in plants, including repeat-associated siRNAs (rasiRNAs), microRNAs (miRNAs), trans-acting siRNAs (tasiRNAs) and natural antisense transcript-derived siRNAs. Chinese fir (Cunninghamia lanceolata) is one of the most important coniferous evergreen tree species in China. No sRNA from Chinese fir has been described to date.
To obtain sRNAs in Chinese fir, we sequenced a sRNA library generated from seeds, seedlings, leaves, stems and calli, using Illumina high throughput sequencing technology. A comprehensive set of sRNAs were acquired, including conserved and novel miRNAs, rasiRNAs and tasiRNAs. With BLASTN and MIREAP we identified a total of 115 conserved miRNAs comprising 40 miRNA families and one novel miRNA with precursor sequence. The expressions of 16 conserved and one novel miRNAs and one tasiRNA were detected by RT-PCR. Utilizing real time RT-PCR, we revealed that four conserved and one novel miRNAs displayed developmental stage-specific expression patterns in Chinese fir. In addition, 209 unigenes were predicted to be targets of 30 Chinese fir miRNA families, of which five target genes were experimentally verified by 5' RACE, including a squamosa promoter-binding protein gene, a pentatricopeptide (PPR) repeat-containing protein gene, a BolA-like family protein gene, AGO1 and a gene of unknown function. We also demonstrated that the DCL3-dependent rasiRNA biogenesis pathway, which had been considered absent in conifers, existed in Chinese fir. Furthermore, the miR390-TAS3-ARF regulatory pathway was elucidated.
We unveiled a complex population of sRNAs in Chinese fir through high throughput sequencing. This provides an insight into the composition and function of sRNAs in Chinese fir and sheds new light on land plant sRNA evolution.
Chinese fir; miRNA; rasiRNA; tasiRNA; Cunninghamia lanceolata
Accumulating evidences show that small non-protein coding RNAs (ncRNAs) play important roles in development, stress response and other cellular processes. The silkworm is an important model for studies on insect genetics and control of lepidopterous pests. Here, we have performed the first systematic identification and analysis of intermediate size ncRNAs (50–500 nt) in the silkworm. We identified 189 novel ncRNAs, including 141 snoRNAs, six snRNAs, three tRNAs, one SRP and 38 unclassified ncRNAs. Forty ncRNAs showed significantly altered expression during silkworm development or across specific stage transitions. Genomic comparisons revealed that 123 of these ncRNAs are potentially silkworm-specific. Analysis of the genomic organization of the ncRNA loci showed that 32.62% of the novel snoRNA loci are intergenic, and that all the intronic snoRNAs follow the pattern of one-snoRNA-per-intron. Target site analysis predicted a total of 95 2′-O-methylation and pseudouridylation modification sites of rRNAs, snRNAs and tRNAs. Together, these findings provide new clues for future functional study of ncRNA during insect development and evolution.
A major goal of post-genomics research is the integrated analysis of genes, regulatory elements and the chromatin architecture on a genome-wide scale. Mapping DNase I hypersensitive sites within the nuclear chromatin is a powerful and well-established method of identifying regulatory element candidates.
Here, we report the first genome-wide analysis of DNase I hypersensitive sites (DHSs) in Caenorhabditis elegans. The data was obtained by hybridizing DNase I-treated and end-captured material from young adult worms to a high-resolution tiling microarray. The data show that C. elegans DHSs were significantly enriched within intergenic regions located 2 kb upstream and downstream of coding genes, and also that a considerable fraction of all DHSs mapped to intergenic positions distant to annotated coding genes. Annotated transcribed loci were generally depleted in DHSs relative to intergenic regions, but DHSs were nonetheless enriched in coding exons and UTRs, whereas introns were significantly depleted in DHSs. Many DHSs appeared to be associated with annotated non-coding RNAs and recently detected transcripts of unknown function. It has been reported that nematode highly conserved non-coding elements were associated with cis-regulatory elements, and we also found that DHSs, particularly distal intergenic DHSs, were significantly enriched in regions that were conserved between the C. elegans and C. briggsae genomes.
We describe the first genome-wide analysis of C. elegans DHSs, and show that the distribution of DHSs is strongly associated with functional elements in the genome.
Small noncoding RNAs (ncRNAs), including short interfering RNAs (siRNAs) and microRNAs (miRNAs), can silence genes at the transcriptional, post-transcriptional or translational level [1,2].
Here, we show that microRNA-10a (miR-10a) targets a homologous DNA region in the promoter region of the hoxd4 gene and represses its expression at the transcriptional level. Mutational analysis of the miR-10a sequence revealed that the 3' end of the miRNA sequence is the most critical element for the silencing effect. MicroRNA-10a-induced transcriptional gene inhibition requires the presence of Dicer and Argonautes 1 and 3, and it is related to promoter associated noncoding RNAs. Bisulfite sequencing analysis showed that the reduced hoxd4 expression was accompanied by de novo DNA methylation at the hoxd4 promoter. We further demonstrated that trimethylation of histone 3 lysine 27 (H3K27me3) is involved in the miR-10a-induced hoxd4 transcriptional gene silence.
In conclusion, our results demonstrate that miR-10a can regulate human gene expression in a transcriptional manner, and indicate that endogenous small noncoding RNA-induced control of transcription may be a potential system for expressional regulation in human breast cancer cells.