Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel) and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes) and ScriptSeq whole transcriptome protocols respectively, p<2x10-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988) between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads). Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol performed particularly well for lincRNA expression from FFPE libraries, but detection of eSNV and fusion transcripts was less sensitive.
Our goal in these analyses was to use genomic features from a test set of primary breast tumors to build an integrated transcriptome landscape model that makes relevant hypothetical predictions about the biological and/or clinical behavior of HER2-positive breast cancer. We interrogated RNA-Seq data from benign breast lesions, ER+, triple negative, and HER2-positive tumors to identify 685 differentially expressed genes, 102 alternatively spliced genes, and 303 genes that expressed single nucleotide sequence variants (eSNVs) that were associated with the HER2-positive tumors in our survey panel. These features were integrated into a transcriptome landscape model that identified 12 highly interconnected genomic modules, each of which represents a cellular processes pathway that appears to define the genomic architecture of the HER2-positive tumors in our test set. The generality of the model was confirmed by the observation that several key pathways were enriched in HER2-positive TCGA breast tumors. The ability of this model to make relevant predictions about the biology of breast cancer cells was established by the observation that integrin signaling was linked to lapatinib sensitivity in vitro and strongly associated with risk of relapse in the NCCTG N9831 adjuvant trastuzumab clinical trial dataset. Additional modules from the HER2 transcriptome model, including ubiquitin-mediated proteolysis, TGF-beta signaling, RHO-family GTPase signaling, and M-phase progression, were linked to response to lapatinib and paclitaxel in vitro and/or risk of relapse in the N9831 dataset. These data indicate that an integrated transcriptome landscape model derived from a test set of HER2-positive breast tumors has potential for predicting outcome and for identifying novel potential therapeutic strategies for this breast cancer subtype.
KRAS mutations are highly prevalent in non-small cell lung cancer (NSCLC), and tumors harboring these mutations tend to be aggressive and resistant to chemotherapy. We used next-generation sequencing technology to identify pathways that are specifically altered in lung tumors harboring a KRAS mutation. Paired-end RNA-sequencing of 15 primary lung adenocarcinoma tumors (8 harboring mutant KRAS and 7 with wild-type KRAS) were performed. Sequences were mapped to the human genome, and genomic features, including differentially expressed genes, alternate splicing isoforms and single nucleotide variants, were determined for tumors with and without KRAS mutation using a variety of computational methods. Network analysis was carried out on genes showing differential expression (374 genes), alternate splicing (259 genes), and SNV-related changes (65 genes) in NSCLC tumors harboring a KRAS mutation. Genes exhibiting two or more connections from the lung adenocarcinoma network were used to carry out integrated pathway analysis. The most significant signaling pathways identified through this analysis were the NFκB, ERK1/2, and AKT pathways. A 27 gene mutant KRAS-specific sub network was extracted based on gene–gene connections from the integrated network, and interrogated for druggable targets. Our results confirm previous evidence that mutant KRAS tumors exhibit activated NFκB, ERK1/2, and AKT pathways and may be preferentially sensitive to target therapeutics toward these pathways. In addition, our analysis indicates novel, previously unappreciated links between mutant KRAS and the TNFR and PPARγ signaling pathways, suggesting that targeted PPARγ antagonists and TNFR inhibitors may be useful therapeutic strategies for treatment of mutant KRAS lung tumors. Our study is the first to integrate genomic features from RNA-Seq data from NSCLC and to define a first draft genomic landscape model that is unique to tumors with oncogenic KRAS mutations.
transcriptome sequencing; RNA-Seq; KRAS mutation; NSCLC; bioinformatics; network analysis; data integration and computational methods
Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6–96.8% precision and 91.6–95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.
The c.4309A>C mutation in the LRRK2 gene (LRRK2 p.N1437H) has recently been reported as the seventh pathogenic LRRK2 mutation causing monogenic Parkinson's disease (PD). So far, only two families worldwide have been identified with this mutation. By screening DNA from seven brains of PD patients, we found one individual with seemingly sporadic PD and LRRK2 p.N1437H mutation. Clinically, the patient had levodopa-responsive PD with tremor, and developed severe motor fluctuations during a disease duration of 19 years. There was severe and painful ON-dystonia, and severe depression with suicidal thoughts during OFF. In the advanced stage, cognition was slow during motor OFF, but there was no noticeable cognitive decline. There were no signs of autonomic nervous system dysfunction. Bilateral deep brain stimulation of the subthalamic nucleus had unsatisfactory results on motor symptoms. The patient committed suicide. Neuropathological examination revealed marked cell loss and moderate alpha-synuclein positive Lewy body pathology in the brainstem. There was sparse Lewy pathology in the cortex. A striking finding was very pronounced ubiquitin-positive pathology in the brainstem, temporolimbic regions and neocortex. Ubiquitin positivity was most pronounced in the white matter, and was out of proportion to the comparatively weaker alpha-synuclein immunoreactivity. Immunostaining for tau was mildly positive, revealing non-specific changes, but staining for TDP-43 and FUS was entirely negative. The distribution and shape of ubiquitin positive lesions in this patient differed from the few previously described patients with LRRK2 mutations and ubiquitin pathology, and the ubiquitinated protein substrate remains undefined.
Autosomal Dominant Parkinsonism; LRRK2; alpha-Synuclein; Ubiquitin; Deep Brain Stimulation; Suicide
Genealogical investigation of a large Norwegian family (F04) with autosomal dominant parkinsonism has identified 18 affected family members over four generations. Genetic studies have revealed a novel pathogenic LRRK2 mutation c.4309 C>A (p.Asn1437His) that co-segregates with disease manifestation (LOD=3.15, θ=0). Affected carriers have an early age at onset (48 ± 7.7 SD years) and are clinically asymmetric and levodopa-responsive. The variant was absent in 623 Norwegian control subjects. Further screening of patients from the same population identified one additional affected carrier (1/692) with familial parkinsonism who shares the same haplotype. The mutation is located within the Roc domain of the protein and enhances GTP-binding and kinase activity, further implicating these activities as the mechanisms that underlie LRRK2-linked parkinsonism.
LRRK2; Parkinson’s disease; genetic; kinase
Mutations in the Glucocerebrosidase gene (GBA) have recently been associated with an increased risk of Parkinson disease (PD). GBA mutations have been observed to be particularly prevalent in the Ashkenazi Jewish population. Interestingly, this population also has a high incidence of the Lrrk2 p.G2019S mutation which is similar in North African Arab-Berber populations. Herein, our sequencing of the GBA gene, in 33 North African Arab-Berber familial parkinsonism probands, identified two novel mutations in three individuals (p.K-26R and p.K186R). Segregation analysis of these two variants did not support a pathogenic role. Genotyping of p.K-26R, p.K186R and the common p.N370S in an ethnically matched series consisting of 395 patients with PD and 372 control subjects did not show a statistically significant association (P>0.05). The p.N370S mutation was only identified in 1 sporadic patient with PD and 3 control subjects indicating that the frequency of this mutation in the North African Arab-Berber population is much lower than that observed in Ashkenazi Jews, and therefore arose in the latter after expansion of the Lrrk2 p.G2019S variant in North Africa.
Parkinson disease; Gaucher disease; genetics
Recently, a variant in LINGO1 (rs9652490) was found to associate with increased risk of essential tremor. We set out to replicate this association in an independent case-control series of essential tremor from North America. In addition, given the clinical and pathological overlap between essential tremor and Parkinson disease, we also evaluate the effect of LINGO1 rs9652490 in two case-control series of Parkinson disease. Our study demonstrates a significant association between LINGO1 rs9652490 and essential tremor (P=0.014) and Parkinson disease (P=0.0003), thus providing the first evidence of a genetic link between both diseases.
LINGO1; Parkinson disease; essential tremor
A de novo α-synuclein A53T (p.Ala53Thr; c.209G>A) mutation has been identified in a Swedish family with autosomal dominant Parkinson's disease (PD). Two affected individuals had early-onset (before 31 and 40 years), severe levodopa-responsive PD with prominent dysphasia, dysarthria, and cognitive decline. Longitudinal clinical follow-up, EEG, SPECT and CSF biomarker examinations suggested an underlying encephalopathy with cortical involvement. The mutated allele (c.209A) was present within a haplotype different from that shared among mutation carriers in the Italian (Contursi) and the Greek-American Family H kindreds. One unaffected family member carried the mutation haplotype without the c.209A mutation, strongly suggesting its de novo occurrence within this family. Furthermore, a novel mutation c.488G>A (p.Arg163His; R163H) in the presenilin-2 (PSEN2) gene was detected, but was not associated with disease state.
Parkinsonian disorders; Autosomal Dominant Parkinsonism; alpha-Synuclein; Biomarkers
Perry syndrome consists of early-onset parkinsonism, depression, severe weight loss and hypoventilation, in which brain pathology is characterized by TDP-43 immunostaining. Through genome-wide linkage analysis we have identified five disease-segregating dynactin (DCTN1) CAP-Gly domain substitutions in 8 families that diminish microtubule binding and lead to intracytoplasmic inclusions. DCTN1 mutations were previously associated with motor neuron disease but can underlie the selective vulnerability of other neuronal populations in distinct neurodegenerative disorders.
Dynactin; DCTN1; Perry syndrome; parkinsonism; neurodegeneration; TDP-43
Herein, we investigate whether single-nucleotide polymorphisms (SNPs) across the PARK10 locus are associated with susceptibility to Parkinson's disease (PD) or age at onset (AAO) of disease. One hundred and eighty-eight SNPs were genotyped across the PARK10 locus in 180 PD patients and 180 controls from central Norway (stage 1). We then used the linkage disequilibrium (LD) structure from stage 1 to select 75 SNPs for genotyping in 186 patients and 186 controls from Ireland (stage 2). Nineteen SNPs were selected from this and previous studies for follow-up in an extended Norwegian series (530 patients and 1142 controls), the Irish series and a US series (221 patients and 221 controls) (stage 3). After correction for multiple testing, markers within ubiquitin specific peptidase 24 (USP24) are significantly associated with PD within Norwegian, Irish, and US series combined (rs13312: odds ratio (OR) 0.78, P<0.001; rs487230: OR 0.80, P=0.001). Independently, the association for rs13312 is strongest in the extended Norwegian series (OR 0.76, P=0.005), although not significant after correction for multiple testing (P≤0.003 is considered significant). ORs in the Irish series are almost identical, and a similar but a weaker effect was observed for the US series. No marker showed consistent association with AAO. Our data indicate that genetic variability in USP24 is associated with PD. Although our work extends and confirms a previous report, the observed effect size does not explain the PARK10 linkage peak.
Parkinson's disease; linkage study; association study; risk factors; USP24
Pathogenic substitutions in the leucine-rich repeat kinase 2 protein (Lrrk2), R1441G and G2019S, are a prevalent cause of autosomal dominant and sporadic Parkinson's disease in the Northern Spanish population. In this study we examined the frequency of these two substitutions in 166 Parkinson's disease patients and 153 controls from Chile, a population with Spanish/European-Amerindian admixture. Lrrk2 R1441G was not observed, however Lrrk2 G2019S was detected in one familial and four sporadic Parkinson's disease patients. These findings suggest Lrrk2 G2019S may play an important role in Parkinson's disease on the South American Continent and further studies are now warranted.
LRRK2; Parkinson's disease; mutation, Amerindian