Search tips
Search criteria

Results 1-25 (78)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Gateways to the FANTOM5 promoter level mammalian expression atlas 
Genome Biology  2015;16(1):22.
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource ( This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0560-6) contains supplementary material, which is available to authorized users.
PMCID: PMC4310165
2.  Telomerase Reverse Transcriptase Regulates microRNAs 
MicroRNAs are small non-coding RNAs that inhibit the translation of target mRNAs. In humans, most microRNAs are transcribed by RNA polymerase II as long primary transcripts and processed by sequential cleavage of the two RNase III enzymes, DROSHA and DICER, into precursor and mature microRNAs, respectively. Although the fundamental functions of microRNAs in RNA silencing have been gradually uncovered, less is known about the regulatory mechanisms of microRNA expression. Here, we report that telomerase reverse transcriptase (TERT) extensively affects the expression levels of mature microRNAs. Deep sequencing-based screens of short RNA populations revealed that the suppression of TERT resulted in the downregulation of microRNAs expressed in THP-1 cells and HeLa cells. Primary and precursor microRNA levels were also reduced under the suppression of TERT. Similar results were obtained with the suppression of either BRG1 (also called SMARCA4) or nucleostemin, which are proteins interacting with TERT and functioning beyond telomeres. These results suggest that TERT regulates microRNAs at the very early phases in their biogenesis, presumably through non-telomerase mechanism(s).
PMCID: PMC4307298  PMID: 25569094
telomerase reverse transcriptase; microRNA; RNA-dependent RNA polymerase; cancer
4.  Mesencephalic dopaminergic neurons express a repertoire of olfactory receptors and respond to odorant-like molecules 
BMC Genomics  2014;15(1):729.
The mesencephalic dopaminergic (mDA) cell system is composed of two major groups of projecting cells in the Substantia Nigra (SN) (A9 neurons) and the Ventral Tegmental Area (VTA) (A10 cells). Selective degeneration of A9 neurons occurs in Parkinson’s disease (PD) while abnormal function of A10 cells has been linked to schizophrenia, attention deficit and addiction. The molecular basis that underlies selective vulnerability of A9 and A10 neurons is presently unknown.
By taking advantage of transgenic labeling, laser capture microdissection coupled to nano Cap-Analysis of Gene Expression (nanoCAGE) technology on isolated A9 and A10 cells, we found that a subset of Olfactory Receptors (OR)s is expressed in mDA neurons. Gene expression analysis was integrated with the FANTOM5 Helicos CAGE sequencing datasets, showing the presence of these ORs in selected tissues and brain areas outside of the olfactory epithelium. OR expression in the mesencephalon was validated by RT-PCR and in situ hybridization. By screening 16 potential ligands on 5 mDA ORs recombinantly expressed in an heterologous in vitro system, we identified carvone enantiomers as agonists at Olfr287 and able to evoke an intracellular Ca2+ increase in solitary mDA neurons. ORs were found expressed in human SN and down-regulated in PD post mortem brains.
Our study indicates that mDA neurons express ORs and respond to odor-like molecules providing new opportunities for pharmacological intervention in disease.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-729) contains supplementary material, which is available to authorized users.
PMCID: PMC4161876  PMID: 25164183
NanoCAGE; Odors; Odorant receptors; Dopaminergic neurons; Ventral midbrain
5.  Multiplicity of 5′ Cap Structures Present on Short RNAs 
PLoS ONE  2014;9(7):e102895.
Most RNA molecules are co- or post-transcriptionally modified to alter their chemical and functional properties to assist in their ultimate biological function. Among these modifications, the addition of 5′ cap structure has been found to regulate turnover and localization. Here we report a study of the cap structure of human short (<200 nt) RNAs (sRNAs), using sequencing of cDNA libraries prepared by enzymatic pretreatment of the sRNAs with cap sensitive-specificity, thin layer chromatographic (TLC) analyses of isolated cap structures and mass spectrometric analyses for validation of TLC analyses. Processed versions of snoRNAs and tRNAs sequences of less than 50 nt were observed in capped sRNA libraries, indicating additional processing and recapping of these annotated sRNAs biotypes. We report for the first time 2,7 dimethylguanosine in human sRNAs cap structures and surprisingly we find multiple type 0 cap structures (mGpppC, 7mGpppG, GpppG, GpppA, and 7mGpppA) in RNA length fractions shorter than 50 nt. Finally, we find the presence of additional uncharacterized cap structures that wait determination by the creation of needed reference compounds to be used in TLC analyses. These studies suggest the existence of novel biochemical pathways leading to the processing of primary and sRNAs and the modifications of their RNA 5′ ends with a spectrum of chemical modifications.
PMCID: PMC4117478  PMID: 25079783
6.  CAGE- Cap Analysis Gene Expression: a protocol for the detection of promoter and transcriptional networks 
We provide here a protocol for the preparation of cap-analysis gene expression (CAGE) libraries, which allow measuring the expression of eukaryotic capped RNAs and simultaneously map the promoter regions. The presented protocol simplified the previously published ones and moreover produces tags that are 27 nucleotides long, which facilitates mapping to the genome. The protocol takes less than 5 days to complete and presents a notable improvement compared to previously published versions.
PMCID: PMC4094367  PMID: 21938627
Cap-analysis gene expression; RNAseq; transcriptome; sequencing; RNA
7.  5’ end-centered expression profiling using Cap-analysis gene expression (CAGE) and next-generation sequencing 
Nature protocols  2012;7(3):542-561.
Cap-Analysis gene expression (CAGE) provides accurate high-throughput measurement of RNA expression. CAGE allows mapping of all the initiation sites of both capped coding and noncoding RNAs. In addition, transcriptional start sites (TSSs) within promoters are characterized at single nucleotide resolution. The latter allows the regulatory inputs driving gene expression to be studied, which in turn enables the construction of transcriptional networks. Here we provide an optimized protocol for the construction of CAGE libraries based on the preparation of 27 nucleotide (nt) long tags corresponding to initial bases at the 5’ ends of capped RNAs. We have optimized the methods using simple steps based on filtration, which altogether takes 4 days to complete. The CAGE tags can be readily sequenced with Illumina sequencers and upon modification, they are also amenable to sequencing using other platforms.
PMCID: PMC4094379  PMID: 22362160
Cap-analysis gene expression (CAGE); transcriptome; promoter; sequencing; RNA
8.  Chromatin-associated RNAi components contribute to transcriptional regulation in Drosophila 
Nature  2011;480(7377):391-395.
RNAi pathways have evolved as important modulators of gene expression that act in the cytoplasm by degrading RNA target molecules via the activity of short (21-30nt) RNAs1-6 RNAi components have been reported to play a role in the nucleus as they are involved in epigenetic regulation and heterochromatin formation7-10. However, although RNAi-mediated post-transcriptional silencing (PTGS) is well documented, mechanisms of RNAi-mediated transcriptional gene silencing (TGS) and in particular the role of RNAi components in chromatin, especially in higher eukaryotes, are still elusive. Here we show that key RNAi components Dicer-2 (Dcr2) and and Argonaute-2 (AGO2) AGO2 associate with chromatin, with strong preference for euchromatic, transcriptionally active loci and interact with core transcription machinery. Notably Dcr2 and AGO2 loss of function show that transcriptional defects are accompanied by perturbation of Pol II positioning on promoters. Further, both Dcr2 and Ago2 null mutations as well as missense mutations compromising the RNAi activity impair global Pol II dynamics upon heat shock. Finally, AGO2 RIP-seq experiments reveal that, AGO2 is strongly enriched in small-RNAs encompassing promoter as well as other parts of heat shock and other gene loci on both sense and antisense, with a strong bias for antisense, particularly after heat shock. Taken together our results reveal a new scenario in which Dcr2 and AGO2 are globally associated with transcriptionally active loci and may play a pivotal role in shaping the transcriptome by controlling RNA Pol II processivity.
PMCID: PMC4082306  PMID: 22056986
9.  MOIRAI: a compact workflow system for CAGE analysis 
BMC Bioinformatics  2014;15:144.
Cap analysis of gene expression (CAGE) is a sequencing based technology to capture the 5’ ends of RNAs in a biological sample. After mapping, a CAGE peak on the genome indicates the position of an active transcriptional start site (TSS) and the number of reads correspond to its expression level. CAGE is prominently used in both the FANTOM and ENCODE project but presently there is no software package to perform the essential data processing steps.
Here we describe MOIRAI, a compact yet flexible workflow system designed to carry out the main steps in data processing and analysis of CAGE data. MOIRAI has a graphical interface allowing wet-lab researchers to create, modify and run analysis workflows. Embedded within the workflows are graphical quality control indicators allowing users assess data quality and to quickly spot potential problems. We will describe three main workflows allowing users to map, annotate and perform an expression analysis over multiple samples.
Due to the many built in quality control features MOIRAI is especially suitable to support the development of new sequencing based protocols.
The MOIRAI source code is freely available at
PMCID: PMC4033680  PMID: 24884663
CAGE; Pipeline; Next generation sequencing
10.  RECLU: a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE) 
BMC Genomics  2014;15:269.
Next generation sequencing based technologies are being extensively used to study transcriptomes. Among these, cap analysis of gene expression (CAGE) is specialized in detecting the most 5’ ends of RNA molecules. After mapping the sequenced reads back to a reference genome CAGE data highlights the transcriptional start sites (TSSs) and their usage at a single nucleotide resolution.
We propose a pipeline to group the single nucleotide TSS into larger reproducible peaks and compare their usage across biological states. Importantly, our pipeline discovers broad peaks as well as the fine structure of individual transcriptional start sites embedded within them. We assess the performance of our approach on a large CAGE datasets including 156 primary cell types and two cell lines with biological replicas. We demonstrate that genes have complicated structures of transcription initiation events. In particular, we discover that narrow peaks embedded in broader regions of transcriptional activity can be differentially used even if the larger region is not.
By examining the reproducible fine scaled organization of TSS we can detect many differentially regulated peaks undetected by previous approaches.
PMCID: PMC4029093  PMID: 24779366
CAGE; Peak finding; Reproducibility; Hierarchical stability
11.  Chromatin states reveal functional associations for globally defined transcription start sites in four human cell lines 
BMC Genomics  2014;15:120.
Deciphering the most common modes by which chromatin regulates transcription, and how this is related to cellular status and processes is an important task for improving our understanding of human cellular biology. The FANTOM5 and ENCODE projects represent two independent large scale efforts to map regulatory and transcriptional features to the human genome. Here we investigate chromatin features around a comprehensive set of transcription start sites in four cell lines by integrating data from these two projects.
Transcription start sites can be distinguished by chromatin states defined by specific combinations of both chromatin mark enrichment and the profile shapes of these chromatin marks. The observed patterns can be associated with cellular functions and processes, and they also show association with expression level, location relative to nearby genes, and CpG content. In particular we find a substantial number of repressed inter- and intra-genic transcription start sites enriched for active chromatin marks and Pol II, and these sites are strongly associated with immediate-early response processes and cell signaling. Associations between start sites with similar chromatin patterns are validated by significant correlations in their global expression profiles.
The results confirm the link between chromatin state and cellular function for expressed transcripts, and also indicate that active chromatin states at repressed transcripts may poise transcripts for rapid activation during immune response.
PMCID: PMC3986914  PMID: 24669905
Fantom; Encode; Cage; Transcription start sites; Chromatin states; Gene expression
12.  NanoCAGE analysis of the mouse olfactory epithelium identifies the expression of vomeronasal receptors and of proximal LINE elements 
By coupling laser capture microdissection to nanoCAGE technology and next-generation sequencing we have identified the genome-wide collection of active promoters in the mouse Main Olfactory Epithelium (MOE). Transcription start sites (TSSs) for the large majority of Olfactory Receptors (ORs) have been previously mapped increasing our understanding of their promoter architecture. Here we show that in our nanoCAGE libraries of the mouse MOE we detect a large number of tags mapped in loci hosting Type-1 and Type-2 Vomeronasal Receptors genes (V1Rs and V2Rs). These loci also show a massive expression of Long Interspersed Nuclear Elements (LINEs). We have validated the expression of selected receptors detected by nanoCAGE with in situ hybridization, RT-PCR and qRT-PCR. This work extends the repertory of receptors capable of sensing chemical signals in the MOE, suggesting intriguing interplays between MOE and VNO for pheromone processing and positioning transcribed LINEs as candidate regulatory RNAs for VRs expression.
PMCID: PMC3927265  PMID: 24600346
vomeronasal receptors; main olfactory epithelium; vomeronasal organ; VNO; MOE; V1Rs; V2Rs
13.  NMDA Receptor Regulation Prevents Regression of Visual Cortical Function in the Absence of Mecp2 
Neuron  2012;76(6):1078-1090.
Brain function is shaped by postnatal experience and vulnerable to disruption of Methyl-CpG-binding protein, Mecp2, in multiple neurodevelopmental disorders. How Mecp2 contributes to the experience-dependent refinement of specific cortical circuits and their impairment remains unknown. We analyzed vision in gene-targeted mice and observed an initial normal development in the absence of Mecp2. Visual acuity then rapidly regressed after postnatal day P35–40 and cortical circuits largely fell silent by P55-60. Enhanced inhibitory gating and an excess of parvalbumin-positive, perisomatic input preceded the loss of vision. Both cortical function and inhibitory hyperconnectivity were strikingly rescued independent of Mecp2 by early sensory deprivation or genetic deletion of the excitatory NMDA receptor subunit, NR2A. Thus, vision is a sensitive biomarker of progressive cortical dysfunction and may guide novel, circuit-based therapies for Mecp2 deficiency.
PMCID: PMC3733788  PMID: 23259945
14.  A comprehensive promoter landscape identifies a novel promoter for CD133 in restricted tissues, cancers, and stem cells 
Frontiers in Genetics  2013;4:209.
PROM1 is the gene encoding prominin-1 or CD133, an important cell surface marker for the isolation of both normal and cancer stem cells. PROM1 transcripts initiate at a range of transcription start sites (TSS) associated with distinct tissue and cancer expression profiles. Using high resolution Cap Analysis of Gene Expression (CAGE) sequencing we characterize TSS utilization across a broad range of normal and developmental tissues. We identify a novel proximal promoter (P6) within CD133+ melanoma cell lines and stem cells. Additional exon array sampling finds P6 to be active in populations enriched for mesenchyme, neural stem cells and within CD133+ enriched Ewing sarcomas. The P6 promoter is enriched with respect to previously characterized PROM1 promoters for a HMGI/Y (HMGA1) family transcription factor binding site motif and exhibits different epigenetic modifications relative to the canonical promoter region of PROM1.
PMCID: PMC3810939  PMID: 24194746
PROM1 protein; human; AC133 antigen; transcription start site; promoter regions; genetic; melanoma; cancer stem cells
15.  Temporal dynamics and transcriptional control using single-cell gene expression analysis 
Genome Biology  2013;14(10):R118.
Changes in environmental conditions lead to expression variation that manifest at the level of gene regulatory networks. Despite a strong understanding of the role noise plays in synthetic biological systems, it remains unclear how propagation of expression heterogeneity in an endogenous regulatory network is distributed and utilized by cells transitioning through a key developmental event.
Here we investigate the temporal dynamics of a single-cell transcriptional network of 45 transcription factors in THP-1 human myeloid monocytic leukemia cells undergoing differentiation to macrophages. We systematically measure temporal regulation of expression and variation by profiling 120 single cells at eight distinct time points, and infer highly controlled regulatory modules through which signaling operates with stochastic effects. This reveals dynamic and specific rewiring as a cellular strategy for differentiation. The integration of both positive and negative co-expression networks further identifies the proto-oncogene MYB as a network hinge to modulate both the pro- and anti-differentiation pathways.
Compared to averaged cell populations, temporal single-cell expression profiling provides a much more powerful technique to probe for mechanistic insights underlying cellular differentiation. We believe that our approach will form the basis of novel strategies to study the regulation of transcription at a single-cell level.
PMCID: PMC4015031  PMID: 24156252
16.  Comparison of RNA- or LNA-hybrid oligonucleotides in template-switching reactions for high-speed sequencing library preparation 
BMC Genomics  2013;14:665.
Analyzing the RNA pool or transcription start sites requires effective means to convert RNA into cDNA libraries for digital expression counting. With current high-speed sequencers, it is necessary to flank the cDNAs with specific adapters. Adding template-switching oligonucleotides to reverse transcription reactions is the most commonly used approach when working with very small quantities of RNA even from single cells.
Here we compared the performance of DNA-RNA, DNA-LNA and DNA oligonucleotides in template-switching during nanoCAGE library preparation. Test libraries from rat muscle and HeLa cell RNA were prepared in technical triplicates and sequenced for comparison of the gene coverage and distribution of the reads within transcripts. The DNA-RNA oligonucleotide showed the highest specificity for capped 5′ ends of mRNA, whereas the DNA-LNA provided similar gene coverage with more reads falling within exons.
While confirming the cap-specific preference of DNA-RNA oligonucleotides in template-switching reactions, our data indicate that DNA-LNA hybrid oligonucleotides could potentially find other applications in random RNA sequencing.
PMCID: PMC3853366  PMID: 24079827
CAGE; Template-switching; LNA; Transcriptome; Quantitative sequencing
17.  Trehalose-enhanced isolation of neuronal sub-types from adult mouse brain 
BioTechniques  2012;52(6):381-385.
Efficient isolation of specific, intact, living neurons from the adult brain is problematic due to the complex nature of the extracellular matrix consolidating the neuronal network. Here, we present significant improvements to the protocol for isolation of pure populations of neurons from mature postnatal mouse brain using fluorescence activated cell sorting (FACS). The 10-fold increase in cell yield enables cell-specific transcriptome analysis by protocols such as nano-CAGE and RNA seq.
PMCID: PMC3696583  PMID: 22668417
FACS; parvalbumin; pyramidal; nanoCAGE; RNA seq
18.  Landscape of transcription in human cells 
Djebali, Sarah | Davis, Carrie A. | Merkel, Angelika | Dobin, Alex | Lassmann, Timo | Mortazavi, Ali M. | Tanzer, Andrea | Lagarde, Julien | Lin, Wei | Schlesinger, Felix | Xue, Chenghai | Marinov, Georgi K. | Khatun, Jainab | Williams, Brian A. | Zaleski, Chris | Rozowsky, Joel | Röder, Maik | Kokocinski, Felix | Abdelhamid, Rehab F. | Alioto, Tyler | Antoshechkin, Igor | Baer, Michael T. | Bar, Nadav S. | Batut, Philippe | Bell, Kimberly | Bell, Ian | Chakrabortty, Sudipto | Chen, Xian | Chrast, Jacqueline | Curado, Joao | Derrien, Thomas | Drenkow, Jorg | Dumais, Erica | Dumais, Jacqueline | Duttagupta, Radha | Falconnet, Emilie | Fastuca, Meagan | Fejes-Toth, Kata | Ferreira, Pedro | Foissac, Sylvain | Fullwood, Melissa J. | Gao, Hui | Gonzalez, David | Gordon, Assaf | Gunawardena, Harsha | Howald, Cedric | Jha, Sonali | Johnson, Rory | Kapranov, Philipp | King, Brandon | Kingswood, Colin | Luo, Oscar J. | Park, Eddie | Persaud, Kimberly | Preall, Jonathan B. | Ribeca, Paolo | Risk, Brian | Robyr, Daniel | Sammeth, Michael | Schaffer, Lorian | See, Lei-Hoon | Shahab, Atif | Skancke, Jorgen | Suzuki, Ana Maria | Takahashi, Hazuki | Tilgner, Hagen | Trout, Diane | Walters, Nathalie | Wang, Huaien | Wrobel, John | Yu, Yanbao | Ruan, Xiaoan | Hayashizaki, Yoshihide | Harrow, Jennifer | Gerstein, Mark | Hubbard, Tim | Reymond, Alexandre | Antonarakis, Stylianos E. | Hannon, Gregory | Giddings, Morgan C. | Ruan, Yijun | Wold, Barbara | Carninci, Piero | Guigó, Roderic | Gingeras, Thomas R.
Nature  2012;489(7414):101-108.
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific sub-cellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic sub-cellular localizations are also poorly understood. Since RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modifications and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations taken together prompt to a redefinition of the concept of a gene.
PMCID: PMC3684276  PMID: 22955620
19.  Endogenous Retrotransposition Activates Oncogenic Pathways in Hepatocellular Carcinoma 
Cell  2013;153(1):101-111.
LINE-1 (L1) retrotransposons are mobile genetic elements comprising ∼17% of the human genome. New L1 insertions can profoundly alter gene function and cause disease, though their significance in cancer remains unclear. Here, we applied enhanced retrotransposon capture sequencing (RC-seq) to 19 hepatocellular carcinoma (HCC) genomes and elucidated two archetypal L1-mediated mechanisms enabling tumorigenesis. In the first example, 4/19 (21.1%) donors presented germline retrotransposition events in the tumor suppressor mutated in colorectal cancers (MCC). MCC expression was ablated in each case, enabling oncogenic β-catenin/Wnt signaling. In the second example, suppression of tumorigenicity 18 (ST18) was activated by a tumor-specific L1 insertion. Experimental assays confirmed that the L1 interrupted a negative feedback loop by blocking ST18 repression of its enhancer. ST18 was also frequently amplified in HCC nodules from Mdr2−/− mice, supporting its assignment as a candidate liver oncogene. These proof-of-principle results substantiate L1-mediated retrotransposition as an important etiological factor in HCC.
Graphical Abstract
► L1 retrotransposons promote tumorigenesis in hepatocellular carcinoma (HCC) ► Germline L1 and Alu insertions in MCC activate β-catenin/Wnt signaling ► L1 mobilization in tumor cells accelerates transformation of the HCC genome ► A tumor-specific L1 insertion interrupts a negative feedback loop regulating ST18
L1 retrotransposons, which are widespread in the human genome, can mobilize and activate oncogenes in the livers of individuals infected with the hepatitis B or hepatitis C virus, promoting the development and growth of hepatocellular carcinoma. Genes identified by the L1 insertions present new options for cancer screening and intervention.
PMCID: PMC3898742  PMID: 23540693
20.  Site-specific DICER and DROSHA RNA products control the DNA damage response 
Nature  2012;488(7410):231-235.
Non-coding RNAs (ncRNAs) are involved in an increasing number of cellular events1. Some ncRNAs are processed by DICER and DROSHA ribonucleases to give rise to small double-stranded RNAs involved in RNA interference (RNAi)2. The DNA-damage response (DDR) is a signaling pathway that originates from the DNA lesion and arrests cell proliferation3. So far, DICER or DROSHA RNA products have not been reported to control DDR activation. Here we show that DICER and DROSHA, but not downstream elements of the RNAi pathway, are necessary to activate DDR upon oncogene-induced genotoxic stress and exogenous DNA damage, as studied also by DDR foci formation in mammalian cells and zebrafish and by checkpoint assays. DDR foci are sensitive to RNase A treatment, and DICER- and DROSHA-dependent RNA products are required to restore DDR foci in treated cells. Through RNA deep sequencing and studies of DDR activation at an inducible unique DNA double-strand break (DSB), we demonstrate that DDR foci formation requires site-specific DICER- and DROSHA-dependent small RNAs, named DDRNAs, which act in a MRE11-RAD50-NBS1 (MRN) complex-dependent manner. Chemically synthesized or in vitro-generated by DICER cleavage, DDRNAs are sufficient to restore DDR in RNase A-treated cells, also in the absence of other cellular RNAs. Our results describe an unanticipated direct role of a novel class of ncRNAs in the control of DDR activation at sites of DNA damage.
PMCID: PMC3442236  PMID: 22722852
DICER; DROSHA; small non coding RNAs; DNA damage response (DDR); ATM; cellular senescence; zebrafish
21.  piRNAs Warrant Investigation in Rett Syndrome: An Omics Perspective 
Disease markers  2012;33(5):261-275.
Mutations in the MECP2 gene are found in a large proportion of girls with Rett Syndrome. Despite extensive research, the principal role of MeCP2 protein remains elusive. Is MeCP2 a regulator of genes, acting in concert with co-activators and co-repressors, predominantly as an activator of target genes or is it a methyl CpG binding protein acting globally to change the chromatin state and to supress transcription from repeat elements? If MeCP2 has no specific targets in the genome, what causes the differential expression of specific genes in the Mecp2 knockout mouse brain? We discuss the discrepancies in current data and propose a hypothesis to reconcile some differences in the two viewpoints. Since transcripts from repeat elements contribute to piRNA biogenesis, we propose that piRNA levels may be higher in the absence of MeCP2 and that increased piRNA levels may contribute to the mis-regulation of some genes seen in the Mecp2 knockout mouse brain. We provide preliminary data showing an increase in piRNAs in the Mecp2 knockout mouse cerebellum. Our investigation suggests that global piRNA levels may be elevated in the Mecp2 knockout mouse cerebellum and strongly supports further investigation of piRNAs in Rett syndrome.
PMCID: PMC3810717  PMID: 22976001
Rett Syndrome; MeCP2; piRNAs; LINE 1; short RNAs
22.  Suppression of artifacts and barcode bias in high-throughput transcriptome analyses utilizing template switching 
Nucleic Acids Research  2012;41(3):e44.
Template switching (TS) has been an inherent mechanism of reverse transcriptase, which has been exploited in several transcriptome analysis methods, such as CAGE, RNA-Seq and short RNA sequencing. TS is an attractive option, given the simplicity of the protocol, which does not require an adaptor mediated step and thus minimizes sample loss. As such, it has been used in several studies that deal with limited amounts of RNA, such as in single cell studies. Additionally, TS has also been used to introduce DNA barcodes or indexes into different samples, cells or molecules. This labeling allows one to pool several samples into one sequencing flow cell, increasing the data throughput of sequencing and takes advantage of the increasing throughput of current sequences. Here, we report TS artifacts that form owing to a process called strand invasion. Due to the way in which barcodes/indexes are introduced by TS, strand invasion becomes more problematic by introducing unsystematic biases. We describe a strategy that eliminates these artifacts in silico and propose an experimental solution that suppresses biases from TS.
PMCID: PMC3562004  PMID: 23180801
23.  Somatic retrotransposition alters the genetic landscape of the human brain 
Nature  2011;479(7374):534-537.
Retrotransposons are mobile genetic elements that employ a germ line “copy-and-paste” mechanism to spread throughout metazoan genomes1. At least 50% of the human genome is derived from retrotransposons, with three active families (L1, Alu and SVA) associated with insertional mutagenesis and disease2-3. Epigenetic and post-transcriptional suppression block retrotransposition in somatic cells4-5, excluding early embryo development and some malignancies6-7. Recent reports of L1 expression8-9 and copy number variation10-11 (CNV) in the human brain suggest L1 mobilization may also occur during later development. However, the corresponding integration sites have not been mapped. Here we apply a high-throughput method to identify numerous L1, Alu and SVA germ line mutations, as well as 7,743 putative somatic L1 insertions in the hippocampus and caudate nucleus of three individuals. Surprisingly, we also found 13,692 and 1,350 somatic Alu and SVA insertions, respectively. Our results demonstrate that retrotransposons mobilize to protein-coding genes differentially expressed and active in the brain. Thus, somatic genome mosaicism driven by retrotransposition may reshape the genetic circuitry that underpins normal and abnormal neurobiological processes.
PMCID: PMC3224101  PMID: 22037309
24.  Automated Workflow for Preparation of cDNA for Cap Analysis of Gene Expression on a Single Molecule Sequencer 
PLoS ONE  2012;7(1):e30809.
Cap analysis of gene expression (CAGE) is a 5′ sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol.
In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 ‘HeliScope ready’ CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility.
We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation.
PMCID: PMC3268765  PMID: 22303458

Results 1-25 (78)