PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-14 (14)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Subcellular RNA Sequencing Reveals Broad Presence of Cytoplasmic Intron-Sequence Retaining Transcripts in Mouse and Rat Neurons 
PLoS ONE  2013;8(10):e76194.
Recent findings have revealed the complexity of the transcriptional landscape in mammalian cells. One recently described class of novel transcripts are the Cytoplasmic Intron-sequence Retaining Transcripts (CIRTs), hypothesized to confer post-transcriptional regulatory function. For instance, the neuronal CIRT KCNMA1i16 contributes to the firing properties of hippocampal neurons. Intronic sub-sequence retention within IL1-β mRNA in anucleate platelets has been implicated in activity-dependent splicing and translation. In a recent study, we showed CIRTs harbor functional SINE ID elements which are hypothesized to mediate dendritic localization in neurons. Based on these studies and others, we hypothesized that CIRTs may be present in a broad set of transcripts and comprise novel signals for post-transcriptional regulation. We carried out a transcriptome-wide survey of CIRTs by sequencing micro-dissected subcellular RNA fractions. We sequenced two batches of 150-300 individually dissected dendrites from primary cultures of hippocampal neurons in rat and three batches from mouse hippocampal neurons. After statistical processing to minimize artifacts, we found a broad prevalence of CIRTs in the neurons in both species (44-60% of the expressed transcripts). The sequence patterns, including stereotypical length, biased inclusion of specific introns, and intron-intron junctions, suggested CIRT-specific nuclear processing. Our analysis also suggested that these cytoplasmic intron-sequence retaining transcripts may serve as a primary transcript for ncRNAs. Our results show that retaining intronic sequences is not isolated to a few loci but may be a genome-wide phenomenon for embedding functional signals within certain mRNA. The results hypothesize a novel source of cis-sequences for post-transcriptional regulation. Our results hypothesize two potentially novel splicing pathways: one, within the nucleus for CIRT biogenesis; and another, within the cytoplasm for removing CIRT sequences before translation. We also speculate that release of CIRT sequences prior to translation may form RNA-based signals within the cell potentially comprising a novel class of signaling pathways.
doi:10.1371/journal.pone.0076194
PMCID: PMC3789819  PMID: 24098440
2.  Quantitative biology of single neurons 
The building blocks of complex biological systems are single cells. Fundamental insights gained from single-cell analysis promise to provide the framework for understanding normal biological systems development as well as the limits on systems/cellular ability to respond to disease. The interplay of cells to create functional systems is not well understood. Until recently, the study of single cells has concentrated primarily on morphological and physiological characterization. With the application of new highly sensitive molecular and genomic technologies, the quantitative biochemistry of single cells is now accessible.
doi:10.1098/rsif.2012.0417
PMCID: PMC3481569  PMID: 22915636
quantitative biology; single neurons; single cells; transcriptomics; proteomics; splicing
3.  Evolutionary genomics of host-use in bifurcating demes of RNA virus phi-6 
Background
Viruses are exceedingly diverse in their evolved strategies to manipulate hosts for viral replication. However, despite these differences, most virus populations will occasionally experience two commonly-encountered challenges: growth in variable host environments, and growth under fluctuating population sizes. We used the segmented RNA bacteriophage ϕ6 as a model for studying the evolutionary genomics of virus adaptation in the face of host switches and parametrically varying population sizes. To do so, we created a bifurcating deme structure that reflected lineage splitting in natural populations, allowing us to test whether phylogenetic algorithms could accurately resolve this ‘known phylogeny’. The resulting tree yielded 32 clones at the tips and internal nodes; these strains were fully sequenced and measured for phenotypic changes in selected traits (fitness on original and novel hosts).
Results
We observed that RNA segment size was negatively correlated with the extent of molecular change in the imposed treatments; molecular substitutions tended to cluster on the Small and Medium RNA chromosomes of the virus, and not on the Large segment. Our study yielded a very large molecular and phenotypic dataset, fostering possible inferences on genotype-phenotype associations. Using further experimental evolution, we confirmed an inference on the unanticipated role of an allelic switch in a viral assembly protein, which governed viral performance across host environments.
Conclusions
Our study demonstrated that varying complexities can be simultaneously incorporated into experimental evolution, to examine the combined effects of population size, and adaptation in novel environments. The imposed bifurcating structure revealed that some methods for phylogenetic reconstruction failed to resolve the true phylogeny, owing to a paucity of molecular substitutions separating the RNA viruses that evolved in our study.
doi:10.1186/1471-2148-12-153
PMCID: PMC3495861  PMID: 22913547
Adaptation; Bacteria; Bacteriophage; Experimental evolution; Known phylogeny; Pseudomonas; Virus
4.  Cytoplasmic intron sequence-retaining transcripts (CIRTs) can be dendritically targeted via ID element retrotransposons 
Neuron  2011;69(5):877-884.
RNA precursors give rise to mRNA after splicing of intronic sequences traditionally thought to occur in the nucleus. Here, we show that intron sequences are retained in a number of dendritically-targeted mRNAs, using microarray and Illumina sequencing of isolated dendritic mRNA as well as in situ hybridization. Many of the retained introns contain ID elements, a class of SINE retrotransposon. A portion of these SINEs confers dendritic targeting to exogenous and endogenous transcripts showing the necessity of ID-mediated mechanisms for the targeting of different transcripts to dendrites. ID elements are capable of selectively altering the distribution of endogenous proteins, providing a link between intronic SINEs and protein function. As such, the ID element represents the first common dendritic targeting element to be found across multiple RNAs. Retention of intronic sequence is a more general phenomenon then previously thought and plays a functional role in the biology of the neuron, partly mediated by co-opted repetitive sequences.
doi:10.1016/j.neuron.2011.02.028
PMCID: PMC3065018  PMID: 21382548
5.  Sniper: improved SNP discovery by multiply mapping deep sequenced reads 
Genome Biology  2011;12(6):R55.
SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml.
doi:10.1186/gb-2011-12-6-r55
PMCID: PMC3218843  PMID: 21689413
6.  RNA: State Memory and Mediator of Cellular Phenotype 
Trends in cell biology  2010;20(6):311-318.
It has become increasingly clear that the genome is dynamic and exquisitely sensitive, changing expression patterns in response to age, environmental stimuli and pharmacological and physiological manipulations. Similarly, cellular phenotype, traditionally viewed as a stable end-state, should be viewed as versatile and changeable. The phenotype of a cell is better defined as a “homeostatic phenotype” implying plasticity resulting from a dynamically-changing yet characteristic pattern of gene/protein expression. A stable change in phenotype is the result of the movement of a cell between different multi-dimensional identity spaces. Here, we describe a key driver of this transition and the stabilizer of phenotype: the relative abundances of the cellular RNAs. We argue that the quantitative state of RNA can be likened to a state memory, that when transferred between cells, alters the phenotype in a predictable manner.
doi:10.1016/j.tcb.2010.03.003
PMCID: PMC2892202  PMID: 20382532
7.  Heterochronic evolution reveals modular timing changes in budding yeast transcriptomes 
Genome Biology  2010;11(10):R105.
Background
Gene expression is a dynamic trait, and the evolution of gene regulation can dramatically alter the timing of gene expression without greatly affecting mean expression levels. Moreover, modules of co-regulated genes may exhibit coordinated shifts in expression timing patterns during evolutionary divergence. Here, we examined transcriptome evolution in the dynamical context of the budding yeast cell-division cycle, to investigate the extent of divergence in expression timing and the regulatory architecture underlying timing evolution.
Results
Using a custom microarray platform, we obtained 378 measurements for 6,263 genes over 18 timepoints of the cell-division cycle in nine strains of S. cerevisiae and one strain of S. paradoxus. Most genes show significant divergence in expression dynamics at all scales of transcriptome organization, suggesting broad potential for timing changes. A model test comparing expression level evolution versus timing evolution revealed a better fit with timing evolution for 82% of genes. Analysis of shared patterns of timing evolution suggests the existence of seven dynamically-autonomous modules, each of which shows coherent evolutionary timing changes. Analysis of transcription factors associated with these gene modules suggests a modular pleiotropic source of divergence in expression timing.
Conclusions
We propose that transcriptome evolution may generally entail changes in timing (heterochrony) rather than changes in levels (heterometry) of expression. Evolution of gene expression dynamics may involve modular changes in timing control mediated by module-specific transcription factors. We hypothesize that genome-wide gene regulation may utilize a general architecture comprised of multiple semi-autonomous event timelines, whose superposition could produce combinatorial complexity in timing control patterns.
doi:10.1186/gb-2010-11-10-r105
PMCID: PMC3218661  PMID: 20969771
8.  Translation of sensory input into behavioral output via an olfactory system 
Neuron  2008;59(1):110-124.
Summary
We investigate the logic by which sensory input is translated into behavioral output. First we provide a functional analysis of the entire odor receptor repertoire of an olfactory system. We construct tuning curves for the 21 functional odor receptors of the Drosophila larva, and show that they sharpen at lower odor doses. We construct a 21-dimensional odor space from the responses of the receptors and find that the distance between two odors correlates with the extent to which one odor masks the other. Mutational analysis shows that different receptors mediate the responses to different concentrations of an odorant. The summed response of the entire receptor repertoire correlates with the strength of the behavioral response. The activity of a small number of receptors is a surprisingly powerful predictor of behavior. Odors that inhibit more receptors are more likely to be repellents. Odor space is largely conserved between two dissimilar olfactory systems.
doi:10.1016/j.neuron.2008.06.010
PMCID: PMC2496968  PMID: 18614033
Olfaction; Drosophila; odor receptor; larva; behavior
9.  Genome-Wide Analyses of Exonic Copy Number Variants in a Family-Based Study Point to Novel Autism Susceptibility Genes 
PLoS Genetics  2009;5(6):e1000536.
The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11–q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3×10−5). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3×10−4). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3×10−39), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.
Author Summary
Autism spectrum disorders (ASDs) are common neurodevelopmental syndromes with a strong genetic component. ASDs are characterized by disturbances in social behavior, impaired verbal and nonverbal communication, as well as repetitive behaviors and/or a restricted range of interests. To identify genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. To enrich for variants most likely to interfere with gene function, we restricted our analyses to deletions and gains encompassing exons. Of the many genomic regions highlighted, 27 were seen to harbor rare variants in cases and not controls, both in the first phase of our analysis, and also in an independent replication cohort comprised of 859 cases and 1,051 controls. More work in a larger number of individuals will be required to determine which of the rare alleles highlighted here are indeed related to the ASDs and how they act to shape risk.
doi:10.1371/journal.pgen.1000536
PMCID: PMC2695001  PMID: 19557195
10.  Self Containment, a Property of Modular RNA Structures, Distinguishes microRNAs 
PLoS Computational Biology  2008;4(8):e1000150.
RNA molecules will tend to adopt a folded conformation through the pairing of bases on a single strand; the resulting so-called secondary structure is critical to the function of many types of RNA. The secondary structure of a particular substring of functional RNA may depend on its surrounding sequence. Yet, some RNAs such as microRNAs retain their specific structures during biogenesis, which involves extraction of the substructure from a larger structural context, while other functional RNAs may be composed of a fusion of independent substructures. Such observations raise the question of whether particular functional RNA substructures may be selected for invariance of secondary structure to their surrounding nucleotide context. We define the property of self containment to be the tendency for an RNA sequence to robustly adopt the same optimal secondary structure regardless of whether it exists in isolation or is a substring of a longer sequence of arbitrary nucleotide content. We measured degree of self containment using a scoring method we call the self-containment index and found that miRNA stem loops exhibit high self containment, consistent with the requirement for structural invariance imposed by the miRNA biogenesis pathway, while most other structured RNAs do not. Further analysis revealed a trend toward higher self containment among clustered and conserved miRNAs, suggesting that high self containment may be a characteristic of novel miRNAs acquiring new genomic contexts. We found that miRNAs display significantly enhanced self containment compared to other functional RNAs, but we also found a trend toward natural selection for self containment in most functional RNA classes. We suggest that self containment arises out of selection for robustness against perturbations, invariance during biogenesis, and modular composition of structural function. Analysis of self containment will be important for both annotation and design of functional RNAs. A Python implementation and Web interface to calculate the self-containment index are available at http://kim.bio.upenn.edu/software/.
Author Summary
An RNA molecule is made up of a linear sequence of nucleotides, which form pairwise interactions that define its folded three-dimensional structure; the particular structure largely depends on the specific sequence. These base-pairing interactions are stabilizing, and the RNA will tend to fold in a particular way to maximize stability. Consider some nucleotide sequence that optimally folds into some structure in isolation; if this sequence is now embedded inside a larger sequence, then either the original structure will be a robust subcomponent of the larger folded structure, or it will be disrupted due to new interactions between the original sequence and the surrounding sequence. We explore this property of context robustness of structure and in particular define the property of “self containment” to describe intrinsic context robustness—i.e., the tendency for certain sequences to be structurally robust in many different sequence contexts. Self containment turns out to be a strong characteristic of a class of RNAs called microRNAs, whose biogenesis process depends on the maintenance of structural robustness. This finding will be useful in future efforts to characterize novel miRNAs, as well as in understanding the regulation and evolution of noncoding functional RNAs as modular units.
doi:10.1371/journal.pcbi.1000150
PMCID: PMC2517099  PMID: 18725951
11.  Whole proteome identification of plant candidate G-protein coupled receptors in Arabidopsis, rice, and poplar: computational prediction and in-vivo protein coupling 
Genome Biology  2008;9(7):R120.
Computational prediction and in vivo protein coupling experiments identify candidate plant G-protein coupled receptors in Arabidopsis, rice and poplar.
Background
The classic paradigm of heterotrimeric G-protein signaling describes a heptahelical, membrane-spanning G-protein coupled receptor that physically interacts with an intracellular Gα subunit of the G-protein heterotrimer to transduce signals. G-protein coupled receptors comprise the largest protein superfamily in metazoa and are physiologically important as they sense highly diverse stimuli and play key roles in human disease. The heterotrimeric G-protein signaling mechanism is conserved across metazoa, and also readily identifiable in plants, but the low sequence conservation of G-protein coupled receptors hampers the identification of novel ones. Using diverse computational methods, we performed whole-proteome analyses of the three dominant model plant species, the herbaceous dicot Arabidopsis thaliana (mouse-eared cress), the monocot Oryza sativa (rice), and the woody dicot Populus trichocarpa (poplar), to identify plant protein sequences most likely to be GPCRs.
Results
Our stringent bioinformatic pipeline allowed the high confidence identification of candidate G-protein coupled receptors within the Arabidopsis, Oryza, and Populus proteomes. We extended these computational results through actual wet-bench experiments where we tested over half of our highest ranking Arabidopsis candidate G-protein coupled receptors for the ability to physically couple with GPA1, the sole Gα in Arabidopsis. We found that seven out of eight tested candidate G-protein coupled receptors do in fact interact with GPA1. We show through G-protein coupled receptor classification and molecular evolutionary analyses that both individual G-protein coupled receptor candidates and candidate G-protein coupled receptor families are conserved across plant species and that, in some cases, this conservation extends to metazoans.
Conclusion
Our computational and wet-bench results provide the first step toward understanding the diversity, conservation, and functional roles of plant candidate G-protein coupled receptors.
doi:10.1186/gb-2008-9-7-r120
PMCID: PMC2530877  PMID: 18671868
12.  Patterns of sequence conservation in presynaptic neural genes 
Genome Biology  2006;7(11):R105.
Comparative sequence analysis and annotation of genomic regions surrounding 150 presynaptic genes identified over 26,000 elements highly conserved in eight vertebrate species; these results are made available in the SynapseDB database.
Background
The neuronal synapse is a fundamental functional unit in the central nervous system of animals. Because synaptic function is evolutionarily conserved, we reasoned that functional sequences of genes and related genomic elements known to play important roles in neurotransmitter release would also be conserved.
Results
Evolutionary rate analysis revealed that presynaptic proteins evolve slowly, although some members of large gene families exhibit accelerated evolutionary rates relative to other family members. Comparative sequence analysis of 46 megabases spanning 150 presynaptic genes identified more than 26,000 elements that are highly conserved in eight vertebrate species, as well as a small subset of sequences (6%) that are shared among unrelated presynaptic genes. Analysis of large gene families revealed that upstream and intronic regions of closely related family members are extremely divergent. We also identified 504 exceptionally long conserved elements (≥360 base pairs, ≥80% pair-wise identity between human and other mammals) in intergenic and intronic regions of presynaptic genes. Many of these elements form a highly stable stem-loop RNA structure and consequently are candidates for novel regulatory elements, whereas some conserved noncoding elements are shown to correlate with specific gene expression profiles. The SynapseDB online database integrates these findings and other functional genomic resources for synaptic genes.
Conclusion
Highly conserved elements in nonprotein coding regions of 150 presynaptic genes represent sequences that may be involved in the transcriptional or post-transcriptional regulation of these genes. Furthermore, comparative sequence analysis will facilitate selection of genes and noncoding sequences for future functional studies and analysis of variation studies in neurodevelopmental and psychiatric disorders.
doi:10.1186/gb-2006-7-11-r105
PMCID: PMC1794582  PMID: 17096848
13.  Estimating genomic coexpression networks using first-order conditional independence 
Genome Biology  2004;5(12):R100.
A computationally efficient statistical framework for estimating networks of coexpressed genes is presented that exploits first-order conditional independence relationships among gene expression measurements.
We describe a computationally efficient statistical framework for estimating networks of coexpressed genes. This framework exploits first-order conditional independence relationships among gene-expression measurements to estimate patterns of association. We use this approach to estimate a coexpression network from microarray gene-expression measurements from Saccharomyces cerevisiae. We demonstrate the biological utility of this approach by showing that a large number of metabolic pathways are coherently represented in the estimated network. We describe a complementary unsupervised graph search algorithm for discovering locally distinct subgraphs of a large weighted graph. We apply this algorithm to our coexpression network model and show that subgraphs found using this approach correspond to particular biological processes or contain representatives of distinct gene families.
doi:10.1186/gb-2004-5-12-r100
PMCID: PMC545795  PMID: 15575966
14.  The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer 
PLoS Biology  2005;3(10):e316.
With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been suggested to be so widespread that it might be the “essence of phylogeny” and might leave the treelike form of genealogy in doubt. On the other hand, possible biased estimation of HGT extent and the findings of coherent phylogenetic patterns indicate that phylogeny of life is well represented by tree graphs. Here, we reexamine this question by assessing the extent of HGT among core orthologous genes using a novel statistical method based on statistical comparisons of tree topology. We apply the method to 40 microbial genomes in the Clusters of Orthologous Groups database over a curated set of 297 orthologous gene clusters, and we detect significant HGT events in 33 out of 297 clusters over a wide range of functional categories. Estimates of positions of HGT events suggest a low mean genome-specific rate of HGT (2.0%) among the orthologous genes, which is in general agreement with other quantitative of HGT. We propose that HGT events, even when relatively common, still leave the treelike history of phylogenies intact, much like cobwebs hanging from tree branches.
A stastical approach applied to 297 orthologous gene clusters in 40 microbial genomes suggests a low rate of interspecies gene transfer. Species relationships can therefore be modeled with a tree structure.
doi:10.1371/journal.pbio.0030316
PMCID: PMC1233574  PMID: 16122348

Results 1-14 (14)