PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-8 (8)
 

Clipboard (0)
None
Journals
Authors
more »
Year of Publication
Document Types
1.  HIV-1 transcription is regulated by splicing factor SRSF1 
Nucleic Acids Research  2014;42(22):13812-13823.
Efficient transcription of the HIV-1 genome is regulated by Tat, which recruits P-TEFb from the 7SK small nuclear ribonucleoprotein (snRNP) and other nucleoplasmic complexes to phosphorylate RNA polymerase II and other factors associated with the transcription complex. Although Tat activity is dependent on its binding to the viral TAR sequence, little is known about the cellular factors that might also assemble onto this region of the viral transcript. Here, we report that the splicing factor SRSF1 (SF2/ASF) and Tat recognize overlapping sequences within TAR and the 7SK RNA. SRSF1 expression can inhibit Tat transactivation by directly competing for its binding to TAR. Additionally, we provide evidence that SRSF1 can increase the basal level of viral transcription in the absence of Tat. We propose that SRSF1 activates transcription in the early stages of viral infection by recruiting P-TEFb to TAR from the 7SK snRNP. Whereas in the later stages, Tat substitutes for SRSF1 by promoting release of the stalled polymerase and more efficient transcriptional elongation.
doi:10.1093/nar/gku1170
PMCID: PMC4267630  PMID: 25416801
2.  OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds 
Nucleic Acids Research  2013;41(10):5149-5163.
A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds (∼14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows–Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (<30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and can be validated experimentally in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.
doi:10.1093/nar/gkt216
PMCID: PMC3664805  PMID: 23571760
3.  Aberrant 5′ splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization 
Nucleic Acids Research  2007;35(13):4250-4263.
Despite a growing number of splicing mutations found in hereditary diseases, utilization of aberrant splice sites and their effects on gene expression remain challenging to predict. We compiled sequences of 346 aberrant 5′splice sites (5′ss) that were activated by mutations in 166 human disease genes. Mutations within the 5′ss consensus accounted for 254 cryptic 5′ss and mutations elsewhere activated 92 de novo 5′ss. Point mutations leading to cryptic 5′ss activation were most common in the first intron nucleotide, followed by the fifth nucleotide. Substitutions at position +5 were exclusively G>A transitions, which was largely attributable to high mutability rates of C/G>T/A. However, the frequency of point mutations at position +5 was significantly higher than that observed in the Human Gene Mutation Database, suggesting that alterations of this position are particularly prone to aberrant splicing, possibly due to a requirement for sequential interactions with U1 and U6 snRNAs. Cryptic 5′ss were best predicted by computational algorithms that accommodate nucleotide dependencies and not by weight-matrix models. Discrimination of intronic 5′ss from their authentic counterparts was less effective than for exonic sites, as the former were intrinsically stronger than the latter. Computational prediction of exonic de novo 5′ss was poor, suggesting that their activation critically depends on exonic splicing enhancers or silencers. The authentic counterparts of aberrant 5′ss were significantly weaker than the average human 5′ss. The development of an online database of aberrant 5′ss will be useful for studying basic mechanisms of splice-site selection, identifying splicing mutations and optimizing splice-site prediction algorithms.
doi:10.1093/nar/gkm402
PMCID: PMC1934990  PMID: 17576681
4.  Comprehensive splice-site analysis using comparative genomics 
Nucleic Acids Research  2006;34(14):3955-3967.
We have collected over half a million splice sites from five species—Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana—and classified them into four subtypes: U2-type GT–AG and GC–AG and U12-type GT–AG and AT–AC. We have also found new examples of rare splice-site categories, such as U12-type introns without canonical borders, and U2-dependent AT–AC introns. The splice-site sequences and several tools to explore them are available on a public website (SpliceRack). For the U12-type introns, we find several features conserved across species, as well as a clustering of these introns on genes. Using the information content of the splice-site motifs, and the phylogenetic distance between them, we identify: (i) a higher degree of conservation in the exonic portion of the U2-type splice sites in more complex organisms; (ii) conservation of exonic nucleotides for U12-type splice sites; (iii) divergent evolution of C.elegans 3′ splice sites (3′ss) and (iv) distinct evolutionary histories of 5′ and 3′ss. Our study proves that the identification of broad patterns in naturally-occurring splice sites, through the analysis of genomic datasets, provides mechanistic and evolutionary insights into pre-mRNA splicing.
doi:10.1093/nar/gkl556
PMCID: PMC1557818  PMID: 16914448
5.  Distribution of SR protein exonic splicing enhancer motifs in human protein-coding genes 
Nucleic Acids Research  2005;33(16):5053-5062.
Exonic splicing enhancers (ESEs) are pre-mRNA cis-acting elements required for splice-site recognition. We previously developed a web-based program called ESEfinder that scores any sequence for the presence of ESE motifs recognized by the human SR proteins SF2/ASF, SRp40, SRp55 and SC35 (). Using ESEfinder, we have undertaken a large-scale analysis of ESE motif distribution in human protein-coding genes. Significantly higher frequencies of ESE motifs were observed in constitutive internal protein-coding exons, compared with both their flanking intronic regions and with pseudo exons. Statistical analysis of ESE motif frequency distributions revealed a complex relationship between splice-site strength and increased or decreased frequencies of particular SR protein motifs. Comparison of constitutively and alternatively spliced exons demonstrated slightly weaker splice-site scores, as well as significantly fewer ESE motifs, in the alternatively spliced group. Our results underline the importance of ESE-mediated SR protein function in the process of exon definition, in the context of both constitutive splicing and regulated alternative splicing.
doi:10.1093/nar/gki810
PMCID: PMC1201331  PMID: 16147989
6.  Intrinsic differences between authentic and cryptic 5′ splice sites 
Nucleic Acids Research  2003;31(21):6321-6333.
Cryptic splice sites are used only when use of a natural splice site is disrupted by mutation. To determine the features that distinguish authentic from cryptic 5′ splice sites (5′ss), we systematically analyzed a set of 76 cryptic 5′ss derived from 46 human genes. These cryptic 5′ss have a similar frequency distribution in exons and introns, and are usually located close to the authentic 5′ss. Statistical analysis of the strengths of the 5′ss using the Shapiro and Senapathy matrix revealed that authentic 5′ss have significantly higher score values than cryptic 5′ss, which in turn have higher values than the mutant ones. β-Globin provides an interesting exception to this rule, so we chose it for detailed experimental analysis in vitro. We found that the sequences of the β-globin authentic and cryptic 5′ss, but not their surrounding context, determine the correct 5′ss choice, although their respective scores do not reflect this functional difference. Our analysis provides a statistical basis to explain the competitive advantage of authentic over cryptic 5′ss in most cases, and should facilitate the development of tools to reliably predict the effect of disease-associated 5′ss-disrupting mutations at the mRNA level.
doi:10.1093/nar/gkg830
PMCID: PMC275472  PMID: 14576320
7.  ESEfinder: a web resource to identify exonic splicing enhancers 
Nucleic Acids Research  2003;31(13):3568-3571.
Point mutations frequently cause genetic diseases by disrupting the correct pattern of pre-mRNA splicing. The effect of a point mutation within a coding sequence is traditionally attributed to the deduced change in the corresponding amino acid. However, some point mutations can have much more severe effects on the structure of the encoded protein, for example when they inactivate an exonic splicing enhancer (ESE), thereby resulting in exon skipping. ESEs also appear to be especially important in exons that normally undergo alternative splicing. Different classes of ESE consensus motifs have been described, but they are not always easily identified. ESEfinder (http://exon.cshl.edu/ESE/) is a web-based resource that facilitates rapid analysis of exon sequences to identify putative ESEs responsive to the human SR proteins SF2/ASF, SC35, SRp40 and SRp55, and to predict whether exonic mutations disrupt such elements.
PMCID: PMC169022  PMID: 12824367
8.  Correlated alternative side chain conformations in the RNA-recognition motif of heterogeneous nuclear ribonucleoprotein A1 
Nucleic Acids Research  2002;30(7):1531-1538.
The RNA-recognition motif (RRM) is a common and evolutionarily conserved RNA-binding module. Crystallographic and solution structural studies have shown that RRMs adopt a compact α/β structure, in which four antiparallel β-strands form the major RNA-binding surface. Conserved aromatic residues in the RRM are located on the surface of the β-sheet and are important for RNA binding. To further our understanding of the structural basis of RRM-nucleic acid interaction, we carried out a high resolution analysis of UP1, the N-terminal, two-RRM domain of heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1), whose structure was previously solved at 1.75–1.9 Å resolution. The two RRMs of hnRNP A1 are closely related but have distinct functions in regulating alternative pre-mRNA splice site selection. Our present 1.1 Å resolution crystal structure reveals that two conserved solvent-exposed phenylalanines in the first RRM have alternative side chain conformations. These conformations are spatially correlated, as the individual amino acids cannot adopt each of the observed conformations independently. These phenylalanines are critical for nucleic acid binding and the observed alternative side chain conformations may serve as a mechanism for regulating nucleic acid binding by RRM-containing proteins.
PMCID: PMC101846  PMID: 11917013

Results 1-8 (8)