PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-14 (14)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
1.  HnRNP L and HnRNP A1 Induce Extended U1 snRNA Interactions with an Exon to Repress Spliceosome Assembly 
Molecular cell  2013;49(5):972-982.
Pre-mRNA splicing is catalyzed through the activity of the spliceosome, a dynamic enzymatic complex. Forcing aberrant interactions within the spliceosome can reduce splicing efficiency and alter splice site choice; however, it is unknown whether such alterations are naturally exploited mechanisms of splicing regulation. Here we demonstrate that hnRNP L represses CD45 exon 4 by recruiting hnRNP A1 to a sequence upstream of the 5’ splice site. Together, hnRNP L and A1 induce extended contacts between the 5’ splice site-bound U1 snRNA and neighboring exonic sequences which, in turn, inhibit stable association of U6 snRNA and subsequent catalysis. Importantly, analysis of several exons regulated by hnRNP L shows a clear relationship between the potential for binding of hnRNP A1 and U1 snRNA, and the effect of hnRNP L on splicing. Together our results demonstrate conformational perturbations within the spliceosome are a naturally occurring and generalizable mechanism for controlling alternative splicing decisions.
doi:10.1016/j.molcel.2012.12.025
PMCID: PMC3595347  PMID: 23394998
alternative splicing; U1 snRNA; tri-snRNP; spliceosome assembly; hnRNP L; hnRNP A1
2.  Human Papillomavirus Type 58 Genome Variations and RNA Expression in Cervical Lesions 
Journal of Virology  2013;87(16):9313-9322.
Human papillomavirus type 58 (HPV58) is relatively prevalent in China and other Asian countries. In this study, the HPV58 genome in cervical lesions was decoded from five grade 2 or 3 cervical intraepithelial neoplasia lesion (CIN2/3) samples and five cervical cancer tissues using rolling-circle amplification of total cell DNA and deep sequencing and verified by whole-genome cloning and sequencing. HPV58 isolates from China feature a total of 52 nucleotide substitutions (0.66%) from the reference HPV58 sequence, which appear mainly in two regions, with 12 from nucleotides (nt) 3430 to 4136 covering the E2/E4/E5 open reading frames (ORFs) and 13 from nt 4621 to 5540 covering the L2 ORF; these could be grouped as HPV58 Chinese Zhejiang-1, -2, and -3 (CNZJ-1, -2, and -3) according to their sequence similarities and restriction enzyme digestion. Phylogenetically, CNZJ-3 is similar to the reference HPV58 sublineage A1 sequence. The other two are close to sublineage A2. Analysis of cervical lesion-derived RNA revealed abundant HPV58 early transcripts spliced at the E6 and E1/E2 ORFs, where two 5′ splice sites at nt 232 and nt 898 and two 3′ splice sites at nt 510 and nt 3355 can be identified. Thus, our study represents the first genome-wide analysis of HPV58 and its expression in cervical lesions.
doi:10.1128/JVI.01154-13
PMCID: PMC3754072  PMID: 23785208
3.  Genome sequencing accuracy by RCA-seq versus long PCR template cloning and sequencing in identification of human papillomavirus type 58 
Cell & Bioscience  2014;4:5.
Background
Genome variations in human papillomaviruses (HPVs) are common and have been widely investigated in the past two decades. HPV genotyping depends on the finding of the viral genome variations in the L1 ORF. Other parts of the viral genome variations have also been implicated as a possible genetic factor in viral pathogenesis and/or oncogenicity.
Results
In this study, the HPV58 genome in cervical lesions was completely sequenced both by rolling-circle amplification of total cell DNA and deep sequencing (RCA-seq) and by long PCR template cloning and sequencing. By comparison of three HPV58 genome sequences decoded from three clinical samples to reference HPV-58, we demonstrated that RCA-seq is much more accurate than long-PCR template cloning and sequencing in decoding HPV58 genome. Three HPV58 genomes decoded by RCA-seq displayed a total of 52 nucleotide substitutions from reference HPV58, which could be verified by long PCR template cloning and sequencing. However, the long PCR template cloning and sequencing led to additional nucleotide substitutions, insertions, and deletions from an authentic HPV58 genome in a clinical sample, which vary from one cloned sequence to another. Because the inherited error-prone nature of Tgo DNA polymerase used in preparation of the long PCR templates of HPV58 genome from the clinical samples, the measurable error rate in incorporation of nucleotide into an elongating DNA template was about 0.149% ±0.038% in our studies.
Conclusions
Since PCR template cloning and sequencing is widely used in identification of single nucleotide polymorphism (SNP), our data indicate that a serious caution should be taken in finding of true SNPs in various genetic studies.
doi:10.1186/2045-3701-4-5
PMCID: PMC3903022  PMID: 24410913
Human papillomaviruses; HPV58; Cervical cancer; Single nucleotide polymorphism; Genotyping; Genome variations; Rolling circle amplification; DNA deep sequencing
4.  PfSETvs methylation of histone H3K36 represses virulence genes in Plasmodium falciparum 
Nature  2013;499(7457):223-227.
The variant antigen, Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1), expressed on the surface of P. falciparum infected Red Blood Cells (iRBCs) is a critical virulence factor for malaria1. Each parasite encodes 60 antigenically distinct var genes encoding PfEMP1s, but during infection the clonal parasite population expresses only one gene at a time before switching to the expression of a new variant antigen as an immune evasion mechanism to avoid the host’s antibody responses2,3. The mechanism by which 59 of the 60 var genes are silenced remains largely unknown4–7. Here we show that knocking out the P. falciparum variant-silencing SET gene (PfSETvs), which encodes an ortholog of Drosophila melanogaster ASH1 and controls histone H3 lysine 36 trimethylation (H3K36me3) on var genes, results in the transcription of virtually all var genes in the single parasite nuclei and their expression as proteins on the surface of individual iRBCs. PfSETvs-dependent H3K36me3 is present along the entire gene body including the transcription start site (TSS) to silence var genes. With low occupancy of PfSETvs at both the TSS of var genes and the intronic promoter, expression of var genes coincides with transcription of their corresponding antisense long non-coding RNA (lncRNA). These results uncover a novel role of the PfSETvs-dependent H3K36me3 in silencing var genes in P. falciparum that might provide a general mechanism by which orthologs of PfSETvs repress gene expression in other eukaryotes. PfSETvs knockout parasites expressing all PfEMP1s may also be applied to the development of a malaria vaccine.
doi:10.1038/nature12361
PMCID: PMC3770130  PMID: 23823717
5.  A Viral Genome Landscape of RNA Polyadenylation from KSHV Latent to Lytic Infection 
PLoS Pathogens  2013;9(11):e1003749.
RNA polyadenylation (pA) is one of the major steps in regulation of gene expression at the posttranscriptional level. In this report, a genome landscape of pA sites of viral transcripts in B lymphocytes with Kaposi sarcoma-associated herpesvirus (KSHV) infection was constructed using a modified PA-seq strategy. We identified 67 unique pA sites, of which 55 could be assigned for expression of annotated ∼90 KSHV genes. Among the assigned pA sites, twenty are for expression of individual single genes and the rest for multiple genes (average 2.7 genes per pA site) in cluster-gene loci of the genome. A few novel viral pA sites that could not be assigned to any known KSHV genes are often positioned in the antisense strand to ORF8, ORF21, ORF34, K8 and ORF50, and their associated antisense mRNAs to ORF21, ORF34 and K8 could be verified by 3′RACE. The usage of each mapped pA site correlates to its peak size, the larger (broad and wide) peak size, the more usage and thus, the higher expression of the pA site-associated gene(s). Similar to mammalian transcripts, KSHV RNA polyadenylation employs two major poly(A) signals, AAUAAA and AUUAAA, and is regulated by conservation of cis-elements flanking the mapped pA sites. Moreover, we found two or more alternative pA sites downstream of ORF54, K2 (vIL6), K9 (vIRF1), K10.5 (vIRF3), K11 (vIRF2), K12 (Kaposin A), T1.5, and PAN genes and experimentally validated the alternative polyadenylation for the expression of KSHV ORF54, K11, and T1.5 transcripts. Together, our data provide not only a comprehensive pA site landscape for understanding KSHV genome structure and gene expression, but also the first evidence of alternative polyadenylation as another layer of posttranscriptional regulation in viral gene expression.
Author Summary
A genome-wide polyadenylation landscape in the expression of human herpesviruses has not been reported. In this study, we provide the first genome landscape of viral RNA polyadenylation sites in B cells from KSHV latent to lytic infection by using a modified PA-seq protocol and selectively validated by 3′ RACE. We found that KSHV genome contains 67 active pA sites for the expression of its ∼90 genes and a few antisense transcripts. Among the mapped pA sites, a large fraction of them are for the expression of cluster genes and the production of bicistronic or polycistronic transcripts from KSHV genome and only one-third are used for the expression of single genes. We found that the size of individual PA peaks is positively correlated with the usage of corresponding pA site, which is determined by the number of reads within the PA peak from latent to lytic KSHV infection, and the strength of cis-elements surrounding KSHV pA site determines the expression level of viral genes. Lastly, we identified and experimentally validated alternative polyadenylation of KSHV ORF54, T1.5, and K11 during viral lytic infection. To our knowledge, this is the first report on alternative polyadenylation events in KSHV infection.
doi:10.1371/journal.ppat.1003749
PMCID: PMC3828183  PMID: 24244170
6.  Distinct polyadenylation landscapes of diverse human tissues revealed by a modified PA-seq strategy 
BMC Genomics  2013;14:615.
Background
Polyadenylation is a key regulatory step in eukaryotic gene expression and one of the major contributors of transcriptome diversity. Aberrant polyadenylation often associates with expression defects and leads to human diseases.
Results
To better understand global polyadenylation regulation, we have developed a polyadenylation sequencing (PA-seq) approach. By profiling polyadenylation events in 13 human tissues, we found that alternative cleavage and polyadenylation (APA) is prevalent in both protein-coding and noncoding genes. In addition, APA usage, similar to gene expression profiling, exhibits tissue-specific signatures and is sufficient for determining tissue origin. A 3′ untranslated region shortening index (USI) was further developed for genes with tandem APA sites. Strikingly, the results showed that different tissues exhibit distinct patterns of shortening and/or lengthening of 3′ untranslated regions, suggesting the intimate involvement of APA in establishing tissue or cell identity.
Conclusions
This study provides a comprehensive resource to uncover regulated polyadenylation events in human tissues and to characterize the underlying regulatory mechanism.
doi:10.1186/1471-2164-14-615
PMCID: PMC3848854  PMID: 24025092
7.  Cellular RNA Binding Proteins NS1-BP and hnRNP K Regulate Influenza A Virus RNA Splicing 
PLoS Pathogens  2013;9(6):e1003460.
Influenza A virus is a major human pathogen with a genome comprised of eight single-strand, negative-sense, RNA segments. Two viral RNA segments, NS1 and M, undergo alternative splicing and yield several proteins including NS1, NS2, M1 and M2 proteins. However, the mechanisms or players involved in splicing of these viral RNA segments have not been fully studied. Here, by investigating the interacting partners and function of the cellular protein NS1-binding protein (NS1-BP), we revealed novel players in the splicing of the M1 segment. Using a proteomics approach, we identified a complex of RNA binding proteins containing NS1-BP and heterogeneous nuclear ribonucleoproteins (hnRNPs), among which are hnRNPs involved in host pre-mRNA splicing. We found that low levels of NS1-BP specifically impaired proper alternative splicing of the viral M1 mRNA segment to yield the M2 mRNA without affecting splicing of mRNA3, M4, or the NS mRNA segments. Further biochemical analysis by formaldehyde and UV cross-linking demonstrated that NS1-BP did not interact directly with viral M1 mRNA but its interacting partners, hnRNPs A1, K, L, and M, directly bound M1 mRNA. Among these hnRNPs, we identified hnRNP K as a major mediator of M1 mRNA splicing. The M1 mRNA segment generates the matrix protein M1 and the M2 ion channel, which are essential proteins involved in viral trafficking, release into the cytoplasm, and budding. Thus, reduction of NS1-BP and/or hnRNP K levels altered M2/M1 mRNA and protein ratios, decreasing M2 levels and inhibiting virus replication. Thus, NS1-BP-hnRNPK complex is a key mediator of influenza A virus gene expression.
Author Summary
Influenza A virus is a major human pathogen, which causes approximately 500,000 deaths/year worldwide. In pandemic years, influenza infection can lead to even higher mortality rates, as in 1918, when ∼30–50 million deaths occurred worldwide. In this manuscript, we identified a novel function for the cellular protein termed NS1-BP as a regulator of the influenza A virus life cycle. We found that NS1-BP, together with other host factors, mediates the expression of a key viral protein termed M2. NS1-BP and its interacting partner hnRNP K specifically regulate alternative splicing of the viral M1 mRNA segment, which generates the M2 mRNA that is translated into the essential viral M2 protein. The M2 protein is key for viral uncoating and entry into the host cell cytoplasm. Altogether, inhibition of NS1-BP and hnRNP K functions regulate influenza A virus gene expression and replication. In sum, these studies revealed new functions for the cellular proteins NS1-BP and hnRNP K during viral RNA expression, which facilitate the influenza A virus life cycle.
doi:10.1371/journal.ppat.1003460
PMCID: PMC3694860  PMID: 23825951
8.  Genome-wide identification and predictive modeling of tissue-specific alternative polyadenylation 
Bioinformatics  2013;29(13):i108-i116.
Motivation: Pre-mRNA cleavage and polyadenylation are essential steps for 3′-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage/polyadenylation sites (polyA sites), which are frequently constrained by sequence content and position. More than 50% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3′-untranslated regions, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered both by the lack of suitable data on the precise location of cleavage sites, as well as of appropriate tests for determining APAs with significant differences across multiple libraries.
Results: We applied a tailored paired-end RNA-seq protocol to specifically probe the position of polyA sites in three human adult tissue types. We specified a linear-effects regression model to identify tissue-specific biases indicating regulated APA; the significance of differences between tissue types was assessed by an appropriately designed permutation test. This combination allowed to identify highly specific subsets of APA events in the individual tissue types. Predictive models successfully classified constitutive polyA sites from a biologically relevant background (auROC = 99.6%), as well as tissue-specific regulated sets from each other. We found that the main cis-regulatory elements described for polyadenylation are a strong, and highly informative, hallmark for constitutive sites only. Tissue-specific regulated sites were found to contain other regulatory motifs, with the canonical polyadenylation signal being nearly absent at brain-specific polyA sites. Together, our results contribute to the understanding of the diversity of post-transcriptional gene regulation.
Availability: Raw data are deposited on SRA, accession numbers: brain SRX208132, kidney SRX208087 and liver SRX208134. Processed datasets as well as model code are published on our website: http://www.genome.duke.edu/labs/ohler/research/UTR/
Contact: uwe.ohler@duke.edu
doi:10.1093/bioinformatics/btt233
PMCID: PMC3694680  PMID: 23812974
9.  Ancient gene transfer from algae to animals: Mechanisms and evolutionary significance 
Background
Horizontal gene transfer (HGT) is traditionally considered to be rare in multicellular eukaryotes such as animals. Recently, many genes of miscellaneous algal origins were discovered in choanoflagellates. Considering that choanoflagellates are the existing closest relatives of animals, we speculated that ancient HGT might have occurred in the unicellular ancestor of animals and affected the long-term evolution of animals.
Results
Through genome screening, phylogenetic and domain analyses, we identified 14 gene families, including 92 genes, in the tunicate Ciona intestinalis that are likely derived from miscellaneous photosynthetic eukaryotes. Almost all of these gene families are distributed in diverse animals, suggesting that they were mostly acquired by the common ancestor of animals. Their miscellaneous origins also suggest that these genes are not derived from a particular algal endosymbiont. In addition, most genes identified in our analyses are functionally related to molecule transport, cellular regulation and methylation signaling, suggesting that the acquisition of these genes might have facilitated the intercellular communication in the ancestral animal.
Conclusions
Our findings provide additional evidence that algal genes in aplastidic eukaryotes are not exclusively derived from historical plastids and thus important for interpreting the evolution of eukaryotic photosynthesis. Most importantly, our data represent the first evidence that more anciently acquired genes might exist in animals and that ancient HGT events have played an important role in animal evolution.
doi:10.1186/1471-2148-12-83
PMCID: PMC3494510  PMID: 22690978
Gene transfer; Endosymbiosis; Plastids; Animal evolution
11.  A paired-end sequencing strategy to map the complex landscape of transcription initiation 
Nature methods  2010;7(7):521-527.
Recent high-throughput sequencing protocols have uncovered the complexity of mammalian transcription by RNA polymerase II, helping to define several initiation patterns in which transcription start sites (TSSs) cluster within both narrow and broad genomic windows. Here, we describe a paired-end sequencing strategy, which enables more robust mapping and characterization of capped transcripts. This strategy was applied to explore the transcription initiation landscape in the Drosophila melanogaster embryo. Extending the previous findings in mammals, we found that fly promoters exhibit distinct initiation patterns, which are linked to specific promoter sequence motifs. Furthermore, we identified a large number of 5′ capped transcripts originating from coding exons; analyses support that they are unlikely the result of alternative TSSs, but rather the product of post-transcriptional modifications. Taken together, paired-end TSS analysis is demonstrated to be a powerful method to uncover the transcriptional complexity of eukaryotic genomes.
doi:10.1038/nmeth.1464
PMCID: PMC3197272  PMID: 20495556
12.  Transcription Initiation Patterns Indicate Divergent Strategies for Gene Regulation at the Chromatin Level 
PLoS Genetics  2011;7(1):e1001274.
The application of deep sequencing to map 5′ capped transcripts has confirmed the existence of at least two distinct promoter classes in metazoans: “focused” promoters with transcription start sites (TSSs) that occur in a narrowly defined genomic span and “dispersed” promoters with TSSs that are spread over a larger window. Previous studies have explored the presence of genomic features, such as CpG islands and sequence motifs, in these promoter classes, but virtually no studies have directly investigated the relationship with chromatin features. Here, we show that promoter classes are significantly differentiated by nucleosome organization and chromatin structure. Dispersed promoters display higher associations with well-positioned nucleosomes downstream of the TSS and a more clearly defined nucleosome free region upstream, while focused promoters have a less organized nucleosome structure, yet higher presence of RNA polymerase II. These differences extend to histone variants (H2A.Z) and marks (H3K4 methylation), as well as insulator binding (such as CTCF), independent of the expression levels of affected genes. Notably, differences are conserved across mammals and flies, and they provide for a clearer separation of promoter architectures than the presence and absence of CpG islands or the occurrence of stalled RNA polymerase. Computational models support the stronger contribution of chromatin features to the definition of dispersed promoters compared to focused start sites. Our results show that promoter classes defined from 5′ capped transcripts not only reflect differences in the initiation process at the core promoter but also are indicative of divergent transcriptional programs established within gene-proximal nucleosome organization.
Author Summary
How are genes transcribed at the right levels and under the right conditions? Transcription regulation in eukaryotes has long been proposed to work by a division of labor: ubiquitous DNA sequence features in the core promoter region, close to the transcription start site (TSS) of genes, were thought to generically encode information to recruit RNA polymerase to initiate transcription, while specific sequence features, often distal from the genes, were thought to boost expression under the right conditions. Supporting the generic function of core promoters, genome-wide chromatin maps showed a stereotypical arrangement of well-spaced nucleosomes providing access to the TSS. High-throughput sequencing has generated genome-wide TSS maps at high resolution, which show that promoters exhibit different initiation patterns, ranging from focused start sites to dispersed regions. Linking these patterns to chromatin maps, we now find distinct core promoter classes, those in which the TSS location is defined broadly on the chromatin level and those in which the TSS is defined by precisely positioned sequence features. Notably, these architectures are conserved deeply across eukaryotes and are used for different functional classes of genes. Our work adds to the increasing understanding that core promoters contribute significantly to the complexity of eukaryotic gene expression.
doi:10.1371/journal.pgen.1001274
PMCID: PMC3020932  PMID: 21249180
13.  The Prevalence and Regulation of Antisense Transcripts in Schizosaccharomyces pombe 
PLoS ONE  2010;5(12):e15271.
A strand-specific transcriptome sequencing strategy, directional ligation sequencing or DeLi-seq, was employed to profile antisense transcriptome of Schizosaccharomyces pombe. Under both normal and heat shock conditions, we found that polyadenylated antisense transcripts are broadly expressed while distinct expression patterns were observed for protein-coding and non-coding loci. Dominant antisense expression is enriched in protein-coding genes involved in meiosis or stress response pathways. Detailed analyses further suggest that antisense transcripts are independently regulated with respect to their sense transcripts, and diverse mechanisms might be potentially involved in the biogenesis and degradation of antisense RNAs. Taken together, antisense transcription may have profound impacts on global gene regulation in S. pombe.
doi:10.1371/journal.pone.0015271
PMCID: PMC3004915  PMID: 21187966
14.  Computational analysis of the relationship between allergenicity and digestibility of allergenic proteins in simulated gastric fluid 
BMC Bioinformatics  2007;8:375.
Background
Safety assessment of genetically modified (GM) food, with regard to allergenic potential of transgene-encoded xenoproteins, typically involves several different methods, evaluation by digestibility being one thereof. However, there are still debates about whether the allergenicity of food allergens is related to their resistance to digestion by the gastric fluid. The disagreements may in part stem from classification of allergens only by their sources, which we believe is inadequate, and the difficulties in achieving identical experimental conditions for studying digestion by simulated gastric fluid (SGF) so that results can be compared. Here, we reclassify allergenic food allergens into alimentary canal-sensitized (ACS) and non-alimentary canal-sensitized (NACS) allergens and use a computational model that simulates gastric fluid digestion to analyze the digestibilities of these two types.
Results
The model presented in this paper is as effective as SGF digestion experiments, but more stable and reproducible. On the basis of this model, food allergens are satisfactorily classified as ACS and NACS types by their pathways for sensitization; the former are relatively resistant to gastric fluid digestion while the later are relatively labile.
Conclusion
The results suggest that it is better to classify allergens into ACS and NACS types when understanding the relationship between their digestibility and allergenicity and the digestibility of a target foreign protein is a parameter for evaluating its allergenicity during safety assessments of GM food.
doi:10.1186/1471-2105-8-375
PMCID: PMC2099448  PMID: 17922925

Results 1-14 (14)