Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)
more »
Year of Publication
Document Types
1.  Comprehensive survey of human brain microRNA by deep sequencing 
BMC Genomics  2010;11:409.
MicroRNA (miRNA) play an important role in gene expression regulation. At present, the number of annotated miRNA continues to grow rapidly, in part due to advances of high-throughput sequencing techniques. Here, we use deep sequencing to characterize a population of small RNA expressed in human and rhesus macaques brain cortex.
Based on a total of more than 150 million sequence reads we identify 197 putative novel miRNA, in humans and rhesus macaques, that are highly conserved among mammals. These putative miRNA have significant excess of conserved target sites in genes' 3'UTRs, supporting their functional role in gene regulation. Additionally, in humans and rhesus macaques respectively, we identify 41 and 22 conserved putative miRNA originating from non-coding RNA (ncRNA) transcripts. While some of these molecules might function as conventional miRNA, others might be harmful and result in target avoidance.
Here, we further extend the repertoire of conserved human and rhesus macaque miRNA. Even though our study is based on a single tissue, the coverage depth of our study allows identification of functional miRNA present in brain tissue at background expression levels. Therefore, our study might cover large proportion of the yet unannotated conserved miRNA present in the human genome.
PMCID: PMC2996937  PMID: 20591156
2.  Computational prediction of the osmoregulation network in Synechococcus sp. WH8102 
BMC Genomics  2010;11:291.
Osmotic stress is caused by sudden changes in the impermeable solute concentration around a cell, which induces instantaneous water flow in or out of the cell to balance the concentration. Very little is known about the detailed response mechanism to osmotic stress in marine Synechococcus, one of the major oxygenic phototrophic cyanobacterial genera that contribute greatly to the global CO2 fixation.
We present here a computational study of the osmoregulation network in response to hyperosmotic stress of Synechococcus sp strain WH8102 using comparative genome analyses and computational prediction. In this study, we identified the key transporters, synthetases, signal sensor proteins and transcriptional regulator proteins, and found experimentally that of these proteins, 15 genes showed significantly changed expression levels under a mild hyperosmotic stress.
From the predicted network model, we have made a number of interesting observations about WH8102. Specifically, we found that (i) the organism likely uses glycine betaine as the major osmolyte, and others such as glucosylglycerol, glucosylglycerate, trehalose, sucrose and arginine as the minor osmolytes, making it efficient and adaptable to its changing environment; and (ii) σ38, one of the seven types of σ factors, probably serves as a global regulator coordinating the osmoregulation network and the other relevant networks.
PMCID: PMC2874817  PMID: 20459751
3.  Sequence features associated with microRNA strand selection in humans and flies 
BMC Genomics  2009;10:413.
During microRNA (miRNA) maturation in humans and flies, Drosha and Dicer cut the precursor transcript, thereby producing a short RNA duplex. One strand of this duplex becomes a functional component of the RNA-Induced Silencing Complex (RISC), while the other is eliminated. While thermodynamic asymmetry of the duplex ends appears to play a decisive role in the strand selection process, the details of the selection mechanism are not yet understood.
Here, we assess miRNA strand selection bias in humans and fruit flies by analyzing the sequence composition and relative expression levels of the two strands of the precursor duplex in these species. We find that the sequence elements associated with preferential miRNA strand selection and/or rejection differ between the two species. Further, we identify another feature that distinguishes human and fly miRNA processing machinery: the relative accuracy of the Drosha and Dicer enzymes.
Our result provides clues to the mechanistic aspects of miRNA strand selection in humans and other mammals. Further, it indicates that human and fly miRNA processing pathways are more distinct than currently recognized. Finally, the observed strand selection determinants are instrumental in the rational design of efficient miRNA-based expression regulators.
PMCID: PMC2751786  PMID: 19732433
4.  Estimating accuracy of RNA-Seq and microarrays with proteomics 
BMC Genomics  2009;10:161.
Microarrays revolutionized biological research by enabling gene expression comparisons on a transcriptome-wide scale. Microarrays, however, do not estimate absolute expression level accurately. At present, high throughput sequencing is emerging as an alternative methodology for transcriptome studies. Although free of many limitations imposed by microarray design, its potential to estimate absolute transcript levels is unknown.
In this study, we evaluate relative accuracy of microarrays and transcriptome sequencing (RNA-Seq) using third methodology: proteomics. We find that RNA-Seq provides a better estimate of absolute expression levels.
Our result shows that in terms of overall technical performance, RNA-Seq is the technique of choice for studies that require accurate estimation of absolute transcript levels.
PMCID: PMC2676304  PMID: 19371429
5.  RepPop: a database for repetitive elements in Populus trichocarpa 
BMC Genomics  2009;10:14.
Populus trichocarpa is the first tree genome to be completed, and its whole genome is currently being assembled. No functional annotation about the repetitive elements in the Populus trichocarpa genome is currently available.
We predicted 9,623 repetitive elements in the Populus trichocarpa genome, and assigned functions to 3,075 of them (31.95%). The 9,623 repetitive elements cover ~40% of the current (partially) assembled genome. Among the 9,623 repetitive elements, 668 have copies only in the contigs that have not been assigned to one of the 19 chromosome while the rest all have copies in the partially assembled chromosomes.
All the predicted data are organized into an easy-to-use web-browsable database, RepPop. Various search capabilities are provided against the RepPop database. A Wiki system has been set up to facilitate functional annotation and curation of the repetitive elements by a community rather than just the database developer. The database RepPop will facilitate the assembling and functional characterization of the Populus trichocarpa genome.
PMCID: PMC2645430  PMID: 19134208
6.  Insertion Sequences show diverse recent activities in Cyanobacteria and Archaea 
BMC Genomics  2008;9:36.
Mobile genetic elements (MGEs) play an essential role in genome rearrangement and evolution, and are widely used as an important genetic tool.
In this article, we present genetic maps of recently active Insertion Sequence (IS) elements, the simplest form of MGEs, for all sequenced cyanobacteria and archaea, predicted based on the previously identified ~1,500 IS elements. Our predicted IS maps are consistent with the NCBI annotations of the IS elements. By linking the predicted IS elements to various characteristics of the organisms under study and the organism's living conditions, we found that (a) the activities of IS elements heavily depend on the environments where the host organisms live; (b) the number of recently active IS elements in a genome tends to increase with the genome size; (c) the flanking regions of the recently active IS elements are significantly enriched with genes encoding DNA binding factors, transporters and enzymes; and (d) IS movements show no tendency to disrupt operonic structures.
This is the first genome-scale maps of IS elements with detailed structural information on the sequence level. These genetic maps of recently active IS elements and the several interesting observations would help to improve our understanding of how IS elements proliferate and how they are involved in the evolution of the host genomes.
PMCID: PMC2246112  PMID: 18218090
7.  Computational prediction of Pho regulons in cyanobacteria 
BMC Genomics  2007;8:156.
Phosphorus is an essential element for all life forms. However, it is limiting in most ecological environments where cyanobacteria inhabit. Elucidation of the phosphorus assimilation pathways in cyanobacteria will further our understanding of the physiology and ecology of this important group of microorganisms. However, a systematic study of the Pho regulon, the core of the phosphorus assimilation pathway in a cyanobacterium, is hitherto lacking.
We have predicted and analyzed the Pho regulons in 19 sequenced cyanobacterial genomes using a highly effective scanning algorithm that we have previously developed. Our results show that different cyanobacterial species/ecotypes may encode diverse sets of genes responsible for the utilization of various sources of phosphorus, ranging from inorganic phosphate, phosphodiester, to phosphonates. Unlike in E. coli, some cyanobacterial genes that are directly involved in phosphorus assimilation seem to not be under the regulation of the regulator SphR (orthologue of PhoB in E coli.) in some species/ecotypes. On the other hand, SphR binding sites are found for genes known to play important roles in other biological processes. These genes might serve as bridging points to coordinate the phosphorus assimilation and other biological processes. More interestingly, in three cyanobacterial genomes where no sphR gene is encoded, our results show that there is virtually no functional SphR binding site, suggesting that transcription regulators probably play an important role in retaining their binding sites.
The Pho regulons in cyanobacteria are highly diversified to accommodate to their respective living environments. The phosphorus assimilation pathways in cyanobacteria are probably tightly coupled to a number of other important biological processes. The loss of a regulator may lead to the rapid loss of its binding sites in a genome.
PMCID: PMC1906773  PMID: 17559671
8.  Bioinformatic analysis of an unusual gene-enzyme relationship in the arginine biosynthetic pathway among marine gamma proteobacteria: implications concerning the formation of N-acetylated intermediates in prokaryotes 
BMC Genomics  2006;7:4.
The N-acetylation of L-glutamate is regarded as a universal metabolic strategy to commit glutamate towards arginine biosynthesis. Until recently, this reaction was thought to be catalyzed by either of two enzymes: (i) the classical N-acetylglutamate synthase (NAGS, gene argA) first characterized in Escherichia coli and Pseudomonas aeruginosa several decades ago and also present in vertebrates, or (ii) the bifunctional version of ornithine acetyltransferase (OAT, gene argJ) present in Bacteria, Archaea and many Eukaryotes. This paper focuses on a new and surprising aspect of glutamate acetylation. We recently showed that in Moritella abyssi and M. profunda, two marine gamma proteobacteria, the gene for the last enzyme in arginine biosynthesis (argH) is fused to a short sequence that corresponds to the C-terminal, N-acetyltransferase-encoding domain of NAGS and is able to complement an argA mutant of E. coli. Very recently, other authors identified in Mycobacterium tuberculosis an independent gene corresponding to this short C-terminal domain and coding for a new type of NAGS. We have investigated the two prokaryotic Domains for patterns of gene-enzyme relationships in the first committed step of arginine biosynthesis.
The argH-A fusion, designated argH(A), and discovered in Moritella was found to be present in (and confined to) marine gamma proteobacteria of the Alteromonas- and Vibrio-like group. Most of them have a classical NAGS with the exception of Idiomarina loihiensis and Pseudoalteromonas haloplanktis which nevertheless can grow in the absence of arginine and therefore appear to rely on the arg(A) sequence for arginine biosynthesis. Screening prokaryotic genomes for virtual argH-X 'fusions' where X stands for a homologue of arg(A), we retrieved a large number of Bacteria and several Archaea, all of them devoid of a classical NAGS. In the case of Thermus thermophilus and Deinococcus radiodurans, the arg(A)-like sequence clusters with argH in an operon-like fashion. In this group of sequences, we find the short novel NAGS of the type identified in M. tuberculosis. Among these organisms, at least Thermus, Mycobacterium and Streptomyces species appear to rely on this short NAGS version for arginine biosynthesis.
The gene-enzyme relationship for the first committed step of arginine biosynthesis should now be considered in a new perspective. In addition to bifunctional OAT, nature appears to implement at least three alternatives for the acetylation of glutamate. It is possible to propose evolutionary relationships between them starting from the same ancestral N-acetyltransferase domain. In M. tuberculosis and many other bacteria, this domain evolved as an independent enzyme, whereas it fused either with a carbamate kinase fold to give the classical NAGS (as in E. coli) or with argH as in marine gamma proteobacteria. Moreover, there is an urgent need to clarify the current nomenclature since the same gene name argA has been used to designate structurally different entities. Clarifying the confusion would help to prevent erroneous genomic annotation.
PMCID: PMC1382215  PMID: 16409639
9.  Retrieving sequences of enzymes experimentally characterized but erroneously annotated : the case of the putrescine carbamoyltransferase 
BMC Genomics  2004;5:52.
Annotating genomes remains an hazardous task. Mistakes or gaps in such a complex process may occur when relevant knowledge is ignored, whether lost, forgotten or overlooked. This paper exemplifies an approach which could help to ressucitate such meaningful data.
We show that a set of closely related sequences which have been annotated as ornithine carbamoyltransferases are actually putrescine carbamoyltransferases. This demonstration is based on the following points : (i) use of enzymatic data which had been overlooked, (ii) rediscovery of a short NH2-terminal sequence allowing to reannotate a wrongly annotated ornithine carbamoyltransferase as a putrescine carbamoyltransferase, (iii) identification of conserved motifs allowing to distinguish unambiguously between the two kinds of carbamoyltransferases, and (iv) comparative study of the gene context of these different sequences.
We explain why this specific case of misannotation had not yet been described and draw attention to the fact that analogous instances must be rather frequent. We urge to be especially cautious when high sequence similarity is coupled with an apparent lack of biochemical information. Moreover, from the point of view of genome annotation, proteins which have been studied experimentally but are not correlated with sequence data in current databases qualify as "orphans", just as unassigned genomic open reading frames do. The strategy we used in this paper to bridge such gaps in knowledge could work whenever it is possible to collect a body of facts about experimental data, homology, unnoticed sequence data, and accurate informations about gene context.
PMCID: PMC514541  PMID: 15287962

Results 1-9 (9)