PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-18 (18)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
author:("Luo, hsingchu")
1.  LSD 2.0: an update of the leaf senescence database 
Nucleic Acids Research  2013;42(D1):D1200-D1205.
This manuscript describes an update of the leaf senescence database (LSD) previously featured in the 2011 NAR Database Issue. LSD provides comprehensive information concerning senescence-associated genes (SAGs) and their corresponding mutants. We have made extensive annotations for these SAGs through both manual and computational approaches. Recently, we updated LSD to a new version LSD 2.0 (http://www.eplantsenescence.org/), which contains 5356 genes and 322 mutants from 44 species, an extension from the previous version containing 1145 genes and 154 mutants from 21 species. In the current version, we also included several new features: (i) Primer sequences retrieved based on experimental evidence or designed for high-throughput analysis were added; (ii) More than 100 images of Arabidopsis SAG mutants were added; (iii) Arabidopsis seed information obtained from The Arabidopsis Information Resource (TAIR) was integrated; (iv) Subcellular localization information of SAGs in Arabidopsis mined from literature or generated from the SUBA3 program was presented; (v) Quantitative Trait Loci information was added with links to the original database and (vi) New options such as primer and miRNA search for database query were implemented. The updated database will be a valuable and informative resource for basic research of leaf senescence and for the manipulation of traits of agronomically important plants.
doi:10.1093/nar/gkt1061
PMCID: PMC3965048  PMID: 24185698
2.  PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors 
Nucleic Acids Research  2013;42(D1):D1182-D1187.
With the aim to provide a resource for functional and evolutionary study of plant transcription factors (TFs), we updated the plant TF database PlantTFDB to version 3.0 (http://planttfdb.cbi.pku.edu.cn). After refining the TF classification pipeline, we systematically identified 129 288 TFs from 83 species, of which 67 species have genome sequences, covering main lineages of green plants. Besides the abundant annotation provided in the previous version, we generated more annotations for identified TFs, including expression, regulation, interaction, conserved elements, phenotype information, expert-curated descriptions derived from UniProt, TAIR and NCBI GeneRIF, as well as references to provide clues for functional studies of TFs. To help identify evolutionary relationship among identified TFs, we assigned 69 450 TFs into 3924 orthologous groups, and constructed 9217 phylogenetic trees for TFs within the same families or same orthologous groups, respectively. In addition, we set up a TF prediction server in this version for users to identify TFs from their own sequences.
doi:10.1093/nar/gkt1016
PMCID: PMC3965000  PMID: 24174544
3.  ABrowse - a customizable next-generation genome browser framework 
BMC Bioinformatics  2012;13:2.
Background
With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects.
Results
Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access.
Conclusions
ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.
doi:10.1186/1471-2105-13-2
PMCID: PMC3265404  PMID: 22222089
4.  Deficiency of X-Linked Inverted Duplicates with Male-Biased Expression and the Underlying Evolutionary Mechanisms in the Drosophila Genome 
Molecular Biology and Evolution  2011;28(10):2823-2832.
Inverted duplicates (IDs) are pervasive in genomes and have been reported to play functional roles in various biological processes. However, the general underlying evolutionary forces that maintain IDs in genomes remain largely elusive. Through a systematic screening of the Drosophila melanogaster genome, 20,223 IDs were detected in nonrepetitive intergenic regions, far more than expectation under the neutrality model. 3,846 of these IDs were identified to have stable hairpin structure (i.e., the structural IDs). Based on whole-genome transcriptome profiling data, we found 628 unannotated expressed structural IDs, which had significantly different genomic distributions and structural properties from the unexpressed IDs. Among the expressed structural IDs, 130 exhibited higher expression in males than in females (i.e., male-biased expression). Compared with sex-unbiased ones, these male-biased IDs were significantly underrepresented on the X chromosome, similar to previously reported pattern of male-biased protein-coding genes. These analyses suggest that a selection-driven process, rather than a purely neutral mutation-driven mechanism, contributes to the maintenance of IDs in the Drosophila genome.
doi:10.1093/molbev/msr101
PMCID: PMC3176832  PMID: 21546357
inverted duplicates; noncoding RNA; sex evolution; MSCI; meiotic drive; Drosophila melanogaster
5.  Rice-Map: a new-generation rice genome browser 
BMC Genomics  2011;12:165.
Background
The concurrent release of rice genome sequences for two subspecies (Oryza sativa L. ssp. japonica and Oryza sativa L. ssp. indica) facilitates rice studies at the whole genome level. Since the advent of high-throughput analysis, huge amounts of functional genomics data have been delivered rapidly, making an integrated online genome browser indispensable for scientists to visualize and analyze these data. Based on next-generation web technologies and high-throughput experimental data, we have developed Rice-Map, a novel genome browser for researchers to navigate, analyze and annotate rice genome interactively.
Description
More than one hundred annotation tracks (81 for japonica and 82 for indica) have been compiled and loaded into Rice-Map. These pre-computed annotations cover gene models, transcript evidences, expression profiling, epigenetic modifications, inter-species and intra-species homologies, genetic markers and other genomic features. In addition to these pre-computed tracks, registered users can interactively add comments and research notes to Rice-Map as User-Defined Annotation entries. By smoothly scrolling, dragging and zooming, users can browse various genomic features simultaneously at multiple scales. On-the-fly analysis for selected entries could be performed through dedicated bioinformatic analysis platforms such as WebLab and Galaxy. Furthermore, a BioMart-powered data warehouse "Rice Mart" is offered for advanced users to fetch bulk datasets based on complex criteria.
Conclusions
Rice-Map delivers abundant up-to-date japonica and indica annotations, providing a valuable resource for both computational and bench biologists. Rice-Map is publicly accessible at http://www.ricemap.org/, with all data available for free downloading.
doi:10.1186/1471-2164-12-165
PMCID: PMC3072960  PMID: 21450055
6.  PlantTFDB 2.0: update and improvement of the comprehensive plant transcription factor database 
Nucleic Acids Research  2010;39(Database issue):D1114-D1117.
We updated the plant transcription factor (TF) database to version 2.0 (PlantTFDB 2.0, http://planttfdb.cbi.pku.edu.cn) which contains 53 319 putative TFs predicted from 49 species. We made detailed annotation including general information, domain feature, gene ontology, expression pattern and ortholog groups, as well as cross references to various databases and literature citations for these TFs classified into 58 newly defined families with computational approach and manual inspection. Multiple sequence alignments and phylogenetic trees for each family can be shown as Weblogo pictures or downloaded as text files. We have redesigned the user interface in the new version. Users can search TFs with much more flexibility through the improved advanced search page, and the search results can be exported into various formats for further analysis. In addition, we now provide web service for advanced users to access PlantTFDB 2.0 more efficiently.
doi:10.1093/nar/gkq1141
PMCID: PMC3013715  PMID: 21097470
7.  LSD: a leaf senescence database 
Nucleic Acids Research  2010;39(Database issue):D1103-D1107.
By broad literature survey, we have developed a leaf senescence database (LSD, http://www.eplantsenescence.org/) that contains a total of 1145 senescence associated genes (SAGs) from 21 species. These SAGs were retrieved based on genetic, genomic, proteomic, physiological or other experimental evidence, and were classified into different categories according to their functions in leaf senescence or morphological phenotypes when mutated. We made extensive annotations for these SAGs by both manual and computational approaches, and users can either browse or search the database to obtain information including literatures, mutants, phenotypes, expression profiles, miRNA interactions, orthologs in other plants and cross links to other databases. We have also integrated a bioinformatics analysis platform WebLab into LSD, which allows users to perform extensive sequence analysis of their interested SAGs. The SAG sequences in LSD can also be downloaded readily for bulk analysis. We believe that the LSD contains the largest number of SAGs to date and represents the most comprehensive and informative plant senescence-related database, which would facilitate the systems biology research and comparative studies on plant aging.
doi:10.1093/nar/gkq1169
PMCID: PMC3013730  PMID: 21097471
8.  AHD2.0: an update version of Arabidopsis Hormone Database for plant systematic studies 
Nucleic Acids Research  2010;39(Database issue):D1123-D1129.
Phytohormone studies enlightened our knowledge of plant responses to various changes. To provide a systematic and comprehensive view of genes participating in plant hormonal regulation, an online accessible database Arabidopsis Hormone Database (AHD) has been developed, which is a collection of hormone related genes of the model organism Arabidopsis thaliana (AHRGs). Recently we updated our database from AHD to a new version AHD2.0 by adding several pronounced features: (i) updating our collection of AHRGs based on most recent publications as well as constructing elaborate schematic diagrams of each hormone biosynthesis and signaling pathways; (ii) adding orthologs of sequenced plants listed in OrthoMCL-DB to each AHRG in the updated database; (iii) providing predicted miRNA splicing site(s) for each AHRG; (iv) integrating genes that genetically interact with each AHRG according to literatures mining; (v) providing links to a powerful online analysis platform WebLab for the convenience of in-time bioinformatics analysis and (vi) providing links to widely used protein databases and integrating more expression profiling information that would facilitate users for a more systematic and integrative analysis related to phytohormone research.
doi:10.1093/nar/gkq1066
PMCID: PMC3013673  PMID: 21045062
9.  Expression pattern divergence of duplicated genes in rice 
BMC Bioinformatics  2009;10(Suppl 6):S8.
Background
Genome-wide duplication is ubiquitous during diversification of the angiosperms, and gene duplication is one of the most important mechanisms for evolutionary novelties. As an indicator of functional evolution, the divergence of expression patterns following duplication events has drawn great attention in recent years. Using large-scale whole-genome microarray data, we systematically analyzed expression divergence patterns of rice genes from block, tandem and dispersed duplications.
Results
We found a significant difference in expression divergence patterns for the three types of duplicated gene pairs. Expression correlation is significantly higher for gene pairs from block and tandem duplications than those from dispersed duplications. Furthermore, a significant correlation was observed between the expression divergence and the synonymous substitution rate which is an approximate proxy of divergence time. Thus, both duplication types and divergence time influence the difference in expression divergence. Using a linear model, we investigated the influence of these two variables and found that the difference in expression divergence between block and dispersed duplicates is attributed largely to their different divergence time. In addition, the difference in expression divergence between tandem and the other two types of duplicates is attributed to both divergence time and duplication type.
Conclusion
Consistent with previous studies on Arabidopsis, our results revealed a significant difference in expression divergence between the types of duplicated genes and a significant correlation between expression divergence and synonymous substitution rate. We found that the attribution of duplication mode to the expression divergence implies a different evolutionary course of duplicated genes.
doi:10.1186/1471-2105-10-S6-S8
PMCID: PMC2697655  PMID: 19534757
10.  NTAP: for NimbleGen tiling array ChIP-chip data analysis 
Bioinformatics  2009;25(14):1838-1840.
Summary:NTAP is designed to analyze ChIP-chip data generated by the NimbleGen tiling array platform and to accomplish various pattern recognition tasks that are useful especially for epigenetic studies. The modular design of NTAP makes the data processing highly customizable. Users can either use NTAP to perform the full process of NimbleGen tiling array data analysis, or choose post-processing modules in NTAP to analyze pre-processed epigenetic data generated by other platforms. The output of NTAP can be saved in standard GFF format files and visualized in GBrowse.
Availability and Implementation:The source code of NTAP is freely available at http://ntap.cbi.pku.edu.cn/. It is implemented in Perl and R and can be used on Linux, Mac and Windows platforms.
Contact: ntap@mail.cbi.pku.edu.cn; luojc@pku.edu.cn; hekun78@gmail.com
doi:10.1093/bioinformatics/btp320
PMCID: PMC2705232  PMID: 19468055
11.  WebLab: a data-centric, knowledge-sharing bioinformatic platform 
Nucleic Acids Research  2009;37(Web Server issue):W33-W39.
With the rapid progress of biological research, great demands are proposed for integrative knowledge-sharing systems to efficiently support collaboration of biological researchers from various fields. To fulfill such requirements, we have developed a data-centric knowledge-sharing platform WebLab for biologists to fetch, analyze, manipulate and share data under an intuitive web interface. Dedicated space is provided for users to store their input data and analysis results. Users can upload local data or fetch public data from remote databases, and then perform analysis using more than 260 integrated bioinformatic tools. These tools can be further organized as customized analysis workflows to accomplish complex tasks automatically. In addition to conventional biological data, WebLab also provides rich supports for scientific literatures, such as searching against full text of uploaded literatures and exporting citations into various well-known citation managers such as EndNote and BibTex. To facilitate team work among colleagues, WebLab provides a powerful and flexible sharing mechanism, which allows users to share input data, analysis results, scientific literatures and customized workflows to specified users or groups with sophisticated privilege settings. WebLab is publicly available at http://weblab.cbi.pku.edu.cn, with all source code released as Free Software.
doi:10.1093/nar/gkp428
PMCID: PMC2703900  PMID: 19465388
12.  PlantTFDB: a comprehensive plant transcription factor database 
Nucleic Acids Research  2007;36(Database issue):D966-D969.
Transcription factors (TFs) play key roles in controlling gene expression. Systematic identification and annotation of TFs, followed by construction of TF databases may serve as useful resources for studying the function and evolution of transcription factors. We developed a comprehensive plant transcription factor database PlantTFDB (http://planttfdb.cbi.pku.edu.cn), which contains 26 402 TFs predicted from 22 species, including five model organisms with available whole genome sequence and 17 plants with available EST sequences. To provide comprehensive information for those putative TFs, we made extensive annotation at both family and gene levels. A brief introduction and key references were presented for each family. Functional domain information and cross-references to various well-known public databases were available for each identified TF. In addition, we predicted putative orthologs of those TFs among the 22 species. PlantTFDB has a simple interface to allow users to search the database by IDs or free texts, to make sequence similarity search against TFs of all or individual species, and to download TF sequences for local analysis.
doi:10.1093/nar/gkm841
PMCID: PMC2238823  PMID: 17933783
13.  Developmental stage related patterns of codon usage and genomic GC content: searching for evolutionary fingerprints with models of stem cell differentiation 
Genome Biology  2007;8(3):R35.
Developmental-stage-related patterns of gene expression correlate with codon usage and genomic GC content in stem cell hierarchies.
Background
The usage of synonymous codons shows considerable variation among mammalian genes. How and why this usage is non-random are fundamental biological questions and remain controversial. It is also important to explore whether mammalian genes that are selectively expressed at different developmental stages bear different molecular features.
Results
In two models of mouse stem cell differentiation, we established correlations between codon usage and the patterns of gene expression. We found that the optimal codons exhibited variation (AT- or GC-ending codons) in different cell types within the developmental hierarchy. We also found that genes that were enriched (developmental-pivotal genes) or specifically expressed (developmental-specific genes) at different developmental stages had different patterns of codon usage and local genomic GC (GCg) content. Moreover, at the same developmental stage, developmental-specific genes generally used more GC-ending codons and had higher GCg content compared with developmental-pivotal genes. Further analyses suggest that the model of translational selection might be consistent with the developmental stage-related patterns of codon usage, especially for the AT-ending optimal codons. In addition, our data show that after human-mouse divergence, the influence of selective constraints is still detectable.
Conclusion
Our findings suggest that developmental stage-related patterns of gene expression are correlated with codon usage (GC3) and GCg content in stem cell hierarchies. Moreover, this paper provides evidence for the influence of natural selection at synonymous sites in the mouse genome and novel clues for linking the molecular features of genes to their patterns of expression during mammalian ontogenesis.
doi:10.1186/gb-2007-8-3-r35
PMCID: PMC1868930  PMID: 17349061
14.  Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice 
BMC Bioinformatics  2006;7:447.
Background
The identification of chromosomal homology will shed light on such mysteries of genome evolution as DNA duplication, rearrangement and loss. Several approaches have been developed to detect chromosomal homology based on gene synteny or colinearity. However, the previously reported implementations lack statistical inferences which are essential to reveal actual homologies.
Results
In this study, we present a statistical approach to detect homologous chromosomal segments based on gene colinearity. We implement this approach in a software package ColinearScan to detect putative colinear regions using a dynamic programming algorithm. Statistical models are proposed to estimate proper parameter values and evaluate the significance of putative homologous regions. Statistical inference, high computational efficiency and flexibility of input data type are three key features of our approach.
Conclusion
We apply ColinearScan to the Arabidopsis and rice genomes to detect duplicated regions within each species and homologous fragments between these two species. We find many more homologous chromosomal segments in the rice genome than previously reported. We also find many small colinear segments between rice and Arabidopsis genomes.
doi:10.1186/1471-2105-7-447
PMCID: PMC1626491  PMID: 17038171
15.  KOBAS server: a web-based platform for automated annotation and pathway identification 
Nucleic Acids Research  2006;34(Web Server issue):W720-W724.
There is an increasing need to automatically annotate a set of genes or proteins (from genome sequencing, DNA microarray analysis or protein 2D gel experiments) using controlled vocabularies and identify the pathways involved, especially the statistically enriched pathways. We have previously demonstrated the KEGG Orthology (KO) as an effective alternative controlled vocabulary and developed a standalone KO-Based Annotation System (KOBAS). Here we report a KOBAS server with a friendly web-based user interface and enhanced functionalities. The server can support input by nucleotide or amino acid sequences or by sequence identifiers in popular databases and can annotate the input with KO terms and KEGG pathways by BLAST sequence similarity or directly ID mapping to genes with known annotations. The server can then identify both frequent and statistically enriched pathways, offering the choices of four statistical tests and the option of multiple testing correction. The server also has a ‘User Space’ in which frequent users may store and manage their data and results online. We demonstrate the usability of the server by finding statistically enriched pathways in a set of upregulated genes in Alzheimer's Disease (AD) hippocampal cornu ammonis 1 (CA1). KOBAS server can be accessed at .
doi:10.1093/nar/gkl167
PMCID: PMC1538915  PMID: 16845106
16.  GBA server: EST-based digital gene expression profiling 
Nucleic Acids Research  2005;33(Web Server issue):W673-W676.
Expressed Sequence Tag-based gene expression profiling can be used to discover functionally associated genes on a large scale. Currently available web servers and tools focus on finding differentially expressed genes in different samples or tissues rather than finding co-expressed genes. To fill this gap, we have developed a web server that implements the GBA (Guilt-by-Association) co-expression algorithm, which has been successfully used in finding disease-related genes. We have also annotated UniGene clusters with links to several important databases such as GO, KEGG, OMIM, Gene, IPI and HomoloGene. The GBA server can be accessed and downloaded at .
doi:10.1093/nar/gki480
PMCID: PMC1160240  PMID: 15980560
17.  RDfolder: a web server for prediction of RNA secondary structure 
Nucleic Acids Research  2004;32(Web Server issue):W150-W153.
Prediction of RNA secondary structure is important in the functional analysis of RNA molecules. The RDfolder web server described in this paper provides two methods for prediction of RNA secondary structure: random stacking of helical regions and helical regions distribution. The random stacking method predicts secondary structure by Monte Carlo simulations. The method of helical regions distribution predicts secondary structure based on the helices that appear most frequently in the set of structures, which are generated by the random stacking method. The RDfolder web server can be accessed at http://rna.cbi.pku.edu.cn.
doi:10.1093/nar/gkh445
PMCID: PMC441583  PMID: 15215369
18.  PCAS – a precomputed proteome annotation database resource 
BMC Genomics  2003;4:42.
Background
Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources.
Results
We report here the development of PCAS (ProteinCentric Annotation System) as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome.
PCAS is available at
Conclusion
PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms identified.
doi:10.1186/1471-2164-4-42
PMCID: PMC293463  PMID: 14594458

Results 1-18 (18)