PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-15 (15)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
author:("Mi, huayu")
1.  Meeting report from the fourth meeting of the Computational Modeling in Biology Network (COMBINE) 
Standards in Genomic Sciences  2014;9(3):1285-1301.
The Computational Modeling in Biology Network (COMBINE) is an initiative to coordinate the development of community standards and formats in computational systems biology and related fields. This report summarizes the topics and activities of the fourth edition of the annual COMBINE meeting, held in Paris during September 16-20 2013, and attended by a total of 96 people. This edition pioneered a first day devoted to modeling approaches in biology, which attracted a broad audience of scientists thanks to a panel of renowned speakers. During subsequent days, discussions were held on many subjects including the introduction of new features in the various COMBINE standards, new software tools that use the standards, and outreach efforts. Significant emphasis went into work on extensions of the SBML format, and also into community-building. This year’s edition once again demonstrated that the COMBINE community is thriving, and still manages to help coordinate activities between different standards in computational systems biology.
doi:10.4056/sigs.5279417
PMCID: PMC4149000
2.  PortEco: a resource for exploring bacterial biology through high-throughput data and analysis tools 
Nucleic Acids Research  2013;42(D1):D677-D684.
PortEco (http://porteco.org) aims to collect, curate and provide data and analysis tools to support basic biological research in Escherichia coli (and eventually other bacterial systems). PortEco is implemented as a ‘virtual’ model organism database that provides a single unified interface to the user, while integrating information from a variety of sources. The main focus of PortEco is to enable broad use of the growing number of high-throughput experiments available for E. coli, and to leverage community annotation through the EcoliWiki and GONUTS systems. Currently, PortEco includes curated data from hundreds of genome-wide RNA expression studies, from high-throughput phenotyping of single-gene knockouts under hundreds of annotated conditions, from chromatin immunoprecipitation experiments for tens of different DNA-binding factors and from ribosome profiling experiments that yield insights into protein expression. Conditions have been annotated with a consistent vocabulary, and data have been consistently normalized to enable users to find, compare and interpret relevant experiments. PortEco includes tools for data analysis, including clustering, enrichment analysis and exploration via genome browsers. PortEco search and data analysis tools are extensively linked to the curated gene, metabolic pathway and regulation content at its sister site, EcoCyc.
doi:10.1093/nar/gkt1203
PMCID: PMC3965092  PMID: 24285306
3.  Dopamine genes and nicotine dependence in treatment seeking and community smokers 
We utilized a cohort of 828 treatment seeking self-identified white cigarette smokers (50% female) to rank candidate gene single nucleotide polymorphisms (SNPs) associated with the Fagerström Test for Nicotine Dependence (FTND), a measure of nicotine dependence which assesses quantity of cigarettes smoked and time- and place-dependent characteristics of the respondent’s smoking behavior. 1123 SNPs at 55 autosomal candidate genes, nicotinic acetylcholine receptors and genes involved in dopaminergic function, were tested for association to baseline FTND scores adjusted for age, depression, education, sex and study site. SNP P values were adjusted for the number of transmission models, the number of SNPs tested per candidate gene, and their intragenic correlation. DRD2, SLC6A3 and NR4A2 SNPs with adjusted P values < 0.10 were considered sufficiently noteworthy to justify further genetic, bioinformatic and literature analyses. Each independent signal among the top-ranked SNPs accounted for ~1% of the FTND variance in this sample. The DRD2 SNP appears to represent a novel association with nicotine dependence. The SLC6A3 SNPs have previously been shown to be associated with SLC6A3 transcription or dopamine transporter density in vitro, in vivo and ex vivo. Analysis of SLC6A3 and NR4A2 SNPs identified a statistically significant gene-gene interaction (P=0.001), consistent with in vitro evidence that the NR4A2 protein product (NURR1) regulates SLC6A3 transcription. A community cohort of N=175 multiplex ever smoking pedigrees (N=423 ever smokers) provided nominal evidence for association with the FTND at these top ranked SNPs, uncorrected for multiple comparisons.
doi:10.1038/npp.2009.52
PMCID: PMC3558036  PMID: 19494806
dopamine transporter; Fagerström Test for Nicotine Dependence; single nucleotide polymorphism; candidate gene association scan; gene-gene interaction
4.  BioPAX support in CellDesigner 
Bioinformatics  2011;27(24):3437-3438.
Motivation: BioPAX is a standard language for representing and exchanging models of biological processes at the molecular and cellular levels. It is widely used by different pathway databases and genomics data analysis software. Currently, the primary source of BioPAX data is direct exports from the curated pathway databases. It is still uncommon for wet-lab biologists to share and exchange pathway knowledge using BioPAX. Instead, pathways are usually represented as informal diagrams in the literature. In order to encourage formal representation of pathways, we describe a software package that allows users to create pathway diagrams using CellDesigner, a user-friendly graphical pathway-editing tool and save the pathway data in BioPAX Level 3 format.
Availability: The plug-in is freely available and can be downloaded at ftp://ftp.pantherdb.org/CellDesigner/plugins/BioPAX/
Contact: huaiyumi@usc.edu
Supplementary Information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr586
PMCID: PMC3232372  PMID: 22021903
5.  PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees 
Nucleic Acids Research  2012;41(Database issue):D377-D386.
The data and tools in PANTHER—a comprehensive, curated database of protein families, trees, subfamilies and functions available at http://pantherdb.org—have undergone continual, extensive improvement for over a decade. Here, we describe the current PANTHER process as a whole, as well as the website tools for analysis of user-uploaded data. The main goals of PANTHER remain essentially unchanged: the accurate inference (and practical application) of gene and protein function over large sequence databases, using phylogenetic trees to extrapolate from the relatively sparse experimental information from a few model organisms. Yet the focus of PANTHER has continually shifted toward more accurate and detailed representations of evolutionary events in gene family histories. The trees are now designed to represent gene family evolution, including inference of evolutionary events, such as speciation and gene duplication. Subfamilies are still curated and used to define HMMs, but gene ontology functional annotations can now be made at any node in the tree, and are designed to represent gain and loss of function by ancestral genes during evolution. Finally, PANTHER now includes stable database identifiers for inferred ancestral genes, which are used to associate inferred gene attributes with particular genes in the common ancestral genomes of extant species.
doi:10.1093/nar/gks1118
PMCID: PMC3531194  PMID: 23193289
6.  PharmGKB summary: dopamine receptor D2 
Pharmacogenetics and Genomics  2011;21(6):350-356.
doi:10.1097/FPC.0b013e32833ee605
PMCID: PMC3091980  PMID: 20736885
dopamine receptor D2; PharmGKB; rs1799732; rs1800497; rs6277; rs1801028
8.  Software support for SBGN maps: SBGN-ML and LibSBGN 
Bioinformatics  2012;28(15):2016-2021.
Motivation: LibSBGN is a software library for reading, writing and manipulating Systems Biology Graphical Notation (SBGN) maps stored using the recently developed SBGN-ML file format. The library (available in C++ and Java) makes it easy for developers to add SBGN support to their tools, whereas the file format facilitates the exchange of maps between compatible software applications. The library also supports validation of maps, which simplifies the task of ensuring compliance with the detailed SBGN specifications. With this effort we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner.
Availability and implementation: Milestone 2 was released in December 2011. Source code, example files and binaries are freely available under the terms of either the LGPL v2.1+ or Apache v2.0 open source licenses from http://libsbgn.sourceforge.net.
Contact: sbgn-libsbgn@lists.sourceforge.net
doi:10.1093/bioinformatics/bts270
PMCID: PMC3400951  PMID: 22581176
9.  InterPro in 2011: new developments in the family and domain prediction database 
Nucleic Acids Research  2011;40(Database issue):D306-D312.
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.
doi:10.1093/nar/gkr948
PMCID: PMC3245097  PMID: 22096229
10.  BioPAX – A community standard for pathway data sharing 
Demir, Emek | Cary, Michael P. | Paley, Suzanne | Fukuda, Ken | Lemer, Christian | Vastrik, Imre | Wu, Guanming | D’Eustachio, Peter | Schaefer, Carl | Luciano, Joanne | Schacherer, Frank | Martinez-Flores, Irma | Hu, Zhenjun | Jimenez-Jacinto, Veronica | Joshi-Tope, Geeta | Kandasamy, Kumaran | Lopez-Fuentes, Alejandra C. | Mi, Huaiyu | Pichler, Elgar | Rodchenkov, Igor | Splendiani, Andrea | Tkachev, Sasha | Zucker, Jeremy | Gopinath, Gopal | Rajasimha, Harsha | Ramakrishnan, Ranjani | Shah, Imran | Syed, Mustafa | Anwar, Nadia | Babur, Ozgun | Blinov, Michael | Brauner, Erik | Corwin, Dan | Donaldson, Sylva | Gibbons, Frank | Goldberg, Robert | Hornbeck, Peter | Luna, Augustin | Murray-Rust, Peter | Neumann, Eric | Reubenacker, Oliver | Samwald, Matthias | van Iersel, Martijn | Wimalaratne, Sarala | Allen, Keith | Braun, Burk | Whirl-Carrillo, Michelle | Dahlquist, Kam | Finney, Andrew | Gillespie, Marc | Glass, Elizabeth | Gong, Li | Haw, Robin | Honig, Michael | Hubaut, Olivier | Kane, David | Krupa, Shiva | Kutmon, Martina | Leonard, Julie | Marks, Debbie | Merberg, David | Petri, Victoria | Pico, Alex | Ravenscroft, Dean | Ren, Liya | Shah, Nigam | Sunshine, Margot | Tang, Rebecca | Whaley, Ryan | Letovksy, Stan | Buetow, Kenneth H. | Rzhetsky, Andrey | Schachter, Vincent | Sobral, Bruno S. | Dogrusoz, Ugur | McWeeney, Shannon | Aladjem, Mirit | Birney, Ewan | Collado-Vides, Julio | Goto, Susumu | Hucka, Michael | Le Novère, Nicolas | Maltsev, Natalia | Pandey, Akhilesh | Thomas, Paul | Wingender, Edgar | Karp, Peter D. | Sander, Chris | Bader, Gary D.
Nature biotechnology  2010;28(9):935-942.
BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data (http://www.biopax.org). Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.
doi:10.1038/nbt.1666
PMCID: PMC3001121  PMID: 20829833
pathway data integration; pathway database; standard exchange format; ontology; information system
11.  Ontologies and Standards in Bioscience Research: For Machine or for Human 
Ontologies and standards are very important parts of today's bioscience research. With the rapid increase of biological knowledge, they provide mechanisms to better store and represent data in a controlled and structured way, so that scientists can share the data, and utilize a wide variety of software and tools to manage and analyze the data. Most of these standards are initially designed for computers to access large amounts of data that are difficult for human biologists to handle, and it is important to keep in mind that ultimately biologists are going to produce and interpret the data. While ontologies and standards must follow strict semantic rules that may not be familiar to biologists, effort must be spent to lower the learning barrier by involving biologists in the process of development, and by providing software and tool support. A standard will not succeed without support from the wider bioscience research community. Thus, it is crucial that these standards be designed not only for machines to read, but also to be scientifically accurate and intuitive to human biologists.
doi:10.3389/fphys.2011.00005
PMCID: PMC3081276  PMID: 21519400
ontology; standard; systems biology
12.  PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium 
Nucleic Acids Research  2009;38(Database issue):D204-D210.
Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolutionary relationships to genes with functions known from experiment. The PANTHER website, freely available at http://www.pantherdb.org, also includes software tools for analyzing genomic data relative to known and inferred gene functions. Since 2007, there have been several new developments to PANTHER: (i) improved phylogenetic trees, explicitly representing speciation and gene duplication events, (ii) identification of gene orthologs, including least diverged orthologs (best one-to-one pairs), (iii) coverage of more genomes (48 genomes, up to 87% of genes in each genome; see http://www.pantherdb.org/panther/summaryStats.jsp), (iv) improved support for alternative database identifiers for genes, proteins and microarray probes and (v) adoption of the SBGN standard for display of biological pathways. In addition, PANTHER trees are being annotated with gene function as part of the Gene Ontology Reference Genome project, resulting in an increasing number of curated functional annotations.
doi:10.1093/nar/gkp1019
PMCID: PMC2808919  PMID: 20015972
13.  PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways 
Nucleic Acids Research  2006;35(Database issue):D247-D252.
PANTHER is a freely available, comprehensive software system for relating protein sequence evolution to the evolution of specific protein functions and biological roles. Since 2005, there have been three main improvements to PANTHER. First, the sequences used to create evolutionary trees are carefully selected to provide coverage of phylogenetic as well as functional information. Second, PANTHER is now a member of the InterPro Consortium, and the PANTHER hidden markov Models (HMMs) are distributed as part of InterProScan. Third, we have dramatically expanded the number of pathways associated with subfamilies in PANTHER. Pathways provide a detailed, structured representation of protein function in the context of biological reaction networks. PANTHER pathways were generated using the emerging Systems Biology Markup Language (SBML) standard using pathway network editing software called CellDesigner. The pathway collection currently contains ∼1500 reactions in 130 pathways, curated by expert biologists with authorship attribution. The curation environment is designed to be easy to use, and the number of pathways is growing steadily. Because the reaction participants are linked to subfamilies and corresponding HMMs, reactions can be inferred across numerous different organisms. The HMMs can be downloaded by FTP, and tools for analyzing data in the context of pathways and function ontologies are available at .
doi:10.1093/nar/gkl869
PMCID: PMC1716723  PMID: 17130144
14.  Applications for protein sequence–function evolution data: mRNA/protein expression analysis and coding SNP scoring tools 
Nucleic Acids Research  2006;34(Web Server issue):W645-W650.
The vast amount of protein sequence data now available, together with accumulating experimental knowledge of protein function, enables modeling of protein sequence and function evolution. The PANTHER database was designed to model evolutionary sequence–function relationships on a large scale. There are a number of applications for these data, and we have implemented web services that address three of them. The first is a protein classification service. Proteins can be classified, using only their amino acid sequences, to evolutionary groups at both the family and subfamily levels. Specific subfamilies, and often families, are further classified when possible according to their functions, including molecular function and the biological processes and pathways they participate in. The second application, then, is an expression data analysis service, where functional classification information can help find biological patterns in the data obtained from genome-wide experiments. The third application is a coding single-nucleotide polymorphism scoring service. In this case, information about evolutionarily related proteins is used to assess the likelihood of a deleterious effect on protein function arising from a single substitution at a specific amino acid position in the protein. All three web services are available at .
doi:10.1093/nar/gkl229
PMCID: PMC1538848  PMID: 16912992
15.  PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification 
Nucleic Acids Research  2003;31(1):334-341.
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. PANTHER is publicly available on the web at http://panther.celera.com.
PMCID: PMC165562  PMID: 12520017

Results 1-15 (15)