PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-18 (18)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
Document Types
author:("Lee, inuk")
1.  MORPHIN: a web tool for human disease research by projecting model organism biology onto a human integrated gene network 
Nucleic Acids Research  2014;42(Web Server issue):W147-W153.
Despite recent advances in human genetics, model organisms are indispensable for human disease research. Most human disease pathways are evolutionally conserved among other species, where they may phenocopy the human condition or be associated with seemingly unrelated phenotypes. Much of the known gene-to-phenotype association information is distributed across diverse databases, growing rapidly due to new experimental techniques. Accessible bioinformatics tools will therefore facilitate translation of discoveries from model organisms into human disease biology. Here, we present a web-based discovery tool for human disease studies, MORPHIN (model organisms projected on a human integrated gene network), which prioritizes the most relevant human diseases for a given set of model organism genes, potentially highlighting new model systems for human diseases and providing context to model organism studies. Conceptually, MORPHIN investigates human diseases by an orthology-based projection of a set of model organism genes onto a genome-scale human gene network. MORPHIN then prioritizes human diseases by relevance to the projected model organism genes using two distinct methods: a conventional overlap-based gene set enrichment analysis and a network-based measure of closeness between the query and disease gene sets capable of detecting associations undetectable by the conventional overlap-based methods. MORPHIN is freely accessible at http://www.inetbio.org/morphin.
doi:10.1093/nar/gku434
PMCID: PMC4086117  PMID: 24861622
2.  A showcase of future plant biology: moving towards next-generation plant genetics assisted by genome sequencing and systems biology 
Genome Biology  2014;15(5):305.
A report on the Cold Spring Harbor Asia conference on Genome Assisted Biology of Crops and Model Plant Systems Meeting, held in Suzhou, China, April 21–25, 2014.
doi:10.1186/gb4176
PMCID: PMC4053818  PMID: 25001400
3.  WormNet v3: a network-assisted hypothesis-generating server for Caenorhabditis elegans 
Nucleic Acids Research  2014;42(Web Server issue):W76-W82.
High-throughput experimental technologies gradually shift the paradigm of biological research from hypothesis-validation toward hypothesis-generation science. Translating diverse types of large-scale experimental data into testable hypotheses, however, remains a daunting task. We previously demonstrated that heterogeneous genomics data can be integrated into a single genome-scale gene network with high prediction power for ribonucleic acid interference (RNAi) phenotypes in Caenorhabditis elegans, a popular metazoan model in the study of developmental biology, neurobiology and genetics. Here, we present WormNet version 3 (v3), which is a new network-assisted hypothesis-generating server for C. elegans. WormNet v3 includes major updates to the base gene network, which substantially improved predictions of RNAi phenotypes. The server generates various gene network-based hypotheses using three complementary network methods: (i) a phenotype-centric approach to ‘find new members for a pathway’; (ii) a gene-centric approach to ‘infer functions from network neighbors’ and (iii) a context-centric approach to ‘find context-associated hub genes’, which is a new method to identify key genes that mediate physiology within a specific context. For example, we demonstrated that the context-centric approach can be used to identify potential molecular targets of toxic chemicals. WormNet v3 is freely accessible at http://www.inetbio.org/wormnet.
doi:10.1093/nar/gku367
PMCID: PMC4086142  PMID: 24813450
4.  YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae 
Nucleic Acids Research  2013;42(Database issue):D731-D736.
Saccharomyces cerevisiae, i.e. baker’s yeast, is a widely studied model organism in eukaryote genetics because of its simple protocols for genetic manipulation and phenotype profiling. The high abundance of publicly available data that has been generated through diverse ‘omics’ approaches has led to the use of yeast for many systems biology studies, including large-scale gene network modeling to better understand the molecular basis of the cellular phenotype. We have previously developed a genome-scale gene network for yeast, YeastNet v2, which has been used for various genetics and systems biology studies. Here, we present an updated version, YeastNet v3 (available at http://www.inetbio.org/yeastnet/), that significantly improves the prediction of gene–phenotype associations. The extended genome in YeastNet v3 covers up to 5818 genes (∼99% of the coding genome) wired by 362 512 functional links. YeastNet v3 provides a new web interface to run the tools for network-guided hypothesis generations. YeastNet v3 also provides edge information for all data-specific networks (∼2 million functional links) as well as the integrated networks. Therefore, users can construct alternative versions of the integrated network by applying their own data integration algorithm to the same data-specific links.
doi:10.1093/nar/gkt981
PMCID: PMC3965021  PMID: 24165882
5.  JiffyNet: a web-based instant protein network modeler for newly sequenced species 
Nucleic Acids Research  2013;41(Web Server issue):W192-W197.
Revolutionary DNA sequencing technology has enabled affordable genome sequencing for numerous species. Thousands of species already have completely decoded genomes, and tens of thousands more are in progress. Naturally, parallel expansion of the functional parts list library is anticipated, yet genome-level understanding of function also requires maps of functional relationships, such as functional protein networks. Such networks have been constructed for many sequenced species including common model organisms. Nevertheless, the majority of species with sequenced genomes still have no protein network models available. Moreover, biologists might want to obtain protein networks for their species of interest on completion of the genome projects. Therefore, there is high demand for accessible means to automatically construct genome-scale protein networks based on sequence information from genome projects only. Here, we present a public web server, JiffyNet, specifically designed to instantly construct genome-scale protein networks based on associalogs (functional associations transferred from a template network by orthology) for a query species with only protein sequences provided. Assessment of the networks by JiffyNet demonstrated generally high predictive ability for pathway annotations. Furthermore, JiffyNet provides network visualization and analysis pages for wide variety of molecular concepts to facilitate network-guided hypothesis generation. JiffyNet is freely accessible at http://www.jiffynet.org.
doi:10.1093/nar/gkt419
PMCID: PMC3692116  PMID: 23685435
6.  Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network 
Nature protocols  2011;6(9):10.1038/nprot.2011.372.
AraNet is a functional gene network for the reference plant Arabidopsis and has been constructed in order to identify new genes associated with plant traits. It is highly predictive for diverse biological pathways and can be used to prioritize genes for functional screens. Moreover, AraNet provides a web-based tool with which plant biologists can efficiently discover novel functions of Arabidopsis genes (http://www.functionalnet.org/aranet/). This protocol explains how to conduct network-based prediction of gene functions using AraNet and how to interpret the prediction results. Functional discovery in plant biology is facilitated by combining candidate prioritization by AraNet with focused experimental tests.
doi:10.1038/nprot.2011.372
PMCID: PMC3654671  PMID: 21886106
7.  RIDDLE: reflective diffusion and local extension reveal functional associations for unannotated gene sets via proximity in a gene network 
Genome Biology  2012;13(12):R125.
The growing availability of large-scale functional networks has promoted the development of many successful techniques for predicting functions of genes. Here we extend these network-based principles and techniques to functionally characterize whole sets of genes. We present RIDDLE (Reflective Diffusion and Local Extension), which uses well developed guilt-by-association principles upon a human gene network to identify associations of gene sets. RIDDLE is particularly adept at characterizing sets with no annotations, a major challenge where most traditional set analyses fail. Notably, RIDDLE found microRNA-450a to be strongly implicated in ocular diseases and development. A web application is available at http://www.functionalnet.org/RIDDLE.
doi:10.1186/gb-2012-13-12-r125
PMCID: PMC4056375  PMID: 23268829
8.  Metabolomics as a Hypothesis-Generating Functional Genomics Tool for the Annotation of Arabidopsis thaliana Genes of “Unknown Function” 
Metabolomics is the methodology that identifies and measures global pools of small molecules (of less than about 1,000 Da) of a biological sample, which are collectively called the metabolome. Metabolomics can therefore reveal the metabolic outcome of a genetic or environmental perturbation of a metabolic regulatory network, and thus provide insights into the structure and regulation of that network. Because of the chemical complexity of the metabolome and limitations associated with individual analytical platforms for determining the metabolome, it is currently difficult to capture the complete metabolome of an organism or tissue, which is in contrast to genomics and transcriptomics. This paper describes the analysis of Arabidopsis metabolomics data sets acquired by a consortium that includes five analytical laboratories, bioinformaticists, and biostatisticians, which aims to develop and validate metabolomics as a hypothesis-generating functional genomics tool. The consortium is determining the metabolomes of Arabidopsis T-DNA mutant stocks, grown in standardized controlled environment optimized to minimize environmental impacts on the metabolomes. Metabolomics data were generated with seven analytical platforms, and the combined data is being provided to the research community to formulate initial hypotheses about genes of unknown function (GUFs). A public database (www.PlantMetabolomics.org) has been developed to provide the scientific community with access to the data along with tools to allow for its interactive analysis. Exemplary datasets are discussed to validate the approach, which illustrate how initial hypotheses can be generated from the consortium-produced metabolomics data, integrated with prior knowledge to provide a testable hypothesis concerning the functionality of GUFs.
doi:10.3389/fpls.2012.00015
PMCID: PMC3355754  PMID: 22645570
Arabidopsis; metabolomics; gene annotation; functional genomics; database
9.  Towards Establishment of a Rice Stress Response Interactome 
PLoS Genetics  2011;7(4):e1002020.
Rice (Oryza sativa) is a staple food for more than half the world and a model for studies of monocotyledonous species, which include cereal crops and candidate bioenergy grasses. A major limitation of crop production is imposed by a suite of abiotic and biotic stresses resulting in 30%–60% yield losses globally each year. To elucidate stress response signaling networks, we constructed an interactome of 100 proteins by yeast two-hybrid (Y2H) assays around key regulators of the rice biotic and abiotic stress responses. We validated the interactome using protein–protein interaction (PPI) assays, co-expression of transcripts, and phenotypic analyses. Using this interactome-guided prediction and phenotype validation, we identified ten novel regulators of stress tolerance, including two from protein classes not previously known to function in stress responses. Several lines of evidence support cross-talk between biotic and abiotic stress responses. The combination of focused interactome and systems analyses described here represents significant progress toward elucidating the molecular basis of traits of agronomic importance.
Author Summary
A major limitation of crop production is imposed by a suite of abiotic and biotic stresses resulting in 30%–60% yield losses globally each year. In this paper, we used a yeast-based approach to identify rice proteins that govern the rice stress response. We validated the role of these new proteins using additional analyses to evaluate the function of these genes in rice and assessed whether they serve to positively or negatively regulate the stress response. This approach allowed us to identify ten genes that control resistance to bacterial disease and tolerance to submergence. The combination of approaches described here represents significant progress toward elucidating the molecular basis of traits of agronomic importance.
doi:10.1371/journal.pgen.1002020
PMCID: PMC3077385  PMID: 21533176
10.  Characterising and Predicting Haploinsufficiency in the Human Genome 
PLoS Genetics  2010;6(10):e1001154.
Haploinsufficiency, wherein a single functional copy of a gene is insufficient to maintain normal function, is a major cause of dominant disease. Human disease studies have identified several hundred haploinsufficient (HI) genes. We have compiled a map of 1,079 haplosufficient (HS) genes by systematic identification of genes unambiguously and repeatedly compromised by copy number variation among 8,458 apparently healthy individuals and contrasted the genomic, evolutionary, functional, and network properties between these HS genes and known HI genes. We found that HI genes are typically longer and have more conserved coding sequences and promoters than HS genes. HI genes exhibit higher levels of expression during early development and greater tissue specificity. Moreover, within a probabilistic human functional interaction network HI genes have more interaction partners and greater network proximity to other known HI genes. We built a predictive model on the basis of these differences and annotated 12,443 genes with their predicted probability of being haploinsufficient. We validated these predictions of haploinsufficiency by demonstrating that genes with a high predicted probability of exhibiting haploinsufficiency are enriched among genes implicated in human dominant diseases and among genes causing abnormal phenotypes in heterozygous knockout mice. We have transformed these gene-based haploinsufficiency predictions into haploinsufficiency scores for genic deletions, which we demonstrate to better discriminate between pathogenic and benign deletions than consideration of the deletion size or numbers of genes deleted. These robust predictions of haploinsufficiency support clinical interpretation of novel loss-of-function variants and prioritization of variants and genes for follow-up studies.
Author Summary
Humans, like most complex organisms, have two copies of most genes in their genome, one from the mother and one from the father. This redundancy provides a back-up copy for most genes, should one copy be lost through mutation. For a minority of genes, one functional copy is not enough to sustain normal human function, and mutations causing the loss of function of one of the copies of such genes are a major cause of childhood developmental diseases. Over the past 20 years medical geneticists have identified over 300 such genes, but it is not known how many of the 22,000 genes in our genome may also be sensitive to gene loss. By comparing these ∼300 genes known to be sensitive to gene loss with over 1,000 genes where loss of a single copy does not result in disease, we have identified some key evolutionary and functional similarities between genes sensitive to loss of a single copy. We have used these similarities to predict for most genes in the genome, whether loss of a single copy is likely to result in disease. These predictions will help in the interpretation of mutations seen in patients.
doi:10.1371/journal.pgen.1001154
PMCID: PMC2954820  PMID: 20976243
11.  Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana 
Nature biotechnology  2010;28(2):149-156.
Plants are essential sources of food, fiber and renewable energy. Effective methods for manipulating plant traits have important agricultural and economic consequences. We introduce a rational approach for associating genes with plant traits by combined use of a genome-scale functional network and targeted reverse genetic screening. We present a probabilistic network (AraNet) of functional associations among 19,647 (73%) genes of the reference flowering plant Arabidopsis thaliana. AraNet associations have measured precision greater than literature-based protein interactions (21%) for 55% of genes, and are highly predictive for diverse biological pathways. Using AraNet, we found a 10-fold enrichment in identifying early seedling development genes. By interrogating network neighborhoods, we identify At1g80710 (now Drought sensitive 1; Drs1) and At3g05090 (now Lateral root stimulator 1; Lrs1) as novel regulators of drought sensitivity and lateral root development, respectively. AraNet (http://www.functionalnet.org/aranet/) provides a global resource for plant gene function identification and genetic dissection of plant traits.
doi:10.1038/nbt.1603
PMCID: PMC2857375  PMID: 20118918
12.  The planar cell polarity effector Fuz is essential for targeted membrane trafficking, ciliogenesis, and mouse embryonic development 
Nature cell biology  2009;11(10):1225-1232.
The planar cell polarity (PCP) signaling pathway is essential for embryonic development because it governs diverse cellular behaviors, and the “core PCP” proteins, such as Dishevelled and Frizzled, have been extensively characterized1–4. By contrast, the “PCP effector” proteins, such as Intu and Fuz, remain largely unstudied5, 6. These proteins are essential for PCP signaling, but they have never been investigated in a mammal and their cell biological activities remain entirely unknown. We report here that Fuz mutant mice display neural tube defects, skeletal dysmorphologies, and Hedgehog signaling defects stemming from disrupted ciliogenesis. Using bioinformatics and imaging of an in vivo mucociliary epithelium, we establish a central role for Fuz in membrane trafficking, showing that Fuz is essential for trafficking of cargo to basal bodies and to the apical tips of cilia. Fuz is also essential for exocytosis in secretory cells. Finally, we identify a novel, Rab-related small GTPase as a Fuz interaction partner that is also essential for ciliogenesis and secretion. These results are significant because they provide novel insights into the mechanisms by which developmental regulatory systems like PCP signaling interface with fundamental cellular systems such as the vesicle trafficking machinery.
doi:10.1038/ncb1966
PMCID: PMC2755648  PMID: 19767740
13.  Rational Extension of the Ribosome Biogenesis Pathway Using Network-Guided Genetics 
PLoS Biology  2009;7(10):e1000213.
Gene networks are an efficient route for associating candidate genes with biological processes. Here, networks are used to discover more than 15 new genes for ribosomal subunit maturation, rRNA processing, and ribosomal export from the nucleus.
Biogenesis of ribosomes is an essential cellular process conserved across all eukaryotes and is known to require >170 genes for the assembly, modification, and trafficking of ribosome components through multiple cellular compartments. Despite intensive study, this pathway likely involves many additional genes. Here, we employ network-guided genetics—an approach for associating candidate genes with biological processes that capitalizes on recent advances in functional genomic and proteomic studies—to computationally identify additional ribosomal biogenesis genes. We experimentally evaluated >100 candidate yeast genes in a battery of assays, confirming involvement of at least 15 new genes, including previously uncharacterized genes (YDL063C, YIL091C, YOR287C, YOR006C/TSR3, YOL022C/TSR4). We associate the new genes with specific aspects of ribosomal subunit maturation, ribosomal particle association, and ribosomal subunit nuclear export, and we identify genes specifically required for the processing of 5S, 7S, 20S, 27S, and 35S rRNAs. These results reveal new connections between ribosome biogenesis and mRNA splicing and add >10% new genes—most with human orthologs—to the biogenesis pathway, significantly extending our understanding of a universally conserved eukaryotic process.
Author Summary
Ribosomes are the extremely complex cellular machines responsible for constructing new proteins. In eukaryotic cells, such as yeast, each ribosome contains more than 80 protein or RNA components. These complex machines must themselves be assembled by an even more complex machinery spanning multiple cellular compartments and involving perhaps 200 components in an ordered series of processing events, resulting in delivery of the two halves of the mature ribosome, the 40S and 60S components, to the cytoplasm. The ribosome biogenesis machinery has been only partially characterized, and many lines of evidence suggest that there are additional components that are still unknown. We employed an emerging computational technique called network-guided genetics to identify new candidate genes for this pathway. We then tested the candidates in a battery of experimental assays to determine what roles the genes might play in the biogenesis of ribosomes. This approach proved an efficient route to the discovery of new genes involved in ribosome biogenesis, significantly extending our understanding of a universally conserved eukaryotic process.
doi:10.1371/journal.pbio.1000213
PMCID: PMC2749941  PMID: 19806183
14.  Broad network-based predictability of Saccharomyces cerevisiae gene loss-of-function phenotypes 
Genome Biology  2007;8(12):R258.
Loss-of-function phenotypes of yeast genes can be predicted from the loss-of-function phenotypes of their neighbours in functional gene networks. This could potentially be applied to the prediction of human disease genes.
We demonstrate that loss-of-function yeast phenotypes are predictable by guilt-by-association in functional gene networks. Testing 1,102 loss-of-function phenotypes from genome-wide assays of yeast reveals predictability of diverse phenotypes, spanning cellular morphology, growth, metabolism, and quantitative cell shape features. We apply the method to extend a genome-wide screen by predicting, then verifying, genes whose disruption elongates yeast cells, and to predict human disease genes. To facilitate network-guided screens, a web server is available .
doi:10.1186/gb-2007-8-12-r258
PMCID: PMC2246260  PMID: 18053250
15.  An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae 
PLoS ONE  2007;2(10):e988.
Background
Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations.
Methodology/Principal Findings
We report a significantly improved version (v. 2) of a probabilistic functional gene network [1] of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis.
Conclusions/Significance
YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome). YeastNet is available from http://www.yeastnet.org.
doi:10.1371/journal.pone.0000988
PMCID: PMC1991590  PMID: 17912365
16.  A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality 
BMC Bioinformatics  2007;8:236.
Background
Identifying all protein complexes in an organism is a major goal of systems biology. In the past 18 months, the results of two genome-scale tandem affinity purification-mass spectrometry (TAP-MS) assays in yeast have been published, along with corresponding complex maps. For most complexes, the published data sets were surprisingly uncorrelated. It is therefore useful to consider the raw data from each study and generate an accurate complex map from a high-confidence data set that integrates the results of these and earlier assays.
Results
Using an unsupervised probabilistic scoring scheme, we assigned a confidence score to each interaction in the matrix-model interpretation of the large-scale yeast mass-spectrometry data sets. The scoring metric proved more accurate than the filtering schemes used in the original data sets. We then took a high-confidence subset of these interactions and derived a set of complexes using MCL. The complexes show high correlation with existing annotations. Hierarchical organization of some protein complexes is evident from inter-complex interactions.
Conclusion
We demonstrate that our scoring method can generate an integrated high-confidence subset of observed matrix-model interactions, which we subsequently used to derive an accurate map of yeast complexes. Our results indicate that essentiality is a product of the protein complex rather than the individual protein, and that we have achieved near saturation of the yeast high-abundance, rich-media-expressed "complex-ome."
doi:10.1186/1471-2105-8-236
PMCID: PMC1940025  PMID: 17605818
17.  Patterns of sequence conservation at termini of long terminal repeat (LTR) retrotransposons and DNA transposons in the human genome: lessons from phage Mu 
Nucleic Acids Research  2003;31(15):4531-4540.
Long terminal repeat (LTR) retrotransposons and DNA transposons are transposable elements (TEs) that perform cleavage and transfer at precise DNA positions. Here, we present statistical analyses of sequences found at the termini of precise TEs in the human genome. The results show that the terminal di- and trinucleotides of these TEs are highly conserved. 5′TG…CA3′ occurs most frequently at the termini of LTR retrotransposons, while 5′CAG…CTG3′ occurs most frequently in DNA transposons. Interestingly, these sequences are the most flexible base pair steps in DNA. Both the sequence preference and the degree of conservation of each position within the human LTR dinucleotide termini are remarkably similar to those experimentally demonstrated in transposable phage Mu. We discuss the significance of these observations and their implication for the function of terminal residues in the transposition of precise TEs.
PMCID: PMC169890  PMID: 12888514
18.  Saccharomyces cerevisiae RAI1 (YGL246c) Is Homologous to Human DOM3Z and Encodes a Protein That Binds the Nuclear Exoribonuclease Rat1p 
Molecular and Cellular Biology  2000;20(11):4006-4015.
The RAT1 gene of Saccharomyces cerevisiae encodes a 5′→3′ exoribonuclease which plays an essential role in yeast RNA degradation and/or processing in the nucleus. We have cloned a previously uncharacterized gene (YGL246c) that we refer to as RAI1 (Rat1p interacting protein 1). RAI1 is homologous to Caenorhabditis elegans DOM-3 and human DOM3Z. Deletion of RAI1 confers a growth defect which can be complemented by an additional copy of RAT1 on a centromeric vector or by directing Xrn1p, the cytoplasmic homolog of Rat1p, to the nucleus through the addition of a nuclear targeting sequence. Deletion of RAI1 is synthetically lethal with the rat1-1ts mutation and shows genetic interaction with a deletion of SKI2 but not XRN1. Polysome analysis of an rai1 deletion mutant indicated a defect in 60S biogenesis which was nearly fully reversed by high-copy RAT1. Northern blot analysis of rRNAs revealed that rai1 is required for normal 5.8S processing. In the absence of RAI1, 5.8SL was the predominant form of 5.8S and there was an accumulation of 3′-extended forms but not 5′-extended species of 5.8S. In addition, a 27S pre-rRNA species accumulated in the rai1 mutant. Thus, deletion of RAI1 affects both 5′ and 3′ processing reactions of 5.8S rRNA. Consistent with the in vivo data suggesting that RAI1 enhances RAT1 function, purified Rai1p stabilized the in vitro exoribonuclease activity of Rat1p.
PMCID: PMC85771  PMID: 10805743

Results 1-18 (18)