Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Installing a Local Copy of the Reactome Web Site and Knowledgebase 
The Reactome project builds, maintains, and publishes a knowledgebase of biological pathways. The information in the knowledgebase is gathered from the experts in the field, peer reviewed, and edited by Reactome editorial staff and then published to the Reactome Web site, (see UNIT 8.7; Croft et al., 2013). The Reactome software is open source and builds on top of other open-source or freely available software. Reactome data and code can be freely downloaded in its entirety and the Web site installed locally. This allows for more flexible interrogation of the data and also makes it possible to add one’s own information to the knowledgebase.
PMCID: PMC4615588  PMID: 26087747
2.  Comparative RNAi Screens in C. elegans and C. briggsae Reveal the Impact of Developmental System Drift on Gene Function 
PLoS Genetics  2014;10(2):e1004077.
Although two related species may have extremely similar phenotypes, the genetic networks underpinning this conserved biology may have diverged substantially since they last shared a common ancestor. This is termed Developmental System Drift (DSD) and reflects the plasticity of genetic networks. One consequence of DSD is that some orthologous genes will have evolved different in vivo functions in two such phenotypically similar, related species and will therefore have different loss of function phenotypes. Here we report an RNAi screen in C. elegans and C. briggsae to identify such cases. We screened 1333 genes in both species and identified 91 orthologues that have different RNAi phenotypes. Intriguingly, we find that recently evolved genes of unknown function have the fastest evolving in vivo functions and, in several cases, we identify the molecular events driving these changes. We thus find that DSD has a major impact on the evolution of gene function and we anticipate that the C. briggsae RNAi library reported here will drive future studies on comparative functional genomics screens in these nematodes.
Author Summary
Although two related species may appear similar, the genetic pathways that underpin this shared biology may have drifted and changed. This phenomenon is known as Developmental System Drift (DSD). One consequence of DSD is that equivalent genes may play different roles in phenotypically similar, related species but there have been no systematic studies to examine this. How often do genes have different functions in similar species? Are certain genes more likely to change functions? Finally, what are the molecular changes that drive this? Here, we compare the effects of reducing the levels of over 1300 different genes in two species of nematode worm. These worms are very similar— they live in the same ecological niche, and have near-identical development and behavior. We find that over 25% of conserved genes have different functions in these two species, showing that DSD has a major impact on how gene function evolves. Intriguingly, we find that genes that have arisen recently are most likely to change functions and that this is often driven by changes in their expression. This is the first systematic comparison of loss of function phenotypes in related species and sheds light on how genetic pathways rewire during DSD.
PMCID: PMC3916228  PMID: 24516395
3.  The taxonomic name resolution service: an online tool for automated standardization of plant names 
BMC Bioinformatics  2013;14:16.
The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.
The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.
We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at and as a RESTful web service and application programming interface. Source code is available at
PMCID: PMC3554605  PMID: 23324024
Biodiversity informatics; Database integration; Taxonomy; Plants
4.  Using the Generic Synteny Browser (GBrowse_syn) 
Genome Browsers are software that allow the user to view genome annotations in the context of a reference sequence, such as a chromosome, contig, scaffold, etc. The Generic Genome Browser (GBrowse) is an open source genome browser package developed as part of the Generic Model Database Project (see Unit 9.9; Stein et at., 2002). The increasing number of sequenced genomes has to a corresponding growth in the field of comparative genomics, which requires methods to view and compare multiple genomes. Using the same software framework as GBrowse, the Generic Synteny Browser (GBrowse_syn) allows the comparison of co-linear regions of multiple genomes using the familiar GBrowse-style web page. Like GBrowse, GBrowse_syn can be configured to display any organism and is currently the synteny browser used for model organisms such as C. elegans (WormBase;; see Unit 1.8) and Arabidopsis (TAIR;; see Unit 1.11). GBrowse_syn is part of the GBrowse software package and can be downloaded from the web and run on any unix-like operating system, such as Linux, Solaris, Mac OS X etc. GBrowse_syn is still under active development. This unit will cover installation and configuration as part of the current stable version of GBrowse (v1.71).
PMCID: PMC3162311  PMID: 20836076
5.  The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details 
The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at
Database URL:
PMCID: PMC3170170  PMID: 21856757
6.  Genes that may modulate longevity in C. elegans in both dauer larvae and long-lived daf-2 adults 
Experimental gerontology  2007;42(8):825-839.
We used Serial Analysis of Gene Expression (SAGE) to compare the global transcription profiles of long-lived mutant daf-2 adults and dauer larvae, aiming to identify aging-related genes based on similarity of expression patterns. Genes that are expressed similarly in both long-lived types potentially define a common life-extending program. Comparison of eight SAGE libraries yielded a set of 120 genes, the expression of which was significantly different in long-lived worms versus normal adults. The gene annotations indicate a strong link between oxidative stress and life span, further supporting the hypothesis that metabolic activity is a major determinant in longevity. The SAGE data show changes in mRNA levels for electron transport chain components, elevated expression of glyoxylate shunt enzymes and significantly reduced expression for components of the TCA cycle in longer-lived nematodes. We propose a model for enhanced longevity through a cytochrome c oxidase-mediated reduction in reactive oxygen species commonly held to be a major contributor to aging.
PMCID: PMC2755518  PMID: 17543485
aging; C. elegans; gene expression; longevity; reactive oxygen species
7.  An archived activation tagged population of Arabidopsis thaliana to facilitate forward genetics approaches 
BMC Plant Biology  2009;9:101.
Functional genomics tools provide researchers with the ability to apply high-throughput techniques to determine the function and interaction of a diverse range of genes. Mutagenised plant populations are one such resource that facilitate gene characterisation. They allow complex physiological responses to be correlated with the expression of single genes in planta, through either reverse genetics where target genes are mutagenised to assay the affect, or through forward genetics where populations of mutant lines are screened to identify those whose phenotype diverges from wild type for a particular trait. One limitation of these types of populations is the prevalence of gene redundancy within plant genomes, which can mask the affect of individual genes. Activation or enhancer populations, which not only provide knock-out but also dominant activation mutations, can facilitate the study of such genes.
We have developed a population of almost 50,000 activation tagged A. thaliana lines that have been archived as individual lines to the T3 generation. The population is an excellent tool for both reverse and forward genetic screens and has been used successfully to identify a number of novel mutants. Insertion site sequences have been generated and mapped for 15,507 lines to enable further application of the population, while providing a clear distribution of T-DNA insertions across the genome. The population is being screened for a number of biochemical and developmental phenotypes, provisional data identifying novel alleles and genes controlling steps in proanthocyanidin biosynthesis and trichome development is presented.
This publicly available population provides an additional tool for plant researcher's to assist with determining gene function for the many as yet uncharacterised genes annotated within the Arabidopsis genome sequence The presence of enhancer elements on the inserted T-DNA molecule allows both knock-out and dominant activation phenotypes to be identified for traits of interest.
PMCID: PMC3091532  PMID: 19646253
8.  An Integrated Strategy to Study Muscle Development and Myofilament Structure in Caenorhabditis elegans 
PLoS Genetics  2009;5(6):e1000537.
A crucial step in the development of muscle cells in all metazoan animals is the assembly and anchorage of the sarcomere, the essential repeat unit responsible for muscle contraction. In Caenorhabditis elegans, many of the critical proteins involved in this process have been uncovered through mutational screens focusing on uncoordinated movement and embryonic arrest phenotypes. We propose that additional sarcomeric proteins exist for which there is a less severe, or entirely different, mutant phenotype produced in their absence. We have used Serial Analysis of Gene Expression (SAGE) to generate a comprehensive profile of late embryonic muscle gene expression. We generated two replicate long SAGE libraries for sorted embryonic muscle cells, identifying 7,974 protein-coding genes. A refined list of 3,577 genes expressed in muscle cells was compiled from the overlap between our SAGE data and available microarray data. Using the genes in our refined list, we have performed two separate RNA interference (RNAi) screens to identify novel genes that play a role in sarcomere assembly and/or maintenance in either embryonic or adult muscle. To identify muscle defects in embryos, we screened specifically for the Pat embryonic arrest phenotype. To visualize muscle defects in adult animals, we fed dsRNA to worms producing a GFP-tagged myosin protein, thus allowing us to analyze their myofilament organization under gene knockdown conditions using fluorescence microscopy. By eliminating or severely reducing the expression of 3,300 genes using RNAi, we identified 122 genes necessary for proper myofilament organization, 108 of which are genes without a previously characterized role in muscle. Many of the genes affecting sarcomere integrity have human homologs for which little or nothing is known.
Author Summary
Muscular diseases affect many people worldwide. While we have learned much about the sarcomere, the basic building block of muscle cells, there are still numerous questions that remain to be answered. We must learn more about proteins expressed in muscle and how they interact so that better treatments for myopathies can be developed. The nematode Caenorhabditis elegans is a valuable model organism for the study of muscle due to similarities between worm body wall muscle and vertebrate muscle, along with its semi-transparent cuticle that allows for visualization of muscle structures in live animals. We have used transcriptional profiling methods to identify the majority of genes that are expressed in the embryonic body wall muscle cells of C. elegans. To gain insight into possible functions performed by these genes and their corresponding proteins, we examined animals and muscle cells for abnormalities after the targeted inactivation of about 3,300 genes. We identified 122 genes necessary for proper myofilament organization, 108 of which had no previously characterized role in muscle. This approach proved to be a rapid and sensitive means to identify genes that affect muscle differentiation and sarcomere assembly.
PMCID: PMC2694363  PMID: 19557190
9.  nGASP – the nematode genome annotation assessment project 
BMC Bioinformatics  2008;9:549.
While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets across 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase.
The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with unusually many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs posed the greatest difficulty for gene-finders.
This experiment establishes a baseline of gene prediction accuracy in Caenorhabditis genomes, and has guided the choice of gene-finders for the annotation of newly sequenced genomes of Caenorhabditis and other nematode species. We have created new gene sets for C. briggsae, C. remanei, C. brenneri, C. japonica, and Brugia malayi using some of the best-performing gene-finders.
PMCID: PMC2651883  PMID: 19099578

Results 1-9 (9)