PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-21 (21)
 

Clipboard (0)
None
Journals
Year of Publication
2.  Deletions of chromosomal regulatory boundaries are associated with congenital disease 
Genome Biology  2014;15(9):423.
Background
Recent data from genome-wide chromosome conformation capture analysis indicate that the human genome is divided into conserved megabase-sized self-interacting regions called topological domains. These topological domains form the regulatory backbone of the genome and are separated by regulatory boundary elements or barriers. Copy-number variations can potentially alter the topological domain architecture by deleting or duplicating the barriers and thereby allowing enhancers from neighboring domains to ectopically activate genes causing misexpression and disease, a mutational mechanism that has recently been termed enhancer adoption.
Results
We use the Human Phenotype Ontology database to relate the phenotypes of 922 deletion cases recorded in the DECIPHER database to monogenic diseases associated with genes in or adjacent to the deletions. We identify combinations of tissue-specific enhancers and genes adjacent to the deletion and associated with phenotypes in the corresponding tissue, whereby the phenotype matched that observed in the deletion. We compare this computationally with a gene-dosage pathomechanism that attempts to explain the deletion phenotype based on haploinsufficiency of genes located within the deletions. Up to 11.8% of the deletions could be best explained by enhancer adoption or a combination of enhancer adoption and gene-dosage effects.
Conclusions
Our results suggest that enhancer adoption caused by deletions of regulatory boundaries may contribute to a substantial minority of copy-number variation phenotypes and should thus be taken into account in their medical interpretation.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0423-1) contains supplementary material, which is available to authorized users.
doi:10.1186/s13059-014-0423-1
PMCID: PMC4180961  PMID: 25315429
3.  Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon 
Background
Elucidating disease and developmental dysfunction requires understanding variation in phenotype. Single-species model organism anatomy ontologies (ssAOs) have been established to represent this variation. Multi-species anatomy ontologies (msAOs; vertebrate skeletal, vertebrate homologous, teleost, amphibian AOs) have been developed to represent ‘natural’ phenotypic variation across species. Our aim has been to integrate ssAOs and msAOs for various purposes, including establishing links between phenotypic variation and candidate genes.
Results
Previously, msAOs contained a mixture of unique and overlapping content. This hampered integration and coordination due to the need to maintain cross-references or inter-ontology equivalence axioms to the ssAOs, or to perform large-scale obsolescence and modular import. Here we present the unification of anatomy ontologies into Uberon, a single ontology resource that enables interoperability among disparate data and research groups. As a consequence, independent development of TAO, VSAO, AAO, and vHOG has been discontinued.
Conclusions
The newly broadened Uberon ontology is a unified cross-taxon resource for metazoans (animals) that has been substantially expanded to include a broad diversity of vertebrate anatomical structures, permitting reasoning across anatomical variation in extinct and extant taxa. Uberon is a core resource that supports single- and cross-species queries for candidate genes using annotations for phenotypes from the systematics, biodiversity, medical, and model organism communities, while also providing entities for logical definitions in the Cell and Gene Ontologies.
The ontology release files associated with the ontology merge described in this manuscript are available at: http://purl.obolibrary.org/obo/uberon/releases/2013-02-21/
Current ontology release files are available always available at: http://purl.obolibrary.org/obo/uberon/releases/
doi:10.1186/2041-1480-5-21
PMCID: PMC4089931  PMID: 25009735
Evolutionary biology; Morphological variation; Phenotype; Semantic integration; Bio-ontology
4.  BioJS: an open source standard for biological visualisation – its status in 2014 
F1000Research  2014;3:55.
BioJS is a community-based standard and repository of functional components to represent biological information on the web. The development of BioJS has been prompted by the growing need for bioinformatics visualisation tools to be easily shared, reused and discovered. Its modular architecture makes it easy for users to find a specific functionality without needing to know how it has been built, while components can be extended or created for implementing new functionality. The BioJS community of developers currently provides a range of functionality that is open access and freely available. A registry has been set up that categorises and provides installation instructions and testing facilities at http://www.ebi.ac.uk/tools/biojs/. The source code for all components is available for ready use at https://github.com/biojs/biojs.
doi:10.12688/f1000research.3-55.v1
PMCID: PMC4103492  PMID: 25075290
5.  Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research 
F1000Research  2014;2:30.
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.
We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.
This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.
doi:10.12688/f1000research.2-30.v2
PMCID: PMC3799545  PMID: 24358873
6.  The environment ontology: contextualising biological and biomedical entities 
As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO; http://www.environmentontology.org) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO’s motivation, content, structure, adoption, and governance approach. The ontology is available from http://purl.obolibrary.org/obo/envo.owl - an OBO format version is also available by switching the file suffix to “obo”.
doi:10.1186/2041-1480-4-43
PMCID: PMC3904460  PMID: 24330602
Environment; Ecosystem; Biome; Ontology
7.  Web Apollo: a web-based genomic annotation editing platform 
Genome Biology  2013;14(8):R93.
Web Apollo is the first instantaneous, collaborative genomic annotation editor available on the web. One of the natural consequences following from current advances in sequencing technology is that there are more and more researchers sequencing new genomes. These researchers require tools to describe the functional features of their newly sequenced genomes. With Web Apollo researchers can use any of the common browsers (for example, Chrome or Firefox) to jointly analyze and precisely describe the features of a genome in real time, whether they are in the same room or working from opposite sides of the world.
doi:10.1186/gb-2013-14-8-r93
PMCID: PMC4053811  PMID: 24000942
GENOME; COLLABORATIVE; EDITOR
8.  MouseFinder: candidate disease genes from mouse phenotype data 
Human Mutation  2012;33(5):858-866.
Mouse phenotype data represents a valuable resource for the identification of disease-associated genes, especially where the molecular basis is unknown and there is no clue to the candidate gene’s function, pathway involvement or expression pattern. However, until recently these data have not been systematically used due to difficulties in mapping between clinical features observed in humans and mouse phenotype annotations. Here, we describe a semantic approach to solve this problem and demonstrate highly significant recall of known disease-gene associations and orthology relationships. A web application (MouseFinder; www.mousemodels.org) has been developed to allow users to search the results of our whole-phenome comparison of human and mouse. We demonstrate its use in identifying ARTN as a strong candidate gene within the 1p34.1-p32 mapped locus for a hereditary form of ptosis.
doi:10.1002/humu.22051
PMCID: PMC3327758  PMID: 22331800
phenotype; candidate disease genes; model organism; mouse
9.  Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research 
F1000Research  2013;2:30.
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.
We have generated a cross-species phenotype ontology for human, mouse and zebra fish that contains zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.
This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.
doi:10.12688/f1000research.2-30.v1
PMCID: PMC3799545  PMID: 24358873
10.  Phenotype Ontology Research Coordination Network meeting report: creating a community network for comparing and leveraging phenotype-genotype knowledge across species 
Standards in Genomic Sciences  2012;6(3):440-443.
Representing phenotype in a way that can be linked to thousands of molecular genetic and environmental databases is an unresolved research challenge. A recent meeting of the Phenotype Research Coordination Network (RCN) aimed to coordinate and leverage current efforts. The three day summit meeting was hosted by NESCent (The National Evolutionary Synthesis Center) in Durham, North Carolina on the 23rd – 25th of February, 2012.
doi:10.4056/sigs.2926219
PMCID: PMC3558964  PMID: 23409218
11.  Uberon, an integrative multi-species anatomy ontology 
Genome Biology  2012;13(1):R5.
We present Uberon, an integrated cross-species ontology consisting of over 6,500 classes representing a variety of anatomical entities, organized according to traditional anatomical classification criteria. The ontology represents structures in a species-neutral way and includes extensive associations to existing species-centric anatomical ontologies, allowing integration of model organism and human data. Uberon provides a necessary bridge between anatomical structures in different taxa for cross-species inference. It uses novel methods for representing taxonomic variation, and has proved to be essential for translational phenotype analyses. Uberon is available at http://uberon.org
doi:10.1186/gb-2012-13-1-r5
PMCID: PMC3334586  PMID: 22293552
12.  Integrating phenotype ontologies across multiple species 
Genome Biology  2010;11(1):R2.
A phenotypic ontology that can be used for the analysis of phenotype-genotype data across multiple species, paving the way for truly cross species translational research.
Phenotype ontologies are typically constructed to serve the needs of a particular community, such as annotation of genotype-phenotype associations in mouse or human. Here we demonstrate how these ontologies can be improved through assignment of logical definitions using a core ontology of phenotypic qualities and multiple additional ontologies from the Open Biological Ontologies library. We also show how these logical definitions can be used for data integration when combined with a unified multi-species anatomy ontology.
doi:10.1186/gb-2010-11-1-r2
PMCID: PMC2847714  PMID: 20064205
13.  Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project 
Nature biotechnology  2008;26(8):889-896.
The Minimum Information for Biological and Biomedical Investigations (MIBBI) project provides a resource for those exploring the range of extant minimum information checklists and fosters coordinated development of such checklists.
doi:10.1038/nbt.1411
PMCID: PMC2771753  PMID: 18688244
14.  Survey-based naming conventions for use in OBO Foundry ontology development 
BMC Bioinformatics  2009;10:125.
Background
A wide variety of ontologies relevant to the biological and medical domains are available through the OBO Foundry portal, and their number is growing rapidly. Integration of these ontologies, while requiring considerable effort, is extremely desirable. However, heterogeneities in format and style pose serious obstacles to such integration. In particular, inconsistencies in naming conventions can impair the readability and navigability of ontology class hierarchies, and hinder their alignment and integration. While other sources of diversity are tremendously complex and challenging, agreeing a set of common naming conventions is an achievable goal, particularly if those conventions are based on lessons drawn from pooled practical experience and surveys of community opinion.
Results
We summarize a review of existing naming conventions and highlight certain disadvantages with respect to general applicability in the biological domain. We also present the results of a survey carried out to establish which naming conventions are currently employed by OBO Foundry ontologies and to determine what their special requirements regarding the naming of entities might be. Lastly, we propose an initial set of typographic, syntactic and semantic conventions for labelling classes in OBO Foundry ontologies.
Conclusion
Adherence to common naming conventions is more than just a matter of aesthetics. Such conventions provide guidance to ontology creators, help developers avoid flaws and inaccuracies when editing, and especially when interlinking, ontologies. Common naming conventions will also assist consumers of ontologies to more readily understand what meanings were intended by the authors of ontologies used in annotating bodies of data.
doi:10.1186/1471-2105-10-125
PMCID: PMC2684543  PMID: 19397794
15.  The minimum information about a genome sequence (MIGS) specification 
Nature biotechnology  2008;26(5):541-547.
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases.
doi:10.1038/nbt1360
PMCID: PMC2409278  PMID: 18464787
16.  EGASP: the human ENCODE Genome Annotation Assessment Project 
Genome Biology  2006;7(Suppl 1):S2.
Background
We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment.
Results
The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified.
Conclusion
This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.
doi:10.1186/gb-2006-7-s1-s2
PMCID: PMC1810551  PMID: 16925836
17.  The Sequence Ontology: a tool for the unification of genome annotations 
Genome Biology  2005;6(5):R44.
The goal of the Sequence Ontology (SO) project is to produce a structured controlled vocabulary with a common set of terms and definitions for parts of a genomic annotation, and to describe the relationships among them. Details of SO construction, design and use, particularly with regard to part-whole relationships are discussed and the practical utility of SO is demonstrated for a set of genome annotations from Drosophila melanogaster.
The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data. Because SO treats part-whole relationships rigorously, data described with it can become substrates for automated reasoning, and instances of sequence features described by the SO can be subjected to a group of logical operations termed extensional mereology operators.
doi:10.1186/gb-2005-6-5-r44
PMCID: PMC1175956  PMID: 15892872
18.  Gene Ontology: looking backwards and forwards 
Genome Biology  2004;6(1):103.
The Gene Ontology consortium began six years ago with a group of scientists who decided to connect our data by sharing the same language for describing it. Its most significant achievement lies in uniting many independent biological database efforts into a cooperative force.
The Gene Ontology consortium began six years ago with a group of scientists who decided to connect our data by sharing the same language for describing it. Its most significant achievement lies in uniting many independent biological database efforts into a cooperative force.
doi:10.1186/gb-2004-6-1-103
PMCID: PMC549054  PMID: 15642104
19.  Annotation of the Drosophila melanogaster euchromatic genome: a systematic review 
Genome Biology  2002;3(12):research0083.1-83.22.
The recent completion of the Drosophila melanogaster genomic sequence to high quality, and the availability of a greatly expanded set of Drosophila cDNA sequences, afforded FlyBase the opportunity to significantly improve genomic annotations.
Background
The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences.
Results
Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes.
Conclusions
Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations.
doi:10.1186/gb-2002-3-12-research0083
PMCID: PMC151185  PMID: 12537572
20.  Systematic determination of patterns of gene expression during Drosophila embryogenesis 
Genome Biology  2002;3(12):research0088.1-88.14.
As a first step to creating a comprehensive atlas of gene-expression patterns during Drosophila embryogenesis, 2,179 genes have been examinded by in situ hybridization to fixed Drosophila embryos. Of the genes assayed, 63.7% displayed dynamic expression patterns that were documented with 25,690 digital photomicrographs of individual embryos.
Background
Cell-fate specification and tissue differentiation during development are largely achieved by the regulation of gene transcription.
Results
As a first step to creating a comprehensive atlas of gene-expression patterns during Drosophila embryogenesis, we examined 2,179 genes by in situ hybridization to fixed Drosophila embryos. Of the genes assayed, 63.7% displayed dynamic expression patterns that were documented with 25,690 digital photomicrographs of individual embryos. The photomicrographs were annotated using controlled vocabularies for anatomical structures that are organized into a developmental hierarchy. We also generated a detailed time course of gene expression during embryogenesis using microarrays to provide an independent corroboration of the in situ hybridization results. All image, annotation and microarray data are stored in publicly available database. We found that the RNA transcripts of about 1% of genes show clear subcellular localization. Nearly all the annotated expression patterns are distinct. We present an approach for organizing the data by hierarchical clustering of annotation terms that allows us to group tissues that express similar sets of genes as well as genes displaying similar expression patterns.
Conclusions
Analyzing gene-expression patterns by in situ hybridization to whole-mount embryos provides an extremely rich dataset that can be used to identify genes involved in developmental processes that have been missed by traditional genetic analysis. Systematic analysis of rigorously annotated patterns of gene expression will complement and extend the types of analyses carried out using expression microarrays.
doi:10.1186/gb-2002-3-12-research0088
PMCID: PMC151190  PMID: 12537577
21.  The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective 
Genome Biology  2002;3(12):research0084.1-84.2.
Using Release 3 of the euchromatic genomic sequence of Drosophila melanogaster, 85 known and eight novel families of transposable element have been identified, varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence.
Background
Transposable elements are found in the genomes of nearly all eukaryotes. The recent completion of the Release 3 euchromatic genomic sequence of Drosophila melanogaster by the Berkeley Drosophila Genome Project has provided precise sequence for the repetitive elements in the Drosophila euchromatin. We have used this genomic sequence to describe the euchromatic transposable elements in the sequenced strain of this species.
Results
We identified 85 known and eight novel families of transposable element varying in copy number from one to 146. A total of 1,572 full and partial transposable elements were identified, comprising 3.86% of the sequence. More than two-thirds of the transposable elements are partial. The density of transposable elements increases an average of 4.7 times in the centromere-proximal regions of each of the major chromosome arms. We found that transposable elements are preferentially found outside genes; only 436 of 1,572 transposable elements are contained within the 61.4 Mb of sequence that is annotated as being transcribed. A large proportion of transposable elements is found nested within other elements of the same or different classes. Lastly, an analysis of structural variation from different families reveals distinct patterns of deletion for elements belonging to different classes.
Conclusions
This analysis represents an initial characterization of the transposable elements in the Release 3 euchromatic genomic sequence of D. melanogaster for which comparison to the transposable elements of other organisms can begin to be made. These data have been made available on the Berkeley Drosophila Genome Project website for future analyses.
doi:10.1186/gb-2002-3-12-research0084
PMCID: PMC151186  PMID: 12537573

Results 1-21 (21)