PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-9 (9)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The zebrafish reference genome sequence and its relationship to the human genome 
Howe, Kerstin | Clark, Matthew D. | Torroja, Carlos F. | Torrance, James | Berthelot, Camille | Muffato, Matthieu | Collins, John E. | Humphray, Sean | McLaren, Karen | Matthews, Lucy | McLaren, Stuart | Sealy, Ian | Caccamo, Mario | Churcher, Carol | Scott, Carol | Barrett, Jeffrey C. | Koch, Romke | Rauch, Gerd-Jörg | White, Simon | Chow, William | Kilian, Britt | Quintais, Leonor T. | Guerra-Assunção, José A. | Zhou, Yi | Gu, Yong | Yen, Jennifer | Vogel, Jan-Hinnerk | Eyre, Tina | Redmond, Seth | Banerjee, Ruby | Chi, Jianxiang | Fu, Beiyuan | Langley, Elizabeth | Maguire, Sean F. | Laird, Gavin K. | Lloyd, David | Kenyon, Emma | Donaldson, Sarah | Sehra, Harminder | Almeida-King, Jeff | Loveland, Jane | Trevanion, Stephen | Jones, Matt | Quail, Mike | Willey, Dave | Hunt, Adrienne | Burton, John | Sims, Sarah | McLay, Kirsten | Plumb, Bob | Davis, Joy | Clee, Chris | Oliver, Karen | Clark, Richard | Riddle, Clare | Eliott, David | Threadgold, Glen | Harden, Glenn | Ware, Darren | Mortimer, Beverly | Kerry, Giselle | Heath, Paul | Phillimore, Benjamin | Tracey, Alan | Corby, Nicole | Dunn, Matthew | Johnson, Christopher | Wood, Jonathan | Clark, Susan | Pelan, Sarah | Griffiths, Guy | Smith, Michelle | Glithero, Rebecca | Howden, Philip | Barker, Nicholas | Stevens, Christopher | Harley, Joanna | Holt, Karen | Panagiotidis, Georgios | Lovell, Jamieson | Beasley, Helen | Henderson, Carl | Gordon, Daria | Auger, Katherine | Wright, Deborah | Collins, Joanna | Raisen, Claire | Dyer, Lauren | Leung, Kenric | Robertson, Lauren | Ambridge, Kirsty | Leongamornlert, Daniel | McGuire, Sarah | Gilderthorp, Ruth | Griffiths, Coline | Manthravadi, Deepa | Nichol, Sarah | Barker, Gary | Whitehead, Siobhan | Kay, Michael | Brown, Jacqueline | Murnane, Clare | Gray, Emma | Humphries, Matthew | Sycamore, Neil | Barker, Darren | Saunders, David | Wallis, Justene | Babbage, Anne | Hammond, Sian | Mashreghi-Mohammadi, Maryam | Barr, Lucy | Martin, Sancha | Wray, Paul | Ellington, Andrew | Matthews, Nicholas | Ellwood, Matthew | Woodmansey, Rebecca | Clark, Graham | Cooper, James | Tromans, Anthony | Grafham, Darren | Skuce, Carl | Pandian, Richard | Andrews, Robert | Harrison, Elliot | Kimberley, Andrew | Garnett, Jane | Fosker, Nigel | Hall, Rebekah | Garner, Patrick | Kelly, Daniel | Bird, Christine | Palmer, Sophie | Gehring, Ines | Berger, Andrea | Dooley, Christopher M. | Ersan-Ürün, Zübeyde | Eser, Cigdem | Geiger, Horst | Geisler, Maria | Karotki, Lena | Kirn, Anette | Konantz, Judith | Konantz, Martina | Oberländer, Martina | Rudolph-Geiger, Silke | Teucke, Mathias | Osoegawa, Kazutoyo | Zhu, Baoli | Rapp, Amanda | Widaa, Sara | Langford, Cordelia | Yang, Fengtang | Carter, Nigel P. | Harrow, Jennifer | Ning, Zemin | Herrero, Javier | Searle, Steve M. J. | Enright, Anton | Geisler, Robert | Plasterk, Ronald H. A. | Lee, Charles | Westerfield, Monte | de Jong, Pieter J. | Zon, Leonard I. | Postlethwait, John H. | Nüsslein-Volhard, Christiane | Hubbard, Tim J. P. | Crollius, Hugues Roest | Rogers, Jane | Stemple, Derek L.
Nature  2013;496(7446):498-503.
Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
doi:10.1038/nature12111
PMCID: PMC3703927  PMID: 23594743
2.  The Vertebrate Genome Annotation browser 10 years on 
Nucleic Acids Research  2013;42(D1):D771-D779.
The Vertebrate Genome Annotation (VEGA) database (http://vega.sanger.ac.uk), initially designed as a community resource for browsing manual annotation of the human genome project, now contains five reference genomes (human, mouse, zebrafish, pig and rat). Its introduction pages have been redesigned to enable the user to easily navigate between whole genomes and smaller multi-species haplotypic regions of interest such as the major histocompatibility complex. The VEGA browser is unique in that annotation is updated via the Human And Vertebrate Analysis aNd Annotation (HAVANA) update track every 2 weeks, allowing single gene updates to be made publicly available to the research community quickly. The user can now access different haplotypic subregions more easily, such as those from the non-obese diabetic mouse, and display them in a more intuitive way using the comparative tools. We also highlight how the user can browse manually annotated updated patches from the Genome Reference Consortium (GRC).
doi:10.1093/nar/gkt1241
PMCID: PMC3964964  PMID: 24316575
3.  Current status and new features of the Consensus Coding Sequence database  
Nucleic Acids Research  2013;42(D1):D865-D872.
The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.
doi:10.1093/nar/gkt1059
PMCID: PMC3965069  PMID: 24217909
4.  Best practices in bioinformatics training for life scientists 
Briefings in Bioinformatics  2013;14(5):528-537.
The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.
doi:10.1093/bib/bbt043
PMCID: PMC3771230  PMID: 23803301
bioinformatics; training; bioinformatics courses; training life scientists; train the trainers
5.  iAnn: an event sharing platform for the life sciences 
Bioinformatics  2013;29(15):1919-1921.
Summary: We present iAnn, an open source community-driven platform for dissemination of life science events, such as courses, conferences and workshops. iAnn allows automatic visualisation and integration of customised event reports. A central repository lies at the core of the platform: curators add submitted events, and these are subsequently accessed via web services. Thus, once an iAnn widget is incorporated into a website, it permanently shows timely relevant information as if it were native to the remote site. At the same time, announcements submitted to the repository are automatically disseminated to all portals that query the system. To facilitate the visualization of announcements, iAnn provides powerful filtering options and views, integrated in Google Maps and Google Calendar. All iAnn widgets are freely available.
Availability: http://iann.pro/iannviewer
Contact: manuel.corpas@tgac.ac.uk
doi:10.1093/bioinformatics/btt306
PMCID: PMC3712218  PMID: 23742982
6.  Structural and functional annotation of the porcine immunome 
BMC Genomics  2013;14:332.
Background
The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.
Results
The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome.
Conclusions
This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig’s adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
doi:10.1186/1471-2164-14-332
PMCID: PMC3658956  PMID: 23676093
Immune response; Porcine; Genome annotation; Co-expression network; Phylogenetic analysis; Accelerated evolution
7.  Tracking and coordinating an international curation effort for the CCDS Project 
The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a ‘gold standard’ definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines.
Database URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi
doi:10.1093/database/bas008
PMCID: PMC3308164  PMID: 22434842
8.  Community gene annotation in practice 
Manual annotation of genomic data is extremely valuable to produce an accurate reference gene set but is expensive compared with automatic methods and so has been limited to model organisms. Annotation tools that have been developed at the Wellcome Trust Sanger Institute (WTSI, http://www.sanger.ac.uk/.) are being used to fill that gap, as they can be used remotely and so open up viable community annotation collaborations. We introduce the ‘Blessed’ annotator and ‘Gatekeeper’ approach to Community Annotation using the Otterlace/ZMap genome annotation tool. We also describe the strategies adopted for annotation consistency, quality control and viewing of the annotation.
Database URL: http://vega.sanger.ac.uk/index.html
doi:10.1093/database/bas009
PMCID: PMC3308165  PMID: 22434843
9.  Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers 
Briefings in Bioinformatics  2011;13(3):383-389.
Funding bodies are increasingly recognizing the need to provide graduates and researchers with access to short intensive courses in a variety of disciplines, in order both to improve the general skills base and to provide solid foundations on which researchers may build their careers. In response to the development of ‘high-throughput biology’, the need for training in the field of bioinformatics, in particular, is seeing a resurgence: it has been defined as a key priority by many Institutions and research programmes and is now an important component of many grant proposals. Nevertheless, when it comes to planning and preparing to meet such training needs, tension arises between the reward structures that predominate in the scientific community which compel individuals to publish or perish, and the time that must be devoted to the design, delivery and maintenance of high-quality training materials. Conversely, there is much relevant teaching material and training expertise available worldwide that, were it properly organized, could be exploited by anyone who needs to provide training or needs to set up a new course. To do this, however, the materials would have to be centralized in a database and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review it, respectively, to similar initiatives and collections.
doi:10.1093/bib/bbr064
PMCID: PMC3357490  PMID: 22110242
Bioinformatics; training; end users; bioinformatics courses; learning bioinformatics

Results 1-9 (9)